echeipesh on spark-3.2
Fix UDF style Aggregates (compare)
echeipesh on spark-3.2
bumped dev version Spark 3.2 Lets get it compiled… withNewChildrenInternal and 2 more (compare)
dependabot[bot] on pip
Bump pyspark from 3.1.2 to 3.1.… (compare)
implicit val searchFiltersDecoder: Decoder[SearchFilters] = { c =>
for {
bbox <- c.downField("bbox").as[Option[Bbox]]
datetime <- c.downField("datetime").as[Option[TemporalExtent]]
intersects <- c.downField("intersects").as[Option[Geometry]]
collectionsOption <- c.downField("collections").as[Option[List[String]]]
itemsOption <- c.downField("ids").as[Option[List[String]]]
limit <- c.downField("limit").as[Option[NonNegInt]]
query <- c.get[Option[Map[String, List[Query]]]]("query")
paginationToken <- c.get[Option[PaginationToken]]("next")
} yield {
SearchFilters(
bbox,
datetime,
intersects,
collectionsOption getOrElse Nil,
itemsOption getOrElse Nil,
limit,
query getOrElse Map.empty,
paginationToken
)
}
}
val results = spark
.read
.stacApi(
"https://planetarycomputer.microsoft.com/api/stac/v1/",
filters = SearchFilters(
collections = List("sentinel-2-l2a"),
query = Map("title" -> List(Equals(Json.fromString("Band 1 - Coastal aerosol - 60m")))),
paginationBody = JsonObject.fromMap(Map("page" -> Json.fromString("2")))
)
)
.load.limit(10)
"token": "next:S2A_MSIL2A_20220411T101601_R065_T45XWK_20220412T055304”
// For example, I perform log calculation on the band5 file of landsat8
// And then output the calculated image, the image resolution is 7581x7721.
val out = Paths.get("target", "band5.tif")
noException shouldBe thrownBy {
df.select(rf_log(col("band5")))
.write.geotiff
.withCRS(LatLng)
//.withDimensions(1024, 1024)
.save(out.toString)
}
Why is Ndvi's calculation different from QGIS?
My QGIS formula:
( "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B5@1" - "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B4@1" ) / ( "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B5@1" + "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B4@1" )
My spark code:
val ndvi = df.withColumn("ndvi", rf_normalized_difference(col("band5"), col("band4")))
ndvi.printSchema()
val out = Paths.get("target", "spark_ndvi.tif")
noException shouldBe thrownBy {
ndvi.write.geotiff
.withCRS(LatLng)
.withDimensions(4096, 4096)
.save(out.toString)
}
How to write a full resolution GeoTIFF? I saw the restriction on the geotiff writer on the official website, which needs to read all the data to the driver, but this will cause oom. So you need to use withDimension to limit the output resolution. But what is the correct operation if I want to write out a full resolution tiff?
// For example, I perform log calculation on the band5 file of landsat8 // And then output the calculated image, the image resolution is 7581x7721. val out = Paths.get("target", "band5.tif") noException shouldBe thrownBy { df.select(rf_log(col("band5"))) .write.geotiff .withCRS(LatLng) //.withDimensions(1024, 1024) .save(out.toString) }
@pomadchin hi, Is there any solution for writing out at full resolution?
Why is Ndvi's calculation different from QGIS?
My QGIS formula:
( "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B5@1" - "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B4@1" ) / ( "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B5@1" + "LC08_L2SP_119038_20210623_20210630_02_T1_SR_B4@1" )
My spark code:
val ndvi = df.withColumn("ndvi", rf_normalized_difference(col("band5"), col("band4"))) ndvi.printSchema() val out = Paths.get("target", "spark_ndvi.tif") noException shouldBe thrownBy { ndvi.write.geotiff .withCRS(LatLng) .withDimensions(4096, 4096) .save(out.toString) }
@pomadchin Hi, Is there any answer to this question?
Caused by: java.lang.IllegalArgumentException: A destination CRS must be provided
at org.locationtech.rasterframes.datasource.geotiff.GeoTiffDataSource.$anonfun$createRelation$7(GeoTiffDataSource.scala:73)
at scala.Option.getOrElse(Option.scala:189)
at org.locationtech.rasterframes.datasource.geotiff.GeoTiffDataSource.createRelation(GeoTiffDataSource.scala:73)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
--repositories
flag to spark-submit
to download packages from nonstandard locations. Perhaps there's a way to do that with pyspark
?
if you still need GT snapshots than they are available on the maven nexus i.e.: https://oss.sonatype.org/content/repositories/snapshots/org/locationtech/geotrellis/geotrellis-spark_2.12/
Check out the GT README badges: https://github.com/locationtech/geotrellis#geotrellis