Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • 02:20
    echeipesh synchronize #587
  • 02:20

    echeipesh on spark-3.2

    Landsat PDS is gone :( (compare)

  • Dec 08 20:14
    echeipesh synchronize #587
  • Dec 08 20:14

    echeipesh on spark-3.2

    Bring in the Kryo setup GT set… Register functions directly th… bump (compare)

  • Nov 29 06:20
  • Nov 16 07:34
    gisnewbie starred locationtech/rasterframes
  • Nov 14 22:30
    DragonEnergy commented #594
  • Nov 14 22:21
    DragonEnergy edited #594
  • Nov 14 22:20
    DragonEnergy opened #594
  • Nov 14 22:17

    dependabot[bot] on pip


  • Nov 14 22:17
    dependabot[bot] commented #592
  • Nov 14 22:17
    echeipesh closed #592
  • Nov 14 22:06
    echeipesh closed #593
  • Nov 14 22:06
    echeipesh commented #593
  • Nov 14 22:02
    DragonEnergy opened #593
  • Nov 11 08:25
    dependabot[bot] closed #586
  • Nov 11 08:25

    dependabot[bot] on pip


  • Nov 11 08:25
    dependabot[bot] commented #586
  • Nov 11 08:25
    dependabot[bot] labeled #592
  • Nov 11 08:25
    dependabot[bot] opened #592
Eugene Cheipesh
@metasim Related to the question above, I’ve been noticing that GeoTiff reader is just kind of really really bad compared to GDAL when reading tiles. Much worse than I would have expected it to be based on previous expirience with it. I wonder if that lines up with what you’ve seen and if you, per-chance, have a theory as to what is going on. It’d be really nice to be able to handle COGs without GDAL. Especially now that we have EMR Serverless as an option.
Simeon H.K. Fitch

@echeipesh That surprises me too. I've not done any RasterFrames work for some time, so I've not been tracking it. Debugging performance issues in Spark is just so damn difficult. Java version different? If I had $$ to work on it, I'd put in dropwizard/prometheus metrics and maybe even tracing on tile ops in hopes instrumentation would help.

I've been wondering fi we should do a new release with just this flag flipped:

1 reply
In short, I don't know what's going on or what to do about it without a project to support the work.
Eugene Cheipesh
Yeah, I know that feeling. I hope its ok to talk about issues without expectation that you got answers, it’s FOSS and all. But still I don’t want to stress you out :)
It seems when using GeoTiffRasterSource it’s just spending a lot of time re-reading the TiffTags … not great.
Screen Shot 2022-06-27 at 3.35.33 PM.png
1 reply
@echeipesh ouf and GDAL Caches them :/
Eugene Cheipesh
Yes, I think there is something suspect going on with the delegate/cache situation. I’ll try to dig into it when I have a minute. Asside from finding what is happening it shouldn’t be a hard fix.
1 reply

@pomadchin Hi, How to convert rasterframes dataFrame to geotrellis RDD[(K, V)] with Metadata[M]?

val df = spark.read.raster
      .withTileDimensions(512, 512)


Get error:

Caused by: java.lang.IllegalArgumentException: requirement failed: A RasterFrameLayer requires a column identified as a spatial key
    at scala.Predef$.require(Predef.scala:281)
    at org.locationtech.rasterframes.extensions.DataFrameMethods.asLayer(DataFrameMethods.scala:234)
    at org.locationtech.rasterframes.extensions.DataFrameMethods.asLayer$(DataFrameMethods.scala:229)
    at org.locationtech.rasterframes.extensions.Implicits$WithDataFrameMethods.asLayer(Implicits.scala:59)
@metasim How to implement image mosaicking with rasterframes?
I see in the source code that the left join is used and the intersection is taken, and the black part is obtained, I actually want the red part.
def apply(left: DataFrame, right: DataFrame, leftExtent: Column, leftCRS: Column, rightExtent: Column, rightCRS: Column, resampleMethod: GTResampleMethod, 
    fallbackDimensions: Option[Dimensions[Int]]): DataFrame = {
        val leftGeom = st_geometry(leftExtent)
        val rightGeomReproj = st_reproject(st_geometry(rightExtent), rightCRS, leftCRS)
        val joinExpr = new Column(SpatialRelation.Intersects(leftGeom.expr, rightGeomReproj.expr))
        apply(left, right, joinExpr, leftExtent, leftCRS, rightExtent, rightCRS, resampleMethod, fallbackDimensions)

.withColumn(id, monotonically_increasing_id())
      // 2. Perform the left-outer join
      .join(right, joinExprs, joinType = "left")
      // 3. Group by the unique ID, reestablishing the LHS count
2 replies
I want to create a function similar to rf_slope, it needs to get one cell's result from the calculation of 8 cells around it. I did not find rf_slope's algorithm code, could someone please teach me how to rewrite it or develop a new function working on neighborhood cells?
Solomon Negusse

Hello, I'm new to rasterframes and encountering the error below while running zonal statistics prototype using pyrasterframes in AWS EMR Serverless environment. I'm a bit lost on the java error and hoping someone can point in the right direction to resolve this.
The environment:
EMR v6.6 (the only version available in EMR Serverless)
Spark v3.2.0
Pyrasterframes v0.10.1

And here is the error:

Traceback (most recent call last):
  File "/tmp/spark-515422fd-4ed2-4ca6-9f2d-160cd422df73/zonal_statistics.py", line 25, in <module>
    spark = create_rf_spark_session()
  File "/home/hadoop/rasterframes/lib/python3.8/site-packages/pyrasterframes/utils.py", line 88, in create_rf_spark_session
  File "/home/hadoop/rasterframes/lib/python3.8/site-packages/pyrasterframes/__init__.py", line 44, in _rf_init
    spark_session.rasterframes = RFContext(spark_session)
  File "/home/hadoop/rasterframes/lib/python3.8/site-packages/pyrasterframes/rf_context.py", line 45, in __init__
    self._jrfctx = self._jvm.org.locationtech.rasterframes.py.PyRFContext(jsess)
  File "/usr/lib/spark/python/lib/py4j-", line 1573, in __call__
  File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 111, in deco
  File "/usr/lib/spark/python/lib/py4j-", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.locationtech.rasterframes.py.PyRFContext.
: java.lang.NoSuchMethodError: org.apache.spark.sql.catalyst.expressions.objects.Invoke$.apply$default$5()Z
    at frameless.RecordEncoder.$anonfun$toCatalyst$2(RecordEncoder.scala:154)
    at scala.collection.immutable.List.map(List.scala:293)
    at frameless.RecordEncoder.toCatalyst(RecordEncoder.scala:153)
    at frameless.TypedExpressionEncoder$.apply(TypedExpressionEncoder.scala:28)
    at org.locationtech.rasterframes.encoders.TypedEncoders.typedExpressionEncoder(TypedEncoders.scala:22)
    at org.locationtech.rasterframes.encoders.TypedEncoders.typedExpressionEncoder$(TypedEncoders.scala:22)
    at org.locationtech.rasterframes.package$.typedExpressionEncoder(package.scala:39)
    at org.locationtech.rasterframes.encoders.StandardEncoders.spatialKeyEncoder(StandardEncoders.scala:68)
    at org.locationtech.rasterframes.encoders.StandardEncoders.spatialKeyEncoder$(StandardEncoders.scala:68)
    at org.locationtech.rasterframes.package$.spatialKeyEncoder$lzycompute(package.scala:39)
    at org.locationtech.rasterframes.package$.spatialKeyEncoder(package.scala:39)
    at org.locationtech.rasterframes.StandardColumns.$init$(StandardColumns.scala:42)
    at org.locationtech.rasterframes.package$.<init>(package.scala:39)
    at org.locationtech.rasterframes.package$.<clinit>(package.scala)
    at org.locationtech.rasterframes.py.PyRFContext.<init>(PyRFContext.scala:49)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:238)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
    at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
    at java.lang.Thread.run(Thread.java:750)
hey @solomon-negusse, RF works with Spark 3.1.x only, 3.2.x is bin incompatible with 3.1.x
See the WIP draft PR locationtech/rasterframes#587
1 reply
Hello folks! any advance on using rasterframes in Databricks? I already ready the topics about it here, however it looks like it doesnt work fine.
hey @LPontes I think it works good if you prebuild the assembly jar with all shaded deps inside and upload it into the DB classpath
24 replies
could we get a RasterFrameLayer from Raster or Tile?
Yingyi Wu
Hi @pomadchin , I'd like to make PyRasterFrames read "Spark RDD" in "Supervised Machine Learning (https://rasterframes.io/supervised-learning.html)" and "Unsupervised Machine Learning (https://rasterframes.io/unsupervised-learning.html)", instead of HDFS. So, would you pls help to give me some suggestions? Thanks!
hey @JenniferYingyiWu2020 I think @metasim can cover better the Python part of it; but what error do you have?
Nikolay Gulyaev
Hello, I'm wondering if it's possible to clip raster data by polygons and save the clipped image for each polygon to geotiff using pyrasterframes functions. I know that there are many functions for working with tiles data but I didn't find the function for gathering clipped tiles data by some polygon Id and saving it as an image
Schmid, Matthias
Hi, how do you export raster coordinates when exploding tiles?
I found this but believe it is outdated: https://gist.github.com/metasim/5236f81a119fcc73833f6f93ee55608f
Schmid, Matthias
I actually managed to extract the value at a single point using st_makepoint and rf_value_at_point but how do I pass an array of point geometries?
Schmid, Matthias
Apologies, I found the solution from @hificoders and @metasim above (:point_up: August 2, 2021 3:57 PM) and will reproduce it. I'm still interested whether it's possible to export raster cell coordinates when exploding tiles but I have a workaround in the meantime.
Atheer Abdullatif
Hello, Is there a way to visualize rasters in Rasterframes scala?
Karthick Narendran

Hello! I'm just starting with PyRasterFrames and successfully installed the latest version (v0.10.1) on an Azure databricks cluster. Following this link https://rasterframes.io/getting-started.html, I created a spark session, read the sample raster file and was able to print the schema. However, when I try to do any other operations (described on the above link) with Dataframes, it fails with NoClassDefFoundError as shown below.

Py4JJavaError: An error occurred while calling o364._dfToHTML. : java.lang.NoClassDefFoundError: Could not initialize class org.locationtech.rasterframes.ref.RFRasterSource$

If I can create the spark session successfully, I'd assume the classpath and the pyrasterframes assembly jar are configured correctly. Do you have any thoughts on why I may be running into this error? TIA

Nitin Kandpal
Hi Everyone, I am looking for functionality to mosaic raster tiles having different extents but having the same pixel size, I tried raster_join but I am not getting the desired result. Do rasterframes have any solution for this?
Dzianis Dus
Hi there! I wonder if it is possible to use RasterFrames with structured streaming? I haven't seen any mention of this in docs and it is not clear from the first look at code. If yes, is there any specific best practices I should know?
Also, I see that RasterFrames supports writes to parquet files. So, if there an option to write batch / stream into Databricks Delta Lake format?
Manrique Vargas
We now want to try rasterframes 0.10., gal 3.5. (which now supports adls), and a LTS Databricks runtime > 9.1. We would also like to set this up using a docker image to make this environment more stable and I note the repo https://github.com/databrickslabs/geospatial-docker is no longer available. I was wondering if you have any suggestions.