Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Nov 29 06:20
  • Nov 16 07:34
    gisnewbie starred locationtech/rasterframes
  • Nov 14 22:30
    DragonEnergy commented #594
  • Nov 14 22:21
    DragonEnergy edited #594
  • Nov 14 22:20
    DragonEnergy opened #594
  • Nov 14 22:17

    dependabot[bot] on pip

    (compare)

  • Nov 14 22:17
    dependabot[bot] commented #592
  • Nov 14 22:17
    echeipesh closed #592
  • Nov 14 22:06
    echeipesh closed #593
  • Nov 14 22:06
    echeipesh commented #593
  • Nov 14 22:02
    DragonEnergy opened #593
  • Nov 11 08:25
    dependabot[bot] closed #586
  • Nov 11 08:25

    dependabot[bot] on pip

    (compare)

  • Nov 11 08:25
    dependabot[bot] commented #586
  • Nov 11 08:25
    dependabot[bot] labeled #592
  • Nov 11 08:25
    dependabot[bot] opened #592
  • Nov 11 08:25

    dependabot[bot] on pip

    Bump pyspark from 3.1.2 to 3.2.… (compare)

  • Nov 06 22:55
    noeigenschaften starred locationtech/rasterframes
  • Nov 02 15:32
    denmoroz starred locationtech/rasterframes
  • Oct 26 23:48
    pomadchin unlabeled #591
Simeon H.K. Fitch
@metasim
I have plenty of space. If the files exists, it should skip installing it. Literally nothing has changed in the build definition since the last release. This is one of those mysterious changes in the python universe.
It's a serious impediment because I'm having to manually delete the .egg dir between the package and test stages, because python setup.py test want's to reinstall stuff again.
1 reply
If I could throw code out the window, I would right now. setuptools is a complete abomination.
Repeatability is the first principle in package and dependency management.
Every time I try to do a little work on RF with my limited free time, I end up spending all of it fighting Python.
Simeon H.K. Fitch
@metasim
The ecosystem is fundamentally flawed.
If there are any Python experts out there in the community, I sure could use your help.
Simeon H.K. Fitch
@metasim
:balloon: :balloon: RasterFrames 0.10.1 is Released! :balloon: :balloon:
Release notes: https://github.com/locationtech/rasterframes/releases/tag/0.10.1
Artifacts deployed to Maven Central and PyPi.
Documentation has not been updated :disappointed:
Manrique Vargas
@mv1742
image.png

Hello, i installed pyrasterframes version 0.8.5 (compatible with spark 3..) in Databricks (runtime 6..) and i was able to read from modis-pds.s3.amazonaws.com tif file successfully, e.g.
df = spark.read.raster('https://modis-pds.s3.amazonaws.com/MCD43A4.006/11/08/2019059/MCD43A4.A2019059.h11v08.006.2019072203257_B02.TIF') https://rasterframes.io/getting-started.html

However i am not able to load other single raster files in other locations or my own .tif files in dbfs or in s3 buckets. For example when reading from this single rasterframe from a public s3 bucket (see https://rasterframes.io/raster-read.html), using commandrf = spark.read.raster('https://rasterframes.s3.amazonaws.com/samples/luray_snp/B02.tif'), i get the error below:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 61 in stage 287.0 failed 4 times, most recent failure: Lost task 61.3 in stage 287.0 (TID 5589, 10.130.180.94, executor 0): java.lang.IllegalArgumentException: Error fetching data for one of: JVMGeoTiffRasterSource(https://rasterframes.s3.amazonaws.com/samples/luray_snp/B02.tif)

Similarly when reading from dbfs I get the error below:

Caused by: java.lang.UnsupportedOperationException: Reading 'dbfs:/tmp/rasterframes/raster_problem/it_template_no0.tif' not supported

any ideas how to load a single rasterframe in databricks using pyrasterframes?

3 replies
Henry Rodman
@hrodmn

Hi all, I am attempting to send a STAC query via the spark.read.stacapi on v0.10.1 and the filters that I am sending do not seem to be applied to the resulting dataframe (the rows show bounding boxes that do not overlap with my supplied bounding box). Do you have any suggestions for applying the filters in a different way?

Here is what I tried:

from pyrasterframes.utils import create_rf_spark_session

bbox = [-92.2646, 46.6930, -92.0276, 46.9739]
uri = 'https://earth-search.aws.element84.com/v0'
query_params = {
    'collections': ['sentinel-s2-l2a-cogs'],
    'datetime': '2021-06-01/2021-06-30',
    'bbox': bbox,
}

df = spark.read.stacapi(uri, filters=query_params)
df.select(df.id, df.bbox).limit(5)

When I run the same query using pystac_client I get 24 images back:

import pystac_client
catalog = pystac_client.Client.open(uri)
all_items = catalog.search(**query_params).get_all_items()
len(all_items)

I found the stacapi read method on GitHub but am not sure the next place to look.

Grigory
@pomadchin
hey @hrodmn let me have a look
@hrodmn do you have a full pystac example?
Henry Rodman
@hrodmn
@pomadchin here is the full pystac example:
import pystac_client

uri = 'https://earth-search.aws.element84.com/v0'
query_params = {
    'collections': ['sentinel-s2-l2a-cogs'],
    'datetime': '2021-06-01/2021-06-30',
    'bbox': [-92.2646, 46.6930, -92.0276, 46.9739],
}

catalog = pystac_client.Client.open(uri)

all_items = catalog.search(**query_params).get_all_items()

# to check on the results
for item in items:
    print(item)
Grigory
@pomadchin
@hrodmn :+1: will look in a bit
Grigory
@pomadchin
@hrodmn 1. the datetime is in the incorrect format should be ISO, 2. there is indeed some weird bug with bbox progpagating? I need to clarify that
thx for reporting
try to use the polygon intersection for now
however, Im really surprised by the output
Grigory
@pomadchin
@hrodmn there is a bug in pagination, it doesn’t quite work with https://earth-search.aws.element84.com/v0
as a workaround try smth like
{
    "collections": ["sentinel-s2-l2a-cogs"],
    "datetime": "2021-06-01T19:09:23.735395Z/2021-06-30T19:09:23.735395Z",
    "bbox": [-92.2646, 46.6930, -92.0276, 46.9739],
    "limit": 30 // increasing limit to avoid pagination
}
Grigory
@pomadchin
oh, it is not a bug, we don’t support the pagination format used by https://earth-search.aws.element84.com/v0
Henry Rodman
@hrodmn
Thank you @pomadchin! Your filter parameters yields the expected result.
Grigory
@pomadchin
@hrodmn I filed an issue here azavea/stac4s#495 but honestly don’t know when we’ll have time to address it
Grigory
@pomadchin
@hrodmn I fixed the behvaior, turned out to be not that complicated. in the next RF release it would be fixed :tada:
evanatomicmaps
@evanatomicmaps
can someone tell me if/how I can pass a rasterframes masked tile to rasterio.features.shapes to get geometries of cells with data?
Adrian Klink
@aklink
Off-Topic question: Has anyone ever tested combining rasterframes with horovod?
https://horovod.readthedocs.io/en/stable/spark_include.html
1 reply
wxmimperio
@imperio-wxm
How to customize the datasource, is there any documentation or instructions for reference?
Grigory
@pomadchin
hey @imperio-wxm what datasource you want to cusomize? but in general there is no, I don’t think spark has official intros / docs related to the DataSources API
1 reply
DonjetaR
@DonjetaR
Hi, I am interested in working with the Rasterframe version >0.10 because of the Spark 3 support. However, I see that the code is on the "develope" branch on GitHub. Do you have a plan for when it will be merget to main or released on an official stable version? Is the Rasterframe 0.10.1 version a stable release?
Grigory
@pomadchin
Hey @DonjetaR it is a stable release
wxmimperio
@imperio-wxm
I run STAC API spark reader Test get error, branch develop:

28499
java.lang.ArrayIndexOutOfBoundsException: 28499
    at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:532)
    at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:315)
    at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:102)
    at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:76)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:45)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:59)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:59)
    at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
    at scala.collection.Iterator.foreach(Iterator.scala:943)
    at scala.collection.Iterator.foreach$(Iterator.scala:943)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
    at scala.collection.IterableLike.foreach(IterableLike.scala:74)
    at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
    at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
    at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
    at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:59)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:181)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
    at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
    at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
    at scala.collection.TraversableLike.map(TraversableLike.scala:286)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:279)
    at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:175)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:174)
    at scala.collection.immutable.List.flatMap(List.scala:366)
    at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:174)
    .......
    .......
    at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:132)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:131)
    at org.apache.spark.sql.Dataset.rdd$lzycompute(Dataset.scala:3241)
    at org.apache.spark.sql.Dataset.rdd(Dataset.scala:3239)
    at org.locationtech.rasterframes.datasource.stac.api.StacApiDataSourceTest.$anonfun$new$4(StacApiDataSourceTest.scala:67)
Grigory
@pomadchin
hey @imperio-wxm how do you run tests?
is it via intelij IDEA?
wxmimperio
@imperio-wxm
@pomadchin Hi, I run tests in intelij IDEA.
Grigory
@pomadchin
@imperio-wxm all right, indeed there is a bug / feature in how idea resolves dependencies for some reason; exclude in idea the old paranamer dependency
Or use SBT to run tests
wxmimperio
@imperio-wxm
@pomadchin Hi, what old paranamer dependencies need to be excluded?
Grigory
@pomadchin
If you’ll look into the project deps, there will be two com.thoughtworks.paranamer deps in the project; just exclude the older one
That’s a great question how the old one gets into the idea project deps tree
If you can’t figure it out just use a plain sbt to run tests
wxmimperio
@imperio-wxm
@pomadchin thanks, exclude version 2.3, it works.
Grigory
@pomadchin
@imperio-wxm :+1: nice
Simeon H.K. Fitch
@metasim
@imperio-wxm Warning: if you reload the project in IntelliJ, you'll have to do the manual exclude again :/
(I think it has to do with IntelliJ merging the project dependencies with sbt plugin dependencies)
Grigory
@pomadchin
wow
wxmimperio
@imperio-wxm
Whether the parameters query and paginationToken in StacApi are use cases, I didn't find them in the test case and don't know how to use these two parameters.
implicit val searchFiltersDecoder: Decoder[SearchFilters] = { c =>
    for {
      bbox              <- c.downField("bbox").as[Option[Bbox]]
      datetime          <- c.downField("datetime").as[Option[TemporalExtent]]
      intersects        <- c.downField("intersects").as[Option[Geometry]]
      collectionsOption <- c.downField("collections").as[Option[List[String]]]
      itemsOption       <- c.downField("ids").as[Option[List[String]]]
      limit             <- c.downField("limit").as[Option[NonNegInt]]
      query             <- c.get[Option[Map[String, List[Query]]]]("query")
      paginationToken   <- c.get[Option[PaginationToken]]("next")
    } yield {
      SearchFilters(
        bbox,
        datetime,
        intersects,
        collectionsOption getOrElse Nil,
        itemsOption getOrElse Nil,
        limit,
        query getOrElse Map.empty,
        paginationToken
      )
    }
  }
Grigory
@pomadchin
hey @imperio-wxm that’s an outdated codec, RF needs upgrade up to https://github.com/azavea/stac4s/releases/tag/v0.8.0
not sure what your question is though...