Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • May 22 01:34
    pomadchin commented #3465
  • May 22 01:33
    pomadchin labeled #3465
  • May 21 16:17
    pomadchin commented #3465
  • May 21 16:10
    qw845602 commented #3465
  • May 21 16:07
    qw845602 commented #3465
  • May 21 15:59
    qw845602 commented #3465
  • May 21 15:57
    qw845602 commented #3465
  • May 21 15:56
    qw845602 commented #3465
  • May 21 15:56
    pomadchin commented #3465
  • May 21 15:55
    qw845602 edited #3465
  • May 21 15:55
    qw845602 commented #3465
  • May 21 15:40
    pomadchin commented #3465
  • May 21 15:39
    pomadchin commented #3465
  • May 21 15:39
    pomadchin commented #3465
  • May 21 15:35
    qw845602 commented #3465
  • May 21 15:29
    pomadchin commented #3465
  • May 21 15:29
    pomadchin commented #3465
  • May 21 15:28
    pomadchin commented #3465
  • May 21 15:28
    pomadchin commented #3465
  • May 21 15:18
    pomadchin closed #2504
Jan Van den bosch
@bossie
good afternoon my friends
I'm instantiating Geotiff*RasterSources with an s3:// URI; that used to work fine but now I have to support reading geotiffs in buckets in different regions at the same time. How do I tell the underlying S3Client with which region it should communicate?
Grigory
@pomadchin
hey @bossie you can configure client via the S3ClientProducer.set method
Jan Van den bosch
@bossie
yes but that singleton is not going to work in this multithreaded application
Grigory
@pomadchin
@bossie oh fair enough hm
well i’d say that you could have different prupose clients in this case for different regions
allocating a client per thread is certainly an option but very slow one
:D
could you mock up some code to demonstrate the API usage and what you want (ideally) to achieve?
Grigory
@pomadchin
hey @chdsb could you clear it up: what value and when is different? you mean after the ingestion you’re reading a chip via the colleciton reader and the min value is different than in the source? If I understood it correctly than it is due to reprojection + resampling + regridding. Since the ingestion was done in your case into the GeoTrellis layers. That is a precision issue
@chdsb I think it just happened so (: it could also not match
Grigory
@pomadchin
if you want a complete match to the source - I would recommend to consider operating with unchanged rasters via the RasterSource API
@chdsb answering the second question: how do you read sources into the RDD at the moment?
chdsb
@chdsb
@pomadchin hadoopGeoTiffRDD
Grigory
@pomadchin

@chdsb okay, and you would like to use smth like sc.hadoopTemporalGeoTiffRDD but to pass time somehow different rather than setting TIFF tags.

You could use HadoopGeoTiffRDD.singleband directly:

Where I = ProjectedExtent and K = TemporalProjectedExtent; the function you’re interested in is uriToKey: (URI, I) => Kyou can get the temporal information from the path (that used to be one of the ways temporal metadata could be encoded) of the tiff and add it to the SpatialComponent

林海听枫
@stjimreal
@pomadchin great! I will check it.
Grigory
@pomadchin
@yang162132 yea you’re right that’s a bug
a typo
(:
tech11-much
@tech11-much
when the tif is bigger than 800mb, I just can't get the Rdd of the tif. Please tell me how to use the lib.
image.png
Grigory
@pomadchin
Hey @tech11-much , what is the error you are seeing?
tech11-much
@tech11-much
@pomadchin this function hadoopgeotiffrdd() can't read big tiff file
Grigory
@pomadchin
@tech11-much how is that so? what means can’t read? throws an error oh what happens?
tech11-much
@tech11-much
image.png
Grigory
@pomadchin
how large is this tiff?
tech11-much
@tech11-much
2 gb
a 1.5gb makes a different error
Grigory
@pomadchin
2gb is a tiny tiff
I guess, if it is not very well compressed and it is not a bit cell typed tiff

what errors do you have? but typically that means that or you’re pointing to a wrong file or smth else is happening

What gt version are you also on?

tech11-much
@tech11-much
image.png
maybe the tif is wrong
Grigory
@pomadchin
@tech11-much well first of all the GT version is too old, try with 3.6.0
another one is for sure can be related to the TIFF structure, you can try to access it using GDAL RasterSourc
jterry64
@jterry64

I'm running into the issue discussed in this bug: https://github.com/locationtech/geotrellis/issues/3184#issuecomment-595326991
The same read issue happens to me randomly (though not frequently), but I'm not sure I totally understand the resolution. I'm using geotrellis 3.5.2, GDAL 3.1.2, and looks like the GDAL warp jar is 1.1.1. So it seems like it should've had the fix.

I'm also doing this in an environment where it's getting data from a S3 bucket using requester pays, but it definitely has AWS credentials and seems to read fine 99% of the time. Could this be related to the parallel reads like discussed in the bug?

jterry64
@jterry64
And we also run this process in the AWS account that owns the S3 bucket, and I can't seem to find the same errors there
Grigory
@pomadchin
hey @jterry64 what error exactly you have?
I remember this one, was a tricky bug

but what really was in this issue is that GDAL / S3 timetouts reads sometimes due to load / etc
Increase the number of attempts;

the error code for increased amount of attempts is 100

In earlier versions it was also overlapped with a deadlock on gdal datasets
jterry64
@jterry64
Ah ok, I'm getting this Error Code 4, saying it can't find the NoData value. But I definitely see it's set if I use gdalinfo, and most of other times can read the same TIF file fine:
FAILURE(3) CPLE_AppDefined(1) "Application defined error." _TIFFPartialReadStripArray:Cannot read offset/size for strile around ~9905 ^[[0m
FAILURE(3) CPLE_OpenFailed(4) "Open failed." `/vsis3/gfw-data-lake/umd_tree_cover_density_2000/v1.6/raster/epsg-4326/10/40000/percent/gdal-geotiff/10N_020E.tif' not recognized as a supported file format.
ERROR TreeLossRDD$: Feature 3: {}
geotrellis.raster.gdal.MalformedDataTypeException: Unable to determine NoData value. GDAL Exception Code: 4
        at geotrellis.raster.gdal.GDALDataset$.$anonfun$noDataValue$1(GDALDataset.scala:313)
        at geotrellis.raster.gdal.GDALDataset$.$anonfun$noDataValue$1$adapted(GDALDataset.scala:310)
        at geotrellis.raster.gdal.GDALDataset$.errorHandler$extension(GDALDataset.scala:406)
        at geotrellis.raster.gdal.GDALDataset$.noDataValue$extension1(GDALDataset.scala:310)
        at geotrellis.raster.gdal.GDALDataset$.cellType$extension1(GDALDataset.scala:366)
        at geotrellis.raster.gdal.GDALDataset$.readTile$extension(GDALDataset.scala:383)
        at geotrellis.raster.gdal.GDALDataset$.$anonfun$readMultibandTile$1(GDALDataset.scala:400)
        at geotrellis.raster.gdal.GDALDataset$.$anonfun$readMultibandTile$1$adapted(GDALDataset.scala:400)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
        at scala.collection.Iterator.foreach(Iterator.scala:941)
        at scala.collection.Iterator.foreach$(Iterator.scala:941)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
        at scala.collection.IterableLike.foreach(IterableLike.scala:74)
        at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
        at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
        at scala.collection.TraversableLike.map(TraversableLike.scala:238)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at geotrellis.raster.gdal.GDALDataset$.readMultibandTile$extension(GDALDataset.scala:400)
        at geotrellis.raster.gdal.GDALRasterSource.$anonfun$readBounds$2(GDALRasterSource.scala:107)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:459)
        at geotrellis.raster.gdal.GDALRasterSource.read(GDALRasterSource.scala:158)
        at geotrellis.layer.LayoutTileSource.$anonfun$read$1(LayoutTileSource.scala:80)
        at scala.Option.flatMap(Option.scala:271)
        at geotrellis.layer.LayoutTileSource.read(LayoutTileSource.scala:79)
        at geotrellis.layer.LayoutTileSource.read(LayoutTileSource.scala:61)
        at org.globalforestwatch.layers.RequiredILayer.fetchWindow(Layer.scala:212)
        at org.globalforestwatch.layers.RequiredILayer.fetchWindow$(Layer.scala:208)
        at org.globalforestwatch.layers.TreeCoverDensityPercent2000.fetchWindow(TreeCoverDensity.scala:42)
        at org.globalforestwatch.summarystats.treecoverloss.TreeLossGridSources.$anonfun$readWindow$5(TreeLossGridSources.scala:43)
        at cats.syntax.EitherObjectOps$.catchNonFatal$extension(either.scala:338)
        at org.globalforestwatch.summarystats.treecoverloss.TreeLossGridSources.$anonfun$readWindow$4(TreeLossGridSources.scala:43)
        at scala.util.Either$RightProjection.flatMap(Either.scala:701)
        at org.globalforestwatch.summarystats.treecoverloss.TreeLossGridSources.$anonfun$readWindow$2(TreeLossGridSources.scala:41)
        at scala.util.Either$RightProjection.flatMap(Either.scala:701)
        at org.globalforestwatch.summarystats.treecoverloss.TreeLossGridSources.readWindow(TreeLossGridSources.scala:40)
Grigory
@pomadchin

@jterry64 yea, this is a new one;

What GDAL version you use?

don’t read the GT err; important there are
FAILURE(3) CPLE_AppDefined(1) "Application defined error." _TIFFPartialReadStripArray:Cannot read offset/size for strile around ~9905 ^[[0m
FAILURE(3) CPLE_OpenFailed(4) "Open failed." `/vsis3/gfw-data-lake/umd_tree_cover_density_2000/v1.6/raster/epsg-4326/10/40000/percent/gdal-geotiff/10N_020E.tif' not recognized as a supported file format.
try to set increase number of attempts
yang162132
@yang162132
Hey! @pomadchin I found that geotrellis can work with machine learning, but how to do it, only cogtiff can do it or both rdd and cog can
Grigory
@pomadchin
hey @yang162132 I don’t quite follow the question; GT is a tool for IO and Raster types + operations. Perhaps you are asking about its usage with SparkML - in this case I recommend you to look into RasterFrames - it is a library that bring GT capabilities into the Dataset / Dataframe world + a lot of extra features to help with Dataframes manipulations
vkassiano10
@vkassiano10
Hello everyone! Complete noob to Geotrellis here! Quick question: Does the s3 Geotiff file reader works with files stored in digitalocean storage?
Grigory
@pomadchin
hey @vkassiano10 we’re using AWS S3 SDK v2; just confgiure the client properly and set it via https://github.com/locationtech/geotrellis/blob/master/s3/src/main/scala/geotrellis/store/s3/S3ClientProducer.scala