Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 14 10:12
    soxofaan edited #3175
  • Jan 11 17:37
    mjigmond opened #3176
  • Jan 10 15:44
    echeipesh assigned #3175
  • Jan 10 11:35
    soxofaan synchronize #3175
  • Jan 10 11:31
    soxofaan edited #3175
  • Jan 10 11:30
    soxofaan edited #3175
  • Jan 10 11:30
    soxofaan synchronize #3175
  • Jan 10 11:15
    soxofaan synchronize #3175
  • Jan 10 11:06
    soxofaan synchronize #3175
  • Jan 09 17:29
    soxofaan synchronize #3175
  • Jan 09 17:27
    soxofaan synchronize #3175
  • Jan 09 17:24
    soxofaan opened #3175
  • Jan 03 13:38
    pomadchin commented #3174
  • Jan 03 07:13
    esmeetu commented #3174
  • Jan 03 03:34
    pomadchin commented #3174
  • Jan 03 03:27
    pomadchin commented #3174
  • Jan 03 03:26
    pomadchin commented #3174
  • Jan 03 03:22
    pomadchin commented #3174
  • Jan 03 03:22
    pomadchin commented #3174
  • Jan 03 03:21
    pomadchin commented #3174
Frank Dekervel
@kervel
i see that all other implicit classes there are serialisable, so i think its a bug that can be fixed by adding "extends Serializable" after the class definition
Simeon H.K. Fitch
@metasim
WRT @vpipkt 's question/analysis, why is this block of code necessary?
https://github.com/s22s/geotrellis/blob/9d8ceffc435906a1d18e1283bda5e22b1b4f37e4/gdal/src/main/scala/geotrellis/raster/gdal/GDALUtils.scala#L45-L57
Does GDAL not have a built-in indicator for unsigned byte cell type?
The issue at hand is that we want all the metadata about scene/tile, including the cell type, and only want to be reading the header. This mim/max scan of the cells triggers a larger read of the data, eliminating the ability to reason about (e.g. filter) a tile on metadata alone.
Grigory
@pomadchin

@metasim yes, to determine the celltype correctly we need bitsPerSample and sampleFormat, bitsPerSample is number of bits per tiff sample, and the sampleFormat is used to determin the ‘sing’ of values.

LIBTIFF exposes this information and in geotiff native jvm reader we also use this information to determine the cellType properly: https://github.com/locationtech/geotrellis/blob/7871e015129ae4a1ccfd06a23544a6de9ac68c6e/raster/src/main/scala/geotrellis/raster/io/geotiff/BandTypes.scala#L37

However GDAL doesn’t expose such information, or I didn’t find a proper way of doing it. Probably you can find smth in the GDAL C API and somehow get the information from the used driver about the raster cellType, but I’m not sure that it is possible to expose such information via GDAL

Simeon H.K. Fitch
@metasim
Ouch... what a PITA!
Grigory
@pomadchin
yea, without this ‘conversion function’ GDALRasterSource and all GeoTrellis types would behave really differently
Simeon H.K. Fitch
@metasim
I knew it was in GeoTIFF based on the JVM reader.
Surprised GDAL didn't handle it more gracefully.
Grigory
@pomadchin
they just use different types for tiffs
Simeon H.K. Fitch
@metasim
I'd have expected the NoData handling part... makes sense
I think @vpipkt has a workaround that will keep min/max from being computed for all other cell types (because the parameter is call-by-name), but still still happen for byte types.
Grigory
@pomadchin
¯\_(ツ)_/¯ I would recommend also to doubleck the GDAL API - mb there is smth we can pull this information from
Simeon H.K. Fitch
@metasim
will do
After I finish the GT 3.x upgrade ;-)
Grigory
@pomadchin
: D
yo @kervel to make this function work you need only to have import geotrellis.spark._ in the call scope. If you have it and function still doesnt work try to use withGetBoundsMethod(rdd).getGridBounds in your code, this at least will give you a readble compile time error.
Jason T Brown
@vpipkt
My reading of this GDAL doc is that there will never be a signed Byte type. And that would mean that the first case here is always true and the second is never evaluated?
we have a bit more rich typesystem; it is wider than what GDAL Provides
also we just store data differently; we don’t have unsugned types in java (: so any byte tiff is by default in [-128; 127] ranges; and UByte allows to interpret them as [0..255] values
Grigory
@pomadchin
ha @vpipkt mb you’re right
Frank Dekervel
@kervel
@pomadchin I know where the error is coming from.. the withGetBoundsMethod Class takes the rdd as instance variable, and needs to be serializable because of this
Grigory
@pomadchin
Ah….. nd value probably can be below zero… because we don’t have ubytes
but still check it i.e. what if NoData Value=255
Frank Dekervel
@kervel
Just like the other implicit classes in the same file, which already are.
Grigory
@pomadchin
@kervel mb; do you have serialziation issues?
Frank Dekervel
@kervel
Yes. But the fix is simple
Grigory
@pomadchin
Nice! would you like to create a PR with marking this implicit class as serializable?
Frank Dekervel
@kervel
Ok.
Grigory
@pomadchin
:100:

It would be awesome if you could sign an ECA https://github.com/locationtech/geotrellis/blob/master/docs/CONTRIBUTING.rst#eclipse-contributor-agreement-eca

after that commit changes via git commit -s -m “commit message” command

Frank Dekervel
@kervel
Ok.
Jason T Brown
@vpipkt
@pomadchin i am not sure how to proceed, but my understanding is that since GDAL's type system is narrower, a GDT_Byte should always result in either UByteCellType, UByteUserDefinedNoDataCellType or UByteConstantNoDataCellType
the function as called will never return a BitCellType either because no typeSizeInBits is passed
(at least as called in this context)
Jason T Brown
@vpipkt
@pomadchin happy to keep the discussion going on this PR: locationtech/geotrellis#3150
Grigory
@pomadchin
@vpipkt can you cover it with test? so we would know that GDALRasterSource behaves the same way GeoTiffRasterSource does
Simeon H.K. Fitch
@metasim
If I have, say, an RDD[ProjectedRaster[Int]] (heterogeneous in CRS) and want to write them all out to HDFS, isn't there an "unstructured cog writer" feature in GT?
Or rather, is that the right/best option?
I don't want to reproject anything or resample it into a new grid.
Grigory
@pomadchin
@metasim nope we didn’t solve it :s
There it still a problem how would you collect metadata of multiple raster sources
Simeon H.K. Fitch
@metasim
Did you do some experimental work on it?... seem to remember some code somewhere ages ago....
Simeon H.K. Fitch
@metasim
where you were writing out debugging vrt files? or am I making that up.
Grigory
@pomadchin

@metasim we didnt resolve that; imagine the case that you have multiple RasterSources
you want to figure out how to tile them to some layout in some projection…

it would mean that you probably need to collect metadata in some CRS or reprojet it into some CRS to have some idea of the entire rasters list extent

ahhhh youre talking about persisting
and not about reading
hmmm can you describe your usecase a lil bit more?
Im jsut flygin in a different context at the moment
Simeon H.K. Fitch
@metasim
yeh, sorry. Say you want to write out a bunch of DL chips as geotiffs, organized via some arbitrary partitioning scheme. Maybe some json sidecars for metadata (or not).