Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 14 10:12
    soxofaan edited #3175
  • Jan 11 17:37
    mjigmond opened #3176
  • Jan 10 15:44
    echeipesh assigned #3175
  • Jan 10 11:35
    soxofaan synchronize #3175
  • Jan 10 11:31
    soxofaan edited #3175
  • Jan 10 11:30
    soxofaan edited #3175
  • Jan 10 11:30
    soxofaan synchronize #3175
  • Jan 10 11:15
    soxofaan synchronize #3175
  • Jan 10 11:06
    soxofaan synchronize #3175
  • Jan 09 17:29
    soxofaan synchronize #3175
  • Jan 09 17:27
    soxofaan synchronize #3175
  • Jan 09 17:24
    soxofaan opened #3175
  • Jan 03 13:38
    pomadchin commented #3174
  • Jan 03 07:13
    esmeetu commented #3174
  • Jan 03 03:34
    pomadchin commented #3174
  • Jan 03 03:27
    pomadchin commented #3174
  • Jan 03 03:26
    pomadchin commented #3174
  • Jan 03 03:22
    pomadchin commented #3174
  • Jan 03 03:22
    pomadchin commented #3174
  • Jan 03 03:21
    pomadchin commented #3174
Grigory
@pomadchin
@15952026052 how is that possible that level 12 has the higher resolution?
Frank Dekervel
@kervel
hello, i have N tilesets in webmercator (see, i am getting further and further :D) every tileset is sparse and the overlap between the tilesets is small. I need to merge them to a big tileset. this shouldn't be an expensive operation since it will just involve copying 90% of the tiles and merging 10% of them, but when doing this with spark a tileset is a RDD[SpatialKey,Tile] so merging N tilesets will require me to load all tiles in memory (even the ones that just need to be copied?)... or am i not understanding how a PairRDD works ?
N ~ 200 with every tileset having a couple of hunderds of 1024x1024 tiles.
Frank Dekervel
@kervel
a column based datastructure (eg Dataset[(SpatialKey,Tile)]) would be much more efficient here i guess
Frank Dekervel
@kervel
hmm, that's rasterframes, right ?
Grigory
@pomadchin
hey @kervel can you tell a little bie more about what you’re trying to do? it is a bit hard to follow
Not sure that rasterframes would solve your issue btw
Frank Dekervel
@kervel
well, i have now geotrellis jobs that generate tilesets based on mobile lidar data.
every run of the mobile lidar is now converted into one set of tiles.
but i want to make a big tileset of all the runs together ... normally every time the mobile lidar runs it will go somewhere else. but sometimes it will have an overlapping section with a previous run
so i go raw lidar returns --> pointcloud --> raster --> retile --> reproject --> save to hadoop for one tileset
Eugene Cheipesh
@echeipesh
@kervel You should be able to use the LayerWriter.update function to merge in the new data. The writer will check if the target tile exists and try to update it using the provided function. (Be warned the default implementation doesn't actually do any merging, so make sure to specify real value for mergeFunc there).
Frank Dekervel
@kervel
Ok, that sounds good! Tx
Frank Dekervel
@kervel
i'm reading about rasterframes.. and i think it would also help me because raster data is lazily loaded. what i can't easily do in geotrellis is getting a list of the raster tile keys without loading them all (or can i). but going to use the layerWriter.update method.
Glider
@esmeetu
hey, @pomadchin i have committed a PR. #3147 for optimization.
Besides, i have a question about tile mapalgebra calculation. How to deal with edge when using 3x3 window to calculate the tile's slope. The edge pixel only has three adjacent pixels, so there is a big difference in the result of tile slope. Is there a better way to solve this?
Grigory
@pomadchin
hey @esmeetu thanks! I anwsered under the issue; thank you so much for the contribution
Eugene Cheipesh
@echeipesh
@kervel You can do that with GT as well. If you load make `RasterSource' instances for your rasters thats a lazy view that will at some point read metadata. You can do metadata-only reproject or resample on them and then trade it for LayoutTileSource if you give tell it how you want the tiles layed out (LayoutDefinition) -- the LayoutTileSource will both give you a set of keys you can read and allow you to actually read them lazilly.
@kervel There is more than normal ammount of scaladocs on those interfaces but you'll still need to read a few tea-leaves as these things are new and are not described in the documntation yet.
RasterFrames will give you a DataFrame interface, which is quite nice to work against and these kinds of workflows are built into the TileUDT. So it depends on how much flexibility and which direction you need.
Jason T Brown
@vpipkt
Hi! Long time user, first time caller. I am using (an older version of) GDALDataset in RasterFrames. The evaluation of various dataset's metadata is happening in parallel and I am un-scientifically seeing a lot of calls here to get the band min and max. I think this is forcing a lot of data reads that are ultimately unnecessary. Happy to elaborate. Would you be open to a PR trying to avoid this call if possible, as in most cases it is
Grigory
@pomadchin
hey @vpipkt yea; feel free to create a PR; I think you’re asbolutely correct about it and we would happy to assist you wth it :100:
Jason T Brown
@vpipkt
@pomadchin thanks so much, I'll request your review on it.
Grigory
@pomadchin
perfect
Frank Dekervel
@kervel
@echeipesh thx! Did not know about existence of Rastersource
Eugene Cheipesh
@echeipesh
Fresh off the mint with GT 3.0 :)
James Hughes
@jnh5y
Congrats!
Grigory
@pomadchin
@kervel ): we also don’t have any 3.0 documentation
tosen1990
@tosen1990
Anyone one know in GT3.0 how to correctly use toJson ?
import spray.json._
  val resultpoly: List[PolygonFeature[Int]] = geotrellis.raster.vectorize.Vectorize(st.tile, st.extent)
    val jsons: JsValue = WithCrs(JsonFeatureCollection(resultpoly), NamedCRS("epsg:4326 ")).toJson
No implicits found for parameter writer: JsonWrtier[Crs[JsonFeatureCollection]]
Grigory
@pomadchin
@tosen1990 we use circe now; insted of an outdated spray-json
import io.circe.syntax._
resutlipoly.asJson
tosen1990
@tosen1990
@pomadchin It works now. x:D
rexuexiangbei
@rexuexiangbei
i want to update to 3.0.0,but in build.sbt i changed the version ,the code sc.hadoopMultibandGeoTiffRDD can not run
tosen1990
@tosen1990
@rexuexiangbei
try to add this
import geotrellis.store._
import geotrellis.store.index._
Frank Dekervel
@kervel
hello i'm upgrading to geotrellis 3.0 but i have an issue that the implicit withGetBoundsMethod is not serializable, so when i try "getBounds" on a layer it fails, but when i just copypaste the code that's in getBounds it works fine
i don't know why scala and spark closure cleaner decides that the object of that class needs to be in scope
i see that all other implicit classes there are serialisable, so i think its a bug that can be fixed by adding "extends Serializable" after the class definition
Simeon H.K. Fitch
@metasim
WRT @vpipkt 's question/analysis, why is this block of code necessary?
https://github.com/s22s/geotrellis/blob/9d8ceffc435906a1d18e1283bda5e22b1b4f37e4/gdal/src/main/scala/geotrellis/raster/gdal/GDALUtils.scala#L45-L57
Does GDAL not have a built-in indicator for unsigned byte cell type?
The issue at hand is that we want all the metadata about scene/tile, including the cell type, and only want to be reading the header. This mim/max scan of the cells triggers a larger read of the data, eliminating the ability to reason about (e.g. filter) a tile on metadata alone.
Grigory
@pomadchin

@metasim yes, to determine the celltype correctly we need bitsPerSample and sampleFormat, bitsPerSample is number of bits per tiff sample, and the sampleFormat is used to determin the ‘sing’ of values.

LIBTIFF exposes this information and in geotiff native jvm reader we also use this information to determine the cellType properly: https://github.com/locationtech/geotrellis/blob/7871e015129ae4a1ccfd06a23544a6de9ac68c6e/raster/src/main/scala/geotrellis/raster/io/geotiff/BandTypes.scala#L37

However GDAL doesn’t expose such information, or I didn’t find a proper way of doing it. Probably you can find smth in the GDAL C API and somehow get the information from the used driver about the raster cellType, but I’m not sure that it is possible to expose such information via GDAL

Simeon H.K. Fitch
@metasim
Ouch... what a PITA!
Grigory
@pomadchin
yea, without this ‘conversion function’ GDALRasterSource and all GeoTrellis types would behave really differently
Simeon H.K. Fitch
@metasim
I knew it was in GeoTIFF based on the JVM reader.
Surprised GDAL didn't handle it more gracefully.
Grigory
@pomadchin
they just use different types for tiffs
Simeon H.K. Fitch
@metasim
I'd have expected the NoData handling part... makes sense
I think @vpipkt has a workaround that will keep min/max from being computed for all other cell types (because the parameter is call-by-name), but still still happen for byte types.
Grigory
@pomadchin
¯\_(ツ)_/¯ I would recommend also to doubleck the GDAL API - mb there is smth we can pull this information from
Simeon H.K. Fitch
@metasim
will do
After I finish the GT 3.x upgrade ;-)
Grigory
@pomadchin
: D
yo @kervel to make this function work you need only to have import geotrellis.spark._ in the call scope. If you have it and function still doesnt work try to use withGetBoundsMethod(rdd).getGridBounds in your code, this at least will give you a readble compile time error.