by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jan 31 2019 17:36
    schnerd starred locationtech/geowave
  • Jan 30 2019 11:01
    hsg77 commented #1474
  • Jan 30 2019 10:58
    hsg77 commented #1474
  • Jan 30 2019 10:57
    hsg77 commented #1474
  • Jan 30 2019 10:53
    hsg77 commented #1474
  • Jan 30 2019 10:53
    hsg77 commented #1474
  • Jan 30 2019 10:51
    hsg77 commented #1474
  • Jan 29 2019 16:30
    JWileczek commented #1474
  • Jan 29 2019 16:30
    JWileczek commented #1474
  • Jan 29 2019 16:12
    rfecher commented #1474
  • Jan 29 2019 10:44
    hsg77 commented #1474
  • Jan 28 2019 22:47
    sunapi386 starred locationtech/geowave
  • Jan 28 2019 21:12

    rfecher on gh-pages

    Lastest javadoc on successful t… (compare)

  • Jan 28 2019 20:47

    rfecher on master

    fixing coveralls (#1488) (compare)

  • Jan 28 2019 20:47
    rfecher closed #1488
  • Jan 28 2019 20:47
    rfecher opened #1488
  • Jan 28 2019 17:02

    rfecher on master

    Update README.md (compare)

  • Jan 28 2019 16:53

    rfecher on master

    updated readme.md (#1486) (compare)

  • Jan 28 2019 16:53
    rfecher closed #1486
rfecher
@rfecher
@adil_ali_twitter it really does look like a scala version mismatch on the classpath, but I'm pretty confident that jar was built with and includes scala 2.11.8 as well ... I took out the inclusion of scala in this jar assuming that its being provided on the classpath elsewhere: https://drive.google.com/file/d/11G7BxAzLzKqZhHHHr3aAbfE70E5lBrid/view?usp=sharing
you can give it a shot, but I'm not particularly confident thats going to do the trick
Adil Ali
@adil_ali_twitter

@rfecher Thanks a ton for sharing the build. It worked with fine with Zeppelin after I added the GeoWave jar to the HBASE_CLASSPATH. If I don't add it, then it gives the following exception:

Caused by: org.apache.hadoop.hbase.exceptions.DeserializationException: org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.ClassNotFoundException: mil.nga.giat.geowave.datastore.hbase.query.HBaseDistributableFilter at org.apache.hadoop.hbase.filter.FilterList.parseFrom(FilterList.java:396) ... 12 more Caused by: org.apache.hadoop.hbase.DoNotRetryIOException: java.lang.ClassNotFoundException: mil.nga.giat.geowave.datastore.hbase.query.HBaseDistributableFilter at org.apache.hadoop.hbase.protobuf.ProtobufUtil.toFilter(ProtobufUtil.java:1476) at org.apache.hadoop.hbase.filter.FilterList.parseFrom(FilterList.java:393)

Is it required for the Geowave jar to be present in the HBASE_CLASSPATH?

rfecher
@rfecher
did you install the geowave-*-hbase rpm?
it puts a geowave jar in hbase.dynamic.jars.dirto get our custom filters on the hbase classpath
rfecher
@rfecher
alternatively there is a --disableServer option in the geowave datastore config that will not depend on the library being on hbase's classpath for ease of deployment but it will need to do additional clientside filtering instead (thats for v0.9.7, it actually will change to --enableServerSideLibrary=false in future versions associated with consolidating the options across all the supported keyvalue stores)
Adil Ali
@adil_ali_twitter

I am trying to do a simple GeoWaveRDD operation via Zeppelin
val storeRdd:RDD[(GeoWaveInputKey, SimpleFeature)] = GeoWaveRDD.rddForSimpleFeatures( sc, pluginOptions, null, null) storeRdd.first

This is resulting into an exception:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 2) had a not serializable result: org.geotools.feature.simple.SimpleFeatureImpl Serialization stack: - object not serializable (class: org.geotools.feature.simple.SimpleFeatureImpl, value: SimpleFeatureImpl:usa_caps=[SimpleFeatureImpl.Attribute: the_geom<the_geom id=usa_caps.12>=POINT (-157.804233 21.31725), S

Am I missing something with serialization here ?

rfecher
@rfecher
It depends what version of GeoWave you're using, but in the latest there's a bunch of convenience methods for setting up serialization correctly within GeoWaveSparkConf, particularly these lines
the convenience methods have changed depending on the version, but regardless you need to do something similar to that to register a serializer
Adil Ali
@adil_ali_twitter
Thanks a ton for your response @rfecher
Adding the kyro serializer properties did the trick.
shortwavedave
@shortwavedave

hello all, I've been trying to ingest and then query some raster data from geowave programmatically. so far, igesting seems to work but query does not. I wonder if I need to configure the CRS somehow, either for the raster or for the query. The raster is this:

GridCoverage2D["o41078a5", GeneralEnvelope[(698753.304798, 4539568.607192), (710925.797598, 4556059.506392)], DefaultProjectedCRS["WGS 84 / UTM zone 17N"]]

No query so far has returned any part of this raster. What would be a valid query?

Adil Ali
@adil_ali_twitter

Hi All,
I am again facing a serialization issue. I am trying to provide SpatialQuery in GeoWaveRDD.rddForSimpleFeatures() but it throws NotSerializableException

`val spatialQuery: SpatialQuery = new SpatialQuery(bufferedGeom) val dataStoreRdd: RDD[(GeoWaveInputKey, SimpleFeature)] = GeoWaveRDD.rddForSimpleFeatures( sc, hbOptions.createPluginOptions(), spatialQuery, // distributableQuery null, // queryOptions 5, // min split 10) // max split

Caused by: java.io.NotSerializableException: mil.nga.giat.geowave.core.geotime.store.query.SpatialQuery
Serialization stack:

I am currently setting the following 2 properties in my conf:
"spark.serializer" "org.apache.spark.serializer.KryoSerializer"
"spark.kryo.registrator" "mil.nga.giat.geowave.analytic.spark.GeoWaveRegistrator"

I am using the 0.9.7 build shared here:
https://gitter.im/locationtech/geowave?at=5afaf853e0b50c2d05c40254

Could you please suggest what I may be missing. Thank you

rfecher
@rfecher
@shortwavedave the default index uses EPSG:4326 within the index (lat/lon) although that is configurable. The raster is tiled in the index, reprojected if necessary, and then on query you can choose a projection for the query constraint and you can reproject results
also, GeoWave's DataStore.query() methods will return the raw results for each row, which is tiles. Often a user would prefer a mosaic as the result and it is only for some customization that you may want to get the tiles. To get the mosaic we use GeoWaveRasterReader.renderGridCoverage() which will choose the appropriate pyramid level when pyramid'ing is enable and it will grab the overlapping tiles, mosaic and crop them to the query envelope.
rfecher
@rfecher
there's several examples of rendering a mosaic in our code base, but here is one such example
rfecher
@rfecher
there are some mechanic to constructing GeoWaveRasterReader which hopefully are easy enough to follow along, but because of the generalities, it may be easier to give a concrete example, such as hbase with MY_CRS, MY_TABLE_NAMESPACE, and MY_ZOOKEEPER
        final GeneralEnvelope queryEnvelope = new GeneralEnvelope(
                new double[] {
                    WEST,
                    SOUTH
                },
                new double[] {
                    EAST,
                    NORTH
                });
        queryEnvelope.setCoordinateReferenceSystem(MY_CRS);
        HBaseRequiredOptions hbaseOptions = new HBaseRequiredOptions();
        hbaseOptions.setZookeeper(MY_ZOOKEEPER);
        DataStorePluginOptions dataStoreOptions = new DataStorePluginOptions(hbaseOptions);
        final GeoWaveRasterReader reader = new GeoWaveRasterReader(
                GeoWaveRasterConfig.createConfig(dataStoreOptions.getOptionsAsMap(), MY_TABLE_NAMESPACE));
        final GridCoverage2D gridCoverage = reader.renderGridCoverage(
                "test",
                new Rectangle(
                        0,
                        0,
                        1024,
                        1024),
                queryEnvelope,
                null,
                null,
                null);
the other thing I'd suggest is making sure the raster shows up in geoserver - you can run on geowave's cli geowave config geoserver <geoserver host:port> and then geowave gs addlayer <raster datastore name> which should just add it to geoserver
rfecher
@rfecher
@adil_ali_twitter what you have there should work? @JWileczek any ideas as to why that wouldn't work?
rfecher
@rfecher
@adil_ali_twitter what I can suggest is you can simply work around that - either by constructing the spatial query within a spark task rather than trying to serialize it from the client, or by registering a serializer explicitly for the class, such as kryo.register(SpatialQuery.class, new PersistableSerializer()) (even though it should be done exactly like this in GeoWaveRegistrator it doesn't hurt to be more explicit)
shortwavedave
@shortwavedave

Thanks for the tips @rfecher. I am using accumulo, so I setup an instance of AccumuloRequiredOptions with the correct user and pass. Now I get this error:

'GEOWAVE_METADATA' table does not exist org.geotools.data.DataSourceException: Unable to create a coverage for this source

The table is definitely there, it seems to me that is isn't looking in the right namespace. The createConfig method takes the table namespace, but ironically it is not used in the GeoWaveRasterConfig class - this looks like a bug. Any ideas of a workaround?

rfecher
@rfecher
yep, you're right, that createConfig method isn't used anywhere, but I tried to use it for convenience to get around using this logic that is used everywhere else for creating the GeoWaveRasterConfig
shortwavedave
@shortwavedave
haha, yea I was trying to avoid that too
rfecher
@rfecher
yep, it seems like we should add a couple convenience methods here to wrap that logic up, it should be easy
shortwavedave
@shortwavedave

I wrote a convenience method that allows me to pass in the adapter and datastore directly. seems to work ok, I'll fork/pull after I get some work done.

On another note, any idea what the purpose of the coverage name is? It seems like we use it to label an adapter, but I think I came across a case where I couldn't ingest two coverages with the same name, so I couldn't reuse the adapter to search over both coverages. Does that seem right?

rfecher
@rfecher
Yes, please PR when you have a chance. Eclipse requires us to have contributors sign a contributor agreement electronically here.
rfecher
@rfecher
the coverage name is a label for the adapter serving as the layer name, but it allows you to either ingest images as individual layers (different coverage names) or mosaic them into a single layer (same name). For the latter to work, the images have to have the same SampleModel which is basically data layout. Did you get any meaningful error when you tried to ingest multiple with the same coverage name?
perhaps playing around with landsat8 commandline utilities would give you a good idea of how that works? by playing with --coverage you end up choosing whether to mosaic landsat scenes or keep them separate (using ${entityId})
John Meehan
@n0rb3rt
Hi I’m wondering if Geowave can support heterogeneous SFTs in a single table?
mawhitby
@mawhitby
@n0rb3rt Yes it does. We use a common index model for all data types so you can ingest them all into a single table if you need to
rfecher
@rfecher
yeah, and a little more on that "common index model" - if you want you can query across all SFTs within a single scan using the indexed fields such as space/time (and potentially other fields but you'd have to do something a bit custom to support normalizing other fields across heterogeneous SFTs). This graphic from our developer guide is meant to be a Venn diagram on the left, in that the primary index data is a subset of the common indexed data which is a subset of the native data, and our default is simply use the primary indexed fields (what goes in the key, usually space/time) for the common index data. Basically, this notion under the hood allows doing a single scan for anything that is part of common index data rather than requiring a scan per SFT if that's a use case of interest.
John Meehan
@n0rb3rt
Playing that out a bit more, if I ingested features with various SFTs and then try to access that table in GeoServer via WFS, would I get just the common types as the attributes of those features, or would i need to write my own feature reader to handle that sort of thing?
rfecher
@rfecher
as it works now out of the box each SFT by unique feature type name would be a layer in GeoServer - there'd need to be some custom code to expose the layers differently
it'd be fairly straightforward to expose a single layer composed of many SFTs with basically just a geometry attribute for spatial, or geometry + time if you're using spatial-temporal, but it seems that would only make sense if you really wanted the individual layers too (otherwise you'd just model your data as one SFT in the first place)
rfecher
@rfecher
as far as exposing common subsets of attributes into hybrid layers, I'd have to understand how that is generalizable, and think of how it could be exposed reasonably simplistically (seems like there could be lots of candidate combinations for sets of attributes) - it feels like a fairly custom use case
John Meehan
@n0rb3rt
I'll have have to experiment a bit. In my case I have potentially thousands of different SFTs, though each with a few common attributes. So the common model seems very useful. But I need to figure out what to do with the rest of the sparse attributes. Maybe I just stick them in a generic bag.
For the common attributes, when is it better to make them dimensions of the SFC vs secondary indices? When they're nullable?
or not nullable rather
rfecher
@rfecher
hmm, yeah, sticking the extended fields in a generic bag could be an option - if you have SFT1 with attributes A,B,C and SFT2 with attributes C,D,E, do you really just want SFT3 with attribute C + generic bag, or do you also want SFT1 and SFT2 as individual layers within GeoServer?
how you index the data is going to be driven by the selectors you most frequently want to use on retrieval
John Meehan
@n0rb3rt
But if the common attributes are in all or most queries then they're generally better off as SFC dimensions?
rfecher
@rfecher
definitely - are the common attributes numeric?
John Meehan
@n0rb3rt
off the top of my head it would be id (Long), layerName or fileName (String), timestamp (Date as Long)
rfecher
@rfecher
are you doing a range query with id? seems to me the answer is likely no?
John Meehan
@n0rb3rt
no
rfecher
@rfecher
if you already have the id, is space-time constraints necessary?
John Meehan
@n0rb3rt
id isn't unique
it represents a group of input data tied to an external system
rfecher
@rfecher
ahh, got it
so our index is composable and to me, I'd structure it as <layerName> + <id> + <SFC>