Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
James Srinivasan
I don't know if the esri big data toolkit is any better. Our experience is anything above say 10k points chokes esri desktop and above 1M chokes esri server on oracle
And anyone shouting about "big data" in 2022 should possibly be viewed with suspicion
@jrs53 hehe ... valid remarks... thanks
Sorry to come back to same topic, but I'm looking for ammunition to defend why I prefer geomesa over ESRI...
I realize this is not really a geomesa topic , but somehow it is...
Sure people here have thought about this. I found out :
-ESRI does have a "spatiotemporal big data store" which looks includes geohash (implemented in 2016; before that they refer to geomesa as the technique that they want to use). See around 35 min here: https://www.youtube.com/watch?v=iW7_w9Evr6c&ab_channel=EsriEvents
-The official links go to tutorials in 2017-2018 --> quite outdated...
-It seems to be using elastic search, in their setup the recommendation is "to run it alone on A SINGLE machine" and describe to use 16GB of ram. This doesnt really sound like big data.
-In their examples they work with 11M points (and they find this very large).
-One thing I really need is dynamically filter while you change the values of the fields you are filtering on.... not sure they have this while the CQL filter allows for this.
-One thing they do show is the counts for heatmaps (that's in fact a useful feature unfort. not standard out of the box for geomesa).
James Srinivasan
Any thoughts on the best way to automatically ingest new data dropped into an S3 bucket into AWS hosted geomesa instance (on Accumulo)? Am thinking either AWS Java Lambda calling the Geotools API, stateless nifi, or a stateful nifi instance on ec2
7 replies
This smells like a bug but not sure if it comes from spark itself or geomesa...
I have a shape, and many points that form a closed structure (begin point=end point).

I want to make a linestring out of it and then a polgyon
When I do in sparksql this (where sometable contains a couple thousands of shapes that all have about 10-15 points):

SELECT shape,st_makePolygon(st_makeLine(collect_list(geom))) AS line
FROM sometable 
GROUP BY shape

I see it fails as the result of the makeline doesnt have the same start/end point.
I checked the table and there it is correct (start point=end point).

17 replies
When I limit it to one shape that I know failed for query above:
SELECT shape,st_makePolygon(st_makeLine(collect_list(geom))) AS line
FROM sometable 
WHERE shape = 'the_problematic_shape_if_all_shapes_are_taken_into_account'
GROUP BY shape
it seems to be working fine and the linestring is closed
so first query gives me for shape consisting out of Point1,Point2, Point3,Point4,Point5=Point1 --> Linestring(Point2,Point3,Point4,Point5).
While latter query gives me Linestring (Point1,Point2,Point3,Point4,Point5)
Hello guys, i have a question about Query VS Filter.
Executing this Query query = new Query(typeName, FILTER.INCLUDE, new String[] { name1, name2 });
what happens under the hood (in my case Hbase)? Meaning, is this a server-side filter (for each row download only the columns name1, name2) or client-side filter (download all and return only name1, name2).
Thank you!
2 replies

Hi guys, I've been struggling for days with geomesa Avro Schema Registry Converter in geomesa-kafka, trying to setup a converter for kafka messages.
In Confluent Kafka I created a topic with a schema (in schema-registry) like the one in the example:
Then in GeoServer I installed geomesa-kafka with all the jars and I created a Store linking to the topic in Confluent Kafka. The topic doesn't have geometry because I want messages to be converted by a Geomesa converter, which should create the geometry field from lat and lon attributes (always referring to the example above).
So I put in WEB-INF/classes of GeoServer the file geomesa-site.xml with filled the property:

and I put the converters and file reference.conf (from geomesa-kafka dist) in the folder WEB-INF/classes/sfts.
The tree in Geoserver is like this
....|--- lib
....|--- classes
.........|--- geomesa-site.xml
.........|--- conf
..............|--- reference.conf
..............|--- sfts
....................|--- my_converter_folder

At the end of all this, every time I send a message to Kafka it seems that GeoServer ignores the conversion and lets the message as is in Kafka.

Have any of you faced a similar situation?
What am I doing wrong?
Is there some example (or Docker) with Geoserver properly configured.
Thanx a lot.

7 replies
Dear experts,
I am creating some polygons, but some of them I would like to join together to bigger polygons.
The ones for which I want to do this, are adjacent. Any clue what sparksql function can do this from the list https://www.geomesa.org/documentation/stable/user/spark/sparksql_functions.html#sparksql-functions? I would expect st_union , but seems this one is not implemented yet.
5 replies
Geomesa will support delta lake, hudi, iceberg ?
2 replies
Hello experts,
I notice if I add, in addition to the regular geospatial/temporal index, an index on 2 other fields that 2 extra catalogs are created:
What is the use of these?
It is clearly related to indexes, but what happens now if I query through geoserver a CQL query on field 1? Does it read from the first table? Or if I query both on field 1 and 2 that both are indexed? Or is it just to store the info about the indexes.
1 reply
James Srinivasan
Anyone here looked at mosaic? https://github.com/databrickslabs/mosaic
1 reply
Bruno Costa
Hello everyone.
It's possible to send data from nodejs or .net core to geomesa kafka store?
10 replies
Bruno Costa
Does anyone know some tutorial with Confluent Integration in other languages like node?
Bruno Costa
So, I create this schema in confluent, after this, I use the API in confluent to create messages on the topic right?
6 replies
Dear experts,
is it possible to use the heatmap sld for data stored as lines instead of points?
Or does it need to be points?
9 replies
Bruno Costa
ok ty. But if I need to persist this data in geomesa what I can use?
2 replies
James Srinivasan
Re attribute level visibilities & accumulo, I was sorta surprised that users can query attributes that they don't have authorisations for. The matching data comes back, but those attributes are blank. Is this expected?
13 replies
Alex Leith

Hey folks, what version of Geoserver should I be using with the latest Geomesa?

Download is https://github.com/locationtech/geomesa/releases/download/geomesa-3.4.1/geomesa-fs_2.12-3.4.1-bin.tar.gz, so 3.4.1 is the latest I guess.

7 replies

Hi guys,

I have been wondering about Spark SQL and adjacent geohash blocks - I am currently using this python library to calculate the adjacent geohash blocks and store them in a single column for a point of interest.


Has anyone written a UDF or have another approach to calculate a list of the adjacent blocks in Spark SQL ?

The python method does it's job - however I thought this might be a useful function for others as it's the 'fastest' way to do a fairly accurate proximity join between 2 tables -> like Point of Interest and User Events.

This is at least the fastest approach I have found - anyone mind sharing knowledge?

Thank you.

User authentication is not supported with Kafka?
regular kafka authentication is supported - if you want fine-grained control, you'll either have to create a consumer/endpoint for each user group, or you have to implement per-user authorization on top, i.e. through geoserver's built-in user permissions
i want to use regular kafka authentication with geomesa-kafka and geoserver, do we have simple demo? @elahrvivaz
2 replies
James Srinivasan
Workshop of geospatial gems (& associated book)
taha turk

Hey Everyone ,
I was working with Kafka Data Store but i have trouble getting data from API and Geocoder datastore at the same time.
One has to stop for other to consume it .
What is Geomesa 's default publish and subscribe methodology .
Normally A topic could be read from different consumer group with uniq id but i could not configure in right way .
is there any trick for this to be set on geoserver or API datasource Config to have this functionality .

Basically, does geomesa allow us to have broadcast feture to all consumer.
Thank you,

12 replies
hello! i'm reading through the geomesa docs, and i see the following versions of accumulo are supported: 1.7.x, 1.8.x, 1.9.x and 2.0.x. any chance accumulo version 1.10.2 is supported?
15 replies

Quick question about writing lines.
I have a dataset with couple million points with couple thousands routes.
I first create point geometries with st_makePoint for each point.
Then a line for each route using sparksql like this:
SELECT somefield, collect_list(geom) as route FROM sometable GROUP BY somefield
if I try to write the resulting lines, it tells me something like "list element in field flt is not basic type: point' ".
printschema for route shows:
route: array (nullable = true)
| |-- element: point (containsNull = true)

Any clue why this is?

2 replies

good afternoon! i'm in the process up upgrading both my instance of spark (going from 2.3 to 3.1.3) and geomesa (going from 2.4 to 3.1.2). i've pulled down the distribution of geomesa-accumulo and can confirm that i'm able to query/interact with my data using the geomesa-accumulo command line tools. i have also installed the latest geomesa-pyspark. that said, i'm getting errors when i attempt to pull data back using both the pyspark and the spark shell. here is the command that i'm running to initialize the pyspark shell: ./bin/pyspark --master local[*] --jars /opt/depZep/geomesa-accumulo-spark-runtime-accumulo1_2.11-3.1.2.jar,/opt/depZep/geomesa-accumulo-spark-runtime-accumulo2_2.11-3.1.2.jar

after configuring geomesa_pyspark, and the local spark session, i receive the following when i try to load any data:

`Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/opt/spark-3.1.3-bin-without-hadoop/python/pyspark/sql/readwriter.py", line 210, in load return self._df(self._jreader.load()) File "/opt/spark-3.1.3-bin-without-hadoop/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1323, in __call__ File "/opt/spark-3.1.3-bin-without-hadoop/python/pyspark/sql/utils.py", line 111, in deco return f(*a, **kw) File "/opt/spark-3.1.3-bin-without-hadoop/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o208.load. : java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/expressions/PredicateHelper$class at org.apache.spark.sql.SQLRules$SpatialOptimizationsRule$.<init>(SQLRules.scala:144) at org.apache.spark.sql.SQLRules$SpatialOptimizationsRule$.<clinit>(SQLRules.scala) at org.apache.spark.sql.SQLRules$.registerOptimizations(SQLRules.scala:287) at org.apache.spark.sql.SQLTypes$.init(SQLTypes.scala:18) at org.locationtech.geomesa.spark.GeoMesaDataSource.createRelation(GeoMesaDataSource.scala:44) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:355) at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:325) at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:307) at scala.Option.getOrElse(Option.scala:189) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:307) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:225) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.catalyst.expressions.PredicateHelper$class at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 22 more

has any one else run into this or have any suggestions for things to try?

9 replies

Hi, guys.
I found this saying
The first feature is the Merge DataStore View. This lets one configure multiple GeoTools datastores into one consistent view in GeoServer and Spark

I'm not sure If I get it right. Does it mean we can use geomesa-spark to read Merge DataStore?

1 reply
Hi everyone, is possible in Geomesa to store timestamped trajectory in a way that i can timestamp every trajectory point. It would be (x,y,t) format. I dont want to store separated points and timestamps for each because I need trajectories for calculating. So i want to assign timestamp to each trajectory point
18 replies

Dear experts,

Quick question about writing lines.
I have a dataset with couple M points with couple thousands route.
I first create geometries with st_makeLine.
Then a line for each rout using sparksql like this:
SELECT somefield, collect_list(geom) as route FROM sometable GROUP BY somefield
if I try to write the resulting lines, it tells me something like "list element in field flt is not basic type: point' ".
printschema for route shows:
route: array (nullable = true) | |-- element: point (containsNull = true)
Any clue why this is?
When I examine my dataset, it looks like it does have the geometric objects that look like POINT(24.1224,13.2963) everywhere.
Maybe one line in the set somehow has something else than a POINT?

1 reply

Im writing my data as lines and then applying the styles=heatmap (using the sld from geomesa docs)
I think I once asked here and it should work.
I notice when I visualize it in geoserver I get:

java.lang.RuntimeException: Failed to evaluate the process function, error is: Error processing heatmap
        at org.geotools.process.function.ProcessFunction.evaluate(ProcessFunction.java:162)
used by: org.geotools.process.ProcessException: Error processing heatmap
        at org.locationtech.geomesa.process.analytic.DensityProcess.execute(DensityProcess.scala:75)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 928
        at org.geotools.process.vector.HeatmapSurface.addPoint(HeatmapSurface.java:100)
        at org.locationtech.geomesa.process.analytic.DensityProcess$$anonfun$execute$1.apply(DensityProcess.scala:81)
        at org.locationtech.geomesa.process.analytic.DensityProcess$$anonfun$execute$1.apply(DensityProcess.scala:76)
        at org.locationtech.geomesa.utils.io.package$WithClose$.apply(package.scala:64)
        at org.locationtech.geomesa.process.analytic.DensityProcess.execute(DensityProcess.scala:76)

Seems to be coming from here: https://github.com/geotools/geotools/blob/main/modules/unsupported/process-feature/src/main/java/org/geotools/process/vector/HeatmapSurface.java#L100 .

13 replies
I have some lines consisting of 528 points max.

Another small question about lines constructed with st_makeLine...
I have points with some features that are point-specific (like the timestamp of the point), and some are the same for all points (like the name of the route=line that these points make up).
Suppose I construct a line, this is 1 geometric object in geomesa so I can add the name of the route once to the line.
Is it also possible to keep the points having some of their point-specific properties?
So that in geoserver if I filter I could filter out with CQL certain parts of the line depending on the timestamp?

structure=geometry/route/timestamp, e.g.:
point 1 (lon1,lat1), route_NYC-LA, 9am
point 2 (lon2,lat2),route_NYC-LA, 9.30 am
point 3 (lon3,lat3),route_NYC-LA, 10am
SELECT route,st_makeline(collect_set(points)) AS theroute
GROUP BY route
gives me:
route_NYC-LA, LINE (point1, point2, point3).
--> is there a way to keep info about the timestamp for each point?

3 replies
Peter Corless
Hello! Peter Corless here from ScyllaDB. I see that GeoMesa has documentation for use with Apache Cassandra. Has anyone done any testing to make sure it works with ScyllaDB? It should be a compatible solution. How would I be able to work with the GeoMesa community to validate ScyllaDB as a storage layer and get the docs changed to reflect that?
4 replies
Bruno Costa

Hello guys, I need some help with this error:

Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /geomesa/ds/kafka/metadata/migration~check
[2022-09-06T13:41:50.836Z] at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
[2022-09-06T13:41:50.836Z] at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
[2022-09-06T13:41:50.836Z] at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:2021)
[2022-09-06T13:41:50.836Z] at org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:268)
[2022-09-06T13:41:50.836Z] at org.apache.curator.framework.imps.ExistsBuilderImpl$3.call(ExistsBuilderImpl.java:257)
[2022-09-06T13:41:50.837Z] at org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:67)
[2022-09-06T13:41:50.837Z] at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:81)
[2022-09-06T13:41:50.837Z] at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForegroundStandard(ExistsBuilderImpl.java:254)
[2022-09-06T13:41:50.837Z] at org.apache.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:247)
[2022-09-06T13:41:50.837Z] at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:206)
[2022-09-06T13:41:50.837Z] at org.apache.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:35)
[2022-09-06T13:41:50.837Z] at org.locationtech.geomesa.utils.zk.ZookeeperMetadata.scanValue(ZookeeperMetadata.scala:52)
[2022-09-06T13:41:50.837Z] at org.locationtech.geomesa.index.metadata.KeyValueStoreMetadata$class.scanValue(KeyValueStoreMetadata.scala:40)
[2022-09-06T13:41:50.837Z] at org.locationtech.geomesa.utils.zk.ZookeeperMetadata.scanValue(ZookeeperMetadata.scala:16)
[2022-09-06T13:41:50.837Z] at org.locationtech.geomesa.index.metadata.TableBasedMetadata

KaTeX parse error: Can't use function '$' in math mode at position 5: anon$̲1.load(TableBas…: anon$1.load(TableBasedMetadata.scala:114)
[2022-09-06T13:41:50.837Z]     at org.locationtech.geomesa.index.metadata.TableBasedMetadata
[2022-09-06T13:41:50.838Z] at com.github.benmanes.caffeine.cache.BoundedLocalCache$BoundedLocalLoadingCache.lambda$new$0(BoundedLocalCache.java:3308)
[2022-09-06T13:41:50.838Z] ... 33 more

Does anyone know what I can do to solve this problem?


5 replies
@elahrvivaz sorry to bug you again... but I did more investigation for the bug with the heatmap for lines -see question Sep.02 2022(and still not sure if it comes from geotools or geomesa)...
Getting this to work is quite important for me.
Here some observations:

1)Heatmap with millions of points: OK
2)HEatmap with lines: OK for some lines, if I use too many I get error.
3)I can investigate for what lines I get the error.
4)Next I can investigate this line, and leave some points out until it stops failing. I then notice exactly what point is "causing" the error.
I dont see anything strange for the point that causes it to crash. Example:
LINESTRING (allvalidpoints, 62.079166412353516 41.94388961791992 (this one is ok), 60.654998779296875 41.578887939453125 (when I add this one it crashes))
And if I leave out that point and add another point it will still crash. SO looks like the point itself is not corrupt .
In my example it happened with the 68th point). Other lines are much longer and dont have a problem.

Also, either I get the error posted originally and the OutOfBounds that tells me it happens at index 928(for different lines its always the same number).
If I change radiusPixels:xx to some other number I see the index in the error changes.
Sometimes it shows this error (no index specified):

org.geoserver.platform.ServiceException: Rendering process failed
        at org.geoserver.wms.map.RenderedImageMapOutputFormat.produceMap(RenderedImageMapOutputFormat.java:642)
        at org.geoserver.wms.map.RenderedImageMapOutputFormat.produceMap(RenderedImageMapOutputFormat.java:275)
        at org.geoserver.wms.map.RenderedImageMapOutputFormat.produceMap(RenderedImageMapOutputFormat.java:135)
        at org.geoserver.wms.GetMap.executeInternal(GetMap.java:749)
Caused by: java.lang.RuntimeException: Failed to evaluate the process function, error is: Error processing heatmap
        at org.geotools.process.function.ProcessFunction.evaluate(ProcessFunction.java:162)
        at org.geoserver.wms.map.RenderedImageMapOutputFormat.produceMap(RenderedImageMapOutputFormat.java:601)
        ... 133 more
Caused by: org.geotools.process.ProcessException: Error processing heatmap
        at org.locationtech.geomesa.process.analytic.DensityProcess.execute(DensityProcess.scala:75)
        at sun.reflect.GeneratedMethodAccessor452.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.geotools.process.factory.AnnotationDrivenProcessFactory$InvokeMethodProcess.execute(AnnotationDrivenProcessFactory.java:621)
        at org.geotools.process.function.ProcessFunction.evaluate(ProcessFunction.java:148)
        ... 138 more
Caused by: java.lang.ArrayIndexOutOfBoundsException

which makes me think it could be a bug in https://www.geomesa.org/documentation/stable/user/process.html#densityprocess ?
I didnt find the code yet to investigate line 75 (note I am using an old version-3.0-I know... outdated but dont think this is causing it).

20 replies
Finally, I notice if I add radiusPixels:xx with xx a very larg number (way too high for what I want) then for this one line that was causing an issue before it will be displayed, but off course then the colour is smoothed out over a way too large area.
Hi guys, just wanted one clarification, when we say geomesa supports spatio-temporal index it is supported on some distributed databases mentioned in document. With kafka geomesa it is only used for streaming.Do we have support for queries on kafka(ksql) where spatio-temporal index can come into play?
9 replies
James Srinivasan
1 reply
Villalta Humberto

Dear Experts;

I am currently new in the world of GeoMesa and DataStores. I program only in python and scala. I am interested in how to connect the Jupyter notebook with GeoMesa and use its ST_Functions for processing some Spatio-Temporal data and also saving these data sets in a database. If someone has some documentation or tips on how to do it, I would appreciate it very much.

Once again, thank you for your time in answering my inquiry.

2 replies