Bringing the scalability of distributed computing to modern geospatial software.
rfecher on gh-pages
Lastest javadoc on successful t… (compare)
rfecher on master
fixing coveralls (#1488) (compare)
rfecher on master
Update README.md (compare)
rfecher on master
updated readme.md (#1486) (compare)
geotools-vector
format and a corresponding properties file which are the connection parameters (unless of course the datastore is already a file-based source such as shapefile in which case you just need to use the filename of the data rather than a properties file)
geowave ingest localtogw <properties filename> ...
where the properties file has the <key>=<value> params specified here: https://docs.geotools.org/stable/userguide/library/jdbc/postgis.html
final SimpleFeature sf = sfBuilder.buildFeature(feature.getID());
i++;
indexWriter.write(sf);
if (i % 1000 == 0) {
indexWriter.flush();
}
flush()
will writer the statistics and clear them, so it is probably a nicety to periodically flush but really shouldn't be a necessity (aggregated statistics shouldn't be a memory issue) ... when you're flushing many times after you finish writing it is best then to merge the stats in the metadata table (for accumulo when serverside library is enabled this is a table compaction on the metadata table, although generally speaking there's a CLI command geowave stat compact
which would do the appropriate thing for each datastore and probably is just your best/easiest way to merge them) because the stats will be stored as a row per flush()
and the stat merging would otherwise need to be done at scan time (well, for accumulo the merging is already tied to accumulo's inherent compaction cycles so it may end up merged through the background compaction anyways, I just find its often nice to ensure its compacted at the end of a large ingest). I guess thats mostly a tangent to understanding why you're having memory issues - is it the accumulo server processes that are constantly growing in memory or is it that client process that you're writing thats building up memory?
geowaveapi-1.0-SNAPSHOT-jar-with-dependencies.jar
doesn't contain the SPI files under META-INF/services. Can you confirm that inside of that jar there is a file META-INF/services/org.locationtech.geowave.core.store.data.field.FieldSerializationProviderSpi
and that inside of that file there is a line for org.locationtech.geowave.core.geotime.store.field.DateSerializationProvider
?
[root@ffe7b9e3d42a geowaveapi]# mvn exec:java -Dexec.mainClass="com.uasis.geowaveapi.Geowave"
[INFO] Scanning for projects...
[INFO]
[INFO] ------------------------< com.uasis:geowaveapi >------------------------
[INFO] Building geowaveapi 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[WARNING] The POM for commons-codec:commons-codec:jar:1.15-SNAPSHOT is missing, no dependency information available
[INFO]
[INFO] --- exec-maven-plugin:3.0.0:java (default-cli) @ geowaveapi ---
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionProvider', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'GeoServerResourceLoader', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionProvider', but ApplicationContext is unset.
Jul 22, 2021 10:52:28 PM org.geoserver.platform.GeoServerExtensions checkContext
WARNING: Extension lookup 'ExtensionFilter', but ApplicationContext is unset.
Finito
22 Jul 22:52:33 ERROR [zookeeper.ClientCnxn] - Event thread exiting due to interruption
java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494)
22 Jul 22:52:40 WARN [zookeeper.ClientCnxn] - Session 0x10008674a240008 for server zookeeper.geodocker-accumulo-geomesa_default/172.25.0.3:2181, unexpected error, closing socket connection and attempting reconnect
java.nio.channels.ClosedByInterruptException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:478)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
[WARNING] thread Thread[com.uasis.geowaveapi.Geowave.main(zookeeper.geodocker-accumulo-geomesa_default:2181),5,com.uasis.geowaveapi.Geowave] was interrupted but is still alive after waiting at least 14999msecs
[WARNING] thread Thread[com.uasis.geowaveapi.Geowave.main(zookeeper.geodocker-accumulo-geomesa_default:2181),5,com.uasis.geowaveapi.Geowave] will linger despite being asked to die via interruption
[WARNING] thread Thread[Thrift Connection Pool Checker,5,com.uasis.geowaveapi.Geowave] will linger despite being asked to die via interruption
[WARNING] thread Thread[GT authority factory disposer,5,com.uasis.geowaveapi.Geowave] will linger despite being asked to die via interruption
[WARNING] thread Thread[WeakCollectionCleaner,8,com.uasis.geowaveapi.Geowave] will linger despite being asked to die via interruption
[WARNING] thread Thread[BatchWriterLatencyTimer,5,com.uasis.geowaveapi.Geowave] will linger despite being asked to die via interruption
[WARNING] NOTE: 5 thread(s) did not finish despite being asked to via interruption. This is not a problem with exec:java, it is a problem with the running code. Although not serious, it should be remedied.
[WARNING] Couldn't destroy threadgroup org.codehaus.mojo.exec.ExecJavaMojo$IsolatedThreadGroup[name=com.uasis.geowaveapi.Geowave,maxpri=10]
java.lang.IllegalThreadStateException
at java.lang.ThreadGroup.destroy (ThreadGroup.java:778)
at org.codehaus.mojo.exec.ExecJavaMojo.execute (ExecJavaMojo.java:293)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
at org.apache.maven.cli.MavenCli.execute (MavenCli.java:957)
at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:289)
at org.apache.maven.cli.MavenCli.main (MavenCli.java:193)
at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke (Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 25.523 s
[INFO] Finished at: 2021-07-22T22:52:48Z
[INFO] ------------------------------------------------------------------------
[root@ffe7b9e3d42a geowaveapi]# geowave vector query "select * from acc.uasis limit 1"
Exception in thread "Thread-4" java.lang.NoSuchMethodError: java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;
at org.locationtech.geowave.core.store.entities.GeoWaveKeyImpl.<init>(GeoWaveKeyImpl.java:47)
at org.locationtech.geowave.core.store.entities.GeoWaveKeyImpl.<init>(GeoWaveKeyImpl.java:37)
at org.locationtech.geowave.core.store.entities.GeoWaveKeyImpl.<init>(GeoWaveKeyImpl.java:30)
at org.locationtech.geowave.datastore.accumulo.AccumuloRow.<init>(AccumuloRow.java:52)
at org.locationtech.geowave.datastore.accumulo.operations.AccumuloReader.internalNext(AccumuloReader.java:198)
at org.locationtech.geowave.datastore.accumulo.operations.AccumuloReader.access$200(AccumuloReader.java:35)
at org.locationtech.geowave.datastore.accumulo.operations.AccumuloReader$NonMergingIterator.next(AccumuloReader.java:146)
at org.locationtech.geowave.datastore.accumulo.operations.AccumuloReader$NonMergingIterator.next(AccumuloReader.java:125)
at org.locationtech.geowave.core.store.operations.SimpleParallelDecoder$1.run(SimpleParallelDecoder.java:41)
at java.lang.Thread.run(Thread.java:748)
[root@ffe7b9e3d42a geowaveapi]# java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)
25 Jul 18:20:16 WARN [transport.TIOStreamTransport] - Error closing output stream.
java.io.IOException: The stream is closed
at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:118)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
at org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110)
at org.apache.thrift.transport.TFramedTransport.close(TFramedTransport.java:89)
at org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.close(ThriftTransportPool.java:335)
at org.apache.accumulo.core.client.impl.ThriftTransportPool.returnTransport(ThriftTransportPool.java:595)
at org.apache.accumulo.core.rpc.ThriftUtil.returnClient(ThriftUtil.java:159)
at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:755)
at org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:367)
at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:748)
Hello all! newbie here... I'm evaluating geographic tools for big data (Cloudera specifically) and have gone through. the geowave quickstart, also tried a variation where I use the kudu DB in Cloudera as data store.
What I haven't found is how exactly is the syntax of the ingest localToGW commando to ingest a geojson dataset. I have been able to ingest the gdelt files, but when calling the ingest with this command:
geowave ingest localToGW -f geotools-vector --geotools-vector.type geojson test.geojson kustore kustore-spatial
Nothing happens (actually test.geojson can have any content, there's no debug output and nothing is stored on the backend)
Thanks in advance for any help!
--debug
(has to come immediately after geowave
so geowave --debug ingest ...
to perhaps get a bit more feedback. Another thing I can say is --geotools-vector.type
is not what you're thinking it is in this case. It filters the ingest to only use that feature type name, so if you had a file with various type names, let's say tracks
and waypoints
or really whatever "names" you could supply that to only ingest one of the feature types.
lastly, I can say we're using this geotools datastore for the GeoJSON support (which is an "unsupported" geotools extension, which does work for geojson I've tested with, mileage may vary) so if you're still having issues it could be worthwhile to make sure that it works with that data store (one way is by including that library in geoserver and seeing if you can add the file in directly as a geoserver layer, here's a gisexchange thread briefly discussing it).
In the end it may also just be worth quickly writing an ingest format plugin to geowave (similar to whats done for GDELT). Here is an example for writing a custom ingest format in geowave.