These are chat archives for locationtech/geomesa

May 2017
Ragi Y. Burhum
May 10 2017 22:26
I have a dumb questions.
I went over both examples of running GeoMesa on AWS EMR (the one for Accumulo using GeoDocker and the one for HBase setting up from scratch). Both work fine, but I am having trouble wrapping my head around data persistence after I upload all my data using the geomesa data ingest tools. For HBase, I am specifying an S3 data store, so even when I turn off my cluster the data is there until I turn it back on to use it. For the Accumulo sample, the HDFS docker container in GeoDocker will go away, and so will all my data. It seems that the recommended approach to persist that data after it has been loaded is to use the S3distcp utility. Does this sound right? Is the recommended way to add a step on EMR that copies this to s3? Just looking for guidelines since I am about to copy 10 billion records and it would suck if I had to copy them by hand...