by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Jun 25 19:54

    CloudNiner on release-files-modified

    (compare)

  • Jun 25 19:54

    CloudNiner on master

    gitignore sensitive files used … (compare)

  • Jun 25 19:54
    CloudNiner closed #3266
  • Jun 25 14:14
    CloudNiner review_requested #3266
  • Jun 25 14:14
    CloudNiner assigned #3266
  • Jun 25 14:14
    CloudNiner review_requested #3266
  • Jun 25 14:14
    CloudNiner opened #3266
  • Jun 25 14:11

    CloudNiner on release-files-modified

    gitignore sensitive files used … (compare)

  • Jun 24 23:46
    pomadchin assigned #3265
  • Jun 24 18:02

    CloudNiner on awf

    (compare)

  • Jun 24 18:02

    CloudNiner on master

    Add GitHub Issue Templates (#32… (compare)

  • Jun 24 18:02
    CloudNiner closed #3264
  • Jun 24 17:44
    CloudNiner commented #3264
  • Jun 24 17:43
    CloudNiner synchronize #3264
  • Jun 24 17:43

    CloudNiner on awf

    Add github issue config.yml wit… (compare)

  • Jun 24 14:18
    pomadchin commented #3264
  • Jun 24 14:18
    pomadchin commented #3264
  • Jun 24 14:18
    pomadchin commented #3264
  • Jun 24 13:24
    CloudNiner synchronize #3264
  • Jun 24 13:24

    CloudNiner on awf

    Cleanup from PR feedback (compare)

Nathan Banek
@natonic77
will do, and yes, I agree - I was going for something tried and true for my experimentation, but just fixing the bitshift operations would do it
Frederic Guiet
@fguiet
Hi @pomadchin
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f4b6ed53a82, pid=37772, tid=0x00007f315015f700
#
# JRE version: Java(TM) SE Runtime Environment (8.0_112-b15) (build 1.8.0_112-b15)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.112-b15 mixed mode linux-amd64 )
# Problematic frame:
# V  [libjvm.so+0x6d0a82]  jni_SetByteArrayRegion+0xc2
#
# Core dump written. Default location: /mnt/data08/yarn/local/usercache/guiet/appcache/application_1571662316912_1404/container_e54_1571662316912_1404_01_000002/core or core.37772
#
# An error report file with more information is saved as:
# /mnt/data08/yarn/local/usercache/guiet/appcache/application_1571662316912_1404/container_e54_1571662316912_1404_01_000002/hs_err_pid37772.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#
Got this error while trying to load las file using :
val las = spark.read.format("geotrellis.pointcloud.spark.datasource").option("path","hdfs:///user/guiet/Orleans_36_rue_de_la_Borde_Fond_L93.las").load()
File is 1.6g
I need to load las file into Hive
maybe I can use
val testDir = new Path(s"hdfs:////user/guiet/test_geotrellis/USGS_LPC_LA_Barataria_2013_15RYN6548_LAS_2015.las")
val numPoints = HadoopPointCloudRDD(testDir)
But how can I convert RDD to DataFrame
1.6g is not a big las file....I need to load a 50giga las file I am worried
Grigory
@pomadchin
@fguiet yo Im very sorry that you faced this issue, it is a known bug / feature / limitation of the current geotrellis-pointcoud implementation. geotrellis/geotrellis-pointcloud#14
it will require much more time and thinking to allow arbitraty file sizes to work with
as a workaround you can split them into smaller chunks via pdal-pipeline
but at this point I don’t have time to look into it; it is a pretty serious issue - so I would be happy to assist you if you would come up with some really good solution
Frederic Guiet
@fguiet
hummm I see...ok I am gonna split my las file into chunks via pdal split command
Check Iqumuls library as it seems to be able to load big las file into Hive
worth a look at the implementation at the implementation
Grigory
@pomadchin
Yes, i know about this library. It is completely different and much more restrictive
Frederic Guiet
@fguiet
moreover it depracated and works only with spark 1.6...that a shame
thank to point me this known limition
limitation
Grigory
@pomadchin
well, it is restrctive in the sense of operations and file formats it supports
pointclouds are not only las / laz files
Frederic Guiet
@fguiet
yeah of course
Grigory
@pomadchin
and the implemenataion details are completely different; in our case we’re hitting a problem of trying to allocate a single array for the entire las file
Frederic Guiet
@fguiet
but las file can be very big...so gt-pointcloud must handle this otherwise it will be useless
Grigory
@pomadchin
not really. usually you don’t need all the dimensions to be loaded into memory and not everything at once
Frederic Guiet
@fguiet
single array for the entire las file!!!
Grigory
@pomadchin
it is doable and we would be happy to work on it once it would be required for someone
Frederic Guiet
@fguiet
u usre?
Grigory
@pomadchin
Well yes, Im sure - because I was the one who worked on all the PDAL / jni interfaces and geotrellis-pointcloud itself
Frederic Guiet
@fguiet
yeah but are you sure it is the good implementation
as a las file can contain billion of point
Grigory
@pomadchin
em; I didnt tell that it is a good implementation
it is a naive implementation
Frederic Guiet
@fguiet
:)
anyway, will try to chunk my las file
Grigory
@pomadchin
or you can create a PR and fix the way spark loads pointclouds into memory ;)
Frederic Guiet
@fguiet
so I can load it X,Y,Z values into a big hive table
Grigory
@pomadchin
Also, if you have only x,y,z you can filter the file by dimensions
Frederic Guiet
@fguiet
I will make a PR if I was smarter enough for sure
no the cas thought
case
not
:(
Frederic Guiet
@fguiet
gtg seeya @pomadchin
thank for ya help
Grigory
@pomadchin
np, you’re welcome
rexuexiangbei
@rexuexiangbei
@pomadchin hi,which hbase version is required in gt3.0.0;hbase 1.3.1 is ok?
Grigory
@pomadchin
Hey @rexuexiangbei nope, we are on hbase 2.2 but you can try, API can be compatible