Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Luigi Pirelli
@luipir
bounds can overlaps? (I suppose yes)
Howard Butler
@hobu
no bounds cant overlap
Luigi Pirelli
@luipir
:(
Howard Butler
@hobu
if your density is highly variable, make more splits and have many of them do nothing quickly
we are discussing making that adaptive, but we need a sponsor for the development
Luigi Pirelli
@luipir
so the way to process big amount of data would be create a set of not overlapping bboxes and iterete on them?
Howard Butler
@hobu
yep
or run them in parallel on some cloud instances
Luigi Pirelli
@luipir
tnx @hobu I suppose (reading the bound param doc) I do not have to cut sources => I would process only that files that intersect the bound tile
iterete on tiles (bboxes) and then merge...
last parameter in "entwine merge somefiles.ept 4", 4, are thread numbers?
or subsets?
Connor Manning
@connormanning

You shouldn't have to supply bounds unless your files have bad headers or you are discarding portions of them. Simply use something of the form:

entwine build -i my-files/ -o my-output/ --subset 1 16
...
entwine build -i my-files/ -o my-output/ --subset 16 16

Entwine will split each of these portions geospatially.

I think the 4 in Howard's merge command is a typo, simply entwine merge my-output is sufficient.
To both the build and the merge you can supply threads with the --threads or -tflag.
@luipir ^
Luigi Pirelli
@luipir
ok... so I've to rebuild again the two (bbox overlapping) datasets with --subset 1 2 and --subset 2 2
tnx @connormanning
Connor Manning
@connormanning
Subsets should be powers of 4, so I think you'll want to use 4 instead of 2.
Luigi Pirelli
@luipir
soubsets can overlap? or can I create subsets randomly?
the reason is to avoid "big" builds for my laptop
Connor Manning
@connormanning
What do you mean by overlap? By their bounds?
Luigi Pirelli
@luipir
ies
Connor Manning
@connormanning
I recommend not including any explicit bounds.
Just use the exact same input specification each time.
Luigi Pirelli
@luipir
??? in that case --subset woldd take care to get only a subset (intersect) of laz files?
Connor Manning
@connormanning
Yes.
Luigi Pirelli
@luipir
I'm starting to understand the meaning of subsets
Connor Manning
@connormanning
Try it on a tiny sample set:
entwine build -i https://data.entwine.io/red-rocks.laz -o red-rocks/ -s 1 4
entwine build -i https://data.entwine.io/red-rocks.laz -o red-rocks/ -s 2 4
entwine build -i https://data.entwine.io/red-rocks.laz -o red-rocks/ -s 3 4
entwine build -i https://data.entwine.io/red-rocks.laz -o red-rocks/ -s 4 4
entwine merge red-rocks/
Each of the 4 builds will produce its own portion and the merge will join their metadata.
Luigi Pirelli
@luipir
great... so no need to loop over bounds... bounds are set automatically discoveing the great BBOX initially and then divided basing on subsets (kind of cuad tree)
Connor Manning
@connormanning
Exactly.
Luigi Pirelli
@luipir
going to process them... it sould takes 5h
mavavilj
@MattiViljamaa_twitter
Hey anyone know if there's example code as to how to use matplotlib plotting with pdal?
Howard Butler
@hobu
plotting of what? https://pdal.io/workshop/exercises/python/histogram.html uses matplotlib to make a historgram
mavavilj
@MattiViljamaa_twitter
No I want to input them to plt.plot() or something
mavavilj
@MattiViljamaa_twitter
then there's also the thing that the plot functions expect xs, ys, ...
Luigi Pirelli
@luipir
@MattiViljamaa_twitter some plotting example here https://archive.fosdem.org/2018/schedule/event/geo_pointcloud/
mavavilj
@MattiViljamaa_twitter
but pipeline.arrays is ndarray of 4-tuple numpy.voids?
Luigi Pirelli
@luipir
and numpy use
mavavilj
@MattiViljamaa_twitter
the # Load Pipeline output in python objects
part here does not look particularly elegant IMO https://github.com/rockestate/point-cloud-processing/blob/master/notebooks/point-cloud-processing.ipynb
mavavilj
@MattiViljamaa_twitter
why do you need to take it to a pd df for plotting?
Luigi Pirelli
@luipir
any suggestion and documentation update is welcome :) or at least a blog post somewhere to propose a better code style
I will add something in entwine doc (if the process will success)
mavavilj
@MattiViljamaa_twitter
yea I will do something as soon as I learn how Github works
Andrew Bell
@abellgithub
@jtfell : You're going to have to roll your own tiling to do what you're wanting. You can probably modify existing code to get what you want.
@MattiViljamaa_twitter : I looked at the language for the header option of readers.text and I think it's pretty straightforward. I just added another example that shows using the option. It should get built into the website shortly.
Andrew Bell
@abellgithub
@MattiViljamaa_twitter : There is documentation describing the pipeline specification: https://pdal.io/pipeline.html#pipelines
Julian Fell
@jtfell
Ok I feared as much @abellgithub - do you have any recommendations for the best way to manage a custom filter implementation? I am currently using the PDAL docker as the base image of my application dockerfile. Should I be building a new PDAL docker image based on the official one but copying my filter implementation in and recompiling? I would like to avoid forking the PDAL repo as I don't want to have to manage merging upstream changes.
Julian Fell
@jtfell
Alternatively is there any way you can see a patch with equivalent functionality making it to master? https://github.com/PDAL/PDAL/compare/master...jtfell:add-mercator-tiling-option-to-splitter-filter?expand=1