github-actions[bot] on gh-pages
Update users documentation for … (compare)
Dear pgrouting-community,
we are big fans of your work! And are especially using the driving-distance feature of the pgrouting library.
There is one known issue with this function though, that is summarized in the issue #584.
With the help of @vjeranc I was exploring possibilities to fix this.
In this attempt we were having some preliminary results that are summarized in the PDF attached.
We wanted to contact you because so far the best fix for us seemed to be a custom dijkstra-function as we were not managing to fix this using Boost.
Therefore, we wanted to hear your opinion on that. Is this something that could be accepted? Looking forward to discuss with you.
Best,
Elias
@EPajares Hi, starting a discussion:
For simplicity suppose there is a graph G(E,V) where E= {(u,v,cost=10)} AND V={u,v} and a driving distance of 4 departing from
u=======+----------------------------v
then between vertex u & v at the "+" its where the driving distance is reached
From the documentation this are the possible columns
RETURNS SET OF (seq, [start_vid,] node, edge, cost, agg_cost)
What would the result be with this graph?
RETURNS SET OF (seq, [start_vid,] node, edge, cost, agg_cost, partial_edge)
We would have in the response all current edges + all edges that can only be "partially traversed" in the reality and are currently not in the response. With the additional parameter 'partial_edge' we could mark all edges that can only be "partially traversed". With post-processing we could use the cost, agg_cost and max_cost to extrapolate along the edge. But this also means that we would probably not have a node at the position "+" but would also create this one with post-processing.
Your result file contains
SELECT * FROM pgr_kargersContraction(
whereas it should be
SELECT * FROM _pgr_kargersContraction(
according to your queries.
kruskalBFS
function (an already-implemented function).kruskalBFS-edge-cases
: https://travis-ci.org/github/krashish8/GSoC-pgRouting/jobs/694577924#L1323
This is the result file. The main queries lie in the *.test.sql file - https://github.com/rajhim2/GSoC-pgRouting/blob/him/docqueries/kargersContraction/doc-pgr_kargersContraction.test.sql
In your case, all the three queries are calling the underscored function, so, I guess, if you change it to the main function, then the test will pass.
@EPajares Hi, starting a discussion:
For simplicity suppose there is a graph G(E,V) where E= {(u,v,cost=10)} AND V={u,v} and a driving distance of 4 departing from
u=======+----------------------------v
then between vertex u & v at the "+" its where the driving distance is reached
From the documentation this are the possible columnsRETURNS SET OF (seq, [start_vid,] node, edge, cost, agg_cost)
What would the result be with this graph?
@cvvergara
The result also does not need to contain additional columns. The partial edges can have agg_cost=drivingDistance
and cost
of the edge can stay the same. User would then check the predecessor agg_cost
of the node
at edge
to figure out the partially traveled part.
The result would be (seq, [start_vid,] node, edge, cost, agg_cost) = (1, u, v, 1, 10, 4)
. Given that predecessor of v
using edge 1
is u
and agg_cost
at u
is 0
then it's simple to figure out the partially traveled part.
I'm understand that implicitly marking the partial edges in this way complicates the function unnecessarily but I have not figured out a simpler way without adding an extra column in the result.
cost
is reduced to match the partial travel time, then one can find all the partial edges easily. Edges are found by comparing the cost of the input edges with same edge
id and the output edges. This is definitely more elegant than finding the predecessors and checking the agg_cost
for the predecessor.
tails
with default value FALSEvertex_ending
with default TRUE@cvvergara Yeah, the naming can be related to tree leaves, given that the current result is a spanning tree, this would extend the leaves with partial edges that end at a place between vertices of the real edge.
The implementation of the fix can be made in such a way to keep the current results. What worries me the most is the fact that the fix requires a custom dijkstra implementation and does not use any of the datastructures used by all of the graph algorithms in pgrouting codebase. I'm confident that the code is correct and performs well.
@EPajares and I will do exhaustive testing to see that the examples deemed previously incorrect (without partial edges or tails
as you call it) are now fully correct.
The reason why the pgr_drivingDistance
function is useful is for finding isochrones. In this case all the nodes at which agg_cost
is less than the given drivingDistance
are not useful.
Maybe that deserves a new function name?
@cvvergara When one wants to find isochrones, then pgr_drivingDistance
obviously gives incomplete results. It gives correct results according to the definition but incomplete for a particular use case. @EPajares had many examples where he wanted to find isochrones but was using the pgr_drivingDistance
and getting incomplete results. These are the examples that my fix is going to cover.
Given that the interest is in isochrones and not in nodes that are reached in time <= drivingDistance
, I asked if the fix maybe deserves a new function name and probably a different result.
@EPajares Just a side note,
One of the biggest issues that pgRouting had when the release 2.0 had was that all the functions were done thinking on real problems, then tests were done with "real" data, and code was adjusted to fit the "real" data.
Even pgr_dijkstra, to test this, download version 2.0 compile it and run a pgr_dijkstra on the sample data for a directed graph from vertex 1 to vertex 3, if you see the sample data graph answer should be (vertex ordering) 1->2->5->6->9->4->3 but version 2.0 returns 1->2->3
pgRouting is about graphs, graphs can be for streets which I think its your "real problem", but can also be for people relationships, electricity distribution, rivers, and so many other kind of data that I can not think of.
But, in the case of driving distance This is the definition we are using:
Using the Dijkstra algorithm, extracts all the nodes that have costs less than or equal to the value distance. The edges extracted will conform to the corresponding spanning tree.
Note that its not the minimum spanning tree its a spanning tree. where the root is the starting vertex.
The important thing here is the statement "extracts all the nodes that have costs less than or equal to the value distance."
So in your testing try to find an example where results are wrong given the definition we are using. (aka it returns a node that is not less than or equal to the value distance)
Thinking of sets, here is a problem
given S = {1,2,3,4,5}
which are the numbers less than 3.2?
Answer: A = {1,2,3}
@cvvergara pgr_drivingDistance
behaves as the definition states, there are no examples showing that the behavior is different from the definition.
I believe most users use the OSM heavily discrete graph together with pgrouting.
In this case, many of the reachable points in the continuous world on the network defined by the graph will not be included in the resulting edges. #584 shows one example of a point reachable in the real world but the edge including that point is not in the result.
@EPajares needs these points and pgr_drivingDistance
is not complete for his usecase.
Given that his usecase is finding these reachable points where travel time is exactly the given drivingDistance
(points that define the isochrones map), we ask if maybe the function implementing this continuous use case deserves a different name or maybe it does not belong in the pgrouting
codebase at all.
I am thinking of edge cases: (for simplicity suppose all edges cost is 1 and undirected graph)
E = {e(v1,v2) e(v1,v3) e(v2,v3)} V= {v1,v2,v3}
pgr_drivingDistance results for a distance of 1.2:
The spanning tree is {e(v1,v2) e(v1,v3)}
The vertices within the distance {v1,v2,v3}
The edge in question is e(v2,v3)v2 ==+------+==v3
So, get the vertices that are (vertices that are leafs of the tree, 1.2 - agg_cost to get there): {(v2,0.2)(v3,0.2)}
v2 ==v5------v5==v3
Regarding the pgr_drivingDistance
before I understand the use case and I have also had the one @EPajares describes.
However, as @cvvergara describes correctly the functions is about nodes and not actually edges and according to the definiton it does what it should do.
That said, there is definitely a need for another function, that could for example extend pgr_drivingDistance
and might make a lot of people happy. ;-)
However, pgRouting is an Open Source project and so it is some sort of "do-ocracy" and every contribution or pull request will be highly appreciated.
And if that's not the case, most Open Source projects are funded in some way, and so it is the case with pgRouting.
It's not only volunteer time, so whoever needs something, consider to fund the development!
https://funding.communitybridge.org/projects/pgrouting
Back to the pgr_drivingDistance
function, it should be possible to
So it should be possible to compute, which fraction of each connected edge can be reached with the remaining cost.
It takes a bit of time to develop this function, especially to make it applicable to generic use cases, but in my opinion it's solvable.
There are two approaches (as outlined in the pdf that @EPajares put a while ago https://gitter.im/pgRouting/pgrouting?at=5ed76ad07da67d06faebe5a3). One is to get the leaves using pgr_drivingDistance and then go to all the neighbors of a leaf L by travelling partially on the edges. This means that if someone was following the shortest path and arrived at leaf L then traveling partially on the outgoing edges reaches the drivingDistance limit. This method returns the largest number of edges.
If, on the other hand, we put the partial edges in the priority queue of the dijkstra algorithm then some of these partial edges will be ignored (the edge case described at the bottom of pdf).
It depends on what the user wants.
First approach does not require changes to the algorithm and can be a postprocessing step.
Second approach cannot be achieved without directly controlling the dijkstra search loop.
I implemented the second approach https://github.com/vjeranc/pgrouting (branch partial_isochrones) and backported it to 2.6.3 (main branch). @EPajares is testing this branch.
During development I noticed that my pgr_isochrones
function is 2-35x times faster than pgr_drivingDistance
. I believe there's unintentional quadratic behavior in the Path
data structure. Speed difference increases with the average number of neighbors of each node. I guess you'd definitely want a speed improvement PR (fix in Path
).
It took me a while to be able to do some tests (running pgr_dijkstra
and using pgrouting data structures against my implementation) with simple executables (without linking to postgres libraries and libpgrouting.so) so maybe a PR reorganizing cmake build so that it's easy to just add executables with CMakeLists.txt can also be the outcome of this endeavour.