Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
Gary Gendel
@ggendel
Something else for you to investigate when you have some time. A performance degradation seems to be related to the size of the RG tables. My test was to take a graph with a couple million each of objects and edges and "commit" them so there would only be a single create command for each of these. Then I created a small independent graph (10s of vertex and edges) via RG and finally request a traversal of that. It took many seconds to return the request. This is not high priority for me, but it means that I can't use RG calls to do current-time traversals but do it directly on my tables.
Aditya Mukhopadhyay
@adityamukho

Something else for you to investigate when you have some time. A performance degradation seems to be related to the size of the RG tables. My test was to take a graph with a couple million each of objects and edges and "commit" them so there would only be a single create command for each of these. Then I created a small independent graph (10s of vertex and edges) via RG and finally request a traversal of that. It took many seconds to return the request. This is not high priority for me, but it means that I can't use RG calls to do current-time traversals but do it directly on my tables.

Ok I will take a look at this. Can you help me with the input parameters you used for the query?

Gary Gendel
@ggendel
I got pulled in for some top-priority tasks and somehow lost my notes. I'll recreate the problem when I get a little time. I'll take good notes so you don't have to spin your wheels.
Aditya Mukhopadhyay
@adityamukho
That sounds great! Thank you!
Gary Gendel
@ggendel
@adityamukho We're going to have to shelve this until I can recreate it. All my recent tests show no appreciable performance issue. Let's chalk it up to work on three conflicting tasks and messing something up. Once I get the patch for the traverse validation problem, I can do more rigorous testing. I've gotten RG on the official roadmap which means that others will join me in final evaluation and testing.
Aditya Mukhopadhyay
@adityamukho
@ggendel I have run the complete test suite on traverse providers again, and also inspected the validator schema by hand. Nothing seems to be out of place, especially nothing that would give rise to the nested validation error you got. This leads me to believe that this could be an issue with the input provided to the method - perhaps some inadvertent nesting has occurred? I would be able to better diagnose this once I get to see the input provided to the method that gave rise to your particular validation error.
Aditya Mukhopadhyay
@adityamukho
Here's an example input that would yield correct results:
// Method signature:
// function traverseProvider (timestamp, svid, minDepth, maxDepth, edges, options = {}) { ... }

traverseProvider(1581583228.2800217, "vertex_collection/starting_vertex_key", 0, 2, { "edge_collection_1": "inbound", "edge_collection_2": "outbound", "edge_collection_3": "any" }, { "uniqueVertices": "path", "uniqueEdges": "path", "bfs": true, "vFilter": "x == 2 && y < 1", "eFilter": "x == 2 && y < 1", "pFilter": "edges[0].x > 2 && vertices[1].y < y" })
In this example I've been explicit about all options, but as with the API endpoint, it will fill in the usual defaults for fields which are left unspecified.
1 reply
Aditya Mukhopadhyay
@adityamukho

I think in your case you had embedded the uniqueVertices param inside edges, resulting in the nested validation error.

I realize that there is a lot of ground to be covered to make the documentation catch up with all the new exports and I really appreciate your patience in sticking with the product so far. I have it on priority to improve documentation and test coverage as the very next items on my plate, to ensure minimal hiccups in the future.

Gary Gendel
@ggendel
@adityamukho That was a help. Indeed I was sending some attributes in the wrong parameter. However, I have one more puzzle. I want to traverse and return all objects that have their 'type' value set to 'config' (the intermediate nodes. I added "vFilter": "type == 'config'" but it returns nothing. If I remove the vFilter, then the intermediate and leaf nodes are returned (as expected). Am I constructing the filter correctly. In AQL I would filter on v.type=='config'.
Aditya Mukhopadhyay
@adityamukho
Just type=='config'. The filter works by setting the current object under iteration as the this object.
Also please note there are now 2 depth params - minDepth and maxDepth. The vFilter only applies to vertices starting from minDepth.
Aditya Mukhopadhyay
@adityamukho
If it still doesn't work after fixing any depth related errors, i might need to run the query at my end. I think my existing sample database would work. Just need all the query params
Gary Gendel
@ggendel
Yes, the starting vertex is a type=='config' and minDepth is 1. Interestingly, if I set minDepth to 0, then the root node comes out but none of the child nodes which are also type=='config' do.
The information I send to traverse is:
2020-06-24T14:32:42.038Z: RealGraph traverse, {\"svid\":\"pmconfig/453634713\",\"timestamp\":\"1593006183.013\",\"minDepth\":1,\"maxDepth\":50,\"edges\":{\"pm_content\":\"inbound\"},\"options\":{\"uniqueVertices\":\"path\",\"uniqueEdges\":\"path\",\"vFilter\":\"type == 'config'\"}}"
Aditya Mukhopadhyay
@adityamukho
Ok let me try it and see...
Gary Gendel
@ggendel
Sorry for the munged info. This comes from my debug code.
Aditya Mukhopadhyay
@adityamukho
Haha.. I'm quite used to it.. i get the same
Gary Gendel
@ggendel
Interrupt! Operator error, I was examining the wrong testcase! The problem was indeed the minDepth.
Aditya Mukhopadhyay
@adityamukho
whew, that's a relief! the filter was working fine over at my end.
so i guess everything's working fine?
Gary Gendel
@ggendel
Yes. Once I finish validation I can push this for internal testing for final approval. After that, we're wedded to RecallGraph.
Aditya Mukhopadhyay
@adityamukho
Awesome!
Gary Gendel
@ggendel
@adityamukho I hope this is the last question. Do you have an example of the purge provider? I can't seem to figure out the right information to put in the body.
Aditya Mukhopadhyay
@adityamukho
Sure... just a minute...
// function purgeProvider (path, options = {}) { ... }
// This function deletes ALL history of the given path. If deleteUserObjects is true, it also
// deletes the corresponding objects from the plain collections (holding the current state)

purgeProvider('/c/pmconfig', { deleteUserObjects: true, silent: false })
Gary Gendel
@ggendel
Super, thanks.
Gary Gendel
@ggendel
@adityamukho Just a "How do I?" question... How do I get a list of "deleted" documents? I'd like to implement a rollback feature, including recovering deleted items.
Aditya Mukhopadhyay
@adityamukho
You can use the log endpoint. Just a min, sending you an example
Aditya Mukhopadhyay
@adityamukho
logProvider('/', { groupBy: 'node', groupLimit: 1, postFilter: "events[0].event === 'deleted'" })
Gary Gendel
@ggendel
Thanks. I'll give it a go.
Gary Gendel
@ggendel
Not quite what I expected. I created two documents in two collections, then I deleted one and got nothing back for the first request. Then I deleted the other and got the following error:
No object at /
Nevermind. Must be me again. I did it in the API "Try it now" and it worked. Not your problem.
Aditya Mukhopadhyay
@adityamukho
Ok, but please let me know if you can't get it to work with the provider.
Gary Gendel
@ggendel
I discovered my error, just outsmarted myself.
Aditya Mukhopadhyay
@adityamukho
A webinar recording of RecallGraph, where I discuss its roadmap, adoption and development efforts. Apologies in advance for the occasional drop in quality.
https://www.youtube.com/watch?v=A953O3hT1Os
Aditya Mukhopadhyay
@adityamukho
Provider API documentation now available at https://recallgraph.github.io/RecallGraph/
Gary Gendel
@ggendel
Very nice. Thanks.
Aditya Mukhopadhyay
@adityamukho
RecallGraph now has a website: https://recallgraph.tech/
Milko Škofič
@milko
A short question: approximately when one may expect support for valid time to be available?
Aditya Mukhopadhyay
@adityamukho

A short question: approximately when one may expect support for valid time to be available?

I've been trying to raise funds/sponsorship for the RecallGraph project during the last few months. Since it is now almost two years that I've been working almost exclusively on RecallGraph, I need to establish a revenue source to be able to keep working on it going forward. I'll get back to core development as soon as I secure some funds.

If you want, you can help by sponsoring the project. Please visit https://opencollective.com/recallgraph

LeenaBahulekar
@LeenaBahulekar
@adityamukho We are working on a requirement where we need versioning of data in Arangodb. We are very excited to see this solution. We have already installed it and trying our hands at the API. I have gone through most of the documentation provided for RecallGraph. I wanted to understand if this is built on the same principles as cited by ArangoDB in Time traveling with Graph databases. Is there any white paper or technical paper that you have published to understand more about RecallGraph.
Aditya Mukhopadhyay
@adityamukho

@LeenaBahulekar Glad to see you on board with the idea of RecallGraph as a data versioning solution.

RecallGraph's design is actually quite different from the one provided in the blog post on time-travelling databases. There isn't a technical whitepaper as such, but you can refer to this YouTube video (in case you haven't seen it already), which discusses a few details about RecallGraph's architecture:

https://www.youtube.com/watch?v=UP2KDQ_kL4I

LeenaBahulekar
@LeenaBahulekar
@adityamukho Yes, we did go through this video. Thanks for sharing again. We had a specific question around transaction support. So ArangoDB gives a transaction API. We were looking at a use case where we would need to execute a set up updates on the graph in a transaction. Now if we plan to use RecallGraph, all updates will be with the Recall api. Is it possible to club a set of updates as a transaction.
Aditya Mukhopadhyay
@adityamukho

@LeenaBahulekar Although RecallGraph supports bulk updates, it wraps each individual update in its own separate transaction. This is because each update involves a bunch of associated event log entries and skeleton graph updates, snapshot insertions, etc. On an average, around 5 internal documents are created/updated during a single node update, although in some cases this number could go higher.

Since ArangoDB does not support nested transactions, it is not possible to wrap these individual small transactions under a larger envelope transaction. It might be useful to support bulk updates in a single transaction in future versions of RecallGraph, but it would need some careful design in order to avoid overshooting memory limits. Here's a list of limitations that must be accounted for when using the RocksDB engine - https://www.arangodb.com/docs/stable/transactions-limitations.html#rocksdb-storage-engine.

Supporting bulk updates in a single transaction is thus not a straightforward matter.