Dear all, first a BIG thank you for opening up this project to the world, pushing it forward and providing good project documentation and groups like this one for new users - would like to see more projects like traildb out there :)
My question is related to optimizing retrieval of consecutive events. I would like to efficiently retrieve all the events stored in one trail, between time interval [t1, t2].
Let's assume first the simple use-case, where no traildb filter is configured on the cursor.
The "data" (that gets stored with each "event") has always the same length.
The closest API to achieve this I've seen is tdb_multi_cursor_next_batch().
Is this the fastest traildb API to retrieve events stored in a given time range?
Actually tangent to the above: just now I saw here some notes about a great feature: tdb indexing (created using 'tdb index -i my-tdb').
This could speed such queries quite a bit, if I understood it correctly. Does this indexing operation apply to events as well and it is safe to run in parallel with "read" operations (cursors performing read operations from different processes)?
Kind regards,
Marius
tdb_cursor_next()
vs. many events in a batch with tdb_multi_cursor_next_batch()
. The batch mode tends to be much faster if using a language binding
tdb
command line tool does and you can use it in your own apps too
/usr/bin/ld: .dub/obj/TrailDB.o: relocation R_X86_64_32 against symbol
_D9Exception7__ClassZ' can not be used when making a shared object; recompile with -fPIC`
-fPIC
flag: http://www.cprogramming.com/tutorial/shared-libraries-linux-gcc.html. dmd calls gcc and perhaps supplies the -fPIC
flag automatically, while perhaps ldc does not?
ctypes
/ cffi
in Python, if you want to give it a try
tdb_cons_open
always creates a new, empty tdb. If you want to add previous events to a new tdb, open the second tdb with a different name and use tdb_cons_append
to add the events of a.tdb
there
Hi, @tuulos thinks for help!
I had overviewed api in https://github.com/traildb/traildb/blob/master/doc/docs/api.md. these are tdb_multi_cursor_(new|free|reset)
and tdb_multi_cursor_next
and tdb_multi_cursor_next_batch
.the *next*
method is returning events.So if i terating over a single primary key,I have to merge it on visitor_id first,just like:
visitormap = {}
for(event in cursor){
visitorid = event.key
visitormap[visitorid] = append(visitormap[visitorid],event)
}
// THEN do work
for((visitorid,events) in visitormap){
// DO SOME WORK
}
So I need to iter the data one time and hold it.It takes time and space.
Is it a better way to do such thing ?