This is purely renaming to match familiar concepts and some fluent APIs abstract away the reader and writer
Vlad Ilyushchenko
@bluestreak01
@sirinath I get it now, thanks. Initial version was the way you suggested. There can only be single instance of writer for same journal at any given point in time. This is the case even cross processes. Attempt to create second writer instance will result in exception. At the same time there could be multiple simultaneous readers against the same journal. If both reader and writer function is wrapped by a same interface single writer enforcement will be deferred and less clear as some methods will work some will not. Also having single interface hides intent of passing around instance of Journal class.
Suminda Sirinath Salpitikorala Dharmasena
@sirinath
If you want to do a SQL on many streams how do you handle it?
Vlad Ilyushchenko
@bluestreak01
do you mean join?
Suminda Sirinath Salpitikorala Dharmasena
@sirinath
Yes
You need multiple streams hence multiple readers.
Vlad Ilyushchenko
@bluestreak01
It isn't a problem having multiple readers. For SQL implementation and any other concurrent access there is class JournalPool (which should be renamed to JournalFactoryPool), which gives out JournalReaderFactory via get/release methods. It caches factories and readers to avoid opening/closing readers often. You can of course use normal JournalFactory to do the same if performance is not a concern.
Suminda Sirinath Salpitikorala Dharmasena
@sirinath
OK
This pooling is what I was thinking
The pooling can be done to abstract also
Vlad Ilyushchenko
@bluestreak01
How abstract are you thinking?
Suminda Sirinath Salpitikorala Dharmasena
@sirinath
Like in a DB
Vlad Ilyushchenko
@bluestreak01
I don't think i understand. Do you mind giving me an example of how abstract pool should be?
Suminda Sirinath Salpitikorala Dharmasena
@sirinath
A schema
A collection of tables
Which is also a table
You can have views fined as queries
And a DB which has many schemas
Under the hood they are a collection of pools, pools and journals
Vlad Ilyushchenko
@bluestreak01
Ok, got it. This is down the line when there is "query service" either local or network. Browsing database content is definitely essential
I think it might be an idea to have benchmark suite against competition as part of your CI
_
Vlad Ilyushchenko
@bluestreak01
I am going to need help with that to have impartial benchmark. The only thing I need from CI is that my changes do not make existing paths slower relative to previous build
I had a look at that parser before writing my flat file import. Not bad, but too complex and slow for what I needed. Nfsdb parser twice as fast as univocity in the very same file.
This is a very useful idea, in fact my friend is doing very similar project for a bank. It is very useful to integrate legacy data sources under single query system. That said what i'm doing is slightly different. Calcite query system simply would not do for my project for three reasons: its query system does not offer functionality beyond what you get from individual databases, it looks more of an overlap between functionality of data sources it supports (check what kind of query functionality splunk provides vs. calcite). Pick a source file on calcite github and search for "new " operator usage, it is far too many for what i'm building. Third: name sounds strange (https://en.wikipedia.org/wiki/Calcite) what does it have to do with either querying or integration? ;)
may be one day somebody would honour my project by writing an adaptor for calcite? :smile:
Came back from holiday today :) I found that I need rewritable in-memory structure for some functions, as-of join being one. I'm writing and testing that. It'll prompt some exciting query capabilities once done.