Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Tactevo
    @tactevo
    NFSdb looks great! I’m not clear on indexing though. How would I produce a timestamp ordered read after writing unordered? Does query include a sort order?
    Vlad Ilyushchenko
    @bluestreak01
    Thanks, its not quite finished yet. If you can insert in batches there is JournalWriter.mergeAppend(List) to merge ordered List with ordered journal.
    Once query language is complete you'd be able to do "select * from x order by timestamp"
    Vlad Ilyushchenko
    @bluestreak01
    @sirinath i tried to make an abstraction out of Journal with pure in-memory storage, but it is a lot of work mainly because you cannot have overlapped memory in java. So it is a major restructuring. I think whats crying now is tool set for data access, which is where my focus is.
    Suminda Dharmasena
    @sirinath
    Sorry what I meant is get rid of the writer and reader abstractions. All this can be abstracted as journal with a schema. Physically they can be in separate files but logically this is like a normal DB with a Schema which contains many tables and a DB which contains many Schemas. The current journal can be the DB abstraction.
    This is purely renaming to match familiar concepts and some fluent APIs abstract away the reader and writer
    Vlad Ilyushchenko
    @bluestreak01
    @sirinath I get it now, thanks. Initial version was the way you suggested. There can only be single instance of writer for same journal at any given point in time. This is the case even cross processes. Attempt to create second writer instance will result in exception. At the same time there could be multiple simultaneous readers against the same journal. If both reader and writer function is wrapped by a same interface single writer enforcement will be deferred and less clear as some methods will work some will not. Also having single interface hides intent of passing around instance of Journal class.
    Suminda Dharmasena
    @sirinath
    If you want to do a SQL on many streams how do you handle it?
    Vlad Ilyushchenko
    @bluestreak01
    do you mean join?
    Suminda Dharmasena
    @sirinath
    Yes
    You need multiple streams hence multiple readers.
    Vlad Ilyushchenko
    @bluestreak01
    It isn't a problem having multiple readers. For SQL implementation and any other concurrent access there is class JournalPool (which should be renamed to JournalFactoryPool), which gives out JournalReaderFactory via get/release methods. It caches factories and readers to avoid opening/closing readers often. You can of course use normal JournalFactory to do the same if performance is not a concern.
    Suminda Dharmasena
    @sirinath
    OK
    This pooling is what I was thinking
    The pooling can be done to abstract also
    Vlad Ilyushchenko
    @bluestreak01
    How abstract are you thinking?
    Suminda Dharmasena
    @sirinath
    Like in a DB
    Vlad Ilyushchenko
    @bluestreak01
    I don't think i understand. Do you mind giving me an example of how abstract pool should be?
    Suminda Dharmasena
    @sirinath
    A schema
    A collection of tables
    Which is also a table
    You can have views fined as queries
    And a DB which has many schemas
    Under the hood they are a collection of pools, pools and journals
    Vlad Ilyushchenko
    @bluestreak01
    Ok, got it. This is down the line when there is "query service" either local or network. Browsing database content is definitely essential
    Suminda Dharmasena
    @sirinath
    Suminda Dharmasena
    @sirinath
    Suminda Dharmasena
    @sirinath
    Suminda Dharmasena
    @sirinath
    I think it might be an idea to have benchmark suite against competition as part of your CI
    Vlad Ilyushchenko
    @bluestreak01
    I am going to need help with that to have impartial benchmark. The only thing I need from CI is that my changes do not make existing paths slower relative to previous build
    Suminda Dharmasena
    @sirinath
    Another project I stumbled upon. Not fast but interesting: https://geteventstore.com/
    Suminda Dharmasena
    @sirinath
    Vlad Ilyushchenko
    @bluestreak01
    I had a look at that parser before writing my flat file import. Not bad, but too complex and slow for what I needed. Nfsdb parser twice as fast as univocity in the very same file.
    Suminda Dharmasena
    @sirinath
    OK
    Suminda Dharmasena
    @sirinath
    Might be of some interest to your work: https://calcite.incubator.apache.org/
    Actually you can have Calccite as a frondend
    Suminda Dharmasena
    @sirinath
    What are your thoughts on Apache Calcite?
    This looks cool to have as the front end
    May be it can be one front end
    Vlad Ilyushchenko
    @bluestreak01
    This is a very useful idea, in fact my friend is doing very similar project for a bank. It is very useful to integrate legacy data sources under single query system. That said what i'm doing is slightly different. Calcite query system simply would not do for my project for three reasons: its query system does not offer functionality beyond what you get from individual databases, it looks more of an overlap between functionality of data sources it supports (check what kind of query functionality splunk provides vs. calcite). Pick a source file on calcite github and search for "new " operator usage, it is far too many for what i'm building. Third: name sounds strange (https://en.wikipedia.org/wiki/Calcite) what does it have to do with either querying or integration? ;)
    may be one day somebody would honour my project by writing an adaptor for calcite? :smile:
    Suminda Dharmasena
    @sirinath
    Following article might be of interest to you: http://preshing.com/20130107/this-hash-table-is-faster-than-a-judy-array/
    Suminda Dharmasena
    @sirinath
    How are things moving?
    Vlad Ilyushchenko
    @bluestreak01
    Came back from holiday today :) I found that I need rewritable in-memory structure for some functions, as-of join being one. I'm writing and testing that. It'll prompt some exciting query capabilities once done.
    Suminda Dharmasena
    @sirinath
    Great.