Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Suminda Dharmasena
    @sirinath
    Queries can be on the streams
    The streams can have an identifier and would be analogous to tables in a normal DB
    Suminda Dharmasena
    @sirinath
    Also have a look at: http://www.espertech.com/
    Suminda Dharmasena
    @sirinath
    Also stream composition (like multi table joins - Fan In, Fan Out patters) must be well thought thought.
    Perhaps compositional streams analogous to views in normal SQL
    Your thoughts on this?
    Vlad Ilyushchenko
    @bluestreak01
    Hey Sirinath, yes, single writer pattern may not be very convenient or simple in multi-channel data acquisition, but it is wait- and lock-free and by far the fastest. I have an example here: https://github.com/NFSdb/nfsdb/tree/master/nfsdb-examples/src/main/java/org/nfsdb/examples/messaging that stores data produced by multiple thread in same journal using disruptor multiple-producer-single-consumer setup.
    on SQL front, the entire language implementation is going to be stream based, including multi-journal joins and aggregation (more specifically - re-sampling). It is going to provide full ANSI SQL 92 support with extensions for handling time series. Unlike Java8 streams, nfsdb streams are reusable, which will contain memory allocation at bare minimum.
    Suminda Dharmasena
    @sirinath
    How about some abstraction like Schema which is a collection of channels.
    Also even with Journal can be abstract which need not map one to one with the file system
    If you must you create multiple files behind the scene
    BTW, have you seen http://www.aerospike.com/?
    Tactevo
    @tactevo
    NFSdb looks great! I’m not clear on indexing though. How would I produce a timestamp ordered read after writing unordered? Does query include a sort order?
    Vlad Ilyushchenko
    @bluestreak01
    Thanks, its not quite finished yet. If you can insert in batches there is JournalWriter.mergeAppend(List) to merge ordered List with ordered journal.
    Once query language is complete you'd be able to do "select * from x order by timestamp"
    Vlad Ilyushchenko
    @bluestreak01
    @sirinath i tried to make an abstraction out of Journal with pure in-memory storage, but it is a lot of work mainly because you cannot have overlapped memory in java. So it is a major restructuring. I think whats crying now is tool set for data access, which is where my focus is.
    Suminda Dharmasena
    @sirinath
    Sorry what I meant is get rid of the writer and reader abstractions. All this can be abstracted as journal with a schema. Physically they can be in separate files but logically this is like a normal DB with a Schema which contains many tables and a DB which contains many Schemas. The current journal can be the DB abstraction.
    This is purely renaming to match familiar concepts and some fluent APIs abstract away the reader and writer
    Vlad Ilyushchenko
    @bluestreak01
    @sirinath I get it now, thanks. Initial version was the way you suggested. There can only be single instance of writer for same journal at any given point in time. This is the case even cross processes. Attempt to create second writer instance will result in exception. At the same time there could be multiple simultaneous readers against the same journal. If both reader and writer function is wrapped by a same interface single writer enforcement will be deferred and less clear as some methods will work some will not. Also having single interface hides intent of passing around instance of Journal class.
    Suminda Dharmasena
    @sirinath
    If you want to do a SQL on many streams how do you handle it?
    Vlad Ilyushchenko
    @bluestreak01
    do you mean join?
    Suminda Dharmasena
    @sirinath
    Yes
    You need multiple streams hence multiple readers.
    Vlad Ilyushchenko
    @bluestreak01
    It isn't a problem having multiple readers. For SQL implementation and any other concurrent access there is class JournalPool (which should be renamed to JournalFactoryPool), which gives out JournalReaderFactory via get/release methods. It caches factories and readers to avoid opening/closing readers often. You can of course use normal JournalFactory to do the same if performance is not a concern.
    Suminda Dharmasena
    @sirinath
    OK
    This pooling is what I was thinking
    The pooling can be done to abstract also
    Vlad Ilyushchenko
    @bluestreak01
    How abstract are you thinking?
    Suminda Dharmasena
    @sirinath
    Like in a DB
    Vlad Ilyushchenko
    @bluestreak01
    I don't think i understand. Do you mind giving me an example of how abstract pool should be?
    Suminda Dharmasena
    @sirinath
    A schema
    A collection of tables
    Which is also a table
    You can have views fined as queries
    And a DB which has many schemas
    Under the hood they are a collection of pools, pools and journals
    Vlad Ilyushchenko
    @bluestreak01
    Ok, got it. This is down the line when there is "query service" either local or network. Browsing database content is definitely essential
    Suminda Dharmasena
    @sirinath
    Suminda Dharmasena
    @sirinath
    Suminda Dharmasena
    @sirinath
    Suminda Dharmasena
    @sirinath
    I think it might be an idea to have benchmark suite against competition as part of your CI
    Vlad Ilyushchenko
    @bluestreak01
    I am going to need help with that to have impartial benchmark. The only thing I need from CI is that my changes do not make existing paths slower relative to previous build
    Suminda Dharmasena
    @sirinath
    Another project I stumbled upon. Not fast but interesting: https://geteventstore.com/
    Suminda Dharmasena
    @sirinath
    Vlad Ilyushchenko
    @bluestreak01
    I had a look at that parser before writing my flat file import. Not bad, but too complex and slow for what I needed. Nfsdb parser twice as fast as univocity in the very same file.
    Suminda Dharmasena
    @sirinath
    OK
    Suminda Dharmasena
    @sirinath
    Might be of some interest to your work: https://calcite.incubator.apache.org/