Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 13:49
    goblinfactory commented #128
  • 08:56
    CLAassistant commented #128
  • 08:56
    dependabot[bot] closed #127
  • 08:56

    dependabot[bot] on nuget

    (compare)

  • 08:56
    dependabot[bot] commented #127
  • 08:56
    dependabot[bot] labeled #128
  • 08:56
    dependabot[bot] opened #128
  • 08:56

    dependabot[bot] on nuget

    Bump Newtonsoft.Json from 12.0.… (compare)

  • Jun 22 15:25
    CLAassistant commented #127
  • Jun 22 15:24
    dependabot[bot] opened #127
  • Jun 22 15:24
    dependabot[bot] labeled #127
  • Jun 22 15:24

    dependabot[bot] on nuget

    Bump Newtonsoft.Json from 12.0.… (compare)

  • Mar 18 20:18
    CLAassistant commented #100
  • Mar 18 20:18
    CLAassistant commented #90
  • Mar 18 20:18
    CLAassistant commented #26
  • Mar 18 20:18
    CLAassistant commented #88
  • Mar 18 20:18
    claassistantio commented #100
  • Mar 18 20:18
    claassistantio commented #90
  • Mar 18 20:18
    CLAassistant commented #85
  • Mar 18 20:18
    claassistantio commented #88
Alan Hemmings
@goblinfactory
so is that a non starter then? in order for it to be remotely possible, what building blocks are needed?
how have other tech offered what you'd like to see in memstate?
you have this already with cassandra right?
Robert Friberg
@rofr
Seems like a showstopper, yes. Anyway, I found an event sourcing implementation :)
Not cassandra
postgres, sqlstreamstore, eventstore are the main ones. Working on a provider for Pravega
Cosmos might work...
Alan Hemmings
@goblinfactory
I have that on my trello board for domanium to investigate
cant' remember how I found it.
Robert Friberg
@rofr
and the answer is....
drum roll.....
Alan Hemmings
@goblinfactory
Ah, just read the homepage docs and remember where I saw it... I had googled table storage limitations, and it came up in top results for the limitations
Robert Friberg
@rofr
optimistic concurrency!
Alan Hemmings
@goblinfactory
dont you still need to sort clock problem?
Robert Friberg
@rofr
Let's just build it on top of Streamstone, that will be simple
Alan Hemmings
@goblinfactory
tbh ... I don't care how it's sorted, I know it's complicated, and solutions involve complicated stuff like gossip protocols, ...stuff I really don't want to care about.
streamstone looks great, from what I saw.
ok... will throw something together tomorrow and compare that with single node, and see how they run...
assuming I don't ignore the compiler warnings ...
(groan!)
this will be fun :D
has been fun, except for... you know... the ...er... ignoring compiler warning thing.
Robert Friberg
@rofr
On top of StreamStone you won't need the BatchingJournalWriter, it already does batching
Alan Hemmings
@goblinfactory
Hi Rob, going to have to postpone eval of streamstone + memstate until this weekend. The big issue is not performance, but cost. Streamstone doesnt appear to support AzureTable storage, it only support Azure CosmosDb CloudTable which is similar but quite different cost wise.
I also probably won't be able to use it myself, since I need a solution that works with Azure functions, serverless style, that's pay as you go, with zero minimum monthly cost.
Alan Hemmings
@goblinfactory
Hi Robert,
so ...done a bit of experimenting (not a lot), and as far as I can see, stereamstone doesn't solve any problems I have. Originally I thought it would help solve the problem of ...what happens if azure decides to create a second instance of an azure function without me expecting it to, despite any attribute or json or configurations requiring a single partition or any other trick to try to force an azure function into some type of singleton
azure doesnt guarantee singleton instances, it gaurantees single instance will be processing a message at a time.
partitioning seems to be much more around azure functions that are not http triggered, but much more around message triggers
Alan Hemmings
@goblinfactory
hi @rofr , is await DisposeAsync on the engine safe to call more than once? i.e. is it threadsafe? And, is it safe to wrap this in something like
       private static Engine<AccountModel> _engine;
   // static constructor for Azure function class Test1
        static Test1()
        {
            // safety net in case not stopped.
            AppDomain.CurrentDomain.ProcessExit += (s, e) => {
                if(!stopped)
                {
                    _engine.DisposeAsync().GetAwaiter().GetResult();
                }
            };
        }
Robert Friberg
@rofr
yes, thread safe and calling more than once has no effect
Albert
@albertyfwu

Hi @rofr. Recently I was looking into ways to hold a large singleton aggregate in memory but still have reasonable persistence to DB/file/etc. in order to recover state. I eventually got into event sourcing, Prevayler, and finally, memstate.

When I look at the github page, there appears to be infrequent development, and it's still in alpha/beta stage. Could you give some idea on how stable the current release is and what your intentions are for further development/support?

In addition, I have a question about using Postgres as the persistence. Suppose in the write-ahead approach, memstate attempts to persist a command C1 to Postgres. For whatever reason, the server persists C1, but we experience a client-side timeout. At this point, the in-memory state will be behind that of the persistence. If we then receive a command C2, that will persist to Postgres and then operate on the stale in-memory state and possibly result in a different state than when the commands are later replayed. Is there some mechanism in memstate to detect this kind of out-of-sync behavior (e.g. using event versioning) and recover transparently?

Robert Friberg
@rofr

Hi @albertyfwu
I'll take the easy question first..

In addition, I have a question about using Postgres as the persistence. Suppose in the write-ahead approach, memstate attempts to persist a command C1 to Postgres. For whatever reason, the server persists C1, but we experience a client-side timeout. At this point, the in-memory state will be behind that of the persistence. If we then receive a command C2, that will persist to Postgres and then operate on the stale in-memory state and possibly result in a different state than when the commands are later replayed. Is there some mechanism in memstate to detect this kind of out-of-sync behavior (e.g. using event versioning) and recover transparently?

JournalRecords have a RecordNumber and by default these are required to have an unbroken sequence. In the scenario above, the engine will throw an exception when receiving C2 if it has not yet seen C1.

Robert Friberg
@rofr

When I look at the github page, there appears to be infrequent development, and it's still in alpha/beta stage. Could you give some idea on how stable the current release is and what your intentions are for further development/support?

We are running memstate in production for several systems using EventStore or SqlStreamStore for storage. The core features are solid using either of these storage providers. There are some loose ends that need to be addressed before a 1.0 release though. Event subscriptions over a remote connection are not working for example.
PS: don't use the standalone Postgres provider, we may drop it altogether in favor of SqlStreamStore which has support for MySql, MSSQL and Postgres.

Albert
@albertyfwu

Hi @albertyfwu
I'll take the easy question first..

In addition, I have a question about using Postgres as the persistence. Suppose in the write-ahead approach, memstate attempts to persist a command C1 to Postgres. For whatever reason, the server persists C1, but we experience a client-side timeout. At this point, the in-memory state will be behind that of the persistence. If we then receive a command C2, that will persist to Postgres and then operate on the stale in-memory state and possibly result in a different state than when the commands are later replayed. Is there some mechanism in memstate to detect this kind of out-of-sync behavior (e.g. using event versioning) and recover transparently?

JournalRecords have a RecordNumber and by default these are required to have an unbroken sequence. In the scenario above, the engine will throw an exception when receiving C2 if it has not yet seen C1.

Thanks for the quick response. I'm still unclear on when this discrepancy is resolved. After C1 is written to the DB (but unknown to the engine), how does the engine know to throw an exception when receiving C2? At that time, it wouldn't have learned that C1 was successfully written to DB yet. Is there some syncing that's happening out-of-band between C1 and C2? Or is there some synchronous synchronization happening to resolve discrepancies between engine in-memory and DB at the time it persists C2 to DB? Let me know if I need to explain this question more clearly.

Another question I have is:
Albert
@albertyfwu

In my particular use case, I have one single big aggregate (instead of many instances).

I know that memstate persists/applies commands sequentially, but I actually need the aggregate "locked" before the "commands" are issued, since I'm doing something more like "event" rather than "command" sourcing.

My workflow is:

  • Lock the single big aggregate.
  • Validate the command.
  • If success, create an event, persist it, and apply it.
  • Unlock the single big aggregate.

It seems that memstate supports something like this instead:

  • Validate the request.
  • If success, create a command.
  • Lock the aggregate.
  • Persist the command and apply it.
  • Unlock the aggregate.

To me, it seems that I'd need to handle my own locking (before validation) before handing it off to memstate. Let me know if see how this can be more natively handled by memstate. Thanks.

Albert
@albertyfwu
I wanted to clarify above that the reason I do it that way is that in the validation step, I need the context of the command (request) AND the aggregate state together in order to validate.
Robert Friberg
@rofr

After C1 is written to the DB (but unknown to the engine), how does the engine know to throw an exception when receiving C2?

Because C2 has the wrong RecordNumber. How it works is not trivial to explain as the various storage providers behave differently. An RDBMS backend could use either auto_increment or optimistic concurrency to ensure that commands are assigned sequential ids.

Conceptually this is how a command is handled:
  1. You create a command object and pass it to engine.Execute(cmd)
  2. The engine sends the command to the underlying storage provider
  3. Storage provider writes the command
  4. The command as applied to the aggregate
Robert Friberg
@rofr
For your requirements you could do either:
  1. Validate your request with a memstate Query and issue a Command if the validation was successful
  2. Validate inside the Command.Execute method before making any changes to the aggregate. If invalid throw an exception or return a status code
2 is the recommended practice but has a drawback. The "failed" commands will be persisted which could be a problem If many commands fail.
Robert Friberg
@rofr
For 1 you would need to ensure that no other Command is executed between your Query and Command which is impossible in a load balanced environment
Albert
@albertyfwu

After C1 is written to the DB (but unknown to the engine), how does the engine know to throw an exception when receiving C2?

Because C2 has the wrong RecordNumber. How it works is not trivial to explain as the various storage providers behave differently. An RDBMS backend could use either auto_increment or optimistic concurrency to ensure that commands are assigned sequential ids.

@rofr Thanks for all your help so far, but I don't think we're talking about the same thing.

I do agree that using auto_increment or other techniques can guarantee that commands are assigned sequential ids in the DB. However, I don't see how that addresses keeping the in-memory representation in-sync with the DB in a certain client-side timeout (note: not server-side timeout) situation:

  1. Suppose we start with the state s0 for our in-memory model.
  2. We call engine.Execute(c1), which sends command to DB.
  3. DB writes a new row c1 with ID 1 (auto-incrementing).
  4. engine.Execute(c1) gets a client-side timeout because the DB write took a long time.
  5. At this point, the engine is unaware that row c1 was created in the DB, and in-memory model stays as state s0.
  6. At this point, if we call engine.Execute(c2), that will write the command to DB (row c2 with ID 2), and the in-memory model will try to compute c2(s_0).
  7. However, if at this point the application crashes, and we replay events from the DB, we'd try to replay as c2(c1(s_0)) instead.

How does memstate either (a) prevent c2 from being written to DB (and then immediately re-sync in-memory with DB), or (b) avoid being out-of-sync in the first place?

Either I am misunderstanding what the code is actually doing, or there are some additional synchronization steps above that I'm unaware of.

Thanks again. I appreciate it.

It seems to me there needs to be some sort of CAS-like model where the DB write is like INSERT c2 if current max_seq_id is <expected_seq_id_from_in_memory_model> (or we don't auto-increment in DB but use the in-memory seq_id to write, and enforce uniqueness on the seq_id column in DB).
Albert
@albertyfwu

I actually spent a couple hours reading the code and I think I have the misunderstanding.

I originally thought that the in-memory model was immediately updated following a successful command write to DB, but I now see that memstate was written to support horizontally-scaled services in a traditional CQRS manner, where in-memory state isn't updated until the subscriber reads the command from the DB, which guarantees correctness.

Given that my particular use case runs in a single process (no horizontal scaling, ever), I'm wondering if that simplification allows me a way for me to achieve the write-ahead logging semantic but get immediate read-after-write instead of waiting for subscribing to my own writes to DB. But in any case, that doesn't seem to be the use case for memstate.

Thanks.

Albert
@albertyfwu

For 1 you would need to ensure that no other Command is executed between your Query and Command which is impossible in a load balanced environment

That doesn't seem sufficient; suppose:

  1. engine.execute(C1), which gets persisted to DB.
  2. I lock the aggregate, and issue a Query followed by engine.execute(C2) (before subscription picks up on C1 having been written to DB).
  3. Now DB has both C1 and C2, though C2 was persisted assuming that there was no state change between Query (which really reflected state before C1was applied) and engine.execute(C2).
Robert Friberg
@rofr
engine.Execute(Command) returns a Task which completes when the command has been applied to the in-memory state. If you have a single engine and do reads and writes (awaited) on a single thread then the system is 100% ACID.