Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Jun 22 15:25
    CLAassistant commented #127
  • Jun 22 15:24
    dependabot[bot] opened #127
  • Jun 22 15:24
    dependabot[bot] labeled #127
  • Jun 22 15:24

    dependabot[bot] on nuget

    Bump Newtonsoft.Json from 12.0.… (compare)

  • Mar 18 20:18
    CLAassistant commented #100
  • Mar 18 20:18
    CLAassistant commented #90
  • Mar 18 20:18
    CLAassistant commented #26
  • Mar 18 20:18
    CLAassistant commented #88
  • Mar 18 20:18
    claassistantio commented #100
  • Mar 18 20:18
    claassistantio commented #90
  • Mar 18 20:18
    CLAassistant commented #85
  • Mar 18 20:18
    claassistantio commented #88
  • Mar 18 20:18
    CLAassistant commented #96
  • Mar 18 20:18
    claassistantio commented #85
  • Mar 18 20:18
    claassistantio commented #96
  • Mar 18 20:18
    CLAassistant commented #98
  • Mar 18 20:18
    claassistantio commented #98
  • Mar 18 20:18
    CLAassistant commented #94
  • Mar 18 20:18
    claassistantio commented #94
  • Mar 18 20:18
    CLAassistant commented #87
Alan Hemmings
@goblinfactory
tbh ... I don't care how it's sorted, I know it's complicated, and solutions involve complicated stuff like gossip protocols, ...stuff I really don't want to care about.
streamstone looks great, from what I saw.
ok... will throw something together tomorrow and compare that with single node, and see how they run...
assuming I don't ignore the compiler warnings ...
(groan!)
this will be fun :D
has been fun, except for... you know... the ...er... ignoring compiler warning thing.
Robert Friberg
@rofr
On top of StreamStone you won't need the BatchingJournalWriter, it already does batching
Alan Hemmings
@goblinfactory
Hi Rob, going to have to postpone eval of streamstone + memstate until this weekend. The big issue is not performance, but cost. Streamstone doesnt appear to support AzureTable storage, it only support Azure CosmosDb CloudTable which is similar but quite different cost wise.
I also probably won't be able to use it myself, since I need a solution that works with Azure functions, serverless style, that's pay as you go, with zero minimum monthly cost.
Alan Hemmings
@goblinfactory
Hi Robert,
so ...done a bit of experimenting (not a lot), and as far as I can see, stereamstone doesn't solve any problems I have. Originally I thought it would help solve the problem of ...what happens if azure decides to create a second instance of an azure function without me expecting it to, despite any attribute or json or configurations requiring a single partition or any other trick to try to force an azure function into some type of singleton
azure doesnt guarantee singleton instances, it gaurantees single instance will be processing a message at a time.
partitioning seems to be much more around azure functions that are not http triggered, but much more around message triggers
Alan Hemmings
@goblinfactory
hi @rofr , is await DisposeAsync on the engine safe to call more than once? i.e. is it threadsafe? And, is it safe to wrap this in something like
       private static Engine<AccountModel> _engine;
   // static constructor for Azure function class Test1
        static Test1()
        {
            // safety net in case not stopped.
            AppDomain.CurrentDomain.ProcessExit += (s, e) => {
                if(!stopped)
                {
                    _engine.DisposeAsync().GetAwaiter().GetResult();
                }
            };
        }
Robert Friberg
@rofr
yes, thread safe and calling more than once has no effect
Albert
@albertyfwu

Hi @rofr. Recently I was looking into ways to hold a large singleton aggregate in memory but still have reasonable persistence to DB/file/etc. in order to recover state. I eventually got into event sourcing, Prevayler, and finally, memstate.

When I look at the github page, there appears to be infrequent development, and it's still in alpha/beta stage. Could you give some idea on how stable the current release is and what your intentions are for further development/support?

In addition, I have a question about using Postgres as the persistence. Suppose in the write-ahead approach, memstate attempts to persist a command C1 to Postgres. For whatever reason, the server persists C1, but we experience a client-side timeout. At this point, the in-memory state will be behind that of the persistence. If we then receive a command C2, that will persist to Postgres and then operate on the stale in-memory state and possibly result in a different state than when the commands are later replayed. Is there some mechanism in memstate to detect this kind of out-of-sync behavior (e.g. using event versioning) and recover transparently?

Robert Friberg
@rofr

Hi @albertyfwu
I'll take the easy question first..

In addition, I have a question about using Postgres as the persistence. Suppose in the write-ahead approach, memstate attempts to persist a command C1 to Postgres. For whatever reason, the server persists C1, but we experience a client-side timeout. At this point, the in-memory state will be behind that of the persistence. If we then receive a command C2, that will persist to Postgres and then operate on the stale in-memory state and possibly result in a different state than when the commands are later replayed. Is there some mechanism in memstate to detect this kind of out-of-sync behavior (e.g. using event versioning) and recover transparently?

JournalRecords have a RecordNumber and by default these are required to have an unbroken sequence. In the scenario above, the engine will throw an exception when receiving C2 if it has not yet seen C1.

Robert Friberg
@rofr

When I look at the github page, there appears to be infrequent development, and it's still in alpha/beta stage. Could you give some idea on how stable the current release is and what your intentions are for further development/support?

We are running memstate in production for several systems using EventStore or SqlStreamStore for storage. The core features are solid using either of these storage providers. There are some loose ends that need to be addressed before a 1.0 release though. Event subscriptions over a remote connection are not working for example.
PS: don't use the standalone Postgres provider, we may drop it altogether in favor of SqlStreamStore which has support for MySql, MSSQL and Postgres.

Albert
@albertyfwu

Hi @albertyfwu
I'll take the easy question first..

In addition, I have a question about using Postgres as the persistence. Suppose in the write-ahead approach, memstate attempts to persist a command C1 to Postgres. For whatever reason, the server persists C1, but we experience a client-side timeout. At this point, the in-memory state will be behind that of the persistence. If we then receive a command C2, that will persist to Postgres and then operate on the stale in-memory state and possibly result in a different state than when the commands are later replayed. Is there some mechanism in memstate to detect this kind of out-of-sync behavior (e.g. using event versioning) and recover transparently?

JournalRecords have a RecordNumber and by default these are required to have an unbroken sequence. In the scenario above, the engine will throw an exception when receiving C2 if it has not yet seen C1.

Thanks for the quick response. I'm still unclear on when this discrepancy is resolved. After C1 is written to the DB (but unknown to the engine), how does the engine know to throw an exception when receiving C2? At that time, it wouldn't have learned that C1 was successfully written to DB yet. Is there some syncing that's happening out-of-band between C1 and C2? Or is there some synchronous synchronization happening to resolve discrepancies between engine in-memory and DB at the time it persists C2 to DB? Let me know if I need to explain this question more clearly.

Another question I have is:
Albert
@albertyfwu

In my particular use case, I have one single big aggregate (instead of many instances).

I know that memstate persists/applies commands sequentially, but I actually need the aggregate "locked" before the "commands" are issued, since I'm doing something more like "event" rather than "command" sourcing.

My workflow is:

  • Lock the single big aggregate.
  • Validate the command.
  • If success, create an event, persist it, and apply it.
  • Unlock the single big aggregate.

It seems that memstate supports something like this instead:

  • Validate the request.
  • If success, create a command.
  • Lock the aggregate.
  • Persist the command and apply it.
  • Unlock the aggregate.

To me, it seems that I'd need to handle my own locking (before validation) before handing it off to memstate. Let me know if see how this can be more natively handled by memstate. Thanks.

Albert
@albertyfwu
I wanted to clarify above that the reason I do it that way is that in the validation step, I need the context of the command (request) AND the aggregate state together in order to validate.
Robert Friberg
@rofr

After C1 is written to the DB (but unknown to the engine), how does the engine know to throw an exception when receiving C2?

Because C2 has the wrong RecordNumber. How it works is not trivial to explain as the various storage providers behave differently. An RDBMS backend could use either auto_increment or optimistic concurrency to ensure that commands are assigned sequential ids.

Conceptually this is how a command is handled:
  1. You create a command object and pass it to engine.Execute(cmd)
  2. The engine sends the command to the underlying storage provider
  3. Storage provider writes the command
  4. The command as applied to the aggregate
Robert Friberg
@rofr
For your requirements you could do either:
  1. Validate your request with a memstate Query and issue a Command if the validation was successful
  2. Validate inside the Command.Execute method before making any changes to the aggregate. If invalid throw an exception or return a status code
2 is the recommended practice but has a drawback. The "failed" commands will be persisted which could be a problem If many commands fail.
Robert Friberg
@rofr
For 1 you would need to ensure that no other Command is executed between your Query and Command which is impossible in a load balanced environment
Albert
@albertyfwu

After C1 is written to the DB (but unknown to the engine), how does the engine know to throw an exception when receiving C2?

Because C2 has the wrong RecordNumber. How it works is not trivial to explain as the various storage providers behave differently. An RDBMS backend could use either auto_increment or optimistic concurrency to ensure that commands are assigned sequential ids.

@rofr Thanks for all your help so far, but I don't think we're talking about the same thing.

I do agree that using auto_increment or other techniques can guarantee that commands are assigned sequential ids in the DB. However, I don't see how that addresses keeping the in-memory representation in-sync with the DB in a certain client-side timeout (note: not server-side timeout) situation:

  1. Suppose we start with the state s0 for our in-memory model.
  2. We call engine.Execute(c1), which sends command to DB.
  3. DB writes a new row c1 with ID 1 (auto-incrementing).
  4. engine.Execute(c1) gets a client-side timeout because the DB write took a long time.
  5. At this point, the engine is unaware that row c1 was created in the DB, and in-memory model stays as state s0.
  6. At this point, if we call engine.Execute(c2), that will write the command to DB (row c2 with ID 2), and the in-memory model will try to compute c2(s_0).
  7. However, if at this point the application crashes, and we replay events from the DB, we'd try to replay as c2(c1(s_0)) instead.

How does memstate either (a) prevent c2 from being written to DB (and then immediately re-sync in-memory with DB), or (b) avoid being out-of-sync in the first place?

Either I am misunderstanding what the code is actually doing, or there are some additional synchronization steps above that I'm unaware of.

Thanks again. I appreciate it.

It seems to me there needs to be some sort of CAS-like model where the DB write is like INSERT c2 if current max_seq_id is <expected_seq_id_from_in_memory_model> (or we don't auto-increment in DB but use the in-memory seq_id to write, and enforce uniqueness on the seq_id column in DB).
Albert
@albertyfwu

I actually spent a couple hours reading the code and I think I have the misunderstanding.

I originally thought that the in-memory model was immediately updated following a successful command write to DB, but I now see that memstate was written to support horizontally-scaled services in a traditional CQRS manner, where in-memory state isn't updated until the subscriber reads the command from the DB, which guarantees correctness.

Given that my particular use case runs in a single process (no horizontal scaling, ever), I'm wondering if that simplification allows me a way for me to achieve the write-ahead logging semantic but get immediate read-after-write instead of waiting for subscribing to my own writes to DB. But in any case, that doesn't seem to be the use case for memstate.

Thanks.

Albert
@albertyfwu

For 1 you would need to ensure that no other Command is executed between your Query and Command which is impossible in a load balanced environment

That doesn't seem sufficient; suppose:

  1. engine.execute(C1), which gets persisted to DB.
  2. I lock the aggregate, and issue a Query followed by engine.execute(C2) (before subscription picks up on C1 having been written to DB).
  3. Now DB has both C1 and C2, though C2 was persisted assuming that there was no state change between Query (which really reflected state before C1was applied) and engine.execute(C2).
Robert Friberg
@rofr
engine.Execute(Command) returns a Task which completes when the command has been applied to the in-memory state. If you have a single engine and do reads and writes (awaited) on a single thread then the system is 100% ACID.
Robert Friberg
@rofr

I originally thought that the in-memory model was immediately updated following a successful command write to DB, but I now see that memstate was written to support horizontally-scaled services in a traditional CQRS manner, where in-memory state isn't updated until the subscriber reads the command from the DB, which guarantees correctness.

Not really CQRS as there is only a single model for both reads and writes. But yes, memstate is designed to support multiple nodes, each with the same sequence of in-memory states.

Also, from the point of view of the client the model is immediately updated given that the command is awaited.
Robert Friberg
@rofr
The purpose of the subscriber model is to obtain a globally ordered sequence of commands from multiple nodes. The only possible kind of inconsistency, given correct message delivery, is temporal. In theory, a node could lag behind and serve stale queries but I've never seen it happen.
Albert
@albertyfwu
Got it. Makes sense. Thanks.
Albert
@albertyfwu
I noticed in the docs that you mention snapshots, but I couldn't find any corresponding code actually supporting that. Is this a future feature? https://memstate.io/docs/core-1.0/storage/snapshots/
If it's not yet supported and I decide to use memstate and extend it personally, I'd be interested in contributing back.
Alan Hemmings
@goblinfactory

Right. It's for use on systems you own and manage, where you can push the code as needed.

sounds like a distributed erlang

Robert Friberg
@rofr
@albertyfwu the docs are copied from origodb, and snapshots are not yet implemented. The main reason being we tend to avoid them and just replay the entire journal at startup. A huge and complex snapshot can take just as long or longer to deserialize than simply rerunning the commands. @goblinfactory did some experiments using Azure table storage and saw 75K commands/second replay rate. I have systems with millions of commands that take ~30 minutes to load. Not saying snapshots aren't useful just not as necessary as some may think.
Albert
@albertyfwu

@rofr Thanks for the explanation. The concern I have for my use case is not the execution of millions of commands but more so that snapshots help mitigate command versioning changes.

Specifically because in my use case, my single aggregate lives permanently (multiple years).

Alan Hemmings
@goblinfactory

@albertyfwu the docs are copied from origodb, and snapshots are not yet implemented. The main reason being we tend to avoid them and just replay the entire journal at startup. A huge and complex snapshot can take just as long or longer to deserialize than simply rerunning the commands. @goblinfactory did some experiments using Azure table storage and saw 75K commands/second replay rate. I have systems with millions of commands that take ~30 minutes to load. Not saying snapshots aren't useful just not as necessary as some may think.

my test was on a Virtual machine running on my laptop, and with JSON serialisation, with no attempts at optimising for speed, e.g. using wire or protobuf. I'm on leave in 2 weeks and will redo those tests so that I can update the docs with some real numbers.

Alan Hemmings
@goblinfactory

I noticed in the docs that you mention snapshots, but I couldn't find any corresponding code actually supporting that. Is this a future feature? https://memstate.io/docs/core-1.0/storage/snapshots/

for my use case I will probably look at compressing my journal records into batches as a background or scheduled service depending on the usage patterns to reduce my storage costs with azure. currently my poc is stupidly simple where I write a new journal entry per command written each time to azure table storage. however I can "optimise" the storage after the fact, i.e take every n records and convert that into a batch and write batches to blob storage, ..then when reading I take a pointer to the top of the stream of journals and batches.

Alan Hemmings
@goblinfactory
for small micro services, if the model is fairly static in size, and lends itself to a quick snapshot, I'll add snapshots to my table storage provider. I can't foresee that being tricky, @rofr has already suggested I use an index that allows me to read the tables backwards, instead of forwards. i.e. snapshot every X records, then you're guaranteed a fixed start time of X small records and 1 snapshot regardless of journal length.
Robert Friberg
@rofr
The concern I have for my use case is not the execution of millions of commands but more so that snapshots help mitigate command versioning changes.
Yes, but then you have to deal with version changes on the model itself. For commands you have the easy option of creating new commands, for example with with a V2 suffix
Albert
@albertyfwu

engine.Execute(Command) returns a Task which completes when the command has been applied to the in-memory state. If you have a single engine and do reads and writes (awaited) on a single thread then the system is 100% ACID.

It seems that this solution still has one potential edge case (client-side timeout):

Suppose we have a function like this:

async Task RunStuff(Command cmd)
{
    await _semaphoreSlim.WaitAsync();
    try
    {
        var state = await _engine.Execute(_query);

        // Do some validation of `cmd` against `state`.

        // It's valid, so we proceed.
        await _engine.Execute(cmd);
    }
    finally
    {
        _semaphoreSlim.Release();
    }
}
  1. Model is at state s0 (and so is DB).
  2. We call RunStuff(c1), which gets state s0, validates c1, and runs engine.Execute(c1).
  3. engine.Execute(c1) succeeds in persisting to DB, but we experience client-side timeout and think it failed.
  4. RunStuff(c1) returns and unlocks.
  5. Somebody else calls RunStuff(c2), which could still read state s0, as engine may not yet have synced with database to see new c1 written.
  6. RunStuff(c2) reads state s0, validates c2, and writes it to DB.
  7. Now, c1 and c2 are persisted in DB and the in-memory model will be synced soon thereafter.

The issue now is that c2 was validated assuming the previous state was s0 instead of s1 = c1(s0).

It seems to me the only ways to resolve this kind of issue (in the case of a single process, and not load-balanced) are:

  1. Have the engine itself write its expected journal record numbers to DB (so c2 using the same record number as c1 would get rejected by column uniqueness constraint in DB).
  2. Expose a public API on engine that allows waiting on the next DB sync job/thread to complete successfully. This way we can prevent the second RunStuff(c2) call from even being able to acquire the semaphore lock until we resolve that DB sync job.

The first proposal obviously doesn't work for load-balanced architectures, but the second proposal seems non-breaking. Do you see a better way? @rofr

Robert Friberg
@rofr

Suppose we have a function like this:

You are silently dropping an exception, don't do that.

Expose a public API on engine that allows waiting on the next DB sync job/thread to complete successfully

That is exactly what await engine.Execute(cmd); does in your code example. It won't timeout, it will wait forever. But we haven't really explored these system failure scenarios so you're raising some really good questions here. System tests for failure conditions would be really good to have and will certainly uncover some bugs. At the least, behavior needs to be well defined and documented.