These are chat archives for akkadotnet/

Aug 2018
Aug 28 2018 15:04

Hi All,

I am getting a persistence failure when replaying events. The recovery timed out and did not get an event within 30 seconds. I have a feeling it is because we have a lot of events. If we extend the time out period it fixes the problem but it implies that we could get to a point where we could not recover at all. Is there a fix for this or is it just a case of introducing snapshots? I don't really like the idea of never being able to recover from scratch if necessary.

Bartosz Sypytkowski
Aug 28 2018 17:35
@mauyl this really depends on the capabilities of the underlying storage you're using. There's also akka.persistence.max-concurrent-recoveries setting, which tells, how many persistent actors are triggering recovery at the same time (it's 50 by default) - it's a bit similar to how semaphore works.
Vagif Abilov
Aug 28 2018 20:06
@mauyl we had exactly the same problem and first blamed our SQL Server. But tracing database calls showed that there db performance wasn't a problem, only the number of simultaneously recovering actors. @Aaronontheweb mentioned here that he has also seen this problem with non-SQL database. We are revising this part to reduce concurrency of persistent actors recovery.
Peter Shrosbree
Aug 28 2018 21:11
I have been having a look at the dockerized version of Lighthouse. I see two problems.
The first is that it has only a Linux container version. It would be good to have a Windows one as well, preferably the .NET Core version.
The second is that the IP address setup for the seed node is the IP address of the container, not the host, so this is only visible from the host, right? Should the ports in the container not be mapped to the host ports, and the IP address of the seed node be the IP address of the host?