These are chat archives for akkadotnet/akka.net

4th
Nov 2015
Zetanova
@Zetanova
Nov 04 2015 04:52 UTC
good morning
i tought about this PipeTo pattern thats creates basicly a free concurrent operation in the actor.
If it schould be supported, then there schould be some kind of operation manager in the ActorBase
So that the developer is forced to declare the execution mode.
One Modes would be the free asynchrone background thread, It schould be with cancelation support like in the DbSnapshot Actor with pendingOperations
Zetanova
@Zetanova
Nov 04 2015 04:58 UTC
and One Mode that the Task schould completed before the next message of the actor can be processed like the RunTask() in the ReciverActor behaivior
The first mode would be a background thread that can basicly be awaited or canceled on PreStop of the actor
the second mode is basicly a Task.WhenAll() then NextMessage
Zetanova
@Zetanova
Nov 04 2015 05:09 UTC
But currently this StartOperationAsync().PipeTo(Sender) is a fire and forget approach, very missleading and lose of control. If the Async Method is outside of the actor then the side effect of it will not have much impact. But if it is used with a AsyncMethod inside the actor then the side effect will have most propaly a big role.
Zetanova
@Zetanova
Nov 04 2015 05:25 UTC
If MessageA creates a background task with PipeTo()
then MessageA+PoisonPill should await this task
and MessageA+Kill should try to cancel it and then stop
Zetanova
@Zetanova
Nov 04 2015 05:32 UTC
Currently the system is only running because the message is reporcessed on Restart of a failed actor.
if MessageA creates this Background Thread with PipeTo() and a following arraivle of the messageA will generate a race-condition and exception (like Linklist.Remove())
Zetanova
@Zetanova
Nov 04 2015 05:37 UTC
the first messageA will be STILL processed on incanation-1 and the second message will then be parallel processed in incanation-2, thats a very bad behaivior.
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 07:38 UTC
@Zetanova I really like this kind of interest in the project. It's very motivating ;)
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 08:46 UTC
the major problem is not behavior of PipeTo itself, but rather this specific implementation. In general you should never mutate actor state inside called tasks - if you do, you basically violate one of the core principles of actor model (messages are processed synchronously). As long as this principle applies, actors are free of race conditions, and this is a behavior we want to remain. Async can be called sometimes - usually for interop or controlled cancelations - but in this case it should never touch mutable data to modify it.
Zetanova
@Zetanova
Nov 04 2015 09:17 UTC
@Horusiath I am very sure that i receive two times the persistent events on ReceiveRecover
but they are stored only once
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:19 UTC
could you manage to provide some snippet with this issue?
Zetanova
@Zetanova
Nov 04 2015 09:19 UTC
not realy
its near the same position / AR
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:20 UTC
logs?
Zetanova
@Zetanova
Nov 04 2015 09:20 UTC
logs are an other problem
i am pushing still the 200 commands
the AR are recovered
and near allways the same 1-3 AR's getting two times there events
no failure no exception (intelli trace)
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:22 UTC
Are you using sqlserver journal for that?
Zetanova
@Zetanova
Nov 04 2015 09:22 UTC
yes, i patched it with lock on the pendingOperations
maybe its the PipeTo + TellBack
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:23 UTC
which lib version do you use?
Zetanova
@Zetanova
Nov 04 2015 09:23 UTC
1.0.5.90
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:25 UTC
it will be hard to replay this one, also 1.0.5.90 is not up to date as our CI server have some issues with publishing packages lately
Zetanova
@Zetanova
Nov 04 2015 09:27 UTC
.pdb is not working
and if i am trying to add the packages with local nuget, they can be added but no reference appier
Zetanova
@Zetanova
Nov 04 2015 09:34 UTC
i copied it with the current dev build
LastSequenceNr in Eventsourced is strange to
the AR got 3 events in ReceiveRecover and the LastSequenceNr is still 0
?
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:36 UTC
the the moment it's set at the end of recovery cycle
Zetanova
@Zetanova
Nov 04 2015 09:36 UTC
    _sequenceNr    0    long
i will look it what is wrong and tell open a iisue
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 09:37 UTC
ok, thx
Zetanova
@Zetanova
Nov 04 2015 09:37 UTC
but still there should be some checks
if a gab in the replay of seqNr is discovered
at least a warning somehow
Thomas Lazar
@thomaslazar
Nov 04 2015 10:08 UTC
don't really know how to phrase it well... but is it possible that actors "wander" from one thread to another in between the processing of two messages?
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 10:50 UTC
@thomaslazar it is possible (but quite rare), unless you pin actor execution to specific thread
Thomas Lazar
@thomaslazar
Nov 04 2015 10:51 UTC
how would one do that?
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 10:51 UTC
maybe first question, what do you need it for? ;)
Thomas Lazar
@thomaslazar
Nov 04 2015 10:57 UTC
i'm writing a thing that loads up data from a database and puts it in a lucene index. problem is. it can be quite a lot of records and just loading all of it and have the actors chew through it is no option. tried it. makes the system go down to it's knees and uses up tons of ram because it has to shovel the data from the querying actor to the indexing one. so i need to use paging for that. query first 1k records, send them off to the indexing actor, and when he's done query the next 1k batch. thing is. if i don't use a snapshot transaction i can't guarantee that the data hasn't changed in between the first querying and the second. so i intend to make a query actor that opens up a database session with a snapshot transaction and keeps it open until he chewed through all the data. but i can't have the query actor wander off into another thread during processing.
the nhibernate session i need to use for this is not thread safe
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:02 UTC
it may not be thread safe, but still it looks like one session isn't used by more than one actor
Thomas Lazar
@thomaslazar
Nov 04 2015 11:02 UTC
it isn't
the principle should be sound. i just can't have the actor wander off to another thread between two processing messages
that's why i have to pin it basically to the thread that it was made from. or something like that
food now
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:08 UTC
you may want to use pinned dispatcher or fork-join dispatcher with one thread set, but to be honest I'm not sure if's this is a problem. I've seen some production code accessing NHibernate session from multiple threads, but as long as it's not accessed concurrently, it haven't cause any problems except NhProf warnings.
Thomas Lazar
@thomaslazar
Nov 04 2015 11:12 UTC
ok. well i will keep it in mind and just proceed without the thread pinning. and you said it's rare in any case
thanks
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:15 UTC
One of the core concepts is that actor is not processing message one by one and releasing the thread after processing every single message. In fact it processes a batch of messages - this value can be configured inside throughput HOCON config and AFAIK it's around 30 messages.
usually the higher throughput the more messages/sec. can be processed at cost of less responsiveness from other (esspecially less noisy) actors - but this is not a rule.
so if you have few dozens of messages in actor mailbox it's quite probable that they will be processes in serial before leaving the thread (also task dispatcher can reactivate actor in the same thread, it was working before)
Yin Zhang
@melcloud
Nov 04 2015 11:19 UTC
Morning / good evening guys, can I ask if it is ok to send message to other actors, particularly remote during akka persistence recovering? If not, how can we get notified if recover is finished so we can unstash messages?
Zetanova
@Zetanova
Nov 04 2015 11:21 UTC
protected override void OnReplaySuccess() {}
@melcloud
Yin Zhang
@melcloud
Nov 04 2015 11:23 UTC
@Zetanova I see. So normally we should stash on recovery and unstash them on replay success?
@Zetanova In other word, it is not a good idea to publish them outside current actor on recovery?
Zetanova
@Zetanova
Nov 04 2015 11:26 UTC
@melcloud yes not at all, the recovery should have no side effects beside the state. if u need to manipulate something outside then do it in OnReplaySuccess Incliding the Become()
@melcloud just htink about it, that the ReceiveRecovery() process can abort in the half of the way and restarted again. If it would have some side effects the second execution whould have a other outcome.
Zetanova
@Zetanova
Nov 04 2015 11:32 UTC
@Horusiath It was my hint. The Fournal is anwering the Eventsourced actor an old request. If the Eventsourced actor is traing to restart (on failure)
@Horusiath I added an autogenerated Guid StateId to the Eventsoruced State and a RequestId to the Request and Response Messages
@Horusiath The State Message Handler just test if the currentState.StateId is euqal to the RequestId and if not ignores the response message
Yin Zhang
@melcloud
Nov 04 2015 11:34 UTC
@Zetanova thinks that is really helpful
Zetanova
@Zetanova
Nov 04 2015 11:37 UTC
@Horusiath
protected Guid StateId { get { return _currentState.StateId; } }

ChangeState(ReplayStarted(recoveryBehavior));
Journal.Tell(new ReplayMessages(StateId, LastSequenceNr + 1L, res.ToSequenceNr, maxReplays, PersistenceId, Self));
Diego Frata
@diegofrata
Nov 04 2015 11:44 UTC
@melcloud @Zetanova Just jumping into this discussion, if my actor is going through recovery while other actors are sending messages, do I need to stash them? I had the impression recovery events are processed first and then the normal queue is processed.
Zetanova
@Zetanova
Nov 04 2015 11:45 UTC
@diegofrata the will be allready stashed an replayed after OnReplaySuccess()
Diego Frata
@diegofrata
Nov 04 2015 11:46 UTC
@Zetanova thanks!
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:47 UTC
@Zetanova I'm quite sure there is such identifier in actor already.
Zetanova
@Zetanova
Nov 04 2015 11:48 UTC
@Horusiath Maybe some Interface in the Akka.Core
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:49 UTC
it's ActorInstanceId property of the messages received from the journal
Zetanova
@Zetanova
Nov 04 2015 11:51 UTC
@Horusiath But basicly if Eventsourced Jis restarting then it can get from two Requests to the Journal Results. Because Journal is Piping them and all responses to the Eventsourced actor are then overlapped
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:52 UTC
but still, when persistent actor gets restarted it gets it's instance id incremented, so it's different from previous incarnation
Zetanova
@Zetanova
Nov 04 2015 11:54 UTC
I am getting Responses that dont correlate to the current stateId
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 11:56 UTC
ok, I'll set issue for this and check it out after work. My question is why does your actor restarts during recovery phase? I looks like a sign of other issue.
Zetanova
@Zetanova
Nov 04 2015 11:58 UTC
some external exception (AES passoword hashing or so) didnt looked it up now
John Nicholas
@MrTortoise
Nov 04 2015 12:34 UTC
hmm how do you do a recieve handler for a generic type message .;.. Eg GetSummary<TSomeType>
John Nicholas
@MrTortoise
Nov 04 2015 12:40 UTC
ahh use the type overload
In case anyone ever searches: Receive(typeof(GetSummary<>),msg => {});
MartinNiemandt
@MartinNiemandt
Nov 04 2015 13:41 UTC
Hi Guys sorry to bother anyone know of a project where the smallestmailbox routing strategy is used ?
Hussein Ait-Lahcen
@hussein-aitlahcen
Nov 04 2015 14:11 UTC
@Horusiath Hey, just wanted to tell you that it works perfectly for my 60 updates/sec
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 14:23 UTC
@hussein-aitlahcen nice to hear :)
Hussein Ait-Lahcen
@hussein-aitlahcen
Nov 04 2015 15:31 UTC
@Horusiath Is it normal that my server use 70% cpu when an actor update himself withouth delay ? You said that the scheduler had like 10ms between ticks ?
Hussein Ait-Lahcen
@hussein-aitlahcen
Nov 04 2015 15:36 UTC
And if i schedule a tellOnce with a timespan of 0 it goes back to 5% cpu
Chris G. Stevens
@cgstevens
Nov 04 2015 16:02 UTC
I'm currently trying to track down why when my worker tell my tasker that it is complete the tell doesn't make it.... Not sure what happens yet.
I know there is no guarantee of delivery for general "at-most-once". Are there any example of best practices to achieve an "exactly-once" delivery.
Some cases I do care about performance but like in the case I care more about getting the message than performance.
Jared Lobberecht
@Jared314
Nov 04 2015 16:06 UTC
Is there a version of the EventBus that, on event publish, preserves the sender?
I don't see one
and the LoggingBus just appears to .Tell without specifiying the sender
Chris G. Stevens
@cgstevens
Nov 04 2015 16:07 UTC
While asking a couple of QQ, I have been noticing... another thing I haven't had a chance to dig into... Is all of my members start reporting Akka.Remote.Transport.InvalidAssociationException: Association failure ---> System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
John Nicholas
@MrTortoise
Nov 04 2015 16:11 UTC
@cgstevens Sounds like you want to get dottrace/memory profileronto the job or soemthing similar. You must be doing something that is leaking memory. I havent had that prpoblem myself and have had some fairly long running systems.
@cgstevens locally messages rarley get dropped, if you want at most once then you need to look at something like how tcp/ip works . Exactly once is almost impossible due to race conditions - well its at least once on steroids. At least once implies some form of persistence and also soem kind of query and response to check if message was recieved. http://doc.akka.io/docs/akka/2.0.1/general/message-send-semantics.html
John Nicholas
@MrTortoise
Nov 04 2015 16:21 UTC
I still havent figured out my generic message problem ... I am expecting a message of type SomeTing<TInsertTypeHere> and am using the actor to pass that off to something that takes a parameter of type SomeTing<InsertTypeHere> so none of the overloads on recieve work in a compile time friendly way (I am using the overload protected void Receive(Type messageType, Action<object> handler, Predicate<object> shouldHandle = null);) passing in SomtTing<> as the type parameter ... i am using reflection to get the method and fire the message parameter at it via invoke ... feels really dirty. I really don't want to do the handler on a case by case basis for each potential generic type. Looking at the source of akka there isnt anything that seems to handle this scenario. If I added a handler is this something that i could create an issue and get a pull request for - i assume there is a load of existing spec to run against any implementation?
Chris G. Stevens
@cgstevens
Nov 04 2015 16:24 UTC
@MrTortoise Thanks for reply! I'm obviously doing things wrong and have a few bugs still... Getting a little frustrated but I am seeing the light at the end of the tunnel. The past 2 months I have learned so much. I will look into the memory leak(s). I have added a "Status Check" which if the status has changed to "completed" then it knows that it never got that message. So I am able to recover from that currently but it didn't feel clean...
John Nicholas
@MrTortoise
Nov 04 2015 16:26 UTC
@cgstevens have you considered using become? If you get a message you get become some other handler so thne if you get it again you can respond appropiatley. Then your sender may just retry the message - like a heart beat or something - until it gets a response from the new behaviour. Upon completion you can become waiting or whatever to start over
Chris G. Stevens
@cgstevens
Nov 04 2015 16:36 UTC
@MrTortoise Currently when my worker completes it sends the message to the tasker (remote) that it is complete and becomes Ready instead of Working. The issue is that the tasker never got the message that that worker is finished with the work so the tasker doesn't know to start the work again.
Aaron Stannard
@Aaronontheweb
Nov 04 2015 16:47 UTC
@thomaslazar just so I understand correctly, NHibernate has some thread-affinity requirements>
?
Roger Johansson
@rogeralsing
Nov 04 2015 18:40 UTC
@thomaslazar that the NH session isnt threadsafe doesnt mean it cant be used by different threads, just not by multiple threads at the same time, (unless they use any thread statics)
Zetanova
@Zetanova
Nov 04 2015 19:10 UTC
@Horusiath i am getting WriteMessagesSuccessful in the ReplayStarted State ... twice
and no, i dont Persist any events in the recovering state
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 19:22 UTC
@Zetanova take a look at #1403 and associated akka issue. Basically this is a bug, but according to typesafe guys, exceptions that occurred during recovery are treated as critical failures, and after bugfix actor should be arbitrary stopped in that case. If actor must be started anyway, this may be covered by applying BackoffSupervisor pattern over it (it's already ported in akka.net).
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 19:34 UTC
@cgstevens basically in local actor system each message is exactly once delivery as long as receipient exists. In distributed environment there is no such thing as exactly once delivery ... but we can provide at-least-once delivery with dedupe mechanism (to handle the duplicates). AtLeastOnceDelivery actors are part of akka.persistence, but depending on your needs you can achieve your own version without big effort.
let me know if you want some help with that mechanism
Zetanova
@Zetanova
Nov 04 2015 20:06 UTC
finally context.stop(self) :)
i even implemented a backlog, required one, even still with the requestId filter
Zetanova
@Zetanova
Nov 04 2015 20:45 UTC
@Horusiath Good news, with OnReplayFailure+Finaly Stop its working.
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 20:52 UTC
I'm going to add a PR which does this by default. But in general persistence plugin should be updated with akka-persistence 2.4. There are a lot of changes in the underlying code right now.
Zetanova
@Zetanova
Nov 04 2015 20:52 UTC
@Horusiath thx its fixes everything.
catch (Exception exc)
                    {
                        //var currentMessage = Context.AsInstanceOf<ActorCell>().CurrentMessage;
                        //ChangeState(ReplayFailed(exc, currentMessage));
                        try
                        {
                            OnReplayFailure(exc);
                        } 
                        finally
                        {
                            Context.Stop(Self);
                        }
                    }
and in the ReplayStarted() long seqNr = LastSequenceNr+1L;
if(persistent.SequenceNr != seqNr)
throw new InvalidOperationException(String.Format("Invalid SequenceNr {1} for persistenceId '{0}' expected {2}.", PersistenceId, m.Persistent.SequenceNr, seqNr));
seqNr++
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 20:55 UTC
not so easy, our recovery state machine differs from theirs at the present moment. We need to take that into account ;)
Zetanova
@Zetanova
Nov 04 2015 20:57 UTC
how is the behaivior if something is flaging some Events as deleted? They will be ignored on ReplayedMessage Request
but how should be the state of the AR? Success or Failure if some events are missing on recovery ?
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 20:59 UTC
events are never deleted by akka itself. If you had deleted events, it means you mean them to be deleted
Zetanova
@Zetanova
Nov 04 2015 21:00 UTC
yes sure, never sow something like it
Purge yes, but delete them in the middle, not realy
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 21:01 UTC
you can't delete them from the middle using available API
when you're deleting events, you're deleting them up to some sequence number
Zetanova
@Zetanova
Nov 04 2015 21:23 UTC
this SeqNr check and some kind of operation-manager in the ActorContext would be nice. WIll help a lot in the future bugs
Actor.Stop => "Warning some Task with Name 'task1' not completed."
the PipeTo method can register the task in the Context and the Context just hooks it up with t.ContinueWith(n =>...,
TaskContinuationOptions.AttachedToParent | TaskContinuationOptions.ExecuteSynchronously);
Zetanova
@Zetanova
Nov 04 2015 21:30 UTC
the Context can act as a Cancelation Source that an actor-method can pass the cancelationToken to a external Async-Method
on Actor-Stop if will be canceled
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 21:42 UTC
@Zetanova concerning LastSequenceNr - it's get updated every time next event is replayed during recovery.
with sequenceNr associated with that event
Zetanova
@Zetanova
Nov 04 2015 21:44 UTC
@Horusiath if the state change to ReplayStarted, the first seqNr can be set in the scope and it can be checked and incremented on every ReplayedMessage. It should never fail
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 21:47 UTC
yeah, I'm trying to think about situations when it actually could, but except long overflow it's hard to think about any
(which is error by itself)
Zetanova
@Zetanova
Nov 04 2015 21:47 UTC
private EventsourcedState ReplayStarted(Receive recoveryBehavior)
        {
            long seqNr = LastSequenceNr+1L;


            return new EventsourcedState("replay started", true, (receive, message) => { });
}
yes, its an integrity check

try
{
if (persistent.SequenceNr != seqNr)
   throw new InvalidOperationException(String.Format("Invalid SequenceNr {1} for persistenceId '{0}' expected {2}.", PersistenceId, m.Persistent.SequenceNr, seqNr));

UpdateLastSequenceNr(persistent);
base.AroundReceive(recoveryBehavior, persistent);
seqNr++;
} catch(...)
{
...
}
currently there is no check in place
and it should never be possible to miss one seqNr
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 21:51 UTC
this will acutally broke the end part of recovery, as after last event LastSequenceNr will be higher than expected.
Zetanova
@Zetanova
Nov 04 2015 21:51 UTC
no
it's not setting the LastSequenceNr or checking it
it should be the same value as
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 21:53 UTC
ok, I see
Zetanova
@Zetanova
Nov 04 2015 21:53 UTC
ChangeState(ReplayStarted(recoveryBehavior));
                    Journal.Tell(new ReplayMessages(StateId, LastSequenceNr + 1L, res.ToSequenceNr, maxReplays, PersistenceId, Self));
Bartosz Sypytkowski
@Horusiath
Nov 04 2015 21:54 UTC
you could actually propose this to the typesafe guys on akka/dev
I've already asked them about few things you've said
Zetanova
@Zetanova
Nov 04 2015 21:55 UTC
thx
Zetanova
@Zetanova
Nov 04 2015 22:01 UTC
i have great plans for an extension of Akka.Persistence. will try to implement it Decmber-Januar
to versioning
Zetanova
@Zetanova
Nov 04 2015 22:20 UTC
@Horusiath its running i think, but now getting still on some AR's Invalid SequenceNr 1 for persistenceId '{0}' expected 3.
no other exceptions
EventStore has two events, no other errors
Zetanova
@Zetanova
Nov 04 2015 22:31 UTC
LOL
extend it to
if (persistent.PersistenceId != PersistenceId)
throw new InvalidOperationException(String.Format("Invalid Persistent Message for persistenceId '{1}' expected '{0}'.", PersistenceId, persistent.PersistenceId));
WTF
found the bug
its the CS_PID in the SQl.common / SQL server extension
John Nicholas
@MrTortoise
Nov 04 2015 23:21 UTC
so no takers for my generics woes ... are any of you f# people? Have been working through some scala akka stuff and converting it to c# ... got me wondering if f# is a nice fit for this stuff? I've put off learning it - maybe now is the tiem to dabble?
Zetanova
@Zetanova
Nov 04 2015 23:26 UTC
Not so important any more @MrTortoise
what u do