These are chat archives for akkadotnet/akka.net

4th
Jan 2019
Matt Lethargic
@matt-lethargic
Jan 04 16:44
Hi, I'm having trouble getting the webcrawler working locally, is this a good place to ask for help?
Onur Gumus
@OnurGumus
Jan 04 16:48
When you call Cluster.Leave for a node , it becomes "Unreachable" for a while. why is that so? No way to gracefully shut it down ?
Onur Gumus
@OnurGumus
Jan 04 17:27
Meh, it is impossible to do handover with singleton without losing messages. It's useless.
Aaron Stannard
@Aaronontheweb
Jan 04 17:29
@OnurGumus well yeah, when you change the topology of the network what do you expect?
that everything is going to happen instantly, everywhere, at the same time?
if you assume that distributed software is going to work like a single atomic SQL Server transaction, you should find a different industry to work in
moving singletons around in the cluster should be a relatively rare event
either design your system to tolerate a brief period of inconsistency when it does move
or write the actors who communicate with it to use at-least-once message delivery protocols, or maybe something stronger than that
When you call Cluster.Leave for a node , it becomes "Unreachable" for a while. why is that so? No way to gracefully shut it down ?
it's possible for this to occur under normal circumstances when a node leaves
but it'd be very brief
and shouldn't happen 100% of the time
if it's stuck in an unreachable state for a longer period of time, take a look at how that node is shutting itself down when it receives the Leave signal
assuming that you're waiting for the ActorSystem to terminate before the process exits, CoordinatedShutdown should do all of the hard work
Aaron Stannard
@Aaronontheweb
Jan 04 17:34
@matt-lethargic I wrote that sample - what issues are you having with it?
Onur Gumus
@OnurGumus
Jan 04 17:36
@Aaronontheweb I disagree. Why not actor can consume and finish it's own message meanwhile signal for hand over, and when terminated the proxy can move the traffic to the other.
I can do this manually myself, it's not even hard.
I mean I would understand if one node is down ungracefully message lost cannot be prevented unless you use at-least-once
however no node is down, I just want my active singleton to 1-) Signal to proxy so that it should buffer future messages until terminated message comes. 2-) Active singleton finishes it's own buffer. 3-) Active singleton signals for termination. 4-) proxy sends buffered messages and new messages to the new singleton.
Following above protocol a perfect graceful handover will happen without any loss of messages.
Onur Gumus
@OnurGumus
Jan 04 17:42
@Aaronontheweb my intention is blue green deployment of a service with 100% up time. If you have any better suggestion you have my ears.
I even tried CoordinatedShutdown.Get(system1).Run(CoordinatedShutdown.ClrExitReason.Instance).Wait();
worse , I have observed two singleton actors were processing the messages at the same time for a while if the buffer is not clean for the first one.
Onur Gumus
@OnurGumus
Jan 04 17:47
actually maybe coordinated shutdown doesn't wait long enough.
hmm
Onur Gumus
@OnurGumus
Jan 04 18:01
yea it is a timeout issue , so if I increase the time out cluster-exiting, then it works fine. No message is lost.
Onur Gumus
@OnurGumus
Jan 04 18:09
yeah it even works for Cluster.Leave now
it was just a matter of setting timeout.
perfect, now I can do my blue green dep.
Aaron Stannard
@Aaronontheweb
Jan 04 18:44
:+1:
Onur Gumus
@OnurGumus
Jan 04 18:48
@Aaronontheweb one question , is role name mandatory for Singleton's ? Because if I don't specify WithRole (I am using pure code approach) my singleton stays uninitialized.
But when I use WithRole everything works fine.
Aaron Stannard
@Aaronontheweb
Jan 04 22:14
@OnurGumus hmm
no - shouldn't be
one possibility though - if the code needed for running the singleton isn't available on all role types
or if the initialization code isn't available on all of them
then it's possible that the "oldest" node in the cluster, which is what the singleton gravitates towards
might not be a node capable of hosting it
could be Lighthouse or something
therefore the singleton would never start
because the node the others have designated as the "correct" host per their algorithm has no idea that it's supposed to start it