These are chat archives for akkadotnet/akka.net

12th
Sep 2017
Martin Cavanagh
@mcavanagh
Sep 12 2017 07:55
hi. I'm using CoordinatedShutdown.Get(actorSystem).Run().Wait(); when gracefully shutting down a node in my cluster. I would expect this to make the node go through leaving/exiting/removed, and any shard coordinator actors to then be moved to another node in the cluster. However, the other nodes never take over the coordinator, and from the cluster state events I'm seeing, the leaving node seems to stay at 'leaving' (which would explain why the other nodes don't think they should take over yet). The cluster leader does transfer to one of the other nodes, but I'm not sure how significant this is.
Sean Templeton
@seantempleton
Sep 12 2017 14:50
How do you handle converting procedural code to actors?
For example, in procedural code I have an if statement to call a function that inserts a row into a table. After that if statement is another function call that updates a column in that table. I want the insert to always happen first, and then the update.
Do I create one actor that does both actions? Or do I create actor X that inserts the data and it creates and owns Y that it Tells to do the updates? Or do I create parent B that creates X and Y, tells X, X replies, and then B Tells Y?
The second seems to break the fact that database calls should be in the leaf nodes. The third seems awfully chatty for what is two function calls in procedural code.
Boban
@bobanco
Sep 12 2017 18:36
@Horusiath are you arround?
Boban
@bobanco
Sep 12 2017 19:51
i have updated the postgressql plugin with batched journal implementation, please take a look BatchedPostgreSqlJournal
Youenn Bouglouan
@Youenn-Bouglouan
Sep 12 2017 20:07
@Aaronontheweb Props is available and it could use it as in F#, however I was wondering if there was a more idiomatic way to do this. So far I haven't used Props at all.
Jimmy Hannon
@JimmyHannon
Sep 12 2017 21:29
Hi all, I am playing arround with Cluster and Cluster sharding. When i have a cluster running with multiple (non seed) nodes and i kill one of these nodes i get exceptions in the seed node and worker nodes. From then on the cluster seems broken. The sharding for example is not reballancing the shards. The logs on the seed node:
logs on the seed node: [WARNING][12/09/2017 21:26:46][Thread 0011][[akka://ActorSystem/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FActorSystem%400.0.0.0%3A63855-52/endpointWriter#394224397]] AssociationError [akka.tcp://ActorSystem@127.0.0.1:4053] -> akka.tcp://ActorSystem@0.0.0.0:63855: Error [Association failed with akka.tcp://ActorSystem@0.0.0.0:63855] []
[WARNING][12/09/2017 21:26:46][Thread 0011][remoting] Tried to associate with unreachable remote address [akka.tcp://ActorSystem@0.0.0.0:63855]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: [Association failed with akka.tcp://ActorSystem@0.0.0.0:63855] Caused by: [System.AggregateException: One or more errors occurred. (No connection could be made because the target machine actively refused it tcp://ActorSystem@0.0.0.0:63855) ---> Akka.Remote.Transport.InvalidAssociationException: No connection could be made because the target machine actively refused it tcp://ActorSystem@0.0.0.0:63855
at Akka.Remote.Transport.DotNetty.TcpTransport.<AssociateInternal>d1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
at Akka.Remote.Transport.DotNetty.DotNettyTransport.<Associate>d
22.MoveNext()
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at Akka.Remote.Transport.ProtocolStateActor.<>c.<InitializeFSM>b__11_54(Task1 result)
at System.Threading.Tasks.ContinuationResultTaskFromResultTask2.InnerInvoke() at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot) ---> (Inner Exception #0) Akka.Remote.Transport.InvalidAssociationException: No connection could be made because the target machine actively refused it tcp://ActorSystem@0.0.0.0:63855 at Akka.Remote.Transport.DotNetty.TcpTransport.<AssociateInternal>d__1.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.ConfiguredTaskAwaitable1.ConfiguredTaskAwaiter.GetResult()
at Akka.Remote.Transport.DotNetty.DotNettyTransport.<Associate>d__22.MoveNext()<---
]
Aaron Stannard
@Aaronontheweb
Sep 12 2017 21:30
@JimmyHannon might be #3094
Jimmy Hannon
@JimmyHannon
Sep 12 2017 21:30
logs on the worker node: [WARNING][12/09/2017 21:24:50][Thread 0011][[akka://ActorSystem/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FActorSystem%400.0.0.0%3A63855-15/endpointWriter#1948372077]] AssociationError [akka.tcp://ActorSystem@0.0.0.0:63844] -> akka.tcp://ActorSystem@0.0.0.0:63855: Error [Association failed with akka.tcp://ActorSystem@0.0.0.0:63855] []
[WARNING][12/09/2017 21:24:50][Thread 0008][remoting] Tried to associate with unreachable remote address [akka.tcp://ActorSystem@0.0.0.0:63855]. Address is now gated for 5000 ms, all messa ges to this address will be delivered to dead letters. Reason: [Association failed with akka.tcp://ActorSystem@0.0.0.0:63855] Caused by: [System.AggregateException: One or more errors occu rred. (No connection could be made because the target machine actively refused it tcp://ActorSystem@0.0.0.0:63855) ---> Akka.Remote.Transport.InvalidAssociationException: No connection cou ld be made because the target machine actively refused it tcp://ActorSystem@0.0.0.0:63855
at Akka.Remote.Transport.DotNetty.TcpTransport.<AssociateInternal>d1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.ConfiguredTaskAwaitable`1.ConfiguredTaskAwaiter.GetResult()
at Akka.Remote.Transport.DotNetty.DotNettyTransport.<Associate>d
22.MoveNext()
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at Akka.Remote.Transport.ProtocolStateActor.<>c.<InitializeFSM>b__11_54(Task1 result)
at System.Threading.Tasks.ContinuationResultTaskFromResultTask2.InnerInvoke() at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state) at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot) ---> (Inner Exception #0) Akka.Remote.Transport.InvalidAssociationException: No connection could be made because the target machine actively refused it tcp://ActorSystem@0.0.0.0:63855 at Akka.Remote.Transport.DotNetty.TcpTransport.<AssociateInternal>d__1.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at System.Runtime.CompilerServices.ConfiguredTaskAwaitable1.ConfiguredTaskAwaiter.GetResult()
at Akka.Remote.Transport.DotNetty.DotNettyTransport.<Associate>d__22.MoveNext()<---
]
Sorry for the splitted messages.
Anyone any idea?
@Aaronontheweb ok thanks. I suppose this fix will be available as nigthly build soon?
Martin Cavanagh
@mcavanagh
Sep 12 2017 21:38
hi. I'm using CoordinatedShutdown.Get(actorSystem).Run().Wait(); when gracefully shutting down a node in my cluster. I would expect this to make the node go through leaving/exiting/removed, and any shard coordinator actors to then be moved to another node in the cluster. However, the other nodes never take over the coordinator, and from the cluster state events I'm seeing, the leaving node seems to stay at 'leaving' (which would explain why the other nodes don't think they should take over yet). The cluster leader does transfer to one of the other nodes, but I'm not sure how significant this is.
Aaron Stannard
@Aaronontheweb
Sep 12 2017 21:38
@JimmyHannon I just took a look at the failing specs
and yeah, looks like the stuff that was failing was unrelated to the PR
if @alexvaluyskiy is onboard I'll go ahead and merge it; should be available in tomorrow's nightly
@mcavanagh I think there might be a bug with CoordinatedShutdown at the moment when it's executed that way
noticed that behavior with one of my own apps but haven't filed a bug yet
however, as for the cluster leaving part not completing - not sure about that
if you can provide some logs on a GH issue that'd be helpful
Martin Cavanagh
@mcavanagh
Sep 12 2017 21:40
@Aaronontheweb I only recently discovered CoordinatedShutdown. I was previously registering OnMemberRemoved and doing Cluster.Leave and saw the same thing
but yeah I'll put together some logs
if I spin up a barebones cluster it works fine, so it's clearly some combination of other things we've layered on top in our actual product
Aaron Stannard
@Aaronontheweb
Sep 12 2017 21:41
hmmm
Martin Cavanagh
@mcavanagh
Sep 12 2017 21:41
the combination of persistence and sharding/singletons seems tricky to get right when upgrading nodes
Jimmy Hannon
@JimmyHannon
Sep 12 2017 21:42
@Aaronontheweb I will try with tomorrow's nightly. I i still see the same issue i'l create an issue on github.
Martin Cavanagh
@mcavanagh
Sep 12 2017 21:42
sometimes, even if I manually go to the other nodes in the cluster and tell them the previous shardcoordinator is down (using the petabridge cmd tool), they acknowledge that in their view of the cluster state, but still don't bring up a shard coordinator singleton of their own
and sometimes, even though they see the correctly remaining nodes, there's logs of chatter in the logs about trying to reach the old coordinator and failing
even if i delete knowledge of that seed node from their configs and restart
so the only place they could be getting that from is the persistence that the old coordinator made before leaving
I'll try and put together some useful sequences from our logs into a GH issue
zbynek001
@zbynek001
Sep 12 2017 22:16
@JimmyHannon when you kill the node, do you also down it in cluster?