These are chat archives for akkadotnet/akka.net

22nd
Apr 2018
Hyungho Ko
@hhko
Apr 22 2018 14:49
my service logs some warns such as
[ WARN][13634][ 37]
[akka://Server/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClient%400.0.0.0%3A9172-5181/endpointWriter#1389624287]
AssociationError
[akka.tcp://Server@203.239.173.208:8081] -> akka.tcp://Client@0.0.0.0:9172:
Error [Association failed with akka.tcp://Client@0.0.0.0:9172] []
[ WARN][13634][ 37]
remoting Tried to associate with unreachable remote address [akka.tcp://Client@0.0.0.0:9172].
Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters.
Reason: [Association failed with akka.tcp://Client@0.0.0.0:9172]
Caused by: [System.AggregateException: One or more errors occurred.
---> Akka.Remote.Transport.InvalidAssociationException: Connection refused tcp://Client@0.0.0.0:9172
at Akka.Remote.Transport.DotNetty.TcpTransport+<AssociateInternal>d__1.MoveNext () [0x00208] in <aa58c9e3a0cf46328a4d91ccb4838078>:0
Hyungho Ko
@hhko
Apr 22 2018 15:05
how to send disconnection message gracefully on remote?
Joshua Garnett
@joshgarnett
Apr 22 2018 15:36
Good morning, I’ve been able to get my cluster into a non-recoverable state. I noticed I had two single node processes for an environment running, after stopping both processes, and only bringing up one, I am stuck in a recovery loop.
2018-04-22 15:28:45.148 ERROR PersistentShardCoordinator Exception in ReceiveRecover when replaying event type [Akka.Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [40650] for persistenceId [/system/sharding/placementCoordinator/singleton/coordinator]
System.ArgumentException: Region [akka://AkkaCluster/system/sharding/p#673731278] not registered
Parameter name: e
   at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e)
   at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.<Recovering>b__0(Receive receive, Object message)

2018-04-22 15:28:45.438 ERROR PersistentShardCoordinator Exception in ReceiveRecover when replaying event type [Akka.Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [50633] for persistenceId [/system/sharding/outboxCoordinator/singleton/coordinator]
System.ArgumentException: Shard 78 is already allocated
Parameter name: e
   at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e)
   at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.<Recovering>b__0(Receive receive, Object message

2018-04-22 15:29:07.714 ERROR PersistentShardCoordinator Exception in ReceiveRecover when replaying event type [Akka.Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [11774] for persistenceId [/system/sharding/playerCoordinator/singleton/coordinator]
System.ArgumentException: Region [akka://AkkaCluster/system/sharding/p#1560991559] not registered
Parameter name: e
   at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e)
   at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.<Recovering>b__0(Receive receive, Object message)

2018-04-22 15:29:10.192 ERROR PersistentShardCoordinator Exception in ReceiveRecover when replaying event type [Akka.Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [3087] for persistenceId [/system/sharding/worldCoordinator/singleton/coordinator]
System.ArgumentException: Region [akka://AkkaCluster/system/sharding/w#43619595] not registered
Parameter name: e
   at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e)
   at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.<Recovering>b__0(Receive receive, Object message)

2018-04-22 15:29:15.210 ERROR PersistentShardCoordinator Exception in ReceiveRecover when replaying event type [Akka.Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [11109] for persistenceId [/system/sharding/externalCoordinator/singleton/coordinator]
System.ArgumentException: Region [akka://AkkaCluster/system/sharding/e#1963167556] not registered
Parameter name: e
   at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e)
   at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message)
   at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
   at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.<Recovering>b__0(Receive receive, Object message)
Joshua Garnett
@joshgarnett
Apr 22 2018 16:49
Created a ticket for the issue akkadotnet/akka.net#3414
Bartosz Sypytkowski
@Horusiath
Apr 22 2018 16:57
@joshgarnett is it possible that you've run into split brain on your persistent shard coordinator?
Joshua Garnett
@joshgarnett
Apr 22 2018 16:57
That’s effectively what happened, it wasn’t a split brain though as much as me starting two independent clusters both to the same data set
The question is, why did it fail to come back up, once I fixed that
i.e. the data shouldn’t be corrupted in that case
Bartosz Sypytkowski
@Horusiath
Apr 22 2018 17:13
if you had two clusters with the same journal for sharding, I guess coordinators on both journals may have tried to change the same journal at the same time
Joshua Garnett
@joshgarnett
Apr 22 2018 17:14
Yeah, in theory though, they should have been seen an exception trying to persist on one of the nodes, since the journal sequence numbers would have collided