These are chat archives for akkadotnet/akka.net

8th
Jun 2018
AndreSteenbergen
@AndreSteenbergen
Jun 08 2018 05:34
I regularly see this (once a day)
[WARNING][6/8/18 4:18:05 AM][Thread 0003]
Marking node(s) as UNREACHABLE
[INFO][6/8/18 4:18:06 AM][Thread 0003]
Marking node(s) as REACHABLE
I am on azure, maybe that doesn't help
but at the moment the cluster isn't really stable
AndreSteenbergen
@AndreSteenbergen
Jun 08 2018 05:42
What can I do to make the cluster a little more relaxed? All nodes are up and running, just high network IO
Bartosz Sypytkowski
@Horusiath
Jun 08 2018 05:55
@AndreSteenbergen I don't know what are experiences of others, but cloud providers have problems with keeping them 100% uptime on regular basis - I've heard about big cluster which was disconnecting nodes 7 times a day due to network splits. What matters, is that the rest of the system should work in face of such failure.
AndreSteenbergen
@AndreSteenbergen
Jun 08 2018 07:22
@Horusiath Thanks, but I could manage something using settings right? I know the nodes aren't really down, so I thought maybe I could relax some of the settings. The settings I tried aren't doing any real thing to raise stability.
Bartosz Sypytkowski
@Horusiath
Jun 08 2018 08:28
@AndreSteenbergen If I remember correctly, there are two typed of unreachability: one related to system-system communication (and related to Akka.Remote) and other used to system-cluster unreachability (as part of Akka.Cluster). Second one is tweaked by failure detector configuration. But maybe @Aaronontheweb could say something here.
AndreSteenbergen
@AndreSteenbergen
Jun 08 2018 08:29
Thanks, I'll tweak some of the cluster settings as well, I only did the remoting so far
Well tweaking .... trial and error I guess.
Vagif Abilov
@object
Jun 08 2018 12:42
@AndreSteenbergen in our cluster one of the machines was often under high load which resulted in high CPU usage (>90%). That often caused unreachability. Once we relaxed the load on that machine and tweaked some settings (related to Akka.Remote if remember right), things get better.
AndreSteenbergen
@AndreSteenbergen
Jun 08 2018 12:47
cpu load isn't really the concern if I am reading top right. I get about 300% in the 4 CPU machines. The load according to top is way higher (about 8-10 on average) using iftop (if I am correct) I could tell a lot of network IO causes the high load. I am building a scraper, so it is normal to have network IO. I am running with cluster settings tweaked, threshold = 16.0 no unreachable nodes as of yet. Crossing my fingers.
Chandra Sekhar Manginipalli
@leo12chandu
Jun 08 2018 19:56
Anyone know of a way to Tell/Ask a message to cluster from outside of the Cluster without having to use ClusterClient? I could use ActorSelection but unfortunately if node goes down, I can't deathwatch with actorselection is what I heard. Besides, if I need to scale up and down my number of nodes, I will have to introduce those new node's IP:Port into the list of ActorSelections everytime.
Chandra Sekhar Manginipalli
@leo12chandu
Jun 08 2018 21:02
Or even group routees I guess but that has the same issue of not be able to death watch when remote nodes go down.
Arjen Smits
@Danthar
Jun 08 2018 21:15
Cant' deathwatch with actorselection ?
you use an actorselection to resolve to an IActorRef, you can deathwatch that right ?
Aaron Stannard
@Aaronontheweb
Jun 08 2018 21:17
yep
you can do that
you can just use Akka.Remote
to communicate with an Akka.Cluster instance, without having to use CluserClient
just resolve an ActorSelection via ResolveOne first
if the node becomes unreachable you'll get a Terminated notice via death watch
Chandra Sekhar Manginipalli
@leo12chandu
Jun 08 2018 21:53
@Danthar @Aaronontheweb Thank You. I will try that.
I am trying to distribute messages across multiple nodes in the cluster in a round robin fashion. This would mean, if new nodes are added, I have to restart the client app if I used ActorSelection woudn't I? Unlike clusterclient which resolves new nodes automatically by the DistributedPubSub.