Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • Mar 06 21:34
    SamEmber edited #4814
  • Mar 06 21:24
    SamEmber commented #4813
  • Mar 06 21:23
    SamEmber closed #4813
  • Mar 06 21:23
    SamEmber opened #4814
  • Mar 06 21:01
    SamEmber opened #4813
  • Mar 06 17:19
    to11mtm commented #70
  • Mar 06 16:51
    to11mtm commented #4698
  • Mar 06 16:43
    Larsus commented #50
  • Mar 06 00:06
    Aaronontheweb commented #4806
  • Mar 05 20:47
    Arkatufus commented #71
  • Mar 05 20:30
    Arkatufus commented #72
  • Mar 05 20:29
    Arkatufus commented #72
  • Mar 05 20:26
    Arkatufus commented #69
  • Mar 05 20:17
    Arkatufus commented #65
  • Mar 05 20:12
    Arkatufus commented #71
  • Mar 05 19:53
    CumpsD commented #70
  • Mar 05 19:51
    Arkatufus closed #36
  • Mar 05 19:51
    Arkatufus commented #36
  • Mar 05 19:49
    Arkatufus closed #23
  • Mar 05 19:49
    Arkatufus commented #23
to11mtm
@to11mtm
you could try to restart the cluster within that app-domain, wire up a bunch of logic around that
or, have the services/etc set up for recovery, and kill the app domain
the second approach (killing the app after shutting down the system) is probably 'easier' from an app standpoint; you can't be certain of the -reason- the node was downed; what if the nasty thing eating CPU is still in the app? reconnecting to the cluster isn't going to help things out
second approach is also better in line with the rest of the actor philosophy of 'let it crash', fwiw
evaldast
@evaldast
Thanks a bunch, I was aware of cluster events, however, having external supervisor restart the application somehow escaped my mind!
fscavo
@fscavo
Hello @Aaronontheweb
I have this configuration for PersistenceSupervisor. After starting the SqlServer instance, I don't records in the journal. Do I miss something? I'm using Akka.Persistence.Extras v0.5.2 and Akka.Persistence v1.4.13
fscavo
@fscavo
var props = actorRefFactory.DI().Props<NotificationCoordinatorActor>();
            var supervisor = PersistenceSupervisor.PropsFor(
                (msg, confId) =>
                    {
                        return msg switch
                        {
                            NotificationCoordinatorActor.Subscribe sub => new NotificationCoordinatorActor.ConfirmableSubscribe(confId, string.Empty, sub),
                            NotificationCoordinatorActor.Unsubscribe unsub => new NotificationCoordinatorActor.ConfirmableUnsubscribe(confId, string.Empty, unsub),
                            NotificationCoordinatorActor.Notify notify => new NotificationCoordinatorActor.ConfirmableNotify(confId, string.Empty, notify),
                            _ => new ConfirmableMessageEnvelope(confId, string.Empty, msg)
                        };
                    },
                msg => true,
                props,
                "notification-coordinator-actor",
                strategy: SupervisorStrategy.StoppingStrategy.WithMaxNrOfRetries(100));

            ActorRef = actorRefFactory.SingletonActorOf(supervisor, "notification-coordinator-actor");
evaldast
@evaldast
How can I produce node Unreachable status locally for testing?
evaldast
@evaldast
var cluster = Cluster.Get(Context.System);
var member = cluster.SelfMember;
var seed = Context.ActorSelection("akka.tcp://cluster@events:5055/user/$a").ResolveOne(TimeSpan.FromSeconds(2)).Result;

seed.Tell(new ClusterEvent.UnreachableMember(member));
It was as simple as that :|
Birnsen
@Birnsen

Hello everyone. Quick question on Akka.Streams. I wanted to test out a RestartFlow and made a simple example. Something along the lines:

using var system = ActorSystem.Create("system");
using var materializer = system.Materializer();

 var source = Source.From(Enumerable.Range(1, 10));

 await source .Via(RestartFlow.OnFailuresWithBackoff(
                    () => Flow.FromFunction<int, int>(i => i == 2 ? throw new Exception("boom") : i),
                    TimeSpan.FromSeconds(1),
                    TimeSpan.FromSeconds(10),
                    0))
                .RunForeach(i => Console.WriteLine(i), materializer);

The problem now is, that after the Exception was thrown the RestartFlow throws a "Cannot pull a closed port" Exception. Can someone explain to me why? Or if I miss something.
Thanks in advance

1 reply
mike-ammer
@mike-ammer

Hello!

Is it possible that the BackoffSupervisor parameter description is not correct?

The doc says:
The following C# snippet shows how to create a backoff supervisor which will start the given echo actor after it has stopped because of a failure, in increasing intervals of ... seconds

But in a scenario, where the child Actor has a delay, I run into a problem.

I am supervising a child with BackoffSupervisor.OnStop. The ChildActor tries to establish a connection. If it doesn't manage to do that in a certain timeout, then the actor is stopped.

Parent
var supervisor = BackoffSupervisor.Props( Backoff.OnStop( childProps, childName: "myEcho", minBackoff: TimeSpan.FromMilliSeconds(500), maxBackoff: TimeSpan.FromSeconds(30), randomFactor: 0.2, maxNoOfRetries: 3));

Child
Context.System.Tcp().Tell(new Tcp.Connect(endpoint, null,null, TimeSpan.FromMilliseconds(1000)));
(Stops on TCP ErrorMessage Received)

I would interpret the doku that if the child stops, the next start will restart after the interval expires.
So in my case the sequence should look like this:

0ms: Initial start
1000ms Child timeout => Context.Stop(Self)
1500ms: first retry (+500ms)
2500ms: Child timeout => Context.Stop(Self)
3500ms: second retry (+1000ms)
4500ms: Child timeout => Context.Stop(Self)
6500ms: third retry (+2000ms)
7500ms: Terminate Child

Is this correct?

But If I set the minBackoff time in the BackoffSupervisor smaller than the timeout in the childactor, then the backoff does not work.
The child is restarted immediately and maxNrOfRetries is also not respected.

What am I doing wrong? What do I miss?

Regards
Michael

fscavo
@fscavo
Hello all,
Did someone use PersistenceSupervisor with Akka.Persistence.SqlServer? It doesn't seem to be working!
1 reply
Vagif Abilov
@object

I found in our logs the following warning:

Configured Total of Connection timeout (15 seconds) and Command timeout (30 seconds) is less than or equal to Circuit breaker timeout (10 seconds). This may cause unintended write failures

I think two things need to be corrected here.

  1. The warning should say "greater than Circuit breaker timeout"
  2. Since it is recommended to set Circuit breaker timeout to the sum of those two, the condition in the source code should trigger the warning only if the sum is greater, not greater or equal.

@Aaronontheweb This is a minor detail but can be confusing. If you agree with my interpretation, I can send a PR.

to11mtm
@to11mtm
@object you are right on the greater than vs less than
The reason I set it to trigger on greater than or equal is because IMO circuit breaker should possibly be set just a little higher still on some systems due to timing/etc
Vagif Abilov
@object
I see. Then it's better to keep this logic and just update the message.
fbasco81
@fbasco81
Hello guys,
I have a question about akka monitoring and application insight. I have followed the wiki page, installed the required packages, added some code on my actors to send metrics to AI and everything works fine, I can see my data on the metrics of Application Insight. The main issue is that the throughput of the application is more than halved! In the same application I have also Serilog sending logs to the same instance of Application Insight. Could it be a problem? I have tried to search for any issue but I did not find anything. Can somebody help me ? Thank you!
evaldast
@evaldast
Using Akka.DI with netcore Microsoft Dependency Injection. Was wondering whether there's a way to create an actor using Akka.DI while still being able to pass arguments into constructor from creating actor
Currently, I'm creating child using DI, and then, from the child, Ask()'ing parent for props. This means that I have to sprinkle Receive<> blocks all over parent actor as it's a state machine with multiple Become()s
mrxrsd
@mrxrsd
@evaldast, there are some hacks to achieve this... but u should wait the next release. All DI core will be updated.
to11mtm
@to11mtm

@object I peeked at some of your issues akkadotnet/akka.net#4265 and akkadotnet/Akka.Persistence.SqlServer#190

  • You may or may not want to try logging some threadpool metrics to see whether there's some form of contention going on. This helped us trobleshoot issues where a misconfigured grafana dependency was causing pool explosions when actors would start up/shut down
     ThreadPool.GetAvailableThreads(out int avWt, out int avIot);
     ThreadPool.GetMaxThreads(out int maxWt, out int maxIot);
     var (workerThreads,ioThreads)= (maxWt - avWt,maxIot-avIot); //log this every second
  • If you can enable MVCC (i.e. Enable snapshot Isolation and default isolation level READ_COMMITTED_SNAPSHOT) that could help

  • Might be worth logging the SQL executed in the chunk

  • Mayyyybe try adding some code in the batching process so that if there's only READ commands in the batch, a transaction isn't used.

  • If you're feeling adventurous, you may want to try (it is technically prerelease state) the Linq2db plugin which is based on akka-persistence-jdbc. Code is the branch in akkadotnet/Akka.Persistence.Linq2Db#10 and I have it as a nuget package here. Configuring for 'compatibility mode' (Where it should be both backward and forward compatible with Existing Journal/snapshot) is exampled here. I've been running the journal in not-quite-prod (i.e. syncing data with old system) for a few months now with great success

Vagif Abilov
@object
Thank you @to11mtm. I will paste your suggestions to the original issue, so they won't get buried here. I will definitely continue my investigation.
Gustavo
@sainzg
@Aaronontheweb, how is the connectionstring to be formed for snapshot store when using azure blobs that are setup to run with DefaultAzureCredential ? I can access it from outside akka.persistance but not when I use this.
Cristiano Degiorgis
@crixo
We tried to use Test configuration grab form here https://getakka.net/articles/configuration/akka.testkit.html and it broke up our test suite. Main difference compared to our previous configuration is
scheduler {
implementation = "Akka.TestKit.TestScheduler, Akka.TestKit"
}
and yes we are using a standard scheduler within one our of Actors that due to the new configuration stop sending scheduled messages.
The tests that fail are testing the entire app. IOW they are simulating the execution kicked in the main but using a Testing System created via TestKit.
Aaron Stannard
@Aaronontheweb
next Akka.NET Community Standup will be tomorrow at 1:00pm CST https://www.youtube.com/watch?v=blLK8lH_MQ0
will have some updates on Akka.NET v1.4.15, the new DI system, and more
@sainzg hmmm
not sure off the top of my head
we're using an older version of the Azure Storage Client because the new Cosmos Db one isn't compatible with Azureite petabridge/Akka.Persistence.Azure#125
would you mind opening an issue here? https://github.com/petabridge/Akka.Persistence.Azure/issues
@crixo yeah the TestScheduler has to be manually advanced
won't run automatically
that's designed, ironically in this case, for deterministic testing purposes
were you able to fix your tests?
Gustavo
@sainzg
@Aaronontheweb , here you go let me know if more detail is needed petabridge/Akka.Persistence.Azure#140
mike-ammer
@mike-ammer

Hi! I've a general question to you...

How do you document your messageflow? Is UML your weapon of choice? Have you any other tool?
Would be nice to hear some suggestions.

thx
Michael

Aaron Stannard
@Aaronontheweb
thanks @sainzg - I appreciate it
Aaron Stannard
@Aaronontheweb
@/all Akka.NET v1.4.15 is now live on NuGet: https://twitter.com/AkkaDotNET/status/1351939117303681028
which includes our new Akka.DependencyInjection package
and we would love to get some feedback on that in the real world
Aaron Stannard
@Aaronontheweb
@mike-ammer I usually do a flowchart
for diagramming the message flow specifications
for charting the live messaging of running systems I tend to rely on distributed tracing for that
we're relying on Azure Application Insights for that today
but we're going to move off of it because it is ludicrously expensive
we'll probably adopt Jaeger instead
Aaron Stannard
@Aaronontheweb
@/all going live with our community standup now https://www.youtube.com/watch?v=blLK8lH_MQ0
fscavo
@fscavo
Hell all,
Is it possible to broadcast a message to all entities in the same shard? For instance, I want to send one message to all entities that live in the shard customerupdated
/user/sharding/customer/customerupdated/123
/user/sharding/customer/customerupdated/567