These are chat archives for akkadotnet/akka.net

13th
Jul 2016
Peter Bergman
@peter-bannerflow
Jul 13 2016 07:23
If I have a bunch of actors that are supposed to be quite short lived (100 ms max) and they have control over their own lifecycle (i.e each actor knows when its no longer needed), is it then good practice for the actor to somehow "shut down" itself?
Malisa Ncube
@malisancube
Jul 13 2016 07:25
C f
Bart de Boer
@boekabart
Jul 13 2016 07:32
@peter-bannerflow as far as I know, yes, it should just commit suicide (eg. Self.Tell(PoisonPill.Instance); )
Arsene T. Gandote
@Tochemey
Jul 13 2016 07:35

Hello I would like to know when I have this in my App.Config file

akka {  
    stdout-loglevel = DEBUG
    loglevel = DEBUG
    log-config-on-start = on        
    actor {                
        debug {  
              receive = on 
              autoreceive = on
              lifecycle = on
              event-stream = on
              unhandled = on
        }
    }

Do I need to call _log.Debug("Some message"); in my Actors? Or Akka take cares of that for me.

Peter Bergman
@peter-bannerflow
Jul 13 2016 07:37
@boekabart Thanks for the info. I think that in my case I want to go with Stop() since the dying actor might have a large unprocessed mailbox that I no longer care about.
Bart de Boer
@boekabart
Jul 13 2016 07:38
Context.Stop(Self) is what I use in exactly such a case
need to be 100% sure that no more messages are handled
Peter Bergman
@peter-bannerflow
Jul 13 2016 07:38
Yes, exactly. Thanks :)
And by the way, do you know how subsequent messages to an IActorRef of the dead actor will be handled? Deadletters?
Bart de Boer
@boekabart
Jul 13 2016 07:40
Can't imagine anything other that that, indeed
since they are letters and the actor is dead ;)
Peter Bergman
@peter-bannerflow
Jul 13 2016 07:41
Yeah, pretty obvious I guess... :P
Arsene T. Gandote
@Tochemey
Jul 13 2016 07:44
Also I would like to know whether it is necessary to provide a ToString method for my message POCO in case I want to log them or I can use the Seriliazation feature to do that. The best will be that Akka takes care of that. Any idea gentlemen?
Bart de Boer
@boekabart
Jul 13 2016 07:51
well that depends on the verbosity you want. using serialization will produce very verbose (large) logs vs using optimized ToString per PoCo. We prefer the latter.
Bartosz Sypytkowski
@Horusiath
Jul 13 2016 07:52
@Tochemey if you'll use serialization to log the content of a message, you're introducing quite a big footprint for such a simple step
Bart de Boer
@boekabart
Jul 13 2016 08:39
Might be worth it only for exceptional cases (eg. errors that really aren't supposed to happen)
Ricky Blankenaufulland
@ZoolWay
Jul 13 2016 08:41
@Aaronontheweb Sorry, there has a bracket splipped into the URL. Should be https://gist.github.com/ZoolWay/8092e7c2aa3a86b009981885cd4aa271 - but it is basically the same as the code in the GitHub repository which shows Cluster.Leave for three types. Interestingly the shown method for ASP.NET Core does not work in a larger project of mine, still checking on that. But the gist shows the exceptions and deadletters in its comments, though.
Ricky Blankenaufulland
@ZoolWay
Jul 13 2016 12:04
@Aaronontheweb Here is another gist https://gist.github.com/ZoolWay/2af4eb1815c6a5c41cdeb440dd0cd36b which is in my opinion the same leave-code like in the GitHub example but the node does not leave the cluster. Instead the other nodes continue to try to reconnect every 5s. The gist shows the ASP.NET Startup.cs and the output the node generates itself when trying to leave the cluster.
Curtis Swartzentruber
@skills0
Jul 13 2016 12:42
still would like some thoughts on my "Leader can currently not perform its duties" issue. should I open a bug with details?
in looking at @ZoolWay issue, it appears we are seeing some similar behavior
Ricky Blankenaufulland
@ZoolWay
Jul 13 2016 12:50
@skills0 The leaving issue with ASP.NET Core?
Ricky Blankenaufulland
@ZoolWay
Jul 13 2016 13:02
I got a rather basic question about how to define the message receive handlers. In http://getakka.net/docs/working-with-actors/sending-messages#ask-send-and-receive-future we have this hint: When using task callbacks inside actors, you need to carefully avoid closing over the containing actor’s reference, i.e. do not call methods or access mutable state on the enclosing actor from within the callback. This would break the actor encapsulation and may introduce synchronization bugs and race conditions because the callback will be scheduled concurrently to the enclosing actor.
Is this violated when I define the handler like this: Receive<SubscribeToDeadLetters>((message) => ReceivedSubscribe(message));? Because calling a method just feels like making the code more readable (and stacktraces too) but if it endangers my actor system...
Curtis Swartzentruber
@skills0
Jul 13 2016 13:06
no, i'm actually seeing a similar issue in standard C# where a node doesn't leave a 2 node cluster properly and the other node keeps trying to connect to it. not exactly the same, but some similar errors to your gist.
Peter Bergman
@peter-bannerflow
Jul 13 2016 13:14
@ZoolWay I would assume that the issue with synchronization only applies to cases where you fire off a task (as with the Ask example) and the callback is being executed sometime in the future. I use the same structure as you do in my Receive handlers. But perhaps someone else can confirm this...
Marc Piechura
@marcpiechura
Jul 13 2016 13:16
@ZoolWay @peter-bannerflow yep that's right
Ricky Blankenaufulland
@ZoolWay
Jul 13 2016 13:22
Good to know, thanks! @Silv3rcircl3 @peter-bannerflow
Jared Lobberecht
@Jared314
Jul 13 2016 14:52
Is there a form of the Akka.Tools.MatchHandler.MatchBuilder.Match that accepts a MethodInfo object?
Aaron Stannard
@Aaronontheweb
Jul 13 2016 16:33
@skills0 in a 2-node cluster unless both nodes have the other node listed as a seed node, you are going to run into that
that's on you, in other words
if one node is a seed and the other isn't
when the seed node restarts
it knows it's a seed
and will join itself and form its own cluster if there are no other seeds listed
so you'll end up with a split brain
Akka.Cluster is not magic - it uses the same dynamo-style clustering system that Cassandra, Riak, DynamoDb, and others use
all of those systems require minimum seed node count >= 2
@ZoolWay I did disable Helios's logging system inside Akka.Remote so I'm a bit surprised this shows up
ERROR 13:55:24 [ 3] ios.TcpClientHandler - Error caught channel [[::ffff:127.0.0.1]:12301->[::ffff:127.0.0.1]:4053](Id=ChannelId(1600056064))
System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, SocketError& errorCode)
   at Helios.Channels.Sockets.TcpSocketChannel.DoReadBytes(IByteBuf buf)
   at Helios.Channels.Sockets.AbstractSocketByteChannel.SocketByteChannelUnsafe.FinishRead(SocketChannelAsyncOperation operation)
ERROR 13:55:24 [ 3] ios.TcpServerHandler - Error caught channel [[::ffff:127.0.0.1]:12300->[::ffff:127.0.0.1]:12311](Id=ChannelId(-18529280))
System.ObjectDisposedException: Cannot access a disposed object.
Object name: 'System.Net.Sockets.Socket'.
   at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags, SocketError& errorCode)
   at Helios.Channels.Sockets.TcpSocketChannel.DoReadBytes(IByteBuf buf)
   at Helios.Channels.Sockets.AbstractSocketByteChannel.SocketByteChannelUnsafe.FinishRead(SocketChannelAsyncOperation operation)
that exception is nothing to worry about
happens in Helios and in DotNetty whenever a socket is closed, even gracefully
Aaron Stannard
@Aaronontheweb
Jul 13 2016 16:39
the SocketAsyncEventArgs object that those libraries use to asynchronously read / write from the socket is always waiting in a read state
when we close the socket, we have to interrupt the pending async read operation in order to complete shutdown
that exception is what's doing it
I have it on my to-do list to review some of the inocuous remoting exceptions and prevent them from being logged
in this case the reason why you're seeing two instances of the exception
one is for the client currently connected to another node
the other is for the socket that is used to accept inbound connections
Peter Bergman
@peter-bannerflow
Jul 13 2016 16:42
Nice, I wa actually scratching my head over that one as well
Aaron Stannard
@Aaronontheweb
Jul 13 2016 16:43
yeah, it's annoying
the exception gets thrown on that Socket.Receive call
which is basically what's going to be run asynchronously by the SocketAsyncEventArgs object we're using
we could probably handle that up the stack better
so in effect
the fact that we have these inocuous errors
is a user-experience issue on our part
Aaron Stannard
@Aaronontheweb
Jul 13 2016 16:48
makes the user think they did something wrong
when they didn't
Peter Bergman
@peter-bannerflow
Jul 13 2016 16:48
Yeah
Aaron Stannard
@Aaronontheweb
Jul 13 2016 16:49
We might release 1.1.1 today
depends on if I can get one more hostname / binding bug fixed
fixed IPV6 support already
have one more issue with public-hostname that needs to be resolved
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:38

Here goes a kind of an open question... Does anyone have some insights to share on Akka.Cluster perfomance in terms of throughput of messages from the network into the actor system. For example, what kind of throughput can we expect to one single actor (running on one machine) that receives messages from other remote nodes in the cluster? Of course it depends on what that actor actually do with the messages as well as hardware specs and such but any insights would be appreciated. Real world experience anyone? For example, can we expect 10k messages/second or 100k messages/second. I have seen this article https://petabridge.com/blog/performance-testing-mandatory/ and from that I gather that around 10k messages/second is reasonable?

Also, does anyone have some insight in OS (Windows Server) level tweaks that could be made in order to increase the message throughput into the actor system from the network?

Aaron Stannard
@Aaronontheweb
Jul 13 2016 17:38
check out the benchmarks on each pull request
green check mark next to perf-tests
that will take you to a folder chock full of them
there's remote one way and remote two way specs for the current transport
two way is a full request response chain
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:42
Hmm yeah ok, I found a lot of them (like this for example http://petabridge-ci.cloudapp.net/viewLog.html?buildId=13097&buildTypeId=AkkaNet_AkkaNetWindowsPerformanceTests&tab=artifacts) if that is what you mean?
Aaron Stannard
@Aaronontheweb
Jul 13 2016 17:43
yep
the Akka.Remote Helios transport ones
are the live transport
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:44
Yeah ok
Aaron Stannard
@Aaronontheweb
Jul 13 2016 17:44
now I just gave a talk on this yesterday
that's a benchmark for our hardware, for our spec
there's no such thing as a global benchmark
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:44
Really... is it recorded? :)(
Aaron Stannard
@Aaronontheweb
Jul 13 2016 17:44
yep
let me find it
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:45
Cool
my talk was only 30 minutes long
so I have no idea why the video is 60 minutes
Arjen Smits
@Danthar
Jul 13 2016 17:46
lol that screenshot
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:46
Thx
Arjen Smits
@Danthar
Jul 13 2016 17:46
its like: "kudos to that guy with the question!"
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:48
About the files, is two way (Akka.Remote.Tests.Performance.Transports.HeliosRemoteMessagingThroughputSpec+TwoWay) like an actor receiving a message from the network and then responding with a message to the sender?
Aaron Stannard
@Aaronontheweb
Jul 13 2016 17:50
just an FYI: I got some of the causes and reasons behind the prior performance issues with Akka.Remote mixed up during the talk because I couldn't see my notes with the way our monitors were laid out
so I was going from memory, and some of that turned out to be mixed up
the lazy evaluation stuff was actually an issue with slowly creating / shutting down children
different performance issues
but aside from the technical detail there, the major point is true
Peter Bergman
@peter-bannerflow
Jul 13 2016 17:55
Alright
Aaron Stannard
@Aaronontheweb
Jul 13 2016 17:56
@peter-bannerflow yep, that's correctr
you can see the source for the benchmarks in any of the *.Tests.Performance projects
Peter Bergman
@peter-bannerflow
Jul 13 2016 18:01
Cool, in the video you mention that the metrics in the graph show the number of messages pushed through a single Akka.Remote connection, would that be one instance of a process running an application with Akka.Remote that binds to one port ?
Curtis Swartzentruber
@skills0
Jul 13 2016 18:04
@Aaronontheweb thanks for feedback, however both the nodes are seed nodes and both cluster configs have both nodes listed. The active/passive bit is managed by our code, I should've made that more clear.
Aaron Stannard
@Aaronontheweb
Jul 13 2016 18:05
@skills0 check out some of our Akka.Cluster.Tests.MultiNode
we cover a big range of cluster disconnect / leave scenarios in there and verify they all leave correctly
in your scenario, does the cluster think the node is unreachable or has the node been removed from the membership?
Curtis Swartzentruber
@skills0
Jul 13 2016 18:21
I'll check that out. Let me find the exception sequence.
Aaron Stannard
@Aaronontheweb
Jul 13 2016 18:21
that'd be great
and please file a Github issue
it's easier to refer back to if there's a problem than Gitter chat
more persistent that way
Curtis Swartzentruber
@skills0
Jul 13 2016 18:39
@Aaronontheweb i will put together an issue, but basically we call Cluster.Leave on seed node A, the seed node B goes through some association failure exceptions, marks A unreachable, but then keeps trying to associate with it. Helios then gets into a loop of connection errors (Error connecting, SocketException, etc.). Eventually node B gets "Leader can currently not perform its duties" and at that point the whole AkkaSystem on the up node just seems to stop working consistently.
Aaron Stannard
@Aaronontheweb
Jul 13 2016 19:53
if A leaves and finishes the leave process, it will not be marked as unreachable
so when you call Cluster.Leave, do you wait for A to receive a local MemberRemoved event for itself?
or Cluster.RegsiterOnMemberRemoved delegate
before shutting down the actor system?
boekabart @boekabart thinks a helper method for that might be useful...await Cluster.LeaveAsync() or so
qwoz
@qwoz
Jul 13 2016 20:49
"it's like watching ferrets describe how a supernova works" ... rofl!
Bartosz Sypytkowski
@Horusiath
Jul 13 2016 20:58
:+1: for Cluster.LeaveAsync()
Aaron Stannard
@Aaronontheweb
Jul 13 2016 21:02
yeah, that could work
would basically just need to send an actor a message when the Cluster.RegisterOnMemberRemoved delegate fires
and that actor would complete a TaskCompletionSource
Curtis Swartzentruber
@skills0
Jul 13 2016 21:39
yeah guys, on closer review of the logs I'm thinking this may be operator error (pebkac). we mainly test in console apps and then deploy as service. I'm wondering if I'm not giving the AkkaSystem a chance to finish all the shutdown. thanks for the idea on waiting for MemberRemoved for self, hadn't thought of that. Getting ready to test some stuff around that.
Aaron Stannard
@Aaronontheweb
Jul 13 2016 22:46
@ZoolWay I had to patch Helios anyway as part of 1.1.1
to solve some DNS resolution issues
went ahead and handled that exception better also