These are chat archives for akkadotnet/akka.net

14th
Jun 2016
Jannis
@schjan
Jun 14 2016 00:28
Okay thank you. I will build a small test application with Akka.Remote and one with SignalR and try out what works best for me.
Aaron Stannard
@Aaronontheweb
Jun 14 2016 05:11
looks like the QuickRestartSpec revealed a bug with the endpoint management system possibly
working on porting ActorsLeakSpec in Akka.Remote to help reproduce the issue
(in a manner that's a bit easier to debug)
Kris Schepers
@schepersk
Jun 14 2016 07:03
Anyone else experiencing issues with garbage collecting of actors? Our system works fine untill there are about 20.000 actors, then it starts to log OutOfMemory exceptions and shortly after that, the cluster falls apart..
We're using Cluster Sharding btw..
And we've gone through the effort of eliminating as much infrastructure stuff as possible (from sql to in mem peristence, single threaded "command executor", ...)
Bart de Boer
@boekabart
Jun 14 2016 07:24
@schepersk we're running systems that create 400k actors and don't run into OutOfMemory exceptions. Make sure you uncheck 'prefer 32bit' on your project build settings, and maybe take a look at the amount of memory each actor instance consumes?
hidavidpeng
@hidavidpeng
Jun 14 2016 09:07
hi What is that means keep mutable state in actor? I can not modifier the message in Receive Action?
Kris Schepers
@schepersk
Jun 14 2016 09:09
@boekabart Is there an easy way to look at the amout of memory of an actor?
Arjen Smits
@Danthar
Jun 14 2016 09:24
@hidavidpeng You can, but it opens you up to all sorts of concurrency problems. Keep your messages immutable. And keep state in your actors.
This means that your actors have private properties/fields
when receiving a message you can mutate that state.
@schepersk Not at this time.
But if you are getting OOM exception, we would be interested to know where they occur.
You are probably hitting some kind of internal limit when running in 32 bit. So going to 64bit will most probably solve your problem. But we would still like to know which limit you encountered ;)
With internal limit, im not talking about some artificial limit we enforce, more like an internal collection size limit. (if it even is something in akka that reaches the limit)
hidavidpeng
@hidavidpeng
Jun 14 2016 09:28
@Danthar thank you. So If I had a complex process. I'd better to split it into small. and tell to the children Actors rather than send the message modified.
Arjen Smits
@Danthar
Jun 14 2016 09:30
Yes, the most common way to handle this, is by modelling each step in your system as a seperate actor (where it makes sense). And push work through them like a workflow.
hidavidpeng
@hidavidpeng
Jun 14 2016 09:35
Thank you. Today My colleague told me that If I want to maximize the workload. I'd better concern the lock free and seperate your concerns into atomic. I think that is what you mean.
hi @Danthar you said the most common way. Is there any other ways? lol
Garrard Kitchen
@garrardkitchen
Jun 14 2016 09:54
@frasermolyneux Replying to posts on Jun 12. We use Azure for both QA and Load Testing multi-tenant on-prem / cloud solution. Infrastrcture set up using ARM, manual deploy of solution though. Heavy dependency on Akka.Cluster throughout solution - multiple components. Using 1.0.8. Good levels of stability from load testing. Stress testing however, different story. Soon as CPU hits 100% util our product's world fell apart! Looking forward to running same stress tests with 1.1. We use JMeter from different machines hitting our product's baseline HA configuration (2 web servers - WS2012 R2, 1 Db server, LB, vnet, subnets used to protect db server behind ACL, etc..). Typical configuration. Not sure if this helps you?
frasermolyneux
@frasermolyneux
Jun 14 2016 10:19
@garrardkitchen Just for clarification are you deploying the solution onto an Azure VM then?
Garrard Kitchen
@garrardkitchen
Jun 14 2016 10:26
@frasermolyneux Yes, multiple VMs
frasermolyneux
@frasermolyneux
Jun 14 2016 10:29
Ah that's fair - the problem I was having wasn't with Azure itself but the Azure cloud service. I've gone to a deployment structure that will work with a Azure VM
Just for now doing it inhouse
Peter Bergman
@peter-bannerflow
Jun 14 2016 10:31
@garrardkitchen I'm a bit curious, how flexible is that setup? I mean, is it a pain to add new VMs into the cluster or do you have some process of doing that? Also, have you managed to handle the deployment chain smoothly (i.e. integrated with some CI server or similar)?
frasermolyneux
@frasermolyneux
Jun 14 2016 10:33
I know you can use RM to provision the VMs, configure them and deploy the Akka solution - I've seen similar done but deploying a different solution
Just takes time to setup
Peter Bergman
@peter-bannerflow
Jun 14 2016 10:41
Ah, sorry, I see now that @garrardkitchen wrote that deploys are handled manually
Garrard Kitchen
@garrardkitchen
Jun 14 2016 10:51
@peter-bannerflow From an elastic perspective, all component ports are same and as we're using vnet & ACL no need to open ports on each VM and we've min 2 seed (for HA deployment) so each VM deploy includes addresses of both seeds so it's just a case of spinning up a new VM (could even include image of existing VM). We've a module that manages Tenants / Licensing etc that also includes a UI that shows several performance counters and active components so visually we see when components & servers get added to cluster system. Not fully hooked into CI. We do use TeamCity for version builds and that's all automated and triggered on bitbucket push and we do have a one click option to create deployment package for partners / etc to deploy but no CD capability at present. Have considered Octopus but dealing with other priorities at the minute.
@frasermolyneux Yeah, ARM is painful and lacks appropriate tooling and has a long way to go for mass adoption. I do believe it has all the building blocks to automate the deployment of an akka solution.
Peter Bergman
@peter-bannerflow
Jun 14 2016 11:18
@garrardkitchen Ah that sounds pretty sweet
Peter Bergman
@peter-bannerflow
Jun 14 2016 11:27
About the module for tenant management, is that some open source component or a proprietary?
Garrard Kitchen
@garrardkitchen
Jun 14 2016 11:36
@peter-bannerflow proprietary and at present and domain specific. That modulee is broken into 2 components; UI and API. Communication between the 2 is via akka so with clustering the UI could be talking with API on different VM. There's a certain amount of statefulness with Tenant and Licensing too and so syncing between roles is implemented to negate need to hit db (think write-behind caching)
Peter Bergman
@peter-bannerflow
Jun 14 2016 11:39
Ah ok, I see
And how do you package your Akka.NET applications that run inside the cluster? I know that some other people have been using topshelf, is that something that you use or something different?
Garrard Kitchen
@garrardkitchen
Jun 14 2016 11:58
@peter-bannerflow windows & owin installed via topshelf
Kris Schepers
@schepersk
Jun 14 2016 12:29
@Danthar Yes, we're building 32 bit, so that could indeed be an issue. The memory limit seems to be somewhere between 1.2GB and 1.5GB for a single process (topshelf windows service in this case).
Arjen Smits
@Danthar
Jun 14 2016 12:30
Thats not nearly enough for a normal OOM to occur
Kris Schepers
@schepersk
Jun 14 2016 12:30
But I think we've just found our issue..
Arjen Smits
@Danthar
Jun 14 2016 12:30
so my guess is that your running into a collection limit
Kris Schepers
@schepersk
Jun 14 2016 12:31
Not correctly disposing EF DB Context instances.. They should be marked as ExternallyOwned in the Autofac bootstrapper!
Arjen Smits
@Danthar
Jun 14 2016 12:31
in .net List types can throw an OOM if they become to large, due to fragmentation of the LOH
aah
Kris Schepers
@schepersk
Jun 14 2016 12:31
(all IDisposable instances should be marked as such)
Arjen Smits
@Danthar
Jun 14 2016 12:31
yup that would do it :P
but that would also mean that you performance is dreadfull
*your
because the EF context just keeps aggregating changes and stuff
Kris Schepers
@schepersk
Jun 14 2016 12:32
When commenting out our ReadModel projectors things looked much better and then came the "AHA" moment ;-)
Arjen Smits
@Danthar
Jun 14 2016 12:32
and calling savechanges would get slower and slower :P
ah. Nice
Kris Schepers
@schepersk
Jun 14 2016 12:34
Indeed, you're spot on! When memory hit arround 1GB, things became really slow and unstable, and a few moments after it just stopped
Now running between 50MB and 60MB..
Arjen Smits
@Danthar
Jun 14 2016 12:34
the EF context is a glutton, gobbles everything up
Kris Schepers
@schepersk
Jun 14 2016 12:35
I now, not the first time the team ran into this problem
Arjen Smits
@Danthar
Jun 14 2016 12:35
I also have EF in my project. Love the migrations feature. Hate the performance.
Kris Schepers
@schepersk
Jun 14 2016 12:35
Also, this is why we're not using EF on the akka persistence end..
Indeed.. More than once I'm thinking, "This wasn't a problem with good old ADO.NET" :-)
Arjen Smits
@Danthar
Jun 14 2016 12:37
I have a hybrid where im using EF for the normal crud (im not using Akka everywhere in my project) and Dapper for almost everything else. Or i drop down to raw sql on efcontext where needed.
And to be fair, there are alternatives for managing DB migrations, automatically even. But by the time EF performance was starting to become an issue, i was to committed. Ripping EF out and going Dapper all the way, is a long term goal though. But its alot of work, and performance is acceptable and manageble atm, so thats why its low prio.
Kris Schepers
@schepersk
Jun 14 2016 12:39
Never used Dapper, heard of it on many occasions though.
Arjen Smits
@Danthar
Jun 14 2016 12:40
Key to EF performance, is keep your context lifecycles small, keep your model simple, keep result sets small (since the object materializer is terrible perfwise). And use notracking() if possible.
Dapper is nothing more then extensions methods on the SqlConnection class
with a very simple but fast object materializer.
  var resultset = conn.Query<MyModel>("select * from table where status = @status", new {status = 2}).ToList();
Kris Schepers
@schepersk
Jun 14 2016 12:42
Indeed, very few people know of the NoTracking() extension method. It makes a huge difference in perf
Arjen Smits
@Danthar
Jun 14 2016 12:43
Also the .Include helps as well.
Diego Frata
@diegofrata
Jun 14 2016 13:55
Dapper is quite cool if you don't mind writing SQL. But seriously, when you start having to deal with joins or tables with dozens of fields, it becomes REALLY tedious to write those queries amd also to map hierarchy of objects manually. On my personal projects I've been resorting to document databases for pretty much everything. I think I'm becoming allergic to O/R mapping in all of its forms.
Arjen Smits
@Danthar
Jun 14 2016 14:16
If you want performance, you got to do the work. There is always a tradeoff.
Toshko Andreev
@Ravenheart
Jun 14 2016 14:17
personally I use Linq2Db (NOT Linq2SQL)
Arjen Smits
@Danthar
Jun 14 2016 14:17
And document databases aren't the holy grail :P.
Toshko Andreev
@Ravenheart
Jun 14 2016 14:17
its sort of Dapper with LINQ
Diego Frata
@diegofrata
Jun 14 2016 14:23
@Danthar that's why I use Postgres :D I can always change my mind later!
Arjen Smits
@Danthar
Jun 14 2016 14:24
Postgres allows you to store documents, while still allowing you to index and query on individual document properties ?
Marc Piechura
@marcpiechura
Jun 14 2016 14:25
Diego Frata
@diegofrata
Jun 14 2016 14:29
Yeah, you can query and create indexes to properties inside the document! @Silv3rcircl3 I started with Marten this week, looks quite good.
Arjen Smits
@Danthar
Jun 14 2016 14:29
ah cool
Marc Piechura
@marcpiechura
Jun 14 2016 14:36
@diegofrata good to know, haven't used it yet but found it a while ago and follow the progress since then
Aaron Stannard
@Aaronontheweb
Jun 14 2016 15:20
@garrardkitchen yeah, the fixes in 1.1 should definitely produce some better results under a stress test like that
working on more of them still ;)
Peter Bergman
@peter-bannerflow
Jun 14 2016 15:26
About the discussions that have taken place here ealier on Akka.Cluster in Azure, has anyone played around with it deployd in Service Fabric?
to11mtm
@to11mtm
Jun 14 2016 15:28
I have two gripes with Linq2Db... not sure how much of them are because I'm in Oracle land, but I can't get actual parameterized queries, and the way it handles certain oracle specific things are too different from how our shop does.
Toshko Andreev
@Ravenheart
Jun 14 2016 15:32
we use it with MariaDB and PostgreSQL, works great there
haven't used Oracle
Aaron Stannard
@Aaronontheweb
Jun 14 2016 16:10
@peter-bannerflow yep, I know @mmisztal1980 is experimenting with that
and I know of one other user running a large production workload on it
there are definitely others but those are the only two I know of off the top of my head
@alexvaluyskiy so I had to expose the ChildContainer as a public member on ICell
it's that way on the JVM
but it was also needed for a couple of important remoting specs, mostly this one
it's clever, the way they're able to recurse the remoting actor hierarchy
checking to make sure that dead connections don't leave behind actors
Garrard Kitchen
@garrardkitchen
Jun 14 2016 16:27
@Aaronontheweb Can't wait!
Peter Bergman
@peter-bannerflow
Jun 14 2016 16:44
@Aaronontheweb (and @mmisztal1980) I'm trying to figure out how a SF deployment of Akka.Cluster would work in terms of service discovery. To start with, I guess I need to dynamically build up the config so that the IP of the node that the service is running on is used in the akka.remote config section so that it can be reached by other nodes in the cluster.
qwoz
@qwoz
Jun 14 2016 16:49
Is there an example of a bare-bones Service Fabric project for Akka.NET?
Aaron Stannard
@Aaronontheweb
Jun 14 2016 16:49
not as far as I know
qwoz
@qwoz
Jun 14 2016 16:52
ok, I haven't done much with it so far but it seems like a great way to go. If I get something up and running, I'll post what I have on github. And @mmisztal1980 and/or @peter-bannerflow if you have something you are comfortable sharing, let me know.
Peter Bergman
@peter-bannerflow
Jun 14 2016 16:52
@qwoz I'll do some work on it this week and let u know how it goes
qwoz
@qwoz
Jun 14 2016 16:55
:plus1:
Alex Valuyskiy
@alexvaluyskiy
Jun 14 2016 16:55
@Aaronontheweb when you finish with your PR, could you help me with routers? I've almost finished them, but SupervisingStrategy stopped working, somehow. I need some "debug" help
Aaron Stannard
@Aaronontheweb
Jun 14 2016 16:56
sure thing
@alexvaluyskiy just gave you PR access, by the way
err, commit access
check your private messages
:p
Aaron Stannard
@Aaronontheweb
Jun 14 2016 17:02
only two rules: never merge your own PRs (except sometimes for trivial stuff that doesn't touch code, like fixing a typo)
and make sure CI always passes
and CONGRATS - thanks for your contributions; you're now part of the core contributors team
should you choose to accept your mission :p
Sean Gilliam
@sean-gilliam
Jun 14 2016 18:08
:clap:
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:13
welp, just found one bug that would definitely cause problems when a node attempts to rejoin a cluster
caused by me nesting if statements incorrectly
:(
easy fix
with a test to prove it
frasermolyneux
@frasermolyneux
Jun 14 2016 18:15
Do you do much TDD Aaron?
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:16
@frasermolyneux I don't literally write the test first
since I think that's a good way to end up with bad code
my preferred way to code is usually bottom up
and I'll write tests as I go
frasermolyneux
@frasermolyneux
Jun 14 2016 18:17
Interesting, just asking as my company brought in a consultant to go through some TDD techniques
It requires a change to the norm
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:17
TDD is good in practice as long as you don't turn it into a religion with strict tenets
frasermolyneux
@frasermolyneux
Jun 14 2016 18:17
*from the norm
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:17
THE TEST MUST ALWAYS BE WRITTEN FIRST
etc
the value I get out of TDD
is that it gives me a sanity check that the current layer and the layers below it are generally working
before I build more stuff on top of it
fixing those low-level bugs gets more expensive
the more stuff you have depending on it
so TDD is a good way to fix a lot of those bugs when they're still cheap to fix
so I write the module first and then the tests afterwards
new thing I've been a big fan of lately is model-based testing
used that really aggressively in the Helios 2.0 code
frasermolyneux
@frasermolyneux
Jun 14 2016 18:20
model-based testing?
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:20
yeah - it's originally a functional programming technique
frasermolyneux
@frasermolyneux
Jun 14 2016 18:20
ah yeah googling now :)
John Hughes is the guy who invented QuickCheck, the original model based testing framework
his video here is a great explanation of how it works and what it does
frasermolyneux
@frasermolyneux
Jun 14 2016 18:22
I'll take a look at that thanks
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:22
and if you want to try it out in .NET, there's a .NET port of QuickCheck called FsCheck; it's written in F# but I've been happily using it in C#. https://github.com/fscheck/FsCheck
works with NUnit, XUnit
qwoz
@qwoz
Jun 14 2016 18:24
I remember seeing that previously. It sounds like american fuzzy lop (http://lcamtuf.coredump.cx/afl/) for .NET
Aaron Stannard
@Aaronontheweb
Jun 14 2016 18:25
@qwoz that scheduler that John uses for randomizing the order in which concurrent operations execute is very similar to that idea
MSR had a project that did this which is unfortunately now shuttered called CHESS
I can't even get it to compile due to byte rot
ilhadad
@ilhadad
Jun 14 2016 19:43
@Aaronontheweb Sent e-mail to you requesting for paid support. Can you please take a look.
ilhadad
@ilhadad
Jun 14 2016 20:49

@Aaronontheweb @Horusiath got the following when loading the persistent actor.

[ERROR][6/14/2016 8:44:39 PM][Thread 0010][akka://SSAActorSystem/system/akka.persistence.snapshot-store.couchbase] Object reference not set to an instance of an object.
Cause: [akka://SSAActorSystem/system/akka.persistence.snapshot-store.couchbase]: Akka.Actor.ActorInitializationException: Exception during creation ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Akka.Actor.Props.NewActor()
   at Akka.Actor.ActorCell.CreateNewActorInstance()
   at Akka.Actor.ActorCell.<>c__DisplayClass113_0.<NewActor>b__0()
   at Akka.Actor.ActorCell.UseThreadContext(Action action)
   at Akka.Actor.ActorCell.NewActor()
   at Akka.Actor.ActorCell.Create(Exception failure)
   --- End of inner exception stack trace ---
   at Akka.Actor.ActorCell.Create(Exception failure)
   at Akka.Actor.ActorCell.SystemInvoke(Envelope envelope)
[ERROR][6/14/2016 8:44:39 PM][Thread 0010][akka://SSAActorSystem/system/akka.persistence.journal.couchbase] Object reference not set to an instance of an object.
Cause: [akka://SSAActorSystem/system/akka.persistence.journal.couchbase]: Akka.Actor.ActorInitializationException: Exception during creation ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Akka.Actor.Props.NewActor()
   at Akka.Actor.ActorCell.CreateNewActorInstance()
   at Akka.Actor.ActorCell.<>c__DisplayClass113_0.<NewActor>b__0()
   at Akka.Actor.ActorCell.UseThreadContext(Action action)
   at Akka.Actor.ActorCell.NewActor()
   at Akka.Actor.ActorCell.Create(Exception failure)
   --- End of inner exception stack trace ---
   at Akka.Actor.ActorCell.Create(Exception failure)
   at Akka.Actor.ActorCell.SystemInvoke(Envelope envelope)
[INFO][6/14/2016 8:44:39 PM][Thread 0010][akka://SSAActorSystem/system/akka.persistence.snapshot-store.couchbase] Message LoadSnapshot from akka://SSAActorSystem/user/ClientSupervisor/ClientList to akka://SSAActorSystem/system/akka.persistence.snapshot-store.couchbase was not delivered. 1 dead letters encountered.

I have loaded the Akka.Persistence code and the Couchbase plugin into my project with break points at all possible entry locations. They never hit. I am going to need your help in figuring this out.

Aaron Stannard
@Aaronontheweb
Jun 14 2016 21:03
@ilhadad sure thing Abe
I'll check my email - was out of the office for a bit
qwoz
@qwoz
Jun 14 2016 21:04
@ilhadad I don't have experience with that, but if the couchbase actor can't be initialized, might be something in your config? https://github.com/akkadotnet/Akka.Persistence.CouchBase/blob/master/README.md#exclusive-hocon-configuration
Aaron Stannard
@Aaronontheweb
Jun 14 2016 21:05
the code where this error is thrown is totally different in #2086, which should be getting merged soon
so it might be a bug that I fixed as part of that
can't include #2086 in a nightly until it's merged though
@cconstantin told me he'd be reviewing it later tonight, so that's already taken care of
I'll follow up on your email
Bartosz Sypytkowski
@Horusiath
Jun 14 2016 21:17
@ilhadad 9 on 10 times this exception kicks in due to inability to create journal or snapshot store - you may have something wrong with your config
Aaron Stannard
@Aaronontheweb
Jun 14 2016 22:12
aw yeah
fixed a second endpoint issue
which would work borked clusters that had nodes which very quickly restarted after being abruptly terminated
i.e. if you force killed a process and immediately restarted it
Aaron Stannard
@Aaronontheweb
Jun 14 2016 22:20
going to be checking this one in shortly
Aaron Stannard
@Aaronontheweb
Jun 14 2016 23:53
@alexvaluyskiy this is implemented in the JVM: akka/akka#13584 - I think we should table the WeaklyUp state until after 1.1
in the name of "just shipping" what we have so far
same idea as releasing Cluster.Metrics a little later
it's a neat idea
but I want the rest of the other changes we've made to be pulled in and see that they settle in their foundations
before we introduce a change like that
what do you think?