Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 15:42
    Aaronontheweb synchronize #4086
  • 15:42
    Aaronontheweb closed #4083
  • 15:42

    Aaronontheweb on dev

    Fix #4083 - Endpoint receive bu… (compare)

  • 15:42
    Aaronontheweb closed #4089
  • 15:42
    Aaronontheweb labeled #4093
  • 15:42
    Aaronontheweb labeled #4093
  • 15:42
    Aaronontheweb labeled #4093
  • 15:42
    Aaronontheweb opened #4093
  • 14:20
    Aaronontheweb commented #4092
  • 14:14
    Aaronontheweb labeled #4089
  • 14:14
    Aaronontheweb labeled #4089
  • 14:11
    Aaronontheweb synchronize #4089
  • 14:10
    Aaronontheweb synchronize #4086
  • 14:09

    Aaronontheweb on dev

    Convert to ImmutableHashSet for… (compare)

  • 14:09
    Aaronontheweb closed #4090
  • 12:04
    nagytech synchronize #4092
  • 11:53
    nagytech synchronize #4092
  • 11:49
    nagytech edited #4092
  • 11:40
    nagytech opened #4092
  • 11:32
    nagytech edited #4091
jalchr
@jalchr
On GetShardRegionState shard region will reply with ShardRegionState containing data about shards living in the current actor system and what entities are alive on each one of them.
On GetClusterShardingStats shard region will reply with ClusterShardingStats having information about shards living in the whole cluster and how many entities alive in each one of them.
okay ... this gives info about "All active Shards" ... not a single one ... right ?
very well ...
Bartosz Sypytkowski
@Horusiath
@wilddeveloperappears you still can set up a separate HOCON config for each ActorSystem, with different logging in mind. In general it's not possible to setup a separate logging adapter per actor, as logging is running within actor system event stream. It doesn't discriminate different actors.
@jalchr yes, for a single shard, you have corresponding message types described bellow in the same article
jalchr
@jalchr

How can I query for the stats ? Is it a Tell or Ask ?

 _shard.Ask< ?? >(Shard.GetCurrentShardState.Instance);

or

 _shard.Tell(Shard.GetCurrentShardState.Instance);

and How I can get the response out of a 'Tell' ??

Bartosz Sypytkowski
@Horusiath
shard.Ask<Shard.CurrentShardState>(Shard.GetCurrentShardState.Instance); - it's described there, by convention query names have Getxxx prefix, while replies don't
with Tell, shard will simply respond to a message sender (an actor)
jalchr
@jalchr
Oops ... I missed that in the documentation ... strongly appreciated
jalchr
@jalchr
@Horusiath I'm getting this error:
2017-12-09 11:20:41,542 [33] WARN  Akka.Cluster.Sharding.ShardRegion - Message does not have an extractor defined in shard [FileHandler] so it was ignored: Akka.Cluster.Sharding.Shard+GetCurrentShardState
Bartosz Sypytkowski
@Horusiath
to which actor do you send that message?
it looks like you're trying to contact with sharded actor instead of shard itself
jalchr
@jalchr
I think so ... here is what I'm doing:
            _shard = ClusterSharding.Get(Context.System).ShardRegion(nameof(FileHandler));
            var state = _shard.Ask<Shard.CurrentShardState>(Shard.GetCurrentShardState.Instance).Result;
            var exists = state.EntityIds.Contains(newVideo.File.ToLower());
How to talk with the "shard" itself ?
Bartosz Sypytkowski
@Horusiath
@jachlr in your case _shard is actually ShardRegion. General notion is that:
  • Actors managed by cluster sharding often are referred to as entities
  • Entities are grouped in shards. Entities within the same shard always live on the same machine, when they are rebalanced to another one, they are always rebalanced together. If you want to get reference to target shard, keep in mind that shard is an actor parent for it's entities (so you can use Context.Parent from within entity).
  • Shards themselves are grouped within shard regions. Shard region is a container for all shards of a given actor type living on a target actor system.
jalchr
@jalchr
@Horusiath I think I understand the terminology. But it would clearer if translated to code.
So _shard is a region.
How do I query for the "shard state" ... ?
Bartosz Sypytkowski
@Horusiath
lets first define, what information do you really want to get? "shard state" is very broad
jalchr
@jalchr
Very well, my ApiMaster is trying to launch a new "File Processing" operation .. I'm trying to detect if a 'job' is already running or not
private void HandleFindRunningJob(FindRunningNewVideo newVideo)
        {
            //var state = _shard.Ask<Shard.CurrentShardState>(Shard.GetCurrentShardState.Instance).Result;
            //var exists = state.EntityIds.Contains(newVideo.File.ToLower());

            var haveChild = Context.Child(newVideo.File.ToLower());
            if (haveChild != ActorRefs.Nobody)
            {
                ApiBroadcaster.Tell(new NewVideoFound(newVideo.File));

                _mediator.Tell(new Publish(Topics.Reporting, new ReportStatus(ReportStatusEnum.Info,
                    $"HandleFindRunningJob() : 'already processed' file = {newVideo.File}"
                    )));
                _logger.Info($"HandleFindRunningJob() : 'already processed' file = {newVideo.File}");

            }
        }
Bartosz Sypytkowski
@Horusiath
is there some kind of 1-1 relation between "job" and a particular actor instance?
Bartosz Sypytkowski
@Horusiath
usually "job" describes some stateless behavior - cluster sharding has more sense in stateful problems (when you have a logical domain entity like User, Department, Vehicle etc.). You don't need to check if entity is alive, as you shouldn't care about that. If it's running, just queue the next message for it, it will be picked up at some point. If it's not alive it will be created ad-hoc and start processing the message.
jalchr
@jalchr

I see what you mean. I have FileProcessing to be a stateful problem. It undergoes several steps. I need to make sure I have 1 actor (entity) per file.
Cause now, the duplicate jobs are causing file locking issues (a file is being used by another process) sort of things.

Before I move to persistent actors, I used var haveChild = Context.Child(newVideo.File.ToLower()); ...
Now, I'm using shards and entities ... I need to mimic the same functionality ...
Much Appreciated

Bartosz Sypytkowski
@Horusiath
@jalchr you can just post message to an entity - it will be created ad-hoc if it wasn't alive. Just keep in mind, that sharded entities are a bit slower and more heavyweight than normal actors (because of all automatic lifecycle and cluster message routing machanisms).
jalchr
@jalchr
@Horusiath You mean I don't have to query for the shard state to detect whether or not an entity already exists ?
And simply send a message, as long as an entity can exist only once... right ?
Bartosz Sypytkowski
@Horusiath
yes
jalchr
@jalchr
very well...
In all cases, how can we query a shard for its 'state' :) ?
jalchr
@jalchr
I'm experience "shortage" in communication ....
here is a sample log file
2017-12-09 16:03:35,998 [6] WARN  Akka.Cluster.ClusterCoreDaemon - Cluster Node [akka.tcp://ArchiveSystem@140.125.4.1:16568] - Marking node(s) as UNREACHABLE [Member(address = akka.tcp://ArchiveSystem@140.125.4.2:16668, Uid=81451205 status = Up, role=[importer-old], upNumber=21), Member(address = akka.tcp://ArchiveSystem@140.125.4.3:16666, Uid=807193699 status = Up, role=[web], upNumber=19)]. Node roles [importer,pubsub-node]
2017-12-09 16:03:36,013 [6] INFO  Akka.Cluster.ClusterCoreDaemon - Marking node(s) as REACHABLE [Member(address = akka.tcp://ArchiveSystem@140.125.4.2:16668, Uid=81451205 status = Up, role=[importer-old], upNumber=21)]. Node roles [importer,pubsub-node]
2017-12-09 16:03:36,013 [6] INFO  Archive.Shared.Cluster.ClusterStatus - UnreachableMember: Member(address = akka.tcp://ArchiveSystem@140.125.4.2:16668, Uid=81451205 status = Up, role=[importer-old], upNumber=21), Role(s): importer-old
2017-12-09 16:03:36,013 [6] INFO  Archive.Shared.Cluster.ClusterStatus - UnreachableMember: Member(address = akka.tcp://ArchiveSystem@140.125.4.3:16666, Uid=807193699 status = Up, role=[web], upNumber=19), Role(s): web
2017-12-09 16:03:36,013 [6] INFO  Akka.Cluster.ClusterCoreDaemon - Leader can currently not perform its duties, reachability status: [Reachability([akka.tcp://ArchiveSystem@140.125.4.1:16568 -> UniqueAddress: (akka.tcp://ArchiveSystem@140.125.4.2:16668, 81451205): Reachable [Reachable] (29)], [akka.tcp://ArchiveSystem@140.125.4.1:16568 -> UniqueAddress: (akka.tcp://ArchiveSystem@140.125.4.3:16666, 807193699): Unreachable [Unreachable] (28)])], member status: [$akka.tcp://ArchiveSystem@140.125.4.1:16568 $Up seen=$True, $akka.tcp://ArchiveSystem@140.125.4.1:16668 $Up seen=$False, $akka.tcp://ArchiveSystem@140.125.4.2:16568 $Up seen=$False, $akka.tcp://ArchiveSystem@140.125.4.2:16668 $Up seen=$False, $akka.tcp://ArchiveSystem@140.125.4.3:16666 $Up seen=$False, $akka.tcp://ArchiveSystem@140.125.4.4:16666 $Up seen=$False, $akka.tcp://ArchiveSystem@140.125.4.5:4053 $Up seen=$False, $akka.tcp://ArchiveSystem@140.125.4.6:4053 $Up seen=$False]
2017-12-09 16:03:36,138 [6] INFO  Archive.Shared.Cluster.ClusterStatus - ReachableMember: Member(address = akka.tcp://ArchiveSystem@140.125.4.2:16668, Uid=81451205 status = Up, role=[importer-old], upNumber=21), Role(s): importer-old
2017-12-09 16:03:36,138 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: 
2017-12-09 16:03:36,138 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,138 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,154 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,154 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,154 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,154 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,154 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,154 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address: akka.tcp://ArchiveSystem@140.125.4.3:16666, 
2017-12-09 16:03:36,170 [6] WARN  Archive.Shared.Cluster.ClusterStatus - Unreachable Member; Role: web, Status: Up, Address:
I have more than 2 roles ... "web" which is the front-end web application and "importer" which is the video processing service.
I noticed that after few days of running ... the combination of lighthouse and other roles ... somehow cause this
Bartosz Sypytkowski
@Horusiath
where are you hosting it?
jalchr
@jalchr
This causes even internal (no remoting) message forwarding to not reach its destination
I'm hosting this on 6 VMs (2 web , 2 processing ... 2 lighthouse)
and for the record, the IIS recycles the web app after some idle time ... so it is natural to become 'unreachable'
I wish I can reproduce this
Bartosz Sypytkowski
@Horusiath
maybe try to add IIS hook to allow node to gracefully leave the cluster before going down?
unreachable is normal part when machine crashes or network connection goes down - if its possible its better to gracefully signal if node is going offline.
jalchr
@jalchr
This is what I do in IIS ... asp.net app
//==============================================================
            HandleStarted();

            var properties = new AppProperties(app.Properties);
            var token = properties.OnAppDisposing;

            if (token != CancellationToken.None)
            {
                token.Register(() =>
                {
                    // do stuff here for ending / disposing
                    HandleStopped();
                    log.Info("ASP.NET application stopped !");
                });
            }

            log.Info("ASP.NET application started !");
Bartosz Sypytkowski
@Horusiath
also there are several cases for non-graceful scenarios
jalchr
@jalchr
private void HandleStopped()
        {
            log.Info("ASP.NET application stopped");
            if (ClusterSystem == null) return;
            log.Debug("Leaving cluster");
            ClusterHelper.Tell(new ClusterHelper.RemoveMember());
            var cluster = Akka.Cluster.Cluster.Get(ClusterSystem);
            cluster.RegisterOnMemberRemoved(() => MemberRemoved(ClusterSystem));
            cluster.Leave(cluster.SelfAddress);

            asTerminatedEvent.WaitOne(20000);
            ClusterSystem.Terminate().Wait(20000);
            log.Info("Actor system terminated, exiting");
        }

        private async void MemberRemoved(ActorSystem actorSystem)
        {
            log.Info("Member removed ...");
            await actorSystem.Terminate();
            this.asTerminatedEvent.Set();
        }
Internal messages fail:
2017-12-09 16:03:43,795 [56] INFO  Akka.Actor.LocalActorRef - Message Shutdown from akka://ArchiveSystem/system/sharding/FileHandler/67/%5C%5Cnas20%5Cd%24%5Carchive%5Cincoming%5Creuters%5C201712096110wd-northkorea-missilescontributors.xml/$c to akka://ArchiveSystem/system/sharding/FileHandler/67/%5C%5Cnas20%5Cd%24%5Carchive%5Cincoming%5Creuters%5C201712096110wd-northkorea-missilescontributors.xml/$c was not delivered. 4 dead letters encountered.
2017-12-09 16:03:43,826 [56] INFO  Akka.Actor.LocalActorRef - Message FileValidationResult from akka://ArchiveSystem/system/sharding/FileHandler/67/%5C%5Cnas20%5Cd%24%5Carchive%5Cincoming%5Creuters%5C201712096110wd-northkorea-missilescontributors.xml/$c to akka://ArchiveSystem/system/sharding/FileHandler/67/%5C%5Cnas20%5Cd%24%5Carchive%5Cincoming%5Creuters%5C201712096110wd-northkorea-missilescontributors.xml was not delivered. 5 dead letters encountered.
jalchr
@jalchr
Some additional notes, my applications are CPU and Network intensive apps. It is typical to have 100% CPU and 100% Network utilization, most of the time.
Perhaps this effects how akka.net communication works and causes this ... perhaps
Bart de Boer
@boekabart
@ that, I sometimes think Akka.Net needs a 'priority' system for actor mailboxes for such high-load cases - some actors are simply more time-critical than others. cc @Aaronontheweb
Stefano
@delfuria
Priority Mailboxes are already available
http://getakka.net/articles/actors/mailboxes.html
anthonyhawes
@anthonyhawes
I don't think priority mailboxes would solve the time-critical actor problem in high load cases. A priority scheduler would be needed to give those actors a larger "slice" of message processing
Vagif Abilov
@object
I am checking our cluster sharding event journal, and it has very small number of rows. And the snapshot store that we created for cluster sharding is always empty. So I wonder if there's any need for a snapshot store for cluster sharding. Can there ever be an attempt to create a snapshot? And is there ever need to cleanup the event journal for cluster sharding? Can it grow big (I suppose it doesn't have autocleanup).
Bartosz Sypytkowski
@Horusiath

@object if you don't have remember-entites flag set, cluster sharding won't produce too much events. Snapshot store will be used (by default) every 1000 consecutive sharding events stored, but this applies if shard coordinator will live long enough to produce that 1000 events within a single incarnation.

Also I hope to be able to finish DData-based cluster sharding soon (see akkadotnet/akka.net#3199). With it, event journals will no longer be needed.

It actually already works, but I need to fix tests.
wilddeveloperappears
@wilddeveloperappears
@Horusiath can I set them up programatically? Or does it have to be through HOCON config? I want to pass a function to the initial logger.
Vagif Abilov
@object
Thanks @Horusiath. We recently had weird experience with our cluster when nodes were not able to form a single cluster continuously logging "trying to register... but no acknowledgement" until we deleted whole sharding event journal (there were only about 80 rows in it). And if DData implementation is around the corner, we should not probably spend much time on the investigation of this failure.
Bartosz Sypytkowski
@Horusiath
@wilddeveloperappears if I remember right, you should be able to just subscribe to System.EventStream.Subscribe(actorRef, typeof(Akka.Event.LogEvent))
wilddeveloperappears
@wilddeveloperappears
I'm currently investigating doing hocon changes to solve the problem.
Bartosz Sypytkowski
@Horusiath
@object this sounds like an issue described by @zbynek001 (see: https://github.com/akkadotnet/akka.net/issues/3204#issuecomment-350274096) - regarding ddata, remember that PRs can take some time.
wilddeveloperappears
@wilddeveloperappears
I've retro-fitted NLog into a project I'm working on with hundreds of classes all separated out into projects. They all fed into one log file and now I have them filtering to several based on their namespace. Using Akka in this model is broken because the Akka loggers are in a separate namespace from the project they're logging for.