These are chat archives for atomix/atomix

3rd
Aug 2017
Kevin Daly
@kedaly
Aug 03 2017 14:45
@kuujo I'm having all sorts of problems with Atomix Distributed Map locking and never returning a value, I've tried with join get get with timeout it seems like it's locked somehow.. How do I diagnose what is happening here.
Johno Crawford
@johnou
Aug 03 2017 14:45
@kedaly take a thread dump if you haven't already and analyse that, that's the first place I would suggest to check
Kevin Daly
@kedaly
Aug 03 2017 14:47
@johnou thanks, will try that.. I'm also wondering if when getting Distributed maps, do we need to close them?
Or any Atomix resource
Johno Crawford
@johnou
Aug 03 2017 14:48
are you rapidly creating and destroying them?
or running in a container environment with app reloads?
Kevin Daly
@kedaly
Aug 03 2017 14:49
no, we have clusters where nodes "join" Distributed groups when they start, they join base groups based on each nodes roles.
public ServiceRegistryImpl(final Atomix atomix, final Set<ServiceDescriptor> descriptorSet,
                               final NodeInfo info) throws InterruptedException, ExecutionException, TimeoutException {
        this.nodeInfo = info;
        this.atomix = atomix;
        this.descriptorSet = descriptorSet;
        //let's join the groups
        for (final ServiceDescriptor descriptor : descriptorSet) {
            final DistributedGroup group = atomix.getGroup(descriptor.getServiceName()).join();
            try {
                group.<NodeInfo>join(info.getId(), info).thenAccept(localMember -> {
                    logger.info("Joined Service Group -> {}  id-> {} <-", descriptor.getServiceName(), localMember.id());
                    serviceIds.put(descriptor.getServiceName(), localMember.id());
                }).get(1, TimeUnit.MINUTES);
            } catch (InterruptedException | ExecutionException | TimeoutException e) {
                logger.error("Cannot Register service -> {} Reason -> {} ", descriptor.getServiceName(), e.getMessage(), e);
                throw e;
            }


        }

        //let's build the roles groups
        for (final String role : nodeInfo.getRoles()) {
            final String roleString = ServiceDescriptor.ROLE_MEMBERSHIP_PREFIX + role;
            final DistributedGroup group = atomix.getGroup(roleString).join();
            try {
                group.<NodeInfo>join(info.getId(), info).thenAccept(localMember -> {
                    logger.info("Joined Service Group -> {} id-> {} <-", roleString, localMember.id());
                    serviceIds.put(roleString, localMember.id());
                }).get(1, TimeUnit.MINUTES);
            } catch (InterruptedException | ExecutionException | TimeoutException e) {
                logger.error("Cannot Register ROLE -> {} Reason ->  {}", roleString, e.getMessage(), e);
                throw e;
            }
        }

    }
Johno Crawford
@johnou
Aug 03 2017 14:50
i would have thought no, should be enough just to cleanup / shutdown atomix when the node leaves the cluster but wouldn't hurt for kuujo to clarify
have you considered using the non blocking api?
like for example
yeah gitter is a bit naff for that
instead of final DistributedGroup group = atomix.getGroup(descriptor.getServiceName()).join() you could use handle(biconsumer throwable, result)
or do you need it to block for some reason?
Kevin Daly
@kedaly
Aug 03 2017 15:03
@kuujo so in the code above should I close the resource? This code has to be sync so I use the get
The above code runs when every node starts.
Jordan Halterman
@kuujo
Aug 03 2017 18:11
It doesn't look like there's anything wrong with the code above. But that depends on the thread in which it's being run. Atomix 1.x has a threading model that's largely single threaded for events coming from the cluster. This is necessary to ensure events occur on clients in the order in which they occur in the cluster. But it also means if an event thread is blocked on a get/join call it may hang forever. Atomix 2.x addresses a lot of these threading issues using thread pools and completing blocked futures on background threads. I don't know that that's what's happening here, but it's my first thought.
Johno Crawford
@johnou
Aug 03 2017 18:50
@kuujo how's it looking in ONOS? almost time to start bringing in the other parts?
Jordan Halterman
@kuujo
Aug 03 2017 18:50
after next week
which is our feature freeze