These are chat archives for atomix/atomix

20th
Apr 2016
scorpiomedia
@scorpiomedia
Apr 20 2016 08:01
Hi. I am new to Atomix. Just compile and run the leader election example. It is working on localhost. But when i change it to two node vbox machine it is not registering
Jordan Halterman
@kuujo
Apr 20 2016 08:02
hmm
scorpiomedia
@scorpiomedia
Apr 20 2016 08:02
host-2/192.168.168.201:9002 - Polling members [ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=/192.168.168.200:9001, clientAddress=null
host-1/192.168.168.200:9001 - Requesting votes from [ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=localhost/127.0.0.1:9001, clientAddress=localhost/127.0.0.1:9001], ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=localhost/127.0.0.1:9002, clientAddress=localhost/127.0.0.1:9002
somehow the nodes are not joining or merging
i wonder why the clientaddress is localhost though
can anyone help
Jordan Halterman
@kuujo
Apr 20 2016 08:07
what commands are you using to run it?
It may just be how the example sets up Addresses
We’ve had to use InetAddress.getLocalHost().getHostName() in Jepsen tests
scorpiomedia
@scorpiomedia
Apr 20 2016 08:09
java -jar atomix-leader-election.jar logs/ink-cron-server1 host-1:9001 host-2:9002
java -jar atomix-leader-election.jar logs/ink-cron-server2 host-2:9002 host-1:9001
Jordan Halterman
@kuujo
Apr 20 2016 08:09
gotcha...
scorpiomedia
@scorpiomedia
Apr 20 2016 08:10
and i the hosts mapping in my /etc/hosts
192.168.168.200 host-1
192.168.168.201 host-2
on both nodes
are my commands correct?
Jordan Halterman
@kuujo
Apr 20 2016 08:13
yep… obviously it’s not resolving to the correct address for some reason. The leader-election-example will parse the host:port and call new Address(host, port) which calls new InetSocketAddress(host, port) and then call getHostString() to get the host
I think it’s just the way the example is set up… @jhalterman has fought with this a bit but sure he’s sleeping ATM :-(
scorpiomedia
@scorpiomedia
Apr 20 2016 08:15
putting the ip addresses instead of the hostname still doesn't work.
hmm... what can i do for this to work?
Jordan Halterman
@kuujo
Apr 20 2016 08:18
does it still show localhost/127.0.0.1 in the logs?
scorpiomedia
@scorpiomedia
Apr 20 2016 08:19
Yes
Jordan Halterman
@kuujo
Apr 20 2016 08:19
this is odd
scorpiomedia
@scorpiomedia
Apr 20 2016 08:20
yeah. changing hostnames to the actual ip addresses doesnt help either
i.e java -jar atomix-leader-election.jar logs/ink-cron-server1 192.168.168.200:9001 192.168.168.201:9002
it still returns localhost
host-1/192.168.168.200:9001 - Requesting votes from [ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=localhost/127.0.0.1:9001, clientAddress=localhost/127.0.0.1:9001], ServerMember[type=ACTIVE, status=AVAILABLE, serverAddress=localhost/127.0.0.1:9002, clientAddress=localhost/127.0.0.1:9002]]
Jordan Halterman
@kuujo
Apr 20 2016 08:24
one sec
Jordan Halterman
@kuujo
Apr 20 2016 08:33
so… I switched the example code to do:
Address address = new Address(InetAddress.getByName(mainParts[0]).getHostName(), Integer.valueOf(mainParts[1]));
the example’s just not setup properly for a network :-P
you can also pass an InetSocketAddress to new Address(…) if necessary
scorpiomedia
@scorpiomedia
Apr 20 2016 08:35
hmm... thanks. trying now
Jordan Halterman
@kuujo
Apr 20 2016 08:37
the top of the example becomes:
    String[] mainParts = args[1].split(":");
    Address address = new Address(InetAddress.getByName(mainParts[0]).getHostName(), Integer.valueOf(mainParts[1]));

    List<Address> cluster = new ArrayList<>();
    for (int i = 1; i < args.length; i++) {
      String[] parts = args[i].split(":");
      cluster.add(new Address(InetAddress.getByName(parts[0]).getHostName(), Integer.valueOf(parts[1])));
    }
should actually just update the example
scorpiomedia
@scorpiomedia
Apr 20 2016 08:40
got an error while compiling. "symbol not found"
think need additional java class to import?
ahh got it right
need import java.net.InetAddress;
hmm... still returning localhost
scorpiomedia
@scorpiomedia
Apr 20 2016 08:51
FYI: using jdk - 1.8.0_77 on Ubuntu 14.04
Richard Pijnenburg
@electrical
Apr 20 2016 08:58
@scorpiomedia I’ll have a look on my machine if i can reproduce it with the current code
scorpiomedia
@scorpiomedia
Apr 20 2016 08:58
@electrical Thanks
Richard Pijnenburg
@electrical
Apr 20 2016 09:34
11:33:23.808 [copycat-server-/192.168.100.20:9001-copycat] INFO  i.a.c.server.state.ServerContext - /192.168.100.20:9001 - Transitioning to FOLLOWER
11:33:24.194 [copycat-server-/192.168.100.20:9001-copycat] INFO  i.a.c.server.state.ServerContext - /192.168.100.20:9001 - Found leader /192.168.100.20:9000
11:33:24.470 [copycat-server-/192.168.100.20:9001-copycat] INFO  i.a.c.s.state.ServerStateMachine - /192.168.100.20:9001 - Installing snapshot 1
11:33:24.559 [copycat-server-/192.168.100.20:9001-copycat] INFO  i.a.c.server.state.ClusterState - /192.168.100.20:9001 - Successfully joined via /192.168.100.20:9000
11:33:24.595 [copycat-server-/192.168.100.20:9001-copycat] INFO  i.a.copycat.server.CopycatServer - Server started successfully!
11:33:24.736 [copycat-client-io-1] INFO  i.a.c.client.session.ClientSession - Registered session 20
must admit it was still on the same host but using its ip address instead of localhost
Richard Pijnenburg
@electrical
Apr 20 2016 09:40
@scorpiomedia ^^
scorpiomedia
@scorpiomedia
Apr 20 2016 09:49
@electrical what command did you run this?
Richard Pijnenburg
@electrical
Apr 20 2016 09:49
java -jar atomix-leader-election.jar logs1 192.168.100.20:9000 192.168.100.20:9000
that was to start the first one
java -jar atomix-leader-election.jar logs2 192.168.100.20:9001 192.168.100.20:9001 192.168.100.20:9000
and that for the second one
scorpiomedia
@scorpiomedia
Apr 20 2016 09:50
hmm...
scorpiomedia
@scorpiomedia
Apr 20 2016 10:00
@electrical You are using one server right.
Richard Pijnenburg
@electrical
Apr 20 2016 10:06
This is on a single server yeah.
using that example indeed
scorpiomedia
@scorpiomedia
Apr 20 2016 10:08
Got it working. Rebooted my VM.
Jordan Halterman
@kuujo
Apr 20 2016 10:08
ahh nice
Richard Pijnenburg
@electrical
Apr 20 2016 10:08
lol okay :-) stupid vm’s :-)
scorpiomedia
@scorpiomedia
Apr 20 2016 10:08
yeah. i never taught it was just a simple reboot
thanks guys
Richard Pijnenburg
@electrical
Apr 20 2016 10:09
np :-)
Jordan Halterman
@kuujo
Apr 20 2016 10:10
thanks @electrical
Richard Pijnenburg
@electrical
Apr 20 2016 10:11
np bud
scorpiomedia
@scorpiomedia
Apr 20 2016 10:11
What i want really is just the election aspect of atomix. Instead of using AtomixReplica, is it possible to use AtomixClient instead?
Richard Pijnenburg
@electrical
Apr 20 2016 10:11
you need at least 1 replica
because that’s what stores the state
scorpiomedia
@scorpiomedia
Apr 20 2016 10:12
ahh i see. thanks @electrical
Jordan Halterman
@kuujo
Apr 20 2016 10:14

You can use servers to store state externally though… there’s a standalone server so you can do something like:

java -jar atomix-standalone-server.jar localhost:8700 -bootstrap

You can start a cluster of them and then just use clients in your application

but indeed at least one server that can store state - either a replica or standalone server - is needed
scorpiomedia
@scorpiomedia
Apr 20 2016 10:15
but what will happen if the standalone server goes down?
Jordan Halterman
@kuujo
Apr 20 2016 10:15
the standalone server hasn’t been tested much TBH it’s one of the last things that needs to be tested more before the release
you lose your state… but that’s why you should start at least three of them
they’re standalone in the sense that they don’t themselves have a client, but they can form a “standalone” cluster
that’s separate from clients
it’s just more the traditional client/server architecture
scorpiomedia
@scorpiomedia
Apr 20 2016 10:16
hmmm... i see...
im using the library mostly for reliably electing a leader. Specifically some kind of a CRON master.
Richard Pijnenburg
@electrical
Apr 20 2016 10:19
ah okay
scorpiomedia
@scorpiomedia
Apr 20 2016 10:19
There are cron jobs that needs to be run only for the elected leader/master
a simple way of replacing pacemaker + corosync
Richard Pijnenburg
@electrical
Apr 20 2016 10:19
yeah. then you have a cluster of nodes which form a single cluster and only one of them can become a leader.
scorpiomedia
@scorpiomedia
Apr 20 2016 10:20
i don't if i got it right. but i think your library is what i needed
because i can add node dynamically
Richard Pijnenburg
@electrical
Apr 20 2016 10:20
yeah indeed
scorpiomedia
@scorpiomedia
Apr 20 2016 10:21
anyway, thanks so much guys. Sorry for being a noob
Richard Pijnenburg
@electrical
Apr 20 2016 10:21
no worries! we all started that way :-)
scorpiomedia
@scorpiomedia
Apr 20 2016 10:23
@electrical I hope this helps. But i found that sometimes, the event onElection didn't fire up
Richard Pijnenburg
@electrical
Apr 20 2016 10:23
evenwhile the node is a master ?
scorpiomedia
@scorpiomedia
Apr 20 2016 10:23
Yes
Transitioning to LEADER
Richard Pijnenburg
@electrical
Apr 20 2016 10:24
hm, interesting.
something for @kuujo to look into :-)
scorpiomedia
@scorpiomedia
Apr 20 2016 10:24
but the "Elected Leader" didnt show up in the log
maybe because onElection didnt fire up or something
Richard Pijnenburg
@electrical
Apr 20 2016 10:24
well, it doesn’t show up in the log. its a print out
scorpiomedia
@scorpiomedia
Apr 20 2016 10:25
ohh sorry. i mean in the printout
Richard Pijnenburg
@electrical
Apr 20 2016 10:25
ah okay :-)
scorpiomedia
@scorpiomedia
Apr 20 2016 10:25
This is the printout in the master node:
[18:18:50.247]: host-2/192.168.168.201:9001 - Transitioning to LEADER
[18:18:50.256]: host-2/192.168.168.201:9001 - Found leader host-2/192.168.168.201:9001
After that, no more printout
Richard Pijnenburg
@electrical
Apr 20 2016 10:26
doesn’t log anything like ‘Server started successfully!’ ?
scorpiomedia
@scorpiomedia
Apr 20 2016 10:27
it started successfully. in fact, it started the election process.
Richard Pijnenburg
@electrical
Apr 20 2016 10:27
this is what i saw:
```
scorpiomedia
@scorpiomedia
Apr 20 2016 10:27
i am testing terminating nodes randomly to force an election
Richard Pijnenburg
@electrical
Apr 20 2016 10:27
argh
1:33:00.294 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.c.server.state.ServerContext - /192.168.100.20:9000 - Transitioning to FOLLOWER
11:33:01.095 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.c.server.state.ServerContext - /192.168.100.20:9000 - Transitioning to CANDIDATE
11:33:01.102 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.c.server.state.CandidateState - /192.168.100.20:9000 - Starting election
11:33:01.289 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.c.server.state.ServerContext - /192.168.100.20:9000 - Transitioning to LEADER
11:33:01.300 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.c.server.state.ServerContext - /192.168.100.20:9000 - Found leader /192.168.100.20:9000
11:33:01.383 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.c.s.state.ServerStateMachine - /192.168.100.20:9000 - Taking snapshot 1
11:33:01.450 [copycat-server-/192.168.100.20:9000-copycat] INFO  i.a.copycat.server.CopycatServer - Server started successfully!
11:33:01.559 [copycat-client-io-1] INFO  i.a.c.client.session.ClientSession - Registered session 3
Elected leader!
scorpiomedia
@scorpiomedia
Apr 20 2016 10:28
Sometimes you that printout. But sometimes it doesnt show
Richard Pijnenburg
@electrical
Apr 20 2016 10:29
hmm okay
Jordan Halterman
@kuujo
Apr 20 2016 10:35
Make sure when you blow up the cluster you delete all the logs. The cluster persists configurations on disk. So, if you start a cluster with a bad confguration and then restart it with a good one in the same log directory sometimes it can mess things up. Servers should be able to restart with the same configuration but if you’re switching configurations (like when trying out all these configurations) make sure the old log directory gets deleted. Could that be why restarting the VM helped?
I haven’t been able to reproduce the leader election issue. It works consistently well in tests and in playing with it for me. It’s impossible to tell what’s going on without debug logs if there is a bug
also note that if you kill one node in a two node cluster, you won’t see another one get elected since the remaining node does not form a majority
but if you start a three node cluster and kill the leader you’ll see another one get elected since the remaining 2/3 form a majority still
scorpiomedia
@scorpiomedia
Apr 20 2016 10:40
okay.