These are chat archives for atomix/atomix

4th
Mar 2016
Grant Haywood
@cinterloper
Mar 04 2016 18:06
Hello!
in member.execute( lambda... ) is there any way to get a handle (inside the lambda) on objects/resources in the remote node?
for example; i want to get a refrence to the remote nodes local storage service, and ask it to store a copy of something on its local disk
Jordan Halterman
@kuujo
Mar 04 2016 18:08
Yeah… there’s currently no way to do that. It should be easy to add. We can add an interface for the callback on which the receiving side sets some local variables
I have realized the same issue
Grant Haywood
@cinterloper
Mar 04 2016 18:10
ok, got it, always improvements to be made. I think i can pull this offi with messageing instead
Jordan Halterman
@kuujo
Mar 04 2016 18:10
indeed
Grant Haywood
@cinterloper
Mar 04 2016 18:52
hey fyi, there are some broken links on this page
http://atomix.io/atomix/docs/messaging/
specificaly, the api docs link to DistributedMessageBus
Jordan Halterman
@kuujo
Mar 04 2016 18:54
Cool thanks. Yeah the website docs are the big push this weekend. Should be cleaned up by Monday. A lot of them are stale... Much more so than the Javadoc
Grant Haywood
@cinterloper
Mar 04 2016 19:03
ok cool, ill look at the javadoc
i had a big problem with the vertx website docs for a while, not that thats realated, but kind of a personal frustration of mine, because it made it hard to convince other people to use the platform
Joachim De Beule
@joachimdb
Mar 04 2016 19:29
Hi. I'm considering atomix, but after reading the docs I'm still not sure how it would best fit my project, so I thought I'd ask here. I've got a cluster of nodes gathering data. I need to periodically collect all gathered data and persist it to an external database. Nodes shouldn't just persist their data individually, accumulation of the data before persistence is critical. If somebody could tell me that atomix is a good choice to tackle this problem and perhaps suggest how to approach it I'd be very happy! :)
Grant Haywood
@cinterloper
Mar 04 2016 19:36
well, atomix could potentially help you coordonate that 'accumulation', maybe, but you have left out the detail of what constraint is preventing you from havind uncoordonated persistance, that is, each node just writing to the db as soon as the data comes in
Joachim De Beule
@joachimdb
Mar 04 2016 19:45
true, but in what way would those details matter for answering my question? Anyway, the answer is that I do not want every single event to be written to the db, only accumulated event (counts) over time (i.e. the time resolution of my db is lower than that of events) and in turn that has to do with reasons of performance.
But so I take it this is not an off-the-shelf use-case for atomix then?
Grant Haywood
@cinterloper
Mar 04 2016 19:47
well, if counting is what you want then
io.atomix.variables.DistributedLong
would allow you to count in a distributed fassion
Joachim De Beule
@joachimdb
Mar 04 2016 19:55
ok, interesting thought, but what if I want to count event types, and there are many types? I mean, I guess I can use a distributed map, but then I have two concerns still: (1) What about performance, particularly if the event rate is very high, should I consider partial accumulation per node instead of modifying the distributed map on each node on each event? And (2): How to decide which node should persist the data?
Grant Haywood
@cinterloper
Mar 04 2016 19:57
id have to defer to the authors about performance, but personally, i think you could keep track of the names of the distributed counters in a map ( per node local or distributed, depending on your application) and just use as manny DistributedLong's as you wanted
Joachim De Beule
@joachimdb
Mar 04 2016 20:00
OK, thanks. And what about selecting the node that persists the data, any ideas?
Grant Haywood
@cinterloper
Mar 04 2016 20:01
well, im trying to do that myself at the moment, but group membership and leader election come to mind.
Jordan Halterman
@kuujo
Mar 04 2016 20:02
Copycat/Atomix/Raft are certainly not designed for performance. You could achieve high write performance, but only through sharding which Atomix doesn't currently do. There's a high cost to consistency. But there's another way to tackle a problem of high throughput, and Atomix does provide the primitives necessary to do that. You can use the consistency to do things like (2) in a way that provides fault tolerance and ensures all the nodes in a cluster have a consistent view of the structure of the cluster. To decide which nodes persist data, you use a DistributedGroup. Within the group, you elect a leader to decide which nodes persist data or have the leader persist it.
Ahh beat me to it
Grant Haywood
@cinterloper
Mar 04 2016 20:03
^^ listen to him, hes an author lol
im going to get something to eat, thanks for the help
Joachim De Beule
@joachimdb
Mar 04 2016 20:04
k, thanks. And enjoy the meal!
k, thanks kuujo, makes sense.
So basically I could have my nodes accumulate locally, and then have a distributed group with a leader that accumulates the data from all nodes and persists it, right? And Atomix would take care of there being always a leader etc.?
Jordan Halterman
@kuujo
Mar 04 2016 20:10
There are some interesting issues with leader election when writing to an external data store, though. In any cluster that does leader election, it's still theoretically possible for two members to believe themselves to be the leader simultaneously. For example, a node could be elected leader and then have a huge GC pause, causing another node to be elected. Right after the pause the first node will still believe itself to be the leader until it receives an event from the cluster telling it the leader changed. This is handled by the group term, which is a monotonically increasing unique token. Ideally, the leader that's elected writes to a data store that has an atomic check-and-set where it can verify the term is still the term that the writer believes it to be. If the data store's term is greater than the writer's term, the write fails because some other leader must have been elected.
This is a good post that talks about fencing tokens: https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html
Yep
Joachim De Beule
@joachimdb
Mar 04 2016 20:12
great, all this'll certainly get me going for now! thanks again!
Jordan Halterman
@kuujo
Mar 04 2016 20:12
:+1:
Jonathan Halterman
@jhalterman
Mar 04 2016 20:28
@cinterloper Fixed a few of the javadoc links, thanks for the heads up! As @kuujo, we'll be doing a doc push this weekend ahead of some advertising on Monday (which we haven't really done yet).
@joachimdb Atomix is all about achieving strong data consistency (across nodes) in a way that is safe/fault-tolerant. So if your use case requires accumulation of data where you want the data to be replicated, for safety reasons, and viewed consistently across nodes, Atomix will take care of that for you.
@joachimdb As @cinterloper mentioned, there are various resources you can use for storing your data: DistributedValue, DistributedMap, etc. Whatever you throw in there, it will be replicated automatically across nodes and changes will be consistent across nodes.
Jonathan Halterman
@jhalterman
Mar 04 2016 20:34
So in your case, usage of an Atomix resource could be a good staging place for your data before it is written to DB. As @kuujo mentioned though, the strong consistency that Atomix provides comes at the cost of lower throughput (since it takes a while to write data out to several nodes). So it's all about what you need.
...but still, you can push tens of thousands of write ops/second through an Atomix cluster, depending on your network (we need to do some updated benchmarking in the near future). So it's no slouch.
Joachim De Beule
@joachimdb
Mar 04 2016 20:41
i see, thanks Jonathan!