These are chat archives for atomix/atomix

13th
Mar 2017
arshad khan
@arshadwkhan_twitter
Mar 13 2017 08:52
Hi @kuujo @electrical , I have a couple of question related to DistributedLong 1) How do you set TTL for DistributedLong ? 2) Is it possible to perform multiple operation on different DistributedLong for e.g. d1.increment() & d2.get() in one call ?
Jordan Halterman
@kuujo
Mar 13 2017 09:00

@arshadwkhan_twitter DistributedLong doesn't currently have any TTL mechanism just because I'm not sure it makes sense. A TTL would really effectively be setting the value back to 0 since that's the value of an empty long, which makes it not really a TTL. But it's trivial to add one if there's a use case for it.

There's not really currently a way to do multiple operations atomically if that's what you mean. One of the items on the Atomix TODO list is atomic transactions which would just submit multiple commands as a single command, but that has yet to be implemented. There's certainly use case for that, and it would be very easy to implement. I just need to find the time.

arshad khan
@arshadwkhan_twitter
Mar 13 2017 09:04
Thanks @kuujo for the quick response .... In my use case I create one DistributedLong every minute (for referencing counting on a sliding window). My sliding window is current min and the previous minute which means I no longer need the older ones. I was concern about the memory usage over the period of time and wanted to remove the older DL using TTL. Can you suggest any other alternative . Should I remove the DL's using some call ?
Jordan Halterman
@kuujo
Mar 13 2017 09:30
Gotcha
Jordan Halterman
@kuujo
Mar 13 2017 09:36

So you need a TTL that actually deletes the entire resource (DistributedLong) itself. You make a good point. That's memory/disk that won't be reclaimed until it's actually deleted. Right now you'd have to do something in your window to delete() the DLongs once they're no longer needed.

I have thought about doing a TTL in the past, and I think there's good reason to implement it. There are actually two things that can be done: tie a resource to sessions such that the resource is deleted when all sessions that access it are closed, or support a TTL for a resource. I think both would be nice.

BTW, it seems like you could benefit a lot from writing a custom state machine to do a reference counting sliding window. Implementing it as a single custom resource would eliminate problems with cleaning up resources and with doing atomic operations since you can do whatever you want within a single operation. The API for adding custom resources is a little clunky right now and will be improved in Atomix 2.0, but it works.

I added issues for transactions and expiring resources so I don't forget.
arshad khan
@arshadwkhan_twitter
Mar 13 2017 17:48
Thanks a lot.delete() seems to be a good option for time being. I am also exploring custom resources option though the documentation seems to be old and not very clear.
Vasily Sulatskov
@redvasily
Mar 13 2017 21:19

Hi, can someone please explain is there some kind of timeout on .join() calls, on say DistributedLong.incrementAndGet() futures? Can they block indefinitely (it seems they can), and what's the correct approach to handling this?

I've tried building a small prototype with atomix (I tried versions 1.0.3 and 1.0.2) - 3 replicas running on the same machine, all on different ports and with different storage locations. After bootstrapping a cluster, I create a distributed long value, and call incrementAndGet from each instance with a 1-3second intervals. Here's the main loop's code: https://gist.github.com/redvasily/d97f1b42153af22fe8cb07467feb8a03.
It all works fine, but if I beging killing/starting instances the things quickly get weird. Once there's only one instance left, as you would expect incrementAndGet() stopps working, until I start a new instance. But sometimes I've managed to get my app in a state where after I start a new (second) instance, and the atmox cluster seems to be up, i.e. a new instance as able to commit new DistributedLong state (i.e. there are at least two atomix nodes present), while on the other Atomix node the thread seems to be blocked in an incrementAndGet().call() (confirmed by looking at the logs, and the stacktrace). On one hand I can deal with this problem by say having my own timeout around a .join() call and canceling a future in the event of a timeout, on the other hand it feels kinda hackish, and I am afraid that even I prevent future from going off, atomix can still hold some references. What's the correct way of dealing with this?

I've ran into another problem if I try to add a state change listener to the above code, so it becomes something like this:
https://gist.github.com/redvasily/c0d978fb321198c874d2331f2eb17263
Running with the same setup (3 replicas), it all works for a while, but when I start killing and starting replicas, I quickly manage to get into a state where I start getting io.atomix.copycat.error.ApplicationException errors like this: https://gist.github.com/redvasily/7b7a0d6385269050738a2709b9de9bb9
Commenting out an onChange() call gets rid of this problem. Am I doing something wrong? What's the correct way of using onChange() ?
Thanks