These are chat archives for atomix/atomix

25th
Oct 2017
Andrius Dagys
@adagys
Oct 25 2017 15:16
Has Atomix been tested with large log sizes? say 100GB - 1TB
Jordan Halterman
@kuujo
Oct 25 2017 15:18
No... logs are generally compacted long before they reach that size. But we're actually running tests this week that fill up the disk of a machine and just implemented hard caps on the amount of disk space that can be used before writes to the cluster are blocked and logs are synchronously compacted
Andrius Dagys
@adagys
Oct 25 2017 15:23
Ok, I see. What would happen in the case where entries can't be compacted away?
for example, when using Atomix for a de-duplication service, and the state machine maintains a set of some action ids
I'm just wondering in general whether the log could be used as a data store in a similar fashion to Kafka
Jordan Halterman
@kuujo
Oct 25 2017 18:09
In theory a Raft log could be used to do that, but not the Atomix log specifically. Atomix's Raft implementation is really designed for managing a set of disparate state machines rather than for managing a raw log. Meaning, entries are only ever read once and applied to a state machine. To expose the log itself as a service, the state machine would have to have access to the set of committed entries in the log since you can't just hold the entire log in memory. Although, I suppose that wouldn't be that difficult to refactor into the system.
Jordan Halterman
@kuujo
Oct 25 2017 19:05
TBH the architecture of Atomix could handle this pretty well. I don't have the bandwidth to do it myself, but I'm open to PRs. I'd suggest abstracting out the RaftServiceManager and creating an implementation that manages state machines or an implementation that serves a log.