These are chat archives for atomix/atomix

17th
Jun 2016
Churro Morales
@churrodog
Jun 17 2016 17:23
hey guys do you know of anyone using copycat in production at decent scale?
Jordan Halterman
@kuujo
Jun 17 2016 21:16
TBH I never really ask anyone about their production environments. @madjam has the most experience with it of the people I frequently talk to and who contribute. Its use at my own company is in its infancy and will likely never hit anything I would consider decent scale if you’re talking about throughput
Gardner Vickers
@gardnervickers
Jun 17 2016 21:34
Hey folks, awesome project I’ve been very impressed with the ease of getting started!
Jordan Halterman
@kuujo
Jun 17 2016 21:34
thanks!
Gardner Vickers
@gardnervickers
Jun 17 2016 21:36
I’ve been particularly interested in Copycat, still reading through the docs and toying around with the examples, but I was wondering if it’s possible to trigger log snapshotting through a client rather than through one of the automatic options.
s/snapshotting/compaction
Jordan Halterman
@kuujo
Jun 17 2016 21:48
Not right now, but there may be an argument for supporting it. Obviously, the fact that it supports snapshots is probably one of those. Client triggered compaction probably makes a bit less sense just with the incremental compaction algorithm. We'd have to think about what the implications are. Particularly what to do with the current segment. When a client triggers compaction, the lot could role over to a new segment so all of the log can be compacted. But there's no reason it can't be supported. I think there's probably a lot of optimization that can be done in log compaction. Right now it really just happens when a segment fills up. But there might be good arguments in general for different mechanisms. Logs accumulate obsolete entries continuously.
Would be an easy change
log*
Gardner Vickers
@gardnervickers
Jun 17 2016 21:50
I definitely need to read a bit deeper, but that sounds promising
To go along with that, I would be very interested in being able to walk a state machine through the log by offset number. Obviously this would be pretty limited without control over what’s compacted. Definitely looking forward to diving in more, thanks!
Jordan Halterman
@kuujo
Jun 17 2016 21:53
I can take a stab at it this weekend. All we really need to do is add an internal Compact command that can be submitted by a compact() method call on the client. The internal ServerStateMachine has access to the log and can just call log.compactor().compact(). The only question is whether the client should be notified when compaction is complete. That would require the servers send an event back to the client, which is fine.
An you elaborate on that? What would that look like?
Gardner Vickers
@gardnervickers
Jun 17 2016 21:57
This might be totally incompatible, but
Say my log consists of [Set(name, gardner), Delete(name, gardner), Set(name, john)] I would like to be able to pick an arbitrary offset [1, 2, 3] and view the state machine at that point.
Jordan Halterman
@kuujo
Jun 17 2016 21:57
I've thought it would be nice to be able to replay the log from a specific index, but there are some problems with it. It might not make total sense if indexes are exposed to a client because in practice Copycat has internal commands like register, keep-alive, unregister, no-op, etc that mean indexes are skipped from the client's perspective. So, indexes from a state machine's perspective are not sequential, but whether that matters depends on the use case. That issue could be resolved by appending a separate offset to each actual state machine command.
Gotcha... That's really interesting
I think the problem is that it sort of conflicts with log compaction. Once a snapshot is taken, the history is lost. But in reality, Copycat internally controls the indexes that are actually removed from the log
so it could be feasible to say keep n entries at the head of the log
But infeasible to keep all of them
Meaning, from the state machine's perspective once a snapshot is taken the history is lost, but that's not actually the case. Copycat internally retains entries in the log to make sure e.g. tombstones are applied on all servers or commands that trigger session events are retained until those events are acknowledged by the client for fault tolerance.
And the same could be done to allow versioning in the state machine
But state machine's could also do it themselves
Just by not releasing entries that shouldn't be compacted
Gardner Vickers
@gardnervickers
Jun 17 2016 22:01
ahh
I still have to read up on the release portion, is that just to prevent premature compaction?
Jordan Halterman
@kuujo
Jun 17 2016 22:02
But that would still require holding them in memory which might not be ideal
Gardner Vickers
@gardnervickers
Jun 17 2016 22:04
What my team currently does now is we have an “immutable log” in zookeeper, and each of our clients connec to to ZK and run their own state machines. I put immutable log in quotes because we allow a GC period where all of our clients verify they’re past a certain point in log reading and we snapshot the state to Zookeeper, then we drop the tail of the log.
I guess I could make a special version of a copycat state machine that read from the raft log and ignored the coordination pieces I don’t care about
Jordan Halterman
@kuujo
Jun 17 2016 22:09
So, yeah state machine's can just not release/close a Commit to essentially say it's still relevant to the state machine and shouldn't be compacted. A Commit that's never released will never be compacted. But the contract is that the state machine is responsible for a commit until it has been released, and thereafter Copyat will ensure that its compacted from the log when it's safe to do so. When it's safe depends on the command's CompactionMode too. This is the more advanced compaction stuff. For Snapshottable state machines, by default commands can be compacted once a snapshot is taken. If a command uses some other CompactionMode it just depends, e.g. a TOMBSTONE has to be held in the log and replicated until it's applied on all servers to make sure state is deleted on all servers, but a normal QUORUM command can be compacted as soon as it's committed and released by the state machine, indicating it's no longer needed. The simplest thing to do is just implement Snapshottable and immediately release all commands since they're presumably covered by the snapshot. Atomix uses the incremental compaction stuff. Basically, if a map key foo is overwritten by a later map key foo, the first foo command can be released. But you could do version of by instead retaining it in an in-memory log.
Actually not sure you even have to release SNAPSHOT commands... The server may just automatically assume they can be compacted on the next snapshot.
Don't remember off the top of my head
Gardner Vickers
@gardnervickers
Jun 17 2016 22:12
Awesome thank you for your time that clarifies things a bit.
Jordan Halterman
@kuujo
Jun 17 2016 22:12
:+1:
Jordan Halterman
@kuujo
Jun 17 2016 22:18
Basically, commands applied to a StateMachine that implements Snapshottable default to CompactionMode.SNAPSHOT. That means commands are retained until the next snapshot and until events published by the state machine up to that index are acknowledged by all clients to ensure fault tolerance for events. If a state machine doesn’t implement Snapshottable, commands default to CompactionMode.QUORUM and the state machine is responsible for calling release/close on Commits applied to the state machine to indicate when it’s safe to compact the command from the log. But for those types of state machines, if the state machine doesn’t properly set the CompactionMode for a tombstone it could result in inconsistent state. If a tombstone is only applied on a majority of servers and not all servers, the state may be retained e.g. on a partitioned server if the tombstone is compacted from the log too soon and thus is never applied on that server. So CompactionMode.TOMBSTONEis used to ensure those types of commands are handled properly. All that’s in the architecture documentation on log compaction, but it’s pretty heavy.
Gardner Vickers
@gardnervickers
Jun 17 2016 22:22
If I’m reading it right, I’d need everything to be snapshotted SEQUENTIAL to replicate what I’m doing in zookeeper, aka a log with [1, 2, 3, 4, 5] should never see 3 snapshotted before 1 or 2.
Jordan Halterman
@kuujo
Jun 17 2016 22:24
let me think about this...
Gardner Vickers
@gardnervickers
Jun 17 2016 22:27
That might have been a bad way of putting it, essentially I need a snapshotting/compaction mode where the resulting snapshot is a fold/reduce over the log to the most recent value that is on a majority of quorum nodes.
No worries gotta run for a bit
Jordan Halterman
@kuujo
Jun 17 2016 22:30
k