These are chat archives for atomix/atomix
I think we gotta start from scratch here. I’ll take parts of our fault injection/verification tests, but everything else would probably be easier to just write new.
The test framework is done enough now to start using it. It does node crashes, various partitions, memory/cpu/disk stress tests, etc.
I think the thing to do is just start by writing primitive tests. The Python client is very incomplete, and writing primitive tests will force the client to be completed and cleaned up. Also should get the fault injection test done since it’s good for finding general problems.
Then I’ll start adding stress tests, which I’m actually doing now. My current work priority is profiling, so I’m writing tests to verify that the cluster stabilizes after periods of high CPU/memory/disk/network usage and timeouts don’t lead to memory leaks when retrying operations.
FYI here’s the fault injection/verification test I wrote for ONOS:
It just randomly kills/partitions nodes and attempts writes/reads on a value, then records the register and runs it through the Knossos linearizability checker