These are chat archives for atomix/atomix

23rd
Aug 2018
suimi
@suimi
Aug 23 2018 02:34
i got a error when restart the node:
```
ERROR SegmentedJournal - Journal appears to be corrupted! raft-group/1/partitions/1/raft-group-partition-1-2.log is not aligned with the prior segment raft-group/1/partitions/1/raft-group-partition-1-1.log [,,]
-rw-rw-r-- 1 admin admin 16777216 Aug 23 10:28 raft-group-partition-1-1.log
-rw-rw-r-- 1 admin admin 10485760 Aug 23 10:26 raft-group-partition-1-2.log
-rw-rw-r-- 1 admin admin      128 Aug 23 10:20 raft-group-partition-1.conf
-rw-rw-r-- 1 admin admin      128 Aug 23 10:25 raft-group-partition-1.meta
suimi
@suimi
Aug 23 2018 02:57
2018-08-23 10:26:20,759 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=507, prevLogTerm=2, entries=0, commitIndex=507} [,,]
2018-08-23 10:26:20,759 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=507} [,,]
2018-08-23 10:26:20,975 [raft-client-raft-group-partition-1-6] TRACE RaftSessionConnection - SessionClient{4}{type=com.higgs.trust.consensus.atomix.core.primitive.CommandPrimitiveType@36f6e521, name=CommandPrimitiveType} - Sending CommandRequest{session=4, sequence=176, operation=PrimitiveOperation{id=DefaultOperationId{id=submit, type=COMMAND}, value=byte[]{length=49082, hash=272872877}}} to 0 [,,]
2018-08-23 10:26:21,020 [raft-client-raft-group-partition-1-1] TRACE RaftSessionConnection - SessionClient{4}{type=com.higgs.trust.consensus.atomix.core.primitive.CommandPrimitiveType@36f6e521, name=CommandPrimitiveType} - Sending CommandRequest{session=4, sequence=177, operation=PrimitiveOperation{id=DefaultOperationId{id=submit, type=COMMAND}, value=byte[]{length=196, hash=1587331156}}} to 0 [,,]
2018-08-23 10:26:21,124 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=507, prevLogTerm=2, entries=1, commitIndex=507} [,,]
2018-08-23 10:26:21,259 [raft-server-raft-group-partition-1] DEBUG SegmentedJournal - Created disk segment: JournalSegment{id=2, version=1, index=508, size=64} [,,]
2018-08-23 10:26:21,259 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Appended Indexed{index=508, entry=CommandEntry{term=2, timestamp=2018-08-23 10:26:20,986, session=4, sequence=176, operation=PrimitiveOperation{id=DefaultOperationId{id=submit, type=COMMAND}, value=byte[]{length=49082, hash=272872877}}}} [,,]
2018-08-23 10:26:21,264 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=508} [,,]
2018-08-23 10:26:21,264 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=508, prevLogTerm=2, entries=1, commitIndex=507} [,,]
2018-08-23 10:26:21,264 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Appended Indexed{index=509, entry=CommandEntry{term=2, timestamp=2018-08-23 10:26:21,124, session=4, sequence=177, operation=PrimitiveOperation{id=DefaultOperationId{id=submit, type=COMMAND}, value=byte[]{length=196, hash=1587331156}}}} [,,]
2018-08-23 10:26:21,264 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=509} [,,]
2018-08-23 10:26:21,265 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=509, prevLogTerm=2, entries=0, commitIndex=509} [,,]
2018-08-23 10:26:21,265 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Committed entries up to index 509 [,,]
suimi
@suimi
Aug 23 2018 03:32
image.png
hear is the segment detail
suimi
@suimi
Aug 23 2018 03:47
2018-08-23 10:25:15,025 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=412, prevLogTerm=2, entries=1, commitIndex=412} [,,]
2018-08-23 10:25:15,025 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Appended Indexed{index=413, entry=CommandEntry{term=2, timestamp=2018-08-23 10:25:15,025, session=4, sequence=89, operation=PrimitiveOperation{id=DefaultOperationId{id=submit, type=COMMAND}, value=byte[]{length=196, hash=1587331156}}}} [,,]
2018-08-23 10:25:15,025 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=413} [,,]
2018-08-23 10:25:15,033 [raft-client-raft-group-partition-1-2] TRACE RaftSessionConnection - SessionClient{4}{type=consensus.atomix.core.primitive.CommandPrimitiveType@36f6e521, name=CommandPrimitiveType} - Received CommandResponse{status=OK, index=413, eventIndex=4, result=null} from 0 [,,]
2018-08-23 10:25:15,033 [raft-client-raft-group-partition-1-2] TRACE RaftSessionSequencer - SessionClient{4}{type=consensus.atomix.core.primitive.CommandPrimitiveType@36f6e521, name=CommandPrimitiveType} - Completing response CommandResponse{status=OK, index=413, eventIndex=4, result=null} [,,]
2018-08-23 10:25:15,256 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=413, prevLogTerm=2, entries=0, commitIndex=413} [,,]
2018-08-23 10:25:15,256 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Committed entries up to index 413 [,,]
2018-08-23 10:25:15,256 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=413} [,,]
2018-08-23 10:25:15,256 [raft-server-raft-group-partition-1-state] TRACE RaftServiceManager - RaftServer{raft-group-partition-1} - Applying Indexed{index=413, entry=CommandEntry{term=2, timestamp=2018-08-23 10:25:15,025, session=4, sequence=89, operation=PrimitiveOperation{id=DefaultOperationId{id=submit, type=COMMAND}, value=byte[]{length=196, hash=1587331156}}}} [,,]
2018-08-23 10:25:15,257 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=413, prevLogTerm=2, entries=0, commitIndex=413} [,,]
2018-08-23 10:25:15,257 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=413} [,,]
2018-08-23 10:25:15,507 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=413, prevLogTerm=2, entries=0, commitIndex=413} [,,]
2018-08-23 10:25:15,507 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=413} [,,]
2018-08-23 10:25:15,508 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=413, prevLogTerm=2, entries=0, commitIndex=413} [,,]
2018-08-23 10:25:15,508 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Sending AppendResponse{status=OK, term=2, succeeded=true, lastLogIndex=413} [,,]
2018-08-23 10:25:15,756 [raft-server-raft-group-partition-1] TRACE FollowerRole - RaftServer{raft-group-partition-1}{role=FOLLOWER} - Received AppendRequest{term=2, leader=0, prevLogIndex=413, prevLogTerm=2, entries=0, commitIndex=413} [,,]
suimi
@suimi
Aug 23 2018 06:48
i checked the code, the log writer maybe not flush at node stop. i found there have the flush config flushOnCommit with the default value false.
public class RaftStorageConfig {
  private static final String DATA_PREFIX = ".data";
  private static final StorageLevel DEFAULT_STORAGE_LEVEL = StorageLevel.DISK;
  private static final int DEFAULT_MAX_SEGMENT_SIZE = 1024 * 1024 * 32;
  private static final int DEFAULT_MAX_ENTRY_SIZE = 1024 * 1024;
  private static final boolean DEFAULT_FLUSH_ON_COMMIT = false;

  private String directory;
  private StorageLevel level = DEFAULT_STORAGE_LEVEL;
  private int maxEntrySize = DEFAULT_MAX_ENTRY_SIZE;
  private long segmentSize = DEFAULT_MAX_SEGMENT_SIZE;
  private boolean flushOnCommit = DEFAULT_FLUSH_ON_COMMIT;
the property flushOnCommit not allowed config at RaftPartitionGroup.
Johno Crawford
@johnou
Aug 23 2018 07:04
IMO it should flush on stop
suimi
@suimi
Aug 23 2018 07:12
i used kill to stop node
Johno Crawford
@johnou
Aug 23 2018 07:13
@kuujo maybe in io.atomix.storage.journal.JournalSegment#close we should be flushing before close?
@suimi actually, how did you integrate atomix
are you using the agent
or are you building an atomix instance in your own application
if the latter, you could try this
suimi
@suimi
Aug 23 2018 07:16
started the atomix instance at my application
Johno Crawford
@johnou
Aug 23 2018 07:16
Atomix.builder()
            .withShutdownHookEnabled()
..
suimi
@suimi
Aug 23 2018 07:16
ok
Johno Crawford
@johnou
Aug 23 2018 07:16
if you don't have a shutdown hook already?
suimi
@suimi
Aug 23 2018 07:16
let me try
Johno Crawford
@johnou
Aug 23 2018 07:18
later on you probably want to control the order of shutdown with your components
are you using spring or any other framework?
suimi
@suimi
Aug 23 2018 07:18
using spring
Johno Crawford
@johnou
Aug 23 2018 07:20
what interacts with atomix? do you have a web service of some sort?
or tcp server?
suimi
@suimi
Aug 23 2018 07:24
no, i override the AtomixRegistry, integrate the atomix and spring together
Johno Crawford
@johnou
Aug 23 2018 07:25
well i'm sure at some point you close the spring context and cleanup resources when kill is invoked
you should look into calling Atomix.stop in there somewhere
suimi
@suimi
Aug 23 2018 07:28
yea, later will try it and take a look when stop
besides, the property flushOnCommit can't config when building the atomix. it's will open later, or closed not allowed config it?
Johno Crawford
@johnou
Aug 23 2018 07:51
you can, from the Atomix builder
.withStorage(RaftStorage.builder().withFlushOnCommit..
suimi
@suimi
Aug 23 2018 07:58
not found this method at AtomixBuilder, it's in RaftServer
Johno Crawford
@johnou
Aug 23 2018 07:58
ah yes
Jordan Halterman
@kuujo
Aug 23 2018 15:42
Just a missing method on RaftPartitionGroup.Builder. @suimi add it
Jordan Halterman
@kuujo
Aug 23 2018 16:43
@suimi @johnou the logs are flushed when rolled over to a new segment, which is what matters. close() is not called when a node crashes
We can’t rely on something being implemented when the node is gracefully shutdown
If you do enable flushOnCommit you have to also increase timeouts because flushing can take seconds on some systems
Jordan Halterman
@kuujo
Aug 23 2018 17:10
also I’m unable to reproduce this in any tests
@suimi what are your specs? And can you debug the log to see how many entries are missing from the first segment?
in fact I can’t even reproduce it without any flushing
IIRC RandomAccessFile is supposed to flush to the OS basically on every write, so kill -9 <the_atomix_process> shouldn’t even cause this without the flush() calls
Jordan Halterman
@kuujo
Aug 23 2018 17:16
also @suimi does flushOnCommit actually fix it?
if you’re using the default Raft partition group configuration then there’s a lot of data missing from that segment
Johno Crawford
@johnou
Aug 23 2018 23:11
Depends on the os and fs I think
Jordan Halterman
@kuujo
Aug 23 2018 23:12
the JVM has control over when bytes are handed to the OS so it shouldn't
Johno Crawford
@johnou
Aug 23 2018 23:17
Does raf write call sync on the FD?