Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • May 10 13:23

    lni on master

    dragonboat: print a warning mes… (compare)

  • May 10 07:22
    lni commented #176
  • May 10 06:46
    coufalja commented #176
  • May 10 06:46
    coufalja commented #176
  • May 10 04:29

    lni on master

    tests: minor change (compare)

  • May 10 03:47

    lni on master

    tests: minor fix for a test (compare)

  • May 09 08:34
    lni commented #175
  • May 09 08:31
    lni commented #176
  • May 09 03:02

    lni on master

    transport: panic when a snapsho… raft: simplified how peers are … tests: minor fix for witness te… (compare)

  • May 08 11:31

    lni on master

    dragonboat: renamed observer to… (compare)

  • May 08 07:18

    lni on prevote-in-master

    (compare)

  • May 08 07:18

    lni on master

    docs: updated CHANGELOG.md (compare)

  • May 08 06:54
    lni closed #165
  • May 08 06:54
    lni commented #165
  • May 08 06:53

    lni on master

    raft: added PreVote support Merge branch 'master' into prev… docs: added pre-vote to feature… (compare)

  • May 08 06:17

    lni on prevote-in-master

    tools: not to measure startup c… tests: minor fix Merge remote-tracking branch 'p… and 1 more (compare)

  • May 08 06:17

    lni on master

    tests: minor fix Merge remote-tracking branch 'p… (compare)

  • May 08 06:13

    lni on master

    tools: not to measure startup c… (compare)

  • May 08 06:12

    lni on prevote-in-master

    raft: added PreVote support (compare)

  • May 07 12:41

    lni on release-3.3

    transport: fixed unreachable no… dragonboat: bumped version to v… (compare)

esonic
@esonic
@lni 麻烦请教一下,如何实现仅落地raft log之后propose就返回成功呢(不用等状态机的Update执行完成)。因为apply过程希望能够异步后台进行,这样写入时能够更快向前端返回成功。
Zhou Yicong
@Jackmrzhou
@lni hi, I’m new to dragonboat. I was confused when reading the “ondisk” example where the state machine has to persist the latest applied index, why this is necessary? Does there exist a scenario that stale entries will go into the update function? I mean these entries are already processed by the framework.
lni
@lni
@Jackmrzhou the index is kept by the state machine so the Open() method can return it when the state machine is restored (e.g. after a crash). the returned index value is used to prevent those already applied entires to be supplied to the update method again.
lni
@lni
@esonic in my own systems, they need to issue linearizable reads all the time. to do that, your described return-after-commit approach won't help much as current committed entries have to be applied before such linearizable reads. that being said, I agree that what you described can be an extra useful feature for certain applications. any plan to contribute it as a PR? let me know if you are interested, I have some ideas to share.
esonic
@esonic
@lni We are planning to build a consistent message queue system upon multi raft. High throughout and low latency are priorities whereas linearizable read is not. For a message queue, It's OK to response as soon as the WAL fsync finishes (like what Kafaka append log did), apply can be done async, and the raft log is a good choice for used as WAL. Maybe we can talk about how to contribute to support such feature in dragonboat.
lni
@lni
@esonic I've sent you a private message to chat about implementing this new feature.
Nob
@nobsu
image.png
I found that the space occupied by this directory has been increasing, for example, it shows 287M, but the actual files do not add up so much, so where did the space go?
lni
@lni
@nobsu it is caused by preallocated spaces used by some rocksdb files
Nob
@nobsu
How to release the preallocated space? Is there any parameter recommendation for the production environment?
lni
@lni
@nobsu I've sent you a private message to get more details
Jeremy Hahn
@jeremyhahn
does dragonboat currently perform any optimizations regarding network traffic (similar to multiraft in cockroachdb)?
excellent library, btw!
lni
@lni
@jeremyhahn could you be more specific?
Jeremy Hahn
@jeremyhahn
sure, im referring to network level optimizations to avoid an explosion in network traffic with each new raft group added to the server. https://www.cockroachlabs.com/blog/scaling-raft/ https://tikv.org/deep-dive/scalability/multi-raft/ cockroachdb/cockroach#20
lni
@lni
@jeremyhahn there are some similar optimizations. e.g. heartbeat messages are batched to make it more efficient to be transmitted & processed. when raft groups are idle, they can also be put into the so called quiesce mode to avoid sending heartbeats.
令狐少侠
@blackfox1983
hi, when is v3.3 to be released? I see that the CGO option can be disabled from version 3.3. thanks. @lni
lni
@lni
hi @blackfox1983, I will wait for at least a couple more months to allow pebble to be better tested. please note that there is current no known issue relating to pebble, it has been extensively tested for many months. you can definitely start playing with the master HEAD and expect the v3.3 in Oct. or Nov.
令狐少侠
@blackfox1983
got. thanks @lni . Looking forward to the release of version 3.3.
This is the best raft library I've ever seen
Иван Сердюк
@oceanfish81_twitter
Hi there
Иван Сердюк
@oceanfish81_twitter
Why am I getting this
$ go test ./...
go: finding module for package github.com/petermattis/pebble
go: found github.com/petermattis/pebble in github.com/petermattis/pebble v0.0.0-20200710160639-c9a380a7f499
go: github.com/lni/dragonboat/v3/internal/logdb/kv/pebble imports
github.com/petermattis/pebble: github.com/petermattis/pebble@v0.0.0-20200710160639-c9a380a7f499: parsing go.mod:
module declares its path as: github.com/cockroachdb/pebble
but was required as: github.com/petermattis/pebble
?
Seth Yates
@sethyates
Hi. Is there any way to detect and remove a failed node? It appears the leader just keeps trying to contact the failed node and doesn't actually remove it from the cluster. And I can't find an event, channel or anything where I could do this myself.
@lni
Seth Yates
@sethyates
Will ISystemEventListener do this for me?
lni
@lni
@sethyates you need to implement your own mechanism to detect failed nodes. to remove a node from a raft cluster, you can use the membership change method NodeHost.RequestDeleteNode for that.
I will check the godoc to make sure this is mentioned
Igor
@ikgo
hi, any updates about 3.3?
lni
@lni
@ikgo v3.3 is expected to be released in a few weeks
Igor
@ikgo
hi, how remove filed node for then cases cluster doesn't have consensus and failed node can't be started with same IP.
NodeHost.RequestDeleteNode and RequestAddNode returns error in case cluster in not operational mode.
for example:
  • 3 nodes cluster up and running.
  • two nodes crushed
  • two new nodes started but with new IP's
  • how to restore cluster in this situation?
lni
@lni
@ikgo please open a new issue ticket for your questions above, will be happy to help you there in the issues section. thanks
Seth Yates
@sethyates
What does "[00100:00018] no snapshot available during launch" mean? We're getting a case where a node in the cluster is not able to come up live. It seems to launch but then just sits there in a "not ready" state
lni
@lni
@sethyates that message means the node you are trying to launch couldn't find any existing snapshot saved on its local storage. If you believe the above mentioned launched but stuck in "not ready" state is caused by dragonboat, please be more specific by providing more info on what happened, what you observed and what makes you believe it is caused by dragonboat. please open a new issue on that with all these supporting info. we can have a look together. thanks.
Seth Yates
@sethyates
thanks. it was a brand new node. how would it have any snapshot in its local storage if it is new? When I say "not ready", I mean Pending=true. Is it possible that this would have been caused by a problem adding the node to the cluster before starting its NodeHost?
lni
@lni
a new node is not expected to have any snapshot in its local storage. the "no snapshot available during launch" message just confirms it and records a message in the log for debugging purpose. Starting NodeHost first is totally fine, but I don't quite understand what does that "pending=true" mean, what pending variable/flag are you referring to?
Seth Yates
@sethyates
lni
@lni
@sethyates please open a new issue on this one. please specify how you started that node, it is a brand new node or an existing node recovering from crash, what was the content of your initialMembers parameter passed to the StartCluster method. thanks
Seth Yates
@sethyates
we added a retry/backoff on the code that was adding the node to the cluster and haven't seen this error since then, so it was a problem adding the node to the cluster before starting NodeHost as suspected.
thanks for the help
what does dropped an incoming message mean and what might cause that to happen?
I can only find this in nodehost.go with the following code:
                                if added, stopped := n.mq.Add(req); !added || stopped {
                                        plog.Warningf("dropped an incoming message")
                                } else {
                                        msgCount++
                                }
is this expected behaviour or is something in our setup potentially incorrect?
lni
@lni
that warning message means your node is receiving more messages than its processing capacity. you don't need to worry about it if it only happens occasionally
Igor
@ikgo
Hi, we use dragonboat in k8s cluster. in case the node crashed it restored with same IP, so gosip (implemented on top of hashicorp membership) wait more that 1min to invalidate IP.
I prepare PR to add configuration for timeouts to configure membership.
@lni what will be better:
  • add timeout configuration to GossipConfig
  • add timeout configuration to ExpertConfig
lebovski
@lebovski
My question is, can I use Raft's built-in storage to store my data?
Can you have an example?
lni
@lni
@ikgo and @lebovski sorry that I didn't see your message until now, for some strange reasons I didn't get an email notification for your message.
@ikgo please use GossipConfig for such timeout settings, thanks
@lebovski raft's built in storage logdb is used for storing raft logs, it is a private store for dragonboat.
btw, please feel free to raise an issue when you couldn't get an response here or via email. thanks!