Wouldn’t that always be the value passed to
Refill.intervallyin that example?
No. Capacity of the bucket is an independent entity. With the token-bucket algorithm, you have 3 configuration points. The token bucket algorithm can be conceptually understood as follows:
So acording to your question:
We have to implement rate limiter in distributed environment.
In order to make these rate limit determinations with minimal latency, it’s necessary to make checks locally in memory. This can be done by relaxing the rate check conditions and using an eventually consistent model. For example, each node can create a data sync cycle that will synchronize with the centralized data store. Each node periodically pushes a counter increment for each consumer and window it saw to the datastore, which will atomically update the values. The node can then retrieve the updated values to update it’s in-memory version. This cycle of converge → diverge → reconverge among nodes in the cluster is eventually consistent.
AFAIK: All the caching libraries linked here like Ignite, Hazelcast etc. They provide distributed caching where cache key map is distributed across all the nodes in the cluster. With this, a request to consume token coming to different node other than the node on which that particular token is stored using distributed cache, will still require comunicating to the "primary node" of that key and updating the value.
Is there any mechanism with Bucket4j where above requirement can be solved? I want to avoid every time reaching out to the "primary node" of that key to know if the bucket has token.
Bucket4j will solve the over-communication problem in another way: for upcoming
5.0 release, I have implemented request batching.
The algorithm is simple:
The algorithm above does not solve latency problem, but significantly increases the output. It is easy to achieve milions operation per second on single bucket
Thank you @vladimir-bukhtoyarov , if I understand correctly this is what you are saying:
When using bucket4j with distributed cache. let us say there are 20 nodes and 40 buckets, then each node will primarily host roughly around 2 buckets as part of distributed caching. suppose node1 hosts bucket1 and bucket2. suppose Node 11 to 20 all get request involving bucket1, in that case All those 10 nodes will reach out to node1 to get token and here you are trying to increasing the efficiency to reduce bucket count efficently without adding much locking.
But the fact that all 10 nodes talking to primary node1 of that bucket will still happen and add to latency correct?
Bucket4j does not provide deleting functionality. I told about ID which was used during bucket creation. Bucket4j does not provide deleting functionality but it is easy to achieve via Cache API
Bucket bucket = Bucket4j.extension(io.github.bucket4j.grid.ignite.Ignite.class).builder() .addLimit(Bandwidth.simple(1_000, Duration.ofMinutes(1))) .build(cache, key, RecoveryStrategy.RECONSTRUCT); // to delete just remove the item for cache cache.remove(key);
But there is a tricky moment. If you will continue to use bucket then it will be resurrected on next iteration with bucket because of