Dear Community
We have following setup : Hazlecast Cluster which consist of 3 members and some microservices which connects to this Cluster
We want to create incrementers for our Infrastructure item in that way value for each Item is incremented or decremented by 1 when some event happened. (ofc without worring about concurrency issues)
Our first choice was to use CP Subsystem (but because we have only 3 members and one member is lost it cannot build CP Subsystem cluster and we end up with errror: CP Subsystem error: majority of the CP members are not accessible)
Because we want to use it only for one microservice we don't want to extend our cluster with another member (Memory is the value for us), and we don't want anable CP subsystem.
In this scenario natural choice for me was to use EntryProcessor where i increment/decrement entry value by one or if key doen't exist i return value equal to 1.
I read Doc and found interesting Data Structure which can fill my needs: PNCounter. However one thing is not clear for me: i saw i can easly preconfigure my PNCounter with PNCounterConfig
and setting it to my Hazlecast Instance Config - that's clear. However i cannot configure it in my case because our Infrastructure Items are dynamic structures and when i receive info about new Item i have to hold counter for it. So in this case i saw i can simply use hazelcastInstance.getPNCounter(name)
which creates or returns PNCounter. What is not clear for me, which configuration will be set for this PNCounter which hadn't predefined PNCounterConfig ? Default one ? If yes how i can change replicaCount property for this PNCounter to set something smaller (we don't want to increase network traffic and decrease the speed at which replica states converge)
Thanks in advance!
@jerrinot @kubagruszka
can you please elaborate the restart scenario?
we changed the way how we restart our pods. Now it looks like this:
- We have two instances of our application (two pods) deployed from scratch
- hazelcast maps are empty
- we set replicas to 0 so pods are deleted
- When deletion is completed (pods do not exist) we set replicas to 2
- new 2 pods are created with our application (from scratch)
- we test them (42 requests, each performs 2000 operations
put
(now we changed fromput
toset
)- Usually, it works fine (each instance respects the limit)
- when test is finished we set replicas to 0
- Wait until pods are deleted
- set replicas to 2
- when pods are running (from scratch, empty maps) we repeat test (the same 42 requests)
- Usually one instance respects the limit, the second one does not
I reproduced it locally (run two instances of our app in intellIJ) but instead of setting replicas to 2 (i'm not using kubernetes locally), I kill instances by just clicking stop button in intellIJ. But other steps are the same.
I also changed version to 3.12.8, test again but it didn't help so I reverted this change
@jerrinot @kubagruszka
do you know if the 2 Hazelcast instances discover each other upon starting or there is a period where each instance is forming its own single member cluster?
yes, I forgot to add one step how I restart instances
Members {size:2, ver:2} [
Member [IP1]:PORT- c1186667-1e58-44b2-996b-5ab59859f00e
Member [IP2]:PORT- 511b4976-5f62-413c-a19c-683747ebaf57 this
]
for me it tells me that after this 5 minutes they form one clusterhow do you populate the map? are you using embedded HazelcastInstance to get a map instance or an external client?
simplifying our code it looks like this
HazelcastInstance hazelcast = Hazelcast.getOrCreateHazelcastInstance(config);
MapConfig mapConfig = hazelcast .getConfig().getMapConfig(getMapName());
mapConfig.setEvictionPolicy(EvictionPolicy.LFU);
addIndices(mapConfig);
mapConfig.setMaxSizeConfig(new MaxSizeConfig(25000, MaxSizeConfig.MaxSizePolicy.PER_NODE));
IMap<String, METADATA> cache = hz.getMap(getMapName());
cache.addEntryListener(new MetadataEvictedListener<METADATA>(), true);
and then we have for example
cache.set(key, value, notRefreshableTTL.toSeconds(), TimeUnit.SECONDS, refreshableTTL.toSeconds(), TimeUnit.SECONDS);
Is it possible that the problem is that we configure our map after initialization HazelcastInstance? Maybe we should configure our map and after that callHazelcast.getOrCreateHazelcastInstance(config);
I would like to add only thatgetMapName()
returns for example "eventsMetadata"
@dawidjasiokwork:
in general it's better to configure data structures statically - before you start an instance - whenever you can. this way data structures are configured right from the start.
however I can see another issue in your code. you are modifying just a local copy of MapConfig, but you are not submitting it to all clusters.
You have to call instance.getConfig().addMapConfig(mapConfig)
to make sure the config is propagated to all cluster members. Otherwise newly joined members might not receive the configuration.
so my advice would be:
instance.getConfig().addMapConfig(mapConfig)
before you starting using the map. so something like this:
MapConfig mapConfig = new MapConfig(getMapName());
mapConfig.setEvictionPolicy(EvictionPolicy.LFU);
addIndices(mapConfig);
mapConfig.setMaxSizeConfig(new MaxSizeConfig(25000, MaxSizeConfig.MaxSizePolicy.PER_NODE));
HazelcastInstance hazelcast = Hazelcast.getOrCreateHazelcastInstance(config);
hazelcastInstance.getConfig().addMapConfig(mapConfig); // <-- this will submit the map config to all cluster members and also to members which join the cluster later.
IMap<String, METADATA> cache = hz.getMap(getMapName());
cache.addEntryListener(new MetadataEvictedListener<METADATA>(), true);
@AdamUlatowski: data structures without explicit configuration will use a default configuration. changing a default configuration is as simple as adding a regular configuration named "default". in your case something like this:
<pn-counter name="default">
<replica-count>someNumber</replica-count>
<statistics-enabled>true</statistics-enabled>
</pn-counter>
or if you prefer yaml
pn-counter:
default:
replica-count: some-number
statistics-enabled: true
alternatively you can use wildcard matching:
<pn-counter name="my*">
<replica-count>someNumber</replica-count>
<statistics-enabled>true</statistics-enabled>
</pn-counter>
Then all PNCounters with a name starting with "my" will use this configuration.
Bear in mind PNCounters are eventually consistent: All replicas will eventually converge to a same value, but this process is asynchronous.
This may or may not be OK, depending on our use-case. For example it is not suitable as an ID generator - as it can generate duplicates - but it's usually a good if you are counting occurrences of some event. It's called a counter after all:) The main advantage is that the counters are always available.
the Raft-based counters are on the opposite side of the consistency spectrum: They are linearizable. This means when an increment is acknowledged then subsequent read operations are guaranteed to see it. The flip-side is that Raft structures are unavailable when there is not a quorum. (when half of your CP group members are unavailable)
@jerrinot Thank you very much for clarification. I didn't want to change default values because it can affect other microservices using PN Counters so we don't want to provide specific microservice configuration into our shared cluster.
Great we can use wildcard matching! Could you confirm i can use it from my Hazelcast Client Side e.g. :
HazelcastInstance hazelcastInstance = HazelcastClient.newHazelcastClient(clientConfig);
PNCounterConfig pnCounterConfig = new PNCounterConfig("my*")
.setReplicaCount(<someNumber>)
.setStatisticsEnabled(true);
hazelcastInstance.getConfig()
.addPNCounterConfig(pnCounterConfig);
What do you mean by Raft-based counters? Does it mean based on Raft consensus algorithm (implememtation of java concurrency primitivies)?
the snippet looks good to me :thumbsup:
you just have to make sure to submit the configuration before you start using the counter.
indeed, the CP-subsystem is based on the Raft consensus protocol (when you enable it).
So you can have IAtomicLong backed by the Raft protocol. See the reference manual for details: https://docs.hazelcast.org/docs/4.0.2/manual/html-single/index.html#cp-subsystem
@pa-vishal what you described is a normal behaviour of a queue: when you have a queue with multiple consumers (listeners) and you submit a new item into a queue then only a signle consumer will receive the item. do you observe a different behaviour?
@jerrinot , I absolutely see a different behavior- when I put a message on the IQueue from server, all the 3 listeners in 3 instances (configured as HazelcastClient) receive the item. Am I missing something?
@pa-vishal I guess I know what the problem is. A Queue listener is just a passive observer. It does not consume queue elements. When you have listeners registered on multiple clients then each listeners will be invoke for each item.
This is probably not what you want. Not only multiple listeners will be invoked for each item, but the items won't ever be consumer and the queue will leak memory.
You most likely do not want a queue listener, but a queue consumer. Please have a look at the reference manual how to consume queue: https://docs.hazelcast.org/docs/4.0.2/manual/html-single/index.html#taking-items-off-the-queue
@jerrinot
I checked your advice and it looks that it works. I mean almost because now we are facing with another problem.
When I set max size policy to PER_NODE and 15000 entries I observe that during requesting to our application:
and the limit is restricted so each instance never contains more than this.
For me it looks ok, I see in logs that entires are evicted.
But when i set USED_HEAP_SIZE on 100MB (does it mean 50MB per entries memory and 50MB per backup right?) I observe that when each instance reachs 50MB then eviction is started (mostly it happens after 10000 entries per instance). But when I request for example for another 1000000 entries and we have 1000000 putAsync operations, memory grows more than 50MB:
summing up the memory on each instance we still have the 100 MB but the distribution is different, only 16 MB for backup. Is this appropriate behavior?
@dawidjasiokwork
I'm glad the threshold is being respected when configured properly.
the primary/backup imbalance looks like a possible bug in eviction. for some reason backups entries are evicted more frequently then primaries.
please open a new ticket at https://github.com/hazelcast/hazelcast/issues/new and describe your scenario. thanks!
hi ,
Currently i am using client sever topology model with 5 java clients .All 5 java clients are web applications and am getting below error in logs.
Failed to execute java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@6bb6a27
java.lang.NoClassDefFoundError: com/hazelcast/client/impl/protocol/util/BufferBuilder
could some one guide me to fix this.
and the other error is getting printed in logs continuously is as below.
Illegal access: this web application instance has been stopped already.
Could not load [com.hazelcast.client.impl.protocol.ClientMessage]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
java.lang.IllegalStateException: I
llegal access: this web application instance has been stopped already.
Could not load [com.hazelcast.client.impl.protocol.ClientMessage].
The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.