by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • Sep 21 23:01
    bakwc edited #117
  • Sep 18 07:56
    alexeymyltsev closed #121
  • Sep 18 07:56
    alexeymyltsev commented #121
  • Sep 17 12:11
    alexeymyltsev commented #121
  • Sep 17 12:05
    bakwc commented #121
  • Sep 17 12:03

    bakwc on 0.3.7

    (compare)

  • Sep 17 12:02

    bakwc on master

    New version (compare)

  • Sep 17 12:00
    bakwc commented #121
  • Sep 17 11:58
    alexeymyltsev commented #121
  • Sep 17 11:39
    bakwc closed #122
  • Sep 17 11:39
    bakwc commented #122
  • Sep 17 11:38

    bakwc on master

    Ignore self address if it exist… (compare)

  • Sep 17 11:38
    bakwc closed #124
  • Sep 17 11:38
    bakwc opened #124
  • Sep 17 11:38

    bakwc on allowSelfAddrInOtherAddrs

    Ignore self address if it exist… (compare)

  • Sep 17 11:31
    coveralls commented #123
  • Sep 17 11:27

    bakwc on master

    Fixed readonly nodes disconnect… (compare)

  • Sep 17 11:27
    bakwc closed #123
  • Sep 17 11:22
    bakwc opened #123
  • Sep 17 11:22

    bakwc on fixedReadonlyNodesException

    Fixed readonly nodes disconnect… (compare)

Matt Hall
@mattall

Regarding locks: I have a replicated object (A) on a cluster of servers that clients will modify. The object has its own object, and that object is a graph (A.graph). Clients will modify attributes in the edges of the graph (A.graph.edge_color := 'blue'). Each edge will have a lock, so clients updating multiple edges will need to acquire all locks before making changes.

The edges of the graph are populated with a method after the object (A) is instantiated. If I want to protect my edges from race conditions at different servers I need assign a lock for each edge as the edge is created.

My question is: Should one lock manager oversee all of the locks, or should there be a lock manager for each lock?

quocbao
@baonq-me
As you said that "clients updating multiple edges will need to acquire all locks before making changes.", you only need one lock manager
quocbao
@baonq-me
This problem can be solved easily if you redirect all client request to a single server that is a leader in raft
Filipp Ozinov
@bakwc

Should one lock manager oversee all of the locks, or should there be a lock manager for each lock?

One lock manager is enough, it can handle multiple locks

Better solution will be to add a @repicated method to your object that modifies edges, eg:
@replicated
def changeEdgeColor(self, edgeID = 12345, edgeColor = 'blue'):
You don't need any lock to execute repicated methods, they always executed at the same order at all servers (but you need to handle case where edge with given id is already deleted - just skip editing for example, and all edgeID-s should be unique).
Matt Hall
@mattall

Thank you both! I am using replicated sync for all of my graph-changing methods.

My server has multiple replicated classes, so forwarding to the leader is not an option since the leader of the different classes might not be on the same server. Basically, I let the server who caught the request handle responding to it, and make sure that it uses replicated methods for changing anything across the cluster.

The thing is that with the graph object, I want to make updates to multiple edges atomic. For instance if multiple clients are trying to change edges on different paths, and those paths overlap with a common edge, then I only want one client to have access to all of those edges at any time.

@replicated_sync
def changeEdgeColor(self, edgeID = 12345, edgeColor = 'blue'):
    ...
    return "Success!"

@replicated_sync
def changeManyEdges(self, list_of_edges = [1, 2, 3] ):
     sort(list_of_edges)
     for edge in list_of_edges:
         self.__LockManager.tryAcquire(edge)

     for edge in list_of_edges:
         changeEdgeColor(self, edge, 'blue')

     for edge in list_of_edge:
         self.__LockManager.release(edge)

One question here about the lock manager. it seems like tryAcquire does not require me to create the lock myself. Is that true?

if existingLock is None or existingLock[0] == clientID:
            self.__locks[lockID] = (clientID, currentTime)
return True
Geert Audenaert
@FastGeert
Hi guys. I'm using pysyncobj in gevent application where i do monkey.patch_all() as the first line that gets executed. From time to time i get the following stack trace that messes up the application:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.6/dist-packages/gevent/threading.py", line 177, in run
    super(Thread, self).run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pysyncobj/syncobj.py", line 500, in _autoTickThread
    self._onTick(self.__conf.autoTickPeriod)
  File "/usr/local/lib/python3.6/dist-packages/pysyncobj/syncobj.py", line 577, in _onTick
    res = self.__doApplyCommand(entry[0])
  File "/usr/local/lib/python3.6/dist-packages/pysyncobj/syncobj.py", line 701, in __doApplyCommand
    return self._idToMethod[funcID](*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pysyncobj/syncobj.py", line 1389, in newFunc
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/pysyncobj/batteries.py", line 419, in prolongate
    for lockID in self.__locks.keys():
RuntimeError: dictionary changed size during iteration
Could this be related to the use of gevent?
Filipp Ozinov
@bakwc
Hi, @FastGeert , thanks for report. It is bug in our library, created an issue, bakwc/PySyncObj#102
Filipp Ozinov
@bakwc
Fixed it, try version from github
Geert Audenaert
@FastGeert
Great, will do.
I'm really trilled by your project btw. I was wondering why consumers cannot be added after SyncObj is created?
MoayedHajiAli
@MoayedHajiAli
Hello everyone.
Hope everyone is fine.
I have a simple question about PySyncObj.
When sending too many RPS, the leader fails.
That happens in the provided benchmark as well.
It is weird for me though, as I expect more packets to be dropped when sending to many requests but not the leader to fail.
can anyone please explain why does this happen ?
Filipp Ozinov
@bakwc
What do you mean by leader fails?
MoayedHajiAli
@MoayedHajiAli
Hi.
Thanks for your fast reply and Sorry for the delay in the reply. I did not notice your reply.
I mean when sending a high RPS, in the RPS benchmark, some packets fail right?
and the FAIL_REASON that is reported is, in a high percentage, 5 (which indicates a LEADER_CHANGE)
However, it seems strange to me because I think that the leader should stay the same whatever the number of packets that are sent.
given a sufficient timeout, the leader should maintain its state and never change, unless the leader itself fail for any reason.
I might be understanding something wrong?
MoayedHajiAli
@MoayedHajiAli
Another question, do you have an idea of how I might benchmark the throughput of a sync program?
sending the MAX RPS every second and then counting how many responses the client receive every second will do?
MoayedHajiAli
@MoayedHajiAli
Hello.
Filipp Ozinov
@bakwc
1) Packets are not lost, they are stored in a queue (it use tcp).

sending the MAX RPS every second and then counting how many responses the client receive every second will do

yep, that way is ok

quocbao
@baonq-me
@MoayedHajiAli I think the result will depends on network speed and IOPS of storage backend rather than the algorithm itself
MoayedHajiAli
@MoayedHajiAli
Thanks for your reply.
But still, I did not get it why the leader is changed when sending a high RPS.
I tracked the leader and it actually changing sometimes when sending a high number of requests.
For example in the RPS benchmark I get something like this for around 20000 RPS:
SUCCESS RATE: 0.169840984098
AVG DELAY: 4.72234106064
LOST RATE: 0.270159015902
ERRORS STATS: 2
1 0.212185504265
5 0.787814495735
number 5 indicates a LEADER_CHANGE which seems weird for me.
I understand that not all the packets will success, but why is the leader changing?
why does the leader crash?
In a normal connection when sending too many requests, some of them will be dropped, or will not receive a response, or whatever. But the server will never crash.
In our case, the leader crashes or it timeout I am not sure. But the leader changes.
@baonq-me I understand that.
But as I am testing on the same machine (which means the network connection is reliable to some extent), the leader has no reason to change.
Filipp Ozinov
@bakwc
package queue fills, leader does not have enought time to process all packages, so it can't response in time.
package1 was sent at time 100
package2 was sent at time 100
package10000 was sent at time 120
after that package that pings leader was sent
but leader was busy processing packages 1-10000, so he answered to ping package only in a minute
MoayedHajiAli
@MoayedHajiAli
Hmmm I understand that. but how is this connected to the change in the leader?
if we have a cluster of 3 servers and server 1 was the leader. Now after sending the 15000 packages, for example, the leader will be busy processing the packages 1-10000 and the other 5000 should wait or will be responded to as the QUEUE_FULL (error number 1). but what happens is that a new election is started and server 2, for example, become a leader.
Is this normal? shouldn't the leader maintain its state and never change unless we manually killed the leader?
please note that when this happens, either the server not the connection crashes but still the leader changes.
I tried increasing the timeout, but the same problem happened.
Filipp Ozinov
@bakwc
Leader elections starts when leader does not response to ping. That is what happens when leader get a lot of messages - he fails to response in time
This is normal
If leader is overloaded - he can not be leader because he don't have enough processor power to process all mesages - we need another more powerfull leader
MoayedHajiAli
@MoayedHajiAli
hmmm now I get it.
I thought the heartbeat should get a priority.
Thank you so much ^_^
schmidtfx
@schmidtfx
Howdy, is there a desired way to redistribute load if a node goes down? In my scenario, I'm having n number of processes running, whereby the processes are opening websocket connections. Each process will be responsible for a subset of the required websockets. If a process goes down, I'd like to redistribute
Filipp Ozinov
@bakwc
You should use "distributed lock". Process should acquire lock before processing a subset of sockets. You should multiple processes responsible for one subset, one is processing and others are waiting untill lock will be freed.