These are chat archives for RBMHTechnology/eventuate

18th
Jan 2017
Alexander Semenov
@Tvaroh
Jan 18 2017 07:18
@krasserm would it make sense to add this method to ORSet to prevent O(n^2) complexity when removing multiple entries?
  /**
    * Collects all timestamps of given `entries`.
    */
  def prepareRemoveMultiple(entries: Set[A]): Set[VectorTime] =
    versionedEntries.collect { case Versioned(entry, timestamp, _, _) if entries.contains(entry) => timestamp }
Alexander Semenov
@Tvaroh
Jan 18 2017 07:31
it can be done in user code but I'm not sure if it's a good practice to access versionedEntries field directly
Martin Krasser
@krasserm
Jan 18 2017 09:30
Does that mean RemoveOp(entry: Any, timestamps: Set[VectorTime]) must be changed to RemoveOp(entries: Seq[Any], timestamps: Set[VectorTime])?
Alexander Semenov
@Tvaroh
Jan 18 2017 09:37
yes, or maybeintroduce a separate op for multiple entries case. I'm not sure how it affect already persisted serialized RemoveOps
Martin Krasser
@krasserm
Jan 18 2017 09:45
I'd prefer a change over an operation addition. To be backwards-compatible the protobuf definition requires a new entries field and the serializer must be extended in a backwards-compatible way i.e. handling old and new RemoveOps
Alexander Semenov
@Tvaroh
Jan 18 2017 10:03
got it, I'm still not sure if it's really needed, will see
Martin Krasser
@krasserm
Jan 18 2017 10:09
That's actually a move towards batch-oriented operations. If we do that for RemoveOp we should also consider doing that for AddOp. However, operation dissemination and persistence under the hood is anyway batch-oriented and one-by-one application of operations is completely in memory, so I'm not sure about the benefit (except if you have really huge ORSets in memory and removals are high frequent operations).
Alexander Semenov
@Tvaroh
Jan 18 2017 12:57
@krasserm could you help me understand how the remove method works? We collect vector timestamps of a set element in prepare (as above) and then remove all elements having these timestamps at dowstream. The thing I don't quite understand is why other set elements cannot have same vector timestamps (otherwise it would delete wrong elements)?
Martin Krasser
@krasserm
Jan 18 2017 13:23

The thing I don't quite understand is why other set elements cannot have same vector timestamps (otherwise it would delete wrong elements)?

Sorry, I cannot follow. Can you please elaborate?

Alexander Semenov
@Tvaroh
Jan 18 2017 13:26

Sure, sorry, in prepareRemove we collect all the timestamps of a set element:

  def prepareRemove(entry: A): Set[VectorTime] =
    versionedEntries.collect { case Versioned(`entry`, timestamp, _, _) => timestamp }

Then in remove we drop all elements that have timestamps from the "prepare" step:

  def remove(timestamps: Set[VectorTime]): ORSet[A] =
    copy(versionedEntries.filterNot(versionedEntry => timestamps.contains(versionedEntry.vectorTimestamp)))

What if some other element has same vector timestamp as the one we're about to delete? I suppose this can't be the case cause otherwise it'll be deleted as well.

Alexander Semenov
@Tvaroh
Jan 18 2017 13:51
I see, so how vector timestamps from different locations differ? I'm trying to understand what values I should use in tests. Are process ids different?
Martin Krasser
@krasserm
Jan 18 2017 13:54
Yes
Alexander Semenov
@Tvaroh
Jan 18 2017 15:35
cool thanks
gabrielgiussi
@gabrielgiussi
Jan 18 2017 15:43
:clap:
Alexander Semenov
@Tvaroh
Jan 18 2017 15:43
still it's non-ordered tree
Martin Krasser
@krasserm
Jan 18 2017 15:47
@Tvaroh congrats, that's great news!