by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Stephan Saalfeld
    @axtimwalde
    good, but no sorting?
    tpietzsch
    @tpietzsch
    every object has an index in that memory pool
    the second thing then is wrappers of for example and TIntList (where the ints are indices into the pool) and the wrapper exposes it as a List<(long, int)>
    The important point is that with this construct objects have identity
    you can have two (long, int) objects with both (1,2), and they will not be the same object
    and if you modify one of them to (2,3), the "references" in the List<(long, int)> will reflect that
    this is different than having immutable tuples, where equality is defined on the values
    So, if that matches your use case, I would look at mastodon-collection maybe
    If you would rather have basically an array of (long,int) primitive types, I would look for something different
    (although maybe the mastodon memory-pool could be used to store something like that)
    tpietzsch
    @tpietzsch
    For sorting, you would just "normally" define a Comparator on your objects and sort the List<(long, int)> by that
    like a normal ArrayList
    basically you can think of it as having normal java.util collections but a packed storage for the Objects you put into them
    tpietzsch
    @tpietzsch
    in your scenario, is the long the id and the int a count? because you said "inserting batches of ids at a time which would also update existing entries"...
    and if so, are you looking for a heap-like data structure? or is it a total order?
    Stephan Saalfeld
    @axtimwalde
    it's a heap like structure
    the long is the id
    I want to update existing ids with a new count (actually I want it to be a float)
    when I pop the highest element, the next pop should give me the second highest element, i.e. popping through the entire heap would deliver an ordered list
    Curtis Rueden
    @ctrueden
    I hate that Java's memory management is called "the heap". It makes it functionally impossible to web search for implementations of heap data structures. :disappointed:
    Stephan Saalfeld
    @axtimwalde
    yeah, very ugly
    I also hate that there is no proper struct handling in old java
    they did introduce it though recently, I think, didn't they?
    it'd be so much easier to use proper object logic instead of hacking some magic over primitive arrays or buffers
    Curtis Rueden
    @ctrueden
    https://pdfs.semanticscholar.org/2fc3/a70b3eaf526f7ec51d5be01c8164ca14e80c.pdf :arrow_left: Sounds promising, but no stated license or source code links at first glance. Grr.
    Curtis Rueden
    @ctrueden
    Oh yeah, Valhalla. :+1:
    Michael Doube
    @mdoube
    @axtimwalde take a look at Eclipse Collections, it's pretty full featured and actively developed (and previously well resourced by GS). I don't know without investigating your particular use case, but I understand maps of primitives is what it is meant to be good at.
    Michael Doube
    @mdoube

    I want to update existing ids with a new count (actually I want it to be a float)

    EC seems to handle most primitive pairs: long+float, long+int are in there:

    Michael Doube
    @mdoube
    @axtimwalde Sounds like a sorted tree map? Something like Trove's TDoubleLongMap would give you a memory-efficient hash from double to long... but sadly Trove doesn't include any tree-backed data structures from what I can tell
    EC has sorted tree maps, see the forum post for performance data.
    I will work on a PR
    Stephan Saalfeld
    @axtimwalde
    Thanks @mdoube !
    Hmmm... the tuple pairs are objects and the collections of those are collections of objects
    Stephan Saalfeld
    @axtimwalde
    I am after a solution that stores tuples of primitives in a set of primitive arrays juts like the primitive sets or lists in Trove, fastutils, EC do
    In the spirit of storing a list of 3D double precision points as 3 n-element arrays of doubles instead of one n-elements array of 3-element arrays of doubles, because it saves n - 3 * 52 bytes.
    Stephan Saalfeld
    @axtimwalde
    or, to avoid cache misses, chunks of reasonable size, but that's details
    Michael Doube
    @mdoube
    I made a PR for scijava/pom-scijava, is there anything else to add, @ctrueden ?
    (the pathway from declaring dependencies in pom.xml to having artefacts available on the updater is still a bit obscure to me)
    Michael Doube
    @mdoube

    In the spirit of storing a list of 3D double precision points as 3 n-element arrays of doubles instead of one n-elements array of 3-element arrays of doubles, because it saves n - 3 * 52 bytes.

    Good point. http://dx.doi.org/10.1145/3030207.3030221 this may help, certainly Object-bloat reduction & reduced memory footprint is one of the main goals of these projects.

    Stephan Saalfeld
    @axtimwalde
    Right, but there are no struct collections in any of those projects, or am I missing it?
    there are single primitive collections and primitive maps (sometimes in combination with objects for completeness)
    Imagine a set of a struct of {int, double, double, long} whose hash method uses the int
    Michael Doube
    @mdoube
    No idea, sorry, my use case was much simpler, just lots of .add() of int to a HashSet
    Stephan Saalfeld
    @axtimwalde
    or, for my use case, a sorted hash set of structs {long, float} that is sorted by the float and uses the long to do the hash and equals check
    Ok, thanks for this comparison though, very interesting!
    I always used Trove because simple and complete, but it doesn't look to good here.
    Michael Doube
    @mdoube
    ja, I think that Trove has been 'Eclipsed'