These are chat archives for non/algebra

14th
Dec 2015
Alec Zorab
@AlecZorab
Dec 14 2015 18:02
@rklaehn if I get a bit of spare time, would you accept a pr on abc that added a NegatableBitSet? It's more or less the one missing piece I need for the horrible mess I'm working on at the moment :)
Rüdiger Klaehn
@rklaehn
Dec 14 2015 19:02
@AlecZorab sure. A NegatableBitSet sounds great. Is a compact representation important for you? Otherwise a NegatableBitSet would be isomorphic to a TotalMap[Int, Boolean], right?
Alec Zorab
@AlecZorab
Dec 14 2015 19:06
That's an interesting question. We'd been relying on the compact representation when I first wrote this den of madness, but now I think about it, the data is sparse enough that maybe I don't. A NegatableArraySet[Int] might do me just as well
Rüdiger Klaehn
@rklaehn
Dec 14 2015 19:07
The internal representation is pretty compact. Just a sorted array of ints. But please be aware that while it is really fast for set/set operations and membership tests, the performance totally sucks when adding or removing single elements from a large set.
I usually don't care about that use case. When building a set from scratch, it is pretty fast (definitely much faster than scala collections)
Tom Switzer
@tixxit
Dec 14 2015 19:33
@rklaehn I find myself wanting an interval "map" that uses a semigroup on the value to do a union. How hard would it be to build something like this off of IntervalSet? Or is there another way?
Denis Rosset
@denisrosset
Dec 14 2015 19:35
@tixxit, are you sure you want a semigroup and not a semilattice?
It requires a Bool[T] for the key. Or more precisely some bespoke collection-specific typeclass that is basically a bool.
I use this to store a timeline of overlapping events. In that case I use my NegatableArraySet[T] as the value.
tixxit @tixxit nods
Tom Switzer
@tixxit
Dec 14 2015 19:46
the values I want are basically some sort of of wrapper for calculating the min, max, or avg of Double
I'm OK if I don't have & available though, for instance
@denisrosset Maybe! I haven't thought it through a ton yet - an IntervalMap sort of came up as the solution to a problem at work
Rüdiger Klaehn
@rklaehn
Dec 14 2015 19:47
I am not so sure about these "collection-specific" typeclasses. They are usually a way to make things faster for a small number of types while making things a bit slower for all other types. But since I usually have some specific types I want to be fast, it is a good compromise for me. See also RadixTree.Key in https://github.com/rklaehn/radixtree, where I have specialized impls for everything I care about (String and Array[Byte])
Tom Switzer
@tixxit
Dec 14 2015 19:48
yeah, I think I'm generally happy when the fast case is really fast, if the slow case is a bit slower
since you kind of expect the slow case to be slow already
I should also revise my thing - I think I'd likely either use the "average" or just a Set[Double] for the value
Actually - that should work for your IntervalMap, right?
I can't really implement isOne, but aside from that
you don't actually need a full Bool
Rüdiger Klaehn
@rklaehn
Dec 14 2015 19:53
I think what you want might be possible to implement directly in IntervalMap. Just remove the boolean style operations from the class itself, and provide either Bool[IntervalMap[K, V]] or some other typeclass depending on what you have for the value.
For the Set instance of IntervalMap.Value I just return false for isOne :-)
By the way: I published intervalset separately so I can experiment with stuff like IntervalMap[K, V]. But if people find this useful I could also add IntervalMap to spire-extras.
Tom Switzer
@tixxit
Dec 14 2015 19:57
Yeah, I'll likely not have a chance to implement this before Jan, but I think it'll be fairly useful for my case (creating a map of feature value -> predicted value for random forests)
Rüdiger Klaehn
@rklaehn
Dec 14 2015 19:59
You will have to make sure that you always keep a canonical representation. E.g. adding [0, 1) => 1 to [1, 2) => 1 should remove the middle boundary.
I always make sure to have a canonical representation so I can use structural equality, and it is just neat anyway.
Rüdiger Klaehn
@rklaehn
Dec 14 2015 20:34
@tixxit a Set[Double] should work out of the box
Rüdiger Klaehn
@rklaehn
Dec 14 2015 23:19
@AlecZorab assuming that your bits are not spread out too far, a NegatableArraySet[Short] or even NegatableArraySet[Byte] would be even more compact. I would have to specialize it for Byte and Short, but your use case seems to indicate that this would be a good idea anyway.
Rüdiger Klaehn
@rklaehn
Dec 14 2015 23:32
@ceedubs I finished writing a TotalArrayMap, for which you can define a Group without compromising on equality
https://github.com/rklaehn/abc/blob/master/core/src/main/scala/com/rklaehn/abc/TotalArrayMap.scala
There is still a lot of optimisation to do, but it does pass the discipline tests...