These are chat archives for rust-lang/rust

18th
Sep 2016
Richard Fussenegger
@Fleshgrinder
Sep 18 2016 13:13
Hey guys :) I need an ultra-fast map of u64 keys with sorted tuple values. I am using a HashMap<u64, BTreeMap<u32, u64>> right now which is fast but I'm a bit worried about the hashing because I actually don't need it at all for my u64 keys. Whether the BTreeMap is the right data structure is also questionable. The operation I need to perform is: select 1 to x entries from the map and all entries from inside of it which are >= than the u32 key. Any suggestions for an efficient data structure for this kind of work?
Sergey Noskov
@Albibek
Sep 18 2016 16:41
are you trying to replace tuples with BTreeMap?
Richard Fussenegger
@Fleshgrinder
Sep 18 2016 16:55
The code already evolved a lot by now. I was using a BTreeMap because I wanted to have it sorted by the u32. However, this is not required anymore and I now have a HashMap<u64, (u64, u32)>; you'll notice that the u32 and the u64 are swapped now.
My question boils down to the following: does the HashMap result in a lot of overhead if I'm using u64 only due to hashing? I mean, it hashes that u64 all the time, right?
Sergey Noskov
@Albibek
Sep 18 2016 16:58
it depends of how is Hash implemented for u64
I think it does the hashing
Richard Fussenegger
@Fleshgrinder
Sep 18 2016 16:59
But there are no maps that don't hash at all in the std so one would need to implement this in userland?
Sergey Noskov
@Albibek
Sep 18 2016 17:02
Seems like so. You can do either custom type with custom hash function or change the built in hasher in HashMap for the passing-through one
Richard Fussenegger
@Fleshgrinder
Sep 18 2016 17:03
I tried that and it was actually slower but I my bet is that I implemented the hasher incorrectly. :( I'll investigate further but thanks for confirming my assumptions. :)
Sergey Noskov
@Albibek
Sep 18 2016 17:06
If you need speed you can also think about just using vector, since usize is u64 on 64-bit arch. It, for sure, depends on the distribution of numbers you have. But probably correct indexing could make it faster
Richard Fussenegger
@Fleshgrinder
Sep 18 2016 17:07
But get is not O(1) anymore.
I would need to traverse the vector to find the matching u64.
Amount of u64 to search is around 20 million entries. Note that HashMap is super fast already but I would not complain if it's even faster. ;)
Sergey Noskov
@Albibek
Sep 18 2016 17:10
It still can be faster depending of data. I'm still not sure that binary search in sorted vector or tree is slower.
Richard Fussenegger
@Fleshgrinder
Sep 18 2016 17:11
I'll give it a shot. :)