These are chat archives for boostorg/hana

28th
Jan 2016
Jason Rice
@ricejasonf
Jan 28 2016 02:43
Is there any way to run just the hana::map part of the benchmark?
I wrote a script to aggregate just the results for hana::map into a single json file but it takes forever.
Louis Dionne
@ldionne
Jan 28 2016 02:44
Yeah, just remove the <%= benchmark(“whatever”) %> you don’t want to run.
Basically, this whole .erb.json file is given to ERB, a program that executes the code inside <%= … %> tags as Ruby code.
Then, whatever is generated by the Ruby code inside <%= … %> is replaced inside the file. This benchmark(…) function is what runs the compilers 1000000x times and takes so long, so just remove those that you don’t want to run.
Does that answer your question?
Jason Rice
@ricejasonf
Jan 28 2016 02:47
yeah i guess it's just a couple of manual changes then
in the past I've done <#% blah %> which comments out the tag
Louis Dionne
@ldionne
Jan 28 2016 02:54
Ah, ok so you know about ERB. For the time being, there’s no better way to disable parts of the benchmark, e.g. on the command line.
Jason Rice
@ricejasonf
Jan 28 2016 02:54
Would it be acceptable to add additional targets for individual benchmarks?
I could make it use the same erb.json files for the spec of course
actually wait that might be a pain
Louis Dionne
@ldionne
Jan 28 2016 02:58
Well, you would have to somehow add code to the benchmarks to allow disabling parts of it. I’m not sure it is worth it, but it would be doable.
Jason Rice
@ricejasonf
Jan 28 2016 03:18
I have another change I want to make to #242
I'm going to see if I can make the second one better.
Louis Dionne
@ldionne
Jan 28 2016 17:41
The real problem seems to be map creation, which is kinda slow. I’ll look at it now.
Jason Rice
@ricejasonf
Jan 28 2016 17:59
Yes, perhaps there is some low hanging fruit there. Other than that, I was thinking about adding an optimization to maps whose keys could never possibly collide (ie type, string, integral_constant, and anything that uses the default implementation of hash)
Jason Rice
@ricejasonf
Jan 28 2016 18:49
..but that would mean an upfront, full scan of the Storage to see if the keys satisfied that.
Louis Dionne
@ldionne
Jan 28 2016 18:50
Yeah, but that’s nothing compared to the complexity of the current fold_left(…, insert) for creating the map.
But that would clearly be a peephole optimization, and let’s not focus on this for now.
Jason Rice
@ricejasonf
Jan 28 2016 18:51
Is there a better way to do that? (ie instead of fold_left)
Louis Dionne
@ldionne
Jan 28 2016 18:51
I’m thinking about it right now :)
There probably is one, but it’s probably not obvious.
Well it’s definitely not obvious because we would have seen it already.
Hey, side question: did you plan on going to C++Now in May?
Jason Rice
@ricejasonf
Jan 28 2016 18:52
I wasn't aware of it. Where is it?
Louis Dionne
@ldionne
Jan 28 2016 18:53
It’s a great conference, probably the most core C++ conference in the US. It’s in Aspen. http://cppnow.org
(Colorado)
Jason Rice
@ricejasonf
Jan 28 2016 18:58
I'd have to clear it with the cfo. I'm not seeing any info about an entry fee.
Louis Dionne
@ldionne
Jan 28 2016 18:59
Registration is not open yet AFAIK. The entry fee is usually around 750$, unless you submit a talk (in which case it’s 0).
Jason Rice
@ricejasonf
Jan 28 2016 18:59
Are you doing a presentation there?
Louis Dionne
@ldionne
Jan 28 2016 18:59
The limit for submitting a talk is tomorrow, however.
I submitted two presentations on metaprogramming; we’ll see whether they are accepted.
Jason Rice
@ricejasonf
Jan 28 2016 19:05
I did a lame presentation about C++ metaprogramming at a local functional programming meetup here in Vegas. It introduces Hana at the end, and the ironic thing is that all of the examples I used are functions that no longer exist (ie only_when, from_just, ...) :P
Louis Dionne
@ldionne
Jan 28 2016 19:07
lol, sorry about that!
Jason Rice
@ricejasonf
Jan 28 2016 19:08
no.. it was only copy/paste example code snippets
Louis Dionne
@ldionne
Jan 28 2016 19:08
These were part of the optional interface, but I felt like it was better to make it closer to C++’s upcoming optional.
So I went from closer to Haskell to closer to C++.
Louis Dionne
@ldionne
Jan 28 2016 21:10
@ricejasonf I see no way to improve on your map creation algorithmically. However, I got a ~ 1.5x speedup by rewriting part of it using different constructs (classical metafunctions instead of functions).
I’ll continue thinking about an algorithmic improvement.
Btw, I would throw you flowers if I could for the idea of storing indices in the bucket instead of actual elements. This removes a huge number of runtime concerns.
Jason Rice
@ricejasonf
Jan 28 2016 21:17
lol.. I only thought of it after running into the wall of trying to construct it.
Louis Dionne
@ldionne
Jan 28 2016 21:21
Ouch. Indices are much easier.
Louis Dionne
@ldionne
Jan 28 2016 21:40
Check this data out: http://pastebin.com/f3tDVFEm
Jason Rice
@ricejasonf
Jan 28 2016 21:42
what is "my hash table"
Louis Dionne
@ldionne
Jan 28 2016 21:42
sorry, it’s my “reimplementation” of what you did
IOW it’s the same thing as your hash table, but with peephole optimizations
It’s not fantastically better, but still a 2x speedup is nice.
I’ll benchmark the cost of creating maps of different sizes now.
Jason Rice
@ricejasonf
Jan 28 2016 21:46
did the peephole optimizations get around calling fold_left?
Louis Dionne
@ldionne
Jan 28 2016 21:46
Naw.
No algorithmic differences, really. Just more structs and fewer functions.
Jason Rice
@ricejasonf
Jan 28 2016 21:47
ah
Louis Dionne
@ldionne
Jan 28 2016 21:47
There is one interesting difference in the insert method, though.
I’ll create a PR so we can discuss.
Louis Dionne
@ldionne
Jan 28 2016 21:53
See #247
Jason Rice
@ricejasonf
Jan 28 2016 21:55
I only see benchmark changes
Louis Dionne
@ldionne
Jan 28 2016 21:55
Look at “benchmark/at_key/hash_table.hpp”.
Yeah, I’m a lazy asshole, LOL
Like I said, I just wanted to check whether it was possible to improve compile-times by using peephole optimizations. I also wanted to wrap my head around your own implementation, and to do this it helped to write mine based on yours.
Jason Rice
@ricejasonf
Jan 28 2016 22:02
when you say "right bucket" do you mean the correct one or is it a right sided position?
Louis Dionne
@ldionne
Jan 28 2016 22:15
I mean “correct"
This one is actually quite nice. What happens is that I map the update_bucket metafunction over all the buckets. But the update_metafunction is specialized such that it always does nothing, except for the bucket with some precise Hash.
For that bucket with the correct Hash, it will append the index to the end of the list of indices.
That’s a fast-lane way to do a hana::replace on types only with the comparison predicate being std::is_same.
Jason Rice
@ricejasonf
Jan 28 2016 22:21
you eliminated my std::is_base_of sfinae hacks too :P
Louis Dionne
@ldionne
Jan 28 2016 22:21
Yeah, there was no need for that anymore. As surprising as it may seem, std::is_base_of is actually very hairy, IIRC.
It’s quite inefficient.
Oops, forget what I just said. It uses an intrinsic.
I’m thinking about regrouping benchmarks by the data structure to which they apply. For example, create a benchmark/map subdirectory with benchmark/map/{at_key,make,insert}, etc.. I’m tending towards this because I feel like the baselines are otherwise not meaningful. For example, the current baseline for at_key is a map, but we’re benchmarking a tuple against it too.
Louis Dionne
@ldionne
Jan 28 2016 22:26
However, that will make it more difficult to compare algorithms across data structures (e.g. how does map compare to tuple w.r.t. at_key?).
Do you have any thoughts or preferences regarding this?
Jason Rice
@ricejasonf
Jan 28 2016 22:30
why not both? :P
Louis Dionne
@ldionne
Jan 28 2016 22:31
Because that would require duplicating benchmarks.
Or wait...
We could generate datasets individually (e.g. hana::at_key on a hana::map, hana::make on a hana::map, hana::at_key on a hana::tuple, and so on), and then mix and match them into charts as we want.
Jason Rice
@ricejasonf
Jan 28 2016 22:34
that would make running a single benchmark much easier
Louis Dionne
@ldionne
Jan 28 2016 22:34
Yes, it would.
But I’m not sure what would be the best way to go.
I mean to implement that.
Jason Rice
@ricejasonf
Jan 28 2016 22:38
this is just spitballing...
what if all of the benchmark definitions were in one big json file and they each have a target name...
Louis Dionne
@ldionne
Jan 28 2016 22:41
How would you generate a single dataset if they are all defined in the same JSON file? Right now, like I said, generating a benchmark is a matter of running ERB on the JSON file that defines it.
Not that it can’t change, though.
What I mean basically is that if all were defined in the same JSON file, your only choice would be to always run all the benchmarks, unless we define something at the Ruby level that allows us to skip some parts of the generated JSON.
I’m not sure that is practical, but what do I know?
Louis Dionne
@ldionne
Jan 28 2016 22:47
I need to go now or I’ll be late somewhere. But I’ve started preparing benchmarks for hana::make_map, and we’ll see whether the initial cost of creating a hash map is going to be a problem in practice. Hopefully not.
Jason Rice
@ricejasonf
Jan 28 2016 22:50
k