These are chat archives for HdrHistogram/HdrHistogram

3rd
Jun 2017
Gil Tene
@giltene
Jun 03 2017 00:04
Rounding doesn't have this problem (or prevents it). The problem occurs when math that calculates a percentile (countToHere / totalCount) produces a fp value that is slightly higher than the actual (countToHere / totalCount), and that percentile is then used in determining something (like what the value at that percentile is, which can be off by one whole subbucket) with logic that always rounds up. Rounding this fp value to the nearest integer erases the problem.
I'm tempted to play some ulp games to see if I can get rid of the problem (e.g. subtract one ulp from the result before doing Math.ceil). I think I can make that work, but need to think about whether or not it will break other things.
Julian Berman
@Julian
Jun 03 2017 00:29
@ahothan the use case is just being able to trust conclusions I draw isn't it?
If I don't know that the numbers I get out are accurate, or how inaccurate they can be, how can I draw conclusions from the data?
If there's one thing I've learned from @giltene's talks it's to remind myself that I shouldn't trust my own statistical intuition same as I wouldn't trust myself with cryptographic intuition or probabilistic ones etc. no?
I'm perfectly happy with an answer that this issue isn't important enough to fix if that's the case but to me as a layman it doesn't make me feel too great to not be able to make any statement about the accuracy of the numbers in the histogram no? Just want to make sure I'm understanding enough to know what bounds there are three
*there
Without someone qualified telling it to me though I deeeefinitely wouldn't feel confident on my own to know whether inaccuracy can lead to wrong answers here
Julian Berman
@Julian
Jun 03 2017 00:37
Ok maybe reading this carefully though I'm understanding something
Is it just the case that get_value_at_percentile is inconsistent, but anytime I look at the whole histogram in order I get a correct one?
Gil Tene
@giltene
Jun 03 2017 00:54
Ok. I thing I have a 99.999999%' good fix for this ;-)
See HdrHistogram/HdrHistogram@5c7226c
It's good to within one ulp. 50.00000000000001% is still going to fall one count short in the example, but 50.0000000000001% won't.
Gil Tene
@giltene
Jun 03 2017 00:59
I do thing this is worth putting in across the board, so thanks Julian for pointing it out. This allows us to have a more clear statement about the value returned: the percentile [to within +/- 1 ulp] is now inclusive of all values that it touches, and the results are more intuitive.
(I do think)
Please review the above commit, (with associated test changes in HdrHistogram/HdrHistogram@95b3cf8 and https://github.com/HdrHistogram/HdrHistogram/commit/abb45f6d4a189e1faabec58871149b1cdb6369d5), and let me know if this makes sense to others, including the python, C, Rust, JS, and .NET folks...