These are chat archives for HdrHistogram/HdrHistogram

Aug 2017
Gil Tene
Aug 17 2017 03:37
I'm probably ok with supporting both percentile and quantile stuff, it's just annoyingly redundant from an API point of view. Quantiles are more "proper" or "pure", but I originally chose percentile for Java because I expect it to be more readily/commonly understood by end users.
However, whichever you support (even if it's both), I don't think it would be reasonable to document a +/- 100 ulp thing about it. So I'd take only 1 ulp off of percentile input to getValueAtPercentile, and only 1 ulp off of quantile input to getValueAtQuantile.
Ironically, this may mean that at the edges, getValueAtQuantile(x) and getValueAtPercentile(100 * x) may return different values. But I think that is the right thing to do.
Marshall Pierce
Aug 17 2017 12:11
I was just going to take the shortcut of leaving the percentile version there with a warning to use the quantile version if you care about precision at all. :)
Marshall Pierce
Aug 17 2017 12:17
But this port has the luxury of not having as much established usage.
(Hm, we might even drop the percentile one entirely -- there are some other breaking API changes I'd like to make anyway)
Marshall Pierce
Aug 17 2017 13:20
I have yet to find a case where nextDown(q) * total and nextDown(q * total) are 2 ulps apart, but they are frequently 1 ulp apart, and slightly less frequently have different ceil()s. I'm not sure I understand the rationale for removing the ulp from the quantile (or percentile) input rather than from the product immediately before the ceil -- I'm struggling to reason about what accuracy guarantees we could make about the resulting number if we perturb the input that early, especially if we do so unconditionally. If someone asks for the 0.5th quantile, which can be represented exactly in IEEE754, it seems undesirable to always step that down to the previous FP value.
here's some scratch code and sample output showing those cases:
Marshall Pierce
Aug 17 2017 17:16

It seems like ultimately this boils down to picking one of a number of unpleasant choices. I'm assuming quantiles here, as percentiles will be the same just with less accuracy because of the one additional FP op.

If we simply round the quantile * total_count product ("quantile count"?), then we will yield the wrong answer when FP inaccuracy yields a 0.5_f64.prev() = 0.49999999999999994 fractional part (using rust syntax) instead of the true value of 0.5 (exactly 1/2) or vice versa. However, a quantile count with a nonzero fractional part (in a perfectly accurate calculation, which is not what we have) should probably be considered part of the next integer. That said, it does yield the right result for the next two examples.

If we ceil() the quantile count, then we will yield the wrong answer for quantile 0.64415 with a 100_000 count: 0.64415 * 100000 = = 64415.00000000001 instead of 64415 (which can be an exact f64 value, btw), which means we'll determine the count to be 64415_f64().next().ceil() = 64416. However, the next example with 0.5000000000000001 will be correct with this logic.

If we .prev().ceil() the quantile count, then we will yield the wrong answer for quantile 0.5_f64().next() = 0.5000000000000001 with count 2: 0.5_f64().next() * 2_f64 = 1.0000000000000002. 1.0000000000000002_f64.prev() = 1_f64, so 1_f64.ceil() = 1. However, the prior example with 0.64415 will yield the correct result, because the prev() happens to counteract the FP error. (I also tried .prev() on the quantile before multiplication, but I didn't see a difference on the tests I ran.)

So, we're just trading off which case we want to yield the wrong result for. I think that if we end up with a fractional part that's not near enough to an integer boundary to be due to FP issues (e.g. .2) then it is desirable to round that up rather than down, so I'm happy to exclude the simple rounding logic. If we go with ceil(), I do wonder if maybe prev() isn't really doing much for us, at least in a way we can concretely define. When I only use ceil(), I get a lot of test failures (in the scenario where the histogram contains 1..n and we check each value's quantile) because the quantile counts are slightly too high, leading to the selection of the next sub bucket. So, by that measure, quantile_count.prev().ceil() fares better, but simply rounding the quantile count (the original logic) also passes that test! Consequently, while .prev().ceil() may end up working better, I don't have any justification for that other than "it seems to work in practice", which isn't very comforting. I don't know why the quantile count should so frequently be 1 ulp too high; maybe that's just how FP multiplication goes?

I don't think any of these options gets us to a consistent quantile calculation -- floating point issues are always going to ruin that. I suspect that maybe the best we can do is use .prev().ceil() on the quantile count and document valueAtQuantile/Percentile with a proviso like:

Due to the inherent inaccuracy of floating point arithmetic, certain combinations of quantile and histogram size will return a value that is one sub bucket smaller than infinite precision arithmetic would yield.

As an example of consistently being 1 ulp too high, here's some relatively inscrutable logging of test failures (the interesting point is the "q count" one):
len 100000 quantile 0.03475 q count 3475.0000000000005 actual 3475 equiv 3475 calc 3477 equiv 3477
len 100000 quantile 0.03477 q count 3477.0000000000005 actual 3477 equiv 3477 calc 3479 equiv 3479
len 100000 quantile 0.03493 q count 3493.0000000000005 actual 3493 equiv 3493 calc 3495 equiv 3495
len 100000 quantile 0.03509 q count 3509.0000000000005 actual 3509 equiv 3509 calc 3511 equiv 3511
len 100000 quantile 0.03511 q count 3511.0000000000005 actual 3511 equiv 3511 calc 3513 equiv 3513
len 100000 quantile 0.03527 q count 3527.0000000000005 actual 3527 equiv 3527 calc 3529 equiv 3529
len 100000 quantile 0.03543 q count 3543.0000000000005 actual 3543 equiv 3543 calc 3545 equiv 3545
There's some expression of how the FP math works because there will be a long string of nearby quantiles that yield the same fractional part, then another long string with a different one. This is the transition to the next block:
len 100000 quantile 0.04071 q count 4071.0000000000005 actual 4071 equiv 4071 calc 4073 equiv 4073
len 100000 quantile 0.04073 q count 4073.0000000000005 actual 4073 equiv 4073 calc 4075 equiv 4075
len 100000 quantile 0.04089 q count 4089.0000000000005 actual 4089 equiv 4089 calc 4091 equiv 4091
len 100000 quantile 0.06259 q count 6259.000000000001 actual 6259 equiv 6259 calc 6263 equiv 6263
len 100000 quantile 0.06263 q count 6263.000000000001 actual 6263 equiv 6263 calc 6267 equiv 6267
len 100000 quantile 0.06295 q count 6295.000000000001 actual 6295 equiv 6295 calc 6299 equiv 6299
So, .prev().ceil() works, but I don't know why.