These are chat archives for HdrHistogram/HdrHistogram

8th
Sep 2015
Gil Tene
@giltene
Sep 08 2015 21:17
To all you HdrHistogram porters: I just pushed a new encoding scheme (V2) for the Java version. The code supports decoding V1 and V0, but as it stands right now it will encode in V2 (I'm disinclined to add options to output V1 or V0)
For some details on why this was done, see discussion here:https://github.com/Searchlight/khronus/issues/21#issuecomment-133877545
(Summary: Khronus's SkinnyHistogram scheme was beating V1 on both space and time by a large enough margin to make this interesting).
Here is a space comparison output table (don't know how it will format for bitter):
           case1 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [   725 /  2186 /  1155 /  66.83%/  37.23%]     401 /  492 /  500 /  18.50%/  19.80%
           case1 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  1762 / 11058 /  2322 /  84.07%/  24.12%]     805 /  997 /  960 /  19.26%/  16.15%
           case2 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  1119 /  4442 /  1895 /  74.81%/  40.95%]     168 /  186 /  181 /   9.68%/   7.18%
           case2 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  5399 / 29106 /  8695 /  81.45%/  37.91%]     218 /  375 /  263 /  41.87%/  17.11%
           case3 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  1964 / 10532 /  2743 /  81.35%/  28.40%]     204 /  253 /  211 /  19.37%/   3.32%
           case3 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  9277 / 71680 / 15449 /  87.06%/  39.95%]     289 /  776 /  355 /  62.76%/  18.59%
        sparsed1 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [    52 /  1456 /    56 /  96.43%/   7.14%]      49 /   59 /   51 /  16.95%/   3.92%
        sparsed1 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [    52 /  5214 /    56 /  99.00%/   7.14%]      48 /   76 /   51 /  36.84%/   5.88%
        sparsed2 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [    61 /  4650 /    65 /  98.69%/   6.15%]      59 /   86 /   62 /  31.40%/   4.84%
        sparsed2 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [    61 / 30762 /    65 /  99.80%/   6.15%]      59 /  114 /   62 /  48.25%/   4.84%
       quadratic [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  1873 /  5286 /  3198 /  64.57%/  41.43%]     396 /  450 /  402 /  12.00%/   1.49%
       quadratic [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  8570 / 35860 / 12998 /  76.10%/  34.07%]     778 /  877 /  869 /  11.29%/  10.47%
           cubic [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  1446 /  6440 /  2084 /  77.55%/  30.61%]     352 /  374 /  381 /   5.88%/   7.61%
           cubic [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [  2682 / 45096 /  2688 /  94.05%/   0.22%]     668 /  811 /  673 /  17.63%/   0.74%
case1PlusSparsed2 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 739 / 4650 / 1170 / 84.11%/ 36.84%] 412 / 523 / 515 / 21.22%/ 20.00%
case1PlusSparsed2 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1777 / 30762 / 2341 / 94.22%/ 24.09%] 816 / 1079 / 971 / 24.37%/ 15.96%
longestjHiccupLine [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 839 / 6380 / 1332 / 86.85%/ 37.01%] 181 / 223 / 195 / 18.83%/ 7.18%
longestjHiccupLine [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1351 / 44616 / 1377 / 96.97%/ 1.89%] 207 / 420 / 207 / 50.71%/ 0.00%
shortestjHiccupLine [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 99 / 3288 / 104 / 96.99%/ 4.81%] 94 / 128 / 96 / 26.56%/ 2.08%
shortestjHiccupLine [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 113 / 19880 / 127 / 99.43%/ 11.02%] 100 / 171 / 101 / 41.52%/ 0.99%
sumOfjHiccupLines [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1063 / 12720 / 1702 / 91.64%/ 37.54%] 343 / 451 / 375 / 23.95%/ 8.53%
sumOfjHiccupLines [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1742 / 89164 / 1751 / 98.05%/ 0.51%] 382 / 830 / 391 / 53.98%/ 2.30%
case1PlusSparsed2 [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 739 / 4650 / 1170 / 84.11%/ 36.84%] 412 / 523 / 515 / 21.22%/ 20.00%
case1PlusSparsed2 [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1777 / 30762 / 2341 / 94.22%/ 24.09%] 816 / 1079 / 971 / 24.37%/ 15.96%
longestjHiccupLine [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 839 / 6380 / 1332 / 86.85%/ 37.01%] 181 / 223 / 195 / 18.83%/ 7.18%
longestjHiccupLine [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1351 / 44616 / 1377 / 96.97%/ 1.89%] 207 / 420 / 207 / 50.71%/ 0.00%
shortestjHiccupLine [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 99 / 3288 / 104 / 96.99%/ 4.81%] 94 / 128 / 96 / 26.56%/ 2.08%
shortestjHiccupLine [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 113 / 19880 / 127 / 99.43%/ 11.02%] 100 / 171 / 101 / 41.52%/ 0.99%
sumOfjHiccupLines [2] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1063 / 12720 / 1702 / 91.64%/ 37.54%] 343 / 451 / 375 / 23.95%/ 8.53%
sumOfjHiccupLines [3] (TzleBytes/noTlzeBytes/Skinny/%ReductionA/%ReductionB): [ 1742 / 89164 / 1751 / 98.05%/ 0.51%] 382 / 830 / 391 / 53.98%/ 2.30%
Michael Barker
@mikeb01
Sep 08 2015 21:42
What happens if you flip the useTzleenconding?
Gil Tene
@giltene
Sep 08 2015 21:55
It encodes without the new stuff, but the cookie is still new
I included the useTzleEncoding stuff mostly to enable comparative testing
will probably take it out later.
Michael Barker
@mikeb01
Sep 08 2015 21:56
Does that break the decoding or does it cope with that situation?
Gil Tene
@giltene
Sep 08 2015 21:56
The decoding in the new code works with everything. But encoding with the new code (even with TZLE off) will produce a new cookie that older code doesn't recognize.
This is work-in-progress-stuff, and I pushed it to get feedback and alignment with others (C, photon, etc.). It's not in the released version or on maven yet.
We can easily make turning useTzleEncoding off use the older cookie, but I don't think that helps much, since we'll want the default to be the "better" one, and that will create a problem for other libs (that don't yet have decoders for V2) that want to read it.
On the other hand, it may be better to have the older scheme (with word sizes) only work with V1, and for V2 to always use the TZLE+ZigZag stuff. Will make the code simpler in the long run (can remove the V1 encoding side over time).
Michael Barker
@mikeb01
Sep 08 2015 22:01
That would work for me.
Gil Tene
@giltene
Sep 08 2015 22:03
Format-wise, it ended up being a pretty simple scheme. You need a ZigZag LEB128 implementation, but that is nearly trivial (port the Java codeI wrote from scratch for that), or use someone else (Protocol Buffers use the same encoding scheme, so implementations abound). I wrote this from scratch rather than pull someone else's only because I didn't want to pull Apache licensed code into the stuff (and it is so simple to implement).
Michael Barker
@mikeb01
Sep 08 2015 22:17
ZigZag shouldn't be a problem, I wrote my own Base64 encoding for the c version. Though I'll probably just copy yours...
Gil Tene
@giltene
Sep 08 2015 22:31
Ok, changed to use V2 only when TZLE is on (default). Turning it off uses V1 instead.
Will probably want to take the V1-outputting code out altogether after this sits for a while.
Michael Barker
@mikeb01
Sep 08 2015 22:32
Cool. I'm working on the V2 encoding now.
Gil Tene
@giltene
Sep 08 2015 22:32
The ratios for both space and speed are pretty impressive for such a simple change...
Michael Barker
@mikeb01
Sep 08 2015 22:32
Yes, I only maintain one version of the output code and multiple versions of the import code.
I might have to take a look at Khronus too.
Gil Tene
@giltene
Sep 08 2015 22:34
A time-series database that holds HDR histograms is a pretty kool thing.
Michael Barker
@mikeb01
Sep 08 2015 22:34
Yes, we've go a home rolled db for perf stuff, but it's only 'value
' is a double
I wonder if there is a way to do the encoding without the branches.
Michael Barker
@mikeb01
Sep 08 2015 22:52
A hiccup test log in V2 format would be useful too.
Gil Tene
@giltene
Sep 08 2015 23:11
Just added one and a test case to match. Also updated log version to 1.2 (in writer)
Michael Barker
@mikeb01
Sep 08 2015 23:27
Cheers