These are chat archives for HdrHistogram/HdrHistogram

16th
Sep 2015
Gil Tene
@giltene
Sep 16 2015 02:19
@mikeb01 : I think @ahothan is correct. ZigZag will be LE encoded "on the wire", but the int64 values you are providing to your zig_zag_encode_i64() and zig_zag_decode_i64() functions are in native format. So I think your current code probably works on machines where htole64() is a no, but will break on big-endian machines (like SPARC).
Michael Barker
@mikeb01
Sep 16 2015 07:52
I've removed the call to htole64.
Alec
@ahothan
Sep 16 2015 18:23

@mikeb01
looks like same issue on the decoding side (in _apply_to_counts_zz):
data_index += zig_zag_decode_i64(&counts_data[data_index], &value);
int64_t host_value = (int64_t) le64toh(value);
if (host_value < INT32_MIN)
{
return HDR_TRAILING_ZEROS_INVALID;
}

    if (host_value < 0)
    {
        int32_t zeros = -((int32_t) host_value);
        counts_index += zeros;
    }
    else
    {
        h->counts[counts_index] = host_value;
        counts_index++;
    }

the return from zig_zag_decode_i64 is already native endian (host) so the le64toh call is not needed.
Additional notes:

  • one of the peculiarity of this varint string encoding is that the number of elements is variable and cannot be calculated without a full pass. Meaning that the code should normally protect against any overrun of the array during the iteration if the varint string contains too many entries (it would be easy to crash the current code with a varint string that is valid but has too many counters).
  • the room for value overflow is real with this encoding, especially when supporting 16 and 32-bit counters. When storing and when adding. Not sure the C code supports fully 16/32-bit counters, the python code does and now I'm quite happy to have to do all these checkings in the python C extension code...
    @giltene I guess the Java version does all these overflow checking as well ? (did not have time to check)
    this V2 encoding might compress better and faster but it does have a hell lot more caveats to deal with in the code!
Michael Barker
@mikeb01
Sep 16 2015 20:11
@ahothan I'll fix the remaining le64toh. For the overruns, I think if the underlying array is padded out to payload_len + 1, such that the last padded byte is 0 then the varint code will terminate if if the last byte (unsigned) of the input is >127 and it is expecting more bytes. To be safe the buffer could be padded out by 8.
The C code supports decoding 16/32 bit counters, but just widens them into a 64 bit histogram.
Gil Tene
@giltene
Sep 16 2015 20:43
@ahothan : Java has built in array range enforcement, so if (for whatever reason) an encoding came in that had more payload than would fit in the actual array constructed to hold the histogram according to it's header information, an exception would occur.
@ahothan : as for other enforcement: negative counts are considered undefined in HdrHistogram. Undefined isn't quite the same as "invalid", in the sense that negative counts are not strictly prevented. But the behavior when they exist is not defined. E.g. subtraction will refuse to subtract larger counts from smaller ones, but negative values can certainly result from overflows or explicit "malicious" counting. In any case,
the V2 encoding will refuse to encode a histogram with negative counts in it.
Gil Tene
@giltene
Sep 16 2015 20:51
@ahothan: for speed tests, can you try some of the test cases I have in the Java benchmarks? Most histograms tend to be very sparse, and won't cover anywhere near the 30% of usec...day with data. As a result, (and since the zero counts walks are not done with variant), I would hope that for at least some of the data sets, V2 would actually be faster even with python. In Java, it is 4x to 10x faster than V1, which is a big deal...
Alec
@ahothan
Sep 16 2015 21:09

@mikeb01: 2 more things needed. The decode function lacks a stop in 2 cases: buffer truncation (missing bytes to finish the decode) and encoded value overflow (9th byte has a more bit set).
For the first stop the loop needs to check it does not read pass the end of the buffer, for the second the code needs to check the 9th byte does not have the more bit set. Since you may have copied this from Java, this may apply to the Java version?
Note that the GPB varint decoder (python version) has the second stop but is also missing the buffer overrun stop, so it'd be possible to also crash that code with an incomplete buffer.

@giltene , @mikeb01 : you use 9 bytes max, which means that after decode, the max decoded value fits in 63 bits (9 x 7).
Given this is zigzag value, it means that the counter range is actually 62 bits (since negative numbers take up half the space).
Hence this V2 can only encode up to 62-bit counters (while V1 could handle the whole 64-bit). Correct?
More of an academic concern, should the code bother to check for any 62-bit overflow while encoding to v2 or maybe we should just say that practically this will "never" happen?

Alec
@ahothan
Sep 16 2015 21:29
@giltene: can certainly try with your test case. In my case, typical use case would be few thousands records per histogram. So would not be that unrealistic to expect ~1000 counters to encode/decode per histogram.
The perf test I did was with 2 digit precision which yields around 4000 counters, so we're pretty close
Michael Barker
@mikeb01
Sep 16 2015 21:34
@ahothan I think for the second case, I just need to check the dataIndex against the length of the buffer. I'll fix that this evening.
Gil Tene
@giltene
Sep 16 2015 23:13
@ahothan: LEB128's long integer encoding is 9 bytes max, but that's 8x7 + 8 = 64 bits. The 8th byte in the sequence (if you ever reach it) is always full, with no more bit.
(In both my @mikeb01's C implementation and in my Java one, it's a long if/else sequence, not a loop, BTW).
Gil Tene
@giltene
Sep 16 2015 23:24
Since the 9th byte can never have a more bit set (it has no more bit), overruns are limited. But it's still worth looking into, since a (bad) encoding can potentially extend the reading 8 bytes past the end of an input buffer. In Java that's not a problem because the buffer gets are individually range checked (even within the ZigZag decoding logic). @mikeb01 : it may be worth giving zig_zag_decode_i64() an extra limit parameter (which would basically be the remaining bytes in the data_limit), and have it check it. For speed purposes, rather than check it on each byte in the common case, I'd split the code into two main cases: when the remaining limit is >8 bytes, do the current if-else chain, and if it's lower, do the slower chain (where each step does a range check).
Michael Barker
@mikeb01
Sep 16 2015 23:24
I've added padding to prevent crashing with corrupted values. The zig zag decoding will only read 9 bytes at at time between bounds checks (dataIndex < dataLimit).
Gil Tene
@giltene
Sep 16 2015 23:25
@mikeb01 : is the reading always done from a buffer you control (can can therefore pad)?
Michael Barker
@mikeb01
Sep 16 2015 23:25
@giltene Yes.
Because we decompress first into a buffer that I allocate.
There is another interesting check for corruption that could be done. If we have a truncated LEB128 value, i.e. I've only 8 bytes left in the buffer and the last is >127. Then I can simply compare dataIndex == dataLimit if they are not the same then I must of read past the end of the buffer and will probably have corrupt data.
More generally, if at the end of decoding the buffer dataIndex != dataLimit, I've encountered a truncated counter.
Gil Tene
@giltene
Sep 16 2015 23:38
I see. You don't provide an API for uncompressed buffers. Maybe I should drop that from the Java side as well?
Michael Barker
@mikeb01
Sep 16 2015 23:56
That's correct, when it doubt leave it out :-)