These are chat archives for HdrHistogram/HdrHistogram

27th
Jul 2015
Alec
@ahothan
Jul 27 2015 20:36
@tmontgomery: so far have not found any way to reuse an output buffer for the decompression in python (short of extending the zlib python wrapper to support custom memory mgmt), I suppose Java must have the same limitation
Todd L. Montgomery
@tmontgomery
Jul 27 2015 20:43
probably
it's annoying that those capabilities are just not exposed
Michael Barker
@mikeb01
Jul 27 2015 20:53
@ahothan In java you pass in the output buffer, so it is fairly easy to reuse one. Python's API doesn't seem to have an option for that.
Alec
@ahothan
Jul 27 2015 21:48
@mikeb01: thanks for the python code snippet, this will save me a lot of time! Using a ctypes array of c_longlong is a great way to store the counters - probably more efficient than a list. I'll try to get the encode/decode done and will see how to avoid copies. Even skipping the header using [x:y] will introduce a copy (encoded[uncompressed_header_len:len(encoded) + uncompressed_header_len]) so I'll see what else is available if we can avoid that.
Michael Barker
@mikeb01
Jul 27 2015 21:54
Note that there is 2 formats. The snippet shows the older format V0. V1 has some slightly different decoding semantics.
And some more fields.
Alec
@ahothan
Jul 27 2015 22:02
is there a document somewhere that describes the various wire formats? I see there is at least V0, V1. I think reducing the span of the compressed array will benefit largely the decoding side because then we do not need to allocate the full array every time (especially to store zeros). Since most runs will result in relatively aggregated counters, it would make sense to exclude not only the upper counters that are zero (as suggested by @giltene) but also the lower zero counters. Even with 2 digits I still have around 6K counters and in my case having to allocate only a few hundred counters per decode out of 6K would make sense (I still have not found how to avoid decompress to create a new storage every time).