These are chat archives for HdrHistogram/HdrHistogram

22nd
Jul 2015
Michael Barker
@mikeb01
Jul 22 2015 11:31
@ahothan I think you can do it reasonably efficiently with using a bit of ctypes magic. See this: https://gist.github.com/mikeb01/08c096864dd415e4dc95 for a partial implementation.
Gil Tene
@giltene
Jul 22 2015 20:36
A few notes:
  1. I use 2 decimal points in almost all practical applications. +/- 1% across the dynamic range is more than virtually all monitoring use cases need, and MUCH better that virtually all the types of (non-HDR) histograms.
(this dramatically reduces the memory footprint of individual histograms, obviously, down to low 10s of KB)
Gil Tene
@giltene
Jul 22 2015 20:44
  1. The use of auto-resize during decoding could have a huge impact on practical histogram size as well. Especially in wire format, data transfer, and work-needed-to-hidrate. In many monitoring situations, "most" interval histograms have a much smaller actual covered range than the occasional "spikey" interval. E.g. a 1:10 or 1:20 ratio between intervals include >20msec noise and ones that don't is very common in hiccup monitoring. As a result, the wire formats that encode the wire histogram's contents with an array length capped by the actual max value (as opposed to the max possible value in the histogram's configured range) save a lot of space "on the wire". Similarly, if we use auto-resize when opening the encoded histogram into a newly allocated histogram object, we will probably reduce practical allocation pressure dramatically in applications that do a lot of decoding
  1. Decoding into an existing histogram object (overwriting all it's settings and resizing it's storage only if needed) will probably be the best solution for applications that decode a lot (e.g. read a lot of interval histograms from log file or network traffic)
Michael Barker
@mikeb01
Jul 22 2015 20:50
@giltene The latest jHiccup binary download still appears to use the older V0 format that doesn't include payloadLength.
Version 2.0.2.
Gil Tene
@giltene
Jul 22 2015 20:51
  1. I haven't put much work into allocation efficiency on the decoding side thus far mostly because decoding tends to be a much less performance (and latency) critical function that encoding (it will usually be a process that is separate from the applications that actually measure ad record latencies). On the encoding side, I try hard to avoid buffer allocation. E.g. histogram instances cache an intermediateUncompressedByteBuffer used int the encoding process, since the same histogram object tends to be used repeatedly in encoding output i. e.g. a Recorder pattern.
Michael Barker
@mikeb01
Jul 22 2015 20:56
I've updated the fix I made to support Direct ByteBuffers to cache the intermediate byte array in the same manner. It's not particularly efficient as there is still a copy cost, but doesn't allocate a byte array on every decode.
Gil Tene
@giltene
Jul 22 2015 20:58
@mikeb01 Good point. I just pushed a jHiccup 2.0.5 pom change that uses the latest HdrHistogram (2.1.5). So the github jHiccup is more up to date. Will build a tarball for the download site as well.
Michael Barker
@mikeb01
Jul 22 2015 21:00
I've realised that my V1 decoding is broken, I don't make use of the payload_length field. I'll fix that today. In combination of some optimisations for buffer reuse for both encode and decode.
On the 'C' implementation.
Gil Tene
@giltene
Jul 22 2015 21:09
@mikeb01 Looking at the changes for support of DirectByteBuffer, I guess Is see the problem (decompressor sadly doesn't know how to deal with the non-heap array, so you need an on-heap one for the de-compression, and then have to copy it's contents into the buffer with buffer.get()). But I'm confused abut why anyone would want to use a DirectByteBuffer as a target. Is this change intended to just make it work if they do? Another approach would be to simply not support use of DirectByteBuffer targets, and to throw an exception if the buffer has no getArray() to decode into...
@mikeb01 re: jHiccup, can you build the current 2.0.5 version (use the tag) and kick it around bit in your envs? I'd love to get a bit of external exposure before I update the actual downloadable tarball for jHiccup...
Michael Barker
@mikeb01
Jul 22 2015 21:12
Making it work if the caller happened to use a Direct ByteBuffer was the main intent. I was going to put a sensible IllegalArgumentException on the call to replace the confusing UnsupportedOperationException, but the implementation was fairly straight forward.
If annoyingly inefficient.
A shame that a number of the older, but still useful parts of the standard library haven't been updated to use ByteBuffers.
Gil Tene
@giltene
Jul 22 2015 21:14
The non-caching implementation is/was straightforward, I agree. the caching starts to smell of undesired complication for a "probably should be using it this way" use case...
Michael Barker
@mikeb01
Jul 22 2015 21:15
Happy to revert the latest change if you like.
Gil Tene
@giltene
Jul 22 2015 21:15
No problem with keeping it though, especially since you added this nice tests
Michael Barker
@mikeb01
Jul 22 2015 21:16
JUnit Theories are very cool, even though the docs are squirrelled away in a corner.
Gil Tene
@giltene
Jul 22 2015 21:16
Agree on the lack of updates in standard lib to use DirectByteBuffer. decompressor is a good example of this.
Michael Barker
@mikeb01
Jul 22 2015 21:18
The patch that Richard Warburton did to support building Strings from ByteBuffers without a double copy is another.