These are chat archives for HdrHistogram/HdrHistogram

11th
Aug 2015
Michael Haberler
@mhaberler
Aug 11 2015 17:03 UTC
@ahothan - in case the wire format issue is not decided, I would suggest protobuf - it is the problem-free swiss knife of serialization. You can go to almost arbitrary formats from without writing (de)/serializers thanks to introspection. And it has a standard readable/parsable representation as well (TextFormat). Actually in some cases the Message instance object is good enough to work as an internal application data structure, which then happens to be automatically (de)serializable
we use it with JSON, XML, TextFormat and protobuf wire format without custom translators - all driven by .proto definition
Alec
@ahothan
Aug 11 2015 18:10 UTC
@mhaberler just to be clear, you are referring to the google protocol buffer?
I have used it and found it pretty good as well. However I'm afraid the wire encoding is about to get frozen now that it is already implemented in Java, C and now python (I'll let @giltene speak on that).
Michael Haberler
@mhaberler
Aug 11 2015 18:11 UTC
yes. google protobufs
Alec
@ahothan
Aug 11 2015 18:11 UTC
Using GPB would have saved me a bunch of time implementing this encoding/decoding in python though ;-)
Michael Haberler
@mhaberler
Aug 11 2015 18:11 UTC
amen
Alec
@ahothan
Aug 11 2015 18:12 UTC
note that the C binding for GPB is not quite standardized
Michael Haberler
@mhaberler
Aug 11 2015 18:14 UTC
well the wireformat is, and that's what counts; and if push comes to shove with C runtime support (eg embedded/realtime kernels) we use nanopb, which wont even need malloc - not as elegant in use as c++/python gpb but certainly minimal in prerequisites; it even runs on AVR's (sans double)
the protobuf-c binding is kindof lame, I agree
Gil Tene
@giltene
Aug 11 2015 20:54 UTC
GPB is not very useful for describing an HdrHistogram. It may be useful for describing something that contains one (or more), but for the internal encoding, it doesn't really help much, and just requires lots of (otherwise unneeded) overhead and dependencies (and so are most other field-based auto-serializers). E.g. the wire format has a handful of flat, well defined fields in it, followed by an array of values with a length that depends on the overall "value" of the HdrHistogram. And since zlib compression is supported (and expected to provide a very good yield) within each HdrHistogram value, it doesn't tend to fit in well into the way serializers tend to think of compression...
Gil Tene
@giltene
Aug 11 2015 21:08 UTC
The purpose of the HdrHistigram wire format is to capture what a histogram "value" looks like as a BLOB, in a normalized, protocol independent (and overarching buffer scheme independent) way. I think of an HdrHistogram as something the is much closer to a "long" (or maybe "string") than a "struct": It's a basic value type (that is mutable; not to be confused with Java's upcoming value types). I expect it to show up as a basic value (in BLOB form) in things like time-series databases, as well as overall protocols that are used to transmit and/or persist histogram values. There is already a good example of this happening here: https://github.com/Searchlight/khronus . Unfortunately khronus uses an "enhanced" wire format that is not compatible with either V0 or V1, and this serves as a good example of why we should really want to standardize the format...