Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Alec
    @ahothan
    Ok, and base64 is optional as I understand?
    Marshall Pierce
    @marshallpierce
    base64 is orthogonal; it has nothing to do with serialization. You might as well hex encode it, or UUEncode, etc
    bytes are bytes :)
    Alec
    @ahothan
    In that case, python supports only "V2+Deflate" with base 64 optionally
    Marshall Pierce
    @marshallpierce
    OK. I think that's just fine as long as it's documented
    Alec
    @ahothan
    and my understanding was that "V2" would mean "V2+Deflate+Base64
    Marshall Pierce
    @marshallpierce
    No, I think that is not correct -- I would document that "V2+Deflate" is the supported mechanism
    Alec
    @ahothan
    and I agree base64 is orthogonal which is why python also supports "V2+Deflate"
    Marshall Pierce
    @marshallpierce
    Alec
    @ahothan
    that's ok, there is just some gray area in the spec (what spec?) wrt to base64 exact role and maybe @giltene can shed some light
    Marshall Pierce
    @marshallpierce
    I don't think base64 has any role to play here. it is just one of many ways of rendering bytes plain text. It happens to be used in logs but has nothing to do with the serialization itself.
    If you look at the implementations of https://github.com/HdrHistogram/HdrHistogram/blob/2268ca2607c54d2e10c160109521cd639d3e1a2c/src/main/java/org/HdrHistogram/EncodableHistogram.java#L23 for instance there is nothing to do with base64 or hex or url-encode or anything else
    Alec
    @ahothan
    I have not looked at other implementations to see if base64 part is optional, I think the go version makes it mandatory (need to check with @filipecosta90)
    Marshall Pierce
    @marshallpierce
    Ah, then that is a mistake by the go version
    It would be doing users a disservice to make all data 33% bigger and slower
    Alec
    @ahothan
    what supported my belief is the log format only supports base64 format
    Marshall Pierce
    @marshallpierce
    Log format != serialization
    Alec
    @ahothan
    and the examples provided for interop as well
    Marshall Pierce
    @marshallpierce
    It is entirely valid (and encouraged) to, for instance, use a plain serialized histogram as a wire format for shipping data to some aggregation service
    If I wanted to provide a copy pastable text version of a .jpg image I might base64 it but that doesn't mean jpg needs to be base64d.
    Alec
    @ahothan
    I agree with you in the general flexibility of optionally compressing/adding base64 - I just was not sure that this was an "officially supported flexibility" - and the lack f interop test dfoes not help. I think we just need to clarify a bit better the exact formats and give them an unambiguous name
    Marshall Pierce
    @marshallpierce
    100% agree we could really use a test corpus, but hard to find the time for that sort of (useful!) project gardening :)
    For the rust impl, I wrote a tool to have the java impl spit out V2 and V2+Deflate versions of canned data so I could use that in my tests
    anyway, I think you should feel free to punt entirely on involving yourself in base64 or any other bytes -> text conversion (except for specifically the interval log format) in the same way that you needn't bother with implementing automatically opening a .tar or .rar or .xz or .hqx or any other possible container for byte data
    Alec
    @ahothan
    Ok, to unblock the OP it looks then that he needs to use V2+Deflate without base 64 to work with current python library
    (or with base64 but I guess no base64 is preferred in his use case)
    Marshall Pierce
    @marshallpierce
    If I'm following the OP right it seems like when he didn't use base64 it wouldn't deser in python, which is curious
    presumably the de-base64ing should be entirely transparent
    (Unless there's a bug in the base64 code, which does happen -- I maintain rust's base64, and people have showed up with invalid base64 generated by other impls... sigh)
    Alec
    @ahothan
    as long as it is compressed it shoudl work, as python does not currently provide an uncompressed coder/decoder API
    Marshall Pierce
    @marshallpierce
    ok, so the takeway: (1) have the OP base64 the V2+Deflate serialization immediately before deser since that seems to work, (2) investigate why the non-base64 path is misbehaving?
    Alec
    @ahothan
    yes
    if you can provide an example of Rust generated V2+Deflate+base64 string I can have a look to see why it would fail decoding
    Marshall Pierce
    @marshallpierce
    I have a todo list entry from about 3 years ago to move and expand on the docs in the rust impl for the serialization and interval log stuff, but spare time y'know... :'(
    Alec
    @ahothan
    yeah, I"m in no hurry either, just to unblock the OP
    Marshall Pierce
    @marshallpierce
    The rust impl includes a basic CLI, so this should do it: jot 100 | cargo run --example cli serialize -c > data.hist
    that will be the numbers 1-100 in a -compressed histogram
    filipe oliveira
    @filipecosta90
    Hi there, good night,
    Wanted to know what you guys think about the following PR on hdrhistogram_c
    HdrHistogram/HdrHistogram_c#89
    It adds the ability to change the HdrHistogram allocator at compile time. This can be important to integrate into other projects that use by default different allocators like jemalloc ( example of redis ).
    Alec
    @ahothan

    Hi there, good night,
    Wanted to know what you guys think about the following PR on hdrhistogram_c
    HdrHistogram/HdrHistogram_c#89
    It adds the ability to change the HdrHistogram allocator at compile time. This can be important to integrate into other projects that use by default different allocators like jemalloc ( example of redis ).

    I think this is a great idea! Only comment is I would have done it a bit differently, instead of using a compile flag, I'd prefer to make this allocator selection dynamic and have an extra API to customize the malloc/calloc functions and use function pointers that default to the standard malloc/calloc

    Alec
    @ahothan
    this has the advantage of not having to deal with multiple hdr shared libraries (for example for packaging hdr_c)
    also solves the problem of importing extra header files when recompiling hdr_c code , of having a fork of that code just for changing the memory manager
    wdyt @mikeb01 ?
    Michael Barker
    @mikeb01
    I'll have a review of the PR in the next few days.
    Gil Tene
    @giltene
    A question & idea down the path of @filipecosta90 recent efforts towards more efficient percentile computation: where do things stand with regards to packed histogram support in the innards of our various implementations (either in already-there or planning-to-add-soon form)? One if the internal tricks I was going to add as a next step in the java implementation was to make scans through all non-zero values of a histogram much faster on packed histograms (than would be possible on non packed ones), adding a benefit beyond memory footprint to packed histogram approach. Computing percentiles is a classic consumer of this, and could be dramatically accelerate as a result. Does one of you want to take a stab at that?
    filipe oliveira
    @filipecosta90
    @giltene I would like to push on the go version as the next task if you guys agree. wdyt @ahothan ?
    PS: with regards to the C version I would also be very much interested in exploring any type of optimization. A reduced C version of the hdrhistogram is already present at redis-benchmark but not in the redis-server ( and we're discussing using it on the server for redis 7 -- meaning all optimizations (computation and memory wise) the C version can have are a way of getting closer to include it in redis-server )
    bottom line I can give a hand on the C and Go versions if you guys agree
    PS: opened the PR for the C version proposing hdr_value_at_percentiles() as a means for redundant computation removal
    HdrHistogram/HdrHistogram_c#90
    Alec
    @ahothan
    A question & idea down the path of @filipecosta90 recent efforts towards more efficient percentile computation: where do things stand with regards to packed histogram support in the innards of our various implementations (either in already-there or planning-to-add-soon form)?
    @giltene could you summarize what this packed histogram feature is?
    Michael Barker
    @mikeb01
    I'll get the C PRs merged in this week. Happy to take a PR for a packed histogram implementation, but it could be interesting to see how it would integrate cleanly. Perhaps a separate side by side implementation.
    Michael Barker
    @mikeb01
    Could figure out a good reason not to merge the allocator PR, done now.
    Michael Barker
    @mikeb01
    Sorry that should say "Couldn't figure..."