These are chat archives for HdrHistogram/HdrHistogram

2nd
Oct 2015
Alec
@ahothan
Oct 02 2015 15:13 UTC

regarding JS I was wondering what the use case would be (server side JS or browser side). Since you mention browser side it means you really only need the consumption side of HdrHistogram (that is decode and display).
Visualizing histograms is great and always needed. But just decoding a histogram is not always sufficient because it lacks the associated semantic.

If you look at wrk2 which is a good model for all the distribution/aggregation of collected values, the encoded histogram ned to be sent to a collector along with some app specific content. Today we have been using a proprietary protocol, which is to encapsulate the encoded histogram into a json container that holds application specific fields - things that go with the histogram content - e.g. sender ID, rate in request/second, number of connections, error counts, duration of capture... and we send that to the collector using Redis.
If we could define a standard protocol for doing the same thing with the corresponding libraries (in C, java, python...), then people would not have to reinvent this every time in their own way. Just thinking out loud: websocket + a well known websocket protocol to be registered at IANA (http://www.iana.org/assignments/websocket/websocket.xhtml) + a flexible encapsulation that allows apps to put in their own fields (easy with json).
That would have saved me a lot of time and that clearly helps a lot in adoption

Gil Tene
@giltene
Oct 02 2015 16:51 UTC
@ahothan There are multiple JS use cases. The simplest one is being able to use a browser to view a .hlog file. Much like we can simply view a .hgrm file today with http://hdrhistogram.github.io/HdrHistogram/plotFiles.html , I'd like to point to a set of .hlog files and plot them both on a timeline and by percentiles (the format I use for showing jHiccup data and response time logs). A cool feature would be to be able to drag a range in the timeline and have the percentile distribution be continually updated for that range (e.g. look at the first 10 minute of the hour on the timeline). We have a thick client app that does that right now, and some folks have been playing with JavaFX (e.g. look at https://github.com/ennerf/HdrHistogramVisualizer). A lightweight browse based JS versions would be really cool.
Another use case is server-side. I really want to create an enhanced statsd that could do two things: (a) receive and forward histograms as a data type, and (b) produce histograms as a data type from it's input (as opposed to / in addition to the current percentile summaries it typically does). This would be an important step in getting time-series data bases and stores (like graphite and what it uses for storage) to deal with histograms as a first order metric, as without the feeds being able to provide them, the storage is irrelevant. Since statsd is so prevalent (and written in JS) a JS HdrHistogram port would really help this cause...
Gil Tene
@giltene
Oct 02 2015 17:17 UTC
As for additional metadata: you make a good point, especially for display and plotting tools. I think that the bets way to address this is by adding "conventions" for metadata tags in in comment lines in histogram logs and streams. E.g. the current Java HistogramLogReader already assigns meaning to the optional "#[StartTime:" and "#[BaseTime:" tags (allowing it to output timeline information relative to a base time if needed). We can add additional useful tags by convention.
E.g. A simple label for the log would be a useful thing for anything doing plots to use (e.g. if #[Label: ...] is found in the log stream, the label found can be used by plotting programs in place of relying on the file name. Especially useful when there is no file (data was treated over a socket). We can also include conventions for metadata about know environment stuff, like various IDs, load parameters (rate in ops/sec, # connections, other leaden metadata), etc. It would be good to come up with a list of things that we would find useful in some tools.
Alec
@ahothan
Oct 02 2015 18:45 UTC
perhaps you could consider having a V2 version of the log file that would use json instead of plain text with metadata embedded in comments? With plain text you have this decoding that is really screen scraping with metadata extraction, it is not the end of the world but clearly not as clean as using a standard and more descriptive language like json. Then instead of a log file it would just be a histogram container with metadata that is parsable using standard libraries. And in my case I won't have to deal with 2 formats of encoded histogram (java log version which I never use but had to implement to pass the interop test + my own json container based version). Json is interesting because of the existing libraries to support that format and native support in nosql db.
I think that would also make it easier to write generic browsers that can read all those metadata and add more meaningful information to the display (label semantic, units, time, date...)