loc, we need to agree on something
@jpivarski What do you think about using the "keyword"
slice to do a slicing cut in C++? It would look something like this
auto h2 = reduce(h1, slice(2, 3));
Note that I have to use functions such as
slice, because C++ does not have keyword arguments.
std::spandirectly, because it is C++20, but I could use a type called
std::spancan be constructed in many different ways, one way is from a pair of iterators. I suppose that's the meaning you are referring to, because axis indices behave a bit like iterators (but not quite, they are not dereferencible).
reducecurrently, but I could add that functionality
reduce in C++ was designed in this way so that the expensive allocation of a new storage buffer is only done once.
hist2 = hist.shrink(0, 3, 6).rebin(0, 2).shrink(1, 2, 4).rebin(1, 2)
does four expensive allocations and deallocations. Allocating from the heap is one of the most expensive operations nowadays, costing 100 to 1000 cycles.
hist2 = reduce(hist, shrink_and_rebin(0, 3, 6, 2), shrink_and_rebin(1, 2, 4, 2))
does only the minimum of one such allocations.
reduceneeds to be a free function, not a method
.view(flow=True)and the rest of my calculation then operates on numpy arrays and pandas objects.
Another thing I find a bit odd: to my mind a canonical operation of a histogram is to a) fill it with data and then b) ask, for some(new) value z, "what is the percentile for z?" IIUC this requires operations "outside" the Boost.Histogram library (e.g. h.view().cumsum(), h.axis(0).index(X), etc.), which I find surprising.
Boost.Histogram provides Python bindings to the C++ library of the same name. Boost and the C++ standard library follow the philosophy of providing a set of orthogonal components. Each one is specialized to do just one thing really well and interfaces are provided to combine all these components to do amazing things. The keyword here is orthogonal, this means that one library does not duplicate functionality that can be obtained by another library component already.
This is the most efficient and powerful way to design a set of libraries, because it requires you to learn the smallest amount of library interfaces. I don't know whether you have played with Lego as a kid, but this is the same thing. You can build anything with Lego, because all the pieces fit together.
The Python wrapper to Boost::Histogram will follow the same philosophy. It is a mapping of the C++ functionality. On top of that, Henry is working on a library called
hist which provides an interface which allows you to do common analysis tasks in a flash.
Just to end on a positive note - the speed of the library is amazing. Also, I really like the focus of the library - no pointless methods to plot the contents of the histogram. As a C++ developer I like that I can write code that can accept objects created from a Python script. So all in all I'm very pleased, and look forward to making use of more advanced features.
That's great to hear, you are one of users then which I had in mind when I designed Boost.Histogram :).