Skip to content

Conversation

@henryiii
Copy link
Member

@henryiii henryiii commented Feb 10, 2021

This closes #413, with a proposal I made in scikit-hep/uproot3#511 - we match uproot4 by default, returning normal NumPy histogram output from to_numpy() on all storages. If the full view is required, a mode="view" can be passed. In boost-histogram 1.0, this will be type checked by MyPy. An expression like this:

mplhep.histplot(h.to_numpy())

will no longer crash if h is not backed by a simple storage. Of course, since mplhep supports PlottableProtocol now, there's no need to call to_numpy, but other classic interfaces out of our control still use this. Including Uproot3 writing.

I've also added clarification to the docstring that to_numpy is not a replacement for calling h.axes.edges(); to_numpy returns the actual NumPy edges, not the boost-histogram ones, which differ on the upper edge. If you were to refill the histogram using np.histogram, this would produce identical results if you use the edges returned by to_numpy, but not if you use the edges returned boost-histogram style.

@github-actions github-actions bot added the needs changelog Might need a changelog entry label Feb 10, 2021
@henryiii henryiii added this to the 0.13.0 milestone Feb 11, 2021
@henryiii henryiii changed the title feat: to_numpy matching uproot, with mode feat: to_numpy matching uproot, extra mode Feb 12, 2021
@henryiii
Copy link
Member Author

henryiii commented Feb 12, 2021

We never really came to a descision, so I just proposed my favorite here as a starting point. As I see it, there are two constraints. a) If we pass the same arguments as Uproot, we should match uproot’s behavoir. b) We are constrained by NumPy; the point of to_numpy is to allow boost-histogram to support the current histogramming ecosystem outside of Scikit-HEP, and NumPy does not have the idea of extra information; so it must be opt-in. We could make mode= a boolean switch, but if we wanted to add a mode="variances" or something like that, this is more expandable. I'm very open, though. Enums, different spellings, etc.

Another example of an interface is matplotlib.pyplot.stairs. It is designed to take NumPy style input. So, with the current design,

plt.stairs(*h.to_numpy())

will work even with weighted storage; if you want to plot uncertainties, you can then make a call to plt.errorbar just below this, etc.

Let me write up my idea for Flow, and I'll open that. Edit: Done, in #504.

@henryiii henryiii force-pushed the feat/to_numpy branch 2 times, most recently from d555cc9 to 9367c82 Compare February 16, 2021 17:22
@henryiii henryiii merged commit 98e2bfe into develop Feb 18, 2021
@henryiii henryiii deleted the feat/to_numpy branch February 18, 2021 05:07
@henryiii henryiii changed the title feat: to_numpy matching uproot, extra mode feat: to_numpy matching uproot, view parameter Feb 18, 2021
@henryiii henryiii removed the needs changelog Might need a changelog entry label Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

to_numpy improvements for views

3 participants