Skip to content

Conversation

andrzejnovak
Copy link
Member

@andrzejnovak andrzejnovak commented Jun 22, 2021

When completed, closes #222

To implement:

  • sort by passing index
  • sort by passing lambda to sorted
  • add tests

@henryiii could you take a look? I had to abuse some design decisions of hist to get this work, so maybe you can recommend better workarounds.

  • AxisNamedTuple (and I assume axes) as well cannot be assigned to, so I had to recreate the sorted axis
  • Couldn't easily copy all meta/traits from old axis to new - hence the helper with inspect (name/label don't seem to be part of traits/metadata) - also cannot create axis and edit metadata

Here is a couple of examples
2D - yax
image
2D - xax
image
1D
image
3D - sort on not-projected axis
image

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jun 22, 2021

Sourcery Code Quality Report

❌  Merging this PR will decrease code quality in the affected files by 1.10%.

Quality metrics Before After Change
Complexity 6.58 ⭐ 6.58 ⭐ 0.00
Method Length 57.64 ⭐ 64.48 🙂 6.84 👎
Working memory 9.12 🙂 9.02 🙂 -0.10 👍
Quality 65.48% 🙂 64.38% 🙂 -1.10% 👎
Other metrics Before After Change
Lines 427 474 47
Changed files Quality Before Quality After Quality Change
src/hist/basehist.py 65.48% 🙂 64.38% 🙂 -1.10% 👎

Here are some functions in these files that still need a tune-up:

File Function Complexity Length Working Memory Quality Recommendation
src/hist/basehist.py BaseHist.__init__ 18 🙂 176 😞 12 😞 40.32% 😞 Try splitting into smaller methods. Extract out complex expressions
src/hist/basehist.py BaseHist.from_columns 12 🙂 160 😞 11 😞 48.49% 😞 Try splitting into smaller methods. Extract out complex expressions
src/hist/basehist.py BaseHist.sort 8 ⭐ 246 ⛔ 9 🙂 49.69% 😞 Try splitting into smaller methods
src/hist/basehist.py BaseHist.profile 3 ⭐ 181 😞 11 😞 55.06% 🙂 Try splitting into smaller methods. Extract out complex expressions
src/hist/basehist.py BaseHist.fill 2 ⭐ 90 🙂 12 😞 64.77% 🙂 Extract out complex expressions

Legend and Explanation

The emojis denote the absolute quality of the code:

  • ⭐ excellent
  • 🙂 good
  • 😞 poor
  • ⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.


Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!


data = (data_dict[i] for i in range(len(args), self.ndim))

total_data = tuple(args) + tuple(data) # Python 2 can't unpack twice
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally unrelated, but we should unpack twice, hist has never supported Python 2... :)

@henryiii
Copy link
Member

henryiii commented Jun 22, 2021

Would scikit-hep/boost-histogram#577 make this easier? h[{i: sorted(h.axes[i]})] would do it, I think? (I'm applying hist's UHI+ on top of boost-histogram's PR in this example)

@henryiii
Copy link
Member

henryiii commented Jun 22, 2021

Couldn't easily copy all meta/traits from old axis to new

I think, post refactor (bh 0.13 or so?), this is now just copying the __dict__ from the old to the new.

>>> h = Hist.new.Reg(10, 0, 1, name="hi", label="ho").Double()
>>> h.axes[0].this = "that"
>>> h.axes[0].__dict__
{'name': 'hi', 'label': 'ho', 'this': 'that'}

@andrzejnovak
Copy link
Member Author

Ok yeah, arguably scikit-hep/boost-histogram#577 could make this entire API redundant, though it might still be worth having as a shorthand for the 2 most obvious use cases I see - sort by label, sort by integral.

Btw the bh draft doesn't work for StrCat, I guess you are aware with the bh.loc comment.

Any ETA on that draft? This sorting histograms question comes up quite often for me and it's fairly experience breaking to reply with "export to 1Ds and pass as a list to mplhep".

@henryiii
Copy link
Member

Before PyHEP; but hopefully this week. Probably will be right after the PR above it (scaling).

@henryiii
Copy link
Member

henryiii commented Jul 6, 2021

This is going in as an "experimental" feature in boost-histogram 1.1 - scikit-hep/boost-histogram#577 . It does not add removed bins to the flow bin when a selection is not complete, but that should not matter for this PR, so the warning could be silenced.

@henryiii henryiii marked this pull request as ready for review July 7, 2021 04:00
@henryiii
Copy link
Member

henryiii commented Jul 7, 2021

Only have a sort by label currently, but I think this is good enough for a first pass?

@henryiii henryiii merged commit 318cf56 into scikit-hep:main Jul 7, 2021
@andrzejnovak
Copy link
Member Author

Great, thanks a lot! It's amazing how much easier this is with the axis picking implemented.

@LovelyBuggies
Copy link
Collaborator

LovelyBuggies commented Aug 8, 2021

Only have a sort by label currently, but I think this is good enough for a first pass?

@henryiii FYI, do you mean h.project(*[i[0] for i in sorted(enumerate(h.axes.label), key=lambda x:x[1], reverse=False)])? Or do we have another shortcut to sort by the labels?

from hist import Hist

h = Hist.new.StrCat(["AB", "BCC", "BC"], name="Y", label="Y-axis").IntCat([1, 4, 2], name="X", label="X-axis").Double()
# sort StrCat
h.sort(0, reverse=True).axes[0]
# sort axes by name
h.project(*[i[0] for i in sorted(enumerate(h.axes.name), key=lambda x:x[1])]) # we may want h.sort_by_names or sorted_by_names(h)
# sort axes by label
h.project(*[i[0] for i in sorted(enumerate(h.axes.label), key=lambda x:x[1], reverse=False)]) # we may want h.sort_by_labels or sorted_by_labels(h)

@henryiii
Copy link
Member

No, I mean you only have access to the StrCategory label (like AB), and not the position (but you could get that by capturing it, so I don't think there's any loss). We don't have a way to sort the axes, but that's likely not needed, either just create the histogram with the axes in a different order, don't rely on the axis order at all, or use project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Reorder categorical axes
3 participants