-
Notifications
You must be signed in to change notification settings - Fork 91
feat: Allow for clipping negative yields in the main model #1845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1845 +/- ##
=======================================
Coverage 98.17% 98.17%
=======================================
Files 68 68
Lines 4322 4331 +9
Branches 726 730 +4
=======================================
+ Hits 4243 4252 +9
Misses 46 46
Partials 33 33
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
048830d to
7093bdb
Compare
7093bdb to
be63893
Compare
|
./cc @mdhank |
|
@HDembinski - this is a possible interesting edge-case for Minuit. Minuit has issues with convergence on a (admittedly hacked-together) workspace that throws negative yields. If we explicitly add clipping to make sure the data is non-negative during minimization, there's no issue. (SciPy seems to be "robust" regardless). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kratsg Overall LGTM. 👍 Thanks!
I just have a question on the tolerances, but beyond that this seems good. Also is there any associated GitHub Issue or Discussion for this? Or was this only brought to our attention through email?
|
(We can ignore that Semantic Pull Request isn't reporting at the moment given zeke/semantic-pull-requests#183 and just visually check. This PR is fine. 👍) |
|
This is not a Minuit issue. You should not have negative predictions for bin entries in a maximum likelihood fit, ever. Negative expectations are mathematically forbidden. Construct the model and limit the model parameters so that this never happens. If scipy handles this better, it is by chance. The main source for negative expectations are: using Chebychev polynomials for background instead of Bernstein polynomials and allowing signal or background fractions that are negative. There is a related study in the iminuit docs about using scipy as an optimisier and a constrained fit. I can't link it here because I am on the phone. |
(Linking here before I sign off for the night) I think the study is this one: SciPy minimizers and constraints (which is the |
|
@matthewfeickert Thank you for adding the link, yes, that's the one I was referring to. To add to my previous comment: this workaround is not going to work well, because by clipping the predictions you introduce discontinuities in the cost function, but Minuit and other gradient-based minimisers expect that the cost function is analytical, an infinitesimal changes in the parameters produces an infinitesimal change in the value of the cost function. When you clip, this is no longer the case. You may think that you can use a gradient-free minimiser as a way out, but that does not work either, in general. If the algorithm converges to a minimum that has some of the predictions clipped, then the estimates found are no longer maximum-likelihood estimates (because you didn't mathematically maximize a likelihood function). In general, this means you loose one of the key properties of a maximum-likelihood fit, that the covariance matrix of the fitted parameters can be computed from the matrix of second derivatives (if second derivatives can be computed at all despite the discontinuities of the cost function). You are only ok, if the minimum that is found does not have any of the predictions clipped. |
|
@HDembinski , In general, I would agree that well-designed likelihood should not spit out negative yields. I think it happens when you have a lot of shape systematics that are uncorrelated but pulled in the same direction. I think limiting the model parameters manually would make the most sense if your model is simple enough (some of the issues from within ATLAS correspond with models with > 100 parameters, so it becomes slightly more intractable to hunt down all problematic parameters). In ROOT, I will point out that this is enabled by default, which means ROOT is using Minuit with the forced positive-definite. I think it would be useful to understand why SciPy seems to be able to converge in these cases while Minuit [which is more strict and careful] will rightfully complain. Does this indicate, even with the comparison studies you've done, that SciPy is wrong and should be more strict? Or is the argument you're making that for poorly-defined workspaces, you throw everything out of the window when it comes to comparing Minuit and SciPy? |
Yeah, this is actually the reason why I and others like @alexander-held are surprised to see this turned on by default in ROOT, so we're not entirely sure why. We can add specific documentation in the docstring to say that this option is "dangerous" (or that they should be careful trusting any results that needs this option). |
I think it is not so difficult, because most of the parameters are shape parameters. You can change those as you like, it will not generate negative bin expectations. You may get zero expectation in some bin, which is also bad, but easier to work around. You cannot work around negative expectations. The main problem is the use Chebychev polynomials for background and allowing amplitudes of pdfs to be negative. Both is a bad tradition that we need to break away from.
Can you please link to that code in ROOT, because I don't know what you mean? You talk about positive-definiteness, but that is a different thing from what I am talking about. Positive-definiteness means that the curvature of the cost function is positive in all directions. The matrix of second derivatives has to be positive-definite for Minuit/Migrad both in ROOT and in iminuit.
Minuit's Migrad algorithm is not more strict and careful. On the contrary, Migrad as implemented in Minuit2, is less robust than some of the algorithms in SciPy in my experience. Migrad cannot recover well from situations where the cost function returns NaN. I think there is a potential for improvement there, but it is not an easy fix.
My comparisons indicate that sometimes Migrad works better than other minimisers and sometimes the other minimisers work better. It depends on the specific problem. My argument is that people need to define their models properly and there is little that you can do on the level of pyhf to fix bad models constructed by users. Another (unrelated) common problem is that users generate models with strongly correlated parameters. There is also nothing you can do in pyhf to prevent that. You should educate your users about this as I am doing. |
Hans, thanks for being patient and sorry for not responding as quickly. I will quote the email from @alexander-held but this was pointed out by @mdhank who found the relevant code/digging
I agree. I'm probably making this a bit confusing because this is the name that ROOT seems to have chosen, but looking at the code, it doesn't seem to be related to derivatives or the cost function - but rather simply just the yields. In the case when I'm referring to "positive-definite" - I'm referring to the phrasing ROOT is using, rather than my understanding of what it means in minimization which is what you explained. It's the reason why I chose a different name for the option in
Noted, I think I understand where you're coming from. I still would like to provide this functionality in |
In the context of |
dec5729 to
3f35195
Compare
|
#1884 might also be of interest in this context: it's not specifically about protecting against negative yields, but protecting against NaNs due to ill-defined templates. I don't know where/how exactly ROOT protects against this. |
3f35195 to
eec1b82
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still would like to provide this functionality in
pyhfto at least match what exists in ROOT, but keep it as non-default. I do want to add more verbose description in the documentation to clarify the expectations of the minimizers and the expectations of the users, to make sure people understand that it is not up topyhfto enforce that workspaces are well-defined, but we'll do our best.
@kratsg (thanks for your patience and I hope that I never do such a poor job of responding to you again) I think this is a reasonable path forward for now. Though to avoid more delays on this I think making a dedicated docs Issue for the descriptions/discussions would be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not tested this very systematically, but have used this for a few problematic workspaces in the past few weeks and have not run into issues with the implementation. It does achieve the intended effect in the examples I have run.
|
I have also tested this some on my workspaces, I've found it does seem to work as intended in my cases. Interestingly, it seems clipping bins might improve stability more than clipping samples; I've run into some cases where the former works but not the latter. |
@mdhank Thanks for testing this out. I'm quite behind on things at the moment (this PR is testimony to that) but if you have examples of where this is helping can you share them? |
* Update lower bound on iminuit to v2.7.0 as it is the 'oldest' SemVer release that has the behavior: > Minuit.hesse no longer sets the status of the FunctionMinimum unconditionally > to valid if it was invalid before. Without this fix, the situation can arise where `Minuit.hesse()` can cause `Minuit.valid` to be True when it should be False. * Update tests/constraints.txt to use iminuit==2.7.0. * Amends PR #1845
The workspaces in question contain non-public ATLAS results, but I have sent versions of them to the pyhf developers via e-mail in March-April. |
Description
Provides functionality for clipping
expected_datain the_MainModelviapyhf.Model(clip_sample_data=0.0, clip_bin_data=0.0)keyword arguments. Added to the docstrings.Backwards functionality is preserved, which is that clipping is not done. There does not seem to be a performance hit in checking for clipping or not. Clipping can be done at two levels:
Note that it seems ROOT primarily goes for the level of the individual samples, so adding both options and leaving it up to the user to decide if they want to use this. If you have all but one sample which is ok, and one sample providing large negative yields, you can either zero out the sample, or zero out the bin.
See documentation changes here: https://pyhf--1845.org.readthedocs.build/en/1845/_generated/pyhf.pdf.Model.html#pyhf.pdf.Model
Checklist Before Requesting Reviewer
Before Merging
For the PR Assignees: