Skip to content

Conversation

aseyboldt
Copy link
Member

Step size adaptation:
Currently, we compute the acceptance rate of a draw only in that part of the trajectory that is accepted at the end. This way we discard some additional information about energy errors in the remaining discarded part. After this PR we compute the mean acceptance rate over all leapfrog steps we take.

Divergences:
Currently we count large derivations from the initial energy as a divergence regardless of the direction of the energy error.
This PR changes this so that only large energy errors that correspond to a low acceptance probability are considered a divergence. I think this helps with some model in the initial phase of sampling, were with the old behavior we stop promising trajectories that are too good, leading to worse performance.

Also a little bit of housekeeping:
This introduces an additional sampler statistic: index_in_trajectory. I don't see much of a use-case for this other than teaching and debugging, but it proved useful while comparing the sampler to other implementations.

I also introduced some array copies for the integrated momentum array. I thought I saw bugs while comparing sampler behavior to nuts-rs, but somehow I can't reproduce those now. Still, I think it's better to be safe.

@codecov
Copy link

codecov bot commented May 30, 2022

Codecov Report

Merging #5824 (7f1e031) into main (6b22ed5) will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5824      +/-   ##
==========================================
- Coverage   89.40%   89.38%   -0.03%     
==========================================
  Files          74       74              
  Lines       13772    13771       -1     
==========================================
- Hits        12313    12309       -4     
- Misses       1459     1462       +3     
Impacted Files Coverage Δ
pymc/step_methods/hmc/base_hmc.py 90.55% <100.00%> (+0.07%) ⬆️
pymc/step_methods/hmc/hmc.py 92.59% <100.00%> (+0.13%) ⬆️
pymc/step_methods/hmc/integration.py 78.84% <100.00%> (ø)
pymc/step_methods/hmc/nuts.py 97.40% <100.00%> (-0.10%) ⬇️
pymc/step_methods/hmc/quadpotential.py 80.69% <100.00%> (+0.14%) ⬆️
pymc/parallel_sampling.py 86.46% <0.00%> (-1.00%) ⬇️

Copy link
Member

@ColCarroll ColCarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

two questions:

  1. is the change from log_weighted_accept_sum to log_accept_sum what is making the change in step size adaptation? I am having trouble seeing that.
  2. these changes are careful enough that I might ask for, say, 10k samples from a random 100d gaussian to show that it still provides unbiased samples.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this for the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That got here my accident...
I can take it out, it is useful though for the new code in covadapt (which I think is pretty nice actually, and I'll try to get into pymc at some point).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to leave it in, I'm ok with it (so long as it has a comment!) I wonder what you think of algorithm 2 in https://proceedings.mlr.press/v151/hoffman22a/hoffman22a.pdf as a means of estimating scales?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thomas sent me a link to that paper just this morning. I also wonder what I'll think about algorithm 2, looks fascinating though. ;-)
If you like to compare ideas with what I did in covadapt, I just wrote a sketch of an intro in the readme: https://github.com/aseyboldt/covadapt
If you don't understand what I'm talking about over there, that's my fault. I'll try to improve it soon. :-)

@twiecki twiecki requested a review from ColCarroll May 30, 2022 15:23
@aseyboldt
Copy link
Member Author

@ColCarroll

is the change from log_weighted_accept_sum to log_accept_sum what is making the change in step size adaptation? I am having trouble seeing that.

Yes. Maybe it is easier to see if you only look at the first commit.
Previously, we had information about the acceptance rate in the Subtree structure, now there is only the attribute log_accept_sum in the NutsTree class, that keeps track of the sum of acceptance rates, and a counter n_proposals that counts how many leapfrog steps we did.

these changes are careful enough that I might ask for, say, 10k samples from a random 100d gaussian to show that it still provides unbiased samples.

That should also be in the tests already, (https://github.com/pymc-devs/pymc/blob/main/pymc/tests/test_posteriors.py), but you are right to ask, I'll also run that manually to make sure. (also we have to run the long-running test manually)

@aseyboldt
Copy link
Member Author

Notebook with the promised verification of the posterior: https://gist.github.com/aseyboldt/43354eded52981340304d3aec94ced3c
I also added a few lines in the docstring about the new stats.

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 3, 2022

Notebook with the promised verification of the posterior: https://gist.github.com/aseyboldt/43354eded52981340304d3aec94ced3c I also added a few lines in the docstring about the new stats.

Can we include that as slow test (which will not run by default)

@ricardoV94
Copy link
Member

@ColCarroll can we get a binary review from you :D?

Copy link
Member

@ColCarroll ColCarroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies for the wait, and thanks for the verification, @aseyboldt!

@ricardoV94 ricardoV94 merged commit c120b02 into pymc-devs:main Jun 21, 2022
@ricardoV94
Copy link
Member

Thanks @aseyboldt and @ColCarroll!

@ColCarroll
Copy link
Member

I guess this is an enhancement of the NUTS algorithm, so it will change results for NUTS users (i.e., most users), we assume for the better. My impression (defer to adrian!) is that these should be fairly subtle or only matter for the most careful or difficult models, so it does not require (say) a minor version release (which I'd suggest doing if the old algorithm was arguably wrong), and can go out whenever the next release was planned anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants