-
Notifications
You must be signed in to change notification settings - Fork 2.1k
make vi (posterior) mean and std accessible as a structured xarray #6086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The error message of "Read the Docs build" don't look like they have anything to do with this PR. Is there something wrong with the doc strings? |
@markusschmaus Thanks for your contribution! For the read the docs error yes please ignore that, sorry for the false alarm. For your code submission, I'll review it now at a "code level" but will defer to my more VI colleagues here for the math and user questions. Which brings me to my next question, right now the PR is marked draft, did you want a review now or were you still planning on working this some more? |
Codecov Report
@@ Coverage Diff @@
## main #6086 +/- ##
==========================================
+ Coverage 89.54% 89.60% +0.06%
==========================================
Files 72 72
Lines 12929 12947 +18
==========================================
+ Hits 11577 11601 +24
+ Misses 1352 1346 -6
|
Thanks, I left it at draft level as I was still trying to debug the docs issue. I'm polishing up a few things and then I will change the status. |
Should an |
I thought about using an It's already possible to get a true |
I was thinking something along the lines of the "sample_stats" or "observed_data" entries in the |
Let's go through the options: https://python.arviz.org/en/latest/schema/schema.html#schema
So a straight forward xarray looks best to me. |
Can we just get a dictionary? Nevermind you are passing dims around as well. |
Yeah, I find the coords just too useful not to use them. I considered returning a dict of numpy arrays when no coords are given, but this would result in an inconsistent return type, which I always find a pain to deal with when a library does this. |
Was hoping you'd be able to add custom attributes to |
What do you mean? You can last time I checked |
The spec is only enforced with a warning:
So I could ignore this warning and wrap the xarrays in an Inference data object which doesn't conform to the spec, though I still don't see any benefits for doing so. It wouldn't make sense to start sampling just to be able to fill any of the other fields, since the whole point of this PR is to give the user the ability to extract mean and std without sampling. If it's about the syntax and you prefer |
I meant you can add attributes to one of the "allowed" groups. In this case I was thinking you could add it to the posterior group. |
The point of the PR is to avoid sampling just for extracting the mean and std, so there are no posterior samples and no posterior group. I could create an empty group, but I still don't see the benefit. |
Fair enough |
@fonnesbeck @ricardoV94 What's the future of this PR? |
General PSA, datasets have a dict-like interface so to most ends you can simply ignore the fact you have a dataset and treat it as a dictionary. |
Let's rebase and merge this. |
@markusschmaus are you good with us merging this as-is, or are there any additional changes you'd like to make? Sorry its taken so long--kind of fell off the radar! |
Merge conflicts fixed by #6387 |
What is this PR about?
For VI the mean and std of approximations is currently only available as an unstructured flat Aesara Variable. This leads to frequent questions on how to extract these properties from the posterior. This PR creates two new properties which evaluate the Aesara Variables and transforms them into a structured xarray Dataset using the available
coords
.See also:
https://discourse.pymc.io/t/quality-of-life-improvements-to-advi/10254
Checklist
Major / Breaking Changes
Bugfixes / New features
Docs / Maintenance
mean
,std
, andcov
properties