Skip to content

Log name of RVs that are being resampled in posterior predictive #5973

@ricardoV94

Description

@ricardoV94

Right now, the behavior of sample_posterior_predictive is very opaque as to which RVs are being resampled. Any RVs that are downstream of Shared variables or other variables mentioned in var_names, as well as variables missing in the trace will be resampled. We should output their names, similarly to how we show what variables are being sampled in pm.sample

For instance, in the example below it might unclear that y is being resampled, as it does not show in any log messages or in the output of posterior_predictive:

import pymc as pm
with pm.Model() as m:
    x = pm.MutableData("x", 0)
    y = pm.Normal("y", x)
    z = pm.Normal("z", y, observed=0)
    idata = pm.sample(chains=1, tune=0, draws=5, random_seed=1)
    pm.set_data({"x": 100})
    pm.sample_posterior_predictive(idata, extend_inferencedata=True, random_seed=1)
    
print(idata.posterior_predictive.data_vars)  # z (chain, draw) float64 101.8 101.5 ...
print(idata.posterior_predictive["z"].mean().values)  # 100.30893015576387

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions