-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
Description
PyMC 5.13 incorporates the var_names
parameter in pm.sample()
. The documentation says var_names: Names of variables to be stored in the trace. Defaults to all free variables and deterministics
.
This comes very handy for something I've been trying to do in Bambi. Now I'm porting Bambi to use this feature and noticed weird results with tests. I reproduced one of the models with PyMC and noticed the problem. Have a look at this
import arviz as az
import numpy as np
import pymc as pm
batch = np.array(
[
1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5,
6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10
]
)
temp = np.array(
[
205, 275, 345, 407, 218, 273, 347, 212, 272, 340, 235, 300, 365,
410, 307, 367, 395, 267, 360, 402, 235, 275, 358, 416, 285, 365,
444, 351, 424, 365, 379, 428
]
)
y = np.array(
[
0.122, 0.223, 0.347, 0.457, 0.08 , 0.131, 0.266, 0.074, 0.182,
0.304, 0.069, 0.152, 0.26 , 0.336, 0.144, 0.268, 0.349, 0.1 ,
0.248, 0.317, 0.028, 0.064, 0.161, 0.278, 0.05 , 0.176, 0.321,
0.14 , 0.232, 0.085, 0.147, 0.18
]
)
batch_values, batch_idx = np.unique(batch, return_inverse=True)
coords = {
"batch": batch_values
}
with pm.Model(coords=coords) as model:
b_batch = pm.Normal("b_batch", dims="batch")
b_temp = pm.Normal("b_temp")
mu = pm.Deterministic("mu", pm.math.invlogit(b_batch[batch_idx] + b_temp * temp))
kappa = pm.Gamma("kappa", alpha=2, beta=2)
alpha = mu * kappa
beta = (1 - mu) * kappa
pm.Beta("y", alpha=alpha, beta=beta, observed=y)
I want to sample the posterior, but I don't want to store the draws of "mu"
by default. So I use var_names=["b_batch", "b_temp", "kappa"]
(and I also sample without var_names
to see the difference).
with model:
idata_1 = pm.sample(random_seed=1234)
idata_2 = pm.sample(var_names=["b_batch", "b_temp", "kappa"], random_seed=1234)
When I don't use var_names
I get the following posterior
az.plot_trace(idata_1, backend_kwargs={"layout": "constrained"});
and when I use var_names
it's the following
az.plot_trace(idata_2, backend_kwargs={"layout": "constrained"});
which makes me think it's basically omitting the likelihood and thus sampling from the prior.
Is this behavior expected?