Skip to content

BUG: Datatype of Discrete RVs is changed to float64 when observed data has missing values #6424

@jessegrabowski

Description

@jessegrabowski

Describe the issue:

Issue first reported here. When using a categorical likelihood with missing variables in the observed data vector, the result is not able to be used as an index variable, because the dtype of the combined missing+observed data vector created in model.make_obs_var does not inherit the dtype of the underlying RV.

This will cause unexpected behavior if the user wants to index with the variable elsewhere in the model.

Reproduceable code example:

import pymc as pm
import numpy as np
import pytensor.tensor as pt

data = np.ma.masked_equal([1, 1, 0, 0, 2, -1, -1], -1)
something_to_index = pt.as_tensor_variable(np.random.normal(size=(10, 3)))

with pm.Model():
    idx = pm.Categorical(f"idx", p=[0.1, 0.2, 0.7], observed=data)
    stuff = something_to_index[:, idx]

Error message:

<details>
Traceback (most recent call last):
  File "/Users/jessegrabowski/Documents/Python/pymc/test.py", line 10, in <module>
    stuff = something_to_index[:, idx]
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/var.py", line 551, in __getitem__
    return at.subtensor.advanced_subtensor(self, *args)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/graph/op.py", line 296, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/subtensor.py", line 2556, in make_node
    index = tuple(map(as_index_variable, index))
  File "/Users/jessegrabowski/mambaforge/envs/econ/lib/python3.9/site-packages/pytensor/tensor/subtensor.py", line 2518, in as_index_variable
    raise TypeError("index must be integers or a boolean mask")
TypeError: index must be integers or a boolean mask
</details>

PyMC version information:

pymc: 0+untagged.9319.g78a3582.dirty pytensor: 2.8.11

Context for the issue:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions