Implement chained transforms #26

ricardoV94 · 2021-07-07T10:57:23Z

Closes #18.

TODO:

Nested transforms for graphs involving a base RV and the following ops: Add, Mul, Exp, Log
Deal with base RVs containing size dimensions of 1 and generic loc / shift tensors (see Implement chained transforms #26 (comment))
~~Deal properly with broadcasting as discussed in Implement censored log-probabilities via the Clip Op #22 (comment)~~ deferred to Decide how to deal with automatic broadcast in derived RVs #51
~~Deal with discrete variables (either do not allow or do not add jacobian term)~~ Not allowed for now.
~~Jacobian of ChainTransform is wonky for non-scalar variables~~ Loc transform was returning a scalar det
Add comprehensive tests
More transforms can be added later by just expanding the rewrite

codecov · 2021-07-07T11:12:50Z

Codecov Report

Merging #26 (f30b1c2) into main (8741683) will increase coverage by 0.38%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #26      +/-   ##
==========================================
+ Coverage   94.36%   94.74%   +0.38%     
==========================================
  Files          11       11              
  Lines        1614     1733     +119     
  Branches      230      251      +21     
==========================================
+ Hits         1523     1642     +119     
  Misses         51       51              
  Partials       40       40

Impacted Files	Coverage Δ
aeppl/transforms.py	`96.43% <100.00%> (+1.94%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8741683...f30b1c2. Read the comment docs.

tests/test_transforms.py

aeppl/transforms.py

tests/test_transforms.py

aeppl/transforms.py

aeppl/utils.py

aeppl/transforms.py

brandonwillard

Can you add a somewhat high-level description of the approach and design of this feature?

I'm trying to reconcile this with the convolution and scaling rewrites we've been planning, and I want to make sure that we can do both without conflict and/or complications, or—better yet—that we can focus on only one approach/framework.

To be clear, I'm talking about RandomVariable-specific rewrites like sums of normals and location/scale properties. They're limited in that they're not as general as a change-of-variables approach, but they have the big advantage that they result in known distributions and they're rather simple to implement.

It seems like we can always add such rewrites, but that they would need to be performed before these transforms. If that's all there is to it, then I think we're good; if not, then we don't want to dig ourselves into a hole.

Also, we might want to start focusing on those simple rewrites because of their ease and impact, so don't feel rushed to push this through before considering any of those.

ricardoV94 · 2021-08-02T07:11:39Z

Can you add a somewhat high-level description of the approach and design of this feature?

The new local rewrite basically checks if there is a single chain of operations that link an "orphan" value variable to an usused base RandomVariable, and replaces the RV graph by a ChainTransformRV that contains the equivalent transformations.

y_rv = at.log( at.random.beta(1, 1) * scale) 
# becomes
y_rv = ChainTransformRV(beta(1, 1), [ScaleTransform(scale), LogTransform()])

I'm trying to reconcile this with the convolution and scaling rewrites we've been planning

To be clear, I'm talking about RandomVariable-specific rewrites like sums of normals and location/scale properties. They're limited in that they're not as general as a change-of-variables approach, but they have the big advantage that they result in known distributions and they're rather simple to implement.

The idea would be to apply the loc-scale reparametrization whenever possible as that results in more succinct graphs. Otherwise this rewrite is more general in that it applies to RVs that cannot be parametrized as such.

In addition it is not limited to loc/scale transforms. The reason why I did not limit to this scenario, is that once we have the ChainedTransformRV no further complexity was needed to work with other transformations.

By the way, the convolutions are not addressed at all in the local rewrite (and I don't think they should). I actually have an explicit check to make sure the graph does not correspond to a potential convolution.

It seems like we can always add such rewrites, but that they would need to be performed before these transforms. If that's all there is to it, then I think we're good; if not, then we don't want to dig ourselves into a hole.

Yes this should not interfere with that at all. Actually those rewrites can equally easily be applied before or after this one, as the ChainTransformRV includes all the necessary information to perform the substitution (and we already did the checks to be sure the value var really corresponds to the the base RV and that it has not been used elsewhere in the graph).

y_rv = ChainTransformRV[Normal(0, 1), [ScaleTransform(scale), LocTransform(loc)]] 
# could be rewritten later (or before) as 
y_rv = Normal(0 + loc, 1 * scale)

In that sense, we could think of the smarter rewrites as a specialization of ChainTransformRVs. But, regardless of which rewrite is applied first, I don't see why the two approaches should conflict.

aeppl/transforms.py

ricardoV94 · 2021-08-18T09:01:05Z

Okay, so my last roadblock is this edge case:

# loc = at.scalar("loc")  # works
# loc = at.vector("loc")  # fails
loc = at.TensorType("floatX", (True,))("loc")  # works

y_rv = loc + at.random.normal(0, 1, size=1)
y = y_rv.clone()

logp = joint_logprob(y_rv, {y_rv: y})

That example only works when loc is a scalar or the custom TensorType.
Otherwise it complains that I am trying to replace the vector y_rv by a LocTransformedRV of type TensorType(float64, (True,)), which is true.

TypeError: Cannot convert Type TensorType(float64, vector) (of Variable y) into Type TensorType(float64, (True,)). You can try to manually convert y into a TensorType(float64, (True,)).

Regardless of the broadcast question in #51 it seems the (1D) vector case should work by default no?

After all, this is valid:

loc = at.vector("loc")
base_rv = at.random.normal(loc, 1, size=1)
base_rv.eval({loc: [1]})

The rewrite fails similarly with row or column sizes such as (1, 3), or (3, 1) when trying to use a generic matrix for loc / scale.

loc = at.matrix("loc")
y_rv = loc + at.random.normal(0, 1, size=(1, 3))
y = y_rv.clone()

logp = joint_logprob(y_rv, {y_rv: y})

Should this be handled by the new rewrite?

ricardoV94 · 2021-09-04T17:58:00Z

My last commit provides a possible solution to the issue described in #26 (comment)

I am not very happy with it, but couldn't come up with a more satisfying solution yet.

Another option is to simply not support the cases where the original RV graph and the derived RV yield different broadcastable patterns.

brandonwillard

If I'm correctly understanding the purpose of MeasurableRebroadcast, then it would appear that we need another canonicalization that lifts Rebroadcast Ops.

Can you provide an explicit example in which this issue arises, and show the relevant aesara.dprint output?

ricardoV94 · 2021-09-15T20:44:55Z

If I'm correctly understanding the purpose of MeasurableRebroadcast, then it would appear that we need another canonicalization that lifts Rebroadcast Ops.

Can you provide an explicit example in which this issue arises, and show the relevant aesara.dprint output?

It appears in test_scale_transform when scalar_scale = False and rv_size=1

Note that the Rebroadcast is something I had to introduce in the rewrite phase so that I could replace the old generic vector node by what would otherwise be a new length 1 vector TransformedRV.

Relevant code:

aeppl/aeppl/transforms.py

Line 435 in 02e6c13

if transform_rv_out.broadcastable != out_var.broadcastable:

brandonwillard · 2021-09-15T21:26:34Z

Note that the Rebroadcast is something I had to introduce in the rewrite phase so that I could replace the old generic vector node by what would otherwise be a new length 1 vector TransformedRV.

Broadcasting is just a subset of shape inference and—just like that—an Apply node's outputs' broadcast patterns are determined by its Op and inputs' broadcast patterns. This means that the mismatched outputs' broadcast patterns can theoretically be fixed by adjusting the broadcast patterns/shapes of their inputs.

This first thing that needs to be done in a situation like this is to determine whether or not the broadcasting inference of the Ops involved are working correctly or not. There have been quite a few instances in which an Op has been found to produce incorrect broadcast patterns for its output, and such things need to be addressed directly.

ricardoV94 · 2021-09-16T11:55:58Z

Also, for comparison, I set the similar test_loc_transform in a way that it passes without the additional Rebroadcast during the rewrite. It requires us to create a loc variable with the same broadcast as the RV. The "issue" boils down to:

(at.vector() + at.random.normal(size=1)).type
# TensorType(float64, vector)

(at.random.normal(loc=at.vector(), size=1)).type
# TensorType(float64, (True,))

It feels like the first expression should be replaceable by the second, but they have different broadcastable patterns.

aeppl/tests/test_transforms.py

Lines 408 to 423 in 02e6c13

    
           @pytest.mark.parametrize("scalar_loc", (True, False)) 
        
           @pytest.mark.parametrize("rv_size", [None, 1, 2, (3, 2)]) 
        
           def test_loc_transform(scalar_loc, rv_size): 
        
               # Incompatible parametrization 
        
               if not scalar_loc and rv_size is None: 
        
                   return 
        
               if scalar_loc: 
        
                   loc = at.scalar("loc") 
        
               elif isinstance(rv_size, int): 
        
                   if rv_size == 1: 
        
                       loc = at.TensorType("floatX", (True,))("loc") 
        
                   else: 
        
                       loc = at.vector("loc") 
        
               else: 
        
                   loc = at.matrix("loc")

brandonwillard · 2021-09-17T01:12:08Z

Also, for comparison, I set the similar test_loc_transform in a way that it passes without the additional Rebroadcast during the rewrite. It requires us to create a loc variable with the same broadcast as the RV. The "issue" boils down to:
(at.vector() + at.random.normal(size=1)).type
# TensorType(float64, vector)

(at.random.normal(loc=at.vector(), size=1)).type
# TensorType(float64, (True,))

Yes, that definitely seems like the/a problem! We should create an issue for this in Aesara.

Here's a little more background on the situation:

import aesara
import aesara.tensor as at


at.vector().broadcastable
# (False,)

at.random.normal(size=1).broadcastable
# (True,)

(at.vector() + at.random.normal(size=1)).broadcastable
# (False,)

(at.random.normal(loc=at.vector(), size=1)).broadcastable
# (True,)

Apparently, the last example is a bit contradictory, because the mean/loc parameter isn't necessarily broadcastable, yet the size parameter is saying that it is, and that's what's likely determining the type of the output.

brandonwillard · 2021-09-17T01:46:38Z

Actually, the last example is inconsistent; the loc=at.vector() and size=1 are contradictory, since at.vector() is a non-broadcastable array—by definition—and size=1 is stating that the output of at.random.normal is necessarily broadcastable (i.e. since size entirely determines the output's shape for scalar RandomVariables).

The second-to-last example makes sense only because the broadcastable RandomVariable output (i.e. size=1) is broadcast against the non-broadcastable at.vector.

Simply put, in the last example, size shouldn't be 1.

@ricardoV94, we need to address whatever it is that's giving rise to that situation.

ricardoV94 · 2021-09-19T11:40:45Z

@ricardoV94, we need to address whatever it is that's giving rise to that situation.

So the question in this PR is whether we want to support the derivation for this kind of graph:

loc = at.vector()
rv = at.random.normal(0, 1, size=1) + loc
vv = rv.type()

In theory loc could be of length 1 at evaluation time, which would define a "valid" RV graph.

If we want to support this, we would need to somehow make the derived rv have a vector type (one way could be through the MeasurableBroadcast I added),

If we don't want to support this, then there is no further roadblock to this PR, I would just revert the MeasurableBroadcast commit and let the rewrites fail naturally for this type of graph.

Users can still make this work by providing a valid loc, via loc = at.TensorType("floatX", (True,))("loc")

brandonwillard · 2021-09-19T20:03:36Z

So the question in this PR is whether we want to support the derivation for this kind of graph:

No, the question is "Why are we getting inconsistent loc and size values?". As I stated above, loc=at.vector() and size=1 are not consistent parameters, because the former implies that the output should have broadcastable == (False,) and the latter implies it should have broadcastable == (True,).

As far as solutions go, we could do multiple things that don't necessarily restrict the cases covered by AePPL: e.g. rebroadcast the parameters when size == 1 (either in RandomVariable or before calling it), raise an exception stating the inconsistency, etc. Some of these approaches might still benefit from Rebroadcast canonicalizations, but they're not necessary in all cases.

ricardoV94 · 2021-09-24T16:26:59Z

Alright, I think I am gonna leave the graph at.vector("loc") + at.random.normal(size=1, name="base_rv") not supported for the time being.

aeppl/opt.py

aeppl/transforms.py

tests/test_transforms.py

ricardoV94 · 2021-10-06T17:02:34Z

This one is ready for review

ricardoV94 · 2021-10-06T17:08:27Z

This one is ready for review.

I reverted the Rebroadcast for now. In a subsequent PR we could add a rewrite that "lifts" a Rebroadcast to the RV inputs or changes the RV size, so that we can output rebroadcasted MeasureVariables from our rewrites (or apply it directly in the rewrites)

aeppl/transforms.py

brandonwillard

Looks like this needs to be rebased.

ricardoV94 · 2022-04-26T09:34:22Z

With #129 this PR is now much more straightforward! Ready for review

ricardoV94 · 2022-04-27T05:11:51Z

Hmm, does the IR "reverter" work in this case where nodes are removed instead of replaced?

aeppl/transforms.py

ricardoV94 · 2022-04-27T12:22:52Z

I had to change strategy away from the transform machinery, because I couldn't see a way to incorporate inputs that are not part of the base measurable variable, which is needed for the add and mul cases.

aeppl/transforms.py

This comment has been minimized.

Sign in to view

brandonwillard added enhancement New feature or request important This label is used to indicate priority over things not given this label labels Jul 7, 2021

This comment has been minimized.

Sign in to view

ricardoV94 commented Jul 14, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

ricardoV94 commented Jul 20, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Jul 20, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Jul 20, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Jul 21, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

ricardoV94 commented Jul 23, 2021

View reviewed changes

aeppl/transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Jul 24, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Jul 24, 2021

View reviewed changes

ricardoV94 commented Jul 29, 2021

View reviewed changes

aeppl/transforms.py Outdated Show resolved Hide resolved

brandonwillard reviewed Aug 1, 2021

View reviewed changes

brandonwillard added the rv-transforms Involves transforms applied to random variables label Aug 17, 2021

ricardoV94 commented Aug 17, 2021

View reviewed changes

aeppl/transforms.py Outdated Show resolved Hide resolved

brandonwillard reviewed Sep 15, 2021

View reviewed changes

brandonwillard mentioned this pull request Sep 17, 2021

Handling of RandomVariable parameters and size inconsistencies aesara-devs/aesara#590

Closed

ricardoV94 commented Sep 24, 2021

View reviewed changes

aeppl/opt.py Show resolved Hide resolved

ricardoV94 commented Sep 24, 2021

View reviewed changes

aeppl/transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Oct 1, 2021

View reviewed changes

tests/test_transforms.py Outdated Show resolved Hide resolved

ricardoV94 commented Oct 16, 2021

View reviewed changes

aeppl/transforms.py Outdated Show resolved Hide resolved

brandonwillard reviewed Nov 26, 2021

View reviewed changes

Implement Chain, Exp, Loc and Scale Transforms

22bcb32

ricardoV94 commented Apr 27, 2022

View reviewed changes

aeppl/transforms.py Show resolved Hide resolved

brandonwillard reviewed Apr 27, 2022

View reviewed changes

aeppl/transforms.py Outdated Show resolved Hide resolved

aeppl/transforms.py Outdated Show resolved Hide resolved

aeppl/transforms.py Show resolved Hide resolved

Implement measurable transform IR rewrites

f30b1c2

brandonwillard approved these changes Apr 29, 2022

View reviewed changes

This was referenced Dec 2, 2022

Transforms of Elemwise do not work with DimShuffle lifting #207

Open

Refactor MeasurableElemwise and its use #208

Merged

Uh oh!

Implement chained transforms #26

Implement chained transforms #26

Uh oh!

Conversation

ricardoV94 commented Jul 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Jul 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment has been minimized.

This comment has been minimized.

Uh oh!

This comment has been minimized.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

brandonwillard left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ricardoV94 commented Aug 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Aug 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Sep 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard left a comment

Choose a reason for hiding this comment

Uh oh!

ricardoV94 commented Sep 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Sep 15, 2021

Uh oh!

ricardoV94 commented Sep 16, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Sep 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Sep 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Sep 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandonwillard commented Sep 19, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Sep 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Oct 6, 2021

Uh oh!

ricardoV94 commented Oct 6, 2021

Uh oh!

Uh oh!

brandonwillard left a comment

ricardoV94 commented Jul 7, 2021 •

edited

Loading

codecov bot commented Jul 7, 2021 •

edited

Loading

brandonwillard left a comment •

edited

Loading

ricardoV94 commented Aug 2, 2021 •

edited

Loading

ricardoV94 commented Aug 18, 2021 •

edited

Loading

ricardoV94 commented Sep 4, 2021 •

edited

Loading

ricardoV94 commented Sep 15, 2021 •

edited

Loading

ricardoV94 commented Sep 16, 2021 •

edited

Loading

brandonwillard commented Sep 17, 2021 •

edited

Loading

brandonwillard commented Sep 17, 2021 •

edited

Loading

ricardoV94 commented Sep 19, 2021 •

edited

Loading

brandonwillard commented Sep 19, 2021 •

edited

Loading

ricardoV94 commented Sep 24, 2021 •

edited

Loading