Implement Scalar Scan (as dummy Op) #174
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Alternative to #283
Closes #83
The idea here is to implement a specialized kind of looping operation that can be treated as an Elemwise. I create a dummy scalar Op, whose Elemwise version is then converted to a scan by a rewrite.
As a test case I made use of the Scalar Scans in the gradient of gammaincc. Performance is now 20-30x better than before when running the new benchmark test, even though the two branches are now always computed. Further improvements could be achieved with a lazy switch, but that is not really the goal of this PR.
PS: It shouldn't be so hard to write a scan without accumulation