-
Notifications
You must be signed in to change notification settings - Fork 6
deprecate symbols api, clean up tests #50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
I can take a look but I'm still unclear as to where custom backend bindings would be absolutely necessary? The fact that Turing now uses DI directly seems to suggest that it does everything needed for PPLs? |
Sounds good. Feel free to take inspiration from DI if you need help on the more subtle aspects of ADTypes, like function annotations for Enzyme |
- better label printing for tests - fix inference for some cases - disable inference checks for no prep
I'll take a look at this PR after JuliaCon Paris |
the shadows is required for stability of the forward mode usage of Enzyme [and will improve performance by not re-generating one-hot vectors] |
@wsmoses: thanks for explaining it. My understanding is that DifferentiationInterface already handles shadows when preparation is used, so simply using that package would work. @gdalle, can you confirm this? |
That statement about DI is true. Whether you want to call Enzyme directly is a different story and up to you, Billy knows best what's good for his package. Still I'm always happy to fix DI issues if buggy or slow examples are provided, as usual. |
I think that once we are breaking the API, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why you kept the Enzyme extension, but is there a specific reason to keep the other backend extensions if you want to switch to DI? What are the things you wish you could do with DI but don't seem to be able to do?
[extensions] | ||
LogDensityProblemsADADTypesExt = "ADTypes" | ||
LogDensityProblemsADDifferentiationInterfaceExt = ["ADTypes", "DifferentiationInterface"] | ||
LogDensityProblemsADDifferentiationInterfaceExt = ["DifferentiationInterface"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DI is a very lightweight package (it depends on nothing except ADTypes and LinearAlgebra), I'm not sure it's worth putting it in an extension
# active argument must come first in DI | ||
return LogDensityProblemsAD.logdensity(ℓ, x) | ||
end | ||
@inline _logdensity_callable(ℓ) = Base.Fix1(LogDensityProblems.logdensity, ℓ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason behind logdensity_switched
was to avoid use of Base.Fix1
, which is a performance pitfall for Enzyme. The Constant
annotation on \ell
could also help speed things up for Mooncake in the future, so I'd suggest leaving this as it was
function logdensity_and_gradient(∇ℓ::EnzymeGradientLogDensity{<:Any,<:Enzyme.ForwardMode}, | ||
x::AbstractVector) | ||
(; ℓ, mode, shadow) = ∇ℓ | ||
_shadow = shadow === nothing ? Enzyme.onehot(x) : shadow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As said by Billy, this is actually useful to store ahead of time
x::AbstractVector{T}) where T | ||
(; ℓ, mode) = ∇ℓ | ||
result = Enzyme.gradient(mode, Base.Fix1(logdensity, ℓ), x) | ||
T(result.val)::T, collect(T, only(result.derivs))::Vector{T} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why convert everything to Vector
?
Deprecate the
ADgradient(::Symbol, ...)
andADgradient(::Val, ...)
API. Backends should be defined usingADTypes.jl
. This is a breaking change and will require a major version bump once deprecations are removed. However, I think it leads to a much cleaner API.This PR is purposefully kept minimal and does not change the implementation unless necessary (see below). That is left for future work which this PR should make much easier, the plan is to use
DifferentiationInterface.jl
for what we can, and only do the workarounds when absolutely necessary.However, the test framework was also cleaned up since the unified API simplifies testing. Various backends were not properly tested type stability and accepting generic vector types, and had bugs, these are now fixed.
The
shadow
kwarg from the Enzyme backend is removed sinceADTypes
does not support it. However, I don't think it was widely used, frankly, I am not even sure why it was there. The Enzyme / forward backend requires type conversions & assertions for stability, this should be investigated, but I think this PR does improve things.I used this opportunity to deprecate the benchmarking code for ForwardDiff too, it was badly written and I don't think it belongs in this package. It should be removed.
TODO
onehot
(shadows) in Enzyme implementationcopy
methods copy deeply when preparation is used, document (cf copy method for prep objects JuliaDiff/DifferentiationInterface.jl#858)