Thoughts on auxiliary audio losses using V-Diffusion #54
brentspell
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
First, thanks for creating this repo, it is a great resource for audio ML.
The Moûsai paper hints at additional/perceptual losses in the Future Work section. I'm curious whether this would be possible to do in the V-Diffusion framework, since the denoiser predicts the "velocity" of the noise instead of the clean audio. Do you know of a transformation that could be applied to the model outputs at training time, for comparing against ground truth using an additional criterion?
Beta Was this translation helpful? Give feedback.
All reactions