-
Notifications
You must be signed in to change notification settings - Fork 212
Added torchaudio.models.Tacotron2()
#669
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
RegisterComponents(); | ||
} | ||
|
||
public (Tensor, Tensor, Tensor, Tensor) forward( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc comments on all public methods, please.
} | ||
} | ||
|
||
public class Prenet : nn.Module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A public class -- please add doc comments.
/// Tacotron2 model based on the implementation from | ||
/// Nvidia https://github.com/NVIDIA/DeepLearningExamples/. | ||
/// </summary> | ||
public class Tacotron2 : nn.Module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a couple of comments below, but there are public methods and classes in this class that should have doc comments.
@kaiidams -- I'm going to make another release this week, since someone found a very serious performance bug in tensor creation. I'd love to get this PR in the release, but no pressure -- if you don't have time to work on it, I'll include it without doc comments, and then create an issue that I'll assign to you to add them later. |
Added Tacotron2 implementation from https://github.com/pytorch/audio/blob/e502df0106403f7666f89fee09715256ea2e0df3/torchaudio/models/tacotron2.py
The file has a LICENSE notice from NVIDIA.
PyTorch has four pretrained mode
Quality of Griffin-Lim is not as good as WaveRNN. Phoneme based requires DeepPhonemizer to process texts which is written with PyTorch.
One of pretrained models from PyTorch converted to TorchSharp.
tacotron2_english_phonemes_1500_epochs_wavernn_ljspeech.pth
https://drive.google.com/file/d/11TnhmCSUy7aO1pv9CBi7Qivhz25nl_2l/view?usp=sharing