Updated the audio processing tools for this notebook.
The VITS training loop will train or fine tune a model using the Coqui framework with phonemized text and speaker embeddings.
This is set up for English. It can be done in other labguages. It is easier for languages that are supported by the espeak-ng phonemizer.
Please read the documentation on the Coqui Github page.
github.com/coq...
VITS Training Notebook
colab.research...
VCTK Hindi model - 22khz audio, 4 speakers
Trained on Mozilla Common Voice and Open Speech and Language Resources datasets to 376,500 steps
**DOWNLOAD TEMPORARILY LOST**
Thorsten-Voice's video on Windows setup
• FREE Voice Cloning in ...
Demucs
github.com/fac...
FFMpeg-Normalize
github.com/slh...
3 окт 2024