Hello Jarods, first of all, great work, you are a genius. all this code thing sounds like an alien for me. I have a few questions from my total ignorance. I've installed this one from you, "AI Voice Cloning v3 Package Installation - TortoiseTTS for Other Languages". The best i find so far, even if the emotions thing are not working for me. I find melotts its interesting in some aspects, will be good something like that if u feel is good. thing is... is this version better than the one i''ve installed? and, do i need to do code things or its just and easy installer like the one i already mention? thanks a lot for sharing your hard work, really appreciate.
Unfortunately, I can't comment on melotts as I haven't played around with it... tortoise is pretty good though, especially if you can finetune. Styletts2 will not be as expressive as tortoise, but it will be much more stable, meaning it will sound more consistent in comparison.
Hi, do you have any open source variants of the styletts2 repo to help me use it on my own custom data out of the box? for instance, what models (asr, pitch extractor, bert etc.) did you custom train to use with the styletts2 model. Im assuming you have to train them all in your own way if you're using your own custom data. are you able to share an instructional on how you trained (e.g., you have a wav|text|speaker file and set the configurations to be these input dimensions and these output dimensions etc.). can you share your python inference script?