This free AI Text-to-Speech is insane! Add emotions & make podcasts

AI Search

Подписаться 219 тыс.

Просмотров 72 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

24 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 468

@theAIsearch 6 дней назад

Thanks to our sponsor Wondershare Filmora, a user-friendly video editor supercharged with AI features. bit.ly/4f60nmB

@crazyguy7585 5 дней назад

i have AMD Radeon 6800XT graphic card. i don't have CUDA what will i do can u tell me please help me.

@jalfagemer 4 дня назад

It's amazing! Thanks a lot. Do you know if it will be an Spanish option to generate speech? Thanks

@jarkodev 2 дня назад

yes me too have rx 6800 im only have amd gpu how to intall it i need this ai voice cloner please help

@Drowe71 4 часа назад

Says this was posted 6 days ago but when I go to the site its got different setup, so links are gone or changed etc. So do we assume they have added some of the features into the main installation process like the requirements?

@Dryesthalo 7 дней назад

This is wild! It’s crazy how little input audio it requires. Also I just wanted to say thanks. If it weren’t for you I would have never discovered my passion for creating AI voice models!

@amitnishad0777 6 дней назад

are you making money out of it? will be very helpful if you can give some insights.

@theAIsearch 6 дней назад

You're welcome! Glad you found your passion

@jmg9509 6 дней назад

That's definitely a new passion no one prior to 5 years ago could say, i'll tell you that much.

@life2030 6 дней назад

@@amitnishad0777 It is for non-commercial use only.

@Dryesthalo 5 дней назад

@@amitnishad0777No, I guess I could do commissions but I haven’t really thought much about it. I also want to improve before I do something like that as I’m to amateurish at data cleaning atm.

@HeRmEtIkA666 6 дней назад

I follow your channel since the early days. I´m super happy for your growth and also super happy when you do content like this... for non-tech people to be able to try and have fun with AI. A dedicated video for everyone to follow. Keep up the good stuff!

@theAIsearch 6 дней назад

Thank you so much!!

@AdvantestInc 6 дней назад

Voice synthesis with emotions? That’s a next-level breakthrough for personalizing user experiences. Feels like we're inching closer to seamless AI-human conversations.

@homuchoghoma6789 6 дней назад

Нам сначала нужно приблизиться к беспрепятственному общению между человеком и человеком )

@lorenndesign 7 дней назад

WW thumbnail

@theAIsearch 7 дней назад

😏

@pelufaz8435 7 дней назад

Sauce

@depressed_thinker 7 дней назад

@Oimamanaplz 7 дней назад

What the thumbnail

@starbrandX 6 дней назад

@@theAIsearch I don't remember if you mentioned your hardware. Can it do inference fast enough for realtime tts of text streams?

@cippalippa3105 6 дней назад

I watch lots of tutorials on youtube. This one is among the best. Keep up the good work and thanks for sharing your know-how!

@theAIsearch 6 дней назад

Thanks!

@mohamedzewail8907 7 дней назад

Great but needs to support more languages.

@steve-g3j6b 6 дней назад

I love how you do not assume that I know what you know, and bothered explaining the basics. and made time stamps for the more knowledgeable to skip. excellent man!!! so we cant train it properly on a larger audio file (you cant pack enough vocal range in that for professional works..

@theAIsearch 6 дней назад

thanks!

@sunnyhaoshiyu9728 7 дней назад

The thumbnail man 🤣 man of culture! like and sub!

@lyrioolyricc 5 дней назад

He changed it what was it😭?

@rafikzidane477 День назад

yamete kudasai ...🌶🤣

@froilen13 7 дней назад

sounds good, but not good enough. I'll wait a bit longer for an upgrade

@nonsookoye3163 4 дня назад

Right! Not good enough. I can tell it's ai

@InnerEagle 4 дня назад

You are lucky you can tell it's AI, wait until you get a phone call and you can't understand if it's fake or AI

@Eldorado66 4 дня назад

Surely the best among the free ones. If you want the absolute best and are willing to pay, try eleven labs.

@Anamontes-o4w 6 дней назад

I don't know where to go without you. You don't know how important you are in my life. Saved for later as usual.

@CrRonaldo-rq6mv 5 дней назад

Man , You're a legend 🙌 Thank you for your efforts ❤

@adelite 6 дней назад

Crazy stuff! I'm glad i found this channel.

@irabliss1496 16 часов назад

Dude, I am a retired software engineer/java programmer that only used PC…..RELIGIOUSLY…..so I totally understand what you did…./VERY COOL, you did a fantastic job! When the iPhone 3GS came out and was 9.99 at Best Buy I got it and switch to ALL APPLE and never poked back!!!! hearing what you have to do to get this to work cheese I don’t miss those days of going through all that crap but I know you love it and you have successful. I applaud you for doing what you do you’re a very articulate very intelligent and I think you just did a great job knocking this video out. I wish you all the best in the future thank you much for the demo. I will look into it for Apple if they have something?

@ninjagogetta 6 дней назад

this is great for npcs in video games

@syedthefunnyguy7570 57 минут назад

now reading visual novels feels cinematic, thanks for suggesting

@vinching926 6 дней назад

That mixing Chinese and English is simply perfect, any Chinese no matter it's Mandarin even Cantonese just speaks like that, the TTS shows no flaw with it's voice, tone and pronunciation, if I play that to my friends and family they can't really spot the common AI characteristics with it.

@theAIsearch 5 дней назад

thanks for sharing!

@adrianmunevar654 5 дней назад

Man, your channel is the bomb 💣 And right, that "Spanish" reading was a little bit hilarious and awful at the same time. Hope they make more languages available soon. 3 of your videos in a row. New subscriber here!

@fuzzyhenry2048 7 дней назад

Feel like a 7/10.

@generalfishcake 4 дня назад

6/10 maybe. Still more robotic than CoquiTTS

@jihe4677 6 дней назад

This version of the tool is astonishing! It is exactly what I have been looking for.Thank you!

@VoicelessScream 6 дней назад

Truly appreciate the detailed installation procedure, made my life much easier. Thanks!

@theAIsearch 6 дней назад

you're welcome!

@SkylineAICreator 5 дней назад

I was also very surprised with how good this works... Thanks!

@pauleasther 5 дней назад

After a break, I deleted all uploaded files and started again, this time successfully. First error was when uploading programs, stick to the older nominated versions! Don't think that by uploading a newer version, things will be better, they won't ! The program is brilliant and will save me a lot of money. Thankyou! Where I went wrong was creating the virtual environment? You sat to add "conda activate f5"; but you must put in "conda init" first, hit enter, and then add "conda activate f5" Once done, it went smoothly

@theAIsearch 5 дней назад

thanks for sharing!

@thesystemera 5 дней назад

Damn. I work extensively with Eleven Labs but this is actually showing some advances. Especially the emotional side of things.

@rickarroyo 19 минут назад

There was a promise about updates with emotions, right? So far, nothing. With ElevenLabs we need to try some workarounds like: (And she says with great sadness) or something like (She says with great anger) Insert the text - The context helps, this uses more characters but in some tests it was worth it for me.

@sasuofficial3448 7 дней назад

i alaway wonder why the requirements are never listed first ... xD (specs vram/ram req) the chinese is insane . it always sounds more than the original voice lol

@proteusblack8913 19 часов назад

Gotta love installing installers for installing installers in an installer that installs the installer needed for a virtual environment used for installing an installer for a tts program. 👍

@4.0.4 6 дней назад

I'm glad that this is being developed, even if it's still at a point where I wouldn't even enable it if it was as easy as a toggle, let alone dig into code to get it working.

@sabofx День назад

really awesome tutorial!

@liarus 6 дней назад

That thumbnail... He knew what he was doing

@jaredf6205 6 дней назад

What about it? Can you guys hear waveforms by looking at a picture of them or something?

@LinkRammer 6 дней назад

A computer could probably

@AimaruVee 6 дней назад

@@jaredf6205 I think he had like AB testing going on in the thumbnail. One is a normal wavelength thumbnail and the other thumbnail also has a wavelength pic paired with a.. sus anime pic.

@jaredf6205 6 дней назад

@@AimaruVee Oh it's actually coming up for me now

@Watchdog-e9f 6 дней назад

Ai girlfriends are becoming a reality we are doomed 😭😭😭

@brianlink391 7 дней назад

This AI is really good...at sounding like a bad audiobook narrator! 😂 It nails those over-the-top emotions, but they don't sound very human. Maybe the problem is that it's trained on audiobooks, where the emotions are often exaggerated. What if we used this "fake emotion" data to our advantage? First, train an AI to recognize those audiobook patterns. Then, train a second AI to spot real emotions in everyday speech from RU-vid, podcasts, etc. The second AI could learn to tell the difference between fake and genuine, and we'd get an AI that truly understands how we express emotions! What do you guys think?

@samuel_innerwinkler 6 дней назад

Have you tried the eleven labs reader for audio books? Not all voices are great but i foubd the voice of burt Reynolds to work really well for audiobooks. It also works in different languages

@jmg9509 6 дней назад

@@samuel_innerwinkler Lol Burt Reynolds was an actor.

@jmg9509 6 дней назад

I think that's what a lot of these AI models use. It's called a discriminator, and it's just is to do just that; tell the determine whether a piece (image, audio, etc) is genuine or ai generated. That's the base of my knowledge, I don't know much after that, or if they use it for this voice model.

@samuel_innerwinkler 5 дней назад

I know@@jmg9509

@bause6182 7 дней назад

I hope one day someone make an open source ai that make songs like suno or udio

@gyro-j 6 дней назад

I'll save it for later. Thank you so much for the detailed tutorial man! Your channel is excellent!

@theAIsearch 6 дней назад

you're welcome!

@Rodentsnipe 5 дней назад

If you generate anything longer than 10 minutes, you'll notice that the voice model gets worse and worse until it becomes absolute gibberish and then static noise at around an hour

@fixelheimer3726 6 дней назад

This needs more languages

@Felaonile 7 дней назад

finally my local voice AI companion will have emotions!

@jahpistol3486 6 дней назад

Why do i have a bad feeling about this

@Felaonile 6 дней назад

@@jahpistol3486 bro 💀

@captteemo9133 6 дней назад

How do you use it on mobile phones?

@Felaonile 6 дней назад

@@captteemo9133 I built the bot from scratch, the basis of my bot is Ollama, for fast communication I used Llama3.2 with 1B parameters. Speech recognition works on Whisper, I used to work with VOSK, VOSK is not inferior by the way, only Whisper allows you to insert punctuation marks into speech. Speech synthesis is based on COQUI TTS - VITS multi-voice model. Unfortunately, it will not work on a smartphone

@Felaonile 6 дней назад

@@captteemo9133 I built the bot from scratch, the basis of my bot is Ollama, for fast communication I used Llama3.2 with 1B parameters. Speech recognition works on Whisper, I used to work with VOSK, VOSK is not inferior by the way, only Whisper allows you to insert punctuation marks into recognized text. Speech synthesis is based on COQUI TTS - VITS multi-voice model. Unfortunately, it will not work on a smartphone

@russellfrancis6294 3 дня назад

I'm really excited about this. I'd love to have a go !

@tp_exe 4 дня назад

I couldnt stop laughing with sudden switch from normal to sad and then to anger LMAO

@cyberprompt 6 дней назад

I approve of the Hitchhiker's Guide reference.

@cheezeckez6843 2 дня назад

The Chinese ones sounds like native speakers. This is a really powerful tool.

@Endangereds 6 дней назад

There was some any language to any language AI voice tool too. Does anyone remember? We can just feed it any language voice and it will learn from it, and after that step it can the be used to generate voice to speak in any language. I believe, It was possible to make it sing too. it even creates a tts file, I believe. So that, we can use that file with any text to speech engine.

@VaibhavShewale 3 дня назад

it took a quite a while for people to find this

@Me91325 3 дня назад

this is the best AI text to speech program i ever seen. tnx AI Search...😍😍😍😍😍

@theAIsearch 3 дня назад

You are welcome!

@yanggary 14 часов назад

OMG this is 🔥thank you!

@theAIsearch 8 часов назад

No problem!!

@realthing2158 6 дней назад

It's really cool but I need it to be able to blend multiple voices together to create a new original one. Just copying other people's voices is not really ethical when using voices for commercial purposes.

@armondtanz 6 дней назад

RVC can do this. I downloaded via this channel. If u go to this guys videos and search popular it's on of the most watched.

@411KJB 6 дней назад

Sir, YOU ARE AMAZING. BELLED to "GIT" notified of everything you make. Simply WOW.

@theAIsearch 6 дней назад

Thanks!

@benashbaugh5982 6 дней назад

This is impressive. No wonder the voice actors have problems with this software

@vi6ddarkking 7 дней назад

The best part is we can use the existing XTTS set of tools to modify our own voices and create the emotional samples, for the existing voices.

@theAIsearch 5 дней назад

thanks for sharing!

@zakyvids6566 3 дня назад

How though I do not know coding would be very interested if you can put a RU-vid channel on this very topic

@everyoneroasted День назад

it doesn't sound like a human at all, but it really nailed these emotions and I can see it over taking Eleven labs if they keep developing it

@laultimaverdad1187 7 дней назад

COOL bro, please I want Spanish TTS and cloning

@tiitola 3 дня назад

Great video Very helpful.Thanks for sharing

@theAIsearch 3 дня назад

You are welcome!

@SherryXShi 4 дня назад

thanks for sharing your skill with us.

@ExpensivePizza 7 дней назад

This is very impressive.

@theAIsearch 7 дней назад

😃

@Random_person_07 7 дней назад

Awesome it sounds good thanks for the guide to set it up Nvm it sounds alright it has a lot of hallucinations

@VintageForYou 6 дней назад

This is very good at cloning voice wave files nice one.👍😁💯

@WIDOMU День назад

That's awesome!

@gwinbeer 7 дней назад

This is great! My all time favorite voices are Morgan Freeman, Peter Thomas (from Forensic Files), and Samuel L Jackson (see: Go the f**k to sleep)

@MolnarG007 6 дней назад

Crazy! i'm interested in the cross language options, and generally how it handles other non English languages. EDIT: just reached end, so it's Chinese and English support at the moment. All in all, thx for the upload definately checking this out!

@user74018 5 дней назад

Thank you for the effort in explaining this topic, but the video is too long with a lot of unnecessary examples. the point was clear early on, so trimming the extras and making it more concise would really improve both the content and the viewing experience. Hope you'd see this feedback ;)

@Entity303GB 6 дней назад

This AI tool could (when multilanguage is available) translate locally voices for free

@LoveAi-eh7cm 5 дней назад

It also works on a 6gb vram. Mine is 3060 6gb.

@theAIsearch 5 дней назад

thanks for sharing!

@IM2awsme 7 дней назад

I remember a product called lier bird that vanished from existence 😅 it did voice cloneing almost a decade ago.

@TrentonMatthews 7 дней назад

It did, and it was fun!!! You can find absolutely funny examples over on The Lost Narrator's RU-vid channel. Yeah, it's My Little Pony voice examples from fan actresses, but I say they are some of the best clips I have found.

@IM2awsme 6 дней назад

@@TrentonMatthews I remember someone showing me a website with my little pony voice clones so many years back, I completely forgot that existed 😅

@IM2awsme 6 дней назад

@@TrentonMatthews the first video I randomly clicked on was titled "apple jack tells the truth " 💀 was not expecting that

@zakyvids6566 3 дня назад

Yep and does anyone remember adobe voco it could do cloning as well as emotions it was very real for 2016 I bet the big tech already has very advanced stuff in their labs

@contentfreeGPT5-py6uv 7 дней назад

XD waoo,list for my open free project 😅

@rachidlajmi2826 6 дней назад

Thank you for the details

@randoom3845 6 дней назад

Waiting this on colab

@hemanthreddy7335 5 дней назад

Imagine using Peter Cullen voice for Optimus Prime in the upcoming transformers flicks.

@Anish-o6n 5 дней назад

Your knowledge is awesome, what was your profession before before starting this fabulous channel ?🤔

@spaceandstuff 7 дней назад

Its ok - adds a little spice to what we currently have. Give it 2 years

@alienspecies6872 7 дней назад

Thumbnail-kun goin cray

@theAIsearch 6 дней назад

🤪

@fixelheimer3726 6 дней назад

The audio quality is good, but could be a bit better to be perfect maybe a version with more than 15 sec input audio will achieve this.

@Legion831 7 дней назад

Thank you so much as always your tutorials are very helpful and insightful. I hope to use this to translate and dub the new Dragon Ball series.

@theAIsearch 6 дней назад

good luck!

@Iridium. 19 часов назад

haha, ai is the end of all actors and voice actors!

@vi6ddarkking 7 дней назад

So what do you think? What's the ETA for this to be added to Sillytavern?

@theAIsearch 7 дней назад

should be very soon. open source community builds fast!

@dsdk1524 6 дней назад

It also works on apple silicon. I know this as fact because I've done it.

@theAIsearch 6 дней назад

thanks for sharing!

@Ai.PromptHero 6 дней назад

So good sir 😊❤

@MarkoKarja_SlapAsSound 2 дня назад

Thank you! Installation part of this tutorial is about 15 minutes long...? Is there a way how regular people can install this software? :)

@Entity303GB 6 дней назад

Were getting closer and closer to ai local voice translator

@HistoryViper 6 дней назад

Amazing! I guess I can quit using chat GPT to do this

@rickfuzzy 7 дней назад

Looks great, but the only thing i wanted to know was inference speed without processing the reference. What would the potential be for realtime if the reference voice was not being processed as part of the inference?

@theAIsearch 7 дней назад

inference is quite fast. there's a good chance someone might make a realtime variant of this

@phizc 6 дней назад

I haven't looked at it yet, but it shows a spectrogram of the clip¹, so it's possible/probable that it generates the entire clip in one go, I.e. it works on every part of the clip at the same time. If that's the case, it could probably create a 20 second clip in e.g. 15 seconds, but you would still have to wait 15 seconds before you can hear any of it. I may be wrong though. ¹ some text to audio systems generates an image of the spectrogram and then converts the spectrogram to an audio file. The spectrogram is a representation of the audio where time is on the x-axis, the frequency is on the y-axis, and the amplitude is the intensity/color of the pixel.

@davidmangold6167 6 дней назад

Love your video! So cozy to listen to your voice :). I was wondering if you tried with your own voice? If yes did it work? 😊

@theAIsearch 6 дней назад

thanks! i haven't actually tried it w my voice, but good idea!

@kngemeral 7 дней назад

great video

@theAIsearch 6 дней назад

thanks!

@m2mdohkun 2 дня назад

I think I will wait for LM studio version or fooocus/flux comfy edition. This installation version is so straight I just can't.... 😅 Anyway, Thank you for the tutorial!

@steverileyretired 3 дня назад

Very Cool

@EmiOokami 2 дня назад

I've been looking for an AI text-to-speech with the voices of the TF2 characters. Do you know any site you recommend that is for free?

@rain5823 День назад

Can you make a tierlist which ai is the best

@dom6512 5 дней назад

What GPU are you running? Your 30 seconds is around 5000 for me. I tried on huggingface with about the same results. Replicate was at about the speeds you were getting.

@Raaz47-w3z 3 дня назад

Will this type of A.I voice be monetized on RU-vid or not?

@mathieul4303 7 дней назад

Interesting thumbnail 😂

@Moukrea 6 дней назад

It's outstanding in English, unfortunately doesn't work with other languages, XTTS stills king for other languages 😊

@LatoreLyfe 7 дней назад

Great video thank you for the education

@theAIsearch 6 дней назад

you're welcome!

@erictidmore8047 День назад

Fucking cool

@GoodBaleadaMusic 7 дней назад

I write lyrics in every language now and I'm starting to find that there aren't words to describe certain notes of inflection you desire. But somehow, the language model understands that little subtlety that you're looking for. Whether it's a person or not, it does not matter to me. It understands me. Better than any of you lol.

@RaySmith-zg7od 7 дней назад

Brooooo 😏

@detlefmogl1770 6 дней назад

pleasseeeeee go outside pleaseeeee

@GoodBaleadaMusic 6 дней назад

@@detlefmogl1770 You're a settler in the middle of one of your genocides. You will never do what I need you to do 😂😂😂

@akshayzankar6995 4 дня назад

0:07 was it techworld with nana's voice? 😂

@Stupel 6 дней назад

Hmmm it is almost as good as characterai TTS and it is not private! (correct me if characterai tts techonology can be used outside their site) But unfortunately F5 is only for English and Chinese languages...