How to make an AI VTuber Using GPT 3 and Google Cloud TTS

Подписаться 1,6 тыс.

Просмотров 167 тыс.

50% 1

Github: github.com/adi-panda/Kuebiko
Twitch Oath Token: twitchtokengenerator.com/
OpenAI API: openai.com/api/
Google Cloud: cloud.google.com/
List of Supported TTS Voices: cloud.google.com/text-to-spee...
VB Audio Cable: vb-audio.com/Cable/
OBS Script for Captions: gist.github.com/kkartaltepe/8...

Опубликовано:

26 фев 2023

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 348

@adi-panda Год назад

I FORGOT TO MENTION: MAKE SURE TO INSTALL VLC ON YOU'RE COMPUTER (64 Bit Version) on your computer. ALSO IF YOU RUN INTO ISSUES INSTALLING THE PACKAGES YOU WILL HAVE TO MANUALLY INSTALL PACKAGES. (pip install twitchio ... keep on doing until all packages are installed) Sorry if this tutorial isn't well made, I just wanted to share how I was able to make an AI Vtuber with all of you

@Byron_Hill Год назад

can you maker a follow up on how to set up vlc? i am having trouble with this

@pterodactylw3899 Год назад

If you add voice mod to the mix it would be amazing and if you get a anti AI text rescripter then your making free money

@justinhwang1160 Год назад

@adi-panda Hi, thanks for the awesome tutorial! I've made sure that all of the input/output settings are done correctly but I still can't seem to get the audio to work. I set the Vtube input to VoiceVolumePlusMouthOpen and output to ParamMouthOpenY, just like the tutorial. However, when I type in the chat, no sound comes from OBS studio and the Vtube model mouth does not move. (I also don't get how you would listen to the bot speaking if you set the speaker as the virtual aux) Also, I connected my Twitch account to Vtube Studio and OBS but nothing happened, so I guess I'll disconnect them again? Any help would be greatly appreciated!

@MrRaja Год назад

How do I feed llama outputs into my output folder..

@bronxandbrenx Год назад

This tutorial is so much :)

@Nazralova Год назад

19:15 For anyone wondering why you can't hear the sound output from OBS Studio, you need to change the Desktop Audio --> click the three dots --> Advanced Audio Properties --> change Audio Monitoring to other than Monitor Off. For example, if you record with the Monitor and Output on, both sounds (in this case, monitor and speaker) will be recorded.

@Gr13fM4ch1n3 Год назад

As soon as I saw clips of Neuro, I immediately wanted to know more. This video popped up in my recommendations just hours after finding her, and I'm so grateful.

@2DReanimation Год назад

For a chatbot like Neuro though, you'd need a local large language model, as the responses need to be fast. And GPT3 and ChatGPT will go "I'm afraid I can't do that Dave" too often to have the interesting responses Neuro has. ^^

@alexlarex7773 Год назад

@@2DReanimation to have a local large language model you need a local GPU farm with enough VRAM to fit it all, and stuff on the level of GPT-3 will require you like 300 gigs of it. Which is not feasible. When you use gpt3 or even gpt3.5 via it's api, it won't do "I'm afraid I can't do that Dave" whatsoever, it's your responsibility to filter the responses so they don't break openAi terms and conditions. Edit: not even just 300Gb, 700Gb of vram for gpt3-like nets considering it's 175B parameters...

@alexlarex7773 Год назад

@@2DReanimation Nope. for training you would need even more. 700 Gigs of vram is for inference (just running the model with some inputs just to get some outputs) on a GPT-3-like model. LLAMA models are smaller than GPT, yes, but even those are far from tiny. The smallest LLAMA model (7B params, which is considerably worse than gpt on this specific task) needs at least 14 gigs of ram, which is indeed possible for a consumer rig. The proper 65B param LLAMA model needs 130 gigs of ram, which is technically possible if you will run it on the cpu, but for gpu that's already like 10 GPUs worth of vram. Also have you actually tried doing any of that yourself? I'm speaking from experience saying that using the APIs is actually faster than running the net locally, i get 1-3 second responses with the API on average, which is perfectly fine for answering twitch comments. ChatGPT is GPT running with a specific prompt before each of your messages, and openAI's filters before and after. ChatGPT is not a good example of the limitations of GPT.

@danielhughes3758 Год назад

@@2DReanimation He did say GPT 3/3.5 API, not chat GPT. I haven't tried that yet so I can't confirm or deny, but I believe that if you skip the chat GPT added functionality and just go with underlying GPT then you have more freedom in those regards

@Gr13fM4ch1n3 Год назад

@@alexlarex7773 I've been testing with Vicuna and it's only taken 20ish GB of space. It gets a little confused and starts writing fanfictions halfway through some conversations, but it's still funny.

@Descalibrado_ Год назад

This is way more than what i expected, very cool to know that you can also refine the model so it learns

@zapp3264 Год назад

Thx for sharing your knowledge on this topic! You ve done an amazing job in this tutorial :D

@elzilcho222 Год назад

had to do some fancy audio stuff to make this work (for example, it doesnt matter on my computer is the output is the vb cable or not), and the vtube model talks for any sound coming out of my computer, but still super cool! I never would have gotten this far without this video, thank you!

@vegetablescankill Год назад

I'm only a minute and 30 in and i can tell this is a high quality tutorial by a competent individual. Excited to see where this goes!! Subscribed! Thanks for sharing ♥️

@MYPSYAI Год назад

THIS is exactly what I wanted. Lets make a community around this "advanced beginner" level!

@darkff9090 Год назад

If you tried it Bro can you telling me what's your Instagram account I want to ask something about this..? Please

@qmac9966 Год назад

Wow I feel like I should be paying money for this information

@adi-panda Год назад

thanks man

@Taurellejr Год назад

@jefftseng6939 Год назад

So dope, i create my own chat bot using python, but i dont know how to connect it with VST, thank you, now i think i can move on next step.

@ComputerSage-Dev Год назад

Quick FYI, the git repostory uses text-davinci-003 model, very powerful but is the most expansive, gpt-3.5-turbo is a better option, both are GPT-3 (3.5 technically) but gpt-3.5-turbo is focus on chat like interactions and is way cheaper, it will require you to change the openai request tho.

@endymion3213 Год назад

How can we do that? total noob here, i'm trying to use chatgpt to figure it out but it's still way over my head. ty

@TehBrownie Год назад

how Do you change it

@gispry Год назад

For an update on this, the git repository now uses gpt 3.5 turbo

@chromacobble Год назад

This is just impressive, kudos to you!

@replikvltyoutube3727 Год назад

Good video thanks for going over the process.

@bronxandbrenx Год назад

this is awesome. Learned so much from the video.

@DansuB4nsu03 Год назад

Why is this video in my recommended, I'm not even that much into coding, lol Thanks RU-vid, you're the best for recommending me smaller creators :)

@Homiloko2 Год назад

Great tutorial! The anime girl speaking with a male voice cracked me up though, gj

@kuroraikiri6343 Год назад

This is amazing!!! please make a video explaining how to train the model ❤

@nolaz6009 11 месяцев назад

wow esto es increible, soy estudiante de programacion, asi que lo voy a recrear, muchas gracias por el tutorial

@Mamika_AFK Год назад

Great info thanks! Just subbed 👍

@Bitscuits_La_Cielhare Год назад

It very appreciated for sharing your knowledge for free!

@rioharta526 Год назад

thank you for sharing, will watch it later

@FARSIDE0_0 Год назад

How'd you only have 360 Sub? You deserve more

@Caysprorespaldo Год назад

Aaaa dios santo si tan solo supiera inglés, pero este videos es Genial y de gran calidad, Gracias broder

@BanriFerdinand Год назад

Hay mucho contenido en español solo que está todo por partes. No hay un tutorial en especifico w

@gouphee Год назад

hey this is a fantastic video. i was wondering if you would be making a video on how to get started with fine-tuning or if you could point me in the best direction to learn more about it? thanks again for the great content!

@MrErick1160 Год назад

awesome bro I need to do this. That will help HEAPS. I subscribed!!!

@breakyourlimit735 Год назад

thanks bro i will try it later for my adding project

@holopekochan Год назад

thank you. the video is very informative.

@storqe 3 месяца назад

You can use more voices too by downloading a voice changer program and setting your input and output microphones so your tts talks to the voice changer and the voice changer talks to vtube studio. Dubbing AI has the most natural sounding voices I've found so far and I'm sure better ones will come out in time.

@VRAnimeTed Год назад

Mine is really coming along! I'm running 5+ hours a day on twitch while tested. Managed to get gpt3.5 working. I thought I'd posted the code changes in chat, but maybe I restarted the computer without hitting send, hmmm... main swapping engine for model, but more to it.

@1doom1000 Год назад

Knowing your work, I'm curious and excited to see where this goes...

@TehBrownie Год назад

Ya want to share the changes to GPT3.5

@ViperousVT 6 месяцев назад

Thanks this would work for my Lyra Ophuchius model 🐍 😃

@ghostsheepy. 8 месяцев назад

grats on the 10/10 from ign

@annasbites Год назад

Thank you for good information!

@SeanKula Год назад

Dude, your tutorial rocks! Thank you. Can we get a tutorial on an AI that plays the video games, kind of like neuro.

@Overneed-Belkan-Witch Год назад

*The Power Anime and ChatGPT in the palm of my hand*

@jymcaballero5748 Год назад

que gran video amigo, felicitaciones, 1 suscripcion!

@zazukanal Год назад

Wow you are best it is what i need

@basaran3d Год назад

Bro single handedly made me learn coding

@yedochen84 Год назад

Thanks for sharing!

@gosudrm Год назад

Thank you very much!

@deepnightly3824 2 месяца назад

damn brother i hope this still work, Thanks Man!

@wikanz09 Год назад

Is there any way to do these with elevenlabs and youtube chat instead? Appreciate your efforts in making this video. This Ai vTuber thing could be a goldmine in the future

@tangerinacat2409 Год назад

I like your words, funny internet man

@saturnorbit6810 Год назад

I like your perspective on fine tuning, thanks!! Any discords you recommend???

@paradoxgaming10 Год назад

Love the video man but I want to know if I can make my own custom responses to certain questions and keywords instead of using the GPT or any other AI service ? So like in this I want to make custom responses and the rest of the process is same as you showed.

@lordessofstrawberries1523 Год назад

Hey! Great walkthrough, for windows its working at the moment! I do have one issue however, in my test stream it disconnects, where as it will show a prompt from Twitch but just idle forever on a response. Any fix or direction i can look at to fix?

@LucyKosaki Год назад

Is it possible to make a version that uses elevenlabs api instead of google tts? The google ones seem a bit restrictive in terms of voice selection and elevenlabs not only has very easy voice cloning features but it also gives you 10k free tokens a month for testing. I think this might be better for imitating anime voices. But I'm not a programmer, I don't know if this would be just some very minor adjustments to the code or if you need to rewrite the whole TTS section of the script with new api requests.

@thomasfrigerio6147 Год назад

I was here (Sure u''ll gain a lot of subs)

@fate2304 Год назад

Now I can make a coding buddy that can not only help me help with coding but also talk to me to keep myself sane

@RockLou Год назад

This is cool!

@holyamvs4594 Год назад

can you make a google colab version, but stll this is some good work

@barrygovino3100 Год назад

Dude your awesome!! I'm gonna try it out soon. Btw is it possible to add different voices instead of the default one?

@adi-panda Год назад

yeah you have to change the code though, in the description I put a list of available voices

@evieyorisou2586 Год назад

@@adi-panda does the language of the TTS voice use have to be the same as the language that's being used in chat such as if a chinese user is saying Chinese would you then have to use one of the Chinese voices or if English users are using all English would you therefore have to use only English voices or do all the voices basically translate to the same thing meaning it doesn't matter what language you're speaking the language of the voice will still be able to be used.

@nightcorealg5765 Год назад

now all I need is a tutorial on how to run everything on your computer ;-;

@adi-panda Год назад

sorry lol, I tried my best, but this was only really a quick project lol

@hevilmateold Год назад

FINILLYYYYY LETS GOO

@incongruous4 Год назад

Will there be an updated video re-explaining after all the changes to the GitHub files recently? Thanks

@hapashreguiem7460 Год назад

Good tut

@ai_vids Год назад

this is gold

@joshuamart3910 Год назад

This is incredible, thank you very much, but how is it possible that vedals can talk in real time with his AI and it responds in real time, even interrupting him like a normal conversation.

@chinonoob3219 Год назад

Nice video, more tutos coming up?

@gab_campos Год назад

I have the same problem as Video. The bot stops talking after a while. How to solve? And also in the prompt appears code snippets, which should not be shown, it should only have the bot's lines.

@notbriann Год назад

this is exactly the video about neuro-sama I wanted thanks!

@ViajanteGaross Год назад

this feels like an 5k dollars worth tutorial

@programateiro9507 9 месяцев назад

Great tutorial sir! Just a little question can I run the chatbot only with vtube studio without connecting to twitch?

@youjinchoi3430 9 месяцев назад

Thank you so much for this. My only question is would it be possible to set up certain personality of an AI when it reads and responds on the chat? + due to some troll viewers, some things have to be filtered or ignored. Is there any way to do so? I am not sure if my question makes sense and is there any way to set this up in Korean?

@livelearn5741 Год назад

How do you retrain the model? Thanks for the great tutorial!

@jangsoodlor Год назад

yay i can finally have neuro-sama at home uwu

@flibflop8219 3 месяца назад

Can you do the tts in a different way without cloud? Im just trying to cut costs. Im not going to be streaming with it so I was wondering if there was a way to do it without cloud and just running off your pc, Im writing this only like 5 mins in so Ill edit this if my question changes.

@deadmeme4058 Год назад

Is there any way for connecting this to Elevenlabs instead of google cloud tts?

@0verDDoS 11 месяцев назад

Do you have to have any programs open to use this? My cpu runs a bit high when streaming games and idk if this will affect it

@MoriVR Год назад

Thanks for the tutorial! I followed it step by step but the audio cuts out before it can play a full response

@roseydeep4896 Год назад

Can I run this on colab/remote GPU somewhere & use a different model (Not GPT3/4, I wanted to use Vicuna)?

@DrunkyBearAslaxVODs Год назад

hi cool video bro, is it possibly to change the tts voice to an anime waifu voice? a costom one? thanks

@beyondmanhattan5099 Год назад

is beatiful

@evieyorisou2586 Год назад

Is it possible to use ElevenLabs instead of Google TTS or cloud services and if so how would I do that to make the necessary changes to the Python

@gispry Год назад

Got this all set up and it works well. Though I want to have it pull the most recent message after it has stopped speaking as currently it seems to just go top to bottom regardless of how many people are chatting. I can manually reset it to force it to jump to the most recent message but it would be good to know how to do that automatically. I have been researching for a couple hours now and I cant find how to do that.

@gispry Год назад

I am so lost. How can something that seems so simple be so complicated. i just want it to read the most recent chat message instead of reading every chat message top to bottom like a list :(((((((( How is this not something someone has wanted before. Why cant I find anything online

@gispry Год назад

For anyone else wanting this I did finally find how to do it. Just needed to add if not message.content.startswith('-'): return under the async def event_message(self,message):

@gispry Год назад

Though doing this in notepad is very difficult as you have to indent manually

@Laloktv Год назад

Does the ai only speak English? Or is it able to manage other languages? Amazing tutorial

@maublanch 6 месяцев назад

Hi, please I was wondering if I can lock the process so my avatar would only speak when asked mayube with some commands or words, I can't understand or find the solution for this, maybe is only adding an if, but I'm not good at it.

@mercurygalaxii Год назад

Nice tutorial! Is it possible to use gpt3.5 turbo api for this chat bot? It's 10 times cheaper, also can it support multi language like chatgpt does

@neociber24 Год назад

From the current documentation on OpenAI is not possible to fine tune the gpt 3.5 turbo (at the moment of this comment)

@AalummiCh Год назад

I'm having problems with registering the audio to make the avatar talk in relations to the audio, even though it's set the complete same way and I have the cable program installed

@deusexmakima 10 месяцев назад

same dude, if u have the solution plz tell me ;)

@KilerMansters Год назад

thanks

@Bettychan1 Год назад

is it possible to use elevenlabs tts instead of the google one?

@travelerwatanabe Год назад

is there any differences if this is being used for RU-vid streams and not on Twitch?

@FedzNGree Год назад

Is it possible to do the same also for the RU-vid Chat? Or even integrate both?

@danielhughes3758 Год назад

Pretty dang sure it's possible, as you're just streaming from OBS which in turn captures all the input. You'd need to get a youtube API key and modify the code to also connect to youtube using the key. You could even modify the script to take an input parameter so you can default to both but specify if you want to when you run the file. Perhaps youtube has some crucial limitation that Twitch doesn't have that would prevent running this, but I highly doubt it

@TheBBWolf 7 месяцев назад

pretty cool friend. Question: can I use a totally free alternative to openai, if so how and which one?

@sarabibrahim8312 Год назад

👏👏👏

@STE4LTHYP1CKL3 3 месяца назад

Hey I know this video is quite old so not sure if I'll get a reply but were you using any plugins for OBS for the subtitles? I can't seem to get my subtitles like yours the words are just cropped out of the text box if they don't fit (this is using the lua script).

@davidbarahona1368 Год назад

You da man

@KasumiChanXx 8 месяцев назад

Question with this am I able to change the voice? N if so how do u do it?

@Tuti_v3 8 месяцев назад

Hi, great tutorial. Can it be set to only read cheering messages?

@septumkmm6030 Год назад

in your bellow code, I have found that once the conversation limit is reached, it will remove the prompt which is found in the .txt file. if len(Bot.conversation) > CONVERSATION_LIMIT: Bot.conversation = Bot.conversation[1:] so I have fixed this issue by using if len(Bot.conversation) > CONVERSATION_LIMIT: Bot.conversation.pop(1) this way memory list is still preserved while prompt is still active which each new request sent.

@pipi2898 Год назад

anyone ran into this problem while the tts try to speak? "mmdevice audio output error cannot initialize com"

@mr.shgamingguy Год назад

cool

@Xaldras Год назад

How do you change the ai's voice? is it through the initial text document?

@adi-panda Год назад

In main.py at line 74-76

@RyuusanFT86 Год назад

Do you think this can be done with Pygmalion?

@MsAnxiety 7 месяцев назад

in miniconda, I keep getting error messages whenever I try to put in the folder location. How would I fix this so that I could continue?

@dylan-lp6km 26 дней назад

What is supposed to be running on port 7500? I get an error when connecting to the web socket there

@frederikdenis8208 Год назад

can you build it with LLaMA & Alpaca and have your thing on a 100G hard drive instead of having a openai account?....i mean im a noob and i dont know what im talking about but i think it can do the chat-gtp job?

@orcaolivegames Год назад

let's say I wanted to replace googles text to speech engine with my own code for that, would I delete that line and put my code in place of that or something else?

@peuug Год назад

Awesome video dude , i'm on a project similar to this one but i want my avatar to react to what i say in my mic and not using text ! any ideas to how to ? thanks

@Auhailbree Год назад

open AI whisper will create a transcript of what youre saying and then you just have to grab that as input somehow, but there will be a lot of latency