I Made an App that Accurately Produces Subtitles Using Whisper

Подписаться 32 тыс.

Просмотров 1,7 тыс.

50% 1

Get the Transcriber: github.com/Jar...
Python - www.python.org...
Whisper - github.com/ope...
OpenAI - platform.opena...
Come join the Language Journey!
Discord - / discord
Github - github.com/Jar...
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoff... |
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/3NBAsIq
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and minimum specs recommended:
Cyberpower 3060 - amzn.to/3XjtZoP

Опубликовано:

12 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 17

@amixam Год назад

Is it accurate? even if you select the audio quality and the model size to be high, there may be a lot of errors depending on the voice heard, comparing to a person who talks clearly and a person who usually slurs stuff (especially anime chars). But this is really cool, amazing job! I see it really useful for productive language learning/consumsion.

@Jarods_Journey Год назад

Its weird I use this as a baseline, but listen to this video: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-1zj8j7sCH_g.html When it transcribes this video on 32, it does so with near perfect accuracy when comparing it to the hard coded subs.

@exponentzero Год назад

Hi, Jarod! I've just tried this with a local file which is probably going to be my standard use case. I'm using the Whisper API on the server, not locally on my PC. The mp4 I used was 1.6G, it was a 45 minute video. At first, I left the audio quality at 128; this generated an mp3 file just over 43Mg, and I eventually got a Whisper API error that the audio file exceeded the size limitation for transcription. I then selected 16 audio quality. This time the mp3 file generated was almost 11Mg. An srt file was created! I looked at the subs, and they looked really good (Japanese) for the first 26 minutes. The srt file kept going for the full 45 minute program, but after 26 minutes the subs were just 「どん」repeatedly until the end of the file! 🤣 This is actually a big improvement over the beta version of the transcriber I tried out, so this is a really promising tool, and I'm quite excited. Do you have any thoughts on why the srt file crapped out half-way? Thank you very much for providing this language-learning game changer!

@Jarods_Journey Год назад

This is actually a weird, but known bug that occuring in whisper and I believe it's due to it getting stuck in some loop. I'm not sure if they're working on fixing this bug, but I noticed this tends to happen on longer videos and files. What might help is try to do the audio file on 64, it should spit it out at less than 25, select Japanese in the language parameter, and then in the prompt section, specify what the video is about and say that its Japanese. These things have helped me in some transcriptions so hopefully it helps out on yours.

@exponentzero Год назад

@@Jarods_Journey Excellent! Thanks, Jarod. I just checked the mp3 file to make sure the sound hadn't gone wonky at around the 27 minute mark. It hadn't, the audio seemed to be normal right up to the end of the file. But the sound quality is so crappy, I can't believe Whisper is able to be so accurate with such input. Amazing. Thanks again for your advice. Very exciting technology for language learning.

@Jarods_Journey Год назад

@@exponentzero Aye well let's hope the transcription goes better, wish ya the best!

@exponentzero Год назад

@@Jarods_Journey Ok, final update from me on this for today ;) I tried the audio setting at 64, and put in some description for the prompt. The mp3 was 21Mgs. Whisper generated an srt file whose contents was only a "1". lol....I retried at 32 audio quality, and this time everything worked! An srt file that looks great right up to the end of the video. Awesome work!

@Jarods_Journey Год назад

@@exponentzero Ah, just a 1? There must've been an issue with the transcription and it never got to openAI in that case 😅. I'll have to add some debugging for that so it tell you that. But awesome, glad that those options I added helped guide the whisper API.

@ArabicLearning-ki9gp Год назад

Great video

@greatideas218 Год назад

Some other video I saw didn't utilize an openAI key to get whisper running. Is there some kind of difference or rather benefit with utilizing the openAI Key?

@Jarods_Journey Год назад

This video is pretty old, but there are multiple ways of doing whisper nowadays: one is from openai themselves, two is running locally. You don't need an openAI key if you're running locally and that's the solution that I do nowadays.

@greatideas218 Год назад

@@Jarods_Journey I see thanks for the quick response. That should have been obvious from the speed of AI progression lol. I deem you as the expert AI voice guy so your video helps a bunch in this category so greatly appreciate it!

@Jarods_Journey Год назад

@@greatideas218 haha appreciate it!