Тёмный

Speech to Text in Kdenlive - How to Configure Whisper for Subtitles in 5 Minutes 

Photolearningism
Подписаться 5 тыс.
Просмотров 2,8 тыс.
50% 1

Опубликовано:

 

25 окт 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 24   
@Photolearningism
@Photolearningism Год назад
UPDATE: After upgrading Kdenlive to 23.04.1, I found that WHISPER stopped working (was throwing "pytorch" errors in the logs. IN my case (Ubuntu), I manually updated Torch and TorchVision with the comment "pip3 install torch==2.0.1 torchvision==0.15.2 -f download.pytorch.org.whl/torch_stable.html", then checked the configuration again in Kdenlive. An additional update was determined by Kdenlive in the "Speech to Text" settings, which self-updated once I clicked the "update" button. Upon testing, subtitles are again working using WHISPER. Hope that helps someone!
@leandrooliveira.
@leandrooliveira. Год назад
thanks for sharing your knowaledge mate! I'm still having some trouble with wisper... I've tried all the commands that you've posted but still not working... I'm getting this message: The openai-whisper python module is required for speech features. (phyton is installed on my desktop) The torch python module is required for machine learning framework. (idk about this guy...) do you know how to get it working? thanks a million!
@pasankawshik412
@pasankawshik412 5 месяцев назад
@@leandrooliveira. same here stuck on this (i'm using pop os)
@LowLightRecovery
@LowLightRecovery 7 месяцев назад
Love your content dude!
@arnonart
@arnonart Год назад
I just returned from a friend who's son runs Windows and have nothing but troubles I wasn't able to solve.😢 So, yes, Linux has his fair share of troubles and bugs but are the others. Really appreciate this video. I run kdenlive appimage and I might want to try speech to text generator myself.
@amir-jg4zy
@amir-jg4zy 6 месяцев назад
I had some trouble getting it working on Windows 11 as well. The real test is getting it going on my Mac which has the M3 Pro cpu so it might process the subs really fast.
@joopvervaart
@joopvervaart 8 месяцев назад
Hi Nate, can you give the newest link to download all of this, because already installed like 4 gb of all kinds of stuff and it still keeps telling me that de srt is not there
@joopvervaart
@joopvervaart 8 месяцев назад
I just downloaded the last appimage 23.08.5 and it installed it all on it's own, So it seems to be that the problem was in the appimage
@grysufeuermelder9602
@grysufeuermelder9602 Год назад
at 1:20 where did you get this command from? I can't get whisper to work (using the latest flatpak version of KDEnlive). There are always some dependencies broken. It complains about requests==2.31.0 needing requests==2.25.1 and when I force to install that other dependencies are broken.
@Photolearningism
@Photolearningism Год назад
Thanks for the question! To my recollection, the command was surfaced in the install attemp/log from Kdenlive - the rest was compiled via the python documentation for torch and digging through the online repository for valid versions. Hope that helps!
@johnn1052
@johnn1052 Год назад
Hi Nate, The last time I used Kdenlive I found the reverse motion function very slow and problematic, do you think Resolve has a better function? also do you know which of the two has the easiest speech to text function.
@Photolearningism
@Photolearningism Год назад
Hi John N! Every now then I use the reverse speed feature - I hadn't noticed anything amiss at the time, but it's possible that it may have been indirectly impacted at some point. I did run a quick test in the latest current release (23.04.2) and it seems to be working, using H264 encoded video. As to Resolve, I've used in the past but have since switched from #Windows to #Ubuntu and haven't been able to get it working in Linux. I do see a lot of others creating good work from it. The tool does have a lot of professional grade features to offer (including speech to text and speed adjustments, whether moving to slow-mo or speed-ramping). However, in my opinion it requires too much in resource (CPU, RAM, and overall storage) not just to install but to maintain regular use of it. Even when I had it working in #Windows10, it would crash frequently from system over-utilization. This may be less of an issue if using a recent, well-stocked machine. My other gripe with Resolve is that it offers some of the tool for free (which is indeed generous), but is limited as a means to sell the more powerful features and options of the tool. There's nothing wrong with this approach, but it may become a headache that gets in the way of creativity. Hope that helps!
@writethatdown100
@writethatdown100 10 месяцев назад
I'm not getting the button to "install missing dependencies" it only shows the x button
@Photolearningism
@Photolearningism 9 месяцев назад
Hi WriteThatDown! Thank you for the comment! The experience may be slightly different depending on the operating system being used (Windows, Linux, etc.). Just in case, the Kdenlive manual does have the needed installs, which may help to resolve the issue - docs.kdenlive.org/en/effects_and_compositions/speech_to_text.html - Hope that helps!
@Poetry-For-The-Itch-Of-It
@Poetry-For-The-Itch-Of-It 9 месяцев назад
Can kdenlivewhen using the speech to text function, throw the words out on the screen in an animated function in the same manner that CapCut can? Is there an option for fonts that have shadows or be easily read on the screen?
@Photolearningism
@Photolearningism 9 месяцев назад
HI William James! Kdenlive does have a control for changing the font, color, shadow, etc. of subtitles. I found that if you hold shift and click and drag over all the subtitles, it will allow all of them to be formatted at the same time. However, it does not appear to support animation, because I believe it isolates the text as a subtitle format, in case of exporting (such as a CRT) to be used with video platforms. I know that "Titles" in Kdenlive do support animation and the other included effects - I did a quick look to see if a subtitle could be converted to a Title, but couldn't find an obvious way to do it. Hope that helps some!
@Photolearningism
@Photolearningism Год назад
What slows you down in your video editing workflow?
@Photolearningism
@Photolearningism Год назад
UPDATE: After upgrading Kdenlive to 23.04.1, I found that WHISPER stopped working (was throwing "pytorch" errors in the logs. IN my case (Ubuntu), I manually updated Torch and TorchVision with the comment "pip3 install torch==2.0.1 torchvision==0.15.2 -f download.pytorch.org.whl/torch_stable.html", then checked the configuration again in Kdenlive. An additional update was determined by Kdenlive in the "Speech to Text" settings, which self-updated once I clicked the "update" button. Upon testing, subtitles are again working using WHISPER. Hope that helps someone!
@joopvervaart
@joopvervaart 8 месяцев назад
well after two hours more ........ ti doesn't work on ubuntu 22.04 and appimage kdenlive 23.08.5
@Photolearningism
@Photolearningism 8 месяцев назад
Hi joop vervaart! I finally had a few moment to test things out - I'm using the same distro and version of Linux and updated to the latest Kdenlive appimage (23.08.5). I was able to try out Whisper and generate speech-to-text subtitles. Just as a thought, if you open Settings -> Configure Kdenlive -> Speech To Text, does it successfully detect the language model? If so, I'd suggest clicking "Check configuration" on the bottom-left of the window to confirm all the needed components are detected. Hopefully that helps to shed some light on the issue!
@joopvervaart
@joopvervaart 8 месяцев назад
@@Photolearningism hi Nate, thanks for your reply. Whisper keeps probing for gpu. I use the graphics of my ryzen 7950. It seems to be that that is the problem. I do not have a dedicated graphic card. I will keep you posted.
@joopvervaart
@joopvervaart 7 месяцев назад
@@PhotolearningismHi Nate, I solved the problem by installing the snap version 23-08.4 and whisper installed without any problem.
@uweschroeder
@uweschroeder 11 месяцев назад
While this is a cool feature, if you're doing YT videos, isn't YT already doing that for you automatically?
@Photolearningism
@Photolearningism 11 месяцев назад
Hi Uwe! Thank you for the question! RU-vid has indeed made some innovative advances with subtitle generation - The ongoing challenge is that the translation models still struggle with accent nuances or unusual spellings of tools \ products. In the open source arena, I've noticed that RU-vid's captions struggle "Kdenlive", "Krita", and the things that I regularly discuss. Additionally, there are shortcomings with periods of silence while something purely visual is being demonstrated. RU-vid sometimes inserts "Thank you" or "[applause]", as it doesn't have a good handle on what is happening visually. For most things, the built in subtitles are fairly good, but having this option built into the editor gives a bit more control as to where subtitles appear (they are timestamped) and ensuring the correct spelling is used. Hope that helps!
Далее
Kdenlive - Split Subtitles in Two Clicks
1:18
МЭЙБИ БЭЙБИ - Hit Em Up (DISS)
02:48
Просмотров 304 тыс.
Kdenlive - How to Make the Floating Text Visual Effect
10:13
poor man's obs setup
6:59
Просмотров 24
Art Color Grade Tutorial | Another Rawtherapee
8:12