Creating an uber-realistic video animation from an avatar with Stable Diffusion

Render Realm

Подписаться 3,2 тыс.

Просмотров 74 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

4 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 94

@STVinMotion Год назад

I think that showing the outcome at the beginning of the video and only then starting explaining how it's done can be a kind of a "hook" for viewers. Keep it up!

@-RenderRealm- Год назад

Thanks, I will :-)

@darkgenix Год назад

what in the jennifer lawrence is going on here?

@-RenderRealm- Год назад

It wasn't about Jennifer Lawrence, but using a prompt with a celebrity (that surely is included in the StableDiffusion model), helps improving the consistency between a sequence of frames.

@jimconner3983 Год назад

lol i read that wrong . . dirty mind

@RM_VFX Год назад

Jennifer Lawrence is so innovative, she reinvents herself in every frame 😂

@FrankJonen Год назад

Looks like the best way to do prepared this is to render the character against a green background and render the background separately. Then just key it out as normally.

@-RenderRealm- Год назад

Yes, green-screening the character in Blender and making 2 render-passes in SD, one with the character and one with the background, would also be an option. I used this method in one of my previous tutorials about creating an audio-reactive music video with StableDiffusion. Makes sense!

@ireincarnatedasacontentcreator Год назад

thank u so much for the tutorial and giving sources as wel

@jadonx Год назад

Ive been and sat in that cafe in France(Saint-Guillhem-le-desert), one of the most beautiful places in the world!

@EdnaKerr Год назад

Super video and excellent teaching. It even looks easy to do. I am going to try.🤩

@EdnaKerr Год назад

I sent a message on your music channel. please have a look.

@AZSDXC8954 Год назад

true future will come once stable diffusion will be able to output consistent result for both background and characters

@-RenderRealm- Год назад

Right, temporal coherence still is one of the big issues with this technology. Still, it's getting better over time and just I'm trying to find some workarounds, by green-screening the character from the background and rendering it separately, which seems to produce better results in most cases. Also tried RunwayML Gen 1, but that's in my view not yet suitable for producing professional stuff, due to it's severe time limitations for rendering clips. Maybe I'll make a video about it, I I think I can add a valuable contribution to this topic.

@z1mt0n1x2 Год назад

man that's a trippy video

@ArtisteImprevisible Год назад

Great video man thanks for sharing !

@richardglady3009 Год назад

Extremely informative and very well done. Thank you.

@OriBengal Год назад

Thanks- very comprehensive.

@PRepublicOfChina Год назад

This is incredible. But also insanely long and complicated. I hope someday they can make an AI that can generate this video all in 1 app. Something like ModelScope text to video synthesis 2. Or stable warpfusion. Or Runway Gen 1. Also you made this very long and complex. I think you could just use a video game like Blade & Soul to create the avatar, dance, background scene, and just record a video of the video game. Then input the video into stable diffusion. Using a video game could have saved you 100 of those steps.

@-RenderRealm- Год назад

I know, it's still a long process, but that's where we stand now. I'm working on a video to compare the pros and cons of Runway Gen 1 and Automatic1111 in video creation in a quick step by step guide, also giving some ideas how to deal with the Auto1111 temporal coherence issues. You are right in the sense that using a video game as an input could accelerate the process, but since I'm not so much into video-gaming, I had to go the more tedious way and create the input-footage myself. But it's a good advice that you gave!

@keleborn9151 Год назад

Such a program has already been created by Epic Games. The Meta Human mobile application allows to record your movements as animation and voice acting in 5 minutes.

@RobertWildling 8 месяцев назад

Great intro to that topic. But did you also manage to get a good video result, where the costume stays the same, as well as the background? If so, is there any chance for a follow-up video?

@-RenderRealm- 8 месяцев назад

I'm just working on that issue! There've been a lot of developments in StableDiffusion since I've posted this video, and now it has become possible to produce stable, flicker-free animations of any person, just by using a Blender animation or a video input and a single facial photo of that person. I'm planning to post another tutorial in the coming week, this time using ComfyUI instead of Auto1111, as it's more versatile... just need to solve a few minor issues before I'm ready.

@RobertWildling 8 месяцев назад

@@-RenderRealm- Looking very much forward to that one. Time to learn ComfyUI!

@EditArtDesign Год назад

😎

@izzaauliyairabby5421 Год назад

Thanks

@mraahauge Год назад

This is great. Thank you.

@gamingdayshuriken4192 Год назад

Nice Thx Good Work !

@pieterkirkham5555 Год назад

You could export the depth map in blender to get an even better output.

@-RenderRealm- Год назад

That's a good idea, thanks! I'll try it out.

@taportnaya2136 Год назад

wow

@omnigeddon Год назад

Nice but I figure a 3d scan can work better and how does face dance make their stuff???

@-RenderRealm- Год назад

Well, the Deform extension still has some issues with temporal inconsistencies. You can get better results, if you set the Strength and CFG-Scale even lower, so StableDiffusion sticks closer to the original video. Also I found that batch img2img and ControlNet, with low strength and scale make the scene to stay more consistent.

@PayxiGandia Год назад

very trippy

@-RenderRealm- Год назад

Yeah, Automatic1111 is quite capable of producing great images, but there are still major issues with the temporal coherence in videos, which makes them look kind of trippy. But I think I'm about to crack that nut... just working on a new video, where I'll be addressing this issue (and hopefully can provide some ideas how to solve it). I'm also checking out some other stuff in this regard, like RunwayML Gen1, which is far from being a perfect solution, but gave me some conceptional ideas how to deal with the coherence issues in Automatic1111. Well, we shall see...

@alexi_space Год назад

can you make another version where you show how to create custom person model, for example from midjourney image?

@santhoshmg2319 Год назад

Awesome👍 please tell me how to create face expression in my model

@homer3189 Год назад

very cool, but the end result is still unwatchable, no? Of course this technology is the start of something amazing once the glitches are worked out.

@-RenderRealm- Год назад

Yes, we're still at the beginning, but it's just amazing how fast the technology and the tools are developing. It's just great fun being part of it already at this early stage, watching it grow and adding some humble contributions to it.

@aribjarnason Год назад

Thanks a lot for this awesome tutorial and all your work. Would the ebsynth program not be a good step in the process to make the video look more consistent?

@-RenderRealm- Год назад

Yes, it might, and I've already installed the app, but didn't have the time yet, to get deeper into it. There's also an extension for Automatic1111 available under Extensions ->Availabe->Load from, which helps you through the process. Maybe I'll make a video about it, if I think I it can be helpful. Just working on another solution for improving the consistency, by green-screening the character in Blender and rendering it separately from the background scene, then making 2 render passes with batch img2img + 2 ControlNets, one for the character and one for the background, and putting them together again in my video editor. Looks promising and I'm going to make a quick tutorial about it soon, together with a short review about RunwayML Gen1.

@thetest4514 Год назад

Results are little bit mehh but i like the idea. Thanks.

@avatarjoker9402 Год назад

Bro which mac you use? M1 or intel.

@BRAVEN32m12 Год назад

I think it Won't be to much longer Before text videos

@coloryvr Год назад

BIG FANX for this great, helpful complex, motivating and inspiring video!

@-RenderRealm- Год назад

Thanks :-)

@tobygilbert-sl7ew Год назад

dude that was so helpful but for i have Q i start learning blender but i want see if my pc i suitable for it or not the cpu is 10400f core i5 and gpu is 6900 xt 16g is that good for making animation or shod i upgrade it??

@-RenderRealm- Год назад

Well, Blender requires some efforts in order to get started, but once you're getting familiar with it, it's a wonderful tool for producing 3d renders of any kind. I'm also using UnrealEngine for cinematic 3d renders and, while it's a monster in terms of memory and GPU-requirements, it's also on top of my list. And both, Blender and Unreal are completely free. Yet, you don't need any of these tools for StableDiffusion, but can create your input-videos in any other way you like, be it a simple smartphone-camera, or even by recording scenes from a game, if you are into computer gaming. StableDiffusion / Automatic1111 is very well connected with NVIDIA RTX graphic cards. Besides my MacBook Pro M1 Max, I also own a middle-class PC with 32GB of RAM and an NVIDIA RTX3060, 12GB VRAM GPU, so nothing fancy, and that works pretty, pretty well, even beating my Mac in terms of performance. To my knowledge, AMD-cards are not so well supported as NVIDIA, but should also be able to get along with Automatic1111. Here's an article I found on Github, addressing this topic: github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-AMD-GPUs I would just give it a try with what you have, and consider an upgrade only if you are not satisfied with it.

@tobygilbert-sl7ew Год назад

@@-RenderRealm-your

@tobygilbert-sl7ew Год назад

@@-RenderRealm- thank for your time to respond and it was so helpful

@gatomio9739 Год назад

Interesting 🤔

@cris4529 Год назад

You could just render the animations directly on blender.

@shadowprince7149 Год назад

Why the background is changing? Is it possible to make it with a constant background?

@-RenderRealm- Год назад

Yes, that's possible, it just requires a little trick. In Blender, render the scene only with the character on a green-screen background, hiding the rest of the scene and feed this video into StableDiffusion ( If you don't know how to create a green screen effect in Blender, take a short look into my tutorial about creating an audio-reactive music video at timeframe 9:25.). Next unhide the scene in Blender again, but hide the character and render it again. Then import the background render from Blender and the SD-rendered video with the character into your video-editing software (FinalCut, DaVinci, Premiere) with the character placed in front and remove the green screen from the character with a keyer - and it's done. Hope that explanation has been understandable, if you have any further questions, just ask!

@11305205219 Год назад

*Runway Gen2 maybe can do it easy*

@patrickmros Год назад

I'm sorry but this is terrible. There are much better solutions for this then using deforum. You should take a look at the img2img batch processing. You can use that with multiple controlnets like depth, canny, pose and landmark all at the same time. And no need for generating a video file first, img2img takes a folder with images as an input. That should give you a good consistency. And then look at some flicker remove tutorial for the free version of Davinci-Resolve. You will be amazed of how much better the result is.

@-RenderRealm- Год назад

Batch img2img can give you slightly better results with low strength and scale, but temporal consistency is still a big issue with all these tools. I still think that Deforum is a great extension with many possibilities, like math functions and prompt shifting, but it's fairly complex to turn this great variety of functions into meaningful results. ControlNet has been a great improvement, no matter if you use it together with batch img2img or with Deforum - I think it's a must-have for most use-cases. For reducing temporal inconsistency it also seems to help using less detailed background scenes, or even separating the character from the background scene by green-screening it and using separate render-passes with slightly different settings for the character and the background, before putting them together again in a video editing software. DaVinci Resolve is great for pre- and post-processing, though I tend to prefer FinalCut Pro, as long as I'm on my Mac, as it has some pretty good tools for stabilizing and improving the optical flow of a clip. Still, my main focus at this time is at tweaking the settings in StableDiffusion in order to improve the temporal consistency. I'm also looking into some new scripts and extensions, and there are some promising concepts and ideas coming up, that try to address these issues, but I still haven't been able to find a convincing overall solution. Well, it's a steep learning curve and all the available tools still have flaws, but I think it's worth dealing with them, as well as sharing your thoughts and ideas with others, no matter how imperfect they still may be.

@curiaproduction2.020 Год назад

sorry but when I go down it no longer goes down to control net to put the image in sequence

@-RenderRealm- Год назад

Please take a look at the ControlNet settings in the Settings Tab: Settings->Controlnet and make sure, that the "Do not append detectmap to output" box is checked. If not, please check it and restart of the webui. Also make sure, that the latest ControlNet version is installed (Extensions->Check for Updates). If nothing helps, try to remove the whole ControlNet Folder from your stable-diffusion-webui/extensions folder and reinstall ControlNet (Extensions->Available->Load From->ControlNet->Install). Hope that helps, if not, leave me another note!

@marassisportsinc.9195 Год назад

👍

@ferreroman2913 Год назад

not a great result but still amazing on learning how to do animations

@maertscisum Год назад

Is it possible to remove the defective frames?

@-RenderRealm- Год назад

Sure! If only single frames are defected, either just delete them from your output sequence, or replace them with the frame before or after it. If there are more defected frames in a sequence, import all frames into your video editing software as an image sequence, then delete the defected frames and interpolate the missing parts. Depends on which software you are using, but I think that most video editing apps are capable of doing that.

@yajuvendrasinghrajpurohit7888 Год назад

Its same as the corridor crew video right ?

@-RenderRealm- Год назад

Sorry, I'm not familiar with a corridor crew video. If you'd like me to take a look, please send me a link.

@yajuvendrasinghrajpurohit7888 Год назад

@@-RenderRealm- ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-_9LX9HSQkWo.html this one ,I ain't criticizing your video just curious.

@-RenderRealm- Год назад

Thanks for the link! Looks like a professional tool for SD video creation, definitely worth taking a closer look, though I'm not sure about the real costs for using that tool. They seem to be working a lot with green-screening the characters, which surely helps keeping them more consistent. I've also done that in one of my previous videos about creating an audio-reactive music video, you then just need 2 render passes, one for the character and one for the background and then merging them together in a video-editing software, like FinalCut or Premiere. Again, thanks, I'm going to play around with it for a bit and see what I can do with it!

@dandelionk3779 Год назад

I still cant generate any asset in blender using stable diffusion,, even tho i already following the installation.. after i input the prompt it not generate anything.. can you help

@-RenderRealm- Год назад

Can you tell me a bit more about the problem, please. Is it that you can't create and export an image sequence in Blender, or is the problem that you can't render this image sequence in StableDiffusion/Automatic1111? I will like to help you, if I can!

@dandelionk3779 Год назад

@@-RenderRealm- you have dm or something that i can connect you withh...

@-RenderRealm- Год назад

No, I'm not using dms, but if you want you can simply send me an email to my channel address (blndrrndr@gmail.com) and attach the blender file, so I can take a look at it.

@charlesneely Год назад

Would have been nice if you gave us the full clip dude 😎 that's like waving jenna Jameson in front of us retired pornstar and didn't tell us to go find

@Zaroak Год назад

that´s Jennifer Lawrence?....

@-RenderRealm- Год назад

Yes, I tried to use a celebrity name for improving the overall temporal consistence of the character. It's not about her as a person, just as a stronger guidance for StableDiffusion than, for example, (a beautiful blonde woman).

@NightRiderCZ Год назад

where is the uber-realistic video animation????... maybe you mean totally under-realistic... maybe clickbait???

@-RenderRealm- Год назад

No, sure no clickbait, I think the model I used produces very realistic images, the only issue with Deforum is the rather low temporal consistency, which is especially visible at the background scene. I'm just trying to find a way how to fix this issue, by green-screening the character and rendering it separately from the background, with the background scene render at very low scale and strength. Then putting them together again in my video editing software and remove the green screen from the character. I wish there were some buillt-in tools in StableDiffusion for keeping a higher consistence across subsequent frames, but still the tools seem to get better and better with each new version, so I'm pretty confident that we're on the right path. Just a few months ago nothing like this would have been possible to make, and the technology is advancing rapidly.

@bomar920 Год назад

That’s long process to have few seconds video . We got a long way to go 😢

@-RenderRealm- Год назад

Well, it could also be done by simply feeding a dancing video into StableDiffusion, instead of creating one in Mixamo and Blender, but this tutorial was also meant to describe how to combine different technologies and tools for creating something new. Yes, still a long way to go, but the StableDiffusion tools are advancing so rapidly, so it makes me very confident that it's going to get a lot easier as we move forward.

@bomar920 Год назад

@@-RenderRealm- direct me if there is tutorial for that as I am new to stable diffusion . This feels like when the iphone first came out . There is excitement all over

@-RenderRealm- Год назад

There are some good basic tutorials about StableDiffusion on RU-vid, that I would recommend watching as a starting point: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-DHaL56P6f5M.html ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-cVkMnskciHU.html ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-3cvP7yJotUM.html I've also listed some good channels covering various Stable Diffusion topics at the end of my video description. If you have any specific questions, don't hesitate asking me!

@bomar920 Год назад

@@-RenderRealm- Thanks, I appreciate your reply . I’ll Go and check out the starter videos. I have already learned the basics but of-course , took me forever.

@eduacademia Год назад

very complex method and still too early for animation, , but nice try congrats

@-RenderRealm- Год назад

Yes, the temporal consistency still needs some improvement, but I'm working on it!

@ExplorerOfTheGalaxy 6 месяцев назад

not realistic and neither uber

@aegisgfx Год назад

I see where we're going with this but I have to say until we have temporal cohesion it's kind of useless.. what's the good of a video where the dress and face changes 60 times per second?

@-RenderRealm- Год назад

Right, the temporal consistency is still an issue with StableDiffusion, but the tools are getting better rapidly, so I guess these issues will be only temporal, too ;-) I'm just working on another video, where I'm trying to deal with temporal inconsistencies by interpolating the input frames from a video with low strength and scale, hoping it will be more consistent - the Deforum extension doesn't provide this feature yet, but maybe it can be done with the video-input mode and a frame-to-frame interpolation with mathematical functions in the prompts (just like the Deforum interpolation mode works in the background, but with a video input and not just interpolating a series of prompts). Well, let's see how it will work out... the whole technology is still work in progress, but I believe we'll be getting there rather sooner than later.

@aegisgfx Год назад

@@-RenderRealm- Well I think they're going to solve the temporal issue pretty quickly here and then what's going to happen is 3D programs are going to become a thing of the past, programs like blender and Maya we will literally look at them and say "yeah that's the way we used to do things...", They will be nothing but relics of the past

@-RenderRealm- Год назад

True :-) I still love my "old" 3d tools, like Blender and Unreal (never worked with Maja) and hope they will integrate the new AI-technologies in a meaningful form some time in the future... but maybe they will just become relics of the past, as you said. No matter how it will turn out, the way how we will create digital artworks will change dramatically. These are fascinating times we're living in!

@user-jk9zr3sc5h Год назад

Until the generation ai supports static-ish scenes We need an alayzer that can tell us "its too different" and keep generating

@-RenderRealm- Год назад

That would be a great step forward in improving temporal consistency in Stable Diffusion videos. Until then, we need do figure out some creative workarounds for this topic. I'm just trying to use frame-interpolation for creating a SD-animation... if it turns out to be a viable solution, I might post another tutorial describing this method.

@user-jk9zr3sc5h Год назад

@@-RenderRealm- yes--- we need a LangChain for Generative AI