controlnet paper explained - Adding Conditional Control to Text-to-Image Diffusion Models

AI Bites

Подписаться 8 тыс.

Просмотров 1,9 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

4 авг 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 6

@abcd45058 4 месяца назад

Great work. Interesting paper read indeed. At 7:27 ; Bayes theorem is incorrect. P(X/Y) = P(Y/X).P(X) / P(Y) ; The rest of the math that follows is fine.

@AIBites 4 месяца назад

well spotted. thank you. I think I saw it after the video pub. Left it as YT doesn't allow newer versions of videos. I think I should start writing errata in the comments :)

@frazuppi4897 7 месяцев назад

great video but is not clear how one train it, one needs to have pairs of controlnet input - image output right?

@AIBites 5 месяцев назад

yes, we need depth or pose datasets. We already have several datasets in computer vision for depth or pose. The problem is these datasets are tiny compared to the scale at which LLMs or LVMs are trained. So the solution is ControlNet. By ControlNet approach, we simply add a few trainable layers and we are good to go and train with these "small" datasets. As a result, we will be able to control the spatial layout of the generated image during inference. Hope that clarifies :)

@frazuppi4897 5 месяцев назад

@@AIBitesyeah but I guess controlenet is around 50M

@AIBites 4 месяца назад

thats the upper bound I guess. Not sure whats the lower bound to train.