Тёмный

Intuitively Understanding the Cross Entropy Loss 

Adian Liusie
Подписаться 3,1 тыс.
Просмотров 79 тыс.
50% 1

This video discusses the Cross Entropy Loss and provides an intuitive interpretation of the loss function through a simple classification set up. The video will draw the connections between the KL divergence and the cross entropy loss, and touch on some practical considerations.
Twitter: / adianliusie

Опубликовано:

 

8 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 63   
@leoxu9673
@leoxu9673 2 года назад
This is the only video that's made the connection between KL Divergence and Cross Entropy Loss intuitive for me. Thank you so much!
@jasonpittman7853
@jasonpittman7853 Год назад
This subject has confused me greatly for nearly a year now, your video and the kl-divergence video made it clear as day. You taught it so well I feel like a toddler could understand this subject.
@kvnptl4400
@kvnptl4400 11 месяцев назад
This one I would say is a very nice explanation of Cross Entropy Loss.
@nirmalyamisra4317
@nirmalyamisra4317 3 года назад
Great video. It is always good to dive into the math to understand why we use what we use. Loved it!
@yingjiawan2514
@yingjiawan2514 Месяц назад
This is so well explained. thank you so much!!! Now I know how to understand KL divergence, cross entropy, logits, normalization, and softmax.
@ananthakrishnank3208
@ananthakrishnank3208 7 месяцев назад
Excellent expositions on KL divergence and Cross Entropy loss within 15 mins! Really intuitive. Thanks for sharing.
@Micha-ku2hu
@Micha-ku2hu 2 месяца назад
What a great and simple explanation of the topic! Great work 👏
@ssshukla26
@ssshukla26 2 года назад
And no one told me that (minimizing KL is almost equivalent to minizing CLE) in 2 years studying in a University... Oh man... thank you so much...
@DHAiRYA2801
@DHAiRYA2801 Год назад
KL = Cross Entropy - Entropy.
@alirezamogharabi8733
@alirezamogharabi8733 Год назад
The best explanation I have ever seen about Cross Entropy Loss. Thank you so much 💖
@bo3053
@bo3053 Год назад
Super useful and insightful video which easily connects KL-divergence and Cross Entropy Loss. Brilliant! Thank you!
@shubhamomprakashpatil1939
@shubhamomprakashpatil1939 2 года назад
This is an amazing explanatory video on Cross-Entropy loss. Thank you
@viktorhansen3331
@viktorhansen3331 2 года назад
I have no background in ML, and this plus your other video completely explained everything I needed to know. Thanks!
@allanchan339
@allanchan339 2 года назад
It is a excellent explanation to make use of previous video of KL divergence in this video
@chunheichau7947
@chunheichau7947 Месяц назад
I wish more professors can hit all the insights that you mentioned in the video.
@hasankaynak2253
@hasankaynak2253 2 года назад
The clearest explanation. Thank you.
@francoruggeri5850
@francoruggeri5850 Год назад
Great and clear explanation!
@hansenmarc
@hansenmarc 2 года назад
Great explanation! I’m enjoying all of your “intuitively understanding” videos.
@matiassandacz9145
@matiassandacz9145 3 года назад
This video was amazing. Very clear! Please post more on ML / Probability topics. :D Cheers from Argentina.
@whowto6136
@whowto6136 2 года назад
Thanks a lot! Really helps me understand Cross Entropy, Softmax and the relation between them.
@yfd487
@yfd487 Год назад
I love this video!! So clear and informative!
@TheVDicer
@TheVDicer 2 года назад
Fantastic video and explanation. I just learned about the KL divergence and the cross entropy loss finally makes sense to me.
@HaykTarkhanyan
@HaykTarkhanyan 2 месяца назад
great video, thank you!
@LiHongxuan-ee7qs
@LiHongxuan-ee7qs 5 месяцев назад
So clear explanation! Thanks!
@mixuaquela123
@mixuaquela123 Год назад
Might be a stupid question but where do we get the "true" class distribution?
@patrickadu-amankwah1660
@patrickadu-amankwah1660 Год назад
Real world data bro, from annotated samples.
@user-gk3ue1he4d
@user-gk3ue1he4d Год назад
Human is the criteria for everything,so called AI
@AnonymousIguana
@AnonymousIguana 4 месяца назад
In the classification task, the true distribution has the value of 1 for the correct class and value of 0 for the other classes. So that's it, that's the true distribution. And we know it, if the data is labelled correctly. The distribution in classification task is called probability mass function btw
@yassine20909
@yassine20909 Год назад
Nice explanation, thank you.
@blakeedwards3582
@blakeedwards3582 2 года назад
Thank you. You should have more subscribers.
@yegounkim1840
@yegounkim1840 Год назад
You the best!
@kevon217
@kevon217 2 года назад
Simple and helpful!
@jiwoni523
@jiwoni523 6 месяцев назад
make more videos please , you are awesome
@lebronjames193
@lebronjames193 2 года назад
really superb video, you should record more !
@dirtyharry7280
@dirtyharry7280 Год назад
This is so good, thx so much
@mikejason3822
@mikejason3822 2 года назад
Great video!
@shchen16
@shchen16 Год назад
Thanks for this video
@omkarghadge8432
@omkarghadge8432 3 года назад
Great! keep it up.
@Darkev77
@Darkev77 3 года назад
Brilliant and simple! Could you make a video about soft/smooth labels instead of hard ones and how that makes it better (math behind it)?
@SA-by2xg
@SA-by2xg Год назад
Intuitively, information is lost whenever discretizing a continuous variable. Said another way, the difference between a class probability of 0.51 and 0.99 is very different. Downstream, soft targets allow for more precise gradient updates.
@sushilkhadka8069
@sushilkhadka8069 Год назад
This is so neat.
@MrPejotah
@MrPejotah Год назад
Great video, but only really clear if you know what the KL divergence is. I'd hammer that point to the viewer.
@user-bi2jm1cn1h
@user-bi2jm1cn1h Месяц назад
How does the use of soft label distributions, instead of one-hot encoding hard labels, impact the choice of loss function in training models? Specifically, can cross-entropy loss still be effectively utilized, or should Kullback-Leibler (KL) divergence be preferred?
@shahulrahman2516
@shahulrahman2516 2 года назад
Thank you
@starriet
@starriet 2 года назад
essence, short, great.
@kutilkol
@kutilkol 2 года назад
superb!
@thinkbigwithai
@thinkbigwithai 11 месяцев назад
At 3:25 why don't we model it as argmax Summ P* logP (without minus sign)?
@vandana2410
@vandana2410 2 года назад
Thanks for the great video. 1 question though. What happens if we swap the true and predicted probabilities in the formula?
@sukursukur3617
@sukursukur3617 2 года назад
Why dont we use just mean of (p-q)^2 instead of p*log(p/q) to understand dissimilarity of pdfs?
@quantumjun
@quantumjun 2 года назад
will the thing in 4:12 be negative if you use information entropy or KL divergence? are they both > 0?
@yassine20909
@yassine20909 Год назад
As explained in the video the KL divergence is a measure of "distance", so it has to be >0. There other prerequisites for a function to be a measure of distance like symmetry, and couple other things i forget about.
@madarahuchiha1133
@madarahuchiha1133 4 месяца назад
what is true class distribution?
@elenagolovach384
@elenagolovach384 3 месяца назад
the frequency of occurrence of a particular class depends on the characteristics of the objects
@genkidama7385
@genkidama7385 3 месяца назад
distirbution
@tanvirtanvir6435
@tanvirtanvir6435 Год назад
0:08 3:30 P* is true prob
@pradiptahafid
@pradiptahafid 2 года назад
3:24 .The aha moment when you realize whta's the purpose of the negative sign in cross entrophy
@pradiptahafid
@pradiptahafid 2 года назад
4:24. do you know how golden the statement is
@zhaobryan4441
@zhaobryan4441 2 года назад
hello, handsome could you share the clear slides?
@zingg7203
@zingg7203 2 года назад
Volumn is low
@ajitzote6103
@ajitzote6103 6 месяцев назад
not really a great explaination, so many terms were thrown in. that's not a good way to explain something.
@commonsense126
@commonsense126 Год назад
Speak slower please
@Oliver-2103
@Oliver-2103 10 месяцев назад
Your name is commonsense and you still don't use your common sense lol In every RU-vid application, there is the option to slow a video down to 75%, 50% or even 25% speed. If you have trouble with understanding his language, you should just select the 0.75 speed option.
@commonsense126
@commonsense126 10 месяцев назад
@@Oliver-2103 Visually Impaired people have problems seeing some of the adjustments one can make on a phone even when they know that they exist.
Далее
Intuitively Understanding the KL Divergence
5:13
Просмотров 83 тыс.
Entropy (for data science) Clearly Explained!!!
16:35
Просмотров 602 тыс.
Categorical Cross - Entropy Loss Softmax
8:15
Просмотров 16 тыс.
Cross Entropy Loss Error Function - ML for beginners!
11:15
Why do we need Cross Entropy Loss? (Visualized)
8:13
Why Does Diffusion Work Better than Auto-Regression?
20:18
Intuitively Understanding the Shannon Entropy
8:03
Просмотров 94 тыс.
The KL Divergence : Data Science Basics
18:14
Просмотров 45 тыс.
Logarithms: why do they even exist?
12:47
Просмотров 91 тыс.