Intuition Behind Self-Attention Mechanism in Transformer Networks

Ark (ark)

Подписаться 4,7 тыс.

Просмотров 213 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Опубликовано:

16 окт 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 446

@MinsungsLittleWorld 7 месяцев назад

Me - Scrolls through MSA's playlist Also me - finds this video in the playlist that is not even related to MSA

@ZarifaahmedNoha-ey9kh 4 месяца назад

bro i am too i was just scrolling and founf this

@totallynotreese Год назад

anyone here from msa’s playlist edit: i left this cmt almost 2 years ago and i still get likes from it crazy how msa still has not found out this was in their playlist

@thishmi2010 Год назад

@dacine_larabe Год назад

Same

@Irene-jj3ri Год назад

I wonder why msa put this in there playlist..

@sjplays2 Год назад

I am too 🤔ᴴᴹᴹᴹ

@Godspeed-w2s Год назад

Same

@grayboywilliams Год назад

Please do a part 2! The second half of the architecture is never covered as much in depth. Your explanation of the first half is the best I've seen.

@twyt108 Год назад

Agreed! I've yet to find a video on the decoder that's as clear as this video.

@amanlohia6399 Год назад

Bro this might be the best video available on the entire internet for explaining transformers. I have studied, worked upon and implemented transformers but never have I been able to grasp it as simply and intuitively as you made it. You should really make more videos about anything you do or like, explain more algorithms, papers, implement them from scratch and stuff. Big thanks man.

@AiDreamscape2364 4 дня назад

Exactly 💯! I agree 💯

@zorqis 3 года назад

This goes in my favorites list to recommend for others. You have the gift of teaching at a level rarely seen, distilling key concepts and patiently explaining every step, even from multiple angles. This teaches not only the subject, but to think in the domain of the subject. Please use this gift as much as you can :). Respect!

@ednrl Год назад

What a gem, I feel like a kid again discovering something beautiful about the world. Teaching is certainly an art and you are a gifted teacher

@AiDreamscape2364 4 дня назад

Gifted teacher indeed

@nirajabcd Год назад

I am still waiting for part II. I haven't yet found the explanation better than this. The way you built the intuition on query, key and value which is the heart and soul of self attention mechanism is impeccable.

@kks8142 Год назад

This is good as well - ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-4Bdc55j80l8.html

@tvcomputer1321 Год назад

i've been trying to wrap my head around this stuff and between this video and chatgpt itself explaining and answering my questions i think i'm starting to get it. i dont think i will ever have the ability to calculate the derivatives of a loss function for gradient descent myself though

@vmarzein Год назад

@@tvcomputer1321usually, u dont have to worry about calculating derivatives (not saying anyone shouldnt learn derivatives). but tools such as pytorch and tensorflow has autograd that does all that for you

@hansrichter5227 Год назад

In a landscape where lots of 5 to 15 minute videos exist where some weird dude stutters and stammers technical terms failing at both the attempt to hide their own lack of understanding about a machine learning topic and "summarizing" a vastly complex subject in a ridiculously short amount of time, you managed to explain a topic amazingly clean. That's how it should be done. Keep it up! Great work!

@kvazau8444 Год назад

Despite the demand for it, competence is a rarity

@skipintro9988 3 года назад

Wondering why he doesn't have a million subscribers. By far the best video on self-attention.

@nirajabcd 3 года назад

This is by far the best explanation on Attention Mechanism I have seen. Finally a true 'Aha' moment in this paper. Absolutely loved it!

@siddharth-gandhi 8 месяцев назад

The BEST source of information I've come across on the internet about the intuition behind the Q,K and V stuff. PLEASE do part 2! You are an amazing teacher!

@hamzaarslan6490 3 года назад

BEST EXPLANATION I HAVE EVER SEEN ABOUT ATTENTIONS, keep going mate,

@CallBlofD Год назад

Can you please do a part 2? I'm usually not commenting on youtube videos but the way you explained the intuition of the first part was the best I've seen. Thank you so much, you gave me a lot of intuition!!

@AnshulKanakia 2 года назад

This was a fantastic video. I really hope you do the whole series on the "Attention is all you need" paper. It would be fantastic to cover the other parts of the architecture, as you said.

@AmerDiwan Год назад

I've seen many videos on transformers that parrot the steps in the original paper with no explanation of why those steps matter. In contrast, this video actually gives an excellent intuitive explanation! Hands down, the best video explaining self-attention that I have seen...by a long shot.

@sethjchandler 3 года назад

Thanks for a spectacularly lucid explanation of a complicated subject that tends to be explained poorly elsewhere. Fantastic!

@ucsdvc 9 месяцев назад

This is the most intuitive but non-hand-wavy explanation of self-attention mechanism I’ve seen! Thanks so much!

@kalpeshsolanki4715 2 года назад

I am half way through the video and I am already in awe. I finally understood the attention mechanism. Thanks a lot.

@shankaranandahv 2 года назад

This Video is a master piece. I really loved this video and explains in a very effective and simplified way. the complexities hidden behind the architecture is peeled layer by layer.. Hats off..

@spartacusnobu3191 Год назад

Having just read the attention is all you need paper with the intention to tackle a work problem with BERT and some specialized classification layers, your explanation here illuminates totally the self-attention mechanism component. Thanks a million times.

@shivangitomar5557 Год назад

Best video on this topic!! Looking forward to one on transformers!

@pravingirase6464 Год назад

What an explanation. I read through other articles but couldnt figure out why they are doing what they are doing. But you nailed it with explaation for everything from dot product to weights and most importantly the meaning of Query and key values. Thanks a ton!

@sibyjoseplathottam4828 11 месяцев назад

This is undoubtedly one of the best and most intuitive explanations of the Self-attention mechanism. Thank you very much!

@khubaibraza8446 3 года назад

One of the best explanations on the internet , just simply crystal and clear He should have tens of thousands of subscribers at the very least.

@hahustatisticalconsultancy8869 Год назад

This is the first video that I have ever seen in the deep architecture with clear and detailed under hood of the box. Thank you very much.

@basharM79 Год назад

I haven't come across a better intuitive explanation of Transformers before.!! Well Done!!!!

@XhensilaPoda Год назад

I came here after the Andrej Karpathys building GPT from scratch video. I have looked at many other videos, but this explains the self-attention mechanism best. Amazing work.

@michelevaccaro865 Год назад

THANKS ! Out of so many videos on the Attention Mechanism, this is by far the best and the more intuitive which explains very well how the score is calculated. THANKS !

@wuhaipeng Год назад

the best explaination for self-attention I've seen. Than you so much

@hubertnguyen8855 Год назад

This is the most interesting and intuitive explanation about Attention I've ever seen. Thanks

@bhaskarswaminathan9998 Год назад

One of the *BEST* videos on the topic of self-attention - PERIOD !!!

@pavan1006 Год назад

I owe you a lot, for this level of clear explanation on math involved.

@proshno Год назад

By far the best and most intuitive introduction to the concept of Self-Attention I've ever found anywhere! Really looking forward to watching more of your amazing videos.

@kngjln Год назад

I can't agree more. Pure display of the art of explaining complex topics in simple and complete words. Please add part 2 as you mentioned.

@trin1721 2 года назад

Dude you need to make more videos. You have a gift. If you do a full series on some key deep learning concepts and things take off you could be onto a very lucrative channel with a lot of social good

@michaelringer5644 3 года назад

Pretty much the best explanation that you can find.

@brenorb Год назад

I've been looking for explanations like this one for a long time! Please, continue this work. Great job!

@MrFurano Год назад

I watched many videos explaining deep learning concepts. This one is without doubt one of the best. Keep up the great work! You have just earned another subscriber.

@rdgopal 9 месяцев назад

This is by far the most intuitive explanation that I have come across. Great job!

@zaidengokhool8085 Год назад

Beautiful! Congrats to Ark, this video is wonderful. I’ve read many papers and seen different videos but this one is a head above the rest in explaining each component AND the intuitions about WHY we are using these, which is the part often skipped in other videos which just cover structure and formulas but are missing the big picture simplicity of what is the purpose of each component. Please keep up this good work!

@idris.adebisi Год назад

This is the best explanation I have gotten on the concept of attention in the transformer network.Thanks for this wonderful video.

@razodactyl Год назад

You hit the "intuitively explained" part. Great work.

@koladearisekola3650 Год назад

This is the best explanation of the attention mechanism I have seen on the internet. Great Job !

@ruchikmishra5177 2 года назад

This tutorial is probably the best I have seen so far on this topic. Really appreciate.

@BuddingAstroPhysicist Год назад

Wow , this is one of the most intuitive intuition I have found on transformer . Please make the second part as well , eagerly awaiting. Thanks a lot for this. :)

@darylgraf9460 10 месяцев назад

I'd just like to add my voice to the chorus of praise for your teaching ability. Thank you for offering this intuition for the scaled dot product attention architecture. It is very helpful. I hope that you'll have the time and inclination to continue providing intuition for other aspects of LLM architectures. All the best to you and yours!

@alexandertachkovskyi705 Год назад

You did a great job! Please don't stop!!!

@saiyeswanth2157 Год назад

"THE ONLY VIDEO THAT EXPLAINS SELF ATTENTION CLEARLY" !!!!, thank you so much !!.

@benwillslee2713 3 месяца назад

Thanks bro, it's the BEST explanation of Attenction I had seen so far ( I have to say that I had seen many others ), looking forward the other Parts eventhough it's been almost 4 years since this Part1 !

@RahimPasban 9 месяцев назад

this is one of the greatest videos that I have ever watched about Transformers, thank you!

@AriadnesClew82 Год назад

I wish I could give you five thumbs up for this video. The diagrams along with the commentary provided the representations needed to comprehend the different aspects / steps to needed to understand the inner workings behind multi-headed attention while not delving too deep into the weeds. This is the best video i've ever watched in terms of explaining a more complex technical topic; it is like a gold standard for technical education videos in my book. Thank you.

@AlexPunnen Год назад

Finally a video which talks about the learning part in transformers which plugs a big hole in all the other videos. Great, I am finally able to understand this. Thank you

@bandaralrooqi5459 2 года назад

I have seen many vids explaining self-attention already but this one is the best, Thank you :)

@prishangabora7303 Год назад

Probably the best explanation I have found on Attention here. Thank you so much. Implementation and coding these will still be a task, but at least now have enough knowledge to know exactly what is happening under the hood of Transformers.

@srinivasansubramaniam5876 3 года назад

One of the best introductory explanation

@aayushsaxena1316 Год назад

Best video describing the attention mechanism in transformers so far !!

@RM-bv6dl Год назад

Best explanation on this topic as well as probably one of the best explanations on a complicated topic in general. Hats off to you sir.

@wolfwalker_ 2 года назад

Well explained. Clearer than most University Online Lectures. Rare Burmese Talent. Looking forward to more videos.

@JonathanUllrich 10 месяцев назад

this video solved a month long understanding problem I had with attention. thank you so much for this educational and didactic master piece!

@guillermoalvarez4397 2 года назад

This explanation deserves at least 1 million views. Amazing! THANKS FOR IT

@mytechnotalent Год назад

Finally someone who actually explains this with a real functional example. Thank you!

@nikhilanjpv8377 Год назад

Thank you so much ! I went through dozens of videos before finding this one, and I don't need any other video to understand attention anymore !

@arkaung Год назад

This video is all you need :D

@gemini_537 Год назад

This is absolutely the best explanation of self-attention mechanism! Keep up the great work!

@raul825able Год назад

Such a complex topic explained so effortlessly. Thanks man!!!

@ChrisHalden007 Год назад

Excellent video. Thanks. Just can't wrap my head around how this works with sentences of different sizes.

@TTTrouble Год назад

This was absolutely one of the better explanations that I've come across and I feel like i've watched a hundred different videos explaining attention. Thanks so much for putting in the time to make it, I look forward to the next one if you can get around to it!

@arvindroshaan3335 Год назад

Best explanation of the self attention mechanism!!! Love the video. Awaiting you to complete the series...

@deepakkumarpoddar 2 года назад

Really Nice. I am going to suggest this video to the people who are still in search of intuition of transformers

@173_sabbirhossain9 Год назад

You are great , you are amazing at teaching.and you totally know how to teach to someone.Really appreciable.

@Phobos221B Год назад

Please make a 2nd Part, This is the most detailed and simple explanation i have seen on multi head attention and it's intuition

@lijkert Год назад

i hope you teach for a living, because that was amazing, so much better than everything else i've read and seen on this topic.

@youtube1o24 11 месяцев назад

Very decent work, please have more part 2, part 3 of this series.

@DanteNoguez Год назад

Wow, you're the only one that has managed to make it truly simple. Thanks a lot for this!

@zahrafayyaz9539 2 года назад

Amazing video. Please make part 2. That explanation saved me a lot of time and head scratching

@programmingwithmangaliso Год назад

This is perhaps the best I have seen. Elegant!

@RajeevSharma-c1j Год назад

This is amazing. Very nicely explained self-attention mechanism. It seems the you are gifted with amazing teaching qualities. Thanks for sharing the information.

@emcpadden Год назад

Great explanation!!! this really helped. I would love to see the part 2!

@Jaybearno 10 месяцев назад

Sir, you are an excellent instructor. Thank you for making this.

@albertlee9592 2 года назад

Oh my God. I finally understood how transformer works now. Thank you so much for this amazing tutorial

@willpearson Год назад

So much better than defaulting to the 'key', 'query', 'value' terminology. Confused me at first but now I have seen this, I fully understand.

@clockmaker22 Год назад

One of the best explanations of the Transformer architecture I have ever seen. I hope you make some videos about the different variations of the architecture that popped up since the original paper, and about some of the details you overlooked in this video (e.g. masking).

@NadaaTaiyab Год назад

This is truly the best video I've seen on this topic. Thank you so much. And, please make more videos for us!

@gihan_liyanage 2 года назад

Best explanation on the self attention mechanism on the internet. Please explain the other concepts in the paper if possible. Thanks for the intuition!

@kristianmamforte4129 Год назад

the best explanation on transformer all over RU-vid indeed!

@AlexOmbla 2 года назад

Really, really good explanation, with all the details needed to understand everything, I really appreciated going into visual details with the weights given by Softmax. Nice work, can't wait to see the other videos related to this one.

@yingguo3683 Год назад

This is the only video that make me feel like i understand, at least some part of it. Please make part 2, 3...

@ababoo99 Год назад

What an excellent explanation. Thank you. I really like how you carefully trace the meaning and structure of each term through the process.

@millenniumbismay382 Год назад

Truely the most awesome explanation. You made sure "Attention is all you get"! Waiting for more videos... Cheers!

@Alex-uc8co Год назад

Awesome video! This is really the best video about the topic. I truly hope you will make the second video asap.

@Predre_Amrna Год назад

Wait! Your this video is in the playlist of MSA(biggest channel of youtube)🤯🤯🤯 I found this from there

@davidmartin7518 2 года назад

This is a very good explanation of the attention mechanism. The high-level intuition presented is superb. Your use of just enough details allows a someone to grasp the key ideas without getting lost in unnecessary complications. Thank you for this.

@vaibhavhiwase5462 Год назад

Hey !! This is by far the best explanation. Please create a series.

@siliconwitch Год назад

This was so detailed and clear! So many videos just gloss over the various blocks, but this made it much less of a mystery. We need a part 2!! The reason I'm personally intrigued by this is that I work with FPGAs designing computer architecture. I'm looking into how we can better build processors to handle these computations. Understanding all this on the lowest level is key

@digitalorca Год назад

Excellent. Best concise explanation I've seen to date.

@matthiasho3216 Год назад

I have viewed thru dozen of transformer videos. This one is good and make sense. Kudos!

@jaiminjariwala5 Год назад

Thank you so much sir, you genuinely explained much better than any other videos I saw on Attention Mechanism in transformer!

@nimywimy1991 Год назад

Really hoping you post videos about the rest of the aspects of the transformer soon! This was super helpful.

@sportsdude2828 Год назад

This is probably the best explanation of the general architecture of the self-attention model I've seen. I hope you can get around to a video on positional embedding and masking to complete the transformer network!

@RuairiODonnellFOTO Год назад

This is a great video to map real world sentences to multi-headed attention for NLP and ML. Great job! Hope to watch more of your work

@padmasrivaddiparthi7287 Год назад

Amazing! I have no background in deep learning and yet I could understand every step of your explanation. Now, I am going to build a transformer from the scratch just for the fun of doing it and because you motivated me beyond words. Keep making more of these videos. Can you please make videos on vision transformers that are doing zero-shot classification, detection and segmentation. For example, DINO V2 by META that is launched recently.