Тёмный

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training 

Umar Jamil
Подписаться 30 тыс.
Просмотров 317 тыс.
50% 1

A complete explanation of all the layers of a Transformer Model: Multi-Head Self-Attention, Positional Encoding, including all the matrix multiplications and a complete description of the training and inference process.
Paper: Attention is all you need - arxiv.org/abs/1706.03762
Slides PDF: github.com/hkproj/transformer...
Chapters
00:00 - Intro
01:10 - RNN and their problems
08:04 - Transformer Model
09:02 - Maths background and notations
12:20 - Encoder (overview)
12:31 - Input Embeddings
15:04 - Positional Encoding
20:08 - Single Head Self-Attention
28:30 - Multi-Head Attention
35:39 - Query, Key, Value
37:55 - Layer Normalization
40:13 - Decoder (overview)
42:24 - Masked Multi-Head Attention
44:59 - Training
52:09 - Inference

Наука

Опубликовано:

 

9 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 568   
@umarjamilai
@umarjamilai Год назад
Slides' PDF: github.com/hkproj/transformer-from-scratch-notes
@bhaskartripathi
@bhaskartripathi 5 месяцев назад
I am not able to download the pdf file. My friends also tried. Will it be possible to put it on a downloadable link please? your content is too good and needs to be read again and again.
@mahek6110
@mahek6110 4 месяца назад
its getting downloaded@@bhaskartripathi
@snehotoshbanerjee1938
@snehotoshbanerjee1938 6 дней назад
Umar, you are a great teacher. I have not seen such a great explanation of transformer. Your transformer from scratch coding is also awesome. So, basically you understand which part needs more explanation. Thanks for your effort.
@gabrielnsionu8583
@gabrielnsionu8583 6 месяцев назад
This is arguably the best explaination of the multi-head attention in the internet hands down. Very thorough and most important to folks like me using attention mechanism as my underpinning mechanism in developing my novel neural architecture to be applied to my deep reinforcement learning architecture. Sir, pls never stop making this type of videos.
@umarjamilai
@umarjamilai 6 месяцев назад
You're welcome! 🤓
@csikel22
@csikel22 6 месяцев назад
I couldn't agree more. Best video on transformers I have seen so far. I doesn't get clearer than this. It would be very interesting to give some insight why this whole thing works and what are other variations and alternative architectures.
@rkbshiva
@rkbshiva 6 месяцев назад
​@@umarjamilaibro you're a legend!!!!
@pablofe123
@pablofe123 2 месяца назад
There are still a couple of things that are not explained well in the video. Q, K and V matrixs are the same matrix? and where do the parameters matrix Wq, Wk and Wv comes from? Besides that, excellent video.
@peregudovoleg
@peregudovoleg Месяц назад
@@pablofe123 21:25 "QKV are the same matreces". As for W matrices, he only says that they are "parameter matrices", and parameters is something we train during training process.
@hackie321
@hackie321 27 дней назад
The best Transformer explanation on internet till now and I have seen almost all of it. Kudos! You are a true teacher. I dare to compare you with Andrew NG. Please become a professor and not a corporate slave.
@DembaDiop-om3gv
@DembaDiop-om3gv 5 месяцев назад
The best explanation of "Attention is all you need" from my point of view, guys "This explanation is all you need". Thank you very much
@kerrykilian9127
@kerrykilian9127 Месяц назад
best explanation of the paper on the whole internet
@sushantpenshanwar8038
@sushantpenshanwar8038 7 месяцев назад
You did the best job of describing the complicated details in a fluid manner. Sat, watched and took notes in one sitting. Hands down best one so far.
@JulianHarris
@JulianHarris 25 дней назад
I'm so glad I found this again. Do NOT rely on RU-vid watch history it doesn't look at all your history. This is definitely the best explanation of transformers and attention and believe me I've watched quite a few! Kudos again Umar.
@umarjamilai
@umarjamilai 25 дней назад
You should subscribe to the channel to never lose it 😇 thanks for the kind words.
@utkarshashinde9167
@utkarshashinde9167 2 месяца назад
I cannot tell you how grateful I am for this explanation provided by you .............. nowhere I find this detailed and easy-to-understand description, a go-to video for every interview preparing students
@_seeker423
@_seeker423 3 месяца назад
The clearest explanation of a very important breakthrough paper that I have seen on RU-vid. Thank you!
@_seeker423
@_seeker423 3 месяца назад
One thing that I felt was missing is the logical explanation of what is the role of value vector (V).
@ajithshenoy5566
@ajithshenoy5566 7 месяцев назад
Bless you Umar One of the finest tutorials out there. Please don't ever stop. We're willing to support you in every way possible.
@NJCLM
@NJCLM 4 месяца назад
This video is surely among the top 3 among the 50 videos that I watched to understand this subject. We are very grateful to you, keep the energy, RU-vid numbers will follow !
@marsupilami125
@marsupilami125 3 месяца назад
Can you tell me the other 2?🙏
@keithchua1723
@keithchua1723 3 месяца назад
Spent days trying to understand this and I wished I had come across this video first because now I understand everything fully. Immediately subscribed, keep it up!!
@silasnginyo7744
@silasnginyo7744 6 месяцев назад
So far the best laid out presentation of Transformers I have ever walked through
@vrvlbl
@vrvlbl 4 месяца назад
Amazing explanation. I struggled too long to understand the architecture until I landed on your video. Way to go!!
@abhilashbalachandran7160
@abhilashbalachandran7160 8 месяцев назад
super useful. I really loved how you explain this with linear algebra. Very insightful. actually easier to understand than a lot of lectures at universities
@hamzaomari7052
@hamzaomari7052 2 месяца назад
This is the best explanation, it took me 4 hours, to take notes and revise stuff, and going with you word by word, with intuitions, and now I feel that I truly understand the transformer architecture and the mathematical intuition behind every detail. A thing that you cannot find in any other video. Thank you so much sir, this is very instructif and helpful.
@abc-by1kb
@abc-by1kb 10 месяцев назад
Such a great video! Explained all the key concepts so clearly and precisely while giving very nice intuition!
@keviny2
@keviny2 18 дней назад
Thanks Umar for the amazing video. This is the most comprehensive yet understandable walkthrough of the transformer architecture that I came across. Super helpful. I feel like I have a good foundation for tackling more complex LLMs because of it.
@vitoroliveiradesouza4214
@vitoroliveiradesouza4214 24 дня назад
I'm really glad to have found your video! Congratulations on the clean and yet detailed explanation
@tariqkhan1518
@tariqkhan1518 Месяц назад
TBH The best Explanation of Attention in whole Internet.
@ankitkacchap
@ankitkacchap Месяц назад
Awesome explanation , our professor also doesn't explain like you did thank youTube recommendation and special thanks to u
@SagarVibhute
@SagarVibhute 6 месяцев назад
Kudos on the commendable work, and simplified explanation! I appreciate that you are also trying to explain the intuition behind each step and not just math. I'll view and re-view this a few times to understand more with successive passes. Thank you!
@Patrick-wn6uj
@Patrick-wn6uj 2 месяца назад
This is the most important channel I have come across on youtube. keep creating these long form videos you are saving our lives in a huge away
@KunalTiwariBCI
@KunalTiwariBCI Месяц назад
Bro, legit the best explanation I have ever seen so far.
@brunogatti383
@brunogatti383 2 месяца назад
Best video for attention mechanism hands down
@anirudhjoshi1607
@anirudhjoshi1607 6 месяцев назад
This is the clearest explanation on this paper I have ever heard. Always had doubts about Multi-Head attention and now finally I can visualise this 100%. Thanks a lot Umar Jamil.
@megatroneata9911
@megatroneata9911 4 месяца назад
After watching this video and the stable diffusion video, I can say forsure that you are an amazing teacher. Extremely digestible content and easy to follow along.
@sanskargupta7085
@sanskargupta7085 13 дней назад
I feel lucky enough to have come across this channel, amazing stuff!
@ishaanjoshi6959
@ishaanjoshi6959 5 месяцев назад
The best explanation of attention based mechanism I found online , thank you so much Umar for making this video.
@cristinaballesteros93
@cristinaballesteros93 3 месяца назад
I have watched a lot of videos about transformers, and this is by far the best one. I finally understand how they work. Thank you so much!
@user-pz5nn2kg2j
@user-pz5nn2kg2j 5 месяцев назад
The best video explaining the Transformer so clearly I have ever seen. Thanks very much for your efforts. I really appreciate your methods of explaining every steps with a concrete examples and explicitly give the shapes of every matrices that involve. The shapes of matrices in each step are the most confusing part for me to understand Transformer models, and you make it so clear for me. Thanks a lot Umar.
@umarjamilai
@umarjamilai 5 месяцев назад
不客气!你们可以在领英交流
@jdbrinton
@jdbrinton 6 месяцев назад
the clearest description I've found to-date. bravo!
@mculabs
@mculabs 5 месяцев назад
Probably the best explanation of the paper and the encoder and decoder sub layers. Kudos!!
@channel8048
@channel8048 11 месяцев назад
This is very clear! Better than anything I have read up till now. Grazie!
@Nereus22
@Nereus22 5 месяцев назад
This is really a great video, exactly what I was searching for! Everything that you mentionned was explained in details (others are skipping a lot).
@andreicristea997
@andreicristea997 8 месяцев назад
Finally the fancy "black box" called transformer became more understandable for me. Really interested in the other content you are making. Thanks for the explanation.
@AIVidya
@AIVidya 6 месяцев назад
One of the best transforrmers videos encountered so far.
@AbhinavSharma-dc3kv
@AbhinavSharma-dc3kv Месяц назад
the best explanation for attention architecture. kudos to you sir!
@NazerkeSafina
@NazerkeSafina 2 месяца назад
This is brilliant. Thank you Umar for your hard work. Please keep new videos coming. You are helping immensely. May you live long and happy and healthy
@haoming3430
@haoming3430 2 месяца назад
Your video is very helpful and easy to follow. I have to say this is the best tutorial about transformer I've seen.
@tgyawali
@tgyawali 6 месяцев назад
Thank you, so much for putting together such a detailed video. This helps technical people who do not have a lot of experience in research but have some background in machine learning to understand this very important and historic paper in AI.
@bsuhaib
@bsuhaib 9 месяцев назад
This is called decoding a transformer. What I really liked was explaining each chunk. That was really helpful for this topic and surely taught me the approach to decode any problem. Jazaakallah ul Khair
@zeeshanmehdi3994
@zeeshanmehdi3994 3 месяца назад
can't thank you enough, this is the best explanation of transformers i could find after trying for days to understand it. Thank you ❤
@atrijpaul4009
@atrijpaul4009 5 месяцев назад
Best explanation of Attention throughout RU-vid!!!!! Thank you sir for making this video and helping us..
@dalilabdouraman3557
@dalilabdouraman3557 5 месяцев назад
Definetely the best explanation of the mutli head attention with the transformer ...just awesome
@sergewilsonmendy9051
@sergewilsonmendy9051 11 месяцев назад
Thank you man, this is the best transformer video I've seen. Well explained and very detailed.
@calewang3713
@calewang3713 8 месяцев назад
Oh Man, you deserve a Turing Award.....
@yuk-hoiyiu7023
@yuk-hoiyiu7023 4 месяца назад
The only video that explains the difference between training and inference in the Transformer model!
@profyao
@profyao 2 месяца назад
Absolutely the best explanation for multi-head attention so far!
@BritskNguyen
@BritskNguyen 3 месяца назад
this is the best lecture on transformer one can get, period.
@Udayanverma
@Udayanverma 7 месяцев назад
I would understand much deeper with your explanation. Rest of the world is scarying with diagrams and tables without explaining practical implementation. thank you dear!
@70152136
@70152136 5 месяцев назад
your presentation skill are simply amazing!!! best video on transformers I've seen so far
@shuchenwu170
@shuchenwu170 3 месяца назад
This tutorial translates complex and terse structures into intuitions. A masterpiece of tutorials!
@albert4392
@albert4392 10 месяцев назад
I really appreciate your talent to present knowledge. Nice explaination, thank you so much!
@saravanannatarajan6515
@saravanannatarajan6515 3 месяца назад
One of the best videos I have seen on this topic. Thanks a lot for making it easy for us. Great effort, hats off!
@ameyadesai6382
@ameyadesai6382 7 месяцев назад
The best explanation on this paper, can't wait to see the other videos on this topic.
@gauravmalik3911
@gauravmalik3911 4 месяца назад
Detailed explanation, did great work on explaining difficult topic by dividing in chunks, I don't think any part is missed in explanation. Best Explanation
@tipu461
@tipu461 10 месяцев назад
I really appreciate your efforts to make it understandable for us 👍. Thanks a lot.
@ddstar
@ddstar 4 месяца назад
Excellent. You answered a lot of questions I had about where the weights come from and how they were updated
@skc909887u
@skc909887u 8 месяцев назад
This is the best explanation for an engineer for sure .love this
@ltbd78
@ltbd78 2 месяца назад
You are incredible. Please continue making these type of tutorials.
@lethnis9307
@lethnis9307 Месяц назад
Finally, after a lot of articles and videos i found a video a could understand. Thank you, sir. I am not strong in math but i think i understood a lot with this explanation
@AvinashKumar-pb2op
@AvinashKumar-pb2op Месяц назад
Best Explanation Ever Existed in the whole Universe !!
@saima6759
@saima6759 3 месяца назад
transformer model never got so clear to me! thank you Umar!
@prethasur7376
@prethasur7376 2 дня назад
life saver 😭 thank you so very much lots of love and gratitude 💙💙
@juwanyirenda3457
@juwanyirenda3457 6 месяцев назад
Excellent exposition! Thank you Umar for the great work.
@JohnSmith-he5xg
@JohnSmith-he5xg 7 месяцев назад
The best overview I've seen. Great job!
@rajkrishnamurthy8474
@rajkrishnamurthy8474 8 месяцев назад
Love it Umar. This is the best explanation of the paper. Thank you very much.
@aeigreen
@aeigreen 9 месяцев назад
great explanation. thank you for demistify trasformer. I have come to your explantion after watching countless videos on transformer, your explanation is simply the best.
@madhuvamsi7055
@madhuvamsi7055 7 месяцев назад
You've definitely earned a lifelong subscriber bro! Great video.
@subinaypanda9936
@subinaypanda9936 21 день назад
Your explanation just hits my mind. You explained all the points, where I was facing problems to understand. It's like you can read my mind from past huh 😜. Yes subscribed.
@1tahirrauf
@1tahirrauf 9 месяцев назад
Umar! You nailed it. Please make more videos. It was truly helpful. Thank you.
@debjyotimukherjee8275
@debjyotimukherjee8275 2 месяца назад
Excellent video gave a complete description with a great explanation. Looking forward to more such amazing content!
@nadyaabdel5559
@nadyaabdel5559 4 месяца назад
Amazing explanation. First time every bit is super clear. Thank you.
@danielvillalba4457
@danielvillalba4457 5 месяцев назад
Lots of new insights about transformers technology, every document and video provides more details, great video sir!
@oleksandrasaskia
@oleksandrasaskia 3 месяца назад
Thank you SO MUCH for your humane, empathic explanation! This means a lot! Keep it up!
@rkjellbe
@rkjellbe 7 месяцев назад
Thank you, Umar. This was very helpful and I feel I have a much better understanding of the process now. Great work!
@aurelagbodoyetin3321
@aurelagbodoyetin3321 6 месяцев назад
This is a masterclass. Thank you for your work
@atulsain6170
@atulsain6170 9 месяцев назад
Wow.. Thank you so much.❤ I was banging my head in different papers, books, and videos for the last two days. Its the best explanation I could find.
@umarjamilai
@umarjamilai 9 месяцев назад
Thanks! You should watch my other video on how to code the Transformer from scratch, that will also give you practical experience.
@koko-wf8vz
@koko-wf8vz 7 месяцев назад
Thank you so much for this video, hands on the best in depth video i have seen. I love the graphical explanations, it helps to visualize matrixes for a math noob :) much love
@richeek10
@richeek10 3 месяца назад
Such a nice explanation with a soothing voice. Thanks so much!
@MichaelJentsch
@MichaelJentsch 6 месяцев назад
Hi, I wanted to express my thanks for your fantastic video. Your clarity and expertise made a complex topic incredibly accessible. Your video has been a meaningful change for me.
@umarjamilai
@umarjamilai 6 месяцев назад
Thank you for your kind words, Michael! Have a nice day
@priyanjaligoel4294
@priyanjaligoel4294 4 месяца назад
omg! I love it. Finally so many answers to my questions. I had a very abstract version of the process in my head before but now its much clearer. Thank you so much!
@orevjoker8332
@orevjoker8332 13 дней назад
I hardly ever comment on youtube videos, but wow this was a very well done video!
@arrozenescau1539
@arrozenescau1539 5 месяцев назад
This is by far the BEST explanaition of Transformers i have ever seen, amazing video.
@TheFitsome
@TheFitsome 6 месяцев назад
I've seen a TON of videos and articles on transformers, enough to say "This is Number 1"
@lyte69
@lyte69 7 месяцев назад
Thank you for your great explanation and effort, this was very informative and honestly there are no problems with the video, it's only a preference for me if there was some code alongside each part explained so it's even better understood, but I want you to know that this was a huge help thank you again. ❤
@sudzam
@sudzam 2 месяца назад
What a wonderful video with clear explanation! Thanks for making this and sharing with the community.
@vincetran6321
@vincetran6321 9 месяцев назад
Best explanation of transformer ive come across! Thanks so much :)
@user-xk7dy4nb7w
@user-xk7dy4nb7w 6 месяцев назад
Thank you for the excellent video. It is very illustrative, and you explained each concept very well.
@sujeethav9885
@sujeethav9885 Месяц назад
This is just perfect! A wholesome video on Transformers!
@parametaorto
@parametaorto 8 месяцев назад
Hi there! I watched it from start to end and written down all infos in my notebook, it was soooo interesting! Thank you for the explanation!! It was very clear and helpful!
@somdubey5436
@somdubey5436 4 месяца назад
you have put such a hard work to explain it so clearly.....hats off to you :)
@ciliamadani3046
@ciliamadani3046 3 месяца назад
The best explanation I have ever watched, thank you
@Zineb-ru8bp
@Zineb-ru8bp 6 месяцев назад
I was struggling trying to understand Transformers but you make it easy for me. Thank you so much
@abdulmajid8731
@abdulmajid8731 5 месяцев назад
It would be harsh if not rated on top. Absolutely the best explaination so far around the 'world'. Thanks Umar for your efforts. Keep the good work up.
@brothachris
@brothachris 11 месяцев назад
Excellent tutorial! Please keep up the great work.
@srikanthvoleti5942
@srikanthvoleti5942 4 месяца назад
Superb video, the best explanation, I have been trying to understand transformers for a long time and this definitely helped me a lot
@deepaksingh9318
@deepaksingh9318 9 месяцев назад
Welll, Cant describe how much i loved it in words. I had been searching for a video like this for more than a year ,which could give me all the picture under the hood like how each sentence gets transformed to make it work. Just loved the way how he explained everything in such a good manner that even a Non technical person can understand that . And if asked , someone can draw the whole architecture and calculate everything by hand after watching it. The way you taught concepts is just amazing with such a good slide pdf(which is also attached for notes). So i am Gonna subscribe so that i dont miss anything from you 😊 Thanks again for making such a great video. Hoping to see more and more from you😊.
@umarjamilai
@umarjamilai 9 месяцев назад
Thank you for your kind words! Always welcome to my humble channel!
Далее
The Attention Mechanism in Large Language Models
21:02
ResNet (actually) explained in under 10 minutes
9:47
The Most Important Algorithm in Machine Learning
40:08
Просмотров 249 тыс.
НАШ ЛЮБИМЫЙ КЛИЕНТ
1:00
Просмотров 207 тыс.
Избранное печатает...
0:11
Просмотров 114 тыс.
Лучший худший экран - PowKiddy RGB30
12:56