Тёмный

Rasa Algorithm Whiteboard - Transformers & Attention 1: Self Attention 

Rasa
Подписаться 31 тыс.
Просмотров 101 тыс.
50% 1

This is the first video on attention mechanisms. We'll start with self attention and end with transformers.
We're going at it step by step, but if you're interested in immediately reading all about it in full detail then we might recommend these online documents:
- www.peterbloem....
- jalammar.github...
- d2l.ai/chapter_...
The general github repo for this playlist can be found here: github.com/Ras....

Опубликовано:

 

9 сен 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 141   
@joliver1981
@joliver1981 3 года назад
I have watched tons of videos and finally an original video that actually teaches these concepts. There are so many RU-vidrs that simply make a video regurgitating something they read somewhere but they don’t really teach anything because they themselves don’t really understand the idea. Bravo. Well done. I actually learned something. Thank you!
@RasaHQ
@RasaHQ 3 года назад
(Vincent here) I just wanted to mention that I certainly sympathize. It's hard to find proper originality out there when it comes to teaching data science.
@WIFI-nf4tg
@WIFI-nf4tg 3 года назад
@@RasaHQ Hi Rasa, can you also explain "how" we should express words into numbers for the vector v ? For example, is there a preferred word embedding?
@RasaHQ
@RasaHQ 3 года назад
@@WIFI-nf4tg Rasa doesn't prefer a word embedding, but a common choice is spaCy. Note that technically, countvectors are also part of the feature space that does into DIET.
@briancase6180
@briancase6180 Год назад
Yeah, and some of those videos use a script generated by an AI that "read" the relevant sections of a paper. Get used to having to wade through tons of AI-generated content. We need laws that require AI-generated content to be labeled as such. But, it's probably unenforceable. How much original content is enough to avoid the "Ai-generated" label? ☺️
@homeboundrecords6955
@homeboundrecords6955 Год назад
TOTALLY agree most 'education' vids really add to confusion and just regurgitate jargon over and over
@guanxi99
@guanxi99 4 года назад
After dozens of papers and videos studied, that‘s the first one really make me understand the context. Many thanks fornthat!!! It also highlighted for me one fact: Self attention is a smart idea, but the real magic souce is the way word embeddings are created. That will decide on if the contexts created by self attention make sense or do not.
@deoabhijit5935
@deoabhijit5935 3 года назад
agree
@davidlanday2647
@davidlanday2647 3 года назад
A good thought! That makes me wonder if there are metrics we can add to a loss function to assess how well a mechanism "attends" to words in a sentence. Like, if we look at the embedding space, we would probably want to see words that are contextually proximal/similar and cluster close together. So I guess, some metric to assess how well an attention mechanism captures all contexts.
@parryhotter18
@parryhotter18 Год назад
This. If a bit late 😊. Yes the creation of an embedding, i.e. the creation of a vector for each word seems to be the main storage of semantics of each word. This video IS the best i have seen so far in that he always explains firstly WHY and then How the best step works. Great approach!
@edouardthomasset6683
@edouardthomasset6683 Год назад
When the student understands its teacher, it means the teacher understood what he explains. I understood everything contrary to the majority of other youtubers on the same topic. Thanks !
@alirezakashani3092
@alirezakashani3092 2 года назад
mind blowing how simple self-attention is explained - thank you
@stanislawcronberg3271
@stanislawcronberg3271 Год назад
Only 4 minutes in and I can tell this series will be a banger. Don't stop teaching, this is pure quality content, much appreciated!
@azurewang
@azurewang 4 года назад
the most intuitive explaination I have ever seem!!! excellent drawing and accent
@deudaux
@deudaux Год назад
The guy in the video not just understands the concept, but also understands what it is that understanding of others might lack so he can fill in the gaps.
@SiliconValleyRunner
@SiliconValleyRunner 3 года назад
Best ever explanation of "self-attention". Awesome job.
@timohear
@timohear 3 года назад
Can't believe I've only stumbled up this now. Fantastic original explanation.
@mohammedmaamari9210
@mohammedmaamari9210 2 года назад
The clearest explanation of attention mechanisms I've ever seen. Thank you
@MrLazini
@MrLazini 6 месяцев назад
I love how you use different colors to represent different dynamics between relationships. Such a simple idea, yet so good at conveying meaning
@fallingintofilm
@fallingintofilm 3 года назад
This was a absolutely eye-opening. Congratulations sir! You win the Internet for a while.
@brianyoon815
@brianyoon815 2 года назад
This is incredible after like 10 attention explanations i finally get it here
@rommeltito123
@rommeltito123 3 года назад
I had so many doubts about the actual operation that happens in self attention. This video just cleared it. Excellent delivery in such a short time.
@oritcoh
@oritcoh 4 года назад
Best Attention explanation, by far.
@ferneutron
@ferneutron 3 года назад
Thank you so much for your explanation! When you said: "This is known as SELF ATTENTION". I just thought: BAM! Awesome job Rasa!
@magpieradio
@magpieradio 3 года назад
This is the best video I have seen so far as to explain things so clearly. Well done.
@foxwalks588
@foxwalks588 2 года назад
This is the best explanation of attention mechanism so far for a regular person like me! I came here after going through Coursera NLP spec and several papers, but only now I am actually able to see how that works. Seems like embeddings themselves are the secret sauce indeed. Thank you.
@binishjoshi1126
@binishjoshi1126 4 года назад
I've known self attention for some time, this is by far the most intuitive video I've ever see, thank you.
@DaveJ6515
@DaveJ6515 9 месяцев назад
Yes sir, this is it. You have nailed it: not only you know the subject; also know he art of creating the condition for everyone else to go into it gradually and logically. Great.
@seanh1591
@seanh1591 2 года назад
This is the best explanation of Self-Attention mechanism I've encountered after combing through the internet! Thank you!
@uniqueaakash14
@uniqueaakash14 2 года назад
best video i have found in self-attention.
@blochspin
@blochspin 2 года назад
best video hands down on the self attention mechanism. Thank you!!!
@luisvasquez5015
@luisvasquez5015 Год назад
Finally somebody explicitly saying that the distributional hypothesis makes no linguistic sense
@TheGroundskeeper
@TheGroundskeeper 11 месяцев назад
Still the best explanation 3 years later
@johnhutton5491
@johnhutton5491 2 года назад
Since there isn't a like/dislike ratio anymore, for those wondering, this video is great
@gomogovo4966
@gomogovo4966 Год назад
I've been looking for a clear explanation for so so so long. First one I've found. I think all the people that made explanatory videos so far, have 0 understanding of the subject. Thank you.
@thongnguyen1292
@thongnguyen1292 4 года назад
I've read dozens of papers and blog posts about this topic, but all they do were mostly walking through the math without showing any intuition. This video is the best I've ever seen, thank you very much!
@simranjoharle4220
@simranjoharle4220 Год назад
This is the best explaination for the topic I have come across! Thanks!
@briancase6180
@briancase6180 2 года назад
OMG, this helped me immeasurably. Thanks so much. I just couldn't quite get it from the other explanations I've seen. Now I can go back and probably understand them better. Yay!
@Anushkumar-lq6hv
@Anushkumar-lq6hv Год назад
The best video on self-attention. No debates
@giyutamioka9437
@giyutamioka9437 2 года назад
Best explaination I have seen so far .... Thanks!
@DrJohnnyStalker
@DrJohnnyStalker 3 года назад
Best Self Attention Intuition i have ever seen. Andrew Ng Level stuff!
@avinashpaul1665
@avinashpaul1665 4 года назад
on of the best example on the web that explains attention mechanism , after reading many blogs i still had my doubts , the way attention is explained between time series and text data is brilliant and helped me understand better.
@andyandurkar7814
@andyandurkar7814 Год назад
A very simple explanation .. the best one!
@tatiana7581
@tatiana7581 Год назад
Thank you sooo much for this video! Finally, someone explained what the self-attention is!
@vijayabhaskar-j
@vijayabhaskar-j 4 года назад
This attention series is the best clear and intuitive explanation of self-attention out there! Great work!
@hiteshnagothu887
@hiteshnagothu887 3 года назад
Never have I ever seen such a great concept explanation. You just made my life easier,@Vincent!!
@mokhtarawwad6291
@mokhtarawwad6291 Год назад
Thanks for sharing I have watched based on recommendations from a friend on Facebook I will watch the whole playlist. Thanks for sharing, good bless you 🙏 😊
@trantandat2699
@trantandat2699 3 года назад
I have read a lot about this: paper, medium, video, this one make me the best understanding. Very nice!
@akashmalhotra4787
@akashmalhotra4787 3 года назад
This is really an amazing explanation! Liked how you build up from time-series and go to text. Keep up the good work :)
@mmpcse
@mmpcse 4 года назад
Have gone through some 10-12 videos on Self Attention. This Attention Series 1,2 &3 are by FAR THE BEST EVER. Many Thanks for these Videos. [came back and updated this comment ;-) ]
@pranjalchaubey
@pranjalchaubey 4 года назад
This is one of the best videos on the topic, if not the best!
@ArabicCompetitiveProgramming
@ArabicCompetitiveProgramming 4 года назад
Great series about attention!
@suewhooo7390
@suewhooo7390 3 года назад
best explanation of attention mechanism out there!! thank you a lot!
@dinoscheidt
@dinoscheidt 3 года назад
Love the style. The more talent takes the time to teach new talent, the better. Very appealing style! Subscribed 🦾
@sebastianp4023
@sebastianp4023 3 года назад
please link this video to the tf doc. I tried a whole day to get behind the concept of attention and this explanation is just beautiful!
@dan10400
@dan10400 10 месяцев назад
This is an exceptionally good explanation! Thank you so much. It is easy to see why the thumbs-up count is so high wrt views.
@galenw6833
@galenw6833 Год назад
At 11:29, the presenter says "cross product", but I think it's the dot product, so that each of the weights (W_11, etc.) are numbers (otherwise using cross product they would be vectors). Thus we can build a new vector from W_11, W_12, ... Great videos, exactly what I was looking for.
@sgt391
@sgt391 2 года назад
Crazy useful video!
@vikramsandu6054
@vikramsandu6054 2 года назад
Loved it. Very clear explanation.
@sowmiya_rocker
@sowmiya_rocker Год назад
Beautiful explanation sir. I'm not sure if i got it all but i could tell you that I've got better idea about self-attention from your video compared to the other ones i watched. Thanks a lot 🙏
@siemdual8026
@siemdual8026 3 года назад
This video is the KEY for my QUERY! Pun intended. Thank you so much!
@jhumdas4613
@jhumdas4613 2 года назад
Amazing explanation!! The best I have come across till date. Thank you so much!
@shivani404sheth4
@shivani404sheth4 3 года назад
This was so interesting! Thank you for this amazing video.
@kevind.shabahang
@kevind.shabahang 3 года назад
Excellent introduction
@Tigriszx
@Tigriszx 2 года назад
SOTA explanation. that's what i was exactly looking for. [tr] okuyan varsa, bu herifi takibe alın, efsane anlatıyor.
@louiseti4883
@louiseti4883 Год назад
Great stuff in here. Super clear and efficient for begginers ! Thanks
@punk3900
@punk3900 4 месяца назад
The best explanations you can get in the world. Thanks! BTW, were you aware at the time of making these videos that transformers will be so revolutionary?
@distrologic2925
@distrologic2925 Год назад
love the format
@skauddy755
@skauddy755 10 месяцев назад
By far, the most intuitive explanation of self-attention. DISAPPOINTED However, with the number of Likes:(
@maker72460
@maker72460 2 года назад
Awesome explanation! It takes great skills to explain such concepts. Looking forward!
@hanimahdi7244
@hanimahdi7244 3 года назад
Thanks a lot! Really amazing , awesome and very clear explanation.
@ishishir
@ishishir 3 года назад
Brilliant explanation
@Deddiward
@Deddiward 2 года назад
Wow this video is so well done
@adrianramirez9729
@adrianramirez9729 2 года назад
Amazing explanation ! , it did not find too much sense to the comparison between time series, but the second part was really good :)
@offthepathworks9171
@offthepathworks9171 Год назад
Solid gold, thank you.
@ashh3051
@ashh3051 3 года назад
You are a great teacher. Thanks for this content.
@QuangNguyen-jz5nl
@QuangNguyen-jz5nl 3 года назад
Thank you for sharing, great tutorial, looking forward to watching more and more great ones.
@devanshamin5554
@devanshamin5554 4 года назад
Very informative and simple explanation of a complicated topic. 👍🏻
@sachinshelar8810
@sachinshelar8810 2 года назад
amazing stuff . Thanks so much Rasa Team :)
@saianishmalla2646
@saianishmalla2646 2 года назад
This was extremely helpful !!
@yacinerouizi844
@yacinerouizi844 3 года назад
Thank you for sharing, great tutorial!
@arvindu9344
@arvindu9344 3 месяца назад
Best explanation, that you so much.
@RaoBlackWellizedArman
@RaoBlackWellizedArman Год назад
Fantastic explanations‌‌ ^_^ Already subscribed!
@nurlubanu
@nurlubanu 2 года назад
Well explained! Thank you!
@vulinhle8343
@vulinhle8343 3 года назад
amazing video, thank you very much
@MohamedSayed-et7lf
@MohamedSayed-et7lf 3 года назад
Perfectly explained
@alexanderskusnov5119
@alexanderskusnov5119 Год назад
To filter (in signals (Low Frequency) and programming (filter predicate vector)) means to hold, not to throw away.
@mohajeramir
@mohajeramir 3 года назад
this was an excellent explanation. Thank you
@luisluiscunha
@luisluiscunha 2 года назад
very well explained: thank you
@ankitmars
@ankitmars Год назад
Best Explanation
@bootagain
@bootagain 4 года назад
Thank you for posting this educational and useful video. though I can not undetstand everything yet, I'll keep seeing the rest of series and trying to understand :) I mean it.
@injysarhan672
@injysarhan672 2 года назад
Great video! thanks
@pi5549
@pi5549 Год назад
Your whiteboarding is beautiful. How are you doing it? I'd love to be able to present in this manner for my students.
@norhanahmed5116
@norhanahmed5116 3 года назад
thanks alot, that was very simple and useful. hoping all the best for you
@ansharora3248
@ansharora3248 2 года назад
Beauty!
@ashokkumarj594
@ashokkumarj594 3 года назад
I love your tutorial 😙😙 Best explanation
@rayaay3095
@rayaay3095 3 года назад
Just Awesome... thank you
@thelastone1643
@thelastone1643 3 года назад
You are the best ....
@jmarcio51
@jmarcio51 3 года назад
I got the idea, thanks for the explaination.
@benjaminticknor2967
@benjaminticknor2967 3 года назад
Incredible video! Did you mean to say dot product instead of cross product at 11:30?
@ParniaSh
@ParniaSh 2 года назад
Yes, I think so
@timholdsworth1
@timholdsworth1 3 года назад
Why did you use cross product at 11:31? Wouldn't that be making the weights small when the word embedding vectors are similar, which would then mean the related words in the sequence would be unable to influence the current state?
@Erosis
@Erosis 2 года назад
I think he meant dot product? I don't know.
@xiaoyu5181
@xiaoyu5181 3 года назад
Clearly explained!
@burnytech
@burnytech 2 года назад
this is so genious i love you
@Ameer-oe2jr
@Ameer-oe2jr 3 года назад
Thank you !
@timmat
@timmat Месяц назад
Hi. This is a really great visualisation of weightings - thank you! I have a question though: at 11:30 you say you're going to calculate the cross product between the first token's vector and all the other vectors. Should this instead be the dot product, given that you are looking for similarity?
@23232323rdurian
@23232323rdurian Год назад
the content for the word vectors are all the OTHER words seen to statistically co-occur in corpora, weighted by their frequencies....so stopwords [the, a, is] cuz they are so frequent, hence dont contribute much topic/semantics, while contentwords are less frequent so contribute more. the word vector ('meaning') for is just all the N words most frequently observed nearby CAT in corpora, discounting for frequency.... works great for cases like [king, queen] cuz they occur in similar contexts in copora... but not for [Noah, cat] cuz that's peculiar/local to this instance..... and also not for co-references [cat, she] which are harder to resolve....you gotta keep a STORY context....where presumably you mighta already seen some reference to ...... and for the co-reference.....well, they're just harder to resolve, tho in this example *HAS* to resolve to either Noa or cat, cuz those are the ONLY choices, and by chance (we assume) all three co-refer..... ==> after all, there's a legit chance that isnt the cat in the example, but the cat's MOM, who can be an ANNOYING MOM, yet nevertheless Noa is still a great cat.....
@mohammadelghandour1614
@mohammadelghandour1614 Год назад
Thanks for the easy and thorough explanation. I just have one question. How is "Y" now is more representative or useful (more context) than "V"? can you give an example ?
@williamstorey5024
@williamstorey5024 9 месяцев назад
what is the reweigh method that you used in the beginning? i would like to look that up and get more details on it.
Далее
лучшая покупка в моей жизни
00:41
Lecture 12.1 Self-attention
22:30
Просмотров 70 тыс.
RING Attention explained: 1 Mio Context Length
24:34
Просмотров 3,1 тыс.