Тёмный

MIT 6.S191 (2023): Reinforcement Learning 

Alexander Amini
Подписаться 259 тыс.
Просмотров 124 тыс.
50% 1

MIT Introduction to Deep Learning 6.S191: Lecture 5
Deep Reinforcement Learning
Lecturer: Alexander Amini
2023 Edition
For all lectures, slides, and lab materials: introtodeeplearning.com
Lecture Outline:
0:00 - Introduction
3:49 - Classes of learning problems
6:48 - Definitions
12:24 - The Q function
17:06 - Deeper into the Q function
21:32 - Deep Q Networks
29:15 - Atari results and limitations
32:42 - Policy learning algorithms
36:42 - Discrete vs continuous actions
39:48 - Training policy gradients
47:17 - RL in real life
49:55 - VISTA simulator
52:04 - AlphaGo and AlphaZero and MuZero
56:34 - Summary
Subscribe to stay up to date with new deep learning lectures at MIT, or follow us @MITDeepLearning on Twitter and Instagram to stay fully-connected!!

Наука

Опубликовано:

 

19 июн 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 73   
@BehindTheBackground
@BehindTheBackground 4 месяца назад
Excellent slides and explanations!
@mehmetburakguldogan6815
@mehmetburakguldogan6815 Год назад
Very good work. Seen many lectures on the topic but this is by far the best one and very intuitive. Thank you for sharing.
@nikteshy9131
@nikteshy9131 Год назад
Wow, Thank very much you )) 🥰🥰😊
@muhammadalikhan5003
@muhammadalikhan5003 5 месяцев назад
Amazing lecture delivery. No words to thank you for sharing this wonderful resource for free. Thanks, MIT as well.
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@cyrusmobini1321
@cyrusmobini1321 Год назад
Great as always, thanks for being consistent
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@nageshwararaov118
@nageshwararaov118 Год назад
Thank you very much. 😊
@sirabhop.s
@sirabhop.s Год назад
Thank you so much
@TheEgesko
@TheEgesko Год назад
Great video! 🙏
@imZoox
@imZoox 7 месяцев назад
haha at 19:50, William Lin the CP legend is answering the question :D Its so weird, I am not even from the US neither I study there but I recognize a student from his voice at MIT in an MIT online lecture :D
@khalidalsaleh3858
@khalidalsaleh3858 Год назад
Thanks!
@blas.duarte
@blas.duarte Год назад
Great!
@pavalep
@pavalep Год назад
Thanks for explaining complex Deep Learning and Reinforcement principles in a simplistic manner 🙌👍
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@yuqiwang3296
@yuqiwang3296 Год назад
great thanks for the course!❤
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@user-ff9hy6qz5t
@user-ff9hy6qz5t 7 месяцев назад
Thank you so much! I loved the lecture, and I'm learning so much! Im only 16 now, but I hope I can one day get into MIT or another great university that teaches this well!
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@ReeceGao
@ReeceGao 8 месяцев назад
It is so clear. Thank you very much!
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@franco-parra
@franco-parra 6 месяцев назад
Great lecture. To be precise, at 24:37, you propose the 'target' as a function of the best action a' in some state s', but you don't explicitly define where this s' comes from. I may be mistaken, but I believe that this s' essentially represents the state s in the next step (t+1), as demonstrated in ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-wDVteayWWvU.html (at 14:45). I hope this information is useful to someone.
@esthertschache
@esthertschache 6 месяцев назад
Great video!
@hilbertcontainer3034
@hilbertcontainer3034 Год назад
~wow my favorite area about AI =] cant wait to finish the lecture
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@prithvishah2618
@prithvishah2618 Год назад
Thank you so much :)
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@xzuanaja2746
@xzuanaja2746 Год назад
This is so great! but unfortunately due to my limited English, I didn't understand some parts. Hopefully in the future there will be subtitles in Indonesian or other languages, thank you very much!
@master7738
@master7738 5 месяцев назад
you can use subtitles if you want
@vahidg1500
@vahidg1500 9 месяцев назад
Thank You, Ostad Amini, But how can I find some code examples for policy learning like ppo?
@saprogrammer2702
@saprogrammer2702 9 месяцев назад
Dude, this guy did such a good job!!!!
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@jennifergo2024
@jennifergo2024 6 месяцев назад
Thanks for sharing!
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@seanwalsh358
@seanwalsh358 9 месяцев назад
Great lecture from a great instructor.
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@sapienspace8814
@sapienspace8814 Год назад
@ 50:00 Very impressive work, VISTA!
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@MrPejotah
@MrPejotah Год назад
Once again a great lecture. I have a challenge, and I wonder if you can help me. I'm currently implementing a NN to determine customer satisfaction through a set of inputs that translate behavioural patterns (think # of complaints with our customer service, rate of usage of our services, etc.), and I'd like to know how much each input i'm using contributes to the overall satisfaction score. I imagine this would involve performing the gradient of the output node (a single one in this case), to each input. Is there any lecture where you go into the details of this, both the math and tensorflow code? Thanks in advance!
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@smftrsddvjiou6443
@smftrsddvjiou6443 6 месяцев назад
I recommend Barto Sutton „Reinforcement Learning“, 1st Edition, way,way better than the newer 2nd Edition.
@kritsaphongphuthibpaphaisi1509
Great lecture
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@agenticmark
@agenticmark 4 месяца назад
Glad to see ML can figure out what I did as an 8 year old with a stack of quarters :D
@jiunyen5586
@jiunyen5586 Год назад
Thanks for the thorough vid! I'm a bit lost @ 39:31 on where the "-0.8" velocity come from. The closest I'm trying to interpret is given the mean=-1 and var=0.5 the prob of norm dist at mean would be about 0.8... and since your going the negative direction to action a, then it becomes -0.8 ?? But this interpretation seems wrong since the mean should indicate the direction and velocity of action a, while the prob is for computing the loss. So.... what am I missing here? Thanks!
@gnikhil335
@gnikhil335 Год назад
when you say " the prob of normal distribution at mean would be around 0.8" where did you get 0.8 from ? (the maximum value of this distribution is 0.564 at mean ) and secondly I think he is using 0.8 m/s as an example ( its a random value which you might get after mapping it back to a speed variable in your game )
@jiunyen5586
@jiunyen5586 Год назад
@@gnikhil335 Good call! I misused that variance for std. My mistake. And I also really should've said likelihood there. But yeah, really I was just trying to figure out why he said the mean is centered at -0.8 but also shows a mean of -1 for the predicted params of pdf. As in are they just separate random examples or are we using a pdf with mean=-1, var=0.5 to determine the prob when speed is -0.8, which also doesn't seem likely since I thought we would use the velocity with the max likelihood (i.e. mean).
@MrMonkeyMana
@MrMonkeyMana Год назад
Can you teach AI to play City Skylines.
@herikaniugu
@herikaniugu 8 месяцев назад
RL is so good for optimizing the trading strategies
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@ojasvisingh786
@ojasvisingh786 Год назад
👏👏
@Gabcikovo
@Gabcikovo Год назад
54:38
@madhusudhanreddy9157
@madhusudhanreddy9157 Год назад
Hi Alex, Could you please suggest any best platform(online coding) that works properly for Reinforcement Learning, In our local systems, are getting errors(system dependencies). Even google colab is showing error when using gym library Thanks Your RU-vid Follower
@xpcalc446
@xpcalc446 Год назад
Have you try to solve those errors by installing the the correct version of the packages?
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@user-xd4cl3qd8x
@user-xd4cl3qd8x 11 месяцев назад
Oh my God, he is so Handsome. And your spoken, lecture delivery, and fluency in RL in as awesome as your looks are....🤩 focusing on the speaker more than the slides. May Allah Almighty bless you man
@bohaning
@bohaning 4 месяца назад
Hey, I'd like to introduce you to my AI learning tool, Coursnap, designed for youtube courses! It provides course outlines and shorts, allowing you to grasp the essence of 1-hour in just 5 minutes. Give it a try and supercharge your learning efficiency!
@forheuristiclifeksh7836
@forheuristiclifeksh7836 Месяц назад
7:00
@DonReichSdeDios
@DonReichSdeDios 7 месяцев назад
An apple with a byte❤ ✒️ fellow August 13th🤳🏿
@forheuristiclifeksh7836
@forheuristiclifeksh7836 Месяц назад
14:25
@Achielezz
@Achielezz 5 месяцев назад
You say state-action-pear but show an apple, I AM CONFUSION! AMERICA EXPRAIN! :) Loved the lecture, really well done.
@SantoshKumar-hx2ig
@SantoshKumar-hx2ig Год назад
Lecture 7 ?
@AAmini
@AAmini Год назад
Lecture 7 is having some technical difficulties so it will be published tomorrow same time (10am ET) -- sorry for the delay!
@SantoshKumar-hx2ig
@SantoshKumar-hx2ig Год назад
@@AAmini I am very happy for reply within few minutes. Today I feel the power of mit .
@AAmini
@AAmini Год назад
Thank you for your understanding :)
@smftrsddvjiou6443
@smftrsddvjiou6443 6 месяцев назад
Now, he knows that Q values can be converted into Probability?
@roadto300kusdbtc7
@roadto300kusdbtc7 Год назад
once again, audio is super quiet. Had to turn the volume to 100. Fire the audio guy lol
@davidkamran9092
@davidkamran9092 10 месяцев назад
SEALCLATCONTITOIN - YALL NEED TO INCORPORATE HARD-CODED TRAJETORIES LIKE POLITICAL VIEWS IN DEEP LEARNING .. THE SYSTEM DYNAMICS CHANGE BASED ON POLITICAL MODALITIES
@shojintam4206
@shojintam4206 11 месяцев назад
33:13
@pravachanpatra4012
@pravachanpatra4012 10 месяцев назад
16:03
Далее
MIT 6.S191 (2023): Deep Learning New Frontiers
1:08:47
Просмотров 81 тыс.
MIT 6.S191: Reinforcement Learning
1:00:19
Просмотров 23 тыс.
The courier saved the children
00:33
Просмотров 1,3 млн
How would you react?😅
00:31
Просмотров 1,4 млн
An introduction to Reinforcement Learning
16:27
Просмотров 643 тыс.
The Most Important Algorithm in Machine Learning
40:08
Просмотров 272 тыс.
MIT 6.S191 (2023): Deep Generative Modeling
59:52
Просмотров 298 тыс.
MIT 6.S191 (2023): Convolutional Neural Networks
55:15
Просмотров 245 тыс.
[1hr Talk] Intro to Large Language Models
59:48
Просмотров 1,9 млн
iPhone 12 socket cleaning #fixit
0:30
Просмотров 43 млн
ВЫ ЧЕ СДЕЛАЛИ С iOS 18?
22:40
Просмотров 130 тыс.
ВЫ ЧЕ СДЕЛАЛИ С iOS 18?
22:40
Просмотров 130 тыс.