Тёмный

RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning 

Google DeepMind
Подписаться 489 тыс.
Просмотров 1,5 млн
50% 1

#Reinforcement Learning Course by David Silver# Lecture 1: Introduction to Reinforcement Learning
#Slides and more info about the course: goo.gl/vUiyjq

Опубликовано:

 

12 май 2015

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 324   
@jordanburgess
@jordanburgess 8 лет назад
Just finished lecture 10 and I've come back to write a review for anyone starting. *Excellent course*. Well paced, enough examples to provide a good intuition, and taught by someone who's leading the field in applying RL to games. Thank you David and Karolina for sharing these online.
@Gabahulk
@Gabahulk 8 лет назад
I've finished both of them, and I'd say that this one has a better and much more solid content, although the one from udacity is much more light and easy to follow, so it really depends on what you want :)
@adarshmcool
@adarshmcool 8 лет назад
This course is more thorough and for someone who is looking to make a career in Machine Learning, you should put in the work and do this course.
@TheAdithya1991
@TheAdithya1991 8 лет назад
Thanks for the review!
@devonk298
@devonk298 7 лет назад
One of the best , if not the best , courses I've watched!
@saltcheese
@saltcheese 7 лет назад
thanks for the review
@zingg7203
@zingg7203 7 лет назад
0:01 Outline Admin 1:10 About Reinforcement Learning 6:13 The Reinforcement Learning problem 22:00 Inside an RL angent 57:00 Problems within Reinforcement Learning
@user-sf5ig4sz6p
@user-sf5ig4sz6p 7 лет назад
Good job. Very thankful :)
@enochsit
@enochsit 6 лет назад
thanks
@trdngy8230
@trdngy8230 6 лет назад
You made the world much easier! Thanks!
@michaelc2406
@michaelc2406 6 лет назад
Problems within Reinforcement Learning 1:15:53
@mairajamil001
@mairajamil001 3 года назад
Thank you for this.
@eyeofhorus1301
@eyeofhorus1301 5 лет назад
Just finished lecture 1 and can already tell this is going to be one of the absolute best courses 👌
@tanmaygangwani3534
@tanmaygangwani3534 7 лет назад
The complete set of 10 lectures is brilliant. David's an excellent teacher. Highly recommended!
@passerby4278
@passerby4278 4 года назад
what a wonderful time to be alive!! thank god we have the opportunity to study a full module from one of the best unis in the world. taught by one of the leaders of its field
@socat9311
@socat9311 5 лет назад
I am a simple man. I see a great course, I press like
@nguyenduy-sb4ue
@nguyenduy-sb4ue 4 года назад
how lucky we are to have access to this kind of knowledge only with a button ! Thank you all in DeepMind public this course
@BhuwanBhatta
@BhuwanBhatta 4 года назад
I was going to say the same. Technology has really made our life easier and better in a lot of ways. But a lot of times we take it for granted.
@sachinkalwar4359
@sachinkalwar4359 3 года назад
@@BhuwanBhatta fvy5tym 🎉4ufgc🙏😎4g🔥f9f4c6v f😎j 9c
@anniekhoekzema9344
@anniekhoekzema9344 3 года назад
@@BhuwanBhatta ji kghkfktghjkhhiljcujfjpjkui jikskjgjpj
@Edin12n
@Edin12n 4 года назад
That was brilliant. Really helping me to get my head around the subject. Thanks David
@zhongchuxiong
@zhongchuxiong Год назад
1:10 Admin 6:13 About Reinforcement Learning 6:22 Sits in the intersection of many fields of science: solving decision making problem in these fields. 9:10 Branches of machine learning. 9:37 Characteristics of RL: no correct answer, delayed feedback, sequence matters, agent influences environment. 12:30 Example of RL 21:57 The Reinforcement Learning Problem 22:57 Reward 27:53 Sequential Decision Making. Action 29:36 Agent & Environment. Observation 33:52 History & State: stream of actions, observations & rewards. 37:13 Environment state 40:35 Agent State 42:00 Information State (Markov State). Contains all useful information from history. 51:13 Fully observable environment 52:26 Partially observable environment 57:04 Inside an RL Agent 58:42 Policy 59:51 Value Function: prediction of the expected future reward. 1:06:29 Model: transition model, reward model. 1:08:02 Maze example to explain these 3 key components. 1:10:53 Taxonomy of RL agents based on these 3 key components: policy-based, value-based, actor critic (which combines both policy & values function), model-free, model-based 1:15:52 Problems within Reinforcement Learning. 1:16:14 Learning vs. Planning. partial known environment vs. fully known environment. 1:20:38 Exploration vs. Exploitation. 1:24:25 Prediction vs. Control. 1:26:42 Course Overview
@guupser
@guupser 6 лет назад
Thank you so much for repeating the questions each time.
@AndreiMuntean0
@AndreiMuntean0 8 лет назад
The lecturer is great!
@tylersnard
@tylersnard 4 года назад
I love that David is one of the foremost minds in Reinforcement Learning, but he can explain it in ways that even a novice can understand.
@5m5tj5wg
@5m5tj5wg 5 месяцев назад
Would be weird if he couldn't. If an expert can't explain it to a novice who can.
@DrTune
@DrTune Год назад
Excellent moment around 24:10 when David makes it crystal clear that there needs to be a metric to train by (better/worse) and that it's possible - and necessary - to try to come up with a scalar metric that roughly approximates success or failure in a field. When you train something to optimize for a metric, important to be clear up-front what that metric is.
@ShalabhBhatnagar-vn4he
@ShalabhBhatnagar-vn4he 4 года назад
Mr. Silver covers in 90 minutes what most books do not in 99 pages. Cheers and thanks!
@linglingfan8138
@linglingfan8138 3 года назад
This is really the best RL course I have seen!
@ethanlyon8824
@ethanlyon8824 7 лет назад
Wow, this is incredible. I'm currently going through Udacity and this lecture series blows their material from GT out of the water. Excellent examples, great explanation of theory, just wow. This actually helped me understand RL. THANK YOU!!!!!
@JousefM
@JousefM 4 года назад
How do you find the RL course from Udacity? Thinking about doing it after the DL Nanodegree.
@pratikd5882
@pratikd5882 3 года назад
@@JousefM I agree, those explanations by GT professors were confusing and less clear, the entire DS nanodegree which had ML, DL and RL was painful to watch and understand.
@donamincorleone
@donamincorleone 8 лет назад
Great video. Thanks. I really needed something like this :)
@Esaens
@Esaens 4 года назад
Superb David - you are one of the giants I am standing on to see a little further - thank you
@NganVu
@NganVu 4 года назад
1:10 Admin 6:13 About Reinforcement Learning 21:57 The Reinforcement Learning Problem 57:04 Inside an RL Agent 1:15:52 Problems within Reinforcement Learning
@mathavraj9662
@mathavraj9662 3 года назад
bless u :)
@Abhi-wl5yt
@Abhi-wl5yt 2 года назад
I just finished the course, and the people in this comment section are not exaggerating. This is one of the best courses on Reinforcement learning. Thank you very much DeepMind, for making this free and available to everyone!
@tristanlouthrobins
@tristanlouthrobins 5 месяцев назад
This is one of the clearest and most illuminating introductions I've watched on RL and its practical applications. Really looking forward to the following instalments.
@johntanchongmin
@johntanchongmin 3 года назад
Really love this video series. Watching it for the fifth time:)
@sachinramsuran7372
@sachinramsuran7372 5 лет назад
Great lecture. The examples really helped in understanding the concepts.
@asavu
@asavu Год назад
David is awesome at explaining a complex topic!
@alpsahin4340
@alpsahin4340 5 лет назад
Great lecture, great starting point. Helped me to understand the basics of Reinforcement Learning. Thanks for great content.
@tianmingdu8022
@tianmingdu8022 7 лет назад
The UCL lecturer is awesome. Thx for the excellent course.
@nirajabcd
@nirajabcd 3 года назад
Just completed Coursera's Reinforcement Learning Specialization and this is a nice addition to reinforce the concept I am learning.
@ImtithalSaeed
@ImtithalSaeed 6 лет назад
I can say that I 've found a treasure..really
@user-hb9wc7sx9h
@user-hb9wc7sx9h 10 месяцев назад
David is awesome at explaining a complex topic!. Great lecture. The examples really helped in understanding the concepts..
@dhrumilbarot1431
@dhrumilbarot1431 6 лет назад
Thank you for sharing.It kinda inspires me to always remember that I have to pass it on too.
@vipulsharma3846
@vipulsharma3846 4 года назад
I am taking a Deep Learning course rn but seriously the comments here are motivating me to get into this one right away.
@filippomiatto1289
@filippomiatto1289 7 лет назад
Amazing video, a very well-designed and well-delivered lecture! I'm going to enjoy this course, good job! 👍
@wireghost897
@wireghost897 Год назад
It's really nice that he gives examples.
@kozzuli
@kozzuli 7 лет назад
Ty for sharing, Great Lecture!!
@aam1819
@aam1819 7 месяцев назад
Thank you for sharing your knowledge online. Enjoying your videos, and loving every minute of it.
@sng5192
@sng5192 8 лет назад
Thanks for a great lecture. I got grasp the point of reinforcement learning !
@kiuhnmmnhuik2627
@kiuhnmmnhuik2627 7 лет назад
@1:07:00. Instead of defining P_{ss'}^a and R_s^a, it's better to define p(s',r|s,a), which gives the joint probability of the new state and reward. The latter is the approach followed by the 2nd edition of Sutton&Barto's book.
@lcswillems
@lcswillems 6 лет назад
A really good introduction course!! Thank you very much!!
@hassan-ali-
@hassan-ali- 7 лет назад
lecture starts at 6:30
@deviljin6217
@deviljin6217 Год назад
the legend of all RL courses
@zhichaochen7732
@zhichaochen7732 7 лет назад
RL could be the killer app in ML. Nice lectures to bring people up to speed!
@mo3adhaytam771
@mo3adhaytam771 Год назад
I took this playlist as a reference for my thesis in "RL for green radio".
@MGO2012
@MGO2012 7 лет назад
Excellent explanation. Thank you.
@rossheaton7383
@rossheaton7383 5 лет назад
Silver is a boss.
@vorushin
@vorushin 5 месяцев назад
Thanks a lot for the great lectures! I enjoyed watching every one of them (even #7). This is a great complement to reading Sutton/Barto and the seminal papers in RL. I remember looking at the Atari paper in the late 2013 and having hard time to understand why everyone is going completely crazy about it. A few years later the trend was absolutely clear. Reinforcement Learning is the key to push the performance of AI systems past the threshold where the humans can serve as wise supervisors to the limit when the different kinds of intelligence help each other to improve via self-play.
@bennog8902
@bennog8902 6 лет назад
awesome course and awesome teacher
@HazemAzim
@HazemAzim 3 года назад
just amazing and different than any intro to RL
@VishalKumarTech
@VishalKumarTech 7 лет назад
Thank you David!!
@ProfessionalTycoons
@ProfessionalTycoons 6 лет назад
amazing introduction and very cool
@abunickabhi
@abunickabhi 6 лет назад
Excellent Indeed!
@TheAIEpiphany
@TheAIEpiphany 3 года назад
His name should be David Gold or Platinum I dunno. Best intro to RL on YT, thank you!
@aaronvr_
@aaronvr_ 4 года назад
really high quality, I'm impressed at David Silver's (or somebody else's?) choice to offer this content to the general public free of charge.. what an age we're living in :DDDDDDDDDDD
@rohitsaka
@rohitsaka 3 года назад
For Me : David Silver is God ❤️ What a Man ! What an Explanation. One of the Greatest Minds who changed the Dynamics of RL in the past few years.Thanks Deep mind for uploading this Valuable course for free 🤍
@ABHINAVGANDHI09
@ABHINAVGANDHI09 5 лет назад
Thanks for the question at 19:48!
@legorative
@legorative 6 лет назад
Too good :) Best analogies.
@43SunSon
@43SunSon 3 года назад
I have to admit, david silver is slightly smarter than me.
@AwesomeLemur
@AwesomeLemur 3 года назад
We can't thank you enough!
@Newascap
@Newascap 3 года назад
I actually prefer this 2015 class over the most recent 2019 one. Nothing wrong on the other expositor, but David kinda makes the course more smoothly.
@mgonetwo
@mgonetwo Год назад
Rare opportunity to listen to Christian Bale after he is finished with dealing with criminals as Batman. On a serious note, overall great series of lectures! Thanks, prof. David Silver!
@yuwuxiong1165
@yuwuxiong1165 4 года назад
Take swimming as example: learning is part that you directly jump into the water and learn swimming to survive; planning is that part that before jumping into the water, you read books/instructions on how to swim (obviously sometimes planning helps, sometimes not, sometimes counter-helps).
@lauriehartley9808
@lauriehartley9808 4 года назад
I have never heard a punishment described as a negative reward at any point during my 71 orbits of the Sun. You can indeed learn something new every day.
@merajis
@merajis 6 лет назад
I love this!
@43SunSon
@43SunSon 4 месяца назад
Im back again, watching the whole video again.
@jamesr141
@jamesr141 2 года назад
What a GIFT.
@dalcimar
@dalcimar 5 лет назад
Can you enable the automatic captioning to this content?
@rz4413
@rz4413 5 лет назад
brilliant course
@ajibolashodipo8911
@ajibolashodipo8911 3 года назад
Silver is Gold!
@umountable
@umountable 6 лет назад
46:20 this also means that it doesn't matter how you got into this state, it will always mean the same.
@vovos00
@vovos00 7 лет назад
Thank you for nice lecture
@erichuang2009
@erichuang2009 4 года назад
5 days to train per game. now is 5 minutes to complete a train based on recent papers. envolve fast!
@fandrade9
@fandrade9 3 года назад
¡Great lecture!
@bingeltube
@bingeltube 5 лет назад
Recommendable
@florentinrieger5306
@florentinrieger5306 11 месяцев назад
This is so good!
@viscaelbarca4381
@viscaelbarca4381 2 года назад
Would be great if you guys could add subtitles!
@halefomkahsay2931
@halefomkahsay2931 5 лет назад
Great Help Thanks Man
@iblaliftw
@iblaliftw 2 года назад
Thank you very much, I recently got a good grade in RL thanks to your great teaching skills!!
@RahulSharma-yx5uf
@RahulSharma-yx5uf 2 года назад
Thank you very much!!
@vballworldcom
@vballworldcom 5 лет назад
Captions would really help here!
@dharambir_iitk
@dharambir_iitk Год назад
love it!
@Delta19G
@Delta19G 9 месяцев назад
This is my first taste of deep mind
@vimalrajayyappan2023
@vimalrajayyappan2023 Год назад
Gifted!
@wentingwang883
@wentingwang883 11 месяцев назад
Thanks so much!
@HoangPham-oh6re
@HoangPham-oh6re 7 лет назад
Could you please turn on the auto generated subscript?
@yuxinzhang9403
@yuxinzhang9403 2 года назад
Any observation and reward could be wrapped up into abstract data structure in an object for sorting.
@taherhabib3180
@taherhabib3180 3 года назад
His 2021 "Reward is Enough" paper makes us agree to the Reward Hypothesis @ 24:18 . :D
@AlessandroOrlandi83
@AlessandroOrlandi83 3 года назад
Amazing teacher I wish I could partecipate to this course! I did a course on Coursera but it was so quick to explain very complex things.
@pratikd5882
@pratikd5882 3 года назад
Are you referring to the RL specialization by Alberta university? If so, then how good was it on the programming/practical aspects?
@AlessandroOrlandi83
@AlessandroOrlandi83 3 года назад
@@pratikd5882 Yes, I did that. The exercises were good, but I'm not an AI guy but a simple programmer. I managed to do the exercises but I think that explainations were very concise. So in 15 minutes they explain what you get in 1 hour on those lectures. I think that is very summarized. But it's good they have exercises. So I don't think after doing that I'm actually able to do much
@satishrapol3650
@satishrapol3650 2 года назад
Do you have any suggestions about which one to start with , the Lecture series here or the RL specialization by Alberta University (on Coursera). I need to apply RL on my own project work. By the way I did the course on Machine learning by NG Andrews and I could follow the pace it was good enough for me and besides the programming exercises helped me alot than I could imagine. But I am not sure if so would be the case with RL by Coursera as well. Can you guide me on this?
@saranggawane4719
@saranggawane4719 2 года назад
42:00 - 47:55 : Information State/Markov State 57:13 RL Agent
@AhmedThabit99
@AhmedThabit99 5 лет назад
if you can activate the subtitle from youtube, it will be great, Thanks
@dashingrahulable
@dashingrahulable 7 лет назад
On Slide "History and State" @ 34:34, does the order of Actions, Observations and Rewards matter? If yes, then why the order isn't Observations, Rewards and Actions; the reasoning is that the agent sees the observations first, assesses the reward for actions and then takes a particular action? Please clarify if the chain-of-thought went awry at any place. Thanks.
@razzlefrog
@razzlefrog 8 лет назад
Only slide that threw me off a bit was the RL taxonomy one. There was some confusion with the redundant labeling, otherwise it was a great lecture!
@smilylife7515
@smilylife7515 2 года назад
Please add subtitles to make it more helpful for those who are from non English native countries
@andyyuan97
@andyyuan97 8 лет назад
if subscript provided, then it shoud be perfect and classic~~
@kemalware4912
@kemalware4912 Год назад
Great
@aidan9876
@aidan9876 3 года назад
I found these psychologically useful. Are subtitles available? "The future is independent of the past ,given the present."
@einemailadressenbesitzerei8816
@einemailadressenbesitzerei8816 3 года назад
I want to discuss: "All goals can be described by the maximisation of expected cumulative reward" "Do you agree with this statement?" My thoughts why it could be controversy is that you can never specify the reward such as you will never have unexpected side effects/behaviour of the agent. Any other inputs/thoughts?
@prashanthduvvuri7845
@prashanthduvvuri7845 4 года назад
The future is independent of the past given the present - David Silver
@utsabshrestha277
@utsabshrestha277 4 года назад
Only if it have Markov state
@prashanthduvvuri7845
@prashanthduvvuri7845 4 года назад
The above comment was meant to be in the context of your life. Your brain is a cumulative of all your prior experiences and the choices/decisions which you make will be an a action taken by your brain(which is a markov state). So what I perceived from that statement was that, "you need to forget your past and move on".
@donnysoh5610
@donnysoh5610 5 лет назад
Hi, am the example of the mouse pressing the lever, would that mean that the representation of the agent state will determine how well the agent learns?
@mechanicalmonk2020
@mechanicalmonk2020 4 года назад
Lecture 1 has half a million views, 10 has 36k. I'm surprised it's even 36k
@lazini
@lazini 4 года назад
Thanks very much. But I need Eng.subtitle. Could you change setting of this videos? :)
@AntrianiStylianou
@AntrianiStylianou 2 года назад
anyone can confirm if this is still relevant in 2022? I would like to study RL. It seems that there is a more recent series but with a different professor on this channel.
Далее
RL Course by David Silver - Lecture 5: Model Free Control
1:36:31
Never waste PASTA SAUCE @itsQCP
00:19
Просмотров 4,3 млн
ML Was Hard Until I Learned These 5 Secrets!
13:11
Просмотров 218 тыс.
6. Monte Carlo Simulation
50:05
Просмотров 2 млн
How I’d learn ML in 2024 (if I could start over)
7:05
MIT Introduction to Deep Learning | 6.S191
1:09:58
Просмотров 312 тыс.