DeepMind AlphaGo Zero Explained

Подписаться 770 тыс.

Просмотров 51 тыс.

50% 1

DeepMind's AlphaGo Zero algorithm beat the best Go player in the world by training entirely by self-play. It played against itself repeatedly, getting better over time with no human gameplay input. AlphaGo Zero was a remarkable moment in AI history, a moment that will always be remembered. Move 37 in particular is worthy of many philosophical debates. You'll see what I mean and get a technical overview of its neural components (code + animations) in this video. Enjoy!
Code for this video:
github.com/Zet...
Please Subscribe! And like. And comment. That's what keeps me going.
Want more education? Connect with me here:
Twitter: / sirajraval
instagram: / sirajraval
Facebook: / sirajology
There are 2 errors in this video:
1. At the top of the residual network, it says value layer twice. One should say 'policy' layer.
2 The residual network is 40 layers, i say 20.
This video is apart of my Machine Learning Journey course:
github.com/llS...
More Learning Resources:
deepmind.com/b...
/ alphago-zero-explained...
hackernoon.com...
web.stanford.e...
tim.hibal.org/b...
www.jessicayung...
Join us in the Wizards Slack channel:
wizards.herokua...
Sign up for the next course at The School of AI:
www.theschool.ai
And please support me on Patreon:
www.patreon.co...
#AlphaGoZero #Deepmind #SirajRaval
Signup for my newsletter for exciting updates in the field of AI:
goo.gl/FZzJ5w
Hit the Join button above to sign up to become a member of my channel for access to exclusive content! Join my AI community: chatgptschool.io/ Sign up for my AI Sports betting Bot, WagerGPT! (500 spots available):
www.wagergpt.co

Опубликовано:

19 сен 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 115

@rishabhchopra6418 6 лет назад

Siraj, you've improved your skill of teaching a lot over the years! Now, you talk more slowly and your videos are much more comprehensive than before! Beautiful. Keep going! :D

@SebastianMantey 6 лет назад

“The dot product operations became self-aware.” (1:34) 😂 Nice point of view to put the danger of artificial “intelligence” somewhat in perspective.

@einemailadressenbesitzerei8816 6 лет назад

can you please explain? I understand the dot product but not the joke. I dont see selfawareness at all, unless you come from 1950 and play against alpha zero.

@SebastianMantey 6 лет назад

That’s exactly the point. There is no self-awareness. He is making fun of the fact that whenever there is a big breakthrough in AI (e.g. AlphaGo beating the world champion in Go 0:25), mainstream media in general immediately jumps to the conclusion that a general artificial intelligence is near (and you are bound to find Skynet comments under respective news articles or videos). But at the core deep learning is basically just a bunch of dot products and activation functions. So, just basic mathematical operations. There is nothing that is inherently “intelligent” about that, let alone self-aware.

@einemailadressenbesitzerei8816 6 лет назад

thx

@SirajRaval 6 лет назад

haha thanks

@benyaminewanganyahu 5 лет назад

@@SebastianMantey there is nothing 'self-aware' about 2 atoms and yet they make up our brains. This is identical to the argument you just used. Self-awareness, is also independent of power or danger. Self-awareness, consciousness, sentience may not even exist.

@MLwithAlva 6 лет назад

Great explanation!!

@SirajRaval 6 лет назад

thanks!

@vijayabhaskar-j 6 лет назад

Siraj is at his peak, making 4-5 videos per week.

@kristianwichmann9996 6 лет назад

Don't forget to sleep, Siraj :D

@SirajRaval 6 лет назад

yes i need to chill but i'm too dedicated

@akhileshpandey123 6 лет назад

Thanks Siraj. I have been watching your content for like 1 years. You have become quite awesome educator. Good wishes.

@prabhakartayenjam2258 6 лет назад

Hi Siraj, I am a regular viewer of your videos. I like them. It would be great to see some videos on observational learning.

@HiEnergyMusic 6 лет назад

This is truly amazing... and very well explained. Thanks, Siraj!

@GlorryMorry 6 лет назад

Hey, @Siraj! It seems to be a mistake at 3:23 Green and cyan colors must be replaced, I suppose?..

@sofia.eris.bauhaus 6 лет назад

jup

@SirajRaval 6 лет назад

yes, thanks for pointing that out my mistake

@sinaabady1028 2 года назад

This video is amazing man thanks for explaining :)

@allmightqs1679 6 лет назад

Did we reach the i-Robot era already?😱 freaky stuff indeed!

@quarkmarino 6 лет назад

Hey Siraj, great video thow, I would recommend you that you don't animate every element on your diagrams with transitions, (feels a bit power pointy), instead just show the whole diagram (e.j. a neural network) and only highlight the piece of the diagram you are talking about, the same with the code, place as much code (or pseudo code) as possible, and highlight the important part(sometimes not even the whole line) as you mention it, I think that would make your explanations more clear, and visuals less distracting, but for the love of A.I., the memes, don't change them, they are great.

@DBCatch22x 6 лет назад

Siraj you are an absolute beast. Thank you so much for everything you do. You have been so helpful, and I am so thankful for your channel.

@Mlantow20 6 лет назад

"Descrete" and "perfect information" have switched explanations at 3:35

@RohilGupta123 6 лет назад

I am a Master student of Machine Learning and Data Mining, I love your videos. I just have a small complaint that, it's sometimes hard to follow you when you're explaining really tough concepts. I know you have time constraints but it becomes nearly impossible to follow you in long explanations that you provide. If you could bring some pause between some really tough concepts, it will be little easy to follow you in better way.

@Cyberautist 3 года назад

1:25 Move 37 was not an invention of AlphaGo. The documentation says it was played by 1:10.000. Bcs the data based on 10.000 games from human amatures, the frequency was one of the human player out of 10.000 human players, which were the base data of AlphaGos learning pattern, which it could optimize but not surpass. That also explains the dead lumps in AlphaGos strategy.

@junkseed 6 лет назад

Good video to the right point of time, a nice explaination for what I am currently reading about RL (Sutton)

@troemax 6 лет назад

The mentioned move 37 was played in game 2 of 5, which was streamed here: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-l-GsfyVCBu0.htmlh17m46s At the bottom right of the video at 1:17:46 you can see that black (AlphaGo) is playing the 37th stone. The reaction of the moderators is funny :)

@larrybeckham6652 5 лет назад

Very scary. If the algorithm learns TCP/IP, HTTP, and such, any computer on the Net is pwned.

@vijayabhaskar-j 6 лет назад

1:34 that's a meme right there.

@curiousbit9228 6 лет назад

Thank you for this Jucy video @siraj!!

@jamestandy8594 6 лет назад

Thank you! Somehow in every previous explanation I missed the fact that MCTS involved purely random playouts. As a Go player, it surprises me that purely random moves would be a good diagnostic for the goodness of a move. It makes sense that strong shape (for instance) would withstand a random barrage of stones better than weak shape. Perhaps the same goes for small territories vs. large areas, which could be why AG prefers quick territory over whole-board influence. Random playouts seem like they would be less effective in complicated fights, where there are several possibilities at the local scale, let alone the whole board, but am I correct that the layers of neural networks involve smaller portions of the board, and so it might be able to deduce the correct local sequence from fewer playouts than otherwise?

@OO-ie1pe 6 лет назад

thank you Mr.Raval

@ZerofeverOfficial 6 лет назад

"Hey, I'm a beautiful wizard ya'll!"

@SirajRaval 6 лет назад

yes you are love u

@xPROxSNIPExMW2xPOWER 6 лет назад

Chemical Reaction Networks. Bio comp the next thing for sure.

@mfeldman143 6 лет назад

"let's be real, probably some tea"

@itsDe0n 6 лет назад

Awesomely Explained

@kirill_good_job Год назад

hi, Siraj, thank you for your video, please explain how si 7:32 is calculated ? (aggregate score)

@kirill_good_job Год назад

thanks for video, which microphone do you use ?

@toki_doki 6 лет назад

How do you post so often. I love it. Keep it up Siraj!

@kapilkansara5129 6 лет назад

Epic content sir

@imays76 5 лет назад

I have a question on "The outputs of neural network are: 1)possibilities for each next move and 2)possibility of win." Does "possibilities for each next move" correspond to one-hot encoding for the next move?

@sheng-yiye1552 6 лет назад

Hello Siraj ! Thank you for the video ! I am a newbie here and I wanna build my own AI for Go. Do you think a genetic neural network approach can suit this application ? :D

@marshboy4150 4 года назад

Matrix reference at 3:48

@DiogoVKersting 6 лет назад

What I'm curious about AlphaGo, is if there's "hidden technology" on its making. That is, I know the source is closed, but a lot of its designs are available in papers. Would someone with enough resources be able to create his own version of AlphaZero without having to research new technology (i.e. just coding based on the papers)?

@gJonii 6 лет назад

Diogo V. Kersting LeelaZero is open source version of AlphaGo, where computing power is crowd sourced.

@larryteslaspacexboringlawr739 6 лет назад

thank you deepmind video

@ymi_yugy3133 6 лет назад

How can you do hyperparameter optimization if it takes so long to train the network?

@abhirishi6200 6 лет назад

Ymi_Yugy Hyperparameter tuning is only for supervised learning . Reinforcement learning is super dope , relying only on the algorithm and the training process to get better and better.

@SirajRaval 6 лет назад

its another layer of training time indeed

@geistreiches 6 лет назад

can you do a video about AI natural language understanding, with all the data we have in books and on the web an algorithm using that data should be able to learn to understand language, how far are we in that area what are the main issues?

@abhijeetghodgaonkar 6 лет назад

Yay AlphaGo!!!! I love Go game , I want to reach dan level , I want to become 9dan professional

@abhijeetghodgaonkar 6 лет назад

@surupendu , AlphaStill there

@abhigo7788 6 лет назад

It is quite hard to understand the image channels made on a 19x19 board to get more accurate features to train.

@wolfisraging 6 лет назад

Deep mind is really digging deep in mind

@dannyiskandar 6 лет назад

i made at the end ..but have no idea what you are saying :)

@eolew9829 6 лет назад

sir,thnx 4 instr,many ppl begin playing chess/go by remembering win/draw patterns(if u place stones in certain position,ur opponent have no other choice but follow u,thus u can control the result),there r many books about these "patterns"& tactics,r they real tesed by ai?

@barnmonster888 5 лет назад

IF YOU TAKE THIS---YOU WILL LOSE YOUR SOLE AND NEVER BE ABLE TO GOE TO HEAVEN EVER AGAIN

@djneumonic 6 лет назад

@Siraj what’s going to happen with your old block chain course on the school of ai ?

@bior87 6 лет назад

Great explanation! but no unlimited stones, a standard set includes 181 black and 180 white stones (361 intersection points)

@neural1023 6 лет назад

you should help promote Dataquest.io for everyone trying to get into A.I and data science its a great resource that truly allows anyone to get into the field.

@albrin 6 лет назад

great video. what about chess? why so little information?

@dibydash 6 лет назад

"Let's be real, probably some tea" XD

@rtechshow62 6 лет назад

I am waiting for someone to develop Zola's algorithm 😈👹

@pratikdesai2396 6 лет назад

If human mind also has good capacity to traverse all possible state space(may be by tree search and others ) at the speed of computer processors/GPU, would it be that AI and human level would be the same on especially for the game like alphago?

@einemailadressenbesitzerei8816 6 лет назад

what would than be the difference between humans and computers?

@einemailadressenbesitzerei8816 6 лет назад

alpha zero doesnt go through all state spaces, this is impossible with current hw for games like chess or go

@abdialibabaali132 6 лет назад

Is it open sourc? And if it is open source where can I get ot

@4acesproductions341 5 лет назад

Would AlphaGo Zero vs AlphaGo Zero end in a draw then?

@Larkinchance Год назад

I'm having an anxiety attack...

@SirajRaval Год назад

It’s going to be ok. Life can be really hard but you’ll get through this

@Larkinchance Год назад

@@SirajRaval you are very nice

@allsmiles6538 5 лет назад

thank our AI god-masters -- may AlphaZero reign supreme -- that this video did not include tasteless royalty-free music to detract from the lesson lol what if humans create a religion out of AI :D. wait. actually it sounds plausible :(

@abhishekkapoor7955 6 лет назад

please make videos on feature engineering.

@andyday4847 6 лет назад

AlphaGo zero never play with a human player.

@nicksocu 6 лет назад

Just listen to 3:01-3:10 made my night so wet!!!!

@e4r281 6 лет назад

Why do you keep trying to grab the air in front of you when saying hi ?

@papalevies 6 лет назад

64 GPUs and 19CPUs cost millions of dollars? What? Did you mean 100k?

@SuperProtector 6 лет назад

I have the same question: this result is based on TPU according to my information. Why only mention GPU en CPU? TPU cost more! about 10.000 $ for a unit, maybe. Titan V cost about 3000$.

@SirajRaval 6 лет назад

over the very long period of time they trained it on, those costs really add up

@bharath5673__ 6 лет назад

Siraj do some videos on CYBORG plzzzzz

@nicktohzyu 6 лет назад

how is your definition of "discrete" different from "perfect information"?

@quickdudley 6 лет назад

You can have discrete games with imperfect information: such as chess with invisible pieces or Space Hulk.

@HappyDancerInPink 6 лет назад

Why does he gasp for air between each sentence

@tadashiuchiha5732 2 года назад

5:42

@bayesianlee6447 5 лет назад

If AI system has way better than human beings, how possibly we human being maintain ruler of AI? It's like ants create human beings. I'm deep learner but still has huge conspicuous thoughts on safety of AI finally when we get into AGI period.

@rodthelimey 4 года назад

Could you redo this video, speaking more slowly? About 3 times more slowly? Thx ;)

@murtazaattari5868 6 лет назад

Need ur help Siraj

@paulalexandrupop3709 6 лет назад

"More possible Go positions then there are atoms in the universe" - isn't the universe supposed to be infinite?

@kiranroye6498 6 лет назад

it is, but that doesen't mean that there are in infinite amount of individual particles

@paulalexandrupop3709 6 лет назад

right, but how does that make the number of Go positions larger than the number of atoms in the universe? You don't even know how many particles there are in the universe. At best, you can approximate the number of atoms in the observable universe and make your claim against that.

@McRingil 6 лет назад

Paul Alexandru Pop actually we now that based on observable expansion of the universe

@ED-TwoZeroNine 6 лет назад

Paul Alexandru Pop he should have said, the known universe.

@abhimanyusid 6 лет назад

It's not known that the universe is infinite. In fact it is more likely believed to be finite, as we had a big bang and the universe expands from that moment, over a finite(but large) period of time

@gustavomartinez6892 6 лет назад

Have been four days, can I contact you in google+? or a mail please? I dont have a facebook acount

@RDJ2 6 лет назад

The problem is that a superior intelligence has no reason to share its intelligence with apes. Why on earth would it combine itself with us. We'll be pets in the best case scenario. Which isn't bad, I could live as a cat. Minus the neutering but I'm afraid there's no way out of that once it starts to genetically engineer us to be cuter and less violent.