Тёмный

Neural Network Learns to Play Snake using Deep Reinforcement Learning 

Jack of Some
Подписаться 30 тыс.
Просмотров 26 тыс.
50% 1

Can an AI play Snake well by only looking at it? In this video I use two separate deep reinforcement learning algorithms to try to answer that question. The video was heavily inspired by and is a spiritual continuation of CodeBullet's excellent "A.I. Learns to play Snake using Deep Q Learning" ( • A.I. Learns to play Sn... )
Twitter: / safijari
Patreon: / jackofsome
SOME OF MY OTHER VIDOES:
○ Snake Programming Stream: • Chill Music and Coding...
○ Deep RL Stream: • How to Solve a Basic R...
○ Vanilla Q Learning Stream: • How to Solve a Basic R...
○ Explaining RL to a baby: • Baby Learns about Arti...
○ 5 Common Python Mistakes: • 5 Things You're Doing ...
○ Making Python fast: • Can VSCode be a reason...
VIDEOS MENTIONED:
Alex Patrenko's Snake AI: • Advantage Actor-Critic...
OpenAI Hide and Seek: • Multi-Agent Hide and Seek
AlphaGo: • AlphaGo Official Trailer
OpenAI Five (Dota): • OpenAI Five
#deeplearning #machinelearning #ai

Авто/Мото

Опубликовано:

 

21 авг 2024

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 55   
@NikoKun
@NikoKun 4 месяца назад
I took my own weird route to playing around with this idea.. First, I wrote a more classic rules based bot to play snake, with it's own recursive function to check future choices for dead ends. It's not perfect, but it can often play the game well enough, to fill half the available space with snake before dying to more difficult to avoid dead-ends. I then used that bot to record 10,000 of it's highest scoring games frame by frame, a couple million frames in total, also recording each action it took per frame. Then I fed all that data into a basic neural network, and ran a few hundred training epochs. So far I've gotten the neural network to play the game alright, but only about as good as my bot. heh
@TheSpotPlot
@TheSpotPlot 13 дней назад
Thank you for the awesome video! I am creating my own reinforcement learning snake game, but I am able to run it on my phone. I will train my snake, while my phone is on charge, while I sleep. The grid size is quite large, so I will see what happens, over time, and eventually show my results on RU-vid.
@swordriffraff-green6319
@swordriffraff-green6319 4 года назад
Really enjoyed the brief overviews of each of the algorithms used - would watch more of this type of video!
@JackofSome
@JackofSome 4 года назад
Thank you for your kind words. I'm going to be doing more of these. So far this has fared much worse than my other work but I think it has potential.
@shahzebafroze4093
@shahzebafroze4093 4 года назад
This is awesome!! Looking forward to more videos over the summer!
@martinsosmucnieks8515
@martinsosmucnieks8515 3 года назад
Very great video. I'm sad I haven't found this channel earlier!
@lucamehl3109
@lucamehl3109 4 года назад
Awesome video! Subbed!
@KrzysztofDerecki
@KrzysztofDerecki 4 года назад
If I understand you correctly you are feeding your learning alghorithm with full enviroment data. Thats why you endup with having not enough resources very fast. In my opinion genral snake agent should not be dependant on board size. For start you can try using moving 16x16 window around head of snake as an enviroment input, and use it on any board size. Later you can try experinent with other additional enviroment inputs like current snake length, distances to every board edge etc. I may be wrong, but maybe it's a good hint :) Nice video!
@sirynka
@sirynka 4 года назад
And also, does it make sense to feed an actual image to the neural network instead of an array filled with [-1, 0, 1] corresponding to states of cells on a board.
@JackofSome
@JackofSome 4 года назад
The way I set up my network it would immediately scale down the image to the representation you're describing. The overhead here is learning one additional convolutional layer which isn't that bad and it keeps the problem err... "pure" for the lack of a better term (since the aim was to learn from pixels)
@tan-uz4oe
@tan-uz4oe 4 года назад
Nice video! Clean and easy to follow explanation :)
@bobingstern4448
@bobingstern4448 3 года назад
Fantastic stuff dude, keep it up!
@gregmarquez8222
@gregmarquez8222 2 месяца назад
Great video! I'm trying to teach myself Neural Networks and AI, and one of my future projects will be a "snake" playing AI based on a NN. I now have experience with basic multi-layer NNs for classifying images, etc.. What would you recommend as a "next step" to go from classifiers to NN that can play snake, etc., and some learning resources for this? I realize that this video is 4 years old, but just in case you are monitoring the comments, I thought I'd ask. Thanks!
@ozmandunn
@ozmandunn 4 года назад
Here from Lex Fridman Podcast Discord!
@NightsAndDays
@NightsAndDays 4 года назад
wow, great video!
@MrCmon113
@MrCmon113 4 года назад
Damn, I wanted to do something like that for my Bachelor Thesis. If your agent can actually learn to play a board of any size in theory, I may go for something else.
@JackofSome
@JackofSome 4 года назад
The video of supposed to inspire, not discourage 😅. There's a lot of cool things you can do with Snake with many different approaches. I encourage you to still go down this route. If you'd like advice, come to the machine learning discord discord.gg/yHh5UwJ
@MrCmon113
@MrCmon113 4 года назад
@@JackofSome It's just that this is the first solution to snake with RL I have seen. ^^ However I wrote this before watching the entire video. Doesn't seem like everything is said and done about snake after all. I'll check out that discord tomorrow. Thanks. : D
@revimfadli4666
@revimfadli4666 Год назад
I mean, there are many possible ways you could develop this. Using new architectures(such as modifications to LSTM/GRU), or hindsight experience replay without knowing the goal state, etc
@andreashon
@andreashon 3 года назад
I admit the music in this video is perfectly fit.
@McMurchie
@McMurchie 2 года назад
What really really gets on my tits is how all reinforcement videos are about trivial stuff, trying to make it seem hard like 'oooh how do we train the network?', when the overwhelmingly hard thing is, how to get the pixel data from the screen, how to format it, how do we build the game etc.
@JackofSome
@JackofSome 2 года назад
That's ... That's the easy stuff to me. I did a silent stream prior to these ones where I built all of that in under an hour. There's lots of tutorials out there on how to make games and capture screen data. It really doesn't belong in a discussion of RL algorithms. That said _some_ aspects are covered in my RL streams
@-mwolf
@-mwolf Год назад
@@JackofSome Could you share the code you used for this video?
@cobuslouw8319
@cobuslouw8319 3 года назад
I have a deep q-learning snake on my channel. It managed to get to a score of 107 on a 14x14 grid. Awesome video! Also interested to see how PPO performs.
@JackofSome
@JackofSome 3 года назад
That is really really nice. I watched the video it looks great. Do you have a repo i could look at?
@cobuslouw8319
@cobuslouw8319 3 года назад
@@JackofSome I don't have a nice repo to share now...I 'm busy finishing my master's degree at the moment, but I'm going put everything in a repo as soon as I'm done. It is a distributed deep q-learning algorithm with prioritized experience replay and n-step updates. The other videos on my channel are trained using the same codebase.
@valentinpopescu98
@valentinpopescu98 3 года назад
Can you explain me how did you calculate the complexity of the problem at 9:05? As I'm thinking about it, on 20x20 res, there are 400 states and actions. So it has to compute 400 actions depending on the environment, meaning 3^400 (given a pixel is 0 - no food/no body, 1 - there is food, 2 - there is part of his body), so 400 * (3 ^ 400)?
@jeffreyhsiao7938
@jeffreyhsiao7938 Год назад
我有類似疑問
@lored6811
@lored6811 4 года назад
Beautiful video, are you working in the industry?
@JackofSome
@JackofSome 4 года назад
Thank you so much for watching. I do deep learning and computer vision professionally. Teaching myself deep RL for fun and also because I think it'll be important soon.
@elliotg8403
@elliotg8403 4 года назад
Is there source code for this somewhere? Would be super helpful to have a look at it
@bobhanger3370
@bobhanger3370 2 года назад
Bruhh wheres the coooode? I am doing literally the same project and hoping to take < a month
@luissoares2467
@luissoares2467 2 года назад
Can you share the sorce code ?
@agentkoko3988
@agentkoko3988 4 года назад
Would love to watch more ❤️ . . .can we feed real game as an environment like as an input the program takes it frame by frame and we define the set of actions something like that ?
@JackofSome
@JackofSome 4 года назад
Yes. OpenAI does that with DotA and deepmind did it for StarCraft. Both very impressive projects though they both had a bit more information into the game than just the frames and input (deepmind plays by just looking at the image though). There's also some projects people have done playing Mario or the chrome dinosaur game. The data requirements make all these tasks really daunting though (e.g. OpenAI trains on 5000 CPUs for 8 months...)
@agentkoko3988
@agentkoko3988 4 года назад
@@JackofSome Yes. Something we could do? For example we feed the game frame by frame and then the AI analyse the frame to recognise the Text and then AI does actions . . . actions which reduces the distance between text and agent get rewards and like more information is required for better results agent have it's position in the frame/game and be analysing every frame he's able to get the position of the text. That example was in general terms.. Now something you could relate . . . Mario game.... there is that a new demon and our Agent/Player is standing and that demon is standing not moving, the Agent performs actions and distance is reduced only when right action is performed after performing series of right actions the distance is very much reduced and then agent will fire and demon gets killed . . . The goal is to kill demon . . .the reward I have thought of are * Rewarded when distance gets reduced * Reward till the time agent is alive [As if it continues to reduce the distance it will collide with the demon and die] *Reward to achieve the goad '''to kill demon. NOTE: demon has a shape and "demon" is written on it so the AI recognise the text from the frames and gets to know there is a demon. Might have missed something . . . . Hope you understand it and I hope we could do anything about the same . Thank you ❤️
@a7744hsc
@a7744hsc 2 года назад
Great video, many useful thoughts in it! Could you share more details about your model? e.g. how did you design the reward system?
@nickmoen6017
@nickmoen6017 Год назад
+1 for approach food, -1 for go away from food, +10 eat food, -100 die. It is a CNN. Calculate the deltas for each episode and then update the weights after all 4 snakes die.
@v.gedace1519
@v.gedace1519 3 года назад
04:15 - Take a look here to get the details about Hamiltonian Cycle and how to solve __any(*)__ snake game perfectly - independently of the playfield size! (any -> Hamiltonian Cycles arent possible for playgrounds with odd width and odd height; see repository why.) Video: ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-UI_I6sJXaJw.html Sorry video has no sound, no animation, etc. it just shows the two parts of my solution in action. Repository: github.com/UweR70/Hamiltonian-Cylce-Snake Contains deep but easy to understand explanations.
@JackofSome
@JackofSome 3 года назад
I'm familiar with the hamiltonian cycle and the video I mentioned at the start also used a similar solution. This video isn't about solving snake, it's about learning how to solve snake where the input is the game image.
@v.gedace1519
@v.gedace1519 3 года назад
@@JackofSome The comment was meant as starting point for detail background info for your viewers.
@marlhex6280
@marlhex6280 Год назад
I’ve never watched that game on its final form 😂😂😂
@sabrimahmoud383
@sabrimahmoud383 3 года назад
the link to the code please !
@arcadesoft294
@arcadesoft294 4 года назад
If you need it to be trained on a powerful computer I think you should contact sentdex he released a video where he said he might accept projects from fans to be run on his new 30k$ PC
@JackofSome
@JackofSome 4 года назад
I did. Haven't heard back.
@tk421allday
@tk421allday Год назад
Any chance we could see the code for this?
@sohampatil6539
@sohampatil6539 3 года назад
Why was the input to the neural network 4 frames of 84 by 84? Why 84?
@sohampatil6539
@sohampatil6539 3 года назад
Also, for different sizes of the snake grid, would you change the complexity of the model?
@JackofSome
@JackofSome 3 года назад
I wasn't changing the complexity, but I don't think that should matter. The model I was using was probably more complex than it needed to be. 84 comes from the image size used in the original Atari paper. No real reason to pick that number other than that.
@mamo987
@mamo987 3 года назад
pls more v nice
@los4776
@los4776 Год назад
HER might have helped
@imkukis5949
@imkukis5949 4 года назад
Did you create the game? And you apply AI on it??
@JackofSome
@JackofSome 4 года назад
Yes
@simonstrandgaard5503
@simonstrandgaard5503 4 года назад
Nice snake you have there. Interesting to see what approaches that you are exporing. I'm working on a snake ai myself, and is experimenting with having obstacles. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-T5xfyswESUQ.html repo here: github.com/neoneye/SwiftSnakeEngine
@JackofSome
@JackofSome 4 года назад
Hey that looks fantastic. I don't have a mac or iPhone unfortunately otherwise I would have tried it out. What method are you using for the autonomous behavior?
Далее
DQN in Pytorch from Scratch stream 1 of N | Deep Learning
2:23:15
I Made an AI with just Redstone!
17:23
Просмотров 950 тыс.
Курск - врата Рая / Новости / Шпак
1:14:35
skibidi toilet zombie universe 40 ( New Virus)
03:06
Просмотров 1,5 млн
AI Learns to Run Faster than Usain Bolt | World Record
10:22
My Computer Taught Itself to Play Minecraft
14:47
Просмотров 1 млн
Training an unbeatable AI in Trackmania
20:41
Просмотров 13 млн
The True Story of How GPT-2 Became Maximally Lewd
13:54
Growing Living Rat Neurons To Play... DOOM? | Part 1
27:10
I programmed some creatures. They Evolved.
56:10
Просмотров 4,1 млн
The moment we stopped understanding AI [AlexNet]
17:38
Просмотров 935 тыс.
Make Python code 1000x Faster with Numba
20:33
Просмотров 442 тыс.
The UNKILLABLE Snake AI (Entire 30x30 game)
32:40
Просмотров 76 тыс.
😎ВЕЛОСИПЕД ДЛИНОЙ 8cm
0:29
Просмотров 2,1 млн
Красивый груз!
0:14
Просмотров 2,7 млн