Build an Mario AI Model with Python | Gaming Reinforcement Learning

Подписаться 260 тыс.

Просмотров 153 тыс.

50% 1

Teach AI to play Super Mario
In this video you'll learn how to:
Setup a Mario Environment
Preprocess Mario for Applied Reinforcement Learning
Build a Reinforcement Learning model to play Mario
Take a look at the final results
Get the code: github.com/nicknochnack/MarioRL
Links:
Super Mario RL: pypi.org/project/gym-super-ma...
Nes Py: pypi.org/project/nes-py/
OpenAI Gym: gym.openai.com/
PyTorch: pytorch.org/get-started/locally/
PPO Algorithm: stable-baselines3.readthedocs...
Intro to RL Loss: spinningup.openai.com/en/late...
0:00 - Start
0:18 - Introduction
0:44 - Explainer
1:58 - Client Interview 1
2:02 - Animation 1
2:30 - Tutorial Start
3:22 - Setting Up Mario
10:44 - Running the Game
18:26 - Understanding the Mario State and Reward
20:44 - Client Interview 2
21:38 - Preprocessing the Environment
26:22 - Installing the RL Libraries
31:11 - Applying Grayscaling
35:32 - Applying Vectorization
36:56 - Applying Frame Stacking
42:46 - Client Conversation 3
43:05 - Animation 3
44:00 - Importing the PPO Algorithm
47:33 - Setting Up the Training Callback
50:13 - Creating a Mario PPO Model
55:30 - Training the Reinforcement Learning Model
1:02:40 - Client Conversation 4
1:02:56 - Animation 4
1:04:01 - Loading the PPO Model
1:06:10 - Using the AI Model
1:15:56 - Client Conversation 5
1:16:37 - Ending
Oh, and don't forget to connect with me!
LinkedIn: bit.ly/324Epgo
Facebook: bit.ly/3mB1sZD
GitHub: bit.ly/3mDJllD
Patreon: bit.ly/2OCn3UW
Join the Discussion on Discord: bit.ly/3dQiZsV
Happy coding!
Nick
P.s. Let me know how you go and drop a comment if you need a hand!
#ai #python

Наука

Опубликовано:

22 май 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 383

@Seriosso 2 года назад

Was waiting for such a video for so long! Your videos shouldb be taught in CS classes! Thank you for the great effort!

@doremekarma3873 Год назад

for people who are getting following error at 15:14: not enough values to unpack (expected 5, got 4) you can fix it by changing the line at 10:44: env = gym_super_mario_bros.make('SuperMarioBros-v0') to env = gym_super_mario_bros.make('SuperMarioBros-v0',apply_api_compatibility=True,render_mode="human" )

@preston748159263 Год назад

Now it says "too many values to unpack (expected 4)"

@ManishGupta-mo4gj Год назад

Excellent, was struggling for last 2 days. Thanks a ton

@_obegi_ Год назад

or you can do this state, reward, done, info, _ = env.step(env.action_space.sample())

@Sirobasa Год назад

thanks a lot. what has changed by using the bottom code?

@eruditeboen3860 11 месяцев назад

@@_obegi_ You would have to do both

@amanlohia6399 2 года назад

Your videos are freaking amazing. I m currently a senior undergrad and will start my career in applied ML next year. Love watching your videos and learning.

@PlacidoYT 2 года назад

Goodluck man!

@NicholasRenotte 2 года назад

Thanks so much Aman! Hoping you smash it in the new job!!

@unixtreme 7 месяцев назад

I hope it’s going well!

@darkwave_2000 2 года назад

Great video again!! I learned already a lot from your videos. Big thanks for all the effort !! For future vidoes I would like to see how to advance to a next level with RL. I read something about concept networks where an RL agent can be trained to do a bunch of dedicated tasks. Here in the mario game that would be i.e. jump on the enemies, jump over hole, jump on pipe, collect coins,.... It would be interessting to see how to teach an RL agent some kind of skills instead of just letting the algorithm do all at once. Something like a 2 layer approach where first the needed skills get trained and then secondly by using the skills the complete tasks gets trained.

@geriain7448 2 года назад

I really hope that you will never stop making such great videos! Cheers up man and happy Christmas!

@NicholasRenotte 2 года назад

Thank you so much!!! I’m hoping I can keep the run going!

@geriain7448 2 года назад

@@NicholasRenotte Due to projects at work I would love to see more about audio recognition or voice enhancement. If you need some inspiration for future videos 😅 Wish you all the best!

@johanneszwilling 2 года назад

🤯 Superbly done!! Really appreciate the commenting and explanations. You know your stuff!

@lukedirsuwei664 6 месяцев назад

great explanation! Used this to solve a few environments. one of the best rl explanations i have seen

@anonymousking2053 2 года назад

Thank you so much! Demanding for it from 2 months! Again thanks a lot!

@vineetsharma189 2 года назад

Thanks for the amazing videos. Your videos are super informative and entertaining ! What GPU did you use to train this RL Model ?

@offstyler Год назад

Maybe you already know this by now, but by wrapping your env in the Monitor Wrapper by SB3 you can get additional info in your Tensorboard log like how much reward your model is collecting and how long the episodes run.

@haguda4096 9 месяцев назад

Can you give me a code example?

@shikhargupta6626 2 года назад

Awesome work, just discovered your channel, I wish I get more videos from you!!!!!

@carveR01js 2 года назад

Really great tutorial! I just finished a RL cource at school where we build our own DDQN for the gym lunar lander and stable baselines is my next step into RL.

@NicholasRenotte 2 года назад

Sweet! What did you build the DDQN with?

@carveR01js 2 года назад

@@NicholasRenotte I builded it with Pytorch, it worked really wel and we trained a model that could land the lunarlander quite good most of the time 😁

@mrCetus 2 года назад

amazing tutorial, tried it and almost beat the first level, gotta keep training the model! will check if there are more games to try this in the future!

@NicholasRenotte 2 года назад

HELL YEAHHHH!!

@xolefray6724 2 года назад

This is freaking awesome! BTW what gpu are you using to train your model that fast?

2 года назад

Never ever missing any of your videos !

@NicholasRenotte 2 года назад

Thanks so much for checking them out man!

@alejo0789 Год назад

WOW how great stuff is better than any course. Thank you so much I wish you the best!!!! it could be great if we can learn about uav drone or airsim which is on open ai it definitely will be a stunning stuff for all of us. Thanks again.

@adityapunetha6606 2 года назад

We can also simplify the input by maybe removing the background image, flattening the pixels into just black or white instead of the 8 bit range and maybe replacing the blocks and enemies with just white squares and black squares(this would require a new gym environment), this simpler input might be easier for the agent to understand and might learn faster

@High_Priest_Jonko Год назад

How would you do all that though

@Joulespersecond Год назад

Even if you did all that, I'm not convinced it would help. If coins are encoded in the same way as boxes then the AI is not going to learn to use them properly (trying to jump on coins, or avoiding them entirely).

@michaelalthoff1172 2 года назад

Great video! Thank you for this excellent work!

@funduino8322 2 года назад

thank you so much for the video, I really learned a lot.

@ndiritu_nyakiago 9 месяцев назад

Where did you get the resources to teach this tutorial from Nick? As of august 2023 most of the parts in the video does not work anymore because of version differences and lots of updates but the content is great.I will very glad when you help me

@KodandocomFaria 2 года назад

this example is so cool. You could create a custom environment to play games like sentdex did with GTA to drive ... Maybe you could do a different game like clash royale where you must to analyse life and choose the best card against the enemy. Or maybe street fighter V, to fighting games you may need to convert image with life and power data to numpy and it can be used as a observation to gym... the interesting pointing here is how to collect those data to create the enviroment . Because from them on you could solve multiple problems beyond games.

@NicholasRenotte 2 года назад

On it man! Going to do a ton more games. This was a first crack to see if people were interesed in it. Got Doom coming up

@windrago Год назад

much of this tutorial no longer works (due to updates on ALE and gym) it was well done but for the purposing of learning it would be great to have a refreshed version - great work nonetheless

@itsbenteller1 9 месяцев назад

Crazy how in just two years everything breaks

@gplgomes 2 года назад

Good work. Perhaps you should make a simpler game, showing the reinforcement algorithm being applied: a hole in a wall at a random height moves towards a horse, which it must jump through. The movement is when to jump and the strength of the jump. Once it's in the air, only gravity is at work. In this case you will have to draw the game and make the animations too.

@NicholasRenotte 2 года назад

Got something coming in this space with PyGame, stay tuned!

@chamberselias1903 2 года назад

Love your videos, was wondering if you could make a tutorial making a object counting api using tensor flow and tensor flow lite

@FireCulex 10 месяцев назад

Awesome video. I learned so much. Even cropped to 176x144 and scaled down 2x and added Monitor and TimeLimit Writing livesplits made it easy for to use the integration to add flag and big rewards. also score and dying. Couldn't converge after 8M tho, tried lr le-8, le-6, le-5. Forgot the n_step. Running hyperparam tests for a while...want to see if ent_coefs will help.Setting deterministic just encouraged Mario to run at first goomba.

@adityapunetha6606 2 года назад

Love your videos, btw can you point to some resources where i can learn about different RL algorithms like PPO and when to use which one

@NicholasRenotte 2 года назад

Yup, check this out, probably my fav sources: spinningup.openai.com/en/latest/

@henkhbit5748 2 года назад

Super Nicolas, after many youtube training you got your 50K reward 👍🍻

@NicholasRenotte 2 года назад

🙏 hahaha it's getting there right @Henk!

@jnu4355 2 года назад

I watched so many AI tutorial videos and you are the best! Can you also do the Health related AI videos such as genome sequencing or symptom diagnosis? Thank you always!

@NicholasRenotte 2 года назад

Ooooh, a little out of my skillset, have you checked out some of the stuff from data professor? He's a gun in bioinformatics.

@Scizyr 2 года назад

From the papers I've been reading it seems most of the success of RL models is dependent on a deep understanding of the subject environment in order to define proper rewards. I went through your video twice but I can't seem to find the section where you defined your rewards. You mentioned it briefly during the "Using the AI Model" section but I can't find the point it references. Since it's the most important aspect of RL I'm surprised there wasn't an entire section dedicated to it. Thanks for the video I really like the way you set up your workspace.

@NicholasRenotte 2 года назад

Thanks a mil @scizyr, you are a hundred percent right. I glossed over it in this video but very much brought it to the fore in the doom video. This is a more detailed description of the reward function (search for reward on this page): pypi.org/project/gym-super-mario-bros/

@Scizyr 2 года назад

@@NicholasRenotte Thank you for the reply, I haven't gotten to the doom video yet but I will move to that after I feel I have a good handle on gym retro with some SNES games I'm more familiar with. :)

@vakurbarut8550 4 месяца назад

Newer version nes_py and gym cause some bugs as 1/1/2024 here is how I managed to run the code: 1. JoypadSpace.reset = lambda self, **kwargs: self.env.reset(**kwargs) env = gym_super_mario_bros.make('SuperMarioBros-v0', apply_api_compatibility=True, render_mode='human') env = JoypadSpace(env, SIMPLE_MOVEMENT) 2. done = True for step in range(100000): if done: env.reset() obs, reward, terminated, truncated, info = env.step(env.action_space.sample()) done = terminated or truncated env.render() env.close() Hope it helps!

@singhamasingh22 3 месяца назад

can you share the notebook with me because I'm getting error like SuperMarioBrosEnv.__init__() got an unexpected keyword argument 'apply_api_compatibility'

@ibandr2749 3 месяца назад

what a hero

@henriquefantato9804 2 года назад

Hey Nick!!! Another awesome video, I just love this integration of AI and games! One thing that is bothering me is that random output for the same input, running the agent in the same level, the model gets different results for each run.... Do you know why this happens and how I can change it?

@carsonstevens7280 7 месяцев назад

Set your random seed when predicting could fix this problem. They are statistical models and will produce slightly different outputs based on the random seed used. For consistent output, use the same random seed. If the game has any randomness to its sprites and locations, this will not fix your problem (if the random seed has been set to be different/changing elsewhere). The input will be different each time your run the level/give the frames to your model and will produce a different action

@unixtreme 7 месяцев назад

@@carsonstevens7280huh funny coincidence just one day ago, this is the way.

@CodingEntrepreneurs 2 года назад

Incredible work once again Nicholas! Looking forward to diving in. Do you have a discord server?

@NicholasRenotte 2 года назад

Yup! Here you go! discord.gg/ZuU5Z5na

@dl7140 2 года назад

Thanks for this video. wow.! It so Amazing PPO.

@Ugunark 2 года назад

Great video? What version of python are you using? I'm having trouble installing nes-py

@NicholasRenotte 2 года назад

Using 3.7.3 for this my guy!

@anuragmukherjee1878 2 года назад

where you able to solve it?

@benjaminrivera3577 Год назад

Good news to those with an M1, pytorch is now optimized, but it is in beta so it doesn't take full use but it runs this faster than before.

@KhangNguyen-jg5ok 2 года назад

Love your video, may I ask about the sources you researched to figure out the adjustment of hyper parameters ? I am trying to improve the model since the model 700000 is not better than 500000 or random one.

@NicholasRenotte 2 года назад

I built a hyper parameter optimisation flow @Khang, I’m going to be teaching it this week!

@MrCaradechancho Год назад

Hey, great video. What parameters we should change so the mario learn faster and that we can avoid doing 1 million steps ?, cause my pc is slow. Thanks !

@ciaranmalone1700 3 месяца назад

Just to note you will need to set !pip install stable-baselines3[extra]==1.3.0 or you will get a render_mode error

@unknownbro9857 3 месяца назад

Did you get error at the 1st command? When installing mario nes py? If not how did you avoided the error

@lamtungphung-cq5bn 10 месяцев назад

Can someone help me with my bug? Why does my state return a tuple? It shows exactly what it suppose to return in the video but inside the (). This prevents me from checking the shape since we cannot check the shape of a tuple. Thanks for helping me!

@Dream4514 2 года назад

Amazing as usual!

@eshwaranugandula6907 Год назад

Hey NIcholas Renotte, Iam getting erros right from first line of code ModuleNotFoundError: No module named 'gym_super_mario_bros' please help me by solving this error.

@marceloferreira950 Год назад

Hi! where is the link that helps to install cuda? I installed it but when I run the PPO model it always says that it will run with the cpu. however with other projects I have in which I use tensorflow it detects the gpu and runs the models with the gpu

@ProbeSC2 2 года назад

No offense, but you are absolutely amazing :D Keep up the awesome work.

@meetvardoriya2550 2 года назад

Amazing as always 😍

@reiitenshi Год назад

hi i got a question, i can't seem to use cuda even though i've copied the tutorial exactly. i'm running a 3070 and i hve the toolkit installed just in case and the model is still running on a cpu instead of gpu. any help would be greatly appreciated

@ChristFan868 2 года назад

Awesome stuff!

@ingluissantana 2 года назад

First of all, thanks a lot for this greattttt videoooo!!! 🍻🍻🍻 I have one question to you/all: Do you know if there is a web environment that I can use to run this code? I don’t have a laptop when I’m traveling and I would like to try this code… I tried already with google colab but it is not working.. Thanks a lot in advance!! Cheers!!! 🙌🏼🙌🏼🙌🏼🙌🏼

@lucianosilvapereira1164 6 месяцев назад

I have a question that I would like to know. How do I make the robot, after being trained, play in a true manner, reproducing the audio of the interaction?

@januszmewa5229 Год назад

Hi Nick!!!! Is there any option to contact you? Because I have stucked in the two lines of this code :( and I am really close to solving it.

@codewithkarthik7136 2 года назад

hey nich loved the video! Could you make a tutorial on gym retro because there are very few videos which explain them properly. Since Gym retro has over 1000 environments it would be pretty nice to have an introduction video. Cheers!

@NicholasRenotte 2 года назад

Coming this week for street fighter!

@codewithkarthik7136 2 года назад

@@NicholasRenotte I can't wait for the video nich. I'm so excited!!!!

@leosong8091 2 года назад

Amazing video! BTW, (240, 256, 3) should be height x width x channel

@mavencheong Год назад

I have a project that using PPO, what could be the problem, when during the learning stage, the PPO is able to perform well and i can see that in simulation. But when use the saved model, and using the predict, the result is worst, like it never learn ..?

@chintanrout4122 2 года назад

Hello I chose not to do the callback because it was taking up a lot of space. I am done training the data so how should I proceed from here ?

@MaNuElxGaLvIg Год назад

is there a way to make the n_steps and learning_rate not fixed values?

@piyushbora5056 10 месяцев назад

For some reason, after training the model, when I run the testing code, the game window does not open. I have confirmed that the model is working by printing the moves it makes; however, the rendering is not functioning. Can anyone help? It rendered perfectly in the random moves part.

@_obegi_ Год назад

For those that are getting an error saying expected 5 got 4. Change the line where you are taking the step to this. state, reward, done, info, _ = env.step(env.action_space.sample()) Adding the underscore makes it so all other variables are ignored, making the expected more error go away

@Sirobasa Год назад

still not working. but thx (ValueError: not enough values to unpack (expected 5, got 4))

@Joyboy_1044_ 8 месяцев назад

@@Sirobasa change these two lines you'll be good 1) # Setup game env = gym_super_mario_bros.make('SuperMarioBros-v0',apply_api_compatibility=True,render_mode="human" ) 2) state, reward, terminated, truncated, info = env.step(env.action_space.sample()) done = terminated or truncated # add this line

@Yupppi 7 месяцев назад

@@Joyboy_1044_ Thanks, this actually made it run. Just got endlessly expanding ConnectionResetError though.

@manKeGames 5 месяцев назад

@@Joyboy_1044_ ty !

@unknownbro9857 3 месяца назад

Is someone getting an error from the starting command?! error: subprocess-exited-with-error python setup.py bdist_wheel did not run successfully. Please HELP

@christophermoore7676 2 года назад

Why am i getting " ERROR: Command errored out with exit status 1:" on step 1?

@robkjohnson 2 года назад

It would be a bit different of a project for you, but programming a Rocket League Bot would be cool to see. RLbot makes it easy to connect with the game and gather info, some community members have built environments in gym for ML training, can use multiple languages (including Python.. obviously), and overall a cool project

@NicholasRenotte 2 года назад

On it! Already found an integration we can take a look at!

@robkjohnson 2 года назад

@@NicholasRenotte Sick, I look forward to it! I’m pretty new to reinforcement learning and decided to take on an rlbot which is going… interestingly lol. If you throw in advice on making better reward functions I wouldn’t mind;)

@Leroez09 2 года назад

Is it possible to do this with tensorflow instead of pytorch?

@FirasAbdallahh Год назад

How to train the model after it has already been trained, say that you trained your model for 1 million and you what to train it to 5 million do you need to start over from the beginning?

@Louisljz 2 года назад

Awesome Video

@flashkachannel2756 2 года назад

Cool, let's go Rockstar ⭐

@mrWhite81 2 года назад

Jump and run fast buttons are in or off buttons, Kinda like analogs. Not sure how openAI gym handles that. But great content. 👍

@llonix29 Год назад

Thanks!!

@user-fd4ph6hx4v 2 года назад

So, It's really cool. It's banged my mind. But cooler will be to see creating some strategy game RL (AoE2/3/4 or SC2 or WC3 or etc). How to create an environment for big game how to learn AI so difficult movement. Is it possible? Anyway, Thank you very much for your doing.

@michaelyoung3337 2 года назад

Can you pause (or stop/restart) the model learning?

@Yupppi 7 месяцев назад

My favourite part of this was "NameError" from gym_super_mario_bros and SIMPLE_MOVEMENT and env even though I installed both gymsmb and nespy before using jupyter-lab but still trying to include the pip install like in the tutorial. A classic situation of followign a coding tutorial and copying every move, but getting errors all over. Until I learned I have to click play on every cell from the beginning in order. Just never managed to get that screen of python playing mario. (Until I read comments and tried a couple of fixes). I bet this would be a pretty nice tutorial if every cell wasn't full of red and the left side brackets weren't full of * and me having no idea if it does anything at all.

@emanuelrouse7076 5 месяцев назад

Thank you, I couldn't figure out how to get the simple movement to work lol

@test_channel_eg 4 месяца назад

Hello , my friend Ai model trained on your pc or a cloud ?

@unknownbro9857 2 месяца назад

Can i train the model on other device using GPU and test it out on other device which has no GPU?

@user-om9hq5bg4g 5 месяцев назад

i want to do this and make this model as my final year project. so i wanted to know that does this tutorial works fine till now? i am asking because i saw a comment saying that it doesn't work now due to some updates etc. Please reply nicholas sir so that i will get to know the scenario. Also, If it does not work fine now can you please help me to build this model according to the updates? it is very important to me. Please let me know.

@Elderloc Год назад

What did you have to change I'm still having a hard time getting it running. Could someone update to code. I can't get it to install Stable Baselines3 Auto Rom license fails

@sylvainmazet52 2 года назад

Amazing presentation! I'll go and plug PPO in smash bros ultimate and see who wins! (hmmmm, just joking, this seems pretty hard to do) Do you know if some game AI actually do reinforcement learning, or any machine learning, "while playing"? I am thinking of AI bots in 1v1 fighting games.

@NicholasRenotte 2 года назад

I think fundamentally yeah, what Deepmind did with Dota comes to mind!

@pepperpotts3661 Год назад

Could someone clarify whether this is a Deep Reinforcement Learning implementation or just Reinforcement Learning

@nourinahmedeka9518 Год назад

In the last client interview, you mentioned that there are a bunch of different algorithms. Can you please tell me what are those algorithms?

@NicholasRenotte Год назад

Here's a solid table of them! stable-baselines3.readthedocs.io/en/master/guide/algos.html

@nourinahmedeka9518 Год назад

@@NicholasRenotte Thank you so much! Can we implement those algorithm to see how they perform for the Mario game? Or is there already published work out there?

@aakruti8232 2 года назад

The project won't have significant differences for other games too right? For example contra

@NicholasRenotte 2 года назад

Hmmm, it likely will. You'll need a different reward function and gym environment for each.

@romaricovargas2657 Год назад

Hi Nicholas, Your video tutorial is awesome! I would just like to know how I can run this in colab, it seems that the env.render() is not working in colab. Thanks!

@Bhaveshyoutube Год назад

Same Problem here!!! Have you got solution?

@tomzhao6113 9 месяцев назад

Same Problem here!!!

@MrSurya-hn7pu 6 месяцев назад

It will not work in colab. Run it in your local jupyter.

@eonryan8491 8 месяцев назад

36:13 - where the trouble begins 46:06 - train and logging callback (optional)

@rakeshchaudhary1918 2 года назад

i used pytorch but cuda 11.3 but my ai learn with cpu cuda is not working

@Icekeytrust 2 года назад

hey, I really like this video :) is it possible to continue it in a certain step? so can I say continue from run 100.000 or so :)

@NicholasRenotte 2 года назад

Yep, I don't think it preloads training schedules but you can reload weights using PPO.load then train from there!

@mijkolsmith Год назад

So, do you need to install CUDA before this program will work? I don't see a link in the description on how to install it, pytorch seems to be installing fine

@mijkolsmith Год назад

Also could you make/recommend a video where we write an algorithm like ppo to train the AI?

@JhonnyAbouRjeily 2 года назад

nicholas the models are not in the github can u plz upload the whole project

@hsuan.kung. 2 года назад

Amazing tutorial! It is really straightforward to follow. If you could add the intrinsic reward to the agent, then it could learn even much better!

@NicholasRenotte 2 года назад

Hmmm, good suggestion. I've been learning about reward shaping this week for an upcoming project, agreed that would've helped a ton.

@bboyflamer Год назад

The order of preprocessing matters. Which is in your opinion the most efficient way to process the frames? SkipFrame, then GrayScale then ResizeObservation ? def add_wrapper_functionality(): # 1. CREATE THE MARIO ENV mario_env = gym_super_mario_bros.make("SuperMarioBros-1-1-v0") # 2. SIMPLIFY THE CONTROLS mario_env = JoypadSpace(mario_env, SIMPLE_MOVEMENT) # 3. SKIP FRAMES AND TAKE ACTION EVERY N FRAMES mario_env = SkipFrame(mario_env, skip=4) # 4. TRANSFORM OBSERVATIONS INTO GRAYSCALE mario_env = GrayScaleObservation(mario_env) # 5. RESIZE OBSERVATIONS TO REDUCE DIMENSIONALITY mario_env = ResizeObservation(mario_env, shape=84) # 6. NORMALIZE OBSERVATIONS mario_env = TransformObservation(mario_env, f=lambda x: x / 255.) # 7. STACK N FRAMES TO INTRODUCE TEMPORAL ASPECT mario_env = FrameStack(mario_env, num_stack=4) return mario_env mario_env = add_wrapper_functionality()

@danialgholami4138 2 года назад

hi Nick, I want to study tracking and detection object. Can you recommend a book ? plz replying to my message

@FuZZbaLLbee 2 года назад

Are you using anaconda, or does the gym now work on windows as well? I remember the Atari gym didn’t work on windows

@NicholasRenotte 2 года назад

Works on windows now! I'm using Windows 10 on this machine.

@cyborgx1156 2 года назад

Amazing. Can we use reinforcement learning for an agent which can play strategy game like clash royale

@NicholasRenotte 2 года назад

Oooh yeah, would need an alternate reward function but highly likely!

@rakeshchaudhary1918 2 года назад

i got an error installing nes-py. Could you please help me sir?

@nostradamus9132 2 года назад

You should implement a simple baseline that spams jumping, to see if that is everything the model learned.

@NicholasRenotte 2 года назад

😂 ngl I tried that to see if trainng for five days was actually worth it. Looking back there's more stuff I would do to improve performance, been getting back into Deep RL this month. A bunch of things we could use to improve outside of just training for longer.

@nostradamus9132 2 года назад

@@NicholasRenotte what was the result of the baseline test?

@nostradamus9132 2 года назад

@@NicholasRenotte I did a bit of rl myself, it can be quite hard. You should focus on presenting the rl agent with very well preprocessed data. Everything you can simply code should be implemented so that the model does not have to learn that much. I also would recommend trying deep Q learning with a dense NN to predict the Q value. If you are feeling fancy you could go for a Actor Critic Model. Also you should punish your agent for not necessary jumps so that it is forced to use jumps more smartly. You should also modify your Reward function further and it is important to start in a easy environment so that the agent learns fundamental reactions first.

@DK-1907 10 месяцев назад

after all preprocessing steps when i try to run env.reset() it pops an error -> TypeError: reset() got an unexpected keyword argument 'seed' can anyone help me to rectify it as i cant proceed without rectifying it

@JohnsonBonson 9 месяцев назад

did you end up figuring it out?

@horrorstargames7592 6 месяцев назад

doesn't work anymore@@JohnsonBonson

@JohnsonBonson 6 месяцев назад

@@horrorstargames7592 took a while but ending up getting it working

@nilakshkashyap8936 6 месяцев назад

@@JohnsonBonson hey did you figure it out. Could you please help. Stuck with the same problem.

@shreyasingh3410 Месяц назад

same

@charleslc1853 2 года назад

that's cool

@phptest2529 Год назад

bro please provide me 4M trained file because i need for run and i have no high configure device it is taking time, i have done 1M train model but its taken 5 days because of low end device and i want to complete the first level so i need 4M file

@yaswanths5382 2 месяца назад

hey nick, can i do this with street fighter game please reply

@somewhere8 2 года назад

very good !!!

@mauzeking6661 2 года назад

I wish I could find where you setup your enviorment to this point so that the setup steps would be easier

@NicholasRenotte 2 года назад

These are the commands I normally run python -m venv mariorl # Creates a new environment .\mariorl\Scripts\activate # Activates it python -m pip install pip ipykernel --upgrade # Upgrades pip and installs ipykernel so we can add the env to jupyter python -m ipykernel install --name mariorl # Adds mariorl env to jupyter jupyter notebook # Starts jupyter, can also run jupyter lab

@AdvancedNoob1908 4 месяца назад

Couple of things that were not working for me, hopefully it will help others, It's 2023 so some of the API libraries etc are behaving a bit differently, but it's a nice tutorial nonetheless. My changes are as follows: 1. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-2eeYqJ0uBKE.html - had to use this instead 'env = gym_super_mario_bros.make('SuperMarioBros-v0',apply_api_compatibility=True, render_mode="human")' 2. ru-vid.com/video/%D0%B2%D0%B8%D0%B4%D0%B5%D0%BE-2eeYqJ0uBKE.html - state.shape was not working, but state[0].shape did work

@singhamasingh22 3 месяца назад

can you share the notebook with me because I'm getting error like SuperMarioBrosEnv.__init__() got an unexpected keyword argument 'apply_api_compatibility'

@donaldduck2852 3 месяца назад

which software do you use for typing in the code

@donaldduck2852 3 месяца назад

@davidg16able 2 года назад

Awesome video! I trained the model using a GPU instance in the cloud but when I try to run it my locale machine I’m getting the following error: AttributeError: Can't get attribute 'RandomNumberGenerator._generator_ctor' on

@NicholasRenotte 2 года назад

Possibly try upgrading the version of gym or ensure they're running both on the same version @David, check this out: github.com/MultiAgentLearning/playground/issues/200

@canpolarbearfly1833 Год назад

I suggest putting the exact python version + all the library versions you used for tutorial videos like this, which would make your video stay relevant for a very long time, as a new learner doesn't necessarily care about using the latest library but just want to learn how things work. I noticed you did it for most of the library but leave out on the nes-py package's version, which caused the source code you posted to no longer work without some modifications.

@IcyClench Год назад

I believe this was done with gym 0.21.0 - that's the version that was out when the video was released + it worked for me

@User-ud5sz Год назад

Can you please let me know on the modifications you made since i encountered a lot of issues as well?

@canpolarbearfly1833 Год назад

You can try this: python 3.7 gym_super_mario_bros==7.4.0 nes_py==8.2.1 stable-baselines3[extra]==1.6.2

@User-ud5sz Год назад

Hello, after your reccomendations i unfortunately still get the error which is a value error and more specifically it says, not enough values to unpack (expexted 5, got 4).

@canpolarbearfly1833 Год назад

That error message means you are trying to assign 5 values, but the function only returns 4, for example, below will return the error: def foo(): return 1,2,3,4 a, b, c, d, e = foo() If I remember correctly, env.step in older gym version return 4-tuple, while in newer version it returns 5-tuple. I guess you were probably doing something like: observation, reward, terminated, truncated, info = env.step(action) instead of observation, reward, done, info = env.step(action)