Training an AI for WIPEOUT (MLAgents Unity Reinforcement Learning)

Alexandre Sajus

Подписаться 1,4 тыс.

Просмотров 2,2 тыс.

50% 1

Видео Поделиться Скачать Добавить в

Наука

Опубликовано:

26 июл 2024

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 27

@AirpowerNow Год назад

Is your code open source ?

@alexandresajus Год назад

Yes! I pushed the code on GitHub here: github.com/AlexandreSajus/Total-Wipeout-AI

@DeadRabbitCanDance 7 месяцев назад

@alexandresajus Big thanks for excelent tutorial!

@fotiskapotos Год назад

Great work ! I would love to see a longer form video with a more in depth exploration into how you make your environment and how you create your agent. Subscribed !

@RobTheDon1234 Год назад

I could see you going BIG if you do more vids like this, keep it up Brodie 👍🏾

@alexandresajus Год назад

Thanks a lot!

@daryladhityahenry 11 дней назад

Wow nice!! I notice the walk is much better than almost other that I could find. Are you using imitation learning for that? Or pure reward n punishment? If it's pure reinforcement learning, I wonder how you can achieve quite great walking motion. Great vid!!!

@alexandresajus 11 дней назад

Thanks! It's just reward and punishment here. The walk comes as default when you use the Walker agent from MLAgents in Unity, so it is a good starting point.

@daryladhityahenry 11 дней назад

@@alexandresajus I see.. Thanks for the info :). Love seeing things like this :D

@NxBy 4 месяца назад

Please make more videos like this

@alexandresajus 4 месяца назад

I would love to! This has been my dream project for 3 years. The problem is that these projects take a lot of work and frustration to make. Also, this was, weirdly, my worst-performing video.

@SlimsyBeetle Год назад

Very cool

@abisheksunil Год назад

Awesome! Love the humor that you put in there, and the agent converges pretty well. Is it PPO or SAC? Would definitely be up for more (some ideas: CNN-based RL or even better Multi-agent RL with similar complexity).

@alexandresajus Год назад

Thanks a lot! I trained on PPO. Yes, I definitely want to do more RL projects using Unity. I'll find something...

@Nebulaoblivion Год назад

Pretty cool!

@bhallos_baus 4 месяца назад

hey bro awesome work man..

@alexandresajus 4 месяца назад

Thanks!!!

@sergche3718 6 месяцев назад

Wow nice. 40 muscles and still it learns. Itsy also funny how it developed the main pushing leg and a supporting one behavior. Yes it's limping, but still. What would be the strategy to make it aware of the moving obstacles? Now it just tries to simply run fast, I suppose?

@alexandresajus 6 месяцев назад

Yeah, this was an enjoyable project to look at. The moving obstacles were challenging to work with: on the swing, the AI does understand how to jump from the edges of the platforms to have a better chance; on the sweeper, he just learned to run as fast as possible, which could be better. I need a way to teach them timing: they should stop before the obstacle and wait for the right timing before moving forward. One way to do this is by adding as input information: how close am I to the start of the obstacle, where is the obstacle in its movement cycle. This information would allow the agent to understand that some timings are better than others. That's one solution. There have got to be better ones...

@Diego0wnz 10 месяцев назад

very cool video! I like RL, what/where did you study?

@alexandresajus 10 месяцев назад

Thanks! I did a Master of Engineering at CentraleSupélec in Paris, but the AI classes there were too theoretical. I learned RL primarily through the AI club we had at that Uni.

@keyhaven8151 Месяц назад

I have always had a question about mlagents: they randomly select actions at the beginning of training. Can we incorporate human intervention into the training process of mlagents to make them train faster? Is there a corresponding method in mlagents? Looking forward to your answer.

@alexandresajus Месяц назад

Excellent question. What is commonly done to choose human actions instead of random ones at the beginning of training is called "Imitation learning." MLAgents does provide documentation on imitation learning, but I have never explored it, and it is probably complex to implement: github.com/gzrjzcx/ML-agents/blob/master/docs/Training-Imitation-Learning.md

@keyhaven8151 Месяц назад

@@alexandresajus Thank you very much for your answer. I have looked at the link you sent and found that it is an old version of mlagents, which is different from multiple settings. For example, the new version does not have Brain and Academy's Broadcast Hub. So, what should we do in the new version? Thank you for your answer!

@alexandresajus Месяц назад

@@keyhaven8151 To be honest, I don’t know since I never tried imitation learning. Try to look up « imitation learning mlagents » online, I’m sure there are tutorials. Or use the older version of MLAgents

@keyhaven8151 Месяц назад

@@alexandresajus Thank you very much for your answer. I will try to find a solution! thank you!