Unity ML-Agents 1.0+ - Self Play explained

Подписаться 10 тыс.

Просмотров 10 тыс.

50% 1

ML-Agents in Unity implements self-play, a mechanism for an agent to play against past versions of itself. This expands the possibilities of reinforcement learning as the environment and rewards are always scaling to the abilities of the agent. I really believe self play to be one of the key ideas in machine learning / artificial intelligence and I hope this video gave you some insight into the idea.
Github-Link: github.com/Sebastian-Schuchma...
I have created a Discord Channel for everybody wanting to learn ML-Agents. It's a place where we can help each other out, ask questions, share ideas, and so on. You can join here: / discord
Support me on Patreon: www.patreon.com/user?u=25285137
Keep in touch (Twitter): / sebastianschuc7

Наука

Опубликовано:

21 авг 2020

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист

Посмотреть позже

Комментарии : 23

@epjm_ 3 года назад

Great tutorial!!!! thank you so much for all the mlagent videos , they've really been helping me out. Just joined the discord too so cant wait to see all the new things about mlagents im gonna get to learn there.

@Prokage 3 года назад

You are contributing so much to my Master thesis, thank you so much. This is brilliant work. Your teaching ability is awesome!

@SebastianSchuchmannAI 3 года назад

Wow, thank you!

@nuttaphoomboonmee2163 3 года назад

I've been finding your youtube ch for hours...remembering your ch name is as hard as understanding your video, but I DO LOVE IT

@BramOuwerkerk 3 года назад

Great video! I'll definitely check the project and Discord Server out

@ziquaftynny9285 3 года назад

I was already subscribed :)

@alexanderyau6347 3 года назад

Good explanation

@leo8755 2 года назад

Awesome video! Could you please elaborate on the policies? How are they handled and used? If there are 2 agents and each has a policy, what is the purpose and how do they use the other policies in the stack? Thanks!

@robosergTV 3 года назад

very cool

@ForsmanTomas 3 года назад

The reason win or loose rewards doesn't work has been on my mind since I saw Wargames when I was 9. If you want the ai to understand that a draw is preffered there has to be a reward for that. As you point out, being the player who force a draw is only rewarding if you go second. In Wargames th AI learns that it's more rewarding not to play even though there is no reward for that, or AFAIK even an option for that. I figured a time reward (not loosing x time) could make that happen since that is what was the driving factor behind the cold War. It would mean taking time to make a move is rewarding if a loss is expected to be likely. In reality human players know that tic tac toe should end in a draw so winning or drawing is the same, the goal is only to not loose.

@keyhaven8151 Месяц назад

I have always had a question about mlagents: they randomly select actions at the beginning of training. Can we incorporate human intervention into the training process of mlagents to make them train faster? Is there a corresponding method in mlagents? Looking forward to your answer.

@lunli4435 2 года назад

Thanks for your wonderful job. How can I find the detailed training code?

@adelAKAdude Год назад

Great video and great project, really enjoyed going through your project, very insightful. So I need some help, I am training my agent for almost a month now (as in training stopping and trying something else), I can't seem to get to the max level, the best agent I trained simply stops me from wining, but rarely go for the win, like if he has two in a row, and you have two in a row, he blocks you instead of wining, and I get to this point on the first 700k ish steps, if I trained longer, the agent goes nuts and is no more an AI and more like I do 1 then 2 then 3 player. I would appreciate the help if you gave me some advices or anything you done under the hood with your agents. And you mentioned in the video that your AI isn't perfect, any idea on that ? can't I reach a perfect player with self-play ?

@nikhilsharma2236 3 года назад

hello, can i use unreal as well for training my models, or unity is the only option as everyone is using it, plz do tell me

@MZXD 2 года назад

interesting so how can this apply to even more multiple ai? So for example, having 6 agents all racing each other. Would you suggest for training just having a 1v1 so 2 cars racing each other or a 3v3 even tho the goal of the game is to win and not for teams to win?

@philip9611 3 года назад

Hi man, appreciate the tutorials! Just wondering though, why was your starting elo 1200? Does this affect training in anyway?

@philip9611 3 года назад

Also how much of a jump in Elo is good?

@mikailkotan5246 3 года назад

Hi , I have done that you say but when i try to train i see that error pls help me :) WARNING [trainer.py:240] Your environment contains multiple teams, but PPOTrainer doesn't support adversarial games. Enable self-play to train adversarial games.

@yyyrational 3 года назад

I have a general question here... Do u use C# or python in unity? Sorry I am very new to the ropic..

@skinnyboystudios9722 3 года назад

How many gtx3090s you need to train your own AlphaZero?

@CGoni 3 года назад

Hello, thank you for this useful video. I am a subscriber who wants to apply ml-agent to the game. I am wondering if there is any way or example that can apply learning in the game.

@Norbingel 3 года назад

By this do you mean in game ai learning and improving? Because I've been wondering about this as well. Is ML something you do out of formal game build then when you do have the build the ai learning is fixed or is it something they can develop while in the game itself?