Тёмный

Can A.I. finish this track without crashing ? 

Yosh
Подписаться 149 тыс.
Просмотров 174 тыс.
50% 1

A.I. learns to drive in Trackmania with NEAT algorithm, but it is not allowed to hit walls !
Contact :
Discord - yosh_tm
Twitter - / yoshtm1
Music : • Neon.Deflector - Star ...

Игры

Опубликовано:

 

8 янв 2021

Поделиться:

Ссылка:

Скачать:

Готовим ссылку...

Добавить в:

Мой плейлист
Посмотреть позже
Комментарии : 272   
@Tepalus
@Tepalus 3 года назад
Top guy in 40th and 41st generation be like: "C'mon bro, just 0.001 more seconds"
@flyingpugs3678
@flyingpugs3678 3 года назад
That’s me all the way when I play the game. Only I’m just trying to get silver
@flyingpugs3678
@flyingpugs3678 3 года назад
(In trackmaina)
@winnerwannabe9868
@winnerwannabe9868 9 месяцев назад
Speedruners be like:
@hatrian
@hatrian 3 года назад
I just finished watching the previous video and this one came out 4 minutes ago lol
@FelipeFerreira-wk5nv
@FelipeFerreira-wk5nv 3 года назад
19 minutes here, RU-vid Algorithm was probably doing some work here
@Capa2.
@Capa2. 3 года назад
30mins here
@texasranger7687
@texasranger7687 3 года назад
a full day for me... i had to wait a long time...
@wsollers1
@wsollers1 3 года назад
Ditto
@Symphony_
@Symphony_ 3 года назад
20 hrs for me, still damn quick
@JZStudiosonline
@JZStudiosonline 3 года назад
It looks like they're wiggling like a hammerhead shark to try and see whats ahead of them. Maybe a change in the detection rays could fix that.
@Kram1032
@Kram1032 3 года назад
nope that's them being mortally scared of the wall. OMG THE WALL IS COMING CLOSER I'M ABOUT TO DIE NOOO it's caused by how severe the penalty to crashing is
@2ARM2
@2ARM2 3 года назад
Yeah I think it's because they move away from one wall, then see another wall and turns away from that, and it repeats itself.
@floriel1
@floriel1 3 года назад
To get a record time, you need to plan your course much further ahead, than you can see. So not having any knowledge of the track is a severe disadvantage. Humans will look at a map and instantly memorize it to some degree. And even if we don't get a map, we will memorize the track better and better with each run. To my understanding, the network did not get any kind of map input. And unless the network has a position input or at least some way of measuring the time that has passed, it also has no way to memorize the track with repeating runs. If my assumptions so far are correct, then although the track stays the same, to the network it's a new track each run. Which would mean the only advantage of using the same track again, is to have comparable section times, but it comes at the huge disadvantage of overfitting the net to this one track. For rendering purposes you can of course use one specific map on which you "benchmark" every n-th generation. But I wouldn't use those rendering runs for training, again, to prevent overfitting. Now in case my assumptions are at least partially wrong and the network has some means of memorizing a track, and if you only care about record times on one single track, completely ignoring possible bad performance on other tracks, then overfitting to the quirks of the map is not a problem at all, but the actual goal. But if you do care about general performance on any random map, i'd suggest to give the network some kind of map and either position or time tracking capability. Now if the map input is to good, you might accidentally create an entirely different network, that works almost exclusively on map data and simply approximates the fastest trajectory for a given map. But if you limit the accuracy of the map information enough, this should not be a problem. A save bet would be to limit this "map" to nothing more than a sequence of turns and straights like: lslrslsrs, which I'd say is about as much as humans will memorize from glancing at a map. If this proves to be to little information to be actually useful to the network, you can add more "low res" informations like turn angle in 45° steps, turn radius in 5m steps, distance between turns in 10m steps. If that still doesn't prove useful, try a normal map/path, but with resolution limited to one node every x meters.
@Duck1en
@Duck1en 2 года назад
Its because he is rewarding for the distance driven
@SergejKolmogorov
@SergejKolmogorov 3 года назад
Try next time to sort the AI out with KPI who is made less turning motions until the end of track. It would be the ideal line.
@odinlindeberg4624
@odinlindeberg4624 3 года назад
Some sort of small penalty for each time the car starts turning, maybe?
@satibel
@satibel 3 года назад
maybe use the shortest length driven ? I mean something like adding a malus that looks like (length driven/progression) - minimum_of_all_cars( length driven/progression) naturally cars turning less will have a shorter driven distance/progression and this also rewards tighter corners.
@TheDjcarter1966
@TheDjcarter1966 3 года назад
Exactly my thought either fewest steering motion or total distance RightNow you can see cars going back and forth a lot trying to stay away from wall
@Kram1032
@Kram1032 3 года назад
@@TheDjcarter1966 such an extra penalty would definitely help but the weaving is really caused by the AI being *overly* scared of the walls. This happens because even the mildest touch means the AI is dead. It'd help the AI to simply smooth out that objective: - make hitting the wall a penalty but not an ending to the run - perhaps add the angle you hit the wall with into your penalty equation (so if you're going completely parallel to the wall, there would barely be any penalty but if you collide head on, it's an extreme penalty) Like, right now, there's literally nothing worse than hitting the wall. It's basically infinitely bad. So there is an insane pressure to avoid the wall, so you really want to maximize your distance from all walls, effectively asking the AI to drive in the center of the lane. If it were angle based, it'd instead simply ask for the AI to drive more parallel to the wall. Not necessarily right in the center of the lane. That said, constraints of the sort "minimize input fluctuations" can work quite well in practice afaik. Turn as little as possible, and also minimize the number of times you change stepping-on-the-gas
@aBucketOfPuppies
@aBucketOfPuppies 3 года назад
@@Kram1032 I liked a lot of the ideas you mentioned in your comment. I just wanted add some context for you as a fan of the game being played. In Trackmania, the walls are one of the biggest enemies to a decent time. You don't bounce off the walls at all, on the contrary your car just grinds against the railing, slowing you down instantaneously. I'm not saying any of your ideas were wrong, because I really liked all of them. I just wanted to give some background into why the ai was probably programmed the way you see in the video. Tldr: Hitting walls is really bad and ends most runs in this game
@DenisShiryaev
@DenisShiryaev 3 года назад
Thank you for sharing and for other videos of this series too. I think it will be interesting to see how the algo will perform if you’ll randomly spawn a “start” each generation on a different position of a map, should be fun in a result I believe 🙃
@Kram1032
@Kram1032 3 года назад
This is a good idea for the sake of diversity and robustness, but the difficult part there is how to combine fitnesses correctly. Some AIs automatically have it WAY harder and so should technically be rewarded more if they manage at *all* vs. AIs that are, like, spawned right in front of the goal and just have to drive forward for a single frame and win. It's possible to account for this but it might be difficult. (The easiest way is to simply compare each AI on multiple runs, taking the average or median performance to get a more reasonable comparison, but that's time consuming)
@chaimlukasmaier335
@chaimlukasmaier335 3 года назад
Something that might look interesting is having the best run of each iteration at the same time, coloured in a gradient. Thanks anyways, awesome video!
@kurtisharen
@kurtisharen 3 года назад
So, having just finished watching your previous video, and then immediately watching this one, I have a question. Let's say that you train an AI for 100 generations like you did before, on Track "A". You train another AI for 100 generations on Track "B". You pit the two trained AIs against each other in a race on Track "B". How well would the generalizations taught to the "A" AI help it against the AI that was trained exclusively on Track "B"? How many generations of Track "B" training would it take on top of the generalizations until it can catch up to or beat the Track "B"-exclusive AI? Would the resulting AI that trained on Track "A" and then Track "B" do better against a natively-trained Track "C" AI than an AI that only trained on Track "B" before encountering Track "C"? If that was a bit confusing, let me try rewording it. If an AI was exclusively trained on one track, and had to rely on generalizations to race on a second track against an AI that was exclusively trained on said second track, would it learn faster when training on the second track? Does generalization from training on two different tracks help more when on a new third track, than generalization from training on a single track? Does training on multiple tracks help it learn new tracks faster than just continuous training on a single track?
@isaacreilly915
@isaacreilly915 3 года назад
i believe that while learning would get quicker and quicker as the ai learned on more tracks, against an ai that has trained exclusively on the track the two race on the one with experience on multiple tracks will most likely lose, but with enough training it would come down to any mutations, such as taking smooth lines, using bugs such as the speedslide for the winner. On the other hand, the ai who has trained on multiple tracks will do better than the one who only trained on one track when both are put on a new track
@naunau311
@naunau311 3 года назад
@@isaacreilly915 I don't really think taking smooth lines could be considered a "mutation" per se. On the other hand, using things like the speedslide would definitely be one but would take a damn long time to be totally incorporated in a whole generation. Although to answer the first comment, if both AI are trained for the same number of generations, the AI that stayed on only one track would definitely always beat an AI that just came into the new track imo. Any result that differs would be down to luck when it comes to "mutation" as the first "team" of AI that discover said mutation would definitely have the upperhand after it has learned the map.
@michaelbuckers
@michaelbuckers 3 года назад
It comes down to generalization vs overfitting. Overfit AI can perform ridiculously well in its training environment but will struggle in novel situations. Generalized AI will be good at everything but will not be great at anything. Mastering racing comes down to memorization of each individual track, so you can simply train its own AI for each track and not even expect it to do well on any other track, using a generalized AI to bootstrap the process will reduce learning time but is generally not necessary. The only scenario where a generalized AI will be the optimal AI is if the game would randomly generate a new track each and every time - you respawn you get a new track.
@vast634
@vast634 3 года назад
Using the same track for each individual favors overtraining. There should be more tracks per individual, or choosing random tracks over the trials. And the track does not have that many curves for a reliable fitness assessment. So a longer sim duration with a longer track would be better overall.
@Azkunki
@Azkunki 3 года назад
@@naunau311 _"the AI that stayed on only one track would definitely always beat an AI that just came into the new track imo"_ Not necessarily. It depends on how much *and* how well both have been trained. If we were to pit the final AI of the previous vid ("P1") against the final AI of this vid ("P2"), I would rather bet on P1. Sure it would hit walls, but from what I've seen it's being stuck that eliminates you (also, that rule could be removed anyway as it was a tool for teaching P2 to drive, not a part of the track), so it wouldn't be an issue as long as hitting the wall doesn't make it pin on the opposite one. P1 had more generations, but most importantly, has been taught more efficiently imo. They didn't just have to finish the track under a set amount of time, they had to be as fast as they could, which made them able to reach much higher speeds (and checkpoints must have helped with that, since it allowed to record a time without having to complete the track). P2 is being really slow even at the last generation (I guess it could have become faster with more generations, but I doubt it would have been much more, since there were (correct me if I'm wrong) no instruction to beat previous times (could have been changed though, but seeing how slowly it improved, I'm guessing that having a time to compare only at the finish slows things greatly, compared to having multiple checkpoints through the track). In any case, I only wished to point out that how well an AI is taught is to take into account ^^ (also, I'm not saying that P1 would necessarily win over P2 on the first run. It possibly would though, but I'm sure it *would* after 3-5 runs at most. In this case. If I'm wrong, then it probably would be because of the steep turn right above the start, at 0:10)
@mahazero
@mahazero 3 года назад
Maybe: Let the AI break. Let accelerate be the default. Also involve the driven speed in the fitness function.
@raxitgohel2999
@raxitgohel2999 3 года назад
You should consider total distance driven by end of journey. And the one who takes least distance path during wole journey gets extra points, this might help eliminate zigzag strategy.
@sakshamrustagi1860
@sakshamrustagi1860 3 года назад
I think a big problem with the ai is that it cannot store the whole track into its own memory. Currently, at every second, it can only make decisions based on what it can see and not what’s ahead. By giving some way to store track that it sees, or increasing the amount of input it gets to include more of the track, the ai would be able to streamline this map in a much faster time.
@Erichteia
@Erichteia 3 года назад
That would be what you call overfitting ;). It would be really good at racing at that track, but only at that specific track. This is already present in this video because he does not change the track lay-outs. Just simply mirroring the track between iterations could make a big difference to avoid this (although still not great).
@blasttrash
@blasttrash 3 года назад
for AI, I think overfitting is bad. But in real life also racers memorize the full track. They know each curve etc in their mind. instead of worrying about generalization vs overfitting trade off, it would be interesting to see an AI video where speed itself is the final objective. I mean after all we want AI to beat human players after all at least in this track.
@black-birdvolt8911
@black-birdvolt8911 3 года назад
Quand j'ai vu toutes ces générations essayer de parcourir la map la boule au ventre pour ne pas toucher de mur ça ma bien fait rire. On dirait même quelles tremblent en faisant leurs gauche/droite pour éviter les murs, mais ça m'a donné l'idée de faire en sorte de "pénaliser" les gauches droites successifs, j'ai vu que dans les commentaires plusieurs ont eut la même d'idée et bien d'autre ! Bravo pour ces vidéos qui allient l'IA learning et Trackmania, ça serait vraiment sympa une suite ou tu améliores l'apprentissage de l'IA.
@jeromelageyre5287
@jeromelageyre5287 3 года назад
The real ones subscribed before the account reached 1 million subscribers. Great video!
@julienroche8233
@julienroche8233 3 года назад
D’où tu mets 2 com dont un anglais :p
@jeromelageyre5287
@jeromelageyre5287 3 года назад
@@julienroche8233 I do want I want 😜
@TheGoodFunGuy
@TheGoodFunGuy 3 года назад
This is amazing man, saw your last video on this topic and made me a huge fan of AI keep it up
@limited06
@limited06 3 года назад
You should now use those weights on a different map and see how they perform.
@HeavyMetalorRockfan9
@HeavyMetalorRockfan9 3 года назад
yeah. it would also be interesting to see the improvements that can be made by first using supervised learning for some quick learning, and then the various reinforcement learning methods later on. In many of these cases we're especially impressed by the quick learning at the beginning by reinforcement learning, but it's equally interesting to see what performs well later on, and you could actually compare various algorithms as well
@satibel
@satibel 3 года назад
I suggest the exact same track but mirrored, so all right turns become left turns.
@loicbarach8998
@loicbarach8998 3 года назад
Damn I am really impressed Can't wait for your next videos !!
@EirikHoldal
@EirikHoldal 3 года назад
This would be the dream for Ubisoft Nadeo. A car/player not trying to wallbang.
@OGPatriot03
@OGPatriot03 3 года назад
Think of the day when RPG games have life like NPCs based on this kind of AI training.
@Toulous1
@Toulous1 3 года назад
Fascinant tes vidéos ! Et en plus c’est hyper addictif à regarder ! Fais en plus !!
@anthonydale1169
@anthonydale1169 3 года назад
First off, great video! I believe the AlGoRiThUm showed me this because I have been watching Code Bullet and I saw you mention other channels in your last video, I was not disappointed to check these out :) My main quesiton is what music did you use for this? It was a real bop and I'd like to listen to it more! :P
@yoshtm
@yoshtm 3 года назад
Thanks ! Music is in the description now :)
@mohammedosman4902
@mohammedosman4902 3 года назад
After watching the previous video, I was thinking what if you had programmed in a negative condition on hitting the sides. This looked so much smoother! Fantastic work
@sinnloses746
@sinnloses746 2 года назад
Love that project I hope it continues
@benjikrafter
@benjikrafter 3 года назад
I commented on the last video too! I’d love to see you take this AI and alter what reward is used. Doing so you can create a better best time than if you started off with only training it to be fast. You basically “prepared it” with the necessary tools to be fast, now reward it for being faster much more than the reward for not hitting a wall!
@gavindejong8478
@gavindejong8478 3 года назад
I find it quite fun/interesting/mildly disturbing to see that a machine learns this in the same way we do. If you never touched a controller and saw the game in your life, you also drive slow and steer unnecessary. Until you perfect parts of the track, the beginning is done way quicker and more attemps get past it, with a few mistakes that go fast but crash. Interesting stuff. Thanks Yosh
@the_break1
@the_break1 3 года назад
Much effort, good work!
@bruhhhhmoment4848
@bruhhhhmoment4848 2 года назад
I love the ai learning videos. Are you planning on doing any other racing games for the ai to learn on
@soulwynd
@soulwynd 3 года назад
I love making neural networks and am happy youtube suggested me these videos. If I may make a suggestion, you could use weighted fitness goals. On the last video and on this one, cars were swaying a lot. You could make "time" the most important fitness goal, but also reward cars that don't do a lot of directional inputs.
@ssab9063
@ssab9063 3 года назад
Yea, or quickest time and shortest distance traveled.
@yourigan
@yourigan 2 года назад
I assume that you're french. Je trouve ton travail super intéressant, et l'approche vraiment bien. Je me demandais si tu avais déjà envisagé d'inscrire une de tes IA à la trackmania cup de zerator ? Je pense que ça pourrait être très drôle de voir une IA auto entraînée, battre des humains :) Bref, je me suis abonné et je viens de finir de tout regarder :) Kiss
@EvoCreatures
@EvoCreatures 3 года назад
Hey, that is pretty awesome. What programming language are you using? And how are you integrating with Trackmania?
@damian007567
@damian007567 3 года назад
You normally inject a dll file into the game so that it can run your own code. The Series Pwn Adventure 3: Pwnie Island by LiveOverflow explains game hacking in great detail
@WhatUpTKHere
@WhatUpTKHere 3 года назад
I love your work! Hope you like the article. I wonder how your implementation is different from Gigante's implementation? His game appeared very similar but seemed to breed competency in far less generations with a lot less left/right weaving than your algorithm. I'm tempted to try and do something like this with a real world robot, but it's a lot more time consuming without a way to simulate. Thanks for putting up such a great resource that explains the theory! -Lewin
@delomovs9582
@delomovs9582 3 года назад
You could discourage turning back and forth by making it so that if the car turns left and right again and the presses are less than 4 seconds (random number) then they're out.
@Kram1032
@Kram1032 3 года назад
Such absolute penalties are way too harsh. The thing that causes the weaving in the first place is because the AI is mortally afraid of the walls.
@MrTVx99
@MrTVx99 3 года назад
But what if there is a left and right corner one after the other?
@arnaudsm
@arnaudsm 3 года назад
Great work! How did you simulate that many agents in the engine? You have VMs? A simplified engine?
@johannjanzen3377
@johannjanzen3377 3 года назад
Please more of these videos! ;)
@andersmolzen7171
@andersmolzen7171 3 года назад
If time and distance travelled were added as inputs to the NN, along with some hidden lateral, i think the model would be able to learn specific tracks more easily. A binary/discrete activation function might also be beneficial, as the output is bang-bang controlled either way
@Pyroxyde
@Pyroxyde 3 года назад
Excellent boulot !
@akliryuzaki5853
@akliryuzaki5853 3 года назад
Hitting a wall can result in a better time in some cases (i.e. when you have a tight trajectory but you kinda want to preserve a portion of your momentum without using breaks).
@Kram1032
@Kram1032 3 года назад
yeah it really ought to take into account the crashing angle. Like, if you crash at 90° right into the wall, that's never gonna be good. But if you go at 0° in parallel to the wall, it's really just a super tight trajectory.
@Z0ck3rb0hn3
@Z0ck3rb0hn3 3 года назад
Hey greats vids man, i am a big trackmania fan and also in my free time i develop AI stuff (also doing that in my job as data scientist). Is there a possablility to get the source code which is providing you the inputs out of the game? I would love to try out my ideas and maybe try to improve the A.I. of the current state which is still quite impressive. I love both worlds and would love to connect them. Thank's alot!
@NikoxD93
@NikoxD93 3 года назад
*L4Bomb4 has entered the chat* This could be used to create thousands of replays to help the 20k replay project! :P
@lincolndethomasis6602
@lincolndethomasis6602 3 года назад
now that you reached the end without crashing in 40 seconds you should try optimising the AI for speed because I think the AI could complete the course much faster
@cinegraphics
@cinegraphics 3 года назад
It certainly should, because a child without any experience driving TM would finish it a lot faster in 2-3 attempts.
@zimzimph
@zimzimph 3 года назад
@@cinegraphics not rly. The input in steering with keyboard is either full left/right or not. You'd need to learn to tap. Well, mb a kid can do that, but I've had a break of a couple years and I had trouble learning that again
@cinegraphics
@cinegraphics 3 года назад
@@zimzimph well, depends on the programming. You could make steering increase the angle the longer you keep the arrow pressed, to avoid oversteer. I've been playing several paid versions of TrackMania. And I was surprised how basically the same game, made by the same studio, totally screwed up controls between the versions. There are versions where you can control the car easily with the KB. The very next version..... it's just impossible to drive. On the same computer, using the same KB. So obviously, getting a proper setup and coefficients for KB control is art. How the same company can get things right, and then totally wrong AFTER they already got it right, is beyond me. Luckily, in the free version called TM Nations, KB control is really good. In some of the paid versions...... they screwed it up. Showing that ratio of money and music is not as easily predictable as the proverb said.
@VictorRuiz-dc9ed
@VictorRuiz-dc9ed 3 года назад
How do you implement ai in trackmamia? I mean the generations, tests and everything in general. Id like to test that same stuff on other games by myself but never found a way to program behaviour inside a game without breaking my head into unpractic solutions
@bagok701
@bagok701 3 года назад
Look up tutorials for cheatengine. Then figure out how input works for the game you want, then figure out how either raytracing or collisions work, then figure out how to reset the state of the game to a known state.
@satibel
@satibel 3 года назад
libtas might be useful
@userPrehistoricman
@userPrehistoricman 3 года назад
@@bagok701 That's the hard way to do it
@Azkunki
@Azkunki 3 года назад
@@userPrehistoricman What's the easy way then ? x)
@userPrehistoricman
@userPrehistoricman 3 года назад
@@Azkunki Find somebody else's work on it or look for an official API.
@AwesomeIronGuy
@AwesomeIronGuy 3 года назад
Nice another code bullet upload
@janhansen3753
@janhansen3753 3 года назад
I simply love one of the cars logic in generation 13. Rule = You cant hit the wall... Car = "Okay, i just stand still" :D
@BoBrEiGs
@BoBrEiGs 3 года назад
There's something pleasing about the many cars in the beginning slowly expanding from the starting position. And is it only me or the many tiers at the front together don't make you think of a hovercraft?
@oddlyracer3728
@oddlyracer3728 3 года назад
Amazing!
@julienroche8233
@julienroche8233 3 года назад
Amazing video as usual ! Such a shame I couldn’t hear your pretty voice on that one :’( keep going my friend, even if we aren’t friend anymore... xoxo Ps : pourquoi t’as pas mis de pont dans la vidéo ????
@jeromelageyre5287
@jeromelageyre5287 3 года назад
Fais pas genre tu sais parler anglais... Ps : c'est vrai pourquoi il y a pas de ponts ? 😮
@julienroche8233
@julienroche8233 3 года назад
@@jeromelageyre5287 you know, I am a bit of a scientist myself ;)
@jeromelageyre5287
@jeromelageyre5287 3 года назад
@@julienroche8233 You writes like an engineer. I think you are one, isn't it?
@julienroche8233
@julienroche8233 3 года назад
@@jeromelageyre5287 totally :o you are the fabien olicard of trackmania ?
@jeromelageyre5287
@jeromelageyre5287 3 года назад
@@julienroche8233 Want love guy ? 😍❤💓💋💋💗
@ZOMIX8
@ZOMIX8 3 года назад
1:53 cutie shy car at the start
@warpmonkey
@warpmonkey 3 года назад
That's great, good to see. Too often with ML we let the algorithm pick all the rules, but that's not how it works in real life. If someone is learning something, they typically get some hints and tips from someone who knows it well, so saying 'Don't hit the walls' is fine, I don't see a reason not to introduce hinting to ML models. But, we all know landing on the underneath of a finish line is a big deal here, and that happened by total accident. I wonder if there are other ways to weight a ML algorithm, sort of like how a judge on a talent show says 'I see something great in this not so great performer', if we can introduce 'judging' to a model to promote it, even though it isn't the best. I saw that left/right swerving and I wonder if that was voted out manually as being inefficient, if it would mean the evolution would change. Great video :)
@HeavyMetalorRockfan9
@HeavyMetalorRockfan9 3 года назад
While we're trying to attain human like qualities sure, but theoretically speaking, the true appeal of reinforcement learning in particular is the ability to go beyond human heuristics. For example, in actual car racing, drifting was thought to be just showy and inefficient, until someone figured out that on certain tight turns, drifting would actually be the faster way around a corner. However, what could also be cool is having ones setup with different sets of heuristics, and using the information from the best performing sets to update the others
@Kram1032
@Kram1032 3 года назад
The swerving is actually caused or at least exacerbated by too severe a wall hit penalty. Making it matter but not just end the run outright would help with that
@HPetch
@HPetch 3 года назад
Nice to see that you implemented some of the suggestions from your previous video. If you feel like doing it again, a slightly more nuanced approach to the controls you give the A.I.s might lead to better results - changing the three outputs from "full left," "full right," and "accelerate" (which you used before and I'm assuming you used here) to "steering angle" (ranging from -1.000 for full left to 1.000 for full right, or some other range that works better for the game's inputs), "accelerate," and "brake" should allow the A.I.s to at least come closer to optimal human performance, although it would take more time to program and probably to train as well. I also think that rather than just punishing the A.I. for hitting a wall, it would be helpful to reward it for getting close to walls without hitting them. As others have noted, making the wall a hard fail is discouraging the A.I.s from driving optimally, so some sort of "time x positive modifier for hugging walls × negative modifier for hitting walls" performance algorithm would likely lead to better results. While this does somewhat defeat the purpose of letting the A.I. learn for itself, it would also probably make up some of the time that more complex controls would lose, so it's still worth considering.
@spoodlypoofs
@spoodlypoofs 3 года назад
Hey Yosh, I'd really like to give this a shot. Can i take a look at your code? Is it posted on github? Thank you!
@dripist5124
@dripist5124 2 года назад
you have alerted the horde
@Kram1032
@Kram1032 3 года назад
The Crash And You're Dead penalty, I think, is actually way too severe for AIs to learn well. It leads them to be overly cautious of the wall, reducing prioritizing speed a *lot.* Instead, there are a few ways to make it less severe but still matter: 1) add a simple time penalty per wall crash (something like +1s per wall) 2) divide fitness by number of walls hit (one wall hit = half the fitness, two walls a third...) 3) make the deadly penalty probabilistic (if you hit the wall, there is a, say, 10% chance you're out) 3a) make the probability dependent on crash angle (take the probability to be, say, the square of the sine of the crashing angle. If you hit the wall straight on, you're always dead. If you barely graze it, your hit isn't as big) 1a: just add square-of-sine-of-crashing-angle seconds 2a: divide by number of crashes weighted by square-of-sine-of-crashing-angle Additionally, all of these could be slowly ramped up during training. First gen gets no penalty yet. Gen 2 is only very mildly affected, and from there, each generation gets an ever increasing penalty for crashing. This will allow the AI to explore more early on, giving you perhaps better diversity. By adding the directionality of the crash into the equation, you directly encourage the AI to drive parallel to the walls rather than basically being scared of it and weaving away. You could even double down on that effect by asking the AI to stick to the walls as closely as possible. - Give a boost to being close to a wall, fitness-wise. But that'd probably distort where the ideal driving line is. Could give it a try but probably not so good if I had a guess.
@avo_k
@avo_k 3 года назад
super video ! bravo ! I'm curious, what was the fitness function on this one ?
@Kram1032
@Kram1032 3 года назад
At a guess: Same fitness function as last time (based on checkpoint times or driven distance) but if you so much as touch the wall, your run is stopped and you no longer can get any more fitness. - I.e. walls are lava. (This makes AIs mortally afraid of the walls and they prioritize not hitting them WAY above going fast. It'd be a lot better to reduce this penalty and perhaps make it dependent on grazing angle)
@RockMore
@RockMore 3 года назад
In some speedruns, crashing in the wall could actually save time, but nice improvement.
@fireboss05
@fireboss05 3 года назад
La prochaine fois ça pourrait être sympa d'essayer avec les collisions des murs pas mortelle ;)
@caleb7461
@caleb7461 3 года назад
0:01 ASMR trackmania edition.
@zine-e
@zine-e 2 года назад
I drive like the last cars, very slowly !!
@tscc
@tscc 3 года назад
If you intend to do further work on this, you might want to consider adding the velocity of the walls approaching, too, instead of solely distance. This could potentially help cutting down on the fish tail movements, since the AI has a better cue that it might not run into a wall anytime soon. Also, are you feeding actual vehicle speed into the neural network? As far as Trackmania, it's been almost a decade I last played it. Does it support analog steering input, instead of just full left and right?
@Kram1032
@Kram1032 3 года назад
Velocity is interesting but I'd simply go with angle. Like, sine of angle squared, basically. If you go parallel to the wall, it doesn't matter ( sin(0°)² = 0) but if you go at a 90° angle it matters quite a lot ( sin(90°)² = 1 ) Could also do projected velocity though (how must of the current velocity is directed into the wall as opposed to forward) Either way that should help fix the fish tail movements.
@kaku5186
@kaku5186 3 года назад
could you make a tutorail plsss , im really loving that shit and i would like to be able to do it by myself
@kaku5186
@kaku5186 3 года назад
@felixifyable i dont know anything about how to code but i would like to learn how to ai code specificaly in trackmania
@potto1488
@potto1488 3 года назад
@@kaku5186 Try Coding with Moshes python and Machine learning tutorials.
@kaku5186
@kaku5186 3 года назад
@@potto1488 okok thanks you
@UsmanDev
@UsmanDev 3 года назад
how do you access game varialbes to be able to get the inputs to your neural network?
@yoshtm
@yoshtm 3 года назад
Screen capture + Openplanet to communicate with the game API
@froozynoobfan
@froozynoobfan 3 года назад
i think you have 2 challenges. generalization: your ML should be able to drive on any map. reward: this is a racing game so we want to get to the end fast. for generalization you could train each generation on a different map. for reward, you have some options, continuous reward is always better. continuous_reward = (total_distance_to_finish - distance_to_driven)/time - extra_punishment OR continuous_sparse_reward = sum(checkpoint_reward) + time * - continuous_punishment - extra_punishment or sparse_reward = sum(checkpoint_reward/checkpoint_time) - extra_punishment i have added extra_punishment in case you want to punish the amount of collisions with the wall for example, this might improve the model but i think it won't as it should learn by its own because i assume that less_bumps == more reward. onto the learning part. Optimizing your generation learn time might be challenging, my first thought is scaling the learning time for each generation, either something simple, linear or something more exponential, with a cap, the world record time of the map; learning_time = gen_number * seconds (linear) learning_time = 2^gen_number (exponential) learning_time = learning_time if learning_time
@asimovstarling8806
@asimovstarling8806 2 года назад
Honestly, I've seen two videos from you now running these tests, and I think It might be fruitful to use the test result data from the 100 generation genetic algorithm generalization and track distance test and the data from this test as the data sets for training the A.I. and then from there, it might be able to catch up with your time. Just a thought.
@profsalad6414
@profsalad6414 3 года назад
are you able to make something like this for alot more complex things? like making a.i adapt to things like dark souls?
@MadMaxShru77
@MadMaxShru77 3 года назад
Your pc rendering this be like *Visible explosion
@yuchaoguo8399
@yuchaoguo8399 3 года назад
if you want them to go faster, you can a death wall that finishes track in, say, 50 seconds, and you can make it go faster if you want the AI to go faster
@neoblackcyptron
@neoblackcyptron 2 года назад
Is it over fitting if AI drives without hitting the wall. Can it perform the same way without hitting walls in a new track. That’s only possible if the ai knows something about the environment right? Really nice videos. I enjoy your channel contents.
@Mabra51
@Mabra51 3 года назад
Generation 1 be like: ⬅️➡️⬅️➡️⬅️➡️
@typicalhog
@typicalhog 3 года назад
Hi! Have you made your own NEAT implementation or have you used a certain library? Was there speciation?
@nahpo3860
@nahpo3860 3 года назад
map learning next ?
@Argrouk
@Argrouk 3 года назад
If you changed the priority of the sensors to be weighted by length, they would crash a lot less. At the moment, they can't tell the difference very well between a gap in front of 100m and a wall 5m to the left and right. In a scenario with no memory, then they need to always head for the longest uninterrupted sensor.
@Kram1032
@Kram1032 3 года назад
even better would probably be logarithmic length so at 0 distance you get like negative infinity but whether you're 10 or 100 units away barely matters
@hippopotamus86
@hippopotamus86 3 года назад
How did you manage to do the wall detection? is that via API or image recognition?
@wsollers1
@wsollers1 3 года назад
You should select those nn's that minimize time or distance traveled in a complete run.
@F23GreyGhost
@F23GreyGhost 3 года назад
I wonder how fast AI could be on the checkpointed course if you first train the A.I. on not touching the walls, and when they start to reach their limit on that, put the best ones on the checkpoint course and only keep the fastest ones.
@TiagoTiagoT
@TiagoTiagoT 3 года назад
What sort of information can you extract from the game? Would it be possible to set the fitness score be something like average speed * percentage of the track completed?
@ouyaah
@ouyaah 3 года назад
Do you add a term for encouraging cars to be faster in your fitness function? I would suggest adding the mean and max speed of the car to encourage it to go full throttle on straight lines
@Jimmy_Jones
@Jimmy_Jones 3 года назад
I found the alternate universe Code Bullet
@JMPDev
@JMPDev 3 года назад
What were the sensory inputs the neural network had access to? Edit: ah ok, some previous videos show that you allow it to raycast in a fan formation, and that it sees some other pieces of data such as speed and gear.
@owenatkin3148
@owenatkin3148 3 года назад
I have a bit of a question, forgive if it sounds dumb: Why not feed the X,Z coordinates to the reinforcement learning AI to enable it to learn the general shape of the course? Also neural net switching by sector? As a sim racing fan I'm excited by this as an engineering problem.
@MaFd0n
@MaFd0n 3 года назад
Sooooo.. how is this project coming along? Any progress? Did you give up?
@thedogofchucknorris
@thedogofchucknorris 3 года назад
I see a huge potential for AI in videogames. In chess there is Alpha Zero, which can beat the world champion himself, it also performed well in Go. It comes from a programm calles Deep blue, which is selfimproving!!
@bramweinreder2346
@bramweinreder2346 2 года назад
You should really make sections, and have every next generation vary on the section champion.
@BlueskyFR_
@BlueskyFR_ 3 года назад
Comment fais-tu pour récupérer les données de Trackmania conçernant les positions des murs, la vitesse etc. ?
@marcobonera838
@marcobonera838 3 года назад
this guy is making an AI to win tournaments :p
@RyushoYosei
@RyushoYosei 3 года назад
I was watching these, and I was thinking, Would a way to help it improve its overall speed, would be, if your not already doing this, Make t he reward it receives at each checkpoint, related also to how much time is l eft, so reaching a checkpoint faster, rewards more points, in addition to being rewarded for going farther, so it gets rewarded for going farther, but also going farther, faster basically.
@aurele8522
@aurele8522 3 года назад
Why are you not making more generations? How long does it take to calculate the first forty?
@lord1todd
@lord1todd 2 года назад
What if you took the car that had the best lines(disregard it's time) and the fastest car for it's traveled distance(include crashed cars). Use them as a selected breeding wild card so to speek. It would be fascinating to see what effect that had on the overall results.
@szymonkosakowski4373
@szymonkosakowski4373 3 года назад
how did you connect to trackmania to simulate the cars?
@tearlach47
@tearlach47 3 года назад
Can't crash into any walls if you stay completely still.
@grosty2353
@grosty2353 3 года назад
I wonder if, given enough time, generation like a billion wouldn’t have such diverging paths at the start.
@Castle3179
@Castle3179 3 года назад
What I would like to know is how the AI would perform if it also could see behind it. I would also like to know if it could be trained to race against other AI so that they could bump into each other as a strategy...
@British_Bastard
@British_Bastard 3 года назад
What happens if you add obstacles
@MrZimmerson
@MrZimmerson 2 года назад
I wonder if you can refine this by reducing the time needed to finish the map - maybe that'd help get the AI to make smoother lines when driving, thus speeding up their overall time.
@Toyback
@Toyback 3 года назад
Can you train the A.I to understand or search for the optimum the Racing line? Because right now they try to be in the middle of the Track to not hit the wall, wich is usually not the fastes way. So it needs to learn how corners are connected...
@Kram1032
@Kram1032 3 года назад
Two basic solutions to that: Make the walls not *quite* so scary and make the AI do as few turn inputs as possible
@Frans_W31
@Frans_W31 3 года назад
Maybe if you could somehow differentiate between the distance of the track driven, and the distance the car has driven on a given run. We want to maximize the distance of track covered, while also minimizing the distance the car traveled i.e. driving in a straighter line and not zig-zagging. Maybe this could be added to the fitness function?
@thomassimoens9302
@thomassimoens9302 3 года назад
i think an AI named Wirtual managed to do it on TMNF campaign :P
@WillPybuseccentric
@WillPybuseccentric 3 года назад
next put in a laser wall or just give them a minimum forward acceleration
@VestedUTuber
@VestedUTuber 3 года назад
Just as an idea, try adding another fitness requirement - favor distance _and_ speed, not just distance.
@janknoblich4129
@janknoblich4129 3 года назад
How do you gather the telemetry data from the game?
@blazbohinc4964
@blazbohinc4964 3 года назад
Makes you think how primitive AI really is. Or at least - this particular learning algorithm
@comet.x
@comet.x 3 года назад
Primitive? It took effectively a newborn ai ~40 attempts. How long does it take a newborn human to learn something as trivially basic as singular word? Alot more than 40 attempts that's for sure.
@Kram1032
@Kram1032 3 года назад
@@comet.x bit unfair given how many more complexities a newborn has to deal with compared to this AI. Like, this AI can only do three actions (turn left, turn right, accelerate) and only has like eight inputs (the rays, each corresponding to effectively a single pixel of information) A baby, meanwhile, has to learn to control all of its muscles in complex and unknown ways (like, it has abilities it doesn't even know it has) and do so with *trillions* if not *more* sensors of input that need to be sorted out and understood.
@MrTVx99
@MrTVx99 3 года назад
I'm sure if coded by someone who understood what they were doing a lot more, we would see AI getting competitive times.
@darkbelg
@darkbelg 3 года назад
Could the next machine project maybe involve a pro trackmania player ? I mean if you ask him to play 5 minutes on your custom track for 12 days. You would have a hour of data points to use. I don't know what software you need to use to get the data points though. Another idea is to use replays of players on speedrun website to try and train the ai better. But 50 000 data inputs to train a supervised model does seem like a lot of replays. And i don't know where you are going to find that. The inputs are in the replay. I have seen a video where people are able to read out the inputs from the file maybe you can combine those. It does sound like a lot of work. I mean machine learning is a lot of work to just make it do one thing.
@raymier1369
@raymier1369 3 года назад
How do you collect data from the game itself while it is running?
@birdwaveracing9
@birdwaveracing9 3 года назад
Could the AI, equipped with map memory, iterated by completion time, and subject to new inputs "lift" and "brake" learn concepts like minimizing apex distance and maximizing track width on a track where it was necessary?
@FunFuerFelix
@FunFuerFelix 3 года назад
I am missing the voice over in thos video 😓
Далее
The Evolution of Predation in a Simulated Ecosystem
18:40
Training an unbeatable AI in Trackmania
20:41
Просмотров 12 млн
УТОПИЯ ШОУ В КИНО
2:36:54
Просмотров 305 тыс.
This Trackmania Shortcut was "impossible"...until now.
9:32
I Reviewed a Viewer's Trackmania Map
7:32
Просмотров 58 тыс.
A.I. learns to drive on a pipe
21:43
Просмотров 230 тыс.
I TASSED Über bug maps
1:40
Просмотров 758
AI beats multiple World Records in Trackmania
37:18
Просмотров 2,6 млн
Why You Suck at Trackmania - Fundamentals
6:04
Просмотров 32 тыс.
How Trackmania Players Destroyed Cheaters
25:00
Просмотров 5 млн
Evolving AIs - Predator vs Prey, who will win?
12:15
Просмотров 2,8 млн
Winline EPIC Standoff 2: Season 10 | Playoff - Final Day
5:29:26
ИГРОВОЙ АВТОМАТ !! КАБАНИМСЯ
2:46:26