Great to see that you’re still making improvements! Have you considered trying the NEAT algorithm? It’s genetic and I’ve had some pretty cool results with it. It may be too simple for Mario Kart though. Keep up the videos!
A few years back before I got into Reinforcement Learning I actually started my AI journey doing genetic algorithms, NEAT included! Infact my first ever AI project was using NEAT for flappy bird. Sadly I doubt it would perform too well on Rainbow Road (especially with items and CPUs), but I could be wrong, would love for someone to try it
I do think I could get it to play every track, it would just take sooo long to train! Maybe if I get some time where I can't be trying new stuff I'll leave it running and see what I can do!
Man, right at the end, watching it hop around that banana in its way... that was probably the biggest testament to how much it had learned. Your reward structure didn't even include rewards for faster times, just getting to the next checkpoint. Very cool.
Just a small suggestion: Next time you show off a different inferior algorithm, maybe don't use that black box filter over it. I could barely see the actual gameplay of the inferior AI because the black box filter combined with youtube's video compression made it a chunky mess to look at. For a better filter, maybe you could use a CRT filter instead?
@@thegamesguy2263it would either have to learn it on its own, which is a very low chance it accidentally does it, or he'd have to change the reward system to force it to take that specific route, but that might break it
Great video, love your stuff. The people who think up novel approaches (or even iterative improvements to existing approaches) to AI are absolute geniuses, but applying existing ones to video games is something I always want more of. I also think it helps that you use Mario Kart Wii, because so many people have good memories with that game. Better than something obscure where I can't gauge how proficient a player is from footage.
Ah, that isn't to diminish your videos on other games. I just can't personally estimate how good the ai is with something I have very little experience with. Still great videos though!
Yeah coming up with new AI techniques that actually work is a nightmare, during my PhD I learn that constantly haha. I do enjoy applying it to new situations though, always interesting to see how AI techniques can be used in different ways. Yeah Mario Kart Wii really has the nostalgia factor for many people
This made me think of the idea that an ai able to finish the race could be implemented in a base game and slightly improving compared to the player’s skill. Does the AI win from the player, then don’t save what it learned that race. This way the ai and the player stay close in skill together.
humans tend to learn much faster than AIs (to an extent, AIs sometimes get ahead later on), so this would result in great boredom for many players waiting for the AI to catch up with them. I think it would be better to have a series of AIs which are all pre-trained, then players can choose which AI they want to play against - e.g. you could have the AI after training for different amounts of time (1 day, 2, 3, a week etc) so they have a range of differently skilled AI opponents to play against. Basically like current CPU opponent difficulties, but with an AI that hopefully makes the difficulty of the hardest CPUs higher.
I know this would be very (very) difficult to program and get working but I just want to say that [hypothetically] I think it would be really cool to see the AI do a shortcut or skip. Love your videos! Keep up the amazing work 👍
Heyy, I appreciate your work! Great job, maybe you can create a tutorial on how you did it? Maybe start a little tournament where people can submit their own AIs that then compete against each other? Wouldn't that be fun?
What needs to be done is to allow the AI more control over the inputs it can use - aka, standard turning, drifting with the three different levels of steering input, etc. That will allow the AI to be significantly more precise and dramatically improve it's time.
Have you thought about analog controls for the ai? As in let the ai control the polar coordinates (or cardinal coordinates?) of the joystick input. If so, why did you decide against it?
I've thought about it, but haven't really looked at doing it. The only real reason is that I specialise in value-based Reinforcement Learning, which doesn't really look at using analog controls. I do think its an interesting idea though
let this learn for a really long time and when ai is more developed in the future you use this as a benchmark but with either the same amount of hours in it or less to see the same resault
Thanks! I looked a while back into getting it to play online. I haven't made much progress on that area, but its still something that will always be a goal of mine! I've got some more interesting stuff in the works with Mario Kart, so who knows what'll end up happening...
As a human who time trials mario kart wii, I'd love to see an AI attempt to. Although I'm not sure it would do too well due to the sheer complexity and lack of strat knowledge, i would love to be proven wrong. Unless you use some kind of method to help the AI learn from time trial footage, then maybe I could see it happening
I'm relatively new to studying RL so please correct me if I'm wrong, but instead of training two agents independently and taking whichever one does better overall, can't you combine the results of the two agents to make one better agent? By this I mean for each state, we evaluate the state-action values for each agent-action combination, and we choose whichever agent's action yields the higher state-action value. This may account for one agent being better at one part of the track while the other agent is better at another part of the track. Great work by the way!
Yes so that is possible, and I guess would be an "ensemble" network. Its hard to say exactly how much better it would get, as RL agents can overestimate values, so only taking the highest values may not always lead to better performance, however would be likely. Thanks!
Seeing you have two computers to use for training AI now, I can see why the quality of your videos have gone up! I just hope your electricity bill hasn’t gone up too high with it 😅 Thanks for the great vid as always!
I've got a question: you said that your AI was training for consecutive 60 hours. Are those "real time" hours? And if so, isn't there a way to artificially speed up the programs/process without altering the validity and reliability of the results?
Thanks! My code is not yet publicly available (however later down the line I do plan on open-sourcing it), however if you're looking to get into this stuff, look at Reinforcement Learning Gym/Gymnasium, and there's lots of easier problems to get into and play around with if you're just starting out
is the end goal to generalize against any track vanilla or custom against real world players? I wanna see this thing rip through CTGP worldwides one day.
I know you use Mario Kart Wii bc it's very popular and likeable (attracts views and such). But if you ever develop such an impressive AI that the Mario Kart community cannot ever take a shit again without thinking about your AI, would you consider applying the AI knowledge to an A Button Challenge?
I'm curious how much better the AI would perform with more ram watch as NN inputs for all the important data that TASers use. I bet it would be scary fast, though not the same as reading the screen pixels.
Its something I'm looking into at the moment! Deciding what exactly to include is a little difficult though, since the RAM is way too large to just include it all. I'll be doing a video at some point though going into that!
Hi Tango! I'm curious, from your point of view and considering the AI already programmed in NPCs and such: would it be possible to train a brand new neural network to play Elden Ring, for example? I know shooter games have botting already, but for the RPG counterpart, is it reasonable to think AI can eventually learn these layered tasks? I imagine devising the reward system and all the input/output would be complex, but beyond that, I'd be curious to see how the intelligence adapts in a more open environment.
These types of tasks have typically been challenging for AI since most AI heavily rely on there being a very set direct task. Some AI models however (Dreamer v3 comes to mind), have however been able to adapt to these types of tasks. In the paper an AI was actually able to find Diamond in Minecraft! This works by the AI having its own "intrinsic" reward to guide it, however this often comes at the expense of needing a LOT of training time
Truth be told, I didn't even set the speed, I just let the AI run as fast as it wants to. If I do something like time trials where the speed is really important, Ill definitely show it at normal speed though
There's a much better option than making the visual input black and white: make it all bright colors - round the pixel value to a pure hue, with maximum saturation and medium lightness. And if it's very close to black or white or medium grey, round the pixel to those colors. This will remove a huge amount of noise and give the ai the information that the game designers thought was important enough to give a color. It will also give the AI very clear shapes with sharp edges. Forcing the AI to recognize thousands of complex grainy shapes that might differ from eachother by only a few pixels sounds inefficient. Of course it had trouble with the banana peel.
Imitation AI strategies are around, however I typically don't look into them since the result is basically guaranteed to be worse than the thing its imitating
If I were to write a paper, it sadly wouldn't be on Mario Kart as I don't think that's a standard benchmark! The algorithm I used though, I might be writing a paper for quite soon!
you're one of the most cracked coders on this platform but unless you can nail the story telling aspect of these videos they won't get the views you deserve
I do wonder if the AI can learn certain top level player tricks like using a mushroom at the right time before a Blue Shell impacts, and ensuring it holds on to useful items to defend themselves from random Red Shell attacks instead of using the item instantly as soon as it gets it. Time Trials would certainly be a lot interesting to see though.
How has this not blown up!? Combining Mario kart wii, and newbie AI... Having played this game since I was little it amazes me to see how bad the AI is, like I was when starting, to it being sorta decent like me, then being like the top players. Simply incredible!
@@aitango As you probably know, MvC2 is a 3 on 3 Tag Team fighting game with three different assist types for each character. It would be absolutely hilarious to see a team of the worst characters make it all the way to the final boss of arcade mode. AI-controlled Roll, Servbot, and Dan wiping the floor with Abyss would make me laugh until my sides hurt
What we have to ask us is at what point will companies use ai to make game npcs over handcrafting the behaviour. I bet for many games will this creat far more easily a strong npc over doing manualy
For this number of frames (around 50M), Rainbow is still pretty much the best out there since there is barely any other research since. Lots of big research labs like google focus on either ultra sample-efficient (100k frames) or virtually unlimited frames (up to like 10B). Sadly there isn't much in between, however I am looking to publish the algorithm I'm using here at some point
This is a value-based method, not a policy gradient method so is quite different from PPO, and is instead much more closely related to DQN. I don't use a softmax, rather the network's outputs are just the AI's value predictions (hence why they don't add up to one). Also whilst I guess rnn is correct (residual neural network), I don't typically hear it called that because rnn usually refers to recurrent neural networks, which are very different!
After the AI completes the race once how long till its consistent at the course, also are you training the model with a single agent or is it multithreaded?
For most of my other AIs after completing a single lap, they almost immediately complete the entire track. This AI had some more trouble however, not due to the track but just because with items throughout the course of 3 laps lots of crazy stuff happens, so it just has to learn to deal with that, so it easily took another 20+ hours in this case. In short its multithreaded, but if you want to know how it works be sure to check out my last video, Evolution of my Mario Kart AI
@@aitango Intrenesting, ive done a bit with nes games and the models have struggled with consistency. Thanks for the reply, I watched the last video must of forgot you mention it. Really been enjoying the videos, excited for more.