Just a simple implementation of neural net for evolution of a car to finish the track. The neural network itself doesn't evolve in shape, but in the neuron connection weights. Made in Unity Music: "Righteous" by Silent Partner
To be honest, the code itself is not very complex, is only few hundreds lines of code. I pretty much followed the Cokemons path, there is plenty of amazing tutorials on this matter, I started with channel called "the coding train"
I find it highly intriguing that it seems to be Right predominant, and I can't help but wonder "If it were to be given virtual hands would it be right handed?" Which sparks a thought - the first part of the track had higher error potential on the left side than it did on the right, so the neural net had to "focus" on the challenge of the left hand side, and the ease of the right; perhaps _this_ contributed to it's right dominance.
I agree with Pablo Pazos, that would be cool if you could give an overview of what is really happening to the neural network algorithm during a test, and what are the changes for each generation
This is not a NEAT implementation, so the network structure doesn't evolve, just the weights. After each population is done I use generic algorithm to generate a new population, I use ranking based selection combined with stochastic universal sampling to select mating pool. Than new gene will have weights from either parent A or B, I have tried to crossover entire layers but this didn't give me good results. Than each weights has a chance to mutate meaning it will get new random value withing the weights range.
oh yeah i could see in the video that the network structure doesn't evolve. Actually i had a question about the hidden nodes : how did you choose the number of those "hidden" nodes ? Does it depend on something or have you just choose a number without any clear reason ? This is very interesting anyway, and the way your NN masters the track at generation 33 is very impressive. Thank you for your explaination.
Well from what I learned the number of hidden nodes is arbitrary chosen, that's why NEAT is much better because skips the limitations of this prerequisite. I also learned somewhere that in general is good to have at least inputs+1.
very nice, how did you go about using a genetic algorithm? based off one of your previous comments I see that you only passed the weights, meaning an offspring got 100% weights from A or B? What about your biases? Were these static or initialized with every new vehicle! Thanks for sharing your research! :D
The offspring is created through a "crossover - nodes", so each neuron will get all incoming weights from (random) one of the parents. In my implementation bias is an additional node in the previous layer and their weights are subjected to crossover and mutation similarly to any other weight. Bias value is constant though.
Ah, yes! I hypothesized that your method would be an optimal way to train a Reinforcement Learned Neural Net, although I have yet to train my first one yet. Have you compared the convergence of this method with other genetic methods? Say every parameter is a gene instead of nodes? (Sorry to probe you, I am trying to get as much data from you before I write one xD) ALSO WHAT'S THE FUNCTION TO DRAW LINES IN UNITY? ( via your NN visual in the top right corner )
Hi Jabrils, what do you think about an hybrid approach, i mean, a combination of genetic algorithm and other evolutionary strategies such as particle swarm or gravitational search algorithm...? (sorry for my english)
I don't have the slightest clue as to how they works or programmed. But this fascinates me beyond belief :D Remember watching something with Super mario 3 some years back, about neural net learning to play.
Very fancy demonstration. I personally started looking into Neural Networks and I was wondering if you have your code open sourced; I'd be interested in attempting a NEAT-based implementation for personal use :)
I did something very similar (without the cool graphics) using TD(lambda) in 1998 with the goal of fastest laps using a reward function of "-1" each second and something really severe (-100?) for crashing. It worked pretty well, though occasionally I'd grow a pathological variation that would minimize the function by crashing as soon as it was able. This is really cool stuff. Thanks for sharing.
One suggestion for a fitness function that would take into account speed (ie. reward faster runs, punish slower runs) is to have a constant living penalty. Each time interval (eg. second), the fitness score decreases by a certain amount. This means that for faster cars, the fitness increases at a greater rate. However you would either want to do this once the cars can sorta get around the track or have it be non-linear. Otherwise they'd learn too slowly
whitepen I will test this and let you know for sure. But in theory it doesn't perceive the track itself it only reacts to sensor input, so my guess is that it should do just fine on any track with exception of ones having sharper curves than the on on learning session. In that case car maybe wouldn't know that it has to de decelerate. I will let you know when I test it.
Most of these neural networks do not consider past actions at all, so it is not a matter of memorization, however, it is possible that the decision making improperly optimized due to coincidental order of operations. In other words, certain combinations of turns may expose flawed neural connections. Also, overtraining a neural network in a sterile test environment could cause the network to completely fail to adapt to changes when brought to a less sterile environment.
I was about to ask the same thing as @whitepen. I think that a little overfitting may be produced due to imperceptible patterns in the track that it is being used for training. Would it be better if a set of tracks is used for training, let's say 5 tracks, each with different degrees in the turns or even in the width of the track? Also, is it this code available in GitHub? I'll be glad if it is.
So I tested it adding two new tracks. I paused the best genome which beaten the #1 track and switched the track. The simple track (#2) with soft turns was beaten without problem, and another track(#3) with sharp turns couldn't be beaten. So I continued to run the genetic algorithm, and the car was able to beat the #3 track after 10 more generations. Than I swap again to track #1 and it failed to beat it. Needed another 6 generations to finish it.
если провести еще больше лучей входных сигналов то повороты машина будет проходить плавнее. Лучи должны уходить более дальше в дорогу. А так же желательно добавить лучи по бокам.
Hi! I am doing a very similar project in UE4. What kind of mutation did you use? Also, did you use recombination? What kind? One point crossover, two point or uniform? Thanks!
very cool video. I'd love to hear an explanation of what's going on in that top right neural network map as it's happening... I couldn't quite figure out what that changing normal map actually meant
I did something similar and came across NEAT neural networks. These work in much the same way. The problem you will have though is that your evolution becomes part of your overall algorithm. The evolution may end up causing your networks to be overtrained. They might not be able to generalise. When you bring pedestrians into the mix you might get problems as people are unpredictable.
Have you tried adding recurrent connections, so that the car has some sense of velocity/memory? Also, why don't give it more or longer distance sensors?
I think it will be more accurate to use other form of crossover such as the blended crossover, because a neural network is defined on a continus domain and not a discrete domain, but in your case, it's a discrete domain in the sens that you are using the node as single point of your multi-dimension vector...!
Always wondered if fluid dynamics could be used for more natural AI pathing, if you watch the way people walk around things it's similar to the fastest part of water flowing through rocks. The water in contact with the rock creates like a barrier type thing that makes all proceeding water flow more efficiently and naturally.
Hey, can you tell me, what strategy did you use, when mutating the weights? Also, how many networks were discarded and replaced in each generation. Thanks.
are those sensors boolean sensors? what if you extend them and feed in the 3 distance vectors instead (9 ANN inputs vs 3), that way it can speed up on straightaways and slow down around curves
we re trying to build a similar project, can you please explain the measure of fitness and fitness function in your implementation. we re planning to make the AI just control the steering, the acceleration will be fixed so the time elapsed seems like a good measure for fitness isnt it?
Hurdalık Cini I had distance + square root of avg velocity. Or something like this. But with constant speed just distance should do for simple project otherwise you can measure dangerous situations i.e. too close to wall etc.
Tomek S thanks for the reply, we will use distance, one more thing what about the threshold values of neurons are they constant or are they included in the chromosome and evolve with the weights?
Is there sensibility to turn or variable speed controlled by the neural network ? If not why did you use a hidden layer ? Just connecting the inputs to the outputs works.
I think it could perform better if it were Recurrent Neural Network (e.g. LSTM, GRU, or plain RNN) since it knows what it was doing. For example, it could accelerate more if it has been driving straight for a long time.
Nice! Try see if PSO weight changing might perform better in terms of learning speed and at time interval. Could you put this on git? Would be fun to test some new ideas with 🤘
You might get better results if you added two more 'whiskers' that are extend 3-4 car lengths ahead, and are even with the left and right of the car. The two sideways whiskers you already have don't really give it much of a chance to do longer-but-smoother turns, and gives the neural network a better chance to 'know' the angle of a curve - which it currently can't, not without becoming a recurrent neural network. I'd be curious to see how it'd act if you added virtual barriers that it can see, but pass through - like lane markers. See if it picks up bad habits, like cutting corners, or drifting into other lanes.
Really nice. Like you I also started with coding train. I’m probably going to create basic bit based chromosome and train against a fitting function. I’m impressed how to use neural networks in the place of chromosomes. I guess you already made them fully connected and used this genetic algorithm for the weights only training only.
Are all the output values only calculated according to the three vision input? It seems to me that there is no internal memory in the computation that would take record of the past. I mean, the neural network will not being able to take account of the curvature of the track, because he only knows how much he is to the left and to the right. Am I correct?
Watching the red and green lines, it appears to me that it learned that if it is able to turn right, it should veer right, or if it can go left, it should veer left, and it should only go straight if that's the only option available. Does that seem accurate?
Is your fitness just the distance the car goes, or also using the time? If you would use the time, and make the max-speed unlimited, it would be possible to get the ai to learn to go even faster and getting more efficient, right?
Truly Awesome. Can you please tell me how to integrate AI into Unity and what is your GPU configuration?? Your help will be appreciated because I also want to work on a similar project.
Any reason to use genetic programming instead of gradient descent for this? Gradient descent is more efficient as it will change weights in the "right direction" instead of doing it randomly.
Tomek S isn't it also because GD requires the knowledge of the right output for every training sample? As far as I understand the correct output is unknown in this scenario
GD requires just the (differentiable) objective function; nothing else. It's only such that in supervised learning settings, we need ground-truths to evaluate our objective function. Otherwise, GD doesn't put any restrictions on availability of ground-truths. If you are faced with a problem setting where you can compute utility/objective function even without ground-truths (such as this one), GD will perform just as fine, I think.
Hello, thank you very much for your video, it's indeed really interesting. I've got a question (because I'm new at this and I can't seem to grasp the reason why to my question): let's take an example. At 1:28 you've got the generation 8 genome 7. It gets around the first turn okay, then it crashes around the second turn. Now we go to the very next: generation 8 genome 8. The code has updated itself (in weight of your neurons) between the 7th and the 8th genome. Am I correct until there? And it's his first breakthrough, so the correction of the weight should be minimal, because he is on the right track and he has covered the maximum length ever! So why on earth does he crash on the first corner right afterwards? Why can't he from one genome to the next get to at least the point where it crashed before? Is it that the correction is "too strong" and it messes up something somewhere before so that it can't take the first corner anymore? If so, how could one make the correction "less radical" to the code so that it "learns faster"? I hope I made some sense =) Thank you for the answers!
I answered another question like this. The car can drive easier course but needs additional training on harder one. Easier or harder course is defined by sharpness of curves.
Try running parallel simulations. Run multiple variants of the car simultaneously and wait for all to have failed or completed. The top contender of each track is tried on all other tracks as well as being used to bias the adaptations placed upon the other experimental algorithms. It should result in a program that can more reliably respond to changes in the environment rather than creating one that only works effectively on a set environment. Additionally, use a second factor like time in addition to distance or apply time constraints once the distance target is achieved. What I believe would work effectively is to track distance and time/distance ratio. Once an iteration reaches the distance target, begin tracking speed.
Widzę że użyłeś random funkcji do generowania przypadkowych wewnętrznych nodów. Ja myślę nad swoimi jak zaimplementować nody w miejscach gdzie są najbardziej potrzebne i jak dowiedzieć się w którym miejscu nowy nod powinien być dodany. W ten sposób IA powinien uczyć się znacznie szybciej, niż używać przypadkowych miejsc i sprawdzać czy dodany nod poprawił funkcjonalność całej sieci neuronowej, czy odwrotnie, i kiedy rozerwać połączenia między nodami, jeśli wynik na następnym nodzie przechodzącym przez wagę ma małą wartość.
I can see the future going backwards from here. Remove human brains from driving, replace with perfected driving matrix, create a robot neural network and slowly introduce brains back into driving. It will be kind of like the Server / Client of the 70's moving to the personal computer of the 80's & 90's now moving back to the Server (Cloud) / Client (Device) model.
How many weight did you use? What is the output vector? And how did you calculate the fitness value, just by distance or something else? Sorry, I'm so curious
You state you have used tutorials to produce this within Unity. Any chance of pointing to them here or in the description? I would like to extend this to my own application: * have the sensors be float distance values (same layout) * change car dynamics to fit my physical chassis * once track is completed procedurally generate a new one
Ritchie Wilson for evolutionary algorithms i watched coding train channel. neural networks just theory from some youtube videos. rest came from some random articles and books about the topic.The implementation is straightforward the problems come after that because there is dozens of choices to be made when designing experiment like this. i.e. what should be the input, how do I crossover do i do fitness or ranked based selection etc. This field of science is very wide this example merely covers a fraction of it.