https://github.com/thatgamesguy/ai_experiments-evolutionThis is part of an ongoing series in which we experiment with different methods of AI. We’ll look at the state of the art and the out of fashion; the practical and the (seemingly) impractical; to find what works and what doesn’t. In this part, we will finish the Genetic Algorithm that we started here. You can read the first article in the series for more information on what we are trying to accomplish.So far we’ve implemented our UFOs, provided them with neural networks, and started our genetic algorithm. Which means we’re almost finished the series. There are only a few more things we need to accomplish:
- We need a way of defining if a UFO is ‘fitter’ than another one. More on this shortly.
- We need a method of selecting the UFOs from the pool based on their fitness.
- Once two UFOs are selected from the pool we need a method of combining their neural networks to create a new network.
- Once we’ve created the new network we need a way to mutate it.
We’ll accomplish most of this today. We’ll leave mutation to the next (and final) part in the series.Let’s start with creating a method of defining fitness for our UFOs. A fitness function can be difficult to define as there may be complex goals that we want to accomplish, however in our game, with its relatively simple goal, we can define it quite simply:A UFO is fitter than another UFO if stays ‘alive’ longer.As we reduce a UFOs energy level when they collide with each other or the screens side and remove them when their energy levels reach zero, we can safely assume that if a UFO is able to survive longer in the game then it is better at the assigned task. There will be some edge cases where a UFO survives by luck (i.e. maybe all other UFOs avoided it) but that will be rare and will soon be rectified as the general population becomes fitter.We’ll record a UFOs fitness by implementing a component called C_TimeAlive.This component holds a float that represents a UFOs total time alive and provides access to it. In the update function we add delta time to timeAlive each frame.Don’t forget to add the new component to all newly created UFOs. You can add the component anywhere after creating the object but before adding it to the object collection. Now whenever a UFO is added to the game this counter will start incrementing. We can then compare two agents fitness using this time alive value.We can now write the function that will select UFOs from the pool based on their fitness. For this project, I’ve chosen to use Fitness Proportionate Selection (LINK). Using this method, UFOs with a higher fitness score (in our case time alive) have a better chance of being selected, although it does not guarantee that the fittest members will be selected. This provides some randomness in our selection process. If we just selected the two best UFOs from the pool our neural networks would not be diverse and would converge on a solution quickly even if the solution is not optimal. Solution: a vector of weights that represent a neural network. The weights will move the UFO in a specific manner based on its input. Better solutions will move the UFO away from other UFOs and the screen sides.You can think of it as a roulette wheel where a UFO with a high time alive has a larger share than the other UFOs in the pool. We then ‘rotate the wheel’ that selects a UFO at random. The UFOs with the larger share are more likely to be selected.
Of course there is no wheel in our implementation but it is a good way of visualising it.We’ll add two new functions to our GeneticAlgorithm class and a new member variable. To select an individual we need to know the total time alive for all UFOs in the pool, therefore we add a new function called CalculateTotalFitness to calculate that for us. This loops through all individuals in the pool and sums their time alive.Next up is our FitnessProportionateSelection function which returns an index to an individual from our pool collection.
- We calculate a random float between 0 and our total fitness score.
- We then iterate over the pool and add each individuals time alive. If the individuals time alive (plus all of its predecessors) is greater than the random slice then this individual is selected.
So for example:Say we have 5 individuals in the pool and we’ve sorted them in descending order:
- Cathleen 500
- Jim 450
- Samantha 300
- Nigel 200
- Greg 100
Their total fitness score is 1550.We then generate a random number between 0 and 1550, which we’ll say is 700.Now we loop through each individual in the pool:
- We add Cathleen’s fitness to a fitnessTotal – 0 + 500 = 500. This is not greater than our random slice of 700 so we move onto the next individual.
- We add Jim’s fitness to the fitnessTotal 500 + 450 = 950 (Cathleen’s + Jims fitness) and this is greater than our random slice so Jim is selected this round.
- We no longer need to check any further individuals and we return Jim’s index in the pool.
We need to calculate the total fitness score every time we add a UFO to the pool and once the fitness score is calculated we can retrieve two parent UFOs using our FitnessProportionateSelection function. We’ll do this in our GeneticAlgorithms Update function.And that’s it for our selection process. We now select two parents from the pool when a UFO is removed from the game. The next step is to combine their neural networks to create a new offspring. We’ll add a new function to GeneticAlgorithm to do just that.
- We retrieve each networks weights as a vector of floats.
- We retrieve the split points for the first network. This provides a vector of indices with each index pointing to a boundary between neurons. I’ll explain split points in more detail shortly.
- A random crossover point is selected from the vector of points. Using this point we will combine all of the parent one’s neurons before the point and all of parent twos neurons after the point.
- We add all of parent one’s weights before the crossover point into a new vector of weights.
- We do the same for parent two but this time after the crossover point. With this done we have a new set of weights that combines both parents.
- We create a new neural network and set its weights equal to the combination of its parents.
Getting back to crossover points. It may help if I showed you a visual example of what I mean when I say they point to a boundary between neurons.
In this example, there are two split points (between -0.2 and 0.6, and between -0.1 and 0.4). This image was borrowed from the book AI Techniques for Game Programming. A book I cannot recommend enough as it was the inspiration for this series.
We use split points to help us preserve individual neurons. For example, using the previous image, if we combined this network with another without using crossover and it selects a random point between 0.6 and 0.1, neuron 2 in this network is disrupted and its behaviour lost. Now that we have a method of combining neural networks we need to change our Update method.We retrieve the parent’s neural networks and call our new CreateNeuralNetworkFromCrossOver function, passing it the neural networks. We then spawn a UFO and give it the newly created child network.Now when you run the game our UFOs actually learn! If you watch them for a number of minutes you’ll see their behaviour change. In the beginning our little UFOs will move around randomly without much purpose. Some will move towards other UFOs, some will move away, and some UFOs will take no notice of anything around them. As time progresses and more UFOs are spawned they will start to react purposefully, they’ll move away from nearby UFOs and try to avoid the sides of the screen. The longer you leave the game running the better they’ll get. This is exactly what we set out to accomplish.
UFOs at the beginning of the game.
UFOs after 10 minutes.And that’s it for this part and almost for the whole series, in the last part we’ll add a mutation function and some way of logging data to a file so we can confirm that the UFOs are actually learning as the game progresses. As always, thank you for reading 🙂