Experiments in AI - Evolution Part 6

that games guy Icon.

by that games guy

This is part of an ongoing series in which we experiment with different methods of AI. We’ll look at the state of the art and the out of fashion; the practical and the (seemingly) impractical; to find what works and what doesn’t. In this, the final part, we finish our genetic algorithm (that we started here) by applying a mutation to the newly evolved networks and then we write a bit of code to output the average time alive for each generation. We can use this to make sure that our eyes are not deceiving us and the UFOs are actually evolving to become better at avoiding each other with each new generation.

So what do I mean when I talk about mutating a neural network?

Mutation is a genetic operator used to maintain genetic diversity from one generation of a population of genetic algorithm chromosomes to the next.

By making small random tweaks to our neural network weights we get a new solution that may not have been found by combining two neural networks (crossover) alone. If this mutation is beneficial then there is a good chance that it will be added to the pool of possible solutions and re-used in future generations.
Before we implement our mutation function, we need a method of limiting when a mutation occurs. We don’t want the mutation to occur on every new neural network so we need some way of setting the chance a mutation will happen. We’ll do this in our GeneticAlgorithm class.
GeneticAlgorithm.hpp
class GeneticAlgorithm
{
…
    
private:
…
    float mutationChance;
};
GeneticAlgorithm.cpp
GeneticAlgorithm::GeneticAlgorithm(ObjectCollection& objects, WorkingDirectory& workingDir, ResourceAllocator<sf::Texture>& textureAllocator, Window& window) : objects(objects), workingDir(workingDir), textureAllocator(textureAllocator), window(window), maxPoolSize(40), totalFitnessScore(0.f), mutationChance(0.1f) {…}
Setting the mutation chance to 0.1 in the constructor means that on average 1 in 10 neural networks will be mutated. This may be slightly higher than we would want for this project but we can always test and adjust once we’ve finished implementing mutation.
We’ll apply the mutation to our neural networks in the Update function.
GeneticAlgorithm.cpp
void GeneticAlgorithm::Update(float deltaTime)
{
…
                    
	if (parentOne >= 0 && parentTwo >= 0)
	{      
		const NeuralNetwork networkOne = pool[parentOne]->neuralNet->Get();
		const NeuralNetwork networkTwo = pool[parentTwo]->neuralNet->Get();

		NeuralNetwork childNetwork = CreateNeuralNetworkFromCrossOver(networkOne, networkTwo);

		// We create a random float between 0 and 1.
		float mutation = static_cast <float> (rand()) /( static_cast <float> (RAND_MAX));
		if (mutation < mutationChance)
		{
			// If mutation is less than our mutation chance
			// we mutate the network.
			childNetwork.Mutate();
		}

		std::shared_ptr<C_NeuralNetwork> newUFO = SpawnUFO();
		newUFO->Set(childNetwork);
	}
…
}
We call the Mutate function on our neural network class. This method does not exist yet so we’ll create it now.
NeuralNetwork.hpp
class NeuralNetwork
{
public:
    void Mutate();
…
};
NeuralNetwork.cpp
void NeuralNetwork::Mutate()
{
	// Gets a vector of this networks weights.
	std::vector<float> weights = GetWeights();

    const unsigned int numOfWeights = weights.size();
    // Calculate how many swaps we would to do.
    int numOfSwaps = (rand() % static_cast<int>(numOfWeights / 2));
    
    for (int i = 0; i < numOfSwaps; i++)
    {
		// Gets random first position.
        int pos1 = (rand() % static_cast<int>(numOfWeights));
        int pos2 = pos1;
        
        while(pos1 == pos2)
        {
			// Sets random second position. 
			// We do this until position one and two are different.
            pos2 = (rand() % static_cast<int>(numOfWeights));
        }
        
		// Swaps the values at these two positions.
        std::swap(weights[pos1], weights[pos2]);
    }
    
    SetWeights(weights);
}

I’ve used a specific type of mutation called ‘swap mutation’ which gains its name from the fact that we swap weights at two locations. There are other mutation variants that you can play around with, others may work better than swap mutation but I know from previous experience that swap mutation does a decent job. It works like this:

  1. Calculate how many swaps we want to do.
  2. For number of swaps:
    1. Get random weights position
    2. Get second random weight position (ensuring it is different from the first)
    3. Swap the values at these two positions.
And that’s it for the mutation function, now when you run the simulation the neural networks will be mutated based on the set mutation rate.
And thats the last bit of code we will write for our evolutionary algorithm in this series!

Now for the last part: testing. We could just watch the UFOs but I like to be able to prove with numbers that we’ve actually achieved our original goal. We can do this quickly and simply by logging the average time alive for each generation to a text file. We’ll do this in our GeneticAlgorithm class.

GeneticAlgorithm.hpp
#include <iostream>
#include <fstream>

class GeneticAlgorithm
{
…
private:
…
    void LogAverageFitness();
    std::ofstream log;
    int genNumber;
	int addedToGame;
};
Along with the function that logs the output to file we also have an ofstream (http://www.cplusplus.com/reference/fstream/ofstream/) which we’ll use to write to the file, a genNumber which stores our current generation number, and addedToGame which stores how many UFOs we’ve added to the game. We’ll use addedToGame to calculate the number of generations that have passed. A generation has passed when the number of UFOs spawned exceeds the maximum number of UFOs we store in the pool. We’ll calculate this in the Update function.
GeneticAlgorithm.cpp
void GeneticAlgorithm::Update(float deltaTime)
{
…
	std::shared_ptr<C_NeuralNetwork> newUFO = SpawnUFO();
    newUFO->Set(childNetwork);
    
	if(++addedToGame > pool.size())                        
	{
    		LogAverageFitness();
        addedToGame = 0;
  	}
…
}
When addedToGame is greater than our max pool size we call our log function and reset addedToGame for the next generation.
GeneticAlgorithm.cpp
void GeneticAlgorithm::LogAverageFitness()
{
    log.open("ga_log.txt", std::ios_base::app | std::ios_base::out);
    log << std::to_string(++genNumber) << ": " << std::to_string(totalFitnessScore / maxPoolSize) <<  std::endl;
    log.close();
}
We calculate average fitness by dividing the total fitness score (the sum of all UFOs time alive in the pool) by the number of UFOs in the pool.

We calculate the average fitness by dividing the total fitness score (the sum of all UFOs time alive in the pool) by the number of UFOs in the pool. If you look in the resources folder for the part in the GitHub page you’ll find an example log file of around 370 generations called ‘ga_log.txt’. As you’ll see the average time alive does increase with each generation and it does so quite drastically at times. By generation 1 the average time alive is just over 1 second but by generation 371 the average is 778 seconds, which means that the UFOs, on average, survive for nearly 13 minutes!

They accomplish this fantastic average time alive by only moving when necessary. They become mostly stationary and only move when another UFO comes very close to them. They choose to mostly ignore the UFOs that within their sight range but not close enough to damage them.
You’ll have to trust me that these UFOs are not moving much.
And that just leaves me to conclude the project. There are a few questions I would like to answer: Did we meet our original goal? and is this a realistic method to use in a real game?
Let’s reiterate our initial goal:

Have 60+ UFOs onscreen that have taught themselves to avoid each other and the sides of their environment.

Well, in the image above there are 200 UFOs who have achieved exactly that. So I would say yes we’ve definitely met our initial goal! Which is fantastic but is this method of AI currently realistic? To answer that I’ll list some of the pros and cons of our implementation. We’ll start with the good points:
  • The UFOs mostly behave as we intended.
  • We didn’t have to write much solution specific code. If we change the fitness function we can adapt this to other problems without too much hassle.
  • We can store the weights for the top performing UFO in a data file and load this in whenever we need it. There is no need to evolve the UFOs twice.
  • In more complicated projects there is potential for our evolutionary algorithm to reach solutions to the problem that were not originally thought of by the programmer (i.e. me).
And the negatives:
  • We had to write a substantial amount of code for a relatively simple task. The more code we write the more opportunities there are to introduce bugs.
  • The UFOs do not behave exactly as intended, even after leaving the game running for 30 minutes. There were still some UFOs who preferred to kamikaze into the side of the window now and then.
  • There is the occasional jerky/jittery movement that would look terrible in a final game. This is due to the UFOs receiving conflicting instructions on adjacent frames. For example, on one frame the closest neighbour to a particular UFO is on the left, so the UFO moves right, and now it’s too close to a UFO on the right so it moves left and so on. This effect can be reduced by adding smoothing to our movement code.
  • We cannot easily tweak the UFOs behaviour. Say we have a top performer who is doing nearly everything we want but for some reason whenever it gets near the left side of the window it runs offscreen, there is no way to just change that bit of the UFOs behaviour without manually editing its weights and then testing the effect that a small tweak has. We would most likely have to perform the whole evolutionary process again.
  • Even if we imported a set of weights and disabled the genetic algorithm, the fact that each UFO has its own neural network is processor intensive. If we were using more traditional steering behaviours we could have many more UFOs onscreen.
  • How do we handle edge cases? There weren’t any (that I can think of) in this game but as we increase complexity and interactivity there may be actions that happen in the shipped game that we did not account for when training the neural networks. How would our UFOs react to a player intentionally trying to break the game?
  • It can be difficult to debug just why a character is doing something we deem weird or incorrect. Is it a fault in the code? The testing environment? Just a mutation? Or does it actually someway solve the problem just not in a way we would like?
I’m sure I could think of more for each column but yikes, there do seem to be a few more negatives than positives. And to be honest this definitely isn’t the best way to solve our initial problem. I knew that from the start but by going through this process we’ve learnt a lot about how these systems work and we can apply that knowledge in the future to problems that may be a bit more suited to our neural network/genetic algorithm combo.
And thats all for this part of the series! Well done for making it this far. If you want to continue the project here are some ideas to get you started:
Do let me know if you need help with anything, had a great idea of how to use this in a real game, or have spotted some obvious bug in my code that I’ve missed.

Do let me know if you need help with anything, had a great idea of how to use this in a real game, or have spotted some obvious bug in my code that I’ve missed. I think next time I’ll try a more traditional form of AI, possibly goal-oriented action planning. Maybe we could combine our neural networks with the more structured actions in goap? Whatever I chose to do, it won’t be too long before I start another ‘Experiments in AI’ series, as I had a great time writing this one!

Quick update: I made an overly long video showing the evolution of the UFOs. If you jump from the beginning to near the end of the video you’ll notice a stark contrast to how they initially behave (random movement) to how they behave at the end of the video (purposeful, minimal movement).
I hope you’ve enjoyed the series (or at least found it useful). I’ve definitely had fun writing it. Until next time!