Dec 13, 2014 Javascript, Biology

Genetic Algorithms

In my Ludum Dare 31 entry, Bunnies!, I used a very simple genetic algorithm to breed bunnies. It can seem like magic at first, but genetic algorithms are very simple at their core.

“Genetic algorithm” sounds really fancy… but really, it’s just a type of search algorithm.

Imagine we have the following:

An object that we can duplicate and mutate gradually
A way to rank objects

With just these two concepts, we can create a simple search algorithm:

Create lots of random objects
Rank the objects
Keep the top ranking objects, delete the low ranking objects
Mutate the top ranking objects in random directions, making more objects
Go to step 2

That’s it!

Over time, if the objects have a gradual path to the optimum state, and the randomness leads the mutation in that direction, then the search algorithm will discover better and better objects.

(“Better” is a bit subjective… in this context, “better” really means “higher ranking”, or “more fit”)

Duplicate and Mutate

Implementing the search algorithm should be simple – once we have an object we can duplicate and mutate gradually. That’s really where the magic happens.

There are many ways to create an object that can duplicate and mutate. However, for our search algorithm to work well, we need to try and design mutation that leads to gradual improvement.

This is a different mindset than most programmers are used to. We usually optimize our programs for speed (go fast!), or space (use less memory!); in this case, we are optimizing our program so that the search algorithm will produce an evolving object.

Genes

Luckily, nature has already done most of the hard work. Instead of creating from scratch, let’s just create a crude model of DNA.

Let’s define a gene as a number, with a range, starting at 0. Every gene will define some property of the object. For example, in the bunnies, I have a gene that represents the bunny’s weight (which is shown by the radius of the body).

In normal programming, we would use an int or double to store this number – but that’s when optimizing for speed/space. For genes, we want a number that can mutate gradually. I tend towards using a string of “0"s and “1"s, and the value is the count of how many “1"s are in the string.

For example, if weight can range between 0 and 4, then I would need a string that is 4 characters long:

"0000" = 0
"0001" = 1
"0010" = 1
"0011" = 2
"0100" = 1
"0101" = 2
"0110" = 2
"0111" = 3
"1000" = 1
"1001" = 2
"1010" = 2
"1011" = 3
"1100" = 2
"1101" = 3
"1110" = 3
"1111" = 4

This seems strange at first, but it has some benefits:

It’s easy to mutate gradually. Simply by flipping a character we are either increasing or decreasing the number by 1. Suppose, instead, that we used actual int values. Flipping a bit causes the number to change by different amounts, depending on the bit; it could be 1, 256, 1024, or more. This is not gradual change.
There is a natural bell curve. This means that we can easily predict what an “average” object will look like. Bunnies can be really fat or skinny, but we’re not likely to see that, because it would mean the weight gene has all “1"s or all “0"s, which is rare.

Noncoding Genes

We could create our object using a series of parameters, all encoded using numbers with the design above. Which works! However, for a little extra spice, let’s take some more inspiration from nature:

Noncoding DNA is extra DNA that doesn’t directly encode a behavior. One piece that I like to incorporate into my genetic algorithms is the idea of “genetic switches”.

In the Bunnies game, you’ll notice that bunnies have different kinds of fur patterns. Some are just solid colors, but others have horizontal bars, or spots (or both!).

There is a gene, for example, that encodes the size of the spots. There is another gene that is simply a flag though: does this bunny have spots, yes or no? This is a switch. If the bunny has spots, then they are drawn according to the “spot size” gene. But all bunnies have the “spot size” gene, whether they have spots or not.

Who cares?

Well, it’s sort of cool. You can have parent bunnies, with small spots, give birth to a bunny without any spots – but this bunny will secretly be carrying the genetic encoding of small spots. That bunny could have children that have small spots – genes inherited from its grandparents.

Having genetic switches allows for our objects to have hidden state, which can produce interesting properties in the long run.

Implementation

Let’s take a look at a basic implementation:

function createSpecies(geneInfo){
  // count the number of 1's in a string
  function countOnes(str){
    return str.split('1').length - 1;
  }

  return {
    create: function(code){
      var obj = { code: code };
      // loop through the genes, and create our object
      for (var i = 0; i < geneInfo.length; i++){
        var gi = geneInfo[i];
        obj[gi.name] = countOnes(code.substr(0, gi.size));
        code = code.substr(gi.size);
      }
      return obj;
    }
  };
}

var bunnyGenes = createSpecies([
  { name: 'weight',   size: 4 },
  { name: 'hasSpots', size: 1 },
  { name: 'spotSize', size: 4 }
]);

var bunny = bunnyGenes.create('011010111');

console.log(bunny);
// => {
//   code: '011010111',
//   weight: 2,
//   hasSpots: 1,
//   spotSize: 3
// }

Gee, that code is actually really simple. Is that really it?

Yes, actually. That is the core.

Now it becomes trivial to implement creating random objects:

...
  return {
    random: function(){
      var code = '';
      for (var i = 0; i < geneInfo.length; i++){
        for (var j = 0; j < geneInfo[i].size; j++)
          code += Math.random() < 0.5 ? '1' : '0';
      }
      return code;
    },
    create: function(code) {
...

Might as well add breeding, too:

...
  return {
    breed: function(dadCode, momCode){
      var code = '';
      for (var i = 0; i < dadCode.length; i++){
        code += Math.random() < 0.5 ?
          dadCode.charAt(i) : momCode.charAt(i);
      }
      return code;
    },
    random: function() {
...

Mutation is nice too – this allows for new things to randomly come into existence:

...
  return {
    mutate: function(code, mutateFreq){
      for (var i = 0; i < code.length; i++){
        if (Math.random() > mutateFreq)
          continue;
        // flip a bit
        code = code.substr(0, i) +
          (code.charAt(i) == '1' ? '0' : '1') +
          code.substr(i + 1);
      }
      return code;
    },
    breed: function(dadCode, momCode) {
...

Easy!

Ranking

The last piece is to give some sort of meaning to the values.

This could be really simple (scoring function), or really complex (interactive environment with predators, hunger, etc), or user driven (select which objects live or die).

In the Bunnies game, my original plan was to have a complex environment where the genetic factors determine behavior, and certain behaviors were rewarded. In practice, I never got around to coding it all – and only the user has the power to truly influence the gene pool.

Imagine, though, if I had a property called “evasion”, ranging from 0 to 9. At 0, it would completely ignore the predator (snowman). At 9, it would have perfect intelligence and always evade the predator before being killed.

Over time, this gene would drift towards 9. Not terribly exciting, but still pretty cool.

Where things get more interesting is if we have properties in conflict. For example, instead of a single evasion property, imagine three:

Evasion (0 - 9): how aggressive the bunny is at evading the predator
Hunger (0 - 9): how aggressive the bunny is at chasing carrots
Speed (0 - 9): how fast the bunny hops

These values are encoded in the genes.

However, now we introduce a statistic: health.

All bunnies are born with 50 health. Hopping takes 1 health. Eating adds 1 health. If a bunny loses all health, or gains over 70 health, they die.

Where is equalibrium? Who knows! This is far more interesting – now the genetic search algorithm is actually solving this problem.

Scoring Function

Just to round things out, let’s implement a simple scoring function, so we can see the bunnies evolve in our demo code:

var bunnyGenes = createSpecies([
  { name: 'evasion',  size: 9 },
  { name: 'hunger',   size: 9 },
  { name: 'speed',    size: 9 },
  { name: 'weight',   size: 4 },
  { name: 'hasSpots', size: 1 },
  { name: 'spotSize', size: 4 }
]);

function scoreBunny(b){
  return 0 +
    b.evasion + // evasion is good
    b.hunger +  // hunger is good
    b.speed +   // speed is good
    -b.weight + // weight is bad
    // if the bunny has spots, larger is better
    // otherwise, no spots is worth 3
    (b.hasSpots == 1 ? b.spotSize : 3);
}

var bunnies = [];

// create 20 random bunnies
for (var i = 0; i < 20; i++)
  bunnies.push(bunnyGenes.create(bunnyGenes.random()));

// cycle through 1000 generations of bunnies
for (var g = 0; g < 1000; g++){
  // sort best bunnies to top of list
  bunnies.sort(function(a, b) {
    a = scoreBunny(a);
    b = scoreBunny(b);
    if (a > b) // if 'a' scores more, then sort it to the top
      return -1;
    return 1;
  });

  // kill the bottom 15 bunnies
  bunnies.splice(5, 15);

  // breed remaining bunnies to create 15 new bunnies
  for (var i = 0; i < 15; i++){
    var dad = Math.floor(Math.random() * 5);
    var mom = Math.floor(Math.random() * 5);
    var child = bunnyGenes.breed(
      bunnies[dad].code, bunnies[mom].code);
    child = bunnyGenes.mutate(child, 0.1);
    bunnies.push(bunnyGenes.create(child));
  }
}

// output our refined population
console.log(bunnies);

What are the results?

Well, it works! For me, the top 4 bunnies have the best possible score:

code: "111111111111111111111111111000011111"
evasion: 9
hasSpots: 1
hunger: 9
speed: 9
spotSize: 4
weight: 0

The 5th ranking bunny has the third highest score:

code: "111111111111110111111111111000001001"
evasion: 9
hasSpots: 0
hunger: 8
speed: 9
spotSize: 2
weight: 0

This bunny could rank higher with an extra hunger value, enabling its spots, and increasing its spot size to 4. This would no doubt happen if more generations were created.

Summary

So, as long as we can create a scenario with a gradually mutating object, and some sort of ranking environment to select winners and losers – along with breeding the winners to produce the next generation – our population will become more and more fit.

Don’t get caught up on this specific implementation. As long as we satisfy the overall requirements, it will work. You could implement breeding by averaging values, and mutation by adding a random amount to values. Or you could implement encoding with a base-4 system, like, say, AGCT.

Happy coding!