Genetic Algorithm Tutorial - Part II
Introduction
Toady we will code the Genetic Algorithm, discussed in the previous blog. I will be using Python for this, but feel free to code this in any language.
Generating a Parent
We will first have to do create a string which will be the data source for our random string. Then we will create a empty string and add random characters to the that string. First lets import the required library.
import random
INPUT = str(input("What would you like this genetic algorithm to find? ")) geneSet = "QWERTYUIOPASDFGHJKLZXCVBNM_qwertyuiopasdfghjklzxcvbnm1234567890" target = INPUT def generateParent(length): genes = list ("") for i in range (0, length): geneIndex = random.randint(0, len(geneSet) - 1); genes.append(geneSet[geneIndex]) return (''.join(genes))
Note: In the geneSet you can add special character as <>/?!|\ and even a space.
Fitness Function
Now we will find the fitness of the parent, by counting the number of letters that are same and exist at the same place between the parent and the target.
def getFitness(candidate, target): fitness = 0 for i in range (0, len(candidate)): if target[i] == candidate[i]: fitness += 1 return (fitness)
Mutation Function
Now we will mutate the strings by randomly choosing a letter from the parent/child and then replacing it by random character from the geneSet.
def mutate(parent): geneIndex = random.randint(0, len(geneSet) - 1) index = random.randint(0, len(parent) - 1) genes = list(parent) genes[index] = geneSet[geneIndex] return (''.join(genes))
Display Function
Since, we will have to display the generation etc. multiple times, it is better to just put it all in a single function. This will display a string and it's fitness.
def display(candidate): timeDiff = datetime.datetime.now() fitness = getFitness(bestParent, target) print ("%s\t%i" % (candidate, fitness)) #Fancy string formatting techniques
Initializing the Algorithm
Now we shall initialize the algorithm, by calling the functions defined above.
bestParent = generateParent(len(target)) firstGen = bestParent bestFitness = getFitness(bestParent, target) display(bestParent)
Now we will call a while loop which will run until the fitness of the strings are equal to the length of our target.
while bestFitness < len(bestParent): child = mutate(bestParent) childFitness = getFitness(child, target)
Now we will check, that is the fitness of the child is better ie. greater than its parents'. If it is true, then we shall rename the variables and display the generations.
if childFitness > bestFitness: bestFitness = childFitness bestParent = child generation += 1 display(bestParent)
Now we will display the summary of all of this evolution.
if bestFitness == len(bestParent): print () print ('Summary >>') print ("It took", generation, "generations to reach the word '",target, "' from '",firstGen,"'. ")
Question and its Answer
If you run the program a couple of time, which you should, you will find that the number of generation is always equal to the length of the target. Think about this, why does this happen?
The answer lies in the fact that we display only those generation that have better fitness than their parents, just like evolution. If we create an empty list, add all those generations which do not have a better fitness than their parents and then print the list, the result will surprise you.
Since they "lost", the race of evolution, lets call them the losers.
losers = []
while bestFitness < len(bestParent): ... if childFitness < bestFitness: losers.append(child)
if bestFitness == len(bestParent): ... print (losers) print (len(losers))
Running this code with our target to be 'lion', the list of the losers is 414 items long and with the successful generation to be only 4! For the case of 'The_Original_Coding_Cult' there are 2572 losers and just 24 winners!
In the real world, where mutation happen at some random rate and there is no end point, we must be grateful that we are the "winners" of evolution and we don't end up in the list of 'losers'
Comments