gpjpp is a Genetic Programming package used to generate function trees for various Genetic Programming application. The GPOthello package uses gpjpp to generate function trees used to evaluate the state of the board in an Othello game. The fitness of each individual tree is evaluated based on the performance of an Othello player using the tree to guide its movement selection.
The details of the evolutionary process can be controlled by several configuration parameters. The effect of each parameter on the population fitness is explored in an attempt to discover values that optimize the evolutionary process. The parameters examined during the tests are listed below.
The description of the configuration parameters examined is copied from the gpjpp documentation.
GPVARIABLE = 0 GPGROW = 1 GPRAMPEDHALF = 2 GPRAMPEDVARIABLE = 3 GPRAMPEDGROW = 4 * A tree creation type that is like GPGROW but starts with * a tree depth of two and increases the depth by one for each * tree until it reaches MaximumDepthForCreation, at which point * it cycles back to two. Note that if TestDiversity is enabled * many of the shallow trees are duplicates, so the distribution * is skewed toward deeper trees than one might expect. * * GPVARIABLE : A tree creation type that makes every node but those on the * bottom level a function and that builds every tree to * MaximumDepthForCreation. * If CreationType equals GPRAMPEDVARIABLE, all individuals are * created using the GPVARIABLE strategy and the depth of * successive individuals is ramped from minTreeDepth to * MaximumDepthForCreation. * * If CreationType equals GPRAMPEDGROW, all individuals are * created using the GPGROW strategy and the depth of successive * individuals is ramped. * * GRAMPEDHALF : A tree creation type that builds alternating trees using * the GPRAMPEDVARIABLE and GPRAMPEDGROW methods. The default * method. * * If CreationType equals GPGROW, all individuals are created * using the GPGROW strategy with depth MaximumDepthForCreation. * * If CreationType equals GPVARIABLE, all individuals are created * using the GPVARIABLE strategy with depth MaximumDepthForCreation.
The configuration parameters are defined in a configuration file which was manually modified for each run of the GPOthello application. The results of each run were stored in flat text files which were manually examined for emerging patterns.
The gpjpp package provides measurements of the best, average and worst complexity, depth and variety of the population for each generation. Each of the three measures characterizing the population makeup was expected to affect the run time of the GP, which is also provided by the package. The goal was to investigate the effect of the configuration parameters on the GP's efficiency via the parameters' effect on the the 3 measures. The motivation for the exercise was the attempt to define an evolutionary process which produces the best Othello players (lowest fitness) in the least amount of time.
The most straightforward indication of a variable's effect on the fitness of the individuals generated by the GP was the number of runs (Runs) required to reach the target number of 'good runs' (GoodRuns). A good run is one where the GP generates an individual with fitness below the acceptance threshold (TerminationFitness) within a given number of generations (NumberOfGenerations). With the default configuration for this test, Runs was defined as the number of runs required for the GP to generate an individual with fitness < 50 within 30 generations, in 3 different runs. Runs was taken from a line in gpjpp's output after each run. Example : "Run number 12 (good runs 1)"
However, the most interesting measure of the GP's efficiency is the total run time. The effect of a variable on the runtime of the GP was examined based on the average run time for each generation. The total run time for each evolutionary process was calculated by summing the run times of all the runs and is the primary measure of a parameter value's desirability. A small tcl script was used to parse the GP's output and generate the efficiency indicators. The depth, variety and complexity of the generations produced during each GP test were selectively averaged and are mentioned in a separate column.
#!/usr/local/bin/tclsh puts [format "%10s %23s %20s" Runs "Run time/generation" "Total run time"] foreach fname $argv { set buf [split [exec grep Run [lindex $fname 0]] \n] set run_T 0 set generation_T 0 set Runs 0 foreach line $buf { if {[string first "Run time" $line] == 0} { set run_t [lindex $line 2] set generation_t [lindex $line 4] set run_T [expr $run_T + $run_t] set generation_T [expr $generation_T + $generation_t] } else { incr Runs } } puts [format "%10s %23s %20s" $Runs [expr $generation_T / $Runs] $run_T] }Sample Run :
chi:OthelloProj/othello> ./process.tcl pop2.run pop20.run pop200.run Runs Run time/generation Total run time 161 2.6459 12373.1 15 12.8513 5637.42 5 132.318 15457.8All the run times are in seconds.
To save on CPU time, some tests were terminated before completing 3 good runs. Each test was allowed at least as much run time as the best test before it. Prematurely terminated tests are indicated as 'failed'. The number of good runs at the termination time is indicated in parentheses. For example, if a test was terminated when only one good run was completed, the word failed(1) will follow the test's statistics.
To study the effect of each variable separately, each parameter was given 3 different values while keeping the rest of the parameters unchanged. To minimize the run time, best value for the parameter was kept for subsequent runs. The initial values for all the parameters during the testing are listed below.
# primary GP settings PopulationSize = 200 NumberOfGenerations = 30 CrossoverProbability = 90 CreationProbability = 0 CreationType = 2 MaximumDepthForCreation = 6 MaximumDepthForCrossover = 17 MaximumComplexity = 100 SelectionType = 0 # secondary GP settings TournamentSize = 7 SwapMutationProbability = 0 ShrinkMutationProbability = 0 AddBestToNewPopulation = false SteadyState = false DemeticGrouping = false DemeSize = 100 DemeticMigProbability = 100 # reporting options PrintDetails = true PrintPopulation = true PrintExpression = true PrintTree = false TreeFontSize = 12 # run testing settings TerminationFitness = 50 GoodRuns = 3 UseADFs = false TestDiversity = true ComplexityAffectsFitness = true CheckpointGens = 0
The tests were performed on a 10 CPU Sparc station.
Manufacturer : Sun (Sun Microsystems) System Model : Ultra Enterprise Number of CPUs : 10 CPU Type : sparc App Architecture : sparc Kernel Architecture : sun4u OS Name : SunOS OS Version : 5.6 Kernel Version : SunOS Release 5.6 Version Generic_105181-17 [UNIX(R) System V Release 4.0]
Both the gpjpp and GPOthello code were left intact except for the following bug fixes :
The unlabeled column before the results tables contains the values of the parameter appearing in bold.
PopulationSize |
Runs Run time/generation Total run time Avg. Variety 2 161 2.6459 12373.1 0.710484 20 15 12.8513 5637.42 0.769451 200 5 132.318 15457.8 0.851538 |
CrossoverProbability |
Runs Run time/generation Total run time Avg. Variety 80 16 12.3175 6122.09 0.801816 failed(0) 90 15 12.8513 5637.42 0.808726 100 16 12.3731 5844.61 0.770557 failed(1) |
CreationProbability |
Runs Run time/generation Total run time Avg. Variety 0 15 12.8513 5637.42 0.769451 10 9 11.6889 3262.9 0.80597 failed(0) 20 4 13.09 1021.88 0.909615 30 7 11.8829 2586.87 0.825521 failed(0) |
A nonzero creation probability was expected to have a positive effect on the GP's efficiency, since a certain number of unique individuals would be added to the pool in each generation, thereby increasing the variety. An exceedingly large creation probability however was expected to pollute the gene pool and increase the fitness measure (decrease the average fitness). The results suggest that the effect is quite significant and verify the prediction. A creation probability of 20 produced 3 good runs in a total of 4, while probabilities of 10 and 30 did not produce a single good run in double the runs. The change in the run time per generation was found to be fairly insignificant. The creation probability was kept at 20 for all subsequent tests.
CreationType |
Runs Run time/generation Total run time Avg. Depth 0 18 13.1933 6974.45 3.08201 1 34 15.5653 16276.4 5.37759 2 4 13.09 1021.88 3.6141 3 6 11.004 2409.62 2.89803 failed(0) 4 27 14.0474 11437.9 3.43366 |
The different methods of generating generation 0 affect the GP's efficiency via their effect on the average tree depth. Very shallow trees were expected to be too simplistic to produce good results and deeper trees were expected to require a large number of generations to produce fit individuals. Indeed, the method for creating the first generation that produced the best results was the default method of GPRAMPEDHALF (2). This method results in trees of moderate depth which appears to positively affect the GP's efficiency. A surprising result was that the trees created using the GPVARIABLE(3) strategy were the ones with the lowest average depth. According to the documentation, all the trees were supposed to start with a depth equal to MaximumDepthForCreation, which had been kept to the default value of 6. An examination of the depth for generation 0 suggested that the GPVARIABLE strategy did not function as advertised, with tree depths varying from 2 to 6. The creation type was kept at 2 for all subsequent tests.
MaximumDepthForCreation |
Runs Run time/generation Total run time Avg. Depth 4 25 13.472 9896.57 2.58933 6 11 13.9182 4280.09 3.16306 8 26 13.8404 10722.5 3.34219 |
The effect of the maximum depth for creation on the GP's effectiveness verified the value of moderately deep function trees. The test with a default value of 6 was repeated, as the run time of 1021 sec previously produced appeared to be suprizingly short in comparison the the run time of the other tests. The GPRAMPEDHALF strategy with MaximumDepthForCreation of 6 once more again produced the best results, but in 11 Runs, as compared to the 4 previously required. The maximum depth for creation was kept at 6 for all subsequent tests.
SwapMutationProbability |
Runs Run time/generation Total run time Avg. Variety 0 11 13.9182 4280.09 0.808726 10 16 13.4156 6147.35 0.877669 20 16 13.4844 5887.06 0.912329 30 20 12.7405 7768.52 0.907355 |
A nonzero swap mutation probability was expected to improve the GP's efficiency by increasing the population variety. The results suggest that the variety was indeed increased, but with negative effects. The most likely explanation for the adverse effect of the swap mutation is the probable pollution of useful subtrees. As a result, the best individuals of a generation would be less likely to pass the attributes that enhanced their fitness to their offspring. The difference in the run times was relatively small, suggesting that the positive effects of a large variety partly offset the negative effects of the gene pool pollution. The swap mutation probability was kept at 0 for all subsequent tests.
MaximumDepthForCrossover |
Runs Run time/generation Total run time Avg. Depth 7 14 13.065 5121.71 3.16087 10 13 13.7154 5101.14 3.42466 13 24 13.3417 9450.86 3.18463 17 13 13.5862 4877.93 3.16306 20 11 13.9182 4280.09 3.19279 |
The maximum depth of individuals after crossover was expected to affect the GP's efficiency similarly to the way the creation strategy did, via its effect on the average depth of the function trees. All the tested values resulted in populations with average depths very close to the average depths of the trees produced with the default value of 17. The large increase in the run time with the value of 13 is puzzling and most likely insignificant. Since the effect of the maximum depth for crossover was relatively small, the default value of 17 was kept for the remaining test.
MaximumComplexity | Runs Run time/generation Total run time Avg. Complexity 20 12 13.9325 4314.64 0.788871 40 13 13.7431 4974.15 0.792818 60 13 13.4354 4780.75 0.779014 100 13 13.5862 4877.93 0.808726 |
The elimination of individuals exceeding an upper complexity limit was expected to have a positive effect, as long as the complexity of the remaining individuals was not exceedingly low. The average complexity of the individuals was already relatively low, so that the algorithm started excluding individuals only after the limit was set below 50. The results suggest that the effect was relatively insignificant, with a minor improvement when the limit was set to 20.
The tested parameters affecting the GP's efficiency can be broken in three categories, based on their effect on the average depth, complexity and variety of the resulting function trees. The parameters affecting the depth of the trees were the creation method for the first generation, the maximum depth for creation and the maximum depth of individuals for crossover. The parameter affecting the complexity of the trees was the limit on the number of nodes. The parameters affecting the variety of the trees were the population size, the creation probability and the swap mutation probability. The only parameter that did not seem to fit in any of the categories was the crossover probability. The suggested value for each parameter and the effect of the parameters on the relevant population attributes and on the GP's efficiency are tabulated below.
Parameter | Affected attribute | Effect on attribute | Effect on efficiency | Suggested Value |
PopulationSize | Variety | Strong | Strong | 20 |
CrossoverProbability | ? | ? | Strong | 90 |
CreationProbability | Variety | Strong | Strong | 20 |
CreationType | Depth | Strong | Strong | 2 |
MaximumDepthForCreation | Depth | Moderate | Strong | 6 |
SwapMutationProbability | Variety | Strong | Moderate | 0 |
MaximumDepthForCrossover | Depth | Moderate | Minimal | 20 |
MaximumComplexity | Complexity | Moderate | Minimal | 20 |