Othello Project

Group members:
Chun Chao and Olga Merport

Introduction

Genetic programming (GP) extends the genetic algorithm to the domain of computer programs. In genetic programming, populations of programs are genetically bred to solve different types of problems. One of the various fields where genetic programming can be applied is game-playing. The goal of the assignment was to apply GP algorithm to the game Othello, and to study how variations on the hypothesis, learning methods, and result evaluation methods can change the performance.

The java implementation of the game was given. The player was able to play against Edgar (best player), a random player as well as a human being. The given cost function used to evaluate the move included 4 arithmetic operators "+,-,*,/" and 9 terminals: "white, black, white_edges, black_edges, white_corners, black_corners, white_near_corners, black_near_corners, 10". During the game, the initial population consisted of various hypothesis (function trees) was produced, and the crossover and mutation operations were applied to them.

Approaches

To evaluate the change of the performance, we have created a new java class called OthelloOurPlayer which we made a subclass of OthelloPalyer. We wanted our player to inherit all the properties of the base class OthelloPlayer as well as to be able to use some new functionality. We have looked at the following variations of the given implementation:

I. Variations on the hypothesis:

We want to maximize the number of moves our player can make and at the same time, minimize the number of moves our opponent can make. This approach looked like a good heuristic: if at the end of the game, the player can make more moves than the opponent, it should potentially increase the chance of winning.

To achieve this, we added two other primitives:
Now after each move, the player would calculate the number of possible moves for its opponent and use this information.
II. Variations on the learning method:

Competitive fitness measure - population plays against itself or competing population co-evolve.

There is a known biology fact that no two populations can occupy the same niche. We intend to see if this holds true for the genetic programming. If this holds true, and we still have one population left after the competition, will this champion be a very good Othello player? If not, maybe there is something wrong with our fitness measure to begin with, that our "niche" does not correctly reflect the real Othello playing scenario.

We have created two "OurPlayers" which played against each other. To implement this part, we had to make quite a lot of changes to the given code.

Our implementation was done in the following way: We created two "OurPlayers" WHITE and BLACK. WHITE was the player who's performance we wanted to improve. We were trying to find such a tree of the WHITE player who's performance would result in the lowest fitness of the BLACK player. For each fitness measure we played 5 games and took an average fitness as a result. We did 10 runs with 31 generations for each run (or until our fitness reached 40 - the termination fitness) and picked best individual from each run.

We used fitness roulette algorithm as a criterion for selecting the best individual. The algorithm works in a way that it selects the best individual with a probability related to its fitness. In particular, the probability is equal to the fitness of this individual divided by the sum total fitness of all individuals of the population.

Here came the problem that we didn't notice. (Q2 on the final exam.). Our fitness results are biased. We didn't compute the absolute fitness, we rather computed relative fitness and based our results on it.
III. Methods to evaluate results.

How often does it win? This must be evaluated over a large number of games (at least 100). First we decided to play the games against people. Then, we found it easier to test our player against professional player Edgar, since we didn't want to spend other people's time for our experiment.

Initial parameters

In the training process we used the following initial parameters:

Population size	300
Number of generations	21
Good runs	10
Termination Fitness	40

Problems

The training on CS machines in the department took a very long time. We were using "nohup" command to run the processes overnight. The processes were getting killed before they were able to generate any results. As we have figured out later, the process of drawing the best individual could not have been handled together with the "nohup" command. We were able to get the results only after we turned off the drawing option in the "ini" file.

Results

To evaluate the performance of our Othello player, OthelloOurPlayer, we have it play against the random Othello player, OthelloRandomPlayer, as well as against Edgar.

In the first experiment, we have our player, OthelloOurPlayer play against OthelloRandomPlayer:

Player Color	Player Name
White	OthelloOurPlayer
Black	OthelloRandomPlayer

During the training phase, we generated 10 "fit" individuals as Othello players. We used these 10 individuals to play against OthelloRandomPlayer and the following are the results:

Run Number	Best Genetic Program	Percentage win for WHITE
1	`white`	57%
2	`( + white ( * white_corners 10 ))`	80%
3	`( - white_corners black )`	81%
4	`( + white_corners white )`	75%
5	`( - white possible_moves_white )`	65%
6	`white`	N/A*
7	`( - white 10 )`	67%
8	`( - ( * black_edges possible_moves_white ) ( * ( + ( + white_edges white_corners ) black_edges ) ( * ( + white 10 ) 10 )))`	49%
9	`( + white_corners white )`	N/A*
10	`( - white black )`	65%

* The data was not available because for these runs, the Best Genetic Program was similar to earlier runs, so we did not re-run the experiment.

In the second experiment, we have our player, OthelloOurPlayer play against Edgar

Also worth nothing is that our OthelloOurPlayer was trained to be the WHITE player. In its match against Edgar, we simply switched the white primitives in the resulting trees to black ones and vice versa, for all the primitives. This might not accurately reflect the result.

It might be helpful if we could run a separate training in which we train our OthelloOurPlayer to play BLACK.

Player Color	Player Name
White	Edgar
Black	OthelloOurPlayer

During the training phase, we generated 10 "fit" individuals as Othello players. We used these 10 individuals to play against Edgar and the following are the results:

Run Number	Best Genetic Program	Percentage win for BLACK
1	`black`	18%
2	`( + black ( * black_corners 10 ))`	10%
3	`( - black_corners white )`	16%
4	`( + black_corners black )`	16%
5	`( - black possible_moves_black )`	4%
6	`black`	N/A*
7	`( - black 10 )`	12%
8	`( - ( * white_edges possible_moves_black ) ( * ( + ( + black_edges black_corners ) white_edges ) ( * ( + black 10 ) 10 )))`	10%
9	`( + black_corners black )`	N/A*
10	`( - black white )`	20%

* The data was not available because for these runs, the Best Genetic Program was similar to earlier runs, so we did not re-run the experiment.

Conclusions:

Assuming our results are "correct" we draw the following conclusions:

The main conclusion that we draw from our experiments is that although the fitness is relative to other players in the population, it gets better in an absolute sense, as compared to random players.
Judging from the results on OthelloOurPlayer vs. OthelloRandomPlayer, we realize that corners are very important (see the first results table above, run # 2, 3, 4 against run # 5, 8).
Our primitives possible_moves_white and possible_moves_black did not seem to help significantly in the evaluation of the board.
Judging from the results on Edgar vs. OthelloOurPlayer, we noticed that it might be helpful if we have our OthelloOurPlayer learned from Edgar rather than itself.
Edgar apparently is a stronger opponent, and our OthelloOurPlayer might be able to learn more from it and become a better player.
As it was mentioned earlier, our potential limitation is that we trained our player to be the WHITE player. Since Edgar always plays WHITE we were forced to play OthelloOurPlayer BLACK. This is potential limitation (results for Edgar might not be accurate).
If we were given more resources, e.g., more computational power, we could have trained our OthelloOurPlayer against Edgar and against OthelloRandomPlayer. In these scenarios, we could determine if the experiments are domain dependent.

[home]

Created by Olga and Celia.
Last updated January 5, 1998.