Othello Project
Group members:
Chun Chao and
Olga Merport
Introduction
Genetic programming (GP) extends the genetic algorithm to the domain of
computer programs. In genetic programming, populations of programs are
genetically bred to solve different types of problems. One of the various
fields where genetic programming can be applied is game-playing. The goal of
the assignment was to apply GP algorithm to the game Othello, and to study how
variations on the hypothesis, learning methods, and result evaluation methods
can change the performance.
The java implementation of the game was given. The player was able to
play against Edgar (best player), a random player as well as a human
being. The given cost function used to evaluate the move included 4
arithmetic operators "+,-,*,/" and 9 terminals: "white, black, white_edges,
black_edges, white_corners, black_corners, white_near_corners,
black_near_corners, 10". During the game, the initial population consisted
of various hypothesis (function trees) was produced, and the crossover and
mutation operations were applied to them.
Approaches
To evaluate the change of the performance, we have created a new java class
called OthelloOurPlayer which we made a subclass of OthelloPalyer. We wanted
our player to inherit all the properties of the base class OthelloPlayer as
well as to be able to use some new functionality. We have looked at the
following variations of the given implementation:
-
I. Variations on the hypothesis:
We want to maximize the number of moves our player can make and
at the same time, minimize the number of moves our opponent can make.
This approach looked like a good heuristic: if at the end of the game,
the player can make more moves than the opponent, it should potentially
increase the chance of winning.
To achieve this, we added two other primitives:
possible_moves_white
possible_moves_black
Now after each move, the player would calculate the number of possible
moves for its opponent and use this information.
-
II. Variations on the learning method:
Competitive fitness measure - population plays against itself
or competing population co-evolve.
There is a known biology fact that no two populations can occupy
the same niche. We intend to see if this holds true for the
genetic programming. If this holds true, and we still have one
population left after the competition, will this champion be a very
good Othello player? If not, maybe there is something wrong with our
fitness measure to begin with, that our "niche" does not correctly
reflect the real Othello playing scenario.
We have created two "OurPlayers" which played against each other.
To implement this part, we had to make quite a lot of changes to
the given code.
Our implementation was done in the following way: We created
two "OurPlayers" WHITE and BLACK. WHITE was the player who's
performance we wanted to improve. We were trying to find
such a tree of the WHITE player who's performance would result
in the lowest fitness of the BLACK player. For each fitness
measure we played 5 games and took an average fitness as a
result. We did 10 runs with 31 generations for each run (or
until our fitness reached 40 - the termination fitness)
and picked best individual from each run.
We used fitness roulette algorithm as a criterion
for selecting the best individual. The algorithm works in a way
that it selects the best individual with a probability related
to its fitness. In particular, the probability is equal to the
fitness of this individual divided by the sum total fitness of
all individuals of the population.
Here came the problem that we didn't notice. (Q2 on the final
exam.). Our fitness results are biased. We didn't compute the
absolute fitness, we rather computed relative fitness and based
our results on it.
-
III. Methods to evaluate results.
How often does it win? This must be evaluated over a large number of
games (at least 100). First we decided to play the games against people.
Then, we found it easier to test our player against professional player
Edgar, since we didn't want to spend other people's time for our
experiment.
Initial parameters
In the training process we used the following initial parameters:
Population size |
300 |
Number of generations |
21 |
Good runs |
10 |
Termination Fitness |
40 |
Problems
The training on CS machines in the department took a very long time.
We were using "nohup" command to run the processes overnight. The processes
were getting killed before they were able to generate any results.
As we have figured out later, the process of drawing the best individual
could not have been handled together with the "nohup" command. We were able
to get the results only after we turned off the drawing option in the "ini"
file.
Results
To evaluate the performance of our Othello player,
OthelloOurPlayer, we have it play against the random
Othello player, OthelloRandomPlayer, as well as against
Edgar.
In the first experiment, we have our player, OthelloOurPlayer
play against OthelloRandomPlayer:
Player Color
| Player Name
|
White
| OthelloOurPlayer
|
Black
| OthelloRandomPlayer
|
During the training phase, we generated 10 "fit" individuals as Othello
players. We used these 10 individuals to play against
OthelloRandomPlayer and the following are the results:
Run Number
| Best Genetic Program
| Percentage win for WHITE
|
1
| white
| 57%
|
2
| ( + white ( * white_corners 10 ))
| 80%
|
3
| ( - white_corners black )
| 81%
|
4
| ( + white_corners white )
| 75%
|
5
| ( - white possible_moves_white )
| 65%
|
6
| white
| N/A*
|
7
| ( - white 10 )
| 67%
|
8
| ( - ( * black_edges possible_moves_white ) ( * ( + ( + white_edges white_corners ) black_edges ) ( * ( + white 10 ) 10 )))
| 49%
|
9
| ( + white_corners white )
| N/A*
|
10
| ( - white black )
| 65%
|
* The data was not available because for these runs, the Best Genetic
Program was similar to earlier runs, so we did not re-run the experiment.
In the second experiment, we have our player, OthelloOurPlayer
play against Edgar
Also worth nothing is that our OthelloOurPlayer was trained to
be the WHITE player. In its match against Edgar, we simply
switched the white primitives in the resulting trees to black ones and
vice versa, for all the primitives. This might not accurately reflect
the result.
It might be helpful if we could run a separate training in which
we train our OthelloOurPlayer to play BLACK.
Player Color
| Player Name
|
White
| Edgar
|
Black
| OthelloOurPlayer
|
During the training phase, we generated 10 "fit" individuals as Othello
players. We used these 10 individuals to play against
Edgar and the following are the results:
Run Number
| Best Genetic Program
| Percentage win for BLACK
|
1
| black
| 18%
|
2
| ( + black ( * black_corners 10 ))
| 10%
|
3
| ( - black_corners white )
| 16%
|
4
| ( + black_corners black )
| 16%
|
5
| ( - black possible_moves_black )
| 4%
|
6
| black
| N/A*
|
7
| ( - black 10 )
| 12%
|
8
| ( - ( * white_edges possible_moves_black ) ( * ( + ( + black_edges black_corners ) white_edges ) ( * ( + black 10 ) 10 )))
10%
| |
9
| ( + black_corners black )
| N/A*
|
10
| ( - black white )
| 20%
|
* The data was not available because for these runs, the Best Genetic
Program was similar to earlier runs, so we did not re-run the experiment.
Conclusions:
Assuming our results are "correct" we draw the following conclusions:
- The main conclusion that we draw from our experiments is that
although the fitness is relative to other players in the population,
it gets better in an absolute sense, as compared to random players.
- Judging from the results on OthelloOurPlayer vs.
OthelloRandomPlayer, we realize that corners are
very important (see the first results table above, run # 2, 3, 4 against
run # 5, 8).
Our primitives possible_moves_white and
possible_moves_black did not seem to help
significantly in the evaluation of the board.
- Judging from the results on Edgar vs.
OthelloOurPlayer, we noticed that it might be
helpful if we have our OthelloOurPlayer learned
from Edgar rather than itself.
Edgar apparently is a stronger opponent, and our
OthelloOurPlayer might be able to learn more from
it and become a better player.
As it was mentioned earlier, our potential limitation is
that we trained our player to be the WHITE player. Since
Edgar always plays WHITE we were forced to play
OthelloOurPlayer BLACK. This is potential limitation
(results for Edgar might not be accurate).
- If we were given more resources, e.g., more computational power,
we could have trained our OthelloOurPlayer against
Edgar and against OthelloRandomPlayer. In
these scenarios, we could determine if the experiments are
domain dependent.
[home]
Created by Olga and Celia.
Last updated January 5, 1998.