To navigate through this document quickly, click on one of the
following keywords:
Abstract * Introduction * Training * Implementation * Results * Other Works * Future Research * Conclusion * Miscellaneous |
Why does this matter? While the corner squares are the most important, and powerful, squares in the game, occupying a square next to an opponent's corner square can take away some of his advantage, while occupying a square next to one of your corner squares can help seal yours.
Say the top row is set up as follows (an X
is an empty square):
X X B B B B B X
Then B goes in the corner on the left:
B X B B B B B X
Invoking this primitive, the GP would want to go in column 2. Not only does this block the corner and help get white the top row of the game board, it also gains him a corner.
White's turn:
B W B B B B B X
White's next turn:
B W W W W W W W
These spaces are bad for a player to occupy. Why? They make
the sides accessible to your opponent, which gives him
greater control over the playing area. (Remember that the pieces on
the sides can only be flipped in one dimension.)
This primitive can be represented by the following (the X-X represents the desired positions):
This is useful because when a player has control of the 4 center pieces
on a side of the board, he has a very useful tool. It eliminates
the 2 squares next to it (row or columns 2 and 7) from being used by his
opponent, because if his opponent plays there, the GP can take the corner.
It also gives the GP 2 additional safe spaces to make a move. When
the middle of the game board is filled up, and the game comes down to the
corners, the GP can use those 2 spaces with minimal danger of giving his
opponent a corner.
In addition to the above analysis, we have decided on our own brief evaluations of the existing primitives:
num_white, num_black - These are the traditional measures of value for board games. You have more pieces so you're winning. Sounds good, right? Well, not when you're dealing with Othello. The rules of play provide for quick reversals of fortune that makes the game entirely into one of strategic placement of pieces, and not one of numbers (except after the last turn, of course).
num_white_edges, num_black_edges - Now these are important. They help establish one of the most important things needed for a victory in Othello ( ** Novice Players Note: Big tip on how to play ------> ) is control of the middle pieces. Not we say control, NOT occupancy. With pieces on the edges, large quantities of the middle pieces can be captured at once, with no danger of your opponent coming up from behind. Of course, if you want to occupy the edges, it helps to have control of them. This brings us to:
num_white_corners, num_black_corners - These are THE MOST IMPORTANT squares on the board. The corner pieces can never be flipped, and therefore occupying them can give a player control of the perimeter of the game board, which in turn delivers the middle of the board, and, the game. It is (although possible) difficult to neutralize the corner pieces. In fact, it takes at minimum three pieces to totally neutralize one corner square.
num_white_near_corners, num_black_near_corners - These are the most dangerous pieces on the game board. Occupying them is very risky, because it is the first step in giving your opponent a window tot he corner. this makes the non-edge near-corner square the most dangerous. It is the most vulnerable to being flipped. However, it is important to note that once the corners are occupied, these squares become valuable. Hence we have adjusted these primitives to mean pieces near unoccupied corners. For squares near occupied corners, see the additional primitive above.
This player is possibly the worst of the bunch. While making the
corners a goal is a good thing, making the near corners a goal is suicidal.
Since the near corner spots will become available first, the player will
grab them, thus opening the corners to his opponent, and costing him the
game.
This player has a simple idea: get the middle. It decides to do
this by looking for opportunities to give the perimeter to black.
This is also suicidal, since we benefit from the experience of knowing
that he who has the perimeter will almost certainly win the game.
This player is getting a little better. He goes for the edge whenever
possible. Unfortunately, his strategy is flawed. If it is available,
he will grab a spot near the corner, possibly giving it away, and he has
nothing telling him to grab a corner if it is available.
This player has a simple strategy. Get as many pieces as possible
in every move. As mentioned above, this is not the wisest of ideas.
He will tend to not choose the strategically best squares to move on, and
will not be so hard to defeat.
This player is bad. Not as bad as those that deliver up away the
corners on a silver platter, but not much better. This player is
basically going to avoid the outer two rows and columns (except for the
corners and near corners). Not much hope for him.
This is where the players start getting decent. This player would like
to grab the corners, which is always a good thing), but wants even more
to keep his opponent out of them.
This player has the general idea, but does not really execute it well.
He is trying to keep his opponent away from the corners, in all cases.
This makes it hard for him to get the corners, though.
This player has regressed a bit. He tries to occupy the squares near
the sides, while keeping his opponent away from them. This is not
such a good thing, since it does nothing for him to establish control over
squares on the board.
This player is getting better. He basically tries to get the two outer
perimeters of the board. There are really two flaws in this.
By trying for the two outer perimeters, he will often grab the squares
in the next to outer row first, leaving the edge pieces open to be grabbed
by his opponent. He also does not recognize that taking a square
adjacent to a corner is not usually such a good thing.
This is by far the best of the training players. He basically tries to get the corners and non-next-to-corner edges, while trying to keep himself out of the dangerous next-to-corner squares, and putting his opponent in them. Using the unoccup_near_corners primitives is important, because if the corner is occupied, it is often better strategically to take it, and the player might not do that if he just looked at corner spaces in general.
The resulting OthelloTrainer player was pitted against the GP learning algorithm in the GPOthello class and training was done with a population size of 200 and, as previously mentioned, 33 generations. (One game was played per player per generation, since the players were deterministic). Several different runs were made to see if mutation would work better in certain instances.
In order to evaluate the results generated by the above, in addition to using the fitness results generated by the system we created an OthelloPlayGP class which would essentially allow Scott to play our generated GP players. The results discuss some of the findings and conclusions he made about the generated players. (In fact, look below if you would like to challenge our player itself!)
========================================================================
Generation:32 best:0 worst:0
GP#
dad mum oper
mut shrk mut fitness len dep
===== ============= ============= ======== ======== ==========
==== ====
B 0 6:RPB:1
2:RPB:1
43.00 1 1
RPB: black_near_corners
Run time 66.79 seconds, 2.02 seconds/generation
Note that this player fared poorly against Edgar in generations 30-32, but against the other ten players it fared well, winning in most (the fitness is the number of pieces left, so if it's less than 32 our GP won.) This player, however, was difficult to use in reality. The problem is that black_near_corners tries to predict what the opposite player will do; unfortunately, against non-deterministic players like humans this algorithm immediately fails. We therefore decided to pick an intermediate child that appeared to have good performance, as shown immediately below.
RPB: ( + white_center_edges black_near_corners )
This player, while not the best in fitness, was playable by Scott, so we could see his analysis.
Scott's Analysis: This player is not so bad. Not much competition for Edgar, but better than most of our training players. A not surprising result for a player that trained on our training players, which are not the most skilled Othello players in the world, or on CS. He did okay playing me, but just didn't have the strategic depth to win.
Scott's Analysis: This player is also fairly decent. However, like many Shakespearean heroes, it does have a tragic flaw. The inclusion of white_near_corners opens up the possibility of giving the corner away. In fact, each of the five times I played him, the turning point of the game was when he seemed to be doing well, and then moved next to a corner, handing it to me on a silver platter, so to speak.
The use of alpha-beta to evaluate an Othello board is by no means a new idea. In Sections 18.6 - 18.12 of his book Paradigms of Artificial Intelligence Programming, Peter Norvig outlines a lisp program to play Othello using an alpha-beta evaluation function. He mentions two high-calibur Othello playing programs, Iago, created by Paul Rosenbloom, and Bill, created by Kai-Fu-Lee. The later program, Bill, defeated the highest rated American Othello player (Brian Rose) by a score of 56-8 in 1989. As we did, Norvig chooses to focus on certain aspects of the Othello board in his evaluation function (His code actually comes from both Iago and Bill). The two he chooses are Mobility and Edge Stability. Norvig defines two types of mobility, current and potential. Current mobility is defined as the number of moves available to the player. Potential mobility is the number of empty spaces next to an opponents piece. Edge stability is determined by taking the approximately 60,000 possible combinations of edge contents and evaluating them to see which are optimal. Norvig combines these factors in a linear function, with coefficients being based on the move number. While this is apparently an effective strategy, we fell having Edgar set his own weights will be as effective, assuming he can be trained sufficiently.
In his paper Games Computers Play: Simulating Characteristic Function Game Playing Agents with Classifier Systems, Garett Dworman discussed the abilities of his generated players to achieve their optimal performance level. The details of his "game" are not so important, as his methods are very general and could be easily applied to many types of learning. Dwormans paper concluded that his players were, in fact, able to reach their optimal levels in competition with each other. Each player started with a simple evaluation function which it adjusted through training. Dworman trained his players to the point where they had reached the optimal playing level, and compared them to his evaluation of what the player should do. They matched.
Scott's expert status is a scientific fact, not a product of boasting (although that doesn't hurt either). If you need convincing, click here to see his record against Edgar.