Project Othello Machine Learning Home Page
"Put money in thy purse." - Shakespeare's Othello
"A minute to learn, a lifetime to master." - Milton Bradley
This page describes the results of Project Othello, one of four
homework projects for the machine learning
course at Columbia University (Professor: Eric Siegel, TA: Eleazar Eskin).
A SIGCSE (cs education) paper on this
assignment is available. Note that there is also an intro to computer science
version of this assignment.
For Project Othello, students applied the machine learning method genetic programming (GP)
to the game Othello. Each student or student team selected a unique
experimental project investigating variations on the learning method,
hypothesis representation, or both. To evaluate results, students had
to deal with many machine learning issues such as overlearning,
temporal credit assignment, local optima, hypothesis representation
bias, learning speed and noisy fitness measurements (cf. GP Brainstorming
Archive for a more comprehensive list of learning issues). The
projects were mutually complementary. The bottom of this web page
contains a hierarchical index of the 18 experimental projects with
links to writeups of their approaches and results. Some have applets
that allow you to play Othello against learned players.
The homework assignment and all needed code are
available for interested researchers or machine learning instructors.
This includes ideas for GP-Othello projects from which students could
select. It also includes Java implementations of GP (jpgpp) and of
Othello, which are hooked up; to perform many non-trivial experiments,
this code can be modified easily by students who are not fluent with
Java.
GP is used to induce a board evaluation function to select
each move made during the game. In this approach, there was no
search beyond the next immediate move in the game, except
for one student project (Truta, below). The students started
from this rudimentary approach and evaluated variations.
The idea to use Othello as a machine learning homework project was Astro Teller's, and he
provided some of the code used in the project, as well as another
automatically-learned Othello player, Edgar. Edgar trained with a different
learning paradigm, so students also performed cross-paradigm
comparisons.
The goal of each student project was not
necessarily to beat Edgar. Rather the goals included:
- Compare, contrast and discover methods to approach Othello with GP.
- Investigate methods to evaluate resulting Othello players.
- Compare results to another learned Othello player,
Edgar
- Use another Othello expert,
Edgar,
during training
One interesting result demonstrated over a few reports, below, is that
while some strategies learned to beat Edgar, it proved more difficult
for learning to develop a strategy that beats a "noisy" Edgar, that
is, Edgar with some randomness added.
Introduction (from Suk's writeup, below)
In this project, we are given an Othello game written in Java with a
genetic programming implementation already in place. This original
implementation consists of 13 primitives, + - * / 10 black
black_corners black_near_corners black_edges white white_corners
white_near_corners white_edges. The first four primitives are standard
arithmetic operators while the rest of the primitives are terminals,
all but one corresponding to positions on the board. All the standard
genetic programming population control constructs such as crossover,
mutation, and ADFs are included in the given Java package, gpjpp. In
addition to this genetic programming setup, a computer player, Edgar,
is provided. Though not a genetically programmed player, Edgar boasts
a superior playing ability, easily dispatching most Othello players in
short time. It is the purpose of this project to modify and extend the
current genetic programming setup in hopes of discovering new
approachs in implementation and representation that will facilitate
the creation and evolution of GP players that can play well against
both Edgar and the general othello playing population.
In general, fitness over five games only is not precise, very noisy.
Student Othello Projects:
- Alpha-Beta search
- Competitive Fitness Measure
- Corey Tripp (Spring 2000):
I decided to try one of the learning method variations
suggested in the homework assignment. This was having a
population vote on each move.
- Thomas Keerikattu (Spring 2000) -
Modified the
fitness-measure algorithm so that fitness is evaluated
against several players. The new algorithm uses the number
of players defeated as a fitness measure and ignores the
number of pieces remaining on the board. Used a reduced
set of terminals: 'white' terminals are removed and
'constant' terminals modified.
- George Yi, Naho Osagawara, and Wei-Ang Lee (Spring 2000) -
Dare to
Mutate: Small population after generations and
generations of evolutions (adaptiving to random mutation
events) give birth to one smart dude that can slightly
overmatch Edgar. With this dude's structure left in tact,
the "extremities" of the dude, tree edges, are refined to
play Othello (as well as the internal nodes). Genetically
engineered, the super dude creams Edgar, and many other
random opponents called to compete against him.
Resistance is futile.
- Yi-Min Chee. (Fall 1997) - Single elimination tournament
- variations on fitness measure:
- non-competitive baseline
- straight competitive fitness eval
- combo of competitive and versus Edgar
- results:
- competitive runs reached termination criterion faster than
absolute fitness runs.
- Resulting player from competitive run beats
Edgar, and generalizes better than individuals trained directly against
Edgar.
- existence proof of a player that is better than random,
but worse than Edgar -- a tribute to Edgar.
-
Chun Chao and Olga Merport (Fall 1997)
- New primitives: possible_moves_white, possible_moves_black
- although fitness is relative to
other players in the population, it gets better in an absolute
sense, as compared to random players.
-
Federico Kattan (Fall 1997)
- large population size
- parallel 5-game evaluation of individuals
- Jorge Lepre (Spring 2000) -
Using Tournament
Selection as competitive fitness measure allows evolving a
player with minimal prior knowledge.
- Modifications to Primitives
-
Brian Whitman and Grace Junxin Zhang (Spring 2000):
Add new primitives, change fitness measure.
-
Jenny Weisenberg and Blaine Boman (Spring 2000)
- explore the effect of increasingly large sets of
terminals on evolved player fitness and performance
- explore the relationship between fitness and
performance against both EDGAR and the Random Player
-
Krishna Srikant and Pascal Saremsky (Spring 2000)
- Super primitives, evolving populations
- Reduction of original primitives to bare essentials
- Addition of primitives for individual square access through board symmetry
- Addition of primitives for piece mobility determination at different game stages
- Demes for evolving increasingly superior players which can play both black or white
- Defeats EDGAR effortlessly; challanging to humans as well
-
Manuel Reyes (Spring 2000): Added two new primitives,
weighted different sections of the board.
-
Ben May and Anastasiya Lebedev (Spring 2000):
Added two new primitives.
-
James Lott (Spring 2000): I decided to add two new
terminals to the othello primitives, based on the concept
of surroundedness. Surroundedness measures the number of
pieces immediately surrounding (in the 8 spaces adjacent
to) a possible move.
- Stephen Jan, Pablo de
Dios (Spring 2000): In this experiment, we attempt to
utilize human derived strategies in order to properly
represent the attributes necessary to derive a hypothesis
reprsentation that can be more quicly generated. The goal
of this experiment is to test one method for increasing
the learning rate and reducing the computation time in
genetic programming applications.
- Joe Gagliano, Stan John, Sheanon Chung
The intent of our addition to the genetic programming
othello player was to allow the programs evolved to put an
arbitrary weight on the features it was evaluating.
(Spring 2000)
- Arie Addi (Spring 2000) -
Discovering a simpler Othello Player by using less but
more effective primitives in the GP trees.
- Jenny Weisenberg and Blaine Bowman (Spring 2000) -
In our
project, we explore the effect of increasingly large sets
of terminals on evolved player fitness and performance.
-
Ilya Mayzus (Spring 2000):
- Fitness versus RandomPlayer
new primitives:
- use me/opponent instead of black/white to
allow testing for either side
- subsitute absolute number of pieces with
relative change after next move
- position on the board -- how far from center?
- fitness measure affects game strategy of evolved
players
- playing against RandomPlayers, just Edgar, or
a combination
- Leonid Portnoy (Spring 2000) - Investigated
using primitives based on the X/Y location of the piece, and a
primitive which guessed the value of the next board randomly.
- Juno Suk (Spring 1998)
- Writeup provides a survey and analysis of F97 work.
- General set of primitives, inspired by Porcelli/Pan and Sow, below.
- Fitness measured against Edgar
- Results: Defeats (deterministic) Edgar, as well as
random players
- Chris Porcelli and Patrick Pan. (Fall 1997)
- 10 terminals for all board-position-types -- captures game's symmetry
- resulting players look at next-to-next-to corners
- Known target function
- Daby Mousse Sow. (Fall 1997) - pre-ordained weights for board regions
- fitness based on these weights
- how well can our set of primitives approximate the known optimum?
- how well does result play Othello?
- vary population size
- William Bauder. (Fall 1997) - hypothesis tries to rank like Edgar
- effect of lessoning complexity penalty examined
- Fitness versus Edgar
-
Lenny Volchock (Spring 2000): Trained against a Random
Edgar.
- Matt Schultz (Spring 2000):
Learned against, Edgar, losers.
-
Ilya Malkovitch and Dmitriy Stolyarov (Spring 2000):
Trained against Random, Othello, and greedy players.
-
Shlomo Hershkop (Spring 2000)
The main goal of my project was to create a player to beat
Edgar. It was motivated out of pure revenge, after I lost
3 games in a row to Edgar J . In addition I experimented
with complexity, population size, and generation numbers
on the performance of the "beat-up Edgar" player, and how
it compared to the random player (Spring 2000)
- Mohamed F. Abdelsadek. (Fall 1997)
- Fitness versus random, Edgar and previously-evolved.
- new primitive: near_edge
- automatically defined functions
- Monty Kahan. (Fall 1997)
- existence proof: there exists a player better than random, worse than Edgar
- new primitives: near_edge, longest row/column/diagonal
- results:
- fitness versus combination of random and Edgar helped
- new primitives did not help
- Janak J Parekh and Scott Susser (Fall 1997) - (course final project).
- fitness measured dynamically against increasingly difficult opponents
- three new primitives measure number of pieces that are:
- near occupied corner
- near side, but not near corner
- in solid side-centers
- David Evans and Barry Schiffman. (Fall 1997)
- Results when train against Edgar are good, but the more Edgar is
randomized, the better "he" does against an hypothesis overspecialized to
beat "him".
- train versus previously-evolved
- demonstration that playing white has statistical advantage over black
- an epic tale of the experiment-result-thought-experiment cycle on the
quest towards beating Edgar
- new primitives: parity of row/column, quadrant
- Varying GP algorithm
-
Sara Schumacher and Eugene Mesgar (Spring 2000):
Throughout evolution, catastrophes have changed the scope
of development by suddenly changing the way in which
natural selection takes place. Those phenotypes that
survived best in one evironment can completely falter in
another.
For this project, we tried to simulate these catastrophes
in a genetic
programming algorithm that plays Othello by drastically
changing the way that the fitness is measured, favoring
hypotheses that may not have been favored before.
-
Tom Bellin and Greg Gasperin (Spring 2000)
The intent of our
addition to the genetic programming othello player was to
allow the programs evolved to put an arbitrary weight on
the features it was evaluating.
- Kayuri Mitsui (course final project) (Fall 1997) - Explicit credit assignment and brood selection
- brood selection: keep only the best of two parents, two offspring
(called "elitism" in the write-up)
- explicit credit assignment: rank all subtrees in the population, use ranking to select crossover points
- results: both modification influence population's convergence
- results: "white" as a local minimum
- Population votes on each move
- Xin Jin. (Fall 1997)
- faster GP runs
- comparable results
- New primitives help
- Matt Bogosian. (Fall 1997)
- "Mitosis" biological metaphor
- Varying GP Parameters
- Investigate results and find useful subtrees
- Jesse Schechter. (Fall 1997)
- argument against complexity penalty
- new primitive: time (which move is it?), diagonal
- Time information
Other, non-Othello GP projects