Project Othello Machine Learning Home Page

"Put money in thy purse." - Shakespeare's Othello

"A minute to learn, a lifetime to master." - Milton Bradley

This page describes the results of Project Othello, one of four homework projects for the machine learning course at Columbia University (Professor: Eric Siegel, TA: Eleazar Eskin).

A SIGCSE (cs education) paper on this assignment is available. Note that there is also an intro to computer science version of this assignment.

For Project Othello, students applied the machine learning method genetic programming (GP) to the game Othello. Each student or student team selected a unique experimental project investigating variations on the learning method, hypothesis representation, or both. To evaluate results, students had to deal with many machine learning issues such as overlearning, temporal credit assignment, local optima, hypothesis representation bias, learning speed and noisy fitness measurements (cf. GP Brainstorming Archive for a more comprehensive list of learning issues). The projects were mutually complementary. The bottom of this web page contains a hierarchical index of the 18 experimental projects with links to writeups of their approaches and results. Some have applets that allow you to play Othello against learned players.

The homework assignment and all needed code are available for interested researchers or machine learning instructors. This includes ideas for GP-Othello projects from which students could select. It also includes Java implementations of GP (jpgpp) and of Othello, which are hooked up; to perform many non-trivial experiments, this code can be modified easily by students who are not fluent with Java.

GP is used to induce a board evaluation function to select each move made during the game. In this approach, there was no search beyond the next immediate move in the game, except for one student project (Truta, below). The students started from this rudimentary approach and evaluated variations.

The idea to use Othello as a machine learning homework project was Astro Teller's, and he provided some of the code used in the project, as well as another automatically-learned Othello player, Edgar. Edgar trained with a different learning paradigm, so students also performed cross-paradigm comparisons.

The goal of each student project was not necessarily to beat Edgar. Rather the goals included:

Compare, contrast and discover methods to approach Othello with GP.
Investigate methods to evaluate resulting Othello players.
Compare results to another learned Othello player, Edgar
Use another Othello expert, Edgar, during training

One interesting result demonstrated over a few reports, below, is that while some strategies learned to beat Edgar, it proved more difficult for learning to develop a strategy that beats a "noisy" Edgar, that is, Edgar with some randomness added.

Introduction (from Suk's writeup, below)

In this project, we are given an Othello game written in Java with a genetic programming implementation already in place. This original implementation consists of 13 primitives, + - * / 10 black black_corners black_near_corners black_edges white white_corners white_near_corners white_edges. The first four primitives are standard arithmetic operators while the rest of the primitives are terminals, all but one corresponding to positions on the board. All the standard genetic programming population control constructs such as crossover, mutation, and ADFs are included in the given Java package, gpjpp. In addition to this genetic programming setup, a computer player, Edgar, is provided. Though not a genetically programmed player, Edgar boasts a superior playing ability, easily dispatching most Othello players in short time. It is the purpose of this project to modify and extend the current genetic programming setup in hopes of discovering new approachs in implementation and representation that will facilitate the creation and evolution of GP players that can play well against both Edgar and the general othello playing population.

In general, fitness over five games only is not precise, very noisy.

Student Othello Projects:

Alpha-Beta search
- Philip Kim, Jason Greenberg, Jeffrey Leong (Spring 2000): For our Genetic Programming Project, we decided to study the effects of using the MiniMax algorithm as an alternative to the current evaluation function used to choose the GP player's next move.
- Truta (Spring 1998):
Competitive Fitness Measure
- Corey Tripp (Spring 2000): I decided to try one of the learning method variations suggested in the homework assignment. This was having a population vote on each move.
- Thomas Keerikattu (Spring 2000) - Modified the fitness-measure algorithm so that fitness is evaluated against several players. The new algorithm uses the number of players defeated as a fitness measure and ignores the number of pieces remaining on the board. Used a reduced set of terminals: 'white' terminals are removed and 'constant' terminals modified.
- George Yi, Naho Osagawara, and Wei-Ang Lee (Spring 2000) - Dare to Mutate: Small population after generations and generations of evolutions (adaptiving to random mutation events) give birth to one smart dude that can slightly overmatch Edgar. With this dude's structure left in tact, the "extremities" of the dude, tree edges, are refined to play Othello (as well as the internal nodes). Genetically engineered, the super dude creams Edgar, and many other random opponents called to compete against him. Resistance is futile.
- Yi-Min Chee. (Fall 1997) - Single elimination tournament
  - variations on fitness measure:
    - non-competitive baseline
    - straight competitive fitness eval
    - combo of competitive and versus Edgar
  - results:
    - competitive runs reached termination criterion faster than absolute fitness runs.
    - Resulting player from competitive run beats Edgar, and generalizes better than individuals trained directly against Edgar.
    - existence proof of a player that is better than random, but worse than Edgar -- a tribute to Edgar.
- Chun Chao and Olga Merport (Fall 1997)
  - New primitives: possible_moves_white, possible_moves_black
  - although fitness is relative to other players in the population, it gets better in an absolute sense, as compared to random players.
- Federico Kattan (Fall 1997)
  - large population size
  - parallel 5-game evaluation of individuals
- Jorge Lepre (Spring 2000) - Using Tournament Selection as competitive fitness measure allows evolving a player with minimal prior knowledge.
Modifications to Primitives
- Brian Whitman and Grace Junxin Zhang (Spring 2000): Add new primitives, change fitness measure.
- Jenny Weisenberg and Blaine Boman (Spring 2000)
  - explore the effect of increasingly large sets of terminals on evolved player fitness and performance
  - explore the relationship between fitness and performance against both EDGAR and the Random Player
- Krishna Srikant and Pascal Saremsky (Spring 2000)
  - Super primitives, evolving populations
  - Reduction of original primitives to bare essentials
  - Addition of primitives for individual square access through board symmetry
  - Addition of primitives for piece mobility determination at different game stages
  - Demes for evolving increasingly superior players which can play both black or white
  - Defeats EDGAR effortlessly; challanging to humans as well
- Manuel Reyes (Spring 2000): Added two new primitives, weighted different sections of the board.
- Ben May and Anastasiya Lebedev (Spring 2000): Added two new primitives.
- James Lott (Spring 2000): I decided to add two new terminals to the othello primitives, based on the concept of surroundedness. Surroundedness measures the number of pieces immediately surrounding (in the 8 spaces adjacent to) a possible move.
- Stephen Jan, Pablo de Dios (Spring 2000): In this experiment, we attempt to utilize human derived strategies in order to properly represent the attributes necessary to derive a hypothesis reprsentation that can be more quicly generated. The goal of this experiment is to test one method for increasing the learning rate and reducing the computation time in genetic programming applications.
- Joe Gagliano, Stan John, Sheanon Chung The intent of our addition to the genetic programming othello player was to allow the programs evolved to put an arbitrary weight on the features it was evaluating. (Spring 2000)
- Arie Addi (Spring 2000) - Discovering a simpler Othello Player by using less but more effective primitives in the GP trees.
- Jenny Weisenberg and Blaine Bowman (Spring 2000) - In our project, we explore the effect of increasingly large sets of terminals on evolved player fitness and performance.
- Ilya Mayzus (Spring 2000):
  - Fitness versus RandomPlayer
    - use me/opponent instead of black/white to allow testing for either side
    - subsitute absolute number of pieces with relative change after next move
    - position on the board -- how far from center?
  - fitness measure affects game strategy of evolved players
    - playing against RandomPlayers, just Edgar, or a combination
- Leonid Portnoy (Spring 2000) - Investigated using primitives based on the X/Y location of the piece, and a primitive which guessed the value of the next board randomly.
- Juno Suk (Spring 1998)
- Chris Porcelli and Patrick Pan. (Fall 1997)
Known target function
- Daby Mousse Sow. (Fall 1997) - pre-ordained weights for board regions
  - fitness based on these weights
  - how well can our set of primitives approximate the known optimum?
  - how well does result play Othello?
  - vary population size
- William Bauder. (Fall 1997) - hypothesis tries to rank like Edgar
  - effect of lessoning complexity penalty examined
Fitness versus Edgar
- Lenny Volchock (Spring 2000): Trained against a Random Edgar.
- Matt Schultz (Spring 2000): Learned against, Edgar, losers.
- Ilya Malkovitch and Dmitriy Stolyarov (Spring 2000): Trained against Random, Othello, and greedy players.
- Shlomo Hershkop (Spring 2000) The main goal of my project was to create a player to beat Edgar. It was motivated out of pure revenge, after I lost 3 games in a row to Edgar J . In addition I experimented with complexity, population size, and generation numbers on the performance of the "beat-up Edgar" player, and how it compared to the random player (Spring 2000)
- Mohamed F. Abdelsadek. (Fall 1997)
  - Fitness versus random, Edgar and previously-evolved.
  - new primitive: near_edge
  - automatically defined functions
- Monty Kahan. (Fall 1997)
  - existence proof: there exists a player better than random, worse than Edgar
  - new primitives: near_edge, longest row/column/diagonal
  - results:
    - fitness versus combination of random and Edgar helped
    - new primitives did not help
- Janak J Parekh and Scott Susser (Fall 1997) - (course final project).
  - fitness measured dynamically against increasingly difficult opponents
  - three new primitives measure number of pieces that are:
    - near occupied corner
    - near side, but not near corner
    - in solid side-centers
- David Evans and Barry Schiffman. (Fall 1997)
  - Results when train against Edgar are good, but the more Edgar is randomized, the better "he" does against an hypothesis overspecialized to beat "him".
  - train versus previously-evolved
  - demonstration that playing white has statistical advantage over black
  - an epic tale of the experiment-result-thought-experiment cycle on the quest towards beating Edgar
  - new primitives: parity of row/column, quadrant
Varying GP algorithm
- Sara Schumacher and Eugene Mesgar (Spring 2000): Throughout evolution, catastrophes have changed the scope of development by suddenly changing the way in which natural selection takes place. Those phenotypes that survived best in one evironment can completely falter in another.
  For this project, we tried to simulate these catastrophes in a genetic programming algorithm that plays Othello by drastically changing the way that the fitness is measured, favoring hypotheses that may not have been favored before.
- Tom Bellin and Greg Gasperin (Spring 2000) The intent of our addition to the genetic programming othello player was to allow the programs evolved to put an arbitrary weight on the features it was evaluating.
- Kayuri Mitsui (course final project) (Fall 1997) - Explicit credit assignment and brood selection
  - brood selection: keep only the best of two parents, two offspring (called "elitism" in the write-up)
  - explicit credit assignment: rank all subtrees in the population, use ranking to select crossover points
  - results: both modification influence population's convergence
  - results: "white" as a local minimum
Population votes on each move
- Xin Jin. (Fall 1997)
  - faster GP runs
  - comparable results
  - New primitives help
- Matt Bogosian. (Fall 1997)
  - "Mitosis" biological metaphor
Varying GP Parameters
- Kuan-Hung Lin (Spring 2000): The goal of this project is to experiment how these attributes effect the players that the evolution generates.
- Li Liao (Spring 2000): In this project, we experimented "staged" hypothesis technique hoping to help hypothesis learn somewhat "directly". Variations on training parameters such as population size, termination fitness are experimented along with different stage points in order to reveal how these factors may influence the staged hypothesis.
- Chihiro Ishii (Spring 2000) Changed parameters:
  - MaximumDepthForCrossover: ranging from 6 to 30
  - MaximumComplexity: ranging from 50 to 250
  - CrossoverProbability: ranging from 0 to 100
  This experiment evaluates how the performance of othello game is affected by changes made in crossover parameters.
- Hamada Hironobu (Spring 2000): I implemented the evaluation of changing population size and crossover probability for GPOthello. I trained my genetic programming player against Edgar.
- Christopher Akritides (Spring 2000)
- Lawrence Leftin (Spring 2000) -
  - recognized distinct white/black strategies
  - trained against white random, black random, and white Othello players
  - varied opponent, population size, deme size, and genetic diversity
- David Pisapia (Spring 2000) - An examination of how the chosen fitness measure (i.e. the evaluation function) affects the complexity and effectiveness of the resultant hypothesis. Experiments over several fitness variations and generation lengths.
- Alberto Goldberger. (Fall 1997)
  - Vary:
    - maximum tree size
    - crossover depth limit
    - generation 0 depth limit
  - Extensive set of 54 runs
- LianShu Huang. (Fall 1997)
  - Vary:
    - population size
    - number of generations
    - proportion of crossover
  - new primitives
Investigate results and find useful subtrees
- Jesse Schechter. (Fall 1997)
  - argument against complexity penalty
  - new primitive: time (which move is it?), diagonal
Time information
- Peng Xu (Spring 2000): a. Generally investigate how to use multiple trees to evaluate in GP. b. Two approaches to integrating time information into evaluation. 1)Multiply trees evaluation in different time points 2)New primitives that can determine the branch to evaluate according to current time points.

Other, non-Othello GP projects

Tom Barry - using GP to solve the Traveling Salesman problem (PDF Format) (Spring 2000)