A Neural Network Approach to Othello Playing
Federico Kattan
Columbia University
Machine Learning class, Fall 97'
fkattan@cs.columbia.edu
Introduction
The theory of situated cognition ([1] Clancey) claims that every human thought and action is adapted to the environment. This is, what people perceive, how the y conceive of their activity and what they physically do, develop together. In this view thinking is like riding a bike, every turn of the steering wheel, and every adjustment of body position doesn't came as a result of evaluating the physics formulae of stability and dynamics, but by a recoordination of previous postures, perception and motion sequences. All human action is partially improvisatory by direct coupling of perception, conception (or interpretation) and action. This coupling involves a "self-organization with memory" in the brain that haven't been duplicated in computer programs.
In regard to the symbolic approach of describing knowledge to a computer system, the theory of situated cognition claims that when researchers equate human knowledge with a set of descriptions (like rules and facts in an expert system) they can describe how the system should behave in particular situations, but they can't capture the full flexibility of how perception, memory and action interact in intelligent behavior. (This may be one of the causes of why such descriptions are domain dependent and so expensive and difficult to build)
Neural Darwinism
There is some work done modeling remembering as categorized coupling of perception and motor systems ([2] Edelman, pg. 93). It's shown how complete structures that coordinate perception-categorization-action are possible without explicit stored descriptions (like production rules).
Gerald Edelman's model of learning is based in his earlier work on the immune system (for what he got the Novel Price in 1972). In that work he showed how recognition of bacteria is based on a competitive selection process over a population of antibodies. There are some intriguing characteristics of this approach:
TNGS explains "how multiple maps lead to integrated responses, and generalizations of perceptual responses, even in the absence of language". Edelman adders to other researches and reacts against the descriptive cognitive approach, which assumes that learning, occurs when an already categorized world with the correct responses is given. Edelman categorizes his approach as instructionism, and seeks to explain how categorization occurs at the neuronal level.
Darwing III is a recognition automaton based on the previous theory. It has a single moving eye and a 4 joins arm with touch and kinesthesia (join sense). After experience with randomly moving objects its eye will follow the object and its arm will try coordinate its movement to reach the object, all of this without pre-programmed structures describing this particular behavior.
Neural Networks for Othello playing
For this project I tried to accomplish the behavior shown by an Othello player keeping at a minimum explicit descriptions. Even when this is a much more modest project, I tried to stay as much as possible in the lines of the mentioned research. For that I used a Neural Network as the foundation of an Othello player, the NN is a standard back-propagation network with the following topology:
Learning curve for the Othello Player network
Detail of the first 200 epochs
The modifications on the given Othello package involved: Generation of games logs suitable to be used by the neural network during training; A C function that can load the trained network and interprets its output (this one will be used during game playing); And the connection between the Java player and the C function. The neural network was defined using SNNS that presents a graphical user interface during network definition and training, it also provides an utility to generate C functions to represent and use the trained networks.
Results
After training for 5000 epochs the network was able to beat a random player in all of the 50 games played. This result is similar to the one achieved by using GP in the previous homework. Further experiments were not being done but it would be interesting to see if a hill climbing learning scheme in which increasingly good neural networks plays against each other trying to learn better positions if feasible. Also playing against EDGAR would be an interesting experiment. Trying to use a different topology or neural maps of the kind described by Edelman could be future areas of research and experimentation.
References
[1] William J. Clancey, Situated Cognition, 1997 Cambridge University Press.