Home
Lectures
Projects
Problem sets
|
Lecturer: Prof.
Michael Collins (Office hours Thursdays, 1-2pm, room 723 CEPSR)
TAs:
Yin-Wen Chang, yc2745@columbia.edu
(office hours Tues 1-2, Weds 11-12, in CEPSR 701).
Karl Stratos, stratos@cs.columbia.edu
(office hours Mon 1.30-2.30, Fri 1.30-2.30, in CEPSR 721).
Lectures: Wednesdays 4.10-6.00pm, in Hamilton 503
Prerequsites:
- Students will need a solid background in: (1) algorithms (e.g., a prior
class at the level
of this class);
(2) probability (e.g., this
textbook is highly
recommended; chapters 1 and 2 should be good background for this course).
- A prior class in machine learning and/or natural language processing
is recommended. (In particular, if you're interested in a first class
in NLP, then COMS 4705,
taught in the fall, may be more appropriate.)
Course description:
This is an advanced course in machine learning for natural language
processing. The methods we will cover will be relevant to many NLP
applications, for example machine translation, dialog systems, natural
language parsing, and information extraction. The course will cover
the following topics:
- Models for structured prediction: e.g.,
hidden Markov models, maximum-entropy Markov models, conditional
random fields, probabilistic context-free grammars, synchronous
context-free grammars, dependency parsing models, max-margin methods
for structured prediction.
- Unsupervised and semi-supervised learning methods: e.g.,
the EM algorithm, methods that derive lexical representations from
unlabeled data, cotraining algorithms, methods based on canonical
correlation analysis (CCA).
- Inference algorithms: e.g., dynamic programming
algorithms, belief propagation, methods based on linear programming
and integer linear programming, methods based on dual decomposition
and Lagrangian relaxation.
Text/material:
There is no textbook for the course.
Throughout the course, we will make use of research papers as
readings.
The following book may
provide useful background, particularly for the early part of the
course:
Assignments:
There will be 3 homeworks (30% of the final grade), a final class
project (40% of the final grade), one 2 hour exam (30% of the final
grade, in class, date TBD).
|