Instructor:
Michael Collins
Course Description:
COMS W4705 is a graduate introduction to natural language processing, the
study of human language from a computational perspective. We will
cover syntactic, semantic and discourse processing models. The
emphasis will be on machine learning or corpus-based methods and
algorithms. We will describe the use of these methods and models in
applications including syntactic parsing, information extraction,
statistical machine translation, dialogue systems, and summarization.
Problem sets:
There were will be 4 problem sets during the class, due roughly every
three weeks. The problem sets will include both theoretical problems and
programming assignments.
Exams:
There will be a midterm and a final in the class.
The midterm will be in class in mid October.
Grading:
The overall grade will be determined roughly as follows:
Midterm 25%, Final 40%, Problem sets 35%.
Syllabus:
Here is a tentative syllabus for class:
- Introduction
- Estimation techniques, and language modeling
- Tagging, hidden Markov models
- Statistical parsing
- Machine translation
- Log-linear models
- Conditional random fields, global linear models
- Unsupervised and semi-supervised learning in NLP
Readings:
There are comprehensive notes for the class, posted
here.