CS4706: Spoken Language Processing, Spring 2012
Time: Mon/Wed 2:40-3:55
Place: Seeley Mudd 233
Professor
Julia Hirschberg (Office Hours M
4:15-6:15 pm)
julia@cs.columbia.edu, 212-939-7114
Teaching Assistants
Rivka Levitan (Office Hours TBD) rlevitan@cs.columbia.edu, 212-939-7147
Erica Cooper (Office Hours TBD) ecooper@cs.columbia.edu, 212-939-7122
Announcements |
Academic
Integrity |
Description
Readings |
Resources |
Requirements |
Syllabus
Description
This
course introduces students to research in spoken language in computational
linguistics, aka natural language processing (NLP). We will study the different
`meanings' that can be conveyed by the way that
speakers produce sentences, techniques for analyzing spoken language, methods
of developing speech technologies such as text-to-speech systems and speech
recognition systems, and applications of speech technologies in the real world,
such as spoken dialogue systems (SDS). Students will build an SDS in a
domain of their choice, working in small teams. NB: This course
can be counted as a PhD elective in Advanced AI. It is a requirement for
the MS NLP Track. There are no official prerequisites for this course
except Data Structures or equivalent, and no prior knowledge of NLP will be
assumed.
Requirements
The major
requirements of the course are a midterm, a final, and a 3-part class project.
Class participation will also contribute to your final grade. The project
involves building a
spoken dialogue system in a domain of your choice. You will build a text-to-speech
(TTS) system and an automatic speech recognition (ASR) system from components we will
provide; the dialogue component will involve building a simple system to put
inputs and outputs together to accomplish some interesting and useful or fun
task. You are encouraged to do these projects in teams of 2-3. There
will be several project deadlines during the term where we evaluate your project
description, your TTS system, your ASR system, and the overall project.
Project deadlines will be allowed total of 5 late days with no questions asked; after that, 10% per late day will be
deducted from the grade for that component, unless you have a note from your
doctor. Do not
use these up early! Save them for real emergencies.
All
students are required to have a Computer
Science Account for this class. To sign up for one, go to the
CRF website and then click on
"Apply for an Account". The
Speech Lab is available
for use in homeworks as needed on a signup basis.
Academic Integrity
Copying or paraphrasing someone's work (code included),
or permitting your own work to be copied or paraphrased, even if only in part,
is not allowed, and will result in an automatic grade of 0 for the entire
assignment or exam in which the copying or paraphrasing was done. Your grade
should reflect your own work. If you believe you are going to have trouble
completing an assignment, please talk to Prof. Hirschberg or to Robert Coyne in
advance of the due date. Please see the
university policy.
Required texts:
Daniel Jurafsky
and James H. Martin
Speech and Language
Processing (second edition). Pearson: Prentice Hall. 2009. See
errata
before you do each reading assignment. There are some typos in
algorithms.
Other required readings
are available online via links from this syllabus.
Grading:
40% Exams
60% Course Project
-
Class participation will be taken into account in calculating the final grade.
Homework and project submission procedure is described HERE.
Lab Signup.
Sign-up to use the Linux computers in the Speech Lab. .
· Sox - audio file editing
·
Help using ToBI - ToBI
Annotation Environments
·
Text-to-Speech Links
and more...
· Praat - Praat resources
Syllabus
C |
Topic |
Reading Assignments |
HW Due Dates and Other Assignments |
Jan 18 |
|
|
|
Jan 23 |
J&M 7.1-7.3, 7.5 |
|
|
Jan 25 |
J&M 7.4 |
|
|
|
Project Description due. Download Praat to your laptop if you have one and bring to class with headphones if you have. |
||
Feb 1 | More on Praat and Lab Visit | " | |
Feb 6 |
J&M 8 (pp. 249-50, 281-84);
TTS-history;
Historical
examples |
|
|
Feb 8 |
Project Part 1 (TTS) assigned |
||
Feb 13 |
J&M 8.1,
Sproatetal01 |
|
|
Feb 15 |
J&M 8.2; Ghoshaletal09 |
||
Feb 20 |
Hirschberg03,
J&M 8.3.0-8.3.4,
ToBI labeling
conventions |
Download and listen to all the ToBI examples. Prepare these exercises and bring them to class with your laptop and headphones. |
|
Feb 22 |
J&M 8.3.4-8.3.7 |
|
|
Feb 27 |
|
||
Feb 29 |
|
|
|
Mar 5 |
J&M &M 8.4-5, 8.6
Tokuda35al02 |
Project Part 1 due Project Part 2 (ASR) assigned |
|
Mar 7 |
Midterm |
|
NB: Please deposit the exercises you did for Feb 21 in Courseworks before class. |
Mar 12-16 |
Spring Break |
|
|
Mar 19 |
J&M 9-9.2, 6-6.3 |
|
|
Mar 21 |
J&M 9.3-9.7 |
Fadi Biadsy
|
|
Mar 26 |
J&M 4, 9.5 |
|
|
Mar 28 |
J&M 9.8 |
|
|
Apr 2 |
J&M 10.7 |
Project Part 2 due Project 3 (SDS) assigned |
|
Apr 4 |
Metadata: Speaker, Sentence and Topic Segmentation and Disfluencies [pdf] |
J&M 10.5,
Liuetal04,
Liuetal05,
Snoveretal04
|
|
Apr 9 |
Spoken Dialogue: Human and Machine |
J&M 24-24.1, 24.8 |
|
Apr 11 |
J&M 24.2-3,
Goldberg03 |
||
Apr 16 |
|
|
|
Apr 18 |
J&M 24.5, Hirschbergetal04 |
|
|
Apr 23 |
Dialogue Acts and Information State (2) |
|
|
Apr 25 |
J&M 24.4,
Walkeretal97
|
Preliminary Project Demos |
|
Apr 30 |
Final Exam |
|
|
May 1-3 |
Study Days |
|
|
May 4-11 |
Project Demos (1:10-4) |
Interschool Lab, 750 CEPSR |
Project Part 3 due |
Links to Resources
cf. also resources available from the text homepage
Places to look up definitions and descriptions of terminology:
Other resources
- Karen Chung Language and Linguistics links
- CatSpeak
- Check out Eliza
- AT&T Labs - Research Finite State Machine Library
- Appelt and Israel's information extraction tutorial (IJCAI-99).
- Framenet.
- Ask Jeeves-- a search engine that answers questions in plain English.
- Answer Bus -- another Q/A system.
- Columbia's NewsBlastersummarizer
- IBM summarizer demo (canned)
- Systran machine translation (also in use at Babelfish)
- AT&T Labs - Research Finite State Machine Library
- Michael Collins' Parser
- On-line dictionaries in many languages.
- WordNet
- Framenet
- CoBuildDirect Corpus
- AT&T's SCANMail voicemail browsing/search system
- DiaLeague 2001 -- includes a link to an online dialogue system demo.
- James Allen's Dialogue Modeling for Spoken Language Systems ACL 1997 Tutorial
- Festival speech synthesizer demo and links to other TTS systems
- Julia Hirschberg's Intonational Variation in Spoken Dialogue Systems tutorial
Julia Hirschberg
Professor, Computer Science
Columbia University
Department of Computer Science
1214 Amsterdam Avenue
M/C 0401
450 CS Building
New York, NY 10027
email: julia@cs.columbia.edu
phone: (212) 939-7114