CS4706: Spoken Language Processing, Spring 2011

Time: Mon/Wed 2:40-3:55
Place: CEPSR 415

Professor Julia Hirschberg (Office Hours TBA)
julia@cs.columbia.edu, 212-939-7114

Teaching Assistant Bob Coyne (Office Hours TBA)
coyne@cs.columbia.edu, 212-939-7147

Announcements | Academic Integrity | Description
Readings | Resources | Requirements | Syllabus

Description

This course introduces students to research in spoken language in computational linguistics, aka natural language processing (NLP). We will study the different `meanings' that can be conveyed by the way that speakers produce sentences, techniques for analyzing spoken language, methods of developing speech technologies such as text-to-speech systems and speech recognition systems, and applications of speech technologies in the real world, such as spoken dialogue systems (SDS).  Students will build an SDS in a domain of their choice, working in small teams.   NB: This course can be counted as a PhD elective in Advanced AI.  It is a requirement for the MS NLP Track.  There are no official prerequisites for this course except Data Structures or equivalent, and no prior knowledge of NLP will be assumed.

Requirements

The major requirements of the course are a midterm, a final, and a 3-part class project.  Class participation will also contribute to your final grade.  The project involves building a spoken dialogue system in a domain of your choice.  You will build a text-to-speech (TTS) system and an automatic speech recognition (ASR) system from components we will provide; the dialogue component will involve building a simple system to put inputs and outputs together to accomplish some interesting and useful or fun task.  You are encouraged to do these projects in teams of 2-3.  There will be several project deadlines during the term where we evaluate your project description, your TTS system, your ASR system, and the overall project.  Project deadlines  will be allowed total of 5 late days with no questions asked; after that, 10% per late day will be deducted from the grade for that component, unless you have a note from your doctor.  Do not use these up early!  Save them for real emergencies. 

All students are required to have a Computer Science Account for this class. To sign up for one, go to the CRF website and then click on "Apply for an Account".  The Speech Lab is available for use in homeworks as needed on a signup basis.  Some parts of the project must be done in the Lab.

 

Academic Integrity

Copying or paraphrasing someone's work (code included), or permitting your own work to be copied or paraphrased, even if only in part, is not allowed, and will result in an automatic grade of 0 for the entire assignment or exam in which the copying or paraphrasing was done. Your grade should reflect your own work. If you believe you are going to have trouble completing an assignment, please talk to Prof. Hirschberg or to Robert Coyne in advance of the due date.  Please see the university policy.

Required texts:

    Daniel Jurafsky and  James H. Martin Speech and Language Processing (second edition). Pearson: Prentice Hall. 2009.  See errata before you do each reading assignment.  There are some typos in algorithms.

    Keith Johnson. Acoustic & Auditory Phonetics (second edition). Blackwell.  2003.

    Other required readings are available online via links from this syllabus.

Grading:

  • 40% Exams

  • 60% Course Project

  •           Class participation will be taken into account in calculating the final grade.

Homework submission procedure is described HERE.

Lab Signup.

Sign-up to use the Linux computers in the Speech Lab. .

 Resources

·         Praat - Praat resources

·         Help using ToBI - ToBI Annotation Environments

·         Text-to-Speech Links and more...

·         Text-to-Song synthesis

Syllabus

 

Date

Topic

Reading  Assignments

 HW Due Dates and Other Assignments

Jan 19

It's not what you said, it's how you said it [pdf]

 

 

Jan 24

From Sounds to Language [pdf]

J&M 7.1-7.3, 7.5

 

Jan 26

Acoustics of Speech [pdf]

J&M 7.4; Johnson Ch. 1-2

 

Jan 31

Tools for Speech Analysis [pdf]

Praat tutorial 1

Project Description due.

Download Praat to your laptop if you have one and bring to class with headphones if you have them.

Feb 7

Speech Generation Overview [pdf]

J&M 8 (pp. 249-50, 281-84); TTS-history; Historical examples

 

Feb 9

Building a TTS System [pdf]

Black-Festival-Notes

Project Part 1 (TTS) assigned

Feb 14

Text Normalization [pdf]

J&M 8.1, Yarowsky97

 

Feb 16

Modeling Pronunciation [pdf]

J&M 8.2; Fackrell&Skut04, Ghoshaletal09

Prepare these exercises and bring them to class.

Feb 21

Prosody Modeling [pdf]

Hirschberg03, J&M 8.3.0-8.3.4, ToBI labeling conventions
 

Download and isten to all the ToBI examples.

Prepare these exercises and bring them to class with your laptop.

Feb 23

Predicting Prosody from Text [pdf]

J&M 8.3.4-8.3.7

 

Feb 28

Information Status: Focus and Given/New [pdf]

GBrown83, Prince92, Terken&Hirschberg93

 

Mar 2

Backend Synthesis and Evaluation [pdf]

J&M &M 8.4-5, 8.6 Tokuda35al02

 

Mar 7

 

 

Project Part 1 due

Project Part 2 (ASR) assigned

Mar 9

Midterm

 

 

Mar 14-18

Spring Break

 

 

Mar 21

ASR: Overview [pdf]

J&M 9-9.2, 6-6.3

 

Mar 23

Building an ASR System [pdf]

 J&M 9.3-9.7, Johnson Ch. 1-2 (review)

Fadi Biadsy

 

Mar 28

Language Modeling [pdf]

J&M 4, 9.5

 

Mar 30

ASR Evaluation [pdf]

J&M 9.8

 

Apr 4

Human Speech Perception [pdf]

J&M 10.7; Johnson 3-4

Project Part 2 due

Project 3 (SDS) assigned

Apr 6

Metadata:  Speaker, Sentence  and Topic Segmentation and Disfluencies [pdf]

J&M 10.5, Liuetal04, Liuetal05, Snoveretal04

 

Apr 11

Spoken Dialogue: Human and Machine [pdf]

J&M 24-24.1, 24.8

 

 

Apr 13

SDS System Architectures [pdf]

J&M 24.2-3, Goldberg03

 

Apr 18

Turn-taking in SDS [pdf]

Gravano&Hirschberg09

 

Apr 20

Dialogue Acts and Information State

J&M 24.5, Hirschbergetal04

 

Apr 25

SDS Evaluation [pdf]

J&M 24.4, Walkeretal97

 

Apr 27

Project Demos

 

 

May 2

Project Demos

 

Project Part 3 due

May 3-5

Study Days

 

 

TBD

Final Exam

 

 

 

 

 

Julia Hirshberg Portrait

Julia Hirschberg
Professor, Computer Science

Columbia University
Department of Computer Science
1214 Amsterdam Avenue
M/C 0401
450 CS Building
New York, NY 10027

email: julia@cs.columbia.edu
phone: (212) 939-7114

Download CV

 

Columbia University Department of Computer Science / Fu Foundation School of Engineering & Applied Science
450 Computer Science Building / 1214 Amsterdam Avenue, Mailcode: 0401 / New York, New York 10027-7003
Tel: 1.212.939.7000 / Fax: 1.212.666.0140