koehn@isi.edu
Title: Advances in Statistical MT: Phrases, Noun Phrases and Beyond
Time:Wednesday November 12, 12:30 - 1:30
Place:CS Conference Room in MUDD
Abstract: I will review the state of
the art in statistical machine translation (SMT), present my
dissertation work, and sketch out the research challenges of
syntactically structured statistical machine translation.
The currently best methods in SMT build on the translation of phrases
(any sequences of words) instead of single words. Phrase translation
pairs are automatically learned from parallel corpora. While SMT
systems generate translation output that often conveys a lot of the
meaning of the original text, it is frequently ungrammatical and
incoherent.
The research challenge at this point is to introduce syntactic knowledge to the state of the art in order to improve translation uality. My approach breaks up the translation process along linguistic lines. I will present my thesis work on noun phrase translation and ideas about clause structure.
About the speaker: Philipp Koehn recently graduated from USC/ISI and worked under the supervision of Kevin Knight and specializes in Machine Translation.