Evaluating Text Generation without Penalizing Variation
Time:Thursday, April 15, 11:30 - 12:30
Place:CS Conference Room
Abstract:
In this talk I will analyze several text generation and machine
translation evaluation metrics (including simple string accuracy,
BLEU, and Melamed's F measure) with respect to their performance on
the evaluation of generation systems that permit variation. Two
different data sets will be used: a set of paraphrases from Barzilay
and Lee's work on text-to-text paraphrase generation, and a set of
paraphrases output by the text generator HALogen. The evaluation
metrics will be compared with respect to their ability to evaluate
syntactic correctness and meaning equivalence. I will close with
ideas for evaluation metrics for text generation that do not penalize
variation.
About the speaker:
Amanda Stent is an assistant professor at Stony Brook University,
where she manages the natural language processing lab. She did her
PhD at the University of Rochester and a postdoc at AT&T Research.
She does research on spoken dialog systems and natural language
generation.