as372@cs.columbia.edu
Title: Generating Referring Expressions in Open Domains
Time:Thursday, January 15, 11:30 - 12:30
Place:CS Conference Room in MUDD
Abstract:
We present an algorithm for generating referring expressions in open
domains. Existing algorithms assume a classification of
adjectives. This exists (or is feasable to construct) only for very
restricted domains. Large scale adjective classification is not only
unfeasable, it is also unnatural and unintuitive. Unlike nouns, which
are classified hierarchically, the important relations for relating
adjectives are antonymy and synonymy, not hyponymy. This largely
follows from the purpose of adjectives - to describe or contrast
entities.
Our algorithm relies on WordNet synonym and antonym sets and gives equivalent results on the (restricted domain) examples cited in the literature and good results for many other cases that prior approaches cannot handle. We believe that it is also the first algorithm that allows for the incremental incorporation of relations as well as attributes. Our algorithm also overcomes various other problems with existing alorithms.
We also present a corpus-based evaluation of our algorithm using the
Penn Treebank. Evaluations are uncommon in this research area, and we
feel that our ability to evaluate our algorithm on a standard corpus
is by itself a vindication of our approach.
About the speaker:
I am doing a PostDoc with the Natural Language Group at Columbia
University, where I am applying my thesis work to the problem of
summarization. My PhD is from the Natural Language and Information
Processing Group at the Computer Lab, University of Cambridge. My
supervisor was Ann Copestake. I've followed a rather erratic
trajectory to get this far, having started out in physics a while
back. My PhD thesis was on syntactic simplification and text
cohesion; I was looking at the problem of simplifying newspaper text
to make it accessible to people with reading difficulties. In
particular, I was interested in analysing the discourse level aspects
of syntactically rewriting text. My publications.