jencc you-know-what us dot ibm dot com
Title:
Semantic Search via XML Fragments: A High-Precision Approach to IR
Abtract:
In some IR applications, it is desirable to adopt a high precision search mechanism to return a small number of documents that are highly focused and relevant to the user's information need. With these applications in mind, we investigate semantic search using the XML Fragments query language on text corpora that have been automatically pre-processed to encode semantic information useful at the retrieval stage. We identify three XML Fragment operations that can be applied to a query to conceptualize, restrict, or relate terms in the query and demonstrate how these operations can be used to address four different types of query time semantic needs, to specify target information, to disambiguate keywords, to specify search term context, or to relate select terms in the query. We demonstrate the effectiveness of our semantic search technology through a series of experiments using the two applications in which we embed this technology and show that it yields significant improvement in precision in the search results.
This is joint work with John Prager, Krzysztof Czuba, David Ferrucci, and Pablo Duboue.
About the speaker:
Jennifer Chu-Carroll is a Research Staff Member at IBM T. J. Watson Research Center. She also manages the Knowledge Structures group which focuses on improving advanced search technology through the use of natural language processing and artificial intelligence techniques. Prior to joining IBM in 2001, she spent 5 years as a Member of Technical Staff at Lucent Technologies Bell Laboratories. Her research interests include question answering, semantic search, natural langauge discourse processing, and spoken dialogue management.
Dr. Chu-Carroll is program co-chair of the upcoming HLT/NAACL 2006 Conference and was program committee area chair for EMNLP/HLT 2005. In the past, she served on the editorial board of the Computational Linguistics Journal, and as secretary and scientific advisory board member of the ACL/ISCA special interest group on discourse and dialogue (SIGDIAL).