Paula Buttery, University of Cambridge, U.K.


Reliable Language Acquisition from Real Data

Time:Thursday, March 18, 11:30 - 12:30

Place:Interschool Lab
PLEASE NOTE THE CHANGE OF LOCATION:

Abstract:

A child acquires the language of her environment from exposure to example utterances; she has no formal language teaching. Lacking any discerning information, a child is likely to assume that every utterance she hears is grammatical (i.e. a valid example of her target language). This presents the child with two problems:

  1. Spoken language contains ungrammatical utterances, perhaps in the form of interruptions, lapses of concentration or slips-of-the-tongue. When a child mis-classifies these utterances as grammatical, errors are introduced into the acquisition process.
  2. Some utterances provide ambiguous grammatical evidence (the problem of ambiguous triggers). For instance, a sentence of English (subject-verb-object ordering) may be interpreted as subject-object-verb ordering with verb movement (V2) as in German. If a child chooses the wrong grammatical interpretation of an utterance, a source of error is again introduced.

A child cannot know when she has encountered an error. Any simulation or explanation of language acquisition should therefore attempt to learn from every utterance it encounters. In this presentation I will describe a statistical learning system (which implements the Bayesian Incremental Parameter Setting (BIPS) algorithm) that is robust to errors and discuss experiments which demonstrate the ability of such a system to learn from real child-directed speech.

About the speaker: Paula Buttery is a Phd student at the Natural Language and Information Processing Group, Computer Laboratory, University of Cambridge, U.K. Her interests are in computational models of child language acqusition; in particular, studying the trade-off between the empiricist and nativist positions on acquisition, and how varying the detail contained in models of the Universal Grammar affects the speed and accuracy of language acquisition in computational simulations.