Event Date: 

Thursday, October 3, 2013 - 3:30pm

Event Location: 

  • South Hall 3605

Roger Levy, UCSD


This talk covers the following fundamental issues in human language processing: what determines the difficulty of comprehending a given word in a given sentence, and what influences the choice that a speaker makes when it is possible to express a meaning more than one way? I present a combination of computational modeling, corpus, and experimental studies to argue that we can make headway on these problems by hypothesizing that human language comprehension and production fundamentally involve the rational application of probabilistic knowledge to achieve efficient communication.
I begin by describing the surprisal theory of incremental processing difficulty. Under this theory, a comprehender's probabilistic grammatical knowledge determines expectations about the continuations of a sentence at multiple structural levels, which in turn determine the difficulty of processing the words that are actually encountered. I show how this theory can be applied to a number of results problematic for prominent theories of processing difficulty based in syntactic locality, and how surprisal goes a considerable way toward providing a unified account of two major classes of processing phenomena in the psycholinguistic literature: garden-path disambiguation and syntactic complexity. I also describe how surprisal can be derived in multiple ways from optimality principles, and present empirical results supporting surprisal's claim that processing times in language comprehension truly are linear in negative log-probability. I also describe experimental results on the processing of extraposed relative clauses illustrating in detail how a class of syntactic processing results originally taken to support locality-based theories in fact provides striking evidence for the power of probabilistic expectations to guide real-time language comprehension. In the last part of the talk I describe the relationship of surprisal theory to Uniform Information Density, a theory of how speaker choices when faced with grammatical optionality are guided by the drive to maintain an optimal, constant level of information transfer throughout a discourse. A case study of optional "that" production in English relative clauses combining corpus analysis and a computational model provides evidence in support of Uniform Information Density as a driver of optimal production decisions.
Many open questions remain, but the work presented here shows how much headway can be made toward understanding human language comprehension and production as inference and action under uncertainty with the goal of efficient communication.
All are invited to a reception for Dr. Levy following the talk.