Stefan Th. Gries
Home
Contact information
Disclaimer
Last updated: 7 Sept 2009

Teaching at the University of California, Santa Barbara
Undergraduate courses
Graduate courses and seminars
Highly (!) recommended readings


This quarter


Ling 201: Research methodology and statistics in linguistics
(S2006, S2008, F2008, F2009)


This course is a gentle hands-on introduction to fundamental aspects of quantitative/statistical methodology. We begin by looking at a few basic notions such as variables and hypotheses. We then discuss the logic of quantitative studies in general as well as the design of factorial experiments in particular. We deal with how data from experiments and corpora should be set up for subsequent statistical evaluation. In terms of analysis and evaluation, we will explore a variety of descriptive graphs and statistics for frequency data, averages, dispersions, and correlations. The largest part of the course is concerned with hands-on practice on a variety of statistical tests: working in a computer lab and practicing different methods each session, we will work on distribution fitting tests, tests for independence, and tests for differences for frequencies, means, dispersions, and correlations. We will use corpus- and psycholinguistic example data, sometimes from published research. The class uses the open source software tool R , and it is based on the English version of a textbook written by myself and to appear soon, which provides with sample data, exercises, answer keys, etc. Since the class requires no prior knowledge of statistics and only very little knowledge of mathematics, it is an ideal entry class for absolute beginners from degree programs esp. in the humanities and social sciences.


Ling 202: Advanced research methods and statistics in linguistics
(F2009)


This course is a hands-on introduction (in a computer lab) to more advanced statistical methods to analyze observational and experimental data. After a small recap of monofactorial methods, we systematically extend monofactorial tests to their multifactorial and multivariate counterparts. We begin with the linear model and extend correlations and t-tests to multiple linear regression, ANOVAs and ANCOVAs. We then broaden the scope to the powerful methods included in generalized linear modeling: binomial logistic regression (for binary data) and Poisson regression and hierarchical configural frequency analysis (for frequency data). In addition to modeling techniques, we will also explore a versatile exploratory methods, hierarchical cluster analysis, to find structure in large, potentially messy data sets. Finally, all statistical techniques will be accompanied by many possibilities for graphical exploration. Time permitting, we will have a look at randomization methods. The class uses the open source software tool R , and it is based on the English version of a textbook written by myself and to appear soon, which comes with sample data, exercises, answer keys, etc. In spite of a small recap at the beginning, knowledge of monofactorial methods in statistics is required, knowledge of the open source software R is necessary.


Undergraduate courses

Ling 110/210: Computational linguistics (W2007)

This course was a (highly selective) introduction to the discipline known as Computational Linguistics. It featured (i) a brief general introduction to some main areas of research within this field(ii) an introduction to a programming language, R, with which we worked on linguistic data, and (iii) hands-on work in a computer lab on a variety of case studies from domains such as computational lexicography as well as word sense and synonym disambiguation, information retrieval, automatic text processing, and a few other things such as orthographic similarities of words and spell-checking, computational methods for authorship attribution, and others. Given the practical orientation of the course, this course was ideally suited for students who were thinking of practical applications and liked to acquire some first computational programming experience. Reading assignments were largely parts of Manning and Schütze's (20001) Foundations of Statistical Natural Language Processing as well as Jurafsky and Martin's (20001) Speech and Language Processing, supplemented with a variety of introductory chapters and research articles.


Ling 120: Corpus linguistics (S2008)


This course was an introduction to computerized research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. In the first part of this particular class, we began with a theoretical introduction: what is a corpus / what are corpora, what kinds of corpora are there and how are they created/compiled, and why would one use corpora in the first place? In the second part, we familiarized ourselves with the open source software tool R . In the third part, we read a variety of published corpus-linguistic studies as well as replicated, modified, or extended them. The topics that were covered included syntax (patterns and alternations), lexis/semantics (key words in different cultures and near synonymy), psycholinguistics (disfluencies and first language acquisition), and others. (This course was partially based on my textbook Quantitative corpus linguistics with R: a practical introduction.)


Ling 113: Introduction to semantics (W2008)


This course was an introduction to the linguistic subdiscipline of semantics. After a very brief general introduction to the course and some main semantic concepts, we looked at word meaning and sentence meaning and different theoretical approaches to these two aspects of meaning. Then, we considered selected aspects of the acquisition of word meaning by children, had a look at different empirical methods of semantic analysis, and explored a few central notions of pragmatics (or utterance meaning)..


Ling 127 / Psy 127: Psychology of language (F2006)


This course was an introduction to psycholinguistics concerned with various aspects of language comprehension, production, and acquisition. It was broadly based on Carroll's (2004) Psychology of Language, but also incorporated a variety of additional information/materials.


Ling 137/237: Introduction to first language acquisition
(F2007, F2008)


This course was a selective introduction to the interdisciplinary enterprise of research on first language acquisition. It covered several different though interrelated topics: an introduction to 'the problem of language acquisition', overviews of different theoretical and methodological approaches towards first language acquisition, and introductions to aspects and processes of first language acquisition in different linguistic subdisciplines: phonology/morphology, semantics/lexicon, syntax.


Ling 194: Group studies in linguistics (W2006)


This course was an introduction to corpus linguistics, involving simple computerized research methods to large data bases of language used in natural communicative settings.


Graduate courses and seminars


Ling 110/210: Computational linguistics (W2007)


This course was a (highly selective) introduction to the discipline known as Computational Linguistics. It featured (i) a brief general introduction to some main areas of research within this field(ii) an introduction to a programming language, R, with which we worked on linguistic data, and (iii) hands-on work in a computer lab on a variety of case studies from domains such as computational lexicography as well as word sense and synonym disambiguation, information retrieval, automatic text processing, and a few other things such as orthographic similarities of words and spell-checking, computational methods for authorship attribution, and others. Given the practical orientation of the course, this course was ideally suited for students who were thinking of practical applications and liked to acquire some first computational programming experience. Reading assignments were largely parts of Manning and Schütze's (20001) Foundations of Statistical Natural Language Processing as well as Jurafsky and Martin's (20001) Speech and Language Processing, supplemented with a variety of introductory chapters and research articles.


Ling 218: Corpus linguistics (S2007)


This course was an introduction to advanced corpus-linguistic research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. It was broadly based on my textbook Quantitative corpus linguistics with R: a practical introduction, supplemented with a variety of research articles. The course had a bipartite structure. On the one hand, we read and discussed a variety of papers on different corpus-linguistic applications ranging from morphophonology, morphology, syntax, the syntax-lexis interface, and semantics to text linguistics / the study of literature. On the other hand, the course taught a programming language (i) to use two of the the three main corpus-linguistics methods to retrieve linguistically relevant data, and (ii) to perform elementary statistical analyses of these data. To that end, we looked at several different corpora and corpus formats. Thus, the course aimed at enabling you (i) to understand and replicate corpus-linguistic work, (ii) to pursue your own corpus-linguistic studies on a wide variety of data, and (iii) to acquire basic skills in programming and regular expressions, which are extremely useful both within and outside academia. See also here for the CorpLing with R Google group, which I moderate and which will host the companion website of my book.


Ling 219: Corpus construction
(to be developed in more detail and then taught the first time)


Design and construction of electronic corpora to represent spoken or written forms of language. Data collection from electronically available texts/transcripts, linguistic fieldwork, archives. Issues of sampling, balancedness, representativity, scale; formatting, markup, annotation, coding, tools; archival preservation, orthography, politics, ethics.


Ling 137/237: Introduction to first language acquisition (F2007, F2008)


This course was a selective introduction to the interdisciplinary enterprise of research on first language acquisition. It covered several different though interrelated topics: an introduction to 'the problem of language acquisition', overviews of different theoretical and methodological approaches towards first language acquisition, and introductions to aspects and processes of first language acquisition in different linguistic subdisciplines: phonology/morphology, semantics/lexicon, syntax.


Ling 252-A/B: Cognitive Linguistics (F2006/W2007)


This course dealt with the set of related approaches known as Cognitive Linguistics. It provided a brief general introduction to the assumptions governing or underlying most of the field, followed by a variety of case studies focusing on central notions of, and areas of research within, Cognitive Linguistics; these notions and areas of research include metaphor/metonymy, polysemy, Cognitive Grammar, (argument structure) constructions etc.


Note: W=winter quarter, F=fall quarter, S=spring quarter.


What I think every (graduate) student of linguistics should read ...

  • AAUP 2007-08 Report on the Economic Status of the Profession.
  • Zwicky, Arnold M. 1986. On referring. Natural Language and Linguistic Theory 4.1:121-6.
  • Postal, Paul. 1988. Advances in Linguistic Rhetoric. Natural Language and Linguistic Theory 6.1:129-37.
  • Pullum, Geoffrey K. 1991. The Great Eskimo Vocabulary Hoax and Other Irreverent Essays on the Study of Language. With a Foreword by James D. McCawley. Chicago, IL: The University of Chicago Press.
  • Harris, Randy A. 1993. The Linguistics Wars. Oxford: Oxford University Press.
  • Sadock, Jerrold M. 1996. PIFL: The Principle of Information-Free Linguistics. Papers from the 32nd Regional Meeting of the Chicago Linguistics Society, p. 133-48.
  • Postal, Paul. 2004. Skeptical Linguistic Essays. Oxford: Oxford University Press, Chapters 7 to 14.