Ling 201: Research methodology and statistics in linguistics (S2006, S2008, F20082012, S2014, F2014, F2015, F2016)


This course was a handson introduction to fundamental aspects of quantitative/statistical methodology. We began by looking at a few basic notions such as variables and hypotheses. We then discussed the logic of quantitative studies in general as well as the design of factorial experiments in particular. We dealt with how data from experiments and corpora should be set up for subsequent statistical evaluation. In terms of analysis and evaluation, we explored a variety of descriptive graphs and statistics for frequency data, averages, dispersions, distributions, and correlations. The largest part of the course was concerned with handson practice on a variety of statistical tests: working in a computer lab and practicing different methods each session, we worked on distribution fitting tests, tests for independence, and tests for differences for frequencies, means, dispersions, distributions, and correlations. We used corpus and psycholinguistic example data, sometimes from published research. This course used the open source programming language R , and was based on the second edition of my book Statistics for Linguistics with R, which comes with sample data, exercises, answer keys, etc. Since the class requires no prior knowledge of statistics and only very little knowledge of mathematics, it is an entry class for absolute beginners from degree programs esp. in the humanities and social sciences. See also the StatForLing with R Google group, which I moderate and which leads to the companion website of my book.

Ling 204: Statistical methodology (W2014, W2016)


This course was a more advanced course on statistical modeling with an emphasis on various kinds of regression modeling; it presupposed a good understanding of the second edition (2013) of my Statistics for Linguistics with R: [...]. We began with a first recap of linear and generalized linear regression modeling. We then discussed the use of contrasts and general linear hypothesis tests for linear and generalized linear regression models, followed by some ideas on how to explore curvature in data (regressions with breakpoints, polynomial regressions, and generalized additive models). This was followed by a larger chunk on linear and generalized linear mixedeffects (or multilevel) modeling, where we reanalyzed published data and discussed numerical and visual exploration of regression results. The last two sessions were then devoted to classification and regression trees as well as influential data points and validation approaches. We used the open source software tool R .

Ling 210/110: Computational linguistics (W2007, S2010)


This course was a (highly selective) introduction to a discipline known as Computational Linguistics. It featured (i) a brief general introduction to some main areas of research within this field(ii) an introduction to the programming language R based on my book Quantitative Corpus Linguistics with R: […], with which we work on linguistic data, and (iii) handson work in a computer lab on a variety of case studies from domains such as computational lexicography as well as word sense and synonym disambiguation, information retrieval, automatic text processing, and a few other things such as orthographic similarities of words and spellchecking, computational methods for authorship attribution, and others. Given the practical orientation of the course, this course was ideally suited for students who were thinking of practical applications and wanted to acquire some first computational programming experience (prior experience with R was not necessary, but a largerthanaverage computer savviness was recommended). Reading assignments included parts of Manning and Schütze's (2000) Foundations of Statistical Natural Language Processing as well as Jurafsky and Martin's (2000) Speech and Language Processing, supplemented with a variety of introductory chapters and research articles.

Ling 218: Corpus linguistics (S2007, F2012)


This course was a handson introduction to advanced corpuslinguistic research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic subdisciplines. It was broadly based on my (2009) textbook Quantitative corpus linguistics with R: a practical introduction and McEnery & Hardie's (2012) Corpus Linguistics, supplemented with a variety of research articles. The course had a bipartite structure. On the one hand, we read and discussed a variety of papers on different corpuslinguistic applications ranging from morphophonology, morphology, syntax, the syntaxlexis interface, to semantics. On the other hand, the course taught how to use the programming language R (i) to employ two of the the three main corpuslinguistics methods to retrieve linguistically relevant data, and (ii) to perform elementary statistical analyses of these data. To that end, we looked at several different corpora and corpus formats. Thus, the course aimed at enabling participants (i) to understand and replicate corpuslinguistic work, (ii) to pursue their own corpuslinguistic studies on a wide variety of data, and (iii) to acquire basic skills in programming and regular expressions, which are extremely useful both within and outside academia. See also the CorpLing with R Google group, which I moderate and which leads to the companion website of my book.

Ling 219: Corpus construction (maybe to be developed)


Design and construction of electronic corpora to represent spoken or written forms of language. Data collection from electronically available texts/transcripts, linguistic fieldwork, archives. Issues of sampling, balancedness, representativity, scale; formatting, markup, annotation, coding, tools; archival preservation, orthography, politics, ethics.

Ling 225: Semantics (S2011, W2015)


In this course, we explored a small range of topics in semantics. Topics we dealt with were structuralist approaches involving necessary and sufficient conditions, the Natural Semantic Metalanguage approach, lexical relations, cognitive semantics (esp. with regard to polysemy and prototypes), computational / distributional semantics, and the acquisition of meaning.

Ling 237/137: Introduction to first language acquisition (F2007, F2008, W2010)


This course was a selective introduction to the interdisciplinary enterprise of research on first language acquisition. It covered several different though interrelated topics: an introduction to 'the problem of language acquisition', overviews of different theoretical and methodological approaches towards first language acquisition, and introductions to aspects and processes of first language acquisition in different linguistic subdisciplines: phonology/morphology, semantics/lexicon, syntax.

Ling 252A/B: Cognitive Linguistics (F2006/W2007)


In the first quarter of this twoquarter seminar, we explored the set of related approaches known as Cognitive Linguistics. The course provided a brief general introduction to the assumptions governing or underlying most of the field, followed by a variety of case studies focusing on central notions of, and areas of research within, Cognitive Linguistics; these notions and areas of research include metaphor/metonymy, polysemy, Cognitive Grammar, (argument structure) constructions etc.

Ling 257A/B: Psycholinguistics (F2010/W2011)


In the first quarter of this twoquarter seminar, we explored topics in psycholinguistics from (i) the theoretical perspective of exemplar/usagebased cognitive/functional linguistics and (ii) the methodological perspective of experimental and observational data and analysis. We read and discussed a variety of papers on topics in language acquisition, language production, 'distributional linguistics', and, depending on participants' choices, language change and sociolinguistics.
