Stefan Th. Gries
Contact information
Last updated: 18 May 2018

Teaching at the University of California, Santa Barbara

Ling 120: Corpus linguistics (S2018)

Syllabus and overview

In general, this course is an introduction to computerized research methods, which are applied to large data bases of language used in natural communicative settings to supplement more traditional ways of linguistic analysis in all linguistic sub-disciplines. In the first part of this particular class, we will begin with a theoretical introduction: what is a corpus / what are corpora, what kinds of corpora are there and how are they created/compiled, and why would one use corpora in the first place? In the second part, we will familiarize ourselves with the open source programming language and environment R . In the third part, we will read a variety of published corpus-linguistic studies as well as replicate, modify, or extend them. The topics to be covered include syntax (patterns and alternations), lexis/semantics (key words in different cultures and near synonymy), psycholinguistics (disfluencies and first language acquisition), and others. This course is based on the second edition of my textbook Quantitative corpus linguistics with R: a practical introduction. New York: Routledge, Taylor & Francis Group. Note, very important: That means we will be using a programming language, which means that the course absolutely requires computer literacy beyond swiping, pinching, long-tapping, and uploading/sending something to/via Facebook, Instagram, Pinterest, Snapchat, or whatever: If you cannot install software, or if you can install software but then don't know 'where the program is', and/or if you download a file on your own personal computer but will then ask me where it went, and/or if you do not know what unzipping a file means, you will not be happy in this course!

Downloads: slides, worksheets, code, data

Files for session 01
Files for session 02-04
Files for session 05
Files for session 06
Files for session 08
Files for session 10

Corpus data

Course-final assignments

General software

Microsoft R Open
R from CRAN
LibreOffice 6.0.3