What ...
|
This bootcamp/ workshop is a hands-on introduction to quantitative corpus linguistics for both graduate students and seasoned researchers. Using the open source software tool and programming language R , participants will learn
- how to generate frequency lists;
- how to search for words and patterns;
- how to handle corpora and perform corpus-linguistic searches that typical corpus software does not support;
- how to carry out basic statistical evaluations of corpus data (significance tests and statistical graphs).
R-based software tools will be made available that allow to easily perform many of the above operations and compile topic-specific web-based corpora. Data to be dealt with include plain text corpora, corpora with SGML or XML annotation, chat files from CHILDES, Unicode files; case studies involve examples from morphology, syntax, first and second language acquisition, among other things. Statistical tests to be introduced include tests for frequencies, means, and distributions.
|