Overview 

This is the central collostructional analysis website (based on a oneday handson workshop taught by myself and Anatol Stefanowitsch in 2005) on how to perform collostructional analysis with the open source software tool and programming language R . For questions regarding the use of the R program made available here, please either contact me or, better still, buy my books Quantitative Corpus Linguistics with R: […] and Statistics for Linguistics with R: […] ;) , join their Google groups, and post your question(s) there. 
Background information


Stefanowitsch & Gries (2003 and 2005), Gries & Stefanowitsch (2004a and 2004b), Gries, Hampe, & Schönefeld (2005 and 2010), Gries (2015a, 2015b)

General links to software


R , LibreOffice

My script Coll.analysis 3.2a


Coll.analysis 3.2a readme.txt for Coll.analysis 3.5.1 Note: this is the legacy version of this script; a new version (coll.analysis_mpfr.r) that is much less likely to produce Inf results for large corpora/frequencies will be available from me upon request again once I have implemented a few small changes.

Collexeme analysis


input files: 1.csv output files: 1_out_mpfr.txt

(Multiple) distinctive collexeme analysis


input files: 2a.csv, 2b.csv, 2c.csv output files: 2a_out_mpfr.txt, 2b_out_mpfr.txt, 2c_out_mpfr.txt

Covarying collexeme analysis


input files: 3.csv output files: 3_out_1_mpfr.txt and 3_out_2_mpfr.txt

Data to play with


dat_AFRAIDs_1.txt and the optimal output file: dat_AFRAIDs_3.txt dat_HORRIBLEs_1.txt and the optimal output file: dat_HORRIBLEs_3.txt
