Stefan Th. Gries
Home
Contact information
Disclaimer
Last updated: 17 August 2008

Dispersion / adjusted frequency resources


R scripts to compute measures of dispersion and adjusted frequencies

R scripts

Script 1: dispersions1
Script 2: dispersions2
Enter this into R to make the function from the first script (and change the "1" to "2" for the second one):

source("http://www.linguistics.ucsb.edu/faculty/stgries/research/dispersion/_dispersions1.r")¶
Then you can call the function as described in the article.


Reference files to download

If you use any of these lists, please cite the following paper as a reference: Gries, Stefan Th. to appear. Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics.


The British National Corpus Sampler


Here you can download files containing all words that occur in the BNC Sampler 10+ times together with their dispersion measures and adjusted frequencies; cf. bncsampler_readme.txt for details:
01n_bncsampler_output.zip (a zipped .txt file)
01n_bncsampler_output.RData (.RData file)
01n_bncsampler_output.ods (.ods file)


The British National Corpus Baby


Here you can download files containing all words that occur in the BNC Baby 10+ times together with their dispersion measures and adjusted frequencies; cf. bncbaby_readme.txt for details:
01n_bncbaby_output.zip (a zipped .txt file)
01n_bncbaby_output.RData (.RData file)
01n_bncbaby_output.ods (.ods file)


The spoken part of the British National Corpus World Edition (XML)


Here you can download files containing all words that occur in the spoken part of the BNC WE 10+ times together with their dispersion measures and adjusted frequencies; cf. bncxml-spoken_readme.txt for details:
01n_bncxmls_output.zip (a zipped .txt file)
01n_bncxmls_output.RData (.RData file)
01n_bncxmls_output.ods (.ods file)


The British Component of the Intl. Corpus of English


Here you can download files containing all words that occur in the ICE-GB 10+ times together with their dispersion measures and adjusted frequencies; cf. icegb_readme.txt for details:
01n_icegb_output.zip (a zipped .txt file)
01n_icegb_output.RData (.RData file)
01n_icegb_output.ods (.ods file)


Graphs to download

If you use any of these graphs, please cite the following paper as a reference: Gries, Stefan Th. to appear. Dispersions and adjusted frequencies in corpora: further explorations.


The spoken part of the British National Corpus World Edition (XML)


scatterplot comparing different measures of dispersion (a .png file, dimensions: 1600x1200)
scatterplot comparing different adjusted frequencies (a .png file, dimensions: 1600x1200)