Stefan Th. Gries

Dispersion / adjusted frequency resources

Home

Contact information

Disclaimer

Last updated: 29 May 2018


R script to compute measures of dispersion and adjusted frequencies

R scripts


dispersions.r
You can download this script, which improves on the previous scripts that came with my 2008e and 2010c papers on dispersions. The script (i) creates the data set discussed in my article "Analyzing dispersion" in the Practial Handbook of Corpus Linguistics (lines 3-5), (ii) defines two functions
dispersions1 (lines 9-139) and dispersions2 (lines 141-214), and (iii) exemplifies their use (lines 217-234 and 238-254 respectively).


Downloadable resources


Everyone of these files is an RDS file you can import into R either with the function readRDS or by clicking on it in RStudio. The file contains a data frame with 1 row per word (the words being the row names) and 24 columns (1 for each dispersion measure or adjusted frequency). If you use any of these, please cite the following paper: Gries, Stefan Th. 2019. Analyzing dispersion. In Magali Paquot & Stefan Th. Gries (eds.), Practical Handbook of Corpus Linguistics. Berlin & New York: Springer.


Dispersion measures & adjusted frequencies of all words in the BNC Baby as an .RDS file (based on 182 files)


Dispersion measures & adjusted frequencies of all words in the BNC Sampler as an .RDS file (based on 184 files)


Dispersion measures & adjusted frequencies of all words in the BNC World Edition as an .RDS file (based on 4049 files)


Dispersion measures & adjusted frequencies of all words in the spoken data of the BNC World Edition as an .RDS file (based on 908 files)


Dispersion measures & adjusted frequencies of all words in the Brown corpus as an .RDS file (based on 500 files)


Dispersion measures & adjusted frequencies of all words in the ICE-GB as an .RDS file (based on 500 files)