Stefan Th. Gries
Contact information
Last updated: 03 June 2021

Collostructional analysis
the script, help files,
and examples

Collostructional analysis


This is the central collostructional analysis website (based on a one-day hands-on workshop taught by myself and Anatol Stefanowitsch in 2005) on how to perform collostructional analysis with the open source software tool and programming language R . For questions regarding the use of the R program made available here, please either contact me or, better still, buy my books Quantitative Corpus Linguistics with R: […] and Statistics for Linguistics with R: […] ;-) , join their Google groups, and post your question(s) there.

Background information

Stefanowitsch & Gries (2003 and 2005), Gries & Stefanowitsch (2004a and 2004b), Gries, Hampe, & Schönefeld (2005 and 2010), Gries (2015a, 2015b)

General links to software

R , LibreOffice

My script Coll.analysis 3.2a

Coll.analysis 3.2a
readme.txt for Coll.analysis 3.5.1
Note: this is the legacy version of this script; a new version (coll.analysis_mpfr.r) that is much less likely to produce Inf results for large corpora/frequencies will be available from me upon request again once I have implemented a few small changes.

Collexeme analysis

input files: 1.csv
output files: 1_out_mpfr.txt

(Multiple) distinctive collexeme analysis

input files: 2a.csv, 2b.csv, 2c.csv
output files: 2a_out_mpfr.txt, 2b_out_mpfr.txt, 2c_out_mpfr.txt

Covarying collexeme analysis

input files: 3.csv
output files: 3_out_1_mpfr.txt and 3_out_2_mpfr.txt

Data to play with

dat_AFRAIDs_1.txt and the optimal output file: dat_AFRAIDs_3.txt
dat_HORRIBLEs_1.txt and the optimal output file: dat_HORRIBLEs_3.txt