USING CORPORA TO AID QUALITATIVE TEXT ANALYSIS
AN INTERDISCIPLINARY APPROACH
Aim. The aim of this paper is to present and exemplify a number of basic uses of corpus-based text analysis tools that can supplement and provide additional insight for an otherwise qualitative analysis of a text. I attempt to show that nowadays certain corpus tools are easily accessible to any researcher and can be used to enrich the results of studies concerned with texts.
Methods. This paper comprises the basics of corpus building, the main types of data that can be drawn from a simple corpus and a detailed description of four methods that can aid text analysis: wordlists, concordances, dispersion plots and keywords. Each of those four methods is thoroughly described, including a number of examples of its applications and indicates its possible limitations.
Results. The examples provided suggest that even performing a very simple corpus analysis of a text might unveil certain trends and phenomena not noticeable through the classic qualitative text analysis methods (e.g. close reading). The paper argues that corpus research can hence work as an extension of a quantitative analysis (or be its starting point) by examining themes and keywords present in a given text and enrich the results of a qualitative study with a fresh perspective. Finally, the paper claims that basic corpus analysis can, in fact, be successfully employed by researchers who do not have any prior experience with statistics or corpora.
 Anthony, L. (2018). AntConc (Version 3.5.7) [Computer Software]. Tokyo: Waseda University. Available from http://www.laurenceanthony.net/software
 Baker, M. (1993). Corpus Linguistics and Translation Studies: Implications and Applications. In Baker, M., Francis, G. and Tognini-Bonelli, E. (Eds.), Text and Technology: In Honour of John Sinclair (233-250). Amsterdam: John Benjamins Publishing Company.
 Biber, D. (1993). Representativeness in corpus design. Literary and Linguistic Computing 8(4), 243-257. https://doi.org/10.1093/llc/8.4.243
 Davies, M. (2004-). BYU-BNC. (Based on the British National Corpus from Oxford University Press). Available online at https://corpus.byu.edu/bnc/
 Fischer-Starcke, B. (2010). Corpus Linguistics in Literary Analysis: Jane Austen and her Contemporaries. London: Continuum Publishing.
 Jeans, C. (2016) Vegetarian Tigers of Paradise. Aberystwyth: Honno Welsh Women's Press.
 Milizia, D. (2010). Keywords and Phrases in Politcial Speeches. In Bondi, M., Scott, M. (Eds.), Keyness in Texts (127-145). Amsterdam/Philadelphia: John Benjamins Publishing Company.
 O’Sullivan, J., Bazarnik, K., Eder, M. & Rybicki, J. (2018). Measuring Joycean Influences on Flann O’Brien. Digital Studies, 8(1), 1–25. https://doi.org/10.16995/dscn.288
 Project Gutenberg. (n.d.). Retrieved on 24 April 2018 from www.gutenberg.org.
 Quirk, R. & Greenbaum, S. (1973). A University Grammar of English. London: Longman.
 Rauscher, J., Swiezinski, L, Riedl, M. & Biemann, C. (2013). Exploring Cities in Crime: Significant Concordance and Co-occurrence in Quantitative Literary Analysis. In Kazantseva, A. & Szpakowicz, S. (Eds.), Proceedings of the Computational Linguistics for Literature Workshop at NAACL-HLT 2013 (61-71). Atlanta, GA, USA: Association for Computational Linguistics.
 Rybicki, J. (2012). The great mystery of the (almost) invisible translator: stylometry in translation. In Oakley, M. and Ji, M. (Eds.), Quantitative Methods in Corpus-Based Translation Studies (231-248). Amsterdam: John Benjamins.
 Scott, M. (2010a). WordSmith Tools Manual. Retrieved on 16 April 2018 http://www.lexically.net/downloads/version5/HTML/index.html?wordlist_overview.htm
 Scott, M. (2010b). WordSmith Tools Manual. Retrieved on 12 April 2018 http://www.lexically.net/downloads/version5/HTML/?dispersion_basics.htm
 Scott, M. (2010c). WordSmith Tools Manual. Retrieved on 24 April 2018 from http://www.lexically.net/downloads/version5/HTML/index.html?keyness_definition.htm
 Scott, M. (2016). WordSmith Tools Version 7, Stroud: Lexical Analysis Software. Available from http://lexically.net/wordsmith/
 Sinclair, J. (2005). Corpus and text — basic principles. In Wynne, M. (Ed.) Developing linguistic corpora: A guide to good practice (1–16). Oxford: Oxbow Books.
 Stubbs, M. (2010). Three Concepts of Keywords. In Scott, M. and Bondi, M. (Eds.), Keyness in Texts (21-42). Amsterdam: John Benjamins Publishing.
 Vonnegut, K. (1969). Slaughterhouse-Five or the Children's Crusade. New York: Bantam Doubleday Dell Publishing Group.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal. All authors agree for publishing their email adresses, affiliations and short bio statements with their articles during the submission process.