201) suggestion that TV series hold ‘predictable elements’ that appear in an expectable order, as well as in Baños (2014: 81) allusion to the ‘specificities of the audiovisual medium’. Arias-Badia and Brumme (2014) provide a first approach to recurrent linguistic patterns in TV police procedurals. As shown in Chapter 4, the fact that fictional characters play with genre conventions is evidence that the language of these series is conventionalised and based on a long genre tradition.
Linguistic research has long been concerned with the identification and description of systematic, recurring patterns of language use, based on the idea that much language use is routine (Stubbs 1993; Hanks 1996). The recurrence of these patterns matches Coseriu’s (1952) description of norm as the objectively established principles followed by speakers of a language, rather than rules imposed by subjective assessment criteria.
Thus, the term norm is understood as descriptive in this framework. As Curzan (2014: 18) points out, ‘[d];escriptive “rules” describe regularities in a language variety’s structure that are developed through analysis of what speakers do; they are sometimes invariant but not always’. Interestingly for the purposes of this study, both norms and genre were variables already under consideration in the classical approach to communicative events referred to by the acronym SPEAKING (situation, participants, ends, acts, key, instrumentalities, norms, and genre), proposed by Hymes (1989/1972).
Particularly since the 1960s, when the Brown Corpus was compiled, corpora have been a major tool employed by linguists for the exploration of patterns or norms. As defined by Sinclair (1991: 171), corpora are ‘collection[s]; of naturally occurring language text, chosen to characterise ←13 | 14→a state or variety of language’. Laviosa (2003: 105) elaborates on previous definitions by adding that, in corpora, the texts are ‘stored on a computer, sometimes analysed automatically or semi-automatically’. The digitalisation of texts makes it easier for researchers to share their material and gain access to larger amounts of data, thus fostering quantitative significance of the results – the more occurrences of a norm may be observed, the more likely that norm is to be representative of the language explored.
Traditionally, working on Corpus Linguistics (CL) has meant to resort to already existing large corpora, of both written and spoken language. However, it may be argued that CL has evolved into some kind of methological tool or research methodology for many areas of knowledge within Linguistics, and even for other disciplines (Olohan 2004; Walsh et al. 2011; Laviosa 2013). In this paradigm, researchers either use the large corpora available or compile ad hoc–designed, often smaller, corpora that better fit their specific research purposes and interests. In this respect, the selection of texts in accordance with research criteria is an essential feature that distinguishes corpora from ‘more random collections of texts held in texts archives’ (Barnbrook 1996: 23).
Within the field of Linguistics, a paradigmatic shift in relation to corpora perception took place in the 2000s, when some researchers started to argue for the need to undertake corpus-driven studies to the detriment of corpus-based studies, which had been the norm until then. In this spirit, Tognini-Bonelli (2001: 84–85) claims that:
Corpus-driven linguistics rejects the characterisation of corpus linguistics as a method and claims instead that the corpus itself should be the sole source of our hypotheses about language. It is thus claimed that the corpus itself embodies a theory of language.
Corpas-Pastor (2008: 53, my translation) offers a table that succinctly compares the main differences between both approaches (Table 1).
Corpus-based approaches | Corpus-driven approaches |
A priori theoretical assumptions | A posteriori theoretical assumptions |
Inductive method | Deductive method |
Intuition and introspection | Statistical methods |
Lemmatised, annotated corpora | Non-codified, large and representative corpora |
Grammar patterns | Lexicogrammatical and phraseological patterns |
Corpus-driven studies confer more power to data, to the text in hand, than corpus-based studies do. The latter use corpora to answer a preconcibed research question (e.g. ‘How often does nominalisation occur in this text?’), while in corpus-driven studies emphasis is placed on the text rather than on the research questions, that is, it is the text that leads the researcher to specific questions. If results are corpus-driven, it means that ←14 | 15→researchers have first approached the text trying to detach themselves from any assumptions derived from any previous specialised literature. Given their non-prejudicial nature, corpus-driven studies have been compared with a ‘tabula rasa’ (Corpas-Pastor 2008: 52).
Hanks’s (2004, 2013a) ongoing lexicographic project, the Pattern Dictionary of English Verbs (PDEV), adopts a corpus-driven approach to lexicon to tease out linguistic norms and the entries correspond to the different semantic patterns of each word, manually annotated from corpora. The scholar has proposed the Theory of Norms and Exploitations (TNE) together with a specific work methodology in CL, namely Corpus Pattern Analysis (CPA), which have served as a framework for the qualitative study of lexicon in this book (Chapter 8). As specified in Jezek and Hanks (2010: 8), this type of lexicographic task consists of ‘us[ing] corpus evidence to tease out the different patterns of use associated with each word in a language and to discover the relationship between meaning and patterns of usage’. Thus, the perception of norms in TNE intersects with the use given to norm in quantitative linguistic research, to mean prototypical, average use of the language, from which creative realisations may deviate to a greater or lesser extent (Muller 1973, 1992).
Corpas-Pastor (2008: 218, my translation) explains that undertaking corpus-driven research does not mean to undertake objective research: ‘Former paradigms, reference frameworks, ideological positions, cultural traditions and key concepts, among others, are bound to influence ←15 | 16→the researcher’s perspective’. Manual annotation, as in the case of the PDEV, necessarily involves the annotator’s subjectivity (Church and Hanks 1990; Renau 2012). The point is, however, that