560029011 03 01 01 JB John Benjamins Publishing Company 01 JB code SCL 119 Eb 15 9789027246486 06 10.1075/scl.119 13 2024033283 DG 002 02 01 SCL 02 1388-0373 Studies in Corpus Linguistics 119 <TitleType>01</TitleType> <TitleText textformat="02">Crossing Boundaries through Corpora</TitleText> <Subtitle textformat="02">Innovative corpus approaches within and beyond linguistics</Subtitle> 01 scl.119 01 https://benjamins.com 02 https://benjamins.com/catalog/scl.119 1 B01 Sarah Buschfeld Buschfeld, Sarah Sarah Buschfeld TU Dortmund University 2 B01 Patricia Ronan Ronan, Patricia Patricia Ronan TU Dortmund University 3 B01 Theresa Neumaier Neumaier, Theresa Theresa Neumaier TU Dortmund University 4 B01 Andreas Weilinghoff Weilinghoff, Andreas Andreas Weilinghoff University of Koblenz 5 B01 Lisa Westermayer Westermayer, Lisa Lisa Westermayer TU Dortmund University 01 eng 266 vi 260 + index LAN009000 v.2006 CFX 2 24 JB Subject Scheme LIN.CORP Corpus linguistics 24 JB Subject Scheme LIN.THEOR Theoretical linguistics 06 01 This volume illustrates new trends in corpus linguistics and shows how corpus approaches can be used to investigate new datasets and emerging areas in linguistics and related fields. It addresses innovative research questions, for example how prosodic analyses can increase the accuracy of syntactic segmentation, how tolerant English language teachers are about language variation, or how natural language can be translated into corpus query language. The thematic scope encompasses four types of ‘boundary crossings’. These include the incorporation of innovative scientific methods, specifically new statistical techniques, acoustic analysis and stylistic investigations. Additionally, temporal boundaries are crossed through the use of new methods and corpora to study diachronic data. New methodologies are also explored through the analysis of prosody, variety-specific approaches, and teacher attitudes. Finally, corpus users can cross boundaries by employing a more user-friendly corpus query language. 04 09 01 https://benjamins.com/covers/475/scl.119.png 04 03 01 https://benjamins.com/covers/475_jpg/9789027215949.jpg 04 03 01 https://benjamins.com/covers/475_tif/9789027215949.tif 06 09 01 https://benjamins.com/covers/1200_front/scl.119.hb.png 07 09 01 https://benjamins.com/covers/125/scl.119.png 25 09 01 https://benjamins.com/covers/1200_back/scl.119.hb.png 27 09 01 https://benjamins.com/covers/3d_web/scl.119.hb.png 10 01 JB code scl.119.toc v vi 2 Miscellaneous 1 <TitleType>01</TitleType> <TitleText textformat="02">Table of contents</TitleText> 10 01 JB code scl.119.01ron 1 6 6 Chapter 2 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 1. Introduction</TitleText> <Subtitle textformat="02">Crossing discipline boundaries with corpus‑linguistic methods</Subtitle> 1 A01 Patricia Ronan Ronan, Patricia Patricia Ronan TU Dortmund University (Germany) 2 A01 Sarah Buschfeld Buschfeld, Sarah Sarah Buschfeld TU Dortmund University (Germany) 3 A01 Theresa Neumaier Neumaier, Theresa Theresa Neumaier TU Dortmund University (Germany) 4 A01 Andreas Weilinghoff Weilinghoff, Andreas Andreas Weilinghoff University of Koblenz (Germany) 5 A01 Lisa Westermayer Westermayer, Lisa Lisa Westermayer TU Dortmund University (Germany) 10 01 JB code scl.119.p01 7 1 Section header 3 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">Integrated approaches</Subtitle> 10 01 JB code scl.119.02sai 8 40 33 Chapter 4 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 2. New approaches to investigating change in derivational productivity</TitleText> <Subtitle textformat="02">Gender and internal factors in the development of ‑ <i>ity</i> and ‑ <i>ness</i> , 1600–1800</Subtitle> 1 A01 Tanja Säily Säily, Tanja Tanja Säily University of Helsinki 2 A01 Martin Hilpert Hilpert, Martin Martin Hilpert Université de Neuchâtel 3 A01 Jukka Suomela Suomela, Jukka Jukka Suomela Aalto University 20 Construction Grammar 20 historical sociolinguistics 20 methodology 20 morphological productivity 20 nominal suffixes 01 We study the productivity of the suffixes ‑<i>ness</i> and ‑<i>ity</i> in seventeenth‑ and eighteenth-century letters in the <i>Corpora of Early English Correspondence</i>. We analyze the role of gender and five internal factors: etymology, the word class of the base, branching structure, semantics, and occurrence in possessive constructions. We develop statistical and visual methods that facilitate diachronic comparisons within factors and between competing suffixes; our basic measure is the proportion of types of interest out of all relevant types, and we utilize permutation testing to assess the statistical significance of our findings. Our results support and refine the earlier finding of a male-led increase in the productivity of ‑<i>ity</i> and provide new information on the interplay of gender and internal factors. 10 01 JB code scl.119.03sch 41 61 21 Chapter 5 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 3. A corpus-based comparative acoustic analysis of target-like vowel production by L1-Japanese learners and native speakers of English</TitleText> 1 A01 Martin Schweinberger Schweinberger, Martin Martin Schweinberger University of Queensland 2 A01 Yuki Komiya Komiya, Yuki Yuki Komiya University of Queensland 20 acoustic phonetics 20 corpus linguistics 20 Japanese learner English 20 multifactorial prediction and deviation analysis 20 random forests 20 regression 01 This study combines acoustic phonetics, (applied) corpus linguistics, machine learning, and speech recognition to analyse the production of the monophthongal vowels / ɐ ɒ æ e ɛ i: ɪ u; ʊ ʌ / in the speech of L1-Japanese learners and L1-speakers of English based on transcripts and audio data from the Japanese spoken monologue section of the <i>International Corpus Network of Asian Learners of English</i> (<i>ICNALE</i>). The aim of this analysis is to evaluate what vowels L1-Japanese learners struggle with in terms of target-like vowel production and to provide insights into the determining factors causing divergencies from L1-English produced vowels. The results of a <i>Multifactorial Prediction and Deviation Analysis Using Regression/Random Forests</i> (MuPDARF) show that Japanese learners of English do indeed have difficulties in producing English vowels in a target-like manner but that these difficulties are confined to a relatively small set of vowels (/ ɪ u ʊ ɛ /). In addition, the analysis shows that difficulties are predominantly correlated with language-internal factors while language external factors (the age and gender of speakers) as well as their target variety and proficiency do not significantly correlate with non-target-like vowel production. The results suggest that Japanese learners of English can focus on specific vowels to enhance their target-like vowel production and that difficulties are caused by L1-interference due to a lack of phonemic vowel duration in Japanese and the similarity of Japanese and English vowels leading learners to use their L1 vowels rather than the slightly but notably different English vowel variants. The results can be used to raise awareness of L1-specific difficulties among this learner cohort due to their L1-background. 10 01 JB code scl.119.04sch 62 98 37 Chapter 6 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 4. Digital Dickens</TitleText> <Subtitle textformat="02">An automated content analysis of Charles Dickens’ novels</Subtitle> 1 A01 Gerold Schneider Schneider, Gerold Gerold Schneider University of Zurich 20 Charles Dickens 20 computational linguistics 20 conceptual maps 20 digital humanities 20 distributional semantics 20 document classification 20 topic modelling 01 This investigation employs computational linguistic methods such as document classification, topic modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social criticism, literary realism, and narrative structures. While affirming positive results for automated analysis of social criticism, the study emphasizes that it could discover differing associations only due to semantic abstraction, which distributional semantics, word embeddings, and topic modelling can offer. Literary realism is successfully traced through detailed descriptions and everyday activities. Plotting plots with computational linguistic methods, specifically conceptual maps with textplot, shows promise but requires refinement. The study shows that current methods in content analysis offer new possibilities for literary analysis and digital humanities. 10 01 JB code scl.119.p02 99 1 Section header 7 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">Change over time</Subtitle> 10 01 JB code scl.119.05ebe 100 124 25 Chapter 8 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 5. 120 years of reporting clauses</TitleText> <Subtitle textformat="02">Stability or change?</Subtitle> 1 A01 Jarle Ebeling Ebeling, Jarle Jarle Ebeling University of Oslo 20 corpus stylistics 20 diachronic change 20 distant reading 20 fictional dialogue 20 reporting clause 01 Introducing the <i>Corpus of British Fiction</i>, this paper studies the development and functions of expansions in the form of manner adverbs and ‑<i>ing</i> clauses within reporting clauses. The investigation shows that the use of single adverbs to modify the reporting verb has not increased in line with the increased use of <sc>say</sc> as a reporting verb. The opposite is the case with <i>ing-</i>clause expansions, as the use of these is stable or on the rise in the 120 years covered by the corpus. The study contains a diachronic, mostly unsupervised quantitative part and a synchronic qualitative part, including manual scrutiny of the functions of the expansions under study. 10 01 JB code scl.119.06gee 125 152 28 Chapter 9 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 6. Establishing a ‘new normal’</TitleText> <Subtitle textformat="02">Detecting fluctuating trends in word frequency over time</Subtitle> 1 A01 Matt Gee Gee, Matt Matt Gee Birmingham City University 2 A01 Andrew Kehoe Kehoe, Andrew Andrew Kehoe Birmingham City University 3 A01 Antoinette Renouf Renouf, Antoinette Antoinette Renouf Birmingham City University 20 collocation 20 diachronic change 20 lexical change 20 time series 20 visualisation 01 In this chapter we introduce statistical methods and associated visualisations for the analysis of lexical change on a monthly basis in a 1.8-billion word news corpus spanning over 30 years. In previous work (Kehoe et al. 2022) we found examples of word frequency change in a data-driven manner by applying existing statistical tests. An ongoing limitation is that, as our diachronic corpus grows, so too does the possibility of a word exhibiting multiple frequency changes in different directions. This chapter reframes the problem as one of time-series segmentation, dividing the frequency history of a word into timespans exhibiting consistent upward or downward change. We then determine reasons for such changes by applying horizon graph visualisations to collocates. 10 01 JB code scl.119.p03 153 1 Section header 10 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">New approaches</Subtitle> 10 01 JB code scl.119.07mcc 154 191 38 Chapter 11 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 7. Syntactic segmentation of spoken corpus data</TitleText> <Subtitle textformat="02">What prosody can contribute</Subtitle> 1 A01 Karin McClellan McClellan, Karin Karin McClellan Ruhr Universität Bochum 2 A01 Kathrin Kircili Kircili, Kathrin Kathrin Kircili Philipps-Universität Marburg 3 A01 Sandra Götz Götz, Sandra Sandra Götz Philipps-Universität Marburg 20 L1 English 20 spoken language 20 syntactic and prosodic segmentation 20 syntax-prosody interface 01 Most corpus-based syntactic segmentation schemes rely on transcriptions alone, which can lead to segmentation difficulties, especially when analyzing spontaneous conversations. We therefore suggest an approach to segmentation that complements syntactic segmentation techniques with prosodic analyses and describe correspondences in syntactic and prosodic segmentation as well as the exact syntactic contexts in which prosodic analyses are necessary to avoid ambiguities and potential inaccuracies. Using 10 recordings from the <i>Louvain Corpus of Native English Conversation</i>, utterances are independently and manually segmented and annotated for various linguistic variables. While the results of our analyses indicate a considerable overlap of intermediate phrases and clausal units, we also showcase syntactic contexts where prosody is needed for disambiguation (e.g. monologs, discourse markers, dysfluencies, and adverbials). 10 01 JB code scl.119.08tob 192 216 25 Chapter 12 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 8. Short-term diachronic and variety-internal approaches to textual functionality in South Asian Englishes</TitleText> <Subtitle textformat="02">Evidence from newspaper language</Subtitle> 1 A01 Tobias Bernaisch Bernaisch, Tobias Tobias Bernaisch Justus Liebig University Giessen 2 A01 Sven Leuckert Leuckert, Sven Sven Leuckert TUD Dresden University of Technology 20 corpus linguistics 20 diachrony 20 Indian Englishes 20 MDA 20 South Asian Englishes 01 To empirically trace functional characteristics of texts such as speaker/writer involvement, narrativity or persuasiveness with a view to potential (a) intra-national variability in Indian English and (b) short-term diachronic change in South Asian Englishes, the <i>South Asian Varieties of English</i> (<i>SAVE</i>) corpus, its updated version <i>SAVE2020</i>, and the <i>Corpus of Regional Indian Newspaper Englishes</i> (<i>CORINNE</i>) are subjected to Multidimensional Analysis (MDA, Biber 1988) as implemented in Nini (2019). A hierarchical cluster analysis of the respective MDA scores reveals the tendency of mesolectal Indian Englishes as well as acrolectal Bangladeshi and Sri Lankan English to employ features of a conceptually written nature more readily than acrolectal Indian, Maldivian, Nepali, and Pakistani English. Still, in the observed time span of 15 years, the acrolects of South Asian Englishes also develop towards conceptually written language. 10 01 JB code scl.119.09sch 217 245 29 Chapter 13 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 9. Do corpus data on World Englishes inspire tolerance of variation in ELT professionals?</TitleText> <Subtitle textformat="02">An experimental questionnaire study with native English speaking teachers</Subtitle> 1 A01 Julia Schlüter Schlüter, Julia Julia Schlüter University of Bamberg 20 applications of corpus linguistics 20 bias 20 consistency 20 corpus literacy 20 Corpus-Assisted Language Learning (CALL) 20 Data-driven learning (DDL) 20 English as a Lingua Franca (ELF) 20 English as an International Language (EIL) 20 English Language Teaching (ELT) 20 error correction 20 Native English Speaking Teachers (NESTs) 20 Prepositional phrases 20 target norm 20 Varieties of English 20 World Englishes 01 The present study aims to show that — given the status of English as a pluricentric global language and as a lingua franca — Corpus Linguistics has important and unique contributions to make to English Language Teaching (ELT). Desirable innovations arguably involve popularizing the use of corpus concordancing as a tool to put native speaker intuitions on a firmer empirical footing, and imbuing ELT practitioners with an awareness that variation – in particular (but not only) between geographical varieties — is an inherent and legitimate characteristic of language in use. To support these points, a quasi-experimental questionnaire study with 76 native English speaking teachers based at German universities is reported, which demonstrates the promises but also the obstacles of such an approach. 10 01 JB code scl.119.p04 247 1 Section header 14 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">Accessibility</Subtitle> 10 01 JB code scl.119.10mil 248 262 15 Chapter 15 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 10. Query a corpus in near-natural language</TitleText> <Subtitle textformat="02">A human-friendly corpus query language not only for linguists</Subtitle> 1 A01 Jiří Milička Milička, Jiří Jiří Milička Charles University, Prague, Czech Republic 2 A01 Denisa Šebestová Šebestová, Denisa Denisa Šebestová Charles University, Prague, Czech Republic 20 corpus 20 corpus linguistics 20 Czech national corpus 20 large language models 20 query language 20 universal dependencies formalism 01 This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in <i>SketchEngine</i>, <i>Czech National Corpus</i> web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the <i>Czech National Corpus</i> infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use. 02 JBENJAMINS John Benjamins Publishing Company 01 John Benjamins Publishing Company Amsterdam/Philadelphia NL 02 October 2024 20241015 2024 John Benjamins B.V. 02 WORLD 13 15 9789027215949 01 JB 3 John Benjamins e-Platform 03 jbe-platform.com 09 WORLD 10 20241015 01 00 120.00 EUR R 01 00 101.00 GBP Z 01 gen 00 156.00 USD S 940029010 03 01 01 JB John Benjamins Publishing Company 01 JB code SCL 119 Hb 15 9789027215949 13 2024033282 BB 01 SCL 02 1388-0373 Studies in Corpus Linguistics 119 <TitleType>01</TitleType> <TitleText textformat="02">Crossing Boundaries through Corpora</TitleText> <Subtitle textformat="02">Innovative corpus approaches within and beyond linguistics</Subtitle> 01 scl.119 01 https://benjamins.com 02 https://benjamins.com/catalog/scl.119 1 B01 Sarah Buschfeld Buschfeld, Sarah Sarah Buschfeld TU Dortmund University 2 B01 Patricia Ronan Ronan, Patricia Patricia Ronan TU Dortmund University 3 B01 Theresa Neumaier Neumaier, Theresa Theresa Neumaier TU Dortmund University 4 B01 Andreas Weilinghoff Weilinghoff, Andreas Andreas Weilinghoff University of Koblenz 5 B01 Lisa Westermayer Westermayer, Lisa Lisa Westermayer TU Dortmund University 01 eng 266 vi 260 + index LAN009000 v.2006 CFX 2 24 JB Subject Scheme LIN.CORP Corpus linguistics 24 JB Subject Scheme LIN.THEOR Theoretical linguistics 06 01 This volume illustrates new trends in corpus linguistics and shows how corpus approaches can be used to investigate new datasets and emerging areas in linguistics and related fields. It addresses innovative research questions, for example how prosodic analyses can increase the accuracy of syntactic segmentation, how tolerant English language teachers are about language variation, or how natural language can be translated into corpus query language. The thematic scope encompasses four types of ‘boundary crossings’. These include the incorporation of innovative scientific methods, specifically new statistical techniques, acoustic analysis and stylistic investigations. Additionally, temporal boundaries are crossed through the use of new methods and corpora to study diachronic data. New methodologies are also explored through the analysis of prosody, variety-specific approaches, and teacher attitudes. Finally, corpus users can cross boundaries by employing a more user-friendly corpus query language. 04 09 01 https://benjamins.com/covers/475/scl.119.png 04 03 01 https://benjamins.com/covers/475_jpg/9789027215949.jpg 04 03 01 https://benjamins.com/covers/475_tif/9789027215949.tif 06 09 01 https://benjamins.com/covers/1200_front/scl.119.hb.png 07 09 01 https://benjamins.com/covers/125/scl.119.png 25 09 01 https://benjamins.com/covers/1200_back/scl.119.hb.png 27 09 01 https://benjamins.com/covers/3d_web/scl.119.hb.png 10 01 JB code scl.119.toc v vi 2 Miscellaneous 1 <TitleType>01</TitleType> <TitleText textformat="02">Table of contents</TitleText> 10 01 JB code scl.119.01ron 1 6 6 Chapter 2 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 1. Introduction</TitleText> <Subtitle textformat="02">Crossing discipline boundaries with corpus‑linguistic methods</Subtitle> 1 A01 Patricia Ronan Ronan, Patricia Patricia Ronan TU Dortmund University (Germany) 2 A01 Sarah Buschfeld Buschfeld, Sarah Sarah Buschfeld TU Dortmund University (Germany) 3 A01 Theresa Neumaier Neumaier, Theresa Theresa Neumaier TU Dortmund University (Germany) 4 A01 Andreas Weilinghoff Weilinghoff, Andreas Andreas Weilinghoff University of Koblenz (Germany) 5 A01 Lisa Westermayer Westermayer, Lisa Lisa Westermayer TU Dortmund University (Germany) 10 01 JB code scl.119.p01 7 1 Section header 3 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">Integrated approaches</Subtitle> 10 01 JB code scl.119.02sai 8 40 33 Chapter 4 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 2. New approaches to investigating change in derivational productivity</TitleText> <Subtitle textformat="02">Gender and internal factors in the development of ‑ <i>ity</i> and ‑ <i>ness</i> , 1600–1800</Subtitle> 1 A01 Tanja Säily Säily, Tanja Tanja Säily University of Helsinki 2 A01 Martin Hilpert Hilpert, Martin Martin Hilpert Université de Neuchâtel 3 A01 Jukka Suomela Suomela, Jukka Jukka Suomela Aalto University 20 Construction Grammar 20 historical sociolinguistics 20 methodology 20 morphological productivity 20 nominal suffixes 01 We study the productivity of the suffixes ‑<i>ness</i> and ‑<i>ity</i> in seventeenth‑ and eighteenth-century letters in the <i>Corpora of Early English Correspondence</i>. We analyze the role of gender and five internal factors: etymology, the word class of the base, branching structure, semantics, and occurrence in possessive constructions. We develop statistical and visual methods that facilitate diachronic comparisons within factors and between competing suffixes; our basic measure is the proportion of types of interest out of all relevant types, and we utilize permutation testing to assess the statistical significance of our findings. Our results support and refine the earlier finding of a male-led increase in the productivity of ‑<i>ity</i> and provide new information on the interplay of gender and internal factors. 10 01 JB code scl.119.03sch 41 61 21 Chapter 5 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 3. A corpus-based comparative acoustic analysis of target-like vowel production by L1-Japanese learners and native speakers of English</TitleText> 1 A01 Martin Schweinberger Schweinberger, Martin Martin Schweinberger University of Queensland 2 A01 Yuki Komiya Komiya, Yuki Yuki Komiya University of Queensland 20 acoustic phonetics 20 corpus linguistics 20 Japanese learner English 20 multifactorial prediction and deviation analysis 20 random forests 20 regression 01 This study combines acoustic phonetics, (applied) corpus linguistics, machine learning, and speech recognition to analyse the production of the monophthongal vowels / ɐ ɒ æ e ɛ i: ɪ u; ʊ ʌ / in the speech of L1-Japanese learners and L1-speakers of English based on transcripts and audio data from the Japanese spoken monologue section of the <i>International Corpus Network of Asian Learners of English</i> (<i>ICNALE</i>). The aim of this analysis is to evaluate what vowels L1-Japanese learners struggle with in terms of target-like vowel production and to provide insights into the determining factors causing divergencies from L1-English produced vowels. The results of a <i>Multifactorial Prediction and Deviation Analysis Using Regression/Random Forests</i> (MuPDARF) show that Japanese learners of English do indeed have difficulties in producing English vowels in a target-like manner but that these difficulties are confined to a relatively small set of vowels (/ ɪ u ʊ ɛ /). In addition, the analysis shows that difficulties are predominantly correlated with language-internal factors while language external factors (the age and gender of speakers) as well as their target variety and proficiency do not significantly correlate with non-target-like vowel production. The results suggest that Japanese learners of English can focus on specific vowels to enhance their target-like vowel production and that difficulties are caused by L1-interference due to a lack of phonemic vowel duration in Japanese and the similarity of Japanese and English vowels leading learners to use their L1 vowels rather than the slightly but notably different English vowel variants. The results can be used to raise awareness of L1-specific difficulties among this learner cohort due to their L1-background. 10 01 JB code scl.119.04sch 62 98 37 Chapter 6 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 4. Digital Dickens</TitleText> <Subtitle textformat="02">An automated content analysis of Charles Dickens’ novels</Subtitle> 1 A01 Gerold Schneider Schneider, Gerold Gerold Schneider University of Zurich 20 Charles Dickens 20 computational linguistics 20 conceptual maps 20 digital humanities 20 distributional semantics 20 document classification 20 topic modelling 01 This investigation employs computational linguistic methods such as document classification, topic modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social criticism, literary realism, and narrative structures. While affirming positive results for automated analysis of social criticism, the study emphasizes that it could discover differing associations only due to semantic abstraction, which distributional semantics, word embeddings, and topic modelling can offer. Literary realism is successfully traced through detailed descriptions and everyday activities. Plotting plots with computational linguistic methods, specifically conceptual maps with textplot, shows promise but requires refinement. The study shows that current methods in content analysis offer new possibilities for literary analysis and digital humanities. 10 01 JB code scl.119.p02 99 1 Section header 7 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">Change over time</Subtitle> 10 01 JB code scl.119.05ebe 100 124 25 Chapter 8 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 5. 120 years of reporting clauses</TitleText> <Subtitle textformat="02">Stability or change?</Subtitle> 1 A01 Jarle Ebeling Ebeling, Jarle Jarle Ebeling University of Oslo 20 corpus stylistics 20 diachronic change 20 distant reading 20 fictional dialogue 20 reporting clause 01 Introducing the <i>Corpus of British Fiction</i>, this paper studies the development and functions of expansions in the form of manner adverbs and ‑<i>ing</i> clauses within reporting clauses. The investigation shows that the use of single adverbs to modify the reporting verb has not increased in line with the increased use of <sc>say</sc> as a reporting verb. The opposite is the case with <i>ing-</i>clause expansions, as the use of these is stable or on the rise in the 120 years covered by the corpus. The study contains a diachronic, mostly unsupervised quantitative part and a synchronic qualitative part, including manual scrutiny of the functions of the expansions under study. 10 01 JB code scl.119.06gee 125 152 28 Chapter 9 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 6. Establishing a ‘new normal’</TitleText> <Subtitle textformat="02">Detecting fluctuating trends in word frequency over time</Subtitle> 1 A01 Matt Gee Gee, Matt Matt Gee Birmingham City University 2 A01 Andrew Kehoe Kehoe, Andrew Andrew Kehoe Birmingham City University 3 A01 Antoinette Renouf Renouf, Antoinette Antoinette Renouf Birmingham City University 20 collocation 20 diachronic change 20 lexical change 20 time series 20 visualisation 01 In this chapter we introduce statistical methods and associated visualisations for the analysis of lexical change on a monthly basis in a 1.8-billion word news corpus spanning over 30 years. In previous work (Kehoe et al. 2022) we found examples of word frequency change in a data-driven manner by applying existing statistical tests. An ongoing limitation is that, as our diachronic corpus grows, so too does the possibility of a word exhibiting multiple frequency changes in different directions. This chapter reframes the problem as one of time-series segmentation, dividing the frequency history of a word into timespans exhibiting consistent upward or downward change. We then determine reasons for such changes by applying horizon graph visualisations to collocates. 10 01 JB code scl.119.p03 153 1 Section header 10 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">New approaches</Subtitle> 10 01 JB code scl.119.07mcc 154 191 38 Chapter 11 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 7. Syntactic segmentation of spoken corpus data</TitleText> <Subtitle textformat="02">What prosody can contribute</Subtitle> 1 A01 Karin McClellan McClellan, Karin Karin McClellan Ruhr Universität Bochum 2 A01 Kathrin Kircili Kircili, Kathrin Kathrin Kircili Philipps-Universität Marburg 3 A01 Sandra Götz Götz, Sandra Sandra Götz Philipps-Universität Marburg 20 L1 English 20 spoken language 20 syntactic and prosodic segmentation 20 syntax-prosody interface 01 Most corpus-based syntactic segmentation schemes rely on transcriptions alone, which can lead to segmentation difficulties, especially when analyzing spontaneous conversations. We therefore suggest an approach to segmentation that complements syntactic segmentation techniques with prosodic analyses and describe correspondences in syntactic and prosodic segmentation as well as the exact syntactic contexts in which prosodic analyses are necessary to avoid ambiguities and potential inaccuracies. Using 10 recordings from the <i>Louvain Corpus of Native English Conversation</i>, utterances are independently and manually segmented and annotated for various linguistic variables. While the results of our analyses indicate a considerable overlap of intermediate phrases and clausal units, we also showcase syntactic contexts where prosody is needed for disambiguation (e.g. monologs, discourse markers, dysfluencies, and adverbials). 10 01 JB code scl.119.08tob 192 216 25 Chapter 12 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 8. Short-term diachronic and variety-internal approaches to textual functionality in South Asian Englishes</TitleText> <Subtitle textformat="02">Evidence from newspaper language</Subtitle> 1 A01 Tobias Bernaisch Bernaisch, Tobias Tobias Bernaisch Justus Liebig University Giessen 2 A01 Sven Leuckert Leuckert, Sven Sven Leuckert TUD Dresden University of Technology 20 corpus linguistics 20 diachrony 20 Indian Englishes 20 MDA 20 South Asian Englishes 01 To empirically trace functional characteristics of texts such as speaker/writer involvement, narrativity or persuasiveness with a view to potential (a) intra-national variability in Indian English and (b) short-term diachronic change in South Asian Englishes, the <i>South Asian Varieties of English</i> (<i>SAVE</i>) corpus, its updated version <i>SAVE2020</i>, and the <i>Corpus of Regional Indian Newspaper Englishes</i> (<i>CORINNE</i>) are subjected to Multidimensional Analysis (MDA, Biber 1988) as implemented in Nini (2019). A hierarchical cluster analysis of the respective MDA scores reveals the tendency of mesolectal Indian Englishes as well as acrolectal Bangladeshi and Sri Lankan English to employ features of a conceptually written nature more readily than acrolectal Indian, Maldivian, Nepali, and Pakistani English. Still, in the observed time span of 15 years, the acrolects of South Asian Englishes also develop towards conceptually written language. 10 01 JB code scl.119.09sch 217 245 29 Chapter 13 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 9. Do corpus data on World Englishes inspire tolerance of variation in ELT professionals?</TitleText> <Subtitle textformat="02">An experimental questionnaire study with native English speaking teachers</Subtitle> 1 A01 Julia Schlüter Schlüter, Julia Julia Schlüter University of Bamberg 20 applications of corpus linguistics 20 bias 20 consistency 20 corpus literacy 20 Corpus-Assisted Language Learning (CALL) 20 Data-driven learning (DDL) 20 English as a Lingua Franca (ELF) 20 English as an International Language (EIL) 20 English Language Teaching (ELT) 20 error correction 20 Native English Speaking Teachers (NESTs) 20 Prepositional phrases 20 target norm 20 Varieties of English 20 World Englishes 01 The present study aims to show that — given the status of English as a pluricentric global language and as a lingua franca — Corpus Linguistics has important and unique contributions to make to English Language Teaching (ELT). Desirable innovations arguably involve popularizing the use of corpus concordancing as a tool to put native speaker intuitions on a firmer empirical footing, and imbuing ELT practitioners with an awareness that variation – in particular (but not only) between geographical varieties — is an inherent and legitimate characteristic of language in use. To support these points, a quasi-experimental questionnaire study with 76 native English speaking teachers based at German universities is reported, which demonstrates the promises but also the obstacles of such an approach. 10 01 JB code scl.119.p04 247 1 Section header 14 <TitleType>01</TitleType> <TitleText textformat="02">Crossing boundaries</TitleText> <Subtitle textformat="02">Accessibility</Subtitle> 10 01 JB code scl.119.10mil 248 262 15 Chapter 15 <TitleType>01</TitleType> <TitleText textformat="02">Chapter 10. Query a corpus in near-natural language</TitleText> <Subtitle textformat="02">A human-friendly corpus query language not only for linguists</Subtitle> 1 A01 Jiří Milička Milička, Jiří Jiří Milička Charles University, Prague, Czech Republic 2 A01 Denisa Šebestová Šebestová, Denisa Denisa Šebestová Charles University, Prague, Czech Republic 20 corpus 20 corpus linguistics 20 Czech national corpus 20 large language models 20 query language 20 universal dependencies formalism 01 This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in <i>SketchEngine</i>, <i>Czech National Corpus</i> web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the <i>Czech National Corpus</i> infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use. 02 JBENJAMINS John Benjamins Publishing Company 01 John Benjamins Publishing Company Amsterdam/Philadelphia NL 02 October 2024 20241015 2024 John Benjamins B.V. 02 WORLD 01 JB 1 John Benjamins Publishing Company +31 20 6304747 +31 20 6739773 bookorder@benjamins.nl 01 https://benjamins.com 01 WORLD US CA MX 10 20241015 01 02 JB 1 00 120.00 EUR R 02 02 JB 1 00 127.20 EUR R 01 JB 10 bebc +44 1202 712 934 +44 1202 712 913 sales@bebc.co.uk 03 GB 10 20241015 02 02 JB 1 00 101.00 GBP Z 01 JB 2 John Benjamins North America +1 800 562-5666 +1 703 661-1501 benjamins@presswarehouse.com 01 https://benjamins.com 01 US CA MX 10 20241015 01 gen 02 JB 1 00 156.00 USD