560029011
03
01
01
JB
John Benjamins Publishing Company
01
JB code
SCL 119 Eb
15
9789027246486
06
10.1075/scl.119
13
2024033283
DG
002
02
01
SCL
02
1388-0373
Studies in Corpus Linguistics
119
01
Crossing Boundaries through Corpora
Innovative corpus approaches within and beyond linguistics
01
scl.119
01
https://benjamins.com
02
https://benjamins.com/catalog/scl.119
1
B01
Sarah Buschfeld
Buschfeld, Sarah
Sarah
Buschfeld
TU Dortmund University
2
B01
Patricia Ronan
Ronan, Patricia
Patricia
Ronan
TU Dortmund University
3
B01
Theresa Neumaier
Neumaier, Theresa
Theresa
Neumaier
TU Dortmund University
4
B01
Andreas Weilinghoff
Weilinghoff, Andreas
Andreas
Weilinghoff
University of Koblenz
5
B01
Lisa Westermayer
Westermayer, Lisa
Lisa
Westermayer
TU Dortmund University
01
eng
266
vi
260
+ index
LAN009000
v.2006
CFX
2
24
JB Subject Scheme
LIN.CORP
Corpus linguistics
24
JB Subject Scheme
LIN.THEOR
Theoretical linguistics
06
01
This volume illustrates new trends in corpus linguistics and shows how corpus approaches can be used to investigate new datasets and emerging areas in linguistics and related fields. It addresses innovative research questions, for example how prosodic analyses can increase the accuracy of syntactic segmentation, how tolerant English language teachers are about language variation, or how natural language can be translated into corpus query language. The thematic scope encompasses four types of ‘boundary crossings’. These include the incorporation of innovative scientific methods, specifically new statistical techniques, acoustic analysis and stylistic investigations. Additionally, temporal boundaries are crossed through the use of new methods and corpora to study diachronic data. New methodologies are also explored through the analysis of prosody, variety-specific approaches, and teacher attitudes. Finally, corpus users can cross boundaries by employing a more user-friendly corpus query language.
04
09
01
https://benjamins.com/covers/475/scl.119.png
04
03
01
https://benjamins.com/covers/475_jpg/9789027215949.jpg
04
03
01
https://benjamins.com/covers/475_tif/9789027215949.tif
06
09
01
https://benjamins.com/covers/1200_front/scl.119.hb.png
07
09
01
https://benjamins.com/covers/125/scl.119.png
25
09
01
https://benjamins.com/covers/1200_back/scl.119.hb.png
27
09
01
https://benjamins.com/covers/3d_web/scl.119.hb.png
10
01
JB code
scl.119.toc
v
vi
2
Miscellaneous
1
01
Table of contents
10
01
JB code
scl.119.01ron
1
6
6
Chapter
2
01
Chapter 1. Introduction
Crossing discipline boundaries with corpus‑linguistic methods
1
A01
Patricia Ronan
Ronan, Patricia
Patricia
Ronan
TU Dortmund University (Germany)
2
A01
Sarah Buschfeld
Buschfeld, Sarah
Sarah
Buschfeld
TU Dortmund University (Germany)
3
A01
Theresa Neumaier
Neumaier, Theresa
Theresa
Neumaier
TU Dortmund University (Germany)
4
A01
Andreas Weilinghoff
Weilinghoff, Andreas
Andreas
Weilinghoff
University of Koblenz (Germany)
5
A01
Lisa Westermayer
Westermayer, Lisa
Lisa
Westermayer
TU Dortmund University (Germany)
10
01
JB code
scl.119.p01
7
1
Section header
3
01
Crossing boundaries
Integrated approaches
10
01
JB code
scl.119.02sai
8
40
33
Chapter
4
01
Chapter 2. New approaches to investigating change in derivational productivity
Gender and internal factors in the development of ‑ <i>ity</i> and ‑ <i>ness</i> , 1600–1800
1
A01
Tanja Säily
Säily, Tanja
Tanja
Säily
University of Helsinki
2
A01
Martin Hilpert
Hilpert, Martin
Martin
Hilpert
Université de Neuchâtel
3
A01
Jukka Suomela
Suomela, Jukka
Jukka
Suomela
Aalto University
20
Construction Grammar
20
historical sociolinguistics
20
methodology
20
morphological productivity
20
nominal suffixes
01
We study the productivity of the suffixes ‑<i>ness</i> and ‑<i>ity</i> in seventeenth‑ and eighteenth-century letters in the <i>Corpora of Early English Correspondence</i>. We analyze the role of gender and five internal factors: etymology, the word class of the base, branching structure, semantics, and occurrence in possessive constructions. We develop statistical and visual methods that facilitate diachronic comparisons within factors and between competing suffixes; our basic measure is the proportion of types of interest out of all relevant types, and we utilize permutation testing to assess the statistical significance of our findings. Our results support and refine the earlier finding of a male-led increase in the productivity of ‑<i>ity</i> and provide new information on the interplay of gender and internal factors.
10
01
JB code
scl.119.03sch
41
61
21
Chapter
5
01
Chapter 3. A corpus-based comparative acoustic analysis of target-like vowel production by L1-Japanese learners and native speakers of English
1
A01
Martin Schweinberger
Schweinberger, Martin
Martin
Schweinberger
University of Queensland
2
A01
Yuki Komiya
Komiya, Yuki
Yuki
Komiya
University of Queensland
20
acoustic phonetics
20
corpus linguistics
20
Japanese learner English
20
multifactorial prediction and deviation analysis
20
random forests
20
regression
01
This study combines acoustic phonetics, (applied) corpus linguistics, machine learning, and speech recognition to analyse the production of the monophthongal vowels / ɐ ɒ æ e ɛ i: ɪ u; ʊ ʌ / in the speech of L1-Japanese learners and L1-speakers of English based on transcripts and audio data from the Japanese spoken monologue section of the <i>International Corpus Network of Asian Learners of English</i> (<i>ICNALE</i>). The aim of this analysis is to evaluate what vowels L1-Japanese learners struggle with in terms of target-like vowel production and to provide insights into the determining factors causing divergencies from L1-English produced vowels. The results of a <i>Multifactorial Prediction and Deviation Analysis Using Regression/Random Forests</i> (MuPDARF) show that Japanese learners of English do indeed have difficulties in producing English vowels in a target-like manner but that these difficulties are confined to a relatively small set of vowels (/ ɪ u ʊ ɛ /). In addition, the analysis shows that difficulties are predominantly correlated with language-internal factors while language external factors (the age and gender of speakers) as well as their target variety and proficiency do not significantly correlate with non-target-like vowel production. The results suggest that Japanese learners of English can focus on specific vowels to enhance their target-like vowel production and that difficulties are caused by L1-interference due to a lack of phonemic vowel duration in Japanese and the similarity of Japanese and English vowels leading learners to use their L1 vowels rather than the slightly but notably different English vowel variants. The results can be used to raise awareness of L1-specific difficulties among this learner cohort due to their L1-background.
10
01
JB code
scl.119.04sch
62
98
37
Chapter
6
01
Chapter 4. Digital Dickens
An automated content analysis of Charles Dickens’ novels
1
A01
Gerold Schneider
Schneider, Gerold
Gerold
Schneider
University of Zurich
20
Charles Dickens
20
computational linguistics
20
conceptual maps
20
digital humanities
20
distributional semantics
20
document classification
20
topic modelling
01
This investigation employs computational linguistic methods such as document classification, topic modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social criticism, literary realism, and narrative structures. While affirming positive results for automated analysis of social criticism, the study emphasizes that it could discover differing associations only due to semantic abstraction, which distributional semantics, word embeddings, and topic modelling can offer. Literary realism is successfully traced through detailed descriptions and everyday activities. Plotting plots with computational linguistic methods, specifically conceptual maps with textplot, shows promise but requires refinement. The study shows that current methods in content analysis offer new possibilities for literary analysis and digital humanities.
10
01
JB code
scl.119.p02
99
1
Section header
7
01
Crossing boundaries
Change over time
10
01
JB code
scl.119.05ebe
100
124
25
Chapter
8
01
Chapter 5. 120 years of reporting clauses
Stability or change?
1
A01
Jarle Ebeling
Ebeling, Jarle
Jarle
Ebeling
University of Oslo
20
corpus stylistics
20
diachronic change
20
distant reading
20
fictional dialogue
20
reporting clause
01
Introducing the <i>Corpus of British Fiction</i>, this paper studies the development and functions of expansions in the form of manner adverbs and ‑<i>ing</i> clauses within reporting clauses. The investigation shows that the use of single adverbs to modify the reporting verb has not increased in line with the increased use of <sc>say</sc> as a reporting verb. The opposite is the case with <i>ing-</i>clause expansions, as the use of these is stable or on the rise in the 120 years covered by the corpus. The study contains a diachronic, mostly unsupervised quantitative part and a synchronic qualitative part, including manual scrutiny of the functions of the expansions under study.
10
01
JB code
scl.119.06gee
125
152
28
Chapter
9
01
Chapter 6. Establishing a ‘new normal’
Detecting fluctuating trends in word frequency over time
1
A01
Matt Gee
Gee, Matt
Matt
Gee
Birmingham City University
2
A01
Andrew Kehoe
Kehoe, Andrew
Andrew
Kehoe
Birmingham City University
3
A01
Antoinette Renouf
Renouf, Antoinette
Antoinette
Renouf
Birmingham City University
20
collocation
20
diachronic change
20
lexical change
20
time series
20
visualisation
01
In this chapter we introduce statistical methods and associated visualisations for the analysis of lexical change on a monthly basis in a 1.8-billion word news corpus spanning over 30 years. In previous work (Kehoe et al. 2022) we found examples of word frequency change in a data-driven manner by applying existing statistical tests. An ongoing limitation is that, as our diachronic corpus grows, so too does the possibility of a word exhibiting multiple frequency changes in different directions. This chapter reframes the problem as one of time-series segmentation, dividing the frequency history of a word into timespans exhibiting consistent upward or downward change. We then determine reasons for such changes by applying horizon graph visualisations to collocates.
10
01
JB code
scl.119.p03
153
1
Section header
10
01
Crossing boundaries
New approaches
10
01
JB code
scl.119.07mcc
154
191
38
Chapter
11
01
Chapter 7. Syntactic segmentation of spoken corpus data
What prosody can contribute
1
A01
Karin McClellan
McClellan, Karin
Karin
McClellan
Ruhr Universität Bochum
2
A01
Kathrin Kircili
Kircili, Kathrin
Kathrin
Kircili
Philipps-Universität Marburg
3
A01
Sandra Götz
Götz, Sandra
Sandra
Götz
Philipps-Universität Marburg
20
L1 English
20
spoken language
20
syntactic and prosodic segmentation
20
syntax-prosody interface
01
Most corpus-based syntactic segmentation schemes rely on transcriptions alone, which can lead to segmentation difficulties, especially when analyzing spontaneous conversations. We therefore suggest an approach to segmentation that complements syntactic segmentation techniques with prosodic analyses and describe correspondences in syntactic and prosodic segmentation as well as the exact syntactic contexts in which prosodic analyses are necessary to avoid ambiguities and potential inaccuracies. Using 10 recordings from the <i>Louvain Corpus of Native English Conversation</i>, utterances are independently and manually segmented and annotated for various linguistic variables. While the results of our analyses indicate a considerable overlap of intermediate phrases and clausal units, we also showcase syntactic contexts where prosody is needed for disambiguation (e.g. monologs, discourse markers, dysfluencies, and adverbials).
10
01
JB code
scl.119.08tob
192
216
25
Chapter
12
01
Chapter 8. Short-term diachronic and variety-internal approaches to textual functionality in South Asian Englishes
Evidence from newspaper language
1
A01
Tobias Bernaisch
Bernaisch, Tobias
Tobias
Bernaisch
Justus Liebig University Giessen
2
A01
Sven Leuckert
Leuckert, Sven
Sven
Leuckert
TUD Dresden University of Technology
20
corpus linguistics
20
diachrony
20
Indian Englishes
20
MDA
20
South Asian Englishes
01
To empirically trace functional characteristics of texts such as speaker/writer involvement, narrativity or persuasiveness with a view to potential (a) intra-national variability in Indian English and (b) short-term diachronic change in South Asian Englishes, the <i>South Asian Varieties of English</i> (<i>SAVE</i>) corpus, its updated version <i>SAVE2020</i>, and the <i>Corpus of Regional Indian Newspaper Englishes</i> (<i>CORINNE</i>) are subjected to Multidimensional Analysis (MDA, Biber 1988) as implemented in Nini (2019). A hierarchical cluster analysis of the respective MDA scores reveals the tendency of mesolectal Indian Englishes as well as acrolectal Bangladeshi and Sri Lankan English to employ features of a conceptually written nature more readily than acrolectal Indian, Maldivian, Nepali, and Pakistani English. Still, in the observed time span of 15 years, the acrolects of South Asian Englishes also develop towards conceptually written language.
10
01
JB code
scl.119.09sch
217
245
29
Chapter
13
01
Chapter 9. Do corpus data on World Englishes inspire tolerance of variation in ELT professionals?
An experimental questionnaire study with native English speaking teachers
1
A01
Julia Schlüter
Schlüter, Julia
Julia
Schlüter
University of Bamberg
20
applications of corpus linguistics
20
bias
20
consistency
20
corpus literacy
20
Corpus-Assisted Language Learning (CALL)
20
Data-driven learning (DDL)
20
English as a Lingua Franca (ELF)
20
English as an International Language (EIL)
20
English Language Teaching (ELT)
20
error correction
20
Native English Speaking Teachers (NESTs)
20
Prepositional phrases
20
target norm
20
Varieties of English
20
World Englishes
01
The present study aims to show that — given the status of English as a pluricentric global language and as a lingua franca — Corpus Linguistics has important and unique contributions to make to English Language Teaching (ELT). Desirable innovations arguably involve popularizing the use of corpus concordancing as a tool to put native speaker intuitions on a firmer empirical footing, and imbuing ELT practitioners with an awareness that variation – in particular (but not only) between geographical varieties — is an inherent and legitimate characteristic of language in use. To support these points, a quasi-experimental questionnaire study with 76 native English speaking teachers based at German universities is reported, which demonstrates the promises but also the obstacles of such an approach.
10
01
JB code
scl.119.p04
247
1
Section header
14
01
Crossing boundaries
Accessibility
10
01
JB code
scl.119.10mil
248
262
15
Chapter
15
01
Chapter 10. Query a corpus in near-natural language
A human-friendly corpus query language not only for linguists
1
A01
Jiří Milička
Milička, Jiří
Jiří
Milička
Charles University, Prague, Czech Republic
2
A01
Denisa Šebestová
Šebestová, Denisa
Denisa
Šebestová
Charles University, Prague, Czech Republic
20
corpus
20
corpus linguistics
20
Czech national corpus
20
large language models
20
query language
20
universal dependencies formalism
01
This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in <i>SketchEngine</i>, <i>Czech National Corpus</i> web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the <i>Czech National Corpus</i> infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use.
02
JBENJAMINS
John Benjamins Publishing Company
01
John Benjamins Publishing Company
Amsterdam/Philadelphia
NL
02
October 2024
20241015
2024
John Benjamins B.V.
02
WORLD
13
15
9789027215949
01
JB
3
John Benjamins e-Platform
03
jbe-platform.com
09
WORLD
10
20241015
01
00
120.00
EUR
R
01
00
101.00
GBP
Z
01
gen
00
156.00
USD
S
940029010
03
01
01
JB
John Benjamins Publishing Company
01
JB code
SCL 119 Hb
15
9789027215949
13
2024033282
BB
01
SCL
02
1388-0373
Studies in Corpus Linguistics
119
01
Crossing Boundaries through Corpora
Innovative corpus approaches within and beyond linguistics
01
scl.119
01
https://benjamins.com
02
https://benjamins.com/catalog/scl.119
1
B01
Sarah Buschfeld
Buschfeld, Sarah
Sarah
Buschfeld
TU Dortmund University
2
B01
Patricia Ronan
Ronan, Patricia
Patricia
Ronan
TU Dortmund University
3
B01
Theresa Neumaier
Neumaier, Theresa
Theresa
Neumaier
TU Dortmund University
4
B01
Andreas Weilinghoff
Weilinghoff, Andreas
Andreas
Weilinghoff
University of Koblenz
5
B01
Lisa Westermayer
Westermayer, Lisa
Lisa
Westermayer
TU Dortmund University
01
eng
266
vi
260
+ index
LAN009000
v.2006
CFX
2
24
JB Subject Scheme
LIN.CORP
Corpus linguistics
24
JB Subject Scheme
LIN.THEOR
Theoretical linguistics
06
01
This volume illustrates new trends in corpus linguistics and shows how corpus approaches can be used to investigate new datasets and emerging areas in linguistics and related fields. It addresses innovative research questions, for example how prosodic analyses can increase the accuracy of syntactic segmentation, how tolerant English language teachers are about language variation, or how natural language can be translated into corpus query language. The thematic scope encompasses four types of ‘boundary crossings’. These include the incorporation of innovative scientific methods, specifically new statistical techniques, acoustic analysis and stylistic investigations. Additionally, temporal boundaries are crossed through the use of new methods and corpora to study diachronic data. New methodologies are also explored through the analysis of prosody, variety-specific approaches, and teacher attitudes. Finally, corpus users can cross boundaries by employing a more user-friendly corpus query language.
04
09
01
https://benjamins.com/covers/475/scl.119.png
04
03
01
https://benjamins.com/covers/475_jpg/9789027215949.jpg
04
03
01
https://benjamins.com/covers/475_tif/9789027215949.tif
06
09
01
https://benjamins.com/covers/1200_front/scl.119.hb.png
07
09
01
https://benjamins.com/covers/125/scl.119.png
25
09
01
https://benjamins.com/covers/1200_back/scl.119.hb.png
27
09
01
https://benjamins.com/covers/3d_web/scl.119.hb.png
10
01
JB code
scl.119.toc
v
vi
2
Miscellaneous
1
01
Table of contents
10
01
JB code
scl.119.01ron
1
6
6
Chapter
2
01
Chapter 1. Introduction
Crossing discipline boundaries with corpus‑linguistic methods
1
A01
Patricia Ronan
Ronan, Patricia
Patricia
Ronan
TU Dortmund University (Germany)
2
A01
Sarah Buschfeld
Buschfeld, Sarah
Sarah
Buschfeld
TU Dortmund University (Germany)
3
A01
Theresa Neumaier
Neumaier, Theresa
Theresa
Neumaier
TU Dortmund University (Germany)
4
A01
Andreas Weilinghoff
Weilinghoff, Andreas
Andreas
Weilinghoff
University of Koblenz (Germany)
5
A01
Lisa Westermayer
Westermayer, Lisa
Lisa
Westermayer
TU Dortmund University (Germany)
10
01
JB code
scl.119.p01
7
1
Section header
3
01
Crossing boundaries
Integrated approaches
10
01
JB code
scl.119.02sai
8
40
33
Chapter
4
01
Chapter 2. New approaches to investigating change in derivational productivity
Gender and internal factors in the development of ‑ <i>ity</i> and ‑ <i>ness</i> , 1600–1800
1
A01
Tanja Säily
Säily, Tanja
Tanja
Säily
University of Helsinki
2
A01
Martin Hilpert
Hilpert, Martin
Martin
Hilpert
Université de Neuchâtel
3
A01
Jukka Suomela
Suomela, Jukka
Jukka
Suomela
Aalto University
20
Construction Grammar
20
historical sociolinguistics
20
methodology
20
morphological productivity
20
nominal suffixes
01
We study the productivity of the suffixes ‑<i>ness</i> and ‑<i>ity</i> in seventeenth‑ and eighteenth-century letters in the <i>Corpora of Early English Correspondence</i>. We analyze the role of gender and five internal factors: etymology, the word class of the base, branching structure, semantics, and occurrence in possessive constructions. We develop statistical and visual methods that facilitate diachronic comparisons within factors and between competing suffixes; our basic measure is the proportion of types of interest out of all relevant types, and we utilize permutation testing to assess the statistical significance of our findings. Our results support and refine the earlier finding of a male-led increase in the productivity of ‑<i>ity</i> and provide new information on the interplay of gender and internal factors.
10
01
JB code
scl.119.03sch
41
61
21
Chapter
5
01
Chapter 3. A corpus-based comparative acoustic analysis of target-like vowel production by L1-Japanese learners and native speakers of English
1
A01
Martin Schweinberger
Schweinberger, Martin
Martin
Schweinberger
University of Queensland
2
A01
Yuki Komiya
Komiya, Yuki
Yuki
Komiya
University of Queensland
20
acoustic phonetics
20
corpus linguistics
20
Japanese learner English
20
multifactorial prediction and deviation analysis
20
random forests
20
regression
01
This study combines acoustic phonetics, (applied) corpus linguistics, machine learning, and speech recognition to analyse the production of the monophthongal vowels / ɐ ɒ æ e ɛ i: ɪ u; ʊ ʌ / in the speech of L1-Japanese learners and L1-speakers of English based on transcripts and audio data from the Japanese spoken monologue section of the <i>International Corpus Network of Asian Learners of English</i> (<i>ICNALE</i>). The aim of this analysis is to evaluate what vowels L1-Japanese learners struggle with in terms of target-like vowel production and to provide insights into the determining factors causing divergencies from L1-English produced vowels. The results of a <i>Multifactorial Prediction and Deviation Analysis Using Regression/Random Forests</i> (MuPDARF) show that Japanese learners of English do indeed have difficulties in producing English vowels in a target-like manner but that these difficulties are confined to a relatively small set of vowels (/ ɪ u ʊ ɛ /). In addition, the analysis shows that difficulties are predominantly correlated with language-internal factors while language external factors (the age and gender of speakers) as well as their target variety and proficiency do not significantly correlate with non-target-like vowel production. The results suggest that Japanese learners of English can focus on specific vowels to enhance their target-like vowel production and that difficulties are caused by L1-interference due to a lack of phonemic vowel duration in Japanese and the similarity of Japanese and English vowels leading learners to use their L1 vowels rather than the slightly but notably different English vowel variants. The results can be used to raise awareness of L1-specific difficulties among this learner cohort due to their L1-background.
10
01
JB code
scl.119.04sch
62
98
37
Chapter
6
01
Chapter 4. Digital Dickens
An automated content analysis of Charles Dickens’ novels
1
A01
Gerold Schneider
Schneider, Gerold
Gerold
Schneider
University of Zurich
20
Charles Dickens
20
computational linguistics
20
conceptual maps
20
digital humanities
20
distributional semantics
20
document classification
20
topic modelling
01
This investigation employs computational linguistic methods such as document classification, topic modelling, and distributional semantics to scrutinize eight novels by Charles Dickens, uncovering dimensions of social criticism, literary realism, and narrative structures. While affirming positive results for automated analysis of social criticism, the study emphasizes that it could discover differing associations only due to semantic abstraction, which distributional semantics, word embeddings, and topic modelling can offer. Literary realism is successfully traced through detailed descriptions and everyday activities. Plotting plots with computational linguistic methods, specifically conceptual maps with textplot, shows promise but requires refinement. The study shows that current methods in content analysis offer new possibilities for literary analysis and digital humanities.
10
01
JB code
scl.119.p02
99
1
Section header
7
01
Crossing boundaries
Change over time
10
01
JB code
scl.119.05ebe
100
124
25
Chapter
8
01
Chapter 5. 120 years of reporting clauses
Stability or change?
1
A01
Jarle Ebeling
Ebeling, Jarle
Jarle
Ebeling
University of Oslo
20
corpus stylistics
20
diachronic change
20
distant reading
20
fictional dialogue
20
reporting clause
01
Introducing the <i>Corpus of British Fiction</i>, this paper studies the development and functions of expansions in the form of manner adverbs and ‑<i>ing</i> clauses within reporting clauses. The investigation shows that the use of single adverbs to modify the reporting verb has not increased in line with the increased use of <sc>say</sc> as a reporting verb. The opposite is the case with <i>ing-</i>clause expansions, as the use of these is stable or on the rise in the 120 years covered by the corpus. The study contains a diachronic, mostly unsupervised quantitative part and a synchronic qualitative part, including manual scrutiny of the functions of the expansions under study.
10
01
JB code
scl.119.06gee
125
152
28
Chapter
9
01
Chapter 6. Establishing a ‘new normal’
Detecting fluctuating trends in word frequency over time
1
A01
Matt Gee
Gee, Matt
Matt
Gee
Birmingham City University
2
A01
Andrew Kehoe
Kehoe, Andrew
Andrew
Kehoe
Birmingham City University
3
A01
Antoinette Renouf
Renouf, Antoinette
Antoinette
Renouf
Birmingham City University
20
collocation
20
diachronic change
20
lexical change
20
time series
20
visualisation
01
In this chapter we introduce statistical methods and associated visualisations for the analysis of lexical change on a monthly basis in a 1.8-billion word news corpus spanning over 30 years. In previous work (Kehoe et al. 2022) we found examples of word frequency change in a data-driven manner by applying existing statistical tests. An ongoing limitation is that, as our diachronic corpus grows, so too does the possibility of a word exhibiting multiple frequency changes in different directions. This chapter reframes the problem as one of time-series segmentation, dividing the frequency history of a word into timespans exhibiting consistent upward or downward change. We then determine reasons for such changes by applying horizon graph visualisations to collocates.
10
01
JB code
scl.119.p03
153
1
Section header
10
01
Crossing boundaries
New approaches
10
01
JB code
scl.119.07mcc
154
191
38
Chapter
11
01
Chapter 7. Syntactic segmentation of spoken corpus data
What prosody can contribute
1
A01
Karin McClellan
McClellan, Karin
Karin
McClellan
Ruhr Universität Bochum
2
A01
Kathrin Kircili
Kircili, Kathrin
Kathrin
Kircili
Philipps-Universität Marburg
3
A01
Sandra Götz
Götz, Sandra
Sandra
Götz
Philipps-Universität Marburg
20
L1 English
20
spoken language
20
syntactic and prosodic segmentation
20
syntax-prosody interface
01
Most corpus-based syntactic segmentation schemes rely on transcriptions alone, which can lead to segmentation difficulties, especially when analyzing spontaneous conversations. We therefore suggest an approach to segmentation that complements syntactic segmentation techniques with prosodic analyses and describe correspondences in syntactic and prosodic segmentation as well as the exact syntactic contexts in which prosodic analyses are necessary to avoid ambiguities and potential inaccuracies. Using 10 recordings from the <i>Louvain Corpus of Native English Conversation</i>, utterances are independently and manually segmented and annotated for various linguistic variables. While the results of our analyses indicate a considerable overlap of intermediate phrases and clausal units, we also showcase syntactic contexts where prosody is needed for disambiguation (e.g. monologs, discourse markers, dysfluencies, and adverbials).
10
01
JB code
scl.119.08tob
192
216
25
Chapter
12
01
Chapter 8. Short-term diachronic and variety-internal approaches to textual functionality in South Asian Englishes
Evidence from newspaper language
1
A01
Tobias Bernaisch
Bernaisch, Tobias
Tobias
Bernaisch
Justus Liebig University Giessen
2
A01
Sven Leuckert
Leuckert, Sven
Sven
Leuckert
TUD Dresden University of Technology
20
corpus linguistics
20
diachrony
20
Indian Englishes
20
MDA
20
South Asian Englishes
01
To empirically trace functional characteristics of texts such as speaker/writer involvement, narrativity or persuasiveness with a view to potential (a) intra-national variability in Indian English and (b) short-term diachronic change in South Asian Englishes, the <i>South Asian Varieties of English</i> (<i>SAVE</i>) corpus, its updated version <i>SAVE2020</i>, and the <i>Corpus of Regional Indian Newspaper Englishes</i> (<i>CORINNE</i>) are subjected to Multidimensional Analysis (MDA, Biber 1988) as implemented in Nini (2019). A hierarchical cluster analysis of the respective MDA scores reveals the tendency of mesolectal Indian Englishes as well as acrolectal Bangladeshi and Sri Lankan English to employ features of a conceptually written nature more readily than acrolectal Indian, Maldivian, Nepali, and Pakistani English. Still, in the observed time span of 15 years, the acrolects of South Asian Englishes also develop towards conceptually written language.
10
01
JB code
scl.119.09sch
217
245
29
Chapter
13
01
Chapter 9. Do corpus data on World Englishes inspire tolerance of variation in ELT professionals?
An experimental questionnaire study with native English speaking teachers
1
A01
Julia Schlüter
Schlüter, Julia
Julia
Schlüter
University of Bamberg
20
applications of corpus linguistics
20
bias
20
consistency
20
corpus literacy
20
Corpus-Assisted Language Learning (CALL)
20
Data-driven learning (DDL)
20
English as a Lingua Franca (ELF)
20
English as an International Language (EIL)
20
English Language Teaching (ELT)
20
error correction
20
Native English Speaking Teachers (NESTs)
20
Prepositional phrases
20
target norm
20
Varieties of English
20
World Englishes
01
The present study aims to show that — given the status of English as a pluricentric global language and as a lingua franca — Corpus Linguistics has important and unique contributions to make to English Language Teaching (ELT). Desirable innovations arguably involve popularizing the use of corpus concordancing as a tool to put native speaker intuitions on a firmer empirical footing, and imbuing ELT practitioners with an awareness that variation – in particular (but not only) between geographical varieties — is an inherent and legitimate characteristic of language in use. To support these points, a quasi-experimental questionnaire study with 76 native English speaking teachers based at German universities is reported, which demonstrates the promises but also the obstacles of such an approach.
10
01
JB code
scl.119.p04
247
1
Section header
14
01
Crossing boundaries
Accessibility
10
01
JB code
scl.119.10mil
248
262
15
Chapter
15
01
Chapter 10. Query a corpus in near-natural language
A human-friendly corpus query language not only for linguists
1
A01
Jiří Milička
Milička, Jiří
Jiří
Milička
Charles University, Prague, Czech Republic
2
A01
Denisa Šebestová
Šebestová, Denisa
Denisa
Šebestová
Charles University, Prague, Czech Republic
20
corpus
20
corpus linguistics
20
Czech national corpus
20
large language models
20
query language
20
universal dependencies formalism
01
This paper addresses the pressing issue of accessibility of corpora to users who are not able or willing to learn a formal query language. It introduces a working online automatic translator from a near-natural language into the Corpus Query Language (CQL), as used in <i>SketchEngine</i>, <i>Czech National Corpus</i> web applications, and other services. The translator does not require strict syntactical patterns and allows for a certain amount of typing errors, using the redundancy associated with natural language. It allows querying corpora of 35 languages hosted by the <i>Czech National Corpus</i> infrastructure, all of them annotated in the Universal Dependencies formalism. Alternatively, the translated CQL code can be employed in other compatible systems. The paper both presents the theoretical assumptions of our solution and outlines the details of its implementation, including examples of use.
02
JBENJAMINS
John Benjamins Publishing Company
01
John Benjamins Publishing Company
Amsterdam/Philadelphia
NL
02
October 2024
20241015
2024
John Benjamins B.V.
02
WORLD
01
JB
1
John Benjamins Publishing Company
+31 20 6304747
+31 20 6739773
bookorder@benjamins.nl
01
https://benjamins.com
01
WORLD
US CA MX
10
20241015
01
02
JB
1
00
120.00
EUR
R
02
02
JB
1
00
127.20
EUR
R
01
JB
10
bebc
+44 1202 712 934
+44 1202 712 913
sales@bebc.co.uk
03
GB
10
20241015
02
02
JB
1
00
101.00
GBP
Z
01
JB
2
John Benjamins North America
+1 800 562-5666
+1 703 661-1501
benjamins@presswarehouse.com
01
https://benjamins.com
01
US CA MX
10
20241015
01
gen
02
JB
1
00
156.00
USD