Bayesian parametric estimation shows the effects of social prestige and bilingualism on reshaping language competition
Sizhe Yang, Tingting Ye, Shengbo Bi, Li Jin, Yongyan Zheng, Menghan Zhang

TL;DR
This paper uses Bayesian methods to study how social prestige and bilingualism affect language competition and preservation.
Contribution
A novel Bayesian parametric estimation strategy is introduced to quantify socio-linguistic effects on language competition.
Findings
Bilingualism can slow the shift of minority language speakers to majority language speakers.
Bilingualism may accelerate reverse shifts when the majority language has higher social prestige.
The Bayesian framework helps assess socio-linguistic factors in language preservation.
Abstract
Understanding the mechanisms of language competition is crucial for mitigating language extinction and promoting cultural sustainability. Nevertheless, how to quantify the effects of socio-linguistic factors, such as social prestige and bilingualism, on language competition remains a critical challenge. Here, we present Markov-process-based language competition models to explore the interactions among monolingual and bilingual groups. Based on these models, we develop a Bayesian parametric estimation strategy, which enables quantifying the effects of socio-linguistic factors through rigorous statistical examinations. With six empirical cases worldwide, we observe a general trend of minority monolinguals shifting towards majority ones, where the presence of bilingualism can decelerate this shift. Typically, bilingualism can sometimes accelerate the reverse shift when the majority…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2|
competition case: |
equality of social prestige |
role of bilinguals |
BF | |
|---|---|---|---|---|
|
|
| |||
|
French versus English (Canada) [ |
≠ ( |
decelerate |
decelerate |
7.4566 |
|
Spanish versus English [ |
≠ ( |
decelerate |
accelerate |
>1 × 103 |
|
Welsh versus English [ |
≠ ( |
decelerate |
no effect |
2.8438 |
|
Gaelic versus English [ |
≠ ( |
decelerate |
decelerate |
2.7164 |
|
English versus French (Montreal) [ |
= ( |
decelerate |
decelerate |
<2 × 10−16 |
|
Catalan versus Spanish [ |
= ( |
no effect |
decelerate |
<2 × 10−16 |
|
class |
symbol |
definition |
unit | |
|---|---|---|---|---|
|
parameter |
transition rate |
|
transition rate from |
proportion/year |
|
|
transition rate from | |||
|
|
transition rate from | |||
|
|
transition rate from | |||
|
|
transition rate from bilingual group to | |||
|
|
transition rate from bilingual group to | |||
|
social prestige |
|
social prestige of |
no dimension | |
|
|
social prestige of | |||
|
notation |
linguistics |
|
proportion of |
no dimension |
|
|
proportion of | |||
|
|
proportion of bilingual group at time | |||
|
demography |
|
total population size of monolingual and bilingual groups at time |
person | |
|
|
the growth rate for each linguistic group at time |
proportion/year |
- —European Union
- —Key R&D Program of China
- —China Postdoctoral Science Foundationhttp://dx.doi.org/10.13039/501100002858
- —National Natural Science Foundation of Chinahttp://dx.doi.org/10.13039/501100001809
- —SIMIS
- —National Social Science Fund of Chinahttp://dx.doi.org/10.13039/501100012456
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultilingual Education and Policy · Language and cultural evolution · Linguistic Variation and Morphology
Introduction
Language is an invaluable asset to humans, whose diversity holds profound importance for both individual speakers and human societies. Extensive research has shown that multilingualism can enhance cognitive executive control and problem-solving skills in the social interactions of individual speakers [1,2], as well as mental flexibility and empathy [3,4]. Beyond acknowledging its transactional nature and role as an instrument for social communication, language also functions as a carrier of the identity, culture and knowledge tradition of a linguistic society [5,6]. Accordingly, preserving the language diversity is pivotal not only for the cultural inheritance but also for social diversity and inclusion, guaranteeing the sustainable development of human cultures. However, a United Nations report from 2022 presents a pessimistic outlook that between 90 and 95% of today’s languages may face extinction or severe endangerment by the end of this century [7]. Recent studies also proposed that the rate of language extinction will increase threefold over the next 40 years, with at least one language disappearing per month [8]. This alarming trend indicates an extensive loss of cultural diversity in the foreseeable future and poses a significant challenge to the sustainable development of human cultures. Consequently, revitalizing endangered languages to preserve them from extinction has now become a universal imperative for humans as a whole.
Language extinction can be brought about by language competition [9], which refers to the dynamics of language usage among interacting individuals who speak different languages [10–14]. It often engenders two primary consequences: language shift and coexistence. Language shift refers to the process by which a population abandons one language in favour of another [15], often accompanied by the loss of an active language tradition and the culture associated with it [16]. This process can be more broadly observed when competing languages carry distinct levels of social prestige owing to their association with different levels of socio-cultural, economic and political power [10,17]. Therefore, language shift commonly leads to the outcome that individual speakers of a minority language in a linguistically and culturally diverse society are inclined to favour the high-prestige language [18]. It has been shown that most recent language extinction events are attributable to language shifts rather than a decline in the population that speaks the language [10]. In contrast to language shift, language coexistence is an ideal scenario in which none of the languages in a particular region faces extinction. In such circumstances, a low-prestige language would survive under pressure from a high-prestige language in a specific community, so that multiple languages could be observed being spoken and used in the region [19]. Besides the language-speaking population, language competition can also trigger contact-induced changes in linguistic structures (e.g. grammar). Previous research has highlighted that the size of second-language learners in a speech community can facilitate the retention or loss of complex linguistic features that are difficult for second-language acquisition [20]. Recently, substantial demographic activities and socio-cultural interactions driven by globalization have induced the frequent occurrence of widespread language competition events, thereby accelerating the extinction of languages [8,21–23]. Therefore, understanding the hidden mechanism of language competition is urgent and essential for slowing down the language shift. Such an understanding would further strengthen linguistic and cultural diversity in addition to facilitating the sustainable development of human cultures.
Language competition is usually investigated on a case-by-case basis in traditional socio-linguistic research [18]. This approach can provide a detailed and comprehensive understanding of the specific socio-cultural, political, economic and historical contexts underlying a particular language competition event [24–26]. Using case-by-case research, sociolinguists have proposed that encouraging bilingualism and enhancing the social prestige of minority languages could be two effective ways to reverse language shifts. For example, Osler [15] pointed out the necessity of fostering a mode of coexistence between minority and dominant languages as an effective means of reversing language shift, suggesting the potential importance of bilinguals and bilingualism in this process. This highlights the complex interplay between bilingualism and language shift within language competition scenarios. On the other hand, measures of language policy and planning may also be effective in preventing the complete loss of an endangered language by elevating its social prestige in a community [27]. In other words, social prestige mediated by policy affects the size of the bilingual population and further changes the dynamics of language competition [28]. However, such an approach is qualitatively case-specific and has limitations in large-scale and cross-context comparisons. As a result, it is challenging to discover the mechanisms underlying language competition globally.
Recent methodological advances in language competition models provide alternative opportunities to address this challenge [29]. These models typically consist of a set of differential equations that can be used to interpret interactions among different language-speaking populations (i.e. monolingual and bilingual groups) and predict their dynamic changes [29]. Modelling studies of language competition primarily began with the early theoretical work of Baggs & Freedman [30,31] and gained momentum with the seminal work of Abrams & Strogatz [32]. The Baggs–Freedman model simulates the interaction between two monolingual groups with the existence of the bilingual group [30,31], while the Abrams–Strogatz model (AS model) investigates language shift between two monolingual groups based on the language social prestige [32]. Various more complex language competition models have been developed based on the AS model by incorporating and parameterizing other socio-linguistic or demographic factors. These factors involve population growth (Kandler’s model) [33], bilingualism (Castello’s model) [34], language similarity (Mira’s model) [35], social structure (Minett’s model) [36] and language diffusion (Zhang’s and Kandler’s models) [33,37]. Nevertheless, the parametric estimations in these models primarily rely on the least squares estimation (LSE). LSE can only produce a single point estimate for each parameter rather than a distribution, thereby failing to quantify its uncertainty or credible range. This limitation forces researchers to assess the effects of socio-linguistic factors solely by simply comparing the differences in their parameter magnitudes. However, without considering uncertainty, it is difficult to distinguish whether their differences result from random noise (e.g. sampling bias) or not, which may raise the misinterpretation of the actual effects of socio-linguistic factors. Accordingly, how to rigorously identify the effects of these socio-linguistic factors, especially bilingualism and social prestige, on shaping language competition remains a critical challenge.
Noting this, we here propose a Bayesian parametric estimation framework to quantitatively evaluate the effects of bilinguals and social prestige through rigorous statistical examinations. This framework rests upon model comparisons between baseline and alternative language competition models based on the posterior distributions of their parameters. The baseline model is a Markov process-based language competition model (see details in §4). It demonstrates the most general scenario that any speaker can transit between bilinguals and monolinguals or between different monolinguals, where two competing languages share equal social prestige. In contrast to the baseline model, two alternative models are compared. One excludes bilinguals and the other accounts for disparities in social prestige between two competing languages (see details in §4). Hereby, the equality of social prestige between two competing languages is assessed by comparing language competition models that do or do not involve equal social prestige based on the metric of the Bayes factor (BF) (figure 1; [38]). In addition, the role of bilingualism is evaluated based on comparisons between language competition models that do or do not involve bilinguals based on the metric of Cohen’s d (figure 1; [39]). Using the language competition models and Bayesian framework, we then assess the role of bilingualism and social prestige in shaping the competition patterns of six empirical cases.
Schematic diagrams for language competition models and evaluation strategy. Model 1 involves equal social prestige and bilinguals. Model 2 involves equal social prestige but no bilinguals. Model 3 involves unequal social prestige and bilinguals. Model 4 involves unequal social prestige but no bilinguals. The comparison between model 1 and model 3 based on the BF aims to assess the equality of social prestige. The statistical comparisons between model 1 and model 2, as well as the ones between model 3 and model 4, aim to assess the role of the bilinguals in language shift. In this study, the statistical comparison is implemented based on the metric of Cohen’s d.
Results
To illustrate the generality and universality of our model across different spatiotemporal scales, we collected public data for six empirical competition cases that occurred in different regions (i.e. North America and Europe) and time periods (i.e. century-long and decades-long periods). In North America, these cases were French versus English competition between 1996 and 2017 in Canada (electronic supplementary material, table S1; [40]), English versus French competition between 1996 and 2016 in Montreal (electronic supplementary material, table S2; [40]) and Spanish versus English competition between 1980 and 2010 in the United States (electronic supplementary material, table S3; [41]). In Europe, these cases were the Gaelic-English competition between 1891 and 1971 in Scotland (electronic supplementary material, table S4; [37]), the Welsh-English competition between 1901 and 2001 in Wales (electronic supplementary material, table S5; [37]) and the Catalan-Spanish competition between 2003 and 2018 in Catalonia (electronic supplementary material, table S6; [42]). Each language case encompasses the proportions of two monolingual groups and one bilingual group at a series of time points.
Given that our model is built upon the Bayesian framework, we first determined the optimal and effective prior distributions of the model parameters for each case, according to a series of prior comparisons, prior sensitivity analyses and prior effectiveness validations (see details in the electronic supplementary material, S1, tables S7, S8 and figures S1–S3). With these optimal and effective prior distributions, we then applied our model to investigate the language competition patterns for these six cases (figure 2; table 1).
Language competition patterns of six realistic cases with different spatiotemporal scales around the world. Panels (a–f) are composite figures showing the language competition patterns of six empirical cases revealed by our model, each of which encompasses three subfigures labelled 1 to 3. Subfigure 1 is the network diagram depicting transitions among three linguistic groups. The blue circle denotes the bilingual group, while the red and green rectangles denote the minority L1 and majority L2 monolingual groups. The relative size of the rectangles for L1 and L2 monolingual groups signifies the comparative level of their social prestige. The arrow denotes the transition from one linguistic group to another, with a larger thickness representing a larger transition rate. ‘Accelerate’, ‘decelerate’ or ‘no effect’ indicates the impact of the existence of the bilingual group on the transition rates between two monolingual groups, respectively. Subfigure 2 is the box plot illustrating the posterior distributions of the transition rates between monolingual groups with and without the existence of the bilingual group. The y-axis is inserted breaks for better visualization. The differences among these distributions are measured by Cohen’s d. A larger value of Cohen’s d indicates a larger difference, which will be annotated by ‘NS’ (0 < d < 0.2), ‘’ (0.2 < d < 0.5), ‘’ (0.5 < d < 0.8) and ‘’ (d > 0.8). Subfigure 3 is the curve plot illustrating the competition dynamics among three linguistic groups. The red, green and blue curves refer to the frequencies of minority L1 monolingual, majority L2 monolingual and bilingual groups predicted by our model over time. The red dot, green rhombus and blue squares refer to the true frequencies of the minority L1 monolingual, majority L2 monolingual and bilingual groups at available time points within the empirical data.
French versus English competition in Canada
(a)
As a result of the colonization of Canada by the French and British starting in the late fifteenth century, both French and English were introduced to this region [43]. Subsequently, competition between these two languages has spanned several centuries, with profound effects on the culture, religion and politics of Canada [43]. Under the Official Languages Act of 1969, English and French hold the same official federal status throughout the nation [44]. Nonetheless, English remains the predominant language in Canada. Based on the available data, we investigated competition between French and English in Canada from 1996 to 2017 (electronic supplementary material, table S1) [40].
As shown in figure 1a, the results of the model comparison revealed that English exhibited more prestige than French in Canada (BF = 7.4566, s2 = 0.5477 > s1 = 0.1770, Cohen’s d = 1.4871). Moreover, the shift rate from French to English (r12 = 0.2343) was higher than that from English to French (r21 = 0.1662). This indicates that in Canada, French monolinguals are more willing to acquire English and become English monolinguals. However, such shift rates between French and English monolinguals would increase in the competition model without bilinguals (r12 = 1.1560, Cohen’s d = 1.7630; r21 = 0.1987, Cohen’s d = 0.2379). This suggests that the presence of bilinguals plays an important role in slowing the shift between the two monolingual groups. This further points to a possibility of decelerating the otherwise more rapid shift from minority French to majority English. All the results are summarized in table 1 and the electronic supplementary material, table S9.
English versus French competition in Montreal, Canada
(b)
To preserve French, the Quebec province enacted the Charter of the French Language in 1977, which made French the exclusive official language within this province [45]. Here, we focused on Montreal, the capital city of Quebec and the second-largest French-speaking city in the world after Paris. We investigated the competition between English and French in Montreal based on the available data from 1996 to 2016 (electronic supplementary material, table S2) [40].
As shown in figure 1b, we did not observe a significant difference between the social prestige of English and French in Montreal (BF < 2 × 10^−16^). Moreover, the rate of English monolinguals shifting to French monolingualism (r12 = 0.3371) was faster than that of French monolinguals shifting to English monolingualism (r21 = 0.0475). This indicates that English monolinguals are more willing to acquire French and become French monolinguals in Montreal. Moreover, we found that such rates would increase within the competition model excluding the bilinguals (r12 = 0.4412, Cohen’s d = 1.5687; r21 = 0.1025, Cohen’s d = 1.1704). This suggests that the existence of bilinguals decelerates the shift between French and English monolingualism. All the results are summarized in table 1 and the electronic supplementary material, table S10.
Spanish versus English competition in the United States
(c)
English serves as the predominant language in government and administration throughout the United States and is thus considered the de facto national language of sovereignty. However, Spanish is used as an additional language for broadcasting information and providing public services in certain states like New Mexico, Texas and California [41]. With 35 million speakers in the United States, Spanish has stood as the largest minority language in the country, primarily owing to the continuous influx of immigrants from Spanish-speaking countries [46]. We investigated the competition between English and Spanish in the United States based on the available data from 1980 to 2010 (electronic supplementary material, table S3) [41].
As shown in figure 1c, the results of the model comparison (BF > 1 × 10^3^) revealed that English has higher prestige than Spanish in the United States (s2 = 1.2553 > s1 = 0.0691, Cohen’s d = 4.1873). Nevertheless, the rate of Spanish monolinguals shifting to English monolingualism (r12 = 0.1067) was lower than that of English monolinguals shifting to Spanish monolingualism (r21 = 0.1716). This suggests that some policies may have been successfully implemented and encouraged a lot of people to learn Spanish. Moreover, in the competition model without bilinguals, we found that the rate of Spanish monolinguals shifting to English monolingualism would speed up (r12 = 2.5362, Cohen’s d = 2.7731), yet the rate of reversal shift would be slower (r21 = 0.0647, Cohen’s d = 1.0201). This indicates that when competing languages exhibit distinct social prestige, bilinguals can slow the shift from minority to majority languages while sometimes accelerating the shift from majority to minority languages. All the results are summarized in table 1 and the electronic supplementary material, table S11.
Gaelic versus English competition in Sutherland, Scotland, United Kingdom
(d)
Gaelic is an indigenous language in the county of Sutherland in Scotland. However, this language is currently spoken only by a diminishing number of elderly islanders under the strong impact of English [47]. Accordingly, Gaelic is confronted with a severe threat of extinction within decades. Based on the available data, we investigated competition between Gaelic and English from 1891 to 1971 (electronic supplementary material, table S4) [37].
As shown in figure 1d, the results of the model comparison (BF = 2.7164) showed that English exhibited higher prestige than Gaelic in Scotland (s2 = 0.9655 > s1 = 0.7911, Cohen’s d = 3.2034). We estimated that the language shift was predominantly Gaelic monolinguals shifting to English monolingualism (r12 = 0.0113 > r21 = 0.0009). This suggests that Gaelic monolinguals are more willing to use English and become English monolinguals. Moreover, we found that the transition rate between these two monolingual groups increased significantly in the competition model without incorporating the bilinguals (r12 = 0.1499, Cohen’s d = 2.2803; r21 = 0.0458, Cohen’s d = 1.9269). This implies that the presence of bilinguals would suppress the direct shift between monolingual groups, mitigating the minority Gaelic from rapid extinction. All the results are summarized in table 1 and the electronic supplementary material, table S12.
Welsh versus English competition in Wales, United Kingdom
(e)
Welsh used to be the prevalent language in Wales. However, it has gradually become a minority language with the spread of English since the twentieth century [48]. Like other indigenous languages in the United Kingdom, Welsh is in danger of extinction owing to the overwhelming prestige of English. Based on the available data, we investigated competition between Welsh and English from 1901 to 2001 (electronic supplementary material, table S5) [37].
As shown in figure 1e, the result of the model comparison showed that Welsh and English had unequal social prestige in Wales (BF = 2.8438), with English significantly higher than Welsh (s2 = 0.1939 > s1 = 0.0130, Cohen’s d = 2.5897). Moreover, we observed that the shift between Welsh and English monolingualism was mainly Welsh monolinguals shifting to English monolingualism (r12 = 0.2137 > r21 = 0.0599). Moreover, in the competition model excluding the bilinguals, the rate of the shift from Welsh monolinguals to English monolinguals would be higher (r12 = 10.0877, Cohen’s d = 2.8270), yet no significant change was observed regarding the rate of English monolinguals shifting towards Welsh monolingualism (r21 = 0.0522, Cohen’s d = 0.1247). This suggests that bilinguals can slow down the shift from a minority language to a majority language, although they may have no effect on the shift from a majority language to a minority language. All the results are summarized in table 1 and the electronic supplementary material, table S13.
Catalan versus Spanish competition in Catalonia, Spain
(f)
Catalonia is an autonomous community within Spain that acknowledges two official languages: Catalan and Spanish. However, Catalan has a deep-rooted history of political and cultural suppression, which intensified during the military dictatorship of Francisco Franco [49]. Relentless political and societal efforts have led to a revitalization of the Catalan language since the death of Francisco Franco. Using data from the Statistical Yearbook of Catalonia, we investigated the competition between Catalan and Spanish from 2003 to 2018 (electronic supplementary material, table S6) [42].
The result of the model comparison showed that Catalan and Spanish did not exhibit distinct social prestige in Catalonia (BF < 2 × 10^−16^). Moreover, the major shift pattern was Catalan monolinguals shifting to Spanish monolingualism (r12 = 0.1778 > r21 = 0.1165). In the competition model without the presence of bilinguals, we observed that the rate of this major shift would not exhibit significant change (r12 = 0.1850, Cohen’s d = 0.0849), although the shift from Spanish monolinguals to Catalan monolingualism would be faster (r21 = 0.1358, Cohen’s d = 0.3007). This result suggests that bilingualism may not play a significant role in the shift from minority Catalonia to majority Spanish. All the results are summarized in table 1 and the electronic supplementary material, table S14.
Discussion
Modelling language competition contributes to the revitalization and preservation of minority languages, facilitating the sustainable development of the cultures associated with them. In this study, we developed a novel Bayesian computational framework to explore language competition patterns across six empirical cases worldwide. Among these six cases, unequal social prestige between competing languages was identified in four cases, which are French versus English in Canada, Spanish versus English in the United States, Gaelic versus English in Scotland and Welsh versus English in Wales (table 1). By contrast, equal social prestige between competing languages was found in the remaining two cases, which are French versus English in Montreal and Catalan versus Spanish in Catalonia (table 1). Besides, we discovered a general language competition pattern of minority monolinguals shifting towards majority ones. Nevertheless, we found that bilingualism could facilitate the protection of the minority language from two perspectives. Firstly, bilingualism could decelerate the direct shift from the minority monolinguals to the majority ones. It could prevent the rapid loss of minority languages. Secondly, bilingualism can accelerate the direct shift from majority monolinguals to minority ones when the majority language possesses higher social prestige. It could facilitate the growth of the minority monolingual population. In other words, unequal social prestige between competing languages can sometimes invert the role of bilingualism, hastening the shift from the majority to the minority monolingual groups. Consequently, the existence of bilinguals plays an important role in protecting minority languages, especially under conditions of competing languages exhibiting unequal social prestige.
The inequality of language social prestige identified in the competitions of Spanish, French, Gaelic and Welsh against English could be attributed to the hyper-central role of English in the global language system, with its linkage to distinct economic, social and cultural values [50]. Because of these, English holds an unshakeable status in the United States and the United Kingdom, as well as exerting a strong influence across a broader territory of North America, including Canada. Accordingly, Spanish, French, Gaelic and Welsh in these countries still cannot generally attain the social status comparable to English, although some states and regions in these countries have enacted many policies and laws to improve their social status. By contrast, the equality of language social prestige identified in the English versus French competition in Montreal and Catalan versus Spanish in Catalonia could result from the successful implementations of relevant policies and laws, which have produced significant effects for maintaining the minority languages. Specifically, the monolingual official language policy implemented by Montreal effectively elevates the status of French to the extent that it has become almost as prestigious as English in this region [45]. By the same token, robust protective measures have been enacted by Catalonia in education, economy, politics and social–cultural activities [51]. These measures have successfully revitalized Catalan and immensely elevated its social status, which makes Catalan share the same social prestige as Spanish [51].
Although language competition typically results in minority monolinguals shifting to majority ones, bilingualism plays various roles in the preservation of minority languages. Specifically, the presence of bilingualism can decelerate the shift from minority to majority monolingualism, thereby preventing the rapid decline of the minority language. The reason could be that bilinguals could serve as icons to boost the confidence of minority monolinguals to transmit their language to their children. In contrast to the decelerating role, the existence of bilingualism can sometimes accelerate the shift from the majority monolinguals to the minority ones when the majority language exhibits higher social prestige. One plausible explanation is that the significant distinction between the social prestige levels of the competing languages can motivate the government to enact strong protective measures for the low-prestige language and encourage bilingualism. Accordingly, the existence of bilinguals may also create strong incentives for majority monolinguals to learn the minority language or for their children to learn the minority language at school. In other words, the strong social pressure brought forth by language policies, reflected by the existence of bilinguals, can disrupt the existing linguistic hierarchy within a specific region, thus reversing the language shift and leading to the revitalization of endangered languages. Besides, we also find that bilingualism may sometimes have little effect on the direct transition between two monolingual groups, such as in Welsh versus English and Catalan versus Spanish competition. The reason could be that the significant gap between social prestige and the speaker population size of Welsh and English makes the English monolinguals have little motivation to learn Welsh, even when policies and laws support bilingualism. By contrast, the equal social prestige between Catalan and Spanish may diminish the influence of bilingual policies and laws on the inclination of Catalan monolinguals to learn other languages. Nevertheless, we did not observe that the presence of bilingualism would accelerate the shift from minority monolinguals to majority ones. In summary, bilingualism can either positively play a protective role for the minority language or have no significant impact, but it will not contribute to the loss of the minority language.
Our findings favour some existing language competition theories while challenging others. Specifically, they align with Mufwene’s ecological theory [13], which posits that language competition is shaped by prestige and socioeconomic factors. For instance, a language associated with greater power or educational access tends to dominate, potentially driving shifts or extinctions of less-prestige competitors. However, as Mufwene underscores, such outcomes are probabilistic and context-dependent, reflecting adaptation to evolving environments rather than inherent linguistic superiority. Consistent with this view, our models reveal that bilingualism can persist despite prestige imbalances in certain scenarios. This supports Mufwene’s theory that bilingualism mitigates outright dominance and enables feature-level fusion, thereby contributing to preserving minority languages within dynamic ecologies [13]. By contrast, our findings diverge from Fishman’s traditional theory [52]. This theory solely views bilingualism as a transitory stage from minority monolingualism to majority dominance, wherein language unequal prestige disparities will be amplified. In contrast, our observations echoing Mufwene’s theory demonstrate that the role of bilingualism is fluid rather than fixed. Under targeted institutional support and proactive language status planning, its role can be inverted, fostered by shifting ideologies, elevated minority language valuation and broader public-domain functions [53,54]. This observation challenges Fishman’s Graded Intergenerational Disruption Scale, suggesting that reversing language shift deviates from a linear, teleological progression through intergenerational transmission. Instead, it favours a more intricate, cyclical process that may engender novel power dynamics among minority speaker groups [55].
Although our model is also an extension of the AS model, it possesses several notable distinctions from other competition models extended from the AS model, especially Kandler’s model (the detailed comparison with Kandler’s model can be found in the electronic supplementary material, S2 and table S15). These distinctions are primarily manifested in the parametric estimation strategy and model form, which bring some advantages but also limitations to our model. For the parametric estimation strategy, our model is built upon the Bayesian framework, which allows for generating a posterior distribution for each parameter rather than a single value. This enables our model to quantify the uncertainties and credible ranges of model parameters. This further facilitates our model to identify the roles of socio-linguistic factors through rigorous statistical examinations rather than simply comparing the parameter magnitudes as done in Kandler’s model. This reduces the risk of generating misinterpretations of the effects of socio-linguistic factors owing to random noise (e.g. sampling bias). For the model form, our model rests upon ordinary differential equations (ODEs) derived from the Markov process, which assumes constant transition rates among different linguistic groups. Compared to our model, Kandler’s model has a more complex form of partial differential equations (PDEs), which allows the languages to diffuse across space and the transition rates to vary across time. Accordingly, our model has the limitation of only simulating the average language competition pattern at a certain time period, rather than the dynamic competition pattern varying across time and space. Noting this limitation, we further validated the robustness of our model against different time windows by dropping different percentages of the time points in the empirical cases. The results showed that although the concrete estimated values of parameters would exhibit differences, the primary competition patterns identified by our model remained stable within different time windows, particularly the relative size of social prestige between competing languages (the detailed results of these validations are available in the electronic supplementary material, S3 and figures S4, S5).
Despite the limitations of the current version of our model, the advantages of Kandler’s model provide valuable insights for the future improvements of our model from two perspectives. Firstly, we can extend our model from ODE into the PDE framework to simulate the more complex language competition patterns from both spatial and temporal perspectives. Secondly, we can substitute the constant values with functions that can vary across time for transition rates in our model. This can facilitate our model to capture the dynamic competition patterns over time and identify the dynamic effects rather than the average effects of socio-linguistic factors. However, the common limitation of the AS-extended model regarding the definition and estimation of social prestige parameters also warrants further refinement. Specifically, social prestige is an abstract parameter for measuring the social or economic opportunities afforded to the speakers of a certain language. Accordingly, this parameter can only be estimated from the empirical data using model fitting, thereby lacking practical and concrete social and economic meanings. In future studies, we could incorporate empirical social and economic data to aid in the definition and estimation of this parameter. Moreover, other socio-linguistic factors, such as language learning difficulty and structural advantage, can impose pronounced influences on language competition [20], which should be considered in future extensions of our model. Nevertheless, we still hope that our Bayesian language competition model could enrich our understanding of the hidden mechanism of language competition and facilitate the protection and revitalization of minority languages, as well as the cultures associated with them. For instance, it can assist language policymakers in evaluating the effectiveness of bilingual incentive policies by analysing the role of bilingualism in minority language preservation. Similarly, it can help determine whether current policies aimed at enhancing the status of minority languages are effective or require adjustment by examining the equality of language social prestige within different time periods.
Material and methods
Language competition data
(a)
We collected the time series data of six empirical language competition cases from the public repository. For French versus English (Canada), the data covers the proportions of French and English monolinguals and bilinguals across Canada from 1996 to 2017 with 5-year intervals (electronic supplementary material, table S1; [40]). The data for French versus English (Montreal) similarly tracks these proportions in Montreal from 1996 to 2016 with 5-year intervals (electronic supplementary material, table S2; [40]). For Spanish versus English, the data include the proportions of Spanish and English monolinguals and bilinguals from 1980 to 2010 in the United States, collected at 10-year intervals (electronic supplementary material, table S3; [41]). For Gaelic versus English, the data entail the proportions of Gaelic and English monolinguals and bilinguals from 1891 to 1971 with 10-year intervals in Scotland, United Kingdom (electronic supplementary material, table S4; [37]). For Welsh versus English, the data spans from 1901 to 2001 with 10-year intervals, encompassing the proportions of Welsh and English monolinguals and bilinguals in Wales, United Kingdom (electronic supplementary material, table S5; [37]). For Catalan versus Spanish, the data contain the proportions of Catalan and Spanish monolinguals and bilinguals from 2003 to 2018 in Catalonia, Spain, collected at 5-year intervals (electronic supplementary material, table S6; [42]).
Language competition model
(b)
Equal-prestige model (EPM). The EPM is derived from the continuous-time Markov chain model, which has two types. One type consists of bilinguals, as shown in equation (4.1), while the other does not, as shown in equation (4.2). The detailed descriptions of the parameters and notations are listed in table 2:
Unequal-prestige model (UPM). The UPM can be regarded as an extension of the EPM model that considers social influence. Like the EPM, the UPM also has two forms. One involves the bilinguals, as shown in equation (4.3), while the other neglects the bilinguals, as shown in equation (4.4). The detailed descriptions of the parameters and notations are listed in table 2:
In both EPM and UPM models, and denote the proportions of speakers of two monolingual groups of and languages at time , respectively. denotes the proportions of the bilingual group. represents the total number of the bilingual group and two monolingual groups. and are the social prestige parameters that reflect the economic or social opportunity and status offered to the speakers of and languages, respectively. A higher value of indicates a higher prestige of under social pressure. Specifically, if , the UPM will degenerate as an EPM. In other words, the EPM is the special case of the UPM when . and , which satisfy .
Bayesian parametric estimation for the language competition model
(c)
To estimate the parameters of the language competition model, we performed Bayesian inference based on the Markov chain Monte Carlo (MCMC) method [56]. Here, we exemplify the estimation procedure using equation (4.3). First, we established the likelihood function for the competition model as shown in equation (4.5). To be specific, we let be the proportions of the two monolingual groups and a bilingual group estimated by the language competition model at time . We let be empirical proportions of the two monolingual groups and one bilingual group at time . Accordingly, and denote the empirical and estimated proportions of two monolingual groups and bilingual group across time points . We assumed that follows the matrix normal distribution , where follows the standard normal distribution. Accordingly, the likelihood function of for the set of all the unknown parameters can be constructed as equation (4.5):
Second, we established the posterior distribution of noted as based on the Bayesian theorem. Let be the prior distribution of . Accordingly, can be calculated following equation (4.6):
Third, we used the MCMC method to simulate samples of from its posterior distribution . In practice, the settings of prior distributions for six cases are listed in the electronic supplementary material, table S7, which are determined through prior comparisons, prior sensitivity analyses and prior effectiveness validations (see details in the electronic supplementary material, S1, tables S7, S8 and figures S1–S3). Moreover, with the consideration of sampling efficiency, we ran the MCMC with different iterations that can guarantee the convergence of the MCMC for different cases (electronic supplementary material, table S7). The burn-in of the first 60% of samples is set to estimate the mean values and confidence intervals of . was solved using the ode function of the deSolve package (1.4.0) in R (4.0.3) [57]. The MCMC method was implemented by the MCMC function of the fmcmc package (0.5−2) in R (4.0.3) [58].
Evaluating the equality of social prestige
(d)
Evaluating the equality of social pressure was accomplished by comparison between equations (4.1) and (4.3). This comparison was implemented based on the BF [38], which is the most commonly used metric to assess performance among different Bayesian models. To be specific, we let and be the estimated values of the parameters of equations (4.1) and (4.3), respectively. The BF value between equations (4.1) and (4.3) is calculated as equation (4.7)
According to the criteria proposed by Harold Jeffreys [59], if then equation (4.3) is not more strongly supported by the data than equation (4.1). In other words, language competition is not significantly affected by social pressure. By contrast, indicates that equation (4.3) is more strongly supported by the data than equation (4.1). In other words, language competition is significantly influenced by social pressure.
Identifying the role of bilinguals
(e)
The role of bilinguals was identified via a comparison of the transition rate between two monolingual groups. This comparison was performed using Cohen’s d, which is a more robust metric that is less sensitive to the sample size than the p‐value [39]. The Cohen’s d between two vectors, and , was calculated using equation (4.8):
Here, and are the mean values of and , respectively. and are the standard deviations of and , respectively. and are the sample sizes of and , respectively. According to Sawilowsky [60], Cohen’s d < 0.2 indicates a non-significant difference between and . Cohen’s d < 0.5 indicates a small difference between and . Cohen’s d < 0.8 indicates a moderate difference between and . Cohen’s d > 0.8 indicates a large difference between and .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Bialystok E, Craik FIM, Luk G. 2012 Bilingualism: consequences for mind and brain. Trends Cogn. Sci. 16, 240–250. (10.1016/j.tics.2012.03.001)22464592 PMC 3322418 · doi ↗ · pubmed ↗
- 2Bialystok E, Craik FIM. 2022 How does bilingualism modify cognitive function? Attention to the mechanism. Psychon. Bull. Rev. 29, 1246–1269. (10.3758/s 13423-022-02057-5)35091993 · doi ↗ · pubmed ↗
- 3Greve W, Koch M, Rasche V, Kersten K. 2024 Extending the scope of the ‘cognitive advantage’ hypothesis: multilingual individuals show higher flexibility of goal adjustment. J. Multiling. Multicult. Dev. 45, 822–838. (10.1080/01434632.2021.1922420) · doi ↗
- 4Dewaele JM, Wei L. 2012 Multilingualism, empathy and multicompetence. Int. J. Multiling. 9, 352–366. (10.1080/14790718.2012.714380) · doi ↗
- 5Edwards J. 2009 Language and identity: an introduction. Cambridge, UK: Cambridge University Press.
- 6Harrison KD. 2007 When languages die: the extinction of the world’s languages and the erosion of human knowledge. Oxford, UK: Oxford University Press.
- 7UNESCO. 2022 International decade of indigenous languages 2022 – 2032. See https://www.unesco.org/en/decades/indigenous-languages.
- 8Bromham L, Dinnage R, Skirgård H, Ritchie A, Cardillo M, Meakins F, Greenhill S, Hua X. 2022 Global predictors of language endangerment and the future of linguistic diversity. Nat. Ecol. Evol. 6, 163–173. (10.1038/s 41559-021-01604-y)34916621 PMC 8825282 · doi ↗ · pubmed ↗
