A new paper by Bentz et al. is available for preview here. It is about a correlation between the lexical diversity of languages and the presence of non-native speakers in a population. This is particularly relevant to the work by Lupyan & Dale (2010), who found that morphological complexity within a language correlates with the population size of a language. It’s reasonable to expect that the percentage of second language speakers within a population will be affected by the size of a speaker population. There has been a lot of talk on this blog in the past about correlations between population structure and linguistic structure. There’s a pretty comprehensive page here covering some of the (spurious) correlations covered on the blog in the past. Bentz. et al. are however aware of the criticisms raised by Sean and James in their Plos one paper, and are all for a pluralistic approach and state that “there needs to be independent evidence for a causal relationship” before covering qualitative and quantitative evidence from other areas.
Here is the abstract for the interested:
Explaining the diversity of languages across the world is one of the central aims of historical and evolutionary linguistics. This paper presents a quantitative approach to measure and model a central aspect of this variation, namely the lexical diversity of languages. Lexical diversity is defined as the breadth of word forms used to encode constant information content. It is measured by means of comparing word frequency distributions for parallel translations of hundreds of languages. The measure is based on indices used in studies of biodiversity and in quantitative linguistics, i.e. Zipf-Mandelbrot’s law, Shannon entropy and type-token ratios. Three statistical models are given to elicit potential factors driving languages towards less diverse lexica. It is shown that the ratio of non-native speakers in languages predicts lower lexical diversity. This suggests that theories focusing on native acquisition as driving force of language change are incomplete. Instead, we argue that languages are information encoding systems shaped by the varying needs of their speakers. Language evolution and change should be modeled as the co-evolution of multiple intertwined adaptive systems: On one hand, the structure of human societies and human learning capabilities, and on the other, the structure of language.