Sean – Page 16 – Replicated Typo

A spin glass model of cultural consensus

Does your social network determine your rational rationality? When trying to co-ordinate with a number of other people on a cultural feature, the locally rational thing to do is to go with the majority. However, in certain situations it might make sense to choose the minority feature. This means that learning multiple features might be rational in some situations, even if there is a pressure against redundancy. I’m interested in whether there are situations in which it is rational to be bilingual and whether bilingualism is stable over long periods of time. Previous models suggest that bilingualism is not stable (e.g. Castello et al. 2007), therefore an irrational strategy (at least not a primary strategy), but these were based on locally rational learners.

This week we had a lecture from Simon DeDeo on system-wide timescales in the behaviour of macaques. He talked about Spin Glasses and rationality, which got me thinking. A Spin Glass is a kind of magnetised material where the ‘spin’ or magnetism (plus or minus) of the molecules does not reach a consensus, but flips about chaotically. This happens when the structure of the material creates ‘frustrated’ triangles where a molecule is trying to co-ordinate with other molecules with opposing spins, making it difficult to resolve the tensions. Long chains of interconnected frustrated triangles can cause system-wide flips on the order of hours or days and are difficult to study both in models (Ising model) and in the real world.

Continue reading “A spin glass model of cultural consensus”

Language Evolves in R, not Python: An apology

One of the risks of blogging is that you can fire off ideas into the public domain while you’re still excited about them and haven’t really tested them all that well. Last month I blogged about a random walk model of linguistic complexity (the current post won’t make much sense unless you’ve read the original). Essentially, it was trying to find a baseline for the expected correlation between a population’s size and a measure of linguistic complexity. It assumed that the rate of change in the linguistic measure was linked to population size. Somewhat surprisingly, correlations between the two measures (similar to the kind described in Lupyan & Dale, 2010) emerged, despite there being no directional link.

However, these observations were made on the basis of a relatively small sample size. In order to discover why the model was behaving like this, I needed to run a lot more tests. The model was running slowly in python, so I transliterated it to R. When I did, the results were very different: In the first model an inverse relationship between the population size and the rate of change of linguistic complexity yielded a negative correlation between population size and linguistic complexity (perhaps explaining results such as Lupyan & Dale’s). However in the R model this did not occur. In fact, significant correlations only appeared 5% of the time, with that 5% being split exactly between positive and negative correlations. That is, the baseline model has a standard confidence interval, not the much stricter one I had suggested in the last post.

Why was this happening? In short: Rounding errors and small sample sizes.

I checked the Python code, but couldn’t find a bug, so the correlations really were appearing, and really were favouring a negative correlation. Here’s my best explanation: First, the sample of runs was too low to capture the proper distribution. However, strong correlations were appearing. This could be because although the linguistic complexity measure started out pretty randomly distributed, the individual communities were synchronising at the maximum and minimum of the range as they bumped up against it. This caused temporary clusters in the low ranges where the linguistic complexity was changing rapidly (and therefore more likely to synchronise), creating tied ranks in the corners. In addition to this, the Python script I was using had a lower bit depth for its numbers than R, so was more prone to rounding errors. I have to assume as well that my Python script somehow favoured numbers closer to 1 than to 0. It’s still not a very satisfactory explanation, but the conclusion remains that, as one would expect, affecting just the rate of change of linguistic complexity does not produce correlations.

Modelling evolutionary systems often runs into these kinds of problems: The search spaces are often intractable for some approaches. Also I am not, as a mere linguist, aware of some of the more advanced computational techniques. It’s one of the reasons that Evolutionary Linguistics requires a pluralist approach and tools from many different disciplines.

It’s embarrassing to have to correct previous statements, but I guess that’s what Science is about. In the blogging age ideas can get out before they’re fully tested and potentially affect other work. This has its advantages – good ideas can get out faster. But it also means that the reader must be more critical in order to catch poor ideas like the one I’m correcting here.

Sorry, Science.

Here’s a link to the R script (25 lines of code!).

Lupyan G, & Dale R (2010). Language structure is partly determined by social structure. PloS one, 5 (1) PMID: 20098492

Passwords adapt to hacking technology

One of this week’s xkcd comics makes the point that combinatorial passwords (sequence of common words) may be better than holistic ones (semi-random string). This may be because we’re fooled into thinking that a password that is difficult to remember will be difficult to guess. This turns out not to be the case. I’m currently thinking about whether combinatoriality would emerge from an iterated learning chain even if the participants were told to give answers that they thought nobody else would give.

Cultural Evolution and the Impending Singularity: The Movie

Here’s a video of a talk I gave at the Santa Fe Institute‘s Complex Systems Summer School (written with roboticist Andrew Tinka-check out him talking about his fleet of floating robots). The talk was a response to the “Evolution Challenge”:

Has Biological Evolution come to an end?
Is belief an emergent property?
Will advanced computers use H. Sapiens as batteries?

I also blogged about a part of this talk here (why a mad scientist’s attempt at creating A.I. to make new scientific discoveries was doomed).

The talk was given a prise for best talk by the judging panel which included David Krakauer, Tom Carter and best-selling author Cormac McCarthy. At several points in the talk, I completely forget what I was supposed to say because the people filming the event asked me to set my screen up in a way so I couldn’t see my notes.

Sperl, M., Chang, A., Weber, N., & Hübler, A. (1999). Hebbian learning in the agglomeration of conducting particles Physical Review E, 59 (3), 3165-3168 DOI: 10.1103/PhysRevE.59.3165

Chater N, & Christiansen MH (2010). Language acquisition meets language evolution. Cognitive science, 34 (7), 1131-57 PMID: 21564247

Ay N, Flack J, & Krakauer DC (2007). Robustness and complexity co-constructed in multimodal signalling networks. Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 362 (1479), 441-7 PMID: 17255020

Ackley, D.H., and Cannon, D.C.. “Pursue Robust Indefinite Scalability”. In Proceedings of the Thirteenth Workshop on Hot Topics in Operating Systems (HOTOS-XIII) (2011, May). Abstract, PDF.

Guttal V, & Couzin ID (2010). Social interactions, information use, and the evolution of collective migration. Proceedings of the National Academy of Sciences of the United States of America, 107 (37), 16172-7 PMID: 20713700

The Bilingual paradox in Language Evolution: Top down versus bottom up approaches

When thinking about bilingualism and language evolution, there appears to be a paradox: Children are adept at learning more than one language at a time and there are many bilingual societies in the world. However, pressures on memory and redundancy makes it unclear what the adaptive advantage of a cognitive capacity for learning multiple languages at an early stage of language evolution would be. For instance, Hagen (2008) has argued that a bilingual ability would not have been adaptive in early societies and so could not have been selected for. Furthermore, many models have suggested that bilingualism is an unstable trait in a society (e.g. Castello et al., 2008). How can we account for the evolution of this ability? Would an early population of language users most likely be monolingual or bilingual? Here, I take a top down and a bottom up approach and show that they tends to lead to two different conclusions.

Continue reading “The Bilingual paradox in Language Evolution: Top down versus bottom up approaches”

Sonority and Sex: Why smaller communities are louder

Through this post on Sprogmuseet about Atkinson’s analysis of the out of Africa hypothesis, I found an article by Ember & Ember (2007) (who also quantified the link between colour lexicon size and distance from the equator, see my post here) on Sonority and climate. The article extends work by Fought et al. (2004) which finds that a language’s sonority is related to climate. Sonority is a measure of amplitude (loudness) as is greater for vowels than for consonants (for example, see here). Basically, the warmer the climate, the greater the sonority of the phoneme inventory of the population. The theory is that “people in warmer climates generally spend more time outdoors and communicate at a distance more often than people in colder climates”.

Continue reading “Sonority and Sex: Why smaller communities are louder”

Linguistic diversity and traffic accidents

I was thinking about Daniel Nettle’s model of linguistic diversity which showed that linguistic variation tends to decline even with a small amount of migration between communities. I wondered if statistics about population movement would correlate with linguistic diversity, as measured by the Greenberg Diversity Index (GDI) for a country (see below). However, this is a cautionary tale about obsession and use of statistics. (See bottom of post for link to data).

Continue reading “Linguistic diversity and traffic accidents”

SpecGram: Phonotronic Energy Reserves and the Tiny Phoneme Hypothesis

An article in this month’s Speculative Grammarian considers whether phonotronic energy could account for the results of Atkinson (2011) (commented on here) which support a serial founder effect on phoneme inventory.

The article demonstrates two things:

The effects on phonotronic energy correlate well with phoneme inventory size
I’m not the only one doing bonkers correlations

A random walk model of linguistic complexity

EDIT: Since writing this post, I have discovered a major flaw with the conclusion which is described here.

One of the problems with large-scale statistical analyses of linguistic typologies is the temporal resolution of the data. Because we only typically have single measurements for populations, we can’t see the dynamics of the system. A correlation between two variables that exists now may be an accident of more complex dynamics. For instance, Lupyan & Dale (2010) find a statistically significant correlation between a linguistic population’s size and its morphological complexity. One hypothesis is that the language of larger populations are adapting to adult learners as they comes into contact with other languages. Hay & Bauer (2007) also link demography with phonemic diversity. However, it’s not clear how robust these relationships are over time, because of a lack of data on these variables in the past.

To test this, a benchmark is needed. One method is to use careful statistical controls, such as controlling for the area that the language is spoken in, the density of the population etc. However, these data also tend to be synchronic. Another method is to compare the results against the predictions of a simple model. Here, I propose a simple model based on a dynamic where cultural variants in small populations change more rapidly than those in large populations. This models the stochastic nature of small samples (see the introduction of Atkinson, 2011 for a brief review of this idea). This model tests whether chaotic dynamics lead to periods of apparent correlation between variables. Source code for this model is available at the bottom.

Continue reading “A random walk model of linguistic complexity”

Linguistic interactions in the UK

I just heard a talk by social network creator extraordinaire Clio Andris about redefining regional boundaries in the UK based on telecommunications data. Her group took data from 12 billion telephone calls made over the space of a month and created a social network based on this (Ratti et al. , 2010). This network was then used to calculate how closely connected two neighbouring locations were. By optimising the spectral modularity, the best-fitting boundaries could be defined.

Here’s a video demonstration:

The data is fascinating, but there is little explanation. Here’s one of the maps (left) compared with a map of regional accents and a map of rail transport links (right):

A perceptual map of dialects, from Montgomery, C. (2007) Northern English Dialects: A perceptual approach, PhD thesis. pdf

CompareDialectAreas — A comparison of the two experiments.

One of the first things that struck me was the similarity with a map of regional accents (apologies for the quality of the accent map – I couldn’t find the one I was looking for). Apparently, people are talking to people that sound like them. Or, people who talk to each other sound like each other. This isn’t covered in the paper, but seems like an important issue.

Secondly, the rail links also seem to form the ‘backbones’ of the communications regions. This is also mentioned in the paper. However, these two features are linked.

Coming from Wales, the important fit here is the three-way split in Wales. South Wales feels like a different country to North Wales – culturally and linguistically. However, both are linked by having large amounts of natural resources: Coal in South Wales and slate in North Wales. This lead to massive migration into cities in the north and south, and rail links were set up to extract these resources to London or the nearest ports: Cardiff in the south and Liverpool in the north. Thus, it’s still a real pain to get from North Wales to South Wales. The picture is somewhat true of the east and west sides of the north of England.

So, the natural resources concentrated people and transport links. However, it also concentrated political views. The large migrant community in Wales, working for little pay in large mine institutions, became unionised. Socialism emerged, promoting political movements that lead to the minimum wage.

The point being, natural resources, transport links and politics are connected with some being historically dependent on each other. This is, perhaps, precisely why splitting the nation by who speaks to who is a good measure of political regions. It would be fascinating to see how linguistic divisions interact with these variables.

Ratti, Carlo, Sobolevsky, Stanislav, Calabrese, Francesco, Andris, Clio, Reades, Jonathan, Martino, Mauro, Claxton, Rob, & Strogatz, Steven H. (2010). Redrawing the map of Great Britain from a network of human interaction PLoS ONE, 5