Woah, I just read some of the responses to Dunn et al. (2011) “Evolved structure of language shows lineage-specific trends in word-order universals” (language log here, Replicated Typo coverage here). It’s come in for a lot of flack. One concern raised at the LEC was that, considering an extreme interpretation, there may be no affect of universal biases on language structure. This goes against Generativist approaches, but also the Evolutionary approach adopted by LEC-types. For instance, Kirby, Dowman & Griffiths (2007) suggest that there are weak universal biases which are amplified by culture. But there should be some trace of universality none the less.
Below is the relationship diagram for Indo-European and Uto-Aztecan feature dependencies from Dunn et al.. Bolder lines indicate stronger dependencies. They appear to have different dependencies- only one is shared (Genitive-Noun and Object-Verb).
However, I looked at the median Bayes Factors for each of the possible dependencies (available in the supplementary materials). These are the raw numbers that the above diagrams are based on. If the dependencies’ strength rank in roughly the same order, they will have a high Spearman rank correlation.
Spearman Rank Correlation | Indo-European | Austronesian |
Uto-Aztecan | 0.39, p = 0.04 | 0.25, p = 0.19 |
Indo-European | -0.13, p = 0.49 |
Spearman rank correlation coefficients and p-values for Bayes Factors for different dependency pairs in different language families. Bantu was excluded because of missing feature data.
Although the Indo-European and Uto-Aztecan families have different strong dependencies, have similar rankings of those dependencies. That is, two features with a weak dependency in an Indo-European language tend to have a weak dependency in Uto-Aztecan language, and the same is true of strong dependencies. The same is true to some degree for Uto-Aztecan and Austronesian languages. This might suggest that there are, in fact, universal weak biases lurking beneath the surface. Lucky for us.
However, this does not hold between Indo-European and Austronesian language families. Actually, I have no idea whether a simple correlation between Bayes Factors makes any sense after hundreds of computer hours of advanced phylogenetic statistics, but the differences may be less striking than the diagram suggests.
UPDATE:
As Simon Greenhill points out below, the statistics are not at all conclusive. However, I’m adding the graphs for all Bayes Factors (these are made directly from the Bayes Factors in the Supplementary Material):
Austronesian: Bantu:
Indo-European: Uto-Aztecan:
Michael Dunn,, Simon J. Greenhill,, Stephen C. Levinson, & & Russell D. Gray (2011). Evolved structure of language shows lineage-specific trends in word-order universals Nature, 473, 79-82
Hi Sean,
Thanks for the review. One thing to note with those correlations is that neither of the Austronesian ones are significant (less than p = 0.05). The IE/UA one just scrapes through at p = 0.04.. and once you correct for multiple tests…. Even if the correlation is significant it’s only explaining 15% of the variation.
What I see as the cool thing in this paper is that we can directly identify these correlations – the next step is to work out if there’s a reason for these correlations. Are there any plausible reasons why these word-order features could be (very weakly) correlated in IE and UA?
Simon
Thanks – yes, I admit, the statistics actually suggest that there isn’t a correlation between the language families. I also agree with your approach: The phylogenetic analysis suggests where to look, but other approaches are needed to explain the patterns found.
I’ve added some diagrams showing the Bayes Factors for all dependencies, not just the significant ones. Some patterns seem to hold, if weakly e.g. SBV-OBV and ADJ-DEM.
It is worth remembering, from the discussions on Language Log, that there are massive and grave concerns about the quality of the data that went into this study. So the entire discussion is moot until the empirical work can be done right – a very far from trivial undertaking.
The quality of the data could be improved, it’s true. However, a large cross-linguistic corpus is never going to satisfy everyone. There’s even the question of whether a single database can evenly capture the structure of two distant languages, especially if cultural transmission really does have a big effect on language structure. However, I’m not sure the point of the paper is a preliminary investigation of a bigger, better investigation of the same kind. The paper tests a simple hypothesis with the data available, and suggests a line of inquiry for other types of investigation to take. This will involve theoretical models, lab experiments and careful case-studies.
Keeping in mind that Tunji C.’s ‘there are massive and grave concerns’ is really just Tunji C.’s and one other commenter’s painstakingly manufactured, overbroad dismissals that Mr. Greenhill and the OP of that post responded to with what I thought was undue kindness and moderation.
“Keeping in mind that Tunji C.’s ‘there are massive and grave concerns’ is really just Tunji C.’s and one other commenter’s painstakingly manufactured, overbroad dismissals…”
I’m not going to re-hash the objections, since anyone can go and read them in their original location. I will just remind readers of this: far from being “painstakingly manufactured”, they are blindingly obvious to any reader who knows the basic literature on the languages in question. The fact that so many of the *other* languages have been investigated very superficially makes it overwhelmingly likely, in my opinion, that the data are close to worthless for this sort of investigation.
Opinions may differ of course, and evidently they do. So be it.
I was sad to miss the LEC discussion of this paper – I think the methods used are an exciting addition to the typological toolkit (albeit one that will force a lot of us to struggle revising our maths!)
A couple of thoughts.
First, we should be very careful not to equate these results as meaning that there are no language universals, as some commentators have expressed it. The universals that I am particularly interested in are the fundamental design features of language (such as duality of patterning, compositionality and so on). While we can argue about the statistical basis of the correlational and/or implicational universals that typologists are primarily interested in, we shouldn’t forget that there is a lot of absolute stuff that needs to be explained, especially by us evolutionary linguists!
Second, I think the results from this paper are exciting in that they leave some significant puzzles that need answering. Exactly what is it about particular lineages that make certain routes through the space of language types more or less likely, for example? Answering this will probably take us from abstract models of cultural evolution to ones which look at the details of how particular features of language interact to create or block pathways for change.
However, I am left wondering if there are not still universal patterns that overarch these results. For example, although certain correlated changes are “missing” in particular language families, it still appears to be the case that we don’t expect to see many correlated changes that take us from “harmonic” to “enharmonic” orders. (Or am I missing something?) If this is correct, then we can still see the overall results as supporting the idea that weak cognitive biases show up cross-linguistically. Particular features of a language can act to stop a weak bias being effective in causing change, but sampled over different families, the universal is revealed by the lack of obvious changes that run counter to the bias.
Simon
@Simon Kirby
Seems like the recent research on ‘Verblog’ (Culbertson, Smolensky, et al.) is up your alley; will you be publishing anything related to the recent and unfortunately framed JHU press release?
http://gazette.jhu.edu/2011/05/09/artificial-grammar-learning-reveals-inborn-language-sense/
Ironically I think this problematic framing in the popular media of how contemporary work in linguistics relates to longstanding characterizations of Chomsky and UG has related to you in the past: http://languagelog.ldc.upenn.edu/nll/?p=1878
I believe Evans & Levinson commented on these sorts of discursive, paradigmatic complications in their ‘Time for a Sea-Change… ’ piece.