The evolution of numeral classifier constructions

ResearchBlogging.orgI went to a good talk almost a year ago at the Interfaces III conference at the University of Kent, and I said I’d write about it, but I never got around to it. The slides have been on my desktop ever since. Now that I have a couple hours to kill on the train coming back from the MPI in Nijmegen, here’s that promise fulfilled. I’m going mostly from the slides, so nicely sent to me, and any errors in the transcription from those are my own.

The evolution of numeral classifier constructions

Vipas Pothipath, Dept. of Thai, Chulalongkorn University
The talk was based on work done at both Chulalongkorm and the MPI for Evo. Anthr. in Leipzig, as well as on (then unpublished, although it might be now) Pothipath’s PhD thesis.

A number classifier is a morpheme typically appearing next to a numeral or a quantifier, categorizing the noun with which it co-occurs on a semantic basis. An example would be the Thai, where tua is the classifier:

  • mǎ: sǎ:m tua
  • dog three CLF (lit. ‘body’)
  • three dogs

These can also be bound morphemes, and can co-occur with ordinal numerals or definitive markers. Pothipath focused on cardinal numerals, and defined numeral classifier constructions (NCCs) as syntactic constructions basically consisting of two core constituents, namely a cardinal numeral X and a numeral classifier Y. This case would be exemplified by the above Thai example, which is just as grammatical when mǎ: ‘dog’ is dropped and only the numeral and classifier remain. Now, based on WALS, these exist in many languages across the world (although not so much in Europe), and are sometimes optional and occasionally obligatory. The sample size was only 56 languages, so there might be more widespread variation. Pothipath claims that the optional/obligatory split shows a possibility of a typologial continuum, and that the evolution can be shown using an evolutionary ladder.

This continuum wouldn’t work if there weren’t different types of NCCs. He outlines these (although the names given here are mostly my own):

  1. Repeater: Where a noun is used as the numeral classifier for the noun itself, particularly when there isn’t a suitable classifier for that noun. (I wish there had been a bit of a more explicit statement about how this isn’t just a switch in syntax for noun and number, as can be seen in the example given (fǽm) hâ: fǽm ‘(file) five files’.)
  2. Free form classifier: Where there is a single form used for certain nouns that isn’t related morphologically or lexically synchronically.
  3. Affixal classifier: Like above, but bound to the numeral, as in mat=tol ‘CLF=three’ in Taba. (Bowden, 2001)
  4. Obligatory affixal classifier: Here, the classifier is a dependant morpheme on the numeral, as in maq-ond ‘CLF-one’ in Malto. (Steever 1998)
  5. Joined (unanalyzable) classifier: Where a different lexical form is used for the numeral depending on the nature of the noun.

Now, among the languages Pothipath looked at, some showed more than one morphological type of NCC. This might be a sign that, under the theory of grammaticalisation, free forms develop into the final lexically closed type of classifier. He goes on to show, using diachronich examples, where different languages show this change. Interestingly, he cites Hurford (2001) as a justification for the affixation of classifiers when they are numerals less than 4, as these behave differently than the other numeral words (as they are used more, among other reasons). I wonder if this has any implications for the broad use of the Swadesh list, especially in cases like in the ASJP database which only has around 40 words per language in it. Later, he also mentions Corbett (2000), as the Animacy Hierarchy influences the lexicalisation of classifiers in Warekena.

The argument stands on the idea that a cline of grammaticality in current systems may show a hypothetical evolutionary ladder, which Pothipath rightfully notes as tentative thikning. He also gives a counter example from Beijing Mandarin, which only had limited scope. But, in essence, this is another cyclic case for grammaticalisation theory. Overall, it’s good research, and adds a bit more to the puzzle.

——

There was at least one open question for me after the talk, which a little WALSing was able to corroborate – is the link between gender assignment and numeral classifiers clear? How do they influence each other? Here’s the WALS markup for that.

As can be seen here, classifiers don’t appear when there is semantic and formal gender assignment. I think that’s interesting. I’d like to take a closer look and see if there are any cases where the numeral classifiers and semantic gender assignments clash – I suspect that they are linked, but that the evolutionary grammaticalisation cycle might be too complex to evolve easily. As I’ve got other evolutionary morphological processes on my mind (cf. my evolang talk), I won’t be looking into this soon, but it is an open question that might have some nice low hanging fruit.

References

  • Bowden, J. (2001). Taba: description of a South Halmahera language. Canberra: Pacific Linguistics.
  • Corbett, G. G. (2000). Number. Cambridge: Cambridge University Press.
  • Gil, D. (2005). Numeral classifiers. In M. Haspelmath, M. Dryer, D. Gil & B. Comrie (Eds.), (pp. 226-229).
  • Hurford, J.R. (2001) Numeral Systems. In International Encyclopedia of the Social and Behavioral Sciences, edited by N.J.Smelser and P.B.Baltes, Pergamon, Amsterdam. pp.10756- 10761.
  • Vipas Pothipath (2008). Typology and Evolution of Numeral-Noun Constructions Unpublished PhD Thesis at the University of Edinburgh
  • Pothipath, V. (2011) The Evolution of numeral classifier constructions: a syntax-morphology-lexicon interface.
  • Steever, S. B. (1998). Malto in S.B.Steever (ed.) The Dravidian languages. London: Routledge, pp. 359-387.

Correction: Theory and evidence in language evolution research session still open!

I recently posted about a thematic session entitled ‘Theory and evidence in language evolution research’ at the Poznan Linguistics Meeting.  The call for paper is still open!  Here’s the call:

PLM2012 – Session CfP – Theory and evidence in language evolution

QHImp Qhallenge: Results on day 1

Earlier today we released an experiment on working memory in humans and chimps.  You can play the game here.

We’ve had responses from about 70 people, and we have some results.  Some are summarised on the live results page.

Astoundingly, people actually managed to get 9 numbers shown for only 210 ms!  Replicated Typo’s very own James Winters was one of those mavericks, but puts it down to luck.

There were some early leaders, but in the last few hours, the player known as ‘mjb’ has really kicked everybody’s ass and got to the top of all three leaderboards.  Who are you, magic human?  Let us know!

Continue reading “QHImp Qhallenge: Results on day 1”

Evolang Coverage: Massimo Piattelli-Palmarini’s plenary talk

Post by Bodo Winter:

Massimo Piattelli-Palmarini’s talk at this Evolang gave an impressively confident and forceful argument for linguistic nativism. The basic tenets of the Chomskyan view of language evolution were reiterated with some old and some new arguments along the way. Piattelli-Palmarini (P.P.) claimed that (1) language is modular and autonomous from other cognitive systems, (2) syntax dominates other aspects of language such as semantics, and (3) that language has not arisen through natural selection because it is a non-adaptive trait. In line with Chomskyan syntactocentrism, syntax was argued to be the major evolutionary transition in the evolution of language.

Generally, it is a good thing to have strong arguments for a particular position because it spurs discussion and excites new research. However, P.P.’s arguments very much neglected or belittled major empirical advances in evolutionary linguistics and cognitive science. If this new evidence is taken into account, the picture that emerges is very different from what P.P. argued for.

Continue reading “Evolang Coverage: Massimo Piattelli-Palmarini’s plenary talk”

Evolang coverage: More on linguistic replicators

Monica Tamariz presented a poster at Evolang (runner up for the best poster award) about linguistic replicators.  This is an alternative view to Andrew Smith’s talk and Bill Benzon’s post on the same subject.

Below I’ve copied out sections of Tamariz’s poster:

Continue reading “Evolang coverage: More on linguistic replicators”

Evolang Coverage: Honest signalling between plants and insects

Yashuiro Suzuki (from Nagoya University, co-authoring with Megumi Sakai and Kazuhiro Adachi) presents a model of the evolution of an honest signalling system between plants and insects.  While honest signalling systems have been studied before, this was the first I harve heard of one between species, and certainly between kingdoms.

The vast majority of animals communicate to some extent.  Many signalling systems used by animals use costly signals, the paradigm case bign the peacock’s tail (Zahavi & Zahavi, 1997).  Growing a long tail imposes a developmental and predatory cost and so only fit individuals can afford to grow long tails.  This makes it difficult to trick others into thinking that you are fitter than you actually are.

However, there are systems which use ‘cheap’ signals, the most often used example being badges of status in sparrows.  Sparrows have a patch of bright feathers on their chest.  A bigger patch signals a better fighter.  This is advantageous since they can avoid fights they would not win.  Yet, there appears to be no cost to growing the patch (although this is contested by some).  Zahavi & Zahavi suggest that ‘cheaters’ who do sport badges larger than their abilities are eventually punished when they get into fights with bigger birds.  Thus, the system remains honest.

Suzuki describes a system of communication between plants and insects.  Plants are in constant danger of bugs such as caterpillars.  However, some plants can emit a chemical that attracts small insects that will come and attack and eat the bugs.  The chemical is emitted when there are many bugs attacking.  However, there are plant mutants named ‘cry-wolf’ plants who emit the chemical even when there are very few bugs attacking it.  In this way, the cry-wolf plants have a small advantage over the normal plants.  However, the cry-wolf plants damage the stability of the signalling system.  The insects are attracted to the cry-wolf plants only to find a smaller meal than expected.  If this situation presists, the insect’s association between the chemical and food diminishes and they eventually stop coming.

Continue reading “Evolang Coverage: Honest signalling between plants and insects”

Evolang coverage: Network structure and the effect of L2 learners on language change

Evolang is over, but I have a backlog of posts to get out!

The idea that language change can be biased by the cognitive profiles of its learners has attracted a lot of interest (see Hanna’s post), and was a frequent topic of discussion at Evolang.  In the talk by James Winters and I, we urge a pluralistic approach involving statistical tests, models and experiments.  Here I describe some of the new studies relating to this presented at the conference.

Roland Mühlenbernd and Michael Franke discuss the network properties that characterise language contact.  They constructed an agent-based model where agents had to converge on a system of mapping two meanings onto two signals.  There were two evolutionary stable mappings:  Meaning 1 maps to word 1 and Meaning 2 maps to word 2, or meaning 1 maps to word 2 and meaning 2 maps to word 1.

Agents played communication games with others to settle on the mapping they use.  Mühlenbernd & Franke used two types of agent:  The rational agent which chooses the rationally best response and reinforcement learners based around a Polya-Urn model.  However, this factor didn’t make a significant difference in the results presented in this talk.

The main focus was the structure of the social network that determined which agents interacted.  Small world networks were generated using the Watts-Strogatz method which creates a variety of networks with certain degree and centrality features.  The social structure was held constant within each run, then the results over several runs were analysed.  Homogeneity always emerged, although the reinforcement learning maintained a higher number of ‘language regions’ (where connected agents used the same mappings) for longer.  These langauge regions tended to form in tightly connected regions of the network, not surprisingly.  Mühlenbernd & Franke looked at the properties of the langauge regions, including how early agents settled on a mapping (early vs late learners) and the strategies of agents on the border between dense communities (border agents).

Interestingly, there was a big overlap in late learners and border agents.  That is, people on the border between two communities tend to be late learners.  This offers an interesting take on the hypotheses linking second language learners and linguistic change.  Lupyan & Dale (2010) find a correlation between group size and morphological complexity.  They suggest that the cognitive profiles of L2 learners biases language change towards morphologically simpler languages.

This hypothesis is further supported by the work of Christian Bentz and Bodo Winter (not to be confused with James Winters, founder of this blog) also presented at this conference which shows that the ratio of L1 to L2 speakers of a language correlates with morphological features (number of cases, cast syncretism and case symmetry), while controlling for language family and geographic region.  However, as I suggested in my talk with James Winters, backing up a statistical correlation with another statistical correlation is not as powerful as running an experiment or a model.

Mühlenbernd & Franke’s model might provide some insight into this problem.  It shows that people who are the most likely to be in contact situations (on the borders of communities) are also more likely to be late learners.  If late learners in the model can equate to L2 learners, then this suggests a closer link than previously hypothesised between L2 learners and langauge change.  It would be interesting to think more about this dynamic.  However, Franke urged caution in interpreting the model in this way, since the concept of a ‘late learner’ is fairly abstract.

As a side note, Bart de Boer raised the intriguing idea you could use statistical analyses like Bentz and Winter’s to find exceptions to the rule.  Rather than them being problematic, perhaps by studying the causes of change in these exceptions, a clearer idea of the role of L2 speakers could emerge.

What’s clear is that there is an emerging body of work surrounding the linguistic niche hypothesis using statistical and modelling techniques.  Combined with Hannah Little’s experiment on this phenomenon, I’m wondering how long before we get a special issue on this subject.

Evolang Coverage: Simon Fisher: Molecular Windows into Speech and Language

In his clear and engaging plenary talk, Simon Fisher, who is director of the Department “Language & Genetics” at the Max-Planck-Institute for Psycholinguistics, the Netherlands, gave a summary of the current state of research on what molecular biology and genetics can contribute to the question of language evolution. Fisher was involved in the discovery of the (in)famous FOXP2 gene, which was found to be linked to hereditary language impairment in an English family. He has also done a lot of subsequent work on this gene, so naturally it was the also main focus of his talk.

But before he dealt with this area, he dispelled what he called the ‘abstract gene myth’. According to Fisher, it cannot be stressed enough that there is no direct relation between genes and behavior and that we have to “mind the gap”, as he put it. There is a long chain of interactions and relations that stand between genes one the one side, and speech and language on the other. DNA is related to the building of proteins, which is related to the development of cells. These in turn are related to neural circuits, which then relate to the human brain as whole, which then are related to speech and language.

So when we try to look at these complex net of relations, what we can say is that there is a subset of children which grow up in normal environments but still do not develop normal language skills. From a genetic perspective it is of interest that of these children, there are cases where these impairments cannot be explained by other transparent impairments like cerebral palsy, hearing loss, etc. Moreover, there are cases in which language disorders are heritable. This suggests that there are genetic factors that play a role in some of these impairments.

The most famous example of such a case of heritable language impairment is the English KE family, where affected members of the family are missing one copy of the FOXP2 gene. These family members exhibit impaired speech development. Specifically, they have difficulty in learning and producing sequences of complex oro-facial movements that underlie speech. However, they do show deficits in a wide range of language-related skills, including spoken and written language. It thus has to be emphasized that the missing FOXP2 gene seems to affect all aspects of linguistic development. It is also important that is not accompanied by general motor dyspraxia.

In general, non-verbal deficits are not central to the disorder. Affected individuals start out with a normal non-nonverbal IQ, but then don’t keep up with their peers, something that is very likely to be related to the fact that possessing non-impaired language opens the door for the enhancement of intelligence in various ways, something which people with only one FOXP2 gene cannot take advantage of to the same degree. In general, deficits in verbal cognition are much more severe and wide-ranging than other possible impairments. It is also important to note that after the FOXP2 gene was discovered in the KE family, researchers found a dozen of cases of a damaged FOXP2 gene that led to language-related problems.

FOXP2 is a so-called transcription factor, which means that it can activate and repress other genes. As Fisher points out, in a way FOXP2 functions as a kind of ‘genetic dimmer switch’ that tunes down the expression of other genes. In this context, it should become clear that FOXP2 is not “the gene for language.” Versions of FOXP2 are found in highly similar form in vertebrae species that lack speech and language. It therefore played very ancient roles in the brain of our common ancestor. Neither is FOXP2 exclusively expressed in the brain. It is also involved in the development of the lung, the intestines and the heart. However, work by Simon Fisher and his colleagues shows that FOXP2 is important for neural connectivity. Interestingly, mice with one damaged FOXP2 copy are absolutely normal in their normal baselines motor behavior. However, they have significant deficits in what Fisher called ‘voluntary motor learning.”

From an evolutionary perspective, it is relevant that there have been very little changes in the gene over the course of vertebrae evolution. However, there seem to have been more changes to the gene since our split from the chimpanzee lineage than there have been since the split from the mouse lineage. This means that when it comes to FOXP2, the protein of a chimpanzee is actually closer to a mouse than to a human.

Overall, what current knowledge about the molecular bases of language tells us is that these uniquely human capacities build on evolutionary ancient system. However, much more work is needed to understand the influence of FOXP2 on the molecular and cellular level and how these are related to the development of neural circuits, the brain, and finally our capacity for fully-formed complex human language.

EvoLang coverage: Boeckx on integrating biolingustics and cultural evolution

Cedric Boeckx gave a remarkable plenary which tried to pull together the fields of cultural language evolution and biolinguistics, with surprising concessions on either side.  Boeckx started from a relatively uncontroversial part of Chomsky’s claim:  That aspects of language can be studied scientifically as part of biology.  However, Boeckx noted that Luria in 1976 was confident that ‘within a few years’ linguists would be interfacing with and contributing to findings from biology.  However, formal syntax has failed to carry out the biological commitment, and Boeckx wonders why linguists don’t have more to say about, for instance, the recent developments in the study of FOXP2.

Boeckx outlined his own position as minimalist, in the sense that a fully specified UG is not plausible.  We need to realise that biology is complex, and move beyond the classical model of Broca and Wernicke’s area as dedicated centers of language.  Also, Boeckx urged the audience to forget about the FLN/FLB distinction, since from a biological viewpoint this view is misleading:  Genes build neural structures, not behaviour (although linguists should note the richness of the range of aspects now thought to be part of FLB).

Instead, Boeckx suggests that the subject of study should be a set of formal properties.  Boeckx suggested the following, while emphasising that the particular terms were not important and it is just the concepts that he would focus on:

  • An edge property:  This removes selectional restrictions on concepts in different domains and makes it possible to combine them.  For example, humans can pull together concepts from very different domains.  Also, lexical items have the property of being able to combine with other lexical items.
  • Set formation or Merge:  The ability to combine lexical items.
  • Cyclic transfer:  Elements are combined at different levels before being passed to other operations.  This allows recursion.

These specify a minimal specification of universal grammar for which might realistically find biological explanations.  Boeckx sees no problem with the idea that we share some of these abilities with animals.  An even bigger concession is that he believes that the particular structures of language (e.g. word order or pro-drop) can be explained by cultural evolution i.e. grammaticalism.  The minimal specifications are weak biases, but we need a cultural explanation.

Boeckx went on to suggest how the biological underpinnings of the minimal specification might be approached.  He promoted the concept of the ‘Global workspace’ as used by Dehanene and colleagues.  This approach suggests that cross-modular computation is the key to human cognition.  It focuses on distributed networks of neurons with long-distance connections which allow different modules of the brain to interact.  Humans are particularly good at integrating concepts across perceptual modalities or time.  Boeckx suggests that this ability is the biological basis for the edge property.  It allows different perceptions to be treated in such a way that they can be combined.  I was put in mind of synaesthesia and the work of Chrissy Cuskley on synaesthesia and language evolution.

Boeckx went on to suggest that the thalmus could act as a regulator of information exchange in this global workspace and cited some studies showing that it is sensitive to syntax and semantics, but not phonology.  The thalmus is ideally placed – right at the center of the brain.  Boeckx also suggested that humans have evolved to have a more regularly spherical brain, facilitating this workspace by placing the thalmus equidistantly from all brain areas (suggesting that earlier ancestors of modern humans had a more elongated brain).  However, he was skeptical that we could ever know if this was an adaptation for language.

This integrative approach is in close alignment with proponents of cultural evolution such as Simon Kirby, who sees the structure of langauge as emerging from cultural transmission, but biology as proving the platform for cultural transmission.  Boeckx’s approach differs a great deal to that of Massimo Piattelli-Palmarini, whose talk essentially told cultural evolutionists that they were wrong and should stop researching explanations that could not be true.  However, one commenter wondered if Boeckx’s concessions were a dangerous form of moderate liberalism – these arguments might leave both the cultural camp and the formalist camp believing that there is no conflict and actually lead to further isolation.  However, I welcome this impressive synthesis and hope that it’ll raise the profile of cultural transmission in the evolution of language.

Evolang Coverage: Luke McCrohon on horizontal transfer

Luke McCrohon suggests that tools from evolutionary biology can be applied to linguistic borrowing between languages.  McCrohon correctly points out that the descent of lexicons are far from tree-like, and there is a great deal of horizontal transfer (see also my post on analysing an etymology dictionary). Although it’s mainly nouns that are borrowed into a language, any feature can potentially be borrowed, according to Thmason & Kaufman (1988).  However, we tend to observe hierarchies of borrowing such that some types of words are borrowed more frequently than others.  For instance, Haugen notes that nouns are more likely to be borrowed than verbs, which are in turn more likely to be borrowed than prepositions.  McCrohon links this with a similar observation in biological evolution that certain types of genes are more likely to be borrowed.  Informational genes (that provide the basis for functions) are less likely to be borrowed than operational genes (that modify other functions).  Jain et al.’s (1999) complexity hypothesis suggests that, while all genes have the same probability of being copied, simpler genes are more likely to be copied faithfully since they have fewer constraints on the precise form they must take to be effective.

McCrohon argues that In a similar way, the explanation of the linguistic borrowing hierarchy might also reflect the increasing constraints on how a word can be used.  For instance, most nouns can be substituted by other nouns, while prepositions are highly restricted by context or domain.  Also, language-interal change might be affected by these restrictions.  Even if there is a more effective form than in the existing system, removing one form might have knock-on consequences for the whole system.  This inter-connectedness could have implications for how languages are likely to change.

Furthermore, this model might predict that words are equally likely to be selected for borrowing, but only certain types have a good likelihood of being successfully borrowed.  However, a commenter wondered about words that are borrowed to fill conceptual gaps such as new technologies.  Still, an interesting analogy between problems in biology and problems in linguistics.  And McCrohon is confident that his studies will also have something to give back to the biology community by studying how this problem applies to linguistics.