I’ve been reading through an earlier draft of my dissertation and noticed a few paragraphs that were omitted due to word length. Despite not making the final cut, it serves as nice reminder about where our data is coming from: that is, when we dive into WALS or UPSID, take a particular inventory and look at one of its phonemes, then we’re viewing something that’s been ascribed by the investigators/observers of said language. Anyway, it’s basically about the Wichí language — a member of the Matacoan language familyspoken in parts of South America’s Chaco region — and the various reports on its phoneme inventory size. N.B. The source is a PhD thesis by Megan Avram (2008).
Even if we accept the theoretical justification for the concept of a phoneme, then there is still an additional problem of how these representations are measured and recorded. These problems are neatly highlighted in the debates surrounding the Wichí language and its phoneme inventory. For instance, back in 1981 Antonio Tovar published an article showing the Wichí had 22 consonants, whereas if you were to jump forward 13 years to 1994, then Kenneth Claesson’s paper would tell you that they are down to just 16 consonants. This is quite a big difference. In WALS terms, Wichí has gone from having an average consonant inventory to a moderately small one. Great news then for those of you searching for a correlation between small communities (Wichí has approximately 25,000 speakers) and phoneme inventory inventory size. Not so great on the reliability front.
Short of conspiracy to bring the number of phonemes down (but see here), reasons for these differences are broad and varied. Some instances could be genuine differences between speech communities in the form of dialectal variation. Other reasons are more likely to be theoretically motivated. Take, as one of many examples, Claesson’s choice to omit glottalized consonants from his description of Wichí. His rationale being that these “are actually consonant clusters of a stop followed by a glottal stop” (Avram, 2008: 37-38). In summary, both sources of data are at the whims of subjectivity: for each language, or dialect, the study is reliant on the choices of potentially one researcher, at a very specific point in time, and with only a finite amount of resources (for a similar discussion, see the comments on Everett and recursion).
It’s straight out of phoneme inventories 101, but from time to time these little examples are useful as cautionary tales about the sources of data we often take for granted.
It’s not only the number of phonemes that people count differently: the number of languages in a country (a typical measure of linguistic diversity) often changes over time.
“There are arguably three indigenous languages in Japan: namely, Ainu, Japanese, and Ryukyuan. However, the genetic relationship between Japanese and Ryukyuan has been proven and the transparency of the relationship is such that the latter is now considered as a dialect (group) of Japanese by most scholars.” (p. xiii, from the preface of The languages of Japan, M. Shibatani, Cambridge University Press, 1990)
Indeed, when Korea was annexed by Japan, Shozaburo classified Korean as a ‘dialect’ of Japan
“The Japanese language and even Japanese linguistic science in some cases became the tools of imperial policy. In 1910, in what was purported to be a scientific study, the famous and respected linguist Kanazawa Shozaburo reached the conclusion that Korean was no more than a “dialect” of Japanese, like Ryukyuan. From a linguistic point of view, Kanazawa’s conclusion was odd even by the standards of the day, but it fit the political spirit of the times and helped pave the way for the kind of standardization policy in Korea that the Japanese government had been pursuing in the home islands … Korean linguists and language scholars were considered revolutionary secessionists, and many were arrested and thrown in jail, sometimes to die there, for nothing more serious than compiling a Korean-language dictionary.” (p.131, Lee & Ramsay, The Korean Language, SUNY Press, 2000)
The linguistic varieties spoken in Japan have been classified based on grammatical differences into 2 dialects (Mitsuo Okumura) 3 dialects (Misao Tojo) and 5 dialects (Toshio Tsuzuku), and based on phonetic similarities into a different 3 dialects (Haruhiko Kindaichi).
I do wonder about these problems (what else is going to keep me up at night?) and how it makes it increasingly difficult to derive any meaningful results.
Also: who would have thought that compiling a dictionary could kill you? Dicticide anyone?
Having compiled a dictionary for Na’vi, and knowing something about the amount of time and effort it takes, it seems to me that Dicticide should be a more common word.