Varieties of Chinese
Chinese, also known as Sinitic,[a] is a branch of the Sino-Tibetan language family consisting of hundreds of local language varieties that are not mutually intelligible. The differences are greater than within the Romance languages, with variation particularly strong in the more mountainous southeast. The varieties are typically classified into several groups: Mandarin, Wu, Min, Xiang, Gan, Hakka and Yue, though some varieties remain unclassified. Some authors further divide Mandarin, Yue and especially Min. These groups are neither clades nor individual languages defined by mutual intelligibility, but reflect common phonological developments from Middle Chinese.
Chinese varieties differ most in their phonology, and to a lesser extent in vocabulary and syntax. Southern varieties tend to have fewer initial consonants than northern and central varieties, but more often preserve the Middle Chinese final consonants. All have phonemic tones, with northern varieties tending to have fewer distinctions than southern ones. Many have tone sandhi, with the most complex patterns in the coastal area from Zhejiang to eastern Guangdong.
Standard Chinese, a form of Mandarin, takes its phonology from the Beijing dialect, with vocabulary from the Mandarin group and grammar based on literature in the modern written vernacular. It is one of the official languages of China.
Taiwanese Mandarin is one of the official languages of Taiwan. Standard Singaporean Mandarin is one of the four official languages of Singapore. Chinese (specifically, Mandarin Chinese) is one of the six official languages of the United Nations.
Population genetic evidence, favors a linguistic origin for Proto-Sino-Tibetan in the upper and middle Yellow River basin in North China, with part of that source population branching off to settle in the Himalayas, with the split of the population that would provide the genesis of the Chinese language from the population that would provide the genesis of the larger Sino-Tibetan language family in the East Asian Neolithic era. At the end of the 2nd millennium BC, a form of Chinese was spoken in a compact area around the lower Wei River and middle Yellow River. From there it expanded eastwards across the North China Plain to Shandong and then south into the valley of the Yangtze River and beyond to the hills of south China. As the language spread, it replaced formerly dominant languages in those areas, and regional differences grew. Simultaneously, especially in periods of political unity, there was a tendency to promote a central standard to facilitate communication between people from different regions.
The first evidence of dialectal variation is found in texts from the Spring and Autumn period (722–479 BC). At that time, the Zhou royal domain, though no longer politically powerful, still defined standard speech. The Fangyan (early 1st century AD) is devoted to differences in vocabulary between regions. Commentaries from the Eastern Han period (first two centuries AD) contain much discussion of local variations in pronunciation. The Qieyun rhyme book (601 AD) noted wide variation in pronunciation between regions, and set out to define a standard pronunciation for reading the classics. This standard, known as Middle Chinese, is believed to be a diasystem based on the reading traditions of northern and southern capitals.
The North China Plain provided few barriers to migration, leading to relative linguistic homogeneity over a wide area in northern China. In contrast, the mountains and rivers of southern China have spawned the other six major groups of Chinese languages, with great internal diversity, particularly in Fujian.
Until the mid-20th century, most Chinese people spoke only their local language. As a practical measure, officials of the Ming and Qing dynasties carried out the administration of the empire using a common language based on Mandarin varieties, known as Guānhuà (官話/官话, literally 'speech of officials'). Knowledge of this language was thus essential for an official career, but it was never formally defined.
In the early years of the Republic of China, Literary Chinese was replaced as the written standard by written vernacular Chinese, which was based on northern dialects. In the 1930s a standard national language was adopted, with its pronunciation based on the Beijing dialect, but with vocabulary also drawn from other Mandarin varieties. It is the official spoken language of the People's Republic of China and of the Republic of China (Taiwan), and one of the official languages of Singapore.
Standard Mandarin Chinese now dominates public life in mainland China, and is much more widely studied than any other variety of Chinese. Outside China and Taiwan, the only varieties of Chinese commonly taught in university courses are Standard Mandarin and Cantonese.
Chinese has been likened to the Romance languages of Europe, the modern descendants of Latin. In both cases, the ancestral language was spread by imperial expansion over substrate languages 2000 years ago, by the Qin–Han empire in China and the Roman Empire in Europe. In Western Europe, Medieval Latin remained the standard for scholarly and administrative writing for centuries, and influenced local varieties, as did Literary Chinese in China. In both Europe and China, local forms of speech diverged from the written standard and from each other, producing extensive dialect continua, with widely separated varieties being mutually unintelligible.
On the other hand, there are major differences. In China, political unity was restored in the late 6th century (by the Sui dynasty) and has persisted (with relatively brief interludes of division) until the present day. Meanwhile, Europe remained fragmented and developed numerous independent states. Vernacular writing, facilitated by the alphabet, supplanted Latin, and these states developed their own standard languages. In China, however, Literary Chinese maintained its monopoly on formal writing until the start of the 20th century. The morphosyllabic writing, read with varying local pronunciations, continued to serve as a source of vocabulary and idioms for the local varieties. The new national standard, Vernacular Chinese, the written counterpart of spoken Standard Chinese, is also used as a literary form by literate speakers of all varieties.
Dialectologist Jerry Norman estimated that there are hundreds of mutually unintelligible varieties of Chinese. These varieties form a dialect continuum, in which differences in speech generally become more pronounced as distances increase, although there are also some sharp boundaries.
However, the rate of change in mutual intelligibility varies immensely depending on region. For example, the varieties of Mandarin spoken in all three northeastern Chinese provinces are mutually intelligible, but in the province of Fujian, where Min varieties predominate, the speech of neighbouring counties or even villages may be mutually unintelligible.
Classifications of Chinese varieties in the late 19th century and early 20th century were based on impressionistic criteria. They often followed river systems, which were historically the main routes of migration and communication in southern China. The first scientific classifications, based primarily on the evolution of Middle Chinese voiced initials, were produced by Wang Li in 1936 and Li Fang-Kuei in 1937, with minor modifications by other linguists since. The conventionally accepted set of seven dialect groups first appeared in the second edition of Yuan Jiahua's dialectology handbook (1961):
Some varieties remain unclassified, including the Danzhou dialect of northwestern Hainan, Waxiang, spoken in a small strip of land in western Hunan, and Shaozhou Tuhua, spoken in the border regions of Guangdong, Hunan, and Guangxi. This region is an area of great linguistic diversity but has not yet been conclusively described.
Most of the vocabulary of the Bai language of Yunnan appears to be related to Chinese words, though many are clearly loans from the last few centuries. Some scholars have suggested that it represents a very early branching from Chinese, while others argue that it is a more distantly related Sino-Tibetan language overlaid with two millennia of loans.
Jerry Norman classified the traditional seven dialect groups into three larger groups: Northern (Mandarin), Central (Wu, Gan, and Xiang) and Southern (Hakka, Yue, and Min). He argued that the Southern Group is derived from a standard used in the Yangtze valley during the Han dynasty (206 BC – 220 AD), which he called Old Southern Chinese, while the Central group was transitional between the Northern and Southern groups. Some dialect boundaries, such as between Wu and Min, are particularly abrupt, while others, such as between Mandarin and Xiang or between Min and Hakka, are much less clearly defined.
Scholars account for the transitional nature of the central varieties in terms of wave models. Iwata argues that innovations have been transmitted from the north across the Huai River to the Lower Yangtze Mandarin area and from there southeast to the Wu area and westwards along the Yangtze River valley and thence to southwestern areas, leaving the hills of the southeast largely untouched.
A 2007 study compared fifteen major urban dialects on the objective criteria of lexical similarity and regularity of sound correspondences, and subjective criteria of intelligibility and similarity. Most of these criteria show a top-level split with Northern, New Xiang, and Gan in one group and Min (samples at Fuzhou, Xiamen, Chaozhou), Hakka, and Yue in the other group. The exception was phonological regularity, where the one Gan dialect (Nanchang Gan) was in the Southern group and very close to Meixian Hakka, and the deepest phonological difference was between Wenzhounese (the southernmost Wu dialect) and all other dialects.
The study did not find clear splits within the Northern and Central areas:
The two Wu dialects occupied an intermediate position, closer to the Northern/New Xiang/Gan group in lexical similarity and strongly closer in subjective intelligibility but closer to Min/Hakka/Yue in phonological regularity and subjective similarity, except that Wenzhou was farthest from all other dialects in phonological regularity. The two Wu dialects were close to each other in lexical similarity and subjective similarity but not in mutual intelligibility, where Suzhou was actually closer to Northern/Xiang/Gan than to Wenzhou.
In the Southern subgroup, Hakka and Yue grouped closely together on the three lexical and subjective measures but not in phonological regularity. The Min dialects showed high divergence, with Min Fuzhou (Eastern Min) grouped only weakly with the Southern Min dialects of Xiamen and Chaozhou on the two objective criteria and was actually slightly closer to Hakka and Yue on the subjective criteria.
Local varieties from different areas of China are often mutually unintelligible, differing at least as much as different Romance languages and perhaps even as much as Indo-European languages as a whole. These varieties form the Sinitic branch of the Sino-Tibetan language family (with Bai sometimes being included in this grouping). Because speakers share a standard written form, and have a common cultural heritage with long periods of political unity, the varieties are popularly perceived among native speakers as variants of a single Chinese language, and this is also the official position. Conventional English-language usage in Chinese linguistics is to use dialect for the speech of a particular place (regardless of status) while regional groupings like Mandarin and Wu are called dialect groups. ISO 639-3 follows the Ethnologue in assigning language codes to eight of the top-level groups listed above (all but Min and Pinghua) and five subgroups of Min. Other linguists choose to refer to the major groupings as languages. Sinologist David Moser stated that the Chinese authorities refer to them as "dialects" as a way to reinforce China as being a single nation.
In Chinese, the term fāngyán[b] is used for any regional subdivision of Chinese, from the speech of a village to major branches such as Mandarin and Wu. Linguists writing in Chinese often qualify the term to distinguish different levels of classification. All these terms have customarily been translated into English as dialect, a practice that has been criticized as confusing. The neologisms regionalect and topolect have been proposed as alternative renderings of fāngyán.[c]
The only varieties usually recognized as languages in their own right are Dungan and Taz. This is mostly due to political reasons as they are spoken in the former Soviet Union and are usually not written in Han characters but in Cyrillic. Dungan is in fact a variety of Mandarin, with high although asymmetric mutual intelligibility with Standard Mandarin. Various mixed languages, particularly those spoken by ethnic minorities, are also referred to as languages such as Tangwang language and E language. Some people and institutions may also allude to Taiwanese language, Cantonese language, and Hakka languages. The Taiwanese Ministry of Education uses the terms "Minnan language" and "Taiwan Minnan language".
The usual unit of analysis is the syllable, traditionally analysed as consisting of an initial consonant, a final and a tone. In general, southern varieties have fewer initial consonants than northern and central varieties, but more often preserve the Middle Chinese final consonants. Some varieties, such as Cantonese and the Shanghai dialect, include syllabic nasals as independent syllables.
In the 42 varieties surveyed in the Great Dictionary of Modern Chinese Dialects, the number of initials (including a zero initial) ranges from 15 in some southern dialects to a high of 35 in the dialect of Chongming Island, Shanghai.
The initial system of the Fuzhou dialect of northern Fujian is a minimal example. With the exception of /ŋ/, which is often merged with the zero initial, the initials of this dialect are present in all Chinese varieties, although several varieties do not distinguish /n/ from /l/. However, most varieties have additional initials, due to a combination of innovations and retention of distinctions from Middle Chinese:
Conservative vowel systems, such as those of Gan dialects, have high vowels /i/, /u/ and /y/, which also function as medials, mid vowels /e/ and /o/, and a low /a/-like vowel. In other dialects, including Mandarin dialects, /o/ has merged with /a/, leaving a single mid vowel with a wide range of allophones. Many dialects, particularly in northern and central China, have apical or retroflex vowels, which are syllabic fricatives derived from high vowels following sibilant initials. In many Wu dialects, vowels and final glides have monophthongized, producing a rich inventory of vowels in open syllables. Reduction of medials is common in Yue dialects.
The Middle Chinese codas, consisting of glides /j/ and /w/, nasals /m/, /n/ and /ŋ/, and stops /p/, /t/ and /k/, are best preserved in southern dialects, particularly Yue dialects such as Cantonese. In Jin, Lower Yangtze Mandarin and Wu dialects, the stops have merged as a final glottal stop, while in most northern varieties they have disappeared. In Mandarin dialects final /m/ has merged with /n/, while some central dialects have a single nasal coda, in some cases realized as a nasalization of the vowel.
All varieties of Chinese, like neighbouring languages in the Mainland Southeast Asia linguistic area, have phonemic tones. Each syllable may be pronounced with between three and seven distinct pitch contours, denoting different morphemes. For example, the Beijing dialect distinguishes mā (妈 "mother"), má (麻 "hemp"), mǎ (马 "horse") and mà (骂 "to scold"). The number of tonal contrasts varies between dialects, with Northern dialects tending to have fewer distinctions than Southern ones. Many dialects have tone sandhi, in which the pitch contour of a syllable is affected by the tones of adjacent syllables in a compound word or phrase. This process is so extensive in Shanghainese that the tone system is reduced to a pitch accent system much like modern Japanese.
The tonal categories of modern varieties can be related by considering their derivation from the four tones of Middle Chinese, though cognate tonal categories in different dialects are often realized as quite different pitch contours. Middle Chinese had a three-way tonal contrast in syllables with vocalic or nasal endings. The traditional names of the tonal categories are "level"/"even" (平 píng), "rising" (上 shǎng) and "departing"/"going" (去 qù). Syllables ending in a stop consonant /p/, /t/ or /k/ (checked syllables) had no tonal contrasts but were traditionally treated as a fourth tone category, "entering" (入 rù), corresponding to syllables ending in nasals /m/, /n/, or /ŋ/.
The tones of Middle Chinese, as well as similar systems in neighbouring languages, experienced a tone split conditioned by syllabic onsets. Syllables with voiced initials tended to be pronounced with a lower pitch, and by the late Tang Dynasty, each of the tones had split into two registers conditioned by the initials, known as "upper" (阴/陰 yīn) and "lower" (阳/陽 yáng). When voicing was lost in all dialects except the Wu and Old Xiang groups, this distinction became phonemic, yielding eight tonal categories, with a six-way contrast in unchecked syllables and a two-way contrast in checked syllables. Cantonese maintains these tones and has developed an additional distinction in checked syllables as well as one in unchecked syllables. (The latter distinction has disappeared again in many varieties.)
However, most Chinese varieties have reduced the number of tonal distinctions. For example, in Mandarin, the tones resulting from the split of Middle Chinese rising and departing tones merged, leaving four tones. Furthermore, final stop consonants disappeared in most Mandarin dialects, and such syllables were distributed amongst the four remaining tones, seemingly at random.
In Wu, voiced obstruents were retained, and the tone split never became phonemic: the higher-pitched allophones occur with initial voiceless consonants, and the lower-pitched allophones occur with initial voiced consonants. (Traditional Chinese classification nonetheless counts these as different tones.) Most Wu dialects retain the tone categories of Middle Chinese, but in Shanghainese several of these have merged.
Many Chinese varieties exhibit tone sandhi, in which the realization of a tone varies depending on the context of the syllable. For example, in Standard Chinese a third tone changes to a second tone when followed by another third tone. Particularly complex sandhi patterns are found in Wu dialects and coastal Min dialects. In Shanghainese, the tone of all syllables in a word is determined by the tone of the first, so that Shanghainese has word rather than syllable tone.
Old Chinese had two families of negatives starting with *p- and *m-, respectively. Northern and Central varieties tend to use a word from the first family, cognate with Beijing pu5 不, as the ordinary negator. A word from the second family is used as an existential negator 'have not', as in Beijing mei2 沒 and Shanghai m2. In Mandarin varieties this word is also used for 'not yet', whereas in Wu and other groups a different form is typically used. In Southern varieties, negators tend to come from the second family. The ordinary negators in these varieties are all derived from a syllabic nasal *m̩, though it has a level tone in Hakka and Yue and a rising tone in Min. Existential negators derive from a proto-form *mau, though again the tonal category varies between groups.
First- and second-person pronouns are cognate across all varieties. For third-person pronouns, Jin, Mandarin, and Xiang varieties have cognate forms, but other varieties generally use forms that originally had a velar or glottal initial:
The Min languages are often regarded as furthest removed linguistically from Standard Chinese in phonology, grammar, and vocabulary. Historically, the Min languages were the first to diverge from the rest of the Chinese languages (see the discussion of historical Chinese phonology for more details). The Min languages are also the group with the greatest amount of internal diversity and are often regarded as consisting of at least five separate languages, e.g. Northern Min, Southern Min, Central Min, Eastern Min, and Puxian Min.
In southern China (not including Hong Kong and Macau), where the difference between Standard Chinese and local dialects is particularly pronounced, well-educated Chinese are generally fluent in Standard Chinese, and most people have at least a good passive knowledge of it, in addition to being native speakers of the local dialect. The choice of dialect varies based on the social situation. Standard Chinese is usually considered more formal and is required when speaking to a person who does not understand the local dialect. The local dialect (be it non-Standard Chinese or non-Mandarin altogether) is generally considered more intimate and is used among close family members and friends and in everyday conversation within the local area. Chinese speakers will frequently code switch between Standard Chinese and the local dialect. Parents will generally speak to their children in dialect, and the relationship between dialect and Mandarin appears to be mostly stable; even a diglossia. Local dialects are valued as symbols of regional cultures.
People generally are tied to the hometown and therefore the hometown dialect, instead of a broad linguistic classification. For example, a person from Wuxi may claim that he speaks Wuxi dialect, even though it is similar to Shanghainese (another Wu dialect). Likewise, a person from Xiaogan may claim that he speaks Xiaogan dialect. Linguistically, Xiaogan dialect is a dialect of Mandarin, but the pronunciation and diction are quite different from spoken Standard Chinese.
Knowing the local dialect is of considerable social benefit, and most Chinese who permanently move to a new area will attempt to pick up the local dialect. Learning a new dialect is usually done informally through a process of immersion and recognizing sound shifts. Generally the differences are more pronounced lexically than grammatically. Typically, a speaker of one dialect of Chinese will need about a year of immersion to understand the local dialect and about three to five years to become fluent in speaking it. Because of the variety of dialects spoken, there are usually few formal methods for learning a local dialect.
Due to the variety in Chinese speech, Mandarin speakers from each area of China are very often prone to fuse or "translate" words from their local language into their Mandarin conversations. In addition, each area of China has its recognizable accents while speaking Mandarin. Generally, the nationalized standard form of Mandarin pronunciation is only heard on news and radio broadcasts. Even in the streets of Beijing, the flavor of Mandarin varies in pronunciation from the Mandarin heard on the media.
Within mainland China, there has been a persistent drive towards promoting the standard language (Chinese: 大力推广普通话; pinyin: dàlì tuīguǎng Pǔtōnghuà); for instance, the education system is entirely Mandarin-medium from the second year onward. However, usage of local dialect is tolerated and socially preferred in many informal situations. In Hong Kong, colloquial Cantonese characters are never used in formal documents other than quoting witnesses' spoken statements during legal trials, and within the PRC a character set closer to Mandarin tends to be used. At the national level, differences in dialect generally do not correspond to political divisions or categories, and this has for the most part prevented dialect from becoming the basis of identity politics.
Historically, many of the people who promoted Chinese nationalism were from southern China and did not natively speak Mandarin, and even leaders from northern China rarely spoke with the standard accent.  Chiang Kai-shek and Sun Yat-sen were also from southern China, and this is reflected in their conventional English names reflecting Cantonese pronunciations for their given names, and differing from their Mandarin pinyin spellings Jiǎng Jièshí and Sūn Yìxiān. One consequence of this is that China does not have a well-developed tradition of spoken political rhetoric, and most Chinese political works are intended primarily as written works rather than spoken works. Another factor that limits the political implications of dialect is that it is very common within an extended family for different people to know and use different dialects.For example, Mao Zedong often emphasized his origins in Hunan in speaking, rendering much of what he said incomprehensible to many Chinese.
Before 1945, other than a small Japanese-speaking population, most of the population of Taiwan were Han Chinese, who spoke Taiwanese Hokkien or Hakka, with a minority of Taiwanese aborigines, who spoke Formosan languages. When the Kuomintang retreated to the island after losing the Chinese Civil War in 1949, they brought a substantial influx of speakers of Northern Chinese (and other dialects from across China), and viewed the use of Mandarin as part of their claim to be a legitimate government of the whole of China. Education policy promoted the use of Mandarin over the local languages, and was implemented especially rigidly in elementary schools, with punishments and public humiliation for children using other languages at school.
From the 1970s, the government promoted adult education in Mandarin, required Mandarin for official purposes, and encouraged its increased use in broadcasting. Over a 40-year period, these policies succeeded in spreading the use and prestige of Mandarin through society at the expense of the other languages. They also aggravated social divisions, as Mandarin speakers found it difficult to find jobs in private companies but were favored for government positions. From the 1990s, Taiwanese native languages[clarify] were offered in elementary and middle schools, first in Yilan county, then in other areas governed by elected Democratic Progressive Party (DPP) politicians, and finally throughout the island.
In 1966, the Singaporean government implemented a policy of bilingual education, where Singaporean students learn both English and their designated native language, which was Mandarin for Chinese Singaporeans (even though Singaporean Hokkien had previously been their lingua franca). The Goh Report, an evaluation of Singapore's education system by Goh Keng Swee, showed that less than 40% of the student population managed to attain minimum levels of competency in two languages. It was later determined that the learning of Mandarin among Singaporean Chinese was hindered by home use of other Chinese varieties, such as Hokkien, Teochew, Cantonese and Hakka. Hence, the government decided to rectify problems facing implementation of the bilingual education policy, by launching a campaign to promote Mandarin as a common language among the Chinese population, and to discourage use of other Chinese varieties.
Launched in 1979 by then Prime Minister Lee Kuan Yew, the campaign aimed to simplify the language environment for Chinese Singaporeans, improve communication between them, and create a Mandarin-speaking environment conducive to the successful implementation of the bilingual education programme. The initial goal of the campaign was for all young Chinese to stop speaking dialects in five years, and to establish Mandarin as the language of choice in public places within 10 years. According to the government, for the bilingual policy to be effective, Mandarin should be spoken at home and should serve as the lingua franca among Chinese Singaporeans. They also argued that Mandarin was more economically valuable, and speaking Mandarin would help Chinese Singaporeans retain their heritage, as Mandarin contains a cultural repository of values and traditions that are identifiable to all Chinese, regardless of dialect group.