Dialect continuum

Geographic range of dialects that vary more strongly at the distant ends

A dialect continuum or dialect chain is a series of language varieties spoken across some geographical area such that neighboring varieties are mutually intelligible, but the differences accumulate over distance so that widely separated varieties may not be. This is a typical occurrence with widely spread languages and language families around the world, when these languages did not spread recently. Some prominent examples include the Indo-Aryan languages across large parts of India, varieties of Arabic across north Africa and southwest Asia, the Chinese languages or dialects, and subgroups of the Romance, Germanic and Slavic families in Europe. Leonard Bloomfield used the name dialect area.[1] Charles F. Hockett used the term L-complex.[2]

Dialect continua typically occur in long-settled agrarian populations, as innovations spread from their various points of origin as waves. In this situation, hierarchical classifications of varieties are impractical. Instead, dialectologists map variation of various language features across a dialect continuum, drawing lines called isoglosses between areas that differ with respect to some feature.[3]

A variety within a dialect continuum may be developed and codified as a standard language, and then serve as an authority for part of the continuum, e.g. within a particular political unit or geographical area. Since the early 20th century, the increasing dominance of nation-states and their standard languages has been steadily eliminating the nonstandard dialects that comprise dialect continua, making the boundaries ever more abrupt and well-defined.

Dialectologists record variation across a dialect continuum using maps of various features collected in a linguistic atlas, beginning with an atlas of German dialects by Georg Wenker (from 1888), based on a postal survey of schoolmasters. The influential Atlas linguistique de la France (1902–10) pioneered the use of a trained fieldworker.[4] These atlases typically consist of display maps, each showing local forms of a particular item at the survey locations.[5]

Secondary studies may include interpretive maps, showing the areal distribution of various variants.[5] A common tool in these maps is an isogloss, a line separating areas where different variants of a particular feature predominate.[6]

In a dialect continuum, isoglosses for different features are typically spread out, reflecting the gradual transition between varieties.[7] A bundle of coinciding isoglosses indicates a stronger dialect boundary, as might occur at geographical obstacles or long-standing political boundaries.[8] In other cases, intersecting isoglosses and more complex patterns are found.[9]

Standard varieties may be developed and codified at one or more locations in a continuum until they have independent cultural status (autonomy), a process the German linguist Heinz Kloss called ausbau. Speakers of local varieties typically read and write a related standard variety, use it for official purposes, hear it on radio and television, and consider it the standard form of their speech, so that any standardizing changes in their speech are towards that variety. In such cases the local variety is said to be dependent on, or heteronomous with respect to, the standard variety.[11]

A standard variety together with its dependent varieties is commonly considered a "language", with the dependent varieties called "dialects" of the language, even if the standard is mutually intelligible with another standard from the same continuum.[12][13] The Scandinavian languages, Danish, Norwegian and Swedish, are often cited as examples.[14] Conversely, a language defined in this way may include local varieties that are mutually unintelligible, such as the German dialects.[15]

The choice of standard is often determined by a political boundary, which may cut across a dialect continuum. As a result, speakers on either side of the boundary may use almost identical varieties, but treat them as dependent on different standards, and thus part of different "languages".[16] The various local dialects then tend to be leveled towards their respective standard varieties, disrupting the previous dialect continuum.[17] Examples include the boundaries between Dutch and German, between Czech, Slovak and Polish, and between Belarusian and Ukrainian.[18][19]

The choice may be a matter of national, regional or religious identity, and may be controversial. Examples of controversies are regions such as the disputed territory of Kashmir, in which local Muslims usually regard their language as Urdu, the national standard of Pakistan, while Hindus regard the same speech as Hindi, an official standard of India. Even though, the Eighth Schedule to the Indian Constitution contains a lists of 22 scheduled languages and Urdu is among them.

During the time of the former Socialist Republic of Macedonia, a standard was developed from local varieties of Eastern South Slavic, within a continuum with Torlakian to the north and Bulgarian to the east. The standard was deliberately based on varieties from the west of the republic that were most different from standard Bulgarian. Now known as Macedonian, it is the national standard of North Macedonia, but viewed by Bulgarians as a dialect of Bulgarian.[20]

Europe provides several examples of dialect continua, the largest of which involve the Germanic, Romance and Slavic branches of the Indo-European language family, the continent's largest language groups.

The Romance area spanned much of the territory of the Roman Empire but was split into western and eastern portions by the Slav Migrations into the Balkans in the 7th and 8th centuries.

The Slavic area was in turn split by the Hungarian conquest of the Carpathian Basin in the 9th and 10th centuries.

Distribution of the native speakers of major continental West Germanic dialects today (dialects of the following languages: Dutch, German, and Frisian). Map colors do not reflect the actual relationship between the varieties.

The Germanic languages and dialects of Scandinavia are a classic example of a dialect continuum, from Swedish dialects in Finland, to Swedish, Gutnish, Elfdalian, Scanian, Danish, Norwegian (Bokmål and Nynorsk), Faroese, Icelandic, with many local dialects of those languages. The Continental North Germanic languages (Swedish, Danish, and Norwegian) are close enough and intelligible enough for some to consider them to be dialects of the same language, but the Insular ones (Icelandic and Faroese) are not immediately intelligible to the other North Germanic speakers.

Historically, the Dutch, Frisian and German dialects formed a canonical dialect continuum, which has been gradually falling apart since the Late Middle Ages due to the pressures of modern education, standard languages, migration and weakening knowledge of the dialects.[22]

The transition from German dialects to Dutch variants followed two basic routes:

Though the internal dialect continua of both Dutch and German remain largely intact, the continuum which historically connected the Dutch, Frisian and German languages has largely disintegrated. Fragmentary areas of the Dutch-German border in which language change is more gradual than in other sections or a higher degree of mutual intelligibility is present still exist, such as the Aachen-Kerkrade area, but the historical chain in which dialects were only divided by minor isoglosses and negligible differences in vocabulary has seen a rapid and ever-increasing decline since the 1850s.[22]

Standard Dutch (based on the dialects of the principal Brabantic and Hollandic cities) and Standard German (originating at the chanceries of Meissen and Vienna) are not closely linked with regard to their ancestral dialects and hence do not show a high degree of mutual intelligibility when spoken and only partially so when written. One study concluded that when concerning written language, Dutch speakers could translate 50.2% of the provided German words correctly, while the German subjects were able to translate 41.9% of the Dutch equivalents correctly. In terms of orthography, 22% of the vocabulary of Dutch and German is identical or near-identical.[24][25]

The , which are usually called "English" in England and "Scots" in Scotland, are often mutually intelligible for large areas north and south of the border between England and Scotland. The Orcadian dialect of Scots is very different from the various dialects of English in southern England, but they are linked by a chain of intermediate steps.

The western continuum of Romance languages comprises, from West to East: in Portugal, Portuguese; in Spain, Galician, Leonese or Asturian, Castilian or Spanish, Aragonese and Catalan or Valencian; in France, Occitan, Franco-Provençal, standard French and Corsican which is closely related to Italian; in Italy, Piedmontese, Italian, Lombard, Emilian-Romagnol, Venetian, Friulian, Ladin; and in Switzerland, Lombard and Romansh. This continuum is sometimes presented as another example, but the major languages in the group (i.e. Portuguese, Spanish, French and Italian) have had separate standards for longer than the languages in the Continental West Germanic group, and so are not commonly classified as dialects of a common language.

Focusing instead on the local Romance lects that pre-existed the establishment of national or regional standard languages, all evidence and principles point to Romania continua as having been, and to varying extents in some areas still being, what Charles Hockett called an L-complex, i.e. an unbroken chain of local differentiation such that, in principle and with appropriate caveats, intelligibility (due to sharing of features) attenuates with distance. This is perhaps most evident today in Italy, where, especially in rural and small-town contexts, local Romance is still often employed at home and work, and geolinguistic distinctions are such that while native speakers from any two nearby towns can understand each other with ease, they can also spot from linguistic features that the other is from elsewhere.

In recent centuries, the intermediate dialects between the major Romance languages have been moving toward extinction, as their speakers have switched to varieties closer to the more prestigious national standards. That has been most notable in France,[citation needed] owing to the French government's refusal to recognise minority languages,[26] but it has occurred to some extent in all Western Romance speaking countries. Language change has also threatened the survival of stateless languages with existing literary standards, such as Occitan.

The Romance languages of Italy are a less arguable example of a dialect continuum. For many decades since Italy's unification, the attitude of the French government towards the ethnolinguistic minorities was copied by the Italian government.[27][28]

The eastern Romance continuum is dominated by Romanian. Outside Romania and Moldova, across the other south-east European countries, various Romanian language groups are to be found: pockets of various Romanian and Aromanian subgroups survive throughout Bulgaria, Serbia, North Macedonia, Greece, Albania and Croatia (in Istria).

Conventionally, on the basis of extralinguistic features (such as writing systems or the former western frontier of the Soviet Union), the North Slavic continuum is split into East and West Slavic continua. From the perspective of linguistic features alone, only two Slavic (dialect) continua can be distinguished, namely, North and South.[29][30][31]

The North Slavic continuum covers the East Slavic and West Slavic languages. East Slavic includes Russian, Belarusian, Rusyn and Ukrainian; West Slavic languages of Czech, Polish, Slovak, Silesian, Kashubian, and Upper and Lower Sorbian.

Ukrainian dialects and to a lesser degree Belarusian have in been influenced by neighbouring Polish, due to the historical ties with the Polish–Lithuanian Commonwealth.

All South Slavic languages form a dialect continuum.[32][33] It comprises, from West to East, Slovenia, Croatia, Bosnia and Herzegovina, Montenegro, Serbia, North Macedonia, and Bulgaria.[34][35] Standard Slovene, Macedonian, and Bulgarian are each based on a distinct dialect, but the Bosnian, Croatian, Montenegrin, and Serbian standard varieties of the pluricentric Serbo-Croatian language are all based on the same dialect, Shtokavian.[36][37][38] Therefore, Croats, Serbs, Bosniaks and Montenegrins communicate fluently with each other in their respective standardized varieties.[39][40][41] In Croatia, native speakers of Shtokavian may struggle to understand distinct Kajkavian or Chakavian dialects, as might the speakers of the two with each other.[42][43] Likewise in Serbia, the Torlakian dialect differs significantly from Standard Serbian. Serbian is a Western South Slavic standard, but Torlakian is largely transitional with the Eastern South Slavic languages (Bulgarian and Macedonian). Collectively, the Torlakian dialects with Macedonian and Bulgarian share many grammatical features that set them apart from all other Slavic languages, such as the complete loss of its grammatical case systems and adoption of features more commonly found among analytic languages.

The barrier between East South Slavic and West South Slavic is historical and natural, caused primarily by a one-time geographical distance between speakers. The two varieties started diverging early on (circa 11th century CE) and evolved separately ever since without major mutual influence, as evidenced by distinguishable Old Bulgarian, while the western dialect of common Old Slavic was still spoken across the modern Serbo-Croatian area in the 12th and early 13th centuries. An intermediate dialect linking western and eastern variations inevitably came into existence over time – Torlakian – spoken across a wide radius on which the tripoint of Bulgaria, North Macedonia and Serbia is relatively pivotal.

The other major language family in Europe besides Indo-European are the Uralic languages. The Sami languages, sometimes mistaken for a single language, are a dialect continuum, albeit with some disconnections like between North, Skolt and Inari Sami. The Baltic-Finnic languages spoken around the Gulf of Finland form a dialect continuum. Thus, although Finnish and Estonian are separate languages, there is no definite linguistic border or isogloss that separates them. This is now more difficult to recognize because many of the intervening languages have declined or become extinct.

Historically, two Celtic dialect continuums existed in North-West Europe. These chains of dialects have been broken in several places due to language death but several continue to be mutually intelligible.

The Goidelic languages consist of Irish, Scottish and Manx. Prior to the 19th and 20th centuries, the continuum existed throughout Ireland, the Isle of Mann and Scotland.[44][45] Many intermediate dialects have become extinct or have died out leaving major gaps between languages such as in the islands of Rathlin, Arran or Kintyre[46] and also in the Irish counties or Antrim, Londonderry and Down.

The current Goidelic speaking areas of Ireland are also separated by extinct dialects but remain mutually intelligible.

The Brythonic languages consist of Breton, Cornish, Welsh and Cumbric.[47] Cumbric went extinct in the medieval era, while Cornish died out in the 18th century but has since been revived.[48]

Arabic is a standard case of diglossia.[49] The standard written language, Modern Standard Arabic, is based on the Classical Arabic of the Qur'an, while the modern vernacular dialects (or languages) branched from ancient Arabic dialects, from North Western Africa through Egypt, Sudan, and the Fertile Crescent to the Arabian Peninsula and Iraq. The dialects use different analogues from the Arabic language inventory and have been influenced by different substrate and superstrate languages. Adjacent dialects are mutually understandable to a large extent, but those from distant regions are more difficult to be understood.[50]

The difference between the written standard and the vernaculars is apparent also in the written language, and children have to be taught Modern Standard Arabic in school to be able to read it.

In Assyrian Neo-Aramaic the continuum starts from the Assyrian tribes in northern Iraq (Alqosh, Batnaya), which are at times considered part of the Chaldean Neo-Aramaic language. Nearing the northern Iraqi-Turkey border, the Barwar and Tyari dialects would begin to sound "traditionally Assyrian". The Barwar and Tyari dialects are "transitional", having both Chaldean and Assyrian phonetic features.[51]

In Hakkari, as one goes eastbound towards Iran, the Gawar, Baz, Jilu and Nochiya dialects would respectively begin to sound slightly distinct to the Tyari/Barwar dialects in the west and more like the prestigious "Urmian" dialect in Urmia, Western Azerbaijan, which is considered the Standard Assyrian dialect, alongside the Iraqi Koine.[52] The dialects in northern Iraq (or "far west" in this continuum), such as those of Alqosh and Batnaya, would not be completely intelligible to those in Western Iran ("far east") even if the same language is spoken.

Going further westward, the "dialect" of Tur Abdin in Turkey, known as Turoyo, has a very distinct pronunciation of words and a different vocabulary to some extent. Turoyo is usually considered to be a discrete language rather than a mere dialect of Assyrian Neo-Aramaic. Finally, both Assyrian and Turoyo are considered to be dialects of the Syriac language.[53]

The Kurdish language considered as a dialect continuum of many varieties and the three main varieties are Northern Kurdish (Kurmanji), Central Kurdish (Sorani), and Southern Kurdish (Xwarîn).

The Persian language in its various varieties (Tajiki and Dari), is representative of a dialect continuum. The divergence of Tajik was accelerated by the shift from the Perso-Arabic alphabet to a Cyrillic one under the Soviets. Western dialects of Persian show greater influence from Arabic and Oghuz Turkic languages,[citation needed] but Dari and Tajik tend to preserve many classical features in grammar and Vocabulary.[citation needed] Also Tat language, a dialect of Persian, is spoken in Azerbaijan.

Turkic languages are best described as a dialect continuum.[54] Geographically this continuum starts at the Balkans in the west with Balkan Turkish, includes Turkish in Turkey and Azerbaijani language in Azerbaijan, extends into Iran with Azeri and Khalaj, into Iraq with Turkmen, across Central Asia to include Turkmenistan, Uzbekistan, Kazakhstan, Kyrgyzstan, to southern Regions of Tajikistan and into Afghanistan. In the south, the continuum starts in northern Afghanistan, northward to the Chuvashia. In the east it extends to the Republic of Tuva, the Xinjiang autonomous region in Western China with the Uyghur language and into Mongolia with Khoton. The entire territory is inhabited by Turkic speaking peoples. There are three varieties of Turkic geographically outside the continuum: Chuvash, Yakut and Dolgan. They have been geographically separated from the other Turkic languages for an extensive period of time, and Chuvash language stands out as the most divergent from other Turkic languages.

There are also Gagauz speakers in Moldavia and Urum speakers in Georgia.

The Turkic continuum makes internal genetic classification of the languages problematic. Chuvash, Khalaj and Yakut are generally classified as significantly distinct, but the remaining Turkic languages are quite similar, with a high degree of mutual intelligibility between not only geographically adjacent varieties but also among some varieties some distance apart.[citation needed] Structurally, the Turkic languages are very close to one another, and they share basic features such as SOV word order, vowel harmony and agglutination.[55]

Many of the Indo-Aryan languages of the Indian subcontinent form a dialect continuum. What is called "Hindi" in India is frequently Standard Hindi, the Sanskritized register of the colloquial Hindustani spoken in the Delhi area, the other register being Urdu. However, the term Hindi is also used for the different dialects from Bihar to Rajasthan and, more widely, some of the Eastern and Northern dialects are sometimes grouped under Hindi.[citation needed] The Indo-Aryan Prakrits also gave rise to languages like Gujarati, Assamese, Maithili, Bengali, Odia, Nepali, Marathi, Konkani and Punjabi.

Chinese consists of hundreds of local varieties, many of which are not mutually intelligible.[56][57] The differences are similar to those within the Romance languages, which are similarly descended from a language spread by imperial expansion over substrate languages 2000 years ago.[58] Unlike Europe, however, Chinese political unity was restored in the late 6th century and has persisted (with interludes of division) until the present day. There are no equivalents of the local standard literary languages that developed in the numerous independent states of Europe.[59]

Chinese dialectologists have divided the local varieties into a number of dialect groups, largely based on phonological developments in comparison with Middle Chinese.[60] Most of these groups are found in the rugged terrain of the southeast, reflecting the greater variation in this area, particularly in Fujian.[61][62] Each of these groups contains numerous mutually unintelligible varieties.[56] Moreover, in many cases the transitions between groups are smooth, as a result of centuries of interaction and multilingualism.[63]

The boundaries between the northern Mandarin area and the central groups, Wu, Gan and Xiang, are particularly weak, due to the steady flow of northern features into these areas.[64][65] Transitional varieties between the Wu, Gan and Mandarin groups have been variously classified, with some scholars assigning them to a separate Hui group.[66][67] The boundaries between Gan, Hakka and Min are similarly indistinct.[68][69] Pinghua and Yue form a dialect continuum (excluding urban enclaves of Cantonese).[70] There are sharper boundaries resulting from more recent expansion between Hakka and Yue, and between Southwestern Mandarin and Yue, but even here there has been considerable convergence in contact areas.[71]

Cree is a group of closely related Algonquian languages that are distributed from Alberta to Labrador in Canada. They form the Cree-Montagnais-Naskapi dialect continuum, with around 117,410 speakers. The languages can be roughly classified into nine groups, from west to east:

Various Cree languages are used as languages of instruction and taught as subjects: Plains Cree, Eastern Cree, Montagnais, etc. Mutual intelligibility between some dialects can be low. There is no accepted standard dialect.[72][73][74]

Ojibwa (Chippewa) is a group of closely related Algonquian languages in Canada, which is distributed from British Columbia to Quebec, and the United States, distributed from Montana to Michigan, with diaspora communities in Kansas and Oklahoma. With Cree, the Ojibwe dialect continuum forms its own continuum, but the Oji-Cree language of this continuum joins the Cree–Montagnais–Naskapi dialect continuum through Swampy Cree. The Ojibwe continuum has 70,606 speakers. Roughly from northwest to southeast, it has these dialects:

Unlike the Cree–Montagnais–Naskapi dialect continuum, with distinct n/y/l/r/ð dialect characteristics and noticeable west-east k/č(ch) axis, the Ojibwe continuum is marked with vowel syncope along the west-east axis and ∅/n along the north-south axis.