A diphthong ( DIF-thong, DIP-; from Ancient Greek δίφθογγος (díphthongos) 'two sounds', from Ancient Greek δίς (dís) 'twice', and φθόγγος (phthóngos) 'sound'), also known as a gliding vowel, is a combination of two adjacent vowel sounds within the same syllable. Technically, a diphthong is a vowel with two different targets: that is, the tongue (and/or other parts of the speech apparatus) moves during the pronunciation of the vowel. In most varieties of English, the phrase "no highway cowboy" () has five distinct diphthongs, one in every syllable.
Diphthongs contrast with monophthongs, where the tongue or other speech organs do not move and the syllable contains only a single vowel sound. For instance, in English, the word ah is spoken as a monophthong (), while the word ow is spoken as a diphthong in most varieties (). Where two adjacent vowel sounds occur in different syllables (e.g. in the English word re-elect) the result is described as hiatus, not as a diphthong. (The English word hiatus () is itself an example of both hiatus and diphthongs.)
Diphthongs often form when separate vowels are run together in rapid speech during a conversation. However, there are also unitary diphthongs, as in the English examples above, which are heard by listeners as single-vowel sounds (phonemes).
In the International Phonetic Alphabet (IPA), monophthongs are transcribed with one symbol, as in English sun [sʌn], in which ⟨ʌ⟩ represents a monophthong. Diphthongs are transcribed with two symbols, as in English high [haɪ] or cow [kaʊ], in which ⟨aɪ⟩ and ⟨aʊ⟩ represent diphthongs.
Diphthongs may be transcribed with two vowel symbols or with a vowel symbol and a semivowel symbol. In the words above, the less prominent member of the diphthong can be represented with the symbols for the palatal approximant [j] and the labiovelar approximant [w], with the symbols for the close vowels [i] and [u], or the symbols for the near-close vowels [ɪ] and [ʊ]:
Some transcriptions are broader or narrower (less precise or more precise phonetically) than others. Transcribing the English diphthongs in high and cow as ⟨aj aw⟩ or ⟨ai̯ au̯⟩ is a less precise or broader transcription, since these diphthongs usually end in a vowel sound that is more open than the semivowels [j w] or the close vowels [i u]. Transcribing the diphthongs as ⟨aɪ̯ aʊ̯⟩ is a more precise or narrower transcription, since the English diphthongs usually end in the near-close vowels [ɪ ʊ].
The non-syllabic diacritic, the inverted breve below ⟨◌̯⟩, is placed under the less prominent part of a diphthong to show that it is part of a diphthong rather than a vowel in a separate syllable: [aɪ̯ aʊ̯]. When there is no contrastive vowel sequence in the language, the diacritic may be omitted. Other common indications that the two sounds are not separate vowels are a superscript, ⟨aᶦ aᶷ⟩, or a tie bar, ⟨a͡ɪ a͡ʊ⟩ or ⟨a͜ɪ a͜ʊ⟩. The tie bar can be useful when it is not clear which symbol represents the syllable nucleus, or when they have equal weight. Superscripts are especially used when an on- or off-glide is particularly fleeting.
The period ⟨.⟩ is the opposite of the non-syllabic diacritic: it represents a syllable break. If two vowels next to each other belong to two different syllables (hiatus), meaning that they do not form a diphthong, they can be transcribed with two vowel symbols with a period in between. Thus, lower can be transcribed ⟨ˈloʊ.ər⟩, with a period separating the first syllable, , from the second syllable, .
The non-syllabic diacritic is used only when necessary. It is typically omitted when there is no ambiguity, as in ⟨haɪ kaʊ⟩. No words in English have the vowel sequences *[a.ɪ a.ʊ], so the non-syllabic diacritic is unnecessary.
Falling (or descending) diphthongs start with a vowel quality of higher prominence (higher pitch or volume) and end in a semivowel with less prominence, like [aɪ̯] in eye, while rising (or ascending) diphthongs begin with a less prominent semivowel and end with a more prominent full vowel, similar to the [ja] in yard. (Note that "falling" and "rising" in this context do not refer to vowel height; for that, the terms "opening" and "closing" are used instead. See below.) The less prominent component in the diphthong may also be transcribed as an approximant, thus [aj] in eye and [ja] in yard. However, when the diphthong is analysed as a single phoneme, both elements are often transcribed with vowel symbols (/aɪ̯/, /ɪ̯a/). Semivowels and approximants are not equivalent in all treatments, and in the English and Italian languages, among others, many phoneticians do not consider rising combinations to be diphthongs, but rather sequences of approximant and vowel. There are many languages (such as Romanian) that contrast one or more rising diphthongs with similar sequences of a glide and a vowel in their phonetic inventory (see semivowel for examples).
In closing diphthongs, the second element is more close than the first (e.g. [ai]); in opening diphthongs, the second element is more open (e.g. [ia]). Closing diphthongs tend to be falling ([ai̯]), and opening diphthongs are generally rising ([i̯a]), as open vowels are more sonorous and therefore tend to be more prominent. However, exceptions to this rule are not rare in the world's languages. In Finnish, for instance, the opening diphthongs /ie̯/ and /uo̯/ are true falling diphthongs, since they begin louder and with higher pitch and fall in prominence during the diphthong.
A centering diphthong is one that begins with a more peripheral vowel and ends with a more central one, such as [ɪə̯], [ɛə̯], and [ʊə̯] in Received Pronunciation or [iə̯] and [uə̯] in Irish. Many centering diphthongs are also opening diphthongs ([iə̯], [uə̯]).
Diphthongs may contrast in how far they open or close. For example, Samoan contrasts low-to-mid with low-to-high diphthongs:
Narrow diphthongs are the ones that end with a vowel which on a vowel chart is quite close to the one that begins the diphthong, for example Northern Dutch [eɪ], [øʏ] and [oʊ]. Wide diphthongs are the opposite - they require a greater tongue movement, and their offsets are farther away from their starting points on the vowel chart. Examples of wide diphthongs are RP/GA English [aɪ] and [aʊ].
Languages differ in the length of diphthongs, measured in terms of morae. In languages with phonemically short and long vowels, diphthongs typically behave like long vowels, and are pronounced with a similar length. In languages with only one phonemic length for pure vowels, however, diphthongs may behave like pure vowels. For example, in Icelandic, both monophthongs and diphthongs are pronounced long before single consonants and short before most consonant clusters.
Some languages contrast short and long diphthongs. In some languages, such as Old English, these behave like short and long vowels, occupying one and two morae, respectively. Languages that contrast three quantities in diphthongs are extremely rare, but not unheard of; Northern Sami is known to contrast long, short and "finally stressed" diphthongs, the last of which are distinguished by a long second element.
In some languages, diphthongs are single phonemes, while in others they are analyzed as sequences of two vowels, or of a vowel and a semivowel.
Certain sound changes relate to diphthongs and monophthongs. Vowel breaking or diphthongization is a vowel shift in which a monophthong becomes a diphthong. Monophthongization or smoothing is a vowel shift in which a diphthong becomes a monophthong.
While there are a number of similarities, diphthongs are not the same phonologically as a combination of a vowel and an approximant or glide. Most importantly, diphthongs are fully contained in the syllable nucleus while a semivowel or glide is restricted to the syllable boundaries (either the onset or the coda). This often manifests itself phonetically by a greater degree of constriction, but the phonetic distinction is not always clear. The English word yes, for example, consists of a palatal glide followed by a monophthong rather than a rising diphthong. In addition, the segmental elements must be different in diphthongs [ii̯] and so when it occurs in a language, it does not contrast with [iː]. However, it is possible for languages to contrast [ij] and [iː].
Diphthongs are also distinct from sequences of simple vowels. The Bunaq language of Timor, for example, distinguishes /sa͡i/ [saj] 'exit' from /sai/ [saʲi] 'be amused', /te͡i/ [tej] 'dance' from /tei/ [teʲi] 'stare at', and /po͡i/ [poj] 'choice' from /loi/ [loʷi] 'good'.
In words coming from Middle English, most cases of the Modern English diphthongs [aɪ̯, oʊ̯, eɪ̯, aʊ̯] originate from the Middle English long monophthongs [iː, ɔː, aː, uː] through the Great Vowel Shift, although some cases of [oʊ̯, eɪ̯] originate from the Middle English diphthongs [ɔu̯, aɪ̯].
In the varieties of German that vocalize the /r/ in the syllable coda, other diphthongal combinations may occur. These are only phonetic diphthongs, not phonemic diphthongs, since the vocalic pronunciation [ɐ̯] alternates with consonantal pronunciations of /r/ if a vowel follows, cf. du hörst [duː ˈhøːɐ̯st] ‘you hear’ – ich höre [ʔɪç ˈhøːʀə] ‘I hear’. These phonetic diphthongs may be as follows:
The diphthongs of some German dialects differ from standard German diphthongs. The Bernese German diphthongs, for instance, correspond rather to the Middle High German diphthongs than to standard German diphthongs:
Apart from these phonemic diphthongs, Bernese German has numerous phonetic diphthongs due to L-vocalization in the syllable coda, for instance the following ones:
Diphthongs may reach a higher target position (towards /i/) in situations of coarticulatory phenomena or when words with such vowels are being emphasized.
There are five diphthongs in the Oslo dialect of Norwegian, all of them falling:
An additional diphthong, [ʉ͍ɪ], occurs only in the word hui in the expression i hui og hast "in great haste". The number and form of diphthongs vary between dialects.
In French, /wa/, /wɛ̃/, /ɥi/ and /ɥɛ̃/ may be considered true diphthongs (that is, fully contained in the syllable nucleus: [u̯a], [u̯ɛ̃], [y̯i], [y̯ɛ̃]). Other sequences are considered part of a glide formation process that turns a high vowel into a semivowel (and part of the syllable onset) when followed by another vowel.
In standard Eastern Catalan, rising diphthongs (that is, those starting with [j] or [w]) are possible only in the following contexts:
There are also certain instances of compensatory diphthongization in the Majorcan dialect so that /ˈtroncs/ ('logs') (in addition to deleting the palatal plosive) develops a compensating palatal glide and surfaces as [ˈtrojns] (and contrasts with the unpluralized [ˈtronʲc]). Diphthongization compensates for the loss of the palatal stop (part of Catalan's segment loss compensation). There are other cases where diphthongization compensates for the loss of point of articulation features (property loss compensation) as in [ˈaɲ] ('year') vs [ˈajns] ('years'). The dialectal distribution of this compensatory diphthongization is almost entirely dependent on the dorsal plosive (whether it is velar or palatal) and the extent of consonant assimilation (whether or not it is extended to palatals).
The Portuguese diphthongs are formed by the labio-velar approximant [w] and palatal approximant [j] with a vowel, European Portuguese has 14 phonemic diphthongs (10 oral and 4 nasal), all of which are falling diphthongs formed by a vowel and a nonsyllabic high vowel. Brazilian Portuguese has roughly the same amount, although the European and non-European dialects have slightly different pronunciations ([ɐj] is a distinctive feature of some southern and central Portuguese dialects, especially that of Lisbon). A [w] onglide after /k/ or /ɡ/ and before all vowels as in quando [ˈkwɐ̃du] ('when') or guarda [ˈɡwaɾðɐ ~ ˈɡwaʁdɐ] ('guard') may also form rising diphthongs and triphthongs. Additionally, in casual speech, adjacent heterosyllabic vowels may combine into diphthongs and triphthongs or even sequences of them.
In addition, phonetic diphthongs are formed in most Brazilian Portuguese dialects by the vocalization of /l/ in the syllable coda with words like sol [sɔw] ('sun') and sul [suw] ('south') as well as by yodization of vowels preceding /s/ or its allophone at syllable coda [ʃ ~ ɕ] in terms like arroz [aˈʁojs ~ ɐˈʁo(j)ɕ] ('rice'), and /z/ (or [ʒ ~ ʑ]) in terms such as paz mundial [ˈpajz mũdʒiˈaw ~ ˈpa(j)ʑ mũdʑiˈaw] ('world peace') and dez anos [ˌdɛjˈz‿ɐ̃nu(j)s ~ ˌdɛjˈz‿ɐ̃nuɕ] ('ten years').
Phonetically, Spanish has seven falling diphthongs and eight rising diphthongs. In addition, during fast speech, sequences of vowels in hiatus become diphthongs wherein one becomes non-syllabic (unless they are the same vowel, in which case they fuse together) as in poeta [ˈpo̯eta] ('poet') and maestro [ˈmae̯stɾo] ('teacher'). The Spanish diphthongs are:
The existence of true diphthongs in Italian is debatable; however, a list is:
The second table includes only 'false' diphthongs, composed of a semivowel + a vowel, not two vowels. The situation is more nuanced in the first table: a word such as 'baita' is actually pronounced ['baj.ta] and most speakers would syllabify it that way. A word such as 'voi' would instead be pronounced and syllabified as ['vo.i], yet again without a diphthong.
In general, unstressed /i e o u/ in hiatus can turn into glides in more rapid speech (e.g. biennale [bi̯enˈnaːle] 'biennial'; coalizione [ko̯alitˈtsi̯oːne] 'coalition') with the process occurring more readily in syllables further from stress.
Romanian has two true diphthongs: /e̯a/ and /o̯a/. There are, however, a host of other vowel combinations (more than any other major Romance language) which are classified as vowel glides. As a result of their origin (diphthongization of mid vowels under stress), the two true diphthongs appear only in stressed syllables and make morphological alternations with the mid vowels /e/ and /o/. To native speakers, they sound very similar to /ja/ and /wa/ respectively. There are no perfect minimal pairs to contrast /o̯a/ and /wa/, and because /o̯a/ doesn't appear in the final syllable of a prosodic word, there are no monosyllabic words with /o̯a/; exceptions might include voal ('veil') and trotuar ('sidewalk'), though Ioana Chițoran argues that these are best treated as containing glide-vowel sequences rather than diphthongs. In addition to these, the semivowels /j/ and /w/ can be combined (either before, after, or both) with most vowels, while this arguably forms additional diphthongs and triphthongs, only /e̯a/ and /o̯a/ can follow an obstruent-liquid cluster such as in broască ('frog') and dreagă ('to mend'), implying that /j/ and /w/ are restricted to the syllable boundary and therefore, strictly speaking, do not form diphthongs.
There are 9 diphthongs in Scottish Gaelic. Group 1 occur anywhere (eu is usually [eː] before -m, e.g. Seumas). Group 2 are reflexes that occur before -ll, -m, -nn, -bh, -dh, -gh and -mh.
For more detailed explanations of Gaelic diphthongs see Scottish Gaelic orthography.
Welsh is traditionally divided into Northern and Southern dialects. In the north, some diphthongs may be short or long according to regular vowel length rules but in the south they are always short (see Welsh phonology). Southern dialects tend to simplify diphthongs in speech (e.g. gwaith /ɡwaiθ/ is reduced to /ɡwaːθ/).
The vowel groups ia, ie, ii, io, and iu in foreign words are not regarded as diphthongs, they are pronounced with /j/ between the vowels [ɪja, ɪjɛ, ɪjɪ, ɪjo, ɪju].
is conventionally considered a diphthong. However, it is actually [ie] in hiatus or separated by a semivowel, [ije].
All nine vowels can appear as the first component of an Estonian diphthong, but only [ɑ e i o u] occur as the second component.
There are additional diphthongs less commonly used, such as [eu] in Euroopa (Europe), [øɑ] in söandama (to dare), and [æu] in näuguma (to mew).
All Finnish diphthongs are falling. Notably, Finnish has true opening diphthongs (e.g. /uo/), which are not very common crosslinguistically compared to centering diphthongs (e.g. /uə/ in English). Vowel combinations across syllables may in practice be pronounced as diphthongs, when an intervening consonant has elided, as in näön [næøn] instead of [næ.øn] for the genitive of näkö ('sight').
The diphthong system in Northern Sami varies considerably from one dialect to another. The Western Finnmark dialects distinguish four different qualities of opening diphthongs:
In terms of quantity, Northern Sami shows a three-way contrast between long, short and finally stressed diphthongs. The last are distinguished from long and short diphthongs by a markedly long and stressed second component. Diphthong quantity is not indicated in spelling.
Rising sequences in Mandarin are usually regarded as a combination of a medial semivowel ([j], [w], or [ɥ]) plus a vowel, while falling sequences are regarded as one diphthong.
In addition to vowel nuclei following or preceding /j/ and /w/, Vietnamese has three diphthongs:
Khmer language has rich vocalics with an extra distinction of long and short register to the vowels and diphthongs.