Dash

The dash is a punctuation mark that is similar in appearance to the hyphen and minus sign but differs from these symbols in length and, in some fonts, height above the baseline. The most common versions of the dash are the en dash , longer than the hyphen; the em dash , longer than the en dash; and the horizontal bar , whose length varies across typefaces but tends to be between those of the en and em dashes.[a]

Usage varies both within English and in other languages, but the usual convention for the most common dashes in printed English text is as follows:

The figure dash has the same width as a numerical digit; most fonts have digits of equal width. It is used within numbers (e.g., the phone number 555‒0199), especially in columns, for maintaining alignment. Its meaning is the same[citation needed] as a hyphen, as represented by hyphen-minus. In contrast, the en dash is generally used for a range of values.[3] The minus sign (−) glyph is generally set a little higher.

When the figure dash is unavailable, a hyphen-minus is often used instead. In Unicode, the figure dash is U+2012 (decimal 8210). HTML provides no character entity for it; it can be represented by the numeric character reference ‒ or ‒.

In TeX, the standard fonts have no figure dash; however, the digits normally all have the same width as the en dash, so an en dash can be substituted. In XeLaTeX, one can use \char"2012.[4] The Linux Libertine font also has the figure dash glyph.

The en dash, en rule, or nut dash[5] is traditionally half the width of an em dash.[6][7] In modern fonts, the length of the en dash is not standardized, and the en dash is often more than half the width of the em dash.[8] The widths of en and em dashes have also been specified as being equal to those of the upper-case letters N and M, respectively,[9][10] and at other times to the widths of the lower-case letters.[8][11]

The en dash is commonly used to indicate a closed range of values – a range with clearly defined and finite upper and lower boundaries – roughly signifying what might otherwise be communicated by the word "through".[12] This may include ranges such as those between dates, times, or numbers.[13][14][15][16] Various style guides restrict this range indication style to only parenthetical or tabular matter, requiring "to" or "through" in running text. Preference for hyphen vs. en dash in ranges varies. For example, the APA style (named after the American Psychological Association) uses an en dash in ranges, but the AMA style (named after the American Medical Association) uses a hyphen:

Some style guides (including the Guide for the Use of the International System of Units (SI) and the AMA Manual of Style) recommend that, when a number range might be misconstrued as subtraction, the word "to" should be used instead of an en dash. For example, "a voltage of 50 V to 100 V" is preferable to using "a voltage of 50–100 V". Relatedly, in ranges that include negative numbers, "to" is used to avoid ambiguity or awkwardness (for example, "temperatures ranged from −18 °C to −34 °C"). It is also considered poor style (best avoided) to use the en dash in place of the words "to" or "and" in phrases that follow the forms from X to Y and between X and Y.[14][15]

The en dash is used to contrast values or illustrate a relationship between two things.[13][16] Examples of this usage include:

A distinction is often made between "simple" attributive compounds (written with a hyphen) and other subtypes (written with an en dash); at least one authority considers name pairs, where the paired elements carry equal weight, as in the Taft–Hartley Act to be "simple",[14] while others consider an en dash appropriate in instances such as these[17][18][19] to represent the parallel relationship, as in the McCain–Feingold bill or Bose–Einstein statistics. When an act of the U.S. Congress is named using the surnames of the senator and representative who sponsored it, the hyphen-minus is used in the short title; thus the short title of Public Law 111–203 is "The Dodd-Frank Wall Street Reform and Consumer Protection Act", with a hyphen-minus rather than an en dash between "Dodd" and "Frank".[20] However, there is a difference between something named for a parallel/coordinate relationship between two people (for example, Satyendra Nath Bose and Albert Einstein) and something named for a single person who had a compound surname, which may be written with a hyphen or a space but not an en dash (for example, the Lennard-Jones potential [hyphen] is named after one person (Mr. John Lennard-Jones), as are Bence Jones proteins and Hughlings Jackson syndrome). Copyeditors use dictionaries (general, medical, biographical, and geographical) to confirm the eponymity (and thus the styling) for specific terms, given that no one can know them all offhand.

Preference for an en dash instead of a hyphen in these coordinate/relationship/connection types of terms is a matter of style, not inherent orthographic "correctness"; both are equally "correct", and each is the preferred style in some style guides. For example, , the AMA Manual of Style, and Dorland's medical reference works use hyphens, not en dashes, in coordinate terms (such as "blood-brain barrier"), in eponyms (such as "Cheyne-Stokes respiration", "Kaplan-Meier method"), and so on.

In English, the en dash is usually used instead of a hyphen in compound (phrasal) attributives in which one or both elements is itself a compound, especially when the compound element is an open compound, meaning it is not itself hyphenated. This manner of usage may include such examples as:[14][15][21][22]

The disambiguating value of the en dash in these patterns was illustrated by Strunk and White in The Elements of Style with the following example: When Chattanooga News and Chattanooga Free Press merged, the joint company was inaptly named Chattanooga News-Free Press (using a hyphen), which could be interpreted as meaning that their newspapers were news-free.[23]

An exception to the use of en dashes is usually made when prefixing an already-hyphenated compound; an en dash is generally avoided as a distraction in this case. Examples of this include:[23]

An en dash can be retained to avoid ambiguity, but whether any ambiguity is plausible is a judgment call. AMA style retains the en dashes in the following examples:[24]

As discussed above, the en dash is sometimes recommended instead of a hyphen in compound adjectives where neither part of the adjective modifies the other—that is, when each modifies the noun, as in love–hate relationship.

The Chicago Manual of Style (CMOS), however, limits the use of the en dash to two main purposes:

That is, the CMOS favors hyphens in instances where some other guides suggest en dashes, the 16th edition explaining that "Chicago's sense of the en dash does not extend to between", to rule out its use in "US–Canadian relations".[26]

In these two uses, en dashes normally do not have spaces around them. Some make an exception when they believe avoiding spaces may cause confusion or look odd. For example, compare "12 June – 3 July" with "12 June–3 July".[27] However, other authorities disagree and state there should be no space between an en dash and adjacent text. These authorities would not use a space in, for example, "11:00 a.m.⁠–⁠1:00 p.m."[28] or "July 9–August 17".[29][30]

En-dashes can be used instead of pairs of commas that mark off a nested clause or phrase. They can also be used around parenthetical expressions – such as this one – rather than the em dashes preferred by some publishers.[31][32]

The en dash can also signify a rhetorical pause. For example, an opinion piece from The Guardian is entitled:

In these situations, en dashes must have a single space on each side.[32]

Either the en dash or the em dash may be used as a bullet at the start of each item in a bulleted list. (This is a matter of graphic design rather than orthography.)

In most uses of en dashes, such as when used in indicating ranges, they are closed up to the joined words. It is only when en dashes are used in setting off parenthetical expressions  – such as this one – that they take spaces around them.[34][full citation needed] For more on the choice of em versus en in this context, see En dash versus em dash.

When an en dash is unavailable in a particular character encoding environment—as in the ASCII character set—there are some conventional substitutions. Often two consecutive hyphens are the substitute.

The en dash is encoded in Unicode as U+2013 (decimal 8211) and represented in HTML by the named character entity –.

The en dash is sometimes used as a substitute for the minus sign, when the minus sign character is not available since the en dash is usually the same width as a plus sign. For example, the original 8-bit Macintosh Character Set had an en dash, useful for the minus sign, years before Unicode with a dedicated minus sign was available. The hyphen-minus is usually too narrow to make a typographically acceptable minus sign. However, the en dash cannot be used for a minus sign in programming languages because the syntax usually requires a hyphen-minus.

The em dash, em rule, or mutton dash[5] is longer than an en dash. The character is called an em dash because it is one em wide, a length that varies depending on the font size. One em is the same length as the font's height (which is typically measured in points). So in 9-point type, an em dash is nine points wide, while in 24-point type the em dash is 24 points wide. By comparison, the en dash, with its 1 en width, is in most fonts either a half-em wide[35] or the width of an upper-case "N".[36]

The em dash is encoded in Unicode as U+2014 (decimal 8212) and represented in HTML by the named character entity —.

The em dash is used in several ways. Primarily in places where a set of parentheses or a colon might otherwise be used,[37][full citation needed] it can show an abrupt change in thought or be used where a full stop (period) is too strong and a comma too weak. Em dashes are also used to set off summaries or definitions.[38] Common uses and definitions are cited below with examples.

It may indicate an interpolation stronger than that demarcated by parentheses, as in the following from Nicholson Baker's The Mezzanine (the degree of difference is subjective).

In a related use, it may visually indicate the shift between speakers when they overlap in speech. For example, the em dash is used this way in Joseph Heller's Catch-22:

Lord Cardinal! if thou think'st on heaven's bliss,
Hold up thy hand, make signal of that hope.—
He dies, and makes no sign!

This is a quotation dash. It may be distinct from an em dash in its coding (see Horizontal bar). It may be used to indicate turns in a dialog, in which case each dash starts a paragraph.[40] It replaces other quotation marks and was preferred by authors such as James Joyce:[41]

―O saints above! miss Douce said, sighed above her jumping rose. I wished I hadn't laughed so much. I feel all wet.

The Walrus and the Carpenter
Were walking close at hand;
They wept like anything to see
Such quantities of sand:
"If this were only cleared away,"
They said, "it would be grand!"

An em dash may be used to indicate omitted letters in a word redacted to an initial or single letter or to fillet a word, by leaving the start and end letters whilst replacing the middle letters with a dash or dashes (for the purposes of censorship or simply data anonymization). In this use, it is sometimes doubled.

Three em dashes might be used to indicate a completely missing word.[42]

Either the en dash or the em dash may be used as a bullet at the start of each item in a bulleted list, but a plain hyphen is more commonly used.

Three em dashes one after another can be used in a footnote, endnote, or another form of bibliographic entry to indicate repetition of the same author's name as that of the previous work,[42] which is similar to the use of id.

According to most American sources (such as The Chicago Manual of Style) and some British sources (such as The Oxford Guide to Style), an em dash should always be set closed, meaning it should not be surrounded by spaces. But the practice in some parts of the English-speaking world, including the style recommended by The New York Times Manual of Style and Usage for printed newspapers and the AP Stylebook, sets it open, separating it from its surrounding words by using spaces or hair spaces (U+200A) when it is being used parenthetically.[43][44] The AP Stylebook rejects the use of the open em dash to set off introductory items in lists. However, the "space, en dash, space" sequence is the predominant style in German and French typography. (See En dash versus em dash below.)

In Canada, The Canadian Style: A Guide to Writing and Editing, (2nd ed.), Editing Canadian English, and the Canadian Oxford Dictionary all specify that an em dash should be set closed when used between words, a word and numeral, or two numerals.

The Oxford Canadian A to Z of Grammar, Spelling & Punctuation: Guide to Canadian English Usage

The Australian government's Style Manual for Authors, Editors and Printers (6th ed.), also specifies that em dashes inserted between words, a word and numeral, or two numerals, should be set closed. A section on the 2-em rule (⸺) also explains that the 2-em can be used to mark an abrupt break in direct or reported speech, but a space is used before the 2-em if a complete word is missing, while no space is used if part of a word exists before the sudden break. Two examples of this are as follows (properly typeset 2-em and 3-em dashes should appear as a single dash, but they may show on this page as several em dashes with spaces in between):

When an em dash is unavailable in a particular character encoding environment—as in the ASCII character set—it has usually been approximated as consecutive double (--) or triple (---) hyphen-minuses. The two-hyphen em dash proxy is perhaps more common, being a widespread convention in the typewriting era. (It is still described for hard copy manuscript preparation in the Chicago Manual of Style as of the 16th edition, although the manual conveys that typewritten manuscript and copyediting on paper are now dated practices.) The three-hyphen em dash proxy was popular with various publishers because the sequence of one, two, or three hyphens could then correspond to the hyphen, en dash, and em dash, respectively.

Because early comic book letterers were not aware of the typographic convention of replacing a typewritten double hyphen with an em dash, the double hyphen became traditional in American comics. This practice has continued despite the development of computer lettering.[45][46]

The en dash is wider than the hyphen but not as wide as the em dash. An em width is defined as the point size of the currently used font, since the M character is not always the width of the point size.[47] In running text, various dash conventions are employed: an em dash—like so—or a spaced em dash — like so — or a spaced en dash – like so – can be seen in contemporary publications.

Various style guides and national varieties of languages prescribe different guidance on dashes. Dashes have been cited as being treated differently in the US and the UK, with the former preferring the use of an em dash with no additional spacing and the latter preferring a spaced en dash.[31] As examples of the US style, The Chicago Manual of Style and recommend unspaced em dashes. Style guides outside the US are more variable. For example, The Elements of Typographic Style by Canadian typographer Robert Bringhurst recommends the spaced en dash – like so – and argues that the length and visual magnitude of an em dash "belongs to the padded and corseted aesthetic of Victorian typography".[32] In the United Kingdom, the spaced en dash is the house style for certain major publishers, including the Penguin Group, the Cambridge University Press, and Routledge. However, this convention is not universal. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that the spaced en dash is used by "other British publishers" but states that the Oxford University Press, like "most US publishers", uses the unspaced em dash.

The en dash – always with spaces in running text when, as discussed in this section, indicating a parenthesis or pause – and the spaced em dash both have a certain technical advantage over the unspaced em dash. Most typesetting and word processing expects word spacing to vary to support full justification. Alone among punctuation that marks pauses or logical relations in text, the unspaced em dash disables this for the words it falls between. This can cause uneven spacing in the text, but can be mitigated by the use of thin spaces, hair spaces, or even zero-width spaces on the sides of the em dash. This provides the appearance of an unspaced em dash, but allows the words and dashes to break between lines. The spaced em dash risks introducing excessive separation of words. In full justification, the adjacent spaces may be stretched, and the separation of words further exaggerated. En dashes may also be preferred to em dashes when text is set in narrow columns, such as in newspapers and similar publications, since the en dash is smaller. In such cases, its use is based purely on space considerations and is not necessarily related to other typographical concerns.

On the other hand, a spaced en dash may be ambiguous when it is also used for ranges, for example, in dates or between geographical locations with internal spaces.

The horizontal bar (U+2015 ), also known as a quotation dash, is used to introduce quoted text. This is the standard method of printing dialogue in some languages. The em dash is equally suitable if the quotation dash is unavailable or is contrary to the house style being used.

The swung dash ( U+2053 ) resembles a lengthened tilde and is used to separate alternatives or approximates. In dictionaries, it is frequently used to stand in for the term being defined. A dictionary entry providing an example for the term henceforth might employ the swung dash as follows:

Typewriters and early computers have traditionally had only a limited character set, often having no key that produces a dash. In consequence, it became common to use the hyphen. Em dashes are often represented in British usage by a single hyphen-minus surrounded by spaces, or in American usage by two hyphen-minuses.

Modern computer software typically has support for many more characters and is usually capable of rendering both the en and em dashes correctly—albeit sometimes with an inconvenient input method. Some software, though, may operate in a more limited mode. Some text editors, for example, were restricted to working with a single 8-bit (pre-Unicode) character encoding, and when unencodable characters are entered—for example by pasting from the clipboard—they were often blindly converted to question marks.[citation needed] Sometimes this happened to em and en dashes, even when the 8-bit encoding supported them or when an alternative representation using hyphen-minuses is an option.

Techniques for generating em and en dashes in various operating systems, word processors and markup languages are provided in the following table:

In many languages, such as Polish, the em dash is used as an opening quotation mark. There is no matching closing quotation mark; typically a new paragraph will be started, introduced by a dash, for each turn in the dialog.

Corpus studies indicate that em dashes are more commonly used in Russian than in English.[52] In Russian, the em dash is used for the present copula (meaning "am"/"is"/"are"), which is unpronounced in spoken Russian.

In French, em or en dashes can be used as parentheses (brackets), but the use of a second dash as a closing parenthesis is optional. When a closing dash is not used, the sentence is ended with a period (full-stop) as usual. Dashes are, however, much less common than parentheses.

In Spanish, em dashes can be used to mark off parenthetical phrases. Unlike in English, the em dashes are spaced like brackets, i.e., there is a space between main sentence and dash, but not between parenthetical phrase and dash.[53]

Llevaba la fidelidad a su maestro —un buen profesor— hasta extremos insospechados.[54]