HTML element

An HTML element is a type of HTML (Hypertext Markup Language) document component, one of several types of HTML nodes (there are also text nodes, comment nodes and others).[vague] HTML document is composed of a tree of simple HTML nodes, such as text nodes, and HTML elements, which add semantics and formatting to parts of document (e.g., make text bold, organize it into paragraphs, lists and tables, or embed hyperlinks and images). Each element can have HTML attributes specified. Elements can also have content, including other elements and text.

As is generally understood, the position of an element is indicated as spanning from a start tag, possibly including some child content, and is terminated by an end tag.[1] This is the case for many, but not all, elements within an HTML document. The distinction is explicitly emphasised in HTML 4.01 Specification:

Elements are not tags. Some people refer to elements as tags (e.g., "the P tag"). Remember that the element is one thing, and the tag (be it start or end tag) is another. For instance, the HEAD element is always present, even though both start and end HEAD tags may be missing in the markup.[1]

Similarly the W3C Recommendadtion HTML 5.1 2nd Edition explicitly says:

Tags are used to delimit the start and end of elements in the markup. (...) The start and end tags of certain normal elements can be omitted, (...)
The contents of the element must be placed between just after the start tag (which might be implied, in certain cases) and just before the end tag (which again, might be implied, in certain cases).

Certain tags can be omitted.
NOTE:
Omitting an element's start tag (...) does not mean the element is not present; it is implied, but it is still there. For example, an HTML document always has a root <html> element, even if the string <html> doesn't appear anywhere in the markup.


As HTML (before HTML5) is based on SGML,[2] its parsing also depends on the Document Type Definition (DTD), specifically an HTML DTD (e.g. HTML 4.01[3][note 1]). The DTD specifies which element types are possible (i.e. it defines the set of element types) and also the valid combinations in which they may appear in a document. It is part of general SGML behavior that, where only one valid structure is possible (per the DTD), its explicit statement in any given document is not generally required. As a simple example, the <p> tag indicating the start of a paragraph element should be complemented by a </p> tag indicating its end. But since the DTD states that paragraph elements cannot be nested, an HTML document fragment <p>Para 1 <p>Para 2 <p>Para 3is thus inferred to be equivalent to <p>Para 1 </p><p>Para 2 </p><p>Para 3. (If one paragraph element cannot contain another, any currently open paragraph must be closed before starting another.) Because this implication is based on the combination of the DTD and the individual document, it is not usually possible to infer elements from document tags alone but only by using an SGML—or HTML—aware parser with knowledge of the DTD. HTML5 creates a similar result by defining what tags can be omitted.[4]

SGML is complex, which has limited its widespread understanding and adoption. XML was developed as a simpler alternative. Although both can use the DTD to specify the supported elements and their permitted combinations as document structure, XML parsing is simpler. The relation from tags to elements is always that of parsing the actual tags included in the document, without the implied closures that are part of SGML.[note 2]

HTML as used on the current web is likely to be either treated as XML, by being XHTML, or as HTML5; in either case the parsing of document tags into Document Object Model (DOM) elements is simplified compared to legacy HTML systems. Once the DOM of elements is obtained, behavior at higher levels of interface (example: screen rendering) is identical or nearly so.[note 3]

Part of this CSS presentation behavior is the notion of the "box model". This is applied to those elements that CSS considers to be "block" elements, set through the CSS display: block; declaration.

HTML also has a similar concept, although different, and the two are very frequently confused. %block; and %inline; are groups within the HTML DTD that group elements as being either "block-level" or "inline".[6] This is used to define their nesting behavior: block-level elements cannot be placed into an inline context.[note 4] This behavior cannot be changed; it is fixed in the DTD. Block and inline elements have the appropriate and different CSS behaviors attached to them by default,[6] including the relevance of the box model for particular element types.

Note though that this CSS behavior can, and frequently is, changed from the default. Lists with <ul><li> ... are %block; elements and are presented as block elements by default. However, it is quite common to set these with CSS to display as an inline list.[7]

In the HTML syntax, most elements are written with a start tag and an end tag, with the content in between. An HTML tag is composed of the name of the element, surrounded by angle brackets. An end tag also has a slash after the opening angle bracket, to distinguish it from the start tag. For example, a paragraph, which is represented by the <p> element, would be written as:

However, not all of these elements require the end tag, or even the start tag, to be present.[4] Some elements, the so-called void elements, do not have an end tag. A typical example is the <br> (hard line-break) element. A void element's behavior is predefined, and it cannot contain any content or other elements. For example, an address would be written as:

When using XHTML, it is required to open and close all elements, including void elements. This can be done by placing an end tag immediately after the start tag, but this is not legal in HTML 5 and will lead to two elements being created. An alternative way to specify that it is a void element, which is compatible with both XHTML and HTML 5, is to put a / at the end of the tag (not to be confused with the / at the beginning of a closing tag).

HTML attributes are specified inside the start tag. For example, the <abbr> element, which represents an abbreviation, expects a title attribute within its opening tag. This would be written as:


There are multiple kinds of HTML elements: normal elements, raw text elements, and void elements.

Normal elements usually have both a start tag and an end tag, although for some elements the end tag, or both tags, can be omitted. It is constructed in a similar way:

Raw text elements (also known as text or text-only elements) are constructed with:

An example is the <title> element must not contain other elements (including markup of text), only plain text.

Void elements (also sometimes called empty elements, single elements or stand-alone elements) only have a start tag (in the form <tag>), which contains any HTML attributes. They may not contain any children, such as text or other elements. For compatibility with XHTML, the HTML specification allows an optional space and slash (<tag /> is permissible). The space and slash are required in XHTML and other XML applications. Two common void elements are <br /> (for a hard line-break, such as in a poem or an address) and <hr /> (for a thematic break). Other such elements are often place-holders which reference external files, such as the image (<img />) element. The attributes included in the element will then point to the external file in question. Another example of a void element is <link />, for which the syntax is:

This <link /> element points the browser at a style sheet to use when presenting the HTML document to the user. Note that in the HTML syntax attributes don't have to be quoted if they are composed only of certain characters: letters, digits, the hyphen-minus and the period. When using the XML syntax (XHTML), on the other hand, all attributes must be quoted, and a spaced trailing slash is required before the last angle bracket:


HTML attributes define desired behavior or indicate additional element properties. Most attributes require a value. In HTML, the value can be left unquoted if it does not include spaces (attribute=value), or it can be quoted with single or double quotes (attribute='value' or attribute="value"). In XML, those quotes are required.

Boolean attributes, on the other hand, don't require a value to be specified. An example is the checked for checkboxes:

In the XML (and thus XHTML) syntax, though, the name should be repeated as the value:

Informally, HTML elements are sometimes referred to as "tags" (an example of synecdoche), though many prefer the term tag strictly in reference to the markup delimiting the start and end of an element.

Element (and attribute) names may be written in any combination of upper or lower case in HTML, but must be in lower case in XHTML.[8] The canonical form was upper-case until HTML 4, and was used in HTML specifications, but in recent years, lower-case has become more common.

HTML elements are defined in a series of freely available open standards issued since 1995, initially by the IETF and subsequently by the W3C.

During the browser wars of the 1990s, developers of user agents (e.g. web browsers) often developed their own elements, some of which have been adopted in later standards. Other user agents may not recognize non-standard elements, and they will be ignored, possibly causing the page to be displayed improperly.

In 1998, XML (a simplified form of SGML) introduced mechanisms to allow anyone to develop their own elements and incorporate them in XHTML documents, for use with XML-aware user agents.[9]

Subsequently, HTML 4.01 was rewritten in an XML-compatible form, XHTML 1.0 (eXtensible HTML). The elements in each are identical, and in most cases valid XHTML 1.0 documents will be valid or nearly valid HTML 4.01 documents. This article mainly focuses on real HTML, unless noted otherwise; however, it remains applicable to XHTML. See HTML for a discussion of the minor differences between the two.

Since the first version of HTML, several elements have become outmoded, and are deprecated in later standards, or do not appear at all, in which case they are invalid (and will be found invalid, and perhaps not displayed, by validating user agents).[10]

In HTML 4.01 / XHTML 1.0, the status of elements is complicated by the existence of three types of DTD:

HTML5 instead provides a listing of obsolete features to go along with the standardized normative content. They are broken down into "obsolete but conforming" for which implementation instructions exist and "non-conforming" ones that should be replaced.[11]

The first Standard (HTML 2.0) contained four deprecated elements, one of which was invalid in HTML 3.2. All four are invalid in HTML 4.01 Transitional, which also deprecated a further ten elements. All of these, plus two others, are invalid in HTML 4.01 Strict. While the frame elements are still current in the sense of being present in the Transitional and Frameset DTDs, there are no plans to preserve them in future standards, as their function has been largely replaced, and they are highly problematic for user accessibility.

(Strictly speaking, the most recent XHTML standard, XHTML 1.1 (2001), does not include frames at all; it is approximately equivalent to XHTML 1.0 Strict, but also includes the Ruby markup module.)[12]

A common source of confusion is the loose use of deprecated to refer to both deprecated and invalid status, and to elements that are expected to be formally deprecated in the future.

Since HTML 4, HTML has increasingly focused on the separation of content (the visible text and images) from presentation (like color, font size, and layout).[13] This is often referred to as a separation of concerns. HTML is used to represent the structure or content of a document, its presentation remains the sole responsibility of CSS style sheets. A default style sheet is suggested as part of the CSS standard, giving a default rendering for HTML.[14]

Behavior (interactivity) is also kept separate from content, and is handled by scripts. Images are contained in separate graphics files, separate from text, though they can also be considered part of the content of a page.

Separation of concerns allows the document to be presented by different user agents according to their purposes and abilities. For example, a user agent can select an appropriate style sheet to present a document by displaying on a monitor, printing on paper, or to determine speech characteristics in an audio-only user agent. The structural and semantic functions of the markup remain identical in each case.

Historically, user agents did not always support these features. In the 1990s, as a stop-gap, presentational elements (like <b> and <i>) were added to HTML, at the cost of creating problems for interoperability and user accessibility. This is now regarded as outmoded and has been superseded by style sheet-based design; most presentational elements are now deprecated.[15]

External image files are incorporated with the <img /> or <object /> elements. (With XHTML, the SVG language can also be used to write graphics within the document, though linking to external SVG files is generally simpler.)[16] Where an image is not purely decorative, HTML allows replacement content with similar semantic value to be provided for non-visual user agents.

An HTML document can also be extended through the use of scripts to provide additional behaviors beyond the abilities of HTML hyperlinks and forms.

The elements <style> and <script>, with related HTML attributes, provide style sheets and scripts.

Can be used to specify additional metadata about a document, such as its author, publication date, expiration date, language, page title, page description, keywords, or other information not provided through the other header elements and HTML attributes. Because of their generic nature, <meta /> elements specify associative key-value pairs. In general, a meta element conveys hidden information about the document. Several meta tags can be used, all of which should be nested in the head element. The specific purpose of each <meta /> element is defined by its attributes. Outside of XHTML, it is often given without the slash (<meta>), despite being a void element.

In one form, <meta /> elements can specify HTTP headers which should be sent by a web server before the actual content. For example, <meta http-equiv="foo" content="bar" /> specifies that the page should be served with an HTTP header called foo that has a value bar.

Used for including generic objects within the document header. Though rarely used within a <head> element, it could potentially be used to extract foreign data and associate it with the current document.

In visual browsers, displayable elements can be rendered as either block or inline. While all elements are part of the document sequence, block elements appear within their parent elements:

Conversely, inline elements are treated as part of the flow of document text; they cannot have margins, width or height set, and do break across lines.

Block elements, or block-level elements, have a rectangular structure. By default, these elements will span the entire width of its parent element, and will thus not allow any other element to occupy the same horizontal space as it is placed on.

The rectangular structure of a block element is often referred to as the box model, and is made up of several parts. Each element contains the following:

The above section refers only to the detailed implementation of CSS rendering and has no relevance to HTML elements themselves.

A name in a description list (previously definition term in a definition list).
A value in a description list (previously definition data in a definition list).
Used for content in a document which is separate from the main page content, for example, sidebars or advertising.

A block level quotation, for when the quotation includes block level elements, e.g. paragraphs. The cite attribute (not to be confused with the <cite> element) may give the source, and must be a fully qualified Uniform Resource Identifier.

Marks a deleted section of content. This element can also be used as inline.
Used for document footers. These might contain author or copyright information, or links to other pages.
Used for document headers. These typically contain content introducing the page.
A thematic break (originally: horizontal rule). Presentational rules can be drawn with style sheets.
Marks a section of inserted content. This element can also be used as inline.
Used in navigational sections of articles (areas of webpages which contain links to other webpages).
Replacement content for scripts. Unlike script this can only be used as a block-level element.

Inline elements cannot be placed directly inside the <body> element; they must be wholly nested within block-level elements.[24]

An anchor element is called an anchor because web designers can use it to "anchor" a URL to some text on a web page. When users view the web page in a browser, they can click the text to activate the link and visit the page whose URL is in the link.[25]

In HTML, an anchor can be either the origin (the anchor text) or the target (destination) end of a hyperlink.

With the attribute href,[26] the anchor becomes a hyperlink to either another part of the document or another resource (e.g. a webpage) using an external URL. Alternatively (and sometimes concurrently), with the name or id HTML attributes set, the element becomes a link target. A Uniform Resource Locator (URL) can link to this target via a fragment identifier. In HTML5, any element can now be made into a target by using the id attribute,[27] so using <a name="foo">...</a> is not necessary, although this way of adding anchors continues to work.

To illustrate: the header of a table of contents section on example.com's homepage could be turned into a target by writing: <h2><a name="contents">Table of contents</a></h2>.

Continuing with this example, now that the section has been marked up as a target, it can be referred to from external sites with a link like: <a href="http://example.com#contents">see contents</a>;

or with a link on the same page like: <a href="#contents">contents, above</a>.

The attribute title may be set to give brief information about the link: <a href="URL" title="additional information">link text</a>.

In most graphical browsers, when the cursor hovers over a link, the cursor changes into a hand with an extended index finger and the title value is displayed in a tooltip or in some other manner. Some browsers render alt text the same way, although this is not what the specification calls for.

Phrase elements are used for marking up phrases and adding structure or semantic meaning to text fragments. For example, the <em> and <strong> tags can be used for adding emphasis to text.

These elements are useful primarily for documenting computer code development and user interaction through differentiation of source code (<code>), variables (<var>), user input (<kbd>), and terminal or other output (<samp>).

A code snippet (code example). Conventionally rendered in a mono-space font.

As visual presentational markup only applies directly to visual browsers, its use is discouraged. Style sheets should be used instead. Several of these elements are deprecated or invalid in HTML 4 / XHTML 1.0, and the remainder are invalid in the current draft of . The current draft of , however, re-includes <s>, <u>, and <small>, assigning new semantic meaning to each. In an HTML5 document, the use of these elements is no longer discouraged, provided that it is semantically correct.

Isolates an inline section of text that may be formatted in a different direction from other text outside of it, such as user-generated content with unknown directionality.
Marks an inline section of text in which the reading direction is the opposite from that of the parent element.
Produces text that looks like this. Intended for highlighting relevant text in a quotation.
Provides fallback parenthesis for browsers lacking ruby annotation support.
Represents a ruby annotation for showing the pronunciation of East Asian characters.
Represents a time on the 24-hour clock or a date on the Gregorian calendar, optionally with time and time zone information. Also allows times and dates to be represented in a machine-readable format.
Inserts a non-standard object (like applet) or external content (typically non-HTML) into the document.
Provides text tracks, like subtitles and captions, for audio and video.

These elements can be combined into a form or in some instances used separately as user-interface controls; in the document, they can be simple HTML or used in conjunction with Scripts. HTML markup specifies the elements that make up a form, and the method by which it will be submitted. However, some form of scripts (server-side, client-side, or both) must be used to process the user's input once it is submitted.

(These elements are either block or inline elements, but are collected here as their use is more restricted than other inline or block elements.)

A generic form button which can contain a range of other elements to create complex buttons.
<input> elements allow a variety of standard form controls to be implemented.
A general-purpose button. The element <button> is preferred if possible (i.e., if the client supports it) as it provides richer possibilities.
Produces a slider for that returns a number, but the number is not visible to the user.
hidden inputs are not visible in the rendered page, but allow a designer to maintain a copy of data that needs to be submitted to the server as part of the form. This may, for example, be data that this web user entered or selected on a previous form that needs to be processed in conjunction with the current form. Not displayed to the user but data can still be altered client-side by editing the HTML source.
Creates a label for a form input, such as radio. Clicking on the label fires a click on the matching input.
Creates a selection list, from which the user can select a single option. May be rendered as a dropdown list.

The format of HTML Tables was proposed in the HTML 3.0 Drafts and the later RFC 1942 HTML Tables. They were inspired by the CALS Table Model. Some elements in these proposals were included in HTML 3.2; the present form of HTML Tables was standardized in HTML 4. (Many of the elements used within tables are neither block nor inline elements.)

Specifies the header part of a <table>. This section may be repeated by the user agent if the table is split across pages (in printing or other paged media).

Frames allow a visual HTML browser window to be split into segments, each of which can show a different document. This can lower bandwidth use, as repeating parts of a layout can be used in one frame, while variable content is displayed in another. This may come at a certain usability cost, especially in non-visual user agents,[52] due to separate and independent documents (or websites) being displayed adjacent to each other and being allowed to interact with the same parent window. Because of this cost, frames (excluding the <iframe> element) are only allowed in HTML 4.01 Frame-set. Iframes can also hold documents on different servers. In this case the interaction between windows is blocked by the browser. Sites like Facebook and Twitter use iframes to display content (plugins) on third party websites. Google AdSense uses iframes to display banners on third party websites.

In HTML 4.01, a document may contain a <head> and a <body> or a <head> and a <frameset>, but not both a <body> and a <frameset>. However, <iframe> can be used in a normal document body.

Contains normal HTML content for user agents that don't support <frame /> elements.

In HTML, longdesc is an attribute used within the <img />, <frame />, or <iframe> elements. It is supposed to be a URL[note 5] to a document that provides a long description for the image, frame, or iframe in question.[53] Note that this attribute should contain a URL, not – as is commonly mistaken – the text of the description itself.

longdesc was designed to be used by screen readers to display image information for computer users with accessibility issues, such as the blind or visually impaired, and is widely implemented by both web browsers and screen readers.[54] Some developers object that[55] it is actually seldom used for this purpose because there are relatively few authors who use the attribute and most of those authors use it incorrectly; thus, they recommend deprecating longdesc.[56] The publishing industry has responded, advocating the retention of longdesc.[57]

Since very few graphical browsers support making the link available natively (Opera and iCab being the exceptions), it is useful to include a link to the description page near the <img /> element whenever possible, as this can also aid sighted users.

The following elements were part of the early HTML developed by Tim Berners-Lee from 1989 to 1991; they are mentioned in HTML Tags, but deprecated in HTML 2.0 and were never part of HTML standards.

This element displayed the text inside the tags in a monospace font and without interpreting the HTML. The HTML 2.0 specification recommended rendering the element at up to 132 characters per line.
This element displayed the text inside the tags in a monospace font and without interpreting the HTML. The HTML 2.0 specification recommended rendering the element at 80 characters per line.

This section lists some widely used obsolete elements, which means they are not used in valid code. They may not be supported in all user agents.

A comment in HTML (and related XML, SGML and SHTML) uses the same syntax as the SGML comment or XML comment, depending on the doctype.

The markup <!--Xbegin<!--Y-->Xend--> will yield the comment Xbegin<!--Y and the text Xend--> after it, or sometimes just Xend-->, depending on browser.

Comments can appear anywhere in a document, as the HTML parser is supposed to ignore them no matter where they appear so long as they are not inside other HTML tag structures (i.e., they cannot be used next to attributes and values; this is invalid markup: <span id="x1"<--for "extension one"--> style="..."<).

Comments can even appear before the doctype declaration; no other tags are permitted to do this.

However, not all browsers and HTML editors are fully compliant with the HTML syntax framework and may do unpredictable things under some syntax conditions. Defective handling of comments only affects about 5% of all browsers and HTML editors in use, and even then only certain versions are affected by comment mishandling issues (Internet Explorer 6 accounts for most of this high percentage).

(HTML 4.01 superseded (1998), which was never widely implemented, and all earlier versions. Superseded in turn on 2018-03-27 by HTML 5.2)(A more detailed version of the above. Also superseded on 2018-03-27 by HTML 5.2.)Also available as a , and (also multi-page, with a search function and other gadgets, and minus details only of interest to browser vendors).(This is the final draft of HTML 3.0, which expired without being developed further.)(List of active specifications that have superseded CSS 2.1, as of the publication date.)(CSS levels 3 and 4 are developed as independent modules, indexed at that page.)