HTML Standard_indo
HTML Standard_indo
HTML
Living Standard — Last Updated 19 July 2024
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 1/12
7/24/24, 2:46 PM HTML Standard
Note
This section only describes the rules for resources labeled with an HTML MIME type. Rules for XML resources are discussed in the section
below entitled "The XML syntax".
This section only applies to documents, authoring tools, and markup generators. In particular, it does not apply to conformance checkers;
conformance checkers must use the requirements given in the next section ("parsing HTML documents").
3. A DOCTYPE.
The various types of content mentioned above are described in the next few sections.
In addition, there are some restrictions on how character encoding declarations are to be serialized, as discussed in the section on that topic.
Note
ASCII whitespace before the html element, at the start of the html element and before the head element, will be dropped when the document is
parsed; ASCII whitespace after the html element will be parsed as if it were at the end of the body element. Thus, ASCII whitespace around the
document element does not round-trip.
It is suggested that newlines be inserted after the DOCTYPE, after any comments that are before the document element, after the html
element's start tag (if it is not omitted), and after any comments that are inside the html element but before the head element.
Many strings in the HTML syntax (e.g. the names of elements and their attributes) are case-insensitive, but only for ASCII upper alphas and ASCII
lower alphas. For convenience, in this section this is just referred to as "case-insensitive".
Note
DOCTYPEs are required for legacy reasons. When omitted, browsers tend to use a different rendering mode that is incompatible with some
specifications. Including the DOCTYPE in a document ensures that the browser makes a best-effort attempt at following the relevant
specifications.
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 2/12
7/24/24, 2:46 PM HTML Standard
Note
In other words, <!DOCTYPE html>, case-insensitively.
For the purposes of HTML generators that cannot output HTML markup with the short DOCTYPE "<!DOCTYPE html>", a DOCTYPE legacy string
may be inserted into the DOCTYPE (in the position defined above). This string must consist of:
Note
In other words, <!DOCTYPE html SYSTEM "about:legacy-compat"> or <!DOCTYPE html SYSTEM 'about:legacy-compat'>, case-
insensitively except for the part in single or double quotes.
The DOCTYPE legacy string should not be used unless the document is generated from a system that cannot output the shorter string.
13.1.2 Elements §
There are six different kinds of elements: void elements, the template element, raw text elements, escapable raw text elements, foreign elements,
and normal elements.
Void elements
area, base, br, col, embed, hr, img, input, link, meta, source, track, wbr
Foreign elements
Elements from the MathML namespace and the SVG namespace.
Normal elements
All other allowed HTML elements are normal elements.
Tags are used to delimit the start and end of elements in the markup. Raw text, escapable raw text, and normal elements have a start tag to indicate
where they begin, and an end tag to indicate where they end. The start and end tags of certain normal elements can be omitted, as described below
in the section on optional tags. Those that cannot be omitted must not be omitted. Void elements only have a start tag; end tags must not be specified
for void elements. Foreign elements must either have a start tag and an end tag, or a start tag that is marked as self-closing, in which case they must
not have an end tag.
The contents of the element must be placed between just after the start tag (which might be implied, in certain cases) and just before the end tag
(which again, might be implied in certain cases). The exact allowed contents of each individual element depend on the content model of that element,
as described earlier in this specification. Elements must not contain content that their content model disallows. In addition to the restrictions placed on
the contents by those content models, however, the five types of elements have additional syntactic requirements.
Void elements can't have any contents (since there's no end tag, no content can be put between the start tag and the end tag).
The template element can have template contents, but such template contents are not children of the template element itself. Instead, they are
stored in a DocumentFragment associated with a different Document — without a browsing context — so as to avoid the template contents
interfering with the main Document. The markup for the template contents of a template element is placed just after the template element's start tag
and just before template element's end tag (as with other elements), and may consist of any text, character references, elements, and comments,
but the text must not contain the character U+003C LESS-THAN SIGN (<) or an ambiguous ampersand.
Raw text elements can have text, though it has restrictions described below.
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 3/12
7/24/24, 2:46 PM HTML Standard
Elemen teks mentah yang dapat di-escap dapat memiliki referensi teks dan karakter , tetapi teks tersebut tidak boleh mengandung ampersand yang
ambigu . Ada juga batasan lebih lanjut yang dijelaskan di bawah ini.
Elemen asing yang tag awalnya ditandai sebagai self-closing tidak boleh memiliki konten apa pun (karena, sekali lagi, karena tidak ada tag penutup,
tidak ada konten yang dapat diletakkan di antara tag awal dan tag akhir). Elemen asing yang tag awalnya tidak ditandai sebagai self-closing dapat
memiliki teks , referensi karakter , bagian CDATA , elemen lain , dan komentar , tetapi teks tersebut tidak boleh berisi karakter U+003C TANDA
KURANG DARI (<) atau ampersand yang ambigu .
Note
Sintaks HTML tidak mendukung deklarasi namespace, bahkan dalam elemen asing .
<p>
<svg>
<metadata>
<!-- this is invalid -->
<cdr:license xmlns:cdr="https://www.example.com/cdr/metadata" name="MIT"/>
</metadata>
</svg>
</p>
Elemen paling dalam, cdr:license, sebenarnya ada di namespace SVG, karena xmlns:cdratribut " " tidak memiliki efek (tidak seperti di XML).
Bahkan, seperti yang dikatakan komentar di fragmen di atas, fragmen tersebut sebenarnya tidak sesuai. Ini karena SVG 2 tidak mendefinisikan
elemen apa pun yang disebut " cdr:license" di namespace SVG.
Elemen normal dapat memiliki teks , referensi karakter , elemen lain , dan komentar , tetapi teks tidak boleh berisi karakter U+003C TANDA KURANG
DARI (<) atau ampersand yang ambigu . Beberapa elemen normal juga memiliki lebih banyak batasan pada konten yang boleh ditampungnya, di luar
batasan yang diberlakukan oleh model konten dan yang dijelaskan dalam paragraf ini. Batasan tersebut dijelaskan di bawah ini.
Tag berisi nama tag , yang memberikan nama elemen. Semua elemen HTML memiliki nama yang hanya menggunakan alfanumerik ASCII . Dalam
sintaksis HTML, nama tag, bahkan untuk elemen asing , dapat ditulis dengan campuran huruf kecil dan huruf besar yang, ketika diubah menjadi
huruf kecil semua, cocok dengan nama tag elemen; nama tag tidak peka huruf besar/kecil.
1. Karakter pertama dari tag awal harus berupa karakter TANDA KURANG DARI U+003C (<).
2. Beberapa karakter berikutnya dari tag awal harus berupa nama tag elemen .
3. Jika akan ada atribut pada langkah berikutnya, pertama-tama harus ada satu atau lebih spasi ASCII .
4. Kemudian, tag pembuka dapat memiliki sejumlah atribut, yang sintaksnya dijelaskan di bawah ini. Atribut harus dipisahkan satu sama lain
dengan satu atau lebih spasi ASCII .
5. Setelah atribut, atau setelah nama tag jika tidak ada atribut, mungkin ada satu atau lebih spasi ASCII . (Beberapa atribut harus diikuti oleh
spasi. Lihat bagian atribut di bawah.)
6. Kemudian, jika elemen tersebut adalah salah satu elemen void , atau jika elemen tersebut adalah elemen asing , maka mungkin ada satu
karakter U+002F SOLIDUS (/), yang pada elemen asing menandai tag awal sebagai penutupan sendiri. Pada elemen void , itu tidak
menandai tag awal sebagai penutupan sendiri tetapi sebaliknya tidak diperlukan dan tidak memiliki efek apa pun. Untuk elemen void
tersebut, itu harus digunakan hanya dengan hati-hati — terutama karena, jika langsung didahului oleh nilai atribut yang tidak dikutip , itu
menjadi bagian dari nilai atribut daripada dibuang oleh parser.
7. Terakhir, tag awal harus ditutup dengan karakter U+003E GREATER-THAN SIGN (>).
1. Karakter pertama dari tag akhir harus berupa karakter TANDA KURANG DARI U+003C (<).
2. Karakter kedua dari tag akhir harus berupa karakter U+002F SOLIDUS (/).
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 4/12
7/24/24, 2:46 PM HTML Standard
3. Beberapa karakter berikutnya dari tag akhir harus berupa nama tag elemen .
4. Setelah nama tag, mungkin ada satu atau lebih spasi ASCII .
5. Terakhir, tag akhir harus ditutup dengan karakter U+003E GREATER-THAN SIGN (>).
13.1.2.3 Atribut §
Atribut memiliki nama dan nilai. Nama atribut harus terdiri dari satu atau beberapa karakter selain kontrol , U+0020 SPACE, U+0022 ("), U+0027 ('),
U+003E (>), U+002F (/), U+003D (=), dan nonkarakter . Dalam sintaks HTML, nama atribut, bahkan untuk elemen asing , dapat ditulis dengan
campuran huruf ASCII bawah dan huruf ASCII atas .
Nilai atribut merupakan campuran referensi teks dan karakter , kecuali dengan batasan tambahan bahwa teks tidak boleh mengandung simbol
ampersand yang ambigu .
Example
Dalam contoh berikut, disabledatribut diberikan dengan sintaksis atribut kosong:
<input disabled>
Jika suatu atribut yang menggunakan sintaksis atribut kosong akan diikuti oleh atribut lain, maka harus ada spasi ASCII yang memisahkan
keduanya.
Example
Dalam contoh berikut, valueatribut diberikan dengan sintaks nilai atribut yang tidak dikutip:
<input value=yes>
Jika suatu atribut yang menggunakan sintaksis atribut yang tidak dikutip harus diikuti oleh atribut lain atau oleh karakter SOLIDUS U+002F (/)
opsional yang diizinkan pada langkah 6 sintaksis tag awal di atas, maka harus ada spasi ASCII yang memisahkan keduanya.
Example
Dalam contoh berikut, typeatribut diberikan dengan sintaks nilai atribut yang diapit tanda kutip tunggal:
<input type='checkbox'>
Jika suatu atribut yang menggunakan sintaksis atribut tanda kutip tunggal harus diikuti oleh atribut lain, maka harus ada spasi ASCII yang
memisahkan keduanya.
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 5/12
7/24/24, 2:46 PM HTML Standard
requirements given above for attribute values, must not contain any literal U+0022 QUOTATION MARK characters ("), and finally followed by a
second single U+0022 QUOTATION MARK character (").
Example
In the following example, the name attribute is given with the double-quoted attribute value syntax:
If an attribute using the double-quoted attribute syntax is to be followed by another attribute, then there must be ASCII whitespace separating the
two.
There must never be two or more attributes on the same start tag whose names are an ASCII case-insensitive match for each other.
When a foreign element has one of the namespaced attributes given by the local name and namespace of the first and second cells of a row from the
following table, it must be written using the name given by the third cell from the same row.
Note
Whether the attributes in the table above are conforming or not is defined by other specifications (e.g. SVG 2 and MathML); this section only
describes the syntax rules if the attributes are serialized using the HTML syntax.
Note
Omitting an element's start tag in the situations described below does not mean the element is not present; it is implied, but it is still there. For
example, an HTML document always has a root html element, even if the string <html> doesn't appear anywhere in the markup.
An html element's start tag may be omitted if the first thing inside the html element is not a comment.
Example
For example, in the following case it's ok to remove the "<html>" tag:
<!DOCTYPE HTML>
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 6/12
7/24/24, 2:46 PM HTML Standard
Doing so would make the document look like this:
<!DOCTYPE HTML>
<head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
This has the exact same DOM. In particular, note that whitespace around the document element is ignored by the parser. The following example
would also have the exact same DOM:
<!DOCTYPE HTML><head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
However, in the following example, removing the start tag moves the comment to before the html element:
<!DOCTYPE HTML>
<html>
<!-- where is this comment in the DOM? -->
<head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
With the tag removed, the document actually turns into the same as this:
<!DOCTYPE HTML>
<!-- where is this comment in the DOM? -->
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
This is why the tag can only be removed if it is not followed by a comment: removing the tag when there is a comment there changes the
document's resulting parse tree. Of course, if the position of the comment does not matter, then the tag can be omitted, as if the comment had
been moved to before the start tag in the first place.
An html element's end tag may be omitted if the html element is not immediately followed by a comment.
A head element's start tag may be omitted if the element is empty, or if the first thing inside the head element is an element.
A head element's end tag may be omitted if the head element is not immediately followed by ASCII whitespace or a comment.
A body element's start tag may be omitted if the element is empty, or if the first thing inside the body element is not ASCII whitespace or a comment,
except if the first thing inside the body element is a meta, noscript, link, script, style, or template element.
A body element's end tag may be omitted if the body element is not immediately followed by a comment.
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 7/12
7/24/24, 2:46 PM HTML Standard
Example
Note that in the example above, the head element start and end tags, and the body element start tag, can't be omitted, because they are
surrounded by whitespace:
<!DOCTYPE HTML>
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
(The body and html element end tags could be omitted without trouble; any spaces after those get parsed into the body element anyway.)
Usually, however, whitespace isn't an issue. If we first remove the whitespace we don't care about:
<!DOCTYPE HTML>
<title>Hello</title>
<p>Welcome to this example.</p>
This would be equivalent to this document, with the omitted tags shown in their parser-implied positions; the only whitespace text node that
results from this is the newline at the end of the head element:
<!DOCTYPE HTML>
<html><head><title>Hello</title>
</head><body><p>Welcome to this example.</p></body></html>
An li element's end tag may be omitted if the li element is immediately followed by another li element or if there is no more content in the parent
element.
A dt element's end tag may be omitted if the dt element is immediately followed by another dt element or a dd element.
A dd element's end tag may be omitted if the dd element is immediately followed by another dd element or a dt element, or if there is no more content
in the parent element.
A p element's end tag may be omitted if the p element is immediately followed by an address, article, aside, blockquote, details, div, dl,
fieldset, figcaption, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, main, menu, nav, ol, p, pre, search, section, table, or
ul element, or if there is no more content in the parent element and the parent element is an HTML element that is not an a, audio, del, ins, map,
noscript, or video element, or an autonomous custom element.
Example
We can thus simplify the earlier example further:
An rt element's end tag may be omitted if the rt element is immediately followed by an rt or rp element, or if there is no more content in the parent
element.
An rp element's end tag may be omitted if the rp element is immediately followed by an rt or rp element, or if there is no more content in the parent
element.
An optgroup element's end tag may be omitted if the optgroup element is immediately followed by another optgroup element, if it is immediately
followed by an hr element, or if there is no more content in the parent element.
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 8/12
7/24/24, 2:46 PM HTML Standard
An option element's end tag may be omitted if the option element is immediately followed by another option element, if it is immediately followed
by an optgroup element, if it is immediately followed by an hr element, or if there is no more content in the parent element.
A colgroup element's start tag may be omitted if the first thing inside the colgroup element is a col element, and if the element is not immediately
preceded by another colgroup element whose end tag has been omitted. (It can't be omitted if the element is empty.)
A colgroup element's end tag may be omitted if the colgroup element is not immediately followed by ASCII whitespace or a comment.
A caption element's end tag may be omitted if the caption element is not immediately followed by ASCII whitespace or a comment.
A thead element's end tag may be omitted if the thead element is immediately followed by a tbody or tfoot element.
A tbody element's start tag may be omitted if the first thing inside the tbody element is a tr element, and if the element is not immediately preceded
by a tbody, thead, or tfoot element whose end tag has been omitted. (It can't be omitted if the element is empty.)
A tbody element's end tag may be omitted if the tbody element is immediately followed by a tbody or tfoot element, or if there is no more content
in the parent element.
A tfoot element's end tag may be omitted if there is no more content in the parent element.
A tr element's end tag may be omitted if the tr element is immediately followed by another tr element, or if there is no more content in the parent
element.
A td element's end tag may be omitted if the td element is immediately followed by a td or th element, or if there is no more content in the parent
element.
A th element's end tag may be omitted if the th element is immediately followed by a td or th element, or if there is no more content in the parent
element.
Example
The ability to omit all these table-related tags makes table markup much terser.
<table>
<caption>37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)</caption>
<colgroup><col><col><col></colgroup>
<thead>
<tr>
<th>Function</th>
<th>Control Unit</th>
<th>Central Station</th>
</tr>
</thead>
<tbody>
<tr>
<td>Headlights</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Interior Lights</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Electric locomotive operating sounds</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Engineer's cab lighting</td>
<td></td>
<td>✔</td>
</tr>
<tr>
<td>Station Announcements - Swiss</td>
<td></td>
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 9/12
7/24/24, 2:46 PM HTML Standard
<td>✔</td>
</tr>
</tbody>
</table>
The exact same table, modulo some whitespace differences, could be marked up as follows:
<table>
<caption>37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
<colgroup><col><col><col>
<thead>
<tr>
<th>Function
<th>Control Unit
<th>Central Station
<tbody>
<tr>
<td>Headlights
<td>✔
<td>✔
<tr>
<td>Interior Lights
<td>✔
<td>✔
<tr>
<td>Electric locomotive operating sounds
<td>✔
<td>✔
<tr>
<td>Engineer's cab lighting
<td>
<td>✔
<tr>
<td>Station Announcements - Swiss
<td>
<td>✔
</table>
Since the cells take up much less room this way, this can be made even terser by having each row on one line:
<table>
<caption>37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
<colgroup><col><col><col>
<thead>
<tr> <th>Function <th>Control Unit <th>Central Station
<tbody>
<tr> <td>Headlights <td>✔ <td>✔
<tr> <td>Interior Lights <td>✔ <td>✔
<tr> <td>Electric locomotive operating sounds <td>✔ <td>✔
<tr> <td>Engineer's cab lighting <td> <td>✔
<tr> <td>Station Announcements - Swiss <td> <td>✔
</table>
The only differences between these tables, at the DOM level, is with the precise position of the (in any case semantically-neutral) whitespace.
Example
Returning to the earlier example with all the whitespace removed and then all the optional tags removed:
If the body element in this example had to have a class attribute and the html element had to have a lang attribute, the markup would have to
become:
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 10/12
7/24/24, 2:46 PM HTML Standard
Note
This section assumes that the document is conforming, in particular, that there are no content model violations. Omitting tags in the fashion
described in this section in a document that does not conform to the content models described in this specification is likely to result in unexpected
DOM differences (this is, in part, what the content models are designed to avoid).
For historical reasons, certain elements have extra restrictions beyond even the restrictions given by their content model.
A table element must not contain tr elements, even though these elements are technically allowed inside table elements according to the content
models described in this specification. (If a tr element is put inside a table in the markup, it will in fact imply a tbody start tag before it.)
A single newline may be placed immediately after the start tag of pre and textarea elements. This does not affect the processing of the element.
The otherwise optional newline must be included if the element's contents themselves start with a newline (because otherwise the leading newline in
the contents would be treated like the optional newline, and ignored).
Example
The following two pre blocks are equivalent:
<pre>Hello</pre>
<pre>
Hello</pre>
13.1.2.6 Restrictions on the contents of raw text and escapable raw text elements §
The text in raw text and escapable raw text elements must not contain any occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F
SOLIDUS) followed by characters that case-insensitively match the tag name of the element followed by one of U+0009 CHARACTER TABULATION
(tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN
(>), or U+002F SOLIDUS (/).
13.1.3 Text §
Text is allowed inside elements, attribute values, and comments. Extra constraints are placed on what is and what is not allowed in text based on
where the text is to be put, as described in the other sections.
13.1.3.1 Newlines §
Newlines in HTML may be represented either as U+000D CARRIAGE RETURN (CR) characters, U+000A LINE FEED (LF) characters, or pairs of
U+000D CARRIAGE RETURN (CR), U+000A LINE FEED (LF) characters in that order.
Where character references are allowed, a character reference of a U+000A LINE FEED (LF) character (but not a U+000D CARRIAGE RETURN
(CR) character) also represents a newline.
In certain cases described in other sections, text may be mixed with character references. These can be used to escape characters that couldn't
otherwise legally be included in text.
Character references must start with a U+0026 AMPERSAND character (&). Following this, there are three possible kinds of character references:
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 11/12
7/24/24, 2:46 PM HTML Standard
Named character references
The ampersand must be followed by one of the names given in the named character references section, using the same case. The name must be
one that is terminated by a U+003B SEMICOLON character (;).
The numeric character reference forms described above are allowed to reference any code point excluding U+000D CR, noncharacters, and controls
other than ASCII whitespace.
An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is followed by one or more ASCII alphanumerics, followed by a U+003B
SEMICOLON character (;), where these characters do not match any of the names given in the named character references section.
2. Optionally, text, with the additional restriction that the text must not contain the string "]]>".
Example
CDATA sections can only be used in foreign content (MathML or SVG). In this example, a CDATA section is used to escape the contents of a
MathML ms element:
<p>You can add a string to a number, but this stringifies the number:</p>
<math>
<ms><![CDATA[x<y]]></ms>
<mo>+</mo>
<mn>3</mn>
<mo>=</mo>
<ms><![CDATA[x<y3]]></ms>
</math>
13.1.6 Comments §
2. Optionally, text, with the additional restriction that the text must not start with the string ">", nor start with the string "->", nor contain the
strings "<!--", "-->", or "--!>", nor end with the string "<!-".
Note
The text is allowed to end with the string "<!", as in <!--My favorite operators are > and <!-->.
https://html.spec.whatwg.org/multipage/syntax.html#start-tags 12/12