HTML charsets define how characters are represented in a web document. The character encoding ensures that text appears correctly across different devices and platforms.
The <meta>
tag's charset
attribute is used to specify which character encoding the HTML document uses. By setting the charset, we ensure proper rendering of special characters, symbols, and text.
Common Character Encodings
1. ASCII
The American Standard Code for Information Interchange (ANSII) created this character encoding. This character encoding is used in C/C++ programming.
It has 128 alphanumeric characters consisting of alphabets(A-Z) and (a-z) and some special symbols like + - * / ( ) @ etc.
2. ANSI (Windows-1252)
American National Standards Institute (ANSI) created character encoding supported 256 characters. It is used as the default character set in Microsoft Windows.
3. ISO-8859-1
It is used as the default character set of HTML4 and also supports 256 characters. The International Standards Organization (ISO) defines the standard character sets for different alphabets/languages. It contains numbers, upper and lowercase English letters, and some special characters.
4. UTF-8
UTF-8 and UTF-16 standards was developed by Unicode Consortium, because the ISO-8859 character-sets are limited, and not compatible a multilingual environment. It consists all the character and punctuation symbols.
Attribute
Web browser must know the character encoding standard used in the html page and this we do as given below.
Example:
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
<meta charset="UTF-8">
Note:
- The first values from 0 to 127 are considered as the "Standard" ASCII character set.
- Characters with values from 128 to 255 are the "Extended" Character set.
Why Character Encoding is Important?
- Consistency: Encoding defines how text, numbers, and symbols are interpreted, ensuring that content appears correctly regardless of the user's device or browser.
- Global Compatibility: Without proper encoding, characters in different languages or special symbols may display as unreadable or incorrect.
- Web Development: By specifying the charset, you avoid issues with rendering characters and improve your site’s accessibility across diverse languages
Character set for different Character Encoding Standard
Following list shows different character encoding standards with their characters and their assigned number codes.
Table 1 (ASCII Device Control Characters)
This table contains Characters which are designed to control hardware devices. These are also known as control characters.
Numbers | Characters | Descriptions |
---|
00 | NUL | null character |
01 | SOH | start of header |
02 | STX | start of text |
03 | ETX | end of text |
04 | EOT | end of transmission |
05 | ENQ | enquiry |
06 | ACK | acknowledge |
07 | BEL | bell(ring) |
08 | BS | backspace |
09 | HT | horizontal tab |
10 | LF | line feed |
11 | VT | vertical tab |
12 | FF | form feed |
13 | CR | carriage return |
14 | SO | shift out |
15 | SI | shift in |
16 | DLE | data link escape |
17 | DC1 | device contyrol 1 |
18 | DC2 | device contyrol 2 |
19 | DC3 | device contyrol 3 |
20 | DC4 | device contyrol 4 |
21 | NAK | negative acknowledge |
22 | SYN | synchronize |
23 | ETB | end transmission block |
24 | CAN | cancel |
25 | EM | end of medium |
26 | SUB | substitute |
27 | ESC | escape |
28 | FS | file separator |
29 | GS | group separator |
30 | RS | record separator |
31 | US | unit separator |
127 | DEL | delete |
Table 2: This table contains characters having the same numbers assigned in different character encoding.
NUMBER | Characters | Description |
---|
32 | | Space |
33 | ! | Exclamation Mark |
34 | " | Quotation Mark |
35 | # | Hash Sign |
36 | $ | Dollar Sign |
37 | % | Percent Sign |
38 | & | Ampersand Sign |
39 | ' | Apostrophe Sign |
40 | ( | Opening Paranthesis |
41 | ) | Closing Parenthesis |
42 | * | Asterisk Sign |
43 | + | Plus Sign |
44 | , | Comma |
45 | - | Hyphen/minus Sign |
46 | . | Full-stop |
47 | / | Slash/Divide Sign |
48 | 0 | Number Zero |
49 | 1 | Number One |
50 | 2 | Number Two |
51 | 3 | Number Three |
52 | 4 | Number Four |
53 | 5 | Number Five |
54 | 6 | Number Six |
55 | 7 | Number Seven |
56 | 8 | Number Eight |
57 | 9 | Number Nine |
58 | : | Colon |
59 | ; | Semicolon |
60 | < | Lessthan Sign |
61 | = | Equalto Sign |
62 | > | Greaterthan Sign |
63 | ? | Question Mark |
64 | @ | at Sign |
65 | A | Letter A |
66 | B | Letter B |
67 | C | Letter C |
68 | D | Letter D |
69 | E | Letter E |
70 | F | Letter F |
71 | G | Letter G |
72 | H | Letter H |
73 | I | Letter I |
74 | J | Letter J |
75 | K | Letter K |
76 | L | Letter L |
77 | M | Letter M |
78 | N | Letter N |
79 | O | Letter O |
80 | P | Letter P |
81 | Q | Letter Q |
82 | R | Letter R |
83 | S | Letter S |
84 | T | Letter T |
85 | U | Letter U |
86 | V | Letter V |
87 | W | Letter W |
88 | X | Letter X |
89 | Y | Letter Y |
90 | Z | Letter Z |
91 | [ | Opening Square Bracket |
92 | \ | Backslash |
93 | ] | Closing Square Bracket |
94 | ^ | Circumflex Accent |
95 | _ | Low Line |
96 | ` | Grave Accent |
97 | a | Letter a |
98 | b | Letter b |
99 | c | Letter c |
100 | d | Letter d |
101 | e | Letter e |
102 | f | Letter f |
103 | g | Letter g |
104 | h | Letter h |
105 | i | Letter i |
106 | j | Letter j |
107 | k | Letter k |
108 | l | Letter l |
109 | m | Letter m |
110 | n | Letter n |
111 | o | Letter o |
112 | p | Letter p |
113 | q | Letter q |
114 | r | Letter r |
115 | s | Letter s |
116 | t | Letter t |
117 | u | Letter u |
118 | v | Letter v |
119 | w | Letter w |
120 | x | Letter x |
121 | y | Letter y |
122 | z | Letter z |
123 | { | Opening Curly Bracket |
124 | | | Vertical Line |
125 | } | Closing Curly Bracket |
126 | ~ | Tilde |
127 | DEL | delete |
Table 3: This table contains character having different character encoding.
Numbers | Description |
---|
128 | € |
129 | not used |
130 | ‚ |
131 | ƒ |
132 | „ |
133 | … |
134 | † |
135 | ‡ |
136 | ˆ |
137 | ‰ |
138 | Š |
139 | ‹ |
140 | Œ |
141 | Not Used |
142 | Ž |
143 | Not Used |
144 | Not Used |
145 | ‘ |
146 | ’ |
147 | “ |
148 | ” |
149 | • |
150 | – |
151 | — |
152 | ˜ |
153 | ™ |
154 | š |
155 | › |
156 | œ |
157 | Not Used |
158 | ž |
159 | Ÿ |
160 | no-break Space |
161 | ¡ |
162 | ¢ |
163 | £ |
164 | ¤ |
165 | ¥ |
166 | ¦ |
167 | § |
168 | ¨ |
169 | © |
170 | ª |
171 | « |
172 | ¬ |
173 | � |
174 | ® |
175 | ¯ |
176 | ° |
177 | ± |
178 | ² |
179 | ³ |
180 | ´ |
181 | µ |
182 | ¶ |
183 | · |
184 | ¸ |
185 | ¹ |
186 | º |
187 | » |
188 | ¼ |
189 | ½ |
190 | ¾ |
191 | ¿ |
192 | À |
193 | Á |
194 | Â |
195 | Ã |
196 | Ä |
197 | Å |
198 | Æ |
199 | Ç |
200 | È |
201 | É |
202 | Ê |
203 | Ë |
204 | Ì |
205 | Í |
206 | Î |
207 | Ï |
208 | Ð |
209 | Ñ |
210 | Ò |
211 | Ó |
212 | Ô |
213 | Õ |
214 | Ö |
215 | × |
216 | Ø |
217 | Ù |
218 | Ú |
219 | Û |
220 | Ü |
221 | Ý |
222 | Þ |
223 | ß |
224 | à |
225 | á |
226 | â |
227 | ã |
228 | ä |
229 | å |
230 | æ |
231 | ç |
232 | è |
233 | é |
234 | ê |
235 | ë |
236 | ì |
237 | í |
238 | î |
239 | ï |
240 | ð |
241 | ñ |
242 | ò |
243 | ó |
244 | ô |
245 | õ |
246 | ö |
247 | ÷ |
248 | ø |
249 | ù |
250 | ú |
251 | û |
252 | ü |
253 | ý |
254 | þ |
255 | ÿ |
HTML Charsets - FAQs
What is a charset in HTML?
Charset (character set) defines the encoding used to represent characters in an HTML document, ensuring text displays correctly.
How to specify the charset in an HTML document?
Use the <meta charset="UTF-8"> tag in the <head> section to specify the character encoding, with UTF-8 being the most common.
What is UTF-8?
UTF-8 (Unicode Transformation Format - 8-bit) is a character encoding that supports all characters in the Unicode standard, widely used for web content.
Why is UTF-8 recommended?
UTF-8 supports a vast range of characters, including special symbols and emojis, and is compatible with most languages, making it ideal for global web content.
How to specify a different charset?
Replace UTF-8 with the desired charset in the <meta> tag, e.g., <meta charset="ISO-8859-1"> for Latin-1 encoding.
What happens if I don’t specify a charset?
If no charset is specified, the browser may use a default or detected encoding, potentially leading to incorrect character display.
Similar Reads
HTML Tutorial HTML stands for HyperText Markup Language. It is the standard language used to create and structure content on the web. It tells the web browser how to display text, links, images, and other forms of multimedia on a webpage. HTML sets up the basic structure of a website, and then CSS and JavaScript
11 min read
HTML Introduction HTML stands for Hyper Text Markup Language, which is the core language used to structure content on the web. It organizes text, images, links, and media using tags and elements that browsers can interpret. As of 2025, over 95% of websites rely on HTML alongside CSS and JavaScript, making it a fundam
6 min read
HTML Editors An HTML Editor is a software application designed to help users create and modify HTML code. It often includes features like syntax highlighting, tag completion, and error detection, which facilitate the coding process. There are two main types of HTML editors: Text-Based Editors - Allow direct codi
5 min read
HTML Basics HTML (HyperText Markup Language) is the standard markup language used to create and structure web pages. It defines the layout of a webpage using elements and tags, allowing for the display of text, images, links, and multimedia content. As the foundation of nearly all websites, HTML is used in over
6 min read
HTML Comments HTML comments are used to add notes or explanations in the HTML code that are not displayed by the browser.They are useful for documenting the code, making it easier to understand and maintain.To add a comment, use the syntax <!-- your comment here -->. HTML<!-- This is a comment and will n
4 min read
HTML Elements An HTML Element consists of a start tag, content, and an end tag, which together define the element's structure and functionality. Elements are the basic building blocks of a webpage and can represent different types of content, such as text, links, images, or headings.For example, the <p> ele
5 min read
HTML Attributes HTML Attributes are special words used within the opening tag of an HTML element. They provide additional information about HTML elements. HTML attributes are used to configure and adjust the element's behavior, appearance, or functionality in a variety of ways. Each attribute has a name and a value
8 min read
HTML Headings HTML headings are used to define the titles and subtitles of sections on a webpage. They help organize the content and create a structure that is easy to navigate.Proper use of headings enhances readability by organizing content into clear sections.Search engines utilize headings to understand page
4 min read
HTML Paragraphs A paragraph in HTML is simply a block of text enclosed within the <p> tag. The <p> tag helps divide content into manageable, readable sections. Itâs the go-to element for wrapping text in a web page that is meant to be displayed as a distinct paragraph.Syntax:<p> Some Content...
5 min read
HTML Text Formatting HTML text formatting refers to the use of specific HTML tags to modify the appearance and structure of text on a webpage. It allows you to style text in different ways, such as making it bold, italic, underlined, highlighted, or struck-through. Table of ContentCategories of HTML Text FormattingLogic
4 min read