008 What Is UTF-8 - UTF-8 Character Encoding Tutorial
008 What Is UTF-8 - UTF-8 Character Encoding Tutorial
As of the mid 2020s, UTF-8 is one of the most popular encoding systems.
To start using UTF-8, you will want to first familiarize yourself with the the
basic ASCII character set.
This means that UTF-8 can represent all of the printable ASCII characters,
as well as the non-printable characters.
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
</head>
</html>
With that out of the way, let me explain how UTF-8 works, and why it's such
a brilliant encoding scheme.
code points in the ASCII range (0-127) are represented by a single byte
The first byte of a UTF-8 sequence is called the "leader byte". The leader
byte provides information about how many bytes are in the sequence, and
what the code point value of the character is.
The leader byte for a single-byte sequence is always in the range (0-127).
The leader byte for a two-byte sequence is in the range (194-223). The
leader byte for a three-byte sequence is in the range (224-239). And the
leader byte for a four-byte sequence is in the range (240-247).
The remaining bytes in the sequence are called "trailing bytes." The trailing
bytes for a two-byte sequence are in the range (128-191). The trailing bytes
for a three-byte sequence are in the range (128-191). And the trailing bytes
for a four-byte sequence are in the range (128-191).
You can calculate the code point value of a character by looking at the
leader byte and the trailing bytes. For a single-byte sequence, the code
point value is equal to the value of the leader byte.
For a two-byte sequence, the code point value is equal to ((leader byte -
194) * 64) + (trailing byte - 128).
For a three-byte sequence, the code point value is equal to ((leader byte -
224) * 4096) + ((trailing byte1 - 128) * 64) + (trailing byte2 - 128).
For a four-byte sequence, the code point value is equal to ((leader byte -
240) * 262144) + ((trailing byte1 - 128) * 4096) + ((trailing byte2 - 128) * 64)
+ (trailing byte3 - 128).
I hope you've found this helpful. If you want to learn more about
programming and technology, try freeCodeCamp's core coding curriculum.
It's free.
ADVERTISEMENT
freeCodeCamp.org © 2023