Chapter 1.04 ASCII and Unicode
Chapter 1.04 ASCII and Unicode
WITH MRSAEM
www.mrsaem.com | www.sirsaem.com | 1
Chapter 5
A Level
ASCII and Unicode
In the early days of computing, programmers would combine groups (sequences) of
0s and 1s to represent different things. For example, they might decide that
00000000 could be used to represent an A and 00000001 could be used to
Processor Fundamentals
represent a B and so on. The problem was that different programmers used their own
coding systems so the sequences meant different things to different people.
As a result of the confusion this caused, a standard was agreed upon for the
representation of all the keyboard characters, including the numbers, and other
Processor Fundamentals
● 256 characters are not sufficient to represent all of the possible characters, numbers and
symbols.
● It was initially developed in English and therefore did not represent all of the other languages
and scripts in the world.
● Widespread use of the web made it more important to have a universal international coding
system.
● The range of platforms and programs has increased dramatically with more developers from
around the world using a much wider range of characters.
Unicode also includes international characters for over 20 countries and even includes
conversions of classical and ancient characters.
To represent these extra characters it is obviously necessary to use more than 8 bits per
character and there are two common encodings of Unicode in use today (UTF-8 and UTF-16).
As the name suggests the latter is a 16-bit code.