What is Internal Storage Encoding of Characters(ISCII)?

Last Updated : 17 Jul, 2020

We all know, the computer does not directly store letters, numbers, and pictures directly. It converts them into small pieces called bits, which either have two values, 0 or 1. To represent each letter or number properly, we need some rules to correctly store them. These rules correspond to the encoding schema. We will look at the 3 most popular storage encoding schema:

ASCII

ASCII Stands for American Standard Code for Information Interchange. ASCII was introduced in the year 1963 by the American Standards Association (ASA). ASCII is broadly classified into 2 sub-categories:

Standard ASCII: Standard ASCII represents the first half of ASCII that is, the first 128 characters from 0 to 127. Standard ASCII comprises non-printable and the lower ASCII. Non-printable ASCII contains the characters that cannot be printed on the screen and comprise various system codes. They start from range 0 to 31. Lower ASCII comprises the remaining range of Standard ASCII, that is, from 32 to 127. It contains alphabets, numbers as well as special symbols.
Extended ASCII: Extended ASCII was proposed because though standard ASCII was enough to represent all major characters from major languages yet it was not sufficient to cover all of them. Extended ASCII solves this by adding more 128 characters, thus taking the total ASCII characters to 256.

ISCII

ISCII stands for the Indian Script Code for Information Interchange. It was proposed by the Bureau of Indian Standards (BIS) in the year 1991. It is an 8-bit standard where the first 128 characters, that is, from 0 to 127 are the same as standard ASCII. The next 128 characters constitute the characters of Indian scripts. Most popular languages that are spoken in India are present in the encoding. These include Devanagari, Gujarati, Bengali, Oriya, Punjabi, Assamese, Kannada, Telugu, Malayalam, Tamil.

Unicode

With the invention of ASCII, it was felt that the character encoding was limited and was not enough to cover all the languages of the world. Hence, a new encoding schema was needed to cover all languages. The Unicode Consortium, a non-profit organization, designed and developed Unicode in the year 1991. Initially, there were only 50, 000 characters present. But today, the Unicode covers more than 128, 000 characters.

Types of Unicode encoding:

UTF-8: It uses 8 bits for its encoding. It is used in email over the internet. It is a standard encoding scheme used on web and software programs.
UTF-16: It uses 2 bytes i.e. 16 bits for encoding.
UTF-32: It uses 4 bytes i.e. 32 bits for encoding.