0% found this document useful (0 votes)
153 views

Encoding Schemes

Encoding schemes are used to convert data into binary format so computers can process it. The document discusses several encoding schemes including ASCII, ISCII, and Unicode. ASCII uses 7 bits for English characters, while ISCII is an 8-bit encoding that supports Indian languages. Unicode provides a unique code for all characters of every language to allow worldwide interchange of written text. It discusses UTF-8, UTF-16, and UTF-32 as different Unicode encoding formats.

Uploaded by

satish kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
153 views

Encoding Schemes

Encoding schemes are used to convert data into binary format so computers can process it. The document discusses several encoding schemes including ASCII, ISCII, and Unicode. ASCII uses 7 bits for English characters, while ISCII is an 8-bit encoding that supports Indian languages. Unicode provides a unique code for all characters of every language to allow worldwide interchange of written text. It discusses UTF-8, UTF-16, and UTF-32 as different Unicode encoding formats.

Uploaded by

satish kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Encoding Schemes

 Encoding is a method of converting data from one format to another. An encoding


scheme standardizes the encoding of character sets by defining a set of rules for
representing character data.
 Character encoding is a method of converting text input to binary values. Each encoding
scheme consists of a number of code pages that adhere to its rules. A computer can handle
a variety of numeric and non-numeric characters. Characters are represented using ASCII,
ISCII, and Unicode
 By definition, When we have text (a series of characters) that we wish to keep inside a
computer (machine) or transport over a digital network, we must convert it to binary
representation since that is the only language that a binary-based computer can
comprehend.

ASCII- (American Standard Code for information Interchange)


1) It is most widely used Alphanumeric code which is used in computer to translate text (letters,
numbers, symbols) into a form that can be understood by computer.
2) It is developed by American National Standards Institute (ANSI) in 1960.
3) The standard ASCII character set uses just 7 bits for each character. So it has 2 7=128 possible
code groups. It represents all of the standard keyboard characters as well as control functions.
4) The characters encoded are numbers 0 to 9, lowercase letters a to z , uppercase letters A to Z,
basic punctuation symbols and control codes.
ISCII (Indian Script Code for information Interchange)

Indian Script Code for Information Interchange (ISCII) is a coding scheme for
representing various writing systems of India. It encodes the main Indic scripts and a
Roman transliteration.

1) In 1991, The bureau of Indian Standards adopted the ISCII.


2) It is an 8-bit code which allows English and Indian Scripts alphabets to be used
simultaneously.
3) It is an 8-bit code capable of coding 28 means 256 characters.
4) ISCII code retains all ASCII characters and offers coding for Indian scripts also .
5) There are 15 official languages recognised in India Hindi, Marathi, Sanskrit , Punjabi ,
Gujarati, Oriya , Bengali, Assamese, Telugu, Kannada, Malayayam, Tamil, Urdu,
Sindhi and Kashmiri.

UNICODE (Universal coding Standard)

UNICODE(Universal coding Standard)- Unicode is new universal coding standard adopted by


all new platforms. It is promoted by Unicode consortium which is a non-profit organisation.
Unicode provides a unique number for every character irrespective of the platform , program and
language. It is character coding system designed to support the worldwide interchange,
processing, display of the written texts of the diverse languages.

Significance of the UNICODE


Unicode provides a unique code for all characters in all languages of the world.
Unicode supports around 100000 characters of languages around the world. Unicode uses
maximum of 4 byte to represent each character.
With the help of Unicode, it can be easily typed in all languages of the world on all computer
platforms, programs and the internet. Unicode has a provision of ‘Devanagari Unicode’ to
type hindi and other indian languages.
With the help of Unicode, we can type text in multiple languages in the same document,
which remains the same when taken in any computer, mobile phone, it does not change.
The text or matter of any language typed in Unicode can be read all over the world without
being corrupted and does not require any special font. It can easily be translated into most
languages of the world.

Encoding Scheme in UNICODE (UTF-8, UTF16, UTF-32)


UTF stands for Unicode transformation format.

In Unicode, a character maps to a code point. Code points are the numbers assigned by the
Unicode Consortium to every character in every writing system. Code points are represented as
U+ followed by four numbers and/or letters. This ensures that there are no collisions between
alphabets of different languages. These numbers are platform independent. Unicode allows all the
characters used in every language to be encoded under one consistent character set.
UTF-8:- UTF -8 encoding is a variable size encoding scheme to represent Unicode unipoints in
memory. Variable sized encoding means the code points are represented using 1, 2,3 or 4 bytes
depending on their size.

UTF-16:- UTF-16 encoding is a variable byte encoding scheme which uses either 2 bytes or 4
bytes to represent unicode code points. Most of the characters for all modern languages are
represented using 2 bytes.

UTF-32:- UTF-32 encoding is a fixed byte encoding scheme and it uses 4 byte to represent all
code points.

UTF-8 and UTF-16 both are variable length and UTF-32 is a fixed length variable scheme.

Questions

1. Computer understands only

a) C++
b) Binary language
c) Assembly Language

2.________ has only two digits 0 and 1.

a) Machine language
b) High level language
c) Assembly language

3. When a key on English keyboard is pressed , It is internally mapped to a


___________________

a) Unique Octal code


b) Unique Decimal code
c) Unique Hexadecimal code

4. When the key A is pressed , it is internally mapped into

a) 74
b) 65
c) 67

5. The mechanism of converting data into an equivalent cipher using specific code is
called
a) Coding
b) Decoding
c) Encoding

6. Alphabet ‘a’ is internally mapped into code value ‘97’.Is it same in all keyboards?
a) True
b) False
c) Converted to some other value.

7. _________ means to convert a data into coded form to hide & conceal it from others.
a)Encryption
b) Decryption
c) Coding
8. A process which convert encrypted data to original content and data is called
___________

a) Programming
b) Encryption
c) Decryption
9. ASCII is able to encode character set of ______ language only.

a) English
b) Punjabi
c) Hindi
10. ISCII is an _____ code representations for indian languages.

a) 16 bit
b) 15 bit
c) 8 bit

11. __________ have been developed to incorporate all characters of every written
language of the world.

a) ASCII
b)UNICODE
c) ISCII
12._________ represent a single encoding scheme for all languages and characters.
a) ASCII
b) UNICODE
c) ISCII

13. The two types of ASCII are ________ and _________


a) ASCII 7 and ASCII 8
b) ASCII14 and ASCII 8
c) ASCII 6 and ASCII 7

14. Any set of digits or alphabets are generally referred as _______


a) Characters
b) Symbols
c) Bits

You might also like