0% found this document useful (0 votes)
58 views

Transforming Data Into Information: Syed Mohsin Ali Sheerazi

The document discusses how computers represent and organize data using binary numbering systems, with basic units like bits and bytes that are grouped to represent different data types. It covers floating point representation which allows for decimal places by using a sign, exponent, and significand field in the computer representation. Error detection and correction methods are also mentioned along with coding schemes like ASCII that are used to form meaningful bytes of data.

Uploaded by

pariyal malik
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Transforming Data Into Information: Syed Mohsin Ali Sheerazi

The document discusses how computers represent and organize data using binary numbering systems, with basic units like bits and bytes that are grouped to represent different data types. It covers floating point representation which allows for decimal places by using a sign, exponent, and significand field in the computer representation. Error detection and correction methods are also mentioned along with coding schemes like ASCII that are used to form meaningful bytes of data.

Uploaded by

pariyal malik
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 51

Transforming Data into

Information
Lecture 03
Syed Mohsin Ali Sheerazi
Outline
• Data Organization
• Numbering Systems
• Floating Point Representation
• BCD Representation
• Characters
• Error Detection & Correction
• Representing Colors
• Audio
Data Organization

Computers use binary number system to store


information as 0’s and 1’s

Bits
– A bit is the fundamental unit of computer storage
– A bit can be 0 (off) or 1 (on)
– Related bits are grouped to represent different
types of information such as numbers, characters,
pictures, sound, instructions
Binary BCD Hex

0000 0 0

0001 1 1

0010 2 2

0011 3 3

• Nibbles 0100 4 4

0101 5 5
– A nibble is a group of 4 bits
0110 6 6
– A nibble is used to represent 0111 7 7
a digit in Hex (from 0-15) 1000 8 8
and BCD (from 0-9) numbers 1001 9 9

1010 A

1011 B

1100 C

1101 D

1110 E

1111 F
Bytes

Bytes
– A byte is a group of 8 bits that is used to represent
numbers and characters

– A standard code for representing numbers and


characters is ASCII
Byte Size

Bytes
– How many different combinations of 0’s and 1’s
with 8 bits can be formed?
– In general, how many different combinations of 0’s
and 1’s with N bits can be formed?
– How many different characters can be represented
with a byte (8 bits)?
Words

Words
– A word is a group of 16 bits or 2 bytes

– UNICODE is an international standard code for


representing characters including non-Latin
characters like Asian, Greek, etc.
Double Words

Double Words
– A double word is a group of 32 bits or 4 bytes or 2
words
Related Bytes

– A nibble is a half-byte (4-bit) - hex representation


– A word is a 2-byte (16-bit) data item
– A doubleword is a 4-byte (32-bit) data item
– A quadword is an 8-byte (64-bit) data item
– A paragraph is a 16-byte (128-bit) area
– A kilobyte (KB) is 210 = 1,024 bytes  1,000 bytes)
– A megabyte (MB) is 220 = 1,048,576 1 Million Bytes
– A Gigabyte (GB) is 230 = 1,073,741,824  1 Billion
How is data processed into information?
Input Process Output
Data Information
Author
Catalog Record
Title Cataloging
Imprint Process
Subject Filing
Call No. Process
Accession No.
Card Catalog

Storage
The Information Processing Cycle
New
Document

Data
Input Process Output

Stored Data
What coding schemes are used to
form meaningful bytes of data?
• The coding scheme ASCII (As-key) and
ASCII-8 or extended ASCII has been
adopted as a standard by the US
government and by computer
manufacturers.
• ASCII can have 128 combinations of 7 bits
each while ASCII-8 can have as many as
1256 combinations
Numbering Systems

• Unsigned number system


• Signed binary Systems
– Signed and magnitude system
– 1’s complement system
– 2’s complement system
• Hexadecimal system
Binary Number System

• base 10 -- has ten digits: 0,1,2,3,4,5,6,7,8,9


– positional notation
2401 = 2 103 + 4 102 + 0 101 + 1 100
• base 2 -- has two digits: 0 and 1
– positional notation
11012 = 1  23 + 1  22 + 0  21 + 1  20

= 8 + 4 + 0 + 1 = 13
Binary Code

Use for representing integers without signed (natural


numbers)
0 0000 8 1000
1 0001 9 1001
2 0010 10 1010
3 0011 11 1011
4 0100 12 1100
5 0101 13 1101
6 0110 14 1110
7 0111 15 1111
Decimal to Binary Conversion

• The binary numbering system is the most


important radix system for digital computers.
• However, it is difficult to read long strings of
binary numbers-- and even a modestly-sized
decimal number becomes a very long binary
number.
– For example: 110101000110112 = 1359510
• For compactness and ease of reading, binary
values are usually expressed using the
hexadecimal, or base-16, numbering system.
Unsigned Conversion

• Convert an unsigned binary number to


decimal
use positional notation (polynomial expansion)
• Convert a decimal number to unsigned
Binary
use successive division by 2
Examples

• Represent 2610 in unsigned Binary Code


2610 = 110102
• Represent 2610 in unsigned Binary Code
using 8 bits
2610 = 000110102
• Represent (26)10 in Unsigned Binary
Code using 4 bits -- not possible
Signed Binary Codes

These are codes used to represent positive and


negative numbers.

• Sign-Magnitude System
• 1’s Complement System
• 2’s Complement System
Signed and Magnitude

• The most significant (left most) bit


represent the sign bit
– 0 is positive
– 1 is negative

• The remaining bits represent the


magnitude
Examples of Signed & Magnitude

5-bit Sign
Decimal
and Magnitude
+5 00101
-5 10101
+13 01101
-13 11101
Signed and Magnitude in 4 bits
0 0000 -0 1000
1 0001 -1 1001
2 0010 -2 1010
3 0011 -3 1011
4 0100 -4 1100
5 0101 -5 1101
6 0110 -6 1110
7 0111 -7 1111
Examples

Decimal S-M 8-bit S-M

2610 011010SM 00011010SM

-2610 111010SM 10011010SM


1’s Complement System

• Positive numbers:
– same as in unsigned binary system
– pad a 0 at the leftmost bit position
• Negative numbers:
– convert the magnitude to unsigned binary system
– pad a 0 at the leftmost bit position
– complement every bit
Examples of 1’s Complement

Decimal 5-bit 1’s complement


5 00101
-5 11010
13 01101
-13 10010
2’s Complement System

• Positive numbers:
– same as in unsigned binary system
– pad a 0 at the leftmost bit position
• Negative numbers:
– convert the magnitude to unsigned binary system
– pad a 0 at the leftmost bit position
– complement every bit
– add 1 to the complement number
Examples of 2’s Complement

Decimal 5-bit 2’s complement


5 00101
-5 11011
13 01101
-13 10011
Hexadecimal Notation

• base 16 -- has 16 digits:


0 1 2 3 4 5 6 7 8 9 A B C D E F
• each Hex digit represents a group of 4 bits
(i.e. half of a byte or a nibble) 0000 to
1111
• used as a shorthand notation for long
sequences of binary bits.
Binary Hex Binary Hex

0000 0 1000 8

0001 1 1001 9

0010 2 1010 A

0011 3 1011 B

0100 4 1100 C

0101 5 1101 D

0110 6 1110 E

0111 7 1111 F
Convert Binary to Hex

Binary Hex
1111 0110b F6h

1001 1101 0000 1010b 9D0Ah

1111 0110 1110 0111b F6E7h

1011011b 5Bh
Addition and Subtraction in Sign and
Magnitude
(a) 5 0101
+2 +0010
7 0111

(b) -5 1101
-2 +1010
-7 1111

(c) 5 0101
-2 +1010
3 0011

(d) -5 1101
+2 +0010
-3 1011
2.5 Floating-Point Representation
Floating-Point Representation

• Floating-point numbers allow an arbitrary


number of decimal places to the right of the
decimal point.
– For example: 0.5  0.25 = 0.125
• They are often expressed in scientific
notation.
– For example:
0.125 = 1.25  10-1
5,000,000 = 5.0  106
Floating-Point Representation

• Computers use a form of scientific notation for


floating-point representation
• Numbers written in scientific notation have
three components:
Floating-Point Representation

• Computer representation of a floating-point


number consists of three fixed-size fields:

• This is the standard arrangement of these


fields.
Floating-Point Representation

• The one-bit sign field is the sign of the stored value.


• The size of the exponent field, determines the range
of values that can be represented.
• The size of the significand determines the precision
of the representation.
Floating-Point Representation

• Floating-point errors can be reduced when we use


operands that are similar in magnitude.
• If we were repetitively adding 0.5 to 128.5, it would
have been better to iteratively add 0.5 to itself and
then add 128.5 to this sum.
• In this example, the error was caused by loss of
the low-order bit.
• Loss of the high-order bit is more problematic.
Floating-Point Representation

• Floating-point overflow and underflow can


cause programs to crash.
• Overflow occurs when there is no room to
store the high-order bits resulting from a
calculation.
• Underflow occurs when a value is too small to
store, possibly resulting in division by zero.
Experienced programmers know that it’s better for a
program to crash than to have it produce incorrect, but
plausible, results.
Character Representations

• BCD (Binary Coded Decimal) & EBCDIC


(Extended Binary Coded Decimal
Interchange Code)
• ASCII (American Standard Code For
Information Interchange)
• UNICODE
Character Codes
• Calculations aren’t useful until their results
can be displayed in a manner that is
meaningful to people.
• We also need to store the results of
calculations, and provide a means for data
input.
• Thus, human-understandable characters
must be converted to computer-
understandable bit patterns using some sort
of character encoding scheme.
Character Codes

• As computers have evolved, character


codes have evolved.
• Larger computer memories and storage
devices permit richer character codes.
• The earliest computer coding systems used
six bits.
• Binary-coded decimal (BCD) was one of
these early codes. It was used by IBM
mainframes in the 1950s and 1960s.
Character Codes

• In 1964, BCD was extended to an 8-bit code,


Extended Binary-Coded Decimal Interchange
Code (EBCDIC).
• EBCDIC was one of the first widely-used
computer codes that supported upper and
lowercase alphabetic characters, in addition
to special characters, such as punctuation
and control characters.
• EBCDIC and BCD are still in use by IBM
mainframes today.
Character Codes

• Other computer manufacturers chose the 7-


bit ASCII (American Standard Code for
Information Interchange) as a replacement
for 6-bit codes.
• While BCD and EBCDIC were based upon
punched card codes, ASCII was based upon
telecommunications (Telex) codes.
• Until recently, ASCII was the dominant
character code outside the IBM mainframe
world.
Character Codes

– ASCII
– Used to represent characters and control information
– Each character is represented with 1 byte
• upper and lower case letters: a...z and A...Z
• decimal digits -- 0,1,…,9
• punctuation characters -- ; , . :
• special characters --$ & @ / {
• control characters -- carriage return (CR) , line feed (LF), beep
Examples of ASCII Code

Bit contents (S): 01010011


Bit position: 76543210

S 83 (decimal) , 53 (hex)

Bit contents (8): 00111000


Bit position: 76543210

8 56 (decimal) , 38 (hex)
ASCII Code in Binary and Hex

Character Binary Hex


A 0100 0001 41
D 0100 0100 44
a 0110 0001 61
? 0011 1111 3F
2 0011 0010 32
DEL 0111 1111 7F
Character Codes

• Many of today’s systems embrace Unicode, a


16-bit system that can encode the characters
of every language in the world.
– The Java programming language, and some operating
systems now use Unicode as their default character code.
• The Unicode codespace is divided into six
parts. The first part is for Western alphabet
codes, including English, Greek, and Russian.
Character Codes

• The Unicode codes-


pace allocation is
shown at the right.
• The lowest-
numbered Unicode
characters comprise
the ASCII code.
• The highest provide
for user-defined
codes.
Error Detection and Correction

• Check digits, appended to the end of a long number


can provide some protection against data input
errors.
– The last character of UPC barcodes and ISBNs are check
digits.
• Longer data streams require more economical and
sophisticated error detection mechanisms.
• Cyclic redundancy checking (CRC) codes provide
error detection for large blocks of data.

48
• Hamming codes and Reed-Soloman codes are
two important error correcting codes.
• Reed-Soloman codes are particularly useful in
correcting burst errors that occur when a series
of adjacent bits are damaged.
– Because CD-ROMs are easily scratched, they employ a type
of Reed-Soloman error correction.
• Because the mathematics of Hamming codes is
much simpler than Reed-Soloman, we discuss
Hamming codes in detail.

49
• Checksums and CRCs are examples of
systematic error detection.
• In systematic error detection a group of error
control bits is appended to the end of the block of
transmitted data.
– This group of bits is called a syndrome.
• CRCs are polynomials over the modulo 2
arithmetic field.
The mathematical theory behind modulo 2 polynomials
is beyond our scope. However, we can easily work with
it without knowing its theoretical underpinnings.
50
What are the advantages of using
computers for data processing?
• Faster data input, processing and
retrieval
• Tireless--can work 24 hours a day, 7
days a week
• Less prone to error
• Produce output requirements easily
• Could send and retrieve data from
other computers if in a network

You might also like