Data Representation What You Need to Know
Data Representation What You Need to Know
Number representation
Any form of data needs to be converted to binary to be processed by a computer.
Data is processed using logic gates and stored in registers.
Convert between
A positive denary and positive binary
Example: Convert 01001000 to denary and 147 to binary
1) 01001000 = 128 64 32 16 8 4 2 1 2) 147 = 128 64 32 16 8 4 2 1
= 0 1 0 01000 = 1 0 0 10011
= 64 + 8 = 10010011
= 72
B, positive denary and positive hexadecimal
Example: Convert 1023 to hexadecimal and F56 to denary
1) 1023 = 256 16 1 2) F56 = 256 16 1
3 F F 5 6 (Note: F= 15)
= 3FF = 3840 + 80 + 6
= 3926
C, positive hexadecimal, and positive binary
Example: Convert EC4 to binary and 0110 0111 1100 1111 to hexadecimal
1) EC4= E C 4 2) 0110 0111 1100 1111= 0110 0111 1100 1111
1110 1100 0100 6 7 9 F
= 1110 1100 0100 =679F
An overflow error will occur if the value is greater than 255 in an 8-bit register.
A computer or a device has a predefined limit that it can represent or store. An overflow
error occurs when a value outside this limit is reached.
1
To the left: 128 64 32 16 8 4 2 1 To the right: 128 64 32 16 8 4 2 1
1 1 0 11100 0 0 1 10111
Two’s complement
Twos complement is used to represent negative numbers. In twos complement system
the left most digit is always a negative number the left most digit determines the sign of a
number
Convert a positive binary or denary integer to a two’s complement 8-bit integer and vice
versa.
For example: Convert 57 and 01101011 to two´s complement.
1) 57= Step 1: it is a positive number write the 2) 01101010= It is already in two´s
Left-most value as zero complement and if the
Step 2: Write as normal binary number most significant digit is 1.
128 64 32 16 8 4 2 1 add a 0 at the left.
0 0 1 11001
Text
A computer represents text by using of character sets, including American standard
code for information interchange (ASCII) and Unicode.
ASCII uses 7-bit codes with the first 32 (used for control codes. Extended ASCII uses 8-
bit binary codes allowing for 128 more characters, the character sets are different for
windows and DOS.
Unicode uses 16-to-32-bit binary codes. Unicode allows for a greater range of
characters and symbols than ASCII, including different languages and emojis with the
first 128 characters are the same as ASCII.
Unicode requires more bits per character than ASCII.
Sound
Sound waves are analogue data, and the computer cannot understand analogue data,
so the sound waves are sampled (measuring the amplitude) for sound to be converted to
binary by the Analogue to Digital Converter (ADC) which is then processed by the
computer.
Advantages Disadvantages
Larger dynamic range Larger file size
Better sound quality Longer to transmit/download sound files
Less sound distortion Requires greater processing power
Images
A bitmap image is a series of pixels that are converted to binary, which is processed by
a computer.
The resolution is the number of pixels in the image, the higher the resolution the higher
the image quality.
The colour depth is the number of bits used to represent each colour a black and white
image only uses 2-bit colour depth, modern computer uses 24-bit colour depth.
The file size and quality of the image increases as the resolution and colour depth
increase.
3
1 pebibyte (PiB) 2^50 1024 tebibyte
1 exbibyte (EiB) 2^60 1024 pebibyte
Calculate the file size of an image file and a sound file, using information given:
1) The file size of an image = image resolution (in pixels) x colour depth (in bits)
In lossy file compression the algorithm eliminates unnecessary data from the file and the
file cannot be reconstructed. The algorithm decides which parts to retain and which parts
to discard.
1) In an image it may reduce the resolution and colour depth
2) In a sound file it may reduce the resolution and sample frequency
MP3 files are used for playing music. They reduce about 90% while retaining most of the
music quality. The data eliminated by the algorithm are:
MP4 files are like MP3 files but are used for storing multimedia files like music, videos,
photographs, and animations.
b) JPEG
JPEG is a lossy file compression algorithm used for bitmap images. The algorithm is
based on two concepts:
Lossless compression reduces the file size without permanent loss of data.
Run-length encoding (RLE) is used for lossless compression of a number of different file
formats:
An issue with a string such as `cdcdcdcdcd´ where RLE compression is not highly
effective, in such a situation a flag is used. A flag preceding data indicates that what
follows are the number of repeating units, when a flag is not used, the next byte(s) are
taken at face value.
We put first the number of times a text is repeated and then the ASCII code of the text.
The original string contains thirty-two characters and would occupy thirty-two bytes of
storage.
The coded version contains eighteen characters and would require eighteen bytes of
storage.
This has fifteen values and requires fifteen bytes of storage. This compression saves 53%
compared to the original string.
The figure `F´ is in a grid with each square requiring one byte of storage. A white square with
value 1 and black square with value 0.
The 8x8 grid would require sixty-four bytes; the compressed RLE format has thirty values,
and therefore needs only thirty bytes to store the image.
5
The figure shows an image with four colours each colour is made up of red, green, and blue
(RGB) according to the code to the right.
This produces the following data: 2 0 0 0 4 0 255 0 3 0 0 0 6 255 255 255 1 0 0 0 2 0 255 0 4 255 0 0 4
0 255 0 1 255 255 255 2 255 0 0 1 255 255 255 4 0 255 0 4 255 0 0 4 0 255 0 4 255 255 255 2 0 255 0.
1 0 0 0 2 255 255 255 2 255 0 0 2 255 255 255 3 0 0 0 4 0 255 0 2 0 0 0
The original image (8x8 square) would need 3 bytes per square (to include all three RGB values), so
the uncompressed file for this image is 8x8x3= 192 bytes.
The RLE code has ninety-two values, so the compressed file is ninety-two bytes in size. Therefore,
the compressed file is 52% smaller.