0% found this document useful (0 votes)
20 views

Data Representation

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Data Representation

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 279

Computer Systems

Data Representation

Dr Todd Waugh Ambridge


Module Outline
PART 1 – Fundamentals of Hardware and Software
1. Data Representation
2. CPU, Memory and Program Execution
3. Assembly Language
4. Compilation, Interpretation and Subroutines
Learning Objectives
To introduce the fundamentals of
the binary number system, how it
represents different types of data
and how binary can be operated on.
Topic Outline
01 Everything is Binary
02 Binary Number System
03 Representing Colour & Text
04 Computation with Binary Numbers
05 Representing Negative Numbers
06 Representing Real Numbers
01001000 01000101
01001100 01001100
01001111 00100001

01
Everything is Binary
Computers use 0s and 1s to represent
numbers, text, colours, images,
videos and everything else.
Computer Hardware is Logical

All computer hardware


communicates using electronic
signals – more specifically,
they use logical signals.

A logical signal can only be


on or off.
Computer Hardware is Logical

Let’s first think about


A something more basic than a
computer: a digital circuit.
AND
This circuit has two switches,
B A and B connected to an AND
logic gate, which is then
connected to a lightbulb X.
Computer Hardware is Logical

If only A is switched on, the


lightbulb stays off.
A

AND
B
Computer Hardware is Logical

AND
B
The same happens if only B is
switched on: the lightbulb
remains off.
Computer Hardware is Logical
But if both A and B are
switched on, the lightbulb A B X
turns on and lights up.
A 0 0 0

AND 1 0 0

B 0 1 0
We can model this operation 1 1 1
using a truth table.

0 represents off and 1 represents on.


Computer Hardware is Logical
So, digital circuits operate
using a logic consisting of only
0s and 1s – a binary logic.
101100
The core components of a
computer – such as the CPU and + 001001
memory – are made up of millions ---------
of digital circuits.
110101
The CPU is where data is
operated on, this means that
data can only be operated on
using binary (just 0s and 1s).
Computer Hardware is Logical
So, digital circuits operate
using a logic consisting of only
0s and 1s – a binary logic.
100110
01101011010
The core components of a 0011100001101
computer – such as the CPU and
memory – are made up of millions
1010100001000
of digital circuits. 0010001001101
01011111000
Memory is where data is stored,
this means that data can only be 000000
stored as binary (just 0s and 1s).
Everything is Binary

83 S In order to store and operate on


everything as binary (0s and 1s),
we need to be able to represent

01010011 everything as binary.

This means that lots of different


items of data (numbers, text,
colours, images, videos, etc.)
will have a common structure
(strings of binary digits).
Everything is Binary

83 S If everything is binary, then


there will be different
representations for different
01010011 types of data.

A data representation is a method


of representing some real world
data (text, images, etc.) as
binary strings.
Everything is Binary

A binary string is a string of any


length that consists only of bits.

0 or 1 A 1-bit string is called a bit.

0010 1100 An 8-bit string is called a byte.

1001 (Less common, but very cute: a 4-bit string is


called a nibble.)
Everything is Binary
How many possible byte strings are
there?

Possible 1-bit strings = 2


0000 0000 ○ 0


○ 1

Possible 2-bit strings = 4


1111 1111 ○

00
01
○ 10
○ 11
Everything is Binary
How many possible byte strings are
there?

Possible 3-bit strings = 8


0000 0000 ○ 000, 001, 010, 011


○ 100, 101, 110, 111

Possible 4-bit strings = 16


1111 1111 ○ 0000, 0001, 0010, 0011
○ 0100, 0101, 0110, 0111
○ 1000, 1001, 1010, 1011
○ 1100, 1101, 1110, 1111
Everything is Binary
How many possible byte strings are
there?

Possible 1-bit strings = 2 = 2^1


0000 0000 Possible 2-bit strings = 4 = 2^2
… Possible 3-bit strings = 8 = 2^3
1111 1111 Possible 4-bit strings = 16 = 2^4

Possible byte strings = 2^8 = 256


Everything is Binary

Possible byte strings = 2^8 = 256

0000 0000 So, if we are limited to a single byte


to represent something, we can
… represent 256 different things.

1111 1111
Have you heard of 256 colour images?
Everything is Binary
In 256 colour images (also known
as 8-bit colour), colours are
represented as single-byte words.

This means there are 256 possible


colours.

Of course, which 256 colours are


represented depends on how you
design your representation.

Before we look into this, we need


to first understand binary as a
number system: in other words,
how binary is used to represent
numbers.
3 = 11?

02
Binary Number System
Humans use the symbols 0-9 to represent
numbers. Computers use the symbols 0-1 to
represent numbers.
Denary Number System
1 2 3 6 7 8
Our number system is called
denary. 0 9
The prefix ‘den’ derives from 4 5
a Latin word meaning “ten
options”.

Humans use denary because we


have ten fingers.

The ten options are called


digits.
Binary Number System
A computer’s number system is
called binary.

The prefix ‘bin’ derives from 1


a Latin word meaning “two
options”.

Computers use binary because


electrical signals have two
states: ‘on’ or ‘off’.
0
The two options are called
bits.
Binary Number System
In every number system, every number has an infinite number of
invisible zeros to the left of the number...

Denary: 9 is really …0000000000009


Binary: 1 is really …0000000000001

...and once you run out of numbers, you add 1 to the next column to
the left and reset the current column to 0.

Denary: After 9 comes 10


Binary: After 1 comes 10
After 1011 comes 1100
Binary Number System
Decimal Binary
0 0
1 1
2 10
Counting is pretty
easy, but how do 3 11 And also, how do
we convert an 100 we convert from
4
arbitrary denary binary to denary?
number to binary? 5 101
6 110
7 111
214 ?
? 10110001
Binary to Denary
Converting from binary to denary is straightforward. Simply lay
out each bit of the binary string.

Let’s consider the binary number 10110001

256 128 64 32 16 8 4 2 1
0 1 0 1 1 0 0 0 1

128 + 32 + 16 + 1 = 177
Then we get the resulting denary number by simply adding the
columns which show a 1.
Denary to Binary
Converting from denary to binary is a little more involved.

We will look at two methods:

Left-to-right subtraction Right-to-left division


Repeatedly subtract powers Repeatedly divide the number
of two from the number. by two and keep track of
remainders.
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1

214
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1

214 is smaller than 256; write 0.


Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0

214
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0

214 – 128 = 86; write 1.


Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1

86
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1

86 – 64 = 22; write 1.
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1

22
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1

22 is smaller than 32; write 0.


Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0

22
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0

22 - 16 = 6; write 1.
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1

6
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1

6 is smaller than 8; write 0.


Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0

6
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0

6 – 4 = 2; write 1.
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0 1

2
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0 1

2 – 2 = 0; write 1.
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0 1 1

0
Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0 1 1

0 reached; write 0 in all remaining columns.


Denary to Binary
Method 1: Left-to-right subtraction.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly subtract powers of two from


the number, from left to right, until we reach 0.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0 1 1 0

214 in binary is 011010110


Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1

214
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1

214 / 2 = 107 remainder 0; write 0.


Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0

107
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0

107 / 2 = 53 remainder 1; write 1.


Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 0

53
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 0

53 / 2 = 26 remainder 1; write 1.
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 1 0

26
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 1 0

26 / 2 = 13 remainder 0; write 0.
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0 1 1 0

13
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0 1 1 0

13 / 2 = 6 remainder 1; write 1.
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 0 1 1 0

6
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 0 1 1 0

6 / 2 = 3 remainder 0; write 0.
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0 1 0 1 1 0

3
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0 1 0 1 1 0

3 / 2 = 1 remainder 1; write 1.
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 0 1 0 1 1 0

1
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 0 1 0 1 1 0

1 / 2 = 0 remainder 1; write 1.
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 1 0 1 0 1 1 0

0
Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
1 1 0 1 0 1 1 0

0 reached; write 0 in all remaining columns.


Denary to Binary
Method 2: Right-to-left division.

Let’s consider the denary number 214

108 107 106 105 10000 1000 100 10 1


0 0 0 0 0 0 2 1 4

To convert to binary, we repeatedly divide the number by 2 and write


the remainder in the next column.

256 128 64 32 16 8 4 2 1
0 1 1 0 1 0 1 1 0

214 in binary is 11010110 (the same answer


as Method 1)
Binary Number System
Decimal Binary
0 0
1 1
2 10
3 11
4 100
5 101
6 110
7 111
214 11010110
177 10110001
03
Representing Colour & Text
Binary is a number system, so it is easy to
represent numbers. But how do we represent
other kinds of data?
Representing Colour & Text

83 S In order to store and operate on


everything as binary (0s and 1s),
we need to be able to represent

01010011 everything as binary.

This means that lots of different


items of data (numbers, text,
colours, images, videos, etc.)
will have a common structure
(strings of binary digits).
Representing Colour & Text

83 S
We know now how to represent
numbers.
01010011 (Well, we know how to represent
positive whole numbers.)

But how could we use binary to


represent colour or text?
Representing Colour
There are lots of ways we could
represent colours in binary.
All of them?
In order to design our
representation, we first have to
ask ourselves:
The human eye
q Which colours do we want to can distinguish
represent?
10,000,000
q How many bits can we use to
represent a colour? colours
Representing Colour
There are lots of ways we could
represent colours in binary.
How many bits
In order to design our
representation, we first have to would we need
ask ourselves:
to represent
q Which colours do we want to 10,000,000
represent?
colours?
q How many bits can we use to
represent a colour?
Representing Colour

In order to represent at least


10,000,000 colours, we need 24
bits for each colour.

101100001111001100000101
But how should a given 24-bit
binary word relate to a colour?
Representing Colour
What if we just list all the
colours?

Colour 0 (000000000000000000000000) = Black


Colour 1 (000000000000000000000001) = Green

Colour 11,596,549 (101100001111001100000101)
= A kind of pinky shade of yellow

This is a bad representation,


because there is no way to work
out which colour we mean from a
given binary word, and no way to
work out what the binary word
should be for a given colour.
Representing Colour
What if we just list all the
colours?

Colour 0 (000000000000000000000000) = Black


Colour 1 (000000000000000000000001) = Green

Colour 11,596,549 (101100001111001100000101)
= A kind of pinky shade of yellow

This is a bad representation,


because there is no way to work
out which colour we mean from a
given binary word, and no way to
work out what the binary word
should be for a given colour.
Representing Colour

A good representation gives a


clear, mathematical relationship
between the representing type of
data (binary) and the represented
type of data (colour)

So, let’s think about the


structure of colour, as a piece
of data!
Representing Colour
What is a colour made up of?

A mixture of primary colours!

In additive (RGB) colour models,


every colour can be modelled as a
mixture of red, green and blue,
which is added to black.

In subtractive (CMY) colour models,


every colour can be modelled as a
mixture of cyan, magenta and yellow,
which is subtracted from white.
Representing Colour

RGB colour models make slightly more


sense for a computer… why?

Because computer displays are


(usually) black.

So we will model each colour as a


mixture of red, green and blue.
Representing Colour
10110000 11110011 00000101
So how should a given 24-bit
binary word relate to a colour,
which is a mixture of red, green
and blue?

Let’s split the 24 bits into


three bytes.

The first byte represents red,


the next represents green and
the last represents blue.
Representing Colour
10110000 11110011 00000101

Recap: How many possible values


are there for a byte? 256

What is the minimum value? 0

What is the maximum value? 255

Therefore, each byte will


measure the amount of that
colour on a scale from 0-255.
Representing Colour
(10110000)2 (11110011)2 (00000101)2

(176)
Recap:
10 (243)possible
How many 10 (5)10
values
are there for a byte? 256

What is the minimum value? 0


+ +
What is the maximum value? 255

Therefore, each byte will


measure the amount of that
= colour on a scale from 0-255.
Representing Colour
(10110000)2 (11110011)2 (00000101)2

(176)10 (243)10 (5)10

Using this 24-bit colour


representation, we can write this
colour in many different ways:

In binary (base 2 number system)


it’s 101100001111001100000101.
Representing Colour
(10110000)2 (11110011)2 (00000101)2

(176)10 (243)10 (5)10

Using this 24-bit colour


representation, we can write this
colour in many different ways:

In RGB notation, which uses


denary (base 10 number system)
it’s rgb(176,243,5).
Representing Colour
(10110000)2 (11110011)2 (00000101)2 Hexadecimal uses 16 distinct
symbols (0-9, A-F). Therefore,
(176)10 (243)10 (5)10 every 4-bit nibble relates to a
hex character:

(B0)16 (F3)16 (5)16 0000 0 1000 8


0001 1 1001 9
0010 2 1010 A
0011 3 1011 B
0100 4 1100 C
0101 5 1101 D
As a hex code, which uses 0110 6 1110 E
hexadecimal (base 16 number
0111 7 1111 F
system) it’s #B0F305.
Representing Colour
Memory on a computer is finite.

We can represent every colour we


need using 24-bits, but what if
we have fewer bits than that?

While 24, 32 and 64-bit binary


words are commonly used today,
this hasn’t always been the case.

Older games consoles only had 16,


8 or even 2-bits for their binary
words. What did this mean for the
colours of their games?
Representing Colour
Simple: with fewer bits, we have
fewer colours.

For example, with 8-bits there


are only 256 different binary
words: so we can only have 256
colours.

VGA display adapters, released in


‘87, often used 256 colours.

Before that, EGA used 16 colours.

The Game Boy had only 2 bits for


its colours.
Representing Colour

Here’s The
Secret of
Monkey Island
in 1-bit,
2-bit, 4-bit
and 8-bit
colour.

(source: Point and


Click Jam!)
Representing Colour & Text

83 S
We know now how to represent

01010011 (positive whole) numbers and


colours.

But how could we use binary to


represent colour or text?
Representing Text
There are lots of ways we could
represent text in binary. All of them?
As we learned with numbers and
colours, the number of bits we
have is directly related to the
number of items (in this case,
textual characters) we can
represent.

So, which characters do we want


to be able to store?
Representing Text

All of them?

There are tens of


thousands of unique
characters across the
world.
Representing Text

But, unlike with colour, it


might be easier for us to settle
for less with text.

Let’s say we only need to


display English text. How many
bits do we actually need?
Representing Text
The Baudot code, an early
telegraphic system, used a 5-bit
representation for characters,
meaning a maximum of 25 = 32
characters.

These 32 ‘codes’ allowed one to


represent (or ‘encode’) the 26
letters, as well as two symbols
and four ways of writing a
blank.
Representing Text

In 1932, the ITA2 code was


introduced, which had 5-bits and
a shift key. With a maximum of
64 different characters, all
common letters, numbers and
symbols could be represented.

Are there any problems with this


representation?
Representing Text

64 symbols isn’t quite enough if


we want both uppercase and
lowercase letters, as well as
numbers and common symbols.

So, when computers came along,


the standard became a 7-bit
system called American Standard
Code for Information Interchange
(or, ASCII).
Representing Text
788,258
words
ASCII is still used today, and
using only 7 bits means that
3,116,480 text files remain very
lightweight.
=
characters 8 bits in a byte,
1000 bytes in a Kilobyte,
1000 Kilobytes in a Megabyte.

21,815,360
= 2.73 MB
bits
Representing Text
Sometimes, we need to be able to
use more than 128 characters.
For example, there are thousands
of unique Chinese characters.

For this, the 1988 standard of


Unicode was developed. 8-bit
Unicode (UTF-8) has been the
standard of the World Wide Web
since 2008.

UTF-16 and UTF-32 allow us to


write even more characters! !
1 + 1 = 10?

04
Computation with Binary Numbers
We know how to store binary numbers as
different kinds of data. But how do we perform
logical and arithmetic operations on them?
Quick Recap
We’ve introduced the binary number system,
and seen how binary can be used to
represent colours and text.

From this, you can imagine how binary can


be used to store all sorts of data:
● Images are made up of thousands of
blocks (called pixels) of colour,
● Audio is represented by a stream of
bits that give the amplitude of a
digitally sampled sound wave over time,
● Video is a combination of a sequence of
images and audio.
Binary Computation
Now let’s return to how
computers represent numbers.

Numbers are used everywhere on


a computer, and it’s not enough
101 ?
0100 to just store numbers – we also
AND need to be able to operate on
OR
them by performing logical and
0110 arithmetic operations.

How would a computer, or more


simply a digital circuit,
compute 101 AND (0100 OR 0110)?
Binary Logic

Logical operations are


performed bitwise – this means
that the operation is performed
101 ? locally at each bit position.
0100
AND
OR Therefore, in order to perform
0110 logical operations on binary
words, we first need to know
how to perform logical
operations on bits themselves.
Binary Logic
We already discussed the AND
operation on bits.

A B A AND B
0 0 0
101 ?
0100 1 0 0
AND
OR 0 1 0
0110
1 1 1

A AND B = 1 only when


both A = 1 and B = 1.
Binary Logic
The OR operation on bits also
does exactly what it says.

A B A OR B
0 0 0
101 ?
0100 1 0 1
AND
OR 0 1 1
0110
1 1 1

A OR B = 1 only when
either A = 1 and B = 1.
Binary Logic
The OR operation on bits also
does exactly what it says.

A B A OR B
0 0 0
101 ?
0100 1 0 1
AND
OR 0 1 1
0110
0110
1 1 1

We can now perform the first


bitwise computation of our logical
calculation: 0100 OR 0110
Binary Logic
We now return to the AND operation
to complete the calculation.

A B A AND B
0 0 0
0101 0100
0100 1 0 0
AND
OR 0 1 0
0110
0110
1 1 1

The second bitwise computation is


101 AND 0110.
Binary Logic

0101 0100 Thus, we have used logical


0100 operations to compute that
AND
OR
0110
0110 101 AND (0100 OR 0110) = 0100
Binary Logic
There are of course more logical operations than just AND and OR.

For example, NOT negates, or XOR performs an exclusive-or; the


flips, the input; it only takes output is 1 only when exactly one
one argument rather than two: of the inputs is 1:

A NOT A A B A XOR B
0 1 0 0 0
1 0 1 0 1
0 1 1
1 1 0
Binary Computation

Numbers are used everywhere on


a computer, and it’s not enough
to just store numbers – we also
need to be able to operate on
them by performing logical and
arithmetic operations.

How would a computer, such as a


smartphone, compute 105 + 28,
given that they are represented
in binary?
Binary Arithmetic – Addition
To answer this question, let’s
think: how would we add 105 and
28 in denary?

We would write them out…

1
…and add them digit-by-digit
(remembering to carry).
Binary Arithmetic – Addition
In order to do this, we had to
know how to add digits themselves.

We needed to know that 2 + 1 = 3…

…or that 5 + 8 = 13.


Binary Arithmetic – Addition
As with logical operations, in order
to perform binary addition we first
need to know how to add individual
bits.

However, unlike with logical


operations, addition is not
performed bitwise.

Instead, it has to be performed from


right-to-left to take account of
carry bits.
Binary Arithmetic – Addition
As with logical operations, in order
to perform binary addition we first
need to know how to add individual
bits (remembering to carry).

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1

The columns with a zero are easy: we


don’t need to carry anything.
Binary Arithmetic – Addition
How do we add 1 + 1 in binary?

Here, 1 is behaving as both the


highest digit and the lowest digit.

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1

It’s a bit like if we wrote 9 + 1 in


denary.
Binary Arithmetic – Addition
As with 9 + 1 in denary, we need to
carry 1 to the next column, and
reduce this column to 0.

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

So, 1 + 1 = 10, or 0 carry 1.


Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1
1
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

A B A+B
Result Carry
0 0 0 0
1 0 1 0
0 1 1 0
1 1 0 1

…and then we add bit-by-bit


(remembering to carry).
Binary Arithmetic – Addition
So, to compute 105 + 28 in binary,
let’s first convert 105 and 28
from denary…

= 133
…and then we add bit-by-bit
(remembering to carry).
Binary Arithmetic – Addition
What should we do in this situation? What is 1 + 1 + 1 ?

1 + 1 = 0 carry 1, so adding another 1 gives us 1 carry 1.


Binary Arithmetic – Addition
What should we do in this situation? What is 1 + 1 + 1 ?

1 + 1 = 0 carry 1, so adding another 1 gives us 1 carry 1.


Binary Arithmetic – Shifting
We will not go over binary multiplication in this module, but
multiplying a number by any power of two is easy!

As we saw in this example, doubling a number is the same as just


shifting all bits once to the left.

This is called a bit shift operation: 11 << 1 is a bit shift one


space to the left, which is the same as multiplying by 2.
Binary Arithmetic – Shifting
Performing a left bit shift n many times will multiply the input by
2^n.
Binary Arithmetic – Shifting
Performing a right bit shift n many times will divide the input
by 2^n.

Note that the last two bits have now been completely lost, because
we can currently only represent whole numbers. The right bit shift
therefore performs an integer division, rather than a real
division.
Binary Arithmetic – Overflow
Thus far, we have implicitly assumed that we have an unlimited
number of bits available for each number – just like we do in real
life when writing on paper or a whiteboard!

However, on a computer this is not the case:


● In a given type of computer, the size of integers is fixed
● 32 or 64 bits are popular choices these days
Binary Arithmetic – Overflow
The problem is, as we have seen, the result of binary arithmetic
often requires more bits than the size of the inputs:
● Adding two n-bit numbers can yield a result of size n+1
● Left shifting can yield results much bigger than the input
Binary Arithmetic – Overflow
Let’s pretend we are working on a hypothetical computer that
represents numbers as 7-bit binary words.

Both of these operations give results that are over 7-bits…

This is called overflow. When overflow occurs, an error could be


raised, or the bits might simply and unknowingly be lost!
Binary Arithmetic – Overflow
Overflow can cause errors or, worse, wrong answers.

The below calculations say 105 + 28 = 5 and 19 * 64 = 64!


Binary Arithmetic – Overflow
In order to avoid overflow when programming, we should use
appropriate data types.

For example, some programming languages have many different types


for integers (whole numbers):
● short is the type of 16-bit integers
● int is the type of 32-bit integers
● long is the type of 64-bit integers

However, Python’s int type is of (effectively) unlimited size! How?

(Feel free to look into it and let me know if you find out how it works.)
Binary Arithmetic – Subtraction
The final arithmetic operation we would like to introduce is
subtraction.

It is rather tricky to perform subtraction on binary numbers…


…but we can cheat by simply adding a negative number instead,
because A – B = A + (- B).

However, to do that we’ll first have to learn how to represent


negative numbers.
-1011 0111?

05
Representing Negative Numbers
Let’s explore how we represent numbers below
zero. By doing this, we will illuminate how
to perform subtraction on binary numbers.
Representing Negative Numbers
Let’s say that our computer represents numbers as single-byte
(8-bit) binary words using the usual binary representation.

The maximum number that can be represented is 255.

128 64 32 16 8 4 2 1
1 1 1 1 1 1 1 1
Representing Negative Numbers
Let’s say that our computer represents numbers as single-byte
(8-bit) binary words using the usual binary representation.

The minimum number that can be represented is 0.

128 64 32 16 8 4 2 1
0 0 0 0 0 0 0 0
Representing Negative Numbers
The int datatype includes both positive and negative numbers, but the
representation we have explored only represents positive numbers.

We will look at two methods for representing negative numbers:

Sign-and-magnitude Two’s complement


The MSB is replaced with a
The more common method,
sign bit, which acts like a
which makes the MSB itself
‘-’ sign
negative
Negative Numbers – Sign-and-magnitude
How do we represent negative numbers?

We just write ‘-’ before the number!

128 64 32 16 8 4 2 1

We would like to do something similar in binary… but how can we


denote whether or not there is a minus sign before the number?

With yet another binary bit!


Negative Numbers – Sign-and-magnitude
In the sign-and-magnitude representation, the MSB (in this 8-bit
case, 2^7) instead becomes a sign bit.

When this bit is set to 1, it denotes the presence of a ‘-’ sign.

– 64 32 16 8 4 2 1
1 1 1 0 0 1 0 0

The remaining bits are the magnitude of the number: they represent
the absolute value of the represented number in the usual way.

Which denary number is represented by the sign-and-magnitude


binary number 11100100?
Negative Numbers – Sign-and-magnitude
In the sign-and-magnitude representation, the MSB (in this 8-bit
case, 2^7) instead becomes a sign bit.

When this bit is set to 1, it denotes the presence of a ‘-’ sign.

– 64 32 16 8 4 2 1
1 1 1 0 0 1 0 0

100
The magnitude of the number is 100.
Negative Numbers – Sign-and-magnitude
In the sign-and-magnitude representation, the MSB (in this 8-bit
case, 2^7) instead becomes a sign bit.

When this bit is set to 1, it denotes the presence of a ‘-’ sign.

– 64 32 16 8 4 2 1
1 1 1 0 0 1 0 0

– 100
The magnitude of the number is 100.

The sign of the number if negative.


Negative Numbers – Sign-and-magnitude
In the sign-and-magnitude representation, the MSB (in this 8-bit
case, 2^7) instead becomes a sign bit.

When this bit is set to 1, it denotes the presence of a ‘-’ sign.

– 64 32 16 8 4 2 1
1 1 1 0 0 1 0 0

– 100
The represented number is -100.
Negative Numbers – Sign-and-magnitude
In the sign-and-magnitude representation, the MSB (in this 8-bit
case, 2^7) instead becomes a sign bit.

When this bit is set to 1, it denotes the presence of a ‘-’ sign.

– 64 32 16 8 4 2 1
0 1 1 0 0 1 0 0

100
The represented number is -100.

If the sign bit had been 0, then the represented number would have
been 100.
Negative Numbers – Sign-and-magnitude

For an 8-bit sign-and-magnitude binary word…

– 64 32 16 8 4 2 1
1 1 1 1 1 1 1 1

What is the maximum representable number? 127

What is the minimum representable number? -127


Negative Numbers – Sign-and-magnitude
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of


subtraction: X - X = 0.
Negative Numbers – Sign-and-magnitude
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of


subtraction: X - X = 0.
Negative Numbers – Sign-and-magnitude
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of


subtraction: X - X = 0.
Negative Numbers – Sign-and-magnitude
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of


subtraction: X - X = 0. v er f low is
O
ed!
remov
Negative Numbers – Sign-and-magnitude
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of The law doesn’t hold with


subtraction: X - X = 0. this representation. To
define subtraction, we need
to look elsewhere…
Negative Numbers – Two’s complement
In the two’s complement representation, the MSB (in this 8-bit
case, 2^7) represents the negated form of the original MSB.

When this bit is set to 1, it denotes the subtraction of the MSB.

–128 64 32 16 8 4 2 1
1 1 1 0 0 1 0 0

-128 + 64 + 32 + 4
Which denary number is represented by the two’s complement
binary number 11100100?
Negative Numbers – Two’s complement
In the two’s complement representation, the MSB (in this 8-bit
case, 2^7) represents the negated form of the original MSB.

When this bit is set to 1, it denotes the subtraction of the MSB.

–128 64 32 16 8 4 2 1
1 1 1 0 0 1 0 0

-28
The represented number is -28.
Negative Numbers – Two’s complement
In the two’s complement representation, the MSB (in this 8-bit
case, 2^7) represents the negated form of the original MSB.

When this bit is set to 1, it denotes the subtraction of the MSB.

–128 64 32 16 8 4 2 1
0 1 1 0 0 1 0 0

100
The represented number is -28.

If the MSB had been 0, then the represented number would have been
100.
Negative Numbers – Two’s complement

For an 8-bit two’s complement binary word…

–128 64 32 16 8 4 2 1
1 0 0 0 0 0 0 0

What is the maximum representable number? 127

What is the minimum representable number? -128

How is it possible that we can represent more numbers with two’s


complement? Because there is only one representation of zero!
Negative Numbers – Two’s complement
Negating a number in sign-and-magnitude is easy: we just put a 1
in the MSB.

But two’s complement is a bit less intuitive than this.

–128 64 32 16 8 4 2 1

So how do we convert from denary to two’s complement binary?


Let’s take the denary number -81.
Negative Numbers – Two’s complement
Negating a number in sign-and-magnitude is easy: we just put a 1
in the MSB.

But two’s complement is a bit less intuitive than this.

–128 64 32 16 8 4 2 1
0 1 0 1 0 0 0 1

So how do we convert from denary to two’s complement binary?


Let’s take the denary number -81.

1. Convert its magnitude 81 into binary.


Negative Numbers – Two’s complement
Negating a number in sign-and-magnitude is easy: we just put a 1
in the MSB.

But two’s complement is a bit less intuitive than this.

–128 64 32 16 8 4 2 1
1 0 1 0 1 1 1 0

So how do we convert from denary to two’s complement binary?


Let’s take the denary number -81.

1. Convert its magnitude 81 into binary.


2. Flip every bit.
Negative Numbers – Two’s complement
Negating a number in sign-and-magnitude is easy: we just put a 1
in the MSB.

But two’s complement is a bit less intuitive than this.

–128 64 32 16 8 4 2 1
1 0 1 0 1 1 1 1

So how do we convert from denary to two’s complement binary?


Let’s take the denary number -81.

1. Convert its magnitude 81 into binary.


2. Flip every bit.
3. Add 1 to the number, an ignoring any overflow.
4. Don’t forget to check your answer!
Negative Numbers – Two’s complement
Negating a number in sign-and-magnitude is easy: we just put a 1
in the MSB.

But two’s complement is a bit less intuitive than this.

–128 64 32 16 8 4 2 1
1 0 1 0 1 1 1 1

So how do we convert from denary to two’s complement binary?


Let’s take the denary number -81.

1. Convert its magnitude 81 into binary.


2. Flip every bit.
3. Add 1 to the number, an ignoring any overflow.
4. Don’t forget to check your answer! -128 + 32 + 8 + 4 + 2 + 1 = -81
Negative Numbers – Two’s complement
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of


subtraction: X - X = 0.
Negative Numbers – Two’s complement
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of


subtraction: X - X = 0. v er f low is
O
ed!
remov
Negative Numbers – Two’s complement
The sign-and-magnitude
representation seems sensible,
simple and reflects what humans
do with denary. Why not use it
all the time?

Mathematically, sign-and-
magnitude isn’t ideal: it doesn’t
let us define subtraction!

Take the following law of The law holds with this


subtraction: X - X = 0. representation. We now know
how to perform subtraction
on binary numbers!
Negative Numbers – Two’s complement
Why does subtraction work for
two’s complement?

It’s all to do with overflow!

255 + 1 in 8-bit binary


arithmetic “looks like” 0, once
we remove overflow.

255 is thus behaving like -1.

So, once we take out the MSB, we


represent a negative number –X as
the usual binary representation
of 256 – X.
Negative Numbers – Two’s complement
Why does subtraction work for
two’s complement?

It’s all to do with overflow!

2^n in n-bit binary arithmetic


“looks like” 0.

2^n - 1 is thus behaving like -1.

So, once we take out the MSB, we


represent a negative number –X as
the usual binary representation
of 2^n – X.
Representing Negative Numbers
Let’s review the three representations of integers for 8-bit arithmetic:

Representation Minimum Maximum


Usual 0 255
Sign-and-magnitude -127 127
Two’s complement -128 127
Representing Negative Numbers
Let’s generalise this to n-bit arithmetic:

Representation Minimum Maximum


Usual 0 2n - 1
Sign-and-magnitude -(2n-1 – 1) 2n-1 - 1
Two’s complement -2n-1 2n-1 - 1
Usual
representation
Sign-and-
magnitude
representation
Two’s
complement
representation
1.0 / 2 = 0.1?

06
Representing Real Numbers
Finally, let’s explore how we represent
numbers with a decimal point.
Representing Real Numbers
The float datatype includes both positive and negative fractional
numbers, but the representations we have explored thus far have only
represented whole numbers (also called integers).

We will look at two methods for representing real numbers:

Fixed point Floating point


An intuitive method, wherein
The standard approach,
a binary number has a fixed
wherein the binary point is
binary point
not fixed, but ‘floats’
Real Numbers – Fixed point
How do we represent real numbers?

We separate the number into an integer part and a fractional


part with a decimal point ‘.’!
27 26 25 24 23 22 21 20

We can do something similar in binary very easily.

Let’s say we have a fixed representation of 8-bits, we can


choose to split those bits into (for example) a 5-bit Integer
part and a 3-bit fractional part.
Real Numbers – Fixed point
The integer part is straightforward; but what should the value
of the fractional part’s bits be?

In denary, these would be 0.1, 0.01, 0.001…

24 23 22 21 20 2-1 2-2 2-3

Integer part Fractional part

…so in binary, they should be 0.5, 0.25, 0.125!


Real Numbers – Fixed point
Binary-to-denary works exactly the same with fixed point binary
numbers.

16 8 4 2 1 0.5 0.25 0.125


1 0 1 1 1 0 1 1

Integer part Fractional part


What denary number does 10111011 represent in our 8-bit (5-bit
integer part, 3-bit fractional part) fixed point representation?
(Note that we would often write this as 10111.011 to make the binary point
explicit.)
Real Numbers – Fixed point
Binary-to-denary works exactly the same with fixed point binary
numbers.

16 4 2 1 0.25 0.125
16 8 4 2 1 0.5 0.25 0.125
1 0 1 1 1 0 1 1

Integer part Fractional part


What denary number does 10111011 represent in our 8-bit (5-bit
integer part, 3-bit fractional part) fixed point representation?
(Note that we would often write this as 10111.011 to make the binary point
explicit.)
Real Numbers – Fixed point
Binary-to-denary works exactly the same with fixed point binary
numbers.

16 + 4 + 2 + 1 + 0.25 + 0.125
16 8 4 2 1 0.5 0.25 0.125
1 0 1 1 1 0 1 1

Integer part Fractional part


What denary number does 10111011 represent in our 8-bit (5-bit
integer part, 3-bit fractional part) fixed point representation?
(Note that we would often write this as 10111.011 to make the binary point
explicit.)
Real Numbers – Fixed point
Binary-to-denary works exactly the same with fixed point binary
numbers.

23.375
16 8 4 2 1 0.5 0.25 0.125
1 0 1 1 1 0 1 1

Integer part Fractional part


What denary number does 10111011 represent in our 8-bit (5-bit
integer part, 3-bit fractional part) fixed point representation?
(Note that we would often write this as 10111.011 to make the binary point
explicit.)
Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


9.625
Integer part Fractional part
Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


9.625
Integer part Fractional part
Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


9.625 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


9.625 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


1.625 0 1

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


1.625 0 1 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


1.625 0 1 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


1.625 0 1 0 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


1.625 0 1 0 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


0.625 0 1 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


0.625 0 1 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


0.125 0 1 0 0 1 1

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


0.125 0 1 0 0 1 1

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


0.125 0 1 0 0 1 1 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

16 8 4 2 1 0.5 0.25 0.125


0.125 0 1 0 0 1 1 0

Integer part Fractional part


Real Numbers – Fixed point
The left-to-right denary-to-binary method also still works.

So, 9.625 is 01001.101 (or 01001101) in this fixed point


representation.

16 8 4 2 1 0.5 0.25 0.125


0 0 1 0 0 1 1 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


9
Integer part Fractional part
Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


4 rem 1
Integer part Fractional part
Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


4 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


2 rem 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


2 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


1 rem 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


1 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


0 rem 1 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
The right-to-left denary-to-binary method is a bit different.

First, we compute the integer part of the number exactly the


same as before (divide by 2 and keep the remainder).

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1

Integer part Fractional part


Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0.625 0 1 0 0 1

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0.625
Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0.625 * 2 = 1.25; write 1 and keep 0.25


Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1 1

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0.25
Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1 1

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0.25 * 2 = 0.5; write 0 and keep 0.5


Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1 1 0

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0.5
Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1 1 0

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0.5 * 2 = 1.0; write 1 and keep 0


Real Numbers – Fixed point
For the fractional part, we do the opposite: we go left-to-right
and multiply by 2 rather than divide by 2.

16 8 4 2 1 0.5 0.25 0.125


0 1 0 0 1 1 0 1

Integer part Fractional part


Following each multiplication, we place the integer part of the
number in the next bit and keep the fractional part.

0 reached!
Real Numbers – Fixed point
Another way to understand fixed point binary numbers is as
scaled binary integers.

For example, let’s use a 7-bit fixed point representation (4-bit


integer, 3-bit fractional).
The number 3.5 can be 8 4 2 1 0.5 0.25 0.125
represented as 0011100. 0 0 1 1 1 0 0
In the usual positive
64 32 16 8 4 2 1
integer representation,
this would be the number 0 0 1 1 1 0 0
28…

…and 28 = 3.5 * 23.


Real Numbers – Fixed point
So an easy way to convert between an n-bit binary number X and a
fixed point (with i-bit fractional part) binary number B is:

X = B * 2i (or X = B << i)
The number 3.5 can be 8 4 2 1 0.5 0.25 0.125
represented as 0011100. L
0 0 1 1 1 eft sh0 0
i ft b
y3
In the usual positive
64 32 16 8 4 2 1
integer representation,
this would be the number 0 0 1 1 1 0 0
28…

…and 28 = 3.5 * 23.


Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -59.25 as a 10-bit sign-and-magnitude


fixed point binary number with a 2-bit fractional part?

– 64 32 16 8 4 2 1 0.5 0.25
Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -59.25 as a 10-bit sign-and-magnitude


fixed point binary number with a 2-bit fractional part?

– 64 32 16 8 4 2 1 0.5 0.25
1 0 1 1 1 0 1 1 0 1
Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -0.533203125 as a 10-bit two’s complement


fixed point binary number with a 9-bit fractional part?

–1 0.5 0.25 0.125 2-4 2-5 2-6 2-7 2-8 2-9


Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -0.533203125 as a 10-bit two’s complement


fixed point binary number with a 9-bit fractional part?

–1 0.5 0.25 0.125 2-4 2-5 2-6 2-7 2-8 2-9


1 0 1 1 1 0 1 1 1 1

There are many ways of doing this. Here’s how I worked this out:
● -0.533203125 * 29 = -273
Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -0.533203125 as a 10-bit two’s complement


fixed point binary number with a 9-bit fractional part?

512 256 128 64 32 16 8 4 2 1


0 1 0 0 0 1 0 0 0 1

There are many ways of doing this. Here’s how I worked this out:
● -0.533203125 * 29 = -273
● 273 in standard 10-bit binary = 0100010001 (using left-to-right method)
Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -0.533203125 as a 10-bit two’s complement


fixed point binary number with a 9-bit fractional part?

–512 256 128 64 32 16 8 4 2 1


1 0 1 1 1 0 1 1 1 1

There are many ways of doing this. Here’s how I worked this out:
● -0.533203125 * 29 = -273
● 273 in standard 10-bit binary = 0100010001 (using left-to-right method)
● Flip all bits and add 1 to convert to two’s complement
Real Numbers – Fixed point
Of course, we can also represent negative real numbers.

How would you represent -0.533203125 as a 10-bit two’s complement


fixed point binary number with a 9-bit fractional part?

–1 0.5 0.25 0.125 2-4 2-5 2-6 2-7 2-8 2-9


1 0 1 1 1 0 1 1 1 1

There are many ways of doing this. Here’s how I worked this out:
● -0.533203125 * 29 = -273
● 273 in standard 10-bit binary = 0100010001 (using left-to-right method)
● Flip all bits and add 1 to convert to two’s complement
Real Numbers – Fixed point
Fixed point isn’t perfect for representing real numbers.

Let’s try to represent 0.03125 with a 5-bit fixed point


representation with a 4-bit fractional part.

1 0.5 0.25 0.125 0.625


0 0 0 0 0

So, in this representation, 0.03125 = 0!


Real Numbers – Fixed point
Fixed point isn’t perfect for representing real numbers.

Let’s try to represent 0.03125 with a 5-bit fixed point


representation with a 4-bit fractional part.

0.5 0.25 0.125 0.625 0.03125

0 0 0 0 1

So, in this representation, 0.03125 = 0!

If we had chosen a 5-bit fractional part, we could have


represented it…
Real Numbers – Fixed point
Fixed point isn’t perfect for representing real numbers.

Let’s try to represent 0.03125 with a 5-bit fixed point


representation with a 4-bit fractional part.

1 0.5 0.25 0.125 0.625


1 1 0 0 0

So, in this representation, 0.03125 = 0!

If we had chosen a 5-bit fractional part, we could have


represented it…

…but then we couldn’t have represented 1.5, for example.


Real Numbers – Fixed point
If we have access to a given number of bits for our numbers (in
this case 5), it would be helpful to be able to change how large
or small the fractional part is according to demand.

1 0.5 0.25 0.125 0.625


0 0 0 0 0

0.5 0.25 0.125 0.625 0.03125

0 0 0 0 1

This is called a floating point representation.


Real Numbers – Floating point
A floating point representation is like scientific notation, but
in binary.

Scientific notation is when a denary (and usually fractional)


number is represented in the form:

m is a denary (and usually fractional) number


e is a denary integer

For example:

● 1025.6 = 1.0256 * 103.


● 0.086 = 8.6 * 10-2
● 1.099 = 1.099 * 100.
Real Numbers – Floating point
We could follow this convention in exactly the same way in
binary!

A binary (usually fractional) number will be written in the


form:

m is a binary (and usually fractional) number


e is a binary integer

For example:

● 1011.1 = 1.0111 * 23.


● 0.011 = 1.1 * 2-2
● 1.011 = 1.011 * 20.
Real Numbers – Floating point
So, the idea of floating point is to represent binary numbers
as two numbers, m and e, such that n = m * 2e.
How can we store this in a (let’s say 16-bit) computer?

m e
The representation can be split into two parts: the part that
represents m and the part that represents e.
In general, the mantissa m needs to have a higher number of
bits than the exponent e, so let’s have a 12-bit mantissa and a
4-bit exponent.
Real Numbers – Floating point
There are now three problems we need to deal with:
1. How should we represent the mantissa?
2. How should we represent negative numbers?
3. How should we represent the exponent?

m e
Real Numbers – Floating point
In our examples of both denary and binary scientific notation,
m was always normalised: this means it always had exactly one
digit/bit before the decimal/binary point.

m e

So, it would seem to make sense to represent m as a fixed point


number, with only a 1-bit integer part…
However, this is not an efficient use of space. Why?
Real Numbers – Floating point
For binary numbers, for every number other than zero, the
normalised scientific notation of the number will start with 1..

m e
We could simply remember that numbers should start with 1. and
effectively hide this extra bit.

(This does mean we now can’t represent zero, but we will fix this later.)
Real Numbers – Floating point
1. How should we represent the mantissa?
2. How should we represent negative numbers?
3. How should we represent the exponent?
4. How should we represent zero?

m e
Real Numbers – Floating point
Floating point numbers represent negative numbers by using a
sign bit.

sign m e
This sign bit gives the sign of the mantissa (and therefore the
overall represented number), not the exponent.

In general, we can assume it works as it did in sign-and-magnitude


representation: 0 for positive numbers, 1 for negative numbers.
Real Numbers – Floating point
1. How should we represent the mantissa?
2. How should we represent negative numbers?
3. How should we represent the exponent?
4. How should we represent zero?

sign m e
Real Numbers – Floating point
The exponent needs to be able to take on both positive and negative
values. So, presumably, it either needs its own sign bit or to be
stored in two’s complement.

sign m e
However, both have their issues. In particular, they both cause
binary numbers to be ordered incorrectly relative to the ordering
on the representing bit-strings.

This makes comparing numbers in the computer much more inefficient!


Real Numbers – Floating point
In order to make sure floating point numbers could be compared
quickly in the computer, two things need to be done.

First, the exponent needs to come after the sign bit and before m.

sign e* m
Secondly, as introduced in 1954 with the IBM 704, the exponent
needs to be stored with an offset.
Sign-and-
magnitude
representation
Two’s
complement
representation
Offset/biased
representation
when b = 3
Real Numbers – Floating point
An offset exponent, sometimes called a biased exponent, is used
instead of two’s complement so that the representations of negative
numbers are ordered earlier than the representations of positive
numbers.

sign e* m
Given a bias b, the true exponent e can be computed from the offset
exponent e* by calculating e = e* - b (thus also e* = e + b).

Additionally, as shown in the previous picture, the all 0s and all


1s values are not used – they are reserved for special values.
Real Numbers – Floating point
1. How should we represent the mantissa?
2. How should we represent negative numbers?
3. How should we represent the exponent?
3. How should we represent zero?

sign e* m
Real Numbers – Floating point
Because of the hidden bit, zero cannot be represented.

However, we reserved the special biased exponent value of 0000 for


special values.


0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

sign e* m
Zero is one of these special values: it is represented by a 0 sign
bit, 0 biased exponent and 0 mantissa.
Real Numbers – Floating point
Other special values include +∞, -∞ and NaN.

Positive and negative infinity are the values which occur when
floating point numbers overflow or underflow (respectively).


0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0

sign e* m
+∞ is represented as a 0 sign bit, all 1s biased exponent and 0
mantissa.
Real Numbers – Floating point
Other special values include +∞, -∞ and NaN.

Positive and negative infinity are the values which occur when
floating point numbers overflow or underflow (respectively).


1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0

sign e* m
-∞ is represented as a 1 sign bit, all 1s biased exponent and 0
mantissa.
Real Numbers – Floating point
Other special values include +∞, -∞ and NaN.

NaN means “not a number”, which occurs when we do strange things,


like take the square root of a negative number.


0 1 1 1 1 1 1 1 0 1 0 1 0 1 0 1

sign e* m
NaN is represented as any sign bit, all 1s biased exponent and any
mantissa except 0.
Real Numbers – Floating point
The standard for floating point arithmetic is IEEE 754. It consists
of the features we introduced: a sign bit, a biased exponent, a
mantissa with a hidden bit and special values such as +∞, -∞ and
NaN.

sign e* m
IEEE 754 usually has one of the following precisions:
● Single precision (32-bit numbers with 8-bit exponent)
● Double precision (64-bit numbers with 11-bit exponent)
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
The sign bit is 1,
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
The sign bit is 1, so it’s a negative number.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
The sign bit is 1, so it’s a negative number.
The biased exponent is 6,
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
The sign bit is 1, so it’s a negative number.
The biased exponent is 6, so the true exponent is 6 - 7 = -1.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
The sign bit is 1, so it’s a negative number.
The biased exponent is 6, so the true exponent is 6 - 7 = -1.
The mantissa with the hidden bit is
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 0 1 1 0 0 1 1 0 0 1 1 0 0 0 0

sign e* m
The sign bit is 1, so it’s a negative number.
The biased exponent is 6, so the true exponent is 6 - 7 = -1.
The mantissa with the hidden bit is (1.01100110000)2.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15

The sign bit is 1, so it’s a negative number.


The biased exponent is 6, so the true exponent is 6 - 7 = -1.
The mantissa with the hidden bit is (1.01100110000)2.
The answer in fixed point is -(1.01100110000)2 * 2-1
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15
0 1 0 1 1 0 0 1 1 0 0 0 0 0 0 0

The sign bit is 1, so it’s a negative number.


The biased exponent is 6, so the true exponent is 6 - 7 = -1.
The mantissa with the hidden bit is (1.01100110000)2.
The answer in fixed point is -(1.01100110000)2 * 2-1 = -(0.10110011)2.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15
0 1 0 1 1 0 0 1 1 0 0 0 0 0 0 0

The sign bit is 1, so it’s a negative number.


The biased exponent is 6, so the true exponent is 6 - 7 = -1.
The mantissa with the hidden bit is (1.0110011)2.
The answer in fixed point is -(1.0110011)2 * 2-1 = -(0.10110011)2.
The answer in denary is -(2-1 + 2-3 + 2-4 + 2-7 + 2-8)
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

Which number is represented by the following?

20 2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15
0 1 0 1 1 0 0 1 1 0 0 0 0 0 0 0

The sign bit is 1, so it’s a negative number.


The biased exponent is 6, so the true exponent is 6 - 7 = -1.
The mantissa with the hidden bit is (1.0110011)2.
The answer in fixed point is -(1.0110011)2 * 2-1 = -(0.10110011)2.
The answer in denary is -(2-1 + 2-3 + 2-4 + 2-7 + 2-8) = -0.69921875.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1

sign e* m
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1

sign e* m
Convert it to fixed point:
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1

sign e* m
Convert it to fixed point: 10001110.1.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation:
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
The mantissa is therefore
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


0 0 0 1 1 1 0 1 0 0 0

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
The mantissa is therefore 00011101.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


0 0 0 1 1 1 0 1 0 0 0

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
The mantissa is therefore 00011101.
The true exponent is 7, so the biased exponent is
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 1 1 0 0 0 0 1 1 1 0 1 0 0 0

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
The mantissa is therefore 00011101.
The true exponent is 7, so the biased exponent is 7 + 7 = 14.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


1 1 1 0 0 0 0 1 1 1 0 1 0 0 0

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
The mantissa is therefore 00011101.
The true exponent is 7, so the biased exponent is 7 + 7 = 14.
The number is positive, so the sign bit is 0.
Real Numbers – Floating point
Let’s do some examples with our hypothetical 16-bit (4-bit
exponent) IEEE 754 representation, with an exponent bias of b = 7.

How do we represent 142.5?

– 8 4 2 1 1024 512 256 128 64 32 16 8 4 2 1


0 1 1 1 0 0 0 0 1 1 1 0 1 0 0 0

sign e* m
Convert it to fixed point: 10001110.1.
Write it in normalised scientific notation: (1.00011101)2 * 27.
The mantissa is therefore 00011101.
The true exponent is 7, so the biased exponent is 7 + 7 = 14.
The number is positive, so the sign bit is 0.
Real Numbers

Floating point is a gold standard for representing real numbers


in binary on a computer.
It allows us to represent a wide array of numbers dynamically,
as we are able to shift the binary point left or right to get
the precision we require.
However, some numbers – no
1 0.5 0.25 0.125 0.625
matter how much precision we
have – can never be 0 0 0 0 0
represented in binary.
0.5 0.25 0.125 0.625 0.03125

0 0 0 0 1
Real Numbers

Even with a huge number of bits, we will not be able to


represent many real numbers.
In denary, numbers such as π and 1/3 cannot be represented
without an infinite number of digits!
In binary, numbers such as those above, and others such as 0.1,
cannot be represented without an infinite number of bits!

2-1 2-2 2-3 2-4 2-5 2-6 2-7 2-8 2-9 2-10 2-11 2-12 2-13 2-14 2-15 …
0 0 0 1 1 0 0 1 1 0 0 1 1 0 0 …
Real Numbers

Research is ongoing in the field


of represent exact real numbers.
This includes the development of
data types in programming
languages that allow one to
represent any real number, not
just those that can be represented
using floating points.

Shameless plug: If you’re


interested, come and do your MSc
project with me!
Topic Recap
01 Everything on a computer is represented in binary.

02 Binary is the base 2 number system that consists only of 0s and 1s.

03 Colour and text can be represented in different ways, depending on


how many bits we have available.

04 We can perform logical (AND,OR,NOT,XOR) and arithmetic (+,<<,>>,-)


operations on binary words.

05 Negative numbers can be represented either by using a sign bit or


by using the two’s complement representation.

06 Real numbers can be represented either by using fixed point or


floating representation.
Further Reading
Forouzan (2023) Foundations of Computer
Science. Cengage Learning. [link]

Chapter 1: Introduction

Chapter 2: Number Systems

Chapter 3: Data Storage

Chapter 4: Operations on Data


Feedback
Tell us what you thought about this
topic and its lectures!
Thanks!
Do you have any questions?

[email protected]

Office Hours:
10:00-12:00 Wednesdays in CS/109

CREDITS: This presentation template was created by


Slidesgo, and includes icons by Flaticon, and infographics
& images by Freepik

You might also like