0% found this document useful (0 votes)
2 views

Sorting

The document provides an overview of the Merge Sort algorithm, its analysis using the Master Theorem, and introduces the concept of heaps and heap operations, including Heapify and BuildHeap. It also discusses the Heapsort algorithm, its efficiency, and the implementation of priority queues using heaps. Finally, it briefly mentions Quicksort as an alternative sorting algorithm with different performance characteristics.

Uploaded by

yadavanshikaraj
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Sorting

The document provides an overview of the Merge Sort algorithm, its analysis using the Master Theorem, and introduces the concept of heaps and heap operations, including Heapify and BuildHeap. It also discusses the Heapsort algorithm, its efficiency, and the implementation of priority queues using heaps. Finally, it briefly mentions Quicksort as an alternative sorting algorithm with different performance characteristics.

Uploaded by

yadavanshikaraj
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 98

Merge Sort

MergeSort(A, left, right) {


if (left < right) {
mid = floor((left + right) / 2);
MergeSort(A, left, mid);
MergeSort(A, mid+1, right);
Merge(A, left, mid, right);
}
}
// Merge() takes two sorted subarrays of A and
// merges them into a single sorted subarray of A.
// It requires O(n) and *does* require allocating
// O(n) space
Analysis of Merge Sort

Statement Effort
MergeSort(A, left, right) { T(n)
if (left < right) { (1)
mid = floor((left + right) / 2); (1)
MergeSort(A, left, mid); T(n/2)
MergeSort(A, mid+1, right); T(n/2)
Merge(A, left, mid, right); (n)
}
}
● So T(n) = (1) when n = 1, and
2T(n/2) + (n) when n > 1
● This expression is a recurrence
The Master Theorem
● Given: a divide and conquer algorithm
■ An algorithm that divides the problem of size n
into a subproblems, each of size n/b
■ Let the cost of each stage (i.e., the work to divide
the problem + combine solved subproblems) be
described by the function f(n)
● Then, the Master Theorem gives us a
cookbook for the algorithm’s running time:
The Master Theorem
● if T(n) = aT(n/b) + f(n) then

 

  
n 
logb a
 
f (n) O n logb a   

 
   0
T (n)  n  logb a
log n  f (n)  n 
logb a

  c 1
 
 f (n)   
f (n)  n logb a  AND 
 
 af (n / b)  cf (n) for large n
Using The Master Method
● T(n) = 9T(n/3) + n
■ a=9, b=3, f(n) = n
■ nlog a = nlog
b
= (n2)
3 9

■ Since f(n) = O(nlog 9 - ), where =1, case 1 applies:


3

  
T (n)  n logb a when f (n) O n logb a   
■ Thus the solution is T(n) = (n2)
Sorting
● So far we’ve talked about two algorithms to
sort an array of numbers
■ What is the advantage of merge sort?
■ What is the advantage of insertion sort?
● Next: Heapsort
■ Combines advantages of both previous algorithms
Heaps
● A heap can be seen as a complete binary tree:

16

14 10

8 7 9 3

2 4 1

■ What makes a binary tree complete?


■ Is the example above complete?
Heaps
● A heap can be seen as a complete binary tree:
16

14 10

8 7 9 3

2 4 1 1 1 1 1 1

The CLR book calls them “nearly complete” binary trees; can
think of unfilled slots as null pointers
Heaps
● In practice, heaps are usually implemented as
arrays:
16

14 10

8 7 9 3
A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Heaps
● To represent a complete binary tree as an array:
■ The root node is A[1]
■ The parent of node i is A[i/2] (note: integer divide)
■ The left child of node i is A[2i]
■ The right child of node i is A[2i + 1]

16

14 10

8 7 9 3

A = 16 14 10 8 7 9 3 2 4 1 =
2 4 1
Referencing Heap Elements

● So…
Parent(i) { return i/2; }
Left(i) { return 2*i; }
right(i) { return 2*i + 1; }
The Heap Property
● Heaps also satisfy the heap property:
A[Parent(i)]  A[i], for all nodes i > 1
■ In other words, the value of a node is at most the
value of its parent
■ Where is the largest element in a heap stored?
Heap Height
● What is the height of an n-element heap?
Why?
● This is nice: basic heap operations take at most
time proportional to the height of the heap
Heap Operations: Heapify()
● Heapify(): maintain the heap property
■ Given: a node i in the heap with children l and r
■ Given: two subtrees rooted at l and r, assumed to be
heaps
■ Problem: The subtree rooted at i may violate the heap
property (How?)
■ Action: let the value of the parent node “float down”
so subtree at i satisfies the heap property
○ What do you suppose will be the basic operation between i,
l, and r?
Heap Operations: Heapify()
Heapify(A, i)
{
l = Left(i); r = Right(i);
if (l <= heap_size(A) && A[l] > A[i])
largest = l;
else
largest = i;
if (r <= heap_size(A) && A[r] > A[largest])
largest = r;
if (largest != i)
Swap(A, i, largest);
Heapify(A, largest);
}
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1
Analyzing Heapify()
● Aside from the recursive call, what is the
running time of Heapify()?
● How many times can Heapify() recursively
call itself?
● What is the worst-case running time of
Heapify() on a heap of size n?
Analyzing Heapify()
● Fixing up relationships between i, l, and r
takes (1) time
● If the heap at i has n elements, how many
elements can the subtrees at l or r have?

● Answer: 2n/3 (worst case: bottom row 1/2 full)


● So time taken by Heapify() is given by
T(n)  T(2n/3) + (1)
Analyzing Heapify()
● So we have
T(n)  T(2n/3) + (1)
● By case 2 of the Master Theorem,
T(n) = O(lg n)
Heap Operations: BuildHeap()
● We can build a heap in a bottom-up manner by
running Heapify() on successive subarrays
■ Fact: for array of length n, all elements in range
A[n/2 + 1 .. n] are heaps (Why?)
■ So:
○ Walk backwards through the array from n/2 to 1, calling
Heapify() on each node.
○ Order of processing guarantees that the children of node
i are heaps when i is processed
BuildHeap()
// given an unsorted array A, make A a heap
BuildHeap(A)
{
heap_size(A) = length(A);
for (i = length[A]/2 downto 1)
Heapify(A, i);
}
BuildHeap() Example
● Work through example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1 3

2 16 9 10

14 8 7
Analyzing BuildHeap()
● Each call to Heapify() takes O(lg n) time
● There are O(n) such calls (specifically, n/2)
● Thus the running time is O(n lg n)
■ Is this a correct asymptotic upper bound?
■ Is this an asymptotically tight bound?
Analyzing BuildHeap()
● Each call to Heapify() takes O(lg n) time
● There are O(n) such calls (specifically, n/2)
● Thus the running time is O(n lg n)
■ Is this a correct asymptotic upper bound?
■ Is this an asymptotically tight bound?
● A tighter bound is O(n)
Analyzing BuildHeap()
● Each call to Heapify() takes O(lg n) time
● There are O(n) such calls (specifically, n/2)
● Thus the running time is O(n lg n)
■ Is this a correct asymptotic upper bound?
■ Is this an asymptotically tight bound?
● A tighter bound is O(n)
■ How can this be? Is there a flaw in the above
reasoning?
Analyzing BuildHeap(): Tight
● To Heapify() a subtree takes O(h) time
where h is the height of the subtree
■ h = O(lg m), m = # nodes in subtree
■ The height of most subtrees is small
● Fact: an n-element heap has at most n/2h+1
nodes of height h
● We can use this fact to prove that
BuildHeap() takes O(n) time
Heapsort

● Given BuildHeap(), an in-place sorting


algorithm is easily constructed:
■ Maximum element is at A[1]
■ Discard by swapping with element at A[n]
○ Decrement heap_size[A]
○ A[n] now contains correct value
■ Restore heap property at A[1] by calling Heapify()
■ Repeat, always swapping A[1] for A[heap_size(A)]
Heapsort
Heapsort(A)
{
BuildHeap(A);
for (i = length(A) downto 2)
{
Swap(A[1], A[i]);
heap_size(A) -= 1;
Heapify(A, 1);
}
}
Analyzing Heapsort
● The call to BuildHeap() takes O(n) time
● Each of the n - 1 calls to Heapify() takes
O(lg n) time
● Thus the total time taken by HeapSort()
= O(n) + (n - 1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)
Priority Queues
● Heapsort is a nice algorithm, but in practice
Quicksort (coming up) usually wins
● But the heap data structure is incredibly useful
for implementing priority queues
■ A data structure for maintaining a set S of elements,
each with an associated value or key
■ Supports the operations Insert(), Maximum(),
and ExtractMax()
■ What might a priority queue be useful for?
Priority Queue Operations
● Insert(S, x) inserts the element x into set S
● Maximum(S) returns the element of S with
the maximum key
● ExtractMax(S) removes and returns the
element of S with the maximum key
● How could we implement these operations
using a heap?
Implementing Priority Queues
HeapInsert(A, key) // what’s running time?
{
heap_size[A] ++;
i = heap_size[A];
while (i > 1 AND A[Parent(i)] < key)
{
A[i] = A[Parent(i)];
i = Parent(i);
}
A[i] = key;
}
Implementing Priority Queues
HeapMaximum(A)
{
return A[i];
}
Implementing Priority Queues
HeapExtractMax(A)
{
if (heap_size[A] < 1) { error; }
max = A[1];
A[1] = A[heap_size[A]]
heap_size[A] --;
Heapify(A, 1);
return max;
}
Quicksort
● Sorts in place
● Sorts O(n lg n) in the average case
● Sorts O(n2) in the worst case
● So why would people use it instead of merge
sort?
Quicksort
● Another divide-and-conquer algorithm
■ The array A[p..r] is partitioned into two non-
empty subarrays A[p..q] and A[q+1..r]
○ Invariant: All elements in A[p..q] are less than all
elements in A[q+1..r]
■ The subarrays are recursively sorted by calls to
quicksort
■ Unlike merge sort, no combining step: two
subarrays form an already-sorted array
Quicksort
The Quicksort algorithm works as
follows:
Quicksort(A,p,r /* to sort array A[p..r]
)
>r ) */
1. 2. ifq(=p Partition(A,p,r);
return;
3. Quicksort(A, p, p+q-
1);
4. Quicksort(A, p+q+1,
Tor);sort A[1..n], we just call Quicksort(A,1,n)
Partition
● Clearly, all the action takes place in the
partition() function
■ Rearranges the subarray in place
■ End result:
○ Two subarrays
○ All values in first subarray  all values in second
■ Returns the index of the “pivot” element separating
the two subarrays
● How do you suppose we implement this function?
Quicksort uses Partition
Quicksort makes use of a Partition
function:
Partition(A,p,r /* to partition array A[p..r] */
) 1. Pick an element, say A[t] (called
pivot)
2. Let q = #elements less than pivot
3. Put elements less than pivot to
A[p..p+q-1]
4. Put pivot to A[p+q]
5. Put remaining elements to
More on Partition
• After Partition(A,p,r), we obtain the value
q, and know that
• Pivot was A[p+q]
• Before A[p+q] : smaller than pivot
• After A[p+q] : larger than pivot

• There are many ways to perform


Partition. One way is shown in the next
slides
• It will be an in-place algorithm (using
O(1) extra space in addition to the
Ideas for In-Place Partition
• Idea 1: Use A[r] (the last element) as
pivot
• Idea 2: Process A[p..r] from left to
right
• The prefix (the beginning part) of A stores
all elements less than pivot seen so far
• Use two counters:
•One for the length of the prefix
•One for the element we are looking
In-Place Partition in Action
before running

Length of prefix =
0
1 3 7 8 2 6 4 5

next pivo
element t

Because next element is less than


pivot, we shall extend the prefix
by 1
In-Place Partition in Action
after 1 step

Length of prefix =
1
1 3 7 8 2 6 4 5

next pivo
element t

Because next element is smaller than pivot,


and is adjacent to the prefix, we extend the
prefix
In-Place Partition in Action
after 2 steps

Length of prefix =
2
1 3 7 8 2 6 4 5

next pivo
element t

Because next element is larger than


pivot, no change to prefix
In-Place Partition in Action
after 3 steps

Length of prefix =
2
1 3 7 8 2 6 4 5

pivo
next
t
element
Again, next element is larger than
pivot, no change to prefix
In-Place Partition in Action
after 4 steps

Length of prefix =
2
1 3 7 8 2 6 4 5

pivo
next
t
element
Because next element is less than
pivot,
we shall extend the prefix by
swapping
In-Place Partition in Action
after 5 steps

Length of prefix =
3
1 3 2 8 7 6 4 5

pivo
next
t
element
Because next element is larger than
pivot, no change to prefix
In-Place Partition in Action
after 6 steps

Length of prefix =
3
1 3 2 8 7 6 4 5

pivo
t
next
Because next element is less thanele
pivot,
we shall extend the prefix by men
t
swapping
In-Place Partition in Action
after 7 steps

Length of prefix =
4
1 3 2 4 7 6 8 5

pivo
t
next
When next element is the pivot, elewe put
it after the end of the prefix bymen
t
swapping
In-Place Partition in Action
after 8 steps

Length of prefix =
4
1 3 2 4 5 6 8 7

pivo
t

Partition is done, and return length of prefix


Partition In Words
● Partition(A, p, r):
■ Select an element to act as the “pivot” (which?)
■ Grow two regions, A[p..i] and A[j..r]
○ All elements in A[p..i] <= pivot
○ All elements in A[j..r] >= pivot
■ Increment i until A[i] >= pivot
■ Decrement j until A[j] <= pivot
■ Swap A[i] and A[j]
■ Repeat until i >= j
■ Return j
Partition Code
Partition(A, p, r)
x = A[p];
i = p - 1; Illustrate on
j = r + 1; A = {5, 3, 2, 6, 4, 1, 3, 7};
while (TRUE)
repeat
j--;
until A[j] <= x;
repeat
What is the running time of
i++;
partition()?
until A[i] >= x;
if (i < j)
Swap(A, i, j);
else
return j;
Partition Code
Partition(A, p, r)
x = A[p];
i = p - 1;
j = r + 1;
while (TRUE)
repeat
j--;
until A[j] <= x;
repeat
i++; partition() runs in O(n) time
until A[i] >= x;
if (i < j)
Swap(A, i, j);
else
return j;
Analyzing Quicksort
● What will be the worst case for the algorithm?
■ Partition is always unbalanced
● What will be the best case for the algorithm?
■ Partition is perfectly balanced
● Which is more likely?
■ The latter, by far, except...
● Will any particular input elicit the worst case?
■ Yes: Already-sorted input
Analyzing Quicksort
● In the worst case:
T(1) = (1)
T(n) = T(n - 1) + (n)
● Works out to
T(n) = (n2)
Analyzing Quicksort
● In the best case:
T(n) = 2T(n/2) + (n)
● What does this work out to?
T(n) = (n lg n)
Improving Quicksort
● The real liability of quicksort is that it runs in
O(n2) on already-sorted input
● Book discusses two solutions:
■ Randomize the input array, OR
■ Pick a random pivot element
● How will these solve the problem?
■ By insuring that no particular input can be chosen
to make quicksort run in O(n2) time
Analyzing Quicksort: Average Case
● Assuming random input, average-case running
time is much closer to O(n lg n) than O(n 2)
● First, a more intuitive explanation/example:
■ Suppose that partition() always produces a 9-to-1
split. This looks quite unbalanced!
■ The recurrence is thus:
T(n) = T(9n/10) + T(n/10) + n
■ How deep will the recursion go? (draw it)
Analyzing Quicksort: Average Case
● Intuitively, a real-life run of quicksort will
produce a mix of “bad” and “good” splits
■ Randomly distributed among the recursion tree
■ Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
■ What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
Analyzing Quicksort: Average Case
● Intuitively, a real-life run of quicksort will
produce a mix of “bad” and “good” splits
■ Randomly distributed among the recursion tree
■ Pretend for intuition that they alternate between
best-case (n/2 : n/2) and worst-case (n-1 : 1)
■ What happens if we bad-split root node, then
good-split the resulting size (n-1) node?
Analyzing Quicksort: Average Case
● Intuitively, a real-life run of quicksort will
produce a mix of “bad” and “good” splits
■ Randomly distributed among the recursion tree
■ Pretend for intuition that they alternate between best-
case (n/2 : n/2) and worst-case (n-1 : 1)
■ What happens if we bad-split root node, then good-
split the resulting size (n-1) node?
○ We end up with three subarrays, size 1, (n-1)/2, (n-1)/2
○ Combined cost of splits = n + n -1 = 2n -1 = O(n)
○ No worse than if we had good-split the root node!
Analyzing Quicksort: Average Case
● Intuitively, the O(n) cost of a bad split
(or 2 or 3 bad splits) can be absorbed
into the O(n) cost of each good split
● Thus running time of alternating bad and good
splits is still O(n lg n), with slightly higher
constants
● How can we be more rigorous?
Analyzing Quicksort: Average Case
● For simplicity, assume:
■ All inputs distinct (no repeats)
■ Slightly different partition() procedure
○ partition around a random element, which is not included
in subarrays
○ all splits (0:n-1, 1:n-2, 2:n-3, … , n-1:0) equally likely

● What is the probability of a particular split


happening?
● Answer: 1/n
Analyzing Quicksort: Average Case
● So partition generates splits
(0:n-1, 1:n-2, 2:n-3, … , n-2:1, n-1:0)
each with probability 1/n
● If T(n) is the expected running time,

1 n 1
T n    T k  T n  1  k   n 
n k 0
Worst-Case Running Time
The worst-case running time of Quicksort
can be expressed by:

T(n) = maxq=0 to n-1 (T(q) + T(n-q-1)) + (n)

We prove T(n)= (n2) by substitution

method:
1. Guess T(n) ≤ cn2 for some constant c
2. Next, verify our guess by induction
Worst-Case Running Time
Inductive Case:
T(n) = maxq=0 to n-1 (T(q) + T(n-q-1)) +
(n)
≤ max ( 2) + 
q=0 to n-1 + (n)
cq2 + c(n-q-1) (n)q =
≤ c(n-1) 2 Maximized when
0 or when q = n-
= cn2 - 2cn + c + 1


≤ (n)
cn2 when c is large
enough
Inductive Case is OK now
Worst-Case Running Time
Conclusion:
1.T(n) = (n2)
2.However, we can also show
T(n) = (n2)
by finding a worst-case input
 T(n) = (n2)
Average-Case Running Time
So, Quicksort runs badly for some
input…

But suppose that when we store a set


of n numbers into the input array,
each of the n! permutations are
equally likely
 Running time varies on input

What will be the “average” running


time ?
Average Running Time
Let X = # comparisons in all
Partition Later, we will show that
Running time = ( n + varies on input
X)
Finding average of X (i.e. #comparisons)
gives average running time

Our first target: Compute average of


X
Average # of Comparisons
We define some notation to help the
analysis:
•Leta1, a2, …, an denote the set of n
numbers initially placed in the array
•Further, we assume a1  a2  …  an
(So, a1 may not be the element in A[1]
originally)

•Let Xij = # comparisons between ai and aj


in all Partition calls
Average # of Comparisons
Then, X = # comparisons in all Partition
calls
= X12 + X13 + … + Xn-1,n

 Average # comparisons
= E[X]
= E[X12 + X13 + … + Xn-1,n]
= E[X12] + E[X13] + … + E[Xn-1,n]
Average # of Comparisons
The next slides will prove: E[Xij] =
2/(j-i+1) Using
this result,
E[X] =  i=1 to n-1  j=i+1 to n 2/(j-i+1)
2/k

= i=1
i=1to n-1
ton-1 k=1 to n-i 2/(k+1)
k=1to
=
n  i=1 to n-1 (log n) = (n
log n)
Comparison between ai and aj
Question: # times ai be compared with
aj ?
Answer: At most once, which happens
only if ai or a1j are chosen as
3 2 4 5 6 8 7
pivot

pivot
After that, the
pivot is fixed
and is never
compared
Comparison between ai and aj
Question: Will ai always be compared with
aj ? Answer: No. E.g., after Partition in
Page 14:
1 3 2 4 5 6 8 7

pivot

we will
separately
Quicksort the
first 4
Comparison between ai and aj
Observation:
Consider the elements ai, ai+1, …, aj-1,
aj
(i)If ai or aj is first chosen as a pivot,
then ai is compared with aj
(ii) Else, if any element of ai+1, …, aj-1
is first chosen as a pivot,
then ai is never compared with aj
Comparison between ai and aj
When the n! permutations are equally
likely to be the input,
Pr(ai compared with aj once) = 2/(j-i+1)
Pr(ai not compared with aj) = (j-i-1)/(j-
i+1)
E[Xij] = 1 * 2/(j-i+1) + 0 * (j-i-1)/(j-
i+1)
= Consider ai,2/(j-i+1)
ai+1, …, aj-1, aj. Given a permutation, if
ai is chosen a pivot first, then by exchanging ai
with ai+1 initially, ai+1 will be chosen as a pivot first3
5
Proof: Running time = (n+X)
Observe that in the Quicksort algorithm:
•Each Partition fixes the position of pivot
 at most n Partition calls
•After each Partition, we have 2 Quicksort
•Also, all Quicksort (except 1st one:
Quicksort(A,1,n))
are invoked after a Partition
 total (n) Quicksort calls
Proof: Running time = (n+X)
So, if we ignore the comparison time in
all Partition calls, the time used =
(n)

Thus, we include back the comparison


time in all Partition calls,
Running time = ( n + X )
Probability & Expectation

1
About this lecture
• What is Probability?
• What is an Event?
• What is a Random Variable?
• What is Expectation or “Average
Value” of a Random Variable?
• Useful Thm: Linearity of
Expectation
Experiment and Sample Space
• An experiment is a process that
produces an outcome
• A random experiment is an
experiment whose outcome is not
known until it is observed
– Exp 1: Throw a die once
– Exp 2: Flip a coin until Head comes up
Experiment and Sample Space
• A sample space  of a random
experiment is the set of all
outcomes
– Exp 1: Throw a die once
– Sample space: { 1, 2, 3, 4, 5, 6 }
– Exp 2: Flip a coin until Head comes
up
– Sample space: ??

• Any subset of sample space  is


called an event
Probability
• Probability studies the chance of
each event occurring
• Informally, it is defined with a
function Pr that satisfies the
following:

(1) For any event E, 0 · Pr(E) · 1


(2) Pr() = 1
(3) If E1 and E2 do not have common
outcomes,
Pr(E [ E ) = Pr(E ) + Pr(E )
Example
Questions:
1.Suppose the die is a fair die, so
that Pr(1)= Pr(2) = … = Pr(6).
What is Pr(1)? Why?

2.Instead, if we know
Pr(1) = 0.2, Pr(2) = 0.3, Pr(3) = 0.4,
Pr(4) = 0.1, Pr(5) = Pr(6) = 0.
What is Pr({1,2,4})?
Random Variable
Definition: A random variable X on a
sample space  is a function that
maps each outcome of  into a real
number.
That is, X:   R.
Ex: Suppose that we throw two dice
  = { (1,1), (1,2), …, (6,5), (6,6) }

Define X = sum of outcome of two dice


 X is a random variable on 
Random Variable
• For a random variable X and a
value a, the notation
“X = a”
denotes the set of outcomes  in the
sample space such that X() = a
 “X = a” is an
event
• In previous example,
“X = 10” is the event {(4,6), (5,5), (6,4)}
Expectation
Definition: The expectation (or
average value) of a random variable X,
is

E[X] = i i Pr(X=i)
Question:
•X = sum of outcomes of two fair
dice What is the value of E[X] ?
•How about the sum of three dice?
Expectation (Example)
Let X = sum of outcomes of two
dice. The value of X can vary from 2
to 12 So, we calculate:
Pr(X=2) = 1/36, Pr(X=3) = 2/36,
Pr(X=4) = 3/36, … , Pr(X=12) =
2/36,

E[X] = 2*Pr(X=2) + 3*Pr(X=3)


+ …+ 11*Pr(X=11) + 12*Pr(X=12)
=7
Linearity of Expectation
● Theorem: Given random variables X1, X2,
…, Xk, each with finite expectation, we have

● E[X1+X2+…+Xk] = E[X1]+E[X2] +…+


E[Xk]

● Let X = sum of outcomes of two dice. Let Xi =


the outcome of the ith dice
● What is the relationship of X, X1, and X2? Can
Linearity of Expectation
(Example)
Let X = sum of
outcomes of two
dice. Let Xi = the
outcome of the ith
dice
 X = X1 + X 2

 E[X] = E[X1+X2] =
E[X1] + E[X2]

You might also like