0% found this document useful (0 votes)
78 views

Heap Sort: Input: One-Dimension Array Advantages of Insertion Sort and Merge Sort Heap Sort

The running time of Heapify() aside from the recursive call is O(1). Heapify() can recursively call itself a maximum of h times where h is the height of the tree which is O(log n). Therefore, the worst-case running time of Heapify() on a heap of size n is O(log n).

Uploaded by

Mohammed Hajjaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Heap Sort: Input: One-Dimension Array Advantages of Insertion Sort and Merge Sort Heap Sort

The running time of Heapify() aside from the recursive call is O(1). Heapify() can recursively call itself a maximum of h times where h is the height of the tree which is O(log n). Therefore, the worst-case running time of Heapify() on a heap of size n is O(log n).

Uploaded by

Mohammed Hajjaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 6

Heap Sort

6 -- 1

Outlines: Heap Sort

 Input: One-Dimension Array


 Advantages of Insertion Sort and Merge Sort
 Heap Sort:
 The Heap Property

 Heapify Function

 Build Heap Function

 Heap Sort Function

6 -- 2

1
1D Array

Memory
a y f k

start
 1-dimensional array x = [a, y, f, k]
 x[1] = a; x[2] = y; x[3] = f; x[4] = k

6 -- 3

Sorting Revisited
 So far we’ve talked about two algorithms to sort an array of
numbers
 What is the advantage of merge sort?
 Answer: good worst-case running time O(n lg n)
 Conceptually easy, Divide-and-Conquer
 What is the advantage of insertion sort?
 Answer: sorts in place: only a constant number of array
elements are stored outside the input array at any time
 Easy to code, When array “nearly sorted”, runs fast in practice
avg case worst case
Insertion sort n 2 n2
Merge sort n log n n log n
 Next on the agenda: Heapsort
 Combines advantages of both previous algorithms

6 -- 4

2
Heaps
 A heap can be seen as a complete binary tree
 In practice, heaps are usually implemented as arrays

 An array A that represent a heap is an object with two


attributes: A[1 .. length[A]]
 length[A]: # of elements in the array

 heap-size[A]: # of elements in the heap stored within


array A, where heap-size[A] ≤ length[A]

 No element past A[heap-size[A]] is an element of the


heap

A = 16 14 10 8 7 9 3 2 4 1

6 -- 5

Heaps

 For example, heap-size of the following heap = 10


 Also, length[A] = 10

A = 16 14 10 8 7 9 3 2 4 1 =

16

14 10

8 7 9 3

2 4 1

6 -- 6

3
Referencing Heap Elements
 The root node is A[1]
 Node i is A[i]

 Parent(i)
 return i/2

 Left(i)
 return 2*i

 Right(i)
 return 2*i + 1
1 2 3 4 5 6 7 8 9 10
16 15 10 8 7 9 3 2 4 1

Level: 3 2 1 0

6 -- 7

The Heap Property

 Heaps also satisfy the heap property:


 A[Parent(i)]  A[i] for all nodes i > 1
 In other words, the value of a node is at most
the value of its parent
 The largest value in a heap is at its root (A[1])

 and subtrees rooted at a specific node contain


values no larger than that node’s value

6 -- 8

4
Heap Operations: Heapify()
 Heapify(): maintain the heap property
 Given: a node i in the heap with children L and R
 two subtrees rooted at L and R, assumed to be
heaps
 Problem: The subtree rooted at i may violate the
heap property (How?)
 A[i] may be smaller than its children value
 Action: let the value of the parent node “float
down” so subtree at i satisfies the heap property
 If A[i] < A[L] or A[i] < A[R], swap A[i] with the
largest of A[L] and A[R]
 Recurse on that subtree

6 -- 9

Heap Operations: Heapify()


Heapify(A, i)
{
1. L  left(i)
2. R  right(i)
3. if L  heap-size[A] and A[L] > A[i]
4. then largest  L
5. else largest  i
6. if R  heap-size[A] and A[R] > A[largest]
7. then largest  R
8. if largest i
9. then exchange A[i]  A[largest]
10. Heapify(A, largest)
}

6 -- 10

5
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1

6 -- 11

Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1

6 -- 12

6
Heapify() Example

16

4 10

14 7 9 3

2 8 1

A = 16 4 10 14 7 9 3 2 8 1

6 -- 13

Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1

6 -- 14

7
Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1

6 -- 15

Heapify() Example

16

14 10

4 7 9 3

2 8 1

A = 16 14 10 4 7 9 3 2 8 1

6 -- 16

8
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1

6 -- 17

Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1

6 -- 18

9
Heapify() Example

16

14 10

8 7 9 3

2 4 1

A = 16 14 10 8 7 9 3 2 4 1

6 -- 19

Heap Height

 Definitions:
 The height of a node in the tree = the number of
edges on the longest downward path to a leaf

 What is the height of an n-element heap? Why?


 The height of a tree for a heap is (lg n)
 Because the heap is a binary tree, the height of
any node is at most (lg n)
 Thus, the basic operations on heap runs in O(lg n)

6 -- 20

10
# of nodes in each level

 Fact: an n-element heap has at most 2h-k nodes of


level k, where h is the height of the tree

 for k = h (root level)  2h-h = 20 =1


 for k = h-1  2h-(h-1) = 21 =2
 for k = h-2  2h-(h-2) = 22 =4
 for k = h-3  2h-(h-3) = 23 =8
 …
 for k = 1  2h-1 = 2h-1
 for k = 0 (leaves level) 2h-0 = 2h

6 -- 21

Heap Height
 A heap storing n keys has height h = lg n = (lg n)
 Due to heap being complete, we know:
 The maximum # of nodes in a heap of height h
 2h + 2h-1 + … + 22 + 21 + 20 =
  i=0 to h 2i=(2h+1–1)/(2–1) = 2h+1 - 1
 The minimum # of nodes in a heap of height h
 1 + 2h-1 + … + 22 + 21 + 20 =
  i=0 to h-1 2i + 1 = [(2h-1+1–1)/(2–1)] + 1 = 2h
 Therefore
 2h  n  2h+1 - 1
 h  lg n & lg(n+1) – 1  h
 lg(n+1) – 1  h  lg n
 which in turn implies:
 h = lg n = (lg n)
6 -- 22

11
Analyzing Heapify()

 Aside from the recursive call, what is the


running time of Heapify()?

 How many times can Heapify() recursively


call itself?

 What is the worst-case running time of


Heapify() on a heap of size n?

6 -- 23

Analyzing Heapify()

 The running time at any given node i is


 (1) time to fix up the relationships among
A[i], A[Left(i)] and A[Right(i)]
 plus the time to call Heapify recursively on a
sub-tree rooted at one of the children of node i
 And, the children’s subtrees each have size at most
2n/3
 The worst case occurs when the last row of the
tree is exactly half full
 Blue =Yellow = Black = Red = ¼ n
 Blue + Black = ½ n
 Yellow + Red= ½ n
 Level 0: leave level = Blue +Yellow = ½ n = 2h
6 -- 24

12
Analyzing Heapify()

 So we have
T(n)  T(2n/3) + (1)

 Heapify takes T(n) = Θ(h)


 h = height of heap = lg n
 T(n) = Θ(lg n)

6 -- 25

Heap Operations: BuildHeap()

 We can build a heap in a bottom-up manner by


running Heapify() on successive subarrays
 Fact: for array of length n, all elements in range
A[n/2 + 1, n/2 + 2 .. n] are heaps (Why?)
 These elements are leaves, they do not have children
 We know that
 2h+1-1 = n  2.2h = n + 1
 2h = (n + 1)/2 = n/2 + 1 = n/2
 We also know that the leave-level has at most
 2h nodes = n/2 + 1 = n/2 nodes
 and other levels have a total of n/2 nodes
 n/2 + 1 + n/2 = n/2 + n/2 = n
6 -- 26

13
Heap Operations: BuildHeap()

 So:

 Walk backwards through the array from n/2 to 1,


calling Heapify() on each node.

 Order of processing guarantees that the children of


node i are heaps when i is processed

6 -- 27

BuildHeap()
// given an unsorted array A, make A a heap

BuildHeap(A)
{
1. heap-size[A]  length[A]
2. for i  length[A]/2 downto 1
3. do Heapify(A, i)
}

The Build Heap procedure, which runs in linear time,


produces a max-heap from an unsorted input array.

However, the Heapify procedure, which runs in


O(lg n) time, is the key to maintaining the heap property.
6 -- 28

14
BuildHeap() Example

 Work through example


A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
 n=10, n/2=5

1
4
2 3
1 3
4 5
6 7
2 16 9 10
8 9 10
14 8 7

6 -- 29

BuildHeap() Example

 A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1
4
2 3
1 3
4 i=5 6 7
2 16 9 10
8 9 10
14 8 7

6 -- 30

15
BuildHeap() Example

 A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

1
4
2 3
1 3
i=4 5 6 7
2 16 9 10
8 9 10
14 8 7

6 -- 31

BuildHeap() Example

 A = {4, 1, 3, 14, 16, 9, 10, 2, 8, 7}

1
4
2 i=3
1 3
4 5 6 7
14 16 9 10
8 9 10
2 8 7

6 -- 32

16
BuildHeap() Example

 A = {4, 1, 10, 14, 16, 9, 3, 2, 8, 7}

1
4
i=2 3
1 10
4 5 6 7
14 16 9 3
8 9 10
2 8 7

6 -- 33

BuildHeap() Example

 A = {4, 16, 10, 14, 7, 9, 3, 2, 8, 1}

i=1
4
2 3
16 10
4 5 6 7
14 7 9 3
8 9 10
2 8 1

6 -- 34

17
BuildHeap() Example

 A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}

1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1

6 -- 35

Analyzing BuildHeap()
 Each call to Heapify() takes O(lg n) time
 There are O(n) such calls (specifically, n/2)
 Thus the running time is O(n lg n)
 Is this a correct asymptotic upper bound?
 YES
 Is this an asymptotically tight bound?

 NO
 A tighter bound is O(n)
 How can this be? Is there a flow in the above reasoning?
 We can derive a tighter bound by observing that the time
for Heapify to run at a node varies with the height of the
node in the tree, and the heights of most nodes are small.
 Fact: an n-element heap has at most 2h-k nodes of level k,
where h is the height of the tree.
6 -- 36

18
Analyzing BuildHeap(): Tight
 The time required by Heapify on a node of height k is O(k).
So we can express the total cost of BuildHeap as

k=0 to h 2h-k O(k)= O(2h k=0 to h k/2k)


= O(n k=0 to h k(½)k)

From: k=0 to ∞ k xk = x/(1-x)2 where x =1/2

So, k=0 to  k/2k = (1/2)/(1 - 1/2)2 = 2

Therefore, O(n k=0 to h k/2k) = O(n)

 So, we can bound the running time for building a heap


from an unordered array in linear time.
6 -- 37

Heapsort
 Given BuildHeap(), an in-place sorting
algorithm is easily constructed:
 Maximum element is at A[1]
 Discard by swapping with element at A[n]
 Decrement heap_size[A]
 A[n] now contains correct value
 Restore heap property at A[1] by calling
Heapify()
 Repeat, always swapping A[1] for
A[heap_size(A)]

6 -- 38

19
Heapsort
Heapsort(A)
{
1. Build-Heap(A)
2. for i  length[A] downto 2
3. do exchange A[1]  A[i]
4. heap-size[A]  heap-size[A] - 1
5. Heapify(A, 1)
}

6 -- 39

HeapSort() Example

 A = {16, 14, 10, 8, 7, 9, 3, 2, 4, 1}

1
16
2 3
14 10
4 5 6 7
8 7 9 3
8 9 10
2 4 1

6 -- 40

20
HeapSort() Example

 A = {14, 8, 10, 4, 7, 9, 3, 2, 1, 16}

1
14
2 3
8 10
4 5 6 7
4 7 9 3
8 9
2 1 16
i = 10

6 -- 41

HeapSort() Example

 A = {10, 8, 9, 4, 7, 1, 3, 2, 14, 16}

1
10
2 3
8 9
4 5 6 7
4 7 1 3
8
2 14 16
i=9 10

6 -- 42

21
HeapSort() Example

 A = {9, 8, 3, 4, 7, 1, 2, 10, 14, 16}

1
9
2 3
8 3
4 5 6 7
4 7 1 2

10 14 16
i=8 9 10

6 -- 43

HeapSort() Example

 A = {8, 7, 3, 4, 2, 1, 9, 10, 14, 16}

1
8
2 3
7 3
4 5 6
4 2 1 9
i=7
10 14 16
8 9 10

6 -- 44

22
HeapSort() Example

 A = {7, 4, 3, 1, 2, 8, 9, 10, 14, 16}

1
7
2 3
4 3
4 5
1 2 8 9
i=6 7
10 14 16
8 9 10

6 -- 45

HeapSort() Example

 A = {4, 2, 3, 1, 7, 8, 9, 10, 14, 16}

1
4
2 3
2 3
4 i=5
1 7 8 9
6 7
10 14 16
8 9 10

6 -- 46

23
HeapSort() Example

 A = {3, 2, 1, 4, 7, 8, 9, 10, 14, 16}

1
3
2 3
2 1

i=4 4 7 8 9
5 6 7
10 14 16
8 9 10

6 -- 47

HeapSort() Example

 A = {2, 1, 3, 4, 7, 8, 9, 10, 14, 16}

1
2
2 i=3
1 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10

6 -- 48

24
HeapSort() Example

 A = {1, 2, 3, 4, 7, 8, 9, 10, 14, 16}

1
1
i =2 3

2 3
4
4 7 8 9
5 6 7
10 14 16
8 9 10

6 -- 49

Analyzing Heapsort

 The call to BuildHeap() takes O(n) time


 Each of the n - 1 calls to Heapify() takes O(lg n) time

 Thus the total time taken by HeapSort()


= O(n) + (n - 1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)

6 -- 50

25
Analyzing Heapsort

 The O(n log n) run time of heap-sort is much


better than the O(n2) run time of selection and
insertion sort

 Although, it has the same run time as Merge sort,


but it is better than Merge Sort regarding memory
space
 Heap sort is in-place sorting algorithm
 But not stable
 Does not preserve the relative order of elements
with equal keys

6 -- 51

26

You might also like