0% found this document useful (0 votes)
88 views

Algorithms and Data Structures: Simonas Šaltenis

This document summarizes a lecture on sorting algorithms and data structures. It discusses quicksort and heapsort, describing their running times, worst cases, and average cases. Quicksort uses a divide and conquer approach, partitioning the array and recursively sorting subarrays. Its average case is O(n log n) but worst case is O(n^2). Heapsort uses a binary heap data structure to sort in O(n log n) time while sorting in place, like selection sort. The document also covers building heaps, priority queues, and heap applications.

Uploaded by

Rahul Saxena
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views

Algorithms and Data Structures: Simonas Šaltenis

This document summarizes a lecture on sorting algorithms and data structures. It discusses quicksort and heapsort, describing their running times, worst cases, and average cases. Quicksort uses a divide and conquer approach, partitioning the array and recursively sorting subarrays. Its average case is O(n log n) but worst case is O(n^2). Heapsort uses a binary heap data structure to sort in O(n log n) time while sorting in place, like selection sort. The document also covers building heaps, priority queues, and heap applications.

Uploaded by

Rahul Saxena
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 41

September 19, 2002 1

Algorithms and Data


Structures
Lecture IV
Simonas altenis
Nykredit Center for Database Research
Aalborg University
[email protected]
September 19, 2002 2
This Lecture
Sorting algorithms
Quicksort
a popular algorithm, very fast on average
Heapsort
Heap data structure and priority queue ADT
September 19, 2002 3
Why Sorting?
When in doubt, sort one of the
principles of algorithm design. Sorting used
as a subroutine in many of the algorithms:
Searching in databases: we can do binary
search on sorted data
A large number of computer graphics and
computational geometry problems
Closest pair, element uniqueness
September 19, 2002 4
Why Sorting? (2)
A large number of sorting algorithms are
developed representing different algorithm
design techniques.
A lower bound for sorting O(n log n) is
used to prove lower bounds of other
problems
September 19, 2002 5
Sorting Algorithms so far
Insertion sort, selection sort
Worst-case running time O(n
2
); in-place
Merge sort
Worst-case running time O(n log n), but
requires additional memory O(n);
September 19, 2002 6
Quick Sort
Characteristics
sorts almost in "place," i.e., does not require
an additional array
like insertion sort, unlike merge sort
very practical, average sort performance O(n
log n) (with small constant factors), but worst
case O(n
2
)
September 19, 2002 7
Quick Sort the Principle
To understand quick-sort, lets look at a
high-level description of the algorithm
A divide-and-conquer algorithm
Divide: partition array into 2 subarrays such
that elements in the lower part <= elements in
the higher part
Conquer: recursively sort the 2 subarrays
Combine: trivial since sorting is done in place
September 19, 2002 8
Partitioning
Linear time partitioning procedure
Partition(A,p,r)
01 xA[r]
02 ip-1
03 jr+1
04 while TRUE
05 repeat jj-1
06 until A[j] sx
07 repeat ii+1
08 until A[i] >x
09 if i<j
10 then exchange A[i]A[j]
11 else return j
17 12 6 19 23 8 5 10
i
j i j
10 12 6 19 23 8 5 17
j
i
10 5 6 19 23 8 12 17
j i
10 5 6 8 23 19 12 17
i j
10 5 6 8 23 19 12 17
s X=10 s
September 19, 2002 9
Quick Sort Algorithm
Initial call Quicksort(A, 1, length[A])
Quicksort(A,p,r)
01 if p<r
02 then qPartition(A,p,r)
03 Quicksort(A,p,q)
04 Quicksort(A,q+1,r)
September 19, 2002 10
Analysis of Quicksort
Assume that all input elements are distinct
The running time depends on the
distribution of splits
September 19, 2002 11
Best Case
If we are lucky, Partition splits the array
evenly ( ) 2 ( / 2) ( ) T n T n n = + O
September 19, 2002 12
Worst Case
What is the worst case?
One side of the parition has only one
element
1
1
2
( ) (1) ( 1) ( )
( 1) ( )
( )
( )
( )
n
k
n
k
T n T T n n
T n n
k
k
n
=
=
= + + O
= +O
= O
= O
= O

September 19, 2002 13


Worst Case (2)
September 19, 2002 14
Worst Case (3)
When does the worst case appear?
input is sorted
input reverse sorted
Same recurrence for the worst case of
insertion sort
However, sorted input yields the best case
for insertion sort!
September 19, 2002 15
Analysis of Quicksort
Suppose the split is 1/10 : 9/10
( ) ( /10) (9 /10) ( ) ( log )! T n T n T n n n n = + + O = O
September 19, 2002 16
An Average Case Scenario
Suppose, we alternate
lucky and unlucky
cases to get an
average behavior
( ) 2 ( / 2) ( ) lucky
( ) ( 1) ( ) unlucky
we consequently get
( ) 2( ( / 2 1) ( / 2)) ( )
2 ( / 2 1) ( )
( log )
L n U n n
U n L n n
L n L n n n
L n n
n n
= + O
= + O
= + O + O
= + O
= O
n
1
n-1
(n-1)/2 (n-1)/2
( ) n O
(n-1)/2+1
(n-1)/2
n
( ) n O
September 19, 2002 17
An Average Case Scenario (2)
How can we make sure that we are usually
lucky?
Partition around the middle (n/2th) element?
Partition around a random element (works well in
practice)
Randomized algorithm
running time is independent of the input ordering
no specific input triggers worst-case behavior
the worst-case is only determined by the output of the
random-number generator

September 19, 2002 18
Randomized Quicksort
Assume all elements are distinct
Partition around a random element
Consequently, all splits (1:n-1, 2:n-2, ..., n-
1:1) are equally likely with probability 1/n

Randomization is a general tool to improve
algorithms with bad worst-case but good
average-case complexity
September 19, 2002 19
Randomized Quicksort (2)
Randomized-Partition(A,p,r)
01 iRandom(p,r)
02 exchange A[r] A[i]
03 return Partition(A,p,r)
Randomized-Quicksort(A,p,r)
01 if p<r then
02 qRandomized-Partition(A,p,r)
03 Randomized-Quicksort(A,p,q)
04 Randomized-Quicksort(A,q+1,r)
September 19, 2002 20
Selection Sort
A takes O(n) and B takes O(1): O(n
2
) in total
Idea for improvement: use a data structure, to
do both A and B in O(lg n) time, balancing the
work, achieving a better trade-off, and a total
running time O(n log n)
Selection-Sort(A[1..n]):
For i n downto 2
A: Find the largest element among A[1..i]
B: Exchange it with A[i]
September 19, 2002 21
Heap Sort
Binary heap data structure A
array
Can be viewed as a nearly complete binary tree
All levels, except the lowest one are completely filled
The key in root is greater or equal than all its children,
and the left and right subtrees are again binary heaps
Two attributes
length[A]
heap-size[A]
September 19, 2002 22
Heap Sort (3)
1 2 3 4 5 6 7 8 9 10
16 15 10 8 7 9 3 2 4 1
Parent (i)
return i/2
Left (i)
return 2i
Right (i)
return 2i+1
Heap propertiy:
A[Parent(i)] > A[i]
Level: 3 2 1 0
September 19, 2002 23
Heap Sort (4)
Notice the implicit tree links; children of
node i are 2i and 2i+1
Why is this useful?
In a binary representation, a
multiplication/division by two is left/right shift
Adding 1 can be done by adding the lowest bit
September 19, 2002 24
Heapify
i is index into the array A
Binary trees rooted at Left(i) and Right(i)
are heaps
But, A[i] might be smaller than its children,
thus violating the heap property
The method Heapify makes A a heap
once more by moving A[i] down the heap
until the heap property is satisfied again
September 19, 2002 25
Heapify (2)
September 19, 2002 26
Heapify Example
September 19, 2002 27
Heapify: Running Time
The running time of Heapify on a subtree
of size n rooted at node i is
determining the relationship between
elements: O(1)
plus the time to run Heapify on a subtree
rooted at one of the children of i, where 2n/3
is the worst-case size of this subtree.

Alternatively
Running time on a node of height h: O(h)

( ) (2 / 3) (1) ( ) (log ) T n T n T n O n s + O =
September 19, 2002 28
Building a Heap
Convert an array A[1...n], where n =
length[A], into a heap
Notice that the elements in the subarray
A[(n/2 + 1)...n] are already 1-element
heaps to begin with!
Building
a Heap
September 19, 2002 30
Building a Heap: Analysis
Correctness: induction on i, all trees rooted at m
> i are heaps
Running time: n calls to Heapify = n O(lg n) =
O(n lg n)
Good enough for an O(n lg n) bound on
Heapsort, but sometimes we build heaps for
other reasons, would be nice to have a tight
bound
Intuition: for most of the time Heapify works on
smaller than n element heaps
September 19, 2002 31
Building a Heap: Analysis (2)
Definitions
height of node: longest path from node to leaf
height of tree: height of root


time to Heapify = O(height of subtree rooted at i)
assume n = 2
k
1 (a complete binary tree k = lg n)
( )
( )
lg lg
2
1 1
1 1 1
( ) 2 3 ... 1
2 4 8
1/ 2
1 since 2
2 2
1 1/ 2
( )
n n
i i
i i
n n n
T n O k
i i
O n
O n
( (

= =
+ + +
| |
= + + + +
|
\ .
| |
= + = =
|
|

\ .
=

September 19, 2002 32
Building a Heap: Analysis (3)
How? By using the following "trick"








Therefore Build-Heap time is O(n)
( )
( )
0
1
2
1
2
1
1
1
if 1 //differentiate
1
1
//multiply by
1
1
//plug in
2
1
1/ 2
2
2 1/ 4
i
i
i
i
i
i
i
i
x x
x
i x x
x
x
i x x
x
i

=
= <

= =

= =

September 19, 2002 33


Heap Sort
The total running time of heap sort is
O(n lg n) + Build-Heap(A) time, which is
O(n)
O( ) n
Heap
Sort
September 19, 2002 35
Heap Sort: Summary
Heap sort uses a heap data structure to
improve selection sort and make the
running time asymptotically optimal
Running time is O(n log n) like merge
sort, but unlike selection, insertion, or
bubble sorts
Sorts in place like insertion, selection or
bubble sorts, but unlike merge sort
September 19, 2002 36
Priority Queues
A priority queue is an ADT(abstract data type) for
maintaining a set S of elements, each with an
associated value called key
A PQ supports the following operations
Insert(S,x) insert element x in set S (SS{x})
Maximum(S) returns the element of S with the
largest key
Extract-Max(S) returns and removes the element of
S with the largest key
September 19, 2002 37
Priority Queues (2)
Applications:
job scheduling shared computing resources
(Unix)
Event simulation
As a building block for other algorithms
A Heap can be used to implement a PQ
September 19, 2002 38
Priority Queues (3)
Removal of max takes constant time on top
of Heapify
(lg ) n O
September 19, 2002 39
Priority Queues (4)
Insertion of a new element
enlarge the PQ and propagate the new
element from last place up the PQ
tree is of height lg n, running time:
(lg ) n O
September 19, 2002 40
Priority Queues (5)

September 19, 2002 41
Next Week
ADTs and Data Structures
Definition of ADTs
Elementary data structures
Trees

You might also like