0% found this document useful (0 votes)
13 views

Lecture02 Sorting Part 2

Uploaded by

khangthinh.htp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Lecture02 Sorting Part 2

Uploaded by

khangthinh.htp
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

DATA STRUCTURES

& ALGORITHMS
Lecture 2: Sorting
part 2: Sorting in Linear time
Lecturer: Dr. Nguyen Hai Minh
[email protected]
CONTENT
qSorting Lower Bound
o Decision trees

qSorting for linear time


o Counting sort
o Radix sort

qConclusion

Nguyen Hai Minh 2 9/19/22


How fast can we sort?
qAll the sorting algorithms we have seen so far are
comparison sorts: only use comparisons to
determine the relative order of elements.
o E.g., insertion sort, merge sort, quicksort,
heapsort.
qThe best worst-case running time that we’ve
seen for comparison sorting is O(nlog2 n).
Is O(nlog 2 n) the best we can do?
qDecision trees can help us answer this question
Nguyen Hai Minh 3 9/19/22
Asymptotic lower bound – Ω-notation
qProvides an asymptotic lower bound on a function
o For$a$given$function$! ! ,$we$denote$by$Ω ! ! $
(pronounced$“big7omega$of$g$of$n”)$the$set$of$functions$$
$
$Ω ! ! = {! ! :!there%exist!positive(constants!!!and!!! !!$
$ such%that:%0 ≤ !" ! ≤ ! ! !for$all$! ≥ !! }$
$
o Explain:$!$is$big7omega$of$!$if$there$is$!$so$that$!!is$on$
or$above$! ∗ !$when$!!is$large$enough$
$
qExample:
o ! = Ω log ! ! !(! = 1, ! = 16)!
Nguyen Hai Minh 4 9/19/22
Asymptotic lower bound – Ω-notation
qRunning time of an algorithm is Ω-(g(n)) means
that the running time of that algorithm is at least a
constant times g(n), for sufficiently large n.

Nguyen Hai Minh 5 9/19/22


Decision tree example
qSort <a1, a2, …, an>
0:1
1:2 0:2
123
123 0:2 213
213 1:2
132
132 312
312 231
231 321
321

q Each internal node is labeled i:j for i, j ∈ {0, 1,…, n-1}.


o The left subtree shows subsequent comparisons if ai ≤ aj.
o The right subtree shows subsequent comparisons if ai > aj.
Nguyen Hai Minh 6 9/19/22
Decision tree example
qSort <a1, a2, …, an> = < 9, 4, 6 >
0:1 9 ≤ 4
1:2 0:2
123
123 0:2 213
213 1:2
132
132 312
312 231
231 321
321

q Each internal node is labeled i:j for i, j ∈ {0, 1,…, n-1}.


o The left subtree shows subsequent comparisons if ai ≤ aj.
o The right subtree shows subsequent comparisons if ai > aj.
Nguyen Hai Minh 7 9/19/22
Decision tree example
qSort <a1, a2, …, an> = < 9, 4, 6 >
0:1
1:2 0:2 9 ≤ 6
123
123 0:2 213
213 1:2
132
132 312
312 231
231 321
321

q Each internal node is labeled i:j for i, j ∈ {0, 1,…, n-1}.


o The left subtree shows subsequent comparisons if ai ≤ aj.
Nguyen Hai Minh 8 9/19/22
o The right subtree shows subsequent comparisons if ai > aj.
Decision tree example
qSort <a1, a2, …, an> = < 9, 4, 6 >
0:1
1:2 0:2
123
123 0:2 213
213 4 ≤ 6 1:2

132
132 312
312 231
231 321
321

q Each internal node is labeled i:j for i, j ∈ {0, 1,…, n-1}.


o The left subtree shows subsequent comparisons if ai ≤ aj.
o The right subtree shows subsequent comparisons if ai > aj.
Nguyen Hai Minh 9 9/19/22
Decision tree example
qSort <a1, a2, …, an> = < 9, 4, 6 >
0:1
1:2 0:2
123
123 0:2 213
213 1:2
132
132 312
312 231
231 321
321
4≤6≤9
qEach leaf contains a permutation
𝜋 0 , 𝜋 1 , … , 𝜋 𝑛 − 1 to indicate that the ordering
𝑎!(#) ≤ 𝑎! % ≤ ⋯ ≤ 𝑎!(&'%) has been established.
Nguyen Hai Minh 10 9/19/22
Decision tree model
A decision tree can model the execution of any
comparison sort:
o One tree for each input size n.
o View the algorithm as splitting whenever it
compares two elements.
o The tree contains the comparisons along all
possible instruction traces.
o The running time of the algorithm = the
length of the path taken.
o Worst-case running time = height of tree.
Nguyen Hai Minh 11 9/19/22
Lower bound for decision-tree sorting
Theorem. Any decision tree that can sort n elements
must have height Ω(nlog2n).
Proof. The tree must contain ≥ n! leaves, since there
are n! possible permutations. A height-h binary tree
has ≤ 2h leaves. Thus, n! ≤ 2h .
∴ h ≥ log2(n!) (log2n is monotonically increasing)
≥ log2((n/e)n) (Stirling’s formula)
= n log2n – n log2e
= Ω(nlog2n)

Nguyen Hai Minh 12 9/19/22


Lower bound for comparison sorting
Corollary. Heapsort and merge sort are
asymptotically optimal comparison sorting
algorithms.

Nguyen Hai Minh 13 9/19/22


Sorting in Linear time
Counting sort: No comparisons between
elements.
q Input: A[0 . . n-1], where A[ j] ∈{0,1, …, k-1} .
q Output: B[0 . . n-1], sorted.
q Auxiliary storage: C[0 . . k-1] .

Nguyen Hai Minh 14 9/19/22


Counting sort
for i ← 0 to k-1
do C[i] ← 0
for j ← 0 to n-1
do C[A[ j]] ← C[A[ j]] +1 ⊳ C[i] = |{key = i}|
for i ← 1 to k-1
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key i}|
for j ← n-1 downto 0
do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1

Nguyen Hai Minh 15 9/19/22


Counting sort - Example
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C:

B:

Nguyen Hai Minh 16 9/19/22


Loop 1
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 0 0 0 0

B:

for i ← 0 to k-1
do C[i] ← 0

Nguyen Hai Minh 17 9/19/22


Loop 2
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 0 0 0 1

B:

for j ← 0 to n-1
do C[A[ j]] ← C[A[ j]] +1 ⊳ C[i] = |{key = i}|

Nguyen Hai Minh 18 9/19/22


Loop 2
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 0 1

B:

for j ← 0 to n-1
do C[A[ j]] ← C[A[ j]] +1 ⊳ C[i] = |{key = i}|

Nguyen Hai Minh 19 9/19/22


Loop 2
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 1 1

B:

for j ← 0 to n-1
do C[A[ j]] ← C[A[ j]] +1 ⊳ C[i] = |{key = i}|

Nguyen Hai Minh 20 9/19/22


Loop 2
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 1 2

B:

for j ← 0 to n-1
do C[A[ j]] ← C[A[ j]] +1 ⊳ C[i] = |{key = i}|

Nguyen Hai Minh 21 9/19/22


Loop 2
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2

B:

for j ← 0 to n-1
do C[A[ j]] ← C[A[ j]] +1 ⊳ C[i] = |{key = i}|

Nguyen Hai Minh 22 9/19/22


Loop 3
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: C': 1 1 2 2

for i ← 1 to k-1
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key i}|

Nguyen Hai Minh 23 9/19/22


Loop 3
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: C': 1 1 3 2

for i ← 1 to k-1
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key i}|

Nguyen Hai Minh 24 9/19/22


Loop 3
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: C': 1 1 3 5

for i ← 1 to k-1
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key i}|

Nguyen Hai Minh 25 9/19/22


Loop 4
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: 2 C': 1 1 2 5

for j ← n-1 down to 0


do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1
Nguyen Hai Minh 26 9/19/22
Loop 4
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: 2 3 C': 1 1 2 4

for j ← n-1 down to 0


do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1
Nguyen Hai Minh 27 9/19/22
Loop 4
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: 2 2 3 C': 1 1 1 4

for j ← n-1 down to 0


do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1
Nguyen Hai Minh 28 9/19/22
Loop 4
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: 0 2 2 3 C': 0 1 1 4

for j ← n-1 down to 0


do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1
Nguyen Hai Minh 29 9/19/22
Loop 4
0 1 2 3 4 0 1 2 3

A: 34 01 23 34 23 C: 1 0 2 2
0 1 2 3

B: 0 2 2 3 3 C': 0 1 1 3

for j ← n-1 down to 0


do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1
Nguyen Hai Minh 30 9/19/22
Counting sort - Analysis
for i ← 0 to k-1
O(k)
do C[i] ← 0
for j ← 0 to n-1
O(n)
do C[A[ j]] ← C[A[ j]] +1
O(k) for i ← 1 to k-1
do C[i] ← C[i] + C[i–1]
for j ← n-1 downto 0
O(n) do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] –1
O(n + k)
Nguyen Hai Minh 31 9/19/22
Counting sort – Running time
If k = O(n), then counting sort takes O(n) time.
o But, sorting takes Ω(n log2n) time!
o Where’s the fallacy?
Answer:
o Comparison sorting takes Ω(n log2n) time.
o Counting sort is not a comparison sort.
o In fact, not a single comparison between
elements occurs!

Nguyen Hai Minh 32 9/19/22


Stable sorting
q Counting sort is a stable sort: it preserves the
input order among equal elements.
0 1 2 3 4

A: 34 01 23 34 23

B: 0 2 2 3 3

q Exercise: What other sorts have this property?

Nguyen Hai Minh 33 9/19/22


Radix sort
q Origin: Herman Hollerith’s card-sorting machine
for the 1890 U.S. Census.

q Digit-by-digit sort.

q Hollerith’s original idea: sort on most-significant


digit first.

q Good idea: Sort on least-significant digit first with


auxiliary stable sort.

Nguyen Hai Minh 34 9/19/22


Operation of LSD Radix sort

329 720
457 355
657 436
839 457
436 657
720 329
355 839

Nguyen Hai Minh 35 9/19/22


Operation of LSD Radix sort

329 720 720


457 355 329
657 436 436
839 457 839
436 657 355
720 329 457
355 839 657

Nguyen Hai Minh 36 9/19/22


Operation of LSD Radix sort

329 720 720 329


457 355 329 355
657 436 436 436
839 457 839 457
436 657 355 657
720 329 457 720
355 839 657 839

Nguyen Hai Minh 37 9/19/22


Radix sort
RADIX-SORT(A, d)
1. for i ← 0 to d-1 do
2. use a stable sort to sort array A using digit i as key

n d-digits numbers
O(n+k)

each has k possible values

if d is constant
O(d(n+k)) O(n)
and k = O(n)
Nguyen Hai Minh 38 9/19/22
Analysis of Radix sort
• Assume counting sort is the auxiliary stable sort.
• Sort n computer words of b bits each.
Each word can be viewed as having b/r base-2r
digits.
8 8 8 8
Example: 32-bit word

r = 8 => b/r = 4 passes of counting sort on base-28


digits; or r = 16 => b/r = 2 passes of counting sort
on base-216 digits.
How many passes should we make?
Nguyen Hai Minh 39 9/19/22
Analysis of Radix sort
qRecall: Counting sort takes O(n + k) time to sort
n numbers in the range from 0 to k – 1.
qIf each b-bit word is broken into r-bit pieces, each
pass of counting sort takes O(n + 2r) time. Since there
are b/r passes, we have
!
! !, ! = O (! + 2! ) !
!
qChoose r to minimize T(n,b):
o Increasing r means fewer passes, but as r >>log2n,
the time grows exponentially.

Nguyen Hai Minh 40 9/19/22


Choosing r
!
! !, ! = O (! + 2! ) !
!

qMinimize T(n, b) by differentiating and setting to 0.


qOr, just observe that we don’t want 2r >>n, and
there’s no harm asymptotically in choosing r as
large as possible subject to this constraint.
qChoosing r = log2n implies T(n,b) = O(bn/log2n).
o For numbers in the range from 0 to n d – 1, we have b =
d log2n => radix sort runs in O(d n)time.
Nguyen Hai Minh 41 9/19/22
Radix sort – Conclusion
qIn practice, radix sort is fast for large inputs, as
well as simple to code and maintain.
qExample (32-bit numbers):
• At most 3 passes when sorting ≥ 2000 numbers.
• Merge sort and quicksort do at least ⎡log22000⎤=11
passes.
qDownside: Unlike quicksort, radix sort displays
little locality of reference, and thus a well-tuned
quicksort fares better on modern processors.

Nguyen Hai Minh 42 9/19/22


Most significant digit Radix sort
qUse lexicographic order, which is suitable for
sorting strings, such as words, or fixed-length
integer representations.

qNo need to preserve the order of duplicate keys

qExample:
o car, bar, care, bare à bar, bare, car, care
o 9, 8, 10, 1, 3 à 1, 10, 3, 8, 9

Nguyen Hai Minh 43 9/19/22


Homework
qRead chapter 11 (page 305~)

qDo Group Assignment 1

qNext week: Individual Assignment 2 (Sorting)

Nguyen Hai Minh 44 9/19/22


Nguyen Hai Minh 45 9/19/22

You might also like