Notes2 Handout
Notes2 Handout
I Stacks:
I Container of objects that are inserted and removed according
to Last-In First-Out (LIFO) principle:
I Only the most-recently inserted object can be removed.
I Insert and remove are usually called push and pop
I Queues (often called FIFO Queues)
I Container of objects that are inserted and removed according
to First-In First-Out (FIFO) principle:
I Only the element that has been in the queue the longes can
be removed.
I Insert and remove are usually called enqueue and dequeue
I Elements are inserted at the rear of the queue and are removed
from the front
Dictionaries/Maps
I Dictionaries
I A Dictionary (or Map) stores <key,value> pairs, which are
often referred to as items
I There can be at most item with a given key.
I Examples:
1. <Student ID, Student data>
2. <Object ID, Object data>
Hashing
An efficient method for implementing a dictionary. Uses
I A hash table, an array of size N.
I A hash function, which maps any key from the set of possible
keys to an integer in the range [0, N − 1]
I A collision strategy, which determines what to do when two
keys are mapped to the same table location by the hash
function.Commonly used collision strategies are:
I Chaining
I Open addressing: linear probing, quadratic probing, double
hashing
I Cuckoo hashing
Hashing is fast:
I O(1) expected time for access, insertion
I Cuckoo hashing improves the access time to O(1) worst-case
time. Insertion time remains O(1) expected time.
Disadvantages on next slide.
CompSci 161—Spring 2021— c M. B. Dillencourt—University of California, Irvine
2-7
Hashing: Disadvantages
Level 1
Level 2
Level 3
The depth of a binary tree is the maximum of the levels of all its
leaves.
B C
D E F
G H
36 65
25 52 79
9 32
def binarySearch(A,x,first,last)
if first > last:
return (-1)
else:
mid = b(first+last)/2c
if x == A[mid]:
return mid
else if x < A[mid]:
return binarySearch(A,x,first,mid-1)
else:
return binarySearch(A,x,mid+1,last)
binarySearch(A,x,0,n-1)
CompSci 161—Spring 2021— c M. B. Dillencourt—University of California, Irvine
2-14
mid
2. last = first + 1
first last
mid
3. last = first
first = last
mid
2 9
0 4 7 11
1 3 5 8 10 12
Hence any algorithm for locating an item in an array of size n using only
comparisons must perform at least blg nc + 1 comparisons in the worst case.
So binary search is optimal with respect to worst-case performance.
CompSci 161—Spring 2021— c M. B. Dillencourt—University of California, Irvine
2-21
Sorting
Comparison-based sorting
I Basic operation: compare two items.
I Abstract model.
I Advantage: doesn’t use specific properties of the data items.
So same algorithm can be used for sorting integers, strings,
etc.
I Disadvantage: under certain circumstances, specific properties
of the data item can speed up the sorting process.
I Measure of time: number of comparisons
I Consistent with philosophy of counting basic operations,
discussed earlier.
I Misleading if other operations dominate (e.g., if we sort by
moving items around without comparing them)
I Comparison-based sorting has lower bound of Ω(n log n)
comparisons. (We will prove this.)
600000
500000
n
400000 y= 2
300000
200000
y = 10 n lg n
100000
n
200 400 600 800 1000
Some terminology
18 29 12 15 32 10
has 9 inversions:
Insertion sort
I Work from left to right across array
I Insert each item in correct position with respect to (sorted)
elements to its left
0
(Unsorted)
(Sorted) x (Unsorted)
n−1
(Sorted)
23 19 42 17 85 38
19 23 42 17 85 38
19 23 42 17 85 38
17 19 23 42 85 38
17 19 23 42 85 38
17 19 23 38 42 85
Time ≤ n − 1 + (# inversions)
Selection Sort
I Two variants:
1. Repeatedly (for i from 0 to n − 1) find the minimum value,
output it, delete it.
I Values are output in sorted order
2. Repeatedly (for i from n − 1 down to 1)
I Find the maximum of A[0],A[1],. . . ,A[i].
I Swap this value with A[i] (no-op if it is already A[i]).
I Both variants run in O(n2 ) time if we use the straightforward
approach to finding the maximum/minimum.
I They can be improved by treating the items A[0],A[1],. . . ,A[i]
as items in an appropriately designed priority queue. (Next set
of notes)
Quicksort
Basic idea
I Classify keys as small keys or large keys. All small keys are
less than all large keys
I Rearrange keys so small keys precede all large keys.
I Recursively sort small keys, recursively sort large keys.
keys
first last
x ?
<x x ≥x
def quickSort(A,first,last):
if first < last:
splitpoint = split(A,first,last)
quickSort(A,first,splitpoint-1)
quickSort(A,splitpoint+1,last)
<x x ≥x
def split(A,first,last):
splitpoint = first
x = A[first]
for k = first+1 to last do:
if A[k] < x:
A[splitpoint+1] ↔ A[k]
splitpoint = splitpoint + 1
A[first] ↔ A[splitpoint]
return splitpoint
Loop invariants:
I A[first+1..splitpoint] contains keys < x.
I A[splitpoint+1..k-1] contains keys ≥ x.
I A[k..last] contains unprocessed keys.
x ?
splitpoint
In middle:
first splitpoint k last
x <x ≥x ?
At end:
first splitpoint last
x <x ≥x
27 83 23 36 15 79 22 18
s k
27 83 23 36 15 79 22 18
s k
27 23 83 36 15 79 22 18
s k
27 23 83 36 15 79 22 18
s k
27 23 15 36 83 79 22 18
s k
27 23 15 36 83 79 22 18
s k
27 23 15 22 83 79 36 18
s k
27 23 15 22 18 79 36 83
s
18 23 15 22 27 79 36 83
s
Analysis of Quicksort
We can visualize the lists sorted by quicksort as a binary tree.
I The root is the top-level list (of all items to be sorted)
I The children of a node are the two sublists to be sorted.
I Identify each list with its split value.
27 83 23 36 15 79 22 18
18 23 15 22 79 36 83
15 23 22 36 83
22
1 2 3 ... n − 1 n
2 3 ... n − 1 n
3 ... n − 1 n
n−1 n
n
2 comparisons required. So the worst-case running time for
Quicksort is Θ(n2 ). But what about the average case . . . ?
CompSci 161—Spring 2021— c M. B. Dillencourt—University of California, Irvine
2-40
Our approach:
1. Use the binary tree of sorted lists
2. Number the items in sorted order
3. Calculate the probability that two items get compared
4. Use this to compute the expected number of comparisons
performed by Quicksort.
27 83 23 36 15 79 22 18
18 23 15 22 79 36 83
15 23 22 36 83
22
Sorted order: 15 18 22 23 27 36 79 83
MergeSort
I Split array into two equal subarrays
I Sort both subarrays (recursively)
I Merge two sorted subarrays
first mid last
def mergeSort(A,first,last):
if first < last:
mid = b(first + last)/2c
mergeSort(A,first,mid)
mergeSort(A,mid+1,last)
merge(A,first,mid,mid+1,last)
temp
19 26 42 71 14 24 31 39
14 19 24 26 31 39 42 71
def merge(A,first1,last1,first2,last2):
index1 = first1; index2 = first2; tempIndex = 0
// Merge into temp array until one input array is exhausted
while (index1 <= last1) and (index2 <= last2)
if A[index1] <= A[index2]:
temp[tempIndex++] = A[index1++]
else:
temp[tempIndex++] = A[index2++]
// Copy appropriate trailer portion
while (index1 <= last1): temp[tempIndex++] = A[index1++]
while (index2 <= last2): temp[tempIndex++] = A[index2++]
// Copy temp array back to A array
tempIndex = 0; index = first1
while (index <= last2): A[index++] = temp[tempIndex++]
Analysis of Mergesort
a b a
6
5
4
3
a b a
Example: The list [18, 29, 12, 15, 32, 10] has 9 inversions:
(18, 12), (18, 15), (18, 10), (29, 12), (29, 15), (29, 10), (12, 10), (15, 10), (32, 10)
In a list of size n, there can be as many as n2 inversions.
Inversion Counting
temp
tempindex
last1 − index1 + 1
Example
19 26 42 71 14 24 31 39
14 19 24 26 31
def mergeSort(A,first,last):
invCount = 0
if first < last:
mid = b(first + last)/2c
invCount += mergeSort(A,first,mid)
invCount += mergeSort(A,mid+1,last)
invCount += merge(A,first,mid,mid+1,last)
return invCount
Listing inversions
We have just seen that we can count inversions without increasing
the asymptotic running time of Mergesort. Suppose we want to list
inversions. When we remove inversions, we list all inversions
removed:
first1 index1 last1 first2 index2 last2
temp
tempindex