UNIT - 5 Advanced Algorithm PDF
UNIT - 5 Advanced Algorithm PDF
Page145
LINEAR PROGRAMMING
Linear programming (LP) is a method to achieve the optimum outcome under some requirements
represented by linear relationships. LP is an optimization technique for a system of linear constraints
and a linear objective function. An objective function defines the quantity to be optimized and the goal
of linear programming is to find the values of the variables that maximize or minimize the objective
function subject to some linear constraints. In general, the standard form of LP consists of
2. Additivity: Additivity means that the total value of the objective function and each constraint
function is obtained by adding up the individual contributions from each variable.
3. Divisibility. The decision variables are allowed to take on any real numerical values within some
range specified by the constraints. That is, the variables are not restricted to integer values. When
fractional values do not make a sensible solution, such as the number of flights an airline should have
each day between two cities, the problem should be formulated and solved as an integer program.
4. Certainty: We assume that the parameter values in the model are known with certainty or are at least
treated that way. The optimal solution obtained is optimal for the specific problem formulated. If the
parameter values are wrong, then the resulting solution is of little value. In practice, the assumptions
of proportionality and additivity need the greatest care and are most likely to be violated by the
modeler. With experience, we recognize when integer solutions are needed and the variables must be
modeled explicitly.In practice, the assumptions of proportionality and additivity need the greatest
care and are most likely to be violated by the modeler. With experience, we recognize when integer
solutions are needed and the variables must be modeled explicitly.
Solution
The quantities that the manager controls are the amounts of each grain to feed each sheep daily. We
define xj number of pounds of grain j ( = 1, 2, 3) to feed each sheep daily. Note that the units of
measure are completely specified. In addition, the variables are expressed on a per sheep basis. If we
minimize the cost per sheep, we minimize the cost for any group of sheep. The daily feed cost per sheep
will be
(cost per lb of grain j)/(lb. of grain j fed to each sheep daily)
That is, the objective function is to
Minimize z = 41x1 + 36x2 + 96x3nutrient
Why can’t the manager simply make all the variables equal to zero? This keeps costs at zero, but the
manager would have a flock of dead sheep, because there are minimum constraints that must be
satisfied. The values of the variables must be chosen so that the number of units of nutrient A consumed
daily by each sheep is equal to or greater than 110. Expressing this in terms of the variables yields
20x130x270x3 110
The constraints for the other nutrients are
10x110x2 18
50x130x2 90
6x12.5x210x3 110 and
finally all xj s 0
The optimal solution to this problem (obtained using a computer software package) is x 1= 0.595, x2
=2.008, x3 =0.541 and z 148.6 cents.
Page149
THE GEOMETRY OF LINEAR PROGRAMS
The characteristic that makes linear programs easy to solve is their simple geometric structure. Let’s
define some terminology. A solution for a linear program is any set of numerical values for the
variables. These values need not be the best values and do not even have to satisfy the constraints or
make sense. For example, in the Healthy Pet Food problem, M = 25 and Y = -800 is a solution, but it
does not satisfy the constraints, nor does it make physical sense. A feasible solution is a solution that
satisfies all of the constraints. The feasible set or feasible region is the set of all feasible solutions.
Finally, an optimal solution is the feasible solution that produces the best objective function value
possible.
All Solutions
Feasible Solutions
Optimal Solutions
The above figure shows the relationships among these types of solutions. Let’s use the Healthy Pet
Food example to show the geometry of linear programs and to show how two-variable problems can be
solved graphically. The linear programming formulation for the Healthy Pet Food problem is:
Maximize z 0.65M 0.45Y
Subject to 2M + 3Y ≤ 400,000
3M + 1.5Y ≤ 300,000
M ≤ 90,000
M, Y ≥0
The fundamental theorem of linear programming reduces to a finite value the number of feasible
solutions that need to be evaluated. One solution strategy might be to identify the coordinates of
every extreme point and then evaluate the objective function at each. The one that produces the best
objective function value is the optimum. In practice, this approach is not efficient because the
number of extreme points can be very large for real problems with hundreds or thousands of
variables and constraints.
The simplex algorithm begins by identifying an initial extreme point of the feasible set. The
algorithm then looks along each edge intersecting at the extreme point and computes the net effect
on the objective function if we were to move along the edge.
Page150
If the objective function value does not improve by moving along at least one of these edges, it can
be proved that the extreme point is optimal. If movement along one or more of the edges improves
the objective function value, we move along one of these edges until we reach a new extreme point.
We repeat the previous steps: checking along each edge intersecting at the extreme point and then
either stopping or sliding along another edge that improves the objective function value.
If an NP-hard problem can be solved in polynomial time, then all NP-complete problems can be
solved in polynomial time.
All NP-complete problems are NP-hard, but all NP- hard problems are not NP-complete.
The class of NP-hard problems is very rich in the sense that it contains many problems from a wide
variety of disciplines
Page152
The machines executing such operations are
allowedtochooseanyoneoftheseoutcomessubjecttoaterminationcondition.
This leads to the concept of non-deterministic algorithms.
To specify such algorithms in SPARKS, we introduce three statements Choice (s) ………
arbitrarily chooses one of the elements of the set S. Failure …. Signals an unsuccessful
completion. Success: Signals a successful completion.
Whenever there is a set of choices that leads to a successful completion then one such set of
choices is always made and the algorithm terminates.
A non-deterministic algorithm terminates unsuccessfully if and only if there exists no set of
choices leading to a successful signal.
Nondeterministic algorithms
A non deterministic algorithm consists of
Phase 1: guessing
Phase 2: checking
If the checking stage of a non-deterministic algorithm is of polynomial timecomplexity, then
this algorithm is called an NP (nondeterministic polynomial) algorithm.
NP problems : (must be decision problems)
e.g. Searching, MST Sorting, Satisfy ability problem (SAT), Travelling salesperson problem
(TSP)
Page153
Example of NP-Complete and NP-Hard Problem
Chromatic Number Decision Problem (CNP)
i. A coloring of a graph G = (V,E) is a function f : V { 1,2, …, k} i V
ii. If (U, V) E then f(u) f(v).
iii. The CNP is to determine if G has a coloring for a given K.
iv. Satisfiability with at most three literals per clause chromatic number problem. CNP is NP-hard.
CNF-Satisfiability with at most three literals per clause is NP-hard. If each clause is restricted to
have at most two literals then CNF-satisfiability is polynomial solvable.
Generating optimal code for a parallel assignment statement is NP-hard, However if the expressions
are restricted to be simple variables, then optimal code can be generated in polynomial time.
Generating optimal code for level one directed a-cyclic graphs is NP-hard but optimal code for trees
can be generated in polynomial time. Determining if a planner graph is three colorable is NP-Hard.
To determine if it is two colorable is a polynomial complexity problem. (It only have to see if it is
bipartite)
General Definitions
P, NP, NP-hard, NP-easy and NP-complete... - Polynomial-time reduction Examples of NP-complete
problems
P - Decision problems (decision problems) that can be solved in polynomial time - can be solved
“efficiently”.
NP - Decision problems whose “YES” answer can be verified in polynomial time, if we already
have the proof (or witness).
Co-NP - Decision problems whose “NO” answer can be verified in polynomial time, if we
already have the proof (or witness) E.g. the satisfy ability problem (SAT) - Given a Boolean
formula is it possible to assign the input x1...x9, so that the formula evaluates to TRUE?
Page154
If the answer is YES with a proof (i.e. an assignment of input value), then we can check the
proof in polynomial time (SAT is in NP). We may not be able to check the NO answer in
polynomial time. (Nobody really knows.)
NP-hard - A problem is NP-hard if and only if an polynomial-time algorithm for it implies a
polynomial-time algorithm for every problem in NPNP-hard problems are at least as hard as NP
problems
NP-complete - A problem is NP-complete if it is NP-hardand is an element of NP.
If a polynomial time algorithm exists for any of these problems, all problems in NP would be
polynomial time solvable. These problems are called NP-complete. The phenomenon of NP-
completeness is important for both theoretical and practical reasons.
Definition of NP-Completeness
A language B is NP-complete if it satisfies two conditions
B is in NP
Every A in NP is polynomial time reducible to B.
If a language satisfies the second property, but not necessarily the first one, the language B is known
as NP-Hard. Informally, a search problem B is NP-Hard if there exists some NP-
Complete problem A that Turing reduces to B.
The problem in NP-Hard cannot be solved in polynomial time, until P = NP. If a problem is proved to
be NPC, there is no need to waste time on trying to find an efficient algorithm for it. Instead, we can
focus on design approximation algorithm.
NP-Complete Problems
Following are some NP-Complete problems, for which no polynomial time algorithm is known.
Determining whether a graph has a Hamiltonian cycle
Determining whether a Boolean formula is satisfiable, etc.
NP-Hard Problems
The following problems are NP-Hard
The circuit-satisfiability problem
Page155
Set Cover
Vertex Cover
Travelling Salesman Problem
In this context, now we will discuss TSP is NP-Complete
TSP is NP-Complete
The traveling salesman problem consists of a salesman and a set of cities. The salesman has to visit
each one of the cities starting from a certain one and returning to the same city. The challenge of the
problem is that the traveling salesman wants to minimize the total length of the trip
Proof
To prove TSP is NP-Complete, first we have to prove that TSP belongs to NP. In TSP, we find a tour
and check that the tour contains each vertex once. Then the total cost of the edges of the tour is
calculated. Finally, we check if the cost is minimum. This can be completed in polynomial time.
Thus TSP belongs to NP.
Secondly, we have to prove that TSP is NP-hard. To prove this, one way is to show that Hamiltonian
cycle ≤p TSP (as we know that the Hamiltonian cycle problem is NP-complete).
Assume G = (V, E) to be an instance of Hamiltonian cycle.
Hence, an instance of TSP is constructed. We create the complete graph G' = (V, E'), where
E′={(i,j):i,j∈Vandi≠jE′={(i,j):i,j∈Vandi≠j
Thus, the cost function is defined as follows −
t(i,j)={01if(i,j)∈Eotherwiset(i,j)={0if(i,j)∈E1otherwise
Now, suppose that a Hamiltonian cycle h exists in G. It is clear that the cost of each edge
in h is 0 in G' as each edge belongs to E. Therefore, h has a cost of 0 in G'. Thus, if graph G has a
Hamiltonian cycle, then graph G' has a tour of 0 cost.
Conversely, we assume that G' has a tour h' of cost at most 0. The cost of edges in E' are 0 and 1 by
definition. Hence, each edge must have a cost of 0 as the cost of h' is 0. We therefore conclude
that h' contains only edges in E.We have thus proven that G has a Hamiltonian cycle, if and only
if G' has a tour of cost at most 0. TSP is NP-complete.
APPROXIMATION ALGORITHM:
Introduction:
An Approximate Algorithm is a way of approach NP-COMPLETENESS for the optimization
problem.The goal of an approximation algorithm is to come as close as possible to the optimum value
in a reasonable amount of time which is at the most polynomial time. Such algorithms are called
approximation algorithm or heuristic algorithm.
o For the traveling salesperson problem, the optimization problem is to find the shortest cycle, and
the approximation problem is to find a short cycle.
o For the vertex cover problem, the optimization problem is to find the vertex cover with fewest
vertices and the approximation problem is to find the vertex cover with few vertices.
Page156
Performance Ratios:
Suppose we work on an optimization problem where every solution carries a cost. An Approximate
Algorithm returns a legal solution, but the cost of that legal solution may not be optimal.
For Example, suppose we are considering for a minimum size vertex-cover (VC). An approximate
algorithm returns a VC for us, but the size (cost) may not be minimized.
Another Example is we are considering for a maximum size Independent set (IS). An approximate
algorithm returns an IS for us, but the size (cost) may not be maximum. Let C be the cost of the
solution returned by an approximate algorithm and C* is the cost of the optimal solution. We say the
approximate algorithm has an approximate ratio P (n) for an input size n, where
Intuitively, the approximation ratio measures how bad the approximate solution is distinguished with
the optimal solution. A large (small) approximation ratio measures the solution is much worse than
(more or less the same as) an optimal solution.
Observe that P (n) is always ≥ 1, if the ratio does not depend on n, we may write P. Therefore, a 1-
approximation algorithm gives an optimal solution. Some problems have polynomial-time
approximation algorithm with small constant approximate ratios, while others have best-known
polynomial time approximation algorithms whose approximate ratios grow with n.
Vertex Cover
A Vertex Cover of a graph G is a set of vertices such that each edge in G is incident to at least one of
these vertices. The decision vertex-cover problem was proven NPC. Now, we want to solve the optimal
version of the vertex cover problem, i.e., we want to find a minimum size vertex cover of a given graph.
We call such vertex cover an optimal vertex cover C*.
An approximate algorithm for vertex cover:
Page157
Add u and v to C;
Remove from E' all edges incident to
u or v;
}
Return C;
}
The idea is to take an edge (u, v) one by one, put both vertices to C, and remove all the edges incident
to u or v. We carry on until all edges have been removed. C is a VC. But how good is C?
VC = {b, c, d, e, f, g}
Traveling-salesman Problem
In the traveling salesman Problem, a salesman must visits n cities. We can say that salesman wishes
to make a tour or Hamiltonian cycle, visiting each city exactly once and finishing at the city he starts
from. There is a non-negative cost c (i, j) to travel from the city i to city j.
The goal is to find a tour of minimum cost. We assume that every two cities are connected. Such
problems are called Traveling-salesman problem (TSP).
We can model the cities as a complete graph of n vertices, where each vertex represents a city.It can
be shown that TSP is NPC.
If we assume the cost function c satisfies the triangle inequality, then we can use the following
approximate algorithm.
Triangle inequality
Let u, v, w be any three vertices, we have
One important observation to develop an approximate solution is if we remove an edge from H*, then
the tour becomes a spanning tree.
Algorithm:
Approx-TSP (G= (V, E))
{ 1. Compute a MST T of G;
2. Select any vertex r is the root of the tree;
3. Let L be the list of vertices visited in a preorder tree walk of T;
Page158
4. Return the Hamiltonian cycle H that visits the vertices in the order L;
}
Traveling-Salesman Problem
Intuitively, Approx-TSP first makes a full walk of MST T, which visits each edge exactly two times. To
create a Hamiltonian cycle from the full walk, it bypasses some vertices (which corresponds to making
a shortcut)
RANDOMIZED ALGORITHM
What is a Randomized Algorithm?
An algorithm that uses random numbers to decide what to do next anywhere in its logic is called
Randomized Algorithm.
For example, in Randomized Quick Sort, we use random number to pick the next pivot (or we
randomly shuffle the array) and in Karger’s algorithm, we randomly pick an edge.
Page159
Average of all evaluated times is the expected worst case time complexity. Below facts are
generally helpful in analysis of such algorithms. Linearity of expectation expected number of trials
until Success.
How many times while loop runs before finding a central pivot?
The probability that the randomly chosen element is central pivot is 1/2.
Therefore, expected number of times the while loop runs is 2
Thus, the expected time complexity of step 2 is O(n)
Page160
INTERIOR POINT METHOD
Interior-point methods (also referred to as barrier methods or IPMs) are a certain class
of algorithms that solve linear and nonlinear convex optimization problems.
Example: John von Neumann suggested an interior-point method of linear programming, which was
neither a polynomial-time method nor an efficient method in practice. In fact, it turned out to be slower
than the commonly used simplex method.
An interior point method discovered by Soviet mathematician I. I. Dikin in 1967 and reinvented in
the U.S. in the mid-1980s. In 1984, NarendraKarmarkar developed a method for linear
programming called Karmarkar's algorithm, which runs in provably polynomial time and is also very
efficient in practice.
It enabled solutions of linear programming problems that were beyond the capabilities of the simplex
method. Contrary to the simplex method, it reaches a best solution by traversing the interior of
the feasible region. The method can be generalized to convex programming based on a self-
concordant barrier function used to encode the convex set.
Any convex optimization problem can be transformed into minimizing (or maximizing) a linear
function over a convex set by converting to the epigraph form.
A special class of such barriers that can be used to encode any convex set. They guarantee that the
number of iterations of the algorithm is bounded by a polynomial in the dimension and accuracy of
the solution.
Karmarkar's breakthrough revitalized the study of interior-point methods and barrier problems,
showing that it was possible to create an algorithm for linear programming characterized
by polynomial complexity and, moreover, that was competitive with the simplex method.
Already Khachiyan's ellipsoid method was a polynomial-time algorithm; however, it was too slow to
be of practical interest.
minimize subject
Page161
Here is a small positive scalar, sometimes called the "barrier parameter". As µ converges to
zero the minimum of B(x,µ) should converge to a solution of (1).
The barrier function gradient is
1. For a product of N numbers, if we have to subtract a constant K such that the product gets its
maximum value, then subtract it from a largest value such that largest value-k is greater than 0. If we
have to subtract a constant K such that the product gets its minimum value, then subtract it from the
smallest value where smallest value-k should be greater than 0.
2. Goldbech`s conjecture: Every even integer greater than 2 can be expressed as the sum of 2 primes.
3. Perfect numbers or Amicable numbers: Perfect numbers are those numbers which are equal to the
sum of their proper divisors. Example: 6 = 1 + 2 + 3
4. Lychrel numbers: Are those numbers that cannot form a palindrome when repeatedly reversed and
added to itself. For example 47 is not a Lychrel Number as 47 + 74 = 121
5. Lemoine’s Conjecture : Any odd integer greater than 5 can be expressed as a sum of an odd prime
(all primes other than 2 are odd) and an even semi-prime. A semi-prime number is a product of two
prime numbers. This is called Lemoine’s conjecture.
6. Fermat’s Last Theorem : According to the theorem, no three positive integers a, b, c satisfy the
equation, for any integer value of n greater than 2. For n = 1 and n = 2, the
equation have infinitely many solutions.
Page162
RECENT TRENDS IN PROBLEM SOLVING USING SEARCHING
AND SORTING TECHNIQUES IN DATASTRUCTURES:
Searching
Searching is the algorithmic process of finding a particular item in a collection of items. A search
typically answers either True or False as to whether the item is present. It turns out that there are many
different ways to search for the item. What we are interested in here is how these algorithms work and
how they compare to one another.
Page163
Analysis of Sequential Search
To analyze searching algorithms, we need to decide on a basic unit of computation. Recall that this is
typically the common step that must be repeated in order to solve the problem. For searching, it makes
sense to count the number of comparisons performed. Each comparison may or may not discover the
item we are looking for. In addition, we make another assumption here.
The list of items is not ordered in any way. The items have been placed randomly into the list in other
words, the probability that the item we are looking for is in any particular position is exactly the same
for each position of the list. If the item is not in the list, the only way to know it is to compare it against
every item present. If there are 𝑛 items, then the sequential search requires 𝑛comparisons to discover
that the item is not there. In the case where the item is in the list, the analysis is not so straightforward.
There are actually three different scenarios that can occur.
In the best case we will find the item in the first place we look, at the beginning of the list. We
will need only one comparison. In the worst case, we will not discover the item until the very last
comparison, the nth comparison.
What about the average case? On average, we will find the item about halfway into the list;
that is, we will compare against 𝑛/2 items. Recall, however, that as 𝑛 gets large, the coefficients, no
matter what they are, become insignificant in our approximation, so the complexity of the sequential
search, is (𝑛).
Table summarizes these results. We assumed earlier that the items in our collection had been randomly
placed so that there is no relative order between the items. What would happen to the sequential search
if the items were ordered in some way? Would we be able to gain any efficiency in our search
technique? Assume that the list of items was constructed so that the items were in ascending order, from
low to high.
If the item we are looking for is present in the list, the chance of it being in any one of the 𝑛 positions is
still the same as before. We will still have the same number of comparisons to find the item. However,
if the item is not present there is a slight advantage. Figure shows this process as the algorithm looks for
the item 50.Notice that items are still compared in sequence until 54. At this point, however, we know
something extra. Not only is 54 not the item we are looking for, but no other elements beyond 54 can
work either since the list is sorted. In this case, the algorithm does not have to continue looking through
all of the items to report that the item was not found. It can stop immediately.
Page164
The Binary Search
It is possible to take greater advantage of the ordered list if we are clever with our comparisons. In the
sequential search, when we compare against the first item, there are at most 𝑛 − 1 more items to look
through if the first item is not what we are looking for. Instead of searching the list in sequence, a
binary search will start by examining the middle item. If that item is the one we are searching for, we
are done.
If it is not the correct item, we can use the ordered nature of the list to eliminate half of the remaining
items. If the item we are searching for is greater than the middle item, we know that the entire lower
half of the list as well as the middle item can be eliminated from further consideration. The item, if it is
in the list, must be in the upper half. We can then repeat the process with the upper half. Start at the
middle item and compare it against what we are looking for. Again, we either find it or split the list in
half, therefore eliminating another large part of our possible search space. Figure shows how this
algorithm can quickly find the value 54.
Before we move on to the analysis, we should note that this algorithm is a great example of a divide and
conquer strategy. Divide and conquer means that we divide the problem into smaller pieces, solve the
Page165
smaller pieces in some way and then reassemble the whole problem to get the result. When we perform
a binary search of a list, we first check the middle item. If the item we are searching for is less than the
middle item, we can simply perform a binary search of the left half of the original list. Likewise, if the
item is greater, we can perform a binary search of the right half. Either way, this is a recursive call to
the binary search function passing a smaller list.
Page166
Sorting
Sorting is the process of placing elements from a collection in some kind of order. For example, a list of
words could be sorted alphabetically or by length. A list of cities could be sorted by population, by area,
or by zip code. There are many, many sorting algorithms that have been developed and analyzed. This
suggests that sorting is an important area of study in computer science. Sorting a large number of items
can take a substantial amount of computing resources. Like searching, the efficiency of a sorting
algorithm is related to the number of items being processed. For small collections, a complex sorting
method may be more trouble than it is worth. The overhead may be too high. On the other hand, for
larger collections, we want to take advantage of as many improvements as possible. In this section we
will discuss several sorting techniques and compare them with respect to their running time. Before
getting into specific algorithms, we should think about the operations that can be used to analyze a
sorting process. First, it will be necessary to compare two values to see which is smaller (or larger). In
order to sort a collection, it will be necessary to have some systematic way to compare values to see if
they are out of order. The total number of comparisons will be the most common way to measure a sort
procedure. Second, when values are not in the correct position with respect to one another, it may be
necessary to exchange them. This exchange is a costly operation and the total number of exchanges will
also be important for evaluating the overall efficiency of the algorithm.
1. Bubble Sort:
The bubble sort makes multiple passes through a list. It compares adjacent items and exchanges those
that are out of order. Each pass through the list places the next largest value in its proper place. In
essence, each item “bubbles” up to the location where it belongs. Figure shows the first pass of a bubble
sort. The shaded items are being compared to see if they are out of order. If there are 𝑛 items in the list,
then there are 𝑛 − 1 pairs of items that need to be compared on the first pass. It is important to note that
once the largest value in the list is part of a pair, it will continually be moved along until the pass is
complete. At the start of the second pass, the largest value is now in place. There are 𝑛 − 1 items left to
sort, meaning that there will be 𝑛 − 2 pairs. Since each pass places the next largest value in place, the
total number of passes necessary will be 𝑛 − 1. After completing the 𝑛 − 1 passes, the smallest item
must be in the correct position with no further processing required. The code below shows the complete
bubble_sort function. It takes the list as a parameter and modifies it by exchanging items as necessary.
The exchange operation, sometimes called a “swap,” is slightly different in Python than in most other
programming languages. Typically, swapping two elements in a list requires a temporary storage
Page167
location (an additional memory location),will exchange the 𝑖th and 𝑛th items in the list. Without the
temporary storage, one of the values would be overwritten.
2. Selection Sort:
The selection sort improves on the bubble sort by making only one exchange for every pass through the
list. In order to do this, a selection sort looks for the largest value as it makes a pass and after
completing the pass, places it in the proper location. As with a bubble sort, after the first pass, the
largest item is in the correct place. After the second pass, the next largest is in place. This process
continues and requires 𝑛 − 1 passes to sort 𝑛 items, since the final item must be in place after the (𝑛 −
1)st pass. Figure shows the entire sorting process. On each pass, the largest remaining item is selected
and then placed in its proper location. The first pass places 93, the second pass places 77, the third
places 55, and so on. The function is shown below.
Page168
The selection sort makes the same number of comparisons as the bubble sort and is therefore also (𝑛2).
However, due to the reduction in the number of exchanges, the selection sort typically executes faster in
benchmark studies. In fact, for our list, the bubble sort makes 20 exchanges, while the selection sort
makes only 8
Page169
3. Insertion Sort:
The insertion sort, although still, works in a slightly different way. It always maintains a sorted sub-list
in the lower positions of the list. Each new item is then “inserted” back into the previous sub-list such
that the sorted sub-list is one item larger. Figure shows the insertion sorting process. The shaded items
represent the ordered sub-lists as the algorithm makes each pass. We begin by assuming that a list with
one item (position 0) is already sorted. On each pass, one for each item 1 through, the current item is
checked against those in the already sorted sub-list. As we look back into the already sorted sub-list, we
shift those items that are greater to the right. When we reach a smaller item or the end of the sub-list,
the current item can be inserted. Figure below shows the fifth pass in detail. At this point in the
algorithm, a sorted sub-list of five items consisting of 17, 26, 54, 77, and 93 exists. We want to insert 31
back into the already sorted items. The first comparison against 93 causes 93 to be shifted to the right.
77 and 54 are also shifted. When the item 26 is encountered, the shifting process stops and 31 is placed
in the open position. Now we have a sorted sub-list of six items. The implementation of insertion sort
shows that there are again n − 1 passes to sort n items. The iteration starts at position 1 and moves
through position n − 1, as these are the items that need to be inserted back into the sorted sub-lists. Line
8 performs the shift operation that moves a value up one position in the list, making room behind it for
the insertion. Remember that this is not a complete exchange as was performed in the previous
algorithms. The maximum number of comparisons for an insertion sort is the sum of the first n−1
integers. Again, this is (n2 ). However, in the best case, only one comparison needs to be done on each
pass. This would be the case for an already sorted list. One note about shifting versus exchanging is also
important. In general, a shift operation requires approximately a third of the processing work of an
exchange since only one assignment is performed. In benchmark studies, insertion sort will show very
good performance.
Page170
4. Shell Sort:
The shell sort, sometimes called the “diminishing increment sort,” improves on the insertion sort by
breaking the original list into a number of smaller sub-lists, each of which is sorted using an insertion
sort. The unique way that these sub-lists are chosen is the key to the shell sort. Instead of breaking the
list into sub-lists of contiguous items, the shell sort uses an increment i, sometimes called the gap, to
create a sub-list by choosing all items that are i items apart. This can be seen in figure. This list has nine
items. If we use an increment of three, there are three sub-lists, each of which can be sorted by an
insertion sort. After completing these sorts, we get the list shown in figure. Although this list is not
completely sorted, something very interesting has happened. By sorting the sub-lists, we have moved
the items closer to where they actually belong. Figure shows a final insertion sort using an increment of
one; in other words, a standard insertion sort. Note that by performing the earlier sub-list sorts, we have
now reduced the total number of shifting operations necessary to put the list in its final order. For this
case, we need only four more shifts to complete the process. In the way in which the increments are
chosen is the unique feature of the shell sort. The function shell sort shown below uses a different set of
increments. In this case, we begin with 2 sub-lists. On the next pass, 4 sub-lists are sorted. Eventually, a
single list is sorted with the basic insertion sort. Figure shows the first sub-lists for our example using
this increment.
Page171
Page172
5. The Merge Sort:
Merge sort using a divide and conquer strategy as a way to improve the performance of sorting
algorithms. Merge sort is a recursive algorithm that continually splits a list in half. If the list is empty or
has one item, it is sorted by definition (the base case). If the list has more than one item, we split the list
and recursively invoke a merge sort on both halves. Once the two halves are sorted, the fundamental
operation, called a merge, is performed. Merging is the process of taking two smaller sorted lists and
combining them together into a single, sorted, new list. Figure below shows our familiar example list as
it is being split by merge sort. Figure below also shows the simple lists, now sorted, as they are merged
back together. The merge sort function shown below begins by asking the base case question. If the
length of the list is less than or equal to one, then we already have a sorted list and no more processing
is necessary. If, on the other hand, the length is greater than one, then we use the Python slice operation
to extract the left and right halves. It is important to note that the list may not have an even number of
items. That does not matter, as the lengths will differ by at most one.
Page173
6. Quick Sort:
The quick sort uses divide and conquer to gain the same advantages as the merge sort, while not using
additional storage. As a trade-off, however, it is possible that the list may not be divided in half. When
this happens, we will see that performance is diminished. A quick sort first selects a value, which is
called the pivot value. Although there are many different ways to choose the pivot value, we will simply
use the first item in the list. The role of the pivot value is to assist with splitting the list. The actual
position where the pivot value belongs in the final sorted list, commonly called the split point, will be
used to divide the list for subsequent calls to the quick sort. Figure below shows that 54 will serve as
our first pivot value. Since we have looked at this example a few times already, we know that 54 will
eventually end up in the position currently holding 31. The partition process will happen next. It will
find the split point and at the same time move other items to the appropriate side of the list, either less
than or greater than the pivot value.
Partitioning begins by locating two position markers – let’s call them left_mark and right_mark – at the
beginning and end of the remaining items in the list (positions 1 and 8 in Figure). The goal of the
partition process is to move items that are on the wrong side with respect to the pivot value while also
converging on the split point. Figure shows this process as we locate the position of 54.We begin by
incrementing left_mark until we locate a value that is greater than the pivot value. We then decrement
right_mark until we find a value that is less than the pivot value. At this point we have discovered two
items that are out of place with respect to the eventual split point. For our example, this occurs at 93 and
20. Now we can exchange these two items and then repeat the process again. At the point where
right_mark becomes less than left_mark, we stop. The position of right_mark is now the split point. The
pivot value can be exchanged with the contents of the split point and the pivot value is now in place. In
addition, all the items to the left of the split point are less than the pivot value and all the items to the
right of the split point are greater than the pivot value. The list can now be divided at the split point and
the quick sort can be invoked recursively on the two halves. The quick_sort function shown below
invokes a recursive function, quick_sort_helper. quick_sort_helper begins with the same base case as
the merge sort. If the length of the list is less than or equal to one, it is already sorted. If it is greater,
then it can be partitioned and recursively sorted. The partition function implements the process
described earlier.
Page174
Page175