0% found this document useful (0 votes)
23 views

Chapter-2-Anaysis-of-Algorithm

Uploaded by

Elroi Teshome
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Chapter-2-Anaysis-of-Algorithm

Uploaded by

Elroi Teshome
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Analysis of Algorithms

Algorithms as a technology
 Suppose computers were infinitely fast and computer memory was free. Would you have any reason
to study algorithms? The answer is yes, if for no other reason than that you would still like to
demonstrate that your solution method terminates and does so with the correct answer.
 If computers were infinitely fast, any correct method for solving a problem would do. You would
probably want your implementation to be within the bounds of good software engineering practice
(i.e., well designed and documented), but you would most often use whichever method was the
easiest to implement.
 Of course, computers may be fast, but they are not infinitely fast. And memory may be cheap, but it
is not free. Computing time is therefore a bounded resource, and so is space in memory. These
resources should be used wisely, and algorithms that are efficient in terms of time or space will help
you do so.

Efficiency
 Algorithms devised to solve the same problem often differ dramatically in their efficiency. These
differences can be much more significant than differences due to hardware and software.
 As an example, we will see two algorithms for sorting. The first, known as insertion sort, takes time
roughly equal to c1n2 to sort n items, where c1 is a constant that does not depend on n. That is, it
takes time roughly proportional to n2. The second, merge sort, takes time roughly equal to c2n lg n,
where lg n stands for log2 n and c2 is another constant that also does not depend on n. Insertion sort
usually has a smaller constant factor than merge sort, so that c1 < c2. We shall see that the constant
factors can be far less significant in the running time than the dependence on the input size n. Where
merge sort has a factor of lg n in its running time, insertion sort has a factor of n, which is much
larger. Although insertion sort is usually faster than merge sort for small input sizes, once the input
size n becomes large enough, merge sort's advantage of lg n vs. n will more than compensate for the
difference in constant factors. No matter how much smaller c1 is than c2, there will always be a
crossover point beyond which merge sort is faster.
 For a concrete example, let us pit a faster computer (computer A) running insertion sort against a
slower computer (computer B) running merge sort. They each must sort an array of one million
numbers. Suppose that computer A executes one billion instructions per second and computer B
executes only ten million instructions per second, so that computer A is 100 times faster than
computer B in raw computing power. To make the difference even more dramatic, suppose that the
world's craftiest programmer codes insertion sort in machine language for computer A, and the
resulting code requires 2n2 instructions to sort n numbers. (Here, c1 = 2.) Merge sort, on the other
hand, is programmed for computer B by an average programmer using a high-level language with an
inefficient compiler, with the resulting code taking 50n lg n instructions (so that c2 = 50). To sort one
million numbers, computer A takes

while computer B takes


 By using an algorithm whose running time grows more slowly, even with a poor compiler, computer
B runs 20 times faster than computer A! The advantage of merge sort is even more pronounced when
we sort ten million numbers: where insertion sort takes approximately 2.3 days, merge sort takes less
than 20 minutes. In general, as the problem size increases, so does the relative advantage of merge
sort.
1|Page
Algorithms and other technologies
 The example shows that algorithms, like computer hardware, are a technology. Total system
performance depends on choosing efficient algorithms as much as on choosing fast hardware. Just as
rapid advances are being made in other computer technologies, they are being made in algorithms as
well.
 You might wonder whether algorithms are truly that important on contemporary computers in light
of other advanced technologies, such as
o hardware with high clock rates, pipelining, and superscalar architectures,
o easy-to-use, intuitive graphical user interfaces (GUIs),
o object-oriented systems, and
o local-area and wide-area networking.
 The answer is yes. Although there are some applications that do not explicitly require algorithmic
content at the application level (e.g., some simple web-based applications), most also require a
degree of algorithmic content on their own. For example, consider a web-based service that
determines how to travel from one location to another. Its implementation would rely on fast
hardware, a graphical user interface, wide-area networking, and also possibly on object orientation.
 However, it would also require algorithms for certain operations, such as finding routes (probably
using a shortest-path algorithm), rendering maps, and interpolating addresses.
 Moreover, even an application that does not require algorithmic content at the application level relies
heavily upon algorithms. Does the application rely on fast hardware? The hardware design used
algorithms. Does the application rely on graphical user interfaces? The design of any GUI relies on
algorithms. Does the application rely on networking? Routing in networks relies heavily on
algorithms. Was the application written in a language other than machine code? Then it was
processed by a compiler, interpreter, or assembler, all of which make extensive use of algorithms.
Algorithms are at the core of most technologies used in contemporary computers.
 Furthermore, with the ever-increasing capacities of computers, we use them to solve larger problems
than ever before. As we saw in the above comparison between insertion sort and merge sort, it is at
larger problem sizes that the differences in efficiencies between algorithms become particularly
prominent.
 Having a solid base of algorithmic knowledge and technique is one characteristic that separates the
truly skilled programmers from the novices. With modern computing technology, you can
accomplish some tasks without knowing much about algorithms, but with a good background in
algorithms, you can do much, much more.

Pseudocode v/s Real Code


 Here we shall typically describe algorithms as programs written in a pseudocode that is similar in
many respects to C, Pascal, or Java.
 What separates pseudocode from "real" code is that in pseudocode, we employ whatever expressive
method is most clear and concise to specify a given algorithm. Sometimes, the clearest method is
English, so do not be surprised if you come across an English phrase or sentence embedded within a
section of "real" code.
 Another difference between pseudocode and real code is that pseudocode is not typically concerned
with issues of software engineering. Issues of data abstraction, modularity, and error handling are
often ignored in order to convey the essence of the algorithm more concisely.

2|Page
Insertion sort
 Input: A sequence of n numbers a1, a2, . . .,an.
 Output: A permutation (reordering) of the input sequence such that a1< a2< . . .<an.
 The numbers that we wish to sort are also known as the keys. We start with insertion sort, which is
an efficient algorithm for sorting a small number of elements. Insertion sort works the way many
people sort a hand of playing cards. We start with an empty left hand and the cards face down on the
table. We then remove one card at a time from the table and insert it into the correct position in the
left hand. To find the correct position for a card, we compare it with each of the cards already in the
hand, from right to left, as illustrated in Figure. At all times, the cards held in the left hand are sorted,
and these cards were originally the top cards of the pile on the table.

 Our pseudocode for insertion sort is presented as a procedure called INSERTION-SORT, which
takes as a parameter an array A[1 … n] containing a sequence of length n that is to be sorted. (The
number n of elements in A is denoted by length[A].) The input numbers are sorted in place: the
numbers are rearranged within the array A, with at most a constant number of them stored outside the
array at any time. The input array A contains the sorted output sequence when INSERTION-SORT is
finished.
INSERTION-SORT(A)
1 for j ← 2 to length[A]
2 do key ← A[j]
3 ▹ Insert A[j] into the sorted sequence A[1 … j - 1].
4 i←j-1
5 while i > 0 and A[i] > key
6 do A[i + 1] ← A[i]
7 i←i-1
8 A[i + 1] ← key

Loop invariants and the correctness of insertion sort


 Figure below shows how this algorithm works for A = <5, 2, 4, 6, 1, 3>. The index j indicates the
"current card" being inserted into the hand. At the beginning of each iteration of the "outer" for loop,
which is indexed by j, the subarray consisting of elements A[1 … j - 1] constitute the currently sorted
hand, and elements A[j + 1 … n] correspond to the pile of cards still on the table. In fact, elements
A[1 …j - 1] are the elements originally in positions 1 through j - 1, but now in sorted order. We state
these properties of A[1 … j -1] formally as a loop invariant:
 At the start of each iteration of the for loop of lines 1-8, the subarray A[1 … j - 1] consists of the
elements originally in A[1 …j - 1] but in sorted order.

3|Page
Figure: The operation of INSERTION-SORT on the array A = <5, 2, 4, 6, 1, 3>. Array indices appear above
the rectangles, and values stored in the array positions appear within the rectangles. (a)-(e) The iterations of
the for loop of lines 1-8. In each iteration, the black rectangle holds the key taken from A[j], which is
compared with the values in shaded rectangles to its left in the test of line 5. Shaded arrows show array
values moved one position to the right in line 6, and black arrows indicate where the key is moved to in line
8. (f) The final sorted array.

 We use loop invariants to help us understand why an algorithm is correct. We must show three things
about a loop invariant:
o Initialization: It is true prior to the first iteration of the loop.
o Maintenance: If it is true before an iteration of the loop, it remains true before the next
iteration.
o Termination: When the loop terminates, the invariant gives us a useful property that helps
show that the algorithm is correct.
 When the first two properties hold, the loop invariant is true prior to every iteration of the loop. Note
the similarity to mathematical induction, where to prove that a property holds, you prove a base case
and an inductive step. Here, showing that the invariant holds before the first iteration is like the base
case, and showing that the invariant holds from iteration to iteration is like the inductive step.
 The third property is perhaps the most important one, since we are using the loop invariant to show
correctness. It also differs from the usual use of mathematical induction, in which the inductive step
is used infinitely; here, we stop the "induction" when the loop terminates.
 Let us see how these properties hold for insertion sort.
o Initialization: We start by showing that the loop invariant holds before the first loop
iteration, when j = 2. The subarray A[1 … j - 1], therefore, consists of just the single element
A[1], which is in fact the original element in A[1]. Moreover, this subarray is sorted (trivially,
of course), which shows that the loop invariant holds prior to the first iteration of the loop.
o Maintenance: Next, we tackle the second property: showing that each iteration maintains the
loop invariant. Informally, the body of the outer for loop works by moving A[ j - 1], A[ j - 2],
A[ j - 3], and so on by one position to the right until the proper position for A[ j] is found
(lines 4-7), at which point the value of A[j] is inserted (line 8). A more formal treatment of
the second property would require us to state and show a loop invariant for the "inner" while
loop. At this point, however, we prefer not to get bogged down in such formalism, and so we
rely on our informal analysis to show that the second property holds for the outer loop.
o Termination: Finally, we examine what happens when the loop terminates. For insertion
sort, the outer for loop ends when j exceeds n, i.e., when j = n + 1. Substituting n + 1 for j in
the wording of loop invariant, we have that the subarray A[1 … n] consists of the elements
originally in A[1 … n], but in sorted order. But the subarray A[1 … n] is the entire array!
Hence, the entire array is sorted, which means that the algorithm is correct.

4|Page
Analysis of insertion sort
 The time taken by the INSERTION-SORT procedure depends on the input: sorting a thousand
numbers takes longer than sorting three numbers. Moreover, INSERTION-SORT can take different
amounts of time to sort two input sequences of the same size depending on how nearly sorted they
already are. In general, the time taken by an algorithm grows with the size of the input, so it is
traditional to describe the running time of a program as a function of the size of its input. To do so,
we need to define the terms "running time" and "size of input" more carefully.
 The best notion for input size depends on the problem being studied. For many problems, such as
sorting or computing discrete Fourier transforms, the most natural measure is the number of items in
the input-for example, the array size n for sorting. Sometimes, it is more appropriate to describe the
size of the input with two numbers rather than one. For instance, if the input to an algorithm is a
graph, the input size can be described by the numbers of vertices and edges in the graph.
 The running time of an algorithm on a particular input is the number of primitive operations or
"steps" executed. It is convenient to define the notion of step so that it is as machine independent as
possible. For the moment, let us adopt the following view. A constant amount of time is required to
execute each line of our pseudocode. One line may take a different amount of time than another line,
but we shall assume that each execution of the ith line takes time ci , where ci is a constant.
 In the following discussion, our expression for the running time of INSERTION-SORT will evolve
from a messy formula that uses all the statement costs ci to a much simpler notation that is more
concise and more easily manipulated. This simpler notation will also make it easy to determine
whether one algorithm is more efficient than another.
 We start by presenting the INSERTION-SORT procedure with the time "cost" of each statement and
the number of times each statement is executed. For each j = 2, 3. . . n, where n = length[A], we let tj
be the number of times the while loop test in line 5 is executed for that value of j. When a for or
while loop exits in the usual way (i.e., due to the test in the loop header), the test is executed one
time more than the loop body. We assume that comments are not executable statements, and so they
take no time.

 The running time of the algorithm is the sum of running times for each statement executed; a
statement that takes ci steps to execute and is executed n times will contribute cin to the total running
time. To compute T(n), the running time of INSERTION-SORT, we sum the products of the cost and
times columns, obtaining

5|Page
 Even for inputs of a given size, an algorithm's running time may depend on which input of that size
is given. For example, in INSERTION-SORT, the best case occurs if the array is already sorted. For
each j = 2, 3. . . n, we then find that A[i] ≤ key in line 5 when i has its initial value of j - 1. Thus tj = 1
for j = 2, 3. . . n, and the best-case running time is

 This running time can be expressed as an + b for constants a and b that depend on the statement
costs ci ; it is thus a linear function of n.
 If the array is in reverse sorted order-that is, in decreasing order-the worst case results. We must
compare each element A[j] with each element in the entire sorted subarray A[1 … j - 1], and so tj = j
for j = 2, 3, . . . , n. Noting that and

 we find that in the worst case, the running time of INSERTION-SORT is

 This worst-case running time can be expressed as an2 + bn + c for constants a, b, and c that again
depend on the statement costs ci ; it is thus a quadratic function of n.

Worst-case and average-case analysis


 In analysis of insertion sort, we looked at both the best case, in which the input array was already
sorted, and the worst case, in which the input array was reverse sorted. Other algorithms, we shall
usually concentrate on finding only the worst-case running time, that is, the longest running time for
any input of size n.
 We give three reasons for this orientation.
o The worst-case running time of an algorithm is an upper bound on the running time for any
input. It gives us a guarantee that the algorithm will never take any longer. We need not make
some educated guess about the running time and hope that it never gets much worse.
o For some algorithms, the worst case occurs fairly often. For example, in searching a database
for a particular piece of information, the searching algorithm's worst case will often occur
when the information is not present in the database. In some searching applications, searches
for absent information may be frequent.

6|Page
o The "average case" is often roughly as bad as the worst case. Suppose that we randomly
choose n numbers and apply insertion sort. How long does it take to determine where in
subarray A[1 … j - 1] to insert element A[j]? On average, half the elements in A[1 … j - 1] are
less than A[j], and half the elements are greater. On average, therefore, we check half of the
subarray A[1 … j - 1], so tj = j/2. If we work out the resulting average-case running time, it
turns out to be a quadratic function of the input size, just like the worst-case running time.

Order of growth
 We used some simplifying abstractions to ease our analysis of the INSERTION-SORT procedure.
First, we ignored the actual cost of each statement, using the constants ci to represent these costs.
Then, we observed that even these constants give us more detail than we really need: the worst-case
running time is an2 + bn + c for some constants a, b, and c that depend on the statement costs ci. We
thus ignored not only the actual statement costs, but also the abstract costs ci.
 We shall now make one more simplifying abstraction. It is the rate of growth, or order of growth, of
the running time that really interests us. We therefore consider only the leading term of a formula
(e.g., an2), since the lower-order terms are relatively insignificant for large n. We also ignore the
leading term's constant coefficient, since constant factors are less significant than the rate of growth
in determining computational efficiency for large inputs. Thus, we write that insertion sort, for
example, has a worst-case running time of Θ(n2) (pronounced "theta of n-squared").
 We usually consider one algorithm to be more efficient than another if its worst-case running time
has a lower order of growth. Due to constant factors and lower-order terms, this evaluation may be in
error for small inputs. But for large enough inputs, a Θ(n2) algorithm, for example, will run more
quickly in the worst case than a Θ(n3) algorithm.

7|Page

You might also like