0% found this document useful (0 votes)
19 views

Lec 01 Why Ds

The document discusses the importance of studying data structures and algorithms. It explains key concepts like data, problems, algorithms, and how structured data helps solve problems more efficiently. Binary search, a more efficient search algorithm that exploits the structure of sorted data, is presented as an example.

Uploaded by

Nikunj Jayas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Lec 01 Why Ds

The document discusses the importance of studying data structures and algorithms. It explains key concepts like data, problems, algorithms, and how structured data helps solve problems more efficiently. Binary search, a more efficient search algorithm that exploits the structure of sorted data, is presented as an example.

Uploaded by

Nikunj Jayas
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

CS213/293 Data Structure and Algorithms 2023

Lecture 1: Why study data structures?

Instructor: Ashutosh Gupta

IITB India

Compile date: 2023-08-09

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 1
What is data?

Things are not data, but information about them is data.

Example 1.1
Age of people, height of trees, price of stocks, and number of likes.

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 2
Data is big!
We are living in the age of big data!

*Image is from the Internet.

Exercise 1.1
1. Estimate the number of messages exchanged for status level in Whatsapp.
2. How much text data was used to train ChatGPT?
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 3
We need to work on data

We process data to solve our problems.

Example 1.2
1. Predict the weather
2. Find a webpage
3. Recognize fingerprint

Disorganized data will need a lot of time to process.

Exercise 1.2
How much time do we need to find an element in an array?

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 4
Problems

A problem is a pair of input specification and output specification

Example 1.3
The problem of search consists of the following specifications
▶ Input specification: an array S of elements and an element e
▶ Output specification: position of e in S if exists. If not found, return -1.
Output specifications refer to the
variables in the input specifications

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 5
Algorithms

An algorithm solves a given problem.


▶ Input ∈ Input specifications
▶ Output ∈ Output specifications

Input Algorithms Output

Note: there can be many algorithms to solve a problem.

Exercise 1.3
1. What is an algorithm?
2. How is it different from a program?

Commentary: An algorithm is a step-by-step process that processes a small amount of data in each step and eventually computes the output. The formal definition of the
algorithm will be presented to you in CS310. It took the genius of Alan Turing to give the precise definition of an algorithm.
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 6
Example: an algorithm for search

Example 1.4
int search ( int * S , int n , int e ) {
// n is the length of the array S
// We are looking for element e in S
for ( int i =0; i < n ; i ++ ) {
if ( S [ i ] == e ) {
return i ;
}
}
return -1; // Not found
}
Exercise 1.4
How much time will it take to run the above algorithm if e is not in S?
Commentary: Answer: We count memory accesses, arithmetic operations (including comparisons), assignments, and jumps. The loop in the program will iterate n times. In
each iteration, there will be one memory access S[i] , three arithmetic operations i<n , S[i] == e and i++ , and two jumps. A the initialization, there is an assignment
i=0 . For the loop exit, there will be one more comparison and jump. Time = nTRead + (3n + 2)TArith + (2n + 1)Tjump + Treturn
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 7
Data needs structure
Storing data as a pile of stuff, will not work. We need structure.

Example 1.5
Store files in the order of the year. How do we store data at IIT Bombay Hospital?
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 8
Structured data helps us solve problems faster

We can exploit the structure to design efficient algorithms to solve our problems.

The goal of this course!

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 9
Example: search on well-structured data

Example 1.6
Let us consider the problem of search consisting of the following specifications
▶ Input specification: a non-decreasing array S and an element e
▶ Output specification: Position of e in S. If not found, return −1.

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 10
Example: search on well-structured data

Let us see how can we exploit the structured data!

Let us try to search 68 in the following array.

▶ Look at the middle point of the array. 0 1 2 3 4 5 6 7 8 9 10


▶ Since the value at the middle point is less 11 21 35 46 49 60 68 73 81 90 91
than 68, we search only in the upper half of
the array.
▶ We have halved our search space.

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 11
A better search
Example 1.7 Commentary: Answer: There will be k iterations. In
each iteration, the function will follow the same path.
int BinarySearch ( int * S , int n , int e ){ In each iteration, there will be
▶ a memory access S[mid] ,(why only one)
// S is a sorted array ▶ five arithmetic operations first < last ,
int first = 0 , last = n ; S[mid] == e , S[i] > e , first+last , and
../2 ,
int mid = ( first + last ) / 2; ▶ one assignment last = mid ,(why?)
while ( first < last ) { ▶ three jumps because of two ifs and a loop
exit,
if ( S [ mid ] == e ) return mid ; For loop exit, there will be one additional comparison
and a jump at the loop head. In the initialization
if ( S [ mid ] > e ) { section, we have two assignments and two arithmetic
operations.
last = mid ; Time = kTRead + (6k + 5)TArith + (3k + 1)Tjump +
} else { Treturn

first = mid + 1;
}
Exercise 1.5
Let n = 2k−1 . How much time will it take
mid = ( first + last ) / 2;
to run the above algorithm if S[0] > e?
}
return -1;
}b n a
c CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 12
Topic 1.1

Big-O notation

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 13
How much resource does an algorithm need?

There can be many algorithms to solve a problem.

Some are good and some are bad.

Good algorithms are efficient in


▶ time and
▶ space.

Our method of measuring time is cumbersome and machine-dependent.

We need approximate counting that is machine independent.

Commentary: Sometimes there is a trade-off between time and space. For example, inefficient linear search only needed one extra integer, but binary search needed three
extra integers. The difference of two integers may be a very minor issue, but it illustrates the trade-off.
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 14
Input size

An algorithm may have different running times for different inputs.

How do we think about comparing algorithms?

We define the rough size of the input, usually in terms of important parameters of input.

Example 1.8
In the problem of search, we say that the number of elements in the array is the input size.

Please note that the size of individual elements is not considered.(why?)

Commentary: Ideally, the number of bits in the binary representation of input is the size, which is too detailed and cumbersome to handle. In the case of search, we assume
that elements are drawn from the space of size 232 and can be represented using 32 bits. Therefore, the type of the element was int .
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 15
Best/Average/Worst case

For a given size of inputs, we may further make the following distinction.
1. Best case: Shortest running time for some input.
2. Worst case: Worst running time for some input.
3. Average case: Average running time on all the inputs of the given size.

Exercise 1.6
How can we modify almost any algorithm to have a good best-case running time?

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 16
Example: Best/Average/Worst case
Example 1.9
int BinarySearch ( int * S , int n , int e ){
// S is a sorted array
int first = 0 , last = n ;
int mid = ( first + last ) / 2; In BinarySearch, let n = 2k−1 .
while ( first < last ) { 1. Best case: e == S[n/2]
if ( S [ mid ] == e ) return mid ; TRead + 6TArith + Treturn ,
if ( S [ mid ] > e ) { 2. Worst case:e ∈ /S
last = mid ; we have seen the worst case.
} else { 3. Average case: ≈ Worst case
first = mid + 1; Most often loop will iterate k
} times.(why?)
mid = ( first + last ) / 2; Commentary: Analyzing the average case is hard. We
will mostly focus on worst-case analysis. For some
} important algorithms, we will do an average time anal-
ysis.
return -1;
}
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 17
Asymptotic behavior

For short inputs, an algorithm may use a shortcut for better running time.

To avoid such false comparisons, we look at the behavior of the algorithm in limit.

Ignore hardware-specific details


▶ Round numbers 100000000000001 ≈ 100000000000000
▶ Ignore coefficients 3kTArith ≈ k

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 18
Big-O notation: approximate measure

Definition 1.1
Let f and g be functions N → N. We say f (n) ∈ O(g (n)) if there are c and n0 such that

f (n) ≤ cg (n) for all n ≥ n0 .

▶ In limit, cg (n) will dominate f (n)


▶ We say f (n) is O(g (n))

Exercise 1.7
Which of the following are the true statements?
▶ 5n + 8 ∈ O(n) ▶ n2 + n ∈ O(n2 )
▶ 5n + 8 ∈ O(n2 ) ▶ 500000000000000000000000n2 ∈ O(n2 )
▶ 5n2 + 8 ∈ O(n) ▶ 50n2 logn + 60n2 ∈ O(n2 logn)

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 19
Example: Big-O of the worst case of BinarySearch

Example 1.10

In BinarySearch, let n = 2k−1 .


1. Worst case:e ∈
/S
kTRead + (6k + 5)TArith + (3k + 1)Tjump + Treturn ∈ O(k)
We may also say BinarySearch
Since k = log n + 1, therefore k ∈ O(log n) is O(log n).

Therefore, the worst-case running time of BinarySearch is O(log n).

Exercise 1.8
Prove that f ∈ O(g ) and g ∈ O(h), then f ∈ O(h).

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 20
What does Big-O says?

Expresses the approximate number of operations executed by the program as a function of input
size

Hierarchy of algorithms
▶ O(log n) algorithm is better than O(n)
▶ We say O(log n) < O(n) < O(n2 ) < O(2n )

May hide large constants!!

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 21
Complexity of a problem

The complexity of a problem is the complexity of the best-known algorithm for the problem.

Exercise 1.9
What is the complexity of the following problem?
▶ sorting an array O(n2 ) ✗
Best algorithm is
▶ matrix multiplication O(n3 ) ✗
still not known

Exercise 1.10
What is the best-known complexity for the above problems?

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 22
Θ-Notation

Definition 1.2 (Tight bound)


Let f and g be functions N → N. We say f (n) ∈ Θ(g (n)) if there are c1 , c2 , and n0 such that

c1 g (n) ≤ f (n) ≤ c2 g (n) for all n ≥ n0 .

There are more variations of the above definition. Please look at the end.

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 23
Names of complexity classes

▶ Constant: O(1)
▶ Logarithmic: O(logn)
▶ Linear: O(n)
▶ Quadratic: O(n2 )
▶ Polynomial : O(nk ) for some given k
▶ Exponential : O(2n )

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 24
Topic 1.2

Problem

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 25
Problem: Compute the exact running time of insertion sort.
Exercise 1.11
The following is the code for insertion sort. Compute the exact worst-case running time of the
code in terms of n and the cost of doing various machine operations.
for ( int j = 1; j < n ; j ++ ) {
int key = A [ j ];
int i = j -1;
while ( i >= 0 ) {
if ( A [ i ] > key ) {
A [ i +1] = A [ i ];
} else {
break ;
}
i - -;
}
A [ i +1] = key ;
cbna
} CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 26
Problem: additions and multiplication

Exercise 1.12
What is the time complexity of binary addition and multiplication? How much time does it take to
do unary addition?

Commentary: Solution: Assume two numbers A and B. In binary representation, their lengths (number of bits) are m and n. Then the time complexity of binary addition
would be O(m + n). This is because we can start from the right end and add (keeping carry in mind) from right to left. Each bit requires an O(1) computation since there are
only 8 combinations (2 each for bit 1, bit 2, and carry). Since the length of a number N in bits is log N, the time complexity is O(log A + log B) = O(log(AB)) = O(m + n).
Similarly, we can analyze long multiplication. The time complexity of multiplication is O(log A × log B) = O(mn). There are better algorithms than long multiplication
that have better time complexity. For example, Karatsuba’s algorithm. Unary addition is the concatenation of inputs. To produce the output the algorithm needs to output
concatenated string, therefore O(A + B).
cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 27
Problem: hierarchy of complexity

Exercise 1.13
Given f (n) = a0 n0 + ... + ad nd and g (n) = b0 n0 + ... + be ne with d > e and ad > 0(why?), show
that f (n) ∈
/ O(g (n)).
Commentary: Solution: Let us begin by assuming the proposition is False, ergo, f (n) ∈ O(g (n)). By definition, then, there exists a constants c and n0 such that
∀n ≥ n0 , f (n) ≤ cg (n). Hence, we have

0 d 0 e
∀n ≥ n0 , a0 n + . . . + ad n ≤ cb0 n + . . . + be n
e
i i+1 d
X
∀n ≥ n0 , (ai − cbi )n + ai+1 n + . . . + ad n ≤ 0
i=0

By definition of limit

e
i i+1 d
X
lim (ai − cbi )n + ai+1 n + . . . + ad n ≤ 0 =⇒ ad ≤ 0
n→∞
i=0

Since ad > 0, Contradiction.Source Milind notes.

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 28
Order of functions

Exercise 1.14
f (n) F (n)
▶ If f (n) ≤ F (n) and G (n) ≥ g (n) (in order sense) then show that ≤ .
G (n) g (n)
▶ Is f (n) the same order as f (n)|sin(n)|?

Commentary: Source Milind notes

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 29
Topic 1.3

Extra slides: More on complexity

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 30
Ω notation

Definition 1.3 (Lower bound)


Let f and g be functions N → N. We say f (n) ∈ Ω(g (n)) if there are c and n0 such that

cg (n) ≤ f (n) for all n ≥ n0 .

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 31
Small-o,ω notation

Definition 1.4 (Strict Upper bound)


Let f and g be functions N → N. We say f (n) ∈ o(g (n)) if for each c, there is n0 such that

f (n) ≤ cg (n) for all n ≥ n0 .

Definition 1.5 (Strict Lower bound)


Let f and g be functions N → N. We say f (n) ∈ ω(g (n)) if for each c, there is n0 such that

cg (n) ≤ f (n) for all n ≥ n0 .

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 32
Size of functions

We can define the order over functions using the above notations
▶ f (n) ∈ O(g (n)) implies f (n) ≤ g (n)
▶ f (n) ∈ o(g (n)) implies f (n) < g (n)
▶ f (n) ∈ Ω(g (n)) implies f (n) ≥ g (n)
▶ f (n) ∈ ω(g (n)) implies f (n) > g (n)
▶ f (n) ∈ Θ(g (n)) implies f (n) = g (n)

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 33
End of Lecture 1

cbna CS213/293 Data Structure and Algorithms 2023 Instructor: Ashutosh Gupta IITB India 34

You might also like