0% found this document useful (0 votes)
16 views

Introduction To DSA

Uploaded by

Logan Panucat
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

Introduction To DSA

Uploaded by

Logan Panucat
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

DATA STRUCTURES

AND ALGORITHM
Abrahem P. Anqui
Subject Details
■ Course Description

– An overview of data structure concepts, arrays, stack, queues, trees, and graphs.

Discussion of various implementations of these data objects, programming styles, and run-

time representations. The course also examines algorithms for sorting, searching, and

some graph algorithms. Algorithm analysis and efficient code design is discussed.

■ Learning Outcome
– Explain and utilize linked lists, stacks, queues, and trees.
– Incorporate algorithmic design know-how and data structures to create reliable and
structured programs.
– Describe the design and performance of various searching and sorting algorithms.
– Use advanced object-oriented concepts such as abstract base classes, friend classes, and
operator overloading in the implementation of data structures.
Introduction to Data Structures and
Algorithm
What Are Data Structures and Algorithms
Good For?
■ A data structure is an arrangement of data in a computer’s
memory (or sometimes on a disk). Data structures include arrays,
linked lists, stacks, binary trees, and hash tables, among others.
■ Algorithms manipulate the data in these structures in various
ways, such as searching for a particular data item and sorting the
data.
Overview of Data Structure
■ Data Structure is a systematic way to organize data in order to use it
efficiently. The following terms are the foundation terms of a data structure.
– Interface − Each data structure has an interface. The interface
represents the set of operations that a data structure supports. An
interface only provides the list of supported operations, the type of
parameters they can accept, and the return type of these operations.
– Implementation − Implementation provides the internal representation of
a data structure. The implementation also provides the definition of the
algorithms used in the operations of the data structure.
Characteristics of a Data Structure

■ Correctness − Data structure implementation should


implement its interface correctly.
■ Time Complexity − Running time or the execution time
of operations of the data structure must be as small as
possible.
■ Space Complexity − Memory usage of a data structure
operation should be as little as possible.
Need for Data Structure
■ As applications are getting complex and data-rich, there are three common
problems that applications face nowadays.
– Data Search − Consider an inventory of 1 million(106) items of a store. If the
application is to search an item, it has to search an item in 1 million(106)
items every time slowing down the search. As data grows, search will become
slower.
– Processor Speed − Processor speed, although very high, falls limited if the
data grows to billion records.
– Multiple Requests − As thousands of users can search data simultaneously
on a web server, even the fast server fails while searching the data.
■ To solve the above-mentioned problems, data structures come to the rescue. Data
can be organized in a data structure in such a way that all items may not be
required to be searched, and the required data can be searched almost instantly.
Execution Time Cases
■ There are three cases that are usually used to compare various data structures’
execution times in a relative manner.
– Worst Case − This is the scenario where a particular data structure
operation takes the maximum time it can take. If an operation’s worst-case
time is ƒ(n) then this operation will not take more than ƒ(n) time, where
ƒ(n) represents the function of n.
– Average Case − This is the scene depicting the average execution time of
an operation of a data structure. If an operation takes ƒ(n) time in
execution, then m operations will take mƒ(n) time.
– Best Case − This is the scene depicting the least possible execution time of
an operation of a data structure. If an operation takes ƒ(n) time in
execution, then the actual operation may take time as the random number
which would be maximum as ƒ(n).
Basic Terminology
■ Data − Data are values or sets of values.
■ Data Item − Data item refers to a single unit of values.
■ Group Items − Data items that are divided into sub-items are called Group Items.
■ Elementary Items − Data items that cannot be divided are called Elementary Items.
■ Attribute and Entity − An entity is that which contains certain attributes or
properties, which may be assigned values.
■ Entity Set − Entities of similar attributes form an entity set.
■ Field − Field is a single elementary unit of information representing an attribute of
an entity.
■ Record − A record is a collection of field values of a given entity.
■ File − A file is a collection of records of the entities in a given entity set.
Characteristics of Data Structures
Overview of Algorithms
■ Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a certain
order to get the desired output. Algorithms are generally created independent of underlying
languages, i.e. an algorithm can be implemented in more than one programming language.
■ Many of the algorithms we’ll discuss apply directly to specific data structures. For most data
structures, you need to know how to
– Insert a new data item.
– Search for a specified item.
– Delete a specified item.
■ From the data structure point of view, the following are some important categories of algorithms
– Search − Algorithm to search an item in a data structure.
– Sort − Algorithm to sort items in a certain order.
– Insert − Algorithm to insert an item in a data structure.
– Update − Algorithm to update an existing item in a data structure.
– Delete − Algorithm to delete an existing item from a data structure.
Characteristics of an Algorithm
■ Not all procedures can be called an algorithm. An algorithm should have the following
characteristics
– Unambiguous − The algorithm should be clear and unambiguous. Each of its steps (or
phases), and their inputs/outputs should be clear and must lead to only one meaning.
– Input − An algorithm should have 0 or more well-defined inputs.
– Output − An algorithm should have 1 or more well-defined outputs and should match the
desired output.
– Finiteness − Algorithms must terminate after a finite number of steps.
– Feasibility − This should be feasible with the available resources.
– Independent − An algorithm should have step-by-step directions, which should be
independent of any programming code.
How to Write an Algorithm?
■ There are no well-defined standards for writing algorithms. Rather, it
is problem and resource-dependent. Algorithms are never written to
support a particular programming code.
■ As we know that all programming languages share basic code
constructs like loops (do, for, while), flow-control (if-else), etc. These
common constructs can be used to write an algorithm.
■ We write algorithms in a step-by-step manner, but it is not always the
case. Algorithm writing is a process and is executed after the problem
domain is well-defined. That is, we should know the problem domain,
for which we are designing a solution.
Example
■ Let's try to learn algorithm writing by using an example.
■ Problem − Design an algorithm to add two numbers and display the result.
– I: 2 Numbers
– P: add
– O: Sum

■ Algorithms tell the programmers how to code the program. Alternatively, the algorithm can be written
as −
■ There are multiple ways to solve a problem using a computer program. For
instance, there are several ways to sort items in an array. You can use
merge sort, bubble sort, insertion sort, etc. All these algorithms have their own
pros and cons. An algorithm can be thought of as a procedure or formula to
solve a particular problem. The question is, which algorithm to use to solve a
specific problem when there exist multiple solutions to the problem?

■ Algorithm analysis refers to the analysis of the complexity of different


algorithms and finding the most efficient algorithm to solve the problem at
hand. Big-O Notation is a statistical measure, used to describe the complexity
of the algorithm.
Why is Algorithm Analysis Important?
■ To understand why algorithm analysis is important, we
will take help of a simple example. Notice that the algorithm simply takes an
integer as an argument. Inside the fact function
■ Suppose a manager gives a task to two of his
a variable named product is initialized to 1. A
employees to design an algorithm in Python that
loop executes from 1 to N and during each
calculates the factorial of a number entered by the user.
iteration, the value in the product is multiplied
■ The algorithm developed by the first employee looks by the number being iterated by the loop and
like this: the result is stored in the product variable again.
After the loop executes, the product variable
will contain the factorial.
E1 E2 Similarly, the second employee also
def fact(n): def fact2(n): developed an algorithm that calculates
product = 1 if n == 0: factorial of a number. The second employee
return 1
for i in range(n): used a recursive function to calculate the
else:
product = product * (i+1) factorial of a program as shown below:
return n * fact2(n-1)
return product print(fact2(5))
print(fact(5))
– def fact2(n):
– if n == 0:
– return 1
– else:
– return n * fact2(n-1)
– print(fact2(5))
■ The manager has to decide which algorithm to use. To do so, he has to find the complexity of the algorithm. One way to do so is by
finding the time required to execute the algorithms.
■ In the Jupyter notebook, you can use the %timeit literal followed by the function call to find the time taken by the function to
execute. Look at the following script:
– %timeit fact(50)
■ Output:
– 9 µs ± 405 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
– The output says that the algorithm takes 9 microseconds (plus/minus 45 nanoseconds) per loop.
■ Similarly, execute the following script:
– %timeit fact2(50)
■ Output:
■ 15.7 µs ± 427 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
■ The second algorithm involving recursion takes 15 microseconds (plus/minus 427 nanoseconds).
■ The execution time shows that the first algorithm is faster compared to the second algorithm involving recursion. This example
shows the importance of algorithm analysis. In the case of large inputs, the performance difference can become more significant.
Algorithm Analysis with Big-O
Notation
■ Big-O notation is a metric used to find algorithm complexity. Basically, Big-O
notation signifies the relationship between the input to the algorithm and the steps
required to execute the algorithm. It is denoted by a big "O" followed by opening
and closing parenthesis. Inside the parenthesis, the relationship between the input
and the steps taken by the algorithm is presented using "n".

■ For instance, if there is a linear relationship between the input and the step taken
by the algorithm to complete its execution, the Big-O notation used will be O(n).
Similarly, the Big-O notation for quadratic functions is O(n^2)
The following are some of the most common
Big-O functions:
To get an idea of how Big-O notation in is
calculated, let's take a look at some
examples of constant, linear, and quadratic
complexity.
Constant Complexity (O(C))
def constant_algo(items):
■ The complexity of an algorithm is said to be
constant if the steps required to complete result = items[0] * items[0]
the execution of an algorithm remain print()
constant, irrespective of the number of
inputs. The constant complexity is denoted
by O(c) where c can be any constant constant_algo([4, 5, 6, 8])
number.
■ Let's write a simple algorithm in Python that In the above script, irrespective of the input size, or the
finds the square of the first item in the list number of items in the input list items, the algorithm
and then prints it on the screen. performs only 2 steps: Finding the square of the first
element and printing the result on the screen. Hence,
the complexity remains constant.
Constant Complexity (O(C))
If you draw a line plot with the varying size of the items input on the x-axis and the number of steps on the y-
axis, you will get a straight line. To visualize this, execute the following script:

import matplotlib.pyplot as plt


import numpy as np

x = [2, 4, 6, 8, 10, 12]

y = [2, 2, 2, 2, 2, 2]

plt.plot(x, y, 'b')
plt.xlabel('Inputs')
plt.ylabel('Steps')
plt.title('Constant Complexity')
plt.show()
Linear Complexity (O(n))
■ The complexity of an algorithm is said to be def linear_algo(items):
linear if the steps required to complete the for item in items:
execution of an algorithm increase or
decrease linearly with the number of inputs. print(item)
Linear complexity is denoted by O(n).
■ In this example, let's write a simple program linear_algo([4, 5, 6, 8])
that displays all items in the list to the
console: The complexity of the linear_algo function is linear in the
above example since the number of iterations of the
for-loop will be equal to the size of the input items array.
For instance, if there are 4 items in the items list, the for-
loop will be executed 4 times, and so on.
Linear Complexity (O(n))
■ The plot for linear complexity with inputs on x-axis and # of steps on the x-axis is as follows:

import matplotlib.pyplot as plt


import numpy as np

x = [2, 4, 6, 8, 10, 12]

y = [2, 4, 6, 8, 10, 12]

plt.plot(x, y, 'b')
plt.xlabel('Inputs')
plt.ylabel('Steps')
plt.title('Linear Complexity')
plt.show()
Another point to note here is that in case of a huge number of inputs the constants become insignificant. For instance,
take a look at the following script:
We can further verify and visualize this by plotting the inputs on x-
def linear_algo(items): axis and the number of steps on y-axis as shown below:
for item in items:
print(item) import matplotlib.pyplot as plt
import numpy as np
for item in items:
print(item) x = [2, 4, 6, 8, 10, 12]

linear_algo([4, 5, 6, 8]) y = [4, 8, 12, 16, 20, 24]

plt.plot(x, y, 'b')
In the script above, there are two for- plt.xlabel('Inputs')
loops that iterate over the input items list. plt.ylabel('Steps')
Therefore the complexity of the algorithm plt.title('Linear Complexity')
becomes O(2n), however in case of
infinite items in the input list, the twice of
infinity is still equal to infinity, therefore
we can ignore the constant 2 (since it is
ultimately insignificant) and the
complexity of the algorithm remains
O(n).
Quadratic Complexity (O(n^2))
def quadratic_algo(items):
■ The complexity of an algorithm is said to be
quadratic when the steps required to execute an for item in items:
algorithm are a quadratic function of the number for item2 in items:
of items in the input. Quadratic complexity is print(item, ' ' ,item)
denoted as O(n^2). Take a look at the following
example to see a function with quadratic
complexity:
quadratic_algo([4, 5, 6, 8])

In the script above, you can see that we have an outer loop that iterates through all
the items in the input list and then a nested inner loop, which again iterates through
all the items in the input list. The total number of steps performed is n * n, where n is
the number of items in the input array.
Quadratic Complexity (O(n^2))
■ The following graph plots the number of inputs vs the steps for an algorithm with quadratic complexity.
■ In the previous examples, we saw that only one function was being performed on the input. What if
multiple functions are being performed on the input? Take a look at the following example.

def complex_algo(items):
In the script above several tasks are being performed, first, a string is
printed 5 times on the console using the print statement. Next, we print the
for i in range(5): O(5)
input list twice on the screen and finally, another string is being printed
print("Python is awesome")
three times on the console. To find the complexity of such an algorithm,
we need to break down the algorithm code into parts and try to find the
for item in items:
complexity of the individual pieces.
print(item) O(n)

for item in items: Let's break our script down into individual parts. In the first part
print(item) O(n) we have:

print("Big O") O(1) for i in range(5):


print("Big O") O(1) print("Python is awesome")
print("Big O") O(1)

complex_algo([4, 5, 6, 8]) The complexity of this part is O(5). Since five constant steps are
being performed in this piece of code irrespective of the input.
Next, we have: To find the overall complexity, we simply have to add these
for item in items: individual complexities. Let's do so:
print(item)
O(5) + O(n) + O(n) + O(3)
We know the complexity of above piece of code is
O(n).
Simplifying above we get:
Similarly, the complexity of the following piece of
code is also O(n)
O(8) + O(2n)
for item in items:
print(item)
We said earlier that when the input (which has length n in
this case) becomes extremely large, the constants become
Finally, in the following piece of code, a string is being
insignificant i.e. twice or half of the infinity still remains
printed three times, hence the complexity is O(3)
infinity. Therefore, we can ignore the constants. The final
complexity of the algorithm will be O(n).

print("Big O")
print("Big O")
print("Big O")
Worst vs Best Case Complexity
■ Usually, when someone asks you about the complexity of the algorithm he is asking you
about the worst case complexity. To understand the best case and worse case
complexity, look at the following script:

In the script above, we have a function that takes a number and a list
def search_algo(num, items):
of numbers as input. It returns true if the passed number is found in
for item in items:
the list of numbers, otherwise it returns None. If you search 2 in the
if item == num:
list, it will be found in the first comparison. This is the best case
return True
complexity of the algorithm that the searched item is found in the
else:
first searched index. The best case complexity, in this case, is O(1).
pass
On the other hand, if you search 10, it will be found at the last
nums = [2, 4, 6, 8, 10]
searched index. The algorithm will have to search through all the
items in the list, hence the worst case complexity becomes O(n).
print(search_algo(2, nums))
Space Complexity
■ In addition to the time complexity, where you count the number of steps required to complete
the execution of an algorithm, you can also find space complexity which refers to the number of
spaces you need to allocate in the memory space during the execution of a program.

Have a look at the following example:


def return_squares(n):
square_list = []
for num in n: In the script above, the function accepts a list of integers and
square_list.append(num * num) returns a list with the corresponding squares of integers. The
algorithm has to allocate memory for the same number of items
return square_list as in the input list. Therefore, the space complexity of the
algorithm becomes O(n).
nums = [2, 4, 6, 8, 10]
print(return_squares(nums))
Watch this video as supplemental
discussion
Synthesize the following:
■ Time Complexity:
https://www.youtube.com/watch?v=V42FBiohc6c&list=PL2_aWCzGMAwI9HK8
YPVBjElbLbI3ufctn
■ Big-O Notation: https://www.youtube.com/watch?v=__vX2sjlpXU
https://www.youtube.com/watch?v=D6xkbGLQesk

1. Discuss the concept


2. Summarize the video
3. Discuss your learning
Group Dynamics

Group yourselves into 3


2 Will write algorithms (1 different each member)
1 will verify which algorithm is better using the time complexity

Problem Separate Odd and Even Number based on user’s input then add all even and all
odd.
Answer
■ Algo with Complexity ■ Code (Java)
Declare variables value, sumE, sumO Import java.util.Scanner;
Get value from user
For i in value:
if i modulus 2 = 0
print i
sumE = sumE + i

Print sumE

For in value:
if i modulus 2 not equal to 0
print i
sumO = sumO + i
Print sumO

O(n)

You might also like