0% found this document useful (0 votes)
27 views

DS Intro

Uploaded by

Ravikumar P
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

DS Intro

Uploaded by

Ravikumar P
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 77

Introduction

to
Data Structures

Virtual lab – IIIT Hyderabad


Objectives
 Be familiar with problem solving.
 Be able to develop (and implement) algorithms.
 Be able to trace algorithms.
 Be able to select appropriate data structures and algorithms
for given problems.
Algorithm Specification

 An algorithm is a finite set of instructions that


accomplishes a particular task.
 Criteria
 input: zero or more quantities that are externally supplied
 output: at least one quantity is produced
 definiteness: clear and unambiguous
 finiteness: terminate after a finite number of steps
 effectiveness: instruction is basic enough to be carried out
 A program does not have to satisfy the finiteness criteria.
From the data structure point of view, following are
some important categories of algorithms −
Search − Algorithm to search an item in a data

structure.
Sort − Algorithm to sort items in a certain order.

Insert − Algorithm to insert item in a data structure.

Update − Algorithm to update an existing item in a

data structure.
Delete − Algorithm to delete an existing item from a

data structure.
Data abstraction
 Data Type
A data type is a collection of objects and a set of
operations that act on those objects.
 For example, the data type int consists of the objects
{0, +1, -1, +2, -2, …,} and the operations +, -, *, /, and
%.
 The data types of C
 The basic data types: char, int, float and double
 The group data types: array and struct
 The pointer data type
 The user-defined types
Data abstraction
 Abstract Data Type
 An abstract data type(ADT) is a data type
that is organized in such a way that
the specification of the objects and
the operations on the objects is separated
from
the representation of the objects and
the implementation of the operations.
Data Types & Data Structures
 Structured Data types: can be broken into
component parts. E.g. an object, array, set, file, etc.
Example: a student object.

Name A H M A D

Age 20

Branch C S C

A Component part

7
Types of Data Structures
Array

Linked List

Queue Stack
Tree

There are many, but we named a few. We’ll learn these


data structures in great detail!
The Need for Data
Structures
 Goal: to organize data
 Criteria: to facilitate efficient
 storage of data
 retrieval of data
 manipulation of data
 Design Issue:
 select and design appropriate datatypes
(This is the main motivation to learn and understand
data structures)
Algorithm Specification
 Recursive algorithms
 that functions can call themselves (direct
recursion).
 They may call other functions that invoke the
calling function again (indirect recursion).
Algorithm Analysis
 Efficiency of an algorithm can be analyzed at two different
stages, before implementation and after implementation.
 A Priori Analysis −
 This is a theoretical analysis of an algorithm.
 Efficiency of an algorithm is measured by factors, such as processor
speed
 A Posterior Analysis −
 This is an empirical analysis of an algorithm.
 The selected algorithm is implemented using programming language.
 In this analysis, actual statistics like running time and space required,
are collected.
Algorithm Complexity
 Suppose X is an algorithm and n is the size of input data, the time and space
used by the algorithm X are the two main factors, which decide the efficiency
of X.
 Time Factor − Time is measured by counting the number of key operations
such as comparisons in the sorting algorithm.
 Space Factor − Space is measured by counting the maximum memory space
required by the algorithm.
 The complexity of an algorithm f(n) gives the running time and/or the storage
space required by the algorithm in terms of n as the size of input data.
Space Complexity

 Space complexity of an algorithm represents the amount of


memory space required by the algorithm.
 The space required by an algorithm is equal to the sum of
the following two components −
 A fixed part : is a space required to store certain data and
variables, that are independent of the size of the problem.
 For example, simple variables and constants used, program size,
etc.
 A variable part : is a space required by variables, whose size
depends on the size of the problem.
 For example, dynamic memory allocation, recursion stack space,
etc.
Space complexity S(P) of any algorithm P is
S(P) = C + SP(I),
where C is the fixed part and

S(I) is the variable part of the algorithm, which depends on instance


characteristic I.

Algorithm: SUM(A, B)
Step 1 - START
Step 2 - C ← A + B + 10
Step 3 - Stop
 Here we have three variables A, B, and C and one constant.

 Hence S(P) = 1 + 3.

 Now, space depends on data types of given variables and constant types and it will

be multiplied accordingly.
Time Complexity
 Time complexity of an algorithm represents the amount of time
required by the algorithm to run to completion.

 Time requirements can be defined as a numerical function T(n),


where T(n) can be measured as the number of steps, provided each
step consumes constant time.

 For example, addition of two n-bit integers takes n steps.

 Consequently, the total computational time is T(n) = c ∗ n, where c


is the time taken for the addition of two bits.

 Here, we observe that T(n) grows linearly as the input size


increases.
Performance analysis
 Iterative function to sum a list of numbers
 steps/execution
What is a good algorithm?
 It must be correct
 It must be finite (in terms of time and size)

 It must terminate

 It must be unambiguous

 It must be space and time efficient

A program is an instance of an algorithm,


written in some specific programming language
A simple algorithm
 Problem: Find maximum of a, b, c
 Algorithm
 Input = a, b, c
 Output = max
 Process
o Let max = a
o If b > max then
 max = b
o If c > max then
 max = c
o Display max

Order is very important!!!


Algorithm development: Basics
 Clearly identify:
 what output is required?
 what is the input?
 What steps are required to transform input into
output
o Needs problem solving skills
o A problem can be solved in many different ways
o Which solution, amongst the different possible
solutions is optimal?
How to express an algorithm?
 A sequence of steps to solve a problem
 We need a way to express this sequence of steps
1. Natural language (NL) is an obvious choice, but not
a good choice. Why?
o
NLs are notoriously ambiguous (unclear)
2. Programming language (PL) is another choice, but
again not a good choice. Why?
o
Algorithm should be PL independent
 We need some balance
o
We need PL independence
o
We need clarity
o
Pseudo-code provides the right balance
What is Pseudo-code?
 Pseudo-code is a short hand way of
describing a computer program
 Rather than using the specific syntax of a

computer language, more general wording


is used
 It is a mixture of NL and PL expressions,

in a systematic way
 Using pseudo-code, it is easier for a non-

programmer to understand the general


Pseudo-code: General Guidelines

 Use PLs construct that are consistent with


modern high level languages, e.g. C++,
Java, ...
 Use appropriate comments for clarity

 Be simple and precise


Components of Pseudo-code


Expressions
 Standard mathematical symbols are used

o
Left arrow sign (←) as the assignment operator in assignment
statements
o
Equal sign (=) as the equality relation in Boolean expressions
o
For example
Sum ← 0
Sum ← Sum + 5

What is the final value of sum?


Components of Pseudo-code (cont.)

 Decision structures (if-then-else logic)

 if condition then
true-actions
[else

false-actions]
 We use indentation to indicate what actions should be included in the true-

actions and false-actions


 For example

if marks > 50 then


print “Congratulation, you are passed!”
else
print “Sorry, you are failed!”
end if

What will be the output if marks are equal to 75?


Components of Pseudo-code (cont.)

 Loops (Repetition)
 Pre-condition loops
o
While loops
 while condition do actions

 We use indentation to indicate what actions should be included in

the loop actions


 For example

while counter < 5 do


print “Welcome to CSE!”
counter ← counter + 1
end while

What will be the output if counter is initialised to 0, 7?


Components of Pseudo-code (cont.)

 Loops (Repetition)
 Pre-condition loops
o
For loops
 for variable-increment-definition do actions

 For example

for counter ← 0; counter < 5; counter ← counter + 2 do


print “Welcome to CSE!”
end for

What will be the output?


Components of Pseudo-code (cont.)

 Loops (Repetition)
 Post-condition loops
o
Do loops
 do actions while condition

 For example

do
print “Welcome to CSE!”
counter ← counter + 1
while counter < 5

What will be the output, if counter was initialised to 10?

The body of a post-condition loop must execute at least once


Components of Pseudo-code (cont.)
 Comments
 /* Multiple line comments go here. */
 // Single line comments go here
 Some people prefer braces {}, for comments
Algorithm Design: Practice
 Example 1: Determining even/odd number
 A number divisible by 2 is considered an even number, while a
number which is not divisible by 2 is considered an odd number.
Write pseudo-code to display first N odd/even numbers.

 Example 2: Computing Weekly Wages


 Gross pay depends on the pay rate and the number of hours
worked per week. However, if you work more than 40 hours, you
get paid time-and-a-half for all hours worked over 40. Write the
pseudo-code to compute gross pay given pay rate and hours
worked
Even/ Odd Numbers
Input range
for num←0; num<=range; num←num+1 do
if num % 2 = 0 then
print num is even
else
print num is odd
endif
endfor
Homework

1. Write an algorithm to find the largest


of a set of numbers.
2. Write an algorithm in pseudocode
that finds the average of (n)
numbers.
For example numbers are [4,5,14,20,3,6]
Definition
 Data structure is representation of the logical
relationship existing between individual
elements of data.
 In other words, a data structure is a way of
organizing all data items that considers not
only the elements stored but also their
relationship to each other.
Introduction
 Data structure affects the design of both
structural & functional aspects of a program.
Program=algorithm + Data Structure
 You know that a algorithm is a step by step

procedure to solve a particular function.


Introduction
 That means, algorithm is a set of instruction
written to carry out certain tasks & the data
structure is the way of organizing the data
with their logical relationship retained.
 To develop a program of an algorithm, we
should select an appropriate data structure
for that algorithm.
 Therefore algorithm and its associated data
structures form a program.
Classification of Data Structure
 Data structure are normally divided into two broad
categories:
 Primitive Data Structure

 Non-Primitive Data Structure


Classification of Data Structure
Data structure

Primitive DS Non-Primitive DS

Integer Float Character Pointer


Classification of Data Structure
Non-Primitive DS

Linear List Non-Linear List

Array Queue Graph Trees

Link List Stack


Primitive Data Structure
 There are basic structures and directly operated upon by the
machine instructions.
 In general, there are different representation on different
computers.
 Integer, Floating-point number, Character constants, string
constants, pointers etc, fall in this category.
Non-Primitive Data Structure
 There are more sophisticated data structures.
 These are derived from the primitive data structures.
 The non-primitive data structures emphasize on structuring
of a group of homogeneous (same type) or heterogeneous
(different type) data items.
Non-Primitive Data Structure
 Lists, Stack, Queue, Tree, Graph are example of non-
primitive data structures.
 The design of an efficient data structure must take
operations to be performed on the data structure.
Non-Primitive Data Structure
 The most commonly used operation on data
structure are broadly categorized into
following types:
 Create
 Selection
 Updating
 Searching
 Sorting
 Merging
 Destroy or Delete
Different between them
 A primitive data structure is generally a basic structure that
is usually built into the language, such as an integer, a float.
 A non-primitive data structure is built out of primitive data
structures linked together in meaningful ways, such as a or
a linked-list, binary search tree, AVL Tree, graph etc.
Description of various
Data Structures : Arrays
 An array is defined as a set of finite number of
homogeneous elements or same data items.
 It means an array can contain one type of data only, either
all integer, all float-point number or all character.
Arrays
 Simply, declaration of array is as follows:
int arr[10]
 Where int specifies the data type or type of
elements arrays stores.
 “arr” is the name of array & the number
specified inside the square brackets is the
number of elements an array can store, this is
also called sized or length of array.
Arrays
 Following are some of the concepts to be remembered
about arrays:
 The individual element of an array can be accessed by

specifying name of the array, following by index or


subscript inside square brackets.
 The first element of the array has index zero[0]. It means

the first element and last element will be specified


as:arr[0] & arr[9] respectively.
Arrays
 The elements of array will always be
stored in the consecutive (continues)
memory location.
 The number of elements that can be stored

in an array, that is the size of array or its


length is given by the following equation:
(Upperbound-lowerbound)+1
Arrays
 For the above array it would be
(9-0)+1=10,where 0 is the lower bound of array and 9 is
the upper bound of array.
 Array can always be read or written through loop.

 If we read a one-dimensional array it require one loop for

reading and other for writing the array.


Arrays
 For example: Reading an array
For(i=0;i<=9;i++)
scanf(“%d”,&arr[i]);
 For example: Writing an array

For(i=0;i<=9;i++)
printf(“%d”,arr[i]);
Arrays
 If we are reading or writing two-dimensional
array it would require two loops.
 And similarly the array of a N dimension

would required N loops.


 Some common operation performed on array

are:
 Creation of an array
 Traversing an array
Arrays
 Insertion of new element
 Deletion of required element

 Modification of an element

 Merging of arrays
Lists
 A lists (Linear linked list) can be defined as a
collection of variable number of data items.
 Lists are the most commonly used non-
primitive data structures.
 An element of list must contain at least two
fields, one for storing data or information and
other for storing address of next element.
 As you know for storing address we have a
special data structure of list the address must be
pointer type.
Lists  Technically each such element is referred to
as a node, therefore a list can be defined as
a collection of nodes as show bellow:

[Linear Liked List]


Head

AAA BBB CCC

Information field Pointer field


Lists
 Types of linked lists:
 Single linked list
 Doubly linked list
 Single circular linked list
 Doubly circular linked list
Stack
 A stack is also an ordered collection of elements like arrays,
but it has a special feature that deletion and insertion of
elements can be done only from one end called the top of
the stack (TOP)
 Due to this property it is also called as last in first out type
of data structure (LIFO).
Stack
 It could be through of just like a stack of plates
placed on table in a party, a guest always takes
off a fresh plate from the top and the new plates
are placed on to the stack at the top.
 It is a non-primitive data structure.
 When an element is inserted into a stack or
removed from the stack, its base remains fixed
where the top of stack changes.
Stack
 Insertion of element into stack is called PUSH and deletion
of element from stack is called POP.
 The bellow show figure how the operations take place on a
stack:

PUSH POP

[STACK]
Stack
 The stack can be implemented into two ways:
 Using arrays (Static implementation)

 Using pointer (Dynamic implementation)


Queue
 Queue are first in first out type of data
structure (i.e. FIFO)
 In a queue new elements are added to the
queue from one end called REAR end and
the element are always removed from other
end called the FRONT end.
 The people standing in a railway reservation
row are an example of queue.
Queue
 Each new person comes and stands at the
end of the row and person getting their
reservation confirmed get out of the row
from the front end.
 The bellow show figure how the
operations take place on a stack:
10 20 30 40 50

front rear
Queue
 The queue can be implemented into two ways:
 Using arrays (Static implementation)

 Using pointer (Dynamic implementation)


Trees
 A tree can be defined as finite set of data items (nodes).
 Tree is non-linear type of data structure in which data items
are arranged or stored in a sorted sequence.
 Tree represent the hierarchical relationship between various
elements.
Trees
 In trees:
 There is a special data item at the top of
hierarchy called the Root of the tree.
 The remaining data items are partitioned into
number of mutually exclusive subset, each
of which is itself, a tree which is called the
sub tree.
 The tree always grows in length towards
bottom in data structures, unlike natural trees
which grows upwards.
Trees
 The tree structure organizes the data into branches, which
related the information.

A root

B C

D E F G
Graph
 Graph is a mathematical non-linear data structure capable
of representing many kind of physical structures.
 It has found application in Geography, Chemistry and
Engineering sciences.
 Definition: A graph G(V,E) is a set of vertices V and a set
of edges E.
Graph
 An edge connects a pair of vertices and many have weight
such as length, cost and another measuring instrument for
according the graph.
 Vertices on the graph are shown as point or circles and
edges are drawn as arcs or line segment.
Graph
 Example of graph:
6
v2 v5
v1 v3
10

v1 8 11
15
9 v2
v3 v4 v4

[a] Directed & [b] Undirected Graph


Weighted Graph
Graph
 Types of Graphs:
 Directed graph

 Undirected graph

 Simple graph

 Weighted graph

 Connected graph

 Non-connected graph
Performance analysis

 Definition: [Big “oh’’]


 f(n) = O(g(n)) iff there exist positive constants c and n0
such that f(n)  cg(n) for all n, n  n0.
 Definition: [Omega]
 f(n) = (g(n)) (read as “f of n is omega of g of n”) iff
there exist positive constants c and n0 such that f(n) 
cg(n) for all n, n  n0.
 Definition: [Theta]
 f(n) = (g(n)) (read as “f of n is theta of g of n”) iff there
exist positive constants c1, c2, and n0 such that c1g(n) 
f(n)  c2g(n) for all n, n  n0.
Big-O Notation (O-notation)
 Big-O notation represents the upper bound of
the running time of an algorithm. Thus, it
gives the worst-case complexity of an
algorithm.

f(n) = O(g(n)) iff there exist positive constants c and


n0 such that f(n) < cg(n) for all n, n < n0.
 The above expression can be described as a function f(n) belongs to the set
O(g(n)) if there exists a positive constant c such that it lies between 0 and
cg(n), for sufficiently large n.

 For any value of n, the running time of an algorithm does not cross the
time provided by O(g(n)).

 Since it gives the worst-case running time of an algorithm, it is widely used


to analyze an algorithm as we are always interested in the worst-case
scenario.
Omega Notation (Ω-notation)
 Omega notation represents the lower bound of
the running time of an algorithm. Thus, it
provides the best case complexity of an
algorithm.

f(n) = (g(n)) (read as “f of n is omega of g of n”)


iff there exist positive constants c and n0 such
that f(n)  cg(n) for all n, n  n0.
 The above expression can be described as a function
f(n) belongs to the set Ω(g(n)) if there exists a
positive constant c such that it lies above cg(n), for
sufficiently large n.

 For any value of n, the minimum time required by


the algorithm is given by Omega Ω(g(n)).
Theta Notation (Θ-notation)
 Theta notation represents the upper and the lower bound of
the running time of an algorithm, it is used for analyzing the
average-case complexity of an algorithm.

f(n) = (g(n)) (read as “f of n is theta of g of n”) iff there exist positive constants
c1, c2, and n0 such that c1g(n)  f(n)  c2g(n) for all n, n  n0.
 The above expression can be described as a
function f(n) belongs to the set Θ(g(n)) if there
exist positive constants c1 and c2 such that it can
be sandwiched between c1g(n) and c2g(n), for
sufficiently large n.

 If a function f(n) lies anywhere in between


c1g(n) and c2g(n) for all n ≥ n0, then f(n) is said
to be asymptotically tight bound.
Steps to calculate O for a program:
 Break the program into smaller segments.
 Find the number of operations performed for each segment (in terms of the
input size) assuming the given input is such that the program takes the
maximum time i.e the worst-case scenario.
 Add up all the operations and simplify it, let’s say it is f(n).
 Remove all the constants and choose the term having the highest order
because for n tends to infinity the constants and the lower order terms
in f(n) will be insignificant, let say the function is g(n) then, big-O
notation is O(g(n)).
Steps to calculate Ω for a program
 Break the program into smaller segments.
 Find the number of operations performed for each segment(in terms of the input
size) assuming the given input is such that the program takes the least amount
of time.
 Add up all the operations and simplify it, let’s say it is f(n).
 Remove all the constants and choose the term having the least order or any
other function which is always less than f(n) when n tends to infinity, let say it is
g(n) then, Omega (Ω) of f(n) is Ω(g(n)).
Steps to calculate Θ for a program
 Break the program into smaller segments.
 Find all types of inputs and calculate the number of
operations they take to be executed.
 Make sure that the input cases are equally distributed.
 Find the sum of all the calculated values and divide the
sum by the total number of inputs let say the function of
n obtained is g(n) after removing all the constants, then
in Θ notation, it’s represented as Θ(g(n)).

You might also like