0% found this document useful (0 votes)
26 views

UNIT-1-Data Structures

Uploaded by

Hiragar Harshil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

UNIT-1-Data Structures

Uploaded by

Hiragar Harshil
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

UNIT-1: INTRODUCTION TO DATA STRUCTURE

Data Management concepts


 A program is said to be efficient when it executes in minimum time and with minimum
memory space.
 In order to write efficient programs, we need to apply certain data management concepts.
 The concept of data management includes activities like
o Data collection,
o Organization of data into appropriate structures, and
o Developing and maintaining routines for quality assurance.
Data Structure:
 A data structure is basically a group of data elements that are put together under one name,
and which defines a particular way of storing and organizing data in a computer so
that it can be used efficiently.
 Data structures are used in almost every program or software system.
 Some common examples of data structures are arrays, linked lists, queues, stacks, binary
trees, and hash tables.
 Data structures are widely applied in the following areas:
o Compiler design
o Operating system
o Statistical analysis
o Numerical analysis
o Artificial intelligence
o Graphics

Terminologies related to Data Structures


Data:
 The term data means a value or set of values.
 It specifies either the value of a variable or a constant
(e.g., marks of students, name of an employee, address of a customer, value of pi, etc.).
 Data can be a number, symbol, character or any kind of information.
 Data is a collection of information but it is in raw form.
Information:
 When data is processed it becomes information.
 The logical and meaningful form of data is called information.

pg. 1
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Record
 A record is a collection of data items.
 For example, the name, address, course, and marks obtained are individual data items. But
all these data items can be grouped together to form a record.
File
 A file is a collection of related records.
 For example, if there are 60 students in a class, then there are 60 records of the students.
All these related records are stored in a file.

Data types primitive and non-primitive


 The data type of a variable determines the kind of data the variable can hold. C language
supports 2 different type of data types.
1. Primitive Data Types: Primitive data types are the most basic data types that are used
for representing simple values such as integers, float, characters, etc.
2. Derived Data Types: The data types that are derived from the primitive or built-in data
types are referred to as derived data types such as array, structure, union and pointer.

Types of Data Structures


 Data structures are generally categorized into two classes: primitive and non-primitive data
structures.

[Figure 1.1: Classification of Data Structures]

pg. 2
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Primitive and Non-primitive Data Structures


 Primitive data structures are the fundamental data types which are supported by a
programming language.
 Some basic data types are integer, real, character, and boolean.
 Non-primitive data structures are those data structures which are created using primitive
data structures.
 Examples of such data structures include linked lists, stacks, trees, and graphs.
 Non-primitive data structures can further be classified into two categories: linear and non-
linear data structures.

Linear and Non-linear Structures


 If the elements of a data structure are stored in a linear or sequential order, then it is a linear
data structure.
 Examples include arrays, linked lists, stacks, and queues.
 If the elements of a data structure are not stored in a sequential order, then it is a non-linear
data structure.
 The relationship of adjacency is not maintained between elements of a non-linear data
structure.
 Examples include trees and graphs.
We will now introduce all these data structures in brief.

Arrays
 An array is a collection of similar data elements. These data elements have the same data
type.
 The elements of the array are stored in consecutive memory locations and are referenced
by an index (also known as the subscript).
 In C, arrays are declared using the following syntax:
type name[size];

For example,
int marks[10];

 The above statement declares an array marks that contains 10 elements.

pg. 3
UNIT-1: INTRODUCTION TO DATA STRUCTURE

 In C, the array index starts from zero.


 This means that the array marks will contain 10 elements in all.
 The first element will be stored in marks[0], second element in marks[1], so on and so forth.
Therefore, the last element, that is the 10th element, will be stored in marks[9].
 In the memory, the array will be stored as shown in Fig. 1.2.

[Fig. 1.2: Memory representation of an array of 10 elements]


 Arrays are generally used when we want to store large amount of similar type of data.
 But they have the following limitations:
o Arrays are of fixed size.
o Data elements are stored in contiguous memory locations which may not be always
available.
o Insertion and deletion of elements can be problematic because of shifting of elements
from their positions.

Linked Lists
 A linked list is a very flexible, dynamic data structure in which elements (called nodes)
form a sequential list.
 In a linked list, each node is allocated space as it is added to the list.
 Every node in the list points to the next node in the list.
 Therefore, in a linked list, every node contains the following two types of data:
 The value of the node (data)
 A pointer or link to the next node in the list (address)
 The last node in the list contains a NULL pointer to indicate that it is the end or tail of the
list.
 Since the memory for a node is dynamically allocated when it is added to the list, the total
number of nodes that may be added to a list is limited only by the amount of memory
available.
 Figure 1.3 shows a linked list of seven nodes.

pg. 4
UNIT-1: INTRODUCTION TO DATA STRUCTURE

[Figure 1.3: Simple linked list]

Advantage: Easier to insert or delete data elements


Disadvantage: Slow search operation and requires more memory space

Stacks
 A stack is a linear data structure in which insertion and deletion of elements are done at
only one end, which is known as the top of the stack.
 Stack is called a last-in, first-out (LIFO) structure because the last element which is added
to the stack is the first element which is deleted from the stack.
 In the computer’s memory, stacks can be implemented using arrays or linked lists. Figure
1.4 shows the array implementation of a stack.
 Every stack has a variable top associated with it.
 Top is used to store the address of the topmost element of the stack.
 It is this position from where the element will be added or deleted.
 There is another variable MAX, which is used to store the maximum number of elements
that the stack can store.

 If top = NULL, then it indicates that the stack is empty and


 If top = MAX–1, then the stack is full.

[Figure 1.4: Array representation of a stack]

 In Fig.1.4, top = 4, so insertions and deletions will be done at this position.


 Here, the stack can store a maximum of 10 elements where the indices range from 0–9.
 In the above stack, five more elements can still be stored.
 A stack supports three basic operations: push, pop, and peep.

pg. 5
UNIT-1: INTRODUCTION TO DATA STRUCTURE

 The push operation adds an element to the top of the stack. The pop operation removes the
element from the top of the stack. And the peep operation returns the value of the topmost
element of the stack (without deleting it).
 However, before inserting an element in the stack, we must check for overflow conditions.
 An overflow occurs when we try to insert an element into a stack that is already full.
 Similarly, before deleting an element from the stack, we must check for underflow
conditions.
 An underflow condition occurs when we try to delete an element from a stack that is already
empty.

Queues
 A queue is a first-in, first-out (FIFO) data structure in which the element that is inserted
first is the first one to be taken out.
 The elements in a queue are added at one end called the rear and removed from the other
end called the front.
 Like stacks, queues can be implemented by using either arrays or linked lists.
 Every queue has front and rear variables that point to the position from where deletions and
insertions can be done, respectively.
 Consider the queue shown in Fig. 1.5

[Figure 1.5: Array representation of a queue]


 Here, front = 0 and rear = 5.
 If we want to add one more value to the list, say, if we want to add another element with
the value 45, then the rear would be incremented by 1 and the value would be stored at the
position pointed by the rear.
 The queue, after the addition, would be as shown in Fig. 1.6. Here, front = 0 and rear = 6.

[Figure 1.6: Queue after insertion of a new element]

pg. 6
UNIT-1: INTRODUCTION TO DATA STRUCTURE

 Every time a new element is to be added, we will repeat the same procedure.
 Now, if we want to delete an element from the queue, then the value of front will be
incremented.
 Deletions are done only from this end of the queue. The queue after the deletion will be as
shown in Fig.1.7.

[Figure 1.7: Queue after deletion of an element]

 However, before inserting an element in the queue, we must check for overflow conditions.
 An overflow occurs when we try to insert an element into a queue that is already full.

A queue is full when rear = MAX–1,


where MAX is the size of the queue, that is MAX specifies the maximum number of
elements in the queue.
Note that we have written MAX–1 because the index starts from 0.
 Similarly, before deleting an element from the queue, we must check for underflow
conditions.
 An underflow condition occurs when we try to delete an element from a queue that is
already empty.
If front = NULL and rear = NULL, then there is no element in the queue.

Trees
 A tree is a non-linear data structure which consists of a collection of nodes arranged in a
hierarchical order. One of the nodes is designated as the root node, and the remaining nodes
can be partitioned into disjoint sets such that each set is a sub-tree of the root.
 The simplest form of a tree is a binary tree.
 A binary tree consists of a root node and left and right sub-trees, where both sub-trees are
also binary trees.
 Each node contains a data element, a left pointer which points to the left sub-tree, and a
right pointer which points to the right sub-tree.
 The root element is the topmost node which is pointed by a ‘root’ pointer.

pg. 7
UNIT-1: INTRODUCTION TO DATA STRUCTURE

 If root = NULL then the tree is empty.


 Figure 1.8 shows a binary tree, where R is the root node and T1 and T2 are the left and
right sub trees of R.
 If T1 is non-empty, then T1 is said to be the left successor of R.
 Likewise, if T2 is non-empty, then it is called the right successor of R.
 In Fig. 1.8, node 2 is the left child and node 3 is the right child of the root node 1.
 Note that the left sub-tree of the root node consists of the nodes 2, 4, 5, 8, and 9.
 Similarly, the right sub-tree of the root node consists of the nodes 3, 6, 7, 10, 11, and 12.

[Figure 1.8: Binary tree]


Advantage: Provides quick search, insert, and delete operations
Disadvantage: Complicated deletion algorithm

Graphs
 A graph is a non-linear data structure which is a collection of vertices (also called nodes)
and edges that connect these vertices.
 In a tree structure, nodes can have any number of children but only one parent, a graph on
the other hand relaxes all such kinds of restrictions.
 Figure 1.9 shows a graph with five nodes.

[Figure 1.9: Graph]

pg. 8
UNIT-1: INTRODUCTION TO DATA STRUCTURE

 A node in the graph may represent a city and the edges connecting the nodes can represent
roads.
 A graph can also be used to represent a computer network where the nodes are workstations
and the edges are the network connections.
 Note that unlike trees, graphs do not have any root node. Rather, every node in the graph
can be connected with every another node in the graph.
 When two nodes are connected via an edge, the two nodes are known as neighbours.
 For example, in Fig. 1.9, node A has two neighbours: B and D.

Advantage: Best models real-world situations


Disadvantage: Some algorithms are slow and very complex

What is Data Type? Differentiate between Data Type and Data Structure.
 A data type is a classification of data which tells the compiler or interpreter how the
programmer intends to use the data. Most programming languages support various types of
data, including integer, real, character or string, and Boolean etc.
Data Type Data Structure
Data Type is the kind or form of a variable Data Structure is the collection of different
which is being used throughout the program. It kinds of data. That entire data can be
defines that the particular variable will assign represented using an object and can be used
the values of the given data type only throughout the entire program.
Implementation through Data Types is a form of Implementation through Data Structures is
abstract implementation called concrete implementation
Can hold values and not data, so it is data less Can hold different kind and types of data
within one single object
Values can directly be assigned to the data type The data is assigned to the data structure
variables object using some set of algorithms and
operations like push, pop and so on.
No problem of time complexity Time complexity comes into play when
working with data structures
Examples: int, float, double Examples: stacks, queues, tree

pg. 9
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Abstract Data Type


 An abstract data type (ADT) is the way we look at a data structure, focusing on what it
does and ignoring how it does its job.
 For example, stacks and queues are perfect examples of an ADT.
 We can implement both these ADTs using an array or a linked list. This demonstrates the
‘abstract’ nature of stacks and queues.
 It is a technique of hiding the internal details from the user and only showing the necessary
details to the user.
 It Provides blueprint of any Data Structure.
ADT= Type+ Function Name + Behaviour of each function.
 To further understand the meaning of an abstract data type, we will break the term into
‘data type’ and ‘abstract’, and then discuss their meanings.

Data type
 Data type of a variable is the set of values that the variable can take.
 Basic data types in C include int, char, float, and double.
 When we talk about a primitive type (built-in data type), we actually consider two things:
1. A data item with certain characteristics and
2. The permissible operations on that data.
For example, an int variable can contain any whole-number value from –32768 to 32767
and can be operated with the operators +, –, *, and /.
 Therefore, when we declare a variable of an abstract data type (e.g., stack or a queue),
we also need to specify the operations that can be performed on it.

Abstract
 The word ‘abstract’ means considered apart from the detailed specifications or
implementation.
 The end-user is not concerned about the details of how the methods carry out their tasks.
They are only aware of the methods that are available to them and are only concerned about
calling those methods and getting the results. They are not concerned about how they work.
 For example, when we use a stack or a queue, the user is concerned only with the type of
data and the operations that can be performed on it. They should just know that to work
with stacks, they have push() and pop() functions available to them.

pg. 10
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Example of ADT
Example-1:
AbstractDataType Stack
{
Instances: Stack is a collection of elements in which insertion and deletion of element
is done by one end called top.
Operations
1. Push(): By this operation one can push elements onto the stack. Before performing
push we should check whether stack is full or not.
2. Pop(): By this operation one can remove elements from the stack. Before
performing pop we should check whether stack is empty or not.
}

Example-2:
AbstractDataType queue
{
Instances: The queue is collection of elements in which element can be inserted by one
end called rear and elements get deleted from one end called front.
Operations
1. queue_full(): Checks whether queue is full or not.
2. queue_empty(): Checks whether queue is empty or not.
3. queue_insert(): Inserts the element in queue from rear end.
4. queue_delete(): Deletes the element from the queue by front end.
}

Advantages of ADT:
 Abstract data type in data structure makes it very easy for us to use the complex data
structures along with their complex functions.
 By using abstract data types, we can also customize any data structure depending on how
we plan to use that particular data structure.
 Abstract data type in data structure follows the concept of reusability of a code. This means
that we don't have to write a particular piece of code again and again. We can just create
an abstract data type and we can use it by simply calling the functions present in it.

pg. 11
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Define Algorithm. Explain the Characteristics of Algorithm.


 The word Algorithm means ” A set of finite rules or instructions to be followed in
calculations or other problem-solving operations ”

Characteristics:
1. Each Algorithm is supplied with zero or more external quantities.
 The expected inputs of an algorithm must be well-defined to ensure its correctness,
predictability, and repeatability.
 Well-defined inputs ensure that the algorithm's behaviour is deterministic, which means,
that the same input will always produce the same output.
 Unambiguous inputs help prevent incorrect implementations and misunderstanding of the
algorithm's requirements.
2. Each Algorithm must produce at least one quantity.
 The outputs of an algorithm should be well-defined to ensure that the algorithm produces
the intended and accurate result for a given set of inputs.
 It avoids ambiguity and guarantees that the algorithm solves the problem correctly.
 It is also easy to verify the correctness of the algorithm's implementation.
3. Each algorithm should have Definiteness.
 Ambiguity in the algorithm's description can lead to incorrect implementations and
unreliable results. That is why it is important for an algorithm to be unambiguous.
4. Each algorithm should have Finiteness.
 The algorithm should end after a finite amount of time, and it should have a limited number
of instructions.
 A finite algorithm ensures that it will eventually stop executing and produce a result.
 An infinite algorithm would never reach a conclusion, which is impractical in real-world
scenarios where computation cannot be performed infinitely.

pg. 12
UNIT-1: INTRODUCTION TO DATA STRUCTURE

5. Each algorithm should have Effectiveness.


 An algorithm should be feasible because feasibility indicates that it is practical and capable
of being executed within reasonable constraints and resources.
 It should not contain any redundant step which called make an algorithm ineffective.

Fundamental of Algorithm Analysis.


 Analyzing an algorithm means calculating/predicting the resources that the algorithm
requires.
 Two most important resources are computing time (time complexity) and storage space
(space complexity).
 By analyzing some of the candidate algorithms for a problem, the most efficient one can
be easily identified.
 The efficiency of algorithm can be specified using time complexity and space complexity.
a) Time Complexity: The amount of time taken by an algorithm to run. [Means analyze
that an algorithm is slow or fast]
b) Space Complexity: The amount of space (Memory) taken by an algorithm. [Means
analyze that an algorithm requires more or less space]

Calculate Space Complexity


Example 1:
int sum(int x, int y, int z)
{
int t = x+y+z;
return t;
}
Variable Size (if int = 2 bytes)
X 2
Y 2
Z 2
T 2
TEMP (Variable to store returned value) 2
Total 10 Bytes.

It Requires Constant Space Required: O(1).

pg. 13
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Example 2 :
int sum(int a[], int n)
{
int r=0;
for (int i=0;i<n;i++)
{
r+=a[i];
}
return r;
}
Variable Size (if int = 2 bytes)
n 2
a[] 2n
r 2
i 2
Temp 2
Total 8+2n

It Requires Constant Space Required : O(n).

Calculate Time Complexity

Frequency Count: The frequency count is a count that denotes how many times particular
statement is executed (for calculating time complexity).

Case-1 : Single Loop

for (int i=0;i<=n;i++) // Statement Executes n times


{
X=y+z; //Takes constant time c
}
T(n)=c*n
Hence , T(n) = O (n).

pg. 14
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Case-2 : Nested Loop


for (int i=0;i<=n;i++) // Statement Executes n times
{
for (int j=0;j<=n;j++) // Statement Executes n times
{

X=y+z; //Takes constant time c


}}
T(n)=c*n2 hence , T(n)=O(n2)

Case-3 : Sequential Statements


Sequencing means putting one instruction after another.
Example:

i = 10;
printf (“ %d” , i);
i = 10 + 5;

Here each instruction executes only once. So frequency count = 1+1+1 = 3 which is
CONSTANT.

Hence, T(n)=O(1).

What is worst case, Best case and Average case with respect to Algorithm Efficiency?
(Measurement of Complexity of an Algorithm)
Worst Case :
In the worst-case analysis, we calculate the upper bound (Maximum Running Time) on the
running time of an algorithm. We must know the case that causes a maximum number of
operations to be executed.
Best Case:
In the best-case analysis, we calculate the lower bound(Minimum Running Time) on the
running time of an algorithm. We must know the case that causes a minimum number of
operations to be executed.

pg. 15
UNIT-1: INTRODUCTION TO DATA STRUCTURE

Average Case:
In average case analysis, we take all possible inputs and calculate the computing time for all of
the inputs. Sum all the calculated values and divide the sum by the total number of inputs.

Linear search is a method for finding a particular value from the given list.
• The algorithm checks each element, one at a time and in sequence, until the desired element
is found.
• Linear search is the simplest search algorithm.
 Given an Array a[5] :

2 9 3 1 8

Search 1 in Given Array


 Comparing value of ith index with the given element one by one, until we get the required
element or end of the array

The required element in the given array can be found at,


1. At the first position
Best Case: minimum comparison is required
2. Anywhere after the first position
Average Case: average number of comparison is required
3. Last position or does not found at all
Worst Case: maximum comparison is required

pg. 16
UNIT-1: INTRODUCTION TO DATA STRUCTURE

What is Asymptotic Notations? Explain types of Asymptotic Notations.


 Asymptotic Notation is used to describe the running time of an algorithm - how much time
an algorithm takes with a given input, n.
 There are three different notations: big O, big Theta (Θ), and big Omega (Ω).
 Asymptotic notation is a shorthand way to represent the time complexity.
 This is also known as an algorithm’s growth rate.
 Asymptotic Notations are used,
1. To characterize the complexity of an algorithm.
2. To compare the performance of two or more algorithms solving the same problem.
1. Big ‘O’ Notation :
The Big oh notation is denoted by “ O ”. It is a method of representing the upper bound of
algorithm’s running time.
The function f (n) = O (g (n)) [read as "f of n is big-oh of g of n"] if and only if exist positive
constant c and n0 such that,
f(n) ⩽ C g(n) , n ≥ n0

pg. 17
UNIT-1: INTRODUCTION TO DATA STRUCTURE

2. Big Omega Notation :


 The Omega notation is denoted by “Ω ”. It is a method of representing the lower bound of
algorithm’s running time.
 The function f(n) = Ω(g(n)), if there is a constant c > 0 and a natural number n0 such that
c*g(n) ≤ f(n) for all n ≥ n

3. Big Theta Notation :


 The Theta notation is denoted by “θ”. By this method the running time is between upper
bound and lower bound.
 The function f(n)= Θ(g(n)), if there are constants c1, c2 > 0 and a natural number n 0 such
that
c1* g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0

pg. 18

You might also like