0% found this document useful (0 votes)
3 views

unit-1 (1)

Uploaded by

bro9876bro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

unit-1 (1)

Uploaded by

bro9876bro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 93

DATA STRUCTURES

RAJESHWARI
ASST.PROF.
DEPT.OF AIML
DEFINITION
 Data structure is representation of the logical relationship existing between
individual elements of data.
 A data structure is a way of organizing all data items that considers not only the
elements stored but also their relationship to each other.
 Data is a representation of facts or concepts in an organized manner.
 The data may be stored, communicated, interpreted, or processed.
 In short, we can say that data is a piece of information or simply set of values and
data as such may not convey any meaning.
 For example, numbers, names, marks etc are the data. Consider the two data
items “Rama” and 10. The data “Rama” and 10 does not convey any meaning to
the reader.
 Information is defined as a collection of data from which conclusions may be
drawn. So, information is subset of data which when interpreted conveys meaning
so that the people can understand. Hence, Information can be considered as a
message that is received and understood.
 Ex 1: The output of the computer may be “Rama scored 10 marks”. This sentence
is an information. Because, it conveys meaning to the reader.
INTRODUCTION
 Data structure affects the design of both structural &
functional aspects of a program.
Program=algorithm + Data Structure
 You know that a algorithm is a step by step procedure to solve

a particular function.
NOW, LET US SEE “HOW COMPUTERS
REPRESENT DATA?”
 The data are represented by the state of electronic switches.
 A switch with ON state represents 1 and a switch with OFF state represents
0 (zero).
 Thus, all digital computers use binary number system to represent the data.
So, the data that we input to the computer is converted into 0’s and 1’s.
 The letters, digits, punctuation marks, sound and even pictures are
represented using 1’s and 0’s.
 Thus, computers represent data using binary digits 0 and 1 i.e., in binary
number system.
NOW, LET US SEE “HOW DOES THE DATA REPRESENTED USING 1’S AND 0’S
CAN BE GROUPED OR MEASURED?”
THE DATA REPRESENTED CAN BE GROUPED OR MEASURED USING
FOLLOWING UNITS:
DEFINE DATA STRUCTURES?”
 “Definition: The study of how the data is collected and
stored in the memory, how efficiently the data is
organized in the memory, how efficiently the data can be
retrieved and manipulated, and the possible ways in
which different data items are logically related is called
data structures.
CLASSIFICATION OF DATA
STRUCTURE
 Data structure are normally divided into two broad
categories:
Primitive
Data Structure
Non-Primitive Data Structure
CLASSIFICATION OF DATA
STRUCTURE

Data structure

Primitive DS Non-Primitive DS

Integer Float Character Pointer


CLASSIFICATION OF DATA
STRUCTURE

Non-Primitive DS

Linear List Non-Linear List

Array Queue Graph Trees

Link List Stack


PRIMITIVE DATA STRUCTURE
 There are basic structures and directly operated upon by the machine
instructions.
 In general, there are different representation on different computers.
 Integer: An integer is a whole number without any decimal point or a
fraction part.
 No extra characters are allowed other than ‘+’ and ‘–‘ sign. If ‘+’ and ‘–‘ are
present, they should precede the number.
 The integers are normally represented in binary or hexadecimal. All
negative numbers are represented using 2’s complement.
 Based on the sign, the integers are classified into:
 1. Unsigned integer
 2. Signed integer
 Floating-point number: The floating point constants are base 10 numbers
with fraction part such as 10.5.
 Character constants:A character data is a primitive data structure. A wide
variety of character sets (also called alphabets) are handled by computers.
The two widely used character sets are:
 1.ASCII (American Standard Code for Information Interchange)
2.EBCDIC(Extended Binary Coded Decimal Interchange Code)
 String: set characters
 Pointers:A pointer is a special variable which
contains address of a memory location. Using
this pointer, the data can be accessed.
NON-PRIMITIVE DATA
STRUCTURE
 There are more sophisticated data structures.
 These are derived from the primitive data structures.
 The non-primitive data structures emphasize on structuring of a group of
homogeneous (same type) or heterogeneous (different type) data items.
 The data structures that cannot be manipulated directly by machine
instructions are called non-primitive data structures.
 Linear data structures:The data structure where its values or elements are
stored in a sequential or linear order is called linear data structure
 Non-linear structure:The data structure where its values or elements are not
stored in a sequential or linear order is called non-linear data structures.
NON-PRIMITIVE DATA
STRUCTURE
 Lists, Stack, Queue, Tree, Graph are example of non-
primitive data structures.
 The design of an efficient data structure must take

operations to be performed on the data structure.


NON-PRIMITIVE DATA
STRUCTURE
 The most commonly used operation on data structure are
broadly categorized into following types:
 Create
 Selection
 Updating
 Searching
 Sorting
 Merging
 Destroy or Delete
DIFFERENT BETWEEN THEM
 A primitive data structure is generally a basic structure that is
usually built into the language, such as an integer, a float.
 A non-primitive data structure is built out of primitive data

structures linked together in meaningful ways, such as a or a


linked-list, binary search tree, AVL Tree, graph etc.
NON-PRIMITIVE DS:
 Arrays: An array is a special and very
powerful data structure in C language.
 An array is a collection of similar data items.

All elements of the array share a common


name.
 Each element in the array can be accessed

by the subscript (or index).


 Array is used to store, process and print large

amount of data using a single variable.


ARRAYS
 Following are some of the concepts to be remembered
about arrays:
The individual element of an array can
be accessed by specifying name of the
array, following by index or subscript
inside square brackets.
The first element of the array has index
zero[0]. It means the first element and
last element will be specified as:arr[0]
& arr[9]
Respectively.
ARRAYS
 For the above array it would be
(4-0)+1=5,where 0 is the lower bound of array and 4 is the
upper bound of array.
 Array can always be read or written through loop. If we read a
one-dimensional array it require one loop for reading and other
for writing the array.
 If we are reading or writing two-dimensional array it would
require two loops. And similarly the array of a N dimension
would required N loops.
ARRAYS

 Simply, declaration of array is as follows:


int arr[10]
 Where int specifies the data type or type of elements arrays
stores.
 “arr” is the name of array & the number specified inside the

square brackets is the number of elements an array can store,


this is also called sized or length of array.
ARRAYS

The elements of array will always be


stored in the consecutive (continues)
memory location.
The number of elements that can be stored
in an array, that is the size of array or its
length is given by the following equation:
(Upperbound-lowerbound)+1
ARRAYS
For example: Reading an array
For(i=0;i<=9;i++)
scanf(“%d”,&arr[i]);
For example: Writing an array

For(i=0;i<=9;i++)
printf(“%d”,arr[i]);
ARRAYS
If we are reading or writing two-
dimensional array it would require two
loops. And similarly the array of a N
dimension would required N loops.
 Stack: A stack is a special type of data structure (linear data structure)
 where elements are inserted from one end and elements are deleted from the
same end.
 Using this approach, the Last element Inserted is the First element to be
deleted Out, and hence, stack is also called Last In First Out (LIFO) data
structure.
 The stack s={a0, a1, a2,……an-1) is pictorially represented as shown
below:
 The elements are inserted into the stack in the order a0, a1, a2,……an-1.
 That is, we insert a0 first, a1 next and so on. The item an-1 is inserted at the
end. Since, it is on top of the stack, it is the first item to be deleted.
STACK
 Insertion of element into stack is called PUSH and
deletion of element from stack is called POP.
 The bellow show figure how the operations take place on

a stack:

PUSH POP

[STACK]
THE VARIOUS OPERATIONS
PERFORMED ON STACK ARE:
 Insert: An element is inserted from top end.
Insertion operation is called push operation.
 Delete: An element is deleted from top end only.

Deletion operation is called pop operation.


 Overflow: Check whether the stack is full or not.

 Underflow: Check whether the stack is empty or

not.
QUEUE:
 A queue is a special type of data structure (linear data structure)
 where elements are inserted from one end and elements are deleted from the other
end.
 The end at which new elements are added is called the rear and the end from which
elements are deleted is called the front.
 Using this approach, the First element Inserted is the First element to be deleted Out,
and hence, queue is also called First In First Out (FIFO) data structure.
 The items are inserted into queue in the order 10, 50 and 20.
 The items are inserted into queue in the
order 10, 50 and 20. The variable q is used
as an array to hold these elements
 Item 10 is the first element inserted. So, the variable

front is used as index to the first element


 Item 20 is the last element inserted. So, the variable rear

is used as index to the last element


LISTS
 A lists (Linear linked list) can be defined as a collection of variable number of
data items.
 Lists are the most commonly used non-primitive data structures.
 Linked lists: A linked list is a data structure which is collection of zero or more
nodes where each node is connected to the next node.
 If each node in the list has only one link, it is called singly linked list.
 If it has two links one containing the address of the next node and other link
containing the address of the previous node it is called doubly linked list.
 Each node in the singly list has two fields namely:

info – This field is used to store the data or information to be manipulated


link – This field contains address of the next node.
 An element of list must contain at least two fields, one for storing data or
information and other for storing address of next element.
 As you know for storing address we have a special data structure of list the
address must be pointer type.
 Technically each such element is referred to as a node,
therefore a list can be defined as a collection of nodes as
show bellow:

[Linear Liked List]


Head

AAA BBB CCC

Information field Pointer field


 Types of linked lists:
 Single linked list
 Doubly linked list
 Single circular linked list
 Doubly circular linked list
TREES
 A tree can be defined as finite set of data items (nodes).
 Tree is non-linear type of data structure in which data

items are arranged or stored in a sorted sequence.


 Tree represent the hierarchical relationship between

various elements.
TREES
 In trees:
 There is a special data item at the top of hierarchy called the
Root of the tree.
 The remaining data items are partitioned into number of
mutually exclusive subset, each of which is itself, a tree
which is called the sub tree.
 The tree always grows in length towards bottom in data
structures, unlike natural trees which grows upwards.
TREES
 The tree structure organizes the data into branches,
which related the information.

A root

B C

D E F G
GRAPH
 Graph is a mathematical non-linear data structure
capable of representing many kind of physical structures.
 It has found application in Geography, Chemistry and

Engineering sciences.
 Definition: A graph G(V,E) is a set of vertices V and a

set of edges E.
GRAPH
 An edge connects a pair of vertices and many have
weight such as length, cost and another measuring
instrument for according the graph.
 Vertices on the graph are shown as point or circles and

edges are drawn as arcs or line segment.


GRAPH
 Example of graph:

6
v2 v5
v1 v3
10

v1 8 11
15
9 v2
v3 v4 v4

[a] Directed & [b] Undirected


Weighted Graph Graph
GRAPH
 Types of Graphs:
Directedgraph
Undirected graph
Simple graph
Weighted graph
Connected graph
Non-connected graph
DATA STRUCTURE OPERATIONS
 Creating: The process of repeatedly adding various data items into the list
is called creating a list.
 Inserting: The process of adding a data item into the list is called inserting.
 Deleting: The process of removing a data item from a list is called deleting.
 Searching: The process of informing whether a particular item is present in
a list or it is not present in a list is called searching.
 Sorting: The process of arranging various data items either in ascending
order or descending order is called sorting
 Merging: Given two sorted lists we can combine those two lists into a
single sorted list.
 Traversing: The process of accessing each data item exactly once so that it
can be processed and manipulated is called traversal.
OPERATIONS ON ARRAYS

 The various operations that can be performed on arrays


are shown below:
 Traversing
 Inserting
 Deleting
 searching
 sorting
TRAVERSING

 Visiting or accessing each item in the array is called


traversing the array. Here, each element is accessed in
linear order either from left to right or from right to left.
 if we want to read n data items from the keyboard, the

following statement can be used:


 for (i = 0; i <= n-1; i++)
 {
 scanf(“%d”, &a[i]);
 }
 Similarly to display n data items stored in the array,
replace scanf() by printf() statement as shown below:
for (i = 0; i < n; i++)
{
printf(“%d”, a[i]);
}
 Now, the function to
INSERTING AN ITEM INTO AN ARRAY
(BASED ON THE POSITION)
 “How to insert an item into an unsorted array
based on the position?”
 Problem statement: Given an array a consisting

of n elements, it is required to insert an item at


the specified position say pos.
 Design: An item can be inserted into the array

by considering various situations as shown


below:
 Step 1: Elements are present (Invalid position):

This case can be pictorially represented as


shown below:
 Can you insert an item at 8th position
onwards in the above array?
 No, we cannot insert since, it is invalid

position. That is, if pos is greater than 7 or if


pos is less than 0, the position is invalid.
 The code for this case can be written as

shown below
 Step 2: Make room for the item to be inserted
at the specified position: Consider the following
list with 7 elements and item 60 to be inserted
at position 3.

 We have to make room for the item to be


inserted at position 3. This can be done by
moving all the elements 30, 80, 70 and 90 from
positions 6, 5, 4, 3 into new positions 7, 6, 5, 4
respectively towards right by one position as
shown below:
DELETING AN ITEM FROM AN ARRAY
(BASED ON THE POSITION)
 “How to delete an item from an unsorted array based on the position?”
 Problem statement: Given an array a consisting of n elements, it is required to delete
an item at the specified position say pos.
 Design: An item can be deleted from the array by considering various situations as
shown below:
 Step 1: Elements are present (Invalid position): This case can be pictorially
represented as shown below:

 Can you delete an item from 7th position onwards in the above
array?
 No, we cannot delete since, it is invalid position. That is, if pos is
greater than or equal to 7 or if pos is less than 0, the position is
invalid.
 Step 2: Display the item to be deleted: Consider the following list with 7 elements and
let the position pos is 3.

 The item at position pos can be accessed by writing a[pos] and it can be displayed
using the printf() function as as shown below:
printf(“Item deleted = %d\n”, a[pos]);
Remove the item from the array: Removing an element at the given
position can be illustrated using the following figure
 move all the elements from 4th position onwards towards left by one position using
the following statements:
 a[3] = a[4];
 a[4] = a[5];
 a[5] = a[6];
 In general
 a[i - 1] = a[i] for i = 4 to 6
 for i = pos+1 to n-1
 Now, the code for above activity can be written as shown below:

 Step: Update number of elements in above array: After deleting an


item, the number of items in the array should be decremented by 1.
 It can be done using the following statement:
return n - 1;
FUNCTION TO DELETE AN ITEM FROM
THE SPECIFIED POSITION IN THE
ARRAY
 Sorting array elements
 First, we shall see “What is sorting?”

 Definition: More often programmers will be working with large


amount of data and it may be necessary to arrange them in ascending
or descending order. This process of arranging the given elements so
that they are in ascending order or descending order is called sorting.
For example, consider the unsorted elements:
 10, 50, 25, 20, 15
 After sorting them in ascending order, we get the following list: 10, 15,

20, 25, 50

 After sorting them in descending order, we get the following list: 50,

25, 20, 15, 10


LINEAR SEARCH (SEQUENTIAL
SEARCH)
 Now, let us see “What is linear search?”
 Definition: A linear search also called sequential search is a simple searching
technique. In this technique, we search for a given key in the list in linear
order (sequential order) i.e., one after the other from first element to last
element or vice versa. The search may be successful or unsuccessful. If key is
present, we say search is successful, otherwise, search is unsuccessful. For
example, consider the following array:

 Successful search: If search key is 25, it is present in the above list and we
return its position 3 indicating “Successful search”.
 Unsuccessful search: If search key is 50, it is not present in the above list and
we return -1 indicating “Unsuccessful search”
 Design: Now, let us see how to search for an item.
 Step 1: Identify parameters to function: We have to search for key item in an
array a
 consisting of n elements. So, input must be key, array a and n. So,
once the value of i is greater than or equal to 5,it is an indication that item is not present and
we return -1. The code for this can be written as:
return -1; /*Unsuccessful search */
Advantages of linear search
Very simple approach
Works well for small arrays
Used to search when the elements are not
sorted
Disadvantages of linear search
Less efficient if the array size is large
If the elements are already sorted, linear
search is not efficient.
BINARY SEARCH

 To overcome the disadvantages of linear search, we use binary search.


Now, let us see

“What is binary search? What is the concept used in binary search?”


 Definition: A binary search is a simple and very efficient searching
technique which can be applied if the items to be compared are either in
ascending order or descending order.

 The general idea used in binary search is similar to the way we search for
the telephone number of a person in the telephone directory. Obviously, we
do not use linear search.
 Instead, we open the book from the middle and the name is compared with
the element at the middle of the book.
 If the name is found, the corresponding telephone number is retrieved and
the searching has to be stopped.
 If the name to be searched is less than the middle element, search towards
left otherwise, search towards right.
 The procedure is repeated till key item is found or the key item is not found.
 Design: Once we know the concept of binary search, the next question is
“How to search for key in a list of elements?”
 Step 1: Identify parameters to function: We have to search for key item in an
array a
 consisting of n elements. So, input must be key, array a and n. So,

Step 2: Return type: We are returning position of key item if found,


otherwise, we return -1. Note that the position is integer and -1 is also
integer. So,
POINTERS

 Definition: A variable which holds address of another


variable or a memory location is called a pointer
variable. For example, if p is a pointer variable, it can
hold the address of variable i. The physical memory
representation and logical memory representation is
shown below:
“What are the steps to be followed to use
pointers?”
 see “What is a NULL pointer?” A NULL pointer is
defined as a special pointer value that points to ‘\0’
(nowhere) in the memory
MEMORY ALLOCATION FUNCTIONS

“What is static memory allocation?”


If memory space to be allocated for various variables is decided during compilation time itself,
then the memory space cannot be expanded to accommodate more data or cannot be reduced to
accommodate less data. In this technique, once the size of the memory space to be allocated is
fixed, it cannot be altered during execution time.
This is called “static memory allocation”.
For example, consider the following declaration:
 “What is dynamic memory allocation?”
 Dynamic memory allocation is the process of allocating

memory space during execution time (i.e., run time).


This allocation technique uses predefined functions to
allocate and release memory for data during execution
time. So, if there is an unpredictable storage
requirement, then the dynamic allocation technique is
used. Dynamic allocation will be used when we create
dynamic arrays, linked lists, trees
MALLOC(SIZE)
“What is the purpose of using malloc?”
This function allows the program to allocate a block of memory space as
and when required and the exact amount needed during execution.
The size of the block is the number of bytes specified in the parameter.
The syntax is shown below:
CALLOC(N, SIZE)
“What is the purpose of using calloc?”

This function is used to allocate multiple blocks of memory. Here, calloc –


stands for contiguous allocation of multiple blocks and is mainly used to
allocate memory for arrays.
The number of blocks is determined by the first parameter n. The size of
each block is equal to the number of bytes specified in the parameter i.e., size.
 Thus, total number of bytes allocated is n*size and all bytes will be
initialized to 0. The syntax is shown below:
DIFFERENCE BETWEEN MALLOC() AND
CALLOC()
REALLOC(PTR, SIZE)
“What is the purpose of using realloc?”
Before using this function, the memory should have been allocated using
malloc() or calloc().
Sometimes, the allocated memory may not be sufficient and we may require
additional memory space.
Sometimes, the allocated memory may be much larger and we want to reduce
the size of allocated memory. In both situations, the size of allocated memory
can be changed using realloc() and the process is called reallocation of
memory.
The reallocation is done as shown below:

realloc() changes the size of the block by extending or deleting the memory at
the end of the block.
If the existing memory can be extended, ptr value will not be changed
If the memory cannot be extended, this function allocates a completely new
block and copies the contents of existing memory block into new memory
block and then deletes the old memory block.
 The syntax is shown below:
FREE(PTR)
“What is the purpose of using free()?”

This function is used to de-allocate (or free) the allocated block of memory
which is allocated using the functions calloc(), malloc() or realloc().
It is the responsibility of a programmer to de-allocate memory whenever it is
not required by the program and initialize ptr to NULL.
The syntax is shown below:
STRINGS
1. Basic terminology
 Each programming language contains a set of characters:
 Letters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 Digits: 0 1 2 3 4 5 6 7 8 9
 Special characters: + 1 * / ( ) [ ] { } # $ and so on.
 All the above character sets are used to communicate with the computer.
 “What is a string?”
 Definition: A string is a sequence of characters that are obtained by concatenating the
various characters in a character set. In C language, a sequence of characters
enclosed within double quotes is called a string. In C, string is implemented as an
array of characters. A string constant in C is always terminated by a NULL character.
A null-terminated string is the only type of string defined by C language.
 For example, the string “VIVEKANANDA” is stored in memory as shown below:
STORING STRINGS
 Variable length storage structure:
 As the name implies, variable-length strings don’t have a pre-defined
length.
 The storage structure for a string can expand or shrink to accommodate any
size of data. But, there should be a mechanism to indicate the end of the
data.
 In a variable-length string, the string ends with a delimiter such as $.
 In C language, the string ends with NULL (denoted by \0) character.
 Linked storage structure:
 In most of the word processing applications, the strings are represented
using linked lists.
 Using linked lists (discussed in chapter 8 and 9), inserting/deleting a
character/word is much easier.
 The string “MITHIL” can be represented using linked list as shown below:
STRING OPERATIONS
 Substring: Substring is a string obtained by extracting a part of a given
string given the position and length of the substring.
 For example, SUBSTRING (“TO BE OR NOT TO BE”, 4, 5) = “BE OR”
 Indexing: The process of finding the position of pattern string in a given
text t is called indexing. It is also called pattern matching.
 Let, text t = “RAMA IS THE KING OF AYODHYA”, the pattern string
pattern is “KING”. Then,
 INDEX(t, pattern) = 13
 Concatenation: The process of appending the second string to the end of
first string is called concatenation. We can denote the concatenation symbol
by “+”.
 For example, let first string is s1=”SEETA” and the second string s2 =”
RAMA”. Then,
 s1+ s2 = “SEETA RAMA”
 Length: The number of characters in a string is called length of the string.
For example, Length(“RAMA”) = 4
STRING HANDLING FUNCTIONS
 Pattern matching algorithms
 First, let us see “What is pattern matching? What are different pattern
matching techniques?”
 Definition: Given a string called pattern p with n characters and another
string called text with m characters where n * m. It is required to search for
the pattern string p in the text string t. If search is successful return the
position of the first occurrence of pattern string p in the text string t.
Otherwise, return -1. This process of searching for a pattern string in a
given text string is called pattern matching.
 The pattern matching can be achieved using the following methods:
 Brute force pattern matching algorithm
 Checking end indices first and proceed using brute force method
 Using Finite State Machine (Deterministic finite automaton or DFA) called
pattern matching table
 Knuth Morris Pratt (KMP) pattern matching algorithm

You might also like