unit-1 (1)
unit-1 (1)
RAJESHWARI
ASST.PROF.
DEPT.OF AIML
DEFINITION
Data structure is representation of the logical relationship existing between
individual elements of data.
A data structure is a way of organizing all data items that considers not only the
elements stored but also their relationship to each other.
Data is a representation of facts or concepts in an organized manner.
The data may be stored, communicated, interpreted, or processed.
In short, we can say that data is a piece of information or simply set of values and
data as such may not convey any meaning.
For example, numbers, names, marks etc are the data. Consider the two data
items “Rama” and 10. The data “Rama” and 10 does not convey any meaning to
the reader.
Information is defined as a collection of data from which conclusions may be
drawn. So, information is subset of data which when interpreted conveys meaning
so that the people can understand. Hence, Information can be considered as a
message that is received and understood.
Ex 1: The output of the computer may be “Rama scored 10 marks”. This sentence
is an information. Because, it conveys meaning to the reader.
INTRODUCTION
Data structure affects the design of both structural &
functional aspects of a program.
Program=algorithm + Data Structure
You know that a algorithm is a step by step procedure to solve
a particular function.
NOW, LET US SEE “HOW COMPUTERS
REPRESENT DATA?”
The data are represented by the state of electronic switches.
A switch with ON state represents 1 and a switch with OFF state represents
0 (zero).
Thus, all digital computers use binary number system to represent the data.
So, the data that we input to the computer is converted into 0’s and 1’s.
The letters, digits, punctuation marks, sound and even pictures are
represented using 1’s and 0’s.
Thus, computers represent data using binary digits 0 and 1 i.e., in binary
number system.
NOW, LET US SEE “HOW DOES THE DATA REPRESENTED USING 1’S AND 0’S
CAN BE GROUPED OR MEASURED?”
THE DATA REPRESENTED CAN BE GROUPED OR MEASURED USING
FOLLOWING UNITS:
DEFINE DATA STRUCTURES?”
“Definition: The study of how the data is collected and
stored in the memory, how efficiently the data is
organized in the memory, how efficiently the data can be
retrieved and manipulated, and the possible ways in
which different data items are logically related is called
data structures.
CLASSIFICATION OF DATA
STRUCTURE
Data structure are normally divided into two broad
categories:
Primitive
Data Structure
Non-Primitive Data Structure
CLASSIFICATION OF DATA
STRUCTURE
Data structure
Primitive DS Non-Primitive DS
Non-Primitive DS
For(i=0;i<=9;i++)
printf(“%d”,arr[i]);
ARRAYS
If we are reading or writing two-
dimensional array it would require two
loops. And similarly the array of a N
dimension would required N loops.
Stack: A stack is a special type of data structure (linear data structure)
where elements are inserted from one end and elements are deleted from the
same end.
Using this approach, the Last element Inserted is the First element to be
deleted Out, and hence, stack is also called Last In First Out (LIFO) data
structure.
The stack s={a0, a1, a2,……an-1) is pictorially represented as shown
below:
The elements are inserted into the stack in the order a0, a1, a2,……an-1.
That is, we insert a0 first, a1 next and so on. The item an-1 is inserted at the
end. Since, it is on top of the stack, it is the first item to be deleted.
STACK
Insertion of element into stack is called PUSH and
deletion of element from stack is called POP.
The bellow show figure how the operations take place on
a stack:
PUSH POP
[STACK]
THE VARIOUS OPERATIONS
PERFORMED ON STACK ARE:
Insert: An element is inserted from top end.
Insertion operation is called push operation.
Delete: An element is deleted from top end only.
not.
QUEUE:
A queue is a special type of data structure (linear data structure)
where elements are inserted from one end and elements are deleted from the other
end.
The end at which new elements are added is called the rear and the end from which
elements are deleted is called the front.
Using this approach, the First element Inserted is the First element to be deleted Out,
and hence, queue is also called First In First Out (FIFO) data structure.
The items are inserted into queue in the order 10, 50 and 20.
The items are inserted into queue in the
order 10, 50 and 20. The variable q is used
as an array to hold these elements
Item 10 is the first element inserted. So, the variable
various elements.
TREES
In trees:
There is a special data item at the top of hierarchy called the
Root of the tree.
The remaining data items are partitioned into number of
mutually exclusive subset, each of which is itself, a tree
which is called the sub tree.
The tree always grows in length towards bottom in data
structures, unlike natural trees which grows upwards.
TREES
The tree structure organizes the data into branches,
which related the information.
A root
B C
D E F G
GRAPH
Graph is a mathematical non-linear data structure
capable of representing many kind of physical structures.
It has found application in Geography, Chemistry and
Engineering sciences.
Definition: A graph G(V,E) is a set of vertices V and a
set of edges E.
GRAPH
An edge connects a pair of vertices and many have
weight such as length, cost and another measuring
instrument for according the graph.
Vertices on the graph are shown as point or circles and
6
v2 v5
v1 v3
10
v1 8 11
15
9 v2
v3 v4 v4
shown below
Step 2: Make room for the item to be inserted
at the specified position: Consider the following
list with 7 elements and item 60 to be inserted
at position 3.
Can you delete an item from 7th position onwards in the above
array?
No, we cannot delete since, it is invalid position. That is, if pos is
greater than or equal to 7 or if pos is less than 0, the position is
invalid.
Step 2: Display the item to be deleted: Consider the following list with 7 elements and
let the position pos is 3.
The item at position pos can be accessed by writing a[pos] and it can be displayed
using the printf() function as as shown below:
printf(“Item deleted = %d\n”, a[pos]);
Remove the item from the array: Removing an element at the given
position can be illustrated using the following figure
move all the elements from 4th position onwards towards left by one position using
the following statements:
a[3] = a[4];
a[4] = a[5];
a[5] = a[6];
In general
a[i - 1] = a[i] for i = 4 to 6
for i = pos+1 to n-1
Now, the code for above activity can be written as shown below:
20, 25, 50
After sorting them in descending order, we get the following list: 50,
Successful search: If search key is 25, it is present in the above list and we
return its position 3 indicating “Successful search”.
Unsuccessful search: If search key is 50, it is not present in the above list and
we return -1 indicating “Unsuccessful search”
Design: Now, let us see how to search for an item.
Step 1: Identify parameters to function: We have to search for key item in an
array a
consisting of n elements. So, input must be key, array a and n. So,
once the value of i is greater than or equal to 5,it is an indication that item is not present and
we return -1. The code for this can be written as:
return -1; /*Unsuccessful search */
Advantages of linear search
Very simple approach
Works well for small arrays
Used to search when the elements are not
sorted
Disadvantages of linear search
Less efficient if the array size is large
If the elements are already sorted, linear
search is not efficient.
BINARY SEARCH
The general idea used in binary search is similar to the way we search for
the telephone number of a person in the telephone directory. Obviously, we
do not use linear search.
Instead, we open the book from the middle and the name is compared with
the element at the middle of the book.
If the name is found, the corresponding telephone number is retrieved and
the searching has to be stopped.
If the name to be searched is less than the middle element, search towards
left otherwise, search towards right.
The procedure is repeated till key item is found or the key item is not found.
Design: Once we know the concept of binary search, the next question is
“How to search for key in a list of elements?”
Step 1: Identify parameters to function: We have to search for key item in an
array a
consisting of n elements. So, input must be key, array a and n. So,
realloc() changes the size of the block by extending or deleting the memory at
the end of the block.
If the existing memory can be extended, ptr value will not be changed
If the memory cannot be extended, this function allocates a completely new
block and copies the contents of existing memory block into new memory
block and then deletes the old memory block.
The syntax is shown below:
FREE(PTR)
“What is the purpose of using free()?”
This function is used to de-allocate (or free) the allocated block of memory
which is allocated using the functions calloc(), malloc() or realloc().
It is the responsibility of a programmer to de-allocate memory whenever it is
not required by the program and initialize ptr to NULL.
The syntax is shown below:
STRINGS
1. Basic terminology
Each programming language contains a set of characters:
Letters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Digits: 0 1 2 3 4 5 6 7 8 9
Special characters: + 1 * / ( ) [ ] { } # $ and so on.
All the above character sets are used to communicate with the computer.
“What is a string?”
Definition: A string is a sequence of characters that are obtained by concatenating the
various characters in a character set. In C language, a sequence of characters
enclosed within double quotes is called a string. In C, string is implemented as an
array of characters. A string constant in C is always terminated by a NULL character.
A null-terminated string is the only type of string defined by C language.
For example, the string “VIVEKANANDA” is stored in memory as shown below:
STORING STRINGS
Variable length storage structure:
As the name implies, variable-length strings don’t have a pre-defined
length.
The storage structure for a string can expand or shrink to accommodate any
size of data. But, there should be a mechanism to indicate the end of the
data.
In a variable-length string, the string ends with a delimiter such as $.
In C language, the string ends with NULL (denoted by \0) character.
Linked storage structure:
In most of the word processing applications, the strings are represented
using linked lists.
Using linked lists (discussed in chapter 8 and 9), inserting/deleting a
character/word is much easier.
The string “MITHIL” can be represented using linked list as shown below:
STRING OPERATIONS
Substring: Substring is a string obtained by extracting a part of a given
string given the position and length of the substring.
For example, SUBSTRING (“TO BE OR NOT TO BE”, 4, 5) = “BE OR”
Indexing: The process of finding the position of pattern string in a given
text t is called indexing. It is also called pattern matching.
Let, text t = “RAMA IS THE KING OF AYODHYA”, the pattern string
pattern is “KING”. Then,
INDEX(t, pattern) = 13
Concatenation: The process of appending the second string to the end of
first string is called concatenation. We can denote the concatenation symbol
by “+”.
For example, let first string is s1=”SEETA” and the second string s2 =”
RAMA”. Then,
s1+ s2 = “SEETA RAMA”
Length: The number of characters in a string is called length of the string.
For example, Length(“RAMA”) = 4
STRING HANDLING FUNCTIONS
Pattern matching algorithms
First, let us see “What is pattern matching? What are different pattern
matching techniques?”
Definition: Given a string called pattern p with n characters and another
string called text with m characters where n * m. It is required to search for
the pattern string p in the text string t. If search is successful return the
position of the first occurrence of pattern string p in the text string t.
Otherwise, return -1. This process of searching for a pattern string in a
given text string is called pattern matching.
The pattern matching can be achieved using the following methods:
Brute force pattern matching algorithm
Checking end indices first and proceed using brute force method
Using Finite State Machine (Deterministic finite automaton or DFA) called
pattern matching table
Knuth Morris Pratt (KMP) pattern matching algorithm