0% found this document useful (0 votes)
29 views33 pages

DS Unit 1

The document provides an introduction to data structures. It defines data structures as a way of organizing data to allow for effective operations. It discusses the basic concepts of data representation including data, data items, entities, records, and files. It also defines abstract data types and describes basic data structures like arrays, linked lists, stacks, queues, trees and graphs. It explains linear and non-linear data structures and common operations like traversing, searching, insertion, deletion, sorting and merging. Finally, it provides details about arrays, linked lists, and dynamic memory allocation.

Uploaded by

aa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views33 pages

DS Unit 1

The document provides an introduction to data structures. It defines data structures as a way of organizing data to allow for effective operations. It discusses the basic concepts of data representation including data, data items, entities, records, and files. It also defines abstract data types and describes basic data structures like arrays, linked lists, stacks, queues, trees and graphs. It explains linear and non-linear data structures and common operations like traversing, searching, insertion, deletion, sorting and merging. Finally, it provides details about arrays, linked lists, and dynamic memory allocation.

Uploaded by

aa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT I

INTRODUCTION TO DATA STRUCTURES

Data Structure is a way of collecting and organizing data in such a way that we can perform
operations on these data in an effective way.

In simple language, Data Structures are structures programmed to store ordered data, so that
various operations can be performed on it easily.

DATA REPRESENTATION

Data : The term ‘DATA’ simply refers to a a value or a set of values. These values may present
anything about something, like it may be roll no of a student, marks, name of an employee, address of
person etc.
Data item : A data item refers to a single unit of value. For eg. roll no of a student, marks, name of an
employee, address of person etc. are data items. Data items that can be divided into sub items are
called group items (Eg. Address, date, name), where as those who can not be divided in to sub items
are called elementary items (Eg. Roll no, marks, city, pin code etc.).

Entity - with similar attributes ( e.g all employees of an organization) form an entity set
Information: Processed data, Data with given attribute

Field is a single elementary unit of information representing an attribute of an entity

Record is the collection of field values of a given entity

File is the collection of records of the entities in a given entity set

Name Age Sex Roll Number Branch

A 17 M 109cs0132 CSE
Name Age Sex Roll Number Branch

B 18 M 109ee1234 EE

ABSTRACT DATA TYPE (ADT)

An abstract data type (ADT) refers to a set of data values and associated operations that are specified
accurately, independent of any particular implementation. With an ADT, we know what a specific data
type can do, but how it actually does it is hidden. Simply hiding the implementation

BASIC TYPES OF DATA STRUCTURES


Anything that can store data can be called as a data structure, hence Integer, Float, Boolean,
Char etc, all are data structures. They are known as Primitive Data Structures.

Then we also have some complex Data Structures, which are used to store large and connected data. Some
example of Abstract Data Structure are :

● Array

● Linked List

● Stack

● Queue

● Tree

● Graph

All these data structures allow us to perform different operations on data. We select these data
structures based on which type of operation is required.

LINEAR AND NON LINEAR DATA STRUCTURE


Linear data structures organize their data elements in a linear fashion, where data elements are
attached one after the other. Linear data structures are very easy to implement, since the memory of the
computer is also organized in a linear fashion. Some commonly used linear data structures are arrays,
linked lists, stacks and queues.
In nonlinear data structures, data elements are not organized in a sequential fashion. Data
structures like multidimensional arrays, trees, graphs, tables and sets are some examples of widely used
nonlinear data structures.

OPERATIONS ON DATA STRUCTURES

The data in the data structures are processed by certain operations. The particular data structure
chosen largely depends on the frequency of the operation that needs to be performed on the data
structure. The Operations are:

• Traversing
• Searching
• Insertion
• Deletion
• Sorting
• Merging

(1) Traversing: Accessing each record exactly once so that certain items in the record may be
processed.

(2) Searching: Finding the location of a particular record with a given key value, or finding
the location of all records which satisfy one or more conditions.

(3) Inserting: Adding a new record to the structure.


(4) Deleting: Removing the record from the structure.

(5) Sorting: Managing the data or record in some logical order (Ascending or descending
order).
(6) Merging: Combining the record in two different sorted files into a single sorted file.
ARRAYS

Array is a container which can hold fix number of items and these items should be of same type.
Most of the data structures make use of array to implement their algorithms. Following are important
terms to understand the concepts of Array.

● Element − each item stored in an array is called an element.

● Index − each location of an element in an array has a numerical index which is used to identify
the element.

Array Representation
As per above shown illustration, following are the important points to be considered.

● Index starts with 0.

● Array length is 6 which means it can store 6 elements.


● Each element can be accessed via its index. For example, we can fetch element at index 5 as 3.

Array Types:

There are majorly three types of arrays:


1. One-dimensional array (1-D arrays)
2. Two-dimensional (2D) array
3. Three-dimensional array

1. One-dimensional array (1-D arrays) :


You can imagine a 1d array as a row, where elements are stored one after another.

1D array

Syntax for Declaration of Single Dimensional Array


Below is the syntax to declare the single-dimensional array
data_type array_name[array_size];
where,
 data_type: is a type of data of each array block.
 array_name: is the name of the array using which we can refer to it.
 array_size: is the number of blocks of memory array going to have.
For Example
int nums[5];

2. Two-dimensional (2D) array:


Multidimensional arrays can be considered as an array of arrays or as a matrix consisting of rows
and columns.

2D array

Syntax for Declaration of Two-Dimensional Array


Below is the syntax to declare the Two-dimensional array
data_type array_name[sizeof_1st_dimension][sizeof_2nd_dimension];
where,
 data_type: is a type of data of each array block.
 array_name: is the name of the array using which we can refer to it.
 sizeof_dimension: is the number of blocks of memory array going to have in the
corresponding dimension.
For Example
int nums[5][10];
3. Three-dimensional array:
A 3-D Multidimensional array contains three dimensions, so it can be considered an array of two-
dimensional arrays.
3D array

Syntax for Declaration of Three-Dimensional Array


Below is the syntax to declare the Three-dimensional array
data_type array_name[sizeof_1st_dimension][sizeof_2nd_dimension][sizeof_3rd_dimension];
where,
 data_type: is a type of data of each array block.
 array_name: is the name of the array using which we can refer to it.
 sizeof_dimension: is the number of blocks of memory array going to have in the
corresponding dimension.
For Example
int nums[5][10][2];

Basic Operations of Array:

Following are the basic operations supported by an array.


● Traverse − print all the array elements one by one.

● Insertion − add an element at given index.

● Deletion − delete an element at given index.

● Search − search an element using given index or by value.

● Update − update an element at given index.

Applications of Array :

Storing and accessing data: Arrays are used to store and retrieve data in a specific order. For example,
an array can be used to store the scores of a group of students, or the temperatures recorded by a weather
station.

Sorting: Arrays can be used to sort data in ascending or descending order. Sorting algorithms such as
bubble sort, merge sort, and quick sort rely heavily on arrays.

Searching: Arrays can be searched for specific elements using algorithms such as linear search and
binary search.

Matrices: Arrays are used to represent matrices in mathematical computations such as matrix
multiplication, linear algebra, and image processing.

Stacks and queues: Arrays are used as the underlying data structure for implementing stacks and queues,
which are commonly used in algorithms and data structures.
Graphs: Arrays can be used to represent graphs in computer science. Each element in the array represents
a node in the graph, and the relationships between the nodes are represented by the values stored in the
array.

Dynamic programming: Dynamic programming algorithms often use arrays to store intermediate results
of sub problems in order to solve a larger problem.

DYNAMIC MEMORY ALLOCATION:

Dynamic memory allocation is the process of assigning the memory space during the execution time or the
run time. Reasons and Advantage of allocating memory dynamically: When we do not know how much
amount of memory would be needed for the program beforehand.

INTRODUCTION TO LISTS:

List in data structure is an ordered data structure that stores elements sequentially and can be accessed
by the index of the elements. list in the data structure can store different or same data types elements
depending on the type of programming language that is being used.

LINKED LISTS

Linked List is a linear data structure and it is very common data structure which consists of group of
nodes in a sequence which is divided in two parts. Each node consists of its own data and the address
of the next node and forms a chain. Linked Lists are used to create trees and graphs.
Commonly used operations on a Linked List:
The following operations are performed on a Linked List
 Insertion: The insertion operation can be performed in three ways:
 Insertion in an empty list
 Insertion at the beginning of the list
 Insertion at the end of the list
 Insertion in between the nodes
 Deletion: The deletion operation can be performed in three ways:
 Deleting from the Beginning of the list
 Deleting from the End of the list
 Deleting a Specific Node
 Display: This process displays the elements of a linked list.

Types of Linked Lists


Singly Linked List : Singly linked lists contain nodes which have a data part as well as an address
part i.e. next, which points to the next node in sequence of nodes. The operations we can perform on
singly linked lists are insertion, deletion and traversal.

Doubly Linked List : In a doubly linked list, each node contains two links the first link points to the
previous node and the next link points to the next node in the sequence.
Circular Linked List : In the circular linked list the last node of the list contains the address of the
first node and forms a circular chain.

Linked Lists Vs Arrays:


Applications of Linked Lists

 Linked lists are used to implement stacks, queues, graphs, etc.

 Linked lists let you insert elements at the beginning and end of the list.

 In Linked Lists we don’t need to know the size in advance.


ALGORITHMS BASICS

Algorithm is a step by step procedure, which defines a set of instructions to be executed in certain
order to get the desired output. Algorithms are generally created independent of underlying
languages, i.e. an algorithm can be implemented in more than one programming language.

An algorithm is a finite set of instructions or logic, written in order, to accomplish a certain predefined
task. Algorithm is not the complete code or program, it is just the core logic (solution) of a problem,
which can be expressed either as an informal high level description as pseudo code or using a
flowchart.

From data structure point of view, following are some important categories of algorithms

• Search − Algorithm to search an item in a data structure.


• Sort − Algorithm to sort items in certain order
• Insert − Algorithm to insert item in a data structure
• Update − Algorithm to update an existing item in a data structure
• Delete − Algorithm to delete an existing item from a data structure

Characteristics of an Algorithm

Not all procedures can be called an algorithm. An algorithm should have the below mentioned
characteristics −

• Unambiguous − Algorithm should be clear and unambiguous. Each of its steps (or phases), and
their input/outputs should be clear and must lead to only one meaning.

• Input − An algorithm should have 0 or more well defined inputs.


• Output − An algorithm should have 1 or more well defined outputs, and should match the desired
output.

• Finiteness − Algorithms must terminate after a finite number of steps.


• Feasibility − Should be feasible with the available resources.

• Independent − An algorithm should have step-by-step directions which should be independent of


any programming code.
Algorithm Analysis

An algorithm is said to be efficient and fast, if it takes less time to execute and consumes less memory
space. The performance of an algorithm is measured on the basis of following properties:

1. Time Complexity

Suppose X is an algorithm and n is the size of input data, the time and space used by the Algorithm X are
the two main factors which decide the efficiency of X.

• Time Factor − The time is measured by counting the number of key operations such as
comparisons in sorting algorithm
• Space Factor − The space is measured by counting the maximum memory space required by the
algorithm.

The complexity of an algorithm f(n) gives the running time and / or storage space required by the
algorithm in terms of n as the size of input data.

2. Space Complexity

Space complexity of an algorithm represents the amount of memory space required by the algorithm in
its life cycle. Its the amount of memory space required by the algorithm, during the course of its
execution. Space complexity must be taken seriously for multi- user systems and in situations where
limited memory is available.

Space required by an algorithm is equal to the sum of the following two components −
• A fixed part that is a space required to store certain data and variables that are independent of
the size of the problem. For example simple variables & constant used and program size
etc.

• A variable part is a space required by variables, whose size depends on the size of the problem.
For example dynamic memory allocation, recursion stacks space etc.
An algorithm generally requires space for following components:

• Instruction Space: It is the space required to store the executable version of the program. This
space is fixed, but varies depending upon the number of lines of code in the program.

• Data Space: It is the space required to store all the constants and variables value.

• Environment Space: It is the space required to store the environment information needed to
resume the suspended function.

Space complexity S(P) of any algorithm P is S(P) = C + SP(I) Where C is the fixed part and S(I) is the
variable part of the algorithm which depends on instance characteristic I. Following is a simple
example that tries to explain the concept −
ASYMPTOTIC NOTATIONS:

The main idea of asymptotic analysis is to have a measure of efficiency of algorithms that doesn’t
depend on machine specific constants, and doesn’t require algorithms to be implemented and time
taken by programs to be compared. Asymptotic notations are mathematical tools to represent time
complexity of algorithms for asymptotic analysis. The following 3 asymptotic notations are mostly
used to represent time complexity of algorithms.

1) Θ Notation:

The theta notation bounds a function from above and below, so it defines exact asymptotic behavior. A
simple way to get Theta notation of an expression is to drop low order terms and ignore leading
constants. For example, consider the following expression. 3𝑛3 + 6𝑛2 + 6000 = 𝛩(𝑛3)

Dropping lower order terms is always fine because there will always be a n0 after which
𝛩(𝑛3) beats 𝛩(𝑛2) irrespective of the constants involved. For a given function g(n), we denote Θ(g(n))
is following set of functions.𝛩((𝑔(𝑛)) = {𝑓(𝑛): 𝑡𝑕𝑒𝑟𝑒 𝑒𝑥𝑖𝑠𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝑐1, 𝑐2 𝑎𝑛𝑑 𝑛0 𝑠𝑢𝑐𝑕
𝑡𝑕𝑎𝑡 0 <= 𝑐1 ∗ 𝑔(𝑛) <= 𝑓(𝑛) <= 𝑐2 ∗ 𝑔(𝑛) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 >= 𝑛0}

The above definition means, if f(n) is theta of g(n), then the value f(n) is always between c1*g(n) and
c2*g(n) for large values of n (n >= n0). The definition of theta also requires that f(n) must be non-
negative for values of n greater than n0.

2. Big O Notation:
The Big O notation defines an upper bound of an algorithm, it bounds a function only from above. For
example, consider the case of Insertion Sort. It takes linear time in best case and quadratic time in
worst case. We can safely say that the time complexity of Insertion sort is O(𝑛2). Note that O(𝑛2) also
covers linear time. If we use Θ notation to represent time complexity of Insertion sort, we have to use
two statements for best and worst cases:

1. The worst case time complexity of Insertion Sort is Θ(𝑛2).

2. The best case time complexity of Insertion Sort is Θ(n).


The Big O notation is useful when we only have upper bound on time complexity of an algorithm.
Many times we easily find an upper bound by simply looking at the algorithm.

𝑂(𝑔(𝑛)) = { 𝑓(𝑛): 𝑡𝑕𝑒𝑟𝑒 𝑒𝑥𝑖𝑠𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝑐 𝑎𝑛𝑑 𝑛0 𝑠𝑢𝑐𝑕 𝑡𝑕𝑎𝑡 0 <=

𝑓(𝑛) <= 𝑐𝑔(𝑛) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 >= 𝑛0}


3.Ω Notation:

Just as Big O notation provides an asymptotic upper bound on a function, Ω notation provides an
asymptotic lower bound. Ω Notation< can be useful when we have lower bound on time complexity of
an algorithm. As discussed in the previous post, the best case performance of an algorithm is generally
not useful; the Omega notation is the least used notation among all three.
For a given function g(n), we denote by Ω(g(n)) the set of functions.

𝛺 (𝑔(𝑛)) = {𝑓(𝑛): 𝑡𝑕𝑒𝑟𝑒 𝑒𝑥𝑖𝑠𝑡 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠 𝑐 𝑎𝑛𝑑 𝑛0 𝑠𝑢𝑐𝑕 𝑡𝑕𝑎𝑡 0 <=
𝑐𝑔(𝑛) <= 𝑓(𝑛) 𝑓𝑜𝑟 𝑎𝑙𝑙 𝑛 >= 𝑛0

RECURSION:
The process in which a function calls itself directly or indirectly is called recursion
and the corresponding function is called a recursive function.

Using a recursive algorithm, certain problems can be solved quite easily.

Examples of such problems are Towers of HanoiTOH), Inorder/Preorder/Postorder Tree


Traversals, DFS of Graph, etc.

Properties of Recursion:
 Performing the same operations multiple times with different inputs.
 In every step, we try smaller inputs to make the problem smaller.
 Base condition is needed to stop the recursion otherwise infinite loop will occur

A Mathematical Interpretation
Let us consider a problem that a programmer has to determine the sum of first n natural numbers, there
are several ways of doing that but the simplest approach is simply to add the numbers starting from 1 to n.
So the function simply looks like this,
approach(1) – Simply adding one by one
f(n) = 1 + 2 + 3 +……..+ n
but there is another mathematical approach of representing this,
approach(2) – Recursive adding
f(n) = 1 n=1
f(n) = n + f(n-1) n>1
There is a simple difference between the approach (1) and approach(2) and that is in approach(2) the
function “ f( ) ” itself is being called inside the function, so this phenomenon is named recursion, and the
function containing recursion is called recursive function, at the end, this is a great tool in the hand of the
programmers to code some problems in a lot easier and efficient way.

What is the base condition in recursion?


In the recursive program, the solution to the base case is provided and the solution to the
bigger problem is expressed in terms of smaller problems.

int fact(int n)
{
if (n < = 1) // base case
return 1;
else
return n*fact(n-1);
In the above example, the base case for n < = 1 is defined and the larger value of a number can be solved
by converting to a smaller one till the base case is reached

You might also like