0% found this document useful (0 votes)
27 views16 pages

DS Notes Unit 1

Unit 1 covers fundamental concepts of data structures, including elementary data organization, types of data structures (linear, hierarchical, graph, and advanced), and key operations such as insertion, deletion, and searching. It also discusses arrays in detail, including their characteristics, types, and applications, as well as sorting algorithms like Insertion Sort and Selection Sort. The document emphasizes the importance of data structures in efficient data management and algorithm design.

Uploaded by

privatevideo810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views16 pages

DS Notes Unit 1

Unit 1 covers fundamental concepts of data structures, including elementary data organization, types of data structures (linear, hierarchical, graph, and advanced), and key operations such as insertion, deletion, and searching. It also discusses arrays in detail, including their characteristics, types, and applications, as well as sorting algorithms like Insertion Sort and Selection Sort. The document emphasizes the importance of data structures in efficient data management and algorithm design.

Uploaded by

privatevideo810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Unit 1 - Data Structures

Elementary Data Organization :- Elementary Data Organization refers to the


fundamental methods and principles used to arrange and structure data in a way
that makes it accessible, manageable, and efficient to use. This concept is
foundational in computer science, particularly in the fields of data management,
storage, and processing.

Data Structure :- A data structure is a specialized format for organizing,


managing, and storing data in a computer so that it can be accessed and
modified efficiently. Data structures are fundamental to designing efficient
algorithms and systems, as they provide a way to handle large amounts of data
effectively.

Linear Data Structures

 Arrays: A collection of elements identified by index, where each


element is of the same type. Arrays have a fixed size.
 Linked Lists: A sequence of nodes where each node contains data
and a reference (or link) to the next node. Variants include singly
linked lists, doubly linked lists, and circular linked lists.
 Stacks: A collection of elements with Last In, First Out (LIFO)
access. Operations include push (insert) and pop (remove).
 Queues: A collection of elements with First In, First Out (FIFO)
access. Operations include enqueue (insert) and dequeue (remove).

Hierarchical Data Structures

 Trees: A hierarchical structure consisting of nodes, with a single


root node and child nodes forming a parent-child relationship.
Examples include:
o Binary Trees: Each node has at most two children.
o Binary Search Trees (BST): A binary tree where each
node's left child is less than the node, and the right child is
greater.
o Heaps: A specialized tree-based structure that satisfies the
heap property (min-heap or max-heap).
o Tries: A tree used for storing strings, where each node
represents a character

Graph Data Structures

 Graphs: Consist of nodes (vertices) connected by edges. Can be:


o Directed Graphs (Digraphs): Edges have a direction.
o Undirected Graphs: Edges do not have a direction.
o Weighted Graphs: Edges have weights/costs associated with
them.
o Unweighted Graphs: Edges do not have weights.

Advanced Data Structures


 Heaps: A specialized tree-based data structure that satisfies the
heap property. Used in priority queues and for efficient sorting
(heap sort).
 Balanced Trees: Trees that maintain a balanced height to ensure
operations like insertion, deletion, and lookup are performed in
logarithmic time. Examples include AVL trees and Red-Black trees.
 B-Trees: A self-balancing tree data structure that maintains sorted
data and allows searches, sequential access, insertions, and
deletions in logarithmic time. Commonly used in databases and file
systems.

Key Operations on Data Structures

 Insertion: Adding a new element.


 Deletion: Removing an element.
 Traversal: Accessing each element in the data structure, often to
perform a computation.
 Searching: Finding a specific element.
 Sorting: Arranging elements in a specific order.

Applications of Data Structures

 Arrays and Linked Lists: Used for implementing other data


structures and algorithms.
 Stacks and Queues: Useful in scenarios like expression
evaluation, backtracking, and scheduling tasks.
 Trees and Graphs: Crucial for hierarchical data representation,
network routing algorithms, and finding shortest paths.
 Hash Tables: Efficient for lookups, such as in databases and
caching mechanisms.

Arrays :- An array is a data structure consisting of a collection of elements,


each identified by an array index or key. Arrays are commonly used in
programming to store multiple values of the same type in a single, contiguous
block of memory.

Characteristics of Arrays

1. Fixed Size: The size of an array is defined when it is created and


cannot be changed. This means the number of elements it can hold
is fixed.

2. Homogeneous Elements: All elements in an array are of the same


data type (e.g., all integers, all floats, all characters).

3. Contiguous Memory Allocation: Elements of an array are stored


in contiguous memory locations, which allows for efficient indexing
and retrieval of elements.
4. Indexed Access: Each element in an array can be accessed using
its index. Array indices typically start from 0 in most programming
languages (e.g., the first element is at index 0, the second at index
1

Types of Arrays

1. One-dimensional Arrays (1D Arrays): A simple linear array


where elements are arranged in a single row. For example:

EXP :- int[] numbers = {1, 2, 3, 4, 5};

Example: Given the base address of an array A[1300 …………


1900] as 1020 and the size of each element is 2 bytes in the memory,
find the address of A[1700].

Solution:

Given:

 Base address (B) = 1020

 Lower bound (LB) = 1300

 Size of each element (W) = 2 bytes

 Index of element (not value) = 1700

Formula used:
Address of A[Index] = B + W * (Index – LB)
Address of A[1700] = 1020 + 2 * (1700 – 1300)
= 1020 + 2 * (400)
= 1020 + 800
Address of A[1700] = 1820

Calculate the address of any element in the 2-D array:

The 2-dimensional array can be defined as an array of arrays. The 2-


Dimensional arrays are organized as matrices which can be represented
as the collection of rows and columns as array[M][N] where M is the
number of rows and N is the number of columns.

Example:

Row major ordering assigns successive elements, moving across the


rows and then down the next row, to successive memory locations. In
simple language, the elements of an array are stored in a Row-Wise
fashion.
To find the address of the element using row-major order uses the
following formula

Address of A[I][J] = B + W * ((I – LR) * N + (J – LC))

I = Row Subset of an element whose address to be found,


J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in an array(in byte),
LR = Lower Limit of row/start row index of the matrix(If not given assume
it as zero),
LC = Lower Limit of column/start column index of the matrix(If not given
assume it as zero),
N = Number of column given in the matrix.

Example: Given an array, arr[1………10][1………15] with base


value 100 and the size of each element is 1 Byte in memory. Find the
address of arr[8][6] with the help of row-major order.

Solution:

Given:
Base address B = 100
Storage size of one element store in any array W = 1 Bytes
Row Subset of an element whose address to be found I = 8
Column Subset of an element whose address to be found J = 6
Lower Limit of row/start row index of matrix LR = 1
Lower Limit of column/start column index of matrix = 1
Number of column given in the matrix N = Upper Bound – Lower Bound +
1
= 15 – 1 + 1
= 15

Formula:
Address of A[I][J] = B + W * ((I – LR) * N + (J – LC))

Solution:
Address of A[8][6] = 100 + 1 * ((8 – 1) * 15 + (6 – 1))
= 100 + 1 * ((7) * 15 + (5))
= 100 + 1 * (110)
Address of A[I][J] = 210

Column Major Order:

If elements of an array are stored in a column-major fashion means


moving across the column and then to the next column then it’s in
column-major order. To find the address of the element using column-
major order use the following formula:
Address of A[I][J] = B + W * ((J – LC) * M + (I – LR))

I = Row Subset of an element whose address to be found,


J = Column Subset of an element whose address to be found,
B = Base address,
W = Storage size of one element store in any array(in byte),
LR = Lower Limit of row/start row index of matrix(If not given assume it as
zero),
LC = Lower Limit of column/start column index of matrix(If not given
assume it as zero),
M = Number of rows given in the matrix.

Example: Given an array arr[1………10][1………15] with a base value


of 100 and the size of each element is 1 Byte in memory find the address
of arr[8][6] with the help of column-major order.

Solution:

Given:
Base address B = 100
Storage size of one element store in any array W = 1 Bytes
Row Subset of an element whose address to be found I = 8
Column Subset of an element whose address to be found J = 6
Lower Limit of row/start row index of matrix LR = 1
Lower Limit of column/start column index of matrix = 1
Number of Rows given in the matrix M = Upper Bound – Lower Bound + 1
= 10 – 1 + 1
= 10

Formula: used
Address of A[I][J] = B + W * ((J – LC) * M + (I – LR))
Address of A[8][6] = 100 + 1 * ((6 – 1) * 10 + (8 – 1))
= 100 + 1 * ((5) * 10 + (7))
= 100 + 1 * (57)
Address of A[I][J] = 157

Multi-dimensional Arrays: Arrays of arrays, where each element is itself an


array. The most common type is the two-dimensional array (2D array), which can
be visualized as a matrix or table with rows and columns.

EXP :- int[][] matrix = {

{1, 2, 3},

{4, 5, 6},

{7, 8, 9}
};

Operations on Arrays

1. Accessing Elements: Elements are accessed using their index. For


example, numbers[2] retrieves the third element in a 1D array
numbers.

2. Modifying Elements: Elements can be updated using their index.


For example, numbers[2] = 10 changes the third element to 10.

3. Traversing Arrays: Iterating through each element, often using


loops.

EXP :- for (int i = 0; i < numbers.length; i++) {

System.out.println(numbers[i]);

Insertion and Deletion: Since arrays have a fixed size, inserting or deleting
elements requires creating a new array and copying the elements. This is a
costly operation in terms of time complexity.

Advantages of Arrays

1. Efficient Access: Accessing elements by index is very fast, with


O(1) time complexity.
2. Memory Efficiency: Arrays use a contiguous block of memory,
which can be more memory-efficient compared to other data
structures like linked lists.

Disadvantages of Arrays

1. Fixed Size: Once an array is created, its size cannot be changed.


This can lead to wasted memory if the array is not fully utilized, or it
can require creating a new array if more space is needed.
2. Costly Insertions and Deletions: Adding or removing elements
can be expensive because it may require shifting elements and
resizing the array.

Applications of Arrays

1. Storing Collections of Data: Arrays are used to store lists of


items, such as numbers, characters, or objects.
2. Implementing Other Data Structures: Arrays can be used to
implement other data structures, like stacks, queues, and heaps.
3. Matrix Operations: 2D arrays are used to represent matrices in
mathematical computations.
4. Data Storage in Programming: Arrays are fundamental in various
programming tasks, from simple data storage to complex algorithm
implementations.

Insertion Sort
Insertion Sort is a simple sorting algorithm that builds the final sorted array (or
list) one item at a time. It is much less efficient on large lists than more
advanced algorithms such as quicksort, heapsort, or merge sort.
How It Works:
1. Start from the second element: Assume that the first element is already
sorted.
2. Compare the current element with the elements in the sorted part of the
array.
3. Shift all the elements that are greater than the current element to the
right by one position.
4. Insert the current element in its correct position.
5. Repeat the process for all elements in the array.
Algorithm Steps:
Given an array A of n elements:
1. Start with the second element (index i = 1).
2. Compare the element A[i] with A[i-1], A[i-2], and so on.
3. Shift elements to the right until you find the correct position for A[i].
4. Insert A[i] in the correct position.
5. Move to the next element and repeat until the entire array is sorted.
Example:
Consider the array: [5, 2, 9, 1, 5, 6]
Step-by-Step Insertion Sort:
1. Initial Array: [5, 2, 9, 1, 5, 6]
2. Pass 1: (i = 1)
o Compare 2 with 5. Since 2 < 5, shift 5 to the right.
o Insert 2 in the first position.
o Array becomes: [2, 5, 9, 1, 5, 6]
3. Pass 2: (i = 2)
o Compare 9 with 5. Since 9 > 5, no changes are made.
o Array remains: [2, 5, 9, 1, 5, 6]
4. Pass 3: (i = 3)
o Compare 1 with 9, 5, and 2. Shift them to the right.
o Insert 1 in the first position.
o Array becomes: [1, 2, 5, 9, 5, 6]
5. Pass 4: (i = 4)
o Compare 5 with 9. Since 5 < 9, shift 9 to the right.
o Insert 5 in the correct position.
o Array becomes: [1, 2, 5, 5, 9, 6]
6. Pass 5: (i = 5)
o Compare 6 with 9. Since 6 < 9, shift 9 to the right.
o Insert 6 in the correct position.
o Array becomes: [1, 2, 5, 5, 6, 9]
Final Sorted Array: [1, 2, 5, 5, 6, 9]
Time Complexity:
 Best Case: O(n) [When the array is already sorted]
 Average and Worst Case: O(n²) [When the array is in reverse order or
unsorted]
Advantages:
 Simple to implement.
 Efficient for small data sets.
 Stable (does not change the relative order of equal elements).
Disadvantages:
 Not suitable for large data sets.
 Higher time complexity compared to more advanced sorting algorithms
like Quick Sort or Merge Sort.

Selection Sort
Selection Sort is a simple comparison-based sorting algorithm. It works by
repeatedly finding the minimum (or maximum) element from the unsorted
portion of the list and moving it to the sorted portion.
How It Works:
1. Start with the first element: Assume the first element is the minimum.
2. Compare this element with the rest of the elements in the list.
3. Find the smallest element in the unsorted portion.
4. Swap the smallest element with the first element.
5. Move the boundary between the sorted and unsorted portions one step
to the right.
6. Repeat the process for all elements in the list.
Algorithm Steps:
Given an array A of n elements:
1. Start with the first element (index i = 0).
2. Assume A[i] is the minimum element.
3. Compare A[i] with every element A[j] where j ranges from i+1 to n-1.
4. If a smaller element is found, update the index of the minimum element.
5. Swap A[i] with the smallest element found in step 4.
6. Move to the next element and repeat until the entire array is sorted.
Example:
Consider the array: [29, 10, 14, 37, 13]
Step-by-Step Selection Sort:
1. Initial Array: [29, 10, 14, 37, 13]
2. Pass 1: (i = 0)
o Find the smallest element in the array [29, 10, 14, 37, 13], which is
10.
o Swap 10 with 29.
o Array becomes: [10, 29, 14, 37, 13]
3. Pass 2: (i = 1)
o Find the smallest element in the remaining array [29, 14, 37, 13],
which is 13.
o Swap 13 with 29.
o Array becomes: [10, 13, 14, 37, 29]
4. Pass 3: (i = 2)
o Find the smallest element in the remaining array [14, 37, 29], which
is 14.
o No need to swap since 14 is already in place.
o Array remains: [10, 13, 14, 37, 29]
5. Pass 4: (i = 3)
o Find the smallest element in the remaining array [37, 29], which is
29.
o Swap 29 with 37.
o Array becomes: [10, 13, 14, 29, 37]
6. Pass 5: (i = 4)
o The last element is already in place.
o Array remains: [10, 13, 14, 29, 37]
Final Sorted Array: [10, 13, 14, 29, 37]
Time Complexity:
 Best, Average, and Worst Case: O(n²) [As the algorithm always makes
n-1 comparisons for every pass]
Advantages:
 Simple to implement.
 Performs well on small lists.
 In-place sorting algorithm (doesn't require extra space).
Disadvantages:
 Inefficient for large lists.
 Unstable (it might change the relative order of equal elements).

Bubble Sort

Bubble Sort is a simple comparison-based sorting algorithm. It


repeatedly steps through the list, compares adjacent elements, and swaps
them if they are in the wrong order. The process is repeated until the list
is sorted.

How It Works:
1. Start at the beginning of the list.
2. Compare adjacent elements. If the current element is greater than the
next one, swap them.
3. Move to the next pair and repeat the comparison and swap if necessary.
4. Continue to the end of the list. At the end of each pass, the largest
element will "bubble up" to its correct position.
5. Repeat the process for the remaining unsorted portion of the list until no
swaps are needed.
Algorithm Steps:
Given an array A of n elements:
1. Start with the first element (index i = 0).
2. Compare A[i] with A[i+1].
o If A[i] > A[i+1], swap them.
3. Move to the next element and repeat the comparison and swap process
for the entire array.
4. After each complete pass, the largest unsorted element will be in its
correct position at the end of the list.
5. Repeat the passes until no swaps are made during a pass, indicating the
list is sorted.
Example:
Consider the array: [5, 3, 8, 4, 2]
Step-by-Step Bubble Sort:
1. Initial Array: [5, 3, 8, 4, 2]
2. Pass 1:
o Compare 5 and 3. Swap them.
o Array becomes: [3, 5, 8, 4, 2]
o Compare 5 and 8. No swap needed.
o Compare 8 and 4. Swap them.
o Array becomes: [3, 5, 4, 8, 2]
o Compare 8 and 2. Swap them.
o Array becomes: [3, 5, 4, 2, 8]
o End of Pass 1: Largest element (8) is in its correct position.
3. Pass 2:
o Compare 3 and 5. No swap needed.
o Compare 5 and 4. Swap them.
o Array becomes: [3, 4, 5, 2, 8]
o Compare 5 and 2. Swap them.
o Array becomes: [3, 4, 2, 5, 8]
o End of Pass 2: Second largest element (5) is in its correct position.
4. Pass 3:
o Compare 3 and 4. No swap needed.
o Compare 4 and 2. Swap them.
o Array becomes: [3, 2, 4, 5, 8]
o End of Pass 3: Third largest element (4) is in its correct position.
5. Pass 4:
o Compare 3 and 2. Swap them.
o Array becomes: [2, 3, 4, 5, 8]
o End of Pass 4: Fourth largest element (3) is in its correct position.
6. Pass 5:
o No swaps needed; the array is sorted.
Final Sorted Array: [2, 3, 4, 5, 8]
Time Complexity:
 Best Case: O(n) [When the array is already sorted and no swaps are
needed]
 Average and Worst Case: O(n²) [When the array is in reverse order or
unsorted]
Advantages:
 Simple to understand and implement.
 In-place sorting algorithm (doesn't require extra space).
 Stable (does not change the relative order of equal elements).
Disadvantages:
 Inefficient for large data sets.
 Higher time complexity compared to more advanced sorting algorithms.

Merge Sort

Merge Sort is a divide-and-conquer sorting algorithm that breaks the list


into smaller sublists until each sublist contains a single element. It then
merges the sublists back together in a sorted order.

How It Works:
1. Divide the unsorted list into two approximately equal halves.
2. Recursively sort each half by applying the merge sort algorithm to them.
3. Merge the two halves back together in a sorted manner to produce the
final sorted list.
Algorithm Steps:
Given an array A of n elements:
1. If the list contains 1 or 0 elements, it is already sorted; return it.
2. Divide the array into two halves: left_half and right_half.
3. Recursively apply merge sort to left_half.
4. Recursively apply merge sort to right_half.
5. Merge the sorted left_half and right_half into a single sorted list.
Example:
Consider the array: [38, 27, 43, 3, 9, 82, 10]
Step-by-Step Merge Sort:
1. Initial Array: [38, 27, 43, 3, 9, 82, 10]
2. Step 1: Divide the array into halves:
o Left half: [38, 27, 43]
o Right half: [3, 9, 82, 10]
3. Step 2: Recursively sort the left half [38, 27, 43]:
o Divide into: [38] and [27, 43]
o Sort [27, 43] by dividing into [27] and [43] (already sorted).
o Merge [27] and [43] to get [27, 43].
o Merge [38] and [27, 43] to get [27, 38, 43].
4. Step 3: Recursively sort the right half [3, 9, 82, 10]:
o Divide into: [3, 9] and [82, 10]
o Sort [3, 9] by dividing into [3] and [9] (already sorted).
o Merge [3] and [9] to get [3, 9].
o Sort [82, 10] by dividing into [82] and [10] (already sorted).
o Merge [10] and [82] to get [10, 82].
o Merge [3, 9] and [10, 82] to get [3, 9, 10, 82].
5. Step 4: Merge the sorted halves [27, 38, 43] and [3, 9, 10, 82]:
o Compare elements and merge to get the final sorted array [3, 9, 10,
27, 38, 43, 82].
Final Sorted Array: [3, 9, 10, 27, 38, 43, 82]
Time Complexity:
 Best, Average, and Worst Case: O(n log n) [The time complexity is the
same for all cases since the algorithm always divides the list into halves
and merges them back together]
Advantages:
 Efficient for large data sets.
 Stable (does not change the relative order of equal elements).
 Works well with linked lists and external storage (e.g., sorting large files).
Disadvantages:
 Requires additional space proportional to the size of the list (O(n) extra
space).
 Slower than some algorithms (like quicksort) for smaller datasets.

Radix Sort

Radix Sort is a non-comparative integer sorting algorithm that sorts


numbers by processing individual digits. It processes each digit of the
numbers from the least significant digit (LSD) to the most significant digit
(MSD) or vice versa. Radix sort is particularly useful when sorting numbers
or strings where each element has a fixed number of digits or characters.

How It Works:
1. Start from the least significant digit (LSD): Sort the numbers based
on this digit.
2. Move to the next digit (next more significant digit) and repeat the
sorting process.
3. Continue this process until all digits have been processed.
4. The list will be sorted after the final pass.
Radix sort typically uses a stable subroutine, like counting sort, to sort the digits.
Algorithm Steps:
Given an array of integers:
1. Determine the maximum number in the array to know the number of
digits (let's call it d).
2. Start with the least significant digit (LSD) and sort the array based on this
digit.
3. Move to the next digit and repeat the sorting process.
4. Continue until all digits have been processed.
Example:
Consider the array: [170, 45, 75, 90, 802, 24, 2, 66]
Step-by-Step Radix Sort:
1. Initial Array: [170, 45, 75, 90, 802, 24, 2, 66]
2. Step 1: Sort based on the least significant digit (units place):
o Arrange numbers based on the last digit: [170, 90, 802, 2, 24, 45,
75, 66]
3. Step 2: Sort based on the next digit (tens place):
o Arrange numbers based on the tens digit: [802, 2, 24, 45, 66, 170,
75, 90]
4. Step 3: Sort based on the most significant digit (hundreds place):
o Arrange numbers based on the hundreds digit: [2, 24, 45, 66, 75,
90, 170, 802]
Final Sorted Array: [2, 24, 45, 66, 75, 90, 170, 802]
Time Complexity:
 Best, Average, and Worst Case: O(nk), where n is the number of
elements and k is the number of digits in the largest number.
Advantages:
 Linear time complexity under certain conditions (when the range of digits
is fixed and not too large).
 Stable (does not change the relative order of equal elements).
Disadvantages:
 Requires additional space for sorting based on each digit.
 Not suitable for non-integer data (unless modified) and may not be as
efficient as comparison-based sorts for small datasets.
Linear Search
Linear Search is a simple search algorithm that checks every element in a list
sequentially until it finds the target value or reaches the end of the list. It's
straightforward and doesn't require the list to be sorted.
How It Works:
1. Start at the first element of the list.
2. Compare the target value with the current element.
3. If the target value matches the current element, the search is successful,
and the index of the element is returned.
4. If the target value does not match, move to the next element.
5. Continue this process until the target value is found or the list ends.
6. If the target value is not found in the list, the search is unsuccessful, and a
special value (like -1 or null) is returned.
Algorithm Steps:
Given an array A of n elements and a target value x:
1. Start with the first element (index i = 0).
2. Compare A[i] with x.
o If A[i] == x, return i.
o If A[i] != x, move to the next element.
3. Repeat step 2 until the end of the array.
4. If x is not found, return a special value (like -1).
Example:
Consider the array: [10, 23, 45, 70, 11, 15]
Search for the target value 70:
1. Initial Array: [10, 23, 45, 70, 11, 15]
2. Step 1: Compare 10 with 70. (Not a match)
3. Step 2: Compare 23 with 70. (Not a match)
4. Step 3: Compare 45 with 70. (Not a match)
5. Step 4: Compare 70 with 70. (Match found)
Result: The target value 70 is found at index 3.
Search for the target value 25:
1. Initial Array: [10, 23, 45, 70, 11, 15]
2. Step 1: Compare 10 with 25. (Not a match)
3. Step 2: Compare 23 with 25. (Not a match)
4. Step 3: Compare 45 with 25. (Not a match)
5. Step 4: Compare 70 with 25. (Not a match)
6. Step 5: Compare 11 with 25. (Not a match)
7. Step 6: Compare 15 with 25. (Not a match)
Result: The target value 25 is not found in the array, so the search returns -1.
Time Complexity:
 Best Case: O(1) [When the target value is the first element]
 Worst and Average Case: O(n) [When the target value is the last
element or not present in the list]
Advantages:
 Simple to implement and understand.
 Works on unsorted data.
Disadvantages:
 Inefficient for large lists compared to more advanced search algorithms
like binary search.
 Not suitable when the list is sorted, as binary search would be much faster.
Binary Search
Binary Search is an efficient search algorithm that works on sorted lists. It
repeatedly divides the search interval in half to locate the target value. Since it
eliminates half of the elements in each step, it is much faster than linear search,
especially for large datasets.
How It Works:
1. Start by defining the search interval: Set two pointers, low (the start
of the list) and high (the end of the list).
2. Find the middle element of the current search interval.
3. Compare the target value with the middle element:
o If the target value matches the middle element, the search is
successful, and the index of the middle element is returned.
o If the target value is less than the middle element, adjust the high
pointer to search in the left half.
o If the target value is greater than the middle element, adjust the
low pointer to search in the right half.
4. Repeat the process with the new search interval until the target value is
found or the search interval is empty (i.e., low > high).
5. If the search interval becomes empty, the target value is not in the list,
and a special value (like -1) is returned.
Algorithm Steps:
Given a sorted array A of n elements and a target value x:
1. Initialize low = 0 and high = n - 1.
2. While low <= high, repeat the following steps:
o Calculate mid = (low + high) / 2.
o Compare A[mid] with x.
o If A[mid] == x, return mid.
o If A[mid] < x, set low = mid + 1.
o If A[mid] > x, set high = mid - 1.
3. If low exceeds high, return -1 (indicating the target value is not found).
Example:
Consider the sorted array: [2, 3, 4, 10, 40]
Search for the target value 10:
1. Initial Array: [2, 3, 4, 10, 40]
2. Step 1: Set low = 0, high = 4. Calculate mid = (0 + 4) / 2 = 2.
o Compare A[2] = 4 with 10. (Not a match)
o Since 10 > 4, search in the right half. Set low = 3.
3. Step 2: Set low = 3, high = 4. Calculate mid = (3 + 4) / 2 = 3.
o Compare A[3] = 10 with 10. (Match found)
Result: The target value 10 is found at index 3.
Search for the target value 5:
1. Initial Array: [2, 3, 4, 10, 40]
2. Step 1: Set low = 0, high = 4. Calculate mid = (0 + 4) / 2 = 2.
o Compare A[2] = 4 with 5. (Not a match)
o Since 5 > 4, search in the right half. Set low = 3.
3. Step 2: Set low = 3, high = 4. Calculate mid = (3 + 4) / 2 = 3.
o Compare A[3] = 10 with 5. (Not a match)
o Since 5 < 10, search in the left half. Set high = 2.
4. Step 3: Now, low = 3 and high = 2 (which means low > high), so the
search ends.
Result: The target value 5 is not found in the array, so the search returns ‘-1’.
Time Complexity:
 Best Case: O(1) [When the target value is the middle element]
 Worst and Average Case: O(log n) [Because the list is halved with each
step]
Advantages:
 Much faster than linear search, especially for large, sorted datasets.
 Efficient in terms of time complexity.
Disadvantages:
 The list must be sorted before performing a binary search.
 Not suitable for linked lists or data structures where direct access to
elements is not possible.

Hash Function
A hash function is a special algorithm that takes an input (or "key") and returns a
fixed-size string of bytes, typically a hash code. The output is usually a numerical
value that uniquely identifies the input data. Hash functions are widely used in
various applications such as data storage, retrieval, and cryptography.
How It Works:
1. Input: The hash function takes an input, which can be data of any size
(e.g., a string, file, or number).
2. Processing: The hash function processes this input using a specific
algorithm to produce a hash value.
3. Output: The hash value is a fixed-size, typically shorter than the input, and
is used as a unique identifier for the original data.
Properties of a Good Hash Function:
 Deterministic: The same input will always produce the same output hash.
 Fast Computation: The hash function should be able to process data
quickly.
 Uniform Distribution: The hash values should be evenly distributed across
the possible output range to minimize collisions.
 Pre-image Resistance: It should be difficult to reverse the process (i.e., to
find the original input given only the hash value).
 Collision Resistance: It should be difficult to find two different inputs that
produce the same hash value.
Applications of Hash Functions:
 Hash Tables: Used to store and retrieve data quickly. Hash functions map
data to specific locations (buckets) in the hash table.
 Data Integrity: Hash functions can be used to verify the integrity of data.
For example, if two hash values match, the data has not been altered.
 Cryptography: Hash functions are used in cryptographic algorithms to
secure data, passwords, and digital signatures.
Example of a Hash Function:
Consider a simple hash function h(x) = x % 10, where x is an integer and % is
the modulus operator.
 Input: x = 35
 Hash Value: h(35) = 35 % 10 = 5
Here, the hash function returns 5 as the hash value for the input 35. This means
that if 35 is stored in a hash table, it would be placed in the bucket
corresponding to the index 5.
Example in a Hash Table:
Suppose we have the following inputs: [21, 14, 18, 35, 42]
Using the hash function h(x) = x % 10:
 h(21) = 21 % 10 = 1 → Store 21 in bucket 1.
 h(14) = 14 % 10 = 4 → Store 14 in bucket 4.
 h(18) = 18 % 10 = 8 → Store 18 in bucket 8.
 h(35) = 35 % 10 = 5 → Store 35 in bucket 5.
 h(42) = 42 % 10 = 2 → Store 42 in bucket 2.
The hash table would look like this:

Collision
When multiple inputs produce the same hash value, it results in a collision in a
hash table. Collision resolution strategies are methods used to handle these
collisions so that the hash table can still efficiently store and retrieve data. Here
are the common collision resolution strategies:
1. Chaining
In chaining, each bucket of the hash table points to a linked list (or another data
structure like a dynamic array) that contains all elements that hash to the same
index.
 How It Works:
o If a collision occurs, the new element is added to the linked list at
the corresponding index.
o When searching for an element, you hash the key to find the correct
bucket and then search through the linked list in that bucket.
 Advantages:
o Simple to implement.
o No need to worry about running out of space in a bucket.
o Can handle a large number of collisions effectively.
 Disadvantages:
o Requires additional memory for pointers in the linked lists.
o Performance degrades to O(n) in the worst case if many elements
hash to the same bucket.

Sparse Matrices and Their Representations


A sparse matrix is a matrix in which most of the elements are zero. In contrast,
a dense matrix has few or no zero elements. Sparse matrices are common in
scientific computing, data science, and engineering because many real-world
problems result in matrices where only a few elements are non-zero.
Representations of Sparse Matrices
To efficiently store and manipulate sparse matrices, several representations are
used. These representations focus on storing only the non-zero elements along
with their positions.
. Coordinate List (COO) Format
The COO format is one of the simplest representations of a sparse matrix. It
stores a list of the non-zero elements along with their row and column indices.

 Storage:
o A list (or array) for row indices.
o A list (or array) for column indices.
o A list (or array) for non-zero values.

Example:
Consider the same sparse matrix:

[ ]
0 0 30
0 0 04
5 0 00
 The CSR representation would be:
 Values: [3, 4, 5]
 ColumnIndex: [2, 3, 0]
 RowPointer: [0, 1, 2, 3]
Here:
 RowPointer[0] = 0 (row 0 starts at index 0 in Values[])
 RowPointer[1] = 1 (row 1 starts at index 1 in Values[])
 RowPointer[2] = 2 (row 2 starts at index 2 in Values[])
 Advantages:
 Efficient for row-wise matrix operations.
 Saves space compared to COO format.
 Disadvantages:
 Complex to implement compared to COO format.
 Insertion of elements can be costly as the data must be shifted.

You might also like