DS Notes Unit 1
DS Notes Unit 1
Characteristics of Arrays
Types of Arrays
Solution:
Given:
Formula used:
Address of A[Index] = B + W * (Index – LB)
Address of A[1700] = 1020 + 2 * (1700 – 1300)
= 1020 + 2 * (400)
= 1020 + 800
Address of A[1700] = 1820
Example:
Solution:
Given:
Base address B = 100
Storage size of one element store in any array W = 1 Bytes
Row Subset of an element whose address to be found I = 8
Column Subset of an element whose address to be found J = 6
Lower Limit of row/start row index of matrix LR = 1
Lower Limit of column/start column index of matrix = 1
Number of column given in the matrix N = Upper Bound – Lower Bound +
1
= 15 – 1 + 1
= 15
Formula:
Address of A[I][J] = B + W * ((I – LR) * N + (J – LC))
Solution:
Address of A[8][6] = 100 + 1 * ((8 – 1) * 15 + (6 – 1))
= 100 + 1 * ((7) * 15 + (5))
= 100 + 1 * (110)
Address of A[I][J] = 210
Solution:
Given:
Base address B = 100
Storage size of one element store in any array W = 1 Bytes
Row Subset of an element whose address to be found I = 8
Column Subset of an element whose address to be found J = 6
Lower Limit of row/start row index of matrix LR = 1
Lower Limit of column/start column index of matrix = 1
Number of Rows given in the matrix M = Upper Bound – Lower Bound + 1
= 10 – 1 + 1
= 10
Formula: used
Address of A[I][J] = B + W * ((J – LC) * M + (I – LR))
Address of A[8][6] = 100 + 1 * ((6 – 1) * 10 + (8 – 1))
= 100 + 1 * ((5) * 10 + (7))
= 100 + 1 * (57)
Address of A[I][J] = 157
{1, 2, 3},
{4, 5, 6},
{7, 8, 9}
};
Operations on Arrays
System.out.println(numbers[i]);
Insertion and Deletion: Since arrays have a fixed size, inserting or deleting
elements requires creating a new array and copying the elements. This is a
costly operation in terms of time complexity.
Advantages of Arrays
Disadvantages of Arrays
Applications of Arrays
Insertion Sort
Insertion Sort is a simple sorting algorithm that builds the final sorted array (or
list) one item at a time. It is much less efficient on large lists than more
advanced algorithms such as quicksort, heapsort, or merge sort.
How It Works:
1. Start from the second element: Assume that the first element is already
sorted.
2. Compare the current element with the elements in the sorted part of the
array.
3. Shift all the elements that are greater than the current element to the
right by one position.
4. Insert the current element in its correct position.
5. Repeat the process for all elements in the array.
Algorithm Steps:
Given an array A of n elements:
1. Start with the second element (index i = 1).
2. Compare the element A[i] with A[i-1], A[i-2], and so on.
3. Shift elements to the right until you find the correct position for A[i].
4. Insert A[i] in the correct position.
5. Move to the next element and repeat until the entire array is sorted.
Example:
Consider the array: [5, 2, 9, 1, 5, 6]
Step-by-Step Insertion Sort:
1. Initial Array: [5, 2, 9, 1, 5, 6]
2. Pass 1: (i = 1)
o Compare 2 with 5. Since 2 < 5, shift 5 to the right.
o Insert 2 in the first position.
o Array becomes: [2, 5, 9, 1, 5, 6]
3. Pass 2: (i = 2)
o Compare 9 with 5. Since 9 > 5, no changes are made.
o Array remains: [2, 5, 9, 1, 5, 6]
4. Pass 3: (i = 3)
o Compare 1 with 9, 5, and 2. Shift them to the right.
o Insert 1 in the first position.
o Array becomes: [1, 2, 5, 9, 5, 6]
5. Pass 4: (i = 4)
o Compare 5 with 9. Since 5 < 9, shift 9 to the right.
o Insert 5 in the correct position.
o Array becomes: [1, 2, 5, 5, 9, 6]
6. Pass 5: (i = 5)
o Compare 6 with 9. Since 6 < 9, shift 9 to the right.
o Insert 6 in the correct position.
o Array becomes: [1, 2, 5, 5, 6, 9]
Final Sorted Array: [1, 2, 5, 5, 6, 9]
Time Complexity:
Best Case: O(n) [When the array is already sorted]
Average and Worst Case: O(n²) [When the array is in reverse order or
unsorted]
Advantages:
Simple to implement.
Efficient for small data sets.
Stable (does not change the relative order of equal elements).
Disadvantages:
Not suitable for large data sets.
Higher time complexity compared to more advanced sorting algorithms
like Quick Sort or Merge Sort.
Selection Sort
Selection Sort is a simple comparison-based sorting algorithm. It works by
repeatedly finding the minimum (or maximum) element from the unsorted
portion of the list and moving it to the sorted portion.
How It Works:
1. Start with the first element: Assume the first element is the minimum.
2. Compare this element with the rest of the elements in the list.
3. Find the smallest element in the unsorted portion.
4. Swap the smallest element with the first element.
5. Move the boundary between the sorted and unsorted portions one step
to the right.
6. Repeat the process for all elements in the list.
Algorithm Steps:
Given an array A of n elements:
1. Start with the first element (index i = 0).
2. Assume A[i] is the minimum element.
3. Compare A[i] with every element A[j] where j ranges from i+1 to n-1.
4. If a smaller element is found, update the index of the minimum element.
5. Swap A[i] with the smallest element found in step 4.
6. Move to the next element and repeat until the entire array is sorted.
Example:
Consider the array: [29, 10, 14, 37, 13]
Step-by-Step Selection Sort:
1. Initial Array: [29, 10, 14, 37, 13]
2. Pass 1: (i = 0)
o Find the smallest element in the array [29, 10, 14, 37, 13], which is
10.
o Swap 10 with 29.
o Array becomes: [10, 29, 14, 37, 13]
3. Pass 2: (i = 1)
o Find the smallest element in the remaining array [29, 14, 37, 13],
which is 13.
o Swap 13 with 29.
o Array becomes: [10, 13, 14, 37, 29]
4. Pass 3: (i = 2)
o Find the smallest element in the remaining array [14, 37, 29], which
is 14.
o No need to swap since 14 is already in place.
o Array remains: [10, 13, 14, 37, 29]
5. Pass 4: (i = 3)
o Find the smallest element in the remaining array [37, 29], which is
29.
o Swap 29 with 37.
o Array becomes: [10, 13, 14, 29, 37]
6. Pass 5: (i = 4)
o The last element is already in place.
o Array remains: [10, 13, 14, 29, 37]
Final Sorted Array: [10, 13, 14, 29, 37]
Time Complexity:
Best, Average, and Worst Case: O(n²) [As the algorithm always makes
n-1 comparisons for every pass]
Advantages:
Simple to implement.
Performs well on small lists.
In-place sorting algorithm (doesn't require extra space).
Disadvantages:
Inefficient for large lists.
Unstable (it might change the relative order of equal elements).
Bubble Sort
How It Works:
1. Start at the beginning of the list.
2. Compare adjacent elements. If the current element is greater than the
next one, swap them.
3. Move to the next pair and repeat the comparison and swap if necessary.
4. Continue to the end of the list. At the end of each pass, the largest
element will "bubble up" to its correct position.
5. Repeat the process for the remaining unsorted portion of the list until no
swaps are needed.
Algorithm Steps:
Given an array A of n elements:
1. Start with the first element (index i = 0).
2. Compare A[i] with A[i+1].
o If A[i] > A[i+1], swap them.
3. Move to the next element and repeat the comparison and swap process
for the entire array.
4. After each complete pass, the largest unsorted element will be in its
correct position at the end of the list.
5. Repeat the passes until no swaps are made during a pass, indicating the
list is sorted.
Example:
Consider the array: [5, 3, 8, 4, 2]
Step-by-Step Bubble Sort:
1. Initial Array: [5, 3, 8, 4, 2]
2. Pass 1:
o Compare 5 and 3. Swap them.
o Array becomes: [3, 5, 8, 4, 2]
o Compare 5 and 8. No swap needed.
o Compare 8 and 4. Swap them.
o Array becomes: [3, 5, 4, 8, 2]
o Compare 8 and 2. Swap them.
o Array becomes: [3, 5, 4, 2, 8]
o End of Pass 1: Largest element (8) is in its correct position.
3. Pass 2:
o Compare 3 and 5. No swap needed.
o Compare 5 and 4. Swap them.
o Array becomes: [3, 4, 5, 2, 8]
o Compare 5 and 2. Swap them.
o Array becomes: [3, 4, 2, 5, 8]
o End of Pass 2: Second largest element (5) is in its correct position.
4. Pass 3:
o Compare 3 and 4. No swap needed.
o Compare 4 and 2. Swap them.
o Array becomes: [3, 2, 4, 5, 8]
o End of Pass 3: Third largest element (4) is in its correct position.
5. Pass 4:
o Compare 3 and 2. Swap them.
o Array becomes: [2, 3, 4, 5, 8]
o End of Pass 4: Fourth largest element (3) is in its correct position.
6. Pass 5:
o No swaps needed; the array is sorted.
Final Sorted Array: [2, 3, 4, 5, 8]
Time Complexity:
Best Case: O(n) [When the array is already sorted and no swaps are
needed]
Average and Worst Case: O(n²) [When the array is in reverse order or
unsorted]
Advantages:
Simple to understand and implement.
In-place sorting algorithm (doesn't require extra space).
Stable (does not change the relative order of equal elements).
Disadvantages:
Inefficient for large data sets.
Higher time complexity compared to more advanced sorting algorithms.
Merge Sort
How It Works:
1. Divide the unsorted list into two approximately equal halves.
2. Recursively sort each half by applying the merge sort algorithm to them.
3. Merge the two halves back together in a sorted manner to produce the
final sorted list.
Algorithm Steps:
Given an array A of n elements:
1. If the list contains 1 or 0 elements, it is already sorted; return it.
2. Divide the array into two halves: left_half and right_half.
3. Recursively apply merge sort to left_half.
4. Recursively apply merge sort to right_half.
5. Merge the sorted left_half and right_half into a single sorted list.
Example:
Consider the array: [38, 27, 43, 3, 9, 82, 10]
Step-by-Step Merge Sort:
1. Initial Array: [38, 27, 43, 3, 9, 82, 10]
2. Step 1: Divide the array into halves:
o Left half: [38, 27, 43]
o Right half: [3, 9, 82, 10]
3. Step 2: Recursively sort the left half [38, 27, 43]:
o Divide into: [38] and [27, 43]
o Sort [27, 43] by dividing into [27] and [43] (already sorted).
o Merge [27] and [43] to get [27, 43].
o Merge [38] and [27, 43] to get [27, 38, 43].
4. Step 3: Recursively sort the right half [3, 9, 82, 10]:
o Divide into: [3, 9] and [82, 10]
o Sort [3, 9] by dividing into [3] and [9] (already sorted).
o Merge [3] and [9] to get [3, 9].
o Sort [82, 10] by dividing into [82] and [10] (already sorted).
o Merge [10] and [82] to get [10, 82].
o Merge [3, 9] and [10, 82] to get [3, 9, 10, 82].
5. Step 4: Merge the sorted halves [27, 38, 43] and [3, 9, 10, 82]:
o Compare elements and merge to get the final sorted array [3, 9, 10,
27, 38, 43, 82].
Final Sorted Array: [3, 9, 10, 27, 38, 43, 82]
Time Complexity:
Best, Average, and Worst Case: O(n log n) [The time complexity is the
same for all cases since the algorithm always divides the list into halves
and merges them back together]
Advantages:
Efficient for large data sets.
Stable (does not change the relative order of equal elements).
Works well with linked lists and external storage (e.g., sorting large files).
Disadvantages:
Requires additional space proportional to the size of the list (O(n) extra
space).
Slower than some algorithms (like quicksort) for smaller datasets.
Radix Sort
How It Works:
1. Start from the least significant digit (LSD): Sort the numbers based
on this digit.
2. Move to the next digit (next more significant digit) and repeat the
sorting process.
3. Continue this process until all digits have been processed.
4. The list will be sorted after the final pass.
Radix sort typically uses a stable subroutine, like counting sort, to sort the digits.
Algorithm Steps:
Given an array of integers:
1. Determine the maximum number in the array to know the number of
digits (let's call it d).
2. Start with the least significant digit (LSD) and sort the array based on this
digit.
3. Move to the next digit and repeat the sorting process.
4. Continue until all digits have been processed.
Example:
Consider the array: [170, 45, 75, 90, 802, 24, 2, 66]
Step-by-Step Radix Sort:
1. Initial Array: [170, 45, 75, 90, 802, 24, 2, 66]
2. Step 1: Sort based on the least significant digit (units place):
o Arrange numbers based on the last digit: [170, 90, 802, 2, 24, 45,
75, 66]
3. Step 2: Sort based on the next digit (tens place):
o Arrange numbers based on the tens digit: [802, 2, 24, 45, 66, 170,
75, 90]
4. Step 3: Sort based on the most significant digit (hundreds place):
o Arrange numbers based on the hundreds digit: [2, 24, 45, 66, 75,
90, 170, 802]
Final Sorted Array: [2, 24, 45, 66, 75, 90, 170, 802]
Time Complexity:
Best, Average, and Worst Case: O(nk), where n is the number of
elements and k is the number of digits in the largest number.
Advantages:
Linear time complexity under certain conditions (when the range of digits
is fixed and not too large).
Stable (does not change the relative order of equal elements).
Disadvantages:
Requires additional space for sorting based on each digit.
Not suitable for non-integer data (unless modified) and may not be as
efficient as comparison-based sorts for small datasets.
Linear Search
Linear Search is a simple search algorithm that checks every element in a list
sequentially until it finds the target value or reaches the end of the list. It's
straightforward and doesn't require the list to be sorted.
How It Works:
1. Start at the first element of the list.
2. Compare the target value with the current element.
3. If the target value matches the current element, the search is successful,
and the index of the element is returned.
4. If the target value does not match, move to the next element.
5. Continue this process until the target value is found or the list ends.
6. If the target value is not found in the list, the search is unsuccessful, and a
special value (like -1 or null) is returned.
Algorithm Steps:
Given an array A of n elements and a target value x:
1. Start with the first element (index i = 0).
2. Compare A[i] with x.
o If A[i] == x, return i.
o If A[i] != x, move to the next element.
3. Repeat step 2 until the end of the array.
4. If x is not found, return a special value (like -1).
Example:
Consider the array: [10, 23, 45, 70, 11, 15]
Search for the target value 70:
1. Initial Array: [10, 23, 45, 70, 11, 15]
2. Step 1: Compare 10 with 70. (Not a match)
3. Step 2: Compare 23 with 70. (Not a match)
4. Step 3: Compare 45 with 70. (Not a match)
5. Step 4: Compare 70 with 70. (Match found)
Result: The target value 70 is found at index 3.
Search for the target value 25:
1. Initial Array: [10, 23, 45, 70, 11, 15]
2. Step 1: Compare 10 with 25. (Not a match)
3. Step 2: Compare 23 with 25. (Not a match)
4. Step 3: Compare 45 with 25. (Not a match)
5. Step 4: Compare 70 with 25. (Not a match)
6. Step 5: Compare 11 with 25. (Not a match)
7. Step 6: Compare 15 with 25. (Not a match)
Result: The target value 25 is not found in the array, so the search returns -1.
Time Complexity:
Best Case: O(1) [When the target value is the first element]
Worst and Average Case: O(n) [When the target value is the last
element or not present in the list]
Advantages:
Simple to implement and understand.
Works on unsorted data.
Disadvantages:
Inefficient for large lists compared to more advanced search algorithms
like binary search.
Not suitable when the list is sorted, as binary search would be much faster.
Binary Search
Binary Search is an efficient search algorithm that works on sorted lists. It
repeatedly divides the search interval in half to locate the target value. Since it
eliminates half of the elements in each step, it is much faster than linear search,
especially for large datasets.
How It Works:
1. Start by defining the search interval: Set two pointers, low (the start
of the list) and high (the end of the list).
2. Find the middle element of the current search interval.
3. Compare the target value with the middle element:
o If the target value matches the middle element, the search is
successful, and the index of the middle element is returned.
o If the target value is less than the middle element, adjust the high
pointer to search in the left half.
o If the target value is greater than the middle element, adjust the
low pointer to search in the right half.
4. Repeat the process with the new search interval until the target value is
found or the search interval is empty (i.e., low > high).
5. If the search interval becomes empty, the target value is not in the list,
and a special value (like -1) is returned.
Algorithm Steps:
Given a sorted array A of n elements and a target value x:
1. Initialize low = 0 and high = n - 1.
2. While low <= high, repeat the following steps:
o Calculate mid = (low + high) / 2.
o Compare A[mid] with x.
o If A[mid] == x, return mid.
o If A[mid] < x, set low = mid + 1.
o If A[mid] > x, set high = mid - 1.
3. If low exceeds high, return -1 (indicating the target value is not found).
Example:
Consider the sorted array: [2, 3, 4, 10, 40]
Search for the target value 10:
1. Initial Array: [2, 3, 4, 10, 40]
2. Step 1: Set low = 0, high = 4. Calculate mid = (0 + 4) / 2 = 2.
o Compare A[2] = 4 with 10. (Not a match)
o Since 10 > 4, search in the right half. Set low = 3.
3. Step 2: Set low = 3, high = 4. Calculate mid = (3 + 4) / 2 = 3.
o Compare A[3] = 10 with 10. (Match found)
Result: The target value 10 is found at index 3.
Search for the target value 5:
1. Initial Array: [2, 3, 4, 10, 40]
2. Step 1: Set low = 0, high = 4. Calculate mid = (0 + 4) / 2 = 2.
o Compare A[2] = 4 with 5. (Not a match)
o Since 5 > 4, search in the right half. Set low = 3.
3. Step 2: Set low = 3, high = 4. Calculate mid = (3 + 4) / 2 = 3.
o Compare A[3] = 10 with 5. (Not a match)
o Since 5 < 10, search in the left half. Set high = 2.
4. Step 3: Now, low = 3 and high = 2 (which means low > high), so the
search ends.
Result: The target value 5 is not found in the array, so the search returns ‘-1’.
Time Complexity:
Best Case: O(1) [When the target value is the middle element]
Worst and Average Case: O(log n) [Because the list is halved with each
step]
Advantages:
Much faster than linear search, especially for large, sorted datasets.
Efficient in terms of time complexity.
Disadvantages:
The list must be sorted before performing a binary search.
Not suitable for linked lists or data structures where direct access to
elements is not possible.
Hash Function
A hash function is a special algorithm that takes an input (or "key") and returns a
fixed-size string of bytes, typically a hash code. The output is usually a numerical
value that uniquely identifies the input data. Hash functions are widely used in
various applications such as data storage, retrieval, and cryptography.
How It Works:
1. Input: The hash function takes an input, which can be data of any size
(e.g., a string, file, or number).
2. Processing: The hash function processes this input using a specific
algorithm to produce a hash value.
3. Output: The hash value is a fixed-size, typically shorter than the input, and
is used as a unique identifier for the original data.
Properties of a Good Hash Function:
Deterministic: The same input will always produce the same output hash.
Fast Computation: The hash function should be able to process data
quickly.
Uniform Distribution: The hash values should be evenly distributed across
the possible output range to minimize collisions.
Pre-image Resistance: It should be difficult to reverse the process (i.e., to
find the original input given only the hash value).
Collision Resistance: It should be difficult to find two different inputs that
produce the same hash value.
Applications of Hash Functions:
Hash Tables: Used to store and retrieve data quickly. Hash functions map
data to specific locations (buckets) in the hash table.
Data Integrity: Hash functions can be used to verify the integrity of data.
For example, if two hash values match, the data has not been altered.
Cryptography: Hash functions are used in cryptographic algorithms to
secure data, passwords, and digital signatures.
Example of a Hash Function:
Consider a simple hash function h(x) = x % 10, where x is an integer and % is
the modulus operator.
Input: x = 35
Hash Value: h(35) = 35 % 10 = 5
Here, the hash function returns 5 as the hash value for the input 35. This means
that if 35 is stored in a hash table, it would be placed in the bucket
corresponding to the index 5.
Example in a Hash Table:
Suppose we have the following inputs: [21, 14, 18, 35, 42]
Using the hash function h(x) = x % 10:
h(21) = 21 % 10 = 1 → Store 21 in bucket 1.
h(14) = 14 % 10 = 4 → Store 14 in bucket 4.
h(18) = 18 % 10 = 8 → Store 18 in bucket 8.
h(35) = 35 % 10 = 5 → Store 35 in bucket 5.
h(42) = 42 % 10 = 2 → Store 42 in bucket 2.
The hash table would look like this:
Collision
When multiple inputs produce the same hash value, it results in a collision in a
hash table. Collision resolution strategies are methods used to handle these
collisions so that the hash table can still efficiently store and retrieve data. Here
are the common collision resolution strategies:
1. Chaining
In chaining, each bucket of the hash table points to a linked list (or another data
structure like a dynamic array) that contains all elements that hash to the same
index.
How It Works:
o If a collision occurs, the new element is added to the linked list at
the corresponding index.
o When searching for an element, you hash the key to find the correct
bucket and then search through the linked list in that bucket.
Advantages:
o Simple to implement.
o No need to worry about running out of space in a bucket.
o Can handle a large number of collisions effectively.
Disadvantages:
o Requires additional memory for pointers in the linked lists.
o Performance degrades to O(n) in the worst case if many elements
hash to the same bucket.
Storage:
o A list (or array) for row indices.
o A list (or array) for column indices.
o A list (or array) for non-zero values.
Example:
Consider the same sparse matrix:
[ ]
0 0 30
0 0 04
5 0 00
The CSR representation would be:
Values: [3, 4, 5]
ColumnIndex: [2, 3, 0]
RowPointer: [0, 1, 2, 3]
Here:
RowPointer[0] = 0 (row 0 starts at index 0 in Values[])
RowPointer[1] = 1 (row 1 starts at index 1 in Values[])
RowPointer[2] = 2 (row 2 starts at index 2 in Values[])
Advantages:
Efficient for row-wise matrix operations.
Saves space compared to COO format.
Disadvantages:
Complex to implement compared to COO format.
Insertion of elements can be costly as the data must be shifted.