0% found this document useful (0 votes)
89 views22 pages

Understanding Data Structures and ADTs

The document provides an overview of data structures and algorithms, focusing on key concepts such as Abstract Data Types (ADT), their principles of abstraction and encapsulation, and examples like the Queue ADT. It differentiates between linear and non-linear data structures, explains ephemeral vs. persistent data structures, and introduces Big-O notation for algorithm efficiency. Additionally, it describes the Divide and Conquer strategy in algorithm design, using Merge Sort as a detailed example.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views22 pages

Understanding Data Structures and ADTs

The document provides an overview of data structures and algorithms, focusing on key concepts such as Abstract Data Types (ADT), their principles of abstraction and encapsulation, and examples like the Queue ADT. It differentiates between linear and non-linear data structures, explains ephemeral vs. persistent data structures, and introduces Big-O notation for algorithm efficiency. Additionally, it describes the Divide and Conquer strategy in algorithm design, using Merge Sort as a detailed example.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Structures [Link].

site

Introduction to Data Structures


B and algorithms
(2024 Pattern) (PCC-201-COM)

Q.
Answer
No.
1 Define Abstract Data Type (ADT) with an example.

Definition and Core Principles


An Abstract Data Type (ADT) is a high-level, logical description of a data type,
defined by a set of possible values and a set of operations on those values, completely
independent of its underlying implementation. The core idea is abstraction, which
means separating the interface (what the data type does) from
the implementation (how it is actually coded).

An ADT is guided by two main principles:

1. Abstraction: Focusing on the essential characteristics and behavior while


hiding unnecessary details. The user of an ADT interacts with a simple, well-
defined interface without needing to know the complex internal logic.
2. Encapsulation: Bundling the data and the operations that manipulate that
data into a single unit. This prevents accidental modification of the data from
outside its defined operations, ensuring data integrity.

Detailed Example: The Queue ADT


A Queue is a fundamental ADT that operates on the First-In, First-Out
(FIFO) principle. Think of a line of people waiting for a ticket; the first person to
get in line is the first person to be served.
Formal Specification of a Queue ADT:
• Values: A collection of elements stored in a linear sequence.
• Operations:
o enqueue(item): Adds a new item to the rear (end) of the queue.
o dequeue(): Removes and returns the item from the front (start) of the
queue.
o front(): Returns the item at the front of the queue without removing it.
o isEmpty(): Returns true if the queue contains no elements, otherwise
returns false.

Join Telegram Channel - [Link] | @SPPU Engineers 1


Data Structures [Link]

o size(): Returns the number of elements currently in the queue.

Implementation vs. Abstraction:


The power of the ADT is that the user only needs to know
about enqueue and dequeue. The internal implementation is hidden. Here are two
possible ways to implement the Queue ADT:
• Implementation 1: Using an Array (Static)
o A fixed-size array is used to store the queue elements.
o Two integer variables, front and rear, are used as pointers to track the
start and end of the queue.
o enqueue involves adding an element at the rear index and
incrementing rear.
o dequeue involves returning the element at the front index and
incrementing front.
o Limitation: The queue has a fixed capacity and can become full.

• Implementation 2: Using a Linked List (Dynamic)


o A series of nodes are linked together. Each node contains a data
element and a pointer to the next node.
o Two pointers, front and rear, point to the first and last nodes in the list.
o enqueue involves creating a new node and adding it after the
current rear node.
o dequeue involves removing the node pointed to by front and
updating front to the next node.
o Advantage: The queue's size is dynamic and limited only by
available memory.

Regardless of which implementation is chosen, a programmer using the Queue ADT


would write the exact same code (e.g., my_queue.enqueue(10);, item =
my_queue.dequeue();). The ADT provides a clean, reusable, and maintainable way
to work with data structures.

2 Differentiate between Linear and Non-linear data structures.

Basis of
Linear Data Structure Non-Linear Data Structure
Differentiation
Fundamental A data structure where elements are A data structure where elements are
Definition arranged in a sequential or linear order, organized in a hierarchical or networked

Join Telegram Channel - [Link] | @SPPU Engineers 2


Data Structures [Link]

one after the other. Each element is fashion. An element can be connected to
directly connected to its previous and one or more other elements, representing
next element. complex relationships.
Represents a one-to-one relationship Represents one-to-many (e.g., Tree) or
Logical
between adjacent elements (except for many-to-many (e.g., Graph) relationships
Arrangement
the first and last). between elements.
Can be implemented using contiguous
Almost always implemented using non-
memory blocks (like arrays) or non-
Memory contiguous memory blocks linked by
contiguous blocks linked by pointers
Representation pointers, as the relationships are too
(like linked lists). The logical
complex for a simple contiguous layout.
arrangement remains sequential.
Traversal is more complex and often
Simple and straightforward. The entire
recursive. It may require specific
Traversal structure can be traversed in a single
algorithms like Depth-First Search (DFS) or
Method pass or run, from the first element to the
Breadth-First Search (BFS) to visit all
last.
nodes. A single run is often insufficient.
Can be significantly more complex to
Implementation Generally simpler to conceptualize and implement due to the need to manage
Complexity implement. multiple pointers and intricate relationships
between nodes.
Used for modeling real-world hierarchical
Used for storing sequential data,
Primary Use systems (file systems, organization charts),
implementing simple lists, stacks, and
Cases networks (social networks, computer
queues.
networks), and decision trees.
Array: A collection of elements in Tree: A hierarchical structure with a root
contiguous memory. <br> Linked node and child nodes. <br> Graph: A
List: A chain of nodes connected by collection of nodes (vertices) connected by
Examples
pointers. <br> Stack: A LIFO edges. <br> Heap: A specialized tree-based
structure. <br> Queue: A FIFO structure. <br> Trie: A tree used for storing
structure. and retrieving strings.

3 What is Persistent and Ephemeral data structure?

Ephemeral Data Structures


An ephemeral data structure is the conventional type of data structure where an
update operation modifies the structure in-place, thereby destroying its previous
state. The word "ephemeral" means short-lived, which reflects the fact that old
versions of the structure vanish after a modification.

• Key Characteristic: Only the most current version of the data structure is
accessible at any given time.

Join Telegram Channel - [Link] | @SPPU Engineers 3


Data Structures [Link]

• How it Works: When you perform an operation like changing a value or


adding a node, the memory representing the old state is directly overwritten
or altered.
• Example: Consider a simple array in Python.

my_list = [10, 20, 30]


# The list object refers to a specific state in memory.

my_list[1] = 99 # In-place update


# Now my_list is [10, 99, 30]. The original state [10, 20, 30] is gone forever.

This is the default behavior in most imperative programming languages.

Persistent Data Structures


A persistent data structure is one that preserves its previous versions when it is
modified. Instead of destroying the old state, an update operation creates a new
version of the structure while keeping the old version(s) accessible and unchanged.

• Key Characteristic: All previous versions of the data structure remain


available for querying or use.
• How it Works: To avoid copying the entire structure on every change
(which would be very inefficient), persistent data structures often use a
technique called path copying. When an update occurs, only the nodes on
the path to the changed element are copied. The new nodes point to the
newly created data and also share pointers to the unchanged parts of the old
structure.
• Example: A persistent stack.
o Imagine you have a stack S1 = (Node C -> Node B -> Node A).
o When you push a new element D onto S1, you don't change S1.
Instead, you create a new stack S2.
o S2 consists of a new Node D whose "next" pointer points to Node
C (the head of S1).
o Now you have access to both S1 (the original stack) and S2 (the new
stack) without having duplicated Nodes A, B, and C.

• Applications:
o Functional Programming: Languages like Haskell and Clojure use
persistent data structures to enforce immutability.
o Version Control Systems: Git uses a persistent graph structure to
store the history of commits.
o Undo/Redo Functionality: Text editors can maintain a history of
document states using a persistent structure.

Join Telegram Channel - [Link] | @SPPU Engineers 4


Data Structures [Link]

4 Explain Big-O notation with a simple example.

Concept and Purpose


Big-O Notation (O) is a mathematical notation used in computer science to
describe the asymptotic upper bound of an algorithm's runtime or space usage. In
simpler terms, it describes the worst-case performance of an algorithm as its
input size (n) becomes very large.

The purpose of Big-O is to provide a high-level, standardized way to compare the


efficiency of algorithms, ignoring machine-specific details like CPU speed or
memory latency. It answers the question: "How does the number of operations
scale as the input grows?"

Formal Definition
A function f(n) is said to be O(g(n)) (read as "f of n is Big-O of g of n") if there
exist two positive constants, c and n₀, such that for all input sizes n ≥ n₀, the
following inequality holds true:

0 ≤ f(n) ≤ c * g(n)
Where:
• f(n) is the actual complexity function of the algorithm (e.g., the exact
number of operations).
• g(n) is a simpler function that represents the growth rate (e.g., n, n², log n).
• c is a constant multiplier.
• n₀ is an input size beyond which the relationship holds true.

Detailed Example: Finding the Maximum Element in an Array


Let's analyze an algorithm that finds the largest number in an array of size n.

def find_max(arr):
n = len(arr) # 1 operation (assignment)
if n == 0: # 1 operation (comparison)
return None

max_val = arr[0] # 1 operation (assignment)


for i in range(1, n): # Loop runs n-1 times
if arr[i] > max_val: # 1 comparison inside loop
max_val = arr[i] # 1 assignment (in worst case)

return max_val # 1 operation (return)

Join Telegram Channel - [Link] | @SPPU Engineers 5


Data Structures [Link]

Step 1: Derive the exact complexity function f(n)


Let's count the operations in the worst case (when the array is sorted in ascending
order, so the assignment max_val = arr[i] happens every time).
• n = len(arr): 1
• max_val = arr[0]: 1
• Loop for i in range(1, n):
o Comparisons (arr[i] > max_val): n-1 times
o Assignments (max_val = arr[i]): n-1 times (worst case)
• return max_val: 1
Total operations f(n) = 1 + 1 + (n-1) + (n-1) + 1 = 2n + 1.

Step 2: Show that f(n) = 2n + 1 is O(n)


Our f(n) = 2n + 1 and our guess for g(n) is n. We need to find c and n₀ such that 2n
+ 1 ≤ c * n for all n ≥ n₀.
Let's try c = 3.
2n + 1 ≤ 3n
1 ≤ 3n - 2n
1≤n
This inequality holds true for all n ≥ 1. So, we found our constants: c = 3 and n₀ =
1.
Since we found such constants, we have formally proven that the time complexity
of this algorithm is O(n). This means its runtime grows linearly with the number of
elements in the array.

5 Describe Divide and Conquer strategy in algorithm design.

Conceptual Overview
The Divide and Conquer (D&C) strategy is a powerful and widely used
algorithm design paradigm for solving complex problems. Its core philosophy is to
systematically break down a large, intimidating problem into smaller, more
manageable sub-problems, solve these sub-problems recursively, and then combine
their solutions to build the solution for the original problem. This approach often
leads to highly efficient and elegant algorithms.

The strategy is formally characterized by three distinct steps:

1. Divide: In this step, the main problem is partitioned into two or more
smaller, independent sub-problems. Critically, these sub-problems are
smaller instances of the same original problem. For an input of size n, this
often means creating two sub-problems of size n/2. This process of division
is applied recursively.

Join Telegram Channel - [Link] | @SPPU Engineers 6


Data Structures [Link]

2. Conquer: This is the base case of the recursion. The division process
continues until the sub-problems become so small that they can be solved
directly, or "trivially." For instance, in a sorting algorithm, an array of size 1
is considered inherently sorted, thus forming the base case.
3. Combine (or Merge): After the sub-problems have been solved
(conquered), their solutions are methodically combined to produce the
solution for the original, larger problem. The efficiency of the entire Divide
and Conquer
algorithm often hinges on the efficiency of this combination step.

Detailed Example: Merge Sort


Merge Sort is the quintessential example of the Divide and Conquer strategy. Its
goal is to sort an array of n elements.

Let's sort the array: [8, 3, 12, 5, 9, 2]

1. The Divide Phase:

The algorithm recursively splits the array in half until it cannot be divided any
further (i.e., we have arrays of size 1).

• Initial Problem: [8, 3, 12, 5, 9, 2]


• Level 1 Split: [8, 3, 12] and [5, 9, 2]
• Level 2 Split: [8, 3], [12] and [5, 9], [2]
• Level 3 Split (Base Cases): [8], [3], [12] and [5], [9], [2]
At this point, the "Conquer" step is complete, as each sub-array of size 1 is, by
definition, sorted.

2. The Combine (Merge) Phase:


Now, the algorithm works its way back up, merging the sorted sub-arrays into
larger sorted arrays. The merge operation takes two sorted lists and combines them
into a single, larger sorted list.

• Merge Level 3 results:


o merge([8], [3]) -> [3, 8]
o [12] is already sorted.
o merge([5], [9]) -> [5, 9]
o [2] is already sorted.
o State after this level: [3, 8], [12], [5, 9], [2]
• Merge Level 2 results:
o merge([3, 8], [12]) -> [3, 8, 12]
o merge([5, 9], [2]) -> [2, 5, 9]
o State after this level: [3, 8, 12], [2, 5, 9]

Join Telegram Channel - [Link] | @SPPU Engineers 7


Data Structures [Link]

• Final Merge (Level 1 results):


o merge([3, 8, 12], [2, 5, 9])
▪ Compare 3 and 2. Take 2. Result: [2]
▪ Compare 3 and 5. Take 3. Result: [2, 3]
▪ Compare 8 and 5. Take 5. Result: [2, 3, 5]
▪ Compare 8 and 9. Take 8. Result: [2, 3, 5, 8]
▪ Compare 12 and 9. Take 9. Result: [2, 3, 5, 8, 9]
▪ Only 12 is left. Take 12. Result: [2, 3, 5, 8, 9, 12]

Final Sorted Array: [2, 3, 5, 8, 9, 12]


This strategy is highly efficient, typically resulting in an O(n log n) time
complexity, making it suitable for large-scale sorting.

Analyze time complexity of the following code using step count method:
for(int i=0; i<n; i++) {
for(int j=0; j<n; j++) {
6
cout << i+j;
}
}

Analysis using Step Count Method


The goal is to create a function T(n) that represents the total number of elementary
operations performed by this code snippet as a function of the input size n.

1. Decompose the Code into Elementary Operations:


• Outer Loop Initialization: int i=0 (1 operation)
• Outer Loop Condition: i<n (n+1 comparisons)
• Outer Loop Increment: i++ (n increments)
• Inner Loop Initialization: int j=0 (1 operation, executed n times)
• Inner Loop Condition: j<n (n+1 comparisons, executed n times)
• Inner Loop Increment: j++ (n increments, executed n times)
• Innermost Statement: cout << i+j; (1 operation, executed n*n times)

2. Calculate the Frequency of Each Operation:


• int i=0: Executes 1 time.
• i<n: Executes n+1 times (the final check fails the loop).
• i++: Executes n times.
• int j=0: Executes n times (once for each outer loop iteration).
• j<n: Executes n * (n+1) times.
• j++: Executes n * n times.
• cout << i+j;: Executes n * n times.

Join Telegram Channel - [Link] | @SPPU Engineers 8


Data Structures [Link]

3. Formulate the Total Cost Function T(n):


T(n) = 1 (outer init) + (n+1) (outer cond) + n (outer incr) + n (inner init) + n(n+1)
(inner cond) + n*n (inner incr) + n*n (cout)
T(n) = 1 + n + 1 + n + n² + n + n² + n²
T(n) = 3n² + 3n + 2

4. Apply Asymptotic Analysis to Find the Big-O Complexity:


To find the Big-O complexity, we simplify the function T(n) = 3n² + 3n + 2 by
following two rules:
1. Keep the Highest Order Term: The term with the largest exponent
as n approaches infinity dominates the function's growth. Here, 3n² is the
highest order term.
2. Discard Constant Coefficients: The constant multiplier does not affect the
rate of growth. We discard the 3 from 3n².

The simplified function becomes n².


Therefore, the time complexity of the given code snippet is O(n²). This means the
runtime grows quadratically with the input size n.

7 State the step count method for calculating time complexity.


The step count method is a formal, analytical technique used to determine the
precise time complexity of an algorithm. It involves breaking down the algorithm
into its fundamental operations and counting the number of times each operation is
executed. The final result is a function, T(n), that represents the total number of
steps, which is then simplified using asymptotic notation.

The formal steps are as follows:


1. Identify the Input Parameter: Determine the variable that represents the
size of the input, typically denoted as n. For an array, n would be the number
of elements; for a matrix, it might be the number of rows or columns.
2. Decompose into Basic Operations: Break the entire algorithm down into
individual, elementary steps. A basic operation is one whose execution time
is considered constant (e.g., assignment, arithmetic operation, comparison,
function return).

3. Determine the Frequency of Each Operation: For each elementary step,


calculate how many times it is executed as a function of the input
parameter n.
o For simple statements, the frequency is 1.
o For statements inside a loop, the frequency is the number of times the
loop iterates multiplied by the frequency of the statement itself.
o For nested loops, these frequencies are multiplied together.

Join Telegram Channel - [Link] | @SPPU Engineers 9


Data Structures [Link]

4. Sum the Frequencies: Add up the frequencies of all elementary operations


to get the total cost function, T(n). This function gives the exact number of
steps the algorithm takes for an input of size n.
T(n) = (Frequency of Op 1) + (Frequency of Op 2) + ...

5. Apply Asymptotic Analysis: Simplify the function T(n) to its dominant


term to express the complexity in Big-O, Theta, or Omega notation. This
involves discarding lower-order terms and constant coefficients to focus on
the overall rate of growth.

Explain asymptotic notations: Big-O, Theta (Θ), and Omega (Ω) with
8
graphical representation.

Introduction to Asymptotic Analysis


Asymptotic analysis is the process of describing the limiting behavior of a
function, which in computer science is typically the runtime or space usage of an
algorithm. We use specific notations—Big-O, Theta, and Omega—to classify
algorithms based on how their resource consumption grows as the input size (n)
approaches infinity. This provides a high-level understanding of an algorithm's
efficiency, abstracting away from machine-specific constants.

a) Big-O Notation (O): The Upper Bound


• Concept: Big-O notation describes the asymptotic upper bound on an
algorithm's complexity. It provides a guarantee that the algorithm's
performance will not be worse than a certain rate of growth. It is the most
commonly used notation because it describes the worst-case scenario,
which is critical for performance guarantees.
• Formal Definition: A function f(n) is O(g(n)) if there exist positive
constants c and n₀ such that 0 ≤ f(n) ≤ c * g(n) for all n ≥ n₀.
• Intuitive Meaning: "For a sufficiently large input n, the function f(n) will
grow no faster than the function g(n), multiplied by some constant c."
• Graphical Representation:

Join Telegram Channel - [Link] | @SPPU Engineers 10


Data Structures [Link]

Imagine a 2D graph with the x-axis representing Input Size (n) and the y-axis
representing Time/Operations.

1. Draw an arbitrary curve for the actual runtime function f(n).


2. Draw another curve for c * g(n) that lies above f(n).
3. The point n₀ is marked on the x-axis. To the right of n₀, the f(n) curve
will never cross above the c * g(n) curve. c * g(n) acts as a "ceiling"
for f(n).

b) Omega Notation (Ω): The Lower Bound


• Concept: Omega notation describes the asymptotic lower bound. It
provides a guarantee that an algorithm's performance will not be better than
a certain rate of growth. It is useful for describing the best-case scenario.
• Formal Definition: A function f(n) is Ω(g(n)) if there exist positive
constants c and n₀ such that 0 ≤ c * g(n) ≤ f(n) for all n ≥ n₀.
• Intuitive Meaning: "For a sufficiently large input n, the function f(n) will
grow at least as fast as the function g(n), multiplied by some constant c."
• Graphical Representation:

Join Telegram Channel - [Link] | @SPPU Engineers 11


Data Structures [Link]

On a similar graph:
1. Draw the curve for the actual runtime function f(n).
2. Draw another curve for c * g(n) that lies below f(n).
3. To the right of the point n₀, the f(n) curve will never dip below the c *
g(n) curve. c * g(n) acts as a "floor" for f(n).

c) Theta Notation (Θ): The Tight Bound


• Concept: Theta notation describes an asymptotic tight bound. It is the
most precise notation, used when an algorithm's growth rate is bounded both
from above and below by the same function. It is often used to describe
the average-case or a precise worst-case complexity.
• Formal Definition: A function f(n) is Θ(g(n)) if there exist positive
constants c₁, c₂, and n₀ such that 0 ≤ c₁ * g(n) ≤ f(n) ≤ c₂ * g(n) for all n ≥ n₀.
This is equivalent to stating that f(n) is both O(g(n)) and Ω(g(n)).
• Intuitive Meaning: "For a sufficiently large input n, the growth of f(n) is
trapped or 'sandwiched' between two scaled versions of g(n)."
• Graphical Representation:

Join Telegram Channel - [Link] | @SPPU Engineers 12


Data Structures [Link]

1. Draw the curve for the actual runtime function f(n).


2. Draw two curves for the bounding functions: c₁ * g(n) (the floor)
and c₂ * g(n) (the ceiling).
3. To the right of n₀, the f(n) curve will always remain in the channel
between c₁ * g(n) and c₂ * g(n).

Discuss differences between Static and Dynamic data structures with at least
9 two examples.

Feature Static Data Structure Dynamic Data Structure


Size Fixed at compile-time. Cannot be Flexible. Can grow or shrink at runtime.
changed during execution.
Memory Memory is allocated from Memory is allocated from the heap during
Allocation the stack before program program execution.
execution.
Flexibility Inflexible. Can lead to memory Highly flexible and efficient in memory usage.
wastage or overflow.
Access Speed Generally faster access due to Can be slower due to non-contiguous memory
contiguous memory allocation. and pointer traversal.
Example 1 Array ( int arr[50]; in C/C++). Linked List, where nodes are added or
removed dynamically.
Example 2 Static Structs with fixed-size Vector (like std::vector in C++), which is
members. a dynamic array.

A website wants to sort a list of 100,000 products by price. Which algorithm


10
and strategy would you choose? Justify using Divide and Conquer with
diagram or pseudocode.

Problem Statement
A website wants to sort a list of 100,000 products by price. Which algorithm and
strategy would you choose? Justify your choice using the Divide and Conquer
strategy and provide an illustrative diagram or pseudocode.

Executive Summary
• Chosen Strategy: Divide and Conquer
• Chosen Algorithm: Merge Sort
This combination is the most professional and robust choice for a large-scale,
production-level application like an e-commerce website due to its guaranteed
efficiency and predictable performance.

Join Telegram Channel - [Link] | @SPPU Engineers 13


Data Structures [Link]

Detailed Justification
The decision-making process for selecting a sorting algorithm for a large dataset (n
= 100,000) must prioritize performance, scalability, and reliability.

Why the Divide and Conquer Strategy is the Correct Approach


The Divide and Conquer strategy is perfectly suited for this problem because it
excels at handling large inputs by breaking them down.
• Scalability: Instead of tackling the entire 100,000-item list at once, it breaks
it into smaller, more manageable sub-problems (e.g., two lists of 50,000,
then four lists of 25,000, and so on).
• Efficiency: This recursive splitting leads to algorithms with O(n log
n) complexity.
o Calculation: For n = 100,000, log₂(n) ≈ 16.6. Therefore, n log n ≈
100,000 * 16.6 ≈ 1.66 million operations.
o Implication: This is a massive improvement over 10 billion
operations and can be executed in well under a second on modern
hardware, meeting user expectations for responsiveness.

Illustration using Pseudocode (Merge Sort)


Here is the detailed pseudocode for Merge Sort, which clearly demonstrates the
Divide, Conquer, and Combine steps.

The Main mergeSort Function (The "Divide" Part)


This function recursively splits the list of products until it has lists of one, which
are inherently sorted.
FUNCTION mergeSort(productList):
// CONQUER (Base Case): A list with one or zero products is already sorted.
IF count of productList <= 1 THEN
RETURN productList
END IF

// DIVIDE: Find the middle point and split the list into two halves.
middle_index = floor(count of productList / 2)
left_half = sublist of productList from start to middle_index
right_half = sublist of productList from middle_index to end

// Recursively sort both halves.


sorted_left_half = mergeSort(left_half)

Join Telegram Channel - [Link] | @SPPU Engineers 14


Data Structures [Link]

sorted_right_half = mergeSort(right_half)

// COMBINE: Merge the two now-sorted halves back into one sorted list.
RETURN merge(sorted_left_half, sorted_right_half)
END FUNCTION

The merge Helper Function (The "Combine" Part)


This function takes two sorted lists and merges them into a single, final sorted list.
Generated code
FUNCTION merge(left_list, right_list):
// Create an empty list to store the final sorted result.
sorted_list = new empty list

// Loop as long as there are elements in both lists to compare.


WHILE left_list is not empty AND right_list is not empty:
// Compare the products based on their price.
IF price of first product in left_list <= price of first product in right_list THEN
// Move the cheaper product from the left list to the sorted list.
move first product from left_list to the end of sorted_list
ELSE
// Move the cheaper product from the right list to the sorted list.
move first product from right_list to the end of sorted_list
END IF
END WHILE

// At this point, one of the lists is empty. Append all remaining


// products from the other list (which are already sorted).
append all remaining products from left_list to the end of sorted_list
append all remaining products from right_list to the end of sorted_list

RETURN sorted_list
END FUNCTION

Google Calendar schedules meetings throughout the day. Apply Greedy


strategy to schedule maximum number of meetings from a given list:
11
Meeting A: 9 AM – 10 AM
Meeting B: 9:30 AM – 11 AM

Join Telegram Channel - [Link] | @SPPU Engineers 15


Data Structures [Link]

Meeting C: 10 AM – 11:30 AM
Meeting D: 11:30 AM – 1 PM

Select optimal set of meetings without overlap

This problem is a classic example of the Activity Selection Problem, which can
be optimally solved using a Greedy Strategy.

The Greedy Choice


The most effective greedy strategy here is to always select the meeting that
finishes earliest. This choice is locally optimal because it frees up our schedule as
quickly as possible, maximizing the time available for subsequent meetings.

Given Meetings:
• Meeting A: 9:00 AM – 10:00 AM
• Meeting B: 9:30 AM – 11:00 AM
• Meeting C: 10:00 AM – 11:30 AM
• Meeting D: 11:30 AM – 1:00 PM

Step-by-Step Application of the Greedy Algorithm:

Step 1: Sort the meetings by their finish times in ascending order.


1. A (Finishes at 10:00)
2. B (Finishes at 11:00)
3. C (Finishes at 11:30)
4. D (Finishes at 13:00)

Step 2: Initialize the solution and track the last finish time.
• Selected Meetings = {}
• Last Finish Time = -∞ (or the beginning of the day)

Step 3: Iterate through the sorted list and make greedy choices.
• Consider Meeting A (9:00 - 10:00):
o Its start time (9:00) is after the last finish time (-∞).
o Action: Select Meeting A.
o Update State: Selected Meetings = {A}, Last Finish Time = 10:00.
• Consider Meeting B (9:30 - 11:00):
o Its start time (9:30) is before the last finish time (10:00). It overlaps.

Join Telegram Channel - [Link] | @SPPU Engineers 16


Data Structures [Link]

oAction: Reject Meeting B.


• Consider Meeting C (10:00 - 11:30):
o Its start time (10:00) is greater than or equal to the last finish time
(10:00).
o Action: Select Meeting C.
o Update State: Selected Meetings = {A, C}, Last Finish Time = 11:30.
• Consider Meeting D (11:30 - 1:00):
o Its start time (11:30) is greater than or equal to the last finish time
(11:30).
o Action: Select Meeting D.
o Update State: Selected Meetings = {A, C, D}, Last Finish Time =
13:00.

Final Result:
The optimal set of meetings that can be scheduled without overlap is {A, C, D}, for
a total of 3 meetings.

Classify the following algorithms based on their time complexities (Linear,


Quadratic, Logarithmic, etc.) and justify:
• Binary Search
12
• Bubble Sort
• Merge Sort
• Insertion Sort

• Binary Search
o Complexity: Logarithmic - O(log n)
o Justification: At each step, the algorithm discards half of the remaining
search space. This repeated halving means the number of steps required
grows logarithmically with the input size n.

• Bubble Sort
o Complexity: Quadratic - O(n²)
o Justification: It uses two nested loops. The outer loop runs n times, and
for each of those, the inner loop also runs approximately n times, leading
to n*n comparisons in the worst case.

• Merge Sort
o Complexity: Log-Linear - O(n log n)

Join Telegram Channel - [Link] | @SPPU Engineers 17


Data Structures [Link]

o Justification: The array is recursively divided in half, which takes log


n levels of recursion. At each level, the merge operation requires
scanning through all n elements. This results in n * log n operations.

• Insertion Sort
o Complexity: Quadratic - O(n²)
o Justification: In the worst case (a reverse-sorted array), for each
element i, the algorithm has to shift all i preceding elements. This leads
to 1 + 2 + 3 + ... + (n-1) operations, which sums to a quadratic function.

Explain the concept of Abstract Data Type (ADT). How is it different from a
13
data structure? Give two examples. (Focuses more deeply on ADT)

The Concept of an Abstract Data Type (ADT)


An Abstract Data Type (ADT) is a high-level, theoretical model that defines a
data type purely in terms of its logical properties and behavior. It acts as
a blueprint or contract that specifies:

1. A collection of data.
2. A set of operations that can be performed on that data.
The defining characteristic of an ADT is abstraction. It deliberately hides the
implementation details from the user, focusing solely on the "what" (what
operations are available and what they do) rather than the "how" (how those
operations are coded and how the data is physically organized in memory). This
principle is also known as information hiding or encapsulation, where the data is
protected and can only be accessed through its defined public interface (the
operations).

Think of an ADT as the user manual and control panel for a complex machine like
a car. The driver (the user) interacts with a simple interface—a steering wheel,
pedals, and a gearstick. They don't need to know the intricate details of the internal
combustion engine, the transmission system, or the electrical wiring to operate the
car effectively. The car's internal complexity (the implementation) is hidden.

The Crucial Difference: ADT vs. Data Structure


The terms ADT and data structure are often used interchangeably, but they
represent two different levels of abstraction. The distinction is fundamental to
computer science.

Join Telegram Channel - [Link] | @SPPU Engineers 18


Data Structures [Link]

• An ADT is the logical description—the idea, the model, the contract.


• A Data Structure is the concrete, physical implementation of that idea.

Feature Abstract Data Type Data Structure


(ADT)
Level of Logical Physical Level (Concrete)
Concern Level (Conceptual)
Focus What it does. (Behavior, How it's done. (Memory layout,
operations, interface) algorithms for operations)
Nature A blueprint or a A real implementation built
mathematical model. from the blueprint.
Implementation Is implementation- Is a specific, concrete
independent. implementation.

One ADT can be implemented by many different data structures. The choice of
which data structure to use depends on the specific performance requirements
(time vs. space) of the application.

Detailed Examples
Example 1: The List ADT
• ADT Description: A List is an ADT that represents a finite, ordered
sequence of elements. It defines operations for accessing, adding, and
removing elements at specific positions.
• Key Operations:
o add(index, item): Inserts an item at a specific index.
o remove(index): Removes the item at a specific index.
o get(index): Retrieves the item at a specific index.
o size(): Returns the number of items in the list.
• Possible Data Structure Implementations:

1. Array-Based List (e.g., ArrayList in Java):


▪ How it works: Uses a contiguous block of memory (an array) to
store elements.
▪ Trade-offs: get(index) is very fast (O(1)) due to direct memory
access. However, add and remove can be slow (O(n)) because they
may require shifting all subsequent elements. Memory can be wasted
if the array is not full.

Join Telegram Channel - [Link] | @SPPU Engineers 19


Data Structures [Link]

2. Linked List:
▪ How it works: Uses a chain of nodes, where each node contains an
element and a pointer to the next node.
▪ Trade-offs: add and remove (especially at the ends) are very fast
(O(1)) as they only involve changing pointers. However, get(index) is
slow (O(n)) because it requires traversing the list from the beginning.

Example 2: The Dictionary (or Map) ADT


• ADT Description: A Dictionary is an ADT that stores a collection of
unique key-value pairs. It is designed for efficient lookup of a value when
its corresponding key is known.
• Key Operations:
o put(key, value): Associates a value with a key. If the key already
exists, its value is updated.
o get(key): Retrieves the value associated with a given key.
o remove(key): Deletes the key-value pair for a given key.
o containsKey(key): Checks if a key exists in the dictionary.
• Possible Data Structure Implementations:
1. Hash Table:
▪ How it works: Uses a hash function to compute an index into an array
of buckets, from which the desired value can be found.
▪ Trade-offs: Provides extremely fast average-case performance
for put, get, and remove (O(1)). However, worst-case performance can
be slow (O(n)), and it does not maintain the keys in any sorted order.
2. Balanced Binary Search Tree (e.g., Red-Black Tree):
▪ How it works: Stores key-value pairs in nodes of a tree that is
automatically kept balanced to ensure its height is logarithmic.
▪ Trade-offs: Provides a guaranteed worst-case performance for put, get,
and remove of O(log n), which is very good and predictable. It also has
the added benefit of maintaining the keys in sorted order, allowing for
efficient range queries.

Write a short note on Greedy Strategy. What are the essential characteristics
14
that make a problem suitable for the greedy approach?
A Short Note on Greedy Strategy
The Greedy Strategy is an algorithm design paradigm used for optimization
problems. Its core philosophy is to build a solution step-by-step by making a
sequence of choices. At each step, it makes the choice that seems most

Join Telegram Channel - [Link] | @SPPU Engineers 20


Data Structures [Link]

beneficial at that particular moment—a locally optimal choice—without any


regard for past choices or future consequences. The algorithm "greedily" consumes
the best available option at each stage, hoping this series of short-sighted decisions
will lead to a correct globally optimal solution.

The process is typically iterative: start with an empty solution, and at each step,
add the best possible piece to it until a complete solution is formed. Once a choice
is made, it is irrevocable; the algorithm never backtracks to reconsider it. A
famous analogy is a cashier making change, who always gives back the largest
denomination coin possible without exceeding the required amount.

While simple and often very fast, the greedy approach does not work for all
optimization problems. It can lead to incorrect, suboptimal solutions if the problem
does not have a specific underlying structure.

Essential Characteristics for a Greedy Problem

For a greedy algorithm to be guaranteed to produce a globally optimal solution, the


problem must exhibit two key properties:

1. The Greedy Choice Property


This is the most crucial characteristic. It states that a globally optimal solution can
be arrived at by making a sequence of locally optimal choices. In other words, the
choice that seems best at the current moment, without considering the results of
any subproblems, must be part of some globally optimal solution. You don't need to
"look ahead" to see if a choice will be detrimental later on; the immediate best
choice is always safe to take.

• Example (Activity Selection): The greedy choice is to pick the activity that
finishes earliest. This is a safe choice because it leaves the maximum
possible time for all other activities, thus keeping the door open for an
optimal solution for the rest of the problem.

2. Optimal Substructure
A problem exhibits optimal substructure if an optimal solution to the overall
problem contains within it optimal solutions to its subproblems. This property is
also a hallmark of dynamic programming.

Join Telegram Channel - [Link] | @SPPU Engineers 21


Data Structures [Link]

For a greedy algorithm, this means that after making the first greedy choice, the
remaining problem is a smaller version of the original problem that can also be
solved optimally. The combination of the greedy choice and the optimal solution to
the remaining subproblem yields an optimal solution for the original problem.

• Example (Activity Selection): Let S be the set of all activities. The greedy
choice is to select the activity a₁ with the earliest finish time. Let S' be the set
of all activities in S that are compatible with a₁ (i.e., they start
after a₁ finishes). The Optimal Substructure property guarantees that an
optimal solution to the original problem S is {a₁} combined with an optimal
solution to the subproblem S'.

Join Telegram Channel - [Link] | @SPPU Engineers 22

You might also like