DAA Exam Preparation Questions
DAA Exam Preparation Questions
An optimal binary search tree minimizes the expected search cost based on a given set of keys, each associated with a probability of search occurrence. For an example instance with n=4, keys (a1,a2,a3,a4)=(do,if,int,while), and probabilities P=(3,3,1,1) and Q=(2,3,1,1,1), the optimal tree is constructed by calculating the weighted cost of all possible trees and choosing the configuration with the lowest cost. Using dynamic programming techniques to compute the expected cost matrix for subtrees ensuring that frequently searched keys are near the root, the process ensures minimal search path lengths on average. This is determined by assessing the dynamic programming table filled with scrutinized combinations of keys and calculating the optimal cost for all subproblems .
The FIFO (First-In-First-Out) branch and bound strategy processes nodes in the state space tree just like in a queue, prioritizing the exploration of nodes in the order they were generated without considering their cost-effectiveness. This strategy is systematic and guarantees eventual coverage of all nodes necessary to find the optimal solution . However, it might not efficiently lead to optimal solutions quickly because it doesn’t leverage cost-focusing strategies to prioritize potentially optimal nodes. Thus, it can end up exploring a large number of suboptimal paths, especially in problems where the search space is vast and optimal nodes are not among the initially generated states, thereby increasing time complexity and potential computational costs .
Kruskal’s algorithm builds a minimum spanning tree by sorting all the edges of the graph by increasing order of weight. It then takes the smallest edge that does not form a cycle with the spanning tree formed so far, repeating until there are exactly V-1 edges in the tree, where V is the number of vertices . The algorithm is efficient for sparse graphs and generally operates with a time complexity of O(E log E), where E is the number of edges, primarily due to the edge sorting process . Efficient data structures like disjoint-set (also known as union-find) are used to keep track of the subsets of vertices as the tree is expanded, enhancing performance especially when union by rank and path compression techniques are applied .
Branch and bound algorithms use a systematic way of exploring the solution space by evaluating explicit (defined) states and implicit (hidden or potential) states, often using a cost function to guide the selection of the next state to explore. This can involve least-cost search paradigms and strategies like FIFO and LIFO queues . Backtracking, on the other hand, uses a simpler state space tree exploration and is often utilized for constraint satisfaction problems without necessarily calculating a cost function beyond feasibility checks, such as in the n-queens and sum of subsets problems . While backtracking explores the space in a depth-first manner, branch and bound uses cost comparison to prune sub-optimal paths, making it more suited for optimization problems .
The weighted union technique significantly enhances the efficiency of disjoint set operations, such as UNION and FIND, by always attaching the smaller tree under the root of the larger tree. This helps in maintaining a shallow tree structure, reducing the time complexity for FIND operations significantly to almost constant time. When combined with path compression, which flattens the structure of the tree whenever FIND is called, the overall amortized time complexity of disjoint set operations becomes nearly O(1). This improvement leads to more efficient implementations for a variety of applications, including Kruskal's algorithm for minimum spanning trees, thereby optimizing overall algorithm performance in those contexts .
NP-Complete problems are a class of computational problems for which no known polynomial-time solutions exist, yet any solved instance can be verified quickly in polynomial time . These problems are significant because they sit at the intersection of P and NP, and solving any NP-Complete problem efficiently would imply P equals NP, a major unsolved question in computer science. A common example is the Travelling Salesperson Problem (TSP), where the objective is to determine the shortest possible route visiting each city exactly once and returning to the origin city. The significance lies in the universality of NP-Complete problems in computational theory, serving as benchmarks for complexity and influencing how algorithms are developed for real-world problem-solving .
Quick Sort uses a divide-and-conquer approach, and its time complexity mainly depends on how well the pivot partitions the array. The best-case scenario, where the pivot divides the array into two equal halves each time, results in a time complexity of O(n log n). The average-case reflects similar behavior, also leading to O(n log n). However, in the worst-case scenario, where the pivot is always the smallest or largest element (like in already sorted arrays), the time complexity degrades to O(n^2). Nevertheless, these scenarios are rare in practice due to strategies like random pivot selection, which can help maintain average-case performance. Moreover, Quick Sort's efficiency is aided by its in-place sorting mechanism, which minimizes additional memory overhead .
The dynamic programming approach to solving the Travelling Salesperson Problem (TSP) involves breaking down the problem into simpler subproblems and storing their solutions. It uses a state-space representation where the state is a pair consisting of a set of visited vertices and the current vertex, and builds solutions by extending them with new edges . This method calculates the minimal cost path recursively and utilizing memorization ensures that each subproblem is computed only once, leading to reduced redundancy. However, the limitation is that this approach has exponential time complexity O(n^2*2^n), which is more computationally intense as the number of vertices increases. Although more efficient than a naive approach, it remains impractical for very large graphs .
Asymptotic notations provide a means to describe the limiting behavior of an algorithm's time or space complexity as the input size approaches infinity. They help in evaluating and comparing the efficiency of algorithms abstracted from lower-level factors like language or machine dependencies . The main types include Big O, which gives an upper bound on the time complexity (worst-case scenario); Big Omega (Ω), which provides a lower bound (best-case scenario); and Big Theta (Θ), representing a tight bound when an algorithm's performance scales reliably between the upper and lower limits . These notations enable developers to focus on significant factors affecting efficiency and ignore constant factors and lower order terms, thus guiding the optimization and selection of algorithms effectively .
Strassen’s matrix multiplication algorithm is an improvement over the conventional O(n^3) multiplication technique. It reduces the number of necessary multiplications between two n x n matrices by using a divide-and-conquer approach to break each matrix into smaller submatrices. Strassen's method involves seven multiplications and 18 addition/subtraction operations for 2x2 matrices, contrary to the typical eight multiplication operations . This results in a reduced time complexity to approximately O(n^2.81), which is a significant improvement for large matrices. This method's implication is most profound on large data sets due to its reduced computational time, despite its increased complexity in managing the submatrix calculations .