B.Tech AIML 5th Semester CSE University Questions
B.Tech AIML 5th Semester CSE University Questions
Ans. In AI, learning agents are considered the most powerful type of agent because they can improve
their performance over time through experience. Unlike simple reflex agents or model-based agents,
learning agents use historical data to update their knowledge and decision-making strategies, making
them adaptable and more robust in complex environments.
Ans. A rational agent in AI is an agent that acts to achieve the best possible outcome or, when there
is uncertainty, the best expected outcome. It takes actions based on its knowledge, environment, and
capabilities to maximize its performance measure. Rationality is context-dependent, meaning an
agent's rational behavior depends on its goals, knowledge, and the information it receives from its
environment.
Ans. Informed Search (Heuristic Search): Uses additional information (heuristics) to find
a solution more efficiently. A popular example is the A* algorithm, which uses a heuristic
function to estimate the cost of reaching the goal from a given node, allowing it to explore
promising paths first.
Example: In a map navigation problem, the distance between two cities could be used as a
heuristic to guide the search.
Uninformed Search (Blind Search): Does not use any domain-specific knowledge to
search for a solution. It explores nodes without guidance. A common example is Breadth-
First Search (BFS), which systematically explores all nodes level by level.
Example: In solving a maze, BFS explores all paths evenly until it finds the exit.
d) State two differences between propositional logic and first order logic.
Ans. Expressiveness:
Propositional Logic: Deals with simple, atomic propositions that are either true or
false. It cannot express relationships between objects.
First-Order Logic (FOL): More expressive, allowing quantification over objects and
the relationships between them (e.g., "All humans are mortal").
Variables:
Propositional Logic: Does not include variables, only propositions (e.g., P, Q).
First-Order Logic: Includes variables and quantifiers like "for all" (∀) and "there
exists" (∃), which enable it to represent more complex statements (e.g., ∀x Human(x)
→ Mortal(x)).
Ans. Uncertainty in reasoning refers to situations where the outcome of actions or the truth of
statements is not known with certainty due to incomplete, noisy, or ambiguous information. In AI,
probabilistic reasoning methods like Bayesian networks or fuzzy logic are often used to handle
uncertainty, allowing the system to make predictions or decisions with a certain level of confidence,
rather than relying on binary true/false logic.
Ans. Worst Ordering: Occurs when alpha-beta pruning is applied, but the nodes are
evaluated in an order that provides minimal pruning. This results in exploring more nodes,
leading to a performance close to that of a basic minimax algorithm without pruning.
Example: In a chess game tree, if the least favorable moves are evaluated first, the algorithm
has to explore a larger portion of the tree.
Ideal Ordering: Occurs when nodes are evaluated in the most efficient order, allowing
maximum pruning of branches. This drastically reduces the number of nodes to be evaluated,
significantly improving the algorithm’s efficiency.
Example: If the most favorable moves are evaluated first, large portions of the tree are
pruned, speeding up the decision-making process.
Ans. Unification is the process of making two logical expressions identical by finding a
substitution for variables that allows this. It is used in first-order logic and Prolog for pattern
matching in logical reasoning.
Unification can occur by substituting x = Alice and y = z, making both expressions equivalent
to P(Alice, z). This allows the inference mechanism to treat them as the same logical
statement.
h) Is Bayesian network supervised or unsupervised? Justify your answer.
Ans. Bayesian network can be used in both supervised and unsupervised learning contexts,
depending on how it is applied.
Supervised learning: When the Bayesian network is used to model the conditional
probabilities between features (input variables) and a known target (output variable),
it functions in a supervised manner. For instance, when training a Bayesian network
to predict the likelihood of a disease given symptoms, the network is trained on
labeled data (symptoms and known outcomes).
Unsupervised learning: A Bayesian network can also be used to model the joint
probability distribution of a set of variables without any predefined labels or outputs.
In this case, it functions as an unsupervised learning tool, learning the relationships
between variables.
Thus, Bayesian networks are flexible and can be applied to both supervised and
unsupervised learning tasks, depending on the problem setup and whether labeled data is
available.
Ans. Statistical learning is a framework for understanding and modeling data using statistics
and probability. Key concepts include:
1. Training Data: A dataset used to train the model by fitting the relationships between
input variables (features) and the target variable (label).
2. Model: A mathematical or computational construct that represents the relationship
between input variables and output. Examples include linear regression models,
decision trees, and neural networks.
3. Loss Function: A measure of how well the model’s predictions match the actual
outcomes. Common loss functions include Mean Squared Error (MSE) for regression
tasks and Cross-Entropy Loss for classification.
4. Regularization: Techniques to prevent overfitting by penalizing overly complex
models (e.g., L1, L2 regularization).
5. Bias-Variance Tradeoff: Describes the trade-off between the model’s complexity
(variance) and the error due to simplifying assumptions (bias). Finding the right
balance minimizes prediction error.
6. Generalization: The ability of a model to perform well on unseen data (test data), not
just on the training data. A well-generalized model avoids overfitting.
7. Probability Distributions: Statistical learning often involves understanding
probability distributions (e.g., Gaussian, Bernoulli) to describe data and uncertainty.
Key Differences:
PART-II
Ans. Artificial Intelligence (AI) and Machine Learning (ML) are closely related fields, but
they are not the same.
AI is the broader concept of creating intelligent systems that can perform tasks
typically requiring human intelligence. This includes reasoning, learning, decision-
making, and understanding natural language. AI covers a wide range of techniques,
from rule-based systems to robotics.
ML is a subset of AI that focuses on developing algorithms that allow machines to
learn from data. Rather than being explicitly programmed for specific tasks, ML
models learn from patterns in data and improve their performance over time. ML is a
key technique that powers many AI systems.
Relationship: AI is the overall goal (building intelligent systems), and ML is one of the key
tools used to achieve that goal by enabling systems to learn and adapt.
b)Explain with example what is Means-Ends-Analysis.
Example: Suppose you want to travel from City A to City B, but you don’t have a direct
flight.
The MEA process involves finding the intermediate steps to reduce the difference between
where you are now (City A) and your final goal (City B).
c)Describe the concept of a multi-agent system and elucidate the advantages and
challenges associated with coordination and interaction among multiple agents within
an environment.
Ans. A Multi-Agent System (MAS) consists of multiple agents that interact with each other
within a shared environment. Each agent in the system is autonomous, meaning it can make
its own decisions based on its perception of the environment. The agents in an MAS may
work collaboratively, competitively, or independently, depending on the system’s design and
the problem it aims to solve.
Advantages:
1. Scalability: MAS systems can handle complex tasks by distributing the workload among
multiple agents, which can improve efficiency and performance.
2. Robustness: Since the system consists of multiple agents, the failure of one agent may not
critically affect the overall system.
3. Parallelism: Agents can work in parallel, which can significantly reduce the time required to
solve certain problems.
Challenges:
1. Coordination: Ensuring that agents coordinate their actions effectively to achieve a common
goal can be complex, especially in collaborative settings.
2. Communication Overhead: If agents need to share information frequently, it can lead to
high communication overhead, affecting performance.
3. Conflicting Objectives: In competitive environments, agents may have conflicting goals,
making it difficult to achieve global optimization.
4. Resource Management: Allocating and sharing resources between agents can be a challenge
in resource-limited environments.
d)What is best first search? Explain its advantages over BFS and DFS with a
suitable example.
Ans. Best-First Search (BFS) is a search algorithm that selects the most promising node to
explore based on a given evaluation function (often a heuristic). It combines elements of both
depth-first and breadth-first search by using a priority queue to prioritize nodes that are
likely to lead to the goal.
BFS explores all nodes level by level, which can be inefficient when the search space
is large.
DFS explores nodes deep into the search tree but might get stuck in deep branches
without finding a solution.
Best-first search addresses these issues by prioritizing nodes that seem closer to the goal,
reducing the number of explored nodes and speeding up the search process.
Example: In pathfinding, if you're looking for the shortest path from a start point to a goal,
BFS would explore all paths evenly, while DFS might get stuck exploring one path too
deeply. Best-first search, however, uses a heuristic like the estimated distance to the goal
(such as Euclidean distance in a grid) to prioritize nodes closer to the goal, which can find a
solution faster.
e)Is A* algorithm able to find a suitable solution from the state space graph of a
problem. Justify your answer with suitable explanation.
Ans. Yes, the A* algorithm is designed to find the optimal solution in a state space graph if
the heuristic used is admissible (never overestimates the cost to reach the goal) and
consistent (the estimated cost is always less than or equal to the estimated cost from any
neighboring vertex plus the step cost to that neighbor).
Justification:
A* combines heuristics (to estimate the cost to the goal) with the actual cost so far.
The cost function in A* is f(n)=g(n)+h(n), where:
o g(n) is the cost from the start node to node n.
o h(n) is the heuristic estimate of the cost from n to the goal.
Because A* expands the most promising nodes first (those with the lowest f(n)), it efficiently
finds the shortest path. If the heuristic is admissible, A* guarantees finding the optimal
solution.
Backward Chaining:
Process: Starts with the goal and works backward by applying inference rules to see if
the facts can support the goal.
Use Case: Used in goal-driven systems where the system tries to prove a hypothesis.
Example: To diagnose a disease, the system starts with a possible disease (goal) and
checks if the symptoms (facts) support the diagnosis.
g)Explain the working of alpha-beta pruning with example. How it is different than
minimax algorithm?
Ans. Alpha-beta pruning is an optimization technique for the minimax algorithm that
reduces the number of nodes evaluated in the search tree. It works by "pruning" branches that
cannot influence the final decision, effectively ignoring parts of the tree that do not need to be
explored.
Alpha represents the best value that the maximizer can guarantee.
Beta represents the best value that the minimizer can guarantee.
As the algorithm traverses the tree, if it finds that a certain move will lead to a worse
outcome than a previously evaluated move, it stops exploring that branch.
Example: In a game tree, suppose we are evaluating moves in a chess game. If one move
clearly leads to a better outcome for the opponent, alpha-beta pruning will stop evaluating
further moves along that branch because the opponent would never allow us to reach that
position.
Minimax Algorithm: Explores all nodes of the game tree, which can be
computationally expensive.
Alpha-Beta Pruning: Skips unnecessary nodes and thus reduces the search space,
making the process faster without affecting the outcome.
Both algorithms aim to find the optimal move, but alpha-beta pruning does so more
efficiently by eliminating unpromising branches.
i)Compare and contrast propositional logic and first-order logic in terms of expressive
power and representational capabilities. Provide examples to highlight scenarios where
each logic type is more suitable for knowledge representation.
Ans. Propositional Logic (PL):
Expressive Power: Propositional logic deals with simple, atomic propositions that
can either be true or false. It does not allow for the expression of relations between
objects or the use of variables. For example, in propositional logic, a fact like "The
sky is blue" would be represented as a single atomic proposition, such as P, where P
stands for "The sky is blue".
Representational Capabilities: PL can only handle specific facts and relationships
between propositions using logical connectives (AND, OR, NOT). It lacks the ability
to represent objects, properties, and relationships between objects.
Example: If we want to represent the fact that "It is raining" and "The ground is wet",
we might use two propositions:
o R: It is raining.
o W: The ground is wet.
o We can then form a logical expression such as: R → W (If it rains, the ground
will be wet).
Expressive Power: FOL, also known as predicate logic, extends propositional logic
by allowing the use of quantifiers (e.g., ∀ for "for all" and ∃ for "there exists") and
predicates that can express relations between objects. It allows for more detailed
representations involving objects, properties, and relations between objects.
Representational Capabilities: FOL can express general rules and relationships
involving objects. For example, it can represent "All humans are mortal" using
variables and quantifiers: ∀x (Human(x) → Mortal(x)).
Example: To represent the fact that "All humans are mortal" and "Socrates is a
human", we can use:
o ∀x (Human(x) → Mortal(x)) (All humans are mortal).
o Human(Socrates) (Socrates is a human).
o From this, we can deduce Mortal(Socrates) (Socrates is mortal).
Propositional Logic is suitable for simple reasoning tasks involving specific facts
without the need for representing relations between different objects or using
variables. It is appropriate when the knowledge base consists of concrete, unchanging
propositions.
First-Order Logic is more powerful and suitable when dealing with complex systems
that involve relationships between multiple objects, or when general rules need to be
expressed. It is useful in domains like artificial intelligence, where reasoning about
objects and their properties is essential (e.g., "If a person is a parent, they have a
child").
j)What is the difference between neural net learning and genetic learning? Explain with
suitable examples.
Genetic Learning:
Key Differences:
k)What are the characteristics of Rote Learning? Is it good or bad? Justify your
answer.
Rote learning can be both good and bad depending on the context:
Good:
o Quick Memorization: Rote learning is useful for tasks that require memorization of
facts, formulas, or procedures that don’t necessarily require deep understanding. For
example, memorizing multiplication tables, vocabulary, or periodic table elements
can be helpful for quick recall.
o Efficiency in Recalling Information: In some cases, being able to quickly recall
information is necessary (e.g., memorizing phone numbers, dates, or historical facts
for a quiz).
o Foundation for Further Learning: In some cases, rote learning can provide a
foundation for deeper learning. For example, memorizing basic math operations can
serve as a foundation for understanding more complex mathematical concepts.
Bad:
o Lack of Understanding: Rote learning discourages critical thinking and
understanding. It often leads to a shallow grasp of information, making it difficult to
apply knowledge to new contexts.
o Poor Long-Term Retention: Information learned by rote may be easily forgotten
over time, especially if not reinforced by deeper learning methods.
o Not Suitable for Complex Concepts: For complex subjects (e.g., problem-solving,
reasoning, or creative thinking), rote learning is generally ineffective because it does
not promote conceptual understanding or flexible thinking.
Justification: Rote learning has its place in certain situations where memorization is the
primary goal, but for more complex tasks that require problem-solving, understanding, or
application of knowledge, rote learning is not ideal. It is best used in combination with more
meaningful learning strategies, like active learning or conceptual understanding.
Ans. Maximum Likelihood Estimation (MLE) is a statistical method used to estimate the
parameters of a probabilistic model. It finds the values of the parameters that maximize the
likelihood function, i.e., it finds the set of parameters that makes the observed data most
probable.
Steps in Maximum Likelihood Estimation:
1. Define the Likelihood Function: The likelihood function represents the probability of
observing the given data as a function of the model parameters.
2. Maximize the Likelihood: The goal is to find the parameter values that maximize this
likelihood function. Often, instead of maximizing the likelihood directly, we maximize the
log-likelihood (because it simplifies the math).
3. Estimate Parameters: The parameters that maximize the likelihood are considered the most
likely values given the data.
Example:
Suppose we are trying to estimate the probability p of a coin landing heads in a biased coin
flip experiment. We perform n trials, and observe k heads. The outcome of each trial can be
modeled as a Bernoulli random variable, and the likelihood function is based on the binomial
distribution.
Let:
The likelihood function for the probability p, given the observed data, is:
The term (nk)\binom{n}{k}(kn) is the binomial coefficient and does not depend on p, so we
focus on maximizing:
To simplify, we take the natural logarithm of the likelihood function, giving us the log-
likelihood:
To find the value of ppp that maximizes this log-likelihood, we take the derivative with
respect to ppp and set it to zero:
Thus, the maximum likelihood estimate for ppp is simply the proportion of heads observed,
kn\frac{k}{n}nk.
Interpretation:
In this example, MLE gives the most likely estimate for the probability of flipping heads
based on the observed data. If, for instance, you flipped a coin 100 times and got 60 heads,
MLE would estimate that the probability p of heads is 0.600.600.60.
Application:
MLE is widely used in machine learning and statistics for parameter estimation in models
such as:
Logistic regression
Gaussian mixture models
Hidden Markov models
MLE's strength lies in its ability to provide a consistent and asymptotically unbiased estimate
of parameters, given sufficient data.
PART-III
Q3) a) Explain the structure of a typical intelligent agent, breaking down its
components such as agent program, percept sequence and actuators. Explain how these
components interact to achieve intelligent behaviour in an agent.
Ans. An intelligent agent is an autonomous entity that perceives its environment through
sensors, takes actions using actuators, and aims to achieve certain goals. The structure of a
typical intelligent agent can be broken down into several key components:
1. Agent Program:
o The agent program is the core logic that controls the agent's behavior. It is a
function that maps from percept sequences (inputs received over time) to actions.
The agent program processes the percepts, makes decisions, and chooses actions to
perform.
o The program can be simple (like a reflex agent) or more complex, involving
reasoning, learning, and planning.
2. Percept Sequence:
o This refers to the complete history of everything the agent has perceived since it
was activated. The percept sequence can vary in complexity depending on the
agent's design.
o It serves as the input to the agent program and helps in decision-making.
o Examples of percepts include sensor data, camera input, sound, temperature, etc.
3. Sensors:
o Sensors are the physical components or mechanisms that gather information from
the environment.
o They can range from cameras, microphones, or any sensory device that helps the
agent perceive its surroundings.
4. Actuators:
o Actuators are mechanisms through which the agent interacts with the environment.
o They allow the agent to perform actions like movement (e.g., motors), manipulation
(e.g., robotic arms), or communication (e.g., speakers or displays).
5. Environment:
o The environment is where the agent operates. It provides the input (percepts) and is
affected by the agent’s actions.
CROSS
+ROADS
DANGER
Ans. CSPs involve variables, domains for these variables, and constraints that must be
satisfied. The goal here is to solve the problem involving the words:
CROSS
ROADS
DANGER
This can be modeled as a cryptarithmetic puzzle. In such puzzles, each letter represents a
unique digit (0-9), and the task is to find the digits that satisfy the arithmetic sum. The
equation is:
CROSS+ROADS=DANGER
Step-by-Step Approach:
Assign values to each letter so that the equation holds. Typically, these types of problems are
solved through backtracking or constraint propagation methods.
FOPL Translation:
∀x(Athlete(x)→(Strong(x)∧Intelligent(x)))
Ans. To prove that Sachin succeeds in his career using resolution, we need to follow a step-
by-step process, which involves:
We are tasked with proving Sachin succeeds in his career, i.e., Succeeds(Sachin).
To use resolution, we first negate the goal that we want to prove. The goal is
Succeeds(Sachin), so we negate it:
¬Succeeds(Sachin)
1. ∀x(Athlete(x)→(Strong(x)∧Intelligent(x)))
∀x(¬Athlete(x)∨(Strong(x)∧Intelligent(x)))
∀x((¬Athlete(x)∨Strong(x))∧(¬Athlete(x)∨Intelligent(x)))
¬Athlete(x)∨Strong(x)
¬Athlete(x)∨Intelligent(x)
2. ∀x((Strong(x)∧Intelligent(x))→Succeeds(x))
∀x(¬(Strong(x)∧Intelligent(x))∨Succeeds(x))
∀x((¬Strong(x)∨¬Intelligent(x))∨Succeeds(x))
¬Strong(x)∨¬Intelligent(x)∨Succeeds(x)
3. GoodRunner(Sachin)
This just remains as is, and it's assumed from the problem that GoodRunner(Sachin)
implies that Sachin is an athlete:
Athlete(Sachin)
1. ¬Athlete(Sachin)∨Strong(Sachin)
2. ¬Athlete(Sachin)∨Intelligent(Sachin)
3. ¬Strong(Sachin)∨¬Intelligent(Sachin)∨Succeeds(Sachin)
4. Athlete(Sachin) (from GoodRunner(Sachin))
5. ¬Succeeds(Sachin) (negated goal)
Strong(Sachin)
Intelligent(Sachin)
Succeeds(Sachin)
Conclusion
Since we derived a contradiction, we conclude that Sachin does indeed succeed in his career.
Q5) a) What are the causes of uncertainty in real world? Explain the need of
probabilistic reasoning in AI with justification.
Justification:
In AI systems, especially when working with dynamic environments or uncertain inputs (like
predicting stock market prices or diagnosing diseases), probabilistic reasoning helps make
more robust decisions. It allows AI to quantify the level of certainty in its predictions and can
adjust beliefs when new information is available, leading to better and more adaptive
performance.
b)State Baye’s theorem in artificial intelligence. Explain briefly how Baye’s theorem
calculates the prediction of an event with respect to addition of new clause.
P(H∣E)=P(E)P(E∣H)⋅P(H)
Where:
Q6) a) What are the two main classes of statistical learning? Explain with examples.
Write the applications of statistical learning.
Ans.
1. Supervised Learning:
o Definition: Supervised learning involves learning a function that maps input
data (features) to a target output (labels) based on a labeled dataset.
o Example: Classifying emails as spam or not spam based on a labeled training
dataset.
o Common Algorithms: Decision trees, support vector machines, neural
networks.
2. Unsupervised Learning:
o Definition: In unsupervised learning, the algorithm learns patterns or
structures from data without explicit labels. It focuses on finding hidden
relationships or clusters in the data.
o Example: Clustering customers into different segments based on purchasing
behavior.
o Common Algorithms: k-means clustering, hierarchical clustering, principal
component analysis (PCA).
b)Explain the architecture of rule based expert system with neat sketch. Describe the
functions of each block.
Ans. A rule-based expert system is an AI system that applies rules to input data to draw
conclusions or make decisions. The architecture typically consists of the following
components:
1. Knowledge Base
Function: Contains the domain-specific knowledge in the form of rules, facts, and heuristics.
Each rule is often in an "IF-THEN" format (e.g., "IF condition THEN conclusion").
Example: "IF the temperature is above 38°C AND the patient has a cough THEN the diagnosis
is flu."
2. Inference Engine
Function: The core component that applies logical reasoning to the knowledge base to infer
new facts or solutions. It evaluates which rules apply to the given data and fires the
appropriate rules to reach a conclusion.
Example: In a medical diagnosis system, the inference engine matches the patient's
symptoms with rules to suggest possible diseases.
Function: Stores the current state of information, including the input data and any
intermediate conclusions drawn during the reasoning process. It keeps track of facts that are
dynamically updated during the reasoning.
Example: Patient's symptoms, medical history, and newly inferred data during diagnosis.
4. User Interface
Function: Facilitates communication between the user and the expert system. Users provide
input (e.g., symptoms), and the system gives output (e.g., diagnosis) based on the reasoning
of the inference engine.
Example: A doctor entering patient details into the system, and the system outputting
possible diagnoses.
5. Explanation Facility
Function: Provides explanations for the conclusions drawn by the system. It helps users
understand why a particular decision or conclusion was reached.
Example: "The system diagnosed flu because the patient had a high fever and cough."
Function: Allows the system to be updated with new knowledge. Domain experts can add,
modify, or delete rules to keep the system up-to-date with the latest information.
Example: Adding new rules related to a recently discovered disease.