Unit 1
Unit 1
AND
EXPERT SYSTEMS (CSE209)
visual information from the world, such as images or videos. Applications include
• Robotics: The field that integrates AI with robotics, allowing machines to perceive
improving safety.
• Predictive Maintenance: AI predicts when vehicles and machinery are likely to fail,
• Voice Assistants: AI-driven voice assistants like Siri, Alexa, and Google Assistant
• Anomaly Detection: AI detects unusual patterns in data that may indicate security
• Pollution Control: AI systems analyze data from sensors to detect and predict
• The Logic Theorist (1955-1956) by Allen Newell and Herbert A. Simon was designed
• John McCarthy coined the term "Artificial Intelligence" in 1956 and organized the
study.
The Early Enthusiasm and Challenges (1956-1970s)
• Symbolic AI (1950s-1970s): AI research focused on symbolic systems that
manipulated abstract symbols to perform reasoning and problem-solving.
Programs like SHRDLU (a natural language understanding system) and ELIZA (an
early chatbot) were developed.
• Expert Systems: AI began to be applied to specialized domains with the
development of expert systems like DENDRAL (for chemical analysis) and MYCIN
(for medical diagnosis).
• Limitations and Criticisms: By the late 1970s, the limitations of symbolic AI became
apparent, especially in dealing with real-world ambiguity, complexity, and learning.
AI entered a period of skepticism known as the "AI winter."
Here is a famous demonstration of SHRDLU: [2]
Person: Pick up a big red block.
Computer: OK.
Person: Grasp the pyramid.
Computer: I DON'T UNDERSTAND WHICH
PYRAMID YOU MEAN.
Person (changing their mind): Find a block which is
taller than the one you are holding and put it into the
box.
Computer: BY "IT", I ASSUME YOU MEAN THE
BLOCK WHICH IS TALLER THAN THE ONE I AM
HOLDING.
Computer: OK.
Person: What does the box contain?
Computer: THE BLUE PYRAMID AND THE BLUE
BLOCK.
Person: What is the pyramid supported by?
Computer: THE BOX.
Person: How many blocks are not in the box?
Computer: FOUR OF THEM.
The Rise of Machine Learning (1980s-1990s)
• Neural Networks: Inspired by the structure of the human brain, neural networks
gained renewed interest in the 1980s. Although early models like Perceptrons had
limitations, the development of backpropagation algorithms led to more powerful
models.
• Machine Learning: AI research shifted focus towards data-driven approaches.
Algorithms for learning from data, such as decision trees, support vector machines, and
clustering algorithms, became central to AI.
• AI Winter and Recovery: The over-promising of AI capabilities in the 1970s led to
funding cuts and an AI winter in the 1980s. However, the field recovered in the 1990s
with advances in machine learning and more realistic expectations.
The Modern Era (2000s-Present)
• Deep Learning: The 2010s saw a revolution in AI with the rise of deep learning, which
involves neural networks with many layers (deep neural networks). This led to
breakthroughs in image recognition, speech recognition, and natural language
processing.
• Big Data and AI: The availability of massive amounts of data and increased
computational power allowed AI models to achieve unprecedented accuracy and
capabilities in various tasks.
• AI in Everyday Life: AI technologies like voice assistants (Siri, Alexa), recommendation
systems (Netflix, Amazon), and autonomous vehicles became part of everyday life.
• Ethical and Social Implications: The rapid advancement of AI raised concerns about privacy, bias, job
displacement, and the ethical use of AI, leading to increased interest in AI ethics and governance.
• Generative AI: Recently, AI models that can generate content, such as GPT (used in ChatGPT) for text
generation, DALL-E for image generation, and others for music, video, and more, have opened new
1.Game Trees:
1. Used in game-playing algorithms like chess, where the exact depth of the winning
move is unknown.
2.Pathfinding:
1. Used in pathfinding problems where memory usage is a concern, and the goal’s
Hill Climbing is a search strategy that continuously moves toward the direction of increasing
•Mechanism: Hill Climbing evaluates the neighboring states of the current state and moves
•Variants:
• Steepest-Ascent Hill Climbing: Evaluates all neighbors and moves to the best one.
• Stochastic Hill Climbing: Randomly selects a neighbor among the better ones.
How Hill Climbing Works:
1.Start with an Initial Solution:
1. Begin with an arbitrary solution to the problem (usually chosen at random).
2.Generate Neighboring Solutions:
1. Generate a set of neighboring solutions by making small changes to the current solution.
3.Evaluate and Compare:
1. Evaluate the neighboring solutions based on a given evaluation or objective function.
2. Compare the values of these neighboring solutions.
4.Move to the Best Neighbor:
1. If one of the neighbors has a better evaluation value (higher for maximization problems or lower for
minimization problems), move to that neighbor and make it the current solution.
2. If no neighboring solution is better than the current solution, terminate the search.
5.Repeat:
1. Continue the process until you reach a point where no neighboring solution is better than the current one
(local optimum) or until a certain number of iterations have been completed.
1. Evaluate the initial State
2. Loop until a solution is found or there are no operators left
1. Select and apply a new operator
2. Evaluate the new state
3. If goal then quit
4. If better state is found then current state then it is a new current state
Example:
Imagine you are trying to find the highest point on a mountain (the optimal solution) but
can only see the elevation of the current point and its immediate neighbors.
•Step 4: Repeat until you can’t find a neighboring point with a higher elevation than your
current point.
Types of Hill Climbing:
1.Simple Hill Climbing:
1. Considers only one neighbor at a time and moves to that neighbor if it improves
the solution.
2.Steepest-Ascent Hill Climbing:
1. Considers all neighbors and moves to the one with the best improvement.
3.Stochastic Hill Climbing:
1. Considers a random neighbor and moves to it if it improves the solution, adding
a level of randomness to the process.
Advantages:
1.Simplicity:
1. Easy to implement and understand, requiring only a simple evaluation function and a method to
generate neighbors.
2.Efficiency:
1. Works well for small problems and in situations where a solution is better than no solution.
3.Memory Usage:
1. Requires minimal memory, as it only needs to store the current solution and its evaluation.
4.Local Search:
1. Ideal for local search problems where the goal is to improve an existing solution rather than find
a global optimum.
Disadvantages:
1.Local Optima:
1. Hill climbing can get stuck in local optima, where the algorithm terminates at a solution that is not
the global optimum.
2.No Guarantee of Global Optimum:
1. The algorithm does not guarantee finding the global optimum, especially in complex or large
search spaces.
3.Plateau Problem:
1. If the algorithm reaches a plateau (a flat area in the search space where all neighboring solutions
have the same value), it may struggle to make progress.
4.Ridge Problem:
1. The algorithm may have difficulty climbing ridges, where the optimal solution requires moving in
a direction that initially decreases the evaluation function.
Challenges:
•Local Maxima: The algorithm can get stuck at local maxima where no
neighboring state is better, but the global maximum is far away.
•Plateaus and Ridges: Flat or steep regions in the search space can hinder
progress.
Example: Optimizing a function by iteratively adjusting input values to
maximize output, like adjusting the parameters of a machine learning model.
Simulated Annealing
Simulated Annealing is a probabilistic search strategy that allows the
algorithm to escape local maxima by occasionally accepting worse solutions.
•Mechanism: Inspired by the annealing process in metallurgy, where materials
are slowly cooled to remove defects. The algorithm starts with a high
"temperature," allowing it to explore the search space freely, and gradually
lowers the temperature, reducing the likelihood of accepting worse solutions.
Simulated Annealing (SA) is a probabilistic optimization algorithm inspired by the
optimization problems, particularly those with a large search space. Here's a step-by-
1. Initialization
•Start with an initial solution SSS and evaluate its cost (or objective function)
E(S)E(S)E(S).
•Set initial temperature TTT (a high value) and a cooling schedule (a function that
3. Acceptance Criterion
•If the new solution S′ is better (i.e., E(S′)<E(S)), accept it as the new
current solution.
•If the new solution is worse, accept it with a probability P=exp(−(E(S′)
−E(S))/T). This probability decreases as the temperature T decreases,
allowing the algorithm to "escape" local minima early on but become more
conservative as it progresses.
4. Cooling Schedule
•Update the temperature T according to the cooling schedule, typically by
reducing T slightly (e.g., T=α×T where alphaα is a factor less than 1).
5. Stopping Criterion
•The algorithm stops when a certain condition is met, such as:
• The temperature TTT drops below a predefined threshold.
• A maximum number of iterations is reached.
• The system stabilizes, meaning that no better solutions are found over
several iterations.
6. Output
•The best solution found during the process is considered the approximate solution to the
optimization problem.
Summary of Key Concepts
•Initial Solution: Starting point of the algorithm.
•Temperature: Controls the likelihood of accepting worse solutions; high temperature
allows more exploration, while low temperature focuses on exploitation.
•Cooling Schedule: Dictates how the temperature decreases over time.
•Acceptance Probability: Allows the algorithm to escape local optima by accepting worse
solutions with a certain probability.
•Stopping Criterion: Determines when the algorithm should terminate.
•Characteristics:
• Effective at avoiding local maxima, especially in complex search spaces.
• Convergence: The algorithm converges to a solution as the temperature
decreases, ideally reaching the global maximum.
Example: Used in combinatorial optimization problems like the Traveling
Salesman Problem, where the goal is to find the shortest route visiting a set
of cities.
How Simulated Annealing (SA) and
Hill Climbing (HC) are different from each other
5. Best-First Search
Best-First Search is a search strategy that explores paths in the
search space by selecting the path that appears to be the best based
on a heuristic.
•Mechanism: The algorithm uses a priority queue to keep track of
nodes, prioritizing nodes with the lowest heuristic value h(n). It
expands the most promising node first.
•Variants: The most common variant is Greedy Best-First Search,
which only considers the heuristic value when selecting nodes to
expand.
How Best-First Search Works:
1.Initialization:
•Start by placing the initial node in a priority queue.
•The priority queue is ordered based on an evaluation function, often denoted
as f(n).
2.Expand Node:
•Dequeue the node with the lowest evaluation function value.
•If this node is the goal, the search ends successfully.
3.Generate Successors:
•Generate all possible successors (or neighbors) of the current node.
•Calculate their evaluation function values and insert them into the priority
queue.
4.Repeat:
•Continue expanding the node with the lowest evaluation function value until
the goal node is found or the priority queue is empty (indicating failure).
Evaluation Function:
The evaluation function f(n) is typically based on heuristics, which
estimate the cost to reach the goal from node n. It can be represented
as:
•f(n) = h(n), where h(n) is the heuristic function estimating the cost
from n to the goal.
Let Open the priority queue containing initial node:
Loop
if node is Goal
Else generate all successors of node and put the node generated node into Open according to f(heuristic) value
Advantages:
1.Efficient Search:
• By focusing on the most promising paths, Best-First Search can
quickly find a solution, especially in large search spaces.
2.Flexibility:
• The choice of heuristic function allows the algorithm to be adapted for
different types of problems.
3.Informed Search:
• Uses domain-specific knowledge to make informed decisions on
which path to explore next.
Disadvantages:
1.No Guarantee of Optimality:
1. Best-First Search does not guarantee finding the optimal solution unless the
evaluation function is designed to do so.
2.May Get Stuck in Local Optima:
1. The algorithm may get stuck exploring a suboptimal path if the heuristic function
misleads it.
3.Memory Usage:
1. Like other graph search algorithms, it can consume a significant amount of
memory if the search space is large.
Applications:
•Pathfinding:
• Used in games, robotics, and other domains where finding the shortest path is
crucial.
•Artificial Intelligence:
• Applied in AI for problem-solving, such as in the development of intelligent agents
and decision-making systems.
•Optimization Problems:
• Suitable for solving optimization problems where the goal is to find the best solution
among many possible ones.
Characteristics:
•Efficient in finding solutions, especially when a good heuristic is available.
•Not guaranteed to be optimal, as it can be misled by local minima.
Example: Finding the shortest path in a map where the heuristic could be the
straight-line distance to the destination.
6. A Algorithm*
The A Algorithm* is an informed search strategy that combines the strengths
of both Dijkstra's algorithm and Best-First Search by considering both the cost
to reach a node and the estimated cost to reach the goal.
•Mechanism: A* uses the function f(n) = g(n) + h(n), where:
•g(n) is the cost to reach the current node n from the start.
•h(n) is the heuristic estimate from n to the goal.
How A Works:*
The A* algorithm uses two main components in its evaluation function:
•Create an empty closed list to track the nodes that have already been evaluated.
2.Process the Open List:
•While the open list is not empty, extract the node with the lowest f(n) value.
•If this node is the goal, reconstruct the path and return it as the solution.
3.Generate Successors:
•For the current node, generate all possible successor nodes (neighbors).
•For each successor:
•Calculate the tentative g value (g(successor) = g(current) + cost(current, successor)).
•Calculate the heuristic value h(successor).
•Calculate f(successor) = g(successor) + h(successor).
4.Update Open List:
• If the successor node is not in the open list, add it with its f value.
• If the successor is already in the open list but with a higher f value,
update its f value to the lower value and set its parent to the current node.
1.Optimality:
2.Efficiency:prioritize paths
•A* is more efficient than other search algorithms like Breadth-
First Search because it uses heuristics to that appear to lead
directly to the goal.
3.Flexibility:
•A* can be adapted for different types of problems by changing
the heuristic function.
Disadvantages:
1.Memory Usage:
1. A* can consume a significant amount of memory, especially for large search
spaces, because it stores all generated nodes in the open and closed lists.
2.Heuristic Dependence:
1. The efficiency of A* is highly dependent on the quality of the heuristic function. A
poorly chosen heuristic can lead to inefficient search.
3.Computational Cost:
1. While A* is generally efficient, the computational cost can be high, particularly
when the heuristic is complex or the search space is large.
Applications:
•Pathfinding:
• Used in video games, robotics, and navigation systems to find the shortest
path.
•Artificial Intelligence:
• Applied in AI to solve complex problems such as puzzle solving (e.g., the 8-
puzzle problem) and game playing.
•Robotics:
• Used in autonomous robots to plan paths and avoid obstacles.
Summary:
7. Constraint Satisfaction
Constraint Satisfaction Problems (CSPs) involve finding a solution that satisfies a
set of constraints or conditions.
•Components:
• Variables: The entities that need to be assigned values.
• Domains: The possible values that each variable can take.
• Constraints: The rules that restrict the values the variables can take.
Approaches to Solving CSPs:
1.Backtracking:
1. A depth-first search approach where variables are assigned values one at a time. If a
variable assignment violates a constraint, the algorithm backtracks and tries a different
value.
2.Forward Checking:
1. While assigning a value to a variable, the algorithm checks ahead to see if the remaining
variables can still be assigned values that satisfy the constraints. If not, it backtracks
immediately.
3.Constraint Propagation:
1. This technique involves simplifying the problem by iteratively applying constraints to
reduce the domains of the variables. A common method is Arc Consistency, which
•Heuristics:
•Minimum Remaining Values (MRV): Select the variable with the fewest legal
values remaining in its domain.
•Degree Heuristic: Select the variable involved in the largest number of constraints
on other unassigned variables.
•Least Constraining Value: Assign the value that imposes the fewest constraints on
the remaining variables.
•Local Search:
•A technique where an initial solution is generated, and then the algorithm iteratively
makes local changes to reduce the number of violated constraints. Hill Climbing and
Simulated Annealing are common local search methods used in CSPs.
Advantages of CSPs:
1.General Framework:
1.CSPs provide a general framework that can be applied to a wide variety of
problems across different domains.
2.Efficiency:
1.Techniques like backtracking and constraint propagation can solve
problems efficiently, especially with the use of heuristics.
3.Structured Search:
1.CSPs allow for structured problem-solving where constraints guide the
search process, reducing the search space.
Disadvantages of CSPs:
1.Scalability:
1. For very large or complex problems, CSPs can become computationally expensive,
particularly if there are many variables and constraints.
2.Complex Constraints:
1. Handling complex constraints, such as non-binary constraints or constraints
involving multiple variables, can be challenging and may require sophisticated
techniques.
3.Local Optima in Local Search:
1. Local search techniques may get stuck in local optima, which may not be the best
overall solution.
Applications of CSPs:
1.Scheduling:
1. Assigning tasks to time slots or resources while satisfying constraints like deadlines, resource
availability, and task dependencies.
2.Puzzle Solving:
1. Problems like Sudoku, crossword puzzles, and the 8-queens problem are classic examples of CSPs.
3.Resource Allocation:
1. Allocating resources such as bandwidth, manpower, or materials while adhering to constraints like
budget limits or availability.
4.Configuration Problems:
1. Configuring products or systems that must meet specific requirements and constraints, such as
setting up a computer system with compatible components.