0% found this document useful (0 votes)
10 views101 pages

heuristic search and game playing

ai topics

Uploaded by

Ashwani Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views101 pages

heuristic search and game playing

ai topics

Uploaded by

Ashwani Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

Heuristic Search algorithms &

Game Playing
By:
Vijeta Rani
Delhi Technological University

©Delhi Technological University 2024


Types of search techniques
• Uninformed vs informed search

• Informed search in AI is a type of search algorithm that uses


additional information to guide the search process, allowing for
more efficient problem-solving compared to uninformed
search algorithms.

• This information is obtained by a heuristic function that


estimates how close a state is to the goal state.
Types of search techniques
Uninformed Search examples are: Informed search examples are:
1 DFS (Depth first search) 1 Generate and Test
2 BFS (Breadth first search) 2 Branch and Bound
3 Brute force search 3 Hill climbing
4 Etc. 4 Best first search
5 A* search
6 AO* search
7 Constraint satisfaction
8 Min max search
9 Alpha beta pruning etc.
Brute Force Search
• A brute force algorithm is a simple, comprehensive search
strategy that systematically explores every option until a
problem’s answer is discovered.

• Many problems are solved in day-to-day life using the brute


force strategy, for example, exploring all the paths to a nearby
market to find the minimum shortest path.
Pros of Brute force search
• This algorithm is a guaranteed way to find the correct solution
• It is a generic method and not limited to any specific domain of
problems.
• This method is ideal for solving small and simpler problems.
• It is known for its simplicity and can serve as a comparison
benchmark.
Cons of Brute force search
• The brute force approach is inefficient. For real-time problems,
algorithm analysis often goes above the O(N!) order of growth.
• This method relies more on compromising the power of a
computer system for solving a problem than on a good
algorithm design.
• Brute force algorithms are slow.
• Brute force algorithms are not constructive or creative
compared to algorithms that are constructed using some other
design paradigms.
Generate and Test search
• Generate and Test Search combines depth-first search with
backtracking.
• It systematically generates all possible solutions and tests
them.
• If a solution is found, it stops; otherwise, it continues
generating and testing.
• It's like searching for an exhibit randomly in the British
Museum.
• The heuristic function ranks alternatives to guide the search.
Algorithm for Generate and Test
1 Generate a possible solution (e.g., a point or a path in the
problem space).
2 Test if it's an actual solution by comparing it to acceptable
goal states.
3 If a solution is found, stop; otherwise, repeat the process.
Properties of Good Generators
• Complete: They generate all possible solutions.

• Non-Redundant: Avoid duplicate solutions.

• Informed: Use knowledge about the search space.


More points on Generate and Test
• The most straightforward way to implement systematic
generate and test is as a depth first search tree with
backtracking.

• This approach can be effective but may not work well for
complex problems. Combining it with other techniques can
improve efficiency.
Backtracking
• Backtracking is a problem-solving algorithmic technique that
involves finding a solution incrementally by trying different
options and undoing them if they lead to a dead end.

• A backtracking algorithm works by recursively exploring all


possible solutions to a problem.
Backtracking Algorithm steps
1. Choose an initial solution.
2. Explore all possible extensions of the current solution.
3. If an extension leads to a solution, return that solution.
4. If an extension does not lead to a solution, backtrack to the
previous solution and try a different extension.
5. Repeat steps 2 to 4 until all possible solutions have been
explored.
Branch and Bound Method
• It works by dividing the problem into smaller subproblems, or
branches, and then eliminating certain branches based on
bounds on the optimal solution.
• This process continues until the best solution is found or all
branches have been explored.
• Branch and Bound is commonly used in problems like the
traveling salesman and job scheduling.
Characteristics of Branch and Bound
• Optimal solution: The algorithm is designed to find the optimal
solution to an optimization problem by searching the solution
space in a systematic way.

• Upper and lower bounds: The algorithm uses upper and lower
bounds to reduce the size of the search space and eliminate
subproblems that cannot contain the optimal solution.
Characteristics of Branch and Bound
• Pruning: The algorithm prunes the search tree by eliminating
subproblems that cannot contain the optimal solution or are
not worth exploring further.

• Backtracking: The algorithm uses backtracking to return to a


previous node in the search tree when a dead end is reached
or when a better solution is found.
Applications of Branch and Bound
• Traveling salesman problem
• Knapsack problem
• Resource allocation
• Network optimization
• Game playing algorithms like chess or tic tac or 16 puzzle
problem.
Branch and Bound Method for TSP
• Begin generating complete paths. Before generating a new path,
keep track of the shortest path found so far. Give up exploring any
new path as soon as its partial length becomes greater than
shortest path found so far.
Branch and Bound Example
Advantages of Branch and Bound
• Optimal solution: This algorithm find the best answer to an
optimization issue by methodically searching the solution space.
• Reduces search space: The algorithm uses lower and upper bounds to
cut down on the size of the search area and get rid of sub-problems that
can not have the best answer.
• Proven performance: This algorithm is successful in locating the best
solutions to challenging optimization problems.
• Incremental improvement: The algorithm starts with an initial lower
bound and iterations improve it until an optimized solution is found.
Disadvantages of Branch and Bound
• Exponential time complexity: Its worst-case time complexity is
exponential in the size of the input, making it unsuitable for handling
complex optimization issues.
• Memory-intensive: To store the search tree and the current best
answer, the method needs a lot of memory.
• Sensitivity to problem-specific parameters: The quality of the problem-
specific constraints utilized determines how well the method performs,
and sometimes it might be challenging to discover good bounds.
• Limited scalability: This technique may not scale effectively for
problems with huge search spaces.
Greedy Algorithms
Greedy algorithms are a class of algorithms that make locally
optimal choices at each step with the hope of finding a global
optimum solution.
In these algorithms, decisions are made based on the
information available at the current moment without
considering the consequences of these decisions in the future.
The key idea is to select the best possible choice at each step,
leading to a solution that may not always be the most optimal
but is often good enough for many problems.
Steps for Creating a Greedy Algorithm
The steps to define a greedy algorithm are:
1. Define the problem: Clearly state the problem to be solved
and the objective to be optimized.
2. Identify the greedy choice: Determine the locally optimal
choice at each step based on the current state.
3. Make the greedy choice: Select the greedy choice and update
the current state.
4. Repeat: Continue making greedy choices until a solution is
reached.
Greedy Algorithm Examples
• Fractional Knapsack: Optimizes the value of items that can be
fractionally included in a knapsack with limited capacity.
• Dijkstra’s algorithm: Finds the shortest path from a source
vertex to all other vertices in a weighted graph.
• Kruskal’s algorithm: Finds the minimum spanning tree of a
weighted graph.
• Huffman coding: Compresses data by assigning shorter codes
to more frequent symbols.
Limitations of Using a Greedy Algorithm
• Greedy algorithms may not always find the best possible
solution.
• The order in which the elements are considered can
significantly impact the outcome.
• Greedy algorithms focus on local optimizations and may miss
better solutions that require considering a broader context.
• Greedy algorithms are not applicable to problems where the
greedy choice does not lead to an optimal solution.
Dynamic Programming
• Dynamic Programming (DP) is a method used in mathematics
and computer science to solve complex problems by breaking
them down into simpler subproblems.
• By solving each subproblem only once and storing the results,
it avoids redundant computations, leading to more efficient
solutions for a wide range of problems.
• It is particularly effective for problems that contains
overlapping subproblems and optimal substructure.
How Does Dynamic Programming (DP) Work?
• Identify Subproblems: Divide the main problem into smaller,
independent subproblems.
• Store Solutions: Solve each subproblem and store the solution
in a table or array.
• Build Up Solutions: Use the stored solutions to build up the
solution to the main problem.
• Avoid Redundancy: By storing solutions, DP ensures that each
subproblem is solved only once, reducing computation time.
Applications of Dynamic Programming
• Longest Common Subsequence (LCS): Finds the longest
common subsequence between two strings.
• Shortest Path in a Graph: Finds the shortest path between two
nodes in a graph.
• Knapsack Problem: Determines the maximum value of items
that can be placed in a knapsack with a given capacity.
• Matrix Chain Multiplication: Optimizes the order of matrix
multiplication to minimize the number of operations.
• Fibonacci Sequence: Calculates the nth Fibonacci number.
Advantages of Dynamic Programming (DP)
• Avoids recomputing the same subproblems multiple times,
leading to significant time savings.
• Ensures that the optimal solution is found by considering all
possible combinations.
• Breaks down complex problems into smaller, more manageable
subproblems.
Dynamic Programming vs Branch and Bound
Dynamic programming Branch and Bound
• Constructs the solution in form of a • Constructs the solution in form of a
table. tree.
• Solves all possible instances of • Only solves promising instances from
problem of size n. the set of instances at any given point.
• Does not require a bounding function. • Needs to compute and apply a
• After constructing the table, it needs bounding function at each node.
to be traced back to find the solution • The solution sequence is implicit, a leaf
sequence. node of the tree is the final solution.
Dynamic Programming vs Greedy approach
Hill Climbing
• Hill climbing is a variant of generate and test.
• In a pure generate and test procedure, the test function with
only a yes or no.
• But in Hill climbing procedure, an evaluation function works as
a heuristic function, which provides an estimate of how close a
given state is to a goal state.
• Difference between depth first search and hill climbing is that
in the later, the children of a node are sorted by the remaining
distance from the goal.
Hill Climbing
• It belongs to the family of local search algorithms and is often used in
optimization problems where the goal is to find the best solution from a
set of possible solutions.
• The algorithm starts with an initial solution and then iteratively makes
small changes to it in order to improve the solution. These changes are
based on a heuristic function that evaluates the quality of the solution.
• The algorithm continues to make these small changes until it reaches a
local maximum, meaning that no further improvement can be made
with the current set of moves.
• Hill climbing belongs to the class of greedy algorithms.
Hill climbing on a surface of states

Hill climbing on a surface of states

Height is defined by
Evaluation Function
Variations of hill climbing algorithm
1. Simple Hill climbing: It examines the neighboring nodes one
by one and selects the first neighboring node which optimizes
the current cost as the next node.
2. Steepest Ascent Hill Climbing: The algorithm evaluates all the
possible moves from the current solution and selects the one
that leads to the best improvement.
3. Simulated annealing: It is a probabilistic variation of Hill
Climbing that allows the algorithm to occasionally accept
worse moves in order to avoid getting stuck in local maxima.
Simple Hill climbing algorithm
1. Evaluate the initial state. If it is also a goal state, then return it and quit. Otherwise,
continue with the initial state as the current state.
2. Loop until a solution is found or until there are no new operators left to be applied to
the current state.
a) Select an operator that has not yet been applied to the current state and apply it to
produce a new state.
b) Evaluate the new state.
i) If it is the goal state, then return it and quit.
ii) If it is not a goal state, but it is better than the current state, then make it the
current state.
iii) If it is not better than the current state , then continue in the loop.
Algorithm for Steepest Ascent Hill climbing
1. Evaluate the initial state. If it is also a goal state, then return it and quit.
Otherwise, continue with the initial state as the current state.

2. Loop until a solution is found or until a complete iteration produces no change


to the current state:
a) Let SUCC be a state such that any possible successor of the current state
will be better than SUCC.
b) For each operator that applies to the current state do:
i) Apply the operator and generate a new state.
ii) Evaluate the new state. If it is a goal state, then return it and quit. If not,
then compare it to SUCC.
Algorithm for Steepest Ascent Hill climbing
contd...
iii) If it is better then set SUCC to this state.
iv) If it is not better, then leave SUCC alone.
c) If the SUCC is better than the current state, then set current state to SUCC.
Hill climbing example
Hill climbing Example
Hill climbing example
2 8 3 1 2 3
start 1 6 4 h = -4 goal 8 4 h=0
7 5 7 6 5

-5 -5 -2
2 8 3 1 2 3
1 4 h = -3 8 4 h = -1
7 6 5 7 6 5

-3 -4
2 3 2 3
1 8 4 1 8 4 h = -2
7 6 5 7 6 5
h = -3 -4
f (n) = -(number of tiles out of place)
Problems with hill climbing
• Local Maxima: peaks that aren’t the local
highest point in the space maximum
• Plateaus: the space has a broad flat plateau
region that gives the search algorithm
no direction (random walk)
• Ridges: flat like a plateau, but with
drop-offs to the sides; steps to the
North, East, South and West may go
down, but a step to the NW may go up.
ridge
Image from: http://classes.yale.edu/fractals/CA/GA/Fitness/Fitness.html
Solutions to the hill climbing problems
To overcome these problems, one or a combination of the following
methods:

• Backtracking to some earlier state and try going in some different


direction
• A big jump to some direction to get to new section, is the solution
to escape from the plateau.
• Trying different paths at the same time is the solution for
circumventing ridges. Apply two or more rules before doing test.
Advantages of Hill Climbing algorithm
1. It is a simple and intuitive algorithm that is easy to understand
and implement.
2. It can be used in a wide variety of optimization problems,
including those with a large search space and complex constraints.
3. It is often very efficient in finding local optima, making it a good
choice for problems where a good solution is needed quickly.
4. The algorithm can be easily modified and extended to include
additional heuristics or constraints.
Disadvantages of Hill Climbing algorithm
1. Hill Climbing can get stuck in local optima, meaning that it
may not find the global optimum of the problem.
2. The algorithm is sensitive to the choice of initial solution, and
a poor initial solution may result in a poor final solution.
3. Hill Climbing does not explore the search space very
thoroughly, which can limit its ability to find better solutions.
4. It may be less effective than other optimization algorithms,
such as genetic algorithms or simulated annealing, for certain
types of problems.
Simulated annealing
Simulated annealing is a variation of hill climbing in which, at the
begining of the process, some downhill moves may be made. The
idea is to do enough exploration of the whole space early on so
that the final solution is relatively insensitive to the starting state.
This should lower the chances of getting caught at a local
maimum, a plateau, or a ridge.
Simulated annealing
The algorithm for simulated annealing is only slightly different
from the simple hill climbing procedure. The three differences
are :
1. The annealing schedule must be maintained.
2. Moves to worst state may be accepted.
3. It is a good idea to maintain, in addition to the current state,
the best state found so far. Then, if the final state is worse than
the earlier state (because of bad luck in accepting moves to
worse states), then the earlier state is still available.
Simulated annealing algorithm
1. Evaluate the initial state. If it is also a goal state, then return it and quit. Otherwise,
continue with the initial state as the current state.

2. Initialize BEST-SO-FAR to the current state.

3. Initialize T according to the annealing schedule.

4. Loop until a solution is found or until there are no new operators left to be applied in
the current state.
a) Select an operator that has not yet been applied to the current state and apply it to
produce a new state.
Simulated annealing algorithm contd...
b) Evaluate the new state. Compute
ΔE = (value of current) - (value of new state)
i) If the new state is the goal state, then return it and quit.
ii) If it is not a goal state but is better than the current state, then make it the
current state. Also set BEST-SO-FAR to this new state.
iii) If it not better than the current state, then make it the current state with
probability p’ = e-ΔE/T. This step is usually implemented by invoking a random number
generator to produce a number in the range [0,1]. If that number is less than p’, then
the move is accepted. Otherwise do nothing.
c) Revise T as necessary according to the annealing schedule.
5. Return BEST-SO-FAR, as the answer.
Best First Search Algorithm
• The idea of Best First Search is to use an evaluation function to
decide which adjacent is most promising and then explore.
• This algorithm jumps all around in the search space to find out
the minimum evaluation function.
• In hill climbing, only the children are compared for minimum
evaluation function, but in best first search, even the
previously untraversed nodes are also compared.
• It is a branch and bound algorithm.
Best First Search Algorithm
1. Create two lists: OPEN list and CLOSED list

2. Put the starting node (initial state) on the OPEN list.

3. While the OPEN list is not empty


a) Find the node with the least evaluation function value on the OPEN
list, call it "BESTNODE"
b) Pop BESTNODE off the OPEN list. Push BESTNODE on the CLOSED list.
c ) See if BESTNODE is the goal node. If so, return Solution.
d) Generate successors of the BESTNODE.
Best First Search Algorithm contd...
e) For each successor, do
(i) If the SUCCESSOR not generated before, evaluate it, add it to OPEN
list, and record its parent.
(ii) If it has been generated before, change the parent if this new path is
better than the previous one. In that case, update the cost of getting to
this node and to any successors that this node may already have.
(iii) End of For Loop
f) End of While Loop

4. If OPEN is empty, return Failure.


Best First Search Example
Step No BESTNODE SUCCESSORS OPEN list CLOSED list

1. S (A:3), (B:6), (C:5) (A:3), (B:6), (C:5) S

2. (A:3) (D:9), (E:8) (B:6), (C:5), (D:9), (E:8) S, (A:3)

3. (C:5) (H:7) (B:6), (D:9), (E:8), (H:7) S, (A:3), (C:5)

4. (B:6) (F:12), (G:14) (D:9), (E:8), (H:7), (F:12), S, (A:3), (C:5), (B:6)
(G:14)
5. (H:7) (I:5), (J:6) (D:9), (E:8), (F:12), (G:14), S, (A:3), (C:5), (B:6),
(I:5), (J:6) (H:7)
6. (I:5) (K:1), (L:0), (M:2) (D:9), (E:8), (F:12), (G:14), S, (A:3), (C:5), (B:6),
(J:6), (K:1), (L:0), (M:2) (H:7), (I:5)
7. (L:0) Search stops as L is the (D:9), (E:8), (F:12), (G:14), S, (A:3), (C:5), (B:6),
goal (J:6), (K:1), (M:2) (H:7), (I:5), (L:0)
Difference between steepest ascent hill
climbing and Best first search
• In Hill climbing one move is selected and all the others are rejected,
never to be reconsidered. Whereas in Best first search, one move is
selected but the others are kept around so that they can be
revisited later, if the selected path becomes less promising.

• Further, the best available state is selected in best first search, even
if the state has a value that is lower than the value of the state that
was just explored. In contrast, hill climbing stops if there are no
successor states with better values than the current state.
Analysis of Best first search
• The worst-case time complexity for Best First Search is O(n *
log n) where n is the number of nodes. In the worst case, we
may have to visit all nodes before we reach goal. Note that
priority queue is implemented using Min(or Max) Heap, and
insert and remove operations take O(log n) time.
• The performance of the algorithm depends on how well the
cost or evaluation function is designed.
A* Search
• In best first search, the heuristic function used was evaluation function,
in which the distance from the goal node was estimated.
• Apart from evaluation function, a function named as the cost function
can be used. This function indicate the amount of resources (energy,
space, time etc) are spent on reaching a node from the start node.
• While the evaluation function value deals with the future, the cost
function value deals with the past.
• The sum of the evaluation function and the cost function values is called
as the fitness number.
A* Search
A* is a Branch and Bound Search with an estimate of remaining
distance (best first Search) combined with principle of Dynamic
Programming
f(x) = g(x) + h(x)
Where,
g: measure of the cost of getting from initial node to current node
(sum of the cost of applying each rules that were applied along the
best path)
h: estimate of getting to goal node from current node
A* Search Algorithm
1. Create the OPEN list
Create the CLOSED list

2. Put the starting node on the OPEN list (you can set its g as zero, and
calculate its f = 0 + h.)

3. while the OPEN list is not empty


a) Find the node with the least f on the OPEN list, call it "BESTNODE"
b) Pop BESTNODE off the OPEN list. Push BESTNODE on the CLOSED list.
A* Search Algorithm contd...
c) See if BESTNODE is the goal node i.e. BESTNODE.h = 0. If so, return
Solution.
d) Generate BESTNODE's SUCCESSORs and set their parents to BESTNODE
e) for each SUCCESSOR
i) Compute both g and h for SUCCESSOR
SUCCESSOR.g = BESTNODE.g + cost of getting from BESTNODE to the
SUCCESSOR.
SUCCESSOR.h = distance from goal to the SUCCESSOR
SUCCESSOR.f = SUCCESSOR.g + SUCCESSOR.h
A* Search Algorithm contd...
ii) if a node with same h as SUCCESSOR is in the OPEN list which has a
lower g than SUCCESSOR, skip this SUCCESSOR
iii) if a node with same h as SUCCESSOR is in the CLOSED list which has
a lower g than SUCCESSOR, skip this SUCCESSOR
iv) otherwise, add the node SUCCESSOR to the OPEN list
v) end (for loop)

f) end (while loop)

4. If OPEN is empty, return Failure.


Step No BESTNODE SUCCESSORS OPEN list CLOSED list

1. (S:7)

2. (S:7) (A:12), (B:16), (C:16) (A:12), (B:16), (C:16) (S:7)

3. (A:12) (B:18: skipped as in OPEN (B:16), (C:16), (D:14) (S:7), (A:12)


list with lower f),
(D:14)
4. (D:14) (H:32), (I:33), (F:13) (B:16), (C:16), (H:32), (I:33), (S:7), (A:12), (D:14)
(F:13)
5. (F:13) (G:23) (B:16), (C:16), (H:32), (I:33), (S:7), (A:12), (D:14),
(G:23) (F:13)
6. (B:16) (D:30: skipped as in (C:16), (H:32), (I:33), (G:23) (S:7), (A:12), (D:14),
CLOSED list with lower f), (F:13), (B:16)
7. (C:16) (D:24: skipped), (E:34) (H:32), (I:33), (G:23), (E:34) (S:7), (A:12), (D:14),
(F:16: skipped) (F:13), (B:16), (C:16)
8. (G:23) Stop as G is the Goal. (H:32), (I:33), (E:34) (S:7), (A:12), (D:14),
(F:13), (B:16), (C:16,
(G:23)
Step No BESTNODE SUCCESSORS OPEN list CLOSED list

1. (A:10)

2. (A:10) (B:14), (F:9) (B:14), (F:9) (A:10)

3. (F:9) (G:9), (H:13) (B:14), (G:9), (H:13) (A:10), (F:9)

4. (G:9) (I:8) (B:14), (H:13), (I:8) (A:10), (F:9), (G:9)

5. (I:8) (E:15), (H:12), (J:10) (B:14), (H:12: Replaced by (A:10), (F:9), (G:9),
new path), (E:15), (J:10) (I:8)
6. (J:10) Stop as J is the Goal. (B:14), (H:12), (E:15), (A:10), (F:9), (G:9),
(I:8), (J:10)
More points about A*
• Is A* algorithm admissible or not? Almost none.

• By admissible, one means that the algorithm is sure to find an


optimal path if one exists. It only happens when the evaluation
function never Overestimates the distance to the goal node.
Problem Reduction
• In this method, a complex problem is broken down or
decomposed into a set of primitive sub-problems. Solutions to
these sub-problems are easily obtained.

• OR graphs: In OR graphs, the solution for the problem is


obtained by solving any of the sub-problems.

• AND graphs: In AND graphs, the solution for the problem is


obtained by solving all the sub-problems.
AND-OR Graphs
• Most problems of AI do not reduce to pure AND or pure OR
graphs. A combination of AND and OR graphs exist, which is
called as AND-OR graphs.

• In order to find solutions in an AND-OR graph, we need an


algorithm similar to the best first search but with the ability to
handle the AND arcs appropriately.
Example AND-OR graph
Searching AND-OR Graphs
• Best first search or A* algorithm is not adequate for searching the AND-OR
graphs, because for the AND tree, all branches must be scanned in order to
arrive to a solution.
AO* Algorithm working
• The AO* method divides any given difficult problem into a smaller group
of problems that are then resolved using the AND-OR graph concept.

• The evaluation function in AO* looks like this:


f(n) = g(n) + h(n)
f(n) = Actual cost + Estimated cost
here,
f(n) = The actual cost of traversal.
g(n) = the cost from the initial node to the current node.
h(n) = estimated cost from the current node to the goal state.
It should be noted that the cost
of each edge is the same as 1,
and the heuristic cost to reach
the goal node from each node of
the graph is shown beside it.
Forward Propagation
• First, we begin from node A and calculate each of the OR side
and AND side paths. The OR side path P(A-B) = g(B) + h(B) = 1 +
5 = 6, where 1 is the cost of the edge between A and B, and 5 is
the estimated cost from B to the goal node.
• The AND side path P(A-C-D) = g(C) + h(C) + g(D) + h(D) = 1 + 3 +
1 + 4 = 9, where the first 1 is the cost of the edge between A
and C, 3 is the estimated cost from C to the goal node, the
second 1 is the cost of the edge between A and D, and 4 is the
estimated cost from D to the goal node.
• Since the cost of P(A-B) is the minimum cost path, we proceed
on this path in the next step.
• Here, someone may ask why we do not stop here since we
have already found the minimum cost path from A to the goal
node. The answer is that such a path may not be the correct
minimum cost path because we have made our calculations
based on the heuristics down to only one level. However, the
given graph has provided us with a deeper level whose
calculations may update the achieved values.
Reaching the Last Level and Back Propagation
• In this step we continue on the P(A-B) from B to its successor
nodes i.e., E and F, where P(B-E) = 1 + 10 = 11 and P(B-F) = 1 +
11 = 12. Here, P(B-E) has a lower cost and would be chosen.
• Now, we have reached the bottom of the graph where no more
level is given to add to our information. Therefore, we can do
the backpropagation and correct the heuristics of upper levels.
In this vein, the updated h(B) = P(B-E) = 11, and as a
consequence the updated P(A-B) = g(B) + updated h(B) = 1 + 11
= 12.
• Now, we can see that P(A-C-D) with a cost of 9 is lower than
the updated P(A-B) with a cost of 12. Therefore, we need to
proceed on this path to find the minimum cost path from A to
the goal node.

• It is worth mentioning that if the updated P(A-B) had a lower


cost than P(A-C-D), then we were done, and no more
calculations would be required.
Correcting the Path from Start Node
• In this step, we do the calculations for the AND side path, i.e.,
P(A-C-D), and first explore the paths attached to node C. In this
node again we have an OR side where P(C-G) = 1 + 3 = 4, and
an AND side where P(C-H-I) = 1 + 0 + 1 + 0 = 2, and as a
consequence the updated h(C) = 2.

• Also, the updated h(D) = 2, since P(D-J) = 1 + 1 = 2. By these


updated values for h(C) and h(D), the updated P(A-C-D) = 1 + 2
+ 1 + 2 = 6.
• This updated P(A-C-D) with the cost of 6 is still less than the
updated P(A-B) with the cost of 12, and therefore, the
minimum cost path from A to the goal node goes from P(A-C-
D) by the cost of 6. We are done.
Difference between the A* and AO* algorithms
• A* always gives the optimal solution but AO* doesn’t guarantee to
give the optimal solution.
• Once AO* got a solution doesn’t explore all possible paths but A*
explores all paths.
• When compared to the A* algorithm, the AO* algorithm uses less
memory.
• Opposite to the A* algorithm, the AO* algorithm cannot go into an
endless loop.
Constraint Satisfaction
• In constraint satisfaction, the goal is to discover some problem
state that satisfies a given set of constraints.

• Examples of constraint satisfaction are cryptarithmetic puzzles.

• It is a two step process. First constraints are discovered and


propagated as far as possible throughout the system. Then if
there is still not a solution, search begins.
Example: Cryptarimetic puzzle
To perform letter arithmetic each letter should be assignment a unique
decimal digit.
Problem
SEND CROSS
+MORE +ROADS
MONEY DANGER
Constraints :
• All alphabets used are to have different numeric values.
• Since addition operation is involved, rules of addition are to be adhered
to.
Initial Problem State
S=? M = ? C1=?
E=? O = ? C2=?
N=? R = ? C3=?
D=? Y=?
•Apply constraint interface rule to generate any relevant new
constraints.
•Apply the letter assignment rules to perform all assignments
required by the current set of constraints , Then choose
another rule to generate an additional assignment which will,
in turn, generate new constraints at the next cycle.

Of course, at each cycle , there may be several chances of


rules the apply . A few useful heuristics can help to select the
best rule to apply first .
C4C3C2C1
SEND
+MORE
MONEY
* M=C4 =1 (since two digits number + carry can not exceed 19)
M=1
* O=S+M+C3→S=8 or 9 since S+M+C3>9→S+C3 >8
O=0 since S+M+C3 should be at least 10 and can be at most 11
* N= O+E+C2 →N = E or E+1 → N=E+1,hence C2=1
For C2= 1, N+R+C1 >9, N+R>8, now N+R cannot be greater than 18
So E≠9
At this point no more conclusion can not be generated
Y=D+E

useful heuristics can help to select the best rule to apply first.
If a letter that has only two possible values and another with six
possible values then there is a better chance of guessing right on
the first than on the second.
This procedure can be implemented as a depth –first search.
Game Playing
• Making computers play games.
• The rules of games are limited. Hence extensive amount of
domain specific knowledge are seldom needed.
• There are two major components of a game playing program:-
1. Plausible move generator: It expands or generates only
selected moves as it not possible to examine all the moves.
2. Static Evaluation function generator: Based on heuristics,
this generates the static evaluation function value. Higher
value means more probability of a win.
Game Playing strategies
• Games can be classified as :
1. Single person playing
2. Multi person playing

Single person games:


→ Examples are Rubik’s cube, 8 tile puzzle etc.
→ For these, Best first or A* algorithm can be used.
Game Playing strategies
• Two person games:
→ Examples are chess, checkers, tic tac toe etc.
→ Each person tries to outsmart the opponent. Each has their
own way of evaluating the situation and since each player tries
to obtain the maximum benefits, best first search or A*
algorithms do not serve the purpose.
→ The basic characterisitic of the strategy must look ahead in
nature i.e. explore the tree for 2 or more levels downwards and
choose the optimal one.
Two player Game playing methods
1. Minmax strategy
2. Minmax strategy with alpha-beta pruning

• Minimax search procedure is depth-first depth-limited search


procedure. It is used for games like chess and tic-tac-toe.

• Minmax strategy with alpha-beta pruning uses two threshold


values called as alpha and beta.
Min max Strategy
Minimax algorithm uses two functions –

• MOVEGEN : It generates all the possible moves that can be


generated from the current position.

• STATICEVALUATION : It returns a value depending upon the


goodness from the viewpoint of two-player
Min max Strategy
This algorithm is a two player game, so we call the one player
as MAXIMIZER and other player as MINIMIZER. Both the
players try to maximize their own benefit and minimize the
opponent’s benefit.
It is assumed that the maximizer makes the first move. It is not
necessary though.
The maximizer tries to go the position where the static
evaluation function value is maximum. The minimizer tries to
go towards a minimum value.
Min max Strategy with 3 levels
Working of previous example
4 levels are generated. The value to nodes H, I, J, K, L, M, N, O is
provided by STATICEVALUATION function. Level 3 is maximizing
level, so all nodes of level 3 will take maximum values of their
children. Level 2 is minimizing level, so all its nodes will take
minimum values of their children. This process continues. The
value of A is 23. That means A should choose C move to win.
Adding Alpha-Beta Cutoffs
• This is a modified strategy to the existing minmax strategy. This
is also called as alpha beta pruning.

• It requires to maintain two threshhold values :


1. One representing a lower bound on the value that a
maximizing may ultimately be assigned. (It is called alpha)
2. Another representing an upper bound on the value that a
minimizing node may be assigned (this is called beta)
Adding Alpha-Beta Cutoffs
• The value of alpha is now treated as a reference point for the
Maximizer . Any node, whose value is greater than alpha is
accepted and all the nodes with a lower value than alpha are
rejected.

• Similarly, beta is a reference point for the Minimizer. Any node,


whose value is less than beta is accepted and all the nodes
with a greater value than beta are rejected.
Advantages of Game Playing in AI
• Advancement of AI: Game playing has been a driving force behind the development
of artificial intelligence and has led to the creation of new algorithms and
techniques that can be applied to other areas of AI.
• Education and training: Game playing can be used to teach AI techniques and
algorithms to students and professionals, as well as to provide training for military
and emergency response personnel.
• Research: Game playing is an active area of research in AI and provides an
opportunity to study and develop new techniques for decision-making and problem-
solving.
• Real-world applications: The techniques and algorithms developed for game playing
can be applied to real-world applications, such as robotics, autonomous systems,
and decision support systems.
Limitations of Game Playing in AI
• Limited scope: The techniques and algorithms developed for
game playing may not be well-suited for other types of
applications and may need to be adapted or modified for
different domains.
• Computational cost: Game playing can be computationally
expensive, especially for complex games such as chess or Go,
and may require powerful computers to achieve real-time
performance.

You might also like