0% found this document useful (0 votes)
54 views7 pages

A Star Material

The A* search algorithm is a best-first search technique that finds the shortest path between nodes in a graph. It uses both the cost of getting from the starting node to the current node (a) plus an estimate of the cost to get from the current node to the goal (b) to determine which node to expand next. This heuristic (f=a+b) aims to find the lowest total cost path. A* uses priority queues and tries paths with lower f values first to efficiently find the optimal solution. It guarantees to find the shortest path if the heuristic is admissible and consistent.

Uploaded by

Aditya Iyer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views7 pages

A Star Material

The A* search algorithm is a best-first search technique that finds the shortest path between nodes in a graph. It uses both the cost of getting from the starting node to the current node (a) plus an estimate of the cost to get from the current node to the goal (b) to determine which node to expand next. This heuristic (f=a+b) aims to find the lowest total cost path. A* uses priority queues and tries paths with lower f values first to efficiently find the optimal solution. It guarantees to find the shortest path if the heuristic is admissible and consistent.

Uploaded by

Aditya Iyer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

A* Search Algorithm:

A* search algorithm is a Branch and Bound Search with an estimate of remaining distance (best first Search)
combined with principle of Dynamic Programming. It is one of the most popular techniques used in finding
paths and in traversing graphs. It traverses the graph by moving in the direction of the most promising node
defined according to some rule or heuristic. Hence, unlike breadth-first search and depth-first search, which
keep on expanding the nodes without using any information about them, the algorithm considers the information
about all the candidate nodes and then traverses the best node. Therefore to calculate the fitness of a node, the
algorithm requires a heuristic evaluation function, which depends on the description of the goal, the description
gathered till the current node and most importantly the knowledge about the problem domain. Most common
form of this search methodology is using a heuristic which calculates how close the current position is to the
destination node. This kind of approach is called greedy best-first search or pure heuristic search. In order to
efficiently select the most suitable node according to the heuristic, a priority queue of nodes is used.
A* is a best first search technique that employs a heuristic function to find a path from the initial position to the
destination by traversing the graph on the basis of the heuristic function used. The algorithm is basically used on
weighted graphs, and tries to find the least cost path from the starting position to the destination. At each
iteration, A* algorithm has to decide from amongst the various paths which one to traverse. It does so
considering the cost to move to an arbitrary next position and the cost to reach the destination from this arbitrary
node. Of course, the cost to reach from the new node to the destination isn’t known and hence has to be taken
into account by making a smart guess. Therefore the heuristic used by A* is given by the following equation:
cost(n)=f(n)=a(n)+b(n) where n is the current position on the path, a(n) calculates the cost of the path from the
initial state to the current position n, and b(n) calculates the least cost of going from current position n to the
destination. The termination condition includes obtaining the path from the initial position to the destination or
when no more paths can be extended. The definition of the heuristic function depends on the problem domain.
If A* search doesn’t overestimate b(n) at nodes along the path, the algorithm guarantees to find the least cost
path from starting position to destination.
A* uses a heap to keep track of all nodes or places visited along the path so far and to efficiently pop the nodes
that have the minimum heuristic value. At each iteration, the location to be visited with the minimum cost value
is popped from the heap and the a and b values of its unvisited neighbors are updated which are then added as
children to the current location being traversed in this BFS approach. A* search continues to iterate until there
are no more locations in the heap or the destination is the node with the minimum heuristic value. For the
destination node the value of cost function is zero since the cost of reaching a node from itself is zero.
Therefore, the value of the heuristic cost function for the goal node is equal to a (destination node) that is the
value of the least cost path from the start node to the goal node.

Methods to determine b(n):


We can either calculate the exact value of cost to visit destination from initial node (This can consume time) or
approximate the value of b using some methodology.

Exact value of b(n):


1. Exact value of distances for all node pairs have to be computed beforehand.
2. If there are no blocking entities in the path between current position and final node, the distance or cost can
be computed by using Euclidean or Manhattan distances.

Approximated value of b(n):


1. Manhattan Distance:
b (n)=|(current position . x−destination position . x)|+|(current position . y−destination position . y)|
2. Diagonal Distance:
b (n)=max(|(current position . x−destination position . x )|,|(current position . y−destination position . y)|)
3. Euclidean Distance:
b (n)=squareroot ((current position . x−destination position . x)∗(current position . x−destination position . x)+(cu

Overestimation and Underestimation of b(n):


If b(n) is a perfect estimator of the cost incurred while visiting goal from the current node, then A* search will
converge immediately to the goal node. If the value of b(n) is taken to be 0, the search will be basically
controlled by the value of a(n). If the value of a(n) also turns out to be 0, the search will become random. If the
value of a(n) is always 1, the A* search will act just like Breadth First Search.
Underestimating b(n) uses moves but makes no progress towards the goal. Thus we can see that underestimating
b(n) can lead to wasted effort. It requires A* search to track back after a certain point but leads to wastage of
computation in a direction which gives no progress. The problem with overestimation is that it blocks the A*
search algorithm from finding the best possible path. It may instead yield a solution which isn’t optimal. The
only way to guarantee b(n) doesn’t overestimate is to set the value of b(n) to 0. But then we are back to Breadth
First Search which is admissible but not optimal.
Underestimation doesn’t block A* from finding the optimal path but can lead to wasted computational power
due to moves which make no progress (figure 9). However, overestimation on the other hand can lead to sub-
optimal solution rather than the best possible path (figure 10). In figure 9, we can see that by underestimating
the value of the heuristic function, we make progress towards the nodes B and then G. Here, G has the value of f
as 6 which is equal to C’s value so we proceed further to H which has the value of f as 7 which is now worse
than the value of node C so we backtrack towards the path through C and eventually reach the optimal solution.
Hence by underestimating the value of the heuristic function, we have wasted some effort but we are guaranteed
to find an optimal solution. In figure 10, the path chosen is B-> E-> F-> G whereas the path through node D can
be an optimal path if the heuristic value is chosen appropriately however by overestimating the value of the
heuristic function of D, we make D look so bad that we may find some other worse solution without ever
expanding D. Thus, by overestimating the value of the heuristic function, we may not find an optimal solution
always.

Figure 9: Underestimating the heuristic function


Figure 10: Overestimating the heuristic function
The heuristic is called a monotone or consistent if it satisfies the following condition for each edge:
cost ( x )≤cost ( y )+e ( x , y ), where e(x, y) is the cost of the edge from node x to node y. With such heuristic,
the algorithm can be implemented very efficiently.
The optimality of the algorithm heavily relies on the ability of the approximation mechanism in place to
estimate b(n). Thus admissible heuristics are by nature optimal in the sense that according to them the cost to
solve the problem is less than what it actually is. For example, if we take the cost function as the Euclidean
distance from the current position to the destination, then this is an admissible function. Thus admissibility
becomes the first condition that must be met in order for the algorithm to be optimal.
Consistency/ Monotonicity also needs to be satisfied in order for algorithm to be optimal. However this
condition is required only when A* is applied to problems involving graph search. According to the condition, a
heuristic is consistent if for every node n, and for its every neighbor n’ obtained by action a, the estimated cost
of reaching the goal from n is less than or equal to the sum of the cost of reaching n’ from n and the calculated
cost to reach the goal from n’.
This idea is derived from the triangle inequality where the sum of any two sides cannot be less than the third
side. Here the sides of the triangle are formed by n, n’ and the goal state closest to n given by G(n). This
inequality holds for a possible function, if there was a possible path from n to G(n) which goes via n’ and its
cost is cheaper than h(n), then this violates the admissibility of h(n). Every consistent heuristic is admissible.
This shows that consistency is a stronger condition than admissibility.
A* search uses open list and closed list which is used to keep track of nodes that are to be visited and those
nodes which are already visited respectively. Initially, the start node is inserted into the open list. In each
iteration the element with minimum a() value is extracted from the open list. Cost of all nodes adjacent to the
current node are calculated. If a node corresponding to the same position as the successor already exists in open
list with lower cost value, then we skip this node. If the new node is having lower cost (cost is found by
comparing the a values), then we reset the pointers and also update the a and f values of the old node. If the
node corresponding to the same position exists in the closed list, then we skip the node otherwise if the new
node is having lower cost, then we reset the pointers and also update the a and f values of the old node and its
children using depth first search. While traversing through its children, again we need to follow the same
procedure and check for better paths if available by comparing the a values of the current path available and the
path that we are following to that node. Otherwise we add all visited nodes to the closed list in each iteration.

A* search pseudocode:

initialization: open ← initial node, closed← empty, a═ 0, f═ b


while loop open ≠ null and current ≠ goal
do
remove the first node with lowest f from open to close and call it best-node,
If best-node ≠ goal
then
generate all successors of best-node
for each successor
establish successor’s parent and compute its a value
[if successor € open
then
[if a(successor) < a(matching)
then
make the parent of successor node to be best-node
change the value of a and f for matching node
else delete successor]
else
[if successor € closed
then
[if a(successor) < a(matching)
then
change the parent of successor node to be best-node
change the value of a and f for matching node
propagate the changes to children using depth first search]

else
calculate f for successors and append to open]
end for

Example 1:
Consider the following graph search problem in figure 11. Here the heuristics value for each node are already
given and the edges are labelled with the costs, f is the goal state here.

Figure 11: A* Search Example


h(a)=4, h(b)=2, h(c)=4, h(d)=4.5, h(e)=2
A* Approach:
1. Explored Nodes = {start} Unexplored Nodes = {a,b,c,d,e,f}
a and d are the neighbors of the start node, hence their ‘f’ values can be computed.
f(a) = 1.5 + 4 = 5.5
f(d) = 2 + 4.5 = 6.5
a is selected.
2. Explored Nodes = {start, a} Unexplored Nodes = {b,c,d,e,f}
b is the neighbor of a.
f(b) = 3.5 + 2 = 5.5
f(d) = 6.5
b is selected.
3. Explored Nodes = {start, a, b} Unexplored Nodes = {c,d,e,f}
c is the neighbor of b.
f(c) = 6.5 + 4 = 10.5
f(d) = 6.5
d is selected.
4. Explored Nodes = {start, a, b, d} Unexplored Nodes = {c,e,f}
e is the neighbor of d.
f(e) = 5 + 2 = 7
f(c) = 10.5
e is selected.
5. Explored Nodes = {start, a, b, d, e} Unexplored Nodes = {c, f}
f is the neighbor of e.
f(f) = 7
f(c) = 10.5
f is selected.
6. f reached, hence terminate.

Example 2: Show the step by step generation of the A* search tree to reach the goal state ‘L’ in the tree of figure
12. Assume here that if there is a duplicate node that exists in the graph, then it was generated before the same
node which is on the current path.
f= (a+b)
L Goal State
A Initial State
Figure 12: A* search example

Ans. (Figure 13)

Indicates the Solution path


Figure 13: Solution for A* search
1. Choose C as the BESTNODE since it has the lowest f value i.e. (3+1).
2. Choose G as the BESTNODE since it has the lowest f value i.e. (2+2).
3. Since G is an already generated node, compare the a-values of both i.e. OLD and SUCCESSOR of
BESTNODE ‘C’ in order to decide whether the old path to SUCCESSOR is better or the path via the
BESTNODE.
4. On comparing the ‘a’ values i.e. 3 via F and 2 via the current path through ‘C’, we find that the path
through the current path via ‘C’ is cheaper and hence reset the parent link of ‘G’ to ‘C’ and also update
the a and f values accordingly.
5. Now, update the a and f values of the successors of ‘G’ through a depth first search traversal. Thus the
values of J, K and L become (3+3), (1+3) and (0+4) respectively.
6. Now, choose K since it has the lower f value and expand K to generate L.
7. Since L is an already generated node but not expanded, compare the a-values of both the paths i.e.
OLD and SUCCESSOR of BESTNODE ‘K’ in order to decide whether the old path to SUCCESSOR
‘L’ is better or the path via the BESTNODE.
8. On comparing the ‘a’ values i.e. 6 via R and 4 via the current path through ‘K’, we find that the path
through the current path via ‘K’ is cheaper and hence reset the parent link of ‘L’ to ‘K’ and also update
the a and f values accordingly.
9. Since L has no successors, we don’t need to do a depth first traversal to update the f values of its
children.

Hence, the GOAL STATE ‘L’ is reached with an f value of (0+4) i.e. 4. through the path (ACGKL)
thereby generating the A* search tree.

You might also like