AI Lab
AI Lab
SOLDIER INSTITUTE OF
ENGINEERING & TECHNOLOGY
SESSION – (2019-2023)
PRACTICAL FILE OF ARTIFICIAL
INTELLIGENCE LAB
(BTCS 605-18)
INDEX
SR. NO. AIM PAGE NO. SIGNATURE
1. Write a programme to conduct 1 – 12
uninformed and informed
search.
2. Write a programme to conduct 13 – 17
game search.
3. Write a programme to 18 – 23
construct a Bayesian network
from given data.
4. Write a programme to infer 24 – 28
from the Bayesian network.
return 0;
}
2. Depth-first Search
Depth-first search isa recursive algorithm for traversing a tree or graph
data structure. It is called the depth-first search because it starts from the
root node and follows each path to its greatest depth node before
moving to the next path. DFS uses a stack data structure for its
implementation. The process of the DFS algorithm is similar to the BFS
algorithm
EXPERIMENT NO. 1
Output:
Following is Breadth First Traversal (starting from vertex 2)
2031
Code for DFS:
#include <bits/stdc++.h>
using namespace std;
// Graph class represents a directed graph
// using adjacency list representation
class Graph {
public:
map<int, bool> visited;
map<int, list<int> > adj;
// function to add an edge to graph
void addEdge(int v, int w);
// DFS traversal of the vertices
// reachable from v
void DFS(int v);
};
void Graph::addEdge(int v, int w)
{
adj*v+.push_back(w); // Add w to v’s list.
}
void Graph::DFS(int v)
{
// Mark the current node as visited and
// print it
visited[v] = true;
cout << v << " ";
// Recurse for all the vertices adjacent
// to this vertex
list<int>::iterator i;
for (i = adj[v].begin(); i != adj[v].end(); ++i)
if (!visited[*i])
DFS(*i);
}
// Driver code
int main()
{
// Create a graph given in the above diagram
Graph g;
g.addEdge(0, 1);
g.addEdge(0, 2);
g.addEdge(1, 2);
g.addEdge(2, 0);
g.addEdge(2, 3);
g.addEdge(3, 3);
return 0;
}
Informed Search:
Informed Search algorithms have information on the goal state which
helps in more efficient searching. This information is obtained by a
function that estimates how close a state is to the goal state.
In the informed search main algorithm which is given below:
1.A* Search Algorithm
A* search is the most commonly known form of best-first search. It uses
heuristic function h(n), and cost to reach the node n from the start state
g(n). It has combined features of UCS and greedy best-first search, by
which it solve the problem efficiently. A* search algorithm finds the
shortest path through the search space using the heuristic function. This
search algorithm expands less search tree and provides optimal result
faster. A* algorithm is similar to UCS except that it uses g(n)+h(n) instead
of g(n).
Code for A* algorithm
#include <iostream>
#include "source/AStar.hpp"
int main()
{
AStar::Generator generator;
// Set 2d map size.
generator.setWorldSize({25, 25});
// You can use a few heuristics : manhattan, euclidean or octagonal.
generator.setHeuristic(AStar::Heuristic::euclidean);
generator.setDiagonalMovement(true);
std::cout << "Generate path ... \n";
// This method returns vector of coordinates from target to source.
auto path = generator.findPath({0, 0}, {20, 20});
for(auto& coordinate : path) {
std::cout << coordinate.x << " " << coordinate.y << "\n";
}
}
OUTPUT OF A* algorithm
OUTPUT :
Following is Depth First Traversal (starting from vertex 2)
2013
EXPERIMENT N0. 2
AIM : Write a program to conduct game search.
THEORY : Game playing was one of the first tasks undertaken in Artificial
Intelligence. Game theory has its history from 1950, almost from the days
when computers became programmable. The very first game that is been
tackled in AI is chess. Initiators in the field of game theory in AI were
Konard Zuse (the inventor of the first programmable computer and the
first programming language), Claude Shannon (the inventor of
information theory), Norbert Wiener (the creator of modern control
theory), and Alan Turing. Since then, there has been a steady progress in
the standard of play, to the point that machines have defeated human
champions (although not every time) in chess and backgammon, and are
competitive in many other games.
Types of Game
1. Perfect Information Game: In which player knows all the possible
moves of himself and opponent and their results.
E.g. Chess. 2
2. Imperfect Information Game: In which player does not know all the
possible moves of the opponent.
E.g. Bridge since all the cards are not visible to player
Mini-Max Algorithm in Artificial Intelligence:
Mini-max algorithm is a recursive or backtracking algorithm which is used
in decision-making and game theory. It provides an optimal move for the
player assuming that opponent is also playing optimally. Mini-Max
algorithm uses recursion to search through the game-tree. Min-Max
algorithm is mostly used for game playing in AI. Such as Chess, Checkers,
tic-tac-toe, go, and various tow-players game. This Algorithm computes
the minimax decision for the current state .In this algorithm two players
play the game, one is called MAX and other is called MIN. Both the
players fight it as the opponent player gets the minimum benefit while
they get the maximum benefit. Both Players of the game are opponent of
each other, where MAX will select the maximized value and MIN will
select the minimized value. The minimax algorithm performs a depth-first
search algorithm for the exploration of the complete game tree. The
minimax algorithm proceeds all the way down to the terminal node of
the tree, then backtrack the tree as the recursion.
Code for minmax algorithm :
// A simple C++ program to find
// maximum score that
// maximizing player can get.
#include<bits/stdc++.h>
using namespace std;
if (depth == h)
return scores[nodeIndex];
// value
if (isMax)
// attainable value
else
int log2(int n)
// Driver code
int main()
{
// a power of 2.
int n = sizeof(scores)/sizeof(scores[0]);
int h = log2(n);
cout << "The optimal value is : " << res << endl;
return 0;
}
EXPERIMENT N0. 2
Output:
evidence
Computer failure
The goal is to calculate the posterior conditional probability distribution
of each of the possible unobserved causes given the observed evidence,
i.e. P [Cause | Evidence].
Data Set:
Title: Heart Disease Databases
The Cleveland database contains 76 attributes, but all published
experiments refer to using a
subset of 14 of them. In particular, the Cleveland database is the only one
that has been used
by ML researchers to this date. The "Heartdisease" field refers to the
presence of heart disease in the patient. It is integer valued from 0 (no
presence) to 4.
Database: 0 1 2 3 4 Total
Cleveland: 164 55 36 35 13 303
Attribute Information:
1. age: age in years
2. sex: sex (1 = male; 0 = female)
3. cp: chest pain type
-anginal pain
4. trestbps: resting blood pressure (in mm Hg on admission to the
hospital)
5. chol: serum cholestoral in mg/dl
6. fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecg: resting electrocardiographic results
Joint Probability
The chance of two (or more) events together is known as the joint
probability. The sum of the probabilities of two or more random variables
is the joint probability distribution. For example, the joint probability of
events A and B is expressed formally as:
The letter P is the first letter of the alphabet (A and B).
The upside-down capital “U” operator or, in some situations, a comma “,”
represents the “and” or conjunction.
P(A ^ B)
P(A, B)
By multiplying the chance of event A by the likelihood of event B, the
combined probability for occurrences A and B is calculated.
Posterior Probability
In Bayesian statistics, the conditional probability of a random occurrence
or an ambiguous assertion is the conditional probability given the
relevant data or background. “After taking into account the relevant
evidence pertinent to the specific subject under consideration,”
“posterior” means in this case. The probability distribution of an
unknown quantity interpreted as a random variable based on data from
an experiment or survey is known as the posterior probability
distribution.
Inferencing with Bayesian Network
In this demonstration, we’ll use Bayesian Networks to solve the well-
known Monty Hall Problem. Let me explain the Monty Hall problem to
those of you who are unfamiliar with it:
This problem entails a competition in which a contestant must choose
one of three doors, one of which conceals a price. The show’s host
(Monty) unlocks an empty door and asks the contestant if he wants to
swap to the other door after the contestant has chosen one. The decision
is whether to keep the current door or replace it with a new one. It is
preferable to enter by the other door because the price is more likely to
be higher. To come out from this ambiguity let’s model this with a
Bayesian network.
CODE :
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
import networkx as nx
import pylab as plt
# Defining Bayesian Structure
model = BayesianNetwork([('Guest', 'Host'), ('Price', 'Host')])
# Defining the CPDs:
cpd_guest = TabularCPD('Guest', 3, [[0.33], [0.33], [0.33]])
cpd_price = TabularCPD('Price', 3, [[0.33], [0.33], [0.33]])
cpd_host = TabularCPD('Host', 3, [[0, 0, 0, 0, 0.5, 1, 0, 1, 0.5],
[0.5, 0, 1, 0, 0, 0, 1, 0, 0.5],
[0.5, 1, 0, 1, 0.5, 0, 0, 0, 0]],
evidence=['Guest', 'Price'], evidence_card=[3, 3])
# Associating the CPDs with the network structure.
model.add_cpds(cpd_guest, cpd_price, cpd_host)
model.check_model()
# Infering the posterior probability
from pgmpy.inference import VariableElimination
infer = VariableElimination(model)
posterior_p = infer.query(['Host'], evidence={'Guest': 2, 'Price': 2})
print(posterior_p)
EXPERIMENT N0. 4
AIM: Write a program to infer from the Bayesian network
OUTPUT :
EXPERIMENT NO. 5
AIM: Write a program to run value and policy iteration in a grid world
THEORY : Value Iteration
With the tools we have explored until now, a new question arises: why
do we need to consider an initial policy at all? The idea of the value
iteration algorithm is that we can compute the value function without a
policy. Instead of letting the policy, π, dictate which actions are selected,
we will select those actions that maximize the expected reward:
A policy is a mapping from states to probabilities of selecting each possible action. If the
agent is following policy at time t, then is the probability that = a if = s.
The value function of a state s under a policy , denoted , is the expected return when
starting in s and following thereafter
Similarly the action value function gives the expected return when taking an action ‘a’ in
state ‘s’
The Bellman equations give the equation for each of the state
The Bellman optimality equations give the optimal policy of choosing specific actions in
specific states to achieve the maximum reward and reach the goal efficiently. They are given
as
The Bellman equations cannot be used directly in goal directed problems and dynamic
programming is used instead where the value functions are computed iteratively
In the problem below the Maze has 2 end states as shown in the corner. There are four
possible actions in each state up, down, right and left. If an action in a state takes it out of
the grid then the agent remains in the same state. All actions have a reward of -1 while the
end states have a reward of 0
This is shown as
where the reward for any transition is Rt=−1Rt=−1 except the transition to the end states at
the corner which have a reward of 0. The policy is a uniform policy with all actions being
equi-probable with a probability of 1/4 or 0.25
1. Gridworld-1
In [1]:
import numpy as np
import random
In [2]:
gamma = 1 # discounting rate
gridSize = 4
rewardValue = -1
terminationStates = [[0,0], [gridSize-1, gridSize-1]]
actions = [[-1, 0], [1, 0], [0, 1], [0, -1]]
numIterations = 1000
The action value provides the next state for a given action in a state and the accrued reward
In [3]:
def actionValue(initialPosition,action):
if initialPosition in terminationStates:
finalPosition = initialPosition
reward=0
else:
#Compute final position
finalPosition = np.array(initialPosition) + np.array(action)
reward= rewardValue
# If the action moves the finalPosition out of the grid, stay in same cell
if -1 in finalPosition or gridSize in finalPosition:
finalPosition = initialPosition
reward= rewardValue
#print(finalPosition)
return finalPosition, reward
In [10]:
def improve_policy(pi, pi1,gamma,valueMap):
policy_stable = True
for state in states:
old = pi[state].copy()
# Greedify policy for state
greedify_policy(state,pi,pi1,gamma,valueMap)
if not np.array_equal(pi[state], old):
policy_stable = False
print(pi)
print(pi1)
return pi, pi1, policy_stable
In [11]:
def policy_iteration(gamma, theta):
valueMap = np.zeros((gridSize, gridSize))
pi = np.ones((gridSize,gridSize))/4
pi1 = np.chararray((gridSize, gridSize))
pi1[:] = 'a'
policy_stable = False
print("here")
while not policy_stable:
valueMap = policy_evaluate(states,actions,gamma,valueMap)
pi,pi1, policy_stable = improve_policy(pi,pi1, gamma,valueMap)
return valueMap, pi,pi1
In [12]:
theta=0.1
valueMap, pi,pi1 = policy_iteration(gamma, theta)
[[0. 3. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 1.]
[0. 0. 2. 0.]]
[[b'u' b'l' b'u' b'u']
[b'u' b'u' b'u' b'u']
[b'u' b'u' b'u' b'd']
[b'u' b'u' b'r' b'u']]
[[0. 3. 3. 0.]
[0. 0. 0. 1.]
[0. 0. 1. 1.]
[0. 2. 2. 0.]]
[[b'u' b'l' b'l' b'u']
[b'u' b'u' b'u' b'd']
[b'u' b'u' b'd' b'd']
[b'u' b'r' b'r' b'u']]
[[0. 3. 3. 1.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]
[0. 2. 2. 0.]]
[[b'u' b'l' b'l' b'd']
[b'u' b'u' b'd' b'd']
[b'u' b'u' b'd' b'd']
[b'u' b'r' b'r' b'u']]
[[0. 3. 3. 1.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]
[0. 2. 2. 0.]]
[[b'u' b'l' b'l' b'd']
[b'u' b'u' b'd' b'd']
[b'u' b'u' b'd' b'd']
[b'u' b'r' b'r' b'u']]
EXPERIMENT NO : 6
AIM : Write a program to do reinforcement learning in a grid world
output:
The valueMap shows the optimal path from any state