AI - ML Lab Manual)
AI - ML Lab Manual)
Accredited by NAAC
KOVILVENNI - 614403
DEPARTMENT OF ELECTRONICS AND COMMUNICATION
ENGINEERING
LAB MANUAL
COURSE OBJECTIVES:
LIST OF EXPERIMENTS
TOTAL : 30 PERIODS
COURSE OUTCOMES
At the end of this course, the students will be able to:
Aim: To write a program in Python to implement Breadth First Search and Depth First Search algorithm.
Algorithm:
Step1: Initialize a queue and a set to keep track of visited nodes.
Step2: Enqueue the starting node into the queue and mark it as visited.
Step3: While the queue is not empty, dequeue a node from the queue.
Step4: Process the dequeued node.
Step5: Iterate through the neighbors of the dequeued node.
Step6: If a neighbor has not been visited, enqueue it into the queue and mark it as visited.
Program 1:
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = [] # List for visited nodes.
queue = [] #Initialize a queue
def bfs(visited, graph, node): #function for BFS
visited.append(node)
queue.append(node)
while queue: # Creating loop to visit each node
m = queue.pop(0)
print (m, end = "")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') # function calling
Output:
Following BFS 5 3 7 2 4 8
Depth First Search:
The Depth-First Search is a recursive algorithm that uses the concept of backtracking. It involves
thorough searches of all the nodes by going ahead if potential, else by backtracking.
1. Start the program by putting any one of the graph's vertex on top of the stack.
2. After that take the top item of the stack and add it to the visited list of the vertex.
3. Next, create a list of that adjacent node of the vertex. Add the ones which aren't in the visited list of
vertexes to the top of the stack.
4. Lastly, keep repeating steps 2 and 3 until the stack is empty.
Program 2 :
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
Output:
Following DFS
5
3
2
4
8
7
Result:
Thus the Python Program for implementing BFS & DFS is executed successfully.
Exercise 2 Implementation of A* search & Memory bounded A* search
A* Search:
1. Initialize Open and Closed Sets: Create two sets, often implemented as priority queues, to keep track of
nodes that are being considered for expansion (Open Set) and nodes that have already been visited (Closed
Set).
2. Initialize Start Node: Create a node representing the initial state. Set its cost from the start node (g) to 0 and
its heuristic value (h) to an estimate of the cost to reach the goal from this node.
3. Add Start Node to Open Set: Add the start node to the Open Set.
4. Repeat Until Open Set is Empty or Goal is Found:
a. Pop the node with the lowest f value (f = g + h) from the Open Set. This node will be the current
node.
b. If the current node is the goal node, reconstruct the path from the start node to the goal node and
return it.
c. Otherwise, move the current node from the Open Set to the Closed Set.
d. Generate successor nodes of the current node and calculate their costs (g) and heuristic values (h).
e. For each successor node:
i. If the successor node is already in the Closed Set and the new path is not better, skip it.
ii. If the successor node is not in the Open Set or the new path is better, update the cost and
heuristic values, set the current node as its parent, and add it to the Open Set.
5. If the Open Set becomes empty and the goal node has not been found, the search fails (goal is unreachable).
6. Return the path from the start node to the goal node, if found.
Program:
def astaralgo(start_nod,stop_nod):
open_set=set(start_nod)
closed_set=set()
g={}
parents={}
g[start_nod]=0
parents[start_nod]=start_nod
while len(open_set) > 0:
n=None
for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n=v
if n==stop_nod or graph_nod[n] == None:
pass
else:
for(m,weight) in get_neighbours(n):
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m] = n
g[m] = g[n]+weight
else:
if g[m]>g[n]+weight:
g[m]=g[n]+weight
parents[m]=n
if m in closed_set:
closed_set.remove(m)
open_set.add(m)
if n==None:
print("path doesn't exist")
return None
if n==stop_nod:
path=[]
while parents[n]!=n:
path.append(n)
n=parents[n]
path.append(start_nod)
path.reverse()
print("path found:{}".format(path))return path
open_set.remove(n) closed_set.add(n)
print("path doesn't exist !")
return None
def get_neighbours(v):
if v in graph_nod:
return graph_nod[v]
else:
return None
def heuristic(n):
h_dist={'a':11,'b':6,'c':5,'d':7,'e':3,'f':6,'g':5,'h':3,'i':1,'j':0}
return h_dist[n]
graph_nod={
'a':[('b',6),('f',3)],
'b':[('a',6),('c',3),('d',2)],
'c':[('b',3),('d',1),('e',5)],
'd':[('b',2),('c',1),('e',8)],
'e':[('c',5),('d',8),('i',5),('j',5)],
'f':[('a',3),('g',1),('h',7)],
'g':[('f',1),('i',3)],
'h':[('f',7),('i',2)],
'i':[('e',5),('g',3),('h',2),('j',3)]
}
astaralgo('a' , 'j')
Output:
Result:
Thus the Python Program for implementing A* search is executed successfully.
Exercise 3 Implementation of Naive Bayes Classifier model
Algorithm :
Naive Bayes classifier calculates the probability of an event in the following steps:
Program :
from functools import reduce
import pandas as pd
import pprint
class Classifier():
data = None
class_attr = None
priori = {}
cp = {}
hypothesis=None
def init (self, filename=None, class_attr=None):
self.data=pd.read_csv(filename, sep=',', header = (0))
self.class_attr = class_attr
def calculate_priori(self):
class_values = list (set (self.data[self.class_attr]))
class_data = list (self.data[self.class_attr])
for i in class_values:
self.priori[i] = class_data.count (i)/float (len (class_data))
print ("Priori Values: ", self.priori)
for i in self.cp:
print (i, "==>", reduce (lambda x,y: x*y, self.cp[i].values())*self.priori[i])
if name ==" main ":
c =Classifier(filename="form.csv",class_attr="Play" )
c.calculate_priori()
c.hypothesis ={"Outlook": 'Rainy', "Temp": 'Mild', "Humidity": 'Normal', "Windy": 't'}
c.calculate_conditional_probabilities (c.hypothesis)
c.classify()
Output:
Priori Values: {'no': 0.35714285714285715, 'yes': 0.6428571428571429}
Result:
Thus the Python Program for implementing Naïve Baye’s classification is executed successfully.
Exercise 4 Implementation of Bayesian Network
Description:
A Bayesian Network falls under the category of Probabilistic Graphical Modelling (PGM) technique that
is used to compute uncertainties by using the concept of probability. Popularly known as Belief
Networks, Bayesian Networks are used to model uncertainties by using Directed Acyclic
Graphs (DAG).
Program :
import numpy as np
import pandas as pd
import csv
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.models import BayesianNetwork
from pgmpy.inference import VariableElimination
heartDisease = pd.read_csv('heart.csv')
heartDisease = heartDisease.replace('?',np.nan)
model=
BayesianNetwork([('age','heartdisease'),('gender','heartdisease'),('exang','heartdisease'),('cp','heartdisease'),
('heartdisease','restecg'),('heartdisease','chol')])
print('\nLearning CPD using Maximum likelihood estimators')
model.fit(heartDisease,estimator=MaximumLikelihoodEstimator)
print('\n Inferencing with Bayesian Network:')
HeartDiseasetest_infer = VariableElimination(model)
q2=HeartDiseasetest_infer.query(variables=['heartdisease'],evidence={'cp':2})
print(q2)
Output:
Priori Values: {'yes': 0.6428571428571429, 'no': 0.35714285714285715}
Result:
Thus the Python Program for implementing Bayesian Network is executed successfully.
Exercise 5 Implementing Regression model
Description:
Linear regression is a regression model that estimates the relationship between one independent
variable and one dependent variable using a straight line.
Algorithm:
Step 1: Import the packages and classes that you need.
Step 2: Provide data to work with, and eventually do appropriate transformations.
Step 3: Create a regression model and fit it with existing data.
Step 4: Check the results of model fitting to know whether the model is satisfactory.
Step 5: Apply the model for predictions.
[Linear Regression]
Program:
import numpy as np
import matplotlib.pyplot as plt
def estimate_coef(x, y):
n = np.size(x)
m_x = np.mean(x)
m_y = np.mean(y)
SS_xy = np.sum(y*x) - n*m_y*m_x
SS_xx = np.sum(x*x) - n*m_x*m_x
b_1 = SS_xy / SS_xx
b_0 = m_y - b_1*m_x
return (b_0, b_1)
def plot_regression_line(x, y, b):
plt.scatter(x, y, color = "m",marker = "o", s = 30)
y_pred = b[0] + b[1]*x
plt.plot(x, y_pred, color = "g")
plt.xlabel('x')
plt.ylabel('y')
plt.show()
def main():
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([1, 3, 2, 5, 7, 8, 8, 9, 10, 12])
b = estimate_coef(x, y)
print("Estimated coefficients:\nb_0 = {} \nb_1 = {}".format(b[0], b[1]))
plot_regression_line(x, y, b)
if name == " main ":
main()
Output:
Estimated coefficients:
b_0 = 1.2363636363636363
b_1 = 1.1696969696969697
[Logistic Regression]
Description:
Logistic regression is a data analysis technique that uses mathematics to find the relationships between
two data factors. It then uses this relationship to predict the value of one of those factors based on the other. The
prediction usually has a finite number of outcomes, like yes or no.
Program:
import numpy
from sklearn import linear_model
X = numpy.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69, 5.88]).reshape(-1,1)
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
logr = linear_model.LogisticRegression()
logr.fit(X,y)
predicted = logr.predict(numpy.array([3.46]).reshape(-1,1))
print(predicted)
Output: [0]
[Least Square Regression]
Description:
The least-square regression is the technique used in regression analysis of the ML model and AI
implementation. It is one of the popular mathematical methods for finding the best possible fit line that defines the
connection between dependent and independent variables.
Program:
sx = sum(x)
sy = sum(y)
sxsy = 0
sx2 = 0
for i in range(n):
sxsy += x[i] * y[i]
sx2 += x[i] * x[i]
b = (n * sxsy - sx * sy)/(n * sx2 - sx * sx)
return b
def leastRegLine(X,Y,n):
b = calculateB(X, Y, n)
meanX = int(sum(X)/n)
meanY = int(sum(Y)/n)
a = meanY - b * meanX
print("Regression line:")
print("Y = ", '%.3f'%a, " + ", '%.3f'%b, "*X",
sep="")
X = [95, 85, 80, 70, 60 ]
Y = [90, 80, 70, 65, 60 ]
n = len(X)
leastRegLine(X, Y, n)
Output:
Regression line:
Y = 5.685 + 0.863*X
[Bayesian Regression]
Bayesian linear regression is a type of conditional modeling in which the mean of one variable is
described by a linear combination of other variables, with the goal of obtaining the posterior probability of the
regression coefficients
Program:
Output:
Result:
Thus the Python Program for implementing Regression Model is executed successfully.
Exercise 6a. Decision Tree Classification
Program:
Output:
Result:
Thus the Python Program for implementing Decision tree classification is executed successfully.
Exercise 6b. Random Forest Classifier
Random forest Classifier is a commonly-used machine learning algorithm, which combines the output
of multiple decision trees to reach a single result. Its ease of use and flexibility have fueled its adoption,
as it handles both classification and regression problems.
Program:
rf = RandomForestClassifier(n_estimators=100,
max_depth=3,
min_samples_leaf=4,
bootstrap=True,
n_jobs=-1,
random_state=0)
rf.fit(X, y)
#rf.estimators_[index]
import matplotlib.pyplot as plt
from sklearn.tree import plot_tree
Result:
Thus the Python Program for implementing Random Forest Classifier is executed successfully.
Exercise 7: SVM Models
Aim:
The aim of this Python code is to demonstrate how to use the scikit-learn library to
train support vector machine (SVM) models for classification tasks.
Algorithm:
1. Load a dataset using the pandas library
2. Split the dataset into training and testing sets using train_test_split function from
scikit-learn
3. Train three SVM models with different kernels (linear, polynomial, and RBF) using
SVC function from scikit-learn
4. Predict the test set labels using the trained models
5. Evaluate the accuracy of the models using the accuracy_score function from scikit-
learn
6. Print the accuracy of each model
Program:
import pandas as pd
fromsklearn.model_selection import train_test_split
fromsklearn.svm import SVC
fromsklearn.metrics import accuracy_score
Output:
Accuracy: 0.9777777777777777
Result:
Thus the program for Build SVM Model has been executed successfully and output is verified.
Exercise 8: Implementation of Ensembling techniques
Aim:
The aim of ensembling is to combine the predictions of multiple individual models,
Known as base models, in order to produce a final prediction that is more accurate and
reliable than any individual model. (Bagging)
Algorithm:
1. Load the dataset and split it into training and testing sets.
2. Choose the base models to be included in the ensemble.
3. Train each base model on the training set.
4. Combine the predictions of the base models using the chosen ensembling technique
(bagging).
5. Evaluate the performance of the ensemble model on the testing set.
6. If the performance is satisfactory, deploy the ensemble model for making predictions
on new data.
Program:
Output:
(1000, 20) (1000,)
Accuracy: 0.862 (0.042)
Predicted Class: 1
Result:
Thus the program for Bagging has been executed successfully and output is verified
Exercise 9: Implementation of Clustering Techniques
Aim:
The aim of clustering is to find patterns and structure in data that may not be
immediately apparent, and to discover relationships and associations between data points.
Algorithm:
1. Data preparation: The first step is to prepare the data that we want to cluster. This may involve
data cleaning, normalization, and feature extraction, depending on the type and quality of the data.
2. Choosing a distance metric: The next step is to choose a distance metric or similarity measure
that will be used to determine the similarity between data points. Common distance metrics
include Euclidean distance, Manhattan distance, and cosine similarity.
3. Choosing a clustering algorithm: There are many clustering algorithms available, each with its
own strengths and weaknesses. Some popular clustering algorithms include K-Means,
Hierarchical clustering, and DBSCAN.
4. Choosing the number of clusters: Depending on the clustering algorithm chosen, we may need to
specify the number of clusters we want to form. This can be done using domain knowledge or by
using techniques such as the elbow method or silhouette analysis.
5. Cluster assignment: Once the clusters have been formed, we need to assign each data point to its
nearest cluster based on the chosen distance metric.
6. Interpretation and evaluation: Finally, we need to interpret and evaluate the results of the
clustering algorithm to determine if the clustering has produced meaningful and useful insights.
Program:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
# load the customer data into a DataFrame
customer_df = pd.read_csv('customer_data.csv')
# Check the first 5 rows
customer_df.head()
plt.scatter(customer_df["Age"],
customer_df["Spending Score (1-100)"])
plt.xlabel("Age")
plt.ylabel("Spending Score (1-100)")
plt.scatter(customer_df["Age"],
customer_df["Annual Income (k$)"])
plt.xlabel("Age")
plt.ylabel("Annual Income (k$)")
relevant_cols = ["Age", "Annual Income (k$)", "Spending Score (1-100)"]
customer_df = customer_df[relevant_cols]
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(customer_df)
scaled_data = scaler.transform(customer_df)
def find_best_clusters(df, maximum_K):
clusters_centers = []
k_values = []
kmeans_model = KMeans(n_clusters = k)
kmeans_model.fit(df)
clusters_centers.append(kmeans_model.inertia_)
k_values.append(k)
return clusters_centers, k_values
def generate_elbow_plot(clusters_centers, k_values):
generate_elbow_plot(clusters_centers, k_values)
kmeans_model = KMeans(n_clusters = 5)
kmeans_model.fit(scaled_data)
customer_df["clusters"] = kmeans_model.labels_
customer_df.head()
plt.scatter(customer_df["Spending Score (1-100)"],
customer_df["Annual Income (k$)"],
c = customer_df["clusters"])
Result:
Thus the K-Means program is executed successfully and output is verified.
Exercise 10: Implementation of EM algorithm
Aim :
To implement Expectation- Maximization algorithm in Python.
What is an EM algorithm?
Algorithm:
st
Step: The very first step is to initialize the parameter values. Further, the system is provided with
incomplete observed data with the assumption that data is obtained from a specific model.
2nd Step: This step is known as Expectation or E-Step, which is used to estimate or guess the values
of the missing or incomplete data using the observed data. Further, E-step primarily updates the
variables.
3rd Step: This step is known as Maximization or M-step, where we use complete data obtained
from the 2nd step to update the parameter values. Further, M-step primarily updates the hypothesis.
4th step: The last step is to check if the values of latent variables are converging or not. If it gets
"yes", then stop the process; else, repeat the process from step 2 until the convergence occurs.
Program :
import numpy as np # import numpy
from numpy.linalg import inv # for matrix inverse
import matplotlib.pyplot as plt # import matplotlib.pyplot for plotting framework
from scipy.stats import multivariate_normal # for generating pdf
m1 = [1,1] # consider a random mean and covariance value
m2 = [7,7]
cov1 = [[3, 2], [2, 3]]
cov2 = [[2, -1], [-1, 2]]x = np.random.multivariate_normal(m1, cov1, size=(200,)) # Generating 200
samples for each mean and covariance
y = np.random.multivariate_normal(m2, cov2, size=(200,))d = np.concatenate((x, y), axis=0)
plt.figure(figsize=(10,10))
plt.scatter(d[:,0], d[:,1], marker='o')
plt.axis('equal')
plt.xlabel('X-Axis', fontsize=16)
plt.ylabel('Y-Axis', fontsize=16)
plt.title('Ground Truth', fontsize=22)
plt.grid()
plt.show()
m1 = random.choice(d)
m2 = random.choice(d)
cov1 = np.cov(np.transpose(d))
cov2 = np.cov(np.transpose(d))
pi = 0.5
x1 = np.linspace(-4,11,200)
x2 = np.linspace(-4,11,200)
X, Y = np.meshgrid(x1,x2)
Z1 = multivariate_normal(m1, cov1)
Z2 = multivariate_normal(m2, cov2)
pos = np.empty(X.shape + (2,)) # a new array of given shape and type, without initializing
entries
pos[:, :, 0] = X; pos[:, :, 1] = Y
return(eval1)
## Maximization step
def Mstep(eval1):
num_mu1,din_mu1,num_mu2,din_mu2=0,0,0,0
for i in range(0,len(d)):
num_mu1 += (1-eval1[i]) * d[i]
din_mu1 += (1-eval1[i])
num_mu2 += eval1[i] * d[i]
din_mu2 += eval1[i]
mu1 = num_mu1/din_mu1
mu2 = num_mu2/din_mu2
num_s1,din_s1,num_s2,din_s2=0,0,0,0
for i in range(0,len(d)):
q1 = np.matrix(d[i]-mu1)
num_s1 += (1-eval1[i]) * np.dot(q1.T, q1)
din_s1 += (1-eval1[i])
q2 = np.matrix(d[i]-mu2)
num_s2 += eval1[i] * np.dot(q2.T, q2)
din_s2 += eval1[i]
s1 = num_s1/din_s1
s2 = num_s2/din_s2
pi = sum(eval1)/len(d)
lis2=[mu1,mu2,s1,s2,pi]
return(lis2)
def plot(lis1):
mu1=lis1[0]
mu2=lis1[1]
s1=lis1[2]
s2=lis1[3]
Z1 = multivariate_normal(mu1, s1)
Z2 = multivariate_normal(mu2, s2)
pos = np.empty(X.shape + (2,)) # a new array of given shape and type, without initializing
entries
pos[:, :, 0] = X; pos[:, :, 1] = Y
Aim:
The aim of building simple neural network (NN) models is to create a basic architecture that
can learn patterns from data and make predictions based on the input. This can involve defining the
structure of the NN, selecting appropriate activation functions, and tuning the hyperparameters to
optimize the performance of the model.
Algorithm:
1. Data preparation: Preprocess the data to make it suitable for training the NN. This may involve
normalizing the input data, splitting the data into training and validation sets, and encoding the output
variables if necessary.
2. Define the architecture: Choose the number of layers and neurons in the NN, and define the
activation functions for each layer. The input layer should have one neuron per input feature, and the
output layer should have one neuron per output variable.
3. Initialize the weights: Initialize the weights of the NN randomly, using a small value to avoid
saturating the activation functions.
4. Forward propagation: Feed the input data forward through the NN, applying the activation
functions at each layer, and compute the output of the NN.
5. Compute the loss: Calculate the error between the predicted output and the true output, using a
suitable loss function such as mean squared error or cross-entropy.
6. Backward propagation: Compute the gradient of the loss with respect to the weights, using the
chain rule and backpropagate the error through the NN to adjust the weights.
7. Update the weights: Adjust the weights using an optimization algorithm such as stochastic
gradient descent or Adam, and repeat steps 4-7 for a fixed number of epochs or until the performance
on the validation set stops improving.
8. Evaluate the model: Test the performance of the model on a held-out test set and report the
accuracy or other performance metrics.
Program:
class NeuralNetwork():
def init (self):
# Seed the random number generator, so it generates the same numbers
# every time the program runs.
random.seed(1)
# We model a single neuron, with 3 input connections and 1 output connection.
# We assign random weights to a 3 x 1 matrix, with values in the range -1 to 1
# and mean 0.
self.synaptic_weights = 2 * random.random((3, 1)) - 1
# Multiply the error by the input and again by the gradient of the Sigmoid curve.
# This means less confident weights are adjusted more.
# This means inputs, which are zero, do not cause changes to the weights.
adjustment = dot(training_set_inputs.T, error * self. sigmoid_derivative(output))
Output :
Random starting synaptic weights:
[[-0.16595599]
[ 0.44064899]
[-0.99977125]]
New synaptic weights after training:
[[ 9.67299303]
[-0.2078435 ]
[-4.62963669]]
Considering new situation [1, 0, 0] -> ?:
[0.99993704]
Result:
Thus the simple Neural Network is built and executed successfully.
Exercise 12: Building of Deep Neural Networks
Deep Learning is a part of machine learning that deals with algorithms inspired by the structure and
function of the human brain. It uses artificial neural networks to build intelligent models and solve complex
problems. We mostly use deep learning with unstructured data.
Program :
from numpy import loadtxt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# load the dataset
dataset = loadtxt('pima-indians-diabetes.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model
model = Sequential()
model.add(Dense(12, input_shape=(8,), activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10)
# evaluate the keras model
_, accuracy = model.evaluate(X, y)
print('Accuracy: %.2f' % (accuracy*100))
# fit the keras model on the dataset without progress bars
model.fit(X, y, epochs=150, batch_size=10, verbose=0)
# evaluate the keras model
_, accuracy = model.evaluate(X, y, verbose=0)
Output :
Accuracy: 75.00
Result :
Thus the Deep Neural network is built and executed successfully.