0% found this document useful (0 votes)
16 views41 pages

CS3491-Ai&Ml Lab manual-CSE

The document outlines the record notebook for the Artificial Intelligence & Machine Learning Lab at Udaya School of Engineering, detailing various experiments including uninformed search algorithms (BFS & DFS), informed search algorithm (A*), Naïve Bayes Classifier, and Bayesian Networks. Each experiment includes aims, algorithms, programs, and results demonstrating successful execution. The document serves as a formal record for students in the Computer Science and Engineering department during the 2024-2025 academic year.

Uploaded by

sujin raja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views41 pages

CS3491-Ai&Ml Lab manual-CSE

The document outlines the record notebook for the Artificial Intelligence & Machine Learning Lab at Udaya School of Engineering, detailing various experiments including uninformed search algorithms (BFS & DFS), informed search algorithm (A*), Naïve Bayes Classifier, and Bayesian Networks. Each experiment includes aims, algorithms, programs, and results demonstrating successful execution. The document serves as a formal record for students in the Computer Science and Engineering department during the 2024-2025 academic year.

Uploaded by

sujin raja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

UDAYA SCHOOL OF ENGINEERING

UDAYA NAGAR, VELLAMODI


Kanyakumari District

Regulation-2021
Department of Computer science and
Engineering
CS3491 ARTIFICIAL INTELLIGENCE & MACHINE LEARNING LAB

RECORD NOTE BOOK

Register No.

Name :
Semester : IV
Subject Name : ARTIFICIAL INTELLIGENCE & MACHINE -
LEARNING LAB
Subject Code : CS3491
UDAYA SCHOOL OF ENGINEERING

UDAYA NAGAR, VELLAMODI


Kanyakumari District

Certified that this is the Bonafide Record of work done by


Mr. / Miss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . of the Fourth
Semester in Computer Science and Engineering of this college , in the
ARTIFICIAL INTELLIGENCE & MACHINE LEARNING LAB (CS3491)
during 2024-2025 in partial fulfillment of the requirements of the B.E. / B.Tech
degree course of the Anna University, Chennai.

Staff – in – Charge Head of the Department

University Reg. No . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
University Examination held on . . . . . . . . . . . . . . . . . . . .

Internal Examiner External Examiner


SL NO DATE NAME OF THE EXPERIMENT MARKS SIGN.
Ex No: 1 IMPLEMENTATION OF UNINFORMED SEARCH ALGORITHMS (BFS & DFS)

Aim:
To implement uninformed search algorithms such as BFS and DFS.

Algorithm (BFS):
Step 1: SET STATUS = 1 (ready state) for each node in G
Step 2: Enqueue the starting node A and set its STATUS = 2
(waiting state) Step 3: Repeat Steps 4 and 5 until QUEUE is
empty
Step 4: Dequeue a node N. Process it and set its STATUS = 3
(processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state
(whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
Step 6: EXIT
Program:
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = [] # List for visited nodes.
queue = [] #Initialize a queue
def bfs(visited, graph, node): #function for BFS
visited.append(node)
queue.append(node)
while queue: # Creating loop to visit each node
m = queue.pop(0)
print (m, end = " ")
for neighbour in graph[m]:
if neighbour not in visited:
visited.append(neighbour)
queue.append(neighbour)
# Driver Code
print("Following is the Breadth-First Search")
bfs(visited, graph, '5') # function calling

Output:
Following is the Breadth-First Search
537248
Algorithm(DFS):
Step 1: SET STATUS = 1 (ready state) for each node in G
Step 2: Push the starting node A on the stack and set its STATUS = 2
(waiting state)
Step 3: Repeat Steps 4 and 5 until STACK is empty
Step 4: Pop the top node N. Process it and set its STATUS = 3
(processed state)
Step 5: Push on the stack all the neighbors of N that are in the ready
state (whose STATUS = 1) and set their STATUS = 2 (waiting
state)
[END OF LOOP]
Step 6: EXIT
Program:
# Using a Python dictionary to act as an adjacency list
graph = {
'5' : ['3','7'],
'3' : ['2', '4'],
'7' : ['8'],
'2' : [],
'4' : ['8'],
'8' : []
}
visited = set() # Set to keep track of visited nodes of graph.
defdfs(visited, graph, node): #function for dfs
if node notin visited:
print (node)
visited.add(node)
forneighbourin graph[node]:
dfs(visited, graph, neighbour)
# Driver Code
print("Following is the Depth-First Search")
dfs(visited, graph, '5')

Output:
Following is the Depth-First Search
532487

Result:
Thus the uninformed search algorithms such as BFS and DFS have been executed
successfully and the output got verified.
Ex No: 2 IMPLEMENTATION OF INFORMED SEARCH ALGORITHMS (A*)

Aim:

To implement the informed search algorithm A*.

Algorithm(A*):

1. Initialize the open list


2. Initialize the closed list put the starting node on the open list (you can leave its f at zero)
3. while the open list is not empty
a) find the node with the least f on the open list, call it "q"
b) pop q off the open list
c) generate q's 8 successors and set their parents to q
d) for each successor
i) if successor is the goal, stop search
ii) else, compute both g and h for successor
successor.g = q.g + distance between successor and q
successor.h = distance from goal to successor
successor.f = successor.g + successor.h
iii) if a node with the same position as successor is in the OPEN list which
has a lower f than successor, skip this successor
iv) if a node with the same position as successor is in the CLOSED list which
has a lower f than successor, skip this successor otherwise, add the node to
the open list end (for loop)
e) push q on the closed list end (while loop)
Program:
def aStarAlgo(start_node, stop_node):
open_set = set(start_node)
closed_set = set()
g = {} #store distance from starting node
parents = {} # parents contains an adjacency map of all nodes
#distance of starting node from itself is zero
g[start_node] = 0
#start_node is root node i.e it has no parent nodes
#so start_node is set to its own parent node
parents[start_node] = start_node
while len(open_set) > 0:
n = None
#node with lowest f() is found
for v in open_set:
if n == None or g[v] + heuristic(v) < g[n] + heuristic(n):
n=v
if n == stop_node or Graph_nodes[n] == None:
pass
else:
for (m, weight) in get_neighbors(n):
#nodes 'm' not in first and last set are added to first
#n is set its parent
if m not in open_set and m not in closed_set:
open_set.add(m)
parents[m] = n
g[m] = g[n] + weight
#for each node m,compare its distance from start i.e g(m) to the
#from start through n node
else:
if g[m] > g[n] + weight:
#update g(m)
g[m] = g[n] + weight
#change parent of m to n
parents[m] = n
#if m in closed set,remove and add to open
if m in closed_set:
closed_set.remove(m)
open_set.add(m)
if n == None:
print('Path does not exist!')
return None

# if the current node is the stop_node


# then we begin reconstructin the path from it to the start_node
if n == stop_node:
path = []
while parents[n] != n:
path.append(n)
n = parents[n]
path.append(start_node)
path.reverse()
print('Path found: {}'.format(path))
return path
# remove n from the open_list, and add it to closed_list
# because all of his neighbors were inspected
open_set.remove(n)
closed_set.add(n)
print('Path does not exist!')
return None

#define fuction to return neighbor and its distance


#from the passed node
def get_neighbors(v):
if v in Graph_nodes:
return Graph_nodes[v]
else:
return None

#for simplicity we ll consider heuristic distances given


#and this function returns heuristic distance for all nodes
def heuristic(n):
H_dist = {
'A': 11,
'B': 6,
'C': 5,
'D': 7,
'E': 3,
'F': 6,
'G': 5,
'H': 3,
'I': 1,
'J': 0
}

return H_dist[n]

#Describe your graph here


Graph_nodes = {
'A': [('B', 6), ('F', 3)],
'B': [('A', 6), ('C', 3), ('D', 2)],
'C': [('B', 3), ('D', 1), ('E', 5)],
'D': [('B', 2), ('C', 1), ('E', 8)],
'E': [('C', 5), ('D', 8), ('I', 5), ('J', 5)],
'F': [('A', 3), ('G', 1), ('H', 7)],
'G': [('F', 1), ('I', 3)],
'H': [('F', 7), ('I', 2)],
'I': [('E', 5), ('G', 3), ('H', 2), ('J', 3)],
}

aStarAlgo('A', 'J')
Output:

Path found: ['A', 'F', 'G', 'I', 'J']

Result:
Thus the program to implement informed search algorithm have been executed
successfully and output got verified.
Ex No: 3 IMPLEMENT NAÏVE BAYES MODELS

Aim:

To diagnose heart patients and predict disease using heart disease dataset with
Naïve Bayes Classifier Algorithm.

Algorithm:

Steps in Naïve Bayes Classifier Algorithm:

1. Read the training dataset T;


2. Calculate the mean and standard deviation of the predictor variables in
each class;
3. Repeat Calculate the probability of fi using the gauss density equation in each
class; Until the probability of all predictor variables (f1, f2, f3, .. , fn) has been
calculated.
4. Calculate the likelihood for each class;
5. Get the greatest likelihood;
Program:
#Import scikit-learn dataset library
from sklearn import datasets
#Load dataset
wine = datasets.load_wine()
# print the names of the 13 features
print("Features: ", wine.feature_names)
# print the label type of wine(class_0, class_1, class_2)
print("Labels: ", wine.target_names)
# print data(feature)shape
wine.data.shape
# print the wine data features (top 5 records)
print(wine.data[0:5])
# print the wine labels (0:Class_0, 1:class_2, 2:class_2)
Print(wine.target)
# Import train_test_split function
from sklearn.model_selection import train_test_split
# Split dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split (wine.data, wine.target, test_size=0.3,
random_state=109)
# 70% training and 30% test
#Import Gaussian Naive Bayes model
from sklearn.naive_bayes import GaussianNB
#Create a Gaussian Classifier
gnb = GaussianNB()
#Train the model using the training sets
gnb.fit(X_train, y_train)
#Predict the response for test dataset
y_pred = gnb.predict(X_test)
# Evaluating model
#Import scikit-learn metrics module for accuracy calculation
from sklearn import metrics
# Model Accuracy
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Output:

Display features and labels in the dataset:


Features: ['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids',
'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_ wines',
'proline']
Labels: ['class_0' 'class_1' 'class_2']

Display the shape of the dataset:


(178, 13)

Display the top 5 records in the dataset:


[[1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00
2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03]
[1.320e+01 1.780e+00 2.140e+00 1.120e+01 1.000e+02 2.650e+00 2.760e+00
2.600e-01 1.280e+00 4.380e+00 1.050e+00 3.400e+00 1.050e+03]
[1.316e+01 2.360e+00 2.670e+00 1.860e+01 1.010e+02 2.800e+00 3.240e+00
3.000e-01 2.810e+00 5.680e+00 1.030e+00 3.170e+00 1.185e+03]
[1.437e+01 1.950e+00 2.500e+00 1.680e+01 1.130e+02 3.850e+00 3.490e+00
2.400e-01 2.180e+00 7.800e+00 8.600e-01 3.450e+00 1.480e+03]
[1.324e+01 2.590e+00 2.870e+00 2.100e+01 1.180e+02 2.800e+00 2.690e+00
3.900e-01 1.820e+00 4.320e+00 1.040e+00 2.930e+00 7.350e+02]]

Display the labels in the dataset:


[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0000000000000000000000111111111111111
1111111111111111111111111111111111111
1111111111111111111222222222222222222
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]

Model Accuracy:
Accuracy: 0.9074074074074074

Result:
Thus the program to diagnose heart patients and predict disease using heart disease
dataset with Naïve Bayes Classifier Algorithm has been executed successfully and output got
verified.
Ex No: 4 IMPLEMENT BAYESIAN NETWORKS

Aim:
To construct a Bayesian network, to demonstrate the diagnosis of heart patients
using standard Heart Disease Data Set.

Algorithm:

1. Read the training dataset T;


2. Calculate the mean and standard deviation of the predictor variables in each class;
3. Repeat Calculate the probability of fi using the gauss density equation in each class; Until
the probability of all predictor variables (f1, f2, f3, .., fn) has been calculated.
4. Calculate the likelihood for each class;
5. Get the greatest likelihood;
Install Packages
pip install pgmpy
pip install networkx

Program
from pgmpy.models import BayesianNetwork
from pgmpy.factors.discrete import TabularCPD
import networkx as nx
import pylab as plt
# Defining Bayesian Structure
model = BayesianNetwork([('Guest', 'Host'), ('Price', 'Host')])
# Defining the CPDs:
cpd_guest = TabularCPD('Guest', 3, [[0.33], [0.33], [0.33]])
cpd_price = TabularCPD('Price', 3, [[0.33], [0.33], [0.33]])
cpd_host = TabularCPD('Host', 3, [[0, 0, 0, 0, 0.5, 1, 0, 1, 0.5],
[0.5, 0, 1, 0, 0, 0, 1, 0, 0.5],
[0.5, 1, 0, 1, 0.5, 0, 0, 0, 0]],
evidence=['Guest', 'Price'], evidence_card=[3, 3])
# Associating the CPDs with the network structure.
model.add_cpds(cpd_guest, cpd_price, cpd_host)
model.check_model()

Output:
True

Program
# Infering the posterior probability
from pgmpy.inference import VariableElimination
infer = VariableElimination(model)
posterior_p = infer.query(['Host'], evidence={'Guest': 2, 'Price': 2})
print(posterior_p)

Output:
+---------+-------------+
| Host | phi(Host) |
+=====+==== ===+
| Host(0) | 0.5000 |
+---------+-------------+
| Host(1) | 0.5000 |
+---------+-------------+
| Host(2) | 0.0000 |
+---------+-------------+
Program
nx.draw(model, with_labels=True)
plt.savefig('model.png')
plt.close()

Output:

Result:
Thus the program to implement a bayesian networks in the given heart disease dataset
have been executed successfully and the output got verified
Ex No: 5. BUILD REGRESSION MODELS

Aim:

To build regression models such as locally weighted linear regression and plot the
necessary graphs.

Algorithm:
1. Read the Given data Sample to X and the curve (linear or non-linear) to Y
2. Set the value for Smoothening parameter or free parameter say τ
3. Set the bias /Point of interest set x0 which is a subset of X
4. Determine the weight matrix using :
(𝑥−𝑥0 )2
𝑤(𝑥, 𝑥0 ) = 𝑒 − 2𝜏
5. Determine the value of model term parameter β using :
𝛽̂ (𝑥0 ) = (𝑋 𝑇 𝑊𝑋)−1 𝑋 𝑇 Wy
6. Prediction = x0*β.
Program:

from math import ceil


import numpy as np
from scipy import linalg

def lowess(x, y, f, iterations):


n = len(x)
r = int(ceil(f * n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]
w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0)
w = (1 - w ** 3) ** 3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iterations):
for i in range(n):
weights = delta * w[:,i]
b = np.array([np.sum(weights * y), np.sum(weights * y * x)])
A = np.array([[np.sum(weights), np.sum(weights *
x)],[np.sum(weights * x), np.sum(weights * x * x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1] * x[i]

residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2

return yest

import math
n = 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)
f =0.25
iterations=3
yest = lowess(x, y, f, iterations)

import matplotlib.pyplot as plt


plt.plot(x,y,"r.")
plt.plot(x,yest,"b-")
Output:

Result
Thus the program to implement build regression models have been executed successfully and the
output got verified
Ex No: 7 BUILD LOGISTIC REGRESSION MODELS

Aim:

To build logistic regression models with suitable datasets.

Algorithm:

1. Import the required libraries


2. Read and understand the data
3. Exploratory data analysis
4. Data preparation
5. Building logistic regression models
6. Making predictions on test set
7. Assigning scores as per predicted probability values
Program:
# importing libraries
import statsmodels.api as sm
import pandas as pd
# loading the training dataset
data = pd.read_csv('pima_diabetes.csv', index_col = 0)
# defining the dependent and independent variables
Xtrain = data[['Glucose', 'BloodPressure', 'SkinThickness', 'Insulin', 'BMI',
'DiabetesPedigreeFunction','Age']]
ytrain = data[['Outcome']]
# building the model and fitting the data
log_reg = sm.Logit(ytrain, Xtrain).fit()
# printing the summary table
print(log_reg.summary())

Output:

Result
Thus the program to implement BUILD logistic regression models have been executed
successfully and the output got verified
Ex No: 7 BUILD DECISION TREES
Aim :
To build decision trees with suitable datasets
Algorithm :
1. It begins with the original set S as the root node
2. On each iteration of the algorithm, it iterates through the very unused attribute of the set
S and calculates Entropy (H) and Information gain (IG) of this attribute.
3. It then selects the attribute which has the smallest entropy or largest information gain
4. The set S is then split by the selected attribute to produce a subset of the data.
5. The algorithm continues to recur on each subset, considering only attributes never
selected before
Program:
import pandas
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier
df = pandas.read_csv("data.csv")
print("Input:")
print(df.head(5))
d = {'UK':0,'USA':1,'N':2}
df['Nationality'] = df['Nationality'].map(d)
d = {'YES':1, 'NO':0}
df['Go'] = df['Go'].map(d)
print("Transformed Data:")
print(df.head(5))
features = ['Age','Experience','Rank','Nationality']
X = df[features]
y = df['Go']
dtree = DecisionTreeClassifier()
dtree = dtree.fit(X,y)
print(dtree.predict([[40,10,6,1]]))
print("[1]means 'Go'")
print("[0]means 'NO'")
DATA SET : (data.csv)
Age Experience Rank Nationality Go
36 10 9 UK NO
42 12 4 USA NO
23 4 6 N NO
52 4 4 USA NO
43 21 8 USA YES
Output:

Result
Thus the program to implement build decision trees have been executed successfully and
the output got verified
Ex No: 8 BUILD RANDOM FORESTS

Aim:
To implement the concept of random forest with suitable dataset from real world problems.
Algorithms:
1. State the questions and determine required data.
2. Acquire the data in an accessible format.
3. Identify and correct missing data points/anomalies as required.
4. Prepare the data for the machine learning model.
5. Establish a baseline model that you aim to exceed.
6. Train the model on the training data.
7. Make prediction on the test data.
8. Compare predictions to the known test set targets and calculate performance metrics.
9. If performance is not satisfactory, adjust the model,acquire more data,or try a different
modeling technique.
10. Interpret model and report results visually and numerically.
Program:

# Pandas is used for data manipulation


import pandas as pd
# Read in data and display first 5 rows
features = pd.read_csv('temps.csv')
features.head(5)

print('The shape of our features is:', features.shape)

# Descriptive statistics for each column


features.describe()

# One-hot encode the data using pandas get_dummies


features = pd.get_dummies(features)
# Display the first 5 rows of the last 12 columns
features.iloc[:,5:].head(5)

import numpy as np
# Labels are the values we want to predict
labels = np.array(features['actual'])
# Remove the labels from the features
# axis 1 refers to the columns
features= features.drop('actual', axis = 1)
# Saving feature names for later use
feature_list = list(features.columns)
# Convert to numpy array
features = np.array(features)

# Using Skicit-learn to split data into training and testing sets


from sklearn.model_selection import train_test_split
# Split the data into training and testing sets
train_features, test_features, train_labels, test_labels = train_test_split(features, labels, test_size =
0.25, random_state = 42)

print('Training Features Shape:', train_features.shape)


print('Training Labels Shape:', train_labels.shape)
print('Testing Features Shape:', test_features.shape)
print('Testing Labels Shape:', test_labels.shape)
# Import the model we are using
from sklearn.ensemble import RandomForestRegressor
# Limit depth of tree to 3 levels
rf_small = RandomForestRegressor(n_estimators=10, max_depth = 3)
# Train the model on training data
rf_small.fit(train_features, train_labels)

# Extract the small tree


tree_small = rf_small.estimators_[5]
# Save the tree as a png image
export_graphviz(tree_small, out_file = 'small_tree.dot', feature_names = feature_list, rounded =
True, precision = 1)
(graph, ) = pydot.graph_from_dot_file('small_tree.dot')
graph.write_png('small_tree.png');

# Use the forest's predict method on the test data


predictions = rf_small.predict(test_features)
# Calculate the absolute errors
errors = abs(predictions - test_labels)
# Print out the mean absolute error (mae)
print('Mean Absolute Error:', round(np.mean(errors), 2), 'degrees.')

# Calculate mean absolute percentage error (MAPE)


mape = 100 * (errors / test_labels)
# Calculate and display accuracy
accuracy = 100 - np.mean(mape)
print('Accuracy:', round(accuracy, 2), '%.')

Output:

The shape of our features is: (348, 12)

Training Features Shape: (261, 17)


Training Labels Shape: (261,)
Testing Features Shape: (87, 17)
Testing Labels Shape: (87,)

RandomForestRegressor(max_depth=3, n_estimators=10)

Mean Absolute Error: 4.0 degrees.

Accuracy: 93.73 %.

Result
Thus the program to implement build random forests have been executed successfully and
the output got verified
Ex no:9 BUILD SVM MODELS

Aim:
To create a machine learning model which classifies the given dataset using support vector
machine algorithm.
Algorithm:
1. First and foremost we will load appropriate SKlearn modules and classes.
2. Next step is to perform feature scaling. The reason for doing feature scaling is to make
sure that datafor different features are in the same range. The standardscaler class of
sklearn.preprocessing is used.
3. Next step is to instantiate a SVC and fit the model. The SVC classof sklearn.svm
module is used.
4. Finally,it is time to measure the model performance. Here is the ciode for doing the
same.
Program:
import pandas
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix

data = pandas.read_csv("vector.csv")
print("Input: ")
print(data.head(10))

training_set, test_set = train_test_split(data, test_size = 0.3, random_state=1)

x_train = training_set.iloc[:,0:2].values
y_train = training_set.iloc[:,2].values
x_test = test_set.iloc[:,0:2].values
y_test = test_set.iloc[:,2].values

classifier = SVC(kernel='linear', random_state=1)


classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)
test_set["prediction"] = y_pred
print("Output")
print(test_set)

cm = confusion_matrix(y_test, y_pred)
accuracy = float(cm.diagonal().sum()/len(y_test))
print("\nAccuracy of SVM for the given dataset: ", accuracy)

Dataset
Output:

Result
Thus the program to implement build svm models have been executed successfully and the
output got verified
Ex No: 10 IMPLEMENT ENSEMBLING TECHNIQUES

Aim:
To implement ensembling techniques for voting classifier for bagging, adboost using python
in Jupiter environment.
Algorithm:
1. Split the train dataset into n parts.
2. A base modelis fitted on n-1 parts and predictions are made for the nth part. This is done
for each one of the n part of the train set.
3. The base model is then fitted on the whole train dataset.
4. This model is used to predict the test dataset.
5. The steps 2 to 4 are repeated for another base model which results in another set of
predictions for the train and test dataset.
6. The predictions on train dataset are used as a feature to build the new model.
7. This final model is used to make the predictions on test dataset.
Program:
#Implement VotingClassifier
#Importing necessary libraries:
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_moons
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score

#Creating dataset:
X, y = make_moons(n_samples=500, noise=0.30)
X_train, X_test, y_train, y_test = train_test_split(X, y)

#Initializing the models:


log = LogisticRegression()
rnd = RandomForestClassifier(n_estimators=100)
svm = SVC()
voting = VotingClassifier(
estimators=[('logistics_regression', log), ('random_forest', rnd), ('support_vector_machine', svm)],
voting='hard')

#Fitting training data:


voting.fit(X_train, y_train)

#prediction using test data


for clf in (log, rnd, svm, voting):
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print(clf.__class__.__name__, accuracy_score(y_test, y_pred))

#Implement BaggingClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

bagging_clf = BaggingClassifier(
DecisionTreeClassifier(), n_estimators=250,
max_samples=100, bootstrap=True, random_state=101)
#Fitting training data:
bagging_clf.fit(X_train, y_train)
#prediction using test data
y_pred = bagging_clf.predict(X_test)
print(accuracy_score(y_test, y_pred))

#Implement AdaBoostClassifier
from sklearn.ensemble import AdaBoostClassifier
adaboost_clf = AdaBoostClassifier(
DecisionTreeClassifier(max_depth=1), n_estimators=200,
algorithm="SAMME.R", learning_rate=0.5, random_state=42)

#Fitting training data:


adaboost_clf.fit(X_train, y_train)

#prediction using test data


y_pred = adaboost_clf.predict(X_test)
accuracy_score(y_test, y_pred)

Output:

#For VotingClassifier
LogisticRegression 0.848
RandomForestClassifier 0.88
SVC 0.896
VotingClassifier 0.896

#For BaggingClassifier
0.888

#For AdaBoostClassifier
0.864

Result
Thus the program to implement ensembling techniques have been executed successfully
and the output got verified
Ex No: 11 IMPLEMENT CLUSTERING ALGORITHMS

Aim:
To implement k-nearest Neighbour algorithm to classify the Iris dataset.
Algorithm:
1. Select the number K of the neighbors.
2. Calaulate the Euclidean distance os K number of neighbors.
3. Take the K nearest neighbors as per the calculated Euclidean distance.
4. Among these k neighbors,count the number of the data points in each category.
5. Assign the new data points to that catedory for which the number of the neighbor is
maximum.
6. Our model is ready.
Program:
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans

data = {'x':
[25,34,22,27,33,33,31,22,35,34,67,54,57,43,50,57,59,52,65,47,49,48,35,33,44,45,38,43,51,4
6],
'y':
[79,51,53,78,59,74,73,57,69,75,51,32,40,47,53,36,35,58,59,50,25,20,14,12,20,5,29,27,8,7]
}

df = pd.DataFrame(data, columns=['x', 'y'])


kmeans = KMeans(n_clusters=3).fit(df)
centroids = kmeans.cluster_centers_
print(centroids)
plt.scatter(df['x'], df['y'], c= kmeans.labels_.astype(float), s=50, alpha=0.5)
plt.scatter(centroids[:, 0], centroids[:, 1], c='red', s=50)
plt.show()

Output:

Result
Thus the program to implement clustering algorithms have been executed successfully and
the output got verified
Ex No: 12 IMPLEMENT GMM ALGORITHMS

Aim:
To implement GMM algorithms for clustering networks using the given dataset.

Algorithm:
Initialize randomly Repeat until convergence.
E-step:
Compute q(h)=p(H=h|E=e;) for each h
Create fully-observed weighted examplesh,e)with weight q(h)
M-step:
Maximum likehood on weighted examples to get teta
Program:
import matplotlib.pyplot as plt
from sklearn import datasets
import sklearn.metrics as sm
import pandas as pd
import numpy as np
%matplotlib inline
# import some data to play with
iris = datasets.load_iris()

#print("\n IRIS DATA :",iris.data);


#print("\n IRIS FEATURES :\n",iris.feature_names)
#print("\n IRIS TARGET :\n",iris.target)
#print("\n IRIS TARGET NAMES:\n",iris.target_names)
# Store the inputs as a Pandas Dataframe and set the column names
X = pd.DataFrame(iris.data)
#print(X)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
#print(X.columns)
#print("X:",x)
#print("Y:",y)
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
# Set the size of the plot
plt.figure(figsize=(14,7))
# Create a colormap
colormap = np.array(['red', 'lime', 'black'])
# Plot Sepal
plt.subplot(1, 2, 1)
plt.scatter(X.Sepal_Length,X.Sepal_Width, c=colormap[y.Targets], s=40)
plt.title('Sepal')
plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length,X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Petal')

# GMM
from sklearn import preprocessing
scaler = preprocessing.StandardScaler()
scaler.fit(X)
xsa = scaler.transform(X)
xs = pd.DataFrame(xsa, columns = X.columns)
xs.sample(5)
from sklearn.mixture import GaussianMixture
gmm = GaussianMixture(n_components=3)
gmm.fit(xs)
y_cluster_gmm = gmm.predict(xs)
y_cluster_gmm
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y_cluster_gmm], s=40)
plt.title('GMM Classification')
# Accuracy
sm.accuracy_score(y, y_cluster_gmm)
# Confusion Matrix
sm.confusion_matrix(y, y_cluster_gmm)

Output:

array([[50, 0, 0],
[ 0, 5, 45],
[ 0, 50, 0]], dtype=int64)

Result
Thus the program to implement GMM algorithms have been executed successfully and the
output got verified
Ex No: 13 BUILD A DEEP LEARNING NEURAL NETWORKS MODELS

Aim:
To implement and build a convolutional neural network model which predicts the age and
gender of a person using the given pre-trained models.

Algorithm:
1. Choose the dataset.
2. Prepare the dataset for training.
3. Create training data.
4. Shuffle the dataset.
5. Assigning Labels and features.
6. Normalizing X and converting labels to categorical data.
7. Split X and Y for use in CNN.
8. Define, compile and train the CNN model.
9. Accuracy and scope of the model.
Program:
import tensorflow as tf
from tensorflow import keras
fashiondata=tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test)=fashiondata.load_data()
x_test.shape
x_train.shape
x_train, x_test=x_train/255, x_test/255
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(128,activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10,activation='softmax')])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Output:
(10000, 28, 28)
(60000, 28, 28)
Epoch 1/5
1875/1875 [==============================] - 7s 3ms/step - loss: 0.0672 - accuracy:
0.9793
Epoch 2/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0578 - accuracy:
0.9811
Epoch 3/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0528 - accuracy:
0.9825
Epoch 4/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0500 - accuracy:
0.9833
Epoch 5/5
1875/1875 [==============================] - 6s 3ms/step - loss: 0.0453 - accuracy:
0.9844
313/313 [==============================] - 1s 3ms/step - loss: 0.0697 - accuracy: 0.9797
[0.06965507566928864, 0.9797000288963318]

Result
Thus the program to implement deep learning neural networks models have been executed
successfully and the output got verified

You might also like