ML LAB Manual
ML LAB Manual
Since there
are 5 school days in a week, the probability that it is Friday is 20%. What is the
probability that a student is absent given that today is Friday? Apply Baye’s rule in
python to get the result.(Ans: 15%)
Explanation:
=================================
F : Friday
A : Absent
and
Then,
Output:
Explanation:
===> First You need to Create a Table (students) in Mysql Database (SampleDB)
---> Open Command prompt and then execute the following command to enter into
MySQL prompt.
And then, you need to execute the following commands at MySQL prompt to
create table in the database.
===> Next,Open Command propmt and then execute the following command to
install mysql.connector package to connect with mysql database through python.
===============================
Source Code :
===============================. '''
import mysql.connector
cur = myconn.cursor()
result = cur.fetchall()
for x in result:
print(x);
myconn.commit()
myconn.close()
Aim: Implement k-nearest neighbours classification using python
Explanation:
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install
sklearn Module
In this program, we are going to use iris dataset.And this dataset Split into
training(70%) and test set(30%).
The Sample data in iris dataset format is [5.4 3.4 1.7 0.2]
Source Code :
import random
# Loading data
data_iris = load_iris()
label_target = data_iris.target_names
print()
print("*"*30)
for i in range(10):
rn = random.randint(0,120)
print(data_iris.data[rn],"===>",label_target[data_iris.target[rn]])
X = data_iris.data
y = data_iris.target
try:
knn = KNeighborsClassifier(nn)
knn.fit(X_train, y_train)
for i in range(len(test_data)):
test_data[i] = float(test_data[i])
print()
v = knn.predict([test_data])
except:
******************************
[4.6 3.4 1.4 0.3] ===> setosa
[6.1 3. 4.6 1.4] ===> versicolor
[6.3 3.3 4.7 1.6] ===> versicolor
[4.9 3.1 1.5 0.1] ===> setosa
[6.9 3.2 5.7 2.3] ===> virginica
[6.4 3.2 4.5 1.5] ===> versicolor
[5.4 3.4 1.5 0.4] ===> setosa
[5.9 3.2 4.8 1.8] ===> versicolor
[5.4 3. 4.5 1.5] ===> versicolor
[7. 3.2 4.7 1.4] ===> versicolor
The Training dataset length: 105
The Testing dataset length: 45
Enter number of neighbors :10
The Score is : 0.9777777777777777
Enter Test Data :5.0,3.3,1.4,0.3
Predicted output is : ['setosa']
Aim: Given the following data, which specify classifications for nine
Combinations of VAR1 and VAR2 predict a classification for a case where
VAR1=0.906and VAR2=0.606, using the result of k-means clustering with 3
means (i.e., 3centroids)
Source Code:
Explanation:
===> To run this program you need to install the sklearn Module
===> Open Command propmt and then execute the following command to install
sklearn Module
Finally, you need to predict the class for the VAR1=0.906 and VAR2=0.606
Source Code :
OUTPUT:
Input attributes are (from left to right) income, recreation, job, status, age-group,
home-owner. Find the unconditional probability of 'golf' and the conditional
probability of 'single' given 'medRisk' in the dataset
Explanation:
= 4 / 10
= 0.4
---> S : single
---> MR : medRisk
P(S ∩ MR) = The number of MedRisk with Single records / total number of
Records
= 2 / 10 = 0.2 and
= 3 / 10 = 0.3
= 0.66666
Source Code :
total_Records=10
numGolfRecords=4
unConditionalprobGolf=numGolfRecords / total_Records
numMedRisk=3
probMedRiskSingle=numMedRiskSingle/total_Records
probMedRisk=numMedRisk/total_Records
conditionalProb=(probMedRiskSingle/probMedRisk)
OUTPUT:
Explanation:
===> To run this program you need to install the pandas Module
===> To install, Open Command propmt and then execute the following command
===> To install, Open Command propmt and then execute the following command
Source Code :
import pandas as pd
import numpy as np
dataFrame = pd.read_csv('Age_Income.csv')
age = dataFrame['Age']
income = dataFrame['Income']
# number of points
num = np.size(age)
mean_age = np.mean(age)
mean_income = np.mean(income)
b1 = CD_ageincome / CD_ageage
b0 = mean_income - b1*mean_age
# to display coefficients
response_Vec = b0 + b1*age
# Placing labels
plt.xlabel('Age')
plt.ylabel('Income')
# To display plot
plt.show()
Age Income
34 40000
23 30000
67 15000
20 15000
24 25000
23 22000
45 34000
54 50000
43 38000
34 30000
40 40000
33 56000
46 44000
56 45000
19 20000
OUTPUT:
Estimated Coefficients:
b0 = 22586.705594785453
b1 = 294.4731124388917
<function matplotlib.pyplot.show(close=None, block=None)>
AIM: Implement Naive Bayes Theorem to Classify the English Text using python
Explanation:
Naive Bayes classifiers are a collection of classification algorithms based on
Bayes’ Theorem. It is not a single algorithm but a family of algorithms where all of
them share a common principle, i.e. every pair of features being classified is
independent of each other.
The dataset is divided into two parts, namely, feature matrix and the
response/target vector.
• The Feature matrix (X) contains all the vectors(rows) of the dataset in which each
vector consists of the value of dependent features. The number of features is d i.e.
X = (x1,x2,x2, xd).
• The Response/target vector (y) contains the value of class/group variable for each
row of feature matrix.
Source Code
print("NAIVE BAYES ENGLISH TEST CLASSIFICATION")
import numpy as np, pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import make_pipeline
from sklearn.metrics import confusion_matrix, accuracy_score
sns.set() # use seaborn plotting style
# Load the dataset
data = fetch_20newsgroups()# Get the text categories
text_categories = data.target_names# define the training set
train_data = fetch_20newsgroups(subset="train", categories=text_categories)#
define the test set
test_data = fetch_20newsgroups(subset="test", categories=text_categories)
print("We have {} unique classes".format(len(text_categories)))
print("We have {} training samples".format(len(train_data.data)))
print("We have {} test samples".format(len(test_data.data)))
# let’s have a look as some training data let it 5th only
#print(test_data.data[5])
# Build the model
model = make_pipeline(TfidfVectorizer(), MultinomialNB())# Train the model
using the training data
model.fit(train_data.data, train_data.target)# Predict the categories of the test data
predicted_categories = model.predict(test_data.data)
print(np.array(test_data.target_names)[predicted_categories])
OUTPUT:
NAIVE BAYES ENGLISH TEST CLASSIFICATION
We have 20 unique classes
We have 11314 training samples
We have 7532 test samples
['rec.autos' 'sci.crypt' 'alt.atheism' ... 'rec.sport.baseball'
'comp.sys.ibm.pc.hardware' 'soc.religion.christian']
AIM: . Implement an algorithm to demonstrate the significance of Genetic
Algorithm in python.
ALGORITHM:
1. Individual in population compete for resources and mate
2. Those individuals who are successful (fittest) then mate to create more
offspring than others
3. Genes from “fittest” parent propagate throughout the generation, that is
sometimes parents create offspring which is better than either parent.
1) Selection Operator: The idea is to give preference to the individuals with good
fitness scores and allow them to pass there genes to the successive generations.
Source Code
# Python3 program to create target string, starting from
# random string using Genetic Algorithm
import random
# Valid genes
GENES = '''abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP
QRSTUVWXYZ 1234567890, .-;:_!"#%&/()=?@${[]}'''
class Individual(object):
'''
Class representing individual in population '''
def __init__(self, chromosome):
self.chromosome = chromosome
self.fitness = self.cal_fitness()
@classmethod
def mutated_genes(self):
'''
create random genes for mutation
'''
global GENES
gene = random.choice(GENES)
return gene
@classmethod
def create_gnome(self):
'''
create chromosome or string of genes
'''
global TARGET
gnome_len = len(TARGET)
return [self.mutated_genes() for _ in range(gnome_len)]
# random probability
prob = random.random()
def cal_fitness(self):
''' Calculate fittness score, it is the number of
characters in string which differ from target string. '''
global TARGET
fitness = 0
for gs, gt in zip(self.chromosome, TARGET):
if gs != gt:
fitness+= 1
return fitness
# Driver code
def main():
global POPULATION_SIZE
#current generation
generation = 1
found = False
population = []
population = new_generation
generation += 1
print("Generation: {}\tString: {}\tFitness: {}".\
format(generation,
"".join(population[0].chromosome),
population[0].fitness))
if __name__ == '__main__':
main()
OUTPUT:
Generation: 1 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 2 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 2 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 3 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 3 String: N;B54eL2BTIAf6NG3}Tz Fitness: 17
Generation: 4 String: N;B54eL2BTIAf6NG3}Tz Fitness: 17
.
.
.
.
.
.
.
.
.
Generation: 65 String: I love GeeWsforGeeks Fitness: 1
Generation: 65 String: I love GeeWsforGeeks Fitness: 1
Generation: 66 String: I love GeeWsforGeeks Fitness: 1
Generation: 66 String: I love GeeWsforGeeks Fitness: 1
Generation: 67 String: I love GeeWsforGeeks Fitness: 1
Generation: 67 String: I love GeeWsforGeeks Fitness: 1
Generation: 68 String: I love GeeWsforGeeks Fitness: 1
Generation: 68 String: I love GeeWsforGeeks Fitness: 1
Generation: 69 String: I love GeeWsforGeeks Fitness: 1
ALGORITHM:
It is the most widely used algorithm for training artificial neural networks. In the
simplest scenario, the architecture of a neural network consists of some
sequential layers, where the layer numbered i is connected to the layer numbered
i+1. The layers can be classified into 3 classes:
1. Input
2. Hidden
3. Output
Usually, each neuron in the hidden layer uses an activation function like sigmoid
or rectified linear unit (ReLU). This helps to capture the non-linear relationship
between the inputs and their outputs. The neurons in the output layer also use
activation functions like sigmoid (for regression) or SoftMax (for classification). To
train a neural network, there are 2 passes (phases):
• Forward
• Backward
The forward and backward phases are repeated from some epochs. In each
epoch, the following occurs:
1. The inputs are propagated from the input to the output layer.
3. The error is propagated from the output layer to the input layer.
SOURCE CODE:
import numpy
import matplotlib.pyplot as plt
def sigmoid(sop):
return 1.0/(1+numpy.exp(-1*sop))
def error(predicted, target):
return numpy.power(predicted-target, 2)
def error_predicted_deriv(predicted, target):
return 2*(predicted-target)
def sigmoid_sop_deriv(sop):
return sigmoid(sop)*(1.0-sigmoid(sop))
def sop_w_deriv(x):
return x
def update_w(w, grad, learning_rate):
return w - learning_rate*grad
x1=0.1
x2=0.4
target = 0.7
learning_rate = 0.01
w1=numpy.random.rand()
w2=numpy.random.rand()
print("Initial W : ", w1, w2)
predicted_output = []
network_error = []
old_err = 0
for k in range(80000):
# Forward Pass
y = w1*x1 + w2*x2
predicted = sigmoid(y)
err = error(predicted, target)
predicted_output.append(predicted)
network_error.append(err)
# Backward Pass
g1 = error_predicted_deriv(predicted, target)
g2 = sigmoid_sop_deriv(y)
g3w1 = sop_w_deriv(x1)
g3w2 = sop_w_deriv(x2)
gradw1 = g3w1*g2*g1
gradw2 = g3w2*g2*g1
w1 = update_w(w1, gradw1, learning_rate)
w2 = update_w(w2, gradw2, learning_rate)
#print(predicted)
plt.figure()
plt.plot(network_error)
plt.title("Iteration Number vs Error")
plt.xlabel("Iteration Number")
plt.ylabel("Error")
plt.show()
plt.figure()
plt.plot(predicted_output)
plt.title("Iteration Number vs Prediction")
plt.xlabel("Iteration Number")
plt.ylabel("Prediction")
plt.show()
OUTPUT:
Initial W : 0.9662947849081247 0.9049871680451135