0% found this document useful (0 votes)
30 views

ML LAB Manual

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views

ML LAB Manual

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Aim: The probability that it is Friday and that a student is absent is 3%.

Since there
are 5 school days in a week, the probability that it is Friday is 20%. What is the
probability that a student is absent given that today is Friday? Apply Baye’s rule in
python to get the result.(Ans: 15%)

Explanation:
=================================
F : Friday
A : Absent

Based on the given problem statement,

The probability that it is Friday and that a student is absent is 3%


i.e
P(A ∩ F)= 3% = 3 / 100 = 0.03

and

The probability that it is Friday is 20%


i.e

P(F)=20% = 20/100 = 0.2

Then,

The probability that a student is absent given that today is Friday


P(A ∣ F)

By the definition of Baye's rule( conditional probability ), we have

P(A ∣ F) = P(A ∩ F) / P(F)


Source Code :

# The probability that it is Friday and that a student is absent is 3%


pAF=0.03
print("The probability that it is Friday and that a student is absent :",pAF)
# The probability that it is Friday is 20%
pF=0.2
print("The probability that it is Friday : ",pF)
# The probability that a student is absent given that today is Friday
pResult=(pAF/pF)
# Display the Result
print("The probability that a student is absent given that today is Friday : ",pResult
* 100,"%")

Output:

The probability that it is Friday and that a student is absent : 0.03


The probability that it is Friday : 0.2
The probability that a student is absent given that today is Friday : 15.0 %
Aim: Extract the data from database using python

Explanation:

===> First You need to Create a Table (students) in Mysql Database (SampleDB)

---> Open Command prompt and then execute the following command to enter into
MySQL prompt.

--> mysql -u root -p

And then, you need to execute the following commands at MySQL prompt to
create table in the database.

--> create database SampleDB;

--> use SampleDB;

--> CREATE TABLE students (sid VARCHAR(10),sname VARCHAR(10),age


int);

--> INSERT INTO students VALUES('s521','Jhon Bob',23);

--> INSERT INTO students VALUES('s522','Dilly',22);

--> INSERT INTO students VALUES('s523','Kenney',25);

--> INSERT INTO students VALUES('s524','Herny',26);

===> Next,Open Command propmt and then execute the following command to
install mysql.connector package to connect with mysql database through python.

--> pip install mysql.connector (Windows)

--> sudo apt-get install mysql.connector (linux)

===============================

Source Code :

===============================. '''
import mysql.connector

# Create the connection object

myconn = mysql.connector.connect(host = "localhost", user = "root",passwd =


"",database="SampleDB")

# Creating the cursor object

cur = myconn.cursor()

# Executing the query

cur.execute("select * from students")

# Fetching the rows from the cursor object

result = cur.fetchall()

print("Student Details are :")

# Printing the result

for x in result:

print(x);

# Commit the transaction

myconn.commit()

# Close the connection

myconn.close()
Aim: Implement k-nearest neighbours classification using python

Explanation:

===> To run this program you need to install the sklearn Module

===> Open Command propmt and then execute the following command to install
sklearn Module

---> pip install scikit-learn

In this program, we are going to use iris dataset.And this dataset Split into
training(70%) and test set(30%).

The iris dataset conatins the following features

---> sepal length (cm)

---> sepal width (cm)

---> petal length (cm)

---> petal width (cm)

The Sample data in iris dataset format is [5.4 3.4 1.7 0.2]

Where 5.4 ---> sepal length (cm)

3.4 ---> sepal width (cm)

1.7 ---> petal length (cm)

0.2 ---> petal width (cm)

Source Code :

# Import necessary modules

from sklearn.neighbors import KNeighborsClassifier

from sklearn.model_selection import train_test_split

from sklearn.datasets import load_iris

import random
# Loading data

data_iris = load_iris()

# To get list of target names

label_target = data_iris.target_names

print()

print("Sample Data from Iris Dataset")

print("*"*30)

# to display the sample data from the iris dataset

for i in range(10):

rn = random.randint(0,120)

print(data_iris.data[rn],"===>",label_target[data_iris.target[rn]])

# Create feature and target arrays

X = data_iris.data

y = data_iris.target

# Split into training and test set

X_train, X_test, y_train, y_test = train_test_split(

X, y, test_size = 0.3, random_state=1)

print("The Training dataset length: ",len(X_train))

print("The Testing dataset length: ",len(X_test))

try:

nn = int(input("Enter number of neighbors :"))

knn = KNeighborsClassifier(nn)
knn.fit(X_train, y_train)

# to display the score

print("The Score is :",knn.score(X_test, y_test))

# To get test data from the user

test_data = input("Enter Test Data :").split(",")

for i in range(len(test_data)):

test_data[i] = float(test_data[i])

print()

v = knn.predict([test_data])

print("Predicted output is :",label_target[v])

except:

print("Please supply valid input......")

OUTPUT:Sample Data from Iris Dataset

******************************
[4.6 3.4 1.4 0.3] ===> setosa
[6.1 3. 4.6 1.4] ===> versicolor
[6.3 3.3 4.7 1.6] ===> versicolor
[4.9 3.1 1.5 0.1] ===> setosa
[6.9 3.2 5.7 2.3] ===> virginica
[6.4 3.2 4.5 1.5] ===> versicolor
[5.4 3.4 1.5 0.4] ===> setosa
[5.9 3.2 4.8 1.8] ===> versicolor
[5.4 3. 4.5 1.5] ===> versicolor
[7. 3.2 4.7 1.4] ===> versicolor
The Training dataset length: 105
The Testing dataset length: 45
Enter number of neighbors :10
The Score is : 0.9777777777777777
Enter Test Data :5.0,3.3,1.4,0.3
Predicted output is : ['setosa']
Aim: Given the following data, which specify classifications for nine
Combinations of VAR1 and VAR2 predict a classification for a case where
VAR1=0.906and VAR2=0.606, using the result of k-means clustering with 3
means (i.e., 3centroids)

Source Code:

Explanation:
===> To run this program you need to install the sklearn Module

===> Open Command propmt and then execute the following command to install
sklearn Module

---> pip install scikit-learn

In this program, we are going to use the following data

VAR1 VAR2 CLASS


1.713 1.586 0
0.180 1.786 1
0.353 1.240 1
0.940 1.566 0
1.486 0.759 1
1.266 1.106 0
1.540 0.419 1
0.459 1.799 1
0.773 0.186 1

And, we need apply k-means clustering with 3 means (i.e., 3 centroids)

Finally, you need to predict the class for the VAR1=0.906 and VAR2=0.606

Source Code :

from sklearn.cluster import KMeans


import numpy as np
X = np.array([[1.713,1.586], [0.180,1.786], [0.353,1.240],
[0.940,1.566], [1.486,0.759],
[1.266,1.106],[1.540,0.419],[0.459,1.799],[0.773,0.186]])
y=np.array([0,1,1,0,1,0,1,1,1])
kmeans = KMeans(n_clusters=3, random_state=0).fit(X,y)
print("The input data is ")
print("VAR1 \t VAR2 \t CLASS")
i=0
for val in X:
print(val[0],"\t",val[1],"\t",y[i])
i+=1
print("="*20)
# To get test data from the user
print("The Test data to predict ")
test_data = []
VAR1 = float(input("Enter Value for VAR1 :"))
VAR2 = float(input("Enter Value for VAR2 :"))
test_data.append(VAR1)
test_data.append(VAR2)
print("="*20)
print("The predicted Class is : ",kmeans.predict([test_data]))

OUTPUT:

The input data is


VAR1 VAR2 CLASS
1.713 1.586 0
0.18 1.786 1
0.353 1.24 1
0.94 1.566 0
1.486 0.759 1
1.266 1.106 0
1.54 0.419 1
0.459 1.799 1
0.773 0.186 1
====================
The Test data to predict
Enter Value for VAR1 :0.906
Enter Value for VAR2 :0.606
====================
The predicted Class is : [0]
Aim: The following training examples map descriptions of individuals onto high,
medium and low credit-worthiness. Input attributes are (from left to right) income,
recreation, job, status, age-group, home-owner. Find the unconditional probability
of 'golf' and the conditional probability of 'single' given 'medRisk' in the dataset

medium skiing design single twenties no -> highRisk

high golf trading married forties yes -> lowRisk

low speedway transport married thirties yes -> medRisk

medium football banking single thirties yes -> lowRisk

high flying media married fifties yes -> highRisk

low football security single twenties no -> medRisk

medium golf media single thirties yes -> medRisk

medium golf transport married forties yes -> lowRisk

high skiing banking single thirties yes -> highRisk

low golf unemployed married forties yes -> highRisk

Input attributes are (from left to right) income, recreation, job, status, age-group,
home-owner. Find the unconditional probability of 'golf' and the conditional
probability of 'single' given 'medRisk' in the dataset

Explanation:

In the given data set,

----> The total number of records are 10.

----> The number of records which contains 'golf' are 4.

----> Then, the Unconditional probability of golf :


= The number of records which contains 'golf' / total number of records

= 4 / 10

= 0.4

To find the Conditional probability of single given medRisk,

---> S : single

---> MR : medRisk

---> By the definition of Baye's rule( conditional probability ), we have

P(S ∣ MR) = P(S ∩ MR) / P(MR)

Based on the given problem statement,

P(S ∩ MR) = The number of MedRisk with Single records / total number of
Records

= 2 / 10 = 0.2 and

P(MR) = The number of records with MedRisk /total number of Records

= 3 / 10 = 0.3

Then, the Conditional probability of single given medRisk

P(S ∣ MR) = 0.2 / 0.3

= 0.66666

Source Code :

total_Records=10

numGolfRecords=4

unConditionalprobGolf=numGolfRecords / total_Records

print("Unconditional probability of golf: ={}".format(unConditionalprobGolf))

#conditional probability of 'single' given 'medRisk'


numMedRiskSingle=2

numMedRisk=3

probMedRiskSingle=numMedRiskSingle/total_Records

probMedRisk=numMedRisk/total_Records

conditionalProb=(probMedRiskSingle/probMedRisk)

print("Conditional probability of single given medRisk: =


{}".format(conditionalProb))

OUTPUT:

Unconditional probability of golf: =0.4

Conditional probability of single given medRisk: = 0.6666666666666667


Aim: Implement linear regression using python

Explanation:

===> To run this program you need to install the pandas Module

---> pandas Module is used to read csv files

===> To install, Open Command propmt and then execute the following command

---> pip install pandas

And, then you need to install the matplotlib Module

---> matplotlib Module is used to plot the graphs

===> To install, Open Command propmt and then execute the following command

---> pip install matplotlib

Finally, you need to create dataset called "Age_Income.csv" file.

Source Code :

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

# To read data from Age_Income.csv file

dataFrame = pd.read_csv('Age_Income.csv')

# To place data in to age and income vectors

age = dataFrame['Age']

income = dataFrame['Income']

# number of points
num = np.size(age)

# To find the mean of age and income vector

mean_age = np.mean(age)

mean_income = np.mean(income)

# calculating cross-deviation and deviation about age

CD_ageincome = np.sum(income*age) - num*mean_income*mean_age

CD_ageage = np.sum(age*age) - num*mean_age*mean_age

# calculating regression coefficients

b1 = CD_ageincome / CD_ageage

b0 = mean_income - b1*mean_age

# to display coefficients

print("Estimated Coefficients :")

print("b0 = ",b0,"\nb1 = ",b1)

# To plot the actual points as scatter plot

plt.scatter(age, income, color = "b",marker = "o")

# TO predict response vector

response_Vec = b0 + b1*age

# To plot the regression line

plt.plot(age, response_Vec, color = "r")

# Placing labels

plt.xlabel('Age')
plt.ylabel('Income')

# To display plot

plt.show()

INPUT DATA: Age_Income.csv

Age Income
34 40000
23 30000
67 15000
20 15000
24 25000
23 22000
45 34000
54 50000
43 38000
34 30000
40 40000
33 56000
46 44000
56 45000
19 20000
OUTPUT:

Estimated Coefficients:
b0 = 22586.705594785453
b1 = 294.4731124388917
<function matplotlib.pyplot.show(close=None, block=None)>
AIM: Implement Naive Bayes Theorem to Classify the English Text using python

Explanation:
Naive Bayes classifiers are a collection of classification algorithms based on
Bayes’ Theorem. It is not a single algorithm but a family of algorithms where all of
them share a common principle, i.e. every pair of features being classified is
independent of each other.

The dataset is divided into two parts, namely, feature matrix and the
response/target vector.
• The Feature matrix (X) contains all the vectors(rows) of the dataset in which each
vector consists of the value of dependent features. The number of features is d i.e.
X = (x1,x2,x2, xd).
• The Response/target vector (y) contains the value of class/group variable for each
row of feature matrix.

Source Code
print("NAIVE BAYES ENGLISH TEST CLASSIFICATION")
import numpy as np, pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import make_pipeline
from sklearn.metrics import confusion_matrix, accuracy_score
sns.set() # use seaborn plotting style
# Load the dataset
data = fetch_20newsgroups()# Get the text categories
text_categories = data.target_names# define the training set
train_data = fetch_20newsgroups(subset="train", categories=text_categories)#
define the test set
test_data = fetch_20newsgroups(subset="test", categories=text_categories)
print("We have {} unique classes".format(len(text_categories)))
print("We have {} training samples".format(len(train_data.data)))
print("We have {} test samples".format(len(test_data.data)))
# let’s have a look as some training data let it 5th only
#print(test_data.data[5])
# Build the model
model = make_pipeline(TfidfVectorizer(), MultinomialNB())# Train the model
using the training data
model.fit(train_data.data, train_data.target)# Predict the categories of the test data
predicted_categories = model.predict(test_data.data)
print(np.array(test_data.target_names)[predicted_categories])

# plot the confusion matrix


mat = confusion_matrix(test_data.target, predicted_categories)
sns.heatmap(mat.T, square = True, annot=True, fmt = "d",
xticklabels=train_data.target_names,yticklabels=train_data.target_names)
plt.xlabel("true labels")
plt.ylabel("predicted label")
plt.show()
print("The accuracy is {}".format(accuracy_score(test_data.target,
predicted_categories)))

OUTPUT:
NAIVE BAYES ENGLISH TEST CLASSIFICATION
We have 20 unique classes
We have 11314 training samples
We have 7532 test samples
['rec.autos' 'sci.crypt' 'alt.atheism' ... 'rec.sport.baseball'
'comp.sys.ibm.pc.hardware' 'soc.religion.christian']
AIM: . Implement an algorithm to demonstrate the significance of Genetic
Algorithm in python.

ALGORITHM:
1. Individual in population compete for resources and mate

2. Those individuals who are successful (fittest) then mate to create more
offspring than others
3. Genes from “fittest” parent propagate throughout the generation, that is
sometimes parents create offspring which is better than either parent.

4. Thus each successive generation is more suited for their environment.

Operators of Genetic Algorithms


Once the initial generation is created, the algorithm evolve the generation using
following operators –

1) Selection Operator: The idea is to give preference to the individuals with good
fitness scores and allow them to pass there genes to the successive generations.

2) Crossover Operator: This represents mating between individuals. Two


individuals are selected using selection operator and crossover sites are chosen
randomly. Then the genes at these crossover sites are exchanged thus creating a
completely new individual (offspring).

3) Mutation Operator: The key idea is to insert random genes in offspring to


maintain the diversity in population to avoid the premature convergence.

Source Code
# Python3 program to create target string, starting from
# random string using Genetic Algorithm

import random

# Number of individuals in each generation


POPULATION_SIZE = 100

# Valid genes
GENES = '''abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP
QRSTUVWXYZ 1234567890, .-;:_!"#%&/()=?@${[]}'''

# Target string to be generated


TARGET = "I love GeeksforGeeks"

class Individual(object):
'''
Class representing individual in population '''
def __init__(self, chromosome):
self.chromosome = chromosome
self.fitness = self.cal_fitness()

@classmethod
def mutated_genes(self):
'''
create random genes for mutation
'''
global GENES
gene = random.choice(GENES)
return gene

@classmethod
def create_gnome(self):
'''
create chromosome or string of genes
'''
global TARGET
gnome_len = len(TARGET)
return [self.mutated_genes() for _ in range(gnome_len)]

def mate(self, par2):


''' Perform mating and produce new offspring '''

# chromosome for offspring


child_chromosome = []
for gp1, gp2 in zip(self.chromosome, par2.chromosome):

# random probability
prob = random.random()

# if prob is less than 0.45, insert gene


# from parent 1
if prob < 0.45:
child_chromosome.append(gp1)

# if prob is between 0.45 and 0.90, insert


# gene from parent 2
elif prob < 0.90:
child_chromosome.append(gp2)

# otherwise insert random gene(mutate),


# for maintaining diversity
else:
child_chromosome.append(self.mutated_genes())
# create new Individual(offspring) using
# generated chromosome for offspring
return Individual(child_chromosome)

def cal_fitness(self):
''' Calculate fittness score, it is the number of
characters in string which differ from target string. '''
global TARGET
fitness = 0
for gs, gt in zip(self.chromosome, TARGET):
if gs != gt:
fitness+= 1
return fitness

# Driver code
def main():
global POPULATION_SIZE

#current generation
generation = 1
found = False
population = []

# create initial population


for _ in range(POPULATION_SIZE):
gnome = Individual.create_gnome()
population.append(Individual(gnome))

while not found:

# sort the population in increasing order of fitness score


population = sorted(population, key = lambda x:x.fitness)

# if the individual having lowest fitness score ie.


# 0 then we know that we have reached to the target
# and break the loop
if population[0].fitness <= 0:
found = True
break

# Otherwise generate new offsprings for new generation


new_generation = []

# Perform Elitism, that mean 10% of fittest population


# goes to the next generation
s = int((10*POPULATION_SIZE)/100)
new_generation.extend(population[:s])

# From 50% of fittest population, Individuals


# will mate to produce offspring
s = int((90*POPULATION_SIZE)/100)
for _ in range(s):
parent1 = random.choice(population[:50])
parent2 = random.choice(population[:50])
child = parent1.mate(parent2)
new_generation.append(child)

population = new_generation

print("Generation: {}\tString: {}\tFitness: {}".\


format(generation,
"".join(population[0].chromosome),
population[0].fitness))

generation += 1
print("Generation: {}\tString: {}\tFitness: {}".\
format(generation,
"".join(population[0].chromosome),
population[0].fitness))

if __name__ == '__main__':
main()

OUTPUT:
Generation: 1 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 2 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 2 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 3 String: 7tqS0( ?_X1.f:890{zL Fitness: 18
Generation: 3 String: N;B54eL2BTIAf6NG3}Tz Fitness: 17
Generation: 4 String: N;B54eL2BTIAf6NG3}Tz Fitness: 17
.
.
.
.
.
.
.
.
.
Generation: 65 String: I love GeeWsforGeeks Fitness: 1
Generation: 65 String: I love GeeWsforGeeks Fitness: 1
Generation: 66 String: I love GeeWsforGeeks Fitness: 1
Generation: 66 String: I love GeeWsforGeeks Fitness: 1
Generation: 67 String: I love GeeWsforGeeks Fitness: 1
Generation: 67 String: I love GeeWsforGeeks Fitness: 1
Generation: 68 String: I love GeeWsforGeeks Fitness: 1
Generation: 68 String: I love GeeWsforGeeks Fitness: 1
Generation: 69 String: I love GeeWsforGeeks Fitness: 1

AIM: Implement an algorithm to demonstrate Back Propagation Algorithm in


python

ALGORITHM:
It is the most widely used algorithm for training artificial neural networks. In the
simplest scenario, the architecture of a neural network consists of some
sequential layers, where the layer numbered i is connected to the layer numbered
i+1. The layers can be classified into 3 classes:

1. Input

2. Hidden

3. Output

Usually, each neuron in the hidden layer uses an activation function like sigmoid
or rectified linear unit (ReLU). This helps to capture the non-linear relationship
between the inputs and their outputs. The neurons in the output layer also use
activation functions like sigmoid (for regression) or SoftMax (for classification). To
train a neural network, there are 2 passes (phases):

• Forward

• Backward
The forward and backward phases are repeated from some epochs. In each
epoch, the following occurs:

1. The inputs are propagated from the input to the output layer.

2. The network error is calculated.

3. The error is propagated from the output layer to the input layer.

SOURCE CODE:
import numpy
import matplotlib.pyplot as plt
def sigmoid(sop):
return 1.0/(1+numpy.exp(-1*sop))
def error(predicted, target):
return numpy.power(predicted-target, 2)
def error_predicted_deriv(predicted, target):
return 2*(predicted-target)
def sigmoid_sop_deriv(sop):
return sigmoid(sop)*(1.0-sigmoid(sop))
def sop_w_deriv(x):
return x
def update_w(w, grad, learning_rate):
return w - learning_rate*grad
x1=0.1
x2=0.4
target = 0.7
learning_rate = 0.01
w1=numpy.random.rand()
w2=numpy.random.rand()
print("Initial W : ", w1, w2)
predicted_output = []
network_error = []
old_err = 0
for k in range(80000):
# Forward Pass
y = w1*x1 + w2*x2
predicted = sigmoid(y)
err = error(predicted, target)

predicted_output.append(predicted)
network_error.append(err)
# Backward Pass
g1 = error_predicted_deriv(predicted, target)
g2 = sigmoid_sop_deriv(y)

g3w1 = sop_w_deriv(x1)
g3w2 = sop_w_deriv(x2)

gradw1 = g3w1*g2*g1
gradw2 = g3w2*g2*g1
w1 = update_w(w1, gradw1, learning_rate)
w2 = update_w(w2, gradw2, learning_rate)
#print(predicted)
plt.figure()
plt.plot(network_error)
plt.title("Iteration Number vs Error")
plt.xlabel("Iteration Number")
plt.ylabel("Error")
plt.show()
plt.figure()
plt.plot(predicted_output)
plt.title("Iteration Number vs Prediction")
plt.xlabel("Iteration Number")
plt.ylabel("Prediction")
plt.show()

OUTPUT:
Initial W : 0.9662947849081247 0.9049871680451135

You might also like