0% found this document useful (0 votes)
1K views

Pattern Recognition Lab

The document contains details about experiments conducted as part of a Pattern Recognition Lab course. It includes 8 programs: 1. Reads images and calculates basic statistics like mean, mode, standard deviation. 2. Implements naive Bayesian classifier on a CSV dataset and computes accuracy. 3. Constructs a Bayesian network using medical data to diagnose heart patients. 4. Implements Bayes' theorem and formula. 5. Performs data analysis on a given dataset. 6. Implements KNN on an image dataset. 7. Implements K-means clustering. 8. Implements PCA (principal component analysis).

Uploaded by

Prashant Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Pattern Recognition Lab

The document contains details about experiments conducted as part of a Pattern Recognition Lab course. It includes 8 programs: 1. Reads images and calculates basic statistics like mean, mode, standard deviation. 2. Implements naive Bayesian classifier on a CSV dataset and computes accuracy. 3. Constructs a Bayesian network using medical data to diagnose heart patients. 4. Implements Bayes' theorem and formula. 5. Performs data analysis on a given dataset. 6. Implements KNN on an image dataset. 7. Implements K-means clustering. 8. Implements PCA (principal component analysis).

Uploaded by

Prashant Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Pattern Recognition Lab

CAL – 302

B.Tech 3rd Year


SEMESTER: 6th
Session: 2021-2022

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SHARDA UNIVERSITY, GREATER NOIDA

Submitted By: Harsh Tiwari


Roll No: 190101124
System I’d: 2019644021
Group: 1
Submitted to: Dr. Vijendra Singh
INDEX
S.No Title of The Experiment Date Signature

1 Assuming a set of images that need


to be classified, read the images
and calculate basic statistics such
as mean, mode, standard deviation,
etc.
2 Write a program to implement the
naïve Bayesian classifier for a
sample training data set stored as a
.CSV file. Compute the accuracy of
the classifier, considering few test
data sets.
3 Write a program to construct a
Bayesian network considering
medical data. Use this model to
demonstrate the diagnosis of heart
patients.
PROGRAM 1

Experiment 1: Assuming a set of images that need to be classified, read the

images and calculate basic statistics such as mean, mode, standard deviation,

etc.

Code:
from google.colab import files
uploaded = files.upload()
!pwd
import os
!pip install albumentations==0.4.6

import numpy as np
import pandas as pd
import torch
import torchvision
from torch.utils.data import Dataset,DataLoader
import albumentations as A
from albumentations.pytorch import ToTensorV2
import cv2
from tqdm import tqdm
import matplotlib.pyplot as plt
%matplotlib inline
device = torch.device('cpu')
import requests
os.environ['KAGGLE_CONFIG_DIR'] = "/content"
!kaggle competitions download -c cassava-leaf-disease-
classification
from zipfile import ZipFile
# specifying the zip file name
file_name = "cassava-leaf-disease-classification.zip"

# opening the zip file in READ mode


with ZipFile(file_name, 'r') as zip:
# printing all the contents of the zip file
zip.printdir()

# extracting all the files


print('Extracting all the files now...')
zip.extractall()
print('Done!')

df2.head()
class LeafData(Dataset):

def __init__(self,
data,
directory,
transform = None):
self.data = data
self.directory = directory
self.transform = transform

def __len__(self):
return len(self.data)

def __getitem__(self, idx):

# import
path = os.path.join(self.directory,
self.data.iloc[idx]['image_id'])
image = cv2.imread(path, cv2.COLOR_BGR2RGB)

# augmentations
if self.transform is not None:
image = self.transform(image = image)['image']
return image

num_workers = 4
image_size = 512
batch_size = 8

augs = A.Compose([A.Resize(height = image_size,


width = image_size),
A.Normalize(mean = (0, 0, 0),
std = (1, 1, 1)),
ToTensorV2()])
# dataset
image_dataset = LeafData(data = df2,
directory = 'train_images/',
transform = augs)

# data loader
image_loader = DataLoader(image_dataset,
batch_size = batch_size,
shuffle = False,
num_workers = num_workers,
pin_memory = True)

# display images
for batch_idx, inputs in enumerate(image_loader):
fig = plt.figure(figsize = (14, 7))
for i in range(8):
ax = fig.add_subplot(2, 4, i + 1, xticks = [], yticks =
[])
plt.imshow(inputs[i].numpy().transpose(1, 2, 0))
break

psum = torch.tensor([0.0, 0.0, 0.0])


psum_sq = torch.tensor([0.0, 0.0, 0.0])

# loop through images


for inputs in tqdm(image_loader):
psum += inputs.sum(axis = [0, 2, 3])
psum_sq += (inputs ** 2).sum(axis = [0, 2, 3])
count = len(df2) * image_size * image_size

# mean and std


total_mean = psum / count
total_var = (psum_sq / count) - (total_mean ** 2)
total_std = torch.sqrt(total_var)
# output
print('mean: ' + str(total_mean))
print('std: ' + str(total_std))
PROGRAM 2
Experiment 2: Write a program to implement the naïve Bayesian
classifier for a sample training data set stored as a .CSV file. Compute
the accuracy of the classifier, considering few test data sets.

Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
wine=datasets.load_wine()
print(wine)

print(wine.feature_names)

print(wine.target_names)

X=pd.DataFrame(wine['data'])
print(X.head())

y=print(wine.target)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(wine.data,wine.target,te
st_size=0.30,random_state=100)

from sklearn.naive_bayes import GaussianNB


gnb = GaussianNB()
gnb.fit(X_train,y_train)
y_pred = gnb.predict(X_test)
print(y_pred)

from sklearn import metrics


print(metrics.accuracy_score(y_test,y_pred))

from sklearn.metrics import confusion_matrix


cm=np.array(confusion_matrix(y_test,y_pred))
cm
PROGRAM 3
Write a program to construct a Bayesian network considering medical
data. Use this model to demonstrate the diagnosis of heart patients.

Code:
!pip install pgmpy

import pandas as pd
data = pd.read_csv('heart.csv')
from pgmpy.models import BayesianNetwork
names = "A,B,C,D,E,F,G,H,I,J,K,L,M,RESULT"
names = names.split(",")
len(names)

data.head()

import pandas.util.testing as tm

model =
BayesianNetwork([('age','sex'),('trestbps','chol'),('restecg','thalach'
),('exang','target')])
model.fit(data)
from pgmpy.inference import VariableElimination
infer = VariableElimination(model)
print(infer)

q=infer.query(variables=['target'],evidence={'age':28})

print(q)
PROGRAM – 4

Write a program to implement Bayes Theorem and its Formula.

CODE:
def bayes_theorem(p_b, p_g_given_b, p_g_given_not_b):

# calculate P(not B)
not_b = 1 - p_b

# calculate P(G)
p_g = p_g_given_b * p_b + p_g_given_not_b * not_b

# calculate P(B|G)
p_b_given_g = (p_g_given_b * p_b) / p_g
return p_b_given_g

#P(B)
p_b = 1/7

# P(G|B)
p_g_given_b = 1

# P(G|notB)
p_g_given_not_b = 2/3

# calculate P(B|G)
result = bayes_theorem(p_b, p_g_given_b, p_g_given_not_b)

# print result
print('P(B|G) = %.2f%%' % (result * 100))

import pandas as pd

df = pd.read_csv('cereal.csv')

df
col1 = df.calories
col2 = df.potass
col3 = df.sodium
for i in range(1,21,2):
p_a = col1[i]/1000
p_b_given_a = col2[i]/1000
p_b_not_a = col3[i]/1000
result = bayes_theorem(p_a, p_b_given_a, p_b_not_a)
print(result)
PROGRAM – 5

Write a program to perform Data Analysis on a given Dataset

CODE:

from google.colab import files


uploaded = files.upload()

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("cereal.csv")

df

print(df.weight)
print(df.columns)

x =df.calories
print(x.var())
print(x.std())
df[['weight']].idxmax()
df.mean(axis = 0)
PROGRAM – 6

Write a program to implement KNN on an image dataset

CODE:

from sklearn import datasets


digits = datasets.load_digits()
x = digits.data
y = digits.target
import pandas as pd
df = pd.DataFrame(data = y, columns = ['targets'])
df

x.shape
y.shape
digits.images.shape
digits.images[0]

import matplotlib.pyplot as plt


plt.imshow(digits.images[0],cmap = plt.cm.gray_r)
plt.axis('off')
plt.title('Number: ' + str(y[0]))
None

figure,axes = plt.subplots(3,10,figsize = (15,6))


for ax,image,number in zip(axes.ravel(),digits.images,y):
ax.axis('off')
ax.imshow(image,cmap = plt.cm.gray_r)
ax.set_title('Number: '+ str(number))
image = digits.images[0]
print('original image data = ')
print(image)
print()
image_flattened = image.ravel()
print("flattened image = ")
print(image_flattened)

print('feature data for a sample= ')


print(x[0])
print()

print('Feature data for all sample is a 8-by-8 two dimaensional array')


print(x)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.33,ran
dom_state=99,stratify = y)

from sklearn.neighbors import KNeighborsClassifier


knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(x_train,y_train)

y_red = knn.predict(x_test)
y_red

from sklearn.metrics import classification_report


report = classification_report(y_test,y_red)
print(report)
import seaborn as sns
s = sns.heatmap(confusion,annot = True, cmap = 'nipy_spectral_r')
s.set_title('Confusion matrix for MNIST dataset')
None
PROGRAM – 7

Write a program to implement K-Means Clustering.

CODE:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

X, y = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_st


ate=0)
plt.scatter(X[:,0], X[:,1])

wcss = []
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-
means++', max_iter=300, n_init=10, random_state=0)
kmeans.fit(X)
wcss.append(kmeans.inertia_)
plt.plot(range(1, 11), wcss)
plt.title('Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('WCSS')
plt.show()
kmeans = KMeans(n_clusters=4, init='k-
means++', max_iter=300, n_init=10, random_state=0)
pred_y = kmeans.fit_predict(X)
plt.scatter(X[:,0], X[:,1])
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1
], s=300, c='red')
plt.show()
PROGRAM – 8

Write a program to implement PCA (Principle Component Analysis)

CODE:

import pandas as pd
url = "https://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data"
# load dataset into Pandas DataFrame
df = pd.read_csv(url, names=['sepal length','sepal width','petal length
','petal width','target'])

df.head()

from sklearn.preprocessing import StandardScaler


features = ['sepal length', 'sepal width', 'petal length', 'petal width
']
# Separating out the features
x = df.loc[:, features].values
# Separating out the target
y = df.loc[:,['target']].values
# Standardizing the features
x = StandardScaler().fit_transform(x)
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
, columns = ['principal component 1', 'principal component
2'])
finalDf = pd.concat([principalDf, df[['target']]], axis = 1)
fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)
ax.set_xlabel('Principal Component 1', fontsize = 15)
ax.set_ylabel('Principal Component 2', fontsize = 15)
ax.set_title('2 component PCA', fontsize = 20)
targets = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']
colors = ['r', 'g', 'b']
for target, color in zip(targets,colors):
indicesToKeep = finalDf['target'] == target
ax.scatter(finalDf.loc[indicesToKeep, 'principal component 1']
, finalDf.loc[indicesToKeep, 'principal component 2']
, c = color
, s = 50)
ax.legend(targets)
ax.grid()

pca.explained_variance_ratio_

You might also like