0% found this document useful (0 votes)
72 views42 pages

Pattern Recognition Sahil Malek

Uploaded by

Dhaval Katariya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views42 pages

Pattern Recognition Sahil Malek

Uploaded by

Dhaval Katariya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

A Laboratory Manual for

Pattern Recognition
(3171613)

B.E. Semester 7
(Information Technology)

Directorate of Technical Education, Gandhinagar,


Gujarat
Government Engineering College, Bhavnagar
Certificate

This is to certify that Mr./Ms.


Enrollment No. of B.E. Semester
Information Technology of this Institute (GTU Code: ) has
satisfactorily completed the Practical / Tutorial work for the subject Pattern
Recognition (3171613) for the academic year 2024-25.

Place:
Date:

Name and Sign of Faculty member

Head of the Department


Pattern Recognition (3171613)

Preface

Main motto of any laboratory/practical/field work is for enhancing required skills as well as
creating ability amongst students to solve real time problem by developing relevant competencies
in psychomotor domain. By keeping in view, GTU has designed competency focused outcome-
based curriculum for engineering degree programs where sufficient weight age is given to
practical work. It shows importance of enhancement of skills amongst the students and it pays
attention to utilize every second of time allotted for practical amongst students, instructors and
faculty members to achieve relevant outcomes by performing the experiments rather than having
merely study type experiments. It is must for effective implementation of competency focused
outcome-based curriculum that every practical is keenlydesigned to serve as a tool to develop and
enhance relevant competency required by the various industry among every student. These
psychomotor skills are very difficult to develop through traditional chalk and board content
delivery method in the classroom. Accordingly, this lab manual is designed to focus on the
industry defined relevant outcomes, rather than old practice of conducting practical to prove
concept and theory.

By using this lab manual students can go through the relevant theory and procedure in advance
before the actual performance which creates an interest and students can have basic idea prior to
performance. This in turn enhances pre-determined outcomes amongst students. Each experiment
in this manual begins with competency, industry relevant skills, course outcomes as well as
practical outcomes (objectives). The students will also achieve safety and necessary precautions
to be taken while performing practical.

This manual also provides guidelines to faculty members to facilitate student centric lab activities
through each experiment by arranging and managing necessary resources in order that the
students follow the procedures with required safety and necessary precautions to achieve the
outcomes. It also gives an idea that how students will be assessed by providing rubrics.

Engineering Thermodynamics is the fundamental course which deals with various forms of
energy and their conversion from one to the another. It provides a platform for students to
demonstrate first and second laws of thermodynamics, entropy principle and concept of exergy.
Students also learn various gas and vapor power cycles and refrigeration cycle. Fundamentals of
combustion are also learnt.

Utmost care has been taken while preparing this lab manual however always there is chances of
improvement. Therefore, we welcome constructive suggestions for improvement and removal of
errors if any.
Pattern Recognition (3171613)

Practical – Course Outcome matrix

Course Outcomes (COs):


After learning the course the students should be able to

CO1 To provide students with an understanding of the mathematical foundations of


pattern recognition, including probability theory, linear algebra, and Bayesian
Decision theory.
CO2 To understand pattern recognition concepts of classification, clustering and feature
selection and Feature representation.
CO3 Become aware of the theoretical issues involved in pattern recognition system
design such as the curse of dimensionality.
CO4 To develop students' programming skills in implementing pattern recognition
algorithms and analyzing their performance using evaluation metric on real world
problem like speech recognition , image classification .

Sr. No. Objective(s) of Experiment CO1 CO2 CO3 CO4

Apply Bayes theorem on given case study by conducting


1. probability calculations using independence, conditional and √
joint probability.
Implement minimum-error-rate classification and decision
2. surfaces. Use discriminate functions to classify and analyze √ √ √
normal density and discrete features.

Apply unsupervised learning and clustering methods on a


WINE dataset using KMeans, Hierarchical, and Gaussian
3. mixture models. The results will be validated using criterion √ √ √
functions such as silhouette score and Calinski-Harabasz
index.
To implement dimensionality reduction techniques such as
Principal Component Analysis (PCA) and Fisher Discriminant
4. √ √ √
Analysis (FDA) on MNIST data and visualize
the data.
Implement linear discriminate functions using gradient
5. descent procedures, the Perceptron algorithm, and Support √ √ √
Vector Machines (SVM).
Design and train a Multilayer Perceptron (MLP) feed
6. forward neural network for a classification task using the √ √ √
CIFAR-10 Dataset.
Implementing Recurrent Neural Networks (RNNs) for
7. √ √ √
Sequential Data Analysis.
Non-metric Methods for Pattern Classification: Analyzing
8. √ √ √
Nominal Data using Decision Trees.
Pattern Recognition (3171613)

Industry Relevant Skills

The following industry relevant competency is expected to be developed in the student by


undertaking the practical work of this laboratory.
1. Decide the type of Pattern Recognition algorithm that best suits the Industry defined
problem.
2. Configuration and development of solution of Industry defined problem.

Guidelines for Faculty members


1. Teacher should provide the guideline with demonstration of practical to the students
with all features.
2. Teacher shall explain basic concepts/theory related to the experiment to the students
before starting of each practical
3. Involve all the students in performance of each experiment.
4. Teacher is expected to share the skills and competencies to be developed in the students
and ensure that the respective skills and competencies are developed in the students
after the completion of the experimentation.
5. Teachers should give opportunity to students for hands-on experience after the
demonstration.
6. Teacher may provide additional knowledge and skills to the students even though not
covered in the manual but are expected from the students by concerned industry.
7. Give practical assignment and assess the performance of students based on task
assigned to check whether it is as per the instructions or not.
8. Teacher is expected to refer complete curriculum of the course and follow the
guidelines for implementation.

Instructions for Students


1. Students are expected to carefully listen to all the theory classes delivered by the faculty
members and understand the COs, content of the course, teaching and examination scheme,
skill set to be developed etc.
2. Students shall organize the work in the group and make record of all observations.
3. Students shall develop maintenance skill as expected by industries.
4. Student shall attempt to develop related hand-on skills and build confidence.
5. Student shall develop the habits of evolving more ideas, innovations, skills etc. apart from
those included in scope of manual.
6. Student shall refer technical magazines and data books.
7. Student should develop a habit of submitting the experimentation work as per the schedule
and s/he should be well prepared for the same.
Pattern Recognition (3171613)

Index
(Progressive Assessment Sheet)

Sr. Objective(s) of Experiment P Date of Date of Assessme Sign. of Remar


No. a ge perform submiss nt Teacher ks
N ance ion Marks with
o date
.

Apply Bayes theorem on given case study by


1.conducting probability calculations using
independence, conditional and joint probability.
Implement minimum-error-rate classification and
decision surfaces. Use discriminate functions to
2. classify and analyze normal density and discrete
features.

Apply unsupervised learning and clustering methods


on a WINE dataset using KMeans, Hierarchical, and
3. Gaussian mixture models. The results will be
validatedusing criterion functions such as silhouette
score and
Calinski-Harabasz index.
To implement dimensionality reduction techniques
4. such as Principal Component Analysis (PCA) and
Fisher Discriminant Analysis (FDA) on MNIST data
and visualize the data.
Implement linear discriminate functions
5. usinggradient descent procedures, the
Perceptron
algorithm, and Support Vector Machines (SVM).
Design and train a Multilayer Perceptron (MLP)
6. feedforward neural network for a classification task
using
the CIFAR-10 Dataset.
Implementing Recurrent Neural Networks (RNNs)
7. for
Sequential Data Analysis.
8. Non-metric Methods for Pattern Classification:
Analyzing Nominal Data using Decision Trees.
Total
Pattern Recognition (3171613)

Experiment No: 0

1. Vision & Mission

1.1.1 Vision of DTE

• To provide globally competitive technical education;


• Remove geographical imbalances and inconsistencies;
• Develop student friendly resources with a special focus on girls’ education
and support to weaker sections;
• Develop programs relevant to industry and create a vibrant pool of technical professionals.

1.2.1 Vision of L.D.College of Engineering

To contribute for sustainable development of nation through achieving excellence in technical


education and research while facilitating transformation of students into responsible citizens and
competent professionals.

1.2.2 Mission of L.D.College of Engineering

• To impart affordable and quality education in order to meet the needs of industries and
achieve excellence in teaching-learning process.
• To create a conducive research ambience that drives innovation and nurtures research-oriented
scholars and outstanding professionals.
• To collaborate with other academic & research institutes as well as industries in order to
strengthen education and multidisciplinary research.
• To promote equitable and harmonious growth of students, academicians, staff, society and
industries, thereby becoming a center of excellence in technical education.
• To practice and encourage high standards of professional ethics, transparency and
accountability.

1.3.1 Vision of Information Technology Department, L.D.College of Engineering

To shape the young minds of aspiring Information Technology engineers to become the
front runner in the sustainable technological growth of our country, conserving its rich cultural
heritage and catering to its socioeconomic needs.

1.3.2 Mission of Information Technology Department, L.D.College of Engineering

• Bringing innovative approach in teaching-learning process to produce competent Information


Technology engineers.
• Provide opportunities and necessary exposure to the young engineers to develop themselves
into responsible professionals.
• Infusing lifelong learning ability in the aspiring minds with the view of making them sensible
towards their social responsibilities.
Pattern Recognition (3171613)

2. Program outcomes as prescribed by NBA

Engineering Graduates will be able to:

1. Engineering knowledge: Apply the knowledge of mathematics, science, engineering


fundamentals, and an engineering specialization to the solution of complex engineering problems.

2. Problem analysis: Identify, formulate, review research literature, and analyze complex
engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.

3. Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for
the public health and safety, and the cultural, societal, and environmental considerations.

4. Conduct investigations of complex problems: Use research-based knowledge and research


methods including design of experiments, analysis and interpretation of data, and synthesis of the
information to provide valid conclusions.

5. Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities with
an understanding of the limitations.

6. The engineer and society: Apply reasoning informed by the contextual knowledge to assess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to the
professional engineering practice.

7. Environment and sustainability: Understand the impact of the professional engineering solutions
in societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.

8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms
of the engineering practice.

9. Individual and team work: Function effectively as an individual, and as a member or leader in
diverse teams, and in multidisciplinary settings.

10. Communication: Communicate effectively on complex engineering activities with the


engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive clear
instructions.

11. Project management and finance: Demonstrate knowledge and understanding of the engineering
and management principles and apply these to one’s own work, as a member and leader in a team,
to manage projects and in multidisciplinary environments.

12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning in the broadest context of technological change.
Pattern Recognition (3171613)

3. PSOs of the Information Technology Department, L.D.College of Engineering

The Information technology engineers of L D College of engineering College will be able to


1. Apply the detailed knowledge of code optimization to complex application problems.
2. Write programs with strong skill set with standard coding practices.
3. Assess risk and vulnerability through standard security practices.

4. PEOs of the Information Technology Department, L.D.College of Engineering

1. Pursue a professional career in the field of Information Technology Engineering and


excel in it.
2. Enhance their knowledge by continuing higher education and research.
3. Work as torch bearer in a multidisciplinary environment to bring innovation and to
improvise the existing technology as entrepreneurs.
4. Keep pace with cutting edge scenario of the field with the view of contributing to the
social and environmental needs in efficient ways.

5. Course outcomes of Pattern Recognition course

CO1 To provide students with an understanding of the mathematical foundations of


pattern recognition, including probability theory, linear algebra, and Baysian
Decision theory.
CO2 To understand pattern recognition concepts of classification , clustering and feature selection
and Feature representation.
CO3 Become aware of the theoretical issues involved in pattern recognition system
design such as the curse of dimensionality.
CO4 To develop students' programming skills in implementing pattern recognition
algorithms and analyzing their performance using evaluation metric on real world
problem like speech recognition , image classification .
Pattern Recognition (3171613) 220213116007
Experiment No: 1

Apply Bayes theorem on given case study by conducting probability calculations


using independence, conditional and joint probability.
Date:

Competency and Practical Skills:


Understanding the basic concepts of probability theory, including conditional probability,
independence, and joint probability, is essential to grasp the underlying principles behind Bayes'
theorem.

Relevant CO: C01


Objectives:
(a).Students can solidify their understanding of probability and gain an intuitive grasp of how
different variables relate to each other.
(b) Improve python programming skill
(c) Apply bays theorem on real world problem.
Equipment/Instruments: Desktop/laptop

Theory:
Ref : https://www.geeksforgeeks.org/bayes-theorem/
Case study:
A hospital wants to screen patients for a rare disease that affects 1 in 10,000 people. Thescreening
test has a sensitivity of 95% (meaning that it correctly identifies 95% of people who have the
disease) and a specificity of 99% (meaning that it correctly identifies 99% of people who do not
have the disease). However, the test is not perfect and produces false positive and false negative
results. If a patient tests positive for the disease, what is the probability that they actually have the
disease?
Procedure:
1) Open python programming terminal.
2) Apply bays theorem on given case study above.
3) Calculate probability that they actually have the disease.

Code:
# Prior probability of having the disease
p_disease = 0.0001

# Sensitivity (true positive rate)


p_positive_given_disease = 0.95

# Specificity (true negative rate)


p_negative_given_no_disease = 0.99

# Calculate the complement of the prior probability of having the disease


p_no_disease = 1 - p_disease

# Calculate the false positive rate


p_positive_given_no_disease = 1 - p_negative_given_no_disease

# Calculate the total probability of testing positive


p_positive = (p_positive_given_disease * p_disease) + (p_positive_given_no_disease *
p_no_disease)
1
Pattern Recognition (3171613) 220213116007

# Calculate the probability of having the disease given a positive test


p_disease_given_positive = (p_positive_given_disease * p_disease) / p_positive

# Print the result


print("Probability of having the disease given a positive test:
{:.4f}".format(p_disease_given_positive))

Observations:

Quiz: (Sufficient space to be provided for the answers)


1. What is the difference between independent and dependent events?

2. If the probability of event A is 0.4 and the probability of event B is 0.6, what is the probability
of both events happening simultaneously if they are independent?
a) 0.24
b) 0.1
c) 0.6
d) 0.4
Suggested Reference: https://machinelearningmastery.com/bayes-theorem-for-machine-learning/

References used by the students: (Sufficient space to be provided)


Rubric wise marks obtained:
Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output
2
Pattern Recognition (3171613) 220213116007

Experiment No: 2

Implement minimum-error-rate classification and decision surfaces. Use


discriminate functions to classify and analyze normal density and discrete
features on given case study.
Date:

Competency and Practical Skills:


• Ability to apply statistical methods for classification of data.
• Ability to analyze and interpret classification results.
• Ability to evaluate classification performance using appropriate metrics.
• Implementing minimum-error-rate classification algorithms using discriminate functions
And Creating decision surfaces to classify data into different classes
• Analyzing and interpreting classification results using confusion matrices and ROC
curves

Relevant CO: co1,co2,co3

Objectives:
(a) To understand the concept of minimum-error-rate classification and decision surfaces.
(b) To learn how to use discriminate functions for classification and analysis of data.
(c) To understand the concept of normal density and its importance in classification.
(d) To learn how to implement classification algorithms using Python programming language.

Equipment/Instruments: Desktop/laptop

Theory:
https://machinelearningmastery.com/linear-discriminant-analysis-for-machine-learning/
https://machinelearningmastery.com/linear-discriminant-analysis-with-python/

Procedure:
In this lab, we will learn how to implement minimum-error-rate classification using discriminate
functions. Discriminate functions are used to classify observations into different classes based on
a set of predictors. We will use the iris dataset to demonstrate how to implement discriminate
functions.

Dataset Description:
The iris dataset contains 150 observations of iris flowers. There are three different species of iris
flowers: setosa, versicolor, and virginica. The dataset has four predictors: sepal length, sepal width,
petal length, and petal width.

Step 1: Download dataset from kaggle & Load the dataset.


Step 2: Split the dataset into training and testing sets
Step 3: Train the discriminate function
Step 4: Make predictions on the testing set.
Step 5: Evaluate the performance of the model using confusion matrix.

Code:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
3
Pattern Recognition (3171613) 220213116007
import seaborn as sns

#loading dataset
df = pd.read_csv("Iris.csv")
df.head()
df.describe()

#visualize dataset
sns.pairplot(df, hue='Species')

#Separating input col and output col-- slicing of data


data = df.values

x= data[:,1:5]
y= data[:,5]
print(y)

#Splitting Training and testing dataset


from sklearn.model_selection import train_test_split
X_train, x_test, y_train, y_test = train_test_split(x,y, test_size=0.2)

#Model Building using Logistic Regression


from sklearn.linear_model import LogisticRegression
model_LR = LogisticRegression()
model_LR.fit(X_train, y_train)

prediction = model_LR.predict(x_test)

#Calculate Accuracy
from sklearn.metrics import accuracy_score
print (accuracy_score(y_test, prediction)*100)

for i in range (len(prediction)):


print(f"Acutal_value: {y_test[i]}, Predicted_value : {prediction[i]}")

Observations:

4
Pattern Recognition (3171613) 220213116007

#Accuracy Score:

#Output Result :

Quiz: (Sufficient space to be provided for the answers)


1. What is a normal density function?
a) A function that represents the density of a feature in a normal distribution
b) A function that represents the distribution of a feature in a normal distribution
c) A function that represents the correlation between two features in a normal distribution
d) A function that represents the joint probability of two features in a normal distribution
2. What is the goal of minimum-error-rate classification?
a) To classify data with the highest accuracy
b) To classify data with the lowest error rate
c) To classify data with the highest precision
d) To classify data with the highest recall

Suggested Reference:

https://www.analyticsvidhya.com/blog/2021/05/bayesian-decision-theory-discriminant-functions-and-
normal-densitypart-3/

References used by the students: (Sufficient space to be provided)


Rubric wise marks obtained:
Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

5
Pattern Recognition (3171613) 220213116007

Experiment No: 03

Apply unsupervised learning and clustering methods on a WINE dataset using KMeans,
Hierarchical, and Gaussian mixture models. The results will be validated using criterion
functions such as silhouette score and Calinski-Harabasz index.

Date:

Competency and Practical Skills:


• Students will learn about unsupervised learning techniques such as KMeans, hierarchical
clustering, and Gaussian mixture models. They will understand how these techniques can
be used to discover patterns and structures in data without any prior knowledge of class
labels.
• Students will learn about different clustering algorithms, including KMeans, hierarchical
clustering, and Gaussian mixture models, and will be able to choose the appropriate
algorithm for a given dataset.
• Students will also learn how to use popular Python libraries such as scikit-learn, pandas,
and matplotlib for data analysis and visualization.

Relevant CO: Co1,co2,co3

• Objectives: To learn how to use different techniques to transform and preprocess data to
make it suitable for clustering. To learn how to interpret and visualize clustering results to
gain insights into the underlying data structure.

Equipment/Instruments:
Desktop/laptop with Materials:
Python programming environment
NumPy and Scikit-learn libraries

Theory:
Ref : https://www.analyticsvidhya.com/blog/2019/10/gaussian-mixture-models-clustering/

Procedure:
1. Load the WINE dataset using the scikit-learn library.
2. Preprocess the data by scaling the features to have zero mean and unit variance.
3. Split the data into training and testing sets.
4. Implement the KMeans clustering algorithm by setting the number of clusters and fitting
the model to the training data.
5. Predict the cluster assignments for the testing data using the trained KMeans model.
6. Evaluate the clustering results using the silhouette score and Calinski-Harabasz index.
7. Apply step 1 to 6 for Hierarchical clustering algorithm and Gaussian mixture model
8. Evaluate the clustering results using the silhouette score and Calinski-Harabasz index.

6
Pattern Recognition (3171613) 220213116007

9. Compare the clustering results obtained from the different algorithms.

Code:
from sklearn.datasets import load_wine

data = load_wine()
X = data.data # Features
y = data.target # Target labels

from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)

# 4. Implement the KMeans clustering algorithm


from sklearn.cluster import KMeans

n_clusters = 3 # Number of clusters (since there are three wine classes)


kmeans = KMeans(n_clusters=n_clusters, random_state=0)
kmeans.fit(X_train)

y_pred = kmeans.predict(X_test)

# 6. Evaluate the clustering results using the silhouette score and Calinski-Harabasz index
from sklearn.metrics import silhouette_score, calinski_harabasz_score

silhouette_avg = silhouette_score(X_test, y_pred)


calinski_harabasz = calinski_harabasz_score(X_test, y_pred)

print("KMeans - Silhouette Score:", silhouette_avg)


print("KMeans - Calinski-Harabasz Index:", calinski_harabasz)

# 7. Apply steps 1 to 6 for Hierarchical clustering algorithm and Gaussian mixture model
from sklearn.cluster import AgglomerativeClustering
from sklearn.mixture import GaussianMixture

# Hierarchical clustering
7
Pattern Recognition (3171613) 220213116007

agg_clustering = AgglomerativeClustering(n_clusters=n_clusters)
agg_clustering.fit(X_train)
y_agg_pred = agg_clustering.fit_predict(X_test)

# Gaussian Mixture Model


gmm = GaussianMixture(n_components=n_clusters, random_state=0)
gmm.fit(X_train)
y_gmm_pred = gmm.predict(X_test)

# 8. Evaluate the clustering results for Hierarchical clustering and Gaussian mixture model
silhouette_agg = silhouette_score(X_test, y_agg_pred)
calinski_harabasz_agg = calinski_harabasz_score(X_test, y_agg_pred)

silhouette_gmm = silhouette_score(X_test, y_gmm_pred)


calinski_harabasz_gmm = calinski_harabasz_score(X_test, y_gmm_pred)

print("Hierarchical Clustering - Silhouette Score:", silhouette_agg)


print("Hierarchical Clustering - Calinski-Harabasz Index:", calinski_harabasz_agg)
print("Gaussian Mixture Model - Silhouette Score:", silhouette_gmm)
print("Gaussian Mixture Model - Calinski-Harabasz Index:", calinski_harabasz_gmm)

# 9. Compare the clustering results obtained from the different algorithms


print("Comparison of Clustering Results:")
print("KMeans - Silhouette Score:", silhouette_avg)
print("Hierarchical Clustering - Silhouette Score:", silhouette_agg)
print("Gaussian Mixture Model - Silhouette Score:", silhouette_gmm)
print("KMeans - Calinski-Harabasz Index:", calinski_harabasz)
print("Hierarchical Clustering - Calinski-Harabasz Index:", calinski_harabasz_agg)
print("Gaussian Mixture Model - Calinski-Harabasz Index:", calinski_harabasz_gmm)

import matplotlib.pyplot as plt

# KMeans
plt.figure(figsize=(12, 4))
plt.subplot(131)
plt.scatter(X_test[:, 0], X_test[:, 1], c=y_pred, cmap='viridis')
plt.title("KMeans Clustering")

# Hierarchical clustering
plt.subplot(132)
8
Pattern Recognition (3171613) 220213116007

plt.scatter(X_test[:, 0], X_test[:, 1], c=y_agg_pred, cmap='viridis')


plt.title("Hierarchical Clustering")

# Gaussian Mixture Model


plt.subplot(133)
plt.scatter(X_test[:, 0], X_test[:, 1], c=y_gmm_pred, cmap='viridis')
plt.title("Gaussian Mixture Model")

plt.show()
Observation:

Quiz: (Sufficient space to be provided for the answers)


1. How do you decide the optimal number of clusters in KMeans?
Finding the optimal number of clusters in KMeans can be like searching for the sweet spot in a
hidden treasure hunt. One popular method is the "Elbow Method"—you plot the sum of
squared distances from each point to its assigned cluster center and look for an "elbow" in the
9
Pattern Recognition (3171613) 220213116007

graph where the reduction in this value slows down. That's often the right number of clusters.
Silhouette analysis is another approach: it measures how similar each point is to its own cluster
compared to others. The higher the silhouette score, the better the clustering.

2. Explain the difference between agglomerative and divisive hierarchical clustering.


Agglomerative and divisive hierarchical clustering are like opposite ends of a spectrum.
Agglomerative clustering starts with each data point as its own cluster and then
successively merges the closest pairs of clusters until you end up with a single cluster or a pre-
defined number of clusters. Think of it as building up a family tree starting with individuals.
Divisive clustering, on the other hand, starts with all data points in a single cluster and then
recursively splits them into smaller clusters. It's like pruning a big tree down to its branches
and leaves.
In essence, agglomerative is bottom-up, while divisive is top-down.

3. How does the initialization of KMeans algorithm affect the final clustering results?
The initialization of the KMeans algorithm can have a big impact on the final clustering results.
KMeans starts with initial guesses for the cluster centroids, and if those guesses are poorly
chosen, it can lead to suboptimal clusters. This is because KMeans can get stuck in local
optima, meaning it might not find the best possible clusters.
A common way to improve initialization is by using the KMeans++ algorithm. It spreads out
the initial centroids in a more informed way, helping to lead to better and more stable
clustering results.

4. Explain the concept of Gaussian mixture model and how it is used in clustering.
Gaussian Mixture Models (GMMs) are like KMeans' sophisticated cousin. They assume that
data points are generated from a mixture of several Gaussian distributions with unknown
parameters. Essentially, instead of just assigning each data point to a single cluster, GMM
assigns a probability of belonging to each cluster.
This is useful in clustering because it allows for soft clustering, where a data point can belong
to multiple clusters with different probabilities. This flexibility can model more complex data
distributions and overlap between clusters, which KMeans can't handle as well.

Suggested Reference:\

1. https://towardsdatascience.com/unsupervised-learning-with-k-means-clustering-generate-color-
palettes-from-images-94bb8e6a1416

2. https://neptune.ai/blog/clustering-algorithms

References used by the students: (Sufficient space to be provided)

Rubric wise marks obtained:


Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

10
Pattern Recognition (3171613) 220213116007

Experiment No: 4
To implement dimensionality reduction techniques such as Principal Component Analysis
(PCA) andFisher Discriminant Analysis (FDA) on MNIST data and visualize the data.

Date:

Competency and Practical Skills:


• Programming skills in Python, including libraries such as NumPy and scikit-learn.
• Familiarity with Jupyter Notebook or any Python IDE for writing and executing code.
• Understanding of basic machine learning concepts, including classification, optimization,
and evaluation metrics.

Relevant CO: CO1,CO2,CO3


Objectives:
(a) Gain practical experience in implementing linear discriminate functions using different
algorithms.
(b) Analyze the strengths and limitations of each algorithm in terms of accuracy, convergence,
and computational efficiency.

Equipment/Instruments:
Desktop/laptop with Materials:
Python programming environment
NumPy and Scikit-learn libraries

Theory:
Ref :
https://medium.com/machine-learning-researcher/dimensionality-reduction-pca-and-lda-
6be91734f567

Procedure:
1. Load the MNIST dataset
The MNIST dataset contains 70,000 handwritten digit images. Each image is of size 28 x
28 pixels. The dataset is divided into 60,000 training images and 10,000 testing images.
2: Preprocess the dataset
The dataset needs to be preprocessed before applying any machine learning algorithms. The
preprocessing steps include normalization and flattening.

Step 3: Apply PCA


PCA is a popular technique used for dimensionality reduction. It uses eigenvalue decomposition to
reduce the dimensionality of the data. In this step, we will apply PCA on the MNIST dataset and
visualize the data in two dimensions using scatter plots.

Step 4: Apply FDA


FDA is a supervised dimensionality reduction technique. It is used to find the linear combination
of features that maximize the separation between classes. In this step, we will apply FDA on the
MNIST dataset and visualize the data in two dimensions using scatter plots.

11
Pattern Recognition (3171613) 220213116007

Step 5: Compare the results


Finally, we will compare the results obtained from PCA and FDA and discuss their advantages
and disadvantages.

Code:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.datasets import fetch_openml

#load mnist dataset


mnist = fetch_openml("mnist_784")
X, y = mnist.data / 255.0, mnist.target

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

lda = LinearDiscriminantAnalysis(n_components=2)
X_lda = lda.fit(X, y).transform(X)

plt.figure(figsize=(10, 5))

plt.subplot(1, 2, 1)
plt.title('PCA')
for i in range(10):
plt.scatter(X_pca[y == str(i)][:, 0], X_pca[y == str(i)][:, 1], label=str(i))
plt.legend()
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')

plt.subplot(1, 2, 2)
plt.title('FDA')
for i in range(10):
plt.scatter(X_lda[y == str(i)][:, 0], X_lda[y == str(i)][:, 1], label=str(i))
plt.legend()
plt.xlabel('LDA 1')
plt.ylabel('LDA 2')

plt.tight_layout()
plt.show()

#compare the results


# Calculate explained variance ratio for PCA
explained_variance_pca = np.sum(pca.explained_variance_ratio_)

# Print explained variance ratios for both PCA and FDA


print(f'Explained Variance (PCA): {explained_variance_pca:.2f}')
print(f'Explained Variance (FDA): {lda.explained_variance_ratio_[0]:.2f}')

12
Pattern Recognition (3171613) 220213116007

Observation:

Quiz: (Sufficient space to be provided for the answers)


1. In a dataset with 100 features, you decide to use PCA to reduce the dimensionality to 10. After
performing PCA, you notice that the first principal component explains 60% of the variance in
the data. What can you infer about the remaining principal components? Howwould you
determine the number of principal components to keep?
Ans:
The first principal component explaining 60% of the variance suggests that it captures the majority
of the information from the dataset. The remaining 40% of the variance isspread across the other ni
ne principal components.
To determine the number of principal components to keep, you could use the cumulative explained
variance. You might look for the smallest number of components that explain a desired amount of t
otal variance
often 95% or 99%. This way, you ensure that you capture most of the important information while r
educing the dimensionality significantly

13
Pattern Recognition (3171613) 220213116007

Suggested Reference:

References used by the students: (Sufficient space to be provided)

Rubric wise marks obtained:


Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

14
Pattern Recognition (3171613) 220213116007
Experiment No: 5
Implement linear discriminate functions using gradient descent procedures, the Perceptron
algorithm, and Support Vector Machines (SVM).

Competency and Practical Skills:


• Ability to evaluate and compare the performance of different linear discriminant
algorithms.
• Proficiency in preparing and preprocessing datasets for classification tasks.
Relevant CO: co2,co3,co4
Objectives:
1. Students will be able to understand the concept of linear discriminate functions and their
role in classification tasks.
2. Students will be able to interpret and analyze the results to draw meaningful conclusions
regarding the performance of different algorithms.
Equipment/Instruments:
Desktop/laptop with Materials:
Python programming environment
NumPy and Scikit-learn libraries
Theory:
Ref : https://www.geeksforgeeks.org/ml-linear-discriminant-analysis/
Procedure:
1. Dataset Preparation:
• Load a dataset suitable for binary classification, such as Breast Cancer dataset.
• Split the dataset into features (X) and labels (y).
• Normalize the features if necessary using techniques like feature scaling or standardization.
• Split the dataset into training and testing sets using a suitable ratio (e.g., 70% for training, 30% for
testing).
2. Implementing Gradient Descent:
• Create a class for gradient descent linear discriminant function (e.g., GradientDescentLDF).
• Initialize the weight vector and bias term to zeros.
• Set hyperparameters such as learning rate, maximum iterations, and convergence threshold.
• Implement the fit() method to update the weight vector and bias term using gradient descent.
Iterate until convergence or reaching the maximum number of iterations.
• Implement the predict() method to predict the class of new inputs based on the learned parameters.
• Train the model by calling fit() on the training set.
• Evaluate the performance of the model on the testing set using appropriate metrics (e.g., accuracy,
confusion matrix).
3. Implementing Perceptron Algorithm:
• Create a class for the Perceptron linear discriminant function (e.g., PerceptronLDF).
• Initialize the weight vector to zeros.
• Set hyperparameters such as learning rate and maximum iterations.
• Implement the fit() method to update the weight vector using the Perceptron algorithm. Iterate until
convergence or reaching the maximum number of iterations.
• Implement the predict() method to predict the class of new inputs based on the learned parameters.
• Train the model by calling fit() on the training set.
• Evaluate the performance of the model on the testing set using appropriate metrics (e.g., accuracy,
confusion matrix).
4. Implementing SVM Algorithm:
• Use the SVM implementation from a machine learning library like scikit-learn (e.g., LinearSVC).
• Set hyperparameters such as penalty parameter (C) and maximum iterations.
• Train the model by calling the fit() method on the training set.
• Evaluate the performance of the model on the testing set using appropriate metrics (e.g., accuracy,
confusion matrix).
5. Performance Evaluation and Comparison:
• Calculate and compare the performance metrics (e.g., accuracy, precision, recall, F1-score) of the
15
Pattern Recognition (3171613) 220213116007
implemented algorithms on the testing set.
• Analyze the results to determine the strengths and limitations of each algorithm.
Compare the performance of the gradient descent, Perceptron, and SVM algorithms in terms of
accuracy, convergence, and computational efficiency

Code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix

# read dataset from URL


url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
cls = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'Class']
dataset = pd.read_csv(url, names=cls)

# divide the dataset into class and target variable


X = dataset.iloc[:, 0:4].values
y = dataset.iloc[:, 4].values

# Preprocess the dataset and divide into train and test


sc = StandardScaler()
X = sc.fit_transform(X)
le = LabelEncoder()
y = le.fit_transform(y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# apply Linear Discriminant Analysis


lda = LinearDiscriminantAnalysis(n_components=2)
X_train = lda.fit_transform(X_train, y_train)
X_test = lda.transform(X_test)

# plot the scatterplot


plt.scatter(
X_train[:,0],X_train[:,1],c=y_train,cmap='rainbow',
alpha=0.7,edgecolors='b'
)

# classify using random forest classifier


classifier = RandomForestClassifier(max_depth=2, random_state=0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

# print the accuracy and confusion matrix


print('Accuracy : ' + str(accuracy_score(y_test, y_pred)))
conf_m = confusion_matrix(y_test, y_pred)
print(conf_m)
16
Pattern Recognition (3171613) 220213116007

Observations:

Quiz: (Sufficient space to be provided for the answers)


1. How does the choice of learning rate in the gradient descent algorithm impact the
convergence and accuracy of the linear discriminant function implemented in the above
lab?
Ans:
Learning rate in gradient descent is a big deal. A high learning rate can make the
algorithm converge faster, but it risks overshooting the minimum and never quite settling
down. Too low a learning rate, and you're moving at a snail's pace—you'll get to the
minimum, but you might have to wait forever.
In terms of accuracy for the linear discriminant function, a good learning rate ensures that
the function is well-optimized and doesn't get stuck in local minima. So, the right balance is
key.

2. How does the non-linearly separable nature of the dataset affect the performance and
convergence of the Perceptron algorithm compared to gradient descent and SVM
algorithms?

17
Pattern Recognition (3171613) 220213116007
Ans:
Non-linearly separable data is tough terrain for the Perceptron algorithm. It struggles

because it can't find a single hyperplane to separate the data, leading to no convergence

and poor performance.

Gradient descent algorithms, particularly in combination with more complex models like

neural networks, can navigate this better because they can capture non-linear

relationships.

Support Vector Machines (SVMs) excel here; they use kernel tricks to transform the data

into a higher dimension where it becomes linearly separable. This way, SVMs find the optimal

hyperplane and converge effectively.

References used by the students: (Sufficient space to be provided)

Rubric wise marks obtained:


Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

18
Pattern Recognition (3171613) 220213116007
Experiment No: 6
Design and train a Multilayer Perceptron (MLP) feed forward neural network for a classification
task using the CIFAR-10 Dataset.
Competency and Practical Skills:
1. Enhances competencies in neural network design, implementation, data preprocessing,
model training and optimization, performance evaluation, hyperparameter tuning, and
result analysis.
2. It develops practical skills in utilizing deep learning frameworks, handling real-world
datasets, and making informed decisions for classification tasks.
Relevant CO: co2,co3,co4
Objectives:
1. Train the MLP network using the CIFAR-10 dataset and evaluate its performance in
classifying the images.
2. Analyze the impact of hyper parameters and network architecture on performance: The
third objective is to experiment with different hyper parameters, such as learning rate,
batch size, and number of hidden neurons, and observe their impact on the MLP
network's performance.
Equipment/Instruments:
Computer/laptop with Python and necessary libraries (such as TensorFlow, Keras, or PyTorch)
installed.CIFAR-10 dataset (easily available in Keras or other libraries).
Theory:
Ref : https://towardsdatascience.com/implementing-a-deep-neural-network-for-the-cifar-10-
dataset-c6eb493008a5
Procedure:
1. Dataset Selection: The CIFAR-10 dataset is a widely used dataset for image classification
tasks. It contains 60,000 color images of size 32x32 pixels, belonging to 10 different classes
(such as airplanes, cars, cats, etc.). The dataset is divided into 50,000 training images and
10,000 testing images.
2. Data Preprocessing: Import the CIFAR-10 dataset using the Keras library. The dataset is
already preprocessed, so you can skip this step.
3. 3. Network Architecture Design: Design the architecture of the MLP network for image
classification. Since CIFAR-10 images are relatively small, a simple MLP with fully
connected layers can be used. Determine the number of neurons in the input layer based on the
image size (32x32x3). Decide on the number of hidden layers, the number of neurons in each
layer, and the activation functions to be used. You can start with a few hidden layers and
gradually increase the complexity if needed.
4. Network Implementation: Implement the MLP network using a deep learning framework such
as Keras or TensorFlow. Define the network architecture by specifying the number of layers,
the number of neurons in each layer, and the activation functions. Set the appropriate input
shape to match the image size (32x32x3).
5. Training and Testing: Split the CIFAR-10 dataset into training and testing sets (usually 80:20
or 70:30 ratio). Use the training set to train the MLP network. Select an appropriate optimizer
(e.g., stochastic gradient descent) and a suitable loss function (e.g., categorical cross-entropy)
for multi-class classification. Train the network for a specified number of epochs, observing
the training loss and accuracy.
6. Hyper parameter Tuning: Experiment with different hyper parameters to improve the
network's performance. Adjust hyper parameters such as learning rate, batch size, number of
hidden neurons, and number of epochs. Observe the impact of these hyper parameters on the
network's accuracy and convergence. You can also explore techniques like regularization or
dropout to mitigate overfitting.

19
Pattern Recognition (3171613) 220213116007
7. Performance Evaluation: Evaluate the trained MLP network on the testing set. Calculate and
analyze various performance metrics such as accuracy, precision, recall, and F1-score to
assess the network's effectiveness in classifying CIFAR-10 images. Additionally, generate a
confusion matrix to visualize the classification results and identify any class-specific
performance issues.
8. Comparison and Discussion: Compare the performance of the MLP network with other
classification algorithms applied to the CIFAR-10 dataset. Discuss the strengths and
weaknesses of the MLP approach for the given image classification task. Analyze the impact
of different hyperparameters and network architectures on the performance.
9. Visualization: Visualize the training progress by plotting the training loss and accuracy curves
over the epochs. Additionally, visualize the network's learned features or extract intermediate
layer outputs to gain insights into how the MLP network is processing the CIFAR-10 images.
10. Documentation and Analysis: Summarize the experimental setup, results, and findings in a
comprehensive report. Discuss the accuracy achieved by the MLP network and the impact of
different hyper parameters. Analyze the strengths.
Code:
import tensorflow as tf
from tensorflow import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.utils import to_categorical

(train_images, train_labels), (test_images, test_labels) = cifar10.load_data()

# Step 3: Network Architecture Design


model = Sequential()
model.add(Flatten(input_shape=(32, 32, 3))) # Input layer
model.add(Dense(512, activation='relu')) # Hidden layer 1
model.add(Dense(256, activation='relu')) # Hidden layer 2
model.add(Dense(10, activation='softmax')) # Output layer (10 classes)

# Step 4: Network Implementation


model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

# Step 5: Training and Testing


train_images = train_images / 255.0 # Normalize pixel values
test_images = test_images / 255.0
train_labels = to_categorical(train_labels, num_classes=10) # One-hot encoding
test_labels = to_categorical(test_labels, num_classes=10)

history = model.fit(train_images, train_labels, validation_split=0.2, epochs=10, batch_size=128)

# Step 7: Performance Evaluation


test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f'Test Accuracy: {test_accuracy * 100:.2f}%')

# Step 9: Visualization

20
Pattern Recognition (3171613) 220213116007
import matplotlib.pyplot as plt

# Plot training loss and accuracy


plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 1)
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend()
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Validation Loss')

plt.subplot(1, 2, 2)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend()
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Training and Validation Accuracy')
plt.show()

Observations:

21
Pattern Recognition (3171613) 220213116007

Quiz: (Sufficient space to be provided for the answers)


Suggested Reference:\

1. How does increasing the number of hidden layers and neurons affect the
performance of the MLP network trained on the CIFAR-10 dataset?
Ans:
Increasing the number of hidden layers and neurons in a Multi-Layer Perceptron (MLP)
network can have a couple of effects:
• Capacity to Learn: With more layers and neurons, the network can capture more
complex patterns and structures in the CIFAR-IO dataset. It can potentially achieve
higher accuracy and better representation of the data.
• Risk of Overfitting: However, adding too many layers and neurons can lead to
overfitting, where the network performs exceptionally well on training data but poorly
on unseen test data.
• Computational Complexity: More layers and neurons increase the computational
resources required for training—longer training times and more memory usage.
• Optimization Difficulty: As the network becomes deeper and wider, it can be harder to
train. Issues like vanishing and exploding gradients can make convergence a
challenge.
So, it's about balancing complexity and performance to avoid overfitting and ensure
generalization.

2. Do using different activation functions in the hidden layers of the MLP network have
an impact on the classification accuracy when training on the CIFAR-10 dataset?

Ans:

Absolutely. Activation functions are like the secret sauce of neural networks. Different

functions can significantly affect how well your MLP network learns and classifies.

• ReLU (Rectified Linear Unit) is popular because it helps mitigate the vanishing
gradient problem, enabling deep networks to learn better. It often performs well with
image data like CIFAR-IO.
22
Pattern Recognition (3171613) 220213116007
• Sigmoid and Tanh functions can struggle with deep networks because they saturate,
leading to vanishing gradients. They can still be useful for specific applications where
their properties align well with the data.

• Leaky ReLU and ELU (Exponential Linear Unit) offer variations that try to fix the
"dying ReLU" problem by allowing a small gradient when the input is negative.

Experimenting with different activation functions can lead to different performance

outcomes, so it's worth trying a few to see what works best for your specific dataset and

network architecture.

References used by the students: (Sufficient space to be provided)

Rubric wise marks obtained:


Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

23
Pattern Recognition (3171613) 220213116007
Experiment No: 7
Implementing Recurrent Neural Networks (RNNs) for Sequential Data Analysis.

Competency and Practical Skills:


Implement and analyze Recurrent Neural Networks (RNNs) for various sequential data analysis
tasks, enabling them to apply these techniques to real-world problems effectively.
Relevant CO: co2,co3,co4
Objectives:
Develop a comprehensive understanding of RNNs, gain practical skills in implementing and fine-tuning
RNN models, and be able to apply them to real-world sequential data analysis tasks effectively
Equipment/Instruments:
Computer with Python and the chosen deep learning framework installed.
LibriSpeech dataset or the TIMIT dataset for sequential data analysis
Theory:
Ref :
https://www.analyticsvidhya.com/blog/2022/03/a-brief-overview-of-recurrent-neural-networks-
rnn/

Procedure:
1. Prepare the dataset:
▪ Datasets such as the LibriSpeech dataset or the TIMIT dataset can be used for
speech recognition tasks where the sequential data consists of audio signals.
▪ Preprocess the dataset by cleaning, normalizing, and transforming it into a suitable
format.
▪ Split the dataset into training and testing sets.
2. Implement an RNN model using the chosen deep learning framework:
▪ Import the necessary libraries and modules.
▪ Load the dataset into the program.
▪ Encode the sequential data into a suitable numerical representation, such as one-
hot encoding or word embeddings.
▪ Split the dataset into input sequences and corresponding target sequences.
▪ Split the data into training and testing sets.
3.Configure the RNN architecture:
▪ Select the appropriate RNN layer type (basic RNN, LSTM, or GRU) based on the
problem and dataset.
▪ Determine the number of RNN layers and their parameters, such as the number of
units or hidden states.
▪ Set other hyperparameters, such as learning rate, batch size, and number of
epochs.
▪ Define the loss function, optimizer, and evaluation metrics for training the model.
4. Train the RNN model:
▪ Initialize the RNN model with the defined architecture.
▪ Train the model using the training dataset.
▪ Monitor the training progress, such as loss convergence and model performance.
5. Evaluate the trained model:
▪ Use the trained RNN model to make predictions on the testing dataset.
▪ Calculate performance metrics, such as accuracy, precision, recall, and F1-score.
▪ Visualize the model's performance using appropriate plots, such as accuracy
curves or confusion matrices.
6. Fine-tune the model:
▪ Experiment with different hyper parameters, such as learning rate, number of
hidden units, or dropout rate, to observe their impact on performance.
▪ Iterate the training and evaluation process to find the optimal configuration.

24
Pattern Recognition (3171613) 220213116007

Code:
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM

# Load a small sample from the Speech Commands dataset


dataset, info = tfds.load('speech_commands', split='train[:1%]', with_info=True,
as_supervised=True)

# Preprocess the dataset


def preprocess(audio, label):
# Ensure audio is at least 1 second long
audio = tf.cast(audio, tf.float32)
# Pad shorter audio clips
audio = tf.pad(audio, [[0, tf.maximum(0, 16000 - tf.shape(audio)[0])]])

audio = audio / 32768.0 # Normalize audio


# Use TensorFlow operations directly
frame_length = 400
frame_step = 160
fft_length = 512

# Ensure the parameters are tensors


frame_length = tf.constant(frame_length, dtype=tf.int32)
frame_step = tf.constant(frame_step, dtype=tf.int32)
fft_length = tf.constant(fft_length, dtype=tf.int32)

# Compute the STFT


stft = tf.signal.stft(audio, frame_length=frame_length, frame_step=frame_step,
fft_length=fft_length)
spectrogram = tf.abs(stft)
log_mel_spectrogram = tf.signal.linear_to_mel_weight_matrix(num_mel_bins=13,
num_spectrogram_bins=fft_length // 2 + 1, sample_rate=16000)
mel_spectrogram = tf.tensordot(spectrogram, log_mel_spectrogram, 1)
log_mel_spectrogram = tf.math.log(mel_spectrogram + 1e-6)
mfccs = tf.signal.mfccs_from_log_mel_spectrograms(log_mel_spectrogram)
mfccs = tf.expand_dims(mfccs, axis=-1)
return mfccs, label

dataset = dataset.map(preprocess).batch(1)

# Building the RNN model


model = Sequential()
model.add(LSTM(128, input_shape=(None, 13), return_sequences=True))
model.add(LSTM(64))
model.add(Dense(1, activation='sigmoid'))

25
Pattern Recognition (3171613) 220213116007

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])


model.summary()

# Train the model


model.fit(dataset, epochs=1)

# Evaluate the model


loss, accuracy = model.evaluate(dataset)
print(f'Loss: {loss}, Accuracy: {accuracy}')

Observation:

26
Pattern Recognition (3171613) 220213116007

Quiz: (Sufficient space to be provided for the answers)


1. Discuss the challenges and considerations involved in selecting appropriate
hyperparameters for RNN models. How would you approach the task of hyperparameter
tuning to optimize the performance of an RNN model?
Ans:
Choosing the right hyperparameters for RNNs can be a bit of a balancing act, involving
both art and science.
Key challenges include:
• Overfitting: Larger models might overfit your data, capturing noise rather than the
underlying pattern.
• Training Time: Too many parameters can significantly slow down training.
• Gradient Issues: Vanishing and exploding gradients can make training unstable.
Hyperparameter Tuning Approaches:
• Grid Search: Systematically explore a range of values, but it's computationally expensive.
• Random Search: Samples a subset of the parameter space; often more efficient.
• Bayesian Optimization: Uses past evaluations to choose the next set of hyperparameters
more intelligently.
• Manual Tuning: Based on experience and intuition.
You might start with basic configurations, such as learning rate, number of layers, and
units per layer, and then fine-tune using one of the above methods. A good balance helps
the RNN model learn effectively without getting bogged down.

2. Discuss the significance of performance evaluation metrics, such as accuracy,


precision, recall, and F1-score, in the context of sequential data analysis. Which
metrics are most suitable for different types of sequential data analysis tasks?
Ans:
Evaluation metrics are essential because they tell you how well your model is performing
and help you compare different models.
Accuracy measures the proportion of correct predictions. While straightforward, it can be
misleading for imbalanced datasets where most data belongs to one class.
Precision and Recall dive deeper. Precision is the ratio of true positives to all predicted
positives, showing how many selected items are relevant. Recall is the ratio of true
positives to all actual positives, indicating how many relevant items are selected.
Fl-Score balances precision and recall. It's particularly useful when you need a single
metric to balance both concerns, especially with imbalanced datasets.
For sequential data:
• Precision and Recall are crucial in tasks like anomaly detection, where false positives
and false negatives have different costs.
• Fl-Score is handy for balancing these in complex tasks like language modeling or
time-series forecasting.
Each metric shines in its specific context.

Rubric wise marks obtained:


Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

27
Pattern Recognition (3171613) 220213116007
Experiment No: 8
Non-metric Methods for Pattern Classification: Analyzing Nominal Data using Decision
Trees.
Relevant CO: co1,co2,co3,co4
Objectives:
• Gain hands-on experience in implementing decision tree algorithms for classification or
regression tasks on datasets with nominal data.
• Discuss the strengths and limitations of decision trees for analyzing nominal data and
compare them to other machine learning algorithms.

Equipment/Instruments:
Computer with Python installed.
Car Evaluation Dataset: This dataset includes attributes related to the evaluation of car features,
such as buying price, maintenance cost, number of doors, and luggage capacity. It can be used to
predict the acceptability of a car using decision trees.
Decision tree libraries (e.g., scikit-learn in Python).
Theory:
Ref : https://towardsdatascience.com/decision-trees-d07e0f420175

Procedure:
1. Data Loading and Exploration:
• Load the Car Evaluation Dataset: This dataset includes attributes related to the evaluation
of car features, such as buying price, maintenance cost, number of doors, and luggage
capacity. It can be used to predict the acceptability of a car using decision trees
• Perform initial data exploration, including checking for missing values and
understanding the distribution of the target variable.
• Explore the non-numeric or nominal attributes and identify any unique categories or
patterns.
2. Data Preprocessing:
• Handle missing values: Decide on an appropriate strategy for handling missing values,
such as imputation or removal.
• Encode categorical variables: Convert non-numeric or nominal attributes into a numerical
format suitable for decision tree analysis. This can be done using techniques like label
encoding or one-hot encoding.
3. Dataset Split:
Split the preprocessed dataset into training and testing sets. The recommended split is usually 70-
30 or 80-20 for training and testing, respectively.
4. Decision Tree Implementation:
Import the necessary decision tree libraries in the chosen programming language (e.g., scikit-learn
in Python). Define the decision tree model and set any desired hyper parameters (e.g., maximum
tree depth, minimum samples for a split).Fit the decision tree model to the training data using the
appropriate function or method.
5. Model Training and Evaluation:
• Once the decision tree model is fitted to the training data, evaluate its performance on the
testing data.
• Calculate relevant evaluation metrics such as accuracy, precision, recall, or mean squared
error (for regression tasks).
• Interpret the results and analyze the model's effectiveness in handling non-numeric or
nominal data.
6. Model Visualization and Interpretation:
• Visualize the decision tree model to understand the learned decision rules and important
features.
• Use libraries or functions to generate visual representations of the decision tree, such as tree
28
Pattern Recognition (3171613) 220213116007
diagrams or rule sets.
• Interpret the decision rules and discuss the significance of each node and branch in the tree.
7. Fine-tuning and Optimization:
• Experiment with different hype parameters to optimize the decision tree model's
performance. For example, vary the tree depth or the minimum number of samples required
for a split.
• Evaluate the model's performance after each adjustment and compare the results to
determine the optimal hyper parameter settings.
Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.preprocessing import LabelEncoder
from sklearn.tree import export_text

#1Data Loading and Exploration


url = "https://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data"
column_names = ["buying", "maint", "doors", "persons", "lug_boot", "safety", "class"]
data = pd.read_csv(url, names=column_names)
data.head()

data.tail()

data.info

# Check for missing values


print(data.isnull().sum())

# Explore the distribution of the target variable


print(data["class"].value_counts())

# Explore non-numeric attributes


for column in data.select_dtypes(include=['object']).columns:
print(column, ":", data[column].unique())

#2Data Preprocessing

# Encode categorical variables using Label Encoding


label_encoders = {}
for column in data.select_dtypes(include=['object']).columns:
le = LabelEncoder()
data[column] = le.fit_transform(data[column])
label_encoders[column] = le

#3Dataset Split
X = data.drop("class", axis=1)
y = data["class"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# 4Decision Tree Implementation


# Define the Decision Tree model with hyperparameters
decision_tree = DecisionTreeClassifier(max_depth=5)

# Fit the model to the training data


decision_tree.fit(X_train, y_train)

29
Pattern Recognition (3171613) 220213116007
# 5Model Training and Evaluation
y_pred = decision_tree.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print(classification_report(y_test, y_pred))

# Step 6: Model Visualization and Interpretation


tree_rules = export_text(decision_tree, feature_names=X.columns.tolist())
print("Decision Tree Rules:\n", tree_rules)

from IPython.display import Image


import graphviz

# Visualize the decision tree and display it in the notebook


dot_data = export_graphviz(
decision_tree, out_file=None, feature_names=X.columns.tolist(),
class_names=label_encoders["class"].classes_, filled=True, rounded=True,
special_characters=True
)

graph = graphviz.Source(dot_data)
graph.format = 'png' # You can choose the format you prefer (e.g., 'png', 'pdf', 'svg', etc.)
graph.render("car_decision_tree", view=False) # Save the visualization to a file (optional)

# Display the visualization in the notebook


Image(filename="car_decision_tree.png") # Adjust the filename based on the format you
chose

Observations:

30
Pattern Recognition (3171613) 220213116007

31
Pattern Recognition (3171613) 220213116007

Quiz: (Sufficient space to be provided for the answers)


1. Explain how the choice of attribute selection measure, such as Gini index or
information gain, can impact the construction and performance of decision trees on
datasets with non-numeric or nominal data. Discuss the strengths and weaknesses of
each measure and provide examples of scenarios where one measure might be more
suitable than the other.
Ans:
The choice between the Gini index and information gain can indeed influence the

32
Pattern Recognition (3171613) 220213116007
construction and performance of decision trees.
• Gini Index measures the impurity of a dataset. A lower Gini index indicates a
purer node. It's computationally less expensive and can handle splits better with
categorical attributes but might be biased towards attributes with more levels.
• Information Gain is based on entropy and calculates the reduction in randomness
or disorder. It's more robust and works well with nominal data, providing clearer
insights. However, it's computationally more intensive.
Example:
• For datasets with non-numeric, high-cardinality features (many unique values), the
Gini Index might be faster and simpler.
• For datasets requiring nuanced decisions where interpretability is key, information
gain could provide more meaningful splits.

2. Describe the concept of ensemble learning and its relevance to decision trees.
Discuss two popular ensemble techniques, namely Bagging and Boosting, and
explain how they can enhance the performance of decision tree models.
Ans:
Ensemble learning is about combining multiple models to create one strong predictive
model. It leverages the idea that a group of weak learners can come together to form a robust
model.
Bagging (Bootstrap Aggregating) involves training multiple versions of a model on
different subsets of the data (drawn with replacement). By averaging the predictions, it
reduces variance and helps the model generalize better.
Boosting focuses on training models sequentially. Each model tries to correct the errors of
the previous one. This iterative process reduces bias and improves the model's performance
on difficult cases.
So, bagging focuses on reducing overfitting by averaging, while boosting hones in on
reducing errors and refining the model iteratively. Both techniques supercharge decision
trees, making them more accurate and resilient.

Rubric wise marks obtained:


Clear( Good) Average/partial Poor/not at all
Understanding of 5 3 0
problem statement
Flow of program/logic 3 1 0

Error free /Generate 2 1 0


output

33

You might also like