0% found this document useful (0 votes)

519 views32 pages

Machine Learning Lab Manual 2024-25

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

519 views32 pages

Machine Learning Lab Manual 2024-25

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Bharati Vidyapeeth's College of Engineering

New Delhi

LAB MANUAL

Department Computer Science and Engineering

Academic Year 2024-25
Semester 7th
Subject Name Machine Learning Lab
Subject Code ML-407P
Faculty Name Dr. Rakhi

1
INSTITUTE VISION

To be an institute of excellence that provides quality technical education and research to

create competent graduates for serving industry and society.

INSTITUTE MISSION

M1: To impart quality technical education through dynamic teaching-learning environment

M2: To promote research and innovations activities which gives opportunities for life-long
learning in context of academic and industry.
M3: To build up links with industry-institute through partnerships and collaborative
developmental works.
M4: To inculcate work ethics and commitment in graduates for their future endeavors to
serve the society.

2
DEPARTMENT OF COMPUTER SCIENCE &
ENGINEERING

VISION OF DEPARTMENT

To develop as a centre of excellence in computer science and engineering education and

research so as to produce globally competent professionals with a sense of social
responsibility.
MISSION OF DEPARTMENT

Mission: The mission of the CSE Department is to:

M1: Impart technical knowledge in Computer Science and Engineering with the
state-of-art infrastructure.
M2: Provide a conducive environment for the holistic development of graduates.
M3: Inculcate leadership qualities, teamwork and strong ethical values among the
graduates. M4: Promote a research culture and industry-academia collaboration to
strengthen innovation.

PROGRAM EDUCATIONAL OBJECTIVES (PEO)

Following are the three CSE Department Program Educational Objectives (PEOs):

PEO1: To produce graduates with in-depth knowledge of Computer Science and

Engineering to contribute towards innovation, research and excellence in higher studies.
PEO2: To inculcate life-long learning skills in graduates enabling them to adapt to
changing technologies, modern tools and work in teams.
PEO3: To produce ethically responsible graduates who are involved in transforming the
society by providing suitable engineering solutions.

The undergraduate program of CSE is having the following Program Outcomes and
Program Specific Outcomes (POs).

PROGRAM OUTCOMES (PO)

1. Engineering knowledge: Apply the knowledge acquired in mathematics, science, engineering
for the solution of complex engineering problems.
2. Problem analysis: Identify research gaps, formulate and analyze complex engineering
problems drawing substantiated conclusions using basic knowledge of mathematics, natural
sciences and engineering sciences.
3. Design/development of solutions: Design solutions for the identified complex engineering
problems as well as develop solutions that meet the specified needs for the public health and
safety, and the cultural, societal and environmental considerations.
4. Conduct investigations of complex problems: Use research-based knowledge and research
methods, including design of experiments, analysis and interpretation of data and synthesis of
the information to provide valid conclusions.

3
5. Modern tool usage: Work on the latest technologies, resources and software tools including
prediction and modelling to complex engineering activities with an understanding of their
limitations.
6. The engineer and society: Apply the basic acquired knowledge to measure societal, health,
safety, legal and cultural issues and identifying the consequential responsibilities relevant to the
professional engineering practice.
7. Environment and Sustainability: Comprehend the impact of the professional engineering
solutions in context of society and environment and demonstrate the need and knowledge for
sustainable development
8. Ethics: Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice
9. Individual and team work: Function effectively as an individual, and as a member or leader
in diverse teams, and in multidisciplinary settings.
10. Communication: Communicate effectively on complex engineering activities with the
engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive
clear instructions.
11. Project management and finance: Demonstrate knowledge and understanding of the
engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-long learning: Recognize the need for, and have the preparation and ability to engage in
independent and life-long learning.

PROGRAM SPECIFIC OUTCOMES (PSO)

PSO1: Ability to apply fundamentals of computational mathematics and algorithmic
formulations to solve the real- time challenges of computer engineering encountered in
research and industry.
PSO2: Capability to design and develop software and hardware applications using logical,
analytical, and programming skills learnt while also following professional and social
ethics.

4
TABLE OF CONTENTS

[Link]. Page no.

1. Course details 06
1.1 Course objective
1.2. Course Outcomes
1.3 CO-PO/PSO mapping
1.4 Evaluation Scheme
1.5 Guidelines/Rubrics for continuous assessment
1.6 Lab safety instruction
1.7Instructions for students while writing Experiment in Lab file.
2 List of Experiments and Content Beyond Syllabus 10

3 Experimental Setup details for the course. 11

4 Experiment details 12
5 Course Exit Survey 43

5
1. COURSE DETAILS
1.1 COURSE OBJECTIVES

● To understand the need of machine learning.

● To learn about regression and feature selection.
● To understand about classification algorithms.
● To learn clustering algorithms.

1.2 COURSE OUTCOMES

At the end of the course student will be able to: Bloom Level

ML-407P.1 To formulate machine learning problems Remember, Analyze,

ML-407P.2 Learn about regression and feature selection techniques and Remember, Understand,
develop applications based on the same. Apply, Evaluate, create

ML-407P.3 Apply machine learning techniques such as classification to Understand, Remember,

practical applications. Apply, Analyze

ML-407P.4 Apply Clustering algorithms to develop various practical Understand, Remember, Apply,
applications. Create

1.3 MAPPING COURSE OUTCOMES (CO) AND

PROGRAM OUTCOMES (PO)/ PROGRAM SPECIFIC
OUTCOME (PSO)

PO 1 PO 2 PO 3 PO 4 PO PO PO 7 PO 8 PO 9 PO 10 PO 11 PO 12 PSO 1 PSO
CO 5 6 2
CO1 3 3 3 3 3 2 2 - - - - 2 2 2

CO2 3 3 3 3 3 2 2 - - - - 2 2 2

CO3 3 3 3 3 3 2 2 - - - - 2 2 2

CO4 3 3 3 3 3 2 2 - - - - 2 2 2

6
1.4 EVALUATION SCHEME

Laboratory
Components Internal External
Marks 40 60
Total Marks 100

1.5 GUIDELINES FOR CONTINUOUS ASSESSMENT FOR EACH

EXPERIMENT

● Attendance and performance in minimum eight experiments and Content

Beyond syllabus – 20 marks
⮚ Each Experiment will carry a weight of 20 marks
• Experiment performance [5 Marks]
• File [10 Marks]
• Attendance [5 Marks]

● Internal Lab examination Viva-Voce of 20 marks.

The Rubrics for Experiment execution and Lab file+ viva voce is given below:

Experiment Marks details:

Completed and Completed but Logically Unacceptable

Executed partially Incorrect efforts/Absent
Status
perfectly Executing Program or errors
Marks 4-5 2-3 1 0

File Marks Details:

File Contents & Checked File Contents &

File Contents not Timely (after one Checked After two
Status week) weeks
&Checked Timely
Marks 9-10 7-8 0-2

Attendance/Viva-Voce Marks details:

Viva (Unsatisfactory)
Viva (Good) Viva (Average)
Status
Marks 4-5 1-3 0

7
Note: Viva Voce Questions for each experiment should be related to
Course Outcomes.

8
1.4 Safety Guidelines/Rules for laboratory

9
1.6 Format for students while writing Experiment in Lab file.

Experiment No:

Aim:

Course Outcome:

Software used:

Theory:

Flowchart/Algorithm/Code:

Results:

Expected Outcome attained: YES/NO

10
2. LIST OF EXPERIMENTS AS PER GGSIPU

Sr. No. Title of Lab Experiments CO

1 Introduction to JUPYTER IDE and its libraries Pandas and NumPy CO1
2 Program to demonstrate Simple Linear Regression CO2

3 Program to demonstrate Logistic Regression CO2

4 Program to demonstrate Decision Tree – ID3 Algorithm CO2

5 Program to demonstrate k‐Nearest Neighbor flowers classification CO3

6 Program to demonstrate Naïve‐ Bayes Classifier CO3

7 Program to demonstrate PCA and LDA on Iris dataset CO3

8 Program to demonstrate DBSCAN clustering algorithm CO4

CONTENT BEYOND SYLLABUS

Sr. No Name of Experiment CO
1 Program to demonstrate K‐Medoid clustering algorithm CO4

2 Program to demonstrate K‐Means Clustering Algorithm on Handwritten CO4

Dataset

11
3. EXPERIMENTAL SETUP DETAILS FOR THE COURSE

Software Requirements:
● Python
● Anaconda

Minimum Hardware Requirements

Dual Core based PC with 2 GB RAM

12
4. EXPERIMENTAL DETAILS

Experiment No.-1

Introduction to JUPYTER IDE and its libraries Pandas and NumPy

Jupyter Notebook
The Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting data
science projects. This article will walk you through how to use Jupyter Notebooks for data science
projects and how to set it up on your local machine.
First, though: what is a “notebook”?
A notebook integrates code and its output into a single document that combines visualizations,
narrative text, mathematical equations, and other rich media. In other words: it’s a single document
where you can run code, display the output, and also add explanations, formulas, charts, and make your
work more transparent, understandable, repeatable, and shareable.
Using Notebooks is now a major part of the data science workflow at companies across the globe. If
your goal is to work with data, using a Notebook will speed up your workflow and make it easier to
communicate and share your results.
Best of all, as part of the open source Project Jupyter, Jupyter Notebooks are completely free. You can
download the software on its own, or as part of the Anaconda data science toolkit.
Although it is possible to use many different programming languages in Jupyter Notebooks, but we will
use Python for this lab.
Python and pandas specifically. That said, if you have experience with another language, the Python in
this article shouldn’t be too cryptic, and will still help you get Jupyter Notebooks set up locally. Jupyter
Notebooks can also act as a flexible platform for getting to grips with pandas and Python. We will cover
the basics of installing Jupyter and creating your first notebook along with pandas and numpy.
Installation
The easiest way for a beginner to get started with Jupyter Notebooks is by installing Anaconda.
Anaconda is the most widely used Python distribution for data science and comes pre-loaded with all
the most popular libraries and tools. Some of the biggest Python libraries included in Anaconda include
NumPy, pandas, and Matplotlib, though the full 1000+ list is exhaustive. Anaconda thus lets us hit the
ground running with a fully stocked data science workshop without the hassle of managing countless
installations or worrying about dependencies and OS-specific (read: Windows-specific) installation
issues.
To get Anaconda, simply:
1. Download the latest version of Anaconda for Python 3.8.
2. Install Anaconda by following the instructions on the download page and/or in the executable. If
you are a more advanced user with Python already installed and prefer to manage your packages
manually, you can just use pip:
pip3 install jupyter

Creating Your First Notebook

In this section, we’re going to learn to run and save notebooks, familiarize ourselves with their structure,
and understand the interface. We’ll become intimate with some core terminology that will steer you
13
towards a practical understanding of how to use Jupyter Notebooks by yourself and set us up for the next
section, which walks through an example data analysis and brings everything we learn here to life.

Running Jupyter

On Windows, you can run Jupyter via the shortcut Anaconda adds to your start menu, which will open
new tab in your default web browser that should look something like the following screenshot. This isn’t
a notebook just yet, but don’t panic! There’s not much to it. This is the Notebook Dashboard,
specifically designed for managing your Jupyter Notebooks. Think of it as the launchpad for exploring,
editing and creating your notebooks. Be aware that the dashboard will give you access only to the files
and sub-folders contained within Jupyter’s start-up directory (i.e., where Jupyter or Anaconda is
installed). However, the start-up directory can be changed. It is also possible to start the dashboard on
any system via the command prompt (or terminal on Unix systems) by entering the command jupyter
notebook; in this case, the current working directory will be the start-up directory. With Jupyter
Notebook open in your browser, you may have noticed that the URL for the dashboard is something like
[Link] Localhost is not a website, but indicates that the content is being served from
your local machine: your own computer. Jupyter’s Notebooks and dashboard are web apps, and Jupyter
starts up a local Python server to serve these apps to your web browser, making it essentially
platform-independent and opening the door to easier sharing on the web. (If you don’t understand this
yet, don’t worry — the important point is just that although Jupyter Notebooks opens in your browser,
it’s being hosted and run on your local machine. Your notebooks aren’t actually on the web until you
decide to share them.) The dashboard’s interface is mostly self-explanatory — though we will come
back to it briefly later. So what are we waiting for? Browse to the folder in which you would like to
create your first notebook, click the “New” drop-down button in the top-right and select “Python 3”:

Hey presto, here we are! Your first Jupyter Notebook will open in new tab — each notebook uses its
own tab because you can open multiple notebooks simultaneously. If you switch back to the dashboard,
you will see the new file [Link] and you should see some green text that tells you your
notebook is running.
What is an ipynb File?
The short answer: each .ipynb file is one notebook, so each time you create a new notebook, a new
.ipynb file will be created. The longer answer: Each .ipynb file is a text file that describes the contents
of your notebook in a format called JSON. Each cell and its contents, including image attachments that
have been converted into strings of text, is listed therein along with some metadata. You can edit this
yourself — if you know what you are doing! — by selecting “Edit > Edit Notebook Metadata” from the
menu bar in the notebook. You can also view the contents of your notebook files by selecting “Edit”
from the controls on the dashboard However, the key word there is can. In most cases, there’s no reason
you should ever need to edit your notebook metadata manually.
The Notebook Interface
Now that you have an open notebook in front of you, its interface will hopefully not look entirely alien.
14
After all, Jupyter is essentially just an advanced word processor.

Why not take a look around? Check out the menus to get a feel for it, especially take a few moments to
scroll down the list of commands in the command palette, which is the small button with the keyboard
icon (or Ctrl + Shift + P). There are two fairly prominent terms that you should notice, which are
probably new to you: cells and kernels are key both to understanding Jupyter and to what makes it more
than just a word processor. Fortunately, these concepts are not difficult to understand.
● A kernel is a “computational engine” that executes the code contained in a notebook
document.
● A cell is a container for text to be displayed in the notebook or code to be executed by
the notebook’s kernel.
Cells
We’ll return to kernels a little later, but first let’s come to grips with cells. Cells form the body of a
notebook. In the screenshot of a new notebook in the section above, that box with the green outline is an
empty cell. There are two main cell types that we will cover:

A code cell contains code to be executed in the kernel. When the code is run, the notebook displays the
output below the code cell that generated it.
A Markdown cell contains text formatted using Markdown and displays its output in-place when the
Markdown cell is run.
The first cell in a new notebook is always a code cell.
Let’s test it out with a classic hello world example: Type print('Hello World!') into the cell and click the
run button in the toolbar above or press Ctrl + Enter.
The result should look like this:
print('HelloWorld!')
Hello World!
When we run the cell, its output is displayed below and the label to its left will have changed from In [ ]
to In [1]. The output of a code cell also forms part of the document, which is why you can see it in this
article. You can always tell the difference between code and Markdown cells because code cells have
that label on the left and Markdown cells do not. The “In” part of the label is simply short for “Input,”
while the label number indicates when the cell was executed on the kernel — in this case the cell was
executed first. Run the cell again and the label will change to In [2] because now the cell was the second
to be run on the kernel. It will become clearer why this is so useful later on when we take a closer look at
kernels. From the menu bar, click Insert and select Insert Cell Below to create a new code cell
underneath your first and try out the following code to see what happens. Do you notice anything
different?

15
This cell doesn’t produce any output, but it does take three seconds to execute. Notice how Jupyter
signifies when the cell is currently running by changing its label to In [*]. In general, the output of a cell
comes from any text data specifically printed during the cell’s execution, as well as the value of the last
line in the cell, be it a lone variable, a function call, or something else. For example:

'Hello, Tim!'

NumPy
Numpy is the backbone of Machine Learning in Python. It is one of the most important libraries in
Python for numerical computations. It adds support to core Python for multi-dimensional arrays (and
matrices) and fast vectorized operations on these arrays. The present day NumPy library is a successor of
an early library, Numeric, which was created by Jim Hugunin and some other developers. Travis
Oliphant, Anaconda’s president and co-founder, took the Numeric library as a base and added a lot of
modifications, to launch the present day NumPy library in 2005. It is a major open source project and is
one of the most popular Python libraries. It’s used in almost all Machine Learning and scientific
computing libraries. The extent of popularity of NumPy is verified by the fact that major OS
distributions, like Linux and MacOS, bundle NumPy as a default package instead of considering it as an
add-on package.

NumPy aims to provide an array object that is up to 50x faster than traditional Python lists. The array
object in NumPy is called ndarray, it provides a lot of supporting functions that make working with
ndarray very easy. Arrays are very frequently used in data science, where speed and resources are very
important.
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and
manipulate them very efficiently. This behavior is called locality of reference in computer science. This
is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU
architectures.
Installation of NumPy
If you have Python and PIP already installed on a system, then installation of NumPy is very easy. Install
it using this command:
C:\Users\Your Name>pip install numpy
If this command fails, then use a python distribution that already has NumPy installed like,
Anaconda, Spyder etc.
Import NumPy
Once NumPy is installed, import it in your applications by adding the import keyword:
import numpy
Now NumPy is imported and ready to use.

Example:
16
import numpy
arr = [Link]([1, 2, 3, 4, 5])
print(arr)
Output:
[1 2 3 4 5]
NumPy as np
NumPy is usually imported under the np alias.
Example:
import numpy as np
arr = [Link]([1, 2, 3, 4, 5])
print(arr)
Output:
[1 2 3 4 5]

Pandas
Pandas is a Python library used for working with data sets. It has functions for analyzing, cleaning, exploring, and
manipulating data. The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was
created by Wes McKinney in 2008.
Why Use Pandas?
Pandas allows us to analyze big data and make conclusions based on statistical theories. Pandas can clean
messy data sets, and make them readable and relevant. Relevant data is very important in data science.
What Can Pandas Do?
Pandas gives you answers about the data. Like:
Is there a correlation between two or more columns? What is average value? Max value? Min value? Pandas
are also able to delete rows that are not relevant, or contains wrong values, like empty or NULL values. This is
called cleaning the data. Where is the Pandas Codebase?
The source code for Pandas is located at this github repository [Link]
Installation of Pandas
If you have Python and PIP already installed on a system, then installation of Pandas is very easy. Install it
using this command:
C:\Users\Your Name>pip install pandas
If this command fails, then use a python distribution that already has Pandas installed like, Anaconda, Spyder
etc.
Import Pandas
Once Pandas is installed, import it in your applications by adding the import keyword:
import pandas
Example:
import pandas
mydataset = { 'cars': ["BMW", "Volvo", "Ford"], 'passings': [3, 7, 2] }
myvar = [Link](mydataset)
print(myvar)
Output:
cars passings
0 BMW 3
1 Volvo 7
2 Ford 2

Sample Viva Questions:

PANDAS
Q1. What are Pandas?
17
Q2. What are the Different Types of Data Structures in Pandas?
Q3. List Key Features of Pandas.
Q4. What is Series in Pandas?
Q5. What are the Different Ways to Create a Series?
Q6. How can we Create a Copy of the Series?
Q7. What is a DataFrame in Pandas?
Q8. What are the Different ways to Create a DataFrame in Pandas?
Q9. How to Read Data into a DataFrame from a CSV file?
Q10. How to access the first few rows of a dataframe?
Q11. What is Reindexing in Pandas?
Q12. How to Select a Single Column of a DataFrame?
Q13. How to Rename a Column in a DataFrame?
Q14. How to add an Index, Row, or Column to an Existing Dataframe?
Q15. How to Delete an Index, Row, or Column from an Existing DataFrame?
Q16. How to set the Index in a Panda dataFrame?
Q17. How to Reset the Index of a DataFrame?
Q18. How to Find the Correlation Using Pandas?
Q19. How to Iterate over Dataframe in Pandas?
Q20. What are the Important Conditions to keep in mind before Iterating?

NUMPY
Q1. What is NumPy? Why should we use it?
Q2. How do you convert Pandas DataFrame to a NumPy array?
Q3. How do you concatenate 2 NumPy arrays?
Q4. How do you multiply 2 NumPy array matrices?
Q5. How is arr[:,0] different from arr[:
Q6. How do we check for an empty array (or zero elements array)?
Q7. How do you count the frequency of a given positive value appearing in the NumPy array?
Q8. How is [Link]() different from [Link]() in NumPy?
Q9. How can you reverse a NumPy array?
Q10. How do you find the data type of the elements stored in the NumPy arrays?
Q11. What are ways of creating 1D, 2D and 3D arrays in NumPy?
Q12. What are ndarrays in NumPy?
Q13. How are NumPy arrays better than Python’s lists?
Q14. Why is NumPy preferred over Matlab, Octave, Idl or Yorick?

18
Experiment No.-2
Program to demonstrate Simple Linear Regression

To code a simple linear regression model using StatsModels we will require

NumPy, pandas, matplotlib, and statsmodels.

Here is a quick overview of the following libraries:

● NumPy — used to perform mathematical operations mainly using

multi-dimensional arrays.

● pandas — used for data manipulation and analysis.

● matplotlib — it is a plotting library as a component of NumPy

● statsmodels — it is used to explore data, estimate statistical models and

perform statistical test.

Import the relevant libraries

In [ ]:
import numpy as np
import pandas as pd
import [Link] as plt
import [Link] as sm
import seaborn as sns
[Link]()

Load the data

In [ ]:
data = pd.read_csv('1.01. Simple linear [Link]')
In [ ]:
data
In [ ]:
[Link]()

Define the dependent and the independent variables

In [ ]:
y = data ['GPA']
x1 = data ['SAT'

Explore the data

In [ ]:
[Link](x1,y)
19
[Link]('SAT', fontsize = 20)
[Link]('GPA', fontsize = 20)
[Link]()

Regression itself
In [ ]:
x = sm.add_constant(x1)
results = [Link](y,x).fit()
[Link]()
In [ ]:
[Link](x1,y)
yhat = 0.0017*x1 + 0.275
fig = [Link](x1,yhat, lw=4, c='orange', label ='regression line')
[Link]('SAT', fontsize = 20)
[Link]('GPA', fontsize = 20)
[Link]()
In [ ]:
[Link](x1,y)
yhat = 0.0017*x1 + 0
fig = [Link](x1,yhat, lw=4, c='green', label ='regression line')
[Link]('SAT', fontsize = 20)
[Link]('GPA', fontsize = 20)
[Link](0)
[Link](0)
[Link]()
In [ ]:
[Link](x1,y)
yhat = 0*x1 + 0.275
fig = [Link](x1,yhat, lw=4, c='red', label ='regression line')
[Link]('SAT', fontsize = 20)
[Link]('GPA', fontsize = 20)
[Link]()
In [ ]:

Output:

20
Sample viva questions:
1. What is the difference between a population regression line and a sample regression line?
2. What are the assumptions of a linear regression model?
3. What are outliers? How do you detect and treat them? How do you deal with outliers in a linear
regression model?
4. How do you determine the best fit line for a linear regression model?
5. What is the difference between simple and multiple linear regression?
6. What is linear Regression Analysis?
7. What is multicollinearity and how does it affect linear regression analysis?
8. What is the difference between linear regression and logistic regression?
9. What are the common types of errors in linear regression analysis?
10. What is the difference between a dependent and independent variable in linear regression?
11. What is an interaction term in linear regression and how is it used?
12. What is the difference between biased and unbiased estimates in linear regression?
13. How do you measure the strength of a linear relationship between two variables?
14. What is linear regression, and how does it work?
15. What is the difference between linear regression and non-linear regression?
16. What are the common techniques used to improve the accuracy of a linear regression model?
17. What is a residual in linear regression and how is it used in model evaluation?
18. What is the difference between a parametric and non-parametric regression model?
19. What are the assumptions of the ordinary least squares method for linear regression?

Ref:
[Link]

21
Experiment No.-3
Program to demonstrate Logistic Regression

22
Experiment No.- 4
Program to demonstrate Decision Tree – ID3 Algorithm

23
Experiment No.- 5
Program to demonstrate k‐Nearest Neighbor flowers classification

24
Experiment No.- 6
Program to demonstrate Naïve‐ Bayes Classifier

25
Experiment No.- 7
Program to demonstrate PCA and LDA on Iris dataset

26
Experiment No.- 8
Program to demonstrate DBSCAN clustering algorithm

27
Experiment No.- 9
Program to demonstrate K‐Medoid clustering algorithm

28
Experiment No.- 10
Program to demonstrate K‐Means Clustering Algorithm on Handwritten
Dataset

29
30
COURSE EXIT SURVEY

BHARATI VIDYAPEETH COLLEGE OF ENGINEERING, NEW DELHI

Department of Computer Science and Engineering
Course Exit Survey
2023- 2024

Subject Name: Machine Learning Lab Subject Code: ML-407P

Semester: 6th

Please rate how well you understood the course (Tick the most appropriate option)
(1 – Poor, 2- Good, 3- Excellent)

ML-407P.1 What is your Level of formulating machine learning problems?

1. 2. 3.

ML-407P.2 To what extent you have learned about regression and feature selection techniques and develop
applications based on the same?
1. 2. 3.

ML-407P.3 To what extent you can apply machine learning techniques such as classification to practical
applications?
1. 2. 3.

ML-407P.4 To what extent you can apply Clustering algorithms to develop various practical applications?
1. 2. 3.

Suggestions to improve the teaching methodology:

Overall, how do you rate your understanding of the subject (tick whichever is applicable)
1. Below 50%. 2. 50%-70%. 3.70%-90% 4. Above 90%

Name of student
Enrolment number Signature

31
32

Common questions

The application of machine learning in engineering practices can contribute to more efficient and innovative solutions by automating complex tasks, optimizing processes, and enabling predictive analytics. Machine learning algorithms can analyze vast amounts of data to uncover patterns, inform decision-making, and enhance system performance. This capability leads to innovative solutions, such as intelligent design automation, predictive maintenance, and improved decision-support systems that streamline operations and increase productivity in engineering projects .

Jupyter Notebooks support iterative and interactive data analysis by allowing users to run code in an executable format and see the results immediately. This interactivity facilitates experimental and exploratory analysis, as users can quickly modify code, test hypotheses, and visualize outputs. The notebook’s structure, including code cells and Markdown, enables a seamless integration of narrative text, which helps document the analytical process and makes the workflow transparent and repeatable .

Project management knowledge equips engineers to work more effectively in multidisciplinary environments by providing skills in coordination, time management, and resource allocation. This knowledge helps engineers understand the diverse perspectives and expertise within a team, leading to more cohesive collaboration. Project management principles, such as setting clear objectives, risk management, and efficient communication, are essential for aligning team efforts towards common goals and ensuring successful project delivery in complex, multidisciplinary settings .

Effective communication in engineering involves clear, concise, and accurate dissemination of complex information through reports, presentations, and instructions. This ensures all stakeholders, including team members and the broader community, are well-informed about project details. Effective communication can lead to increased project success by facilitating coordination, reducing misunderstandings, and fostering collaborative problem-solving .

Jupyter Notebooks integrate code, output, visualizations, and narrative text into a single document, making them a powerful tool for collaboration in data science. They facilitate the sharing of clear and reproducible analysis, allowing team members to easily understand and build upon each other's work. This transparency and reproducibility are crucial for verifying results and making collaborative projects more effective .

Understanding legal and cultural issues allows engineers to design solutions that are culturally sensitive, legally compliant, and socially responsible. By taking into account these factors, engineers can create products and systems that meet regulatory standards, respect cultural norms, and address the real-world challenges faced by communities. This comprehensive awareness can prevent legal conflicts, reduce project risks, and ensure the long-term acceptance and success of engineering solutions within society .

Professional ethics ensure that engineering teams adhere to principles such as honesty and integrity, fostering trust and collaboration among team members. This ethical foundation is crucial for sustainable projects, which often require transparent decision-making and accountability to both society and the environment. Adherence to ethical guidelines helps prevent conflicts of interest, ensures compliance with legal and environmental standards, and enhances the project's credibility and acceptance by the public .

Jupyter Notebooks improve the learning curve for new data scientists by offering an interactive environment where code, output, visualizations, and documentation coexist. This format helps beginners systematically understand data analysis workflows by providing immediate feedback and facilitating hands-on practice. The ability to annotate code with explanatory text and visualizations makes complex concepts more accessible, aiding in deeper comprehension and retention of data science principles .

Sustainable engineering practices can profoundly impact societal development by promoting resource efficiency, reducing environmental impact, and ensuring the long-term well-being of communities. These practices encourage the use of renewable resources and minimize waste, leading to healthier ecosystems and economies. Additionally, sustainable engineering addresses social equity by considering the needs of diverse populations, thereby contributing to more inclusive and resilient communities .

NumPy enhances performance and efficiency in computational tasks by providing support for large multi-dimensional arrays and matrices along with a collection of mathematical functions to operate on these data structures. NumPy arrays are implemented in C, allowing them to process data more quickly than Python's built-in lists. This efficiency gains arise from NumPy's ability to perform operations on entire arrays at once without the need for iteration, which minimizes the overhead and accelerates execution .

B.Tech CSE Program Overview 2022
No ratings yet
B.Tech CSE Program Overview 2022
287 pages
ITC AKASH Full End Sem
No ratings yet
ITC AKASH Full End Sem
36 pages
MA3355 Random Process & Linear Algebra
No ratings yet
MA3355 Random Process & Linear Algebra
61 pages
VLSI Testing and Verification Overview
No ratings yet
VLSI Testing and Verification Overview
3 pages
Data Processing Lab Manual 2023-24
No ratings yet
Data Processing Lab Manual 2023-24
19 pages
8085 Microprocessor Overview and Programming
No ratings yet
8085 Microprocessor Overview and Programming
113 pages
KTU Computer Science Curriculum 2024
No ratings yet
KTU Computer Science Curriculum 2024
20 pages
Classification of Random Processes
No ratings yet
Classification of Random Processes
30 pages
Introduction to SCILAB Programming
No ratings yet
Introduction to SCILAB Programming
24 pages
BAI701 Deep Learning Lab Manual
No ratings yet
BAI701 Deep Learning Lab Manual
58 pages
Classification Exercises Overview
No ratings yet
Classification Exercises Overview
9 pages
Assembly Language Programming Tasks
No ratings yet
Assembly Language Programming Tasks
2 pages
BCS602 TIE Questions for CSE Students
No ratings yet
BCS602 TIE Questions for CSE Students
8 pages
BE3251 Analog Electronics Notes
No ratings yet
BE3251 Analog Electronics Notes
47 pages
Scilab Practical Programming Examples
No ratings yet
Scilab Practical Programming Examples
48 pages
Deep Learning for Epileptic Seizure Prediction
No ratings yet
Deep Learning for Epileptic Seizure Prediction
22 pages
7th Sem VTU AI & ML Syllabus
No ratings yet
7th Sem VTU AI & ML Syllabus
32 pages
BEC515A Machine Learning Question Bank
100% (1)
BEC515A Machine Learning Question Bank
2 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
148 pages
PHD Test Question Paper 2018 Electronics Comm Engg
No ratings yet
PHD Test Question Paper 2018 Electronics Comm Engg
24 pages
Understanding Perceptrons in Deep Learning
No ratings yet
Understanding Perceptrons in Deep Learning
39 pages
Entry-Level Front-End Developer Resume
No ratings yet
Entry-Level Front-End Developer Resume
1 page
ECE R20 Syllabus: IC Applications
No ratings yet
ECE R20 Syllabus: IC Applications
1 page
DRDO Scientist B Computer Science Syllabus
No ratings yet
DRDO Scientist B Computer Science Syllabus
2 pages
CO-PO Mapping Justification for BCS303
No ratings yet
CO-PO Mapping Justification for BCS303
5 pages
Mumbai University Computer Engineering NEP Syllabus
No ratings yet
Mumbai University Computer Engineering NEP Syllabus
113 pages
Team Projects for SIH 2025 Innovation
No ratings yet
Team Projects for SIH 2025 Innovation
12 pages
Industrial Electronics Course Overview
No ratings yet
Industrial Electronics Course Overview
1 page
MTech Machine Learning Exam Questions
No ratings yet
MTech Machine Learning Exam Questions
2 pages
Digital Signal Processing Lesson Plan
No ratings yet
Digital Signal Processing Lesson Plan
6 pages
Synchronous Counter IC 74190 Overview
No ratings yet
Synchronous Counter IC 74190 Overview
3 pages
Data Visualization Lab Manual
No ratings yet
Data Visualization Lab Manual
26 pages
Machine Learning Lab Manual for CSE
No ratings yet
Machine Learning Lab Manual for CSE
19 pages
Bosch Graduate Apprentice Program 2026
No ratings yet
Bosch Graduate Apprentice Program 2026
3 pages
LNMIIT MT121 Exam Paper 2023-24
No ratings yet
LNMIIT MT121 Exam Paper 2023-24
9 pages
GTU Digital Electronics Syllabus Sem 3
No ratings yet
GTU Digital Electronics Syllabus Sem 3
3 pages
UCS301 Data Structures Exam Questions
No ratings yet
UCS301 Data Structures Exam Questions
9 pages
Flight Accident Risk Prediction Model
No ratings yet
Flight Accident Risk Prediction Model
11 pages
7th Sem Open Electives for VTU 2022
No ratings yet
7th Sem Open Electives for VTU 2022
3 pages
Principles of Communication System Exam
No ratings yet
Principles of Communication System Exam
2 pages
Anti-Sleep Alarm for Drivers Project
No ratings yet
Anti-Sleep Alarm for Drivers Project
16 pages
FPGA System Design Course Overview
No ratings yet
FPGA System Design Course Overview
2 pages
Embedded Systems Lab Manual
No ratings yet
Embedded Systems Lab Manual
95 pages
Counter Propagation Networks Overview
No ratings yet
Counter Propagation Networks Overview
1 page
M.Tech Computer Science Syllabus 2025
No ratings yet
M.Tech Computer Science Syllabus 2025
5 pages
Day 1 Fig1 Eltn3233DrAjithSir
No ratings yet
Day 1 Fig1 Eltn3233DrAjithSir
43 pages
Anna University EC6401 Question Papers
No ratings yet
Anna University EC6401 Question Papers
7 pages
Microwave and Radar Lab Manual Guide
No ratings yet
Microwave and Radar Lab Manual Guide
89 pages
CEC349 RFID System Lab Manual
No ratings yet
CEC349 RFID System Lab Manual
62 pages
Software Engineering Exam - Jharkhand University
No ratings yet
Software Engineering Exam - Jharkhand University
12 pages
Computer Graphics & Multimedia Lab Manual
No ratings yet
Computer Graphics & Multimedia Lab Manual
17 pages
B.Tech 4th Semester Electronics Syllabus
No ratings yet
B.Tech 4th Semester Electronics Syllabus
4 pages
Python Programming Lab Manual
No ratings yet
Python Programming Lab Manual
22 pages
4-Bit ALU Design in Verilog
No ratings yet
4-Bit ALU Design in Verilog
16 pages
Machine Learning Lab Manual 2019-20
No ratings yet
Machine Learning Lab Manual 2019-20
47 pages
Data Warehousing Lab Manual 2024-25
100% (1)
Data Warehousing Lab Manual 2024-25
60 pages
Soft Computing Lab Manual 2020-21
No ratings yet
Soft Computing Lab Manual 2020-21
38 pages
AI Lab Manual
No ratings yet
AI Lab Manual
32 pages
Data Mining Lab Manual ETCS-457
No ratings yet
Data Mining Lab Manual ETCS-457
56 pages
Big Data Analytics Lab Manual 2024-25
No ratings yet
Big Data Analytics Lab Manual 2024-25
41 pages
Sentiment Analysis and K-Means Clustering
No ratings yet
Sentiment Analysis and K-Means Clustering
1 page
Bidirectional DC/DC Converter Design Report
No ratings yet
Bidirectional DC/DC Converter Design Report
17 pages
Designing SVC for Voltage Stability
No ratings yet
Designing SVC for Voltage Stability
3 pages
Ed Unit 1&2
No ratings yet
Ed Unit 1&2
59 pages
Lecture 36
No ratings yet
Lecture 36
4 pages
Ollama - OpenClaw
No ratings yet
Ollama - OpenClaw
8 pages
Science Book Back Tamil 6-12
100% (1)
Science Book Back Tamil 6-12
424 pages
Audio Media: Types and Uses Explained
No ratings yet
Audio Media: Types and Uses Explained
34 pages
Generative AI Virtual Internship Report
No ratings yet
Generative AI Virtual Internship Report
33 pages
Steam Library Update Log 2024
No ratings yet
Steam Library Update Log 2024
46 pages
Understanding Java Interfaces
No ratings yet
Understanding Java Interfaces
4 pages
ECP EarlyWatch Alert Summary Report
No ratings yet
ECP EarlyWatch Alert Summary Report
88 pages
Engineering Analysis with ANSYS Guide
No ratings yet
Engineering Analysis with ANSYS Guide
2 pages
BU - FCAI - SCC430 - Modeling&Simulation - Ch01 2
No ratings yet
BU - FCAI - SCC430 - Modeling&Simulation - Ch01 2
91 pages
Windows to Linux: File Management Guide
No ratings yet
Windows to Linux: File Management Guide
76 pages
English Grammar and Reading Comprehension Test
No ratings yet
English Grammar and Reading Comprehension Test
3 pages
Inserting Blank Lines in Word Documents
No ratings yet
Inserting Blank Lines in Word Documents
24 pages
Subdomain Enumeration Techniques and Tools
No ratings yet
Subdomain Enumeration Techniques and Tools
44 pages
Understanding Cassandra Column Families
No ratings yet
Understanding Cassandra Column Families
37 pages
Nvidia HPC X Software Toolkit Rev 2 22 1.1
No ratings yet
Nvidia HPC X Software Toolkit Rev 2 22 1.1
146 pages
Data Warehousing Lab Record 2023-24
No ratings yet
Data Warehousing Lab Record 2023-24
45 pages
Michael Kerrisk's Linux Training Courses
No ratings yet
Michael Kerrisk's Linux Training Courses
2 pages
Hyperion Planning Input Forms Guide
100% (1)
Hyperion Planning Input Forms Guide
67 pages
ERP Implementation Phases Explained
No ratings yet
ERP Implementation Phases Explained
6 pages
Visual Basic Programming Essentials
No ratings yet
Visual Basic Programming Essentials
66 pages
Understanding Number Systems in Computing
No ratings yet
Understanding Number Systems in Computing
5 pages
Ration Card Management System Overview
No ratings yet
Ration Card Management System Overview
7 pages
Approvals
No ratings yet
Approvals
48 pages
Liberty University Campus Exploration PPT
No ratings yet
Liberty University Campus Exploration PPT
3 pages
Beginner's Guide to Compiler Design
No ratings yet
Beginner's Guide to Compiler Design
3 pages
Full-Stack Web Development Course Guide
No ratings yet
Full-Stack Web Development Course Guide
4 pages
Nico: A Forced Proximity, Enemies To Lovers, Marriage of Convenience, Rock Star Romance Digital Version 2025
No ratings yet
Nico: A Forced Proximity, Enemies To Lovers, Marriage of Convenience, Rock Star Romance Digital Version 2025
85 pages
SunTech CT40 Configuration Guide
No ratings yet
SunTech CT40 Configuration Guide
33 pages
Juniper JNCIA-Cloud Exam Questions
No ratings yet
Juniper JNCIA-Cloud Exam Questions
25 pages
Binary Data Representation & CPU Basics
No ratings yet
Binary Data Representation & CPU Basics
21 pages

Machine Learning Lab Manual 2024-25

Uploaded by

Machine Learning Lab Manual 2024-25

Uploaded by

Bharati Vidyapeeth's College of Engineering

Department Computer Science and Engineering

To be an institute of excellence that provides quality technical education and research to

M1: To impart quality technical education through dynamic teaching-learning environment

To develop as a centre of excellence in computer science and engineering education and

Mission: The mission of the CSE Department is to:

PROGRAM EDUCATIONAL OBJECTIVES (PEO)

PEO1: To produce graduates with in-depth knowledge of Computer Science and

PROGRAM OUTCOMES (PO)

PROGRAM SPECIFIC OUTCOMES (PSO)

[Link]. Page no.

3 Experimental Setup details for the course. 11

● To understand the need of machine learning.

1.2 COURSE OUTCOMES

ML-407P.1 To formulate machine learning problems Remember, Analyze,

ML-407P.3 Apply machine learning techniques such as classification to Understand, Remember,

1.3 MAPPING COURSE OUTCOMES (CO) AND

1.5 GUIDELINES FOR CONTINUOUS ASSESSMENT FOR EACH

● Attendance and performance in minimum eight experiments and Content

● Internal Lab examination Viva-Voce of 20 marks.

Experiment Marks details:

Completed and Completed but Logically Unacceptable

File Marks Details:

File Contents & Checked File Contents &

Attendance/Viva-Voce Marks details:

Expected Outcome attained: YES/NO

Sr. No. Title of Lab Experiments CO

3 Program to demonstrate Logistic Regression CO2

4 Program to demonstrate Decision Tree – ID3 Algorithm CO2

5 Program to demonstrate k‐Nearest Neighbor flowers classification CO3

7 Program to demonstrate PCA and LDA on Iris dataset CO3

8 Program to demonstrate DBSCAN clustering algorithm CO4

CONTENT BEYOND SYLLABUS

2 Program to demonstrate K‐Means Clustering Algorithm on Handwritten CO4

Minimum Hardware Requirements

Introduction to JUPYTER IDE and its libraries Pandas and NumPy

Creating Your First Notebook

Sample Viva Questions:

To code a simple linear regression model using StatsModels we will require

Here is a quick overview of the following libraries:

● NumPy — used to perform mathematical operations mainly using

● pandas — used for data manipulation and analysis.

● matplotlib — it is a plotting library as a component of NumPy

● statsmodels — it is used to explore data, estimate statistical models and

Import the relevant libraries

Load the data

Define the dependent and the independent variables

Explore the data

BHARATI VIDYAPEETH COLLEGE OF ENGINEERING, NEW DELHI

Subject Name: Machine Learning Lab Subject Code: ML-407P

ML-407P.1 What is your Level of formulating machine learning problems?

Suggestions to improve the teaching methodology:

Common questions

How can the application of machine learning in engineering practices contribute to more efficient and innovative solutions?

How can the application of machine learning in engineering practices contribute to more efficient and innovative solutions?

How do the features of Jupyter Notebooks support iterative and interactive data analysis?

How do the features of Jupyter Notebooks support iterative and interactive data analysis?

In what ways does project management knowledge equip engineers to work more effectively in multidisciplinary environments?

In what ways does project management knowledge equip engineers to work more effectively in multidisciplinary environments?

What are the key elements of effective communication in engineering practices, and how do they impact project success?

What are the key elements of effective communication in engineering practices, and how do they impact project success?

What role do Jupyter Notebooks play in enhancing collaboration in data science teams?

What role do Jupyter Notebooks play in enhancing collaboration in data science teams?

How does an understanding of legal and cultural issues improve an engineer's ability to design solutions that meet societal needs?

How does an understanding of legal and cultural issues improve an engineer's ability to design solutions that meet societal needs?

How can professional ethics influence the effectiveness of engineering teams working on sustainable projects?

How can professional ethics influence the effectiveness of engineering teams working on sustainable projects?

How do Jupyter Notebooks improve the learning curve for new data scientists in understanding data analysis workflows?

How do Jupyter Notebooks improve the learning curve for new data scientists in understanding data analysis workflows?

In what ways can sustainable engineering practices influence societal development?

In what ways can sustainable engineering practices influence societal development?

How does numpy enhance performance and efficiency in computational tasks compared to Python’s built-in lists?

How does numpy enhance performance and efficiency in computational tasks compared to Python’s built-in lists?

You might also like