0% found this document useful (0 votes)

7 views

ML LabManual (1)

The document is a lab manual for a Machine Learning course at RNS First Grade College, detailing a series of programming tasks for BCA 6th semester students. It includes instructions for setting up Python and essential libraries, implementing various machine learning algorithms using scikit-learn, and visualizing data with libraries like Matplotlib and Seaborn. The manual covers practical implementations of k-NN, linear regression, decision trees, and K-Means clustering, along with data handling and preprocessing techniques.

Uploaded by

chinnikrishna63631

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

ML LabManual (1)

Uploaded by

chinnikrishna63631

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

RNS FIRST GRADE COLLEGE AUTONOMOUS

(Affiliated to Bangalore University and NAAC Accredited with ‘A’ Grade) Dr. Vishnuvardhan Road,
Channasandra, R R Nagara, Bengaluru – 560 098

Department of Computer Science

Machine Learning Lab Manual (BCA 6th Sem)

LIST OF PROGRAMS

1. Install and set up Python and essential libraries like NumPy and pandas

2. Introduce scikit-learn as a machine learning library.

3. Install and set up scikit-learn and other necessary tools.

4. Write a program to Load and explore the dataset of .CVS and excel files using
pandas.

5. Write a program to Visualize the dataset to gain insights using Matplotlib or

Seaborn by plotting scatter plots, bar charts.

6. Write a program to Handle missing data, encode categorical variables, and

perform feature scaling.

7. Write a program to implement a k-Nearest Neighbours (k-NN) classifier using

scikitlearn and Train the classifier on the dataset and evaluate its performance.

8. Write a program to implement a linear regression model for regression tasks

and Train the model on a dataset with continuous target variables.

9. Write a program to implement a decision tree classifier using scikit-learn and

visualize the decision tree and understand its splits
10. Write a program to Implement K-Means clustering and Visualize clusters.
1. Install and set up Python and essential libraries like NumPy and pandas.

Installation of Python

Step 1: Search for Python

Click on the official website link: https://www.python.org/downloads/

Step 2: Select Version to Install Python

Choose the latest versions for windows.

Step 3: Downloading the Python Installer

 Once you have downloaded the installer, open the .exe file.
 Enable users to run Python from the command line by checking the Add python.exe to
PATH checkbox
 Click on Install Now to start installation.

Step 4: Verify the Python Installation in Windows

Go to Command Prompt, type the command “python -V” or “python --version”. You can see
installed version of Python on your system.

Step5: Check the Pip version

Go to Command Prompt, type the command “pip -V” or “pip --version”. You can see
installed version of pip on your system.
Installation of essential packages Numpy and Pandas.

Install numpy and pandas package.

NumPy is an open-source Python library that facilitates efficient numerical operations on large
quantities of data. There are a few functions that exist in NumPy that we use on pandas
DataFrames. The most important part about NumPy is that pandas is built on top of it which
means Numpy is required for operating the Pandas.

Pandas is a very popular library for working with data (its goal is to be the most powerful and
flexible open source tool, and in our opinion, it has reached that goal). DataFrames are at the
center of pandas. A DataFrame is structured like a table or spreadsheet. The rows and the
columns both have indexes, and can perform operations on rows or columns separately.

Step 1: Open command prompt, CMD.

Step 2: Type the command,
pip install numpy
pip install pandas

Step3: Print the versions of NumPy and

Pandas that were installed.
Go to python script or jupyter notebook
and type.
import numpy
import pandas
print(numpy. version )
print(pandas. version )

Step4: Check for any updates on packages

Type the command:
pip install --upgrade numpy
pip install --upgrade pandas
2. Introduce sci-kit-learn as a machine learning library.

Scikit-learn (Sklearn) is the most useful and robust library for

machine learning in Python. It provides a selection of
efficient tools for machine learning and statistical
modeling including classification, regression,
clustering and dimensionality reduction via a
consistence interface in Python. This library, which is
largely written in Python, is built upon NumPy, SciPy
and Matplotlib.

Features
Rather than focusing on loading, manipulating and summarizing data, Scikit-learn library is
focused on modeling the data. Some of the most popular groups of models provided by Sklearn
are as follows −
Supervised Learning algorithms − Almost all the popular supervised learning algorithms, like
Linear Regression, Support Vector Machine (SVM), Decision Tree etc., are the part of scikit-
learn.
Unsupervised Learning algorithms − On the other hand, it also has all the popular unsupervised
learning algorithms from clustering, factor analysis, PCA (Principal Component Analysis) to
unsupervised neural networks.
Clustering − This model is used for grouping unlabeled data.
Cross Validation − It is used to check the accuracy of supervised models on unseen data.

3. Install and set up scikit-learn and other necessary tools.

Scikit-learn (Sklearn) is the most useful and robust library for machine learning in Python. It
provides a selection of efficient tools for machine learning and statistical modeling including
classification, regression, clustering and dimensionality reduction via a consistence interface in
Python
Install scikit-learn using pip: Open your terminal or command prompt and run the following
command:
pip install -U scikit-learn

To verify your installation, you can use the following commands:

python -m pip show scikit-learn

To see which version and where scikit-learn is installed

python -m pip freeze

To see all packages installed

import sklearn
import numpy
import pandas
print(sklearn. version )
print(numpy. version )
print(pandas. version )
4. Write a program to Load and explore the dataset of .CVS and excel files using
pandas.

import pandas as pd
def load_data(file):
if file.endswith('.csv'):
df = pd.read_csv(file)
elif file.endswith('.xlsx'):
df = pd.read_excel(file)
else:
print("Unsupported file format. Please provide a CSV or Excel file.")
return

print("Dataset information:")
print(df.info())
print("\nTop rows of the dataset:")
print(df.head(1))

file = 'train.csv'
# Change this to the path of your CSV or Excel file
load_data(file)
5. Write a program to Visualize the dataset to gain insights using Matplotlib or
Seaborn by plotting scatter plots, and bar charts.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

df=pd.read_csv('train.csv')
df.head(2)

#plotting pairchart
sns.pairplot(df,hue='Survived')
sns.set_theme(style="darkgrid")
plt.show()
sns.countplot(x='Survived',data=df,hue = 'Sex')

6. Write a program to Handle missing data, encode categorical variables, and

perform feature scaling.

#importing necessary libraries

import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

#Reading the dataset

df=pd.read_csv(‘train.csv')
df.head(3)
#HANDLING MISSING VALUES
df.isnull().sum()

#Dropping the “Cabin” column as it contains more null values

df = df.drop(columns='Cabin', axis=1)

#Replacing the missing values in the “Age” column with the mean value
df['Age'].fillna(df['Age'].mean(), inplace=True)

#Replacing the missing values in the “Age” column with the mode value
df['Embarked'].fillna(df['Embarked'].mode()[0], inplace=True)

df.isnull().sum().sum()
output:
0

#ENCODING CATEGORICAL FEATURE

df.info()
#Droping unnessary columns
df= df.drop(columns = ['PassengerId','Name','Ticket'],axis=1)

#Using labelEncoder to impute categorical features

le=LabelEncoder()
df['Sex']= le.fit_transform(df['Sex'])
df['Embarked']=le.fit_transform(df['Embarked'])

df.info()

#FEATURE SCALING
#spliting input and output
X = df.drop(columns = ['Survived'],axis=1)
y=df['Survived']

X.head()
#Train-test split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=42)

Using Standarscalar to scale the features

sc = StandardScaler()
X_train = sc.fit_transform(X_train)

#Displaying scaled data as dataframes

scaled_df = pd.DataFrame(X_train, columns=X.columns)
scaled_df.head()

7. Write a program to implement a k-Nearest Neighbours (k-NN) classifier using

scikit-learn and Train the classifier on the dataset and evaluate its performance

#importing necessary libraries

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report

#importing iris dataset from sklearn and spliting input and output
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target

#Train-test split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=0)
#Implementing Knn Classifier model
knn_model = KNeighborsClassifier(n_neighbors=3)
knn_model.fit(X_train, y_train)
y_pred = knn_model.predict(X_test)

#Checking performance matrices

acc = accuracy_score(y_test, y_pred)
print("Accuracy:", acc)
Output:
Accuracy: 1.0

print("Classification Report:")
print(classification_report(y_test, y_pred))

8. Write a program to implement a linear regression model for regression tasks

and Train the model on a dataset with continuous target variables.

#importing necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

#Reading dataset
df=pd.read_csv('Boston.csv')
df.head(3)
#spliting input and output
X = df.drop(columns = ['medv'],axis=1)
y=df['medv']

#Train-test split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=42)

print(X_train.shape) output: (404, 13)

print(y_train.shape) (404,)
print(X_test.shape) (102, 13)
print(y_test.shape) (102,)

#Performing simple linear regression

model = LinearRegression()

#Fitting model
model.fit(X_train, y_train)

#Prediction
y_pred = model.predict(X_test)

#Scatter plot for actual Vs predicted datapoints

sns.regplot(x=y_test, y=y_pred, scatter_kws={'s': 10}, line_kws={'color': 'red'})
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('Actual vs Predicted Values')
plt.show()
#Calculating error rate via performance metrics
print('Root Mean Squared error:(RMSE)',np.sqrt(mean_squared_error(Y_test,y_pred)))
print('R2-Square:',r2_score(Y_test,y_pred))
Root Mean Squared error:(RMSE) 4.300630200615773
R2-Square: 0.7789207451814409

9. Write a program to implement a decision tree classifier using scikit-learn and

visualize the decision tree and understand its splits.

#importing necessary libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score

#importing iris dataset from sklearn and spliting input and output
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target

#Train-test split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.2, random_state=42)

Performing Decision Tree Classifier

dtc = DecisionTreeClassifier()
dtc.fit(X_train, y_train)
y_pred=dtc.predict(X_test)

#Checking accuracy
acc=accuracy_score(y_test,y_pred)
print("Accuracy of model=", acc)
Output:
Accuracy: 1.0
#Visualizing decision tree
plt.figure(figsize=(12, 8))
plot_tree(dtc, feature_names=iris.feature_names,
class_names=iris.target_names, filled=True)
plt.show()
10. Write a program to Implement K-Means clustering and Visualize clusters

make_blobs is a synthetic data generator, especially useful for clustering and classification
algorithms. It generates isotropic Gaussian blobs. An isotropic Gaussian blob essentially
means that the data points are distributed in a circular (spherical, for multi-dimensional data)
shape around the centroid.

#importing necessary libraries

import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

# Generate sample data

X, y = make_blobs(n_samples=500, centers=4, cluster_std=0.8, random_state=42)

# Create a K-Means clusterer with 4 clusters

kmeans = KMeans(n_clusters=4, random_state=42)
kmeans.fit(X)

# Get cluster labels

labels= kmeans.predict(X)

#Plotting the data with cluster labels

plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1], s=100,
c='red', label='Centroids')
plt.title('K-Means Clustering')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()

Learning Informatica PowerCenter 9.x
From Everand
Learning Informatica PowerCenter 9.x
Rahul Malewar
3/5 (4)
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Ml Lab Manual(Vim)
No ratings yet
Ml Lab Manual(Vim)
13 pages
ML Pgms_24Mar2025
No ratings yet
ML Pgms_24Mar2025
23 pages
MACHINE LEARNING LAB PROGRAMS
No ratings yet
MACHINE LEARNING LAB PROGRAMS
6 pages
CS3361 - Data Science Laboratory
No ratings yet
CS3361 - Data Science Laboratory
31 pages
ML-Lab Manual - NEP - DSS
No ratings yet
ML-Lab Manual - NEP - DSS
23 pages
ML lab_abbs
No ratings yet
ML lab_abbs
23 pages
72b85f60-8523-423f-9efc-ff56aa21f3f3
No ratings yet
72b85f60-8523-423f-9efc-ff56aa21f3f3
29 pages
ML_Exp
No ratings yet
ML_Exp
9 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
ML Lab Manual
No ratings yet
ML Lab Manual
20 pages
ML[Lab Programs]
No ratings yet
ML[Lab Programs]
28 pages
data-mining-lab-manual-CSE-VII-Sem
No ratings yet
data-mining-lab-manual-CSE-VII-Sem
63 pages
ml file syllabus
No ratings yet
ml file syllabus
43 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Ml Lab Manual Completed
No ratings yet
Ml Lab Manual Completed
56 pages
unit 4
No ratings yet
unit 4
105 pages
CS3362 Data Science Laboratory Manual 2022-23
No ratings yet
CS3362 Data Science Laboratory Manual 2022-23
54 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
Machine Learning Laboratory: Manual
No ratings yet
Machine Learning Laboratory: Manual
52 pages
ML - Lab - Programs - J
No ratings yet
ML - Lab - Programs - J
18 pages
ML Lab Manual
No ratings yet
ML Lab Manual
38 pages
Tushar ML
No ratings yet
Tushar ML
52 pages
ML Aml Cse It Lab Manual Final
No ratings yet
ML Aml Cse It Lab Manual Final
22 pages
vishnu. ml
No ratings yet
vishnu. ml
26 pages
Mastering pandas 1st Edition Femi Anthony download
100% (2)
Mastering pandas 1st Edition Femi Anthony download
81 pages
Data Preprocessing and Data Analysis using Python
No ratings yet
Data Preprocessing and Data Analysis using Python
32 pages
IML Lab Manual
No ratings yet
IML Lab Manual
31 pages
1 An Introduction To Machine Learning With Scikit Learn
No ratings yet
1 An Introduction To Machine Learning With Scikit Learn
2 pages
ML
No ratings yet
ML
8 pages
Kartik mlp 4-9prg (1)
No ratings yet
Kartik mlp 4-9prg (1)
10 pages
Scikit-Learn: Library For Machine Learning and Data Science With Python
No ratings yet
Scikit-Learn: Library For Machine Learning and Data Science With Python
11 pages
23CS302 - dslab - experiment 1
No ratings yet
23CS302 - dslab - experiment 1
5 pages
ML MANUAL
No ratings yet
ML MANUAL
21 pages
CO-367 Machine Learning Lab File: Submitted To: Submitted by
No ratings yet
CO-367 Machine Learning Lab File: Submitted To: Submitted by
12 pages
Ass1 DSBDA Writeup
No ratings yet
Ass1 DSBDA Writeup
8 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
MACHINE LEARNING manual
No ratings yet
MACHINE LEARNING manual
36 pages
DSBDA Lab Manual
No ratings yet
DSBDA Lab Manual
155 pages
Scikit Learn
No ratings yet
Scikit Learn
25 pages
Syllabus AIML
No ratings yet
Syllabus AIML
14 pages
l9 Scientific Python Proc
No ratings yet
l9 Scientific Python Proc
30 pages
Scikit Learn Cheat Sheet Python
No ratings yet
Scikit Learn Cheat Sheet Python
1 page
Dav Lab
No ratings yet
Dav Lab
8 pages
ML Lab Manual (Upto Cie-1)
No ratings yet
ML Lab Manual (Upto Cie-1)
33 pages
Fds Lab Record
No ratings yet
Fds Lab Record
84 pages
Machine Learning - Python Libraries
No ratings yet
Machine Learning - Python Libraries
12 pages
Data Science ppt
No ratings yet
Data Science ppt
17 pages
CETM313 - Workshop Week 06-4
No ratings yet
CETM313 - Workshop Week 06-4
9 pages
Python Library Functions
No ratings yet
Python Library Functions
12 pages
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet
Practical Labs Guide
No ratings yet
Practical Labs Guide
34 pages
D P Lab Manual
No ratings yet
D P Lab Manual
54 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
39 pages
Kabir Data Preprocessing Python
No ratings yet
Kabir Data Preprocessing Python
14 pages
Scala Data Analysis Cookbook (new): Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes
From Everand
Scala Data Analysis Cookbook (new): Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes
Arun Manivannan
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Python Programming: Learn, Code, Create
From Everand
Python Programming: Learn, Code, Create
Sachin Naha
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Inprocess and Finished Products Quality Control Tests For Pharmaceutical Tablets According To Pharmacopoeias
No ratings yet
Inprocess and Finished Products Quality Control Tests For Pharmaceutical Tablets According To Pharmacopoeias
7 pages
Complex Event Processing With Apache Flink Presentation
No ratings yet
Complex Event Processing With Apache Flink Presentation
49 pages
Earth Science 4
No ratings yet
Earth Science 4
24 pages
Math 6, q4, w3
No ratings yet
Math 6, q4, w3
19 pages
Pythagoras in 3D Shapes
No ratings yet
Pythagoras in 3D Shapes
7 pages
Sem3 2023-24
No ratings yet
Sem3 2023-24
1 page
Cold Thermal Insulation Specification
No ratings yet
Cold Thermal Insulation Specification
13 pages
Stat and Prob Q3 Module 7
33% (3)
Stat and Prob Q3 Module 7
8 pages
Weblogic Interview Questions
No ratings yet
Weblogic Interview Questions
4 pages
Dna Computing
No ratings yet
Dna Computing
25 pages
Chapter Four Control Structures 4.1. The CMP Instruction: Page 1 of 5
No ratings yet
Chapter Four Control Structures 4.1. The CMP Instruction: Page 1 of 5
5 pages
HeatEquivalent-memo-2
No ratings yet
HeatEquivalent-memo-2
7 pages
Unit 2 Boolean Algebra
No ratings yet
Unit 2 Boolean Algebra
24 pages
1sxu000023c0202 - 15 - s200 ABB MCB
No ratings yet
1sxu000023c0202 - 15 - s200 ABB MCB
42 pages
Stronger Lie Derivations On: MA-semirings
No ratings yet
Stronger Lie Derivations On: MA-semirings
11 pages
Types of Cranes: Presented by Sayantan Das ROLL NO: 1854003
100% (1)
Types of Cranes: Presented by Sayantan Das ROLL NO: 1854003
49 pages
Vehicle Detection Trackingand Counting
No ratings yet
Vehicle Detection Trackingand Counting
8 pages
Water - Life's Matrix Philip Ball
No ratings yet
Water - Life's Matrix Philip Ball
10 pages
Lesson Plan Grade 6 Math Week 1
No ratings yet
Lesson Plan Grade 6 Math Week 1
6 pages
3G1F2043709296B JM JP KP 90LD4 400VY 50Hz 1,5kW 3GZF021009-138
No ratings yet
3G1F2043709296B JM JP KP 90LD4 400VY 50Hz 1,5kW 3GZF021009-138
1 page
60 Minutes: Learning Activities A. Introduction Panimula
No ratings yet
60 Minutes: Learning Activities A. Introduction Panimula
4 pages
DA102-13 Damper Torque PDF
No ratings yet
DA102-13 Damper Torque PDF
3 pages
Class 10 Maths Notes Chapter 5 Studyguide360
No ratings yet
Class 10 Maths Notes Chapter 5 Studyguide360
16 pages
Corrosion Fatiga PDF
No ratings yet
Corrosion Fatiga PDF
46 pages
Goulds 3196 i-FRAME ANSI Process Goulds HT 3196 i-FRAME ANSI High Temperature Process Pump
No ratings yet
Goulds 3196 i-FRAME ANSI Process Goulds HT 3196 i-FRAME ANSI High Temperature Process Pump
1 page
Biollogia Sueño Semana 2
No ratings yet
Biollogia Sueño Semana 2
12 pages
Verilog Basics
No ratings yet
Verilog Basics
52 pages
QuickAmp-72 ENGLISH v3-0
No ratings yet
QuickAmp-72 ENGLISH v3-0
16 pages
Operation Manual &: Strapping Machine
No ratings yet
Operation Manual &: Strapping Machine
121 pages
Concealed FCU Catalog
No ratings yet
Concealed FCU Catalog
20 pages