0% found this document useful (0 votes)
13 views16 pages

Ml Cyber Lab

The document outlines a series of laboratory exercises focused on machine learning, covering topics such as data manipulation, linear regression, supervised and unsupervised learning, and model evaluation. Each section includes objectives, tasks, and Python code solutions to facilitate hands-on learning. The exercises aim to provide practical experience with key machine learning concepts and techniques.

Uploaded by

nfsunotess
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views16 pages

Ml Cyber Lab

The document outlines a series of laboratory exercises focused on machine learning, covering topics such as data manipulation, linear regression, supervised and unsupervised learning, and model evaluation. Each section includes objectives, tasks, and Python code solutions to facilitate hands-on learning. The exercises aim to provide practical experience with key machine learning concepts and techniques.

Uploaded by

nfsunotess
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Machine Learning

Laboratory Exercises
Kuldeep. J. Purohit
November 25, 2024

Contents
1 Data Manipulation and Statistical Analysis 2

2 Solving Linear Equations Using Python 3

3 Working with Vectors and Matrices in Machine Learning 3

4 Implementing Linear Regression from Scratch 4

5 Introduction to AI and Machine Learning 5

6 Data Preprocessing and Feature Engineering 6

7 Supervised Learning: Classification and Regression 7

8 Unsupervised Learning: Clustering and Dimensionality Reduction 8

9 Model Evaluation and Hyperparameter Tuning 9

1
1 Data Manipulation and Statistical Analysis
Objective
Apply Python programming skills to manipulate data using lists, dictionaries, and the
pandas library. Perform statistical operations such as mean, median, mode, standard
deviation, and variance.

Tasks
1. Create a dictionary representing a dataset of students with their names, ages, and
scores. Convert it into a pandas DataFrame and display the data.

2. Perform statistical analysis on the Score column:


• Mean
• Median
• Mode
• Standard deviation
• Variance
3. Visualize the distribution of Score using a histogram.

Solution
1
import pandas as pd
2 import numpy as np
3 import matplotlib . pyplot as plt
4

5
# Data
dictionary data
6
= {
7
’ Name ’: [ ’ Alice ’, ’ Bob ’, ’ Charlie ’, ’ David ’, ’ Eve
8 ’], ’ Age ’: [23 , 22 , 24 , 23 , 22] ,
9 ’ Score ’: [88 , 92 , 85 , 91 , 89]
10 }
11
# Convert to Data
12
Frame df = pd . Data
13
Frame ( data ) print ( df )
14

15 # Calculate statistical measures


16 mean_score = np . mean ( df [ ’ Score ’])
17 median_score = np . median ( df [ ’ Score
18
’]) mode_score = df [ ’ Score ’]. mode ()
[0] std_dev_score = np . std ( df [ ’ Score
19
’]) variance_score = np . var ( df [ ’
20
Score ’])
21

22 # Print statistics
23 print ( f" Mean : { mean_score
24
}")
print ( f" Median : { median_score
25

26

2
27

28

3
29

30 # Plot histogram
31 plt . hist ( df [ ’ Score ’], bins =5 , color = ’ blue ’, alpha
32 =0.7) plt . title ( ’ Distribution of Scores ’)
33
plt . xlabel ( ’ Score ’)
plt . ylabel ( ’ Frequency
34
’) plt . show ()
35

2 Solving Linear Equations Using Python


Objective
Use Python and NumPy to solve a system of linear equations and understand matrix
operations.

Tasks
1. Represent the system of equations 2x + 3y = 5 and x −y = 1 as a matrix equation
Ax = b.

2. Solve for x and y using numpy.linalg.solve().

3. Verify the solution by substituting x and y back into the original equations.

Solution
1
import numpy as np
2

3 # Coefficients matrix A
4 A = np . array ([[2 , 3] , [1 , -1]])
5
# Constants matrix
6
b b = np . array ([5 ,
7
1])
8

9 # Solve for x and y


10 solution = np . linalg . solve (A , b )
11
print ( f" Solution : x = { solution [0]} , y = { solution [1]} " )
12
# Verify the solution
13
check = np . dot (A , solution )
14

15

3 Working with Vectors and Matrices in Machine


Learning
Objective
Understand and perform basic vector and matrix operations, foundational to machine
4
learning algorithms.

5
Tasks
1. Create vectors and perform:

• Dot product
• Element-wise addition
• Cross product
2. Create matrices and perform:

• Matrix multiplication
• Transpose
• Inverse (if invertible)
3. Compute eigenvalues and eigenvectors of a random 3 × 3 matrix.

Solution
1
# Vector operations
2 v1 = np . array ([1 , 2 , 3])
3 v2 = np . array ([4 , 5 , 6])
4

5
dot_product = np . dot ( v1 , v2 )
elementwise_addition = v1 + v2
6
cross_product = np . cross ( v1 ,
7
v2 )
8

9 print ( f" Dot Product : { dot_product }")


10 print ( f" Element - wise Addition : { elementwise_addition
11
}") print ( f" Cross Product : { cross_product }")
12
# Matrix operations
13
A = np . array ([[1 , 2] , [3 , 4]])
14
B = np . array ([[5 , 6] , [7 , 8]])
15

16 mat rix_ multi plic ation = np . dot (A , B )


17 transpose_A = np . transpose ( A)
18
inverse_A = np . linalg . inv ( A)
19
print ( f" Matrix Multiplication : \ n { matr ix_m ulti plica tion }")
20
print ( f" Transpose of A: \ n { transpose_A }" )
21 print ( f" Inverse of A: \ n { inverse_A }")
22

23 # Eigenvalues and Eigenvectors


24
random_matrix = np . random . rand (3 ,
3)
25
eigenvalues , eigenvectors = np . linalg . eig ( random_matrix )
26

27

28

29

30

4 Implementing Linear Regression from Scratch


6
Objective
Implement a simple linear regression algorithm using Python.

7
Tasks
1. Generate synthetic data for y = 2x + 1 with random noise.
2. Visualize the data using matplotlib.
3. Implement the linear regression formula:
θ = (XT X)−1XT y

4. Make predictions and evaluate using Mean Squared Error (MSE).

Solution
1
# Generate synthetic data
2 np . random . seed (42)
3 X = np . random . rand (100 , 1) * 10
4 y = 2 * X + 1 + np . random . randn (100 , 1)
5
# Visualize the data
6
plt . scatter (X , y , color = ’ blue
7
’) plt . title ( ’ Generated Data
8 ’) plt . xlabel ( ’X ’)
9 plt .
10 ylabel ( ’y ’)
11
plt . show ()
12
# Linear regression
13
X_bias = np . c_ [ np . ones (( X . shape [0] , 1) ) , X ]
14
theta = np . linalg . inv ( X_bias . T. dot ( X_bias )) . dot ( X_bias . T). dot ( y )
15

16 # Predictions
17 y_pred = X_bias . dot ( theta )
18
# MSE
19
mse = np . mean (( y - y_pred ) ** 2)
20
print ( f" Mean Squared Error : { mse }" )
21

22 # Plot the regression line


23 plt . scatter (X , y , color = ’ blue
24
’) plt . plot (X , y_pred , color = ’
red ’) plt . title ( ’ Linear
25
Regression ’) plt . xlabel ( ’X ’)
26
plt .
27

28

29

30

5 Introduction to AI and Machine Learning


Objective: Understand the basic concepts of AI and ML, including definitions and types
of learning. Explore the applications of AI and ML in real-world scenarios.

Tasks
8
1. Research and Present AI Applications: List at least 5 applications of AI and
ML in different domains (e.g., healthcare, finance, transportation, etc.). Write a
brief explanation of how AI/ML is used in each application.

9
2. Classify Types of Learning: Describe and compare supervised learning, unsu-
pervised learning, and reinforcement learning. For each type of learning, provide a
real-world example. Create a table summarizing the types of learning.

3. Hands-on Task: Load a simple dataset (e.g., the Iris dataset) using scikit-learn
and visualize the features.
1
from sklearn import datasets
2 import matplotlib . pyplot as plt
3

4 # Load Iris dataset


5
iris = datasets . load_iris ()
X = iris. data
6
y = iris. target
7

8 # Scatter plot of the first two features


9 plt . scatter ( X [: , 0] , X [: , 1] , c =y , cmap = ’ viridis
10 ’) plt . title (" Iris Dataset ")
11
plt . xlabel (" Feature 1 ( Sepal length
)" ) plt . ylabel (" Feature 2 ( Sepal
12
width )") plt . show ()
13

14

15

6 Data Preprocessing and Feature Engineering


Objective: Learn about data cleaning, feature selection, and feature normalization.

Tasks
1. Data Cleaning: Load a dataset (e.g., Titanic dataset from Kaggle). Check for
missing values and apply methods to handle them (e.g., fill with mean or drop
rows).
1
import pandas as pd
2

3 # Load Titanic dataset


4 df = pd . read_csv ( ’ titanic . csv ’)
5
# Check for missing
6
values print ( df . isnull () .
7
sum () )
8

9 # Fill missing ’ Age ’ values with the mean


10 df [ ’ Age ’]. fillna ( df [ ’ Age ’]. mean () , inplace =
11

2. Feature Normalization: Normalize the numerical features using MinMaxScaler


or StandardScaler.
1
from sklearn . preprocessing import Standard Scaler
2

3 # Normalize ’ Age ’ and ’ Fare ’ columns


scaler = Standard Scaler ()
1 . fit_transform ( df [[ ’ Age ’, ’ Fare
df [[ ’ Age ’, ’ Fare ’]] = scaler
’ ]]) 0
4

1
1
3. Feature Selection: Use correlation analysis or feature importance (e.g., decision
trees) to select relevant features.
1
import seaborn as sns
2

3 # Calculate correlation
4 matrix corr = df . corr ()
5
# Plot the heatmap of correlations
6
sns . heatmap ( corr , annot = True , cmap = ’ coolwarm
7
’)
8

7 Supervised Learning: Classification and Regres-


sion
Objective: Apply supervised learning algorithms to classification and regression tasks.

Tasks
1. Regression with Linear Regression: Use the Boston Housing dataset from scikit-
learn to perform linear regression and predict house prices. Evaluate the model
using Mean Squared Error (MSE).
1 from sklearn . datasets import load_boston
2 from sklearn . linear_model import LinearRegression
3 from sklearn . metrics import mean_squared_error
4 from sklearn . model_selection import train_test_split
5
6 # Load dataset
7 boston = load_boston ()
8 X = boston . data
9 y = boston . target
10
11 # Train - test split
12 X_train , X_test , y_train , y_test = train_test_split (X , y,
test_size =0.2 , random_state =42)
13
14 # Train the model
15 model = LinearRegression ()
16 model . fit ( X_train , y_train )
17
18 # Predict and evaluate
19 y_pred = model . predict ( X_test )
20 mse = mean_squared_error ( y_test , y_pred )
21 print ( f" Mean Squared Error : { mse }" )
22

2. Classification with Logistic Regression: Use the Iris dataset for classification
with Logistic Regression. Evaluate the model using accuracy and confusion matrix.
1
from sklearn . linear_model import LogisticRegression
2 from sklearn . metrics import accuracy_score ,
3 confusion_matrix from sklearn . model_selection import
4 train_test_split

1
2
5

1
3
6 ir is = datasets . load_iris ()
7 X = iris. data
8 y = iris. target
9
10 # Train - test split
11 X_train , X_test , y_train , y_test = train_test_split (X , y ,
test_size =0.3 , random_state =42)
12

13
# Train the model
model = LogisticRegression ( max_iter =200)
14
model . fit ( X_train , y_train )
15

16
# Predict and evaluate
17 y_pred = model . predict ( X_test )
18 print ( f" Accuracy : { accuracy_score ( y_test , y_pred )}" )
19 print ( f" Confusion Matrix : \ n { confusion_matrix ( y_test , y_pred
20
)}"
)
21

8 Unsupervised Learning: Clustering and Dimen-


sionality Reduction
Objective: Implement unsupervised learning techniques like clustering and dimension-
ality reduction.

Tasks
1. Clustering with K-Means: Apply K-Means clustering on the Iris dataset and
visualize the clusters.
1
from sklearn . cluster import KMeans
2 import matplotlib . pyplot as plt
3

4 # Apply KMeans clustering


5
kmeans = KMeans ( n_clusters =3 , random_state
=42) y_kmeans = kmeans . fit_predict ( X )
6

7
# Visualize the clusters
8 plt . scatter ( X [: , 0] , X [: , 1] , c = y_kmeans , cmap = ’ viridis ’)
9 plt . scatter ( kmeans . cluster_centers_ [: , 0] ,
10 kmeans . cluster_centers_ [: , 1] , s =200 , c = ’ red ’,
marker = ’x ’) plt . title ("K - Means Clustering ")
11
plt . xlabel (" Feature 1 ")
plt . ylabel (" Feature 2
12
") plt . show ()
13

14

15

2. Dimensionality Reduction with PCA: Apply Principal Component Analysis


(PCA) to reduce the dimensions of the Iris dataset and visualize it in 2D.
1
from sklearn . decomposition import PCA
1
# Apply PCA to reduce to 2 4 components
pca = PCA ( n_components =2)
2

1
5
5
X_pca = pca . fit_transform ( X )
6

7 # Visualize the reduced data


8 plt . scatter ( X_pca [: , 0] , X_pca [: , 1] , c =y , cmap = ’ viridis ’)
9
plt . title (" PCA - Iris Data ")
plt . xlabel (" Principal Component 1 " )
10
plt . ylabel (" Principal Component 2 " )
11
plt . show ()
12

13

9 Model Evaluation and Hyperparameter Tuning


Objective: Evaluate models using cross-validation and tune hyperparameters for better
performance.

Tasks
1. Model Evaluation with Cross-Validation: Apply cross-validation to evaluate
the performance of a classification model (e.g., SVM or Random Forest).
1
from sklearn . model_selection import cross_val_score
2 from sklearn . ensemble import Ran dom F ore st Cl assifie r
3

4 # Apply cross - validation


5
rf = Ran dom F ore st Cl assifie r ( n_estimators =100)
scores = cross_val_score ( rf , X , y , cv =5)
6
print ( f" Cross - validated accuracy : { scores . mean
7
() }")
8

2. Hyperparameter Tuning with GridSearchCV: Use GridSearchCV to tune the


hyperparameters of an SVM classifier (e.g., C and gamma).
1
from sklearn . model_selection import Grid Search CV
2 from sklearn . svm import SVC
3

4 # Define parameter grid and model


5
param_grid = { ’C ’: [0.1 , 1 , 10] , ’ gamma ’: [0.01 , 0.1 , 1]}
svm = SVC ()
6

7
# Apply Grid Search CV
8 grid_search = Grid Search CV ( svm , param_grid , cv
9 =5) grid_search . fit ( X_train , y_train )
10

11
# Display best hyperparameters
print ( f" Best Hyperparameters : { grid_search . best_params_ }")
12

13

14

1
6

You might also like