0% found this document useful (0 votes)

8 views

ML LAB34

The document describes a series of experiments involving data processing and machine learning techniques applied to housing data and a linear regression model. It includes steps for data imputation, anomaly detection, standardization, normalization, and encoding, followed by the implementation of a gradient descent algorithm for linear regression. The final results include predictions from the model and statistical calculations related to height and weight data.

Uploaded by

Manav Purswani

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

ML LAB34

Uploaded by

Manav Purswani

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Experiment 1

import pandas as pd
data = pd.read_csv('large_housing_data_mumbai.csv')
print("Original Data:")
print(data.head())

Original Data:
House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built
0 1 4.0 855.0 31356226.0 Juhu 2002.0
1 2 5.0 1847.0 27775439.0 Andheri 2004.0
2 3 NaN 2363.0 37325149.0 Bandra 2000.0
3 4 5.0 626.0 6147116.0 South Mumbai 2002.0
4 5 5.0 NaN 49899606.0 Worli NaN

#Imputation
#Handle missing values using median for numerical columns and the most
frequent value for categorical columns.
from sklearn.impute import SimpleImputer
num_features = ['Bedrooms', 'Size (sq ft)', 'Price (INR)', 'Year_Built']
cat_features = ['Location']
num_imputer = SimpleImputer(strategy='median')
data[num_features] = num_imputer.fit_transform(data[num_features])
cat_imputer = SimpleImputer(strategy='most_frequent')
data[cat_features] = cat_imputer.fit_transform(data[cat_features])
print("\nData After Imputation:")
print(data.head())

Data After Imputation:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built
0 1 4.0 855.0 31356226.0 Juhu 2002.0
1 2 5.0 1847.0 27775439.0 Andheri 2004.0
2 3 3.0 2363.0 37325149.0 Bandra 2000.0
3 4 5.0 626.0 6147116.0 South Mumbai 2002.0
4 5 5.0 1702.5 49899606.0 Worli 2012.0

#Anomaly Detection
#Detect anomalies in the dataset. Here, we use Z-scores to identify anomalies
in the Price (INR) column.
from scipy import stats
z_scores = stats.zscore(data[num_features])
data['Anomaly'] = (abs(z_scores) > 3).any(axis=1) # Mark anomalies
print("\nData After Anomaly Detection:")
print(data.head())
#Rule-Based Anomaly Detection
#simple rules where:
#A house with less than 1000 sq ft should have 1 to 2 bedrooms.
#A house with 1000-2000 sq ft should have 2 to 4 bedrooms.
#A house with more than 2000 sq ft should have 3 or more bedrooms.
def is_bedroom_size_reasonable(row):
if row['Size (sq ft)'] < 1000:
return 1 <= row['Bedrooms'] <= 2
elif row['Size (sq ft)'] <= 2000:
return 2 <= row['Bedrooms'] <= 4
else:
return row['Bedrooms'] >= 3
data['Bed_Size_Anomaly'] = ~data.apply(is_bedroom_size_reasonable, axis=1)
print("\nData After Rule-Based Anomaly Detection:")
print(data.head())

Data After Anomaly Detection:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built \
0 1 4.0 855.0 31356226.0 Juhu 2002.0
1 2 5.0 1847.0 27775439.0 Andheri 2004.0
2 3 3.0 2363.0 37325149.0 Bandra 2000.0
3 4 5.0 626.0 6147116.0 South Mumbai 2002.0
4 5 5.0 1702.5 49899606.0 Worli 2012.0

Anomaly
0 False
1 False
2 False
3 False
4 False

Data After Rule-Based Anomaly Detection:

Anomaly Bed_Size_Anomaly
0 False True
1 False True
2 False False
3 False True
4 False True

#Standardization
#Standardize numerical features so they have a mean of 0 and a standard
deviation of 1.
from sklearn.preprocessing import StandardScaler
# Standardize numericals
scaler = StandardScaler()
data[num_features] = scaler.fit_transform(data[num_features])
print("\nData After Standardization:")
print(data.head())

Data After Standardization:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built \
0 1 0.710719 -1.231366 -0.017529 Juhu -1.432248
1 2 1.432261 0.194650 -0.110953 Andheri -1.124866
2 3 -0.010823 0.936408 0.138203 Bandra -1.739631
3 4 1.432261 -1.560557 -0.675243 South Mumbai -1.432248
4 5 1.432261 -0.013071 0.466275 Worli 0.104664

Anomaly Bed_Size_Anomaly
0 False True
1 False True
2 False False
3 False True
4 False True

#Normalization
#Normalize numerical features to fit within the range [0, 1]
from sklearn.preprocessing import MinMaxScaler

normalizer = MinMaxScaler()
data[num_features] = normalizer.fit_transform(data[num_features])
print("\nData After Normalization:")
print(data.head())

Data After Normalization:

House_ID Bedrooms Size (sq ft) Price (INR) Location Year_Built \
0 1 0.75 0.142055 0.058226 Juhu 0.090909
1 2 1.00 0.540128 0.050309 Andheri 0.181818
2 3 0.50 0.747191 0.071422 Bandra 0.000000
3 4 1.00 0.050161 0.002493 South Mumbai 0.090909
4 5 1.00 0.482143 0.099222 Worli 0.545455

Anomaly Bed_Size_Anomaly
0 False True
1 False True
2 False False
3 False True
4 False True

#Encoding
#One-Hot Encode the categorical feature Location.
from sklearn.preprocessing import OneHotEncoder
# One-Hot Encoding for 'Location'
encoder = OneHotEncoder(sparse=False)
encoded_location = encoder.fit_transform(data[['Location']])
encoded_df = pd.DataFrame(encoded_location,
columns=encoder.get_feature_names_out(['Location']))

data_encoded = pd.concat([data, encoded_df], axis=1).drop('Location', axis=1)

print("\nData After Encoding:")

print(data_encoded.head())

Data After Encoding:

House_ID Bedrooms Size (sq ft) Price (INR) Year_Built Anomaly \
0 1 0.75 0.142055 0.058226 0.090909 False
1 2 1.00 0.540128 0.050309 0.181818 False
2 3 0.50 0.747191 0.071422 0.000000 False
3 4 1.00 0.050161 0.002493 0.090909 False
4 5 1.00 0.482143 0.099222 0.545455 False

Bed_Size_Anomaly Location_Andheri Location_Bandra Location_Borivali \

0 True 0.0 0.0 0.0
1 True 1.0 0.0 0.0
2 False 0.0 1.0 0.0
3 True 0.0 0.0 0.0
4 True 0.0 0.0 0.0

Location_Juhu Location_Malad Location_Pali Hill Location_South Mumbai

\
0 1.0 0.0 0.0 0.0
1 0.0 0.0 0.0 0.0
2 0.0 0.0 0.0 0.0
3 0.0 0.0 0.0 1.0
4 0.0 0.0 0.0 0.0

Location_Worli
0 0.0
1 0.0
2 0.0
3 0.0
4 1.0

/usr/local/lib/python3.10/dist-
packages/sklearn/preprocessing/_encoders.py:975: FutureWarning: `sparse` was
renamed to `sparse_output` in version 1.2 and will be removed in 1.4.
`sparse_output` is ignored unless you leave `sparse` to its default value.
warnings.warn(
Experiment 2
import numpy as np

class GradientDescentMSE:
def __init__(self, lr=0.01, n_iters=1000):
self.lr = lr
self.n_iters = n_iters
self.x1 = None
self.x2 = None

def fit(self, X, y):

# Initialize parameters
self.x1 = np.random.randn() # Initialize x1
self.x2 = np.random.randn() # Initialize x2

for _ in range(self.n_iters):
# Compute predictions
y_pred = self.x1 * X[:, 0] + self.x2 * X[:, 1]

# Compute gradients for MSE

grad_x1 = (2/len(y)) * np.sum((y_pred - y) * X[:, 0])
grad_x2 = (2/len(y)) * np.sum((y_pred - y) * X[:, 1])

# Update parameters
self.x1 = self.x1 - self.lr * grad_x1
self.x2 = self.x2 - self.lr * grad_x2

return self.x1, self.x2

def objective_function(self, X):

return self.x1 * X[:, 0] + self.x2 * X[:, 1]

# Example dataset with two features

X = np.array([[0.5, 1.0], [1.0, 2.0], [1.5, 2.5], [2.0, 3.5]]) # Features
y = np.array([1.5, 2.5, 3.0, 4.0]) # True values

# Initialize and run gradient descent

gd_mse = GradientDescentMSE(lr=0.01, n_iters=1000)
final_x1, final_x2 = gd_mse.fit(X, y)
final_predictions = gd_mse.objective_function(X)

print(f"Final x1: {final_x1}, Final x2: {final_x2}")

print(f"Final Predictions: {final_predictions}")

Final x1: -1.3189740745133114, Final x2: 1.9351908822144432

Final Predictions: [1.27570384 2.55140769 2.85951609 4.13521994]
Experiment 3
import pandas as pd
import numpy as np
file_path = '/ml-linear-reg.csv'
data = pd.read_csv(file_path)
#display data
print(data)

Height Weight
0 151 63
1 174 81
2 138 56
3 186 91
4 128 47
5 136 57
6 179 76
7 163 72
8 152 62
9 131 48

#mean of X (Height) and Y (Weight)

x_mean = np.mean(data['Height'])
y_mean = np.mean(data['Weight'])

#Display
print(f"Mean of Height (x_mean): {x_mean}")
print(f"Mean of Weight (y_mean): {y_mean}")

Mean of Height (x_mean): 153.8

Mean of Weight (y_mean): 65.3

# Calculate xi - x_bar and yi - y_bar

data['xi-xbar'] = data['Height'] - x_mean
data['yi-ybar'] = data['Weight'] - y_mean

#Display xi - x_bar and yi - y_bar

print(data[['Height', 'Weight', 'xi-xbar', 'yi-ybar']])

Height Weight xi-xbar yi-ybar

0 151 63 -2.8 -2.3
1 174 81 20.2 15.7
2 138 56 -15.8 -9.3
3 186 91 32.2 25.7
4 128 47 -25.8 -18.3
5 136 57 -17.8 -8.3
6 179 76 25.2 10.7
7 163 72 9.2 6.7
8 152 62 -1.8 -3.3
9 131 48 -22.8 -17.3
# Calculate product of (xi - x_bar) and (yi - y_bar)
data['(xi-xbar)*(yi-ybar)'] = data['xi-xbar'] * data['yi-ybar']

# Display product of (xi - x_bar) and (yi - y_bar)

print(data[['Height', 'Weight', 'xi-xbar', 'yi-ybar', '(xi-xbar)*(yi-
ybar)']])

Height Weight xi-xbar yi-ybar (xi-xbar)*(yi-ybar)

0 151 63 -2.8 -2.3 6.44
1 174 81 20.2 15.7 317.14
2 138 56 -15.8 -9.3 146.94
3 186 91 32.2 25.7 827.54
4 128 47 -25.8 -18.3 472.14
5 136 57 -17.8 -8.3 147.74
6 179 76 25.2 10.7 269.64
7 163 72 9.2 6.7 61.64
8 152 62 -1.8 -3.3 5.94
9 131 48 -22.8 -17.3 394.44

# Calculate square of (xi - x_bar)

data['sq(xi-xbar)'] = data['xi-xbar'] ** 2

# Display square of (xi - x_bar)

print(data[['Height', 'Weight', 'xi-xbar', 'yi-ybar', '(xi-xbar)*(yi-ybar)',
'sq(xi-xbar)']])

Height Weight xi-xbar yi-ybar (xi-xbar)*(yi-ybar) sq(xi-xbar)

0 151 63 -2.8 -2.3 6.44 7.84
1 174 81 20.2 15.7 317.14 408.04
2 138 56 -15.8 -9.3 146.94 249.64
3 186 91 32.2 25.7 827.54 1036.84
4 128 47 -25.8 -18.3 472.14 665.64
5 136 57 -17.8 -8.3 147.74 316.84
6 179 76 25.2 10.7 269.64 635.04
7 163 72 9.2 6.7 61.64 84.64
8 152 62 -1.8 -3.3 5.94 3.24
9 131 48 -22.8 -17.3 394.44 519.84

# Calculate sum of square of (xi - x_bar)

sum_sq_xi_xbar = np.sum(data['sq(xi-xbar)'])

# Display sum of square of (xi - x_bar)

print(f"Sum of square of (xi - x_bar): {sum_sq_xi_xbar}")

Sum of square of (xi - x_bar): 3927.6000000000004

# Calculate sum of (xi - x_bar) * (yi - y_bar)

sum_xiyi_xbar_ybar = np.sum(data['(xi-xbar)*(yi-ybar)'])

# Display sum of (xi - x_bar) * (yi - y_bar)

print(f"Sum of (xi - x_bar) * (yi - y_bar): {sum_xiyi_xbar_ybar}")
Sum of (xi - x_bar) * (yi - y_bar): 2649.6

# Calculate b1 (slope)
b1 = sum_xiyi_xbar_ybar / sum_sq_xi_xbar

# Display b1 (slope)
print(f"Slope (b1): {b1}")

Slope (b1): 0.6746104491292392

# Calculate b0 (intercept)
b0 = y_mean - b1 * x_mean

# Display b0 (intercept)
print(f"Intercept (b0): {b0}")

Intercept (b0): -38.45508707607699

# Define a function to predict weight from height

def predict(height):
return b1 * height + b0

# Example prediction
height_new = 160
weight_prediction = predict(height_new)
print(f'Predicted weight for height {height_new} cm is
{weight_prediction:.2f} kg')

Predicted weight for height 160 cm is 69.48 kg

Experiment 4
import numpy as np
1
The sigmoid function is defined as 𝜙(𝑧) = 1+𝑧−𝑧

$\therefore \phi (\hat{y}) = \frac{1}{1 + e^(-\hat{y})} $

$\therefore \hat{y} = \frac{1}{1 + e^(-wx+b)} $
def sigmoid(x):
return 1/(1+np.exp(-x))

class LogisticRegression():

def init(self, lr=0.01, n_iters=1000):

self.lr = lr
self.n_iters = n_iters
self.weights = None
self.bias = None

def fit(self, X, y):

n_samples, n_features = X.shape
self.weights = np.zeros(n_features)
self.bias = 0

for _ in range(self.n_iters):
linear_pred = np.dot(X, self.weights) + self.bias
y_pred = sigmoid(linear_pred) # Logistic addition. Rest all is
Linear Regression.

dw = (1/n_samples) * np.dot(X.T, (y_pred - y))

db = (1/n_samples) * np.sum(y_pred - y)

self.weights = self.weights - self.lr*dw

self.bias = self.bias - self.lr*db

def predict(self, X):

linear_pred = np.dot(X, self.weights) + self.bias
y_pred = sigmoid(linear_pred)
class_pred = [0 if y<=0.5 else 1 for y in y_pred]
return class_pred

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets
import matplotlib.pyplot as plt

bc = datasets.load_breast_cancer()
X, y = bc.data, bc.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=1234)

model = LogisticRegression(lr=0.01)
model.fit(X_train,y_train)
y_pred = model.predict(X_test)

def accuracy(y_pred, y_test):

return np.sum(y_pred==y_test)/len(y_test)

acc = accuracy(y_pred, y_test)

print(acc)

0.9210526315789473

C:\Users\rohra\AppData\Local\Temp\ipykernel_19392\4033946986.py:2:
RuntimeWarning: overflow encountered in exp
return 1/(1+np.exp(-x))

import pandas as pd

results = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})

print(results)

Actual Predicted
0 1 1
1 1 1
2 1 1
3 1 1
4 1 1
.. ... ...
109 1 1
110 0 0
111 1 0
112 0 0
113 0 0

[114 rows x 2 columns]

Experiment 5
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("penguins_size.csv")
df.head()

# EDA
# Missing Data
df.info()

df.isna().sum()

# What percentage are we dropping?

100*(10/344)
df = df.dropna()
df.info()

df.head()

df['sex'].unique()

df['island'].unique()

df = df[df['sex']!='.']
# Feature Engineering
pd.get_dummies(df)
pd.get_dummies(df.drop('species',axis=1),drop_first=True)

# Train and Test split

X = pd.get_dummies(df.drop('species',axis=1),drop_first=True)
y = df['species']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
# Decision Tree Classifier
from sklearn.tree import DecisionTreeClassifier
model = DecisionTreeClassifier()
model.fit(X_train,y_train)
base_pred = model.predict(X_test)
from sklearn.metrics import confusion_matrix, classification_report, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

# Generate confusion matrix

y_pred = model.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
# Display the confusion matrix
disp = ConfusionMatrixDisplay(confusion_matrix=cm)
disp.plot()

plt.show()

print(classification_report(y_test,base_pred))

model.feature_importances_
pd.DataFrame(index=X.columns,data=model.feature_importances_,columns=['Feature
Importance'])

# Visualize the tree

from sklearn.tree import plot_tree
plt.figure(figsize=(12,8))
plot_tree(model);

from sklearn.tree import plot_tree

import matplotlib.pyplot as plt
# Convert X.columns to a list
plt.figure(figsize=(12, 8), dpi=150)
plot_tree(model, filled=True, feature_names=X.columns.tolist())

plt.show()

def report_model(model):
model_preds = model.predict(X_test)
print(classification_report(y_test,model_preds))
print('\n')
plt.figure(figsize=(12,8),dpi=150)
plot_tree(model,filled=True,feature_names=X.columns);
pruned_tree = DecisionTreeClassifier(max_depth=2)
pruned_tree.fit(X_train,y_train)
from sklearn.tree import plot_tree
import matplotlib.pyplot as plt

def report_model(model):
# Print classification report if needed (e.g., precision, recall, etc.)
print('\n')

# Convert X.columns to a list before passing to plot_tree

plt.figure(figsize=(12, 8), dpi=150)
plot_tree(model, filled=True, feature_names=X.columns.tolist())
plt.show()

pruned_tree = DecisionTreeClassifier(max_leaf_nodes=3)
pruned_tree.fit(X_train,y_train)
report_model(pruned_tree)

entropy_tree = DecisionTreeClassifier(criterion='entropy')
entropy_tree.fit(X_train,y_train)
report_model(entropy_tree)
Experiment 6
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.preprocessing import LabelEncoder

# Load dataset from CSV

data = pd.read_csv('iris.csv')

print(data)

# Use only the first two features for training and visualization
X = data.iloc[:, :2].values # First two features
y = data.iloc[:, -1].values # Target variable (last column)

# Encode target labels (species) to numeric values

label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3,
random_state=42)

# Create and train the SVM model with RBF kernel

svm_rbf = SVC(kernel='rbf', gamma='auto')
svm_rbf.fit(X_train, y_train)

# Make predictions
y_pred = svm_rbf.predict(X_test)

# Accuracy
accuracy = accuracy_score(y_test, y_pred) * 100
print(f"Accuracy: {accuracy:.4f}\n")

# Classification report (formatted as a DataFrame)

report_dict = classification_report(y_test, y_pred,
target_names=label_encoder.classes_, output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()
print("Classification Report:\n", report_df)

# Confusion matrix
conf_matrix = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(conf_matrix)
print()

# Visualize the Confusion Matrix as a heatmap

plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix, annot=True, fmt='d', cmap='Blues',
xticklabels=label_encoder.classes_, yticklabels=label_encoder.classes_)
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.title('Confusion Matrix Heatmap')
plt.show()

print()

# Visualize the decision boundary (for 2D data)

def plot_decision_boundary(X, y, model):
h = .02 # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.coolwarm)

plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o',
cmap=plt.cm.coolwarm)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('SVM Decision Boundary with RBF Kernel')
plt.show()

# Plot decision boundary using the test set

plot_decision_boundary(X_test, y_test, svm_rbf)
Experiment 7
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Load dataset from CSV

data = pd.read_csv('iris.csv')

# Use only the first two features for training and visualization
X = data.iloc[:, :2].values # First two features
y = data.iloc[:, -1].values # Target variable (last column)

# Encode target labels (species) to numeric values

label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)

# Split the data into training and test sets

X_train, X_test, y_train, y_test = train_test_split(X, y_encoded, test_size=0.3,
random_state=42)

# 1. SVM Model
svm_rbf = SVC(kernel='rbf', gamma='auto', probability=True)
svm_rbf.fit(X_train, y_train)
y_pred_svm = svm_rbf.predict(X_test)

# 2. Random Forest Model (Bagging)

rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)

# 3. AdaBoost Model (Boosting)

ada = AdaBoostClassifier(base_estimator=SVC(kernel='linear', probability=True),
n_estimators=50, random_state=42)
ada.fit(X_train, y_train)
y_pred_ada = ada.predict(X_test)

# Combine predictions using majority voting

y_pred_ensemble = np.array([y_pred_svm, y_pred_rf, y_pred_ada])
y_pred_final = np.array([np.bincount(x).argmax() for x in y_pred_ensemble.T])

# Accuracy for each model

for model_name, y_pred in zip(['SVM', 'Random Forest', 'AdaBoost', 'Ensemble'],
[y_pred_svm, y_pred_rf, y_pred_ada, y_pred_final]):
accuracy = accuracy_score(y_test, y_pred)
print(f"{model_name} Accuracy: {accuracy:.4f}\n")

# Classification report for the ensemble model

report_dict = classification_report(y_test, y_pred_final, target_names=label_encoder.classes_,
output_dict=True)
report_df = pd.DataFrame(report_dict).transpose()
print("Ensemble Classification Report:\n", report_df)

# Confusion matrix for the ensemble model

conf_matrix = confusion_matrix(y_test, y_pred_final)
print("\nEnsemble Confusion Matrix:")
print(conf_matrix)

# Visualize the Confusion Matrix as a heatmap for the ensemble model

# Visualize the decision boundary for the ensemble model

Z = model.predict(np.c_[xx.ravel(), yy.ravel()])
Z = Z.reshape(xx.shape)

plt.contourf(xx, yy, Z, alpha=0.8, cmap=plt.cm.coolwarm)

plt.scatter(X[:, 0], X[:, 1], c=y, edgecolors='k', marker='o', cmap=plt.cm.coolwarm)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Ensemble Decision Boundary with Bagging and Boosting')
plt.show()

# Since we can't train an ensemble model directly, we just plot the decision boundary using the
SVM model
plot_decision_boundary(X_test, y_test, svm_rbf)
Experiment 8
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt

# Step 1: Load the Iris dataset from a CSV file

data = pd.read_csv('iris.csv')

# Assuming the last column is the target label and the rest are features
X = data.iloc[:, :-1].values # Features (all rows, all columns except the last)
y = data.iloc[:, -1].values # Target (all rows, last column)

# Map string labels to integers

label_mapping = {'setosa': 0, 'versicolor': 1, 'virginica': 2}
y_numeric = np.array([label_mapping[label] for label in y])

# Step 2: Standardize the data

scaler = StandardScaler()
X_std = scaler.fit_transform(X)

# Step 3: Calculate the covariance matrix

cov_matrix = np.cov(X_std.T)

# Step 4: Calculate the eigenvalues and eigenvectors

eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)

# Step 5: Sort the eigenvalues and eigenvectors

sorted_indices = np.argsort(eigenvalues)[::-1]
eigenvalues_sorted = eigenvalues[sorted_indices]
eigenvectors_sorted = eigenvectors[:, sorted_indices]

# Step 6: Select the number of principal components

n_components = 2
eigenvectors_subset = eigenvectors_sorted[:, :n_components]

# Step 7: Transform the data

X_pca = X_std.dot(eigenvectors_subset)

# Step 8: Visualize the results

plt.figure(figsize=(8, 6))
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y_numeric, cmap='viridis', edgecolor='k', s=100)
plt.xlabel('Principal Component 1')
plt.ylabel('Principal Component 2')
plt.title('PCA of Iris Dataset')
plt.grid()
plt.show()

Property-Management-Contract (1) JMC
100% (4)
Property-Management-Contract (1) JMC
4 pages
Norma IPC 2222B-2020
0% (2)
Norma IPC 2222B-2020
3 pages
Multiple - Linear - Regression - AirBNB - Student - File0.2 - New (1) .Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Student - File0.2 - New (1) .Ipynb - Colaboratory
8 pages
Compal LA-F951P DH5VF DH7VF DH53F DH73F REV 1A (1.0) - Acer Nitro AN515-52
100% (5)
Compal LA-F951P DH5VF DH7VF DH53F DH73F REV 1A (1.0) - Acer Nitro AN515-52
67 pages
Predicting Home Prices in Bangalore
No ratings yet
Predicting Home Prices in Bangalore
18 pages
Introduction To Machine Learning (ML) With Sklearn
No ratings yet
Introduction To Machine Learning (ML) With Sklearn
10 pages
Faisal Nadeem (SAP# 30601)
No ratings yet
Faisal Nadeem (SAP# 30601)
7 pages
Normialization Dataset
No ratings yet
Normialization Dataset
7 pages
T2_summary_VHA
No ratings yet
T2_summary_VHA
14 pages
20BECE30146 ML Pratical2
No ratings yet
20BECE30146 ML Pratical2
3 pages
Kaggle Machine Learning
No ratings yet
Kaggle Machine Learning
6 pages
Linear Regression Analysis - Polynomial Regression
No ratings yet
Linear Regression Analysis - Polynomial Regression
25 pages
Quantam - Learning - Colaboratory
No ratings yet
Quantam - Learning - Colaboratory
13 pages
f3683849-7ca6-4854-8f96-af11b6e837ec
No ratings yet
f3683849-7ca6-4854-8f96-af11b6e837ec
20 pages
House Price Prediction Models
No ratings yet
House Price Prediction Models
16 pages
Document From Jahnavi
No ratings yet
Document From Jahnavi
20 pages
Copy of Project 4 _ House Price Prediction.ipynb - Colab
No ratings yet
Copy of Project 4 _ House Price Prediction.ipynb - Colab
5 pages
Real_Estate_Price_Prediction_Model
No ratings yet
Real_Estate_Price_Prediction_Model
33 pages
a
No ratings yet
a
2 pages
DL_1
No ratings yet
DL_1
11 pages
The Data Science Process
100% (1)
The Data Science Process
53 pages
Import As Import As From Import: "Mean Squared Errors: "
No ratings yet
Import As Import As From Import: "Mean Squared Errors: "
1 page
Bi El
No ratings yet
Bi El
26 pages
ML lab manual
No ratings yet
ML lab manual
25 pages
Setup: Chapter 2 - End-To-End Machine Learning Project
No ratings yet
Setup: Chapter 2 - End-To-End Machine Learning Project
31 pages
Machine Learning Lab Manual (1)
No ratings yet
Machine Learning Lab Manual (1)
33 pages
AAAAAAAAAAAAAAAAAAAAAAAAA
No ratings yet
AAAAAAAAAAAAAAAAAAAAAAAAA
41 pages
Week 12
No ratings yet
Week 12
2 pages
Regression Algorithm
No ratings yet
Regression Algorithm
9 pages
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
No ratings yet
Multiple - Linear - Regression - AirBNB - Solution-0.2 - New - Ipynb - Colaboratory
11 pages
House - Price - Prediction
No ratings yet
House - Price - Prediction
16 pages
Housepriceprediction ML 221104055342 Fb5109ae
No ratings yet
Housepriceprediction ML 221104055342 Fb5109ae
17 pages
Capstone Project Report
No ratings yet
Capstone Project Report
187 pages
Exp1a
No ratings yet
Exp1a
5 pages
02 End To End Machine Learning Project
No ratings yet
02 End To End Machine Learning Project
26 pages
Predicting House Prices Using Regression Techniques: Problem Statement: Problems Faced During Buying A House
No ratings yet
Predicting House Prices Using Regression Techniques: Problem Statement: Problems Faced During Buying A House
20 pages
The Boston Housing Dataset
100% (1)
The Boston Housing Dataset
4 pages
ML Record
No ratings yet
ML Record
15 pages
vertopal.com_housing_linear
No ratings yet
vertopal.com_housing_linear
3 pages
Project PDF
No ratings yet
Project PDF
13 pages
Pattern - Recognition - 3 - Code With Output
No ratings yet
Pattern - Recognition - 3 - Code With Output
7 pages
Presentation 1
No ratings yet
Presentation 1
2 pages
Exercise3 Solution
No ratings yet
Exercise3 Solution
19 pages
Emllab
No ratings yet
Emllab
6 pages
California Housing Price Prediction .
No ratings yet
California Housing Price Prediction .
1 page
Linear Reg
No ratings yet
Linear Reg
25 pages
Mlext
No ratings yet
Mlext
1 page
Xgboost
No ratings yet
Xgboost
12 pages
1722414346054
No ratings yet
1722414346054
18 pages
House Price Prediction
No ratings yet
House Price Prediction
1 page
Machine Learning Laboratory
No ratings yet
Machine Learning Laboratory
23 pages
q1
No ratings yet
q1
2 pages
Tarea - Prediccion de Casas en California
No ratings yet
Tarea - Prediccion de Casas en California
5 pages
Boston Housing Solutions
No ratings yet
Boston Housing Solutions
3 pages
ML Lecture # 04 Multiple Regression
No ratings yet
ML Lecture # 04 Multiple Regression
29 pages
LAB-Skill Advanced Course Machine Learning With Python Experiments
No ratings yet
LAB-Skill Advanced Course Machine Learning With Python Experiments
23 pages
P04 The Regression Pipeline - Preprocessing Ans
No ratings yet
P04 The Regression Pipeline - Preprocessing Ans
19 pages
GianluigiDeRubertis 228766
No ratings yet
GianluigiDeRubertis 228766
9 pages
One Hot Encoding
No ratings yet
One Hot Encoding
12 pages
California 1673295505
No ratings yet
California 1673295505
18 pages
DMV - 3 - Jupyter Notebook
No ratings yet
DMV - 3 - Jupyter Notebook
2 pages
ML 1st Program
No ratings yet
ML 1st Program
3 pages
Let's Play with Excel
From Everand
Let's Play with Excel
Anurag Pandey
No ratings yet
DC_ANS-1
No ratings yet
DC_ANS-1
66 pages
Distributed_Systems_Cleaned_Question_Bank
No ratings yet
Distributed_Systems_Cleaned_Question_Bank
5 pages
DC_ANS
No ratings yet
DC_ANS
73 pages
Project_Management_Module_Wise_Questions_With_Repetitions
No ratings yet
Project_Management_Module_Wise_Questions_With_Repetitions
2 pages
Distributed_Systems_Updated_Question_Bank
No ratings yet
Distributed_Systems_Updated_Question_Bank
3 pages
DC_Final_Sem[1]
No ratings yet
DC_Final_Sem[1]
142 pages
Data Warehousing and Mining Techmax Semester 6 Computer Engineering
No ratings yet
Data Warehousing and Mining Techmax Semester 6 Computer Engineering
109 pages
MIS
No ratings yet
MIS
55 pages
Goa plan
No ratings yet
Goa plan
2 pages
management-consultant-resume-example
No ratings yet
management-consultant-resume-example
1 page
BC
No ratings yet
BC
67 pages
ADS ans
No ratings yet
ADS ans
18 pages
Setting Goals To Success
No ratings yet
Setting Goals To Success
23 pages
PACOM 1076 Intelligent 2 Door Controller Datasheet
No ratings yet
PACOM 1076 Intelligent 2 Door Controller Datasheet
2 pages
Ndovi v Standard Bank Judgement edited
No ratings yet
Ndovi v Standard Bank Judgement edited
24 pages
Appendices 11.1 Market Research
No ratings yet
Appendices 11.1 Market Research
5 pages
full tuition fee reimbursement by GOVT-160322
No ratings yet
full tuition fee reimbursement by GOVT-160322
30 pages
Answer Key CHAPTER 24
No ratings yet
Answer Key CHAPTER 24
1 page
Loctite 97152 Manual
No ratings yet
Loctite 97152 Manual
48 pages
Bulk Fabric Handling Process For Direct Import
No ratings yet
Bulk Fabric Handling Process For Direct Import
7 pages
Hawai Math Pp1 Qs Post-mock - Set 01
No ratings yet
Hawai Math Pp1 Qs Post-mock - Set 01
15 pages
Key Stage 1 & 2 Planning Template
No ratings yet
Key Stage 1 & 2 Planning Template
4 pages
PI33 Supported Device List
No ratings yet
PI33 Supported Device List
266 pages
Invitation
No ratings yet
Invitation
5 pages
Kidney Transplant
No ratings yet
Kidney Transplant
30 pages
Form 4562
No ratings yet
Form 4562
2 pages
Budgeting, Budget Process, Implementation and Performance Evaluation and Control
No ratings yet
Budgeting, Budget Process, Implementation and Performance Evaluation and Control
8 pages
Proposal
No ratings yet
Proposal
2 pages
C1125D6
No ratings yet
C1125D6
4 pages
TD62783APG, TD62783AFWG: 8Ch High Voltage Source Driver
No ratings yet
TD62783APG, TD62783AFWG: 8Ch High Voltage Source Driver
10 pages
A General Method of Controlled Bidirectional Quantum Teleportation of Qudit State
No ratings yet
A General Method of Controlled Bidirectional Quantum Teleportation of Qudit State
7 pages
Profile: Enis Gemil: Personal Data
No ratings yet
Profile: Enis Gemil: Personal Data
4 pages
Related Party Disclosures
No ratings yet
Related Party Disclosures
2 pages
Single Entry
No ratings yet
Single Entry
5 pages
Area / Perimeter Worksheet: Draw A Rectangle With An Area of 40 Square Units. Find The Area of This Rectangle
No ratings yet
Area / Perimeter Worksheet: Draw A Rectangle With An Area of 40 Square Units. Find The Area of This Rectangle
2 pages
COMPRESS Client List
No ratings yet
COMPRESS Client List
6 pages
Differences Between Fundamental Research and Action Research
No ratings yet
Differences Between Fundamental Research and Action Research
4 pages
Forward Lease Sukuk in Islamic Capital Mrkts - Lahsane N Kabir Hassan
No ratings yet
Forward Lease Sukuk in Islamic Capital Mrkts - Lahsane N Kabir Hassan
288 pages
Shallow Groundwater Investigation Using Time-Domain Electromagnetic (TEM) Method at Itay El-Baroud, Nile Delta, Egypt
No ratings yet
Shallow Groundwater Investigation Using Time-Domain Electromagnetic (TEM) Method at Itay El-Baroud, Nile Delta, Egypt
12 pages