0% found this document useful (0 votes)

12 views

Cat 2 Document Likkitha

Uploaded by

Likkitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Cat 2 Document Likkitha

Uploaded by

Likkitha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 80

TOPICS

MACHINE LEARNING LAB RECORD – 21MDS48

SUBMITTED BY
LIKKITHA S
71762132023

SUBMITTED TO
MRS. D. SUDHA DEVI

COIMBATORE INSTITUTE OF TECHNOLOGY

30 JUNE 2023
X NO DATE TOPICS

1 09/03/23 SIMPLE LINEAR REGRESSION

2 16/03/23 FIND S ALGORITHM

3 30/03/23 CANDIDATE ALGORITHM

4 30/03/23 MULTIPLE LINEAR REGRESSION

5 18/04/23 POLYNOMIAL REGRESSION

6 20/04/23 LOGISTIC REGRESSION

7 18/05/23 GAUSSIAN NAÏVE BAYES MODEL

8 23/05/23 BERNOULLI NAÏVE BAYES MODEL

9 23/05/23 MULTINOMIAL NAÏVE BAYE MODEL

10 23/05/23 K-NEAREST NEIGHBOR MODEL

11 25/05/23 K-MEAN CLUSTERING MODEL

12 30/05/23 HIERARCHICAL CLUSTERING MODEL

13 01/06/23 PRINCIPAL COMPONENT ANALYSIS

14 06/06/23 DECISION TREE CLASSIFIER

15 08/06/23 RANDOM FOREST

16 13/06/23 SUPPORT VECTOR MACHINE

EX NO 01 SIMPLE LINEAR REGRESSION
DATE: 21.03.2023

PROBLEM STATEMENT:
Predicting the cost of homes in any rural area has become a significant difficulty for
construction companies. In order to anticipate the cost of dwellings in Coimbatore for a
specific square foot, the least squares method must be used.

PROBLEM ANALYSIS:
In this machine learning problem, we aim to build a model to predict the chance of admission
to a graduate school based on various features. The dataset provided contains information
about different applicants, including their GRE scores, TOEFL scores, university ratings,
statement of purpose (SOP) scores, letter of recommendation (LOR) scores, undergraduate
CGPA, research experience, and their corresponding chances of admission. The objective is
to create a model that can predict the likelihood of an applicant's admission based on their
profile. We want to determine the relationship between the various features and the chance of
admission and use this information to make accurate predictions for new, unseen applicants.

SAMPLE DATASET:
CODE 1 - FROM SCRATCH:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('adm_data.csv')
X = data['GRE Score'].values
Y = data['CGPA'].values
data.head()
mean_x = np.mean(X)
mean_y = np.mean(Y)
n = len(X)
numer = 0
denom = 0
for i in range(n):
numer += (X[i] - mean_x) * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
m = numer / denom
c = mean_y - (m * mean_x)
# Printing coefficients
print("Coefficients")
print(m, c)
max_x = np.max(X) + 30
min_x = np.min(X) - 30
x = np.linspace(min_x, max_x, 1000)
y=c+m*x
plt.plot(x, y, color='#58b970', label='Regression Line')
plt.scatter(X, Y, c='#ef5423', label='Scatter Plot')
plt.xlabel('GRE Score')
plt.ylabel('CGPA')
plt.legend()
plt.show()
rmse = 0
for i in range(n):
y_pred = c + m * X[i]
rmse += (Y[i] - y_pred) ** 2
rmse = np.sqrt(rmse/n)
print("RMSE")
print(rmse)
ss_tot = 0
ss_res = 0
for i in range(n):
y_pred = c + m * X[i]
ss_tot += (Y[i] - mean_y) ** 2
ss_res += (Y[i] - y_pred) ** 2
r2 = 1 - (ss_res/ss_tot)
print("R2 Score")
print(r2)

OUTPUT:
CODE 2 - USING LIBRARIES:
import pandas as pd
dataset = pd.read_csv('adm_data.csv')
X = dataset['GRE Score'].values
Y = dataset['CGPA'].values
x=np.reshape(X,(-1,1))
y=np.reshape(Y,(-1,1))
dataset.head()
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.30, random_state = 0)
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(x_train, y_train)
y_pred = regressor.predict(x_test)
import matplotlib.pyplot as plt
plt.scatter(x_train, y_train, color = 'red')
plt.plot(x_train, regressor.predict(x_train), color = 'blue')
plt.title('GRE Score vs CGPA (Training set)')
plt.xlabel('GRE Score')
plt.ylabel('CGPA')
plt.show()
from sklearn.metrics import r2_score
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from math import sqrt
import numpy as np
y_pred= regressor.predict(x_train)
print(np.sqrt(mean_squared_error(y_train,y_pred)
))
print(r2_score(y_train, y_pred))
y_pred= regressor.predict(x_test)
print(np.sqrt(mean_squared_error(y_test,y_pred)))
print(r2_score(y_test, y_pred))

OUTPUT:

PLOT:
EX NO 02 FIND S ALGORITHM
DATE: 23.03.2023

PROBLEM STATEMENT:
To implement Find S algorithm to find the specific hypothesis that fits all the positive
examples.

SAMPLE DATASET:

SOURCE CODE:
import pandas as pd
import numpy as np
data = pd.read_csv("ws.csv")
print(data,"\n")
d = np.array(data)[:,:-1]
print("\n The attributes are: ",d)
target = np.array(data)[:,-1]
print("\n The target is: ",target)
def train(c,t):
for i, val in enumerate(t):
if val == "Yes":
specific_hypothesis = c[i].copy()
break
for i, val in enumerate(c):
if t[i] == "Yes":
for x in range(len(specific_hypothesis)):
if val[x] != specific_hypothesis[x]:
specific_hypothesis[x] = '?'
else:
pass
return specific_hypothesis
print("\n The final hypothesis is:",train(d,target))

OUTPUT:

INFERENCE:
FIND S Algorithm is used to find the Maximally Specific Hypothesis. Using the Find-S
algorithm gives a single maximally specific hypothesis for the given set of training examples.

EX NO 03 CANDIDATE ELIMINATION
DATE: 28.03.2023
PROBLEM STATEMENT:
The aim of the Candidate Elimination algorithm is to learn a hypothesis that approximates the
target concept based on a set of training examples and a hypothesis space. It seeks to find the
most specific and general hypotheses that can accurately classify the training examples and
generalize to unseen instances, thereby effectively narrowing down the hypothesis space. The
algorithm aims to provide a concise and accurate representation of the target concept using a
minimal set of hypotheses.

SAMPLE DATASET:

SOURCE CODE:
import numpy as np
import pandas as pd
data = pd.read_csv('candidate elimination.csv')
concepts = np.array(data.iloc[:,0:-1])
print("\nInstances are:\n",concepts)
target = np.array(data.iloc[:,-1])
print("\nTarget Values are: ",target)
def learn(concepts, target):
specific_h = concepts[0].copy()
print("\nInitialization of specific_h and genearal_h")
print("\nSpecific Boundary: ", specific_h)
general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print("\nGeneric Boundary: ",general_h)
for i, h in enumerate(concepts):
print("\nInstance", i+1 , "is ", h)
if target[i] == "yes":
print("Instance is Positive ")
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
specific_h[x] ='?'
general_h[x][x] ='?'
if target[i] == "no":
print("Instance is Negative ")
for x in range(len(specific_h)):
if h[x]!= specific_h[x]:
general_h[x][x] = specific_h[x]
else:
general_h[x][x] = '?'
print("Specific Bundary after ", i+1, "Instance is ", specific_h)
print("Generic Boundary after ", i+1, "Instance is ", general_h)
print("\n")
indices = [i for i, val in enumerate(general_h) if val == ['?', '?', '?', '?', '?', '?']]
for i in indices:
general_h.remove(['?', '?', '?', '?', '?', '?'])
return specific_h, general_h
s_final, g_final = learn(concepts, target)
print("Final Specific_h: ", s_final, sep="\n")
print("Final General_h: ", g_final, sep="\n")
OUTPUT:
INFERENCE:
The candidate Elimination algorithm finds all hypotheses that match all the given training
examples. Unlike in Find-S algorithm, it goes through both negative and positive examples,
eliminating any inconsistent hypothesis.

EX NO 04 MULTI - LINEAR REGRESSION

DATE: 13.04.2023

PROBLEM STATEMENT:
Predicting the house prices in all outskirts have become a major problem for construction
companies. So our problem here is to predict the prices of houses in Coimbatore for the given
square feet, number of bedrooms and age using multiple linear regression

PROBLEM ANALYSIS:

Here we will develop and evaluate the performance and the predictive power of a
model trained and tested on data collected from houses in Coimbatore’s suburbs. Once we get
a good fit, we will use this model to predict the monetary value of a house located at all parts
of Coimbatore. A model like this would be very valuable for real estate agents and
construction companies where they could make use of the information provided in a daily
basis. Our data set consists of four rows (Area, Bedrooms, Age of the home, Price) and thirty
columns. We analyse the problem by applying the least squares method to the given dataset.
We separate the dataset into training and test data, training the MLR model in the training
dataset and we predict the test results and visualize them. By analysing the R square value we
get from the model using MLR, we can predict the accuracy for the model.

SAMPLE DATASET:
CODE 1 - FROM SCRATCH:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from math import sqrt

data = pd.read_csv(r'C:\Users\Selva Vignesh\Downloads\Houses.csv')

from scipy import optimize
df = pd.DataFrame(data, columns = ['area','bedrooms','homeage'])
X1 = df['area'].values
X2 = df['bedrooms'].values
Y = df['homeage'].values
n = len(X1) + len(X2)

mean_x1 = np.mean(X1)
mean_x2 = np.mean(X2)
mean_y = np.mean(Y)
for i in range(no):
nr_b1 = ((X2[i] ** 2) * (X1[i] * Y[i])) - ((X1[i] * X2[i]) * (X2[i] * Y[i]))
dr_b1 = ((X1[i] * 2) * (X2[i] * 2)) - ((X1[i] * X2[i]) ** 2)
b1 = nr_b1/dr_b1
nr_b2 = ((X1[i] **2) * (X2[i] * Y[i])) - ((X1[i] * X2[i]) * (X1[i] * Y[i]))
dr_b2 = ((X1[i] * 2) * (X2[i] * 2)) - ((X1[i] * X2[i]) ** 2)
b2 = nr_b2/dr_b2
mean_y = (b1 * mean_x1) + (b2 * mean_x2)

batch_size = 30
no = batch_size

numer = 0
denom = 0
for i in range(no):
numer += (X[i] - mean_x) * (Y[i] - mean_y)
denom += (X[i] - mean_x) ** 2
m = numer / denom
c = mean_y - (m * mean_x)

# Printing coefficients
print("Coefficients")
print(m, c)
max_x = np.max(X) + 30
min_x = np.min(X) - 30
x = np.linspace(min_x, max_x, 1000)
for i in range(no):
y = c + m * X[i]

rmse = 0
for i in range(no):
y_pred = c + m * X[i]
rmse += (Y[i] - y_pred) ** 2
rmse = np.sqrt(rmse/n)
print("RMSE")
print(rmse)
ss_tot = 0
ss_res = 0
for i in range(no):
y_pred = c + m * X[i]
ss_tot += (Y[i] - mean_y) ** 2
ss_res += (Y[i] - y_pred) ** 2
r2 = 1 - (ss_res/ss_tot)
print("R2 Score")
print(r2)
OUTPUT:
Coefficients
97.26987365 -68.67032333 6.78195499
RMSE
245.0986827382367
R2 Score
0.984526627789891

INFERENCE:

From the MLR scratch model we get an RMSE value of 245.09868 and R square
value of 0.98, from which we can infer that the model has an accuracy of 98% which states
that the model has performed very well. The equation obtained here is y = 97.2698 * x1+
6.7819 * x2- 68.6703 * x3. This model can be used further by training it with a large data.

CODE 2 - USING LIBRARIES:

from sklearn import linear_model
import numpy as np
import statsmodels.api as sm
import pandas as pd
data=pd.read_csv(r'C:\Users\Selva Vignesh M\Desktop\Houses.csv')
dt = pd.DataFrame(data, columns = ['area','bedrooms','homeage','price'])
x = dt[['area','bedrooms','homeage']]
y = dt['price']
reg = linear_model.LinearRegression()
reg.fit(x, y)
print("Intercept: ", reg.intercept_)
print("Coefficients: ", reg.coef_)
#Extracting independent variables (income, age)
x = dt.iloc[:,:-1].values
print(x)
#Extracting dependent variable (happiness)
y = dt.iloc[:,3:].values
print(y)
from sklearn.model_selection import train_test_split
from sklearn import model_selection
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
from math import sqrt
lr = LinearRegression()
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 0)
lr.fit(x_train, y_train)
regressor = LinearRegression()
regressor.fit(x_train,y_train)
y_pred = regressor.predict(x_test)
pred_train_lr = lr.predict(x_train)
pred_test_lr = lr.predict(x_test)
print("RMSE and r-square for train set:")
print(np.sqrt(mean_squared_error(y_train,pred_train_lr)))
print(r2_score(y_train,pred_train_lr))
print("RMSE and r-square for test set:")
print(np.sqrt(mean_squared_error(y_test,pred_test_lr)))
print(r2_score(y_test,pred_test_lr))
OUTPUT:
Intercept: 3046.2225659549213
Coefficients: [ 97.26987365 -68.67032333 6.78195499]
RMSE and r-square for train set:
243.14341291104748
0.9842721957215742
RMSE and r-square for test set:
214.29771477177258
0.9922371865519531

INFERENCE:

From the model we can infer that the training dataset computed the RMSE value as
243.14314 with r square value of 0.98 and the test dataset computed its RMSE as 214.29771
with r square value of 0.99. Here the model has performed well with an accuracy of 99%
which infers that both the models fit perfectly.

Now let’s try to give an input to the model and see how it predicts the price. Here the
input is the area in square feet.
EX NO 05 POLYNOMIAL REGRESSION
DATE: 20.04.2023

PROBLEM STATEMENT:
The objective of this problem statement is to provide comprehensive support to 50 startups by
addressing their key challenges and enabling their growth and success. The problem is to
develop a program that supports the growth of 50 startups by addressing their critical needs
and challenges. The program should encompass various areas and provide tailored solutions
to meet the unique requirements of each startup.

PROBLEM ANALYSIS:
Here we will develop and evaluate the performance and the predictive power of a model
trained and tested on data collected from a Company. Once we get a good fit, we will use this
model to predict the salary of an employee based on their position. The data set consists of
five columns (R&D Spend,Administration,Marketing Spend,State,Profit) and fifty rows. We
separate the dataset into training and test data, training the model in the training dataset and
we predict the test results and visualize them. By analysing the R square value, we can
predict the accuracy for the model.

SAMPLE DATASET:
CODE 1 - FROM SCRATCH:
n=int(input("Enter the degree:"))
xn=data['Level']
for i in range(1,n+1):
for j in range(len(xn)):
data['x',i]=xn**i
xval=np.array(data.iloc[:,-n:])
yval=np.array(y)
coeffs = np.linalg.inv(xval.T @ xval) @ xval.T @ yval
coeffs.shape
print('Coefficients:', coeffs)
new=int(input("Enter the value of x to be predicted:"))
X_n=[]
for i in range(1,n+1):
X_n.append(new**i)
print('Dimensions of coeff matrix:',coeffs.shape)
X_new=np.array(X_n)
y_pred = X_new.T @ coeffs
print('Prediction:', y_pred)

OUTPUT:
y = -38494.26 + 66878.12x + 287369.29x^2 + 460744.27x^3
RMSE: 100912.45186113848
R2 score: 0.8737535471595872
CODE 2 - USING LIBRARIES:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
print("Done")
data=pd.read_csv("/kaggle/input/position-salaries/Position_Salaries.csv")
x=data.iloc[:, 1:2]
y=data['Salary']
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree = 4)
x_poly = poly_reg.fit_transform(x)
poly_reg.fit(x_poly, y)
linear = LinearRegression()
linear.fit(x_poly, y)
y_pred=linear.predict(x_poly)
from sklearn.metrics import mean_squared_error,r2_score
rmse=np.sqrt(mean_squared_error(y,y_pred))
print('RMSE:',rmse)
r2_scr=r2_score(y,y_pred)
print('R2 SCORE:',r2_scr)

OUTPUT:
Salary = -195333.33333333337 + 80878.78787878789 * Level
RMSE : 163388.73519272613
R2 score: 0.6690412331929895
PLOT:
plt.scatter(x,y,color='Black')
plt.plot(x,linear.predict(x_poly),color='Red')
plt.xlabel('Levels')
plt.ylabel('Salary')

INFERENCE:
Using Polynomial Regression model, we have predicted the salaries of the employees of a
company.The models has computed the value of R square score as 0.99739.Therefore the
model has performed with an accuracy of 99.73%.
EX NO 06 LOGISTIC REGRESSION
DATE: 18.05.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the Logistic regression in both
scratch and using in built functions method in python using numpy and pandas.

PROBLEM ANALYSIS:
The SAT score achieved by a student (input feature). A binary variable indicating whether
the student was admitted (output label). Our goal is to build a logistic regression model that
can accurately predict the admission decision based on the SAT scores. Logistic regression is
a binary classification algorithm commonly used for predicting binary outcomes, such as
"admitted" or "not admitted" in our case. It models the relationship between the input features
and the probability of the output label using a logistic function. The train the logistic
regression model, we will split the dataset into two parts: a training set and a test set. The
training set will be used to train the model, and the test set will be used to evaluate its
performance. We will use evaluation metrics such as accuracy, precision, recall, and F1-score
to assess the performance of our logistic regression model. These metrics will help us
understand how well the model predicts the admission decisions based on the SAT scores.
Once the logistic regression model is trained, we can use it to make predictions on new,
unseen data. Given a student's SAT score, the model will output the probability of being
admitted. We can then apply a threshold (e.g., 0.5) to classify the student as admitted or not
admitted.

SAMPLE DATASET:
CODE 1 - FROM SCRATCH:
import numpy as np
import pandas as pd
# Normalize the independent variable (optional but recommended)
X = (X - np.mean(X)) / np.std(X)
# Add a column of ones to X for the bias term
X = np.column_stack((np.ones(len(X)), X))
# Define the sigmoid function
def sigmoid(z):
return 1 / (1 + np.exp(-z))
# Define the cost function
def cost_function(X, y, theta):
m = len(y)
h = sigmoid(np.dot(X, theta))
cost = (-1/m) * np.sum(y*np.log(h) + (1-y)*np.log(1-h))
return cost
# Define the gradient descent function
def gradient_descent(X, y, theta, alpha, num_iterations):
m = len(y)
cost_history = []
for i in range(num_iterations):
h = sigmoid(np.dot(X, theta))
gradient = (1/m) * np.dot(X.T, (h - y))
theta -= alpha * gradient
cost = cost_function(X, y, theta)
cost_history.append(cost)
return theta, cost_history
# Set the learning rate and number of iterations
learning_rate = 0.01
num_iterations = 1000
# Initialize the parameters (theta)
theta = np.zeros(X.shape[1])
# Run gradient descent to train the model
theta_optimized, cost_history = gradient_descent(X, y, theta, learning_rate, num_iterations)
# Print the optimized parameters (theta)
print("Optimized parameters (theta):", theta_optimized)
# Optimized parameters (theta)
theta_optimized = np.array([0.28904728, 1.85989727])
# Example test data
X_test = np.array([1, 1600]) # Your test data
# Predict the admission using the optimized parameters
h_test = sigmoid(np.dot(X_test, theta_optimized))
prediction = 1 if h_test >= 0.5 else 0
# Calculate the accuracy on the training set
y_pred_train = sigmoid(np.dot(X, theta_optimized))
y_pred_train = np.round(y_pred_train) # Round the predictions to 0 or 1
accuracy = np.mean(y_pred_train == y) * 100

# Print the predicted admission and accuracy

print("Predicted admission:", prediction)
print("Accuracy on the training set:", accuracy)

OUTPUT:
CODE 2 - USING LIBRARIES:
import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Create the dependent and independent variables
Y = data['Admitted']
X= data['SAT'].values
X = X.reshape(-1, 1)
# Create an instance of LogisticRegression model
logreg = LogisticRegression()
# Fit the model to the training data
logreg.fit(X, y)
X_test = np.array([[1600]]) # Your test data
# Predict the admission for the test data
y_pred = logreg.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y, logreg.predict(X))
# Print the predicted admission and accuracy
print("Predicted admission:", y_pred[0])
print("Accuracy:", accuracy)
plt.scatter(x1,y, color='C0')
# Don't forget to label your axes!
plt.xlabel('SAT', fontsize = 20)
plt.ylabel('Admitted', fontsize = 20)
plt.show()
x = sm.add_constant(x1)
reg_log = sm.Logit(y,x)
results_log = reg_log.fit()
def f(x,b0,b1):
return np.array(np.exp(b0+x*b1) / (1 + np.exp(b0+x*b1))) f_sorted =
np.sort(f(x1,results_log.params[0],results_log.params[1]))
x_sorted = np.sort(np.array(x1))
plt.scatter(x1,y,color='C0')
plt.xlabel('SAT', fontsize = 20)
plt.ylabel('Admitted', fontsize = 20)
plt.plot(x_sorted,f_sorted,color='C8')
plt.show()

OUTPUT:

PLOT:
INFERENCE:
We have implemented the Logistic Regression in both scratch and in-built functions method
and the accuracies for both the naïve bayes classifier from scratch and using built in methods
are found to have only slight difference making the model more accurate and precise.

EX NO 07 GAUSSIAN NAÏVE BAYES MODEL

DATE: 23.05.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the Gaussian naive bayes classifier in
both scratch and using in built functions method in python using numpy and pandas.

PROBLEM ANALYSIS:
The code begins by importing necessary libraries such as pandas, numpy, scikit-learn's
train_test_split, GaussianNB, and accuracy_score.The code uses the pandas library to read
the CSV file 'fraud_oracle.csv' and store it in a DataFrame called 'df'.The code drops the
'PolicyNumber' and 'RepNumber' columns from the DataFrame using the drop() method.he
code selects four features ('DriverRating', 'Deductible', 'Age', 'WeekOfMonth') as input
features (X) for the model and assigns the 'FraudFound_P' column as the target variable
(y).The code splits the dataset into training and testing sets using the train_test_split()
function from scikit-learn. It assigns 80% of the data for training (X_train, y_train) and 20%
for testing (X_test, y_test). The random_state parameter is set to 42 for reproducibility.The
code prints the accuracy score obtained from the accuracy_score() function.
SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
# Load the dataset
df = pd.read_csv('/kaggle/input/vehicle-claim-fraud-detection/fraud_oracle.csv')

# Drop irrelevant columns

df.drop(['PolicyNumber', 'RepNumber'], axis=1, inplace=True)

# Convert categorical variables to numerical

df['Sex'] = df['Sex'].map({'Male': 0, 'Female': 1})
df['MaritalStatus'] = df['MaritalStatus'].map({'Single': 0, 'Married': 1})
df = pd.get_dummies(df, columns=['Make', 'AccidentArea', 'DayOfWeekClaimed',
'MonthClaimed', 'VehicleCategory', 'PolicyType', 'AgentType', 'AddressChange_Claim'])
train, test = train_test_split(df, test_size=0.2, random_state=42)
# Separate the fraudulent and non-fraudulent claims in the training set
fraud_train = train[train['FraudFound_P'] == 1]
nonfraud_train = train[train['FraudFound_P'] == 0]

# Compute the prior probabilities

prior_fraud = len(fraud_train) / len(train)
prior_nonfraud = len(nonfraud_train) / len(train)

# Compute the likelihoods for each feature and class

likelihood_fraud = {}
likelihood_nonfraud = {}

for col in train.columns[:-1]:

if train[col].dtype == 'float64':
likelihood_fraud[col] = (np.mean(fraud_train[col]), np.std(fraud_train[col]))
likelihood_nonfraud[col] = (np.mean(nonfraud_train[col]), np.std(nonfraud_train[col]))
else:
likelihood_fraud[col] = dict(fraud_train[col].value_counts(normalize=True))
likelihood_nonfraud[col] = dict(nonfraud_train[col].value_counts(normalize=True))
from sklearn.metrics import accuracy_score, confusion_matrix

# Make predictions on the testing set

predictions = []

for index, row in test.iterrows():

# Compute the posterior probabilities
posterior_fraud = prior_fraud
posterior_nonfraud = prior_nonfraud

for col in test.columns[:-1]:

if test[col].dtype == 'float64':
likelihood_f = np.exp(-(row[col]-likelihood_fraud[col][0])*2 /
(2*likelihood_fraud[col][1]*2)) / (np.sqrt(2*np.pi)*likelihood_fraud[col][1])
likelihood_nf = np.exp(-(row[col]-likelihood_nonfraud[col][0])*2 /
(2*likelihood_nonfraud[col][1]*2)) / (np.sqrt(2*np.pi)*likelihood_nonfraud[col][1])
else:
if row[col] in likelihood_fraud[col]:
likelihood_f = likelihood_fraud[col][row[col]]
else:
likelihood_f = 0

if row[col] in likelihood_nonfraud[col]:
likelihood_nf = likelihood_nonfraud[col][row[col]]
else:
likelihood_nf = 0

posterior_fraud *= likelihood_f
posterior_nonfraud *= likelihood_nf

# Make the prediction

if posterior_fraud > posterior_nonfraud:
predictions.append(1)
else:
predictions.append(0)

# Evaluate the model

accuracy = accuracy_score(test['FraudFound_P'], predictions)
confusion = confusion_matrix(test['FraudFound_P'], predictions)

print("Accuracy:", accuracy)
print("Confusion Matrix:\n", confusion)
OUTPUT:

CODE 2 - USING LIBRARIES:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
df = pd.read_csv('/kaggle/input/vehicle-claim-fraud-detection/fraud_oracle.csv')
df.drop(['PolicyNumber', 'RepNumber'], axis=1, inplace=True)
X = df[['DriverRating','Deductible','Age','WeekOfMonth']]
y = df['FraudFound_P']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
print("accuracy:",accuracy_score(y_pred,y_test))

OUTPUT:

INFERENCE:
We have implemented the Gaussian Naive Bayes Classifier in both scratch and in built
functions method and The accuracies for both the naïve bayes classifier from scratch and
using built in methods are found to be same making the model more accurate and precise.

EX NO 08 BERNOULLI NAÏVE BAYES MODEL

DATE: 23.05.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the Bernoulli naive bayes classifier in
both scratch and using in built functions method in python using numpy and pandas.

PROBLEM ANALYSIS:
The code starts by importing necessary libraries such as NumPy, Pandas, and scikit-learn
modules.The Titanic dataset is read from a CSV file using the pd.read_csv() function and
stored in the dataset variable.The categorical variables in the dataset are encoded using the
LabelEncoder from scikit-learn. The apply() function is used to apply the encoding to object-
type columns, while numerical columns are left unchanged. The encoded features are stored
in the X_encoded variable.The target variable is also encoded using the LabelEncoder, and
the encoded labels are stored in the y_encoded variable.The dataset is split into training and
testing sets using the train_test_split() function from scikit-learn. The training set consists of
75% of the data, while the testing set contains the remaining 25%. The random state is set to
42 for reproducibility.An instance of the BernoulliNB class is created and assigned to the bnb
variable. The model is then trained on the training data using the fit() method.The accuracy of
the model is calculated by comparing the predicted labels (y_pred) with the actual labels from
the testing set (y_test) using the accuracy_score() function. The accuracy score is printed to
the console.

SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
data = pd.read_csv('/kaggle/input/bernoulli-naive-bayes/titanic_prediction.csv')
X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
class BernoulliNaiveBayes:
def __init__(self):
self.class_probabilities = None
self.feature_probabilities = None
def fit(self, X, y):
n_samples, n_features = X.shape
self.class_probabilities = {}
self.feature_probabilities = {}
classes, class_counts = np.unique(y, return_counts=True)
total_samples = n_samples
for i in range(len(classes)):
class_name = classes[i]
class_probability = class_counts[i] / total_samples
self.class_probabilities[class_name] = class_probability
for feature in range(n_features):
feature_values = np.unique(X[:, feature])
self.feature_probabilities[feature] = {}
for class_name in classes:
class_indices = np.where(y == class_name)
class_samples = X[class_indices, :]
feature_counts = np.sum(class_samples[:, feature] == 1)
feature_probability = (feature_counts + 1) / (len(class_indices[0]) + 2)
self.feature_probabilities[feature][class_name] = feature_probability
def predict(self, X):
y_pred = []
for sample in X:
class_probabilities = {}
for class_name, class_probability in self.class_probabilities.items():
feature_probabilities = self.feature_probabilities
for feature, feature_value in enumerate(sample):
if feature_value == 0:
feature_probability = 1 - feature_probabilities[feature][class_name]
else:
feature_probability = feature_probabilities[feature][class_name]
if class_name not in class_probabilities:
class_probabilities[class_name] = feature_probability
else:
class_probabilities[class_name] *= feature_probability
class_probabilities[class_name] *= class_probability
predicted_class = max(class_probabilities, key=class_probabilities.get)
y_pred.append(predicted_class)
return y_pred
naive_bayes = BernoulliNaiveBayes()
naive_bayes.fit(X_train, y_train)
y_pred = naive_bayes.predict(X_test)
accuracy = np.mean(y_pred == y_test)
print("Accuracy:", accuracy)

OUTPUT:

CODE 2 - USING LIBRARIES:

import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import BernoulliNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
dataset = pd.read_csv("/kaggle/input/bernoulli-naive-bayes/titanic_prediction.csv")
label_encoder = LabelEncoder()
X_encoded = dataset.iloc[:, :-1].apply(lambda x: label_encoder.fit_transform(x) if x.dtype ==
"object" else x)
y_encoded = label_encoder.fit_transform(dataset.iloc[:, -1])
X_train, X_test, y_train, y_test = train_test_split(X_encoded, y_encoded, test_size=0.25,
random_state=42)
bnb = BernoulliNB()
bnb.fit(X_train, y_train)
y_pred = bnb.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

OUTPUT:

INFERENCE:
We have implemented the Bernoulli Naive Bayes Classifier in both scratch and in built
functions method and The accuracies for both the naïve bayes classifier from scratch and
using built in methods are found using numpy and pandas libraries in python.

EX NO 09 MULTINOMIAL NAÏVE BAYES MODEL

DATE: 23.05.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the Multinomial naive bayes classifier in
both scratch and using in built functions method in python using numpy and pandas.

PROBLEM ANALYSIS:
The necessary libraries are imported, including pandas, which is used for data manipulation,
and various modules from scikit-learn for the machine learning tasks.The code reads a CSV
file containing heart attack data and assigns it to the variable data. It then separates the
features (X) from the target variable (y).The dataset is split into training and testing sets using
the train_test_split function from scikit-learn. The testing set size is set to 20% of the data,
and a random state of 42 is used for reproducibility.A Multinomial Naive Bayes classifier is
instantiated using MultinomialNB() and trained on the training set using the fit method.The
trained classifier is used to make predictions on the test set (X_test) using the predict
method.The accuracy of the classifier is calculated by comparing the predicted values
(y_pred) with the actual target values (y_test) using the accuracy_score function. The
accuracy is then printed.

SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

import csv
import math
# Load the dataset
dataset = []
with open('/kaggle/input/heart-attack-analysis-prediction-dataset/heart.csv', 'r') as file:
csv_reader = csv.reader(file)
for row in csv_reader:
dataset.append(row)

# Remove the header

header = dataset[0]
dataset = dataset[1:]
# Convert string values to float
for i in range(len(dataset)):
dataset[i] = [float(x) for x in dataset[i]]

# Split the dataset into features and target

X = [row[:-1] for row in dataset]
y = [row[-1] for row in dataset]

# Function to split the dataset based on the target variable

def split_dataset(X, y, target_value):
X_subset = []
y_subset = []
for i in range(len(X)):
if y[i] == target_value:
X_subset.append(X[i])
y_subset.append(y[i])
return X_subset, y_subset

# Function to calculate the probability of a value given a mean and standard deviation
def calculate_probability(value, mean, stdev):
exponent = math.exp(-((value - mean) ** 2 / (2 * stdev ** 2)))
return (1 / (math.sqrt(2 * math.pi) * stdev)) * exponent

# Function to train the multinomial Naive Bayes model

def train(X_train, y_train):
# Separate the dataset by class
separated = {}
for i in range(len(X_train)):
features = X_train[i]
target = y_train[i]
if target not in separated:
separated[target] = []
separated[target].append(features)

# Calculate the mean and standard deviation for each feature by class
summaries = {}
for target, features in separated.items():
summaries[target] = []
for i in range(len(features[0])):
values = [row[i] for row in features]
mean = sum(values) / len(values)
stdev = math.sqrt(sum([(x - mean) ** 2 for x in values]) / len(values))
summaries[target].append((mean, stdev))

return summaries

# Function to make predictions using the trained model

def predict(X_test, summaries):
predictions = []
for features in X_test:
probabilities = {}
for target, class_summaries in summaries.items():
probabilities[target] = 1
for i in range(len(class_summaries)):
mean, stdev = class_summaries[i]
value = features[i]
probabilities[target] *= calculate_probability(value, mean, stdev)

# Select the class with the highest probability

best_class = None
best_probability = -1
for target, probability in probabilities.items():
if best_class is None or probability > best_probability:
best_class = target
best_probability = probability

predictions.append(best_class)

return predictions

# Split the dataset into training and testing sets

split_ratio = 0.8
split_point = int(split_ratio * len(dataset))
X_train = X[:split_point]
y_train = y[:split_point]
X_test = X[split_point:]
y_test = y[split_point:]

# Train the model

model = train(X_train, y_train)

# Make predictions on the test set

predictions = predict(X_test, model)

# Calculate accuracy
correct_predictions = sum(1 for pred, true in zip(predictions, y_test) if pred == true)
accuracy = correct_predictions / len(y_test)
print("Accuracy:", accuracy)

OUTPUT:
CODE 2 - USING LIBRARIES:

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np
data = pd.read_csv('/kaggle/input/heart-attack-analysis-prediction-dataset/heart.csv')
data.head()
X=data.iloc[:,:-1]
y=data.iloc[:,-1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the Naive Bayes classifier

nb_classifier = MultinomialNB()
nb_classifier.fit(X_train, y_train)

# Make predictions on the test set

y_pred = nb_classifier.predict(X_test)

# Evaluate the accuracy of the classifier

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

OUTPUT:
INFERENCE:
We have implemented the Multinomial Naive Bayes Classifier in both scratch and in built
functions method and The accuracies for both the naïve bayes classifier from scratch and
using built in methods are found using numpy and pandas libraries in python.

EX NO 10 K-NEAREST NEIGHBOR MODEL

DATE: 25.05.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the KNN classifier in both scratch and
using in built functions method in python using numpy and pandas.

SAMPLE DATASET:
CODE 1 - FROM SCRATCH:
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
class KNNClassifier:
def __init__(self, k=3):
self.k = k
def fit(self, X, y):
self.X_train = X
self.y_train = y
def euclidean_distance(self, x1, x2):
return np.sqrt(np.sum((x1 - x2) ** 2))
def predict(self, X):
y_pred = []
for x_test in X:
distances = []for x_train in self.X_train:
dist = self.euclidean_distance(x_test, x_train)
distances.append(dist)

indices = np.argsort(distances)[:self.k]
k_nearest_labels = self.y_train[indices]
unique x=data.iloc[:,:-1].values
y=data.iloc[:,-1].values
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

# Feature scaling
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# K-NN classification
knn = KNNClassifier(k=)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
accuracy = np.sum(y_pred == y_test) / len(y_test)
print("Accuracy:", accuracy)
_labels, counts = np.unique(k_nearest_labels, return_counts=True)
predicted_label = unique_labels[np.argmax(counts)]
y_pred.append(predicted_label)
return np.array(y_pred)
data=pd.read_csv("/kaggle/input/iris-flower-dataset/IRIS.csv")
data.head()

OUTPUT:

CODE 2 - USING LIBRARIES:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Load the Iris dataset

dataset=pd.read_csv("/kaggle/input/iris-flower-dataset/IRIS.csv")
dataset.head()

# Split the dataset into features and labels

X = dataset.iloc[:, :-1]
y = dataset.iloc[:, -1]

# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create the KNN model

knn = KNeighborsClassifier(n_neighbors=3)

# Fit the model on the training data

knn.fit(X_train, y_train)

# Predict the labels for the test data

y_pred = knn.predict(X_test)

# Calculate the accuracy of the model

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

OUTPUT:
INFERENCE:
We have implemented the KNN classifier in both scratch and in built functions method and
the plots are displayed in both the methods using the numpy and pandas libraries using these
plots we are able to identify the neighbours.

EX NO 11 K-MEAN CLUSTERING MODEL

DATE: 30.05.2023

PROBLEM STATEMENT:
The problem at hand is to cluster a given dataset into k distinct groups using the K-Means
Clustering algorithm. The dataset consists of various data points/features, and the goal is to
identify natural groupings or patterns within the data. The number of clusters, 'k', needs to be
determined based on the nature of the data or domain knowledge.
PROBLEM ANALYSIS:
The problem at hand is to cluster a given dataset into k distinct groups using the K-Means
Clustering algorithm. The dataset consists of various data points/features, and the goal is to
identify natural groupings or patterns within the data. The number of clusters, 'k', needs to be
determined based on the nature of the data or domain knowledge. Develop the K-Means
Clustering model using a programming language or a machine learning library that supports
the algorithm. Implement the necessary steps for initializing centroids, assigning data points
to clusters, and updating centroids iteratively. Evaluate the performance of the clustering
model. This can be done using metrics such as the silhouette score or the average distance
between data points and their cluster centroids. Assess the quality and coherence of the
obtained clusters. Visualize the clusters to gain insights and interpret the results effectively.
Plotting the data points and their respective clusters can help understand the structure and
patterns within the dataset.

CODE 1 - FROM SCRATCH:

import numpy as np
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt
class KMeansClustering:
def __init__(self, num_clusters, max_iterations=100):
self.K = num_clusters
self.max_iterations = max_iterations
self.centroids = None
self.clusters = None
def initialize_random_centroids(self, X):
centroids = np.zeros((self.K, X.shape[1]))
for k in range(self.K):
centroid = X[np.random.randint(0, X.shape[0])]
centroids[k] = centroid
return centroids
def calculate_distance(self, x1, x2):
return np.sqrt(np.sum((x1 - x2) ** 2))
def create_clusters(self, X, centroids):
clusters = [[] for _ in range(self.K)]
for i in range(X.shape[0]):
distances = [self.calculate_distance(X[i], centroid) for centroid in centroids]
cluster_idx = np.argmin(distances)
clusters[cluster_idx].append(i)
return clusters
def calculate_new_centroids(self, X, clusters):
centroids = np.zeros((self.K, X.shape[1]))
for cluster_idx, cluster in enumerate(clusters):
if len(cluster) > 0:
new_centroid = np.mean(X[cluster], axis=0)
centroids[cluster_idx] = new_centroid
return centroids
def fit(self, X):
self.centroids = self.initialize_random_centroids(X)
for _ in range(self.max_iterations):
self.clusters = self.create_clusters(X, self.centroids)
prev_centroids = np.copy(self.centroids)
self.centroids = self.calculate_new_centroids(X, self.clusters)
if np.all(prev_centroids == self.centroids):
break
return self.clusters, self.centroids
def predict(self, X):
distances = np.zeros((X.shape[0], self.K))
for i in range(X.shape[0]):
for j in range(self.K):
distances[i, j] = self.calculate_distance(X[i], self.centroids[j])
return np.argmin(distances, axis=1)
def plot_clusters(self, X, clusters):
colors = ['r', 'g', 'b', 'c', 'm', 'y']
for cluster_idx, cluster in enumerate(clusters):
for idx in cluster:
plt.scatter(X[idx, 0], X[idx, 1], color=colors[cluster_idx])
for centroid in self.centroids:
plt.scatter(centroid[0], centroid[1], marker='x', color='k', s=100)
plt.show()
# Example usage
np.random.seed(10)
num_clusters = 3
X, _ = make_blobs(n_samples=1000, n_features=2, centers=num_clusters)

kmeans = KMeansClustering(num_clusters)
clusters, centroids = kmeans.fit(X)
predictions = kmeans.predict(X)
kmeans.plot_clusters(X, clusters)

OUTPUT:
CODE 2 - USING LIBRARIES:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
import warnings
warnings.filterwarnings("ignore")
# Generate sample data
np.random.seed(10)
num_clusters = 3
X, _ = make_blobs(n_samples=1000, n_features=2, centers=num_clusters)
# Perform K-means clustering using scikit-learn
kmeans = KMeans(n_clusters=num_clusters)
kmeans.fit(X)
# Get cluster labels and centroids
labels = kmeans.labels_
centroids = kmeans.cluster_centers_
# Plot the clusters and centroids
colors = ['r', 'g', 'b', 'c', 'm', 'y']
for i in range(num_clusters):
plt.scatter(X[labels == i, 0], X[labels == i, 1], c=colors[i])
plt.scatter(centroids[:, 0], centroids[:, 1], marker='x', color='k', s=100)
plt.show()

OUTPUT:

INFERENCE:
We have implemented the K-Mean Clustering in both scratch and in built functions method
and the plots are displayed in both the methods using the numpy and pandas libraries using
these plots we are able to identify the neighbours.

EX NO 12 HIERARCHICAL CLUSTERING MODEL

DATE: 30.05.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the hierarchical clustering model in both
scratch and using in built functions method in python using numpy and pandas.

PROMBLEM ANALYSIS:
The code imports the required libraries, including numpy, scikit-learn's Agglomerative
Clustering and make_blobs functions, and matplotlib for plotting.The code generates a
synthetic dataset using the make_blobs function. It creates 50 samples distributed among 3
clusters, with a specified standard deviation.An instance of AgglomerativeClustering is
created with the desired number of clusters (3 in this case). The fit method is then called on
the clustering object, which performs the clustering on the given dataset.The scatter function
from matplotlib is used to create a scatter plot of the data points. Each point is colored based
on its assigned cluster label. The xlabel, ylabel, and title functions set the labels and title for
the plot.The centroids of each cluster are computed by taking the mean of the points in that
cluster. For each unique label in the clustering labels, the code retrieves the points belonging
to that cluster and calculates the centroid. These centroids are then plotted on the scatter plot
as red crosses. The show function is called to display the plot with the clustered data and
marked centroids.

CODE 1 - FROM SCRATCH:

import numpy as np
import matplotlib.pyplot as plt
def euclidean_distance(a, b):
return np.sqrt(np.sum((a - b) ** 2))
def hierarchical_clustering(X, n_clusters):
num_samples = X.shape[0]
distances = np.zeros((num_samples, num_samples))
# Calculate pairwise distances
for i in range(num_samples):
for j in range(i+1, num_samples):
distances[i, j] = euclidean_distance(X[i], X[j])
# Initialize clusters
clusters = [[i] for i in range(num_samples)]
# Perform clustering
while len(clusters) > n_clusters:
min_dist = np.inf
merge_indices = (0, 0)
# Find the closest clusters
for i in range(len(clusters)):
for j in range(i+1, len(clusters)):
cluster1 = clusters[i]
cluster2 = clusters[j]
dist = np.mean(distances[np.ix_(cluster1, cluster2)])
if dist < min_dist:
min_dist = dist
merge_indices = (i, j)

# Merge the closest clusters

merged_cluster = clusters[merge_indices[0]] + clusters[merge_indices[1]]
clusters = [c for idx, c in enumerate(clusters) if idx not in merge_indices] +
[merged_cluster]
# Calculate and return centroids
centroids = []
for cluster in clusters:
cluster_points = X[cluster]
centroid = np.mean(cluster_points, axis=0)
centroids.append(centroid)
return clusters, centroids
# Generate sample data
np.random.seed(0)
X, y = make_blobs(n_samples=50, centers=3, random_state=0, cluster_std=0.5)
# Perform hierarchical clustering
cluters, centroids = hierarchical_clustering(X, n_clusters=3)
# Plotting the clusters and centroids
colors = ['red', 'blue', 'green']
for i, cluster in enumerate(clusters):
points = X[cluster]
plt.scatter(points[:, 0], points[:, 1], color=colors[i])
centroid = centroids[i]
plt.scatter(centroid[0], centroid[1], marker='x', color='black', s=100)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Hierarchical Clustering with Centroid Markers')
plt.show()

OUTPUT:

CODE 2 - USING LIBRARIES:

import numpy as np
from sklearn.cluster import AgglomerativeClustering
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

# Generate sample data

np.random.seed(0)
X, y = make_blobs(n_samples=50, centers=3, random_state=0, cluster_std=0.5)

# Perform hierarchical clustering

clustering = AgglomerativeClustering(n_clusters=3)
clustering.fit(X)

# Plotting the clusters and centroids

plt.scatter(X[:, 0], X[:, 1], c=clustering.labels_, cmap='viridis')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Hierarchical Clustering')

# Calculate and mark centroids

centroids = []
for label in np.unique(clustering.labels_):
cluster_points = X[clustering.labels_ == label]
centroid = np.mean(cluster_points, axis=0)
centroids.append(centroid)
plt.scatter(centroid[0], centroid[1], marker='x', s=100, color='red')
plt.show()

OUTPUT:
INFERENCE:
We have implemented the Hierearchial clustering in both scratch and in built functions
method and the plots are displayed in both the methods using the numpy and pandas libraries
using these plots we are able to identify the neighbours.

EX NO 13 PRINCIPAL COMPONENT OF ANALYSIS

DATE: 01.06.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the Principal component of analysis in both
scratch and using in built functions method in python using numpy and pandas.

SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

data = pd.read_csv('/kaggle/input/iris-flower-dataset/IRIS.csv')
data.head()

X = data.iloc[:, :-1].values
y = data.iloc[:, -1].values
class PCA:
def __init__(self, n_components):
self.n_components = n_components
self.components = None

def fit(self, X):

# Center the data
X_centered = X - np.mean(X, axis=0)
# Compute the covariance matrix
covariance_matrix = np.cov(X_centered, rowvar=False)

# Perform eigendecomposition
eigenvalues, eigenvectors = np.linalg.eig(covariance_matrix)

# Sort eigenvectors based on eigenvalues

indices = np.argsort(eigenvalues)[::-1]
sorted_eigenvalues = eigenvalues[indices]
sorted_eigenvectors = eigenvectors[:, indices]

# Select the top n_components eigenvectors

self.components = sorted_eigenvectors[:, :self.n_components]

def transform(self, X):

# Center the data
X_centered = X - np.mean(X, axis=0)

# Project the data onto the selected components

transformed_data = np.dot(X_centered, self.components)

return transformed_data
pca = PCA(n_components=2)
pca.fit(X)
transformed_data = pca.transform(X)
print("Original data shape:", X.shape)
print("Transformed data shape:", transformed_data.shape)
species_map = {'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica': 2}
color_labels = [species_map[label] for label in y]

OUTPUT:

PLOT 1:
plt.figure(figsize=(10, 4))
plt.subplot(1, 2, 2)
plt.scatter(X[:, 0], X[:, 1], c=color_labels)
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.title('Original Data')
PLOT 2:
plt.subplot(1, 2, 2)
plt.scatter(transformed_data[:, 0], transformed_data[:, 1], c=color_labels)
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.title('Transformed Data')

plt.tight_layout()
plt.show()

CODE 2 - USING LIBRARIES:

from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(X)
pca_samples = pca.transform(X)
df = pd.DataFrame(pca_samples)
df.head()
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X,y, test_size=0.3, shuffle=True)
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(x_train, y_train)
y_pred = classifier.predict(x_test)
from sklearn.metrics import accuracy_score
accuracy_score(y_pred,y_test)

OUTPUT:

PLOT:
plt.subplot(1, 2, 2)
plt.scatter(pca_samples[:, 0], pca_samples[:, 1], c=color_labels)
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.title('Transformed Data')
plt.tight_layout()
plt.show()
INFERENCE:
We have implemented the Principal component analysis in both scratch and in built functions
method and the plots are displayed in both the methods using the numpy and pandas libraries
using these plots we are able to identify the neighbours.

EX NO 14 DECISION TREE CLASSIFIER

DATE: 06.06.2023

PROBLEM STATEMENT:
The problem statement is that we have to implement the Decision tree classifier in both scratch and
using in built functions method in python using numpy and pandas.

SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

import pandas as pd
import numpy as np

# Load the dataset

data = pd.read_csv("/kaggle/input/breast-cancer-wisconsin-data/data.csv")

# Drop unnecessary columns

data.drop(['id', 'Unnamed: 32'], axis=1, inplace=True)

# Convert the diagnosis column to numeric values (0 for benign, 1 for malignant)
data['diagnosis'] = data['diagnosis'].map({'M': 1, 'B': 0})

# Split the dataset into features and target variable

X = data.drop('diagnosis', axis=1)
y = data['diagnosis']
# Split the data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Decision Tree Classifier

class DecisionTreeClassifier:
def __init__(self, max_depth=None):
self.max_depth = max_depth

def fit(self, X, y):

self.X = X
self.y = y
self.n_classes = len(np.unique(y))
self.n_features = X.shape[1]
self.tree = self.build_tree()

def build_tree(self):
Xy = np.concatenate((self.X, self.y[:, np.newaxis]), axis=1)
return self.recursive_build(Xy)

def recursive_build(self, Xy):

if self.max_depth is not None and self.max_depth == 0:
return self.get_leaf_node(Xy)

if np.unique(Xy[:, -1]).size == 1:
return self.get_leaf_node(Xy)

best_split = self.get_best_split(Xy)
if best_split is None:
return self.get_leaf_node(Xy)

left_child = self.recursive_build(best_split['left'])
right_child = self.recursive_build(best_split['right'])

return {
'feature_index': best_split['feature_index'],
'threshold': best_split['threshold'],
'left': left_child,
'right': right_child
}

def get_leaf_node(self, Xy):

leaf_node = {'class_counts': np.bincount(Xy[:, -1].astype(int))}
return leaf_node

def get_best_split(self, Xy):

best_split = None
best_gini = 1.0
for feature_index in range(self.n_features):
feature_values = np.unique(Xy[:, feature_index])
for threshold in feature_values:
left = Xy[Xy[:, feature_index] <= threshold]
right = Xy[Xy[:, feature_index] > threshold]

gini = self.calculate_gini(left, right)

if gini < best_gini:
best_gini = gini
best_split = {
'feature_index': feature_index,
'threshold': threshold,
'left': left,
'right': right
}
return best_split

def calculate_gini(self, left, right):

left_counts = np.bincount(left[:, -1].astype(int), minlength=self.n_classes)
right_counts = np.bincount(right[:, -1].astype(int), minlength=self.n_classes)

left_size = left.shape[0]
right_size = right.shape[0]
total_size = left_size + right_size

gini_left = 1.0 - sum((left_counts[i] / left_size) ** 2 for i in range(self.n_classes))

gini_right = 1.0 - sum((right_counts[i] / right_size) ** 2 for i in range(self.n_classes))

gini = (left_size / total_size) * gini_left + (right_size / total_size) * gini_right

return gini

def predict(self, X):

return np.array([self.traverse_tree(x, self.tree) for x in X])

def traverse_tree(self, x, node):

if 'class_counts' in node:
return np.argmax(node['class_counts'])

if x[node['feature_index']] <= node['threshold']:

return self.traverse_tree(x, node['left'])
else:
return self.traverse_tree(x, node['right'])

# Create an instance of the Decision Tree Classifier

dt_classifier = DecisionTreeClassifier(max_depth=5)

# Fit the classifier to the training data

dt_classifier.fit(X_train.values, y_train.values)

# Predict the test data

y_pred = dt_classifier.predict(X_test.values)

# Calculate the accuracy

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test.values, y_pred)
print(f"Accuracy: {accuracy}")

OUTPUT:

CODE 2 - USING LIBRARIES:

import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
dataset = pd.read_csv("/kaggle/input/breast-cancer-wisconsin-data/data.csv")
dataset.head()
dataset = dataset.drop(["id"], axis = 1)
dataset = dataset.drop(["Unnamed: 32"], axis = 1)
dataset.diagnosis = [1 if i == "M" else 0 for i in dataset.diagnosis]
x = dataset.drop(["diagnosis"], axis = 1)
y = dataset.diagnosis.values
x = (x - np.min(x)) / (np.max(x) - np.min(x))
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.3, random_state = 42)
from sklearn.tree import DecisionTreeClassifier
dt = DecisionTreeClassifier()
dt.fit(x_train, y_train)
dt.score(x_test, y_test)

OUTPUT:

INFERENCE:
We have implemented the Decision tree classifier in both scratch and in-built functions method and
the Accuracy are displayed in both the methods using the NumPy and pandas

EX NO 15 RANDOM FOREST
DATE: 08.06.2023
PROBLEM STATEMENT:
The problem statement is that we have to implement the Random Forest in both scratch and using in
built functions method in python using numpy and pandas.

SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

import numpy as np
import pandas as pd

class Randomforestclassifier:
def __init__(self, num_trees=100, max_features=None, max_depth=None):
self.num_trees = num_trees
self.max_features = max_features
self.max_depth = max_depth
self.trees = []

def fit(self, X, y):

X = np.array(X) # Convert X to a NumPy array
y=np.array(y)
num_samples = len(X)
num_features = len(X[0])
self.trees = []

for _ in range(self.num_trees):
# Randomly select a subset of features
if self.max_features:
selected_features = np.random.choice(num_features, self.max_features,
replace=False)
X_subset = X[:, selected_features]
else:
X_subset = X

# Randomly select a subset of samples (bootstrap aggregating)

indices = np.random.choice(num_samples, num_samples, replace=True)
X_bootstrap = X_subset[indices]
y_bootstrap = y[indices]

# Create a decision tree using the bootstrap samples

tree = DecisionTreeClassifier(max_depth=self.max_depth)
tree.fit(X_bootstrap, y_bootstrap)
self.trees.append(tree)

def predict(self, X):

X = np.array(X) # Convert X to a NumPy array
predictions = []

for tree in self.trees:

predictions.append(tree.predict(X))
# Voting for the majority class
predictions = np.array(predictions)
return np.round(np.mean(predictions, axis=0))

class DecisionTreeClassifier:
def __init__(self, max_depth=None):
self.max_depth = max_depth
self.tree = None

def fit(self, X, y):

X = np.array(X) # Convert X to a NumPy array
self.tree = self.build_tree(X, y)

def predict(self, X):

X = np.array(X) # Convert X to a NumPy array
predictions = [self.predict_sample(x, self.tree) for x in X]
return predictions

def predict_sample(self, sample, node):

if 'class' in node:
return node['class']

feature_value = sample[node['feature']]

if feature_value <= node['value']:

return self.predict_sample(sample, node['left'])
else:
return self.predict_sample(sample, node['right'])

def build_tree(self, X, y, depth=0):

num_samples, num_features = X.shape
num_classes = len(np.unique(y))

# Base cases: if all samples have the same class or maximum depth is reached
if len(np.unique(y)) == 1 or (self.max_depth and depth == self.max_depth):
return {'class': y[0]}

# Find the best split point

best_feature, best_value = self.find_best_split(X, y)

# Handle the case where best_feature or best_value is None

if best_feature is None or best_value is None:
return {'class': np.argmax(np.bincount(y))}

# Recursive splitting
left_indices = np.where(X[:, best_feature] <= best_value)[0]
right_indices = np.where(X[:, best_feature] > best_value)[0]

left_tree = self.build_tree(X[left_indices], y[left_indices], depth + 1)

right_tree = self.build_tree(X[right_indices], y[right_indices], depth + 1)

return {'feature': best_feature, 'value': best_value, 'left': left_tree, 'right': right_tree}

def find_best_split(self, X, y):

best_gain = 0
best_feature = None
best_value = None

for feature in range(X.shape[1]):

values = np.unique(X[:, feature])

for value in values:

gain = self.calculate_gain(X, y, feature, value)

if gain > best_gain:

best_gain = gain
best_feature = feature
best_value = value

return best_feature, best_value

def calculate_gain(self, X, y, feature, value):

parent_entropy = self.calculate_entropy(y)

left_indices = np.where(X[:, feature] <= value)[0]

right_indices = np.where(X[:, feature] > value)[0]

if len(left_indices) == 0 or len(right_indices) == 0:
return 0

left_entropy = self.calculate_entropy(y[left_indices])
right_entropy = self.calculate_entropy(y[right_indices])

left_weight = len(left_indices) / len(X)

right_weight = len(right_indices) / len(X)

gain = parent_entropy - (left_weight * left_entropy) - (right_weight * right_entropy)

return gain

def calculate_entropy(self, y):

classes, class_counts = np.unique(y, return_counts=True)
class_probs = class_counts / len(y)
entropy = -np.sum(class_probs * np.log2(class_probs + 1e-10))
return entropy

# No need to convert x_train and y_train to NumPy arrays if they are already in that format
rf_classifier = Randomforestclassifier(num_trees=100, max_features=3, max_depth=5)
rf_classifier.fit(x_train, y_train)
y_pred = rf_classifier.predict(x_test)
accuracy_score(y_test, y_pred)

OUTPUT:

CODE 2 - USING LIBRARIES:

import pandas as pd
import numpy as np
import warnings
warnings.filterwarnings('ignore')
df = pd.read_csv('/kaggle/input/full-filled-brain-stroke-dataset/full_data.csv')
df.head()
X = df.drop(['stroke'],axis=1)
y = df['stroke']
X= pd.get_dummies(X)
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(X,y, test_size=0.3, shuffle=True)

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
classifier= RandomForestClassifier(n_estimators= 10, criterion="entropy")
classifier.fit(x_train, y_train)
y_pred=classifier.predict(x_test)
accuracy_score(y_test,y_pred)

OUTPUT:

INFERENCE:
We have implemented the Random forest in both scratch and in-built functions method and the
Accuracy are displayed in both the methods using the NumPy and pandas

EX NO 16 SUPPORT VECTOR MACHINE

DATE: 13.06.2023
PROBLEM STATEMENT:
The problem statement is that we have to implement the Support vector machine in both scratch and
using in built functions method in python using numpy and pandas.

SAMPLE DATASET:

CODE 1 - FROM SCRATCH:

class SVM:

def init(self, learning_rate=0.001, lambda_param=0.01, n_iters=1000):

self.lr = learning_rate
self.lambda_param = lambda_param
self.n_iters = n_iters
self.w = None
self.b = None

def fit(self, X, y):

n_samples, n_features = X.shape

y_ = np.where(y <= 0, -1, 1)

# initialize weights
self.w = np.zeros(n_features)
self.b = 0

for _ in range(self.n_iters):
for idx, x_i in enumerate(X):
condition = y_[idx] * (np.dot(x_i, self.w) - self.b) >= 1
if condition:
self.w -= self.lr * (2 * self.lambda_param * self.w)
else:
self.w -= self.lr * (2 * self.lambda_param * self.w - np.dot(x_i, y_[idx]))
self.b -= self.lr * y_[idx]

def predict(self, X):

approx = np.dot(X, self.w) - self.b
return np.sign(approx)
acc = accuracy_score(y_test,y_pred)*100
print('Accuracy of the model: {0}%'.format(acc))

OUTPUT:

CODE 2 - USING LIBRARIES:

import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings('ignore')
dataset = pd.read_csv('/kaggle/input/support-vector-machine/Social_Network_Ads.csv')
dataset.head(5)
dataset=dataset.drop(['User ID'],axis=1)

from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
dataset['Gender'] = le.fit_transform(dataset['Gender'])
X=dataset.drop(['Purchased'],axis=1)
X = dataset
y = dataset['Purchased']
from sklearn.model_selection import train_test_split
X_train , X_test , y_train , y_test = train_test_split(X , y ,test_size=0.3, random_state=0)
from sklearn.svm import LinearSVC
clf = LinearSVC()
clf.fit(X_train , y_train)
y_pred = clf.predict(X_test)
from sklearn.metrics import accuracy_score , classification_report
acc = accuracy_score(y_test,y_pred)*100
print('Accuracy of the model: {0}%'.format(acc))

OUTPUT:

INFERENCE:
We have implemented the SVM in both scratch and in-built functions method and the Accuracy are
displayed in both the methods using the NumPy and pandas

Example Sky Airtemp Humidity Wind Water Forecast Enjoysport 1 2 3 4
No ratings yet
Example Sky Airtemp Humidity Wind Water Forecast Enjoysport 1 2 3 4
6 pages
ML LAB PROGRAMS
No ratings yet
ML LAB PROGRAMS
42 pages
1.implement FIND-S Algorithm: Desription
No ratings yet
1.implement FIND-S Algorithm: Desription
19 pages
ML Lab Programs
No ratings yet
ML Lab Programs
18 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
43 pages
FINAL LAB PROGRAMS (2)
No ratings yet
FINAL LAB PROGRAMS (2)
52 pages
201CS240-MLLABMANUAL
No ratings yet
201CS240-MLLABMANUAL
20 pages
ML Lab Manual-2019
No ratings yet
ML Lab Manual-2019
85 pages
22K61A0618_removed_lab manual sasi cld
No ratings yet
22K61A0618_removed_lab manual sasi cld
25 pages
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
No ratings yet
Machine Learning Techniques Lab: Session: 2023-24, Even Semester
20 pages
15CSL76 Students
No ratings yet
15CSL76 Students
18 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
Edited - Edited - Final ML Lab Manual Version11
No ratings yet
Edited - Edited - Final ML Lab Manual Version11
83 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
IV - ML Lab
No ratings yet
IV - ML Lab
31 pages
ML Lab - 231009 - 210335
No ratings yet
ML Lab - 231009 - 210335
38 pages
Lab Manual Final
No ratings yet
Lab Manual Final
34 pages
Machine Learning Lab Mannual CS 601
No ratings yet
Machine Learning Lab Mannual CS 601
30 pages
original ML lab manual (1)
No ratings yet
original ML lab manual (1)
22 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
31 pages
ML_LAB Record_final
No ratings yet
ML_LAB Record_final
39 pages
ML Final-1
No ratings yet
ML Final-1
7 pages
ML Record
No ratings yet
ML Record
18 pages
(ML) Machine Learning Lab Manual
No ratings yet
(ML) Machine Learning Lab Manual
25 pages
AD3461_ML Lab Manual
No ratings yet
AD3461_ML Lab Manual
54 pages
ML Lab Manual-99
No ratings yet
ML Lab Manual-99
23 pages
ML-LAB-MANUAL-R20
No ratings yet
ML-LAB-MANUAL-R20
77 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
ML Lab
No ratings yet
ML Lab
49 pages
B.TECH Machine Learning-Lab
No ratings yet
B.TECH Machine Learning-Lab
99 pages
IT 804
No ratings yet
IT 804
33 pages
MLP - Iv Eee
No ratings yet
MLP - Iv Eee
36 pages
MLT LAB1
No ratings yet
MLT LAB1
27 pages
Program 1
No ratings yet
Program 1
25 pages
Machine Learning Through Python Lab Mannual
No ratings yet
Machine Learning Through Python Lab Mannual
33 pages
ML LAB
No ratings yet
ML LAB
51 pages
Practical 2
No ratings yet
Practical 2
2 pages
Machine Learning LAB MANUAL
No ratings yet
Machine Learning LAB MANUAL
23 pages
ML LAB record[1]
No ratings yet
ML LAB record[1]
35 pages
Machine Learning Manual Final
No ratings yet
Machine Learning Manual Final
37 pages
MLlab Manual LIET
No ratings yet
MLlab Manual LIET
52 pages
ML (1)(LAB)
No ratings yet
ML (1)(LAB)
51 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
30 pages
CANDIDATE-ELIMINATION Learning Algorithm
0% (1)
CANDIDATE-ELIMINATION Learning Algorithm
3 pages
MLT(1)
No ratings yet
MLT(1)
18 pages
Amit MLT1
No ratings yet
Amit MLT1
22 pages
ML Lab Manual Devansh (1)
No ratings yet
ML Lab Manual Devansh (1)
57 pages
ML NEW Final Format
No ratings yet
ML NEW Final Format
37 pages
Ex 1 in ML
No ratings yet
Ex 1 in ML
4 pages
Screenshot 2023-12-07 at 11.07.49 AM
No ratings yet
Screenshot 2023-12-07 at 11.07.49 AM
14 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
Candidate Elimination Algorithm
No ratings yet
Candidate Elimination Algorithm
3 pages
AIML
No ratings yet
AIML
12 pages
Lab Manual
No ratings yet
Lab Manual
25 pages
PESIT Bangalore South Campus: Vii Semester Lab Manual Subject: Machine Learning
No ratings yet
PESIT Bangalore South Campus: Vii Semester Lab Manual Subject: Machine Learning
31 pages
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
No ratings yet
Machine Learning Laboratory 18CSL76: Institute of Technology and Management
49 pages
Machine learning
No ratings yet
Machine learning
27 pages
ML Lab Manual
No ratings yet
ML Lab Manual
26 pages
DOC-20250509-WA0027.
No ratings yet
DOC-20250509-WA0027.
34 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
A Comprehensive Guide to Digital Transformation in the Oil and Gas Industry
No ratings yet
A Comprehensive Guide to Digital Transformation in the Oil and Gas Industry
12 pages
Syllabus (CS361)
No ratings yet
Syllabus (CS361)
3 pages
AI Referral Project Report
No ratings yet
AI Referral Project Report
11 pages
Interview Questions To Ask A Data Scientist Xobin Downloaded
No ratings yet
Interview Questions To Ask A Data Scientist Xobin Downloaded
8 pages
Research Paper
No ratings yet
Research Paper
51 pages
Case Study 2
No ratings yet
Case Study 2
3 pages
CSC408 Case Study - Mar23 - With Rubric
No ratings yet
CSC408 Case Study - Mar23 - With Rubric
5 pages
Dataset: Camel: Train Data
No ratings yet
Dataset: Camel: Train Data
3 pages
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
No ratings yet
Asset-V1 MITx+6.86x+3T2020+typeasset+blockslides Lecture2 Compressed
21 pages
Mindscapetravellers Riddhi Tushita Anandita1
No ratings yet
Mindscapetravellers Riddhi Tushita Anandita1
8 pages
Analyzing Large Scale Human Mobility Dat
No ratings yet
Analyzing Large Scale Human Mobility Dat
23 pages
X22-Artificial Intelligence Enabled Energy-Efficient Heating, Ventilation and Air
No ratings yet
X22-Artificial Intelligence Enabled Energy-Efficient Heating, Ventilation and Air
27 pages
Pharma Industry 4 0 Preparing For The Smart Factories 1636727193
No ratings yet
Pharma Industry 4 0 Preparing For The Smart Factories 1636727193
9 pages
Shubham Thesis PDF
No ratings yet
Shubham Thesis PDF
63 pages
Knime - Words To Wisdom
100% (2)
Knime - Words To Wisdom
177 pages
Deps 087669
No ratings yet
Deps 087669
14 pages
next gen farming
No ratings yet
next gen farming
34 pages
Download full Introduction to Deep Learning Using R: A Step-by-Step Guide to Learning and Implementing Deep Learning Models Using R Taweh Beysolow Ii ebook all chapters
100% (3)
Download full Introduction to Deep Learning Using R: A Step-by-Step Guide to Learning and Implementing Deep Learning Models Using R Taweh Beysolow Ii ebook all chapters
62 pages
Compatible Final Proofread AI Fraud Detection For FinTech
No ratings yet
Compatible Final Proofread AI Fraud Detection For FinTech
58 pages
Uddin et al (2023)
No ratings yet
Uddin et al (2023)
21 pages
Artificial Intelligence Assignment!!!
No ratings yet
Artificial Intelligence Assignment!!!
13 pages
C2C - Predictive Analysis of Student Campus Placement PDF
No ratings yet
C2C - Predictive Analysis of Student Campus Placement PDF
16 pages
Data Analytics - 4 Manuscripts - Data Science For Beginners, Data Analysis With Python, SQL Computer Programming For Beginners, Statistics For Beginners
100% (1)
Data Analytics - 4 Manuscripts - Data Science For Beginners, Data Analysis With Python, SQL Computer Programming For Beginners, Statistics For Beginners
481 pages
sem2
No ratings yet
sem2
2 pages
Speech Recognition
No ratings yet
Speech Recognition
16 pages
ID 431 Anodot Ultimate Guide To Building A Machine Learning Outlier Detection System Part III
No ratings yet
ID 431 Anodot Ultimate Guide To Building A Machine Learning Outlier Detection System Part III
20 pages
Applied Machine Learning for Health and Fitness: A Practical Guide to Machine Learning with Deep Vision, Sensors and IoT Kevin Ashley pdf download
100% (4)
Applied Machine Learning for Health and Fitness: A Practical Guide to Machine Learning with Deep Vision, Sensors and IoT Kevin Ashley pdf download
59 pages
Deep Learning Unit 1..
No ratings yet
Deep Learning Unit 1..
21 pages
Cross-Validation, Regularization, and Principal Components Analysis (PCA)
No ratings yet
Cross-Validation, Regularization, and Principal Components Analysis (PCA)
47 pages
AI-DrivenWarehouseAutomationAComprehensiveReviewofSystems
No ratings yet
AI-DrivenWarehouseAutomationAComprehensiveReviewofSystems
12 pages