0% found this document useful (0 votes)

11 views

3. Machine Learning

Uploaded by

ssen29750

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

3. Machine Learning

Uploaded by

ssen29750

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 158

Data Science

Module Name: Machine Learning

1
Machine Learning

Chapters
1. Introduction Machine Learning
2. Regression
3. Classification
4. Clustering
5. Principal Component Analysis

2
Chapter 1

Introduction Machine Learning

3
1. Introduction Machine Learning

What is Machine Learning?

• It is a sub set/branch of Artificial Intelligence and computer science
• It enables system to learn and improve from rom experience without
being explicitly programmed
• Deals with data and algorithms to imitate human learning process to
improve the accuracy
• The main objective of Machine Learning:
• To allow computers to learn autonomously without human
intervention
• To make predictions on unknow dataset or data points

4
1. Introduction Machine Learning

Applications of Machine Learning

• Email Filtering
• Fraud Detection
• Image Recognition
• Speech Recognition
• Recommendations
• Medical Diagnosis

5
1. Introduction Machine Learning

ML Terminology
• Variables / Features
• These are the columns from the dataset, dataset may be
come from files, databases and other sources
• Independent Variable
• It used in equation to find output (pattern)
• It is also known as Predictor
• Dependent Variable
• It is the out of the equation
• It is also known as Response / Target

6
1. Introduction Machine Learning

ML Terminology
• Actual Value
• Dependent Variable value from dataset
• Predicted Value
• Dependent Variable value from equation
• Error
• Difference from actual and predicted
• Accuracy Metric
• Value/measure to identify how well machine trained or evaluate machine learning algorithm

7
1. Introduction Machine Learning

Type of Machine Learning

• Machine Learning mainly categorized as follows:
• Supervised Machine Learning
• Unsupervised Machine Learning
• Reinforcement Learning

8
1. Introduction Machine Learning

Supervised Machine Learning

• Supervised Machine Learning
• We will use both (Independent and Dependent) variables to build supervised machine learning
• Machine will be trained with supervisor of dependent variable
• Supervised Machine Learning Kinds:
• Regression
• Classification

9
1. Introduction Machine Learning

Regression
• Regression:
• The dependent variable is continuous, example salary of an employee
• Regression Techniques:
• Liner Regression
• Predictor and Response variables are linearly related
• Simple Linear Regression
• Multiple Linear Regression
• Non Linear Regression
• Predictor and Response variables are non linearly related
• Polynomial Regression
10
1. Introduction Machine Learning

Classification
• Classification:
• The dependent variable is categorical, example mail is spam or not
• Classification Techniques:
• Logistic Regression
• Decision Tree
• Support Vector Machine
• K-Nearest Neighbor
• Naïve Bays
• Random Forest (ensemble technique)

11
1. Introduction Machine Learning

Unsupervised Machine Learning

• Unsupervised Machine Learning
• We will have independent variables to build unsupervised machine learning
• Machine will be trained with out supervisor of dependent variable
• Unsupervised Machine Learning Kinds:
• Clustering
• Grouping data based on patterns
• Association Rule
• Rules are used to make predictions

12
1. Introduction Machine Learning

Reinforcement Learning
• Reinforcement Learning
• Machine will be trained on rewards and penalty
• Rewards are positive points
• Penalty is negative point

13
2. Regression

Simple Linear Regression

• Simple means only one independent variable present in model(equation) building
• Linear model equation as follows:

• 𝑦 = 𝛽 0 + 𝛽 1𝑥 + ∈
• Where:
• y is dependent variable
• x is independent variable
• 𝛽0 is intercept
• 𝛽1 is slope or coefficient
• ∈ is error term or residual
14
2. Regression

Simple Linear Regression

• Linear equation as follows:

• 𝑦 = 𝑚𝑥 + 𝑐
• Where:
• y is dependent variable
• x is independent variable
• 𝑐 is intercept
• 𝑚 is slope or coefficient

15
2. Regression

Simple Linear Regression

• To train machine on simple linear regression we use OLS method
• OLS stands for Ordinary Least Squares
• It is used to find the intercept and slope (unknown parameters)
• The method relies on minimizing the sum of squared residuals (difference between the actual(y)
and predicted(y’) values)
• Error equation as given below:

16
2. Regression

OLS Method

#no m(slope) c(intercep SSE

t)
1 10 11 2000
2 10.5 11.5 1500
3 11 12 1000
4 11.5 12.5 1500
5 12 13 2000

𝒚 = 𝟏𝟏𝒙 + 𝟏𝟐
17
2. Regression

SLR Walkthrough using statsmodels

OLS
import pandas as pd
import statsmodels.api as sm

emp_ds = pd.read_csv('data/Emp_Salary.csv’)

x = emp_ds1[['YearsExperience']]
y = emp_ds1.iloc[:,-1]

x = sm.add_constant(x)
model = sm.OLS(y, x).fit()

model.summary() 18
2. Regression

SLR Walkthrough

importing

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.metrics import r2_score

from matplotlib import pyplot as plt

import seaborn as sns

19
2. Regression

SLR Walkthrough

Loading data
#loading data from csv file
file_path = 'data/Emp_Salary.csv'
emp_ds = pd.read_csv(file_path)

#displaying 1st 2 rows

emp_ds.head(2)

#finding dataset information

emp_ds.info()

20
2. Regression

SLR Walkthrough

Handling na values

#finding na values
emp_ds.isna().sum()

#replacing na values with ffile

emp_ds1 = emp_ds.fillna(method='ffill’)

21
2. Regression

SLR Walkthrough

Checking Relation

#check relation between x and y

sns.pairplot(data=emp_ds)

22
2. Regression

SLR Walkthrough

Splitting x and y

#spliting dataset into IVs(x) and DV(y)

x = emp_ds1[['YearsExperience']].values
y = emp_ds1.iloc[:,-1].values

23
2. Regression

SLR Walkthrough

Train Test Split

#spliting dataset into train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=10)

print(x_train.shape, x_test.shape)

24
2. Regression

SLR Walkthrough

Building Model

#training model
slr_model = LinearRegression()
slr_model.fit(x_train, y_train)

#finding parameters
print(f'Coef : {slr_model.coef_} \nIntercept : {slr_model.intercept_}')

25
2. Regression

SLR Walkthrough

Evaluating Model

#evaluting model
y_pred = slr_model.predict(x_test)

print('R2 Score : ', r2_score(y_test, y_pred))

print('MAE : ', mean_absolute_error(y_test, y_pred))
print('MSE : ', mean_squared_error(y_test, y_pred))

26
2. Regression

SLR Walkthrough

Drawing Regression Line

#drawing regression line

plt.scatter(emp_ds2['YearsExperience'], emp_ds2['Salary'], label='Acutal')

plt.plot(emp_ds2['YearsExperience'],
slr_model.predict(emp_ds2[['YearsExperience']]), color='green', label='Predicted')
plt.legend()
plt.title('YoE vs Salary')
plt.xlabel('YoE')
plt.ylabel('Salary')
27
2. Regression

SLR Walkthrough

Finding outliers

#function to find outliers

sns.boxplot(y='Salary', data=emp_ds2)

28
2. Regression

SLR Walkthrough

Finding outliers as list

#function to find outliers

def find_outliers(df):
q1=df.quantile(0.25)
q3=df.quantile(0.75)
IQR=q3-q1
outliers = df[((df<(q1-1.5*IQR)) | (df>(q3+1.5*IQR)))]

return outliers.to_list()

29
2. Regression

SLR Walkthrough

Deleting outliers

#deleting outliers
outliers = find_outliers(emp_ds1['Salary’])

emp_ds2 = emp_ds1.query(f'Salary not in {outliers}')

30
2. Regression

SLR Walkthrough

Building & Evaluating Model Again

#building model without outliers

x = emp_ds2.iloc[:,[0]]
y = emp_ds2.iloc[:,1]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=10)

slr_model = LinearRegression()
slr_model.fit(x_train, y_train)
print(f'Coef : {slr_model.coef_} \nIntercept : {slr_model.intercept_}')

y_pred = slr_model.predict(x_test)
print('R2 Score : ', r2_score(y_test, y_pred))
31
2. Regression

Multiple Linear Regression

• It is extension of the simple linear regression
• More than one independent variable are present
• Linear model equation as follows:

• 𝑦 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 + … + 𝛽𝑛𝑥𝑛 +∈

• Where:
• y is dependent variable
• X1, x2, x3 are independent variable
• 𝛽0 is intercept
• 𝛽1, 𝛽2, 𝛽𝑛 is slope or coefficient
• ∈ is error term or residual
32
2. Regression

Multiple Linear Regression

• Assumptions:
• Independent variables are linearly related to dependent variable
• No or less multicollinearity (no linear relations between independent variables)
• Normality of the residuals

33
2. Regression

MLR Walkthrough

Loading Data Set

#loading data from csv file

adv_ds = pd.read_csv('data/Advertisments.csv')
adv_ds.head()

Finding Linear Relation

#heatmap with correlation value

sns.heatmap(adv_ds.corr(), annot=True)
plt.show()
34
2. Regression

MLR Walkthrough

Splitting Dataset

#splitting x, y, train and test

x = adv_ds.iloc[:,:-1]
y = adv_ds.iloc[:,-1]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=10)

35
2. Regression

MLR Walkthrough

Bulding and Evaluation model

mlr_model = LinearRegression()
mlr_model.fit(x_train, y_train)

y_pred = mlr_model.predict(x_test)

print('R2 Score : ', r2_score(y_test, y_pred))

print('MAE : ', mean_absolute_error(y_test, y_pred))
print('RMSE : ', np.sqrt(mean_squared_error(y_test, y_pred)))

36
2. Regression

MLR Walkthrough

Finding Multicollinearity

from statsmodels.stats.outliers_influence import variance_inflation_factor

vif=pd.Series([variance_inflation_factor(x.values, i) for i in range(x.shape[1])],

index=x.columns)

#delete all variables have value more than 5 and rebuild model

37
2. Regression

MLR Walkthrough

Finding Multicollinearity

from statsmodels.stats.outliers_influence import variance_inflation_factor

vif=pd.Series([variance_inflation_factor(x.values, i) for i in range(x.shape[1])],

index=x.columns)

#delete all variables have value more than 5 and rebuild model

38
2. Regression

MLR Walkthrough

Residuals Normality
import statsmodels.api as sm

sm.qqplot(residuals)
plt.show()

#if it is strait line then model is good

39
2. Regression

Activity
• Build regression model to predict house price on Real Estate Dataset

40
2. Regression

Polynomial Regression
• Linear regression not suitable for non linear relation data
• It is extension of linear relation with nth degree polynomial

• Polynomial equation as follows:

• 𝑦 = 𝑏 + 𝑏1𝑥1 + 𝑏2𝑥12 + ⋯ + 𝑏𝑛𝑥1n

41
2. Regression

Polynomial Regression Walkthrough

Loading Data Set

#loading data from csv file

emp_ds = pd.read_csv('data/Emp_Grade_Salary.csv’)
emp_ds.head()

#build linear regression and check the score

42
2. Regression

Polynomial Regression Walkthrough

Creating Polynomial Features

#x to polynomial features

from sklearn.preprocessing import PolynomialFeatures

poly_conv = PolynomialFeatures(degree=2,include_bias=False)
x_poly = poly_conv.fit_transform(x)

43
2. Regression

Polynomial Regression Walkthrough

Building model using x_poly, y

#train test split

x_train, x_test, y_train, y_test = train_test_split(x_poly, y, test_size=0.3, random_state=10)

#building model
pr_model = LinearRegression()
pr_model.fit(x_train, y_train)

44
2. Regression

Polynomial Regression Walkthrough

Visualizing the graph

plt.figure(figsize=(4,4))
plt.scatter(x, y, label='Acutal Data')
plt.plot(x, pr_model.predict(x_poly), color='g', label='Regression Line')
plt.title('Grade vs Salary')
plt.xlabel('Grade')
plt.ylabel('Salary')
plt.legend()

45
2. Regression

Polynomial Regression
• Finding best degree is the big deal for polynomial regression
• We check with different degree values start from 2 to n
• Select degree at the best score or minimum error

46
2. Regression

Polynomial Regression Walkthrough

Finding best degree

train_errors = []
test_errors = []

for d in range(1,10):
poly_conv = PolynomialFeatures(degree=d,include_bias=False)
x_poly = poly_conv.fit_transform(x)
x_train, x_test, y_train, y_test = train_test_split(x_poly, y, test_size=0.4,
random_state=101)
model = LinearRegression()
model.fit(x_train,y_train)
47
2. Regression

Polynomial Regression Walkthrough

Finding best degree

train_pred = model.predict(x_train)
test_pred = model.predict(x_test)
train_RMSE = np.sqrt(mean_squared_error(y_train,train_pred))
test_RMSE = np.sqrt(mean_squared_error(y_test,test_pred))

train_errors.append(train_RMSE)
test_errors.append(test_RMSE)

48
2. Regression

Polynomial Regression Walkthrough

Finding best degree

steps = range(len(train_errors))
plt.plot(steps, train_errors, label='Training Error')
plt.plot(steps, test_errors, label='Testing Error’)
plt.xlabel(‘Steps’)
plt.ylabel(‘Error’)
plt.legend()

49
2. Regression

Bias and Variance

• We will have two kinds of errors in model as follows:
• Bias
• It is the error for training set
• Variance
• It is the error for testing set

50
2. Regression

Overfitting and Underfitting

• Overfitting:
• Model is doing pretty well for training set, and doing for testing
set
• Bias is less and variance is more

• Underfitting:
• Model is doing pretty well for testing set, and doing for training
set
• Bias is more and variance is less

51
2. Regression

Bias – Variance Tradeoff

• To make model more generalize model (doing well for
training set and test set), we train model with different
combination of parameter values

• We pick one with low bias and low variance

52
2. Regression

Overfitting and Underfitting

• Avoiding overfitting:
• Training with more data
• Removing features
• Cross-Validation
• Regularization
• Ensembling

• Avoiding underfitting:
• Increasing the training time of the model
• Increasing the number of features
53
2. Regression

Cross-Validation
• Model is trained with different combination of train and test sets from same dataset
• Some times it is known as k-fold cross validation

54
2. Regression

Cross-Validation

from sklearn.model_selection import cross_val_score

from sklearn.model_selection import KFold

lm = LinearRegression()
k_folds = KFold(n_splits = 5, shuffle = True, random_state = 100)
scores = cross_val_score(lm, x, y, scoring='r2', cv=k_folds)

np.mean(np.absolute(scores))

55
2. Regression

Regularization
• One of the most crucial ideas in machine learning is regularization.
• It is a method for preventing the model from overfitting by providing it with more
data.
• By lowering the magnitude of the variables, this strategy can be applied to keep all
variables or features in the model.
• Consequently, it keeps the model's generality and accuracy.
• The coefficient of features is mostly regularized or reduced toward zero.

56
2. Regression

Regularization
• In regularization approach, we preserve the same amount of features while reducing
the magnitude of the features.
• Small error term introduce to loss/cost function, with lambda hyper parameter, this
term is called penalty

• Type of Regularization:
• Ridge Regularization
• Lasso Regularization

57
2. Regression

Ridge Regularization
• It is also known as L2 Regularization
• The penalty term is lambda multiplied with squares of coefficients
• Equations as follows:

58
2. Regression

Ridge Regularization

Ridge Regression

from sklearn.linear_model import Ridge

ridge_model = Ridge(alpha=10)
ridge_model.fit(x_train,y_train)
y_pred = ridge_model.predict(x_test)
MAE = mean_absolute_error(y_test,y_pred)
MSE = np.sqrt(mean_squared_error(y_test,y_pred))

print("Test MAE is:"+ str(MAE))

print("Test RMSE is:"+ str(RMSE))

59
2. Regression

Ridge Regularization

Finding Best alpha

from sklearn.linear_model import RidgeCV

ridge_cv_model = RidgeCV(alphas=range(1,101,5),scoring='neg_mean_absolute_error')
ridge_cv_model.fit(x_train,y_train)

ridge_cv_model.alpha_

60
2. Regression

Activity
• Build ridge regression to predict house price on Real Estate Dataset

61
2. Regression

Lasso Regularization
• It is also known as L1 Regularization
• The penalty term is lambda multiplied with absolute of coefficients
• Equations as follows:

62
2. Regression

Lasso Regularization

Lasso Regression

from sklearn.linear_model import Lasso

lasso_model = Lasso(alpha=100)
lasso_model.fit(x_train,y_train)
y_pred = lasso_model.predict(x_test)
MAE = mean_absolute_error(y_test,y_pred)
MSE = np.sqrt(mean_squared_error(y_test,y_pred))

print("Test MAE is:"+ str(MAE))

print("Test RMSE is:"+ str(RMSE))

63
2. Regression

Lasso Regularization

Finding Best alpha

from sklearn.linear_model import LassoCV

lasso_cv_model = LassoCV(eps=0.1,n_alphas=100,cv=5)
lasso_cv_model.fit(x_train,y_train)

lasso_cv_model.alpha_

64
2. Regression

Activity
• Build lasso regression to predict house price on Real Estate Dataset

65
Chapter 3

Classification

66
3. Classification

Introduction to Classification
• Classifying samples into groups is called classification
• In classification dependent variable has categorical values such as yes or no
• If dependent variable has only two categorical values then problem classified into binary class
classification
• If dependent variable has more than two categorical values then problem classified into multiclass
classification

67
3. Classification

Classification Techniques
• Logistic Regression
• Decision Tree
• K Nearest Neighbor
• Support Vector Machine
• Naïve Bayes
• Ensemble Methods
• Random Forest
• Gradient Boosting

68
3. Classification

Logistic Regression
• Don’t confuse with name regression, but is a classification
(algorithm)technique
• It is probabilistic model
• It uses MLE (Maximum Likelihood Estimation) (Maximum
Probability)
• It uses linear model(equation) internally to predict labels
(dependent variable) Source: https://encrypted-
tbn0.gstatic.com/images?q=tbn:ANd9
GcTI4QMCr3XP0OTpRoyyIZvpm_g
hInAQ5pkldPtyKDgfXWi64HMUke
UblKYtZVLlZuC5Jig&usqp=CAU

69
3. Classification

Logistic Regression
• Linear model will be transformed into non linear model by applying a function is called
sigmoid
• It returns values from 0 to 1 (probability values) for the samples
• Sigmoid function as given bellow:

𝟏
𝒇 𝒙 =
𝟏 + 𝒆−𝒙
• Here e is base of natural logarithm with value 2.718

https://www.vcalc.com/wiki/vCalc/Sigmoid+Function
70
3. Classification

Logistic Regression
• We introduce decision surface (threshold) to classify a sample, default is 0.5.
• For example binary classification, if the sigmoid values is greater than equal to 0.5 classify as
1 else 0

• Cost or loss function as fallows:

𝟏
𝒄𝒐𝒔𝒕 = 𝚺 − [𝒚𝒊 𝒍𝒐𝒈(𝒇 𝒙𝒊 ) + (𝟏 − 𝒚𝒊 )𝒍𝒐𝒈(𝟏 − 𝒇 𝒙𝒊 )]
𝒏

71
3. Classification

Logistic Regression Example

X Y Sigmoid Threshold Y|
-5 0 0.02 0.5 0
-2 0 0.17 0.5 0
10 1 1 0.5 1
20 1 1 0.5 1
1 0 0.69 0.5 1
18 1 1 0.5 1

72
3. Classification

LogisticRegression Class

Parameter Description

penalty Regularization norm (l1, l2, elasticnet)

C Regularization Term (0, 0.001, 0.1, 1, 10)
solver Optimizer (liblinear, lbfgs, sag)
multi_class Classification Type (auto, ovr, multinomial)

73
3. Classification

LogisticRegression Class

Attributes Description

coef_ Coefficient of the features

intercept_ Intercept (a.k.a. bias) added

Methods Description

fit(X, y) Fit the model to the given training data.

predict_proba(X) Probability estimates
predict(X) Predict class labels
get_params([deep]) Get parameters for this estimator

74
3. Classification

Logistic Regression(Binary Classification) Walkthrough

Importing

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

75
3. Classification

Logistic Regression(Binary Classification) Walkthrough

Loading Dataset

bank_ds = pd.read_csv('data/bank/bank.csv', delimiter=';’)

bank_ds.head()

bank_ds.info()

#Do all preprocessing required

76
3. Classification

Logistic Regression(Binary Classification) Walkthrough

x and y split

x = bank_ds[['age']]
y = bank_ds['y’]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=0)

77
3. Classification

Logistic Regression(Binary Classification) Walkthrough

Build Model

log_model = LogisticRegression()
log_model.fit(x_train, y_train)

log_model.coef_, log_model.intercept_

78
3. Classification

Logistic Regression(Binary Classification) Walkthrough

Evaluating Model

y_pred = log_model.predict(x_test)

#y_pred_proba = log_model.predict_proba(x_test) #this returns the probability value

accuracy_score(y_test, y_pred)

79
3. Classification

Classification Model Evaluation Metrics

• Metrics:
• Accuracy Score
• Confusion Matrix
• Precision, Recall and F1 score
• ROC-AUC Score

80
3. Classification

Accuracy Score
• It is the ratio between actual labels and predicted labels
• It’s value ranges from 0 to 1
• 0 means all wrongly predicted
• 1 means all correctly predicted
• 0.5 means only 50 percent observations are correctly predicted

• Function from sklearn.metrics

• accuracy_score(y_actual, y_pred)

81
3. Classification

Confusion Matrix
• It is a n by n square matrix with detailed prediction of each class label
• It gives how many are correctly and wrongly predicted for each class
• This will change for different threshold values

• Function from sklearn.metrics:

• confusion_matrix(y_actual, y_pred)

Source:
https://upload.wikimedia.org/wikipedia/comm
ons/6/6f/ConfusionMatrix.png
82
3. Classification

Precision, Recall and F1 score

• Precision: (How many predicted values are actual)
• It is the ration between TP and TP+FP

𝑇𝑃
• 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(𝑃) =
𝑇𝑃+𝐹𝑃
• Recall: (How many actual are predicted)
• It is the ration between TP and TP+FN

𝑇𝑃
• 𝑅𝑒𝑐𝑎𝑙𝑙(𝑅) =
𝑇𝑃+𝐹𝑁

83
3. Classification

Precision, Recall and F1 score

• In some problem we need to aim high precision
• In some problem we need to aim high recall
• In some problem we need to aim high precision and high recall, it is very difficult, we trade-
off between precision and recall with f1 score
• It is the harmonic mean of precision and recall

𝑃∗𝑅
• 𝑓1 = 2 ∗
𝑃+𝑅

84
3. Classification

ROC Curve-AUC Score

• ROC stands for Receiver Operating Characteristic
• ROC Curve is the graph between true positive rate and false
positive rate at different thresholds
𝑇𝑃
• 𝑇𝑃𝑅 = 𝑇𝑃+𝐹𝑁

𝐹𝑃
• 𝐹𝑃𝑅 = 𝐹𝑃+𝑇𝑁
Source:
https://upload.wikimedia.org/wikipedia/c
ommons/4/4d/Threshold_roc.wikipedia_e
dit.svg
85
3. Classification

ROC Curve-AUC Score

• AUC stands Area Under the Curve
• It gives the performance of the model
• If AUC score is 1 then model predicts 100% correct

• Function from sklearn.metrics:

• roc_curve(y_actual, y_pred)
• roc_auc_score(y_actual, y_pred)
• RocCurveDisplay(fpr=fpr, tpr=tpr, roc_auc=0.8).plot() Source:
https://upload.wikimedia.org/wikipedia/c
ommons/4/4d/Threshold_roc.wikipedia_e
dit.svg
86
3. Classification

Logistic Regression(Multiclass Classification) Walkthrough

MC Logistic Model
#importing required packages
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import seaborn as sns

#loading iris dataset

iris_ds = load_iris()

#splitting dataset into features(x) and labels(y)

x = iris_ds.data
y = iris_ds.target 87
3. Classification

Logistic Regression(Multiclass Classification) Walkthrough

MC Logistic Model
#splitting dataset into train test split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2, random_state=0)

#building and training logistic model

mc_log_model = LogisticRegression(multi_class='multinomial', max_iter=1000)
mc_log_model.fit(x_train, y_train)

#evaluating trained model

y_pred = mc_log_model.predict(x_test)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy {accuracy}')
88
3. Classification

Logistic Regression(Multiclass Classification) Walkthrough

MC Logistic Model
#displaying confussion matrix
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True)

#classification report
print(classification_report(y_test, y_pred))

#checking overfitting or underfitting

training_loss = log_loss(y_train, mc_log_model.predict_proba(x_train))
print(f'Training Loss : {training_loss}')
testing_loss = log_loss(y_test, mc_log_model.predict_proba(x_test))
print(f'Testing Loss : {testing_loss}')
89
3. Classification

Activity
• Build multiclass logistic regression model on digit dataset from sklearn package

90
3. Classification

K Nearest Neighbour (KNN)

• It is simplest Machine Learning algorithms based on
Supervised Learning technique
• It can be used for Regression as well as for Classification,
mainly used for classification
• It is a non-parametric algorithm
• No assumptions are made on data set.
• It is lazy learner algorithm
• Stores dataset, at the time of classification it perform the
actions

91
3. Classification

K Nearest Neighbour (KNN)

• New data point is classified, based on k neighbours
• If more neighbours of a class present, it classified new data point to that class
• It uses distance between new data point and k neighbours to decide the class of the new data point
• Distance can be calculated as follows:
• Euclidean distance

• σ(𝑝1 − 𝑝2)2
• Manhattan distance
• σ 𝑝1 − 𝑝2
• Minkowski distance
1/𝑝
• (σ(𝑝1 − 𝑝2)𝑝 )
92
3. Classification

K Nearest Neighbour (KNN)

• Advantages:
• It is robust to the noisy data
• It can be more effective if the data is large

• Disadvantages:
• Difficult to find best k value
• Computational cost is large

93
3. Classification

KNeighborsClassifier Class

Parameter Description

n_neighbors Number of neighbors

metric Metric to use for distance computation

Methods Description

fit(X, y) Fit the model to the given training data.

predict_proba(X) Probability estimates
predict(X) Predict class labels
get_params([deep]) Get parameters for this estimator
94
3. Classification

KNN Walkthrough

KNN Model
#importing required packages
from sklearn import datasets as dss
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score
from sklearn.model_selection import train_test_split

# Loading cancer dataset from sklearn package

cancer_ds = dss.load_breast_cancer()

# Getting x and y from cancer_ds

x = cancer_ds.data
y= cancer_ds.target 95
3. Classification

KNN Walkthrough

KNN Model
# displaying shape of x
x.shape

# splitting train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

# Building KNN model

knn_model = KNeighborsClassifier(n_neighbors=7)
knn_model.fit(x_train, y_train)

96
3. Classification

KNN Walkthrough

KNN Model
# Evaluating KNN model
y_pred = knn_model.predict(x_test)

acc_score = accuracy_score(y_test, y_pred)

auc_score = roc_auc_score(y_test, y_pred)

print(f'Accuracy Score : {acc_score}')

print(f'Auc Score : {auc_score}')

97
3. Classification

K Nearest Neighbour (KNN)

• Finding Best k value:
• Fit the model with different k values
• Find errors or scores for each k
• Plot the graph between errors or scores vs k
• Pick the k value where line is bent like elbow

98
3. Classification

Activity
• Build KNN model on digit dataset from sklearn package

99
3. Classification

Decision Tree
• It is like a tree to make decision
• It can be used to regression as well as classification
• Decision Tree algorithms:
• ID3
• C4.5
• CART

Source:
https://upload.wikimedia.org/wikipedia/en
/4/4f/GEP_decision_tree_with_numeric_a
nd_nominal_attributes.png

100
3. Classification

Decision Tree Terminology

• Root Node
• Leaf Node
• Splitting
• Sub Tree
• Parent/Child Node
• Pruning
• Attribute Selection Measures

https://upload.wikimedia.org/wikipedia/commons/a/
a8/Decision_Tree_Depth_2.png

101
3. Classification

ASM Techniques
• ASM stands Attribute Selection Measures
• Technique select a feature to create tree
• ASM Techniques
• Entropy
• Information Gain
• Gini Index

102
3. Classification

Entropy
• The randomness in the information being processed
• Higher entropy, more randomness of classes
• Lesser entropy, less randomness of classes
• Equation as follows:
• 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 = σ𝑐𝑖=1 −𝑃𝑖𝑙𝑜𝑔2𝑃𝑖

Log2 Calculator: https://www.omnicalculator.com/math/log-2

103
3. Classification

Information Gain
• It gives the value of class at nodes
• If information gain is higher then node contains almost one class values
• If information gain is low then node mix of all class values
• Equation as follows:
• 𝐼𝐺 𝑇, 𝑋 = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑇 − 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑇, 𝑋

104
3. Classification

Gini Index
• It gives the value of class at nodes
• It opposite to information gain
• Equation as follows:
• 𝐺𝑖𝑛𝑖 = 1 − σ𝑐𝑖=1(𝑃𝑖)2

105
3. Classification

DecisionTreeClassifier Class

Parameter Description

criterion gini, entropy

max_depth The maximum depth of the tree
The minimum number of samples required to
min_samples_split
split
A node will be split if this split induces a
min_impurity_decrease decrease of the impurity greater than or equal
to this value
Complexity parameter used for Minimal Cost-
ccp_alpha
Complexity Pruning
106
3. Classification

DecisionTreeClassifier Class

Attributes Description

classes_ The classes labels

feature_names_in_ Names of features seen during fit

Methods Description

fit(X, y) Fit the model to the given training data.

predict_proba(X) Probability estimates
predict(X) Predict class labels
get_params([deep]) Get parameters for this estimator

107
3. Classification

Decision Tree Classifier Walkthrough

DT Classifier Model
#importing required packages
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score,confusion_matrix, classification_report
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
import seaborn as sns

#loading iris dataset

iris_ds = load_iris()

108
3. Classification

Decision Tree Classifier Walkthrough

DT Classifier Model

#splitting dataset into features(x) and labels(y)

x = iris_ds.data
y = iris_ds.target

#splitting dataset into train test split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=.2, random_state=0)

#building and training logistic model

dt_model = DecisionTreeClassifier(criterion='entropy')
dt_model.fit(x_train, y_train)
109
3. Classification

Decision Tree Classifier Walkthrough

DT Classifier Model

#evaluating trained model

y_pred = dt_model.predict(x_test)
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy {accuracy}')

#displaying confussion matrix

cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True)

#classification report
print(classification_report(y_test, y_pred)) 110
3. Classification

Decision Tree Classifier Walkthrough

DT Classifier Model

#displaying tree
plt.figure(figsize=(20,20))
plot_tree(dt_model, class_names=['one', 'two', 'three'])
plt.show()

111
3. Classification

Activity
• Build decision tree model on digit dataset from sklearn package

112
3. Classification

Decision Tree
• Advantages:
• Easy to read and interpret
• Less data cleaning required

• Disadvantages:
• Easily Overfits
• Unstable nature

113
3. Classification

Decision Tree
• Avoiding Overfitting:
• Pruning
• Pre-Pruning
• Post-Pruning
• Ensemble
• Random Forest

114
3. Classification

Naïve Bayes
• It is a supervised machine learning algorithm based on bayes theorem
• It is mainly used for classification problems for high dimensional dataset
• It is probabilistic model
• Most of the time it is used for text classification
• Naïve means assuming all features are independent to each other
• It uses bayes theorem or law

115
3. Classification

Naïve Bayes
• Bayes law as follows:

𝑃(𝐵|𝐴)𝑃 𝐴
• 𝑃 𝐴𝐵 =
𝑃(𝐵)

• Where
• P(A|B) is posterior probability
• P(B|A) is likelihood probability
• P(A) is prior probability
• P(B) is marginal probability

116
3. Classification

Naïve Bayes
• Types of naïve bayes:
• Gaussian
• Features follow normal distribution
• Multinomial
• Data follows multinomial distribution
• Bernoulli
• Same like multinomial, but features will have boolean values

117
3. Classification

GaussianNB Class

Attributes Description

classes_ The classes labels

feature_names_in_ Names of features seen during fit

Methods Description

fit(X, y) Fit the model to the given training data.

predict_proba(X) Probability estimates
predict(X) Predict class labels
get_params([deep]) Get parameters for this estimator

118
3. Classification

GaussianNB Classifier Walkthrough

GaussianNB Model
#importing required libraries
from sklearn.datasets import load_iris
from sklearn.naive_bayes import GaussianNB, roc_auc_score
from sklearn.metrics import accuracy_score, classification_report
from sklearn.model_selection import train_test_split

#loading iris dataset

iris_ds = load_iris()

119
3. Classification

GaussianNB Classifier Walkthrough

GaussianNB Model
#splitting dataset into x and y
x = iris_ds.data
y = iris_ds.target

#splitting x, y into train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

#Building gaussian naive bayes model

gnb_model = GaussianNB()
gnb_model.fit(x_train, y_train)

120
3. Classification

GaussianNB Classifier Walkthrough

GaussianNB Model
#Evaluating model
y_pred = gnb_model.predict(x_test)
acc_score = accuracy_score(y_test, y_pred)

print(f'Accuracy Score : {acc_score}')

print(classification_report(y_test, y_pred))

121
3. Classification

Support Vector Machines (SVM)

• It is a supervised machine learning algorithm
• It is used for regression and classification problems for high
dimensional dataset
• It separates the class with best line is called decision
boundary or hyper plane
• Points are used to find hyper plane are called support vectors
• Types of SVM:
• Linearly separable
• Non linearly separable

122
3. Classification

Support Vector Machines (SVM)

• Non linearly separable uses kernel trick.
• Kernel transforms low dimensional data into high dimensional data

• Types of kernel functions :

• Linear Kernel: The linear kernel is the simplest type of kernel function. It is
used when the data is linearly separable.

• Polynomial Kernel: The polynomial kernel function transforms the data into a
higher-dimensional space using a polynomial function.

123
3. Classification

Support Vector Machines (SVM)

• Radial Basis Function (RBF) Kernel: The RBF kernel

is the most commonly used kernel function in SVMs. It
transforms the data into a higher-dimensional space using
a Gaussian function.

• Sigmoid Kernel: The sigmoid kernel function transforms

Source:
the data using a sigmoid function. https://miro.medium.com/max/621/1*o
DksheYAj1eP0Be-a6r_qQ.png

124
3. Classification

SVC Walkthrough

SVC
#importing required libraries
from sklearn import datasets
from sklearn.svm import SVC
from sklearn.metrics import classification_report

from sklearn.model_selection import train_test_split

# Loading cancer dataset from sklearn package

cancer_ds = datasets.load_breast_cancer()

125
3. Classification

SVC Walkthrough

SVC
# Getting x and y from cancer_ds
x = cancer_ds.data
y= cancer_ds.target

# splitting train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

# Building SVC model

svc_model = SVC()
svc_model.fit(x_train, y_train)

126
3. Classification

SVC Walkthrough

SVC

# Evaluating SVC model

y_pred = svc_model.predict(x_test)
cls_rpt = classification_report(y_test, y_pred)

print(cls_rpt)

127
3. Classification

Ensemble Techniques
• Ensemble techniques are machine learning methods that combine multiple models to
improve the accuracy and robustness of the predictions.

• Types of ensemble techniques, including:

• Bagging (Bootstrap Aggregating):

• Multiple models are trained on different subsets of the training data using
bootstrapping, a statistical sampling technique.

• The predictions of these models are combined to make the final prediction.
128
3. Classification

Ensemble Techniques
• Boosting:
• A sequence of models is trained on the same data, with each model focusing on
the samples that the previous model got wrong.

• The predictions of these models are combined to make the final prediction.

129
3. Classification

Ensemble Techniques

• The predictions of these models are combined using another model, called a meta-
model, to make the final prediction.

• Ensemble techniques can improve the accuracy and robustness of the predictions, reduce
overfitting, and handle noisy or missing data.

• However, they can also increase the complexity and computational cost of the model.

130
3. Classification

Ensemble Techniques
• Ensemble techniques:
• Random Forest:
• Random Forest is a type of bagging technique

• Gradient Boosting
• Gradient Boosting is a type of boosting technique

• AdaBoost:
• AdaBoost is a type of boosting technique

131
3. Classification

Ensemble Techniques
• Random Forest:

Source:
https://upload.wikimedia.org/wikipedia/commons/4/4e/Random_f
132
orest_explain.png
3. Classification

Random Forest Walkthrough

Random Forest

from sklearn import datasets

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

from sklearn.model_selection import train_test_split

# Loading cancer dataset from sklearn package

cancer_ds = datasets.load_breast_cancer()

133
3. Classification

Random Forest Walkthrough

Random Forest
# Getting x and y from cancer_ds
x = cancer_ds.data
y= cancer_ds.target

# splitting train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

# Random Forest model

rf_model = RandomForestClassifier(n_estimators=50)
rf_model.fit(x_train, y_train)

134
3. Classification

Random Forest Walkthrough

Random Forest

# Evaluating Random Forest model

y_pred = rf_model.predict(x_test)
cls_rpt = classification_report(y_test, y_pred)

print(cls_rpt)

135
3. Classification

Ensemble Techniques
• Gradient Boosting:

Source:
https://miro.medium.com/max/1400/1*jbncjeM4CfpobEnDO0ZTjw.
136
png
3. Classification

Gradient Boosting Walkthrough

Gradient Boosting

from sklearn import datasets

from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import classification_report

from sklearn.model_selection import train_test_split

# Loading cancer dataset from sklearn package

cancer_ds = datasets.load_breast_cancer()

137
3. Classification

Gradient Boosting Walkthrough

Gradient Boosting
# Loading cancer dataset from sklearn package
cancer_ds = datasets.load_breast_cancer()

# Getting x and y from cancer_ds

x = cancer_ds.data
y= cancer_ds.target

# splitting train and test

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=0)

138
3. Classification

Gradient Boosting Walkthrough

Gradient Boosting

# Building Gradient Boosting model

gb_model = GradientBoostingClassifier(n_estimators=35)
gb_model.fit(x_train, y_train)

# Evaluating Gradient Boosting model

y_pred = gb_model.predict(x_test)
cls_rpt = classification_report(y_test, y_pred)

print(cls_rpt)

139
4. Clustering

Clustering
• Clustering is a unsupervised machine learning techniques

• Clustering is a machine learning technique used for grouping similar data points together
based on their characteristics or features.

• The goal of clustering is to find natural groups or clusters in the data, without prior
knowledge of the group labels.

• Clustering algorithms typically operate by measuring the similarity between data points
and assigning them to groups based on their similarity
140
4. Clustering

Clustering
• Types of clustering algorithms:
• K-means clustering:
• It partitions the data into k clusters based on their similarity.

• Hierarchical clustering:
• It creates a hierarchy of clusters by recursively merging or splitting clusters based
on their similarity.

• Density-based clustering
• It identifies clusters based on areas of high density in the data.
141
4. Clustering

K-means clustering
• The goal of k-means clustering is to partition a set of
observations into k clusters in such a way that the points
within each cluster are as similar as possible.
• The points across different clusters are as dissimilar as
possible.
• The k-means algorithm works by randomly initializing k Source: https://www.gatevidyalay.com/wp-
content/uploads/2020/01/K-Means-
cluster centers, and then iteratively assigning each data point Clustering.png
to the nearest cluster center based on its distance.
• The algorithm then re-computes the cluster centers based on
the new assignments, and repeats the process until
convergence. 142
4. Clustering

K-Means Walkthrough

K-Means

from sklearn import datasets

from sklearn.cluster import Kmeans
from matplotlib import pyplot as plt

# Loading iris dataset from sklearn package

iris_ds = datasets.load_iris()

# Creating clusters with 2 centroids

k_means = KMeans(n_clusters=2)

143
4. Clustering

K-Means Walkthrough

K-Means

# Finding best k using elbow method

i_wss = []
centers = list(range(1, 11))
for center in centers:
k_means = KMeans(n_clusters=center)
k_means.fit(iris_ds.data)
i_wss.append(k_means.inertia_)

plt.plot(centers, i_wss)

144
4. Clustering

K-Means Walkthrough

K-Means

# Visualizing clusters
plt.scatter(x[:,0], x[:,1], c=k_means.labels_)

145
4. Clustering

Hierarchical clustering
• Hierarchical clustering starts with each data point as a separate cluster and then iteratively
merges clusters based on the distance between them, until all data points are contained in a
single cluster.

• Types of hierarchical clustering:

• Agglomerative
• Divisive

146
4. Clustering

Hierarchical clustering
• Agglomerative clustering :
• Agglomerative clustering starts with each data point
as a separate cluster and iteratively merges the
closest pairs of clusters until all data points are
contained in a single cluster.

• Divisive clustering: Source:

• Divisive clustering starts with all data points in a https://miro.medium.com/max/1039/0*afzan
WwrDq9vd2g-
single cluster and iteratively splits the cluster into
smaller clusters until each data point is contained in
a separate cluster. 147
4. Clustering

Hierarchical clustering Walkthrough

Hierarchical clustering
from sklearn.cluster import KMeans, AgglomerativeClustering, DBSCAN
from sklearn.datasets import make_blobs, make_circles
from matplotlib import pyplot as plt

#Generating Dataset
centers = [[1, 1], [3, 3]]
ds1 = make_blobs(n_samples=500, centers=centers, cluster_std=0.4, random_state=0)
ds2 = make_circles(n_samples=500, noise=0.1, factor=0.2)

148
4. Clustering

Hierarchical clustering Walkthrough

Hierarchical clustering
agg_clstr = AgglomerativeClustering(n_clusters=2)
x = ds2[0]
agg_clstr.fit(x)

plt.scatter(x[:,0], x[:,1], c= agg_clstr.labels_)

plt.show()

149
4. Clustering

Density-based clustering
• Density-based clustering is a clustering technique that identifies
clusters based on the density of data points in the feature space.

• It is particularly useful when dealing with data that has complex

and irregular cluster shapes or when there is no prior knowledge
about the number of clusters in the data.

• The main idea behind density-based clustering is to group together

data points that are close to each other and have a high density of Source:
https://media.geeksforgeeks.org/wp
nearby points, while separating points that have low densities. -content/uploads/fig-1-300x300.jpg
150
4. Clustering

Density-based clustering
• The key parameter in density-based clustering is the minimum number of data points required to
form a cluster, known as the minimum cluster size or the minimum points threshold.

• DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is the one popular
density-based clustering algorithm.

• DBSCAN works by defining a radius around each data point and counting the number of data
points within that radius.

151
4. Clustering

Density-based clustering
• A point is considered to be a core point if there are at least a specified minimum number of points
(the minimum points threshold) within its radius.

• If a point is not a core point but is within the radius of a core point, it is considered a border
point. All other points that do not meet either of these criteria are classified as noise points.

• DBSCAN then forms clusters by connecting core points that are within each other's radius, and
any border points that are within the radius of a core point.

• DBSCAN also allows for the detection of noise points, which are data points that do not belong
to any cluster. 152
4. Clustering

DBSCAN Walkthrough

DBSCAN
from sklearn.cluster import KMeans, AgglomerativeClustering, DBSCAN
from sklearn.datasets import make_blobs, make_circles
from matplotlib import pyplot as plt

#Generating Dataset
centers = [[1, 1], [3, 3]]
ds1 = make_blobs(n_samples=500, centers=centers, cluster_std=0.4, random_state=0)
ds2 = make_circles(n_samples=500, noise=0.1, factor=0.2)

153
4. Clustering

DBSCAN Walkthrough

DBSCAN
dbs = DBSCAN(eps=0.2, min_samples=5)
x = ds2[0]
dbs.fit(x)

plt.scatter(x[:,0], x[:,1], c=dbs.labels_)

plt.show()

154
5. Principal Component Analysis

Principal Component Analysis(PCA)

• Principal Component Analysis is a widely used technique in dimensionality reduction.
• PCA is used to transform high-dimensional data into a lower-dimensional representation that
captures most of the variability of the original data.
• In PCA, a set of orthogonal basis vectors, called principal components, are calculated to represent
the data in a way that minimizes the information loss.
• These principal components are linear combinations of the original variables
• The first principal component accounts for the largest amount of variability in the data
• The second principal component accounts for the second largest amount of variability, and so on.

155
5. Principal Component Analysis

Principal Component Analysis(PCA)

Source:
https://www.analytixlabs.co.in/blog/wp-content/uploads/2021/05/Blog-Image-
1.jpg
156
5. Principal Component Analysis

Principal Component Analysis(PCA)

• PCA Steps:

• Standardize the data by subtracting the mean and dividing by the standard deviation.
• Calculate the covariance matrix of the standardized data.
• Calculate the eigenvectors and eigenvalues of the covariance matrix.
• Choose the first k eigenvectors with the largest eigenvalues to form the basis of the
lower-dimensional subspace.
• Multiply the standardized data with eigen vectors
• Select k components

157
5. Principal Component Analysis

PCA Walkthrough

PCA
from sklearn import datasets
from sklearn.decomposition import PCA

iris_ds = datasets.load_iris()
x = iris_ds.data

pca_2 = PCA(n_components=2)
x_std = StandardScaler().fit_transform(x)
pca_2_x = pca_2.fit_transform(x_std)

158

Predictive Analytics Updated
No ratings yet
Predictive Analytics Updated
30 pages
Polymer Composites - 2024 - Kamarian - Machine Learning For Bending Behavior of Sandwich Beams With 3D Printed Core and
No ratings yet
Polymer Composites - 2024 - Kamarian - Machine Learning For Bending Behavior of Sandwich Beams With 3D Printed Core and
12 pages
Xgboost Presentation
100% (2)
Xgboost Presentation
54 pages
Machine Learning Cheat Sheet
100% (1)
Machine Learning Cheat Sheet
211 pages
Regression Dataset Example
No ratings yet
Regression Dataset Example
14 pages
Mechine Learning
No ratings yet
Mechine Learning
106 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
17 pages
Module 03 - Learners Guide
No ratings yet
Module 03 - Learners Guide
13 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript Ds
11 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
Machine Learning (Chapter1)
No ratings yet
Machine Learning (Chapter1)
8 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
38 pages
Data Science Chapitre 2
No ratings yet
Data Science Chapitre 2
98 pages
Week - 03 Week04
No ratings yet
Week - 03 Week04
32 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
sdl unit 1
No ratings yet
sdl unit 1
7 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
Mla Unit 2
No ratings yet
Mla Unit 2
99 pages
Predictive Analytics (2)
No ratings yet
Predictive Analytics (2)
46 pages
Lecture-07 & 08 (New)
No ratings yet
Lecture-07 & 08 (New)
17 pages
ML_Introduction
No ratings yet
ML_Introduction
76 pages
Machine Learning
100% (3)
Machine Learning
46 pages
Machine Learning Algorithm With Python Implementation
No ratings yet
Machine Learning Algorithm With Python Implementation
34 pages
7 محاضرات
No ratings yet
7 محاضرات
36 pages
Exp 1
No ratings yet
Exp 1
6 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
lab mannual of ML
No ratings yet
lab mannual of ML
43 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
2-Machine Learning Algorithms
No ratings yet
2-Machine Learning Algorithms
16 pages
Report of Profit Prediction
No ratings yet
Report of Profit Prediction
15 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Linear Regression and Logistic Regression
No ratings yet
Linear Regression and Logistic Regression
19 pages
Learn Machine Learning in One Lesson Book
No ratings yet
Learn Machine Learning in One Lesson Book
8 pages
Linear Regression for ML ass
No ratings yet
Linear Regression for ML ass
99 pages
Foundation of Machine Learning F-PMLFML02-WS
No ratings yet
Foundation of Machine Learning F-PMLFML02-WS
352 pages
ML Report 1
No ratings yet
ML Report 1
23 pages
Commonly Used Machine Learning Algorithms (With Python and R Codes)
No ratings yet
Commonly Used Machine Learning Algorithms (With Python and R Codes)
19 pages
Regression Models: by Mayuri Bhandari
No ratings yet
Regression Models: by Mayuri Bhandari
64 pages
Regression
No ratings yet
Regression
16 pages
2.WhyMachineLearning.pdf
No ratings yet
2.WhyMachineLearning.pdf
27 pages
Lab Experiment 4 - AI
No ratings yet
Lab Experiment 4 - AI
7 pages
Unit 3 Regression
No ratings yet
Unit 3 Regression
91 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
Slide 1
No ratings yet
Slide 1
29 pages
supervised_learning
No ratings yet
supervised_learning
14 pages
Unit-Vi 2
No ratings yet
Unit-Vi 2
31 pages
Ca10bd6d De86 4bae 9427 c60d433d2076 Supervised Learning
No ratings yet
Ca10bd6d De86 4bae 9427 c60d433d2076 Supervised Learning
17 pages
R Data Analysis
No ratings yet
R Data Analysis
10 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
14 pages
Iu 3.6.4 ML 101
No ratings yet
Iu 3.6.4 ML 101
39 pages
Aychew Chernet
No ratings yet
Aychew Chernet
8 pages
Supervised Machine Learning Algorithm
100% (1)
Supervised Machine Learning Algorithm
111 pages
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
No ratings yet
Introduction To Machine Learning: ETH Zurich Janik Schuettler Marcel Graetz FS18
18 pages
Unit 4 - Machine Learning PDF
No ratings yet
Unit 4 - Machine Learning PDF
49 pages
DS-05 Introduction To Machine Learning
No ratings yet
DS-05 Introduction To Machine Learning
103 pages
DOC-20240831-WA0023.
No ratings yet
DOC-20240831-WA0023.
22 pages
ML Cheatsheet PDF
100% (1)
ML Cheatsheet PDF
211 pages
Machine Learning Algorithns - Unit3
No ratings yet
Machine Learning Algorithns - Unit3
124 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Session 05 Lecture Notes
No ratings yet
Session 05 Lecture Notes
5 pages
Session 02 Lecture Notes
No ratings yet
Session 02 Lecture Notes
4 pages
L1- Regression Using ANN
No ratings yet
L1- Regression Using ANN
1 page
Tableau
No ratings yet
Tableau
31 pages
用于视频修复的缺陷感知Masked Transformer
No ratings yet
用于视频修复的缺陷感知Masked Transformer
15 pages
P-2.1.2 Cross Validation and Regularization
No ratings yet
P-2.1.2 Cross Validation and Regularization
37 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
13 pages
Unit 5
No ratings yet
Unit 5
8 pages
Ridge and Lasso Regression in Python
No ratings yet
Ridge and Lasso Regression in Python
18 pages
Libfm
No ratings yet
Libfm
7 pages
E009 Simultaneous Time-Lapse Binning and Regularization of 4D Data
No ratings yet
E009 Simultaneous Time-Lapse Binning and Regularization of 4D Data
5 pages
HW 4
No ratings yet
HW 4
7 pages
2023-LLI Dif Final
No ratings yet
2023-LLI Dif Final
12 pages
Instant Access to Deblurring Images Matrices Spectra and Filtering Fundamentals of Algorithms 3 1st Edition Per Christian Hansen ebook Full Chapters
100% (10)
Instant Access to Deblurring Images Matrices Spectra and Filtering Fundamentals of Algorithms 3 1st Edition Per Christian Hansen ebook Full Chapters
60 pages
Fake News Analysis
No ratings yet
Fake News Analysis
46 pages
Learning Efficient Convolutional Networks Through Network Slimming
No ratings yet
Learning Efficient Convolutional Networks Through Network Slimming
10 pages
Machine Learning Guidelines and Practical List - Tutorialsduniya
No ratings yet
Machine Learning Guidelines and Practical List - Tutorialsduniya
2 pages
ML prep for samsung
No ratings yet
ML prep for samsung
73 pages
Research Article: Research On E-Commerce Database Marketing Based On Machine Learning Algorithm
No ratings yet
Research Article: Research On E-Commerce Database Marketing Based On Machine Learning Algorithm
13 pages
6.867 Machine Learning: Mid-Term Exam October 13, 2004
No ratings yet
6.867 Machine Learning: Mid-Term Exam October 13, 2004
11 pages
NN-examples
No ratings yet
NN-examples
91 pages
Physics Informed Neural Networks For Cardiac Activation Mapping
No ratings yet
Physics Informed Neural Networks For Cardiac Activation Mapping
12 pages
3-Lecture Three - (Chapter Two-N-gram Language Models)
No ratings yet
3-Lecture Three - (Chapter Two-N-gram Language Models)
28 pages
Section 1: Cross-Validation and Model Performance
No ratings yet
Section 1: Cross-Validation and Model Performance
33 pages
House Price Prediction
No ratings yet
House Price Prediction
3 pages
Efficient Noise-Decoupling For Multi-Behavior Sequential Recommendation
No ratings yet
Efficient Noise-Decoupling For Multi-Behavior Sequential Recommendation
10 pages
Greek Letters Mathematical Equations
No ratings yet
Greek Letters Mathematical Equations
101 pages
Chapter 08
100% (2)
Chapter 08
202 pages
pptml
No ratings yet
pptml
16 pages
Deep Learning - Week 7
No ratings yet
Deep Learning - Week 7
4 pages
Admm Diptv
No ratings yet
Admm Diptv
8 pages