0% found this document useful (0 votes)

62 views32 pages

Machine Learning: April 2022

Uploaded by

Rajachandra Voodiga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views32 pages

Machine Learning: April 2022

Uploaded by

Rajachandra Voodiga

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/360034430

Machine Learning

Chapter · April 2022

DOI: 10.4018/978-1-7998-9831-3.ch005

CITATIONS READS

0 545

4 authors:

Khalid Ahmed Alafandy Hicham Omara

Abdelmalek Essaâdi University Abdelmalek Essaâdi University
25 PUBLICATIONS 88 CITATIONS 18 PUBLICATIONS 51 CITATIONS

SEE PROFILE SEE PROFILE

Mohamed LAZAAR Mohammed Al Achhab

Ecole Nationale Supérieure d'Informatique et d'Analyse des Systè… Abdelmalek Essaâdi University
73 PUBLICATIONS 443 CITATIONS 63 PUBLICATIONS 276 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Investment of Machine Learning Models for Classifying Remote Sensing Images View project

Le soutien pédagogique et l'adaptation des apprentissages en ligne View project

All content following this page was uploaded by Khalid Ahmed Alafandy on 26 July 2022.

The user has requested enhancement of the downloaded file.

Chapter 5
Machine Learning
Khalid Ahmed AlAfandy
https://orcid.org/0000-0003-1465-4446
ENSA, Abdelmalek Essaadi University, Morocco

Hicham Omara
Abdelmalek Essaadi University, Morocco

Mohamed Lazaar
ENSIAS, Mohammed V University in Rabat, Morocco

Mohammed Al Achhab
NTT, ENSATE, Abdelmalek Essaadi University, Tetouan, Morocco

ABSTRACT
This chapter provides a comprehensive explanation of machine learning including
an introduction, history, theory and types, problems, and how these problems can
be solved. Then it shows some of the most used machine learning algorithms that
are used in image classification, ending with the evaluation matrices calculations
that are used to assess the performance of the learning models. The open source
libraries also mentioned in this chapter facilitate the used codes for building any
learning model with the use of machine learning.

INTRODUCTION

Artificial intelligence is a broad discipline of computer science concerned with

developing intelligent machines that can accomplish activities that would normally
need human intelligence by mimicking human intelligence (Shinde & Shah, 2018).
Artificial intelligence is a multidisciplinary subject with many techniques, but

DOI: 10.4018/978-1-7998-9831-3.ch005

Copyright © 2022, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Machine Learning

advances in machine learning and deep learning are causing a paradigm shift in almost
every field (Shinde & Shah, 2018; Bowling et al., 2006). The relationship between
artificial intelligence, machine learning, and deep learning is depicted in figure 1.
Machine learning is an artificial intelligence portion that relies on the utilization of
real data to train computers, which qualifies computers to reach strong predictions
for a particular data type as a human expert and without modular programming. That
means it is a process of importing dataset features and exporting output classes or
results depending on the used machine learning algorithm type. These data features
can be linear data, images, videos, audios, or any other used data type in our human
life. So, machine learning is an attempt to make computers learn as humans using
commonly used data in our human life (Alpaydin, 2020). Machine learning can
be divided into two types; supervised and unsupervised machine learning. The
supervised machine learning is based on training computers using given data that
have known correct outputs where the unsupervised machine learning is based
on given data without outputs or cleared results to build the learning models. The
supervised machine learning can be categorized into two branches; classification and
regression. The regression problem is based on the predictions within continuous
outputs where there is a relation between inputs and outputs within continuous
function. The classification problem is based on the predictions in discrete outputs
where the outputs are limited to two or more known categories or classes (Alloghani
et al., 2020). The learning model must count on a mathematical model where it can
be different according to the learning type which is called the hypothesis function. In
supervised machine learning, the cost function or loss function must be used through
a training process which is built according to the used mathematical model; it must
be minimized to achieve a high accuracy prediction model. The lowest cost value
can be achieved by updating the model parameters. Thus, the main goal to build a
high performance learning model is selecting model parameters values that result
in the lowest cost function. There are two main problems that can occur in learning
models; the over-fitting problem and the under-fitting problem, and then there are
more ways to solve these problems (Alpaydin, 2020; Alloghani et al., 2020).
This chapter outlines the machine learning history, types, problems, and how
learning problems can be solved, then shows some of the most used machine learning
algorithms, the open sources library which is ease the used codes for building any
learning model with the use of machine learning, ending with the evaluation matrices
calculations that are used in the learning models performance assessment.

84
Machine Learning

Figure 1. The relation between the artificial intelligence, machine learning, and
deep learning
(Shinde & Shah, 2018; Bowling et al., 2006)

HISTORY OF MACHINE LEARNING

Machine learning started in 1943 with building the first mathematical model of the
neural networks which was presented in (Mcculloch & Pitts, 1990). In 1950, Arthur
Samual developed a computer program for playing checkers, he initiated the alph-
beta pruning that measures the chance of winning to overcome the low available
memory in this time, and then he designed the scoring function called minimax
algorithm which is used in game programming till now (ElNaqa & Murphy, 2015).
Arthur Samuel defined machine learning in 1959. His definition is “The machine
learning is the field of study that gives computers the ability to learn without being
explicitly programmed” (ElNaqa & Murphy, 2015). In 1965, Aleksei Ivakhnenko
and Valentin Lapa created a hierarchical representation of polynomial activation
function neural networks that were trained with the GMDH. It is widely regarded
as the first multi-layer perceptron, and Ivakhnenko is frequently referred to as the
“Father of Deep Learning.” (Ivakhnenko & Lapa, 1966). In 1979, Kunihiko Fukushima
created a hierarchical multilayered network for pattern recognition and inspiration
for convolutional neural networks (Fukushima et al., 1983). In 1998, the problem of
learning was also defined by Tom Mitchell as “a computer program is said to learn
from experience E with respect to some task T and some performance measure P”
(Mitchell, 2006). ImageNet, a vast visual database of labeled images founded by
Fei-Fei Li in 2009, is a massive visual database of tagged images. She believed that

85
Machine Learning

in order to be genuinely practical and effective, machine learning needed appropriate

training data that reflected the actual world (J. Deng et al., 2009).

THEORY OF MACHINE LEARNING

The machine learning process is the operation of importing dataset features and
exporting output classes or results depending on the used machine learning algorithm
type. Machine learning algorithms are divided into two categories: supervised and
unsupervised machine learning (Khanum et al., 2015). Figure 2 shows types of
machine learning algorithms.

Figure 2. Types of machine learning algorithms

(Khanum et al., 2015)

The supervised learning relies on a given data set with the known correct outputs,
where there is relevance among the input and the output. Supervised learning models
are assorted to regression and classification models. In a regression model, the results
are tried to predict within a continuous output, which means that the input variables
are tried to map to some continuous function. In a classification model, the results
are tried to predict in a discrete output, which means that the input variables are
tried to map into discrete categories or classes (Sen et al., 2020).

86
Machine Learning

The unsupervised learning is based on approaching results with little or no idea

about the results. The data structure can be deduced without necessarily knowing the
effect of the variables. So, the data structure can be deduced by clustering the data
based on relationships between the variables in the data. In unsupervised learning,
there is no feedback based on the prediction results (Khanum et al., 2015).

UNSUPERVISED MACHINE LEARNING

The unsupervised machine learning counts on given data that doesn’t have any
labels. So, the learning algorithm attempts to find some structure in the given data.
After the learning algorithm succeeds in structuring this given unlabeled data, the
algorithm grouped these data into separate clusters. Then the main unsupervised
machine learning algorithm is called the clustering algorithm (Smola & Vishwanathan,
2008). The k-means algorithm is the commonly used clustering algorithm. It based
on the division of m objects into k clusters where each object is affiliated to the
nearest mean cluster. This approach results perfectly k various clusters with greatest
possible variation. Till now, it is not known the best number of clusters k that leads
to the greatest variation as a priority, thus it must be reckoned from the data. The
input features of dataset are given X={x1,…,xm}. The goal of k-means algorithm is
to structure the given dataset into k clusters where every point in a cluster looks like
the points from its own cluster than with the points from other clusters. To achieve
this goal, realize the prototype vectors 𝜇1,…,𝜇k and an indicator vector rij which is
1, if and only if, is assigned to cluster j. To cluster our dataset we will minimize
the following objective function J(r,𝜇), which minimizes the distance of each point
from the prototype vector (Smola & Vishwanathan, 2008).

m k
1
J (r , µ) =
2
∑∑
2 i =1 j =1
rij x i − µj (1)

2
where r={rij}, 𝜇={𝜇j}, and . denotes the usual Euclidean square norm.
To achieve the high performance learning model, it must be minimize the value
of objective function J(r,𝜇) in (1), on the other hand to build this model it must to
find r and 𝜇 values. So, practically it is very difficult to minimize objective function
J(r,𝜇) with respect to both r and 𝜇 values, then two stages strategy must be adapted;
the first stage is to determine r with fixing 𝜇. The xi can be found by setting rj=1 if:

87
Machine Learning

J (r , µ) = argmin x i − µj
2
(2)
´
j

and 0 otherwise (Smola & Vishwanathan, 2008).

The second stage is to determine 𝜇 with fixing r. So, J is a quadratic function of
𝜇, and it can be minimized by setting the derivative to be 0 with respect to 𝜇j for all
j (Smola & Vishwanathan, 2008):

∑r (x ij i
−µj ) = 0 (3)
i =1

By rearranging, it is obtained (Smola & Vishwanathan, 2008):

µj =
∑r x i ij i
(4)
∑r i ij

Where ∑ rij counts the numbers of points that assigned to cluster j and 𝜇j is basically
i
set to be the sample mean of points that assigned to cluster j. Figure 3 shows the
unsupervised learning (Smola & Vishwanathan, 2008).

Figure 3. The unsupervised learning

(Smola & Vishwanathan, 2008)

88
Machine Learning

SUPERVISED MACHINE LEARNING

The supervised machine learning is based on given data with known correct results
or outputs. The supervised machine learning can be categorized into two types;
regression and classification. In the regression learning models, the given dataset
inputs and its outputs must have a mathematical relation where the outputs are
predicted within continuous function. In the classification learning models, the given
dataset inputs and outputs also must have a mathematical relation but the outputs
are predicted within discrete function. In these two types, the mathematical model
is necessary but it can be different according to the learning model type (regression
or classification), the used dataset, and the learning algorithm, this mathematical
model is called hypothesis function h𝜃(x). The model parameters {𝜃0,…,𝜃n} must
be well selected and updated according to the dataset input features {x1,…,xn}
t
o achieve the lowest cost or loss value that calculated using the cost function J
(Verdhan, 2020). The main difference between the regression and classification
algorithm is the hypothesis function where the cost function and updating model
parameters are almost the same. Figure 4 shows the supervised learning (regression
and classification).

Figure 4. The supervised Learning

(Verdhan, 2020)

In regression, the hypothesis function can be calculated for each input in the
dataset inputs by using the model parameters and the input features. This hypothesis
function can be a linear function or any other mathematical function according to the

89
Machine Learning

nature of the used dataset (Gutenbrunner et al., 1993; Shanthamallu et al., 2017). It
can be calculated by (Gutenbrunner et al., 1993; Shanthamallu et al., 2017):

hθ (x )=θ0 +θ1x 1 +θ2x 2 +…

+θn x n (5)

where h𝜃(x) is the hypothesis function, {𝜃0,…,𝜃n} is the model parameters, and
{x1,…,xn} is the nth input features for the dataset input .
It can be reformulated as matrix multiplication form by (Gutenbrunner et al.,
1993; Shanthamallu et al., 2017):

1
x1
hθ (x )=θ0 θ1 θ2 
… θn  x 2 =θT x (6)
 

xn

In classification, the hypothesis function can be calculated for each input in the
dataset inputs by using the model parameters and the input features. The most used
hypothesis function in classification is the sigmoid function (Shanthamallu et al.,
2017). The hypothesis function can be calculated by (Shanthamallu et al., 2017):

1
he (x ) = g(z ) = (7)
1 + e −z

z = 𝜃Tx (8)

The sigmoid function is g(z) whose output is any real number in [0, 1] interval
as shown in figure 5. If the output is less than 0.5 then the classification output
is 0, and if the output is greater than or equal 0.5 then the classification output is
1. In case of more than two classes which y={0,1,2,…,n}, the problem is divided
into n+1 binary classification problems; in classes predictions outputs, the highest
probability for a class means that y belongs to this class (Shanthamallu et al., 2017;
Zhao et al., 2010).

90
Machine Learning

Figure 5. The sigmoid function graph

(Shanthamallu et al., 2017)

Through the training stage in supervised machine learning, two outputs can be
utilized to build the learning model; the correct and predefined dataset outputs and
the hypothesis outputs where the cost function is based on these two outputs. Thus,
the cost function for regression learning can be built using the hypothesis function
and the dataset known correct outputs by (F. Lubis et al., 2014):

( ( ) )
m 2
1 (i ) (i )
J regression = ∑ hθ x −y (9)
2m i =1

To enhance the learning model performance, a regularization term can be appended

to the cost function. The regularization can be enforced to regression algorithms
or classification algorithms. The cost function with regularization for regression
algorithm can be represented by (Shi, 2013):

( ( ) )
m 2 n
1 (i ) (i ) λ
J regression = ∑ hθ x −y + ∑θ
2
j
(10)
2m i =1 2m j =1

91
Machine Learning

where 𝜆 or lambda is the regularization parameter. It determines how much the costs
of theta parameters 𝜃 are elevated. It must be notice that the selection of 𝜆 value is
based on the self-intuition.
The cost function for classification learning uses the log function because of the
sigmoid function. It can be built by (Zhao et al., 2010):

( ( )) (
m

J classification =−

1  (i )
∑ y log hθ x

m i =1 
(i ) (i )
) ( 
−1 − y log 1 − hθ (x ) 

) (11)

The regularization term can be added as done in regression in (10) (Zhao et al.,
2010).

( ( )) (
m n

J classification =−

1  (i )
∑ y log hθ x
m i =1 
(i ) (i )
) ( 
−1 − y log 1 − hθ (x )  +
 2m
λ
) ∑θ
j =1
2
j

(12)

To achieve the lowest cost value, it must estimate the model parameters that
realize the minimum loss for this model, beginning with initializing these model
parameters by random values which can’t obtain the minimum loss for the learning
model. So, it must update these parameters to achieve the targeted minimum loss
value. This updating is done using the gradient descent which iterates this process
until we reach the minimum loss (Ruder, 2016). The parameters can be updated by
(Ruder, 2016):

∂J
θj = θj − α (13)
∂θj

where 𝛼 is the learning rate, J is the cost function value, and 𝜃j∈{𝜃0,…,𝜃n}. It must
be notice that the selection of 𝛼 value is based on the self-intuition where it is
preferred to select a small value that must be less than 1 (Ruder, 2016). By using (9)
the updated parameters for regression learning can be calculated by (Ruder, 2016):

 1 
( ( ) )
2
∂ 
m (i ) (i ) 
∑ hθ x −y
 2m i =1 
θj = θj −α (14)
∂θj

92
Machine Learning

 1 m
 m
( ( ) (i )
 ∂ ∑ i =1 hθ x −y
(i )
)
 2 i =1 ( ( )
θj =θj −α 2 ∑ hθ x −y
(i ) (i )
) 

 ∂θj
(15)

( ( ) )
m
 (i ) (i ) (i ) 
θj = θj −α ∑  hθ x −y x j  (16)
i =1
 

where j={0,…,n} and x0=1.

The parameters for classification learning can be updated by (F. Lubis et al.,
2014; Ruder, 2016):

 m  (i )
∂ ∑ y log hθ x
α   i =1 
(i )
( ( )) (
(i ) 
−1 − y log 1 − hθ (x )  
  ) ( )
θj =θj − (17)
m ∂θj

∑ (h (x ( ) ) −y( ) ) x ( ) 

m
α  i i i
θj = θj − θ j
(18)
m i =1

For training a supervised learning model even if this model is regression or

classification, there are many steps to finalize the training process (Verdhan, 2020;
Gutenbrunner et al., 1993; Shanthamallu et al., 2017; F. Lubis et al., 2014; Shi,
2013; Zhao et al., 2010; Ruder, 2016).

1. Divide the dataset into training dataset which is 80% from dataset records and
the test dataset which is 20% from dataset records.
2. Initialize the model parameters to any random values.
3. Calculate the hypothesis function using the model parameters and the training
dataset input features.
4. Calculate the cost function even if using regularization or not.
5. Update the model parameters values using the gradient descent.
6. Repeat steps from 3 to 5 to achieve the possible lowest cost value.
7. Calculate the hypothesis function using the final model parameters values and
the test dataset input features.
8. Compare the hypothesis outputs with the test dataset known correct results to
assess the learning model performance.

93
Machine Learning

THE SUPERVISED MACHINE LEARNING PROBLEMS

Through running a learning algorithm, it is possible that it can’t do as hoped almost

all the time, it will be because it has either a high bias problem or a high variance
problem, in other words, either an under-fitting problem or an over-fitting problem.
These two problems are considered the main problems that can occur in supervised
machine learning algorithms. To deal with these problems in the learning model,
it is necessary to divide the used dataset into training dataset which is 60% from
dataset records, validation dataset which is 20% from dataset records, and test dataset
which is 20% from dataset records. After solving the problem, the training dataset
and the validation dataset, which are 80% from the dataset records, are together the
training dataset, and then retrain the model with the suggested problem solution
(Hawkins, 2004; Jabbar & Khan, 2015).

The Over-Fitting Problem

The over-fitting problem is that the learning model can fit 100% of training data well
through prediction after the training process but can’t predict the test data well. In
this problem, the training set cost function value will be low and the validation set
cost function value will be much greater than the training set cost function value.
There are two main solutions for these problems; the first solution is reducing the
training features, thus it must fine select the features to be removed that can’t affect
the required data to train. This selection can be done manually or by model selection
algorithm. The second solution is increasing the training set data. It can be done
by getting more training data or using data augmentation. The data augmentation
is selecting some data and adding these selected data as new data after doing
mathematical or spatial modification on it, it will be explained in section 2. The
third solution is the regularization where it must add the regularization term in the
cost function calculations through the training process (Hawkins, 2004).

The Under-Fitting Problem

The under-fitting problem is that the learning model fails to predict the training data
well after the training process. In this problem, the training set and the validation set
cost function values will be high. There are two main solutions for these problems;
the first solution is increasing the training features. It can be done by selecting some
features and adding these features as new features after mathematical modification
such as square or any other mathematical function. The second solution is increasing
the training iterations to reduce under-fitting where stopping training too soon can

94
Machine Learning

also result in an under-fit learning model. The third solution is the decrease of the
regularization parameter 𝜆 value if this value is high (Jabbar & Khan, 2015).

DATA AUGMENTATION

One way of preventing the over-fitting problems is increasing the training data,
but sometimes there is no data available for use in training or providing additional
data has high cost. Thus, the data augmentation is the solution in this situation. So,
data augmentation, which is commonly used in computer vision, is a technique for
increasing the amount of data by adding significantly changed copies of either existing
data or new synthetic data derived from existing data (Shorten & Khoshgoftaar, 2019).
The most well-known sort of data augmentation is image data augmentation, which
entails transforming images in the training dataset into altered copies that belong
to the same class as the original image. Shifts, flips, zooms, color modification,
random cropping, rotation, noise injection, and many other operations from the field
of image editing are all included in transforms. Typically, image data augmentation
is only applied to the training dataset, not the validation or test datasets. This differs
from data preparation tasks like image resizing and pixel scaling, which must be
carried out uniformly across all datasets that interact with the model (Shorten &
Khoshgoftaar, 2019).

THE MOST USED MACHINE LEARNING ALGORITHMS

In previous sections, the both of supervised machine learning and unsupervised

machine learning techniques are explained with the most common used algorithm
for each type of machine learning types; k-means or clustering algorithm for
unsupervised machine learning, linear regression for supervised regression machine
learning and logistic regression for classifying supervised machine learning. In this
section some most common supervised machine learning algorithms are explained
with its mathematical model; these algorithms are the Naïve Bayes, the K-nearest
neighbors, the DT, the SVM, and the ANNs which is considered the base of the
deep learning techniques.

The Naïve Bayes Algorithm

The Naïve Bayes algorithm is a machine learning algorithm which acts as a classifier.
This classifier is based on the concept of the Bayes theorem where the Bayes theorem

95
Machine Learning

is one of the fundamental probability theorems. The Bayes theorem can be represented
with a simple mathematical formula as (19) (Taheri & Mammadov, 2013).

P (B|A)P (A)
P (A|B ) = (19)
P (B )

where P(A|B) is the probability of event A occurring given that B is true, P(B|A)
is the probability of event B occurring given that A is true, and P(A) and P(B) are
the probabilities of observing A and B respectively without any given condition.
So, if we have a given dataset to build a learning model using Naïve Bayes
algorithm with input X with input features (x1, x2, x3, …, xn) where X=(x1,x2,x3,…,xn)
and output y, the mathematical model of the Naïve Bayes learning algorithm can
be represented as (Taheri & Mammadov, 2013):

P (X |y )P (y )
P (y|X ) = (20)
P (X )

Using the input features the Naïve Bayes learning algorithm can be represented
mathematically, for n features, as (Taheri & Mammadov, 2013):

P (x 1|y )P (x 2 |y )P (x 3 |y )…P (x n |y )P (y )
P (y|x 1, x 2 , x 3 , , x n ) = (21)
P (x 1 )P (x 2 )P (x 3 )…
P (x n )

By looking at the dataset and substituting the values into the equation, you can
get the values for each. The denominator does not change for any of the entries in
the dataset; it remains constant. As a result, the denominator can be eliminated and
proportionality can be injected as (Taheri & Mammadov, 2013):

P (y|x 1, x 2 , x 3 , , x n )∝ P (y )∏ P (x i |y )
n
(22)
i =1

The output class (the value of y variable) can be given with maximum probability
as (Taheri & Mammadov, 2013):

P (y )∏ P (x i |y )
n
y =argmax
y
(23)
i =1

96
Machine Learning

There are three main types of Naïve Bayes classifier; the Multinomial Naïve
Bayes, the Bernoulli Naïve Bayes, and the Gaussian Naïve Bayes (Singh et al.,
2019; T. Wang & W. Li, 2010).
The Multinomial Naïve Bayes classifier is usually used for document classifications.
It is used for discrete counts (Singh et al., 2019).
The Bernoulli Naïve Bayes classifier is useful in binary classification. One of
most used applications is text classifications using a ‘bag of words’ paradigm, in
which the 1s and 0s represent “word appears in the document” and “word does not
appear in the document,” respectively (Singh et al., 2019).
The Gaussian Naïve Bayes classifier works by using a Gaussian distribution
to distribute the continuous values associated with each feature. So, the features
likelihood is considered to be Gaussian, and then the conditional probability can
be calculated by (T. Wang & W. Li, 2010):

x 2
1
PX e 2 2
(24)
2 2

where 𝜇 is the mean value and 𝜎 is the standard deviation value of X features; it can
be given by (T. Wang & W. Li, 2010):

1 n
xi
n i 1
(25)

0.5
1 n 2
xi (26)
n 1 i 1

The KNN Algorithm

The KNN algorithm is a supervised machine learning algorithm. It is a simple

algorithm and it can act as a regression or classification machine learning. This
algorithm is facile to understand and facile to execute but has a major stumbling
block, which is the significantly slowdown with the growth of used data size. This
algorithm applies a rule that things that are comparable are close together (S. Zhang
et al., 2017; A. Lubis & M. Lubis, 2020). Figure 6 shows the KNN algorithm which
indicates that the nearest and closest points are classified as a corresponding class (S.

97
Machine Learning

Zhang et al., 2017). The KNN algorithm is based on the calculation of the distances
among a query and all the data examples, the specified number examples K selection
which is closest to the query, then polls for the averages the labels (in the case of
regression) or the most frequent label (in the case of classification) (S. Zhang et al.,
2017; A. Lubis & M. Lubis, 2020). The K value is determined by iterations and test
but it can be depended on the neighbors where the K can has high value in case of
more neighbors and can has small value in case of fewer neighbors. Be careful that
in the case of K = N, where N is the number of classes, over-fitting problem can be
occurred (S. Zhang et al., 2017).

Figure 6. The KNN Machine Learning Algorithm

(S. Zhang et al., 2017)

In mathematics, there are several distance calculation functions such as the

Euclidean distance, the Manhattan distance, the Minkowski distance, the Jaccard
distance, the Hamming distance… etc. The suitable function is elected according to
the types of the used data. In this section, three of these functions are mentioned; the
Euclidean distance, the Manhattan distance, and the Hamming distance. The most
used function in the KNN machine learning algorithm is the Euclidean distance (A.
Lubis & M. Lubis, 2020; Wu et al., 2002).

98
Machine Learning

The Euclidean distance De(x,y) is the calculation of the square root of the sum
of the square differences between the coordinates (x,y) of n points as (A. Lubis &
M. Lubis, 2020; Wu et al., 2002):

De (x , y )= ∑ (x − yi )
n 2
i
(27)
i =1

The Manhattan distance Dm(x,y) is the calculation of the sum of the absolute
values of the differences between the coordinates (x,y) of n points as (A. Lubis &
M. Lubis, 2020):

Dm (x , y )= ∑
n
x i − yi (28)
i =1

The Hamming distance Dh(x,y) is the calculation of the distance between n given
points is the maximum difference between their coordinates (x,y) on a dimension
as (A. Lubis & M. Lubis, 2020; Wu et al., 2002):

Dh (x , y )= ∑
n
x i − yi (29)
i =1

0, x = y
With Dh (x , y ) =  (30)
1, x ≠ y


The DT Algorithm

The DT classifier is a supervised machine learning approach which can be utilized

as a classifier. The DT is relied on building a tree structure with a root node and leaf
nodes. The leaf nodes contain attribute test conditions to separate sample classes,
and all of these nodes are assigned a class label yes or no. The data can be classified
when the decision tree is constructed. Apply the test condition to the given data
starting from the root node and follow the convenient flow based on the test results.
Applying a new test condition can lead to another internal node or to a leaf node
(Farid et al., 2014). The given data associated with the leaf node is assigned to the
certain class when reaching the leaf node as shown in figure 7. This algorithm does
not need training, and its computational efficiency is good but requires complex

99
Machine Learning

calculations. Accuracy depends on the tree design and the features selection (Farid
et al., 2014).

Figure 7. The DT classifier

(Farid et al., 2014)

The SVM Algorithm

The SVM is a machine learning method that can be utilized in classification or

regression prediction. When used as a supervised classifier, it provides an excellent
separation of classes. The SVM classifier is based on small sample statistical theory
that differentiates optimal hyper-planes from the training data. Decision planes
determine decision boundaries, and these hyper-planes are formed by that decision
planes (Guo & W. Wang, 2019; Chauhan et al., 2019; AlAfandy et al., 2020a). Those
hyper-plans distinguish the various classes by constructing margins among classes.
Maximizing those margins among classes, especially the closest classes, on both
aspects of hyper-planes is the goal for reaching the most efficient SVM classifier.
The SVM structure is difficult to understand. In the other words the SVM
computational complexity is reduced. Figure 8 shows the SVM classifier (Chauhan
et al., 2019). The SVM classifier cost function mathematical model can be achieved
by deriving the classification cost function with regularization, which is represented
1
in (12). The only way to do that by gets rid of these terms, it should give the
m
1
same optimal value because is just a constant it gives (Chauhan et al., 2019;
m
Vapnik, 2000; Koda et al., 2018):

100
Machine Learning

( ( )) (
m n
 (i )
∑ y log hθ x
J SVM =−

i =1 
(i ) (i )
) (  λ
)
−1 − y log 1 − hθ (x )  + ∑θj2
 2 j =1
(31)

1
By multiplying the two terms (cost and regularization) of (22) by C where C =
λ
and called the penalty parameter of the SVM classifier model, it gives (Chauhan et
al., 2019; Vapnik, 2000; Koda et al., 2018):

( ( )) (
m n
 (i )
C ∑ y log hθ x
J SVM =−

i =1 
(i ) (i )
) (  1
)
−1 − y log 1 − hθ (x )  + ∑θj2
 2 j =1
(32)

So the hypothesis function for the SVM classifier can be represented as (Chauhan
et al., 2019; Vapnik, 2000; Koda et al., 2018):

0,θT x ≥ 0
hθ (x ) = (33)
1,θT x < 0


Figure 8. The SVM classifier

(Chauhan et al., 2019)

101
Machine Learning

The ANNs

The ANNs algorithm is a machine learning approach which can act as a supervised
or unsupervised machine learning algorithm; it can be utilized as a regression or
classifier too. The use of the ANNs as a supervised classifier relies on the biological
neural networks form. The definition of the ANNs approach is algorithms that attempt
to imitate the human brain (Shanmuganathan, 2016; AlAfandy et al., 2019). The
structure of the ANNs depends on the information that flows through this network.
The ANNs are deemed as nonlinear applied mathematical information modeling
tools whenever the complicated relationships between inputs and outputs are
forged. The ANNs are formed of a sequence of layers; every layer contains a set of
neurons. The input layer is the first layer wherever the output layer is the last layer;
the internal layers are treated as the hidden layers. Neurons within the preceding
and the succeeding layers are connected by weighted connections known as the
weights (Srivastava et al., 2012). The accuracy and the performance of the ANNs
are extremely looking at the network structure and the hyper-parameters values. The
ANNs process rate is high however the network takes an enormous time for training
and also needs a huge memory with advanced hardware; on the other hand there
is some stiffness to set the network structure. Figure 9 shows the ANNs approach
(Shanmuganathan, 2016; Srivastava et al., 2012).

Figure 9. The ANNs model classifier

(Shanmuganathan, 2016; Srivastava et al., 2012)

102
Machine Learning

The ANNs structure with multilayers which each layer contains multiple nodes
L−1 L 
has input layer X=A[0], hidden layers from A[1] to A 
to yˆ =A  , and output layer
L 
yˆ =A  (Shanmuganathan, 2016; Srivastava et al., 2012; Bengio et al., 2017)

 a l  
x   1 
 1  l  
x   a2  
 2   
X = A = x 3  A =  a 3   and
  l ={1,2,…, L }
  
0 l  l
(34)
   
   
   l  
x n  a  l  
 k   

where L is the ANNs layers, n is the input features, and k[l] is the lth layer nodes.
Then, the lth layer output can be calculated as (Bengio et al., 2017):

l  l 
A  =∅  Z   ( )
l 
(35)

l  l  l −1 l 
  A
Z   =W 
+ b   (36)

b l    z l  

 1   1 
 l    l  l    l  
b2 
     
 w11  w1k l −1 
   z 2  
      
  , and Z =  z 3   ,
l  l   l  l  l
b =b3  , and W =
     
 

(37)
     
  w l  w l    
 l   
 k 1

l 
k k l  l −1 
  
  l  
b  l   z    
 k     k l  

l 
(
So, the right vectors dimensions are W   = k   , k 
l  l −1

) , b =(k ,1), (k ,1), and
[l] [l] [l]

l 
A[l]=(k[l],1). The ∅  is the activation function of the lth layer (Bengio et al., 2017).
Then, the output of the ANN structure is calculated as (Bengio et al., 2017):

L 
ŷ=A  (38)

103
Machine Learning

THE OPEN SOURCE IMPLEMENTATIONS

Open source is an expression referred to open source software. Open source software
is a code that is designed to be publicly accessible for free. So, anyone can see,
modify, and distribute this code. In machine learning, a lot of researchers routinely
open source their work on the Internet, such as on GitHub. On the other hand there
are open source libraries for machine learning such as TF and Scikit-learn. These
libraries are available and easy to deal with the widely used programming languages
such as MATLAB and Python.
TF is an end-to-end open-source platform for creating machine learning and deep
learning applications which is created by the Google Brain team. It’s a symbolic
math package that performs numerous tasks involving DNNs training and inference
using dataflow and differentiable programming (Gad, 2018).
Scikit-learn was created as a Google summer of code project in 2007 by David
Cournapeau. It’s a Python-based machine learning package that includes supervised
and unsupervised machine learning approaches. It is distributed under several Linux
distributions and is licensed under a liberal simplified BSD license, allowing for
academic and commercial use (Pedregosa et al., 2011).

THE PERFORMANCE ASSESSMENTS

There are several evaluation metrics for assessing the performance of the classification
algorithms; some of them assess the performance of each class prediction and the
others assess the predictive performance for the whole classifier.
This section illustrates the confusion matrix, precision, recall, and F1-score
which are used to assess the performance of each class prediction, the OA and the
kappa coefficient which are used to assess the predictive performance for the whole
classifier (X. Deng et al., 2016; AlBeladi & Muqaibel, 2018; Banko, 1998; W. Li
et al., 2017; C. Liu et al., 2007; Cohen, 1960).

The Confusion Matrix

The confusion matrix is a performance assessment for binary or multi-classes

classifiers that rely on machine learning or deep learning (X. Deng et al., 2016).
The confusion matrix is appeared as a table, each cell represents the number of
times the model can correctly or wrongly predict the class. The TP, TN, FP, and
FN may all be calculated using the confusion matrix (X. Deng et al., 2016). The TP
denotes the number of predictions where the model correctly predicted a positive
predetermined class, the TN denotes the number of predictions where the model

104
Machine Learning

correctly predicted a negative predetermined class, the FP denotes the number of

predictions where the model incorrectly predicted a positive predetermined class, and
the FN denotes the number of predictions where the model incorrectly predicted a
negative predetermined class (X. Deng et al., 2016). Figure 10 shows the confusion
matrix for binary classifiers which represent the TP, the TN, the FP, and the FN
calculations (X. Deng et al., 2016).

Figure 10. The confusion matrix for binary classifiers

(X. Deng et al., 2016)

Figure 11 shows the confusion matrix for multi-classes classifiers.

105
Machine Learning

Figure 11. The confusion matrix for multi-classes classifiers

(X. Deng et al., 2016)

The Precision, Recall, and F1-score

As a result of this confusion matrix, precision, recall, F1-score and the OA can be
calculated (X. Deng et al., 2016; AlBeladi & Muqaibel, 2018).
Precision depicts the proportion of expected samples in a class that actually belong
to that class compared to all predicted samples in that class; it can be expressed as
(X. Deng et al., 2016; AlBeladi & Muqaibel, 2018):

TP
Precision = (39)
TP + FP

Recall depicts the proportion of anticipated samples in a class that actually belong
to that class to all actual samples in that class; it can be expressed as (X. Deng et
al., 2016; AlBeladi & Muqaibel, 2018):

TP
Recall = (40)
TP + FN

The F1-score is a single assessment that combines precision and recall; it is

represented as (X. Deng et al., 2016; AlBeladi & Muqaibel, 2018):

106
Machine Learning

Precision × Recall 2 ×TP

F 1 − score = 2 × = (41)
Precision + Recall (2 ×TP ) + FP + FN

The OA

The OA basically informs us what percentage of the reference sites were correctly
mapped out of all of them. The OA is usually reported as a percentage, with 100%
accuracy indicating that all reference sites were properly categorized. OA is the
simplest to compute and comprehend, but it only provides basic accuracy information
to map users and producers. The OA is the major classification accuracy appreciation.
The OA is calculated as (Banko, 1998; W. Li et al., 2017; AlAfandy et al., 2020b):

Number of correctly classified data

OA = (42)
Total number of checkked data

The Kappa Coefficient

A statistical test is used to calculate the kappa coefficient, which is used to assess
the correctness of a categorization. Kappa is a metric that measures how well a
categorization worked as compared to assigning values at random. The Kappa
Coefficient might be anything between -1 and 1. A classification with a value of 0
is no better than a random categorization. The categorization is much poorer than
random if the number is negative. A value near to 1 suggests that the classification
is superior to chance. The kappa coefficient (𝜅) is calculated as (C. Liu et al., 2007;
Cohen, 1960):

p − pe
κ = o (43)
1 − pe

where po is the relative observed agreement between raters (identical to accuracy),

and pe is the hypothetical probability of chance agreement. The calculations of po
and pe are based on predictions, po and pe are calculated by (W. Liu et al., 2007;
Cohen, 1960):

107
Machine Learning

TP + TN
po = (44)
TP + FP + FN + TN

pe = pyes + pno (45)

TP + FP TP + FN
pyes = × (46)
TP + FP + FN + TN TP + FP + FN + TN

FN + TN FP + TN
pno = × (47)
TP + FP + FN + TN TP + FP + FN + TN

REFERENCES

AlAfandy, K. A., Omara, H., Lazaar, M., & Al Achhab, M. (Eds.). (2019). Artificial
Neural Networks Optimization and Convolution Neural Networks to Classifying
Images in Remote Sensing: A Review. Proceeding of The 4th International Conference
on Big Data and Internet of Things (BDIoT’19). 10.1145/3372938.3372945
AlAfandy, K. A., Omara, H., Lazaar, M., & Al Achhab, M. (2020a). Investment of
Classic Deep CNNs and SVM for Classifying Remote Sensing Images. Advances in
Science Technology and Engineering Systems Journal, 5(5), 652–659. doi:10.25046/
aj050580
AlAfandy, K. A., Omara, H., Lazaar, M., & Al Achhab, M. (2020b). Using Classic
Networks for Classifying Remote Sensing Images: Comparative Study. Advances in
Science Technology and Engineering Systems Journal, 5(5), 770–780. doi:10.25046/
aj050594
AlBeladi, A. A., & Muqaibel, A. H. (2018). Evaluating Compressive Sensing
Algorithms in Through-the-wall Radar via F1-score. International Journal of Signal
and Imaging Systems Engineering, 11(3), 164–171. doi:10.1504/IJSISE.2018.093268

108
Machine Learning

Alloghani, M., Al-Jumeily, D., Mustafina, J., Hussain, A., & Aljaaf, A. J. (2020). A
Systematic Review on Supervised and Unsupervised Machine Learning Algorithms
for Data Science. In M. Berry, A. Mohamed, & B. Yap (Eds.), Supervised and
Unsupervised Learning for Data Science. Unsupervised and Semi-Supervised
Learning. Springer. doi:10.1007/978-3-030-22475-2_1
Alpaydin, E. (2020). Introduction to Machine Learning. MIT Press.
Banko, G. (1998). A Review of Assessing the Accuracy of Classifications of Remotely
Sensed Data and of Methods Including Remote Sensing Data in Forest Inventory.
International Institution for Applied Systems Analysis (IIASA).
Bengio, Y., Goodfellow, I., & Courville, A. (2017). Deep Learning. MIT press.
Bowling, M., Furnkranz, J., Graepel, T., & Musick, R. (2006). Machine Learning and
Games. Machine Learning, Springer, 63(3), 211–215. doi:10.100710994-006-8919-x
Chauhan, V. K., Dahiya, K., & Sharma, A. (2019). Problem Formulations and
Solvers in Linear SVM: A Review. Artificial Intelligence Review, Springer, 52(2),
803–855. doi:10.100710462-018-9614-6
Cohen, J. (1960). A Coefficient of Agreement for Normal Scales. Educational and
Psychological Measurement, 20(1), 37–46. doi:10.1177/001316446002000104
J. Deng, W. Dong, R. Socher, L. Li, K. Li, & L. Fei-Fei (Eds.). (2009). ImageNet: A
Large-Scale Hierarchical Image Database. In Proceeding of the 2009 IEEE Conference
on Computer Vision and Pattern Recognition. IEEE. 10.1109/CVPR.2009.5206848
Deng, X., Liu, Q., Deng, Y., & Mahadevan, S. (2016). An Improved Method
to Construct Basic Probability Assignment Based on the Confusion Matrix for
Classification Problem. Information Sciences, Elsevier, 340-341, 250–261.
doi:10.1016/j.ins.2016.01.033
ElNaqa, I., & Murphy, M. J. (2015). What is Machine Learning? In I. Issam ElNaqa
& M. J. Murphy (Eds.), Machine Learning in Radiation Oncology (pp. 3–11).
Springer. doi:10.1007/978-3-319-18305-3_1
Farid, D. M., Zhang, L., Rahman, C. M., Hossain, M. A., & Strachan, R. (2014).
Hybrid Decision Tree and Naive Bayes Classifiers for Multi-class Classification
Tasks. Expert Systems with Applications, Elsevier, 41(4), 1937–1946. doi:10.1016/j.
eswa.2013.08.089
Fukushima, K., Miyake, S., & Ito, T. (1983). Neocognitron: A Neural Network Model
for a Mechanism of Visual Pattern Recognition. IEEE Transactions on Systems,
Man, and Cybernetics, SMC-13(5), 826–834. doi:10.1109/TSMC.1983.6313076

109
Machine Learning

Gad, A. F. (2018). Practical Computer Vision Applications Using Deep Learning

with CNNs. Apress. doi:10.1007/978-1-4842-4167-7
Guo, H., & Wang, W. (2019). Granular Support Vector Machine: A Review. Artificial
Intelligence Review, Springer, 51(1), 19–32. doi:10.100710462-017-9555-5
Gutenbrunner, C., Jureckova, J., Koenker, R., & Portnoy, S. (1993). Tests of Linear
Hypotheses Based on Regression Rank Score. Journal of Nonparametric Statistics,
2(4), 307–331. doi:10.1080/10485259308832561
Hawkins, D. M. (2004). The Problem of Overfitting. Journal of Chemical Information
and Computer Sciences, ACM, 44(1), 1–12. doi:10.1021/ci0342472 PMID:14741005
Ivakhnenko, A. G., & Lapa, V. G. (1966). Cybernetic Predicting Devices. Technical
Report, DTIC Document, Purdue University.
Jabbar, H. K., & Khan, R. Z. (2015). Methods to Avoid Over-fitting and Under-
fitting in Supervised Machine Learning (Comparative Study). Computer Science,
Communication and Instrumentation Devices, 2015, 163–172. doi:10.3850/978-
981-09-5247-1_017
Khanum, M., Mahboob, T., Imtiaz, W., Abdul Ghafoor, H., & Sehar, R. (2015). A
Survey on Unsupervised Machine Learning Algorithms for Automation, Classification
and Maintenance. International Journal of Computers and Applications, 119(13),
34–39. doi:10.5120/21131-4058
Koda, S., Zeggada, A., Melgani, F., & Nishii, R. (2018). Spatial and Structured SVM
for Multilabel Image Classification. IEEE Transactions on Geoscience and Remote
Sensing, 56(10), 5948–5960. doi:10.1109/TGRS.2018.2828862
Li, W., Fu, H., Yu, L., & Cracknell, A. (2017). Deep Learning Based Oil Palm
Tree Detection and Counting for High-Resolution Remote Sensing Images. Remote
Sensing, 9(1), 22–34. doi:10.3390/rs9010022
Liu, C., Frazier, P., & Kumar, L. (2007). Comparative Assessment of the Measures
of Thematic Classification Accuracy. Remote Sensing of Environment, Elsevier,
107(4), 606–616. doi:10.1016/j.rse.2006.10.010
Lubis, A. R., Lubis, M., & Khowarizmi, A. (2020). Optimization of Distance Formula
in K-Nearest Neighbor Method. Bulletin of Electrical Engineering and Informatics,
9(1), 326–338. doi:10.11591/eei.v9i1.1464

110
Machine Learning

F. F. Lubis, Y. Rosmansyah, & S. H. Supangkat (Eds.). (2014). Gradient Descent

and Normal Equations on Cost Function Minimization for Online Predictive Using
Linear Regression with Multiple Variables. In Proceeding of the 2014 International
Conference on ICT For Smart Society (ICISS). IEEE. 10.1109/ICTSS.2014.7013173
Mcculloch, W. S., & Pitts, W. (1990). A Logical Calculus of the Ideas Immanent
in Nervous Activity. Bulletin of Mathematical Biology, Springer, 52(1-2), 99–115.
doi:10.1016/S0092-8240(05)80006-0 PMID:2185863
Mitchell, T. M. (2006). The Discipline of Machine Learning. Carnegie Mellon
University, School of Computer Science, Machine Learning Department.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O.,
Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn:
Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Ruder, S. (2016). An Overview of Gradient Descent Optimization Algorithms. arXiv
preprint arXiv:1609.04747.
Sen, P. C., Hajra, M., & Ghosh, M. (2020). Supervised Classification Algorithms in
Machine Learning: A Survey and Review. In J. Mandal & D. Bhattacharya (Eds.),
Emerging Technology in Modelling and Graphics. Advances in Intelligent Systems
and Computing (Vol. 937). Springer. doi:10.1007/978-981-13-7403-6_11
Shanmuganathan, S. (2016). Artificial Neural Network Modelling: An Introduction. In
Series in Artificial Neural Network Modelling (pp. 1-14). Springer. doi:10.1007/978-
3-319-28495-8_1
U. S. Shanthamallu, A. Spanias, C. Tepedelenlioglu, & M. Stanley (Eds.). (2017). A
Brief Survey of Machine Learning Methods and Their Sensor and IoT Applications.
In Proceeding of the 8th International Conference on Information, Intelligence,
Systems & Applications (IISA). IEEE. 10.1109/IISA.2017.8316459
Shi, L. (2013). Learning theory estimates for coefficient-based regularized
regression. Applied and Computational Harmonic Analysis, Elsevier, 34(2), 252–265.
doi:10.1016/j.acha.2012.05.001
Shinde, P. P., & Shah, S., Dr. (Eds.) (2018). A Review of Machine Learning and
Deep Learning Applications. In Proceeding of the 2018 Fourth International
Conference on Computing Communication Control and Automation (ICCUBEA).
IEEE. 10.1109/ICCUBEA.2018.8697857

111
Machine Learning

Shorten, C., & Khoshgoftaar, T. M. (2019). A Survey on Image Data Augmentation

for Deep Learning. Journal of Big Data, 6(1), 1–48. doi:10.118640537-019-0197-0
G. Singh, B. Kumar, L. Gaur, & A. Tyagi (Eds.). (2019). Comparison between
multinomial and Bernoulli naïve Bayes for text classification. In Proceeding of the
2019 International Conference on Automation, Computational and Technology
Management (ICACTM). IEEE. 10.1109/ICACTM.2019.8776800
Smola, A., & Vishwanathan, S. V. N. (2008). Introduction to Machine Learning.
Cambridge University.
Srivastava, P. K., Han, D., Rico-Ramirez, M. A., Bray, M., & Islam, T. (2012). Selection
of Classification Techniques for Land Use / Land Cover Change Investigation.
Advances in Space Research, 50(9), 1250–1265. doi:10.1016/j.asr.2012.06.032
Taheri, S., & Mammadov, M. (2013). Learning the Naïve Bayes Classifier With
Optimization Models. International Journal of Applied Mathematics and Computer
Science, 23(4), 787–795. doi:10.2478/amcs-2013-0059
Vapnik, V. N. (2000). The Nature of Statistical Learning Theory. Springer Science.
doi:10.1007/978-1-4757-3264-1
Verdhan, V. (2020). Supervised Learning with Python. Apress. doi:10.1007/978-
1-4842-6156-9
T. Wang, & W. H. Li (Eds.). (2010). Naïve Bayes Software Defect Prediction Model.
In Proceeding of 2010 International Conference on Computational Intelligence and
Software Engineering. IEEE. 10.1109/CISE.2010.5677057
Wu, Y., Ianakiev, K., & Govindaraju, V. (2002). Improved K-Nearest Neighbor
Classification. Pattern Recognition, Elsevier, 35(10), 2311–2318. doi:10.1016/
S0031-3203(01)00132-7
Zhang, S., Li, X., Zong, M., Zhu, X., & Cheng, D. (2017). Learning K for KNN
Classification. ACM Transactions on Intelligent Systems and Technology, 8(3),
1–19. doi:10.1145/2990508
L. Zhao, M. Mammadov, & J. Yearwood (Eds.). (2010). From Convex to Nonconvex:
A Loss Function Analysis for Binary Classification. In Proceeding of the 2010
IEEE International Conference on Data Mining Workshops. IEEE. 10.1109/
ICDMW.2010.57

112
Machine Learning

APPENDIX

Table 1.

ANNs Artificial Neural Networks

BSD Berkeley Software Distribution
DNNs Deep Neural Networks
DT Decision Tree
FN False Negative
FP False Positive
GMDH Group Method of Data Handling
KNN k-nearest neighbors
NNs Neural Networks
OA Overall Accuracy
SVM Support Vector Machine
TF Tensorflow
TN True Negative
TP True Positive

113

View publication stats

Machine Learning and Big Data Analytics Paradigms: Analysis, Applications and Challenges Aboul Ella Hassanien download
100% (2)
Machine Learning and Big Data Analytics Paradigms: Analysis, Applications and Challenges Aboul Ella Hassanien download
58 pages
Trainning Tesseract
No ratings yet
Trainning Tesseract
9 pages
SURVEY Accepted
No ratings yet
SURVEY Accepted
66 pages
Image Generation A Review
No ratings yet
Image Generation A Review
40 pages
DragonflyAlgorithmTheory 2
No ratings yet
DragonflyAlgorithmTheory 2
25 pages
Image Generation A Review
No ratings yet
Image Generation A Review
40 pages
10.1007978 3 319 67056 018 PDF
No ratings yet
10.1007978 3 319 67056 018 PDF
26 pages
Document 6
No ratings yet
Document 6
11 pages
NaturalLanguageProcessing Paper Al Taani 2021
No ratings yet
NaturalLanguageProcessing Paper Al Taani 2021
12 pages
Uncertainty in Machine Learning
No ratings yet
Uncertainty in Machine Learning
8 pages
Machine Learning For Complex and Unmanned Systems: February 2024
No ratings yet
Machine Learning For Complex and Unmanned Systems: February 2024
2 pages
MachineLearningAlgorithmsforSignalandImageProcessing-2022-Ghai
No ratings yet
MachineLearningAlgorithmsforSignalandImageProcessing-2022-Ghai
31 pages
Opportunities For Machine Learning in Scientific Discovery: Preprint
No ratings yet
Opportunities For Machine Learning in Scientific Discovery: Preprint
23 pages
FPGA-Based_Accelerators_of_Deep_Learning_Networks_
No ratings yet
FPGA-Based_Accelerators_of_Deep_Learning_Networks_
42 pages
Machine Learning Towards Intelligent Systems Appli
No ratings yet
Machine Learning Towards Intelligent Systems Appli
47 pages
22.AI Adoption and Educational Sustainability in Higher Education in The UAE
No ratings yet
22.AI Adoption and Educational Sustainability in Higher Education in The UAE
30 pages
SPIE__Exploring_bias_and_fairness_in_machine_learning_and_artificial_intelligence
No ratings yet
SPIE__Exploring_bias_and_fairness_in_machine_learning_and_artificial_intelligence
11 pages
-Article 1
No ratings yet
-Article 1
19 pages
Investigating The Electrical Discharge Machining (EDM) Parameter Effects On Al-Mg2Si Metal Matrix Composite (MMC) For..
No ratings yet
Investigating The Electrical Discharge Machining (EDM) Parameter Effects On Al-Mg2Si Metal Matrix Composite (MMC) For..
11 pages
Using-Artificial-Intelligence-tools-in-media-learning-attitudes-and-skills
No ratings yet
Using-Artificial-Intelligence-tools-in-media-learning-attitudes-and-skills
34 pages
Evolutionary Deep Learning For Car Park Occupancy Prediction in Smart Cities: 12th International Conference, LION 12, Kalamata, Greece, June 10-15, 2018, Revised Selected Papers
No ratings yet
Evolutionary Deep Learning For Car Park Occupancy Prediction in Smart Cities: 12th International Conference, LION 12, Kalamata, Greece, June 10-15, 2018, Revised Selected Papers
16 pages
Time Series
100% (1)
Time Series
91 pages
ArtificialIntelligenceChatbots_ASurveyofClassicalversusDeepMachinelearningTechniques
No ratings yet
ArtificialIntelligenceChatbots_ASurveyofClassicalversusDeepMachinelearningTechniques
19 pages
Travel Behavior
No ratings yet
Travel Behavior
14 pages
RNN LSTM
No ratings yet
RNN LSTM
49 pages
Arabic Ontology Model For Financial Accounting
No ratings yet
Arabic Ontology Model For Financial Accounting
9 pages
Pyspark Interview Questions: Click Here
0% (1)
Pyspark Interview Questions: Click Here
35 pages
Yann LeCun - What's So Great About - Extreme Learning Machines - MachineLearning
No ratings yet
Yann LeCun - What's So Great About - Extreme Learning Machines - MachineLearning
11 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
23 pages
Unit 5 Updated
No ratings yet
Unit 5 Updated
107 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Semi-: Supervised Learning
No ratings yet
Semi-: Supervised Learning
40 pages
Review Article Drl-Based Intelligent Resource Allocation For Diverse Qos in 5G and Toward 6G Vehicular Networks: A Comprehensive Survey
No ratings yet
Review Article Drl-Based Intelligent Resource Allocation For Diverse Qos in 5G and Toward 6G Vehicular Networks: A Comprehensive Survey
21 pages
PHD IT Syllabus 01
No ratings yet
PHD IT Syllabus 01
27 pages
Image Recognition With Neural Networks Howto
No ratings yet
Image Recognition With Neural Networks Howto
10 pages
A Gentle Introduction To Object Recognition With Deep Learning
No ratings yet
A Gentle Introduction To Object Recognition With Deep Learning
9 pages
A Deep Learning Approach To Phishing Website Detection
No ratings yet
A Deep Learning Approach To Phishing Website Detection
48 pages
55-60 PDF
No ratings yet
55-60 PDF
6 pages
Speech Recognition Using Artificial Neural Network: - A Review
100% (1)
Speech Recognition Using Artificial Neural Network: - A Review
4 pages
A Guide To Convolutional Neural Networks - The ELI5 Way - Saturn Cloud Blog
No ratings yet
A Guide To Convolutional Neural Networks - The ELI5 Way - Saturn Cloud Blog
10 pages
Artificial Intelligence Interview Questions: Click Here
No ratings yet
Artificial Intelligence Interview Questions: Click Here
44 pages
Image Classification Techniques-A Survey
No ratings yet
Image Classification Techniques-A Survey
4 pages
Convolutional Neural Networks For Malaria Detection
100% (1)
Convolutional Neural Networks For Malaria Detection
22 pages
(IJCST-V7I2P6) :athul Krishna P.B, Raheenamol M.R, Jesna M.S and Ms. Sheena Kurian K
No ratings yet
(IJCST-V7I2P6) :athul Krishna P.B, Raheenamol M.R, Jesna M.S and Ms. Sheena Kurian K
4 pages
Medical Image Analysis
No ratings yet
Medical Image Analysis
13 pages
R 2008 M.E. Applied Electronics Syllabus
No ratings yet
R 2008 M.E. Applied Electronics Syllabus
31 pages
Enhancing Error Prediction in Machineries Through Sensor Data Fusion
No ratings yet
Enhancing Error Prediction in Machineries Through Sensor Data Fusion
78 pages
Keras: Sequential Model
No ratings yet
Keras: Sequential Model
25 pages
Uncertainity Quantification
No ratings yet
Uncertainity Quantification
88 pages
Design of Music Emotion Analysis and Creation Aid System Based On Machine Learning
No ratings yet
Design of Music Emotion Analysis and Creation Aid System Based On Machine Learning
12 pages
Gas Leakage Detection Using Spatial and Temp - 2022 - Process Safety and Environ
No ratings yet
Gas Leakage Detection Using Spatial and Temp - 2022 - Process Safety and Environ
8 pages
Traget Radar
No ratings yet
Traget Radar
25 pages
Intrusion Detection Systems Based On Machine Learning Algorithms
No ratings yet
Intrusion Detection Systems Based On Machine Learning Algorithms
7 pages
Deep Learning Notes Andrew NG
No ratings yet
Deep Learning Notes Andrew NG
54 pages
SQL Joins Interview Questions: Click Here
No ratings yet
SQL Joins Interview Questions: Click Here
34 pages
IAS Exam Congress: Iasec
No ratings yet
IAS Exam Congress: Iasec
22 pages
Neural Networks: CMR Technical Campus
No ratings yet
Neural Networks: CMR Technical Campus
30 pages
Numpy Interview Questions: Click Here
No ratings yet
Numpy Interview Questions: Click Here
32 pages
Mobile Net
No ratings yet
Mobile Net
9 pages
08.time Series
No ratings yet
08.time Series
1 page
Question Bank - Module 2 - Module-3 Module 4 -Module 5
No ratings yet
Question Bank - Module 2 - Module-3 Module 4 -Module 5
4 pages
Chapter1-Foundations For Efficiencies
No ratings yet
Chapter1-Foundations For Efficiencies
5 pages
Chapter3 Gaining Efficiencies
No ratings yet
Chapter3 Gaining Efficiencies
6 pages
20210501-ML Question Bank
No ratings yet
20210501-ML Question Bank
1 page
Face Recognition Based Attendance System For CMR College of Engineering and Technology
No ratings yet
Face Recognition Based Attendance System For CMR College of Engineering and Technology
3 pages
EC360 Soft Computing
No ratings yet
EC360 Soft Computing
2 pages
The Essential Guide to AI Trainers: Shaping the Future of Artificial Intelligence with Ethics and Precision: AI trainers, #1
From Everand
The Essential Guide to AI Trainers: Shaping the Future of Artificial Intelligence with Ethics and Precision: AI trainers, #1
Rene YORE
No ratings yet
Fundamentals of Machine Learning: a Simplified Approach
From Everand
Fundamentals of Machine Learning: a Simplified Approach
Er. Sudhir Goswami
No ratings yet
Generative AI – An Overview: Software, #1
From Everand
Generative AI – An Overview: Software, #1
Editor IJSMI
No ratings yet
Foundational Models and Architectures S1: Generative AI, #1
From Everand
Foundational Models and Architectures S1: Generative AI, #1
Leaster Startx
No ratings yet
Machine Learning For Absolute Beginners A Step by Step guide Algorithms For Supervised and Unsupervised Learning With Real World Applications
From Everand
Machine Learning For Absolute Beginners A Step by Step guide Algorithms For Supervised and Unsupervised Learning With Real World Applications
Raymond Kazuya
2/5 (2)
Beginner's Guide to Machine Learning Concepts
From Everand
Beginner's Guide to Machine Learning Concepts
gareth thomas
No ratings yet
The Fundamentals of Machine Learning: A Comprehensive Introduction
From Everand
The Fundamentals of Machine Learning: A Comprehensive Introduction
Maula Issa
No ratings yet
AI Breakthroughs: Theories and Concepts for Today
From Everand
AI Breakthroughs: Theories and Concepts for Today
Gopee Mukhopadhyay
No ratings yet
A Beginner's Guide to Understanding and Using AI
From Everand
A Beginner's Guide to Understanding and Using AI
Kerem Tataroglu
No ratings yet
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Deep learning: deep learning explained to your granny – a guide for beginners
From Everand
Deep learning: deep learning explained to your granny – a guide for beginners
PAT NAKAMOTO
3/5 (2)
Self-Supervised Learning: Teaching AI with Unlabeled Data
From Everand
Self-Supervised Learning: Teaching AI with Unlabeled Data
Robert Johnson
No ratings yet
Optimizing AI and Machine Learning Solutions: Your ultimate guide to building high-impact ML/AI solutions (English Edition)
From Everand
Optimizing AI and Machine Learning Solutions: Your ultimate guide to building high-impact ML/AI solutions (English Edition)
Mirza Rahim Baig
No ratings yet
Machine Learning Fundamentals: Concepts, Models, and Applications
From Everand
Machine Learning Fundamentals: Concepts, Models, and Applications
Amar Sahay
No ratings yet
Introduction to Machine Learning and Neural Classification
From Everand
Introduction to Machine Learning and Neural Classification
Trilokesh Khatri
No ratings yet
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
From Everand
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Avishek Nag
No ratings yet
Data Science and AI Simplified
From Everand
Data Science and AI Simplified
Ekaaksh Deshpande
No ratings yet
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
Deep Learning on Microcontrollers: Learn how to develop embedded AI applications using TinyML (English Edition)
From Everand
Deep Learning on Microcontrollers: Learn how to develop embedded AI applications using TinyML (English Edition)
Atul Krishna Gupta
5/5 (1)
Machine Learning for Beginners - 2nd Edition: Build and deploy Machine Learning systems using Python (English Edition)
From Everand
Machine Learning for Beginners - 2nd Edition: Build and deploy Machine Learning systems using Python (English Edition)
Dr. Harsh Bhasin
No ratings yet
Python Machine Learning Projects: Learn how to build Machine Learning projects from scratch (English Edition)
From Everand
Python Machine Learning Projects: Learn how to build Machine Learning projects from scratch (English Edition)
Dr. Deepali R Vora
No ratings yet
Mivar NETs and logical inference with the linear complexity
From Everand
Mivar NETs and logical inference with the linear complexity
Varlamov, Oleg O.
No ratings yet
Industrial Automation: Learn the current and leading-edge research on SCADA security
From Everand
Industrial Automation: Learn the current and leading-edge research on SCADA security
Vikalp Joshi
No ratings yet
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
From Everand
Mastering Machine Learning Algorithms - Second Edition: Expert techniques for implementing popular machine learning algorithms, fine-tuning your models, and understanding how they work, 2nd Edition
Giuseppe Bonaccorso
2/5 (1)
Discrete Structure and Automata Theory for Learners: Learn Discrete Structure Concepts and Automata Theory with JFLAP
From Everand
Discrete Structure and Automata Theory for Learners: Learn Discrete Structure Concepts and Automata Theory with JFLAP
Sukhpreet Kaur Gill
No ratings yet
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
From Everand
Pathways to Machine Learning and Soft Computing: 邁向機器學習與軟計算之路（國際英文版）
Jyh-Horng Jeng
No ratings yet
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
The AI Artificial Intelligence Course From Beginner to Expert
From Everand
The AI Artificial Intelligence Course From Beginner to Expert
Asomoo Ebooks
No ratings yet
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
Machine Learning For Absolute Begginers A Step By Step Guide: Algorithms For Supervised And Unsupervised Learning With Real World Applications
From Everand
Machine Learning For Absolute Begginers A Step By Step Guide: Algorithms For Supervised And Unsupervised Learning With Real World Applications
Raymond Kazyua
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Constrained Conditional Model: Fundamentals and Applications
From Everand
Constrained Conditional Model: Fundamentals and Applications
Fouad Sabry
No ratings yet
Python Machine Learning: Introduction to Machine Learning with Python
From Everand
Python Machine Learning: Introduction to Machine Learning with Python
Frank Millstein
No ratings yet
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet

Machine Learning: April 2022

Uploaded by

Machine Learning: April 2022

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Chapter · April 2022

Khalid Ahmed Alafandy Hicham Omara

SEE PROFILE SEE PROFILE

Mohamed LAZAAR Mohammed Al Achhab

SEE PROFILE SEE PROFILE

Le soutien pédagogique et l'adaptation des apprentissages en ligne View project

The user has requested enhancement of the downloaded file.

Artificial intelligence is a broad discipline of computer science concerned with

HISTORY OF MACHINE LEARNING

in order to be genuinely practical and effective, machine learning needed appropriate

THEORY OF MACHINE LEARNING

Figure 2. Types of machine learning algorithms

The unsupervised learning is based on approaching results with little or no idea

UNSUPERVISED MACHINE LEARNING

and 0 otherwise (Smola & Vishwanathan, 2008).

By rearranging, it is obtained (Smola & Vishwanathan, 2008):

Figure 3. The unsupervised learning

SUPERVISED MACHINE LEARNING

Figure 4. The supervised Learning

hθ (x )=θ0 +θ1x 1 +θ2x 2 +…

Figure 5. The sigmoid function graph

To enhance the learning model performance, a regularization term can be appended

where j={0,…,n} and x0=1.

∑ (h (x ( ) ) −y( ) ) x ( ) 

For training a supervised learning model even if this model is regression or

THE SUPERVISED MACHINE LEARNING PROBLEMS

Through running a learning algorithm, it is possible that it can’t do as hoped almost

The Over-Fitting Problem

The Under-Fitting Problem

THE MOST USED MACHINE LEARNING ALGORITHMS

In previous sections, the both of supervised machine learning and unsupervised

The Naïve Bayes Algorithm

The KNN Algorithm

The KNN algorithm is a supervised machine learning algorithm. It is a simple

Figure 6. The KNN Machine Learning Algorithm

In mathematics, there are several distance calculation functions such as the

The DT classifier is a supervised machine learning approach which can be utilized

Figure 7. The DT classifier

The SVM Algorithm

The SVM is a machine learning method that can be utilized in classification or

Figure 8. The SVM classifier

Figure 9. The ANNs model classifier

b l    z l  

THE OPEN SOURCE IMPLEMENTATIONS

THE PERFORMANCE ASSESSMENTS

The Confusion Matrix

The confusion matrix is a performance assessment for binary or multi-classes

correctly predicted a negative predetermined class, the FP denotes the number of

Figure 10. The confusion matrix for binary classifiers

Figure 11 shows the confusion matrix for multi-classes classifiers.

Figure 11. The confusion matrix for multi-classes classifiers

The Precision, Recall, and F1-score

The F1-score is a single assessment that combines precision and recall; it is

Precision × Recall 2 ×TP

Number of correctly classified data

The Kappa Coefficient

where po is the relative observed agreement between raters (identical to accuracy),

pe = pyes + pno (45)

Gad, A. F. (2018). Practical Computer Vision Applications Using Deep Learning

F. F. Lubis, Y. Rosmansyah, & S. H. Supangkat (Eds.). (2014). Gradient Descent

Shorten, C., & Khoshgoftaar, T. M. (2019). A Survey on Image Data Augmentation

ANNs Artificial Neural Networks

View publication stats

You might also like

hθ (x )=θ0 +θ1x 1 +θ2x 2 +…

∑ (h (x ( ) ) −y( ) ) x ( ) 