ML-ToC
ML-ToC
Preface iv
Acknowledgements vii
QR Code Content Details viii
10.7 R
adial Basis Function Neural 12.3.2 Cascading 347
Network297 12.4 Sequential Ensemble Models 347
10.8 Self-Organizing Feature Map 301 12.4.1 AdaBoost 347
10.9 P
opular Applications of Artificial
Neural Networks 306 13. Clustering Algorithms 361
10.10 A
dvantages and Disadvantages 13.1 I ntroduction to Clustering
of ANN 306 Approaches361
10.11 C
hallenges of Artificial Neural 13.2 Proximity Measures 364
Networks307 13.3 H
ierarchical Clustering
Algorithms368
11. Support Vector Machines 312
13.3.1 S
ingle Linkage or MIN
11.1 I ntroduction to Support Vector Algorithm368
Machines 312
13.3.2 C
omplete Linkage or MAX
11.2 Optimal Hyperplane 314 or Clique 371
11.3 F
unctional and Geometric 13.3.3 Average Linkage 371
Margin316
13.3.4 M
ean-Shift Clustering
11.4 H
ard Margin SVM as an Algorithm372
Optimization Problem 319
13.4 Partitional Clustering Algorithm 373
11.4.1 L
agrangian Optimization
13.5 Density-based Methods 376
Problem320
13.6 Grid-based Approach 377
11.5 S
oft Margin Support Vector
Machines323 13.7 P
robability Model-based
Methods379
11.6 I ntroduction to Kernels and
Non-Linear SVM 326 13.7.1 Fuzzy Clustering 379
11.7 K
ernel-based Non-Linear 13.7.2 E
xpectation-Maximization
Classifier 330 (EM) Algorithm 380
11.8 Support Vector Regression 331 13.8 Cluster Evaluation Methods 382
11.8.1 R
elevance Vector 14. Reinforcement Learning 389
Machines333
14.1 O
verview of Reinforcement
12. Ensemble Learning 339 Learning389
14.7 M
odel-based Learning (Passive 15.6 Evolutionary Computing 433
Learning)402 15.6.1 Simulated Annealing 433
14.8 Model Free Methods 406 15.6.2 Genetic Programming 434
14.8.1 Monte-Carlo Methods 407
16. Deep Learning 439
14.8.2 T
emporal Difference
Learning408 16.1 I ntroduction to Deep Neural
Networks439
14.9 Q-Learning409
16.2 I ntroduction to Loss Functions
14.10 SARSA Learning 410
and Optimization 440
15. Genetic Algorithms 417 16.3 Regularization Methods 442
15.1 O
verview of Genetic 16.4 Convolutional Neural Networks 444
Algorithms417 16.5 Transfer Learning 451
15.2 O
ptimization Problems and 16.6 Applications of Deep Learning 451
Search Spaces 419
16.6.1 Robotic Control 451
15.3 G
eneral Structure of a
16.6.2 L
inear Systems and
Genetic Algorithm 420
Non-linear Dynamics 452
15.4 Genetic Algorithm Components 422
16.6.3 Data Mining 452
15.4.1 Encoding Methods 422
16.6.4 Autonomous Navigation 453
15.4.2 Population Initialization 424
16.6.5 Bioinformatics 453
15.4.3 Fitness Functions 424
16.6.6 Speech Recognition 453
15.4.4 Selection Methods 425
16.6.7 Text Analysis 454
15.4.5 Crossover Methods 428
16.7 Recurrent Neural Networks 454
15.4.6 Mutation Methods 429
16.8 LSTM and GRU 457
15.5 C
ase Studies in Genetic
Algorithms430 Bibliography463
15.5.1 M
aximization of a Index472
Function430 About the Authors 480
15.5.2 G
enetic Algorithm Related Titles 481
Classifier 433
Introduction to
Machine Learning
“Computers are able to see, hear and learn. Welcome to the future.”
— Dave Waters
Machine Learning (ML) is a promising and flourishing field. It can enable top management of an
organization to extract the knowledge from the data stored in various archives of the business
organizations to facilitate decision making. Such decisions can be useful for organizations to
design new products, improve business processes, and to develop decision support systems.
Learning Objectives
• Explore the basics of machine learning
• Introduce types of machine learning
• Provide an overview of machine learning tasks
• State the components of the machine learning algorithm
• Explore the machine learning process
• Survey some machine learning applications
2. S
econd reason is that the cost of storage has reduced. The hardware cost has also dropped.
Therefore, it is easier now to capture, process, store, distribute, and transmit the digital
information.
3. T
hird reason for popularity of machine learning is the availability of complex algorithms
now. Especially with the advent of deep learning, many algorithms are available for
machine learning.
With the popularity and ready adaption of machine learning by business organizations, it
has become a dominant technology trend now. Before starting the machine learning journey, let
us establish these terms - data, information, knowledge, intelligence, and wisdom. A knowledge
pyramid is shown in Figure 1.1.
Wisdom
Intelligence
(applied
knowledge)
Knowledge
(condensed
information)
Experience Decisions
Humans
(a)
Data Model
Data- Learning program
base
(b)
Figure 1.2: (a) A Learning System for Humans (b) A Learning System
for Machine Learning
Often, the quality of data determines the quality of experience and, therefore, the quality of
the learning system. In statistical learning, the relationship between the input x and output y is
modeled as a function in the form y = f(x). Here, f is the learning function that maps the input x
to output y. Learning of function f is the crucial aspect of forming a model in statistical learning.
In machine learning, this is simply called mapping of input to output.
The learning program summarizes the raw data in a model. Formally stated, a model is an
explicit description of patterns within the data in the form of:
1. Mathematical equation
2. Relational diagrams like trees/graphs
3. Logical if/else rules, or
4. Groupings called clusters
In summary, a model can be a formula, procedure or representation that can generate data
decisions. The difference between pattern and model is that the former is local and applicable only
to certain attributes but the latter is global and fits the entire dataset. For example, a model can be
helpful to examine whether a given email is spam or not. The point is that the model is generated
automatically from the given data.
Another pioneer of AI, Tom Mitchell’s definition of machine learning states that, “A computer
program is said to learn from experience E, with respect to task T and some performance measure P,
if its performance on T measured by P improves with experience E.” The important components of this
definition are experience E, task T, and performance measure P.
For example, the task T could be detecting an object in an image. The machine can gain the
knowledge of object using training dataset of thousands of images. This is called experience E.
So, the focus is to use this experience E for this task of object detection T. The ability of the system
to detect the object is measured by performance measures like precision and recall. Based on the
performance measures, course correction can be done to improve the performance of the system.
Models of computer systems are equivalent to human experience. Experience is based on data.
Humans gain experience by various means. They gain knowledge by rote learning. They observe
others and imitate it. Humans gain a lot of knowledge from teachers and books. We learn many things
by trial and error. Once the knowledge is gained, when a new problem is encountered, humans
search for similar past situations and then formulate the heuristics and use that for prediction.
But, in systems, experience is gathered by these steps:
1. Collection of data
2. O
nce data is gathered, abstract concepts are formed out of that data. Abstraction is used
to generate concepts. This is equivalent to humans’ idea of objects, for example, we have
some idea about how an elephant looks like.
3. G
eneralization converts the abstraction into an actionable form of intelligence.
It can be viewed as ordering of all possible concepts. So, generalization involves ranking
of concepts, inferencing from them and formation of heuristics, an actionable aspect of
intelligence. Heuristics are educated guesses for all tasks. For example, if one runs or
encounters a danger, it is the resultant of human experience or his heuristics formation.
In machines, it happens the same way.
4. H
euristics normally works! But, occasionally, it may fail too. It is not the fault
of heuristics as it is just a ‘rule of thumb′. The course correction is done by taking
evaluation measures. Evaluation checks the thoroughness of the models and to-do
course correction, if necessary, to generate better formulations.
Machine learning
Deep
learning
1.3.2 Machine Learning, Data Science, Data Mining, and Data Analytics
Data science is an ‘Umbrella’ term that encompasses many fields. Machine learning starts with
data. Therefore, data science and machine learning are interlinked. Machine learning is a branch
of data science. Data science deals with gathering of data for analysis. It is a broad field that
includes: