Review of AppliedMachineLearning
Review of AppliedMachineLearning
Current book is compiled from class notes of the courses taught on this subject by the
author, and will be very useful as an undergraduate textbook. Focus here is on
describing techniques necessary to use the machine learning tools, rather than going
in-depth into the underlying theory. Content is split into six sections that cover
classification, high dimensional data, clustering, regression, graphical models and
deep networks.
Classification is the basic step and consists of identifying label of an item based on a
set of features. A good account of the Bayes classifier and the support vector machine
is provided in the first section. Training error is the error rate on examples used to
train the classifier and the test error is the error on other examples. Smaller gap
between training error and test error is desirable, so bounds on the probability that
the gap can become high are studied.
When items in the data contain multiple features, the dataset gets modeled as d-
dimensional vectors. Behavior of such high dimensional data is non-intuitive, and
second section handles this phenomenon known as the curse of dimension (which is
exhibited by points being close to the decision boundary, and also further apart).
Techniques that consider dataset as a collection of blobs, or clusters, where points
that are closer together belong to the same blob, are demonstrated. In most of the
high dimensional datasets, diagonal entries of the covariance matrix are very small,
so techniques that build a reasonably accurate lower dimensional model using
principal components analysis are shown.
Clustering is the topic of third section. Multiple algorithms are discussed here. In
agglomerative method, each data item is considered as a cluster, and then clusters
are merged recursively to arrive at a good clustering. In divisive method, entire
dataset is considered as a cluster, and then clusters are split recursively to find a good
clustering. The iterative k-means method is described as the go-to clustering
algorithm here, along with a more general EM (expectation maximization) algorithm
that can handle cases where some data is missing.
Whereas classification predicts a class from the data item, regression models predict a
value, and are useful in comparing trends in the data. This is the topic of the fourth
section. Regression using linear models is described along with transformations to
improve the performance. To find sets of independent variables that predict
effectively, greedy search methods are described. Forward stagewise regression adds
variables, whereas the backward method removes them till the change makes
regression worse. Boosting is another greedy technique where an optimal predictor is
built incrementally from less ambitious predictors.
Fifth section describes models that can be represented as graphs, like HMM (hidden
markov models) and techniques to learn them. CRF (conditional random field) is
another model that considers joint probabilities. Learning algorithm for CRF repeatedly
computes the current best inferred set of hidden states and adjusts cost functions so
that the desired sequence scores better than the current best.
A neural network is made of a stack of layers of units where each unit accepts a set of
inputs and a set of parameters, and produces a number that is a non-linear function
of the inputs and those parameters. When the number of layers is large, the network
is called a deep network. Such networks make excellent classifiers and are the topic of
the last section. Networks are trained by descent on a loss function and
backpropagation is used for evaluating the gradient. To improve training, a gradient
scaling method is described. Image classification is the interesting centerpiece in this
section. Another application shown here is an encoder that produces low dimensional
code from higher dimensional data, which is trained along with a decoder that
recovers data from the code.
Overall, a great textbook that will be very useful for students and practitioners.
computingreviews.com/Reviewer/Smartbox/RSmartbox_frameset_toplevel.cfm?CFID=28087644&CFTOKEN=99425391 1/2
1/1/2020 Write Confirmation
Material here is designed to be read from beginning to the end. Basic skills in linear
algebra, probability, statistics and programming are expected from the readers.
Each chapter is organized very well with clear descriptions and examples. Key ideas
are highlighted in the form of text boxes which offer an excellent summary of the
main body.
Reproduction in whole or part without permission is prohibited. Privacy Policy | Terms of Use
Copyright © 2000-2020 ThinkLoud, Inc..
computingreviews.com/Reviewer/Smartbox/RSmartbox_frameset_toplevel.cfm?CFID=28087644&CFTOKEN=99425391 2/2