Machine Learning- UNIT I (1)
Machine Learning- UNIT I (1)
Outcome
1. Recognize the characteristics of machine learning that make it
useful
to real-world problems.
2. Able to use regularized regression and Classification algorithms.
3. Evaluate machine learning algorithms and model selection.
4. Understand scalable machine learning and machine learning for
IoT.
5. Understand Deep leaning and Expert system.
UNIT-
I
Outline
• Introduction to Machine Learning
• Types of Machine Learning Algorithms
• Supervised Learning
• Unsupervised learning
• Reinforcement Learning
• Classification of Machine Learning
Concept
• Distance Based Machine learning Methods
• K-Nearest Neighbor (kNN)
Outline
cont….
• Introduction to Clustering Techniques
• Possible Applications
• Requirements of clustering algorithm
• Problems associated with using Clustering
Technique
• Types of Clustering Methods
• Clustering Strategies.
Introduction to Machine Learning
• The idea behind Machine Learning is that, the data is feed to train the machines
and
letting them learn on their own, without any human intervention.
• Machine learning was first introduced in 1959, by Arthur Samuel described as the,” it is
a field of study that gives the ability to the computer for self-learn without
being explicitly programmed”, that means imbue knowledge to machines without
hard-coding it.
The Artificial Intelligence View: Learning is central to human knowledge and intelligence,
and, likewise, it is also essential for building intelligent machines. Years of effort in AI has shown that trying
to build intelligent computers by programming all the rules cannot be done; automatic learning is crucial. For
example, we humans are not born with the ability to understand language — we learn it — and it makes
sense to try to have computers learn language instead of trying to program it all it.
The Software Engineering View Machine learning allows us to program computers by
:
example, which can be easier than writing code the traditional way.
The Stats(statistics) View Machine learning is the marriage of computer science and statistics:
.
computational techniques are applied to statistical problems. Machine learning has been applied to a vast
number of problems in many contexts, beyond the typical statistics problems. Machine learning is often
designed with different considerations than statistics (e.g., speed is often more important than accuracy).
History of Machine
Learning
History of Machine Learning
• It was in the 1940s when the first manually operated computer system,
ENIAC (Electronic Numerical Integrator and Computer), was invented.
• At that time the word “computer” was being used as a name for a human
with intensive numerical computation capabilities, so, ENIAC was called a
numerical computing machine!
• Well, you may say it has nothing to do with learning?! WRONG, from
the beginning the idea was to build a machine able to emulate human
thinking and learning.
History of Machine
Learning
• In the 1950s, we see the first computer game program claiming to be able to beat the
checkers world champion. This program helped checkers players a lot in improving their skills!
• Around the same time, Frank Rosenblatt invented the Perceptron which was a very, very
simple classifier but when it was combined in large numbers, in a network, it became a powerful
monster.
• Thanks to statistics, machine learning became very famous in the 1990s. The intersection
of computer science and statistics gave birth to probabilistic approaches in AI. This shifted the
field further toward data-driven approaches. Having large-scale data available, scientists started
to build intelligent systems that were able to analyze and learn from large amounts of data.
Need of Machine
Learning
• Machine learning is a field raised out of artificial intelligence (AI). Through
the application of AI, human build better and intelligent machines. But they
were unable to program more complex and constantly evolving challenges.
• Then came the realization that the only way to achieve more advanced tasks
was to let the machine learn from its own input.
• If big data and cloud computing are gaining importance for their contributions,
machine learning also deserves recognition for helping data scientists analyze large
chunks of data via an automated process that saves time and effort.
• The techniques used for data mining have been around for years, but they're not
effective without the power to run algorithms. When you run deep learning with
access to better data, the output leads to dramatic advances, which is why there's
a need for machine learning.
Why Machine Learning is so
important?
Working Process of Machine
Learning
Machine Learning
Applications
• There are many uses of
Machine Learning in various
fields. These fields have
different applications of
Supervised, Unsupervised and
Reinforcement learning.
Types of Machine
Learning
Machine learning can
be classified into 3
types of algorithms:
▪ Supervised Learning
▪ Unsupervised
Learning
▪ Reinforcement
Learning
Supervised Learning
• In supervised learning, algorithms are trained using labeled data,
where
the input and the output are known.
• The data is fed in the learning algorithm as a set of inputs, along with
the corresponding outputs, and the algorithm learns by comparing its
actual production with correct outputs to find errors. It then modifies
the model accordingly.
• The raw data divided into two parts, i.e. training data and testing data.
Supervised
Learning
• Supervised learning uses the data patterns to predict the values
of additional data for the labels.
• The objective is for the agent to take actions that maximise the expected reward over
a given measure of time. The agent will reach the goal much quicker by following a
good policy. So the purpose of reinforcement learning is to learn the best plan.
Reinforcement
Learning
Reinforcement
Learning
• It is a type of dynamic programming that trains algorithms using
a system of reward and punishment.
• With reinforcement learning, the algorithm discovers through trial
and
error which actions yield the most significant rewards.
• The reinforcement learning frequently used for robotics, gaming,
and
navigation.
Summarization of all Machine Learning
Types
• Supervised Learning – Train Me!
• Euclidean space
• Euclidean distance
• Manhattan distance
• Minkowski
distance
K-Nearest Neighbor (KNN) Algorithm
• K-Nearest Neighbor is a type of Supervised Machine Learning algorithm.
• K-NN algorithm can be used for both Regression as well as for Classification,
however it is mainly used for the Classification problems.
• K-NN algorithm uses ‘feature similarity’ to predict the values between the new
data points and available data points and put the new data points into the category that
is most similar to the available categories.
• K-NN algorithm stores all the available data and classifies a new data point based on
the similarity. This means when new data appears then it can be easily classified into
a well suite category by using K- NN algorithm.
K-Nearest Neighbor (KNN) Algorithm
• The following two properties would define KNN well −
• Non-parametric learning algorithm − KNN is also a non-parametric
learning
algorithm because it does not assume anything about the underlying data.
• Lazy learning algorithm − KNN is a lazy learning algorithm because it
does not have a specialized training set and uses all the data for training
while classification.
• KNN algorithm at the training phase just stores the dataset and when it
gets new data, then it classifies that data into a category that is much similar
to the new data.
K-Nearest Neighbor (KNN)
Algorithm
With the help of K-NN, we can easily identify the category or class of a particular
dataset. Consider the below diagram:
K-Nearest Neighbor (KNN)
Algorithm
• The working of the K-NN algorithm:
• Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
• Step-4: Among these k neighbors, count the number of the data points in each
category.
• Step-5: Assign the new data points to that category for which the
number of the neighbor is maximum.
• As we can see the 3 nearest neighbors are from category A, hence this new data point
must belong to category A.
Procedure to select the value of K in the K-NN
Algorithm?
• Below are some points to remember while selecting the value of K in the K-
NN algorithm:
• There is no particular way to determine the best value for "K", so we need to try
some
values to find the best out of them. The most preferred value for K is 5.
• A very low value for K such as K=1 or K=2, can be noisy and lead to the
effects of outliers in the model.
• Large values for K are good, but it may find some difficulties.
Advantages and Disadvantages of KNN
Algorithm
• Advantages :
• It is simple to implement.
• It is robust to the noisy training data
• It can be more effective if the training data is large.
• Disadvantages:
• Always needs to determine the value of K which may be complex some time.
• The computation cost is high because of calculating the distance between the
data
points for all the training samples
Introductio
n to
Clustering
Techniques
Introduction to Clustering
Techniques
• Clustering is basically a type of unsupervised learning method.
• It does it by finding some similar patterns in the unlabeled dataset such as shape,
size, color, behavior, etc., and divides them as per the presence and absence of those
similar patterns.
• Discovery of clusters with attribute shape − The clustering algorithm should be capable
of detecting clusters of arbitrary shape. They should not be bounded to only distance
measures that tend to find spherical cluster of small sizes.
• Insensitivity to the order of input records: Some clustering algorithms cannot incorporate
newly inserted data. Some clustering algorithms are sensitive to the order of input data. It is
important to develop algorithms that are insensitive to the order of input.
Requirements of Clustering
Algorithm
• High dimensionality − The clustering algorithm should not only be able to
handle
low-dimensional data but also the high dimensional space.
2. Dealing with large number of dimensions and large number of data items
can be problematic because of time complexity;
4. If an obvious distance measure doesn’t exist we must “define” it, which is not always
easy, especially in multi-dimensional spaces;
5. The result of the clustering algorithm (that in many cases can be arbitrary itself) can
be interpreted in different ways.
Types of Clustering Methods
• Clustering methods are used to identify groups of similar objects in a multivariate
data sets collected from fields such as marketing, bio-medical and geo-spatial.
• They are different types of clustering methods, including:
• Partitioning methods
• Hierarchical methods
• Density-based methods
• Grid-based methods
• Model-based methods
• Subspace methods
• Graph based methods
Partitioning
Method
• From a data set of n objects, a partitioning method constructs k partitions of
the
data, where each partition represents a cluster and k ≤n.
• That is, it divides the data into k groups such that each group must contain
at
least one object.
• In other words, partitioning methods conductone-level partitioning on
data
sets.
• To find clusters with complex shapes and for very large data sets,
partitioning-
Partitioning Method
• Algorithms under Partitioning Method:
• K-means- The main objective of the K-Means algorithm is to
minimize the
sum of distances between the points and their respective cluster centroid.
• K-medoids: A medoid can be defined as the point in the cluster,
whose
dissimilarities with all the other points in the cluster is minimum.
Hierarchical
methods
• A hierarchical clustering method works by grouping data objects into a tree of clusters
• The agglomerative approach, also called the bottom-up approach, starts with each
object forming a separate group. It successively merges the objects or groups close to one
another, until all the groups are merged into one (the topmost level of the hierarchy), or a
termination condition holds.
• The divisive approach, also called the top-down approach, starts with all the objects in
the same cluster. In each successive iteration, a cluster is split into smaller clusters,
until eventually each object is in one cluster, or a termination condition holds.
Hierarchical methods
Hierarchical methods
• Algorithms under Hierarchical
method:
and Clustering using Hierarchies):
• BIRCH(Balanced Iterative is
designed
Reducingfor clustering a large amount of numeric data by integrating hierarchical
and other clustering methods such as iterative partitioning.
• To clusters with
discover
arbitrary density-
shape,
clustering methods
basedhave
been
developed.
• The clusters are modeled
as dense regions in the data
space, separated by sparse
regions.
Density-based methods
• These methods have good accuracy and ability to merge two clusters.
Density-based methods
• Algorithms under Density-based method:
• DBSCAN(Density-Based Spatial Clustering of Applications with
Noise):
DBSCAN grows clusters according to a density-based connectivity analysis.
• OPTICS (Ordering Points To Identify Clustering Structure):
OPTICS extends DBSCAN to produce a cluster ordering obtained from a
wide range of parameter settings.
• DENCLUE (DENsity-based CLUstEring): It clusters objects based on a
set
of density distribution functions.
Grid-based
methods
• The grid-based clustering approach uses a multi-resolution grid
data
structure.
• It quantizes the object space into a finite number of cells that form a
grid
structure on which all of the operations for clustering are performed.
• The main advantage of the approach is its fast processing time, which
is typically independent of the number of data objects, yet
dependent on only the number of cells in each dimension in the
Grid-based methods
Grid-based
methods
• Algorithms under Grid-based method:
• STING (Statistical Information Grid):The algorithm can be used on
spatial queries. The spatial area is divided into rectangle cells, which are
represented by a hierarchical structure. Statistical information regarding
the attributes in each grid cell is pre-computed and stored.
• CLIQUE (CLustering In QUEst): It is a simple grid-based
method for
finding density based clusters in subspaces.
Model-based methods
• Model-based method is based on probability models, such as the
finite
mixture model for probability densities.
are similar to each other in a subspace. The similarity is often captured by conventional measures
such as distance or density. For example, the CLIQUE algorithm is a subspace method algorithm.
• Top-down approaches start from the full space and search smaller and smaller subspaces
recursively. Top-down approaches are effective only if the subspace of a cluster can be
determined by the local neighborhood.
Graph based
methods
• To find clusters in a graph, visualize cutting the graph into pieces, each piece being a cluster, such that the
vertices within a cluster are well connected and the vertices in different clusters are connected in a much weaker way.
• The size of the cut is the number of edges in the cut set. For weighted graphs, the size of a cut is the sum of the weights
• In graph theory and some network applications, a minimum cut is of importance. A cut is minimum if the cut’s size is
not greater than any other cut’s size. There are polynomial time algorithms to compute minimum cuts of graphs.
• There are two kinds of methods for clustering graph data, which address some challenges such as High computational
cost, High dimensionality, Sparsity, etc. One uses clustering methods for high-dimensional data, while the other