Session 3 Types of Machine Learning (1)
Session 3 Types of Machine Learning (1)
1. Supervised learning
2. Unsupervised learning
3. Reinforcement Learning
Supervised Learning
1 What is it?
• Supervised learning is an approach based on the use of labeled data.
• Labeled data is a set of known data samples with corresponding known target
outputs.
• Such a kind of data is used to build a model that can predict future outputs.
• In supervised learning, the algorithm is provided with labeled training data and
learns to predict outputs from inputs.
Supervised Learning
1 What is it? 2 How it works
In supervised learning, the algorithm The algorithm learns a mapping
is provided with labeled training data function from the input to the output,
and learns to predict outputs from and can then apply that function to
inputs. new, unseen data.
3 Common Tasks
Common supervised learning tasks include classification (predicting categories) and
regression (predicting numerical values).
Supervised Learning Techniques
• Supervised ML algorithms usually take a limited set of labeled data and build models that can
make reasonable predictions for new data.
• We can split supervised learning algorithms into two main parts:
• Classification techniques and
• Regression techniques
Supervised Learning
Classification
• The most successful kinds of machine learning algorithms are those that automate decision-
making processes by generalizing from known examples.
• In supervised learning, the user provides the algorithm with pairs of inputs and desired outputs
(label), and the algorithm finds a way to produce the desired output given an input.
• In particular, the algorithm is able to create an output for an input it has never seen before without
any help from a human.
• Going back to our example of spam classification, using machine learning, the user provides the
algorithm with a large number of emails (which are the input), together with information about
whether any of these emails are spam (which is the desired output). Given a new email, the
algorithm will then produce a prediction as to whether the new email is spam.
Supervised Learning
Classification
• Machine learning algorithms that learn from input/output pairs are called supervised learning
algorithms because a “teacher” provides supervision to the algorithms in the form of the desired
outputs for each example that they learn from.
• Classification models are applied in speech and text recognition, object identification on images,
credit scoring, and others.
• Typical algorithms for creating classification models are Support Vector Machine (SVM), decision
tree approaches, k-nearest neighbors (KNN), logistic regression, Naive Bayes, and neural
networks.
• The following sessions describe the details of each of these algorithms.
Supervised Learning
Classification
• In classification, the goal is to predict a class label, which is a choice from a predefined list of
possibilities.
• Classification is sometimes separated into binary classification, which is the special case of
distinguishing between exactly two classes, and multiclass classification, which is classification
between more than two classes.
• You can think of binary classification as trying to answer a yes/no question, for example,
Classifying emails as either spam or not spam is an example of a binary classification problem.
• In this binary classification task, the yes/no question being asked would be “Is this email spam?”
Supervised Learning
Regression
• For regression tasks, the goal is to predict a continuous number, or a floating-point number in
programming terms (or real number in mathematical terms).
• Predicting a person’s annual income from their education, their age, and where they live is an
example of a regression task.
• When predicting income, the predicted value is an amount, and can be any number in a given
range.
• Another example of a regression task is predicting the yield of a corn farm given attributes such as
previous yields, weather, and number of employees working on the farm.
• The yield again can be an arbitrary number.
Supervised Learning
Regression - Examples
• We have seen that regression models predict continuous responses such as changes in
temperature or values of currency exchange rates.
• Regression models are applied in algorithmic trading, forecasting of electricity load, revenue
prediction, and others.
• Creating a regression model usually makes sense if the output of the given labeled data is real
numbers.
• Typical algorithms for creating regression models are linear and multivariate regressions,
polynomial regression models, and stepwise regressions.
• We can use decision tree techniques and neural networks to create regression models too.
• The following sessions describe the details of some of these algorithms..
Unsupervised Learning
• Unsupervised learning algorithms do not use labeled datasets.
• They create models that use intrinsic relations in data to find hidden patterns that they can use for
making predictions.
• The most well-known unsupervised learning technique is clustering.
• Clustering involves dividing a given set of data in a limited number of groups according to some
intrinsic properties of data items.
• Clustering is applied in market researches, different types of exploratory analysis,
deoxyribonucleic acid (DNA) analysis, image segmentation, and object detection.
• Typical algorithms for creating models for performing clustering are kmeans, k-medoids, Gaussian
mixture models, hierarchical clustering, and hidden Markov models.
• Some of these algorithms are explained in the following sessions of this unit.
Unsupervised Learning
• Unsupervised learning subsumes all kinds of machine learning where there is no known output,
no teacher to instruct the learning algorithm.
• In unsupervised learning, the learning algorithm is just shown the input data and asked to extract
knowledge from this data.
Unsupervised Learning
What is it? How it works Common Tasks
In unsupervised learning, The algorithm explores the Clustering, dimensionality
the algorithm is given data to discover hidden reduction, and anomaly
unlabeled data and has to insights without being given detection are common
find inherent patterns and specific targets or labels. unsupervised learning
structures. tasks.
Unsupervised Learning
Clustering
• A method of grouping a set of objects in such a way that objects in the same group (cluster) are
more similar to each other than to those in other groups.
• Common Algorithms:
• K-means
• Hierarchical Clustering
• DBSCAN.
Unsupervised Learning
Association
• Technique used to find relationships between variables in large databases, often used in market
basket analysis.
• Common Algorithms:
• Apriori
• Eclat
• FP-Growth
Unsupervised Learning
Association Example
• Technique used to reduce the number of input variables in a dataset, simplifying the model without
losing significant information.
• Common Algorithms:
• Principal Component Analysis (PCA)
• t-Distributed Stochastic Neighbor Embedding (t-SNE)
• Linear Discriminant Analysis (LDA)
Unsupervised Learning
Anomaly Detection
• Technique used to identify rare items, events, or observations which raise suspicions by differing
significantly from the majority of the data.
• Common Algorithms:
• Isolation Forest
• One-Class SVM
• Local Outlier Factor (LOF)
Unsupervised Learning
Applications of Anomaly Detection
• Fraud Detection:
• Identifying fraudulent credit card transactions
• Network Security:
• Detecting unusual patterns indicating potential security breaches
• Manufacturing:
• Identifying defects in products
• Healthcare:
• Detecting rare diseases or abnormalities in medical images
Reinforcement Learnin
Reinforcement Learning (RL) is a powerful machine learning
technique that enables agents to learn optimal behaviors
through trial-and-error interactions with an environment.
Environment
The world in which the agent operates and receives feedback in the form of
rewards or punishments.
Rewards
The feedback signal that the agent uses to learn which actions lead to
desirable outcomes.