0% found this document useful (0 votes)
8 views

Chapter Five

Uploaded by

Mitiku Abebe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Chapter Five

Uploaded by

Mitiku Abebe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Chapter -5 –Advanced ML and Deep Learning

Intoroduction
Ensemble Learning ML

 Ensemble techniques in machine learning involve combining


multiple individual models

 (often referred to as base models or weak learners) to improve overall


predictive performance.

 These techniques leverage the concept that combining diverse models


can often produce more accurate and robust predictions compared to
using a single model.

2
Ensemble Learning ML

 Ensemble methods are widely used across various machine learning


tasks, including classification, regression, and anomaly detection.

3
Bagging

 Bagging, short for Bootstrap Aggregating, is an ensemble learning


technique used to improve the stability and accuracy of machine
learning algorithms.

 It works by combining the predictions of multiple base models, each


trained on a subset of the original dataset.

4
How Bagging Works

 Bagging starts by creating multiple bootstrap samples from


the original dataset.
 Bootstrap sampling involves randomly selecting samples from the
original dataset with replacement.
 This means that some instances may be selected multiple times, while
others may not be selected at all
 Base Model Training: Once the bootstrap samples are created, a base
model (e.g., decision tree, random forest, etc.) is trained on each
bootstrap sample.

5
Bagging

 Bagging, short for Bootstrap Aggregating, is an ensemble learning


technique used to improve the stability and accuracy of machine
learning algorithms.

 It works by combining the predictions of multiple base models, each


trained on a subset of the original dataset.

 Voting or Averaging: After training all the base models, bagging


combines their predictions through a voting mechanism (for
classification tasks) or averaging (for regression tasks).

6
Bagging ML

Bagging helps to reduce overfitting by reducing the variance of the

model.

7
Boosting

 Boosting is another ensemble learning technique used to improve the performance

of machine learning models. Unlike bagging, which trains each base model

independently,

 boosting trains base models sequentially, with each subsequent model focusing

more on the instances that were misclassified by the previous models.

 The succeeding models are dependent on the previous model.

8
How boosting works
A. Base Model Training:
Boosting starts by training a base model (often a weak learner) on the original dataset.
B. Instance Weighting:
After the first model is trained, the misclassified instances are given higher weights, while the
correctly classified instances are given lower weights. This allows subsequent models to focus more on
the difficult instances.
C. Sequential Training:
The subsequent models are trained sequentially, with each model focusing more on the instances that
were misclassified by the previous models. The weights of the instances are adjusted after each model
is trained.

9
Boosting cont..
D. Combining Predictions:
Finally, the predictions of all the models are combined through a weighted sum or voting
to produce the final prediction.

Example : AdaBoost (Adaptive Boosting) and Gradient Boosting,

10
Bagging - Example

11
AdaBoost - Example

Adaptive boosting or AdaBoost: is one of the simplest boosting algorithms. Usually,

decision trees are used for modelling.

12
Unsupervised Learning
➔Unsupervised learning aims to find the underlying structure or the distribution
➔ of data. We want to explore the data to find some intrinsic structures in them.

➔ models itself find the hidden patterns and insights from the given data.

➔ Unsupervised learning cannot be directly applied to a regression or classification problem

➔ Unsupervised learning is much similar as a human learns to think by their own experiences,
which makes it closer to the real AI.

➔ Unsupervised learning works on unlabeled and uncategorized data which make


unsupervised learning more important.

13
Basic Steps In ML

Clustering: Clustering is a method of grouping the objects into clusters such that objects with most
similarities remains into a group.

Association: An association rule is an unsupervised learning method which is used for finding the
relationships between variables in the large database. It determines the set of items that occurs together
in the dataset.

Dimensionality Reduction: Reducing the number of features (variables) in a dataset while


preserving important information
14
Unsupervised Learning algorithms

Below is the list of some popular unsupervised learning algorithms:

➔ K-means clustering

->Principal component analysis

15
K-means clustering

16
K-means Algorithm

Definition: K-Means is a partitioning clustering algorithm that separates a dataset


into K distinct, non-overlapping subsets (clusters).

Working Principle

Initialization: Randomly select K data points as initial cluster centers.

Assignment: Assign each data point to the cluster whose center is nearest.

Update Centers: Recalculate the cluster centers as the mean, variance, Euclidian
distance of the data points in each cluster.

Repeat: Iterate steps 2 and 3 until convergence (when cluster assignments stabilize).

17
Common Terms
Given k, the k-means algorithm is implemented in 5 steps:

https://domino.ai/blog/getting-started-with-k-means-clustering-in-python

18
Python Implementation
EX: Problem Statement:

A retail store wants to get insights about its customers. And then build a system
that can cluster customers into different groups.

19
Introduction to Deep Learning

Deep Learning

20
What is Deep Learning

 Deep learning is a subset of machine learning, which itself is a subset of


artificial intelligence (AI).

 Deep learning models are inspired by the structure and function of the
human brain, specifically neural networks, and they are designed to
automatically learn features and patterns from large amounts of data.

 The output from each preceding layer is taken as input by each one of the
successive layers

21
Deep Learning Cont..

 Deep learning models are capable enough to focus on the accurate


features themselves by requiring a little guidance from the programmer
and are very helpful in solving out the problem of dimensionality.
 Deep learning algorithms are used, especially when we have a huge no
of inputs and outputs.
 Deep learning is a collection of statistical techniques of machine learning
for learning feature hierarchies that are actually based on artificial neural
networks

22
Deep Learning Implementation

 Deep learning is implemented by the help of deep networks, which are


nothing but neural networks with multiple hidden layers

23
Artificial Neural Network

 Neural Networks processes information in a similar way the human


brain does, and these networks actually learn from examples, you cannot
program them to perform a specific task.

 They will learn only from past experiences as well as examples

 Artificial Neural Network is biologically inspired by the neural network,


which constitutes after the human brain

24
Artificial Neural Network

25
Input Layer

 This layer consists of neurons that directly accept the input features of
the data.
 It does not perform any transformations or computations. Instead, it acts as
a channel to pass the raw data into the network.
 The primary role of the input layer is to serve as the entry point for the
input data into the neural network.
 For example, if the input is an image of 28x28 pixels, the input layer will
have 784 neurons (one for each pixel).

26
Hidden Layer

 Hidden layers are where the actual computation and learning take

place.

 These layers consist of neurons that apply weights and biases to the input,

followed by an activation function to introduce non-linearity

 There can be one or multiple hidden layers, and each layer transforms

the input data into a more abstract and useful representation

27
Output - Layer

 The output layer is the final layer of the neural network that produces the

output prediction

 The number of neurons in this layer corresponds to the number of

output classes or the number of values to predict in the case of

regression tasks.

 In a classification task, it typically uses a softmax activation function to

provide class probabilities.


28 In regression tasks, it may use a linear activation function
Perceptron's

 The Neural Network's basic unit is called a Perceptron.

 So, a perceptron can be defined as a neural network with a single layer

that classifies the linear data

 It further constitutes four major components

 It was introduced by Frank Rosenblatt in 1957 and is used for binary

classification tasks.

29
Perceptron's parts

A. Input Features (x): These are the inputs to the perceptron.

 Each input is associated with a feature of the data.

 For example, in an image recognition task, each pixel value can be an

input feature.

B. Weights (w): Each input feature is multiplied by a corresponding

weight. Weights are parameters that the perceptron learns during training.

 They determine the importance of each input feature in making the

30 prediction
Perceptron's parts
C. Bias (b):

 The bias term is added to the weighted sum of inputs. It allows the activation

function to be shifted left or right, enabling the perceptron to make more flexible

decisions.

 Bias helps in the adjustment of the curve of activation function so as to accomplish a

precise output

 Activation Function: The weighted sum of the inputs plus the bias is passed through

an activation function. The activation function determines the output of the perceptron

based on this sum.


31
Perceptron's parts
 For a basic perceptron, the activation function is usually a step function (Heaviside step

function), which outputs 1 if the weighted sum is greater than a certain threshold and 0

otherwise.

 introducing non-linearity, allowing them to learn complex patterns and relationships in

data.

32
Perceptron's parts

33
Types of Activation function

34
Types of Activation function

35
Types of Activation function

36
Types of Activation function

37
When to use which Activation function
Sigmoid: Use for binary classification tasks where the output needs to be interpreted as a
probability between 0 and 1.

Hyperbolic Tangent (Tanh): Use in hidden layers of neural networks for zero-centered
activations and when outputs need to be in the range (-1, 1).

Rectified Linear Unit (ReLU): Use in hidden layers of deep neural networks for faster
convergence and to overcome the vanishing gradient problem.

Leaky ReLU: Use to address the "dying ReLU" problem in deep neural networks where some
neurons become inactive during training.

Softmax: Use in the output layer of neural networks for multi-class classification tasks where
outputs need to be interpreted as probabilities and the sum of probabilities equals 1.

38
When to use which Activation function

39
Perceptron's Summary

40
Simple implementation

41
Thank you Had a Great Time

42

You might also like