0% found this document useful (0 votes)

30 views

Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)

Machine learning is a branch of artificial intelligence focused on building applications that can learn from data and improve over time without being explicitly programmed. There are two main types of machine learning: supervised learning, where labeled examples are used to train models to map inputs to outputs, and unsupervised learning, where unlabeled data is used to find hidden patterns or grouping in the data. Popular supervised learning algorithms include naive Bayes, support vector machines, k-nearest neighbors, and linear regression.

Uploaded by

Ayola Jayamaha

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)

Uploaded by

Ayola Jayamaha

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Machine Learning

Dr. Windhya Rankothge (PhD – UPF, Barcelona)

Faculty of Graduate Studies & Research

Machine Learning
• A branch of artificial intelligence (AI) focused on building applications
 that learn from data and
 improve their accuracy over time without being programmed to do so.

2
Machine Learning
• In data science, an algorithm is a sequence of statistical processing steps.
• In machine learning, algorithms are 'trained' to find patterns and features
in massive amounts of data
in order to make decisions and predictions based on new data.
• The better the algorithm,
the more accurate the decisions and predictions will become as it processes more data.

3
Machine Learning

Past Future
i ct
n e d
a r p r
le
Training Model/ Testing Model/
Data Predictor Data Predictor
Machine Learning
Using and improving the model

Select and prepare a training data set 1 4

Choose an algorithm to run on the

training data set 2

3 Training the algorithm to create the

model
Machine Learning

6
Supervised Learning Algorithms

Faculty of Graduate Studies & Research

Supervised Learning Algorithms

Labeled data set

8
Supervised Learning Algorithms
• Supervised learning is where you have input variables (x) and an output variable (Y)
and you use an algorithm to learn the mapping function from the input to the output.

Y = f(X)

• The goal is to approximate the mapping function so well that

when you have new input data (x) that you can predict the output variables (Y) for that data.

9
Supervised Learning Algorithms
Accurately assign test data into specific
categories

Understand the relationship between

dependent and independent variables

10
Supervised Learning Algorithms

11
Naïve Bayes
• A statistical classification technique based on Bayes Theorem.
• One of the simplest and fast supervised learning algorithms.
• Naive Bayes classifier assumes that the effect of a particular feature in a class is independent of
other features.
• For example, a loan applicant is desirable or not depending on his/her income, previous loan and
transaction history, age, and location.
• Even if these features are interdependent, these features are still considered independently.
• This assumption simplifies computation, and that's why it is considered as naive.

12
Naïve Bayes

• P(h): the probability of hypothesis h being true (regardless of the data). This is known as the prior
probability of h.
• P(D): the probability of the data (regardless of the hypothesis). This is known as the prior
probability.
• P(h|D): the probability of hypothesis h given the data D. This is known as posterior probability.
• P(D|h): the probability of data d given that the hypothesis h was true. This is known as posterior
probability.

13
Naïve Bayes
• Assume we have a bunch of emails that we want to classify as spam or not spam.
• Our dataset has 15 Not Spam emails and 10 Spam emails. Some analysis had been done, and the
frequency of each word had been recorded as shown below:

• Note: Stop Words like “the”, “a”, “on”, “is”, “all”

had been removed as they do not carry important meaning
and are usually removed from texts.
The same thing applies to numbers and punctuations.

14
Naïve Bayes
Exploring some probabilities:
• P(Dear|Not Spam) = 8/34
• P(Visit|Not Spam) = 2/34
• P(Dear|Spam) = 3/47
• P(Visit|Spam) = 6/47

• Now assume we have the message “Hello friend”

and we want to know whether it is a spam or not.

15
Naïve Bayes
So, using Bayes’ Theorem:

Ignoring the denominator

But, P(Hello friend | Not Spam) = 0, as this case (Hello friend) doesn’t exist in our dataset, i.e. we deal
with single words, not the whole sentence, and the same for P(Hello friend | Spam) will be zero as
well, which in turn will make both probabilities of being a spam and not spam both are zero, which
has no meaning!!
16
Naïve Bayes
But wait!! we said that the Naive Bayes assumes that `the features we use to predict the target are
independent`.

17
Naïve Bayes
Now let’s calculate the probability of being spam using the same procedure:

So, the message “Hello friend” is not a spam

18
Supervised Learning Algorithms

19
Support Vector Machines (SVM)
• Typically leveraged for classification problems (can be used for regression too),
constructing a hyperplane where the distance between two classes of data points is at its
maximum.
• This hyperplane is known as the decision boundary,
separating the classes of data points (e.g., oranges vs. apples) on either side of the plane

20
Support Vector Machines (SVM)
• Plot each data item as a point in n-dimensional space (where n is number of features you have)
with the value of each feature being the value of a particular coordinate.
• Perform classification by finding the hyper-plane that differentiates the two classes very well.

Which Hyperplane ?

21
Support Vector Machines (SVM)
• The vector points closest to the hyperplane are known as the support vector points because only
these two points are contributing to the result of the algorithm, and other points are not.
• The distance of the vectors from the hyperplane
is called the margin, which is a separation
of a line to the closest support vector points.
• We would like to choose a hyperplane
that maximizes the margin between classes.

22
Support Vector Machines (SVM)

23
Support Vector Machines (SVM)

24
Support Vector Machines (SVM)

25
Support Vector Machines (SVM)
• Maximizing margin is equivalent to Minimizing Loss (Minimizing misclassification)
• The loss function that SVM uses is known as hinge loss

• The dimension of the hyperplane depends upon the number of features.

• If the number of input features is 2, then the hyperplane is just a line (linear-hyperplane).
• If the number of input features is 3, then the hyperplane becomes a two-dimensional plane.
• It becomes difficult to imagine when the number of features exceeds 3.

26
Support Vector Machines (SVM)
• SVM has a technique called the kernel trick.
• These are functions that take low dimensional input space and transform it into a higher-
dimensional space
• It converts not separable problem
to separable problem.

27
Support Vector Machines (SVM)

28
Support Vector Machines (SVM)

29
Supervised Learning Algorithms

30
K-Nearest Neighbors (K-NN)
• Can be used for Regression as well as for Classification but mostly it is used for the Classification
problems.
• A non-parametric algorithm, which means it does not make any assumption on underlying data.

31
K-Nearest Neighbors (K-NN)

32
K-Nearest Neighbors (K-NN)

33
K-Nearest Neighbors (K-NN)

34
K-Nearest Neighbors (K-NN)

35
K-Nearest Neighbors (K-NN)
• Research has shown that no optimal number of neighbors (k) suits all kind of data sets.
• Each dataset has it's own requirements.
• In the case of a small number of neighbors, the noise will have a higher influence on the result,
and a large number of neighbors make it computationally expensive.

36
Supervised Learning Algorithms
Accurately assign test data into specific
categories

Understand the relationship between

dependent and independent variables

37
Supervised Learning Algorithms

38
Linear Regression
• Performs the task to predict a dependent variable value (y) based on a given independent variable
(x).
• So, this regression technique finds out a linear relationship between x (input) and y(output)

39
Linear Regression

40
Linear Regression

41
Linear Regression

42
Linear Regression

Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Day 4 Content
No ratings yet
Day 4 Content
35 pages
Machine learning algorithms laiki
No ratings yet
Machine learning algorithms laiki
123 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
ML UNIT-4
No ratings yet
ML UNIT-4
20 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
Classification
No ratings yet
Classification
7 pages
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
No ratings yet
11 Most Common Machine Learning Algorithms Explained in A Nutshell by Soner Yıldırım Towards Data Science
16 pages
Supervised Learning
No ratings yet
Supervised Learning
46 pages
Algorithm of Neural Network M4
No ratings yet
Algorithm of Neural Network M4
25 pages
Pricas Supervised Learning
No ratings yet
Pricas Supervised Learning
18 pages
ML Unit 3 V1
No ratings yet
ML Unit 3 V1
25 pages
Topics in Module-3-: ML & Cloud Computing For Iot
No ratings yet
Topics in Module-3-: ML & Cloud Computing For Iot
149 pages
Unit 1
No ratings yet
Unit 1
15 pages
Unit 3 PPT
No ratings yet
Unit 3 PPT
20 pages
AI Unit-4
No ratings yet
AI Unit-4
58 pages
Machine Learning (Part 1) : Iykra Data Fellowship Batch 3
No ratings yet
Machine Learning (Part 1) : Iykra Data Fellowship Batch 3
28 pages
ML notes
No ratings yet
ML notes
10 pages
Assignment 2
No ratings yet
Assignment 2
111 pages
Machine Learning
100% (6)
Machine Learning
115 pages
2
No ratings yet
2
15 pages
Introduction of Machine Learning
No ratings yet
Introduction of Machine Learning
9 pages
Wa0000.
No ratings yet
Wa0000.
26 pages
Evaluation of Different Classifier
No ratings yet
Evaluation of Different Classifier
4 pages
Colloquium Evaluation: Faculty of Computer Science and Engineering To:Kanika Gupta Ma'Am Bhavya Sethi 16csu082
No ratings yet
Colloquium Evaluation: Faculty of Computer Science and Engineering To:Kanika Gupta Ma'Am Bhavya Sethi 16csu082
12 pages
Supervised ML
No ratings yet
Supervised ML
69 pages
AP for NLP-LO2
No ratings yet
AP for NLP-LO2
38 pages
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
No ratings yet
Fulldoc - Dsec Mca - Crime Prediction (1) - 051521
65 pages
Chapter Four
No ratings yet
Chapter Four
75 pages
Decision Trees
No ratings yet
Decision Trees
5 pages
Notes
No ratings yet
Notes
32 pages
41 Machine Learning Algorithms I
No ratings yet
41 Machine Learning Algorithms I
8 pages
1
No ratings yet
1
42 pages
ML & Cloud Computing For Iot: Topics in Module-3
No ratings yet
ML & Cloud Computing For Iot: Topics in Module-3
38 pages
IMTC634_Data Science_Chapter 6
No ratings yet
IMTC634_Data Science_Chapter 6
22 pages
ML Unit 1
No ratings yet
ML Unit 1
74 pages
Lecture 9
No ratings yet
Lecture 9
27 pages
Unit-1 DL
No ratings yet
Unit-1 DL
29 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Lecture 02 Supervised Learning 27102022 124322am
No ratings yet
Lecture 02 Supervised Learning 27102022 124322am
29 pages
Project Report 2
No ratings yet
Project Report 2
11 pages
Algorithms - Reading Assignment
No ratings yet
Algorithms - Reading Assignment
17 pages
Unit 4
No ratings yet
Unit 4
23 pages
Machine Learning in A Nutshell
No ratings yet
Machine Learning in A Nutshell
36 pages
Mod09-ppt2-ML_in_Image_Classification
No ratings yet
Mod09-ppt2-ML_in_Image_Classification
30 pages
Unit 5
No ratings yet
Unit 5
28 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-04 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE334L TH VL2024250101768 2024-10-04 Reference-Material-I
69 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
Lecture#11
No ratings yet
Lecture#11
19 pages
Unit Iii
No ratings yet
Unit Iii
18 pages
SRU ADA Unit-3
No ratings yet
SRU ADA Unit-3
78 pages
UNIT1
No ratings yet
UNIT1
38 pages
Lecture - 2 & 3
No ratings yet
Lecture - 2 & 3
62 pages
ML Algorithms
No ratings yet
ML Algorithms
12 pages
Machine Learning
No ratings yet
Machine Learning
40 pages
Review On Network Intrusion Detection Techniques Using Machine Learning
No ratings yet
Review On Network Intrusion Detection Techniques Using Machine Learning
6 pages
Machine Learning Algorithms - A Review - ART20203995
No ratings yet
Machine Learning Algorithms - A Review - ART20203995
6 pages
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet
System Network and Administration - Assignment 3
No ratings yet
System Network and Administration - Assignment 3
8 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
Assignment 6: MSC Degree
No ratings yet
Assignment 6: MSC Degree
28 pages
IT5010-Advanced Learning and Study Skills Lecture-01: How To Be An Effective Learner
No ratings yet
IT5010-Advanced Learning and Study Skills Lecture-01: How To Be An Effective Learner
19 pages
Lecture-03 Learning Styles - Part2: IT5010-Advanced Learning and Study Skills
No ratings yet
Lecture-03 Learning Styles - Part2: IT5010-Advanced Learning and Study Skills
26 pages
Install and Configure: Openldap Server
No ratings yet
Install and Configure: Openldap Server
17 pages
IT5010-Advanced Learning and Study Skills: Lecture-02
No ratings yet
IT5010-Advanced Learning and Study Skills: Lecture-02
10 pages
IT 5030 Software Engineering Practices-2
No ratings yet
IT 5030 Software Engineering Practices-2
4 pages
IT 5010 Advanced Learning and Study Skills
No ratings yet
IT 5010 Advanced Learning and Study Skills
3 pages
Install and Configure A Mail Server: Sendmail: Dr. Anuradha Jayakody
No ratings yet
Install and Configure A Mail Server: Sendmail: Dr. Anuradha Jayakody
25 pages
Decision Making Under Uncertainty: - How To Make One Decision in The Face of Uncertainty
No ratings yet
Decision Making Under Uncertainty: - How To Make One Decision in The Face of Uncertainty
102 pages
Assignment 2.2 - Group IT-B
No ratings yet
Assignment 2.2 - Group IT-B
30 pages
IT 5020 Advanced Database Technologies
No ratings yet
IT 5020 Advanced Database Technologies
4 pages
Wireless Network Technologies Assignment-2 Part 2
No ratings yet
Wireless Network Technologies Assignment-2 Part 2
20 pages
Wireless Network Technologies Assignment 2
No ratings yet
Wireless Network Technologies Assignment 2
33 pages
Leadership
No ratings yet
Leadership
28 pages
Leadership: IT Group 1 MS21900136 MS21900686 MS21900754 MS21900822
No ratings yet
Leadership: IT Group 1 MS21900136 MS21900686 MS21900754 MS21900822
28 pages
Faculty of Graduate Studies and Research Master of Science in Information Technology
No ratings yet
Faculty of Graduate Studies and Research Master of Science in Information Technology
31 pages
ADT Assignement 02 G10
No ratings yet
ADT Assignement 02 G10
4 pages
Development Methodology
No ratings yet
Development Methodology
6 pages
Master's Finder: Group IT - B
No ratings yet
Master's Finder: Group IT - B
8 pages
Adobe Scan 30-May-2023
No ratings yet
Adobe Scan 30-May-2023
7 pages
Image Classification Techniques-A Survey
No ratings yet
Image Classification Techniques-A Survey
4 pages
Analysis of Road Traffic Accident Using Data Mining: Keywords
No ratings yet
Analysis of Road Traffic Accident Using Data Mining: Keywords
9 pages
Part 1 - Machine Learning
No ratings yet
Part 1 - Machine Learning
44 pages
1-s2.0-S2666285X21000893-main
No ratings yet
1-s2.0-S2666285X21000893-main
6 pages
Batch No 14
No ratings yet
Batch No 14
62 pages
Information Geometry Univariate Time Series
No ratings yet
Information Geometry Univariate Time Series
12 pages
1 s2.0 S2214509523007374 Main
No ratings yet
1 s2.0 S2214509523007374 Main
26 pages
Image Based Bird Species Identification Using
No ratings yet
Image Based Bird Species Identification Using
7 pages
DA unit 2
No ratings yet
DA unit 2
51 pages
Natural Language Processingand Sentiment Analysis
No ratings yet
Natural Language Processingand Sentiment Analysis
15 pages
A Review On Sentiment Analysis Methodologies Practices and Applications
No ratings yet
A Review On Sentiment Analysis Methodologies Practices and Applications
10 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
Prediction of Poultry Yield Using Data Mining Techniques
No ratings yet
Prediction of Poultry Yield Using Data Mining Techniques
17 pages
Machine-Learning Methods For Ligand-Protein Molecular Docking
No ratings yet
Machine-Learning Methods For Ligand-Protein Molecular Docking
17 pages
LAWBOT
No ratings yet
LAWBOT
13 pages
Multi-Task Pre-Training of Deep Neural Networks For Digital Pathology
No ratings yet
Multi-Task Pre-Training of Deep Neural Networks For Digital Pathology
10 pages
Full Proj Report
No ratings yet
Full Proj Report
59 pages
SKL Pattern
No ratings yet
SKL Pattern
66 pages
Stock Market Forecasting Using Intrinsic Time-Scale Decomposition in Fusion With Cluster Based Modified CSA Optimized ELM (2022)
No ratings yet
Stock Market Forecasting Using Intrinsic Time-Scale Decomposition in Fusion With Cluster Based Modified CSA Optimized ELM (2022)
17 pages
AQI-report
No ratings yet
AQI-report
17 pages
AI - Machine Learning Algorithms Applied To Transformer Diagnostics
No ratings yet
AI - Machine Learning Algorithms Applied To Transformer Diagnostics
6 pages
Using The Bitcoin Transaction Graph To Predict The Price of Bitcoin
No ratings yet
Using The Bitcoin Transaction Graph To Predict The Price of Bitcoin
8 pages
Richardson_DAA_3e_PPT_Ch03) (5) (2)
No ratings yet
Richardson_DAA_3e_PPT_Ch03) (5) (2)
47 pages
AMT305 INTRODUCTION TO MACHINE LEARNING, Pyq2
No ratings yet
AMT305 INTRODUCTION TO MACHINE LEARNING, Pyq2
3 pages
Get Supervised machine learning: optimization framework and applications with SAS and R First Edition Kolosova PDF ebook with Full Chapters Now
100% (1)
Get Supervised machine learning: optimization framework and applications with SAS and R First Edition Kolosova PDF ebook with Full Chapters Now
55 pages
Extraction and Classification of Risk
No ratings yet
Extraction and Classification of Risk
13 pages
Text Classification on Call Center Data Using BERT
No ratings yet
Text Classification on Call Center Data Using BERT
4 pages
jurnal 7
No ratings yet
jurnal 7
11 pages

Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)

Uploaded by

Machine Learning: Dr. Windhya Rankothge (PHD - Upf, Barcelona)

Uploaded by

Machine Learning

Dr. Windhya Rankothge (PhD – UPF, Barcelona)

Faculty of Graduate Studies & Research

Select and prepare a training data set 1 4

Choose an algorithm to run on the

3 Training the algorithm to create the

Faculty of Graduate Studies & Research

Labeled data set

• The goal is to approximate the mapping function so well that

Understand the relationship between

• Note: Stop Words like “the”, “a”, “on”, “is”, “all”

• Now assume we have the message “Hello friend”

Ignoring the denominator

So, the message “Hello friend” is not a spam

• The dimension of the hyperplane depends upon the number of features.

Understand the relationship between

You might also like