Machine Learning
ABDELA AHMED, PhD
© University of Gondar, 2022
[Link]@[Link]
Course outline for ML
Course Description
This course is about an introduction to ML
It is a set of topics that will help anyone master the
most important algorithms and concepts in
machine learning – including not only deep learning
but also a lot of other things- and to build effective
learning systems.
3
Objective
To build computer systems that learn from
experience and that are capable to adapt to their
environments
Upon completing this course, you should be able
to:
Explain and differentiate between classical and modern
machine learning techniques
Identify potential application areas where machine
learning techniques can be useful
Implement the solution, and evaluate the results
4
Contents: has six main chapters
1. Overview of Machine Learning
Introduction to AI and ML
Machine Learning vs Datamining
Machine Learning vs Statistical learning
Applications of Machine learning
Challenges in Machine learning
Basic Mathematical Concepts for AI and ML
5
Contents…
2. Classic Machine Learning
Classification and regression
Basic steps of classification
Logistic and linear regressions
K-nearest neighbor (KNN)
Decision Tree
Naïve Bias
Support Vector Machine (SVM)
Data preprocessing and representations
Evaluation methods
6
Contents…
3. Deep Learning
Artificial Neural Network
Convolutional Neural Network
Deep Boltzmann Machine
Deep Belief Network
Autoencoders
Recurrent Neural Network
Long Short Term Memory
7
Contents…
4. Clustering
Overview of unsupervised learning
Partition based clustering
k-means
K-Medoid
Hierarchical clustering
AGNES-agglomerative
Density-based clustering
Grid-based clustering
Model-based clustering
Evaluation of cluster quality
8
Contents…
5. Reinforcement learning
Introduction to reinforcement learning
Bandit problems and Online learning
Overview of Some popular reinforcement learning
algorithms (Q-learning, Deep Q-learning, Policy gradients,
Actor Critic, and PPO )
Application of Reinforcement learning
Challenges of Reinforcement learning
9
Contents
6. Advanced topics
Semi-supervised learning (S3VMs)
Generative Adversarial Networks
Dimensionality réduction techniques (PCA, LDA, FA)
Handling Imbalanced Datasets
Handling multimodal Inputs
Ensemble Learning
Transfer Learning
Interpretability of Deep Learning Models
10
Course Prerequisites
In order to be successful in this course, you will need a working
knowledge of the following:
Fundamental understanding of
Calculus (partial derivatives),
Linear Algebra (vector/matrix manipulations, properties),
Basic statistics (probability, common distributions, Bayes Rule, mean,
median, mode and maximum likelihood)
Familiarity with programming on a Python development
environment and other ML frameworks
11
please install these tools
Course Prerequisites
In order to be successful in this course, you will need a working
knowledge of the following:
Fundamental understanding of
Calculus (partial derivatives),
Linear Algebra (vector/matrix manipulations, properties),
Basic statistics (probability, common distributions; Bayes Rule, mean,
median, mode and maximum likelihood)
Familiarity with programming on a Python development
environment and other ML frameworks
Run deep learning models on your local computer and remote
machines such as Google Colab
Numpy
Pandas
Matplotlib
Scikit-Learn
12
please install these tools
Resources: Books
Tom M. Mitchell, Machine Learning, McGraw Hill
Bishop, Pattern Recognition and Machine Learning,
Springer, 2006
Ethem Alpaydin, Introduction to Machine Learning, The
MIT Press, 2nd Edn., 2010
Bengio, Y., LeCun, Y., and Hinton, G. Deep Learning.
Nature 521: 436-44, 2015.
Marc Toussaint Maths for Intelligent Systems, 2017.
...
13
Resources: Journals
Journal of Machine Learning Research [Link]
Machine Learning
Neural Computation
Neural Networks
IEEE Transactions on Neural Networks
IEEE Transactions on Pattern Analysis and Machine
Intelligence
Annals of Statistics
Journal of the American Statistical Association
...
14
Resources: Conferences
International Conference on Machine Learning (ICML)
[Link]
European Conference on Machine Learning (ECML)
[Link]
Neural Information Processing Systems (NeurIPS)
[Link]
International Conference on Learning Representations (ICLR)
[Link]
Conference on Computer Vision and Pattern Recognition (CVPR)
[Link]
Uncertainty in Artificial Intelligence (UAI)
[Link]
Computational Learning Theory (COLT)
[Link]
International Joint Conference on Artificial Intelligence (IJCAI)
[Link]
International Conference on Neural Networks (Europe)
[Link]
... 15
Resources: Datasets
UCI Repository:
[Link]
UCI KDD Archive:
[Link]
Statlib: [Link]
Delve: [Link]
16
Resources: Top ML researchers
17
CHAPTER 1:
Overview of ML
In this overview, we will explain
artificial Intelligence (AI), Machine Learning (ML)
and Deep Learning (DL)
how DL helps solve classical ML limitations
types of Machine learning algorithms
work flows of Machine Learning
application areas of Machine Learning
challenges of Machine Learning
19
What do you know about Machine learning?
How can we solve a specific problem?
As computer scientists, we write a program that encodes
a set of rules that are useful to solve the problem
Figure: How can we make a robot cook? 20
AI breakthroughs: what do you know about ML?
How can we solve a specific problem?
As computer scientists, we write a program that encodes
a set of rules that are useful to solve the problem
In many cases, it is very difficult to specify those rules.
e.g., given a picture determine whether there is a cat in the
image
As of 2015, computers can be trained to
perform on this tasks than humans using the
latest ML techniques
As of 2016, computers have achieved near-
human performance for machine translation21
AI breakthroughs: what do you know about ML?
How can we solve a specific problem?
As computer scientists, we write a program that encodes
a set of rules that are useful to solve the problem
In many cases, it is very difficult to specify those rules.
e.g., given a picture determine whether there is a cat in the
image
As of 2015, computers can be trained to
perform
“About on this
100 year ago,tasks than humans
electricity usingevery
transformed the
major [Link]
AI hasML techniques
advanced to the point where it
has the power to transform… every major sector in
coming years “ -Andrew NG, Stanford University
As of 2016, computers have achieved near-
[Link]
human performance for machine translation22
Definitions
How can we solve a specific problem?
As computer scientists, we write a program that encodes a set
of rules that are useful to solve the problem
In many cases it is very difficult to specify those rules, e.g.,
given a picture determine whether there is a cat in the image
The real question is what is learning?
Using past experience to improve future performance
For a machine, experiences come in the form of data
Learning simply means incorporating information from
the training examples into the system
Learning systems are not directly programmed to solve
a problem, instead develop own program based on
Examples of how they should behave
From trial-and-error experience trying to solve the problem
23
What do you know about Machine Learning
Machine learning is the study and construction of
programs that are not explicitly programmed, but learn
patterns as they are exposed to more data over time.
Instead of writing a program by hand, we collect lots of
examples that specify the correct output for the given
input.
A machine learning algorithm then takes these
examples and produces a program that does the job.
The program produced by the learning algorithm may look
very different from a typical hand-written program. It may
contain millions of numbers.
If we do it right, the program works for new cases as well as
the one we trained it on.
24
Definitions
AI : the branch of computer science dealing
with the simulation of intelligent behavior
in computers.
Any program that can sense, reason, act, and
adapt.
A rule-based system where it doesn’t
learn as more data comes in.
ML : the study of programs that are not
explicitly programmed, but instead these
algorithms learn patterns from data.
Algorithms whose performance improve as they
are exposed to more data over time.
DL: subset of ML, that uses the multilayered
neural networks to analyze different
patterns from vast amount of data with a
structure that is similar to the human
25
neural system.
What is ML: summary
Arthur Samuel (1959) defined machine learning as “a
sub-field of computer science that gives computers
the ability to learn without being explicitly
programmed.”
It means that ML is able to perform a specified task
without being directly told how to do it.
These programs learn from repeatedly seeing data,
rather than being explicitly programmed by humans.
Example:
Decide whether emails are spam or not spam.
26
What is ML: summary
Example:
Decide whether emails are spam or not spam.
We would start off with dataset where we have a bunch emails that
are going to be labeled spam vs not spam.
These emails will be preprocessed and fed through a ML algorithm
that learns the patterns for spam vs not spam
Once the ML algorithm is trained we can use it to predict as new
emails are coming in
So we trained on this label dataset, and now we can run in
production as new emails come in we predict spam vs not spam
27
ML terminoloygy
In this example, we learn to classify wine quality from a set of
measurement features and wine type
Fixed Volatile Citric Density PH alcohol type Quality
acidity acidity acid
7.4 1.9 0 0.9978 3.51 9.4 red Poor
8.5 0.28 0.56 0.9969 3.3 10.5 red Excellent
6.7 0.24 0.3 0.9919 3.04 11.3 white Excellent
8.1 0.27 0.41 0.9908 2.99 12 white Poor
7.3 0.65 0 0.9946 3.39 10 red Excellent
7 0.27 0.36 1.001 3 8.8 white Excellent
7.8 0.88 0 0.9968 3.2 9.8 red poor
28
ML terminoloygy
In this example, we learn to classify wine quality from a set of
measurement features and wine type
Fixed Volatile Citric Density PH alcohol type Quality
acidity acidity acid
7.4 1.9 0 0.9978 3.51 9.4 red Poor
8.5 0.28 0.56 0.9969 3.3 10.5 red Excellent
6.7 0.24 0.3 0.9919 3.04 11.3 white Excellent
8.1 0.27 0.41 0.9908 2.99 12 white Poor
7.3 0.65 0 0.9946 3.39 10 red Excellent
7 0.27 0.36 1.001 3 8.8 white Excellent
7.8 0.88 0 0.9968 3.2 9.8 red poor
29
ML terminology
Features (explanatory variables)
attributes of the data used for prediction
Number of features to predict target variable?
Fixed Volatile Citric Density PH alcohol type Quality
acidity acidity acid
7.4 1.9 0 0.9978 3.51 9.4 red Poor
8.5 0.28 0.56 0.9969 3.3 10.5 red Excellent
6.7 0.24 0.3 0.9919 3.1 11.3 white Excellent Target
8.1 0.27 0.41 0.9908 2.9 12 white Poor category or
value that we
7.3 0.65 0 0.9946 3.4 10 red Excellent are trying to
predict
7 0.27 0.36 1.001 3 8.8 white Excellent
7.8 0.88 0 0.9968 3.2 9.8 red poor
Example (Observation)
A single data point within the data (one raw)
Number of examples? 30
ML terminology
Features (explanatory variables)
attributes of the data used for prediction
Number of features to predict target variable?
Fixed Volatile Citric Density PH alcohol type Quality
acidity acidity acid
7.4 1.9 0 0.9978 3.51 9.4 red Poor
8.5 0.28 0.56 0.9969 3.3 10.5 red Excellent
6.7 0.24 0.3 0.9919 3.1 11.3 white Excellent Target
category or
8.1 0.27 0.41 0.9908 2.9 12 white Poor
value that we
7.3 0.65 0 0.9946 3.4 10 red Excellent are trying to
predict
7 0.27 0.36 1.001 3 8.8 white Excellent
7.8 0.88 0 0.9968 3.2 9.8 red poor
Example (Observation)
A single data point within the data (one raw)
Number of examples? 31
ML terminology
Features (explanatory variables)
attributes of the data used for prediction
Number of features to predict target variable?
Fixed Volatile Citric Density PH alcohol type Quality
acidity acidity acid
7.4 1.9 0 0.9978 3.51 9.4 red Poor
8.5 0.28 0.56 0.9969 3.3 10.5 red Excellent
6.7 0.24 0.3 0.9919 3.1 11.3 white Excellent Target
category or
8.1 0.27 0.41 0.9908 2.9 12 white Poor
value that we
7.3 0.65 0 0.9946 3.4 10 red Excellent are trying to
predict
7 0.27 0.36 1.001 3 8.8 white Excellent
7.8 0.88 0 0.9968 3.2 9.8 red poor
Label
Example (Observation) The target
value for a
A single data point within the data (one raw) single data
Number of examples? point 32
What is ML: summary
A widely accepted formal definition by Tom Mitchell
(1997, professor of Carnegie Mellon University):
A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P,
if its performance at the tasks T , as measured by P,
improves with the experiences.
According to this definition, we can reformulate the
previous email classification problem as
the task of identifying spam messages (task T) using the data
of previously labeled email messages (experience E) through a
machine learning algorithm with the goal of improving the
future email spam labeling (performance measure P)
33
Types of ML
Supervised Learning
Unsupervised Learning
Semi-supervised Learning
Reinforcement Learning
34
Types of ML: Supervised vs Unsupervised
Dataset Goal Example
Supervised Has a target Make Fraud
Learning column predictions detection
Doesn’t have Customer
Unsupervised Find structure
a target segmentation
Learning in the data
column
35
Supervised Learning:
classification vs regression
36
Unsupervised Learning
Clustering
Dimensionality reduction
37
Classic ML example: fraud detection
Suppose you wanted to identify fraudulent credit
card transaction
detecting fraud is a common Machine Learning problem
You can define your features to be
transaction time,
transaction amounts,
transaction location,
category of purchase.
The algorithm could learn what feature combinations
suggest unusual activity
This structured data with intuitive features are going to be a
good task for our traditional Machine Learning
38
Classic ML limitations
Suppose you wanted to determine if an image is
of a cat or a dog.
What features would you use?
For images, the data is taken as numerical
data to reference the coloring of each
individual pixel within our image
So a pixel then could be used as a feature
But if you imagine even a small image will have
256 by 256 pixels, which will come out to over
65,000 pixels.
Sixty five thousand pixels mean 65,000 features
which is a huge amount of features to be working
with
Another issue is that using each pixel as an
individual, you lose the spatial relationship to the
pixels around it
In other words, the information of a pixel makes 39
sense relative to its surrounding pixels
Classic ML limitations
Suppose you wanted to determine if an image is of
a cat or a dog.
What features would you use?
This where deep learning can come in
Deep Learning techniques will give you the
capability to learn these features on its own and
combine these pixels to define these spatial
relationships
Deep learning is a ML that involves using very
complicated models called “deep neural networks”
DL models determine best representation of
original data; in classic ML, humans must do this.
40
DL vs Classic ML with Example
Feature
Classic Detection
Machine
Learning Step 1:
Determine
features
41
DL vs Classic ML with Example
ML
Feature classifier John
Classic Detection algorithm
Machine
Learning Step 1: Step 2:
Determine Feed them
features through
model
Deep Learning
(Steps 1 and 2
are combine into
1 step)
42
Tasks that require Machine Learning:
What makes a 2?
43
Tasks that benefit from machine learning:
cooking
44
Learning – a two step process
Model construction
A training set is used to create the model.
The model is represented as classification rules, decision
trees, or mathematical formula
Model usage
the test set is used to see how well it works for classifying
future or unknown objects
45
Learning – a two step process
Model construction
A training set is used to create the model.
The model is represented as classification rules, decision
trees, or mathematical formula
Classification
Algorithms
Training
Data
Classifier
NAME RANK YEARS TENURED (Model)
M ike A ssistan t P ro f 3 no
M ary A ssistan t P ro f 7 yes
B ill P ro fesso r 2 yes IF rank = ‘professor’
Jim A sso ciate P ro f 7 yes
D ave A ssistan t P ro f 6 no
OR years > 6
Anne A sso ciate P ro f 3 no THEN tenured = ‘yes’ 46
Learning – a two step process
Model usage
the test set is used to see how well it works for classifying
future or unknown objects
Classifier
model
Testing
Data Unseen Data
(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom A ssistan t P ro f 2 no Tenured?
M erlisa A sso ciate P ro f 7 no
G eo rg e P ro fesso r 5 yes 47
Jo sep h A ssistan t P ro f 7 yes
Challenges in Machine Learning
Data
Getting huge data
Unclear data acquisition and representation issues
Protection of security, integrity, and privacy
Handling high-dimensionality
Handling noise, incomplete and imbalanced data
Computational resources
CPU, GPU, Cloud
Algorithms
Selection of algorithms
Efficiency and scalability of machine learning algorithms
Degree of interpretability 48
Basic steps in Machine Learning
1. Problem statement
What problem are you trying to solve
2. Data collection
What data do you need to solve it?
3. Data pre-processing
How should you clean data so you model can use it?
4. Feature engineering
Select representatives features to improve performance
5. Modeling
Build a model to solve your problem
6. Validation
Did I solve the problem
7. Deployment
Put it into production
49
Applications
Association
Supervised Learning
Classification
Regression
Unsupervised Learning
Reinforcement Learning
50
Learning Associations
Basket analysis:
P (Y | X ) probability that somebody who buys X also
buys Y where X and Y are products/services.
Example: P ( chips | beer ) = 0.7
51
Classification
Example: Credit
scoring
Differentiating
between low-risk
and high-risk
customers from
their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
52
Applications
Transportation: autonomous cars, automated
tracking, Shipping, search and rescue
Communication: language translation
Healthcare: Enhancing diagnosis, Drug discovery
Industry: Factory automation, precision agriculture
Finance: Fraud detection, algorithmic trading,
Energy : Oil and Gas exploration, conservation
Government: Defense, safety and security, smarter
cities,
and More...
53
Unsupervised Learning
Learning “what normally happens”
No output
Clustering: Grouping similar instances
Example applications
Customer segmentation in CRM
Image compression: Color quantization
Bioinformatics: Learning motifs
54
Reinforcement Learning
Learning a policy: A sequence of outputs
No supervised output but delayed reward
Credit assignment problem
Game playing
Robot in a maze
Multiple agents, partial observability, ...
55
Project : guideline
Competition between ML algorithms.
You will be given some data for training a ML system, and you
will try to develop the best method
You should prepare slides for a 10 minute presentation, with 5
minutes for the question and answer session
The final report of your project should consists of Abstract,
Introduction, Statement of the problem, Research Questions,
Objectives, Related Works, Results and Discussions,
Conclusion and Future works, and References.
Final report submission and Presentation of you work is 1 week
before class end
Projects which have well-designed experiments and a thorough
analysis of the results will score higher.
The writing style and the clarity of the report is an asset to score
good grade
56
Project: list of potential topics-
category 1
1. Drug Prediction System
2. Prediction of Heart Disease Presence
3. Prediction of Cervical Cancer
4. Water quality Prediction
57
Project: list of potential topics-
category 2
1. Breast Cancer Classification from mammography
images
2. Brain tumor classification from MRI images
3. Brain tumor classification from Histology images
4. Covid-19 detection from X-ray images
5. Tuberculosis classification from CT images
58
Assignment: topics
1. Support Vector Machine
2. Naïve Bayes and Random Forest
3. Deep Boltzmann Machine
4. Deep Belief Network
5. Deep Learning with Autoencoder
6. Density-based Clustering (DBSCAN)
7. Grid-based Clustering (STING)
8. Deep Reinforcement Learning
9. Semi-supervised Learning
59
Provisional Calendar
Week-1: June 6-12
Chapter-1: Overview of ML
Chapter-2: Regression algorithms
Week-2: June 13-19
Chapter-2: KNN, DT, Evaluation metrics
Week-3: June 20-26
Chapter-3: NVB, SVM, ANN,CNN
Week-4 : June 27- July 3
Chapter-3: RNN, LSTM, DBM, DBN, Autoencoder
Week-5: July 4 – 10
Chapter-4: K-means, K-medoids, AGNES, Density-based, Grid-
based
WEEK-6: July 11-17
Chapter-5: Reinforcement learning
Chpater-6: Dimensionality reduction techniques, Ensemble
learning, Transfer learning, Interpretability, Handling Imbalanced
and multimodal inputs, Semi-supervised learning
Week 7 : July 18-24
Revision
Week 8 : July 25 - August 6
Project and Exam preparation 60
Summary
Machine Learning
Study of algorithms that
improve their performance
at some task
with experience
Factors that have contributed to the current state
of Machine Learning:
bigger data sets,
faster computers,
open source packages, and
a wide range of neural network architectures.
61
Thank You
62