0% found this document useful (0 votes)

22 views49 pages

Lecture 1 Intro

This document provides an introduction to machine learning including definitions, types of machine learning systems, applications, and examples. It discusses supervised learning and classification/regression tasks. Key topics covered include spam filtering, game of Go, DeepMind, and quotes about the potential of and concerns with machine learning.

Uploaded by

farida1971yasmin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views49 pages

Lecture 1 Intro

Uploaded by

farida1971yasmin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 49

CSM 6405: Symbolic ML II

Lecture 1: Introduction

Pro f. Dr. M d . R a k i b Ha s s an
De pt . o f Co m p u te r S c i en ce a n d M at h e m at ics ,
Ba n gl adesh A gr i c ul tural U n i ve rsi ty.
E m a i l: ra k i b@ bau .edu.bd
Books
❖ Machine Learning
❑ Tom M Mitchell
❖ Artificial Intelligence: A Modern Approach
❑ Stuart Russell and Peter Norvig
❖ The Hundred-Page Machine Learning Book
❑ Andriy Burkov
❖ Hands-On Machine Learning with Scikit-Learn and TensorFlow:
Concepts, Tools, and Techniques to Build Intelligent Systems
❑ Aurélien Géron
❖ Pattern Recognition and Machine Learning (Information
Science and Statistics)
❑ Christopher M. Bishop
❖ Deep Learning
❑ Ian Goodfellow and Yoshua Bengio

PROFESSOR DR. MD. RAKIB HASSAN 2

Books (Cont.)
❖ Data Science from Scratch
❑ Joel Grus

❖ Deep Learning with Python

❑ Francois Chollet

❖ Machine Learning: An Algorithmic Perspective

❑ Stephen Marsland

❖ Python Machine Learning

❑ Sebastian Raschka and Vahid Mirjalili

❖ Learning from Data

❑ Yaser S. Abu-Mostafa, Malik Magdon-Ismail, and Hsuan-Tien
Lin

PROFESSOR DR. MD. RAKIB HASSAN 3

Resources
❖ Machine Learning by Andrew Ng (Coursera)
❑ http://coursera.org

❖ Geoffrey Hinton’s Neural Network and Deep Learning

❑ http://www.cs.toronto.edu/~hinton/

❖ Scikit-Learn’s user guide

❑ https://scikit-learn.org/stable/user_guide.html

❖ Dataquest interactive tutorials

❑ https://www.dataquest.io

❖ Deep learning
❑ http://deeplearning.net/

❖ Competitions
❑ https://www.kaggle.com/

PROFESSOR DR. MD. RAKIB HASSAN 4

Introduction
❖ Machine learning (ML) teaches computers to do what
comes naturally to humans and animals:
❑ learn from experience/data

❖ Machine learning algorithms use computational

methods to “learn” information directly from data
without relying on a predetermined equation as a
model.

❖ The algorithms adaptively improve their performance

as the number of samples available for learning
increases.

PROFESSOR DR. MD. RAKIB HASSAN 5

Applications of ML
❖ Image processing and computer vision, for face
recognition, motion detection, and object detection
❖ Computational biology, for tumor detection, drug
discovery, and DNA sequencing
❖ Detecting transactions that are likely to be fraudulent
❖ Energy production, for price and load forecasting
❖ Understanding human learning (brain, real AI)

PROFESSOR DR. MD. RAKIB HASSAN 6

Applications of ML (Cont.)
❖ Ranking web search results
❖ Recognizing faces
❖ Smartphone’s speech recognition
❖ Song or movie recommendations
❖ Self-driving cars
❖ Autonomous weapons
❖ Beating humans in games (e.g., Go)

PROFESSOR DR. MD. RAKIB HASSAN 7

Game of Go
❖ Originated in China 3,000 years ago.

❖ The rules of the game are simple but it is a game of

profound complexity
❑ 10170 possible board configurations - more than the number of
atoms in the known universe - making Go a googol (10100)
times more complex than Chess.

❖ AlphaGo
❑ https://deepmind.com/research/alphago/

PROFESSOR DR. MD. RAKIB HASSAN 8

DeepMind
❖ DeepMind Technologies is a British artificial
intelligence company founded in September 2010,
acquired by Google in 2014 and currently owned by
Alphabet Inc.

❖ A more general program, AlphaZero, beat the most

powerful programs playing go, chess and shogi
(Japanese chess) after a few days of play against itself
using reinforcement learning.

PROFESSOR DR. MD. RAKIB HASSAN 9

Quotes about ML
❖ Professor Stephen Hawking:
❑ “The development of full artificial intelligence could spell the
end of the human race.”

❖ Billionaire Elon Musk has said that he thinks AI is the

“biggest existential threat” to the human race.

PROFESSOR DR. MD. RAKIB HASSAN 10

Machine Learning (ML) Definition
❖ Machine Learning: Field of study that gives
computers the ability to learn without being explicitly
programmed.
❑ Arthur Samuel (1959)

❖ Well-posed Learning Problem: A computer program is

said to learn from experience E with respect to some
task T and some performance measure P, if its
performance on T, as measured by P, improves with
experience E.
❑ Tom Mitchell (1998)

PROFESSOR DR. MD. RAKIB HASSAN 11

Example of ML
❖ Definition of ML: “A computer program is said to
learn from experience E with respect to some task T
and some performance measure P, if its performance
on T, as measured by P, improves with experience E.”

❖ Example: Suppose your email program watches which

emails you do or do not mark as spam, and based on
that learns how to better filter spam. What is the task
T in this setting?

PROFESSOR DR. MD. RAKIB HASSAN 12

Example of ML (Cont.)
❖ Example: Suppose your email program watches which
emails you do or do not mark as spam, and based on
that learns how to better filter spam. What is the task
T in this setting?
❑ Classifying emails as spam or not spam (ham).
❑ Watching you label emails as spam or not spam.
❑ The number (or fraction) of emails correctly classified as
spam/not spam.
❑ None of the above—this is not a machine learning problem.

PROFESSOR DR. MD. RAKIB HASSAN 13

Example of ML (Cont.)
❖ Example: Suppose your email program watches which
emails you do or do not mark as spam, and based on
that learns how to better filter spam. What is the task
T in this setting?
❑ Classifying emails as spam or not spam. (T)
❑ Watching you label emails as spam or not spam. (E)
❑ The number (or fraction) of emails correctly classified as
spam/not spam. (P)
❑ None of the above—this is not a machine learning problem.

PROFESSOR DR. MD. RAKIB HASSAN 14

Spam Classification

❖ Traditional approach

❖ Machine learning approach

PROFESSOR DR. MD. RAKIB HASSAN 15

Traditional Approach

❖ In this approach, the program will likely become a

long list of complex rules - pretty hard to maintain.

PROFESSOR DR. MD. RAKIB HASSAN 16

Traditional Approach (Cont.)
❖ If spammers notice that all their emails containing
“4U” are blocked, they might start writing “For U”
instead.

❖ A spam filter using traditional programming

techniques would need to be updated to flag “For U”
emails.

❖ If spammers keep working around your spam filter,

you will need to keep writing new rules forever.

PROFESSOR DR. MD. RAKIB HASSAN 17

Machine Learning Approach

❖ Automatically learns which words and phrases are good predictors

of spam by detecting unusually frequent patterns of words in the
spam examples compared to the ham examples
❖ The program is much shorter, easier to maintain, and most likely
more accurate.

PROFESSOR DR. MD. RAKIB HASSAN 18

Machine Learning Approach (Cont.)

❖ A spam filter based on Machine Learning techniques

automatically notices that “For U” has become
unusually frequent in spam flagged by users, and it
starts flagging them without your intervention.

PROFESSOR DR. MD. RAKIB HASSAN 19

Why Machine Learning?
❖ Machine Learning is great for:
❑ Problems for which existing solutions require a lot of hand-
tuning or long lists of rules: one Machine Learning algorithm
can often simplify code and perform better.
❑ Complex problems for which there is no good solution at all
using a traditional approach: the best Machine Learning
techniques can find a solution.
❑ Fluctuating environments: a Machine Learning system can
adapt to new data.
❑ Getting insights about complex problems and large amounts
of data.

PROFESSOR DR. MD. RAKIB HASSAN 20

Why Machine Learning? (Cont.)

❖ Machine Learning can also help humans learn.

PROFESSOR DR. MD. RAKIB HASSAN 21

Types of Machine Learning Systems
❖ Whether or not they are trained with human supervision
❑ supervised, unsupervised, semi-supervised, and Reinforcement Learning

❖ Whether or not they can learn incrementally on the fly

❑ online versus batch learning

❖ Whether they work by simply comparing new data points to

known data points, or instead detect patterns in the training
data and build a predictive model, much like scientists do
❑ instance-based versus model-based learning

❖ These are not exclusive. They can be combined in different

ways.

PROFESSOR DR. MD. RAKIB HASSAN 22

Supervised/Unsupervised Learning
❖ Machine Learning systems can be classified according
to the amount and type of supervision they get
during training.

❖ There are four major categories:

❑ Supervised learning
❑ Unsupervised learning
❑ Semi-supervised learning and
❑ Reinforcement Learning

PROFESSOR DR. MD. RAKIB HASSAN 23

Supervised Learning

❖ In supervised learning, the training data you feed to

the algorithm includes the desired solutions, called
labels.

Figure: A labeled training set for supervised learning (e.g., spam classification)

PROFESSOR DR. MD. RAKIB HASSAN 24

Supervised Learning Tasks
❖ Classification
❑ Example: Spam filter
o It is trained with many example emails along with their class (spam or ham), and
it must learn how to classify new emails.

❖ Regression
❑ Predicting a target numeric value, such as the price of a car,
given a set of features (mileage, age, brand, etc.) called
predictors.
❑ To train the system, many examples of cars, including both
their predictors and their labels (i.e., their prices) are
provided.

PROFESSOR DR. MD. RAKIB HASSAN 25

Regression

Figure: Regression

❖ Some regression algorithms can be used for classification as

well, and vice versa.
❑ For example, Logistic Regression is commonly used for classification, as it
can output a value that corresponds to the probability of belonging to a
given class (e.g., 20% chance of being spam).

PROFESSOR DR. MD. RAKIB HASSAN 26

Supervised Learning Algorithms
❖ k-Nearest Neighbors
❖ Linear Regression
❖ Logistic Regression
❖ Support Vector Machines (SVMs)
❖ Decision Trees and Random Forests
❖ Neural networks
❑ Some neural network architectures can be unsupervised, such
as autoencoders and restricted Boltzmann machines.
❑ They can also be semisupervised, such as in deep belief
networks and unsupervised pretraining.

PROFESSOR DR. MD. RAKIB HASSAN 27

Unsupervised Learning

❖ In unsupervised learning, the training data is

unlabeled.
❑ The system tries to learn without a teacher.

Figure: An unlabeled training set for unsupervised learning

PROFESSOR DR. MD. RAKIB HASSAN 28

Unsupervised Learning Algorithms
❖ Clustering
❑ k-Means
❑ Hierarchical Cluster Analysis (HCA)
❑ Expectation Maximization
❖ Visualization and dimensionality reduction
❑ Principal Component Analysis (PCA)
❑ Kernel PCA
❑ Locally-Linear Embedding (LLE)
❑ t-distributed Stochastic Neighbor Embedding (t-SNE)
❖ Association rule learning
❑ Apriori
❑ Eclat

PROFESSOR DR. MD. RAKIB HASSAN 29

Unsupervised - Clustering
❖ Tries to detect groups of similar patterns
❑ Example: A clustering algorithm can detect groups of similar
visitors of a website without any help. For example, it might
notice that 40% of the visitors are males who love comic
books and generally read the blog in the evening, while 20%
are young sci-fi lovers who visit during the weekends, and so
on.

❖ A hierarchical clustering algorithm can also subdivide

each group into smaller groups.

PROFESSOR DR. MD. RAKIB HASSAN 30

Unsupervised - Clustering (Cont.)

Figure: Clustering

PROFESSOR DR. MD. RAKIB HASSAN 31

Unsupervised - Visualization Algorithm
❖ A lot of complex and unlabeled data is provided and
it outputs a 2D or 3D representation of the input data
that can easily be plotted.

Figure: Example of a t-SNE visualization highlighting semantic clusters

(animals are well separated from vehicles, horses are close to deer but far from birds, and so on.)

PROFESSOR DR. MD. RAKIB HASSAN 32

Unsupervised - Visualization Algorithm (Cont.)
❖ These algorithms try to preserve as much structure as
they can (e.g., trying to keep separate clusters in the
input space from overlapping in the visualization), so
you can understand how the data is organized and
perhaps identify unsuspected patterns.

PROFESSOR DR. MD. RAKIB HASSAN 33

Unsupervised - Dimensionality Reduction
❖ The goal is to simplify the data without losing too
much information.
❑ One way to do this is to merge several correlated features into
one.
o For example, a car’s mileage may be very correlated with its age, so the
dimensionality reduction algorithm will merge them into one feature that
represents the car’s wear and tear.
o This is called feature extraction.

PROFESSOR DR. MD. RAKIB HASSAN 34

Anomaly Detection
❖ The system is trained with normal instances, and
when it sees a new instance it can tell whether it
looks like a normal one or whether it is likely an
anomaly.

❖ Examples:
❑ Detecting unusual credit card transactions to prevent fraud
❑ Catching manufacturing defects
❑ Automatically removing outliers from a dataset before feeding
it to another learning algorithm.

PROFESSOR DR. MD. RAKIB HASSAN 35

Anomaly Detection (Cont.)

Figure: Anomaly detection

PROFESSOR DR. MD. RAKIB HASSAN 36

Association Rule Learning
❖ The goal is to dig into large amounts of data and
discover interesting relations between attributes.

❖ Example:
❑ Suppose you own a supermarket. Running an association rule
on your sales logs may reveal that people who purchase
barbecue sauce and potato chips also tend to buy steak. Thus,
you may want to place these items close to each other.

PROFESSOR DR. MD. RAKIB HASSAN 37

Semisupervised Learning

❖ Some algorithms can deal with partially labeled training

data, usually a lot of unlabeled data and a little bit of
labeled data.
❑ This is called semisupervised learning

Figure: Semisupervised learning

PROFESSOR DR. MD. RAKIB HASSAN 38

Semisupervised Learning (Cont.)
❖ Example:
❑ Google Photos which automatically recognizes that the same
person A shows up in photos 1, 5, and 11, while another
person B shows up in photos 2, 5, and 7. This is the
unsupervised part of the algorithm (clustering).

❑ Now all the system needs is for you to tell it who these people
are. Just one label per person, and it is able to name everyone
in every photo, which is useful for searching photos.

❑ Sometimes it is necessary to provide a few labels per person

and manually clean up some clusters.

PROFESSOR DR. MD. RAKIB HASSAN 39

Semisupervised Learning (Cont.)
❖ Most semisupervised learning algorithms are
combinations of unsupervised and supervised
algorithms.

❖ Example:
❑ Deep belief networks (DBNs) are based on unsupervised
components called restricted Boltzmann machines (RBMs)
stacked on top of one another.
❑ RBMs are trained sequentially in an unsupervised manner,
and then the whole system is fine-tuned using supervised
learning techniques.

PROFESSOR DR. MD. RAKIB HASSAN 40

Reinforcement Learning
❖ The learning system, called an agent, can observe the
environment, select and perform actions, and get
rewards in return (or penalties in the form of negative
rewards).

❖ It must then learn by itself what is the best strategy,

called a policy, to get the most reward over time.

❖ A policy defines what action the agent should choose

when it is in a given situation.

PROFESSOR DR. MD. RAKIB HASSAN 41

Reinforcement Learning (Cont.)

Figure: Reinforcement learning

PROFESSOR DR. MD. RAKIB HASSAN 42

Reinforcement Learning (Cont.)
❖ Examples:
❑ Many robots implement Reinforcement Learning algorithms
to learn how to walk.

❑ DeepMind’s AlphaGo program is also a good example of

Reinforcement Learning:
o It made the headlines in March 2016 when it beat the world champion Lee Sedol
at the game of Go.
o It learned its winning policy by analyzing millions of games, and then playing
many games against itself.
o Note that learning was turned off during the games against the champion -
AlphaGo was just applying the policy it had learned.

PROFESSOR DR. MD. RAKIB HASSAN 43

Applications of Unsupervised Learning
❖ Organize computing clusters
❖ Social network analysis
❖ Market segmentation
❖ Market research
❖ Astronomical data analysis
❖ Gene sequence analysis
❖ Object recognition

PROFESSOR DR. MD. RAKIB HASSAN 44

Problem
❖ Of the following examples, which would you address
using an unsupervised learning algorithm? (Check all
that apply.)
❑ Given email labeled as spam/not spam, learn a spam filter.
❑ Given a set of news articles found on the web, group them
into set of articles about the same story.
❑ Given a database of customer data, automatically discover
market segments and group customers into different market
segments.
❑ Given a dataset of patients diagnosed as either having
diabetes or not, learn to classify new patients as having
diabetes or not.

PROFESSOR DR. MD. RAKIB HASSAN 45

Problem (Cont.)
❖ Of the following examples, which would you address
using an unsupervised learning algorithm? (Check all
that apply.)
❑ Given email labeled as spam/not spam, learn a spam filter.
❑ Given a set of news articles found on the web, group them
into set of articles about the same story.
❑ Given a database of customer data, automatically discover
market segments and group customers into different market
segments.
❑ Given a dataset of patients diagnosed as either having
diabetes or not, learn to classify new patients as having
diabetes or not.

PROFESSOR DR. MD. RAKIB HASSAN 46

Definitions
❖ Training set
❑ Examples that the system uses to learn.

❖ Training instance (or sample)

❑ Each training example is called a training instance (or sample).

❖ Data mining
❑ Applying ML techniques to dig into large amounts of data can
help discover patterns that were not immediately apparent.

PROFESSOR DR. MD. RAKIB HASSAN 47

Definitions (Cont.)
❖ Attribute and Feature
❑ In Machine Learning, an attribute is a data type (e.g.,
“Mileage”), while a feature has several meanings depending
on the context, but generally means an attribute plus its value
(e.g., “Mileage = 15,000”).

❑ Many people use the words attribute and feature

interchangeably, though.

PROFESSOR DR. MD. RAKIB HASSAN 48

PROF. DR. MD. RAKIB HASSAN 49

Machine Learning PPT For Students
70% (10)
Machine Learning PPT For Students
18 pages
ML Module 1
No ratings yet
ML Module 1
26 pages
Lecture 1 (Part 1)- Course Logistics and Gentle Overview(1)
No ratings yet
Lecture 1 (Part 1)- Course Logistics and Gentle Overview(1)
29 pages
Module 1
No ratings yet
Module 1
34 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
78 pages
ML m1-m5 NOTES
No ratings yet
ML m1-m5 NOTES
160 pages
21AI63 Module 1
No ratings yet
21AI63 Module 1
38 pages
6.1.Unit-1 ML Handsout
No ratings yet
6.1.Unit-1 ML Handsout
18 pages
Unit-2 AI Python
No ratings yet
Unit-2 AI Python
57 pages
Module 4 ISML
No ratings yet
Module 4 ISML
88 pages
Chapter 1
No ratings yet
Chapter 1
40 pages
Overview of machine learning
No ratings yet
Overview of machine learning
60 pages
Lecture 1 -Intro
No ratings yet
Lecture 1 -Intro
63 pages
Module1 ML
No ratings yet
Module1 ML
114 pages
1. Chapter 1 Introduction to ML
No ratings yet
1. Chapter 1 Introduction to ML
52 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Week09a Intro ML
No ratings yet
Week09a Intro ML
17 pages
The Machine Learning Landscape
No ratings yet
The Machine Learning Landscape
25 pages
ML Lec 1
No ratings yet
ML Lec 1
47 pages
Machine Learning and Soft Computing: CSCC53 Mca V Sem 2020
No ratings yet
Machine Learning and Soft Computing: CSCC53 Mca V Sem 2020
33 pages
ML 01
No ratings yet
ML 01
15 pages
Chapter 1 - Introduction
No ratings yet
Chapter 1 - Introduction
28 pages
GettingStartedwithMachineLearningML-DataScience365
No ratings yet
GettingStartedwithMachineLearningML-DataScience365
12 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
Unit 1&2
No ratings yet
Unit 1&2
270 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
03 ML Notes PDF
No ratings yet
03 ML Notes PDF
16 pages
AML All Merged PDF Class 1 To 8
No ratings yet
AML All Merged PDF Class 1 To 8
423 pages
ML Lecture 1 Introduction and Policies
No ratings yet
ML Lecture 1 Introduction and Policies
45 pages
Chapter 1
No ratings yet
Chapter 1
6 pages
Overview of Machine Learning PDF
100% (1)
Overview of Machine Learning PDF
57 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
11 pages
1_AML _Manish
No ratings yet
1_AML _Manish
72 pages
ML Unit-1
No ratings yet
ML Unit-1
30 pages
Lecture 01 - Machine Learning Basics Revision
No ratings yet
Lecture 01 - Machine Learning Basics Revision
80 pages
Lecture Compiled
No ratings yet
Lecture Compiled
224 pages
MLUnit_1
No ratings yet
MLUnit_1
131 pages
Artificial Intelligence & Machine Learning
No ratings yet
Artificial Intelligence & Machine Learning
27 pages
Training Report On Machine
No ratings yet
Training Report On Machine
25 pages
Unit 1-2
No ratings yet
Unit 1-2
22 pages
Unit 5 Machine Language
No ratings yet
Unit 5 Machine Language
77 pages
21ai63 Mod 1
No ratings yet
21ai63 Mod 1
38 pages
Chapter 5 AI
No ratings yet
Chapter 5 AI
40 pages
Lecture 01 - Introduction To AML-Jan24
No ratings yet
Lecture 01 - Introduction To AML-Jan24
66 pages
00intro-1
No ratings yet
00intro-1
43 pages
Introduction
No ratings yet
Introduction
18 pages
Cognate x Spidey
No ratings yet
Cognate x Spidey
46 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
5 pages
01 Introduction
No ratings yet
01 Introduction
43 pages
1.Introduction
No ratings yet
1.Introduction
24 pages
Machine Learning With Python
No ratings yet
Machine Learning With Python
44 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
74 pages
Module - 4 - ISML Notes
No ratings yet
Module - 4 - ISML Notes
38 pages
@vtucode - in 21AI63 Module 1 AI&ML 2021 Scheme
No ratings yet
@vtucode - in 21AI63 Module 1 AI&ML 2021 Scheme
38 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
Lec 01 [ML] Introduction
No ratings yet
Lec 01 [ML] Introduction
98 pages
ML Unit 1
No ratings yet
ML Unit 1
16 pages
Module-2 ML Part
No ratings yet
Module-2 ML Part
124 pages
Lecture1 - ML Introduction
No ratings yet
Lecture1 - ML Introduction
21 pages