0% found this document useful (0 votes)
3 views

18.Overview

Artificial intelligence

Uploaded by

zenithw131013
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

18.Overview

Artificial intelligence

Uploaded by

zenithw131013
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Machine Learning: Overview

Jihoon Yang

Machine Learning Research Laboratory


Department of Computer Science & Engineering
Sogang University

Jihoon Yang (ML Research Lab) 1 / 18


Machine Learning

Algorithms or computation or information processing provide for


study of cognition and life what calculus provided for physics

We have a theory of intelligent behavior when we have precise


information processing models (computer programs) that produce
such behaviour

We will have a theory of learning when we have precise information


processing models of learning (computer programs that learn from
experience)

Jihoon Yang (ML Research Lab) 2 / 18


Why should machines learn?
Intelligent behavior requires knowledge
Explicitly specifying the knowledge needed for specific tasks is hard,
and often infeasible
Some tasks are best specified by examples (e.g. medical diagnosis,
credit risk assessment)
Buried in large volumes of data are useful predictive relationships
(data mining)
Machine learning is most useful when
The structure of the task is not well understood but a representative
dataset is available
Task (or its parameters) change dynamically
If we can program computers to learn from experience, we can
Dramatically enhance the usability of software (e.g. personalised
information assistants)
Dramatically reduce the cost of software development (e.g. for medical
diagnosis)
Automate data driven discovery (e.g. bioinformatics, social informatics)
Jihoon Yang (ML Research Lab) 3 / 18
ML Applications

Medical diagnosis/image analysis (e.g. pneumonia)


Spam filtering, fraud detection (e.g. credit cards, phone calls)
Search and recommendation (e.g. google, amazon)
Automatic speech recognition & speaker verification
Locating/tracking/identifying objects in images & videos (e.g. faces)
Printed and handwritten text parsing
Driving computer players in games
Computational molecular biology (e.g. gene expression analysis)
Autonomous driving
...

Jihoon Yang (ML Research Lab) 4 / 18


ML in Context

Jihoon Yang (ML Research Lab) 5 / 18


What is ML?

A program M is said to learn from experience E with respect to some


class of tasks T and performance measure P if its performance as
measured by P on tasks in T in an environment Z with experience E

Examples
1 T : cancer diagnosis
E : a set of diagnosed cases
P: accuracy of diagnosis on new cases
Z : noisy measurements, occasionally misdiagnosed training cases
M: a program that runs on a general purpose computer

Jihoon Yang (ML Research Lab) 6 / 18


What is ML?

2 T : annotating protein sequences with function labels


E : a data set of annotated protein sequences
P: score on a test set not seen during training (e.g. accuracy of
annotations)

3 T : driving on the interstate


E : a sequence of sensor measurements and driving actions recorded
while observing an expert driver
P: mean distance traveled before an error as judged by a human
expert

Jihoon Yang (ML Research Lab) 7 / 18


Canonical learning problems
Supervised learning: given examples of inputs and corresponding
desired outputs, predict outputs on future inputs
Classification/Regression
Time series prediction
To address labor-intensive labelling issue:
Semi-SL: combines a small amount of labeled data with a large amount
of unlabeled data during training via pseudo-labelling
Self-SL or Unsupervised pre-training: labels are created by the
algorithm, rather than provided externally by a human; solves “pretext”
tasks that produce good features for downstream tasks
Unsupervised learning: given only inputs, automatically discover
representations, features, structures, etc.
Clustering/Outlier detection
Compression
Reinforcement learning: given sequences of inputs, actions from a
fixed set, and scalar rewards/punishments, learn to select actions in a
way that maximises expected reward
Jihoon Yang (ML Research Lab) 8 / 18
Machine Learning

Learning involves synthesis or adaption of computational structures:


Classifiers
Functions
Logic Programs
Rules
Grammars
Probability distributions
Action policies

ML = Inference + Data Structures + Algorithms

Jihoon Yang (ML Research Lab) 9 / 18


Learning input-output functions

Target function f: unknown to the learner – f ∈ F


Learner’s hypothesis about what f might be – h ∈ H, Hypothesis
space
Instance space X : domain of f , h
Output space Y : range of f , h
Example: an ordered pair (x, y ) where x ∈ X and f (x) = y ∈ Y
F and H may or may not be the same!
Training set E : a multi-set of examples
Learning algorithm L: a procedure which given some E , outputs an
h∈H

Jihoon Yang (ML Research Lab) 10 / 18


Learning input-output functions

Training regime
Batch
Online
Distributed
Vertical fragmentation
Horizontal fragmentation

Noise
Attribute noise
Classification noise
Both

Jihoon Yang (ML Research Lab) 11 / 18


Inductive learning

Premise: A hypothesis (e.g. a classifier) that is consistent with a


sufficiently large number of representative training examples is likely
to accurately classify novel instances drawn from the same universe

We can prove this is an optimal approach (under appropriate


assumptions)

When the number of examples is limited, the learner needs to be


smarter (e.g. find a concise hypothesis that is consistent with the
data)

Jihoon Yang (ML Research Lab) 12 / 18


Measuring classifier performance

Jihoon Yang (ML Research Lab) 13 / 18


Measuring classifier performance

N: Total number of instances in the data set


TP(c): True Positives for class c, FP(c): False Positives for class c
TN(c): True Negatives for class c, FN(c): False Negatives for class c
TP: True Positives over all classes

Accuracy =TP/N
TP(c)
Precision/Specificity(c) = TP(c)+FP(c)
TP(c)
Recall/Sensitivity(c) = TP(c)+FN(c)
FP(c)
False Alarm(c) = TP(c)+FP(c) = 1 − Precision(c)

Jihoon Yang (ML Research Lab) 14 / 18


Inductive bias

Consider a concept learning algorithm L for the set of instances X .


Let c be an arbitrary concept defined over X , and let
Dc = {⟨x, c(x)⟩} be an arbitrary set of training examples of c. Let
L(xi , Dc ) denote the classification assigned to the instance xi by L
after training on the data Dc .

The inductive bias of L is any minimal set of assertions B such that


for any target concept c and corresponding training examples Dc

(∀xi ∈ X )[(B ∧ Dc ∧ xi ) ⊢ L(xi , Dc )]

In other words, the set of assumptions, that together with the training
data, deductively justify the classifications assigned by the learner to
future instances

Jihoon Yang (ML Research Lab) 15 / 18


Function learning and bias

Jihoon Yang (ML Research Lab) 16 / 18


Learning and bias

Suppose H = set of all n-input Boolean functions.


Suppose the learner is unbiassed. Then
n
|H| = 22

HV = version space – the subset of H not yet ruled out by the learner

Jihoon Yang (ML Research Lab) 17 / 18


Learning and bias

Weaker bias
→ more open to experience, flexible
→ more expressive hypothesis representation

Occam’s razor
Simple hypotheses preferred
Linear fit preferred to quadratic fit assuming both yield relatively good
fit over the training examples

Learning in practice requires a trade-off between complexity of


hypothesis and goodness of fit; How this trade-off is done affects the
learner’s ability to generalise

Jihoon Yang (ML Research Lab) 18 / 18

You might also like