0% found this document useful (0 votes)

74 views

ML Lecture 8

This document discusses computational learning theory and frameworks for analyzing learning algorithms. It introduces the Probably Approximately Correct (PAC) learning model and the mistake bound model. The PAC model examines sample complexity, computational complexity, and the conditions under which a learning algorithm can output a hypothesis that has low error. The mistake bound model analyzes the number of mistakes a learning algorithm makes during training. Key concepts discussed include VC-dimension, sample complexity bounds for finite and infinite hypothesis spaces, and analyzing learning algorithms like the weighted majority algorithm within these frameworks.

Uploaded by

Avishek01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views

ML Lecture 8

Uploaded by

Avishek01

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 12

Machine Learning: Lecture 8

Computational Learning
Theory
(Based on Chapter 7 of Mitchell T..,
Machine Learning, 1997)

1
Overview
 Are there general laws that govern learning?
 Sample Complexity: How many training examples are needed for
a learner to converge (with high probability) to a successful
hypothesis?
 Computational Complexity: How much computational effort is
needed for a learner to converge (with high probability) to a
successful hypothesis?
 Mistake Bound: How many training examples will the learner
misclassify before converging to a successful hypothesis?
 These questions will be answered within two analytical
frameworks:
 The Probably Approximately Correct (PAC) framework

 The Mistake Bound framework

2
Overview (Cont’d)
 Rather than answering these questions for
individual learners, we will answer them for
broad classes of learners. In particular we will
consider:
 The size or complexity of the hypothesis space
considered by the learner.
 The accuracy to which the target concept must be
approximated.
 The probability that the learner will output a
successful hypothesis.
 The manner in which training examples are
presented to the learner.
3
The PAC Learning Model
 Definition: Consider a concept class C
defined over a set of instances X of length n
and a learner L using hypothesis space H. C is
PAC-learnable by L using H if for all cC,
distributions D over X,  such that 0<  < 1/2,
and  such that 0<  <1/2, learner L will, with
probability at least (1- ), output a hypothesis
hH such that errorD(h)   , in time that is
polynomial in 1/ , 1/ , n , and size(c).

4
Sample Complexity for Finite
Hypothesis Spaces
 Given any consistent learner, the number of examples
sufficient to assure that any hypothesis will be probably
(with probability (1- )) approximately (within error  )
correct is m= 1/ (ln|H|+ln(1/))
 If the learner is not consistent, m= 1/22 (ln|H|+ln(1/))
 Conjunctions of Boolean Literals are also PAC-
Learnable and m= 1/ (n.ln3+ln(1/))
 k-term DNF expressions are not PAC learnable because
even though they have polynomial sample complexity,
their computational complexity is not polynomial.
 Surprisingly, however, k-term CNF is PAC learnable.
5
Sample Complexity for Infinite
Hypothesis Spaces I: VC-Dimension
 The PAC Learning framework has 2 disadvantages:
 It can lead to weak bounds

 Sample Complexity bound cannot be established for

infinite hypothesis spaces

 We introduce new ideas for dealing with these problems:
 Definition: A set of instances S is shattered by hypothesis

space H iff for every dichotomy of S there exists some

hypothesis in H consistent with this dichotomy.
 Definition: The Vapnik-Chervonenkis dimension,

VC(H), of hypothesis space H defined over instance

space X is the size of the largest finite subset of X
shattered by H. If arbitrarily large finite sets of X can b
shattered by H, then VC(H)=
6
Sample Complexity for Infinite
Hypothesis Spaces II
 Upper-Bound on sample complexity, using the VC-
Dimension: m 1/ (4log2(2/)+8VC(H)log2(13/)
 Lower Bound on sample complexity, using the VC-
Dimension:
Consider any concept class C such that VC(C)  2, any
learner L, and any 0 <  < 1/8, and 0 <  < 1/100. Then
there exists a distribution D and target concept in C
such that if L observes fewer examples than
max[1/ log(1/ ),(VC(C)-1)/(32)]
then with probability at least , L outputs a hypothesis
h having errorD(h)>  . 7
VC-Dimension for Neural Networks
 Let G be a layered directed acyclic graph with n
input nodes and s2 internal nodes, each having
at most r inputs. Let C be a concept class over Rr
of VC dimension d, corresponding to the set of
functions that can be described by each of the s
internal nodes. Let CG be the G-composition of
C, corresponding to the set of functions that can
be represented by G. Then VC(CG)2ds log(es),
where e is the base of the natural logarithm.
 This theorem can help us bound the VC-
Dimension of a neural network and thus, its
sample complexity (See, [Mitchell, p.219])!
8
The Mistake Bound Model of Learning
 The Mistake Bound framework is different from
the PAC framework as it considers learners that
receive a sequence of training examples and that
predict, upon receiving each example, what its
target value is.
 The question asked in this setting is: “How
many mistakes will the learner make in its
predictions before it learns the target concept?”
 This question is significant in practical settings
where learning must be done while the system is
in actual use.
9
Optimal Mistake Bounds
 Definition: Let C be an arbitrary nonempty
concept class. The optimal mistake bound
for C, denoted Opt(C), is the minimum over
all possible learning algorithms A of MA(C).
Opt(C)=minALearning_Algorithm MA(C)
 For any concept class C, the optimal
mistake bound is bound as follows:
VC(C)  Opt(C)  log2(|C|)

10
A Case Study: The Weighted-
Majority Algorithm
ai denotes the ith prediction algorithm in the pool A of
algorithm. wi denotes the weight associated with ai.
 For all i initialize wi <-- 1
 For each training example <x,c(x)>
 Initialize q0 and q1 to 0

 For each prediction algorithm ai

• If ai(x)=0 then q0 <-- q0+wi

• If ai(x)=1 then q1 <-- q1+wi
 If q1 > q0 then predict c(x)=1

 If q0 > q1 then predict c(x) =0

 If q0=q1 then predict 0 or 1 at random for c(x)

 For each prediction algorithm ai in A do

• If ai(x)  c(x) then wi <-- wi

11
Relative Mistake Bound for the
Weighted-Majority Algorithm
 Let D be any sequence of training examples, let A
be any set of n prediction algorithms, and let k be
the minimum number of mistakes made by any
algorithm in A for the training sequence D. Then
the number of mistakes over D made by the
Weighted-Majority algorithm using =1/2 is at
most 2.4(k + log2n).
 This theorem can be generalized for any 0   1
where the bound becomes
(k log2 1/ + log2n)/log2(2/(1+ ))
12

Igcse Mathematics 0580 - Notes
100% (7)
Igcse Mathematics 0580 - Notes
11 pages
ND Business Administration Nbte Curriculum 34
No ratings yet
ND Business Administration Nbte Curriculum 34
164 pages
CBSE Class 7 Maths Worksheet
43% (7)
CBSE Class 7 Maths Worksheet
3 pages
4 Algebraic Representation & Manipulation
No ratings yet
4 Algebraic Representation & Manipulation
9 pages
GRE Magoosh Practice Questions
100% (1)
GRE Magoosh Practice Questions
15 pages
Math Finance Cheat Sheet PDF
No ratings yet
Math Finance Cheat Sheet PDF
2 pages
Computational Learning
No ratings yet
Computational Learning
12 pages
ML Unit-2 Material Add-On
No ratings yet
ML Unit-2 Material Add-On
82 pages
ML 3
No ratings yet
ML 3
36 pages
Foundations of Machine Learning: Module 7: Computational Learning Theory
No ratings yet
Foundations of Machine Learning: Module 7: Computational Learning Theory
64 pages
Lect 26 PDF
No ratings yet
Lect 26 PDF
14 pages
Machine Learning: PAC-Learning and VC-Dimension
No ratings yet
Machine Learning: PAC-Learning and VC-Dimension
31 pages
MachineLearning_UNIT III
No ratings yet
MachineLearning_UNIT III
30 pages
ML Unit-4 Prob Learning
No ratings yet
ML Unit-4 Prob Learning
36 pages
computational learning theorem
No ratings yet
computational learning theorem
91 pages
Unit Iii
No ratings yet
Unit Iii
6 pages
ML Chapter 7 (CLT) Notes
No ratings yet
ML Chapter 7 (CLT) Notes
59 pages
Machine Learning - Computational Learning Theory PDF
No ratings yet
Machine Learning - Computational Learning Theory PDF
7 pages
Colt Tutorial
No ratings yet
Colt Tutorial
43 pages
ML Unit-3
No ratings yet
ML Unit-3
24 pages
Lecture16 VC
No ratings yet
Lecture16 VC
42 pages
1 The Probably Approximately Correct (PAC) Model: COS 511: Theoretical Machine Learning
No ratings yet
1 The Probably Approximately Correct (PAC) Model: COS 511: Theoretical Machine Learning
6 pages
SML_Lecture2
No ratings yet
SML_Lecture2
35 pages
Week_7_Notes[1]
No ratings yet
Week_7_Notes[1]
11 pages
ML_UNIT 4
No ratings yet
ML_UNIT 4
15 pages
Lecture 5
No ratings yet
Lecture 5
12 pages
Computational Learning Theory
No ratings yet
Computational Learning Theory
15 pages
MLSM Lecture2 120923
No ratings yet
MLSM Lecture2 120923
35 pages
AL3451 13 M
No ratings yet
AL3451 13 M
22 pages
Lecture5 Learning Theory v1.1
No ratings yet
Lecture5 Learning Theory v1.1
59 pages
ML Unit-3.-1
No ratings yet
ML Unit-3.-1
28 pages
Machine Leaning 3
No ratings yet
Machine Leaning 3
44 pages
ML Question Bank U - 4
No ratings yet
ML Question Bank U - 4
14 pages
Tutorial
No ratings yet
Tutorial
81 pages
WINSEM2021-22 CSE4020 ETH VL2021220501968 Reference Material I 22-01-2022 PAC Learning
No ratings yet
WINSEM2021-22 CSE4020 ETH VL2021220501968 Reference Material I 22-01-2022 PAC Learning
34 pages
How Many Samples To Learn A Finite Class?
No ratings yet
How Many Samples To Learn A Finite Class?
4 pages
PAC LEARNING
No ratings yet
PAC LEARNING
30 pages
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
No ratings yet
Module 2 - Syllabus: CS 476 Introduction To Machine Learning, Module 2
20 pages
TheLearningTheory 2
No ratings yet
TheLearningTheory 2
90 pages
4 Solid Property
No ratings yet
4 Solid Property
30 pages
ML Notes
No ratings yet
ML Notes
161 pages
Pac VC PDF
No ratings yet
Pac VC PDF
32 pages
Unit 1-1
No ratings yet
Unit 1-1
75 pages
LearningTheory
No ratings yet
LearningTheory
19 pages
UNIT-3
No ratings yet
UNIT-3
99 pages
SupervisedLearning 2 33
No ratings yet
SupervisedLearning 2 33
32 pages
Learnability Can Be Undecidable-Nicolelis
No ratings yet
Learnability Can Be Undecidable-Nicolelis
5 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
RL-Notes Book
No ratings yet
RL-Notes Book
119 pages
ML UNIT-3 Notes PDF
No ratings yet
ML UNIT-3 Notes PDF
23 pages
Qualification Exam Question: 1 Statistical Models and Methods
No ratings yet
Qualification Exam Question: 1 Statistical Models and Methods
4 pages
SML_Lecture3
No ratings yet
SML_Lecture3
36 pages
4.0 ALGO211 Week10 Computational Learning Theory
No ratings yet
4.0 ALGO211 Week10 Computational Learning Theory
16 pages
CS 2750 Machine Learning
No ratings yet
CS 2750 Machine Learning
14 pages
All Merged Chap 4
No ratings yet
All Merged Chap 4
37 pages
Kearns Vairani 1994 Introduction To Computational Learning Thoery Ch01
No ratings yet
Kearns Vairani 1994 Introduction To Computational Learning Thoery Ch01
41 pages
Assignment2 PDF
No ratings yet
Assignment2 PDF
2 pages
Notes
No ratings yet
Notes
125 pages
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
No ratings yet
Kakade S. Tewari a. - Topics in Artificial Intelligence (Learning Theory)
68 pages
Lecture Summary
No ratings yet
Lecture Summary
2 pages
ML Sit1305
No ratings yet
ML Sit1305
127 pages
CS3491-AI ML-Chapter 2
No ratings yet
CS3491-AI ML-Chapter 2
16 pages
Ai and ML Module 3
No ratings yet
Ai and ML Module 3
12 pages
Lecture 1: Brief Overview - PAC Learning
No ratings yet
Lecture 1: Brief Overview - PAC Learning
3 pages
ML Lecture 1 Iitg
No ratings yet
ML Lecture 1 Iitg
32 pages
Satplan: Fundamentals and Applications
From Everand
Satplan: Fundamentals and Applications
Fouad Sabry
No ratings yet
Quasi Rigid
No ratings yet
Quasi Rigid
29 pages
NLP
100% (1)
NLP
20 pages
cs638 13
No ratings yet
cs638 13
26 pages
SQL Notes
No ratings yet
SQL Notes
4 pages
Error No-1 (Alteration) Date-02Aug2011, Master Policy No - Member Addition Policy No Description
No ratings yet
Error No-1 (Alteration) Date-02Aug2011, Master Policy No - Member Addition Policy No Description
1 page
Numerical Ability - Data Interpretation 1
No ratings yet
Numerical Ability - Data Interpretation 1
7 pages
Numerical Ability - Data Interpretation 3: 25 Questions
No ratings yet
Numerical Ability - Data Interpretation 3: 25 Questions
6 pages
National Institute of Public Cooperation and Child Development 5, Siri Institutional Area, Hauz Khas, New Delhi 110 016
No ratings yet
National Institute of Public Cooperation and Child Development 5, Siri Institutional Area, Hauz Khas, New Delhi 110 016
12 pages
Claims For Capitation
No ratings yet
Claims For Capitation
1 page
RD Sharma Class 8 Maths Chapter 6 Solutions Exercise 1
No ratings yet
RD Sharma Class 8 Maths Chapter 6 Solutions Exercise 1
3 pages
Recursion
No ratings yet
Recursion
9 pages
Lec 9 Other Methods
No ratings yet
Lec 9 Other Methods
22 pages
Nitsche 1985
No ratings yet
Nitsche 1985
19 pages
Brown F.B. - Fundamentals of Monte Carlo Particle Transport. Lecture Notes For Monte Carlo Course PDF
No ratings yet
Brown F.B. - Fundamentals of Monte Carlo Particle Transport. Lecture Notes For Monte Carlo Course PDF
403 pages
Q. 1 Write Algorithm For The Following: A) To Check Whether An Entered Number Is Odd / Even. B) To Calculate Sum of Three Numbers
100% (1)
Q. 1 Write Algorithm For The Following: A) To Check Whether An Entered Number Is Odd / Even. B) To Calculate Sum of Three Numbers
3 pages
Revision PaperA For EOS1
No ratings yet
Revision PaperA For EOS1
12 pages
Module 3
No ratings yet
Module 3
35 pages
IB-ACIO-Grade-II-Sample guide
No ratings yet
IB-ACIO-Grade-II-Sample guide
17 pages
Program Evaluation Tools: Collections
No ratings yet
Program Evaluation Tools: Collections
2 pages
Cns Assignment 4
No ratings yet
Cns Assignment 4
6 pages
Unit-2-Start Learning R
No ratings yet
Unit-2-Start Learning R
10 pages
Module 6 - CENTRAL LIMIT THEOREM
No ratings yet
Module 6 - CENTRAL LIMIT THEOREM
6 pages
Pedagogy of Mathematics PPT English
100% (1)
Pedagogy of Mathematics PPT English
16 pages
CLASS 8 - MATH - ICSE -CUBE AND CUBE ROOT-EX-4
No ratings yet
CLASS 8 - MATH - ICSE -CUBE AND CUBE ROOT-EX-4
22 pages
MatHEMATICS TEACHING and LEARNING FRAMEWORK Draft 6
No ratings yet
MatHEMATICS TEACHING and LEARNING FRAMEWORK Draft 6
92 pages
International Islamic University Chittagong: Bismillahir Rahmanir Rahim
No ratings yet
International Islamic University Chittagong: Bismillahir Rahmanir Rahim
3 pages
Solution Manual for Calculus, 4th Edition, Jon Rogawski, Colin Adams, Robert Franzosa pdf download
100% (1)
Solution Manual for Calculus, 4th Edition, Jon Rogawski, Colin Adams, Robert Franzosa pdf download
53 pages
Engg. Maths 1
No ratings yet
Engg. Maths 1
2 pages
Year 10A Extended Mathematics Mid-Year Exam: Semester 1 2016
No ratings yet
Year 10A Extended Mathematics Mid-Year Exam: Semester 1 2016
7 pages
2022 1 Linear Programming
No ratings yet
2022 1 Linear Programming
8 pages
Handout Econometrics - Module
No ratings yet
Handout Econometrics - Module
86 pages
Answers: Practice Practice
No ratings yet
Answers: Practice Practice
1 page
Multiple Regression: Dr. Sanjay Rastogi IIFT, New Delhi
No ratings yet
Multiple Regression: Dr. Sanjay Rastogi IIFT, New Delhi
37 pages

ML Lecture 8

Uploaded by

ML Lecture 8

Uploaded by

Machine Learning: Lecture 8

 The Mistake Bound framework

 Sample Complexity bound cannot be established for

infinite hypothesis spaces

space H iff for every dichotomy of S there exists some

VC(H), of hypothesis space H defined over instance

 For each prediction algorithm ai

• If ai(x)=0 then q0 <-- q0+wi

 If q0 > q1 then predict c(x) =0

 If q0=q1 then predict 0 or 1 at random for c(x)

 For each prediction algorithm ai in A do

• If ai(x)  c(x) then wi <-- wi

You might also like