100% found this document useful (1 vote)
3K views

Machine Learning UNIT 1 PDF

This document provides information about a Machine Learning course, including: 1. The course introduces machine learning techniques like decision trees and Bayesian learning to understand computational learning theory and pattern comparison techniques. 2. Prerequisites for the course are a background in probabilities, statistics, and data structures. 3. Evaluation includes two midterm exams, two assignments, and a final university exam, with marks distributed between these assessments.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views

Machine Learning UNIT 1 PDF

This document provides information about a Machine Learning course, including: 1. The course introduces machine learning techniques like decision trees and Bayesian learning to understand computational learning theory and pattern comparison techniques. 2. Prerequisites for the course are a background in probabilities, statistics, and data structures. 3. Evaluation includes two midterm exams, two assignments, and a final university exam, with marks distributed between these assessments.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

COMPUTERSCIENCE AND ENGINEERING

COURSE DESCRIPTIONF O R M

CourseTitle MACHINE LEARNING


CourseCode CS601PC
Regulation R18-JNTUH
CourseStructure Lectures Tutorials Practicals Credits
4 01 3 4
CourseFaculty Mr.K.LAKSHMINARAYANA

I. COURSE OVERVIEW:

This course explains machine learning techniques such as decision tree learning,
Bayesian learning etc, to understand computational learning theory, study the pattern
comparison techniques, understanding basic machine learning algorithms and
applications.

II. PREREQUISITE(S):
Level Credits Periods/Week Prerequisites
UG 4 4+1 Probabilities, statistics, Data structures

III. MARKSDISTRIBUTION:

UniversityEnd Total
SessionalMarks
Exammarks marks
MidSemesterTest
There shallbe two mid term examinations.
Each mid term examination consists of subjective type and objective type
tests.
The subjective test is f o r 1 0 marks o f 60 m i n u t e s d u r a t i o n .
Subjective test of shall contain 4 questions; the student has t o answer 2
questions,each carrying 5 marks. 75 100
The o b j e c t i v e t y p e t e s t i s f o r 1 0 m a r k s o f 2 0
m i n u t e s d u r a t i o n . I t c o n s i s t s of 10 Multiple choice and 10
objective type questions, the student has
toanswerallt h e questionsand eachcarrieshalf mark.
First mid term examination s h a l l b e c o n d u c t e d f o r t h e
f i r s t t w o a n d h a l f units of syllabus and second mid term
examinations h a l l be c o n d u c t e d for the remaining p o r t i o n .
UniversityEnd Total
SessionalMarks
Exammarks marks
Assignment
Five marks are alloted fo r assignments.
There s h a l l b e t w o a ssignments in e v e r y t h e o r y c o u r s e . Marks shall
be awarded c o n s i d e r i n g the average o f two a s s i g n m e n t s i n e a c h
course.

IV. EVALUATIONSCHEME:

S.No Component Duration Marks


1. I MidExamination 80minutes 20
2. I Assignment - 5
3. II MidExamination 80minutes 20
4. II Assignment - 5
5. External Examination 3hours 75

V. COURSE OBJECTIVES:

I. This course explains machine learning techniques such as decision tree learning, Bayesian
learning etc.
II. To understand computational learning theory.
III. To study the pattern comparison techniques.

IV. COURSE OUTCOMES:

CO.1. Understand the concepts of computational intelligence like machine learning


CO.2. Understand the Algorithms of Artificial Neural Networks and how to apply these algorithms in real time applications
CO.3. Ability to get the skill to apply machine learning techniques to address the real time problems in different areas
CO.4. Understand the advanced machine learning concept like reinforcement learning and genetic algorithms and its
applications.
CO.5. Ability to get analytical skills on Machine learning concepts.
V. HOW PROGRAMS ARE ACCESSED:

Proficiencyasse
Program Outcomes Level
ssedby
PO1 Engineering k n o w l e d g e : A p p l y t heknowledge o f
mathematics, science, engineering fundamentals, and anengineerings H Exercises
p e c i a l i z a t i o n t o the solution of complex e n g i n e e r i n g
problems.
PO2 Problemanalysis:Identify, f o r m u l a t e , r e v i e w r e s e a r c h
literature, a n d analyze complex engineering problems
reaching substantiated conclusions using first principles H Exercises
of mathematics, natural sciences, and engineering sciences.
PO3 Design/developmentofsolutions:Design solutions for complex
engineering problems and design system components or processes Assignments
t h a t meet the specified needs with appropriate consideration for H
the public health and safety, and the cultural, societal, and
environmental
considerations.
PO4 Conductinvestigations of complex problems:Use research-based
knowledge and research methods including design of S Projects
experiments, analysis and i n t e r p r e t a t i o n o f d a t a , and
s y n t h e s i s of the information to provide valid conc lusions.
PO5 Modern to o l u s a g e : C r e a t e , s e l e c t , and apply appropriate
techniques, resources,and modern engineering and IT tools including
H Mini Projects
prediction and modeling t o c o m p l e x e n g i n e e r i n g
a c t i v i t i e s w i t h a n u n d e r s t a n d i n g of the limitations.
PO6 Theengineer andsociety:Applyreasoning informedbythe contextual
knowledge t o a s s e s s s o c i e t a l ,
N --
h e a l t h , s a f e t y , l e g a l a n d c u l t u r a l i s s u e s and
theconsequent responsibilities relevantt o
t h e p r o f e s s i o n a l engineeringpractice.
PO7 Environment and sustainability: Understand the impact of the
professional engineering solutions in societal and environmental
N --
contexts, an d demonstrate the knowledge of, a n d need f o r
s u s t a i n a b l e development.
PO8 Ethics:Apply
N --
ethicalprinci pl es andcom mi t t o professi onal ethicsa n d responsibilitiesand
normso f theengineeringpractice.
PO9 Individual a n d t e a m work : F u n c t i o n effectively a s
anindividual, a n d as a member or leader ind i v e r s e teams, and in H Projects
multidisciplinary settings.
PO10 Communication:Communicate effectively on complex engineering
activities with the engineering community and with society at
large, such as, being a b l e to c o m p r e h e n d a n d write N --
e f f e c t i v e reports and
design documentation, make effective presentations, and give
and receive clear instructions.
PO11 Project management and finance: Demonstrate knowledge and
understanding of the e n g i n e e r i n g and management p r i n c i p l e s
S Projects
and a p p l y theseto one’s own w o r k , a s a member and leader in a
t e a m , t o manage projects a n d in m u l t i d i s c i p l i n a r y
e n v i r o n m e n t s .
PO12 Life-long learning: Recognize the need for, and have the
preparation and ability to engage in independent and life-long learning N --
in the broadest context o f technological change.

N-None S-Supportive H-HighlyR e l a t e d


MAPPINGCOURSE OBJECTIVESLEADING TO THE ACHIEVEMENTOFPROGRAMOUTCOMES

Program
Course ProgramOutcomes
SpecificOutcom
Objective es
s
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
I H H H H
II H S S S
III H S S S
IV H S S S
V H S S S

S–Supportive H-HighlyRelated

Prepared by: K.Lakshminarayana

HOD, COMPUTER SCIENCE AND ENGINEERING


R18 B.TECH.COMPUTERSCIENCE&ENGG.

CS601PC: MACHINE LEARNING

B.TECH I I I Year II-Sem. L T P C


3 1 0 4
Prerequisites: Data Structures, Knowledge on statistical methods

CourseObjectives:

1. This course explains machine learning techniques such as decision tree learning, Bayesian
learning etc.
2. To understand computational learning theory.
3. To study the pattern comparison techniques.
CourseOutcomes:
1. Understand the concepts of computational intelligence like machine learning
2. Ability to get the skill to apply machine learning techniques to address the real time
problems in different areas
3. Understand the Neural Networks and its usage in machine learning application.

UNIT-I
Introduction - Well-posed learning problems, designing a learning system, Perspectives and issues in
machine learning Concept learning and the general to specific ordering – introduction, a concept
learning task, concept learning as search, find-S: finding a maximally specific hypothesis, version
spaces and the candidate elimination algorithm, remarks on version spaces and candidate
elimination, inductive bias.
Decision Tree Learning – Introduction, decision tree representation, appropriate problems for
decision tree learning, the basic decision tree learning algorithm, hypothesis space search in decision
tree learning, inductive bias in decision tree learning, issues in decision tree learning.

UNIT-II
Artificial Neural Networks-1– Introduction, neural network representation, appropriate problems
for neural network learning, perceptions, multilayer networks and the back-propagation algorithm.
Artificial Neural Networks-2- Remarks on the Back-Propagation algorithm, An illustrative example:
face recognition, advanced topics in artificial neural networks.
Evaluation Hypotheses – Motivation, estimation hypothesis accuracy, basics of sampling theory, a
general approach for deriving confidence intervals, difference in error of two hypotheses, comparing
learning algorithms.

UNIT-III
B a y e s i a n l e a r n i n g – Introduction, Bayes theorem, Bayes theorem and concept learning,
Maximum Likelihood and least squared error hypotheses, maximum likelihood hypotheses for
predicting probabilities, minimum description length principle, Bayes optimal classifier, Gibs
algorithm, Naïve Bayes classifier, an example: learning to classify text, Bayesian belief networks, the
EM algorithm.
Computational learning theory – Introduction, probably learning an approximately correct
hypothesis, sample complexity for finite hypothesis space, sample complexity for infinite hypothesis
spaces, the mistake bound model of learning.
Instance-Based Learning- Introduction, k-nearest neighbour algorithm, locally weighted regression,
radial basis functions, case-based reasoning, remarks on lazy and eager learning.
UNIT-IV
Genetic Algorithms – Motivation, Genetic algorithms, an illustrative example, hypothesis
space search, genetic programming, models of evolution and learning, parallelizing genetic
algorithms.
Learning Sets of Rules – Introduction, sequential covering algorithms, learning rule sets: summary,
learning First-Order rules, learning sets of First-Order rules: FOIL, Induction as inverted deduction,
inverting resolution.
Reinforcement Learning – Introduction, the learning task, Q–learning, non-deterministic, rewards
and actions, temporal difference learning, generalizing from examples, relationship to dynamic
programming.
UNIT-V
Analytical Learning-1- Introduction, learning with perfect domain theories: PROLOG-EBG,
remarks on explanation-based learning, explanation-based learning of search control
knowledge.
Analytical Learning-2-Using prior knowledge to alter the search objective, using prior knowledge to
augment search operators. Combining Inductive and Analytical Learning – Motivation, inductive-
analytical approaches to learning, using prior knowledge to initialize the hypothesis.

TEXTBOOKS:
1. Machine Learning – Tom M. Mitchell, - MGH

REFERENCES:
1. Machine Learning: An Algorithmic Perspective, Stephen
Marshland, Taylor & Francis
Machine Learning S e s s i o n P l a n n e r
Text/ Date Date
S.No UNIT Class Topic Ref Book Planned Conducted
UNIT-1
1 Introduction to course
2 Course Objectives and Course outcomes
3 LH1 Well-posed learning problems, designing a learning system T1
4 LH2 Perspectives and issues in machine learning T1
5 LH3 A concept learning task, concept learning as search T1
6 LH4 find-S: finding a maximally specific hypothesis T1
7 LH5 version spaces and the candidate elimination algorithm T1
8 LH6 Remarks on version spaces T1
9 LH7 candidate elimination, inductive bias T1
I
10 LH8 decision tree representation, appropriate problems for decision tree learning T1
11 LH9 The basic decision tree learning algorithm T1
12 LH10 hypothesis space search in decision tree learning T1
13 LH11 inductive bias in decision tree learning, issues in decision tree learning T1
14 I PPT 1
15 TS1 Test 1
16 ALP Active Learning-Collaborative Learning
UNIT-2
T1
17 LH12 Neural network representation, appropriate problems for neural network learning
18 LH13 perceptions, multilayer networks T1
19 LH14 the back-propagation algorithm T1
20 LH15 advanced topics in artificial neural networks T1
21 LH16 Motivation, estimation hypothesis accuracy T1
22 LH17 basics of sampling theory T1
II
23 LH18 pointer and reference types,type checking, T1
24 LH19 general approach for deriving confidence intervals T1
25 LH20 difference in error of two hypotheses T1
26 LH21 comparing learning algorithms. T1
27 II PPT-2
28 TS2 Test2
29 ALP Active Learning-Flipped Class
Bayesian learning – Introduction, Bayes theorem UNIT-3
30 LH22 referencing environments, T1
31 LH23 Bayes theorem and concept learning T1
32 LH24 Maximum Likelihood and least squared error hypotheses, T1
33 LH25 maximum likelihood hypotheses for predicting probabilities T1
34 LH26 minimum description length principle, Bayes optimal classifier T1
35 LH27 Gibs algorithm, Naïve Bayes classifier T1
36 LH28 example: learning to classify text, Bayesian belief networks, EM algorithm T1
37 LH29 probably learning an approximately correct hypothesis T1
38 III T1
LH30 sample complexity for finite hypothesis space, sample complexity for infinite hypothesis spaces
39 LH31 mistake bound model of learning. T1
40 LH32 Instance-Based Learning- Introduction T1
41 LH33 k-nearest neighbour algorithm T1
42 LH34 locally weighted regression, radial basis functions T1
43 LH35 case-based reasoning, remarks on lazy and eager learning T1
44 III PPT-3
45 TS3 Test3
46 ALP Active Learning: Poster Presentation
UNIT-4

47 LH36 T1
Genetic Algorithms – Motivation, Genetic algorithms, an illustrative example
48 LH37 hypothesis space search, genetic programming T1
49 LH38 models of evolution and learning, parallelizing genetic algorithms T1
40 LH39 Learning Sets of Rules – Introduction T1
41 LH40 sequential covering algorithms T1
42 LH41 learning rule sets: summary, learning First-Order rules T1
43 LH42 learning sets of First-Order rules: FOIL T1
IV
44 LH43 Induction as inverted deduction, inverting resolution T1
45 LH44 Reinforcement Learning – Introduction, the learning task T1
46 LH45 Q–learning, non-deterministic, rewards and actions T1
47 LH46 temporal difference learning, generalizing from examples T1
48 LH47 relationship to dynamic programming. T1
49 IV PPT-4
50 TS4 Test4
51 ALP Active Learning: Collaborative learning
UNIT-5
52 LH48 Analytical Learning-1-learning with perfect domain theories T1
53 LH49 PROLOG-EBG, remarks on explanation-based learning T1
54 LH50 explanation-based learning of search control knowledge T1
55 LH51 Analytical Learning-2-Using prior knowledge to alter the search objective T1
56 LH52 using prior knowledge to augment search operators T1
57 V LH53 Combining Inductive and Analytical Learning – Motivation T1
58 LH54 inductive-analytical approaches to learning, T1
LH55 using prior knowledge to initialize the hypothesis.
59 T1
60 V PPT-5
61 TS5 Test5
62 ALP Active Learning: Think Pair
INDEX

M A C H I N E L E A R N I N G (A.Y:2020-21)

S.No. Topic Pg.No.

1. LectureNotes 9-21

2. Assignment questions and answers 22

3. University questions 22

4. Objective questions and answers 23

5. Tutorial questions and answers 24

6. Interview questions 25
7. NPTEL References 26

8. Course Learning outcomes 26

9. UNIT TEST Questionpaper 27-28

10. Seminar questions 28


11. Bloom'staxonomy 29-30

12. Active Learning 31

13. Gate question 32

14. Real time applications 33


1.0 Machine Learning Introduction

Machine learning deal with how to program them to learn-to improve automatically with experience.
Imagine computers learning from medical records which treatments are most effective for new diseases,
houses learning from experience to optimize energy costs based on the particular usage patterns of their
occupants, or personal software assistants learning the evolving interests of their users in order to highlight
especially relevant stories from the online morning newspaper. A successful understanding of how to make
computers learn would open up many new uses of computers and new levels of competence and
customization. For problems such as speech recognition, algorithms based on machine learning outperform
all other approaches that have been attempted to date. It seems inevitable that machine learning will play
an increasingly central role in computer science and computer technology.
1.1 WELL-POSED LEARNING PROBLEMS

Learn Definition: A computer program is said to learn from experience E with respect
to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P,
improves with experience E.

A checkers learning problem:

Task T: playing checkers


Performance measure P: percent of games won against opponents
Training experience E: playing practice games against itself

A handwriting recognition learning problem:


Task T: recognizing and classifying handwritten words within images
Performance measure P: percent of words correctly classified
Training experience E: a database of handwritten words with given classifications

A robot driving learning problem:


0 Task T: driving on public four-lane highways using vision sensors
0 Performance measure P: average distance traveled before an error (as judged by human overseer)
0 Training experience E: a sequence of images and steering commands recorded while observing a human
driver.

1.2 DESIGNING A LEARNING SYSTEM


Choosing the Training Experience
Choosing the Target Function
Choosing a Representation for the Target Function
Choosing a Function Approximation Algorithm
The Final Design.

Choosing the Training Experience:

The first design choice we face is to choose the type of training experience from which our system will
learn. The type of training experience available can have a significant impact on success or failure of the
learner.

1
One key attribute is whether the training experience provides direct or indirect feedback regarding the
choices made by the performance system. A second important attribute of the training experience is the
degree to which the learner controls the sequence of training examples. A third important attribute of the
training experience is how well it represents the distribution of examples over which the final system
performance P must be measured. In general, learning is most reliable when the training examples follow a
distribution similar to that of future test examples.

Choosing the Target Function

The next design choice is to determine exactly what type of knowledge will be learned and how this will
be used by the performance program. Let us begin with a checkers-playing program that can generate the
legal moves from any board state. The program needs only to learn how to choose the best move from
among these legal moves.

Let us call this function ChooseMove and use the notation ChooseMove : B → M to indicate that this
function accepts as input any board from the set of legal board states B and produces as output some move
from the set of legal moves M.

An alternative target function and one that will turn out to be easier to learn in this setting-is an evaluation
function that assigns a numerical score to any given board state. Let us call this target function V and again
use the notation V : B → R to denote that V maps any legal board state from the set B to some real value.

1. if b is a final board state that is won, then V(b) = 100


2. if b is a final board state that is lost, then V(b) = -100
3. if b is a final board state that is drawn, then V(b) = 0

Choosing a Representation for the Target Function

Choosing a Function Approximation Algorithm

2
The Final Design
The final design of our checkers learning system can be naturally described by four distinct program
modules that represent the central components in many learning systems. These four modules, summarized
in Figure 1.1, are as follows:

1.3 PERSPECTIVES AND ISSUES IN MACHINE LEARNING

One useful perspective on machine learning is that it involves searching a very large space of possible
hypotheses to determine one that best fits the observed data and any prior knowledge held by the learner.

Issues in Machine Learning


The field of machine learning is concerned with answering questions such as the following:
• What algorithms exist for learning general target functions from specific training examples?
• How much training data is sufficient?
• Can prior knowledge be helpful even when it is only approximately correct?
• What is the best strategy for choosing a useful next training experience, and
• how does the choice of this strategy alter the complexity of the learning problem?
• What is the best way to reduce the learning task to one or more function approximation problems?
3
• How can the learner automatically alter its representation to improve its ability to represent and
learn the target function?

1.4 A CONCEPT LEARNING TASK

consider the example task of learning the target concept "days on which my friend Aldo enjoys his favorite
water sport." Table 2.1 describes a set of example days, each represented by a set of attributes. The
attribute EnjoySport indicates whether or not Aldo enjoys his favorite water sport on this day. The task is
to learn to predict the value of EnjoySport for an arbitrary day, based on the values of its other attributes.

What hypothesis representation shall we provide to the learner in this case? Let us begin by considering a
simple representation in which each hypothesis consists of a conjunction of constraints on the instance
attributes. In particular, let each hypothesis be a vector of six constraints, specifying the values of the six
attributes Sky, AirTemp, Humidity, Wind, Water, and Forecast. For each attribute, the hypothesis will
either

4
1.5 CONCEPT LEARNING AS SEARCH

The goal of this search is to find the hypothesis that best fits the training examples. Consider, for example,
the instances X and hypotheses H in the EnjoySport learning task. Given that the attribute Sky has three
possible values, and that AirTemp, Humidity, Wind, Water, and Forecast each have two possible values,
the instance space X contains exactly 3.2.2.2.2.2 = 96 distinct instances. A similar calculation shows that
there are 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses within H.

Most practical learning tasks involve much larger, sometimes infinite, hypothesis spaces. To illustrate the
general-to-specific ordering, consider the two hypotheses. Now consider the sets of instances that are
classified positive by h1 and by h2. Because h2 imposes fewer constraints on the instance, it classifies
more instances as positive. In fact, any instance classified positive by hl will also be classified positive by
h2. Therefore, we say that h2 is more general than h1.

1.6 FIND-S: FINDING A MAXIMALLY SPECIFIC HYPOTHESIS

5
The key property of the FIND-S algorithm is that for hypothesis spaces described by conjunctions of
attribute constraints (such as H for the EnjoySport task), FIND-S is guaranteed to output the most specific
hypothesis within H that is consistent with the positive training examples. Its final hypothesis will
also be consistent with the negative examples provided the correct target concept is contained in H, and
provided the training examples are correct.

1.7 VERSION SPACES AND THE CANDIDATE-ELIMINATION ALGORITHM

A second approach to concept learning, the CANDIDATE ELIMINATION algorithm, that addresses
several of the limitations of FIND-S. The key idea in the CANDIDATE ELIMINATION is to output a
description of the set of all hypotheses consistent with the training examples.

CANDIDATE- ELIMINATION Algorithm:

The CANDIDATE- ELIMINATION computes the version space containing all hypotheses from H that
are consistent with an observed sequence of training examples. It begins by initializing the version space
to the set of all hypotheses in H; that is, by initializing the G boundary set to contain the most general
hypothesis in H

and initializing the S boundary set to contain the most specific (least general)
hypothesis

These two boundary sets delimit the entire hypothesis space, because every other
hypothesis in H is both more general than So and more specific than Go.

6
Notice that the algorithm is specified in terms of operations such as computing minimal generalizations
and specializations of given hypotheses, and identifying nonrninimal and nonmaximal hypotheses.

The fourth training example, further generalizes the S boundary of the version space. It also results in
removing one member of the G boundary, because this member fails to cover the new positive example.
This last action results from the first step under the condition "If d is a positive example" in the algorithm
shown in Table 2.7. After processing these four examples, the boundary sets S4 and G4 delimit the
version space of all hypotheses consistent with the set of incrementally observed training examples. The
entire version space, including those hypotheses bounded by S4 and G4, is shown in Figure 2.7.

1.8 REMARKS ON VERSION SPACES AND CANDIDATE-ELIMINATION

The version space learned by the CANDIDATE-ELIMINATlON will converge toward the hypothesis
that correctly describes the target concept, provided (1) there are no errors in the training examples, and (2)
there is some hypothesis in H that correctly describes the target concept.

A Biased Hypothesis Space


To illustrate, consider again the EnjoySpor t example in which we restricted the hypothesis space to
include only conjunctions of attribute values. Because of this restriction, the hypothesis space is unable to
represent even simple disjunctive target concepts such as "Sky = Sunny or Sky = Cloudy."
In fact, given the following three training examples of this disjunctive hypothesis, our algorithm would
find that there are zero hypotheses in the version space.

7
1.9 DECISION TREE REPRESENTATION

Decision tree learning is one of the most widely used and practical methods for inductive inference. It is a
method for approximating discrete-valued functions that is robust to noisy data and capable of learning
disjunctive expressions. Decision tree learning is a method for approximating discrete-valued target
functions, in which the learned function is represented by a decision tree. Learned trees can also be re-
represented as sets of if-then rules to improve human readability.

These learning methods are among the most popular of inductive inference algorithms and have been
successfully applied to a broad range of tasks from learning to diagnose medical cases to learning to assess
credit risk of loan applicants.

Figure 3.1
illustrates
a typical learned decision tree. This decision tree classifies Saturday mornings according to whether they
are suitable for playing tennis. For example, the instance

would be sorted down the leftmost branch of this decision tree and would therefore be classified as a
negative instance (i.e., the tree predicts that PlayTennis = no). This tree and the example used in Table 3.2
to illustrate the ID3 learning algorithm.

In general, decision trees represent a disjunction of conjunctions of constraints on the attribute values of
instances. Each path from the tree root to a leaf corresponds to a conjunction of attribute tests, and the tree
itself to a disjunction of these conjunctions. For example, the decision tree shown in Figure 3.1
corresponds to the expression

1.10 APPROPRIATE PROBLEMS FOR DECISION TREE LEARNING

decision tree learning is generally best suited to problems with the following characteristics:
instances are represented by attribute-value pairs.

8
• Instances are described by a fixed set of attributes (e.g., Temperature) and their values (e.g., Hot).

• The target function has discrete output values. The decision tree in Figure 3.1 assigns a boolean
classification (e.g., yes or no) to each example.

• Disjunctive descriptions may be required. As noted above, decision trees naturally represent
disjunctive expressions.
• The training data may contain errors. Decision tree learning methods are robust to errors, both
errors in classifications of the training examples and errors in the attribute values that describe
these examples.
• The training data may contain missing attribute values.

Decision tree learning has therefore been applied to problems such as learning to classify medical patients
by their disease, equipment malfunctions by their cause, and loan applicants by their likelihood of
defaulting on payments. Such problems, in which the task is to classify examples into one of a discrete set
of possible categories, are often referred to as classijication problems.

1.11 THE BASIC DECISION TREE LEARNING ALGORITHM

The central choice in the ID3 algorithm is selecting which attribute to test at each node in the tree. We
would like to select the attribute that is most useful for classifying examples. What is a good quantitative
measure of the worth of an attribute? We will define a statistical property, called informution gain, that
measures how well a given attribute separates the training examples according to their target classification.
ID3 uses this information gain measure to select among the candidate attributes at each step while growing
the tree.

9
To illustrate, suppose S is a collection of 14 examples of some boolean concept, including
9 positive and 5 negative examples (we adopt the notation [9+, 5-] to summarize such a
sample of data). Then the entropy of S relative to this boolean classification is

INFORMATION GAIN MEASURES THE EXPECTED REDUCTION IN ETROPY

Information gain is precisely the measure used by ID3 to select the best attribute at each
step in growing the tree.

To illustrate the operation of ID3, consider the learning task represented by the training
examples of Table 3.2. Here the target attribute PlayTennis, which can have values yes or
no for different Saturday mornings, is to be predicted based on other attributes of the
morning in question.
Figure 3.4 illustrates the computations of information gain for the next step in growing the decision tree.
The final decision tree learned by ID3 from the 14 training examples of Table 3.2 is shown in Figure 3.1.

1.12 HYPOTHESIS SPACE SEARCH IN DECISION TREE LEARNING

The hypothesis space searched by ID3 is the set of possible decision trees. ID3 performs a simple-
tocomplex, hill-climbing search through this hypothesis space, beginning with the empty tree, then
considering progressively more elaborate hypotheses in search of a decision tree that correctly classifies
the training data. The evaluation function that guides this hill-climbing search is the information gain
measure. This search is depicted in Figure 3.5.

1.13 INDUCTIVE BIAS IN DECISION TREE LEARNING


Inductive bias is the set of assumptions that, together with the training data, deductively justify the
classifications assigned by the learner to future instances. It chooses the first acceptable tree it encounters
in its simple-to-complex, hillclimbing search through the space of possible trees. Roughly speaking, then,
the ID3 search strategy
(a) selects in favor of shorter trees over longer ones, and
(b) selects trees that place the attributes with highest information gain closest to the root.

The inductive bias of ID3 is thus a preference for certain hypotheses over others (e.g., for shorter
hypotheses), with no hard restriction on the hypotheses that can be eventually enumerated. This form of
bias is typically called a preference bias (or, alternatively, a search bias). Given that some form of
inductive bias is required in order to generalize beyond the training data, which type of inductive bias shall
we prefer; a preference bias or restriction bias?

Typically, a preference bias is more desirable than a restriction bias, because it allows the learner to work
within a complete hypothesis space that is assured to contain the unknown target function. In contrast, a
restriction bias that strictly limits the set of potential hypotheses is generally less desirable, because it
introduces the possibility of excluding the unknown target function altogether.

1.14 ISSUES IN DECISION TREE LEARNING

Avoiding Overfitting the Data

Handling Training Examples with Missing Attribute Values


In certain cases, the available data may be missing values for some attributes. For example, in a medical
domain in which we wish to predict patient outcome based on various laboratory tests, it may be that the
lab test Blood-Test-Result is available only for a subset of the patients. In such cases, it is common to
estimate the missing attribute value based on other examples for which this attribute has a known value.

Handling Attributes with Differing Costs


In some learning tasks the instance attributes may have associated costs. For example, in learning to
classify medical diseases we might describe patients in terms of attributes such as Temperature,
BiopsyResult, Pulse, BloodTestResults, etc. These attributes vary significantly in their costs, both in terms
of monetary cost and cost to patient comfort. In such tasks, we would prefer decision trees that use low-
cost attributes where possible, relying on high-cost attributes only when needed to produce reliable
classifications.
UNIT-I Assignment Questions

SN Questions Taxonom Course


o y Level Outco
me
1 Define Machine learning and mention its popular TL1 CO1
applications?
2 List and Explain steps for designing a learning TL2 CO1
system?
3 Explain Find-S and Candidate-Elimination TL2 CO1
algorithms?
4 Explain Decision Tree Learning algorithms? TL2 CO1
5 Discuss issues in Decision Tree learning? TL2 CO1
UNIVERSITY QUESTION

S.No BT
Description Cos
. L
Explain the two uses of features in machine learning. (JNTUH-2020) TL1 CO1
1.
List the problems that can be solved with machine learning TL2 CO1
2. (JNTUH 2020)
Find least general conjunctive generalization of two conjunctions,
3 TL3 CO1
employing internal disjunction. (JNTUH-2020)
Explain in detail about Decision Tree with an example.
4. (JNTUH-2020) TL3 CO1

Which disciplines have their influence on machine learning? Explain


5. with examples. (JNTUH-2019) TL3 CO1

Contrast the hypothesis space search in ID3 and candidate


6. elimination algorithm. (JNTUH-2019) TL3 CO1

Illustrate the impact of overfitting in a typical application of decision


7. tree learning. (JNTUH-2019) TL3 CO1

List and explain the basic designing issues to learning


8. system. (JNTUH-2019) TL2 CO1

Explain the following


9 i. most general consistent hypothesis. TL2 CO1
ii. closed concepts in path through the hypothesis. (JNTUH-2020)
List the applications of Neural Networks in Machine Learning.
10. (JNTUH-2020) TL 1 CO1
OBJECTIVE QUESTIONS

1. The field of machie learning is concerned with “how to construct computer program that automatically
improve with ______
a) copying b) experience c) pasting d) None

2. Which of the following are successful machine learning applications ______


a) Data Mining b) Handwritten Recognition c) Speech Recognition d) All

3. Machine learning concepts came from which of the following fields ______
a) Artificial Intelligence b) Statistics c) Biology d) All

4. Any computer program that improves its performce of some task through experience is called ____
a) Experience b) Recognition c) Learning d) Classification

5. Well defined learn problem requires which of the following features ______
a) class of task b) measure of performance c) source of experience d) All

6. The design choice is to determine exactly what type of knowledge will be learned is called . a) Target
function b) Training c) Function approximation d) Final design

7. Adjusting the weights is to specify the learning algorithm to best fit the set of ____ data
a) Training b) Test c) set of objects d) None

8. Any instance classified positive by hypothesis h1 will also be classified by h2, then h2 is
more ______ h1
a) specific than b) general than c) similar than d) None

9. Which of the following algorithm is more robust to noisy training data ______
a) Find-S b) Candidate-Elimination c) Decision-Tree d) All

10. Which of the following measure used by Decision Trees ______


a) Information Gain b) F1-Score c) Precision d) Recall
TUTORIAL QUESTIONS
S.No. WEEK-1 Ref
1.1
1 What is well posed learning problem?
1.2
2 Write short notes on designing a learning system
1.3
3 List perspectives and issues in machine learning?

4 List and explain applications of machine learning? 1.0

5 Differentiate AI, ML and DL? 1.0

WEEK – II Ref
Chaper-2
What is concept learning task? C 2.1

Discuss about concept learning as search? C 2.2

C 2.3
Discuss about Find-S algorithm with an example?
Discuss about Candidate-Elimination algorithm with an example? C 2.4

What is inductive bias? C 2.5

WEEK – III Ref


Chaper-3

What is decision tree and its representation? C 3.2

List appropriate problems for decision tree algorithm? C 3.3

C 3.4
Write the decision tree (DT) algorithm?
Discuss hypothesis search space in DT Learning? C 3.5

List Issues in decision tree learning? C 3.7


PLACEMENT QUESTIONS WITH KEY

1.What’s the trade-off between bias and variance?


Answer: Bias is error due to erroneous or overly simplistic assumptions in the learning algorithm you’re
using. This can lead to the model underfitting your data, making it hard for it to have high predictive
accuracy and for you to generalize your knowledge from the training set to the test set.

Variance is error due to too much complexity in the learning algorithm you’re using. This leads to the
algorithm being highly sensitive to high degrees of variation in your training data, which can lead your
model to overfit the data. You’ll be carrying too much noise from your training data for your model to be
very useful for your test data.

2. What is deep learning, and how does it contrast with other machine learning algorithms?
Answer: Deep learning is a subset of machine learning that is concerned with neural networks: how to use
backpropagation and certain principles from neuroscience to more accurately model large sets of
unlabelled or semi-structured data. In that sense, deep learning represents an unsupervised learning
algorithm that learns representations of data through the use of neural nets.

3. What’s the difference between a generative and discriminative model?

Answer: A generative model will learn categories of data while a discriminative model will simply learn
the distinction between different categories of data. Discriminative models will generally outperform
generative models on classification tasks.

4. How is a decision tree pruned?

Answer: Pruning is what happens in decision trees when branches that have weak predictive power are
removed in order to reduce the complexity of the model and increase the predictive accuracy of a decision
tree model. Pruning can happen bottom-up and top-down, with approaches such as reduced error pruning
and cost complexity pruning.

Reduced error pruning is perhaps the simplest version: replace each node. If it doesn’t decrease predictive
accuracy, keep it pruned. While simple, this heuristic actually comes pretty close to an approach that
would optimize for maximum accuracy.

5. How do you handle missing or corrupted data in a dataset?

Answer: You could find missing/corrupted data in a dataset and either drop those rows or columns, or
decide to replace them with another value.

In Pandas, there are two very useful methods: isnull() and dropna() that will help you find columns of data with
missing or corrupted data and drop those values. If you want to fill the invalid values with a
placeholder value (for example, 0), you could use the fillna() method.
NPTEL References UNIT-1

1. https://onlinecourses.nptel.ac.in/noc19_cs52/preview

2. https://www.youtube.com/watch?v=UE4UQjn6SOM

3. https://www.youtube.com/watch?v=VMoPY9Wimi4

4. https://towardsdatascience.com/decision-tree-ba64f977f7c3

Course Learning Outcomes (UNIT-1)

1. Understand the Machine learning applications.

2. Ability to design learning system.

3. Understanding of hypothesis generation for given training data.

4. Ability to generate decision tree for given training data.


TEST PAPER SET-1

Short Answer Questions (2M)


1.List the basic design issues to machine learning.
2.How to use entropy as evaluation function?
3. What is the essential difference between analytical and inductive learning methods?
4.What is a decision tree?
5. What is discriminative probabilistic model?

Essay Writing Questions (12M)


6. Define Machine learning and mention its popular applications?
7. List and Explain steps for designing a learning system?
8. Explain Find-S and Candidate-Elimination algorithms?
9. Explain Decision Tree Learning algorithms?
10. Discuss issues in Decision Tree learning?
TEST PAPER SET-2

Short Answer Questions (2M)


1.What is the representational power of perceptron?
2. List the applications of Neural Networks in Machine Learning.
3. What are the functions used in Decision Tree.
4. What is descriptive learning?
5. What are the steps involved in designing a learning system.

Essay Writing Questions (12M)


6. Explain the two uses of features in machine learning.
7. Find least general conjunctive generalization of two conjunctions, employing internal
disjunction.
8. List the problems that can be solved with machine learning
9. Explain in detail about Decision Tree with an example.
10. Which disciplines have their influence on machine learning? Explain with examples.

Seminar topics:

S.No Seminar Topic


1. Importance of Machine learning in Real life applications.
2. How to design a learning system
3. What are the issues in machine learning
4. Discuss general-to-specific ordering of hypothesis with
example.
5. Inductive bias in decision tree learning.
BLOOM'S TAXONOMY UNIT-1

Checkers Game Analysis


Checkers Rules and Gameplay Checkers is a fun, challenging, and relatively easy to learn game. Game
Pieces and Board Checkers is a board game played between two people on an 8x8 checked board like the
one shown below. Each player has 12 pieces that are like flat round disks that fit inside each of the boxes
on the board. The pieces are placed on every other dark square and then staggered by rows, like shown on
the board.

Each Checkers player has different colored pieces. Sometimes the pieces are black and red or red and white.

Taking a Turn

Typically the darker color pieces moves first. Each player takes their turn by moving a piece. Pieces are always
moved diagonally and can be moved in the following ways:

➢ Diagonally in the forward direction (towards the opponent) to the next dark square.

➢ If there is one of the opponent's pieces next to a piece and an empty space on the other side, you jump your
opponent and remove their piece. You can do multiple jumps if they are lined up in the forward direction. ***
note: if you have a jump, you have no choice but to take it.

King Pieces

The last row is called the king row. If you get a piece across the board to the opponent's king row, that piece
becomes a king. Another piece is placed onto that piece so it is now two pieces high. King pieces can move in both
directions, forward and backward. Once a piece is kinged, the player must wait until the next turn to jump out of the
king row.
Winning the Game

You win the game when the opponent has no more pieces or can't move (even if he/she still has pieces). If neither
player can move then it is a draw or a tie.

Checkers Strategy and Tips

Sacrifice 1 piece for 2: you can sometimes bait or force the opponent to take one of your pieces enabling you to then
take 2 of their pieces.

Pieces on the sides are valuable because they can't be jumped.

Don't bunch all your pieces in the middle or you may not be able to move, and then you will lose.

Try to keep your pieces on the back row or king row for as long as possible, to keep the other player from gaining a
king.

Plan ahead and try to look at every possible move before you take your turn.

Practice: if you play a lot against a lot of different players, you will get better.
Active Learning Unit-1
One type of machine learning algorithm is Decision Tree, which is a type of classification algorithm that
comes under supervised classification.

The problem decision tree can solve?


It can solve two types of problems.

1. Classification: Classify based on if-then condition. Ex: If a flower color is red then its rose, if it’s
white then lily.
2. Regression: Regression tree is used when there is continuous data.

Predicting the
price
GATE Questions with Answers
Real Time Applications

Following are Successful Applications of Machine Learning:

• Traffic Alerts
• Social Media
• Transportation and Commuting
• Products Recommendations
• Virtual Personal Assistants
• Self Driving Cars
• Dynamic Pricing
• Google Translate
• Online Video Streaming
• Fraud Detection

• Face Recognition

• Handwritten Recognition

• Speech Recognition

You might also like