0% found this document useful (0 votes)
8 views12 pages

UNIT 1

Uploaded by

indraneel k
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views12 pages

UNIT 1

Uploaded by

indraneel k
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

SL.

NO PART B UNIT 1
1. Discuss the different types of machine learning.

Based on the methods and way of learning, machine learning is divided into mainly four
types, which are:

1. Supervised Machine Learning


2. Unsupervised Machine Learning
3. Semi-Supervised Machine Learning
4. Reinforcement Learning

1. Supervised Machine Learning


As its name suggests, Supervised machine learning is based on supervision. It means in the
supervised learning technique, we train the machines using the "labelled" dataset, and based
on the training, the machine predicts the output. Here, the labelled data specifies that some
of the inputs are already mapped to the output. More preciously, we can say; first, we train
the machine with the input and corresponding output, and then we ask the machine to
predict the output using the test dataset.

Let's understand supervised learning with an example. Suppose we have an input dataset of
cats and dog images. So, first, we will provide the training to the machine to understand the
images, such as the shape & size of the tail of cat and dog, Shape of eyes, colour, height
(dogs are taller, cats are smaller), etc. After completion of training, we input the picture of
a cat and ask the machine to identify the object and predict the output. Now, the machine is
well trained, so it will check all the features of the object, such as height, shape, colour, eyes,
ears, tail, etc., and find that it's a cat. So, it will put it in the Cat category. This is the process
of how the machine identifies the objects in Supervised Learning.

The main goal of the supervised learning technique is to map the input variable(x)
with the output variable(y). Some real-world applications of supervised learning are Risk
Assessment, Fraud Detection, Spam filtering, etc.

Categories of Supervised Machine Learning

Supervised machine learning can be classified into two types of problems, which are given
below:

o Classification
o Regression

2. Unsupervised Machine Learning


Unsupervised learning is different from the Supervised learning technique; as its name
suggests, there is no need for supervision. It means, in unsupervised machine learning, the
machine is trained using the unlabeled dataset, and the machine predicts the output without
any supervision.

In unsupervised learning, the models are trained with the data that is neither classified nor
labelled, and the model acts on that data without any supervision.

The main aim of the unsupervised learning algorithm is to group or categories the
unsorted dataset according to the similarities, patterns, and differences. Machines are
instructed to find the hidden patterns from the input dataset.

Let's take an example to understand it more preciously; suppose there is a basket of fruit
images, and we input it into the machine learning model. The images are totally unknown to
the model, and the task of the machine is to find the patterns and categories of the objects.

So, now the machine will discover its patterns and differences, such as colour difference,
shape difference, and predict the output when it is tested with the test dataset.

Categories of Unsupervised Machine Learning

Unsupervised Learning can be further classified into two types, which are given below:

o Clustering
o Association

3. Semi-Supervised Learning
Semi-Supervised learning is a type of Machine Learning algorithm that lies between
Supervised and Unsupervised machine learning. It represents the intermediate ground
between Supervised (With Labelled training data) and Unsupervised learning (with no
labelled training data) algorithms and uses the combination of labelled and unlabeled
datasets during the training period.

Although Semi-supervised learning is the middle ground between supervised and


unsupervised learning and operates on the data that consists of a few labels, it mostly
consists of unlabeled data. As labels are costly, but for corporate purposes, they may have
few labels. It is completely different from supervised and unsupervised learning as they are
based on the presence & absence of labels.

To overcome the drawbacks of supervised learning and unsupervised learning


algorithms, the concept of Semi-supervised learning is introduced. The main aim
of semi-supervised learning is to effectively use all the available data, rather than only
labelled data like in supervised learning. Initially, similar data is clustered along with an
unsupervised learning algorithm, and further, it helps to label the unlabeled data into
labelled data. It is because labelled data is a comparatively more expensive acquisition than
unlabeled data.

We can imagine these algorithms with an example. Supervised learning is where a student is
under the supervision of an instructor at home and college. Further, if that student is self-
analysing the same concept without any help from the instructor, it comes under
unsupervised learning. Under semi-supervised learning, the student has to revise himself
after analyzing the same concept under the guidance of an instructor at college.

4. Reinforcement Learning
Reinforcement learning works on a feedback-based process, in which an AI agent (A
software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance. Agent gets rewarded
for each good action and get punished for each bad action; hence the goal of reinforcement
learning agent is to maximize the rewards.

2. How to choose a function approximation algorithm while designing the learning system?
Discuss.
3. Define Well-Posed problem. Illustrate any two examples for Well-Posed problems.
•Definition
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E.:

•A Checkers Learning Problem


–Task T: playing checkers
–Performance measureP: percent of games won against opponents
–Training experience E: playing practice games against itself

•A Handwriting Recognition Learning Problem


–Task T: recognizing and classifying handwritten words within images
–Performance measure P: percent of words correctly classified
–Training experience E: a database of handwritten words with given classifications

•A Robot Driving Learning Problem


–Task T: driving on public four-lane highways using vision sensors
–Performance measure P: average distance traveled before an error (as judged by human
overseer)
–Training experience E: a sequence of images and steering commands recorded while observing
a human driver

4. For the given data find a maximally specific hypothesis using Find-S
algorithm:

S. No. Origin Manufacturer Color Year Type Class


1 JP HO Blue 1980 Eco Yes
2 JP TO Green 1970 Spo No
3 JP TO Blue 1990 Eco Yes
4 USA AU Red 1980 Eco No
5 JP HO White 1980 Eco Yes
6 JP TO Green 1980 Eco Yes
7 HP HO Red 1980 Eco No

Using Find-S algorithm here is the solution:


h0 = < , , , ,  >

h1 = < JP, HO, Blue, 1980, eco >

h2 = h1

h3 = < JP, ?, Blue, ?, eco >

h4 = h3

h5 = < JP, ?, ?, ?, eco >

h6 = < JP, ?, ?, ?, eco > -> most general

5. Explain Perspectives and Issues of Machine Learning.


Perspectives in Machine Learning: One useful perspective on machine learning is that it
involves searching a very large space of possible hypotheses to determine one that best fits the
observed data and any prior knowledge held by the learner. Example: Sorting Algorithms

Issues in Machine Learning:


 What algorithms should be used?: What algorithms exist for learning general target
functions from specific training examples? In what settings will particular algorithms converge
to the desired function, given sufficient training data? Which algorithms perform best for which
types of problems and representations?
 How much training data and testing data is sufficient?: What general bounds can be found
to relate the confidence in learned hypotheses to the amount of training experience and the
character of the learner's hypothesis space.
When and how can prior knowledge held by the learner guide the process of generalizing from
examples? Can prior knowledge be helpful even when it is only approximately correct?
 What kinds of methods should be used?: What is the best strategy for choosing a useful
next training experience, and how does the choice of this strategy alter the complexity of the
learning problem?
 Which methods should be used to reduce learning overhead?: What is the best way to
reduce the learning task to one or more function approximation problems? Put another way,
what specific functions should the system attempt to learn? Can this process itself be
automated?
 For which type of data which methods used be used?
 How can the learner automatically alter its representation to improve its ability to represent
and learn the target function?
6. Discuss the concepts of learning as search.
- Concept learning can be viewed as the task of searching through a large space of hypotheses
implicitly defined by the hypothesis representation.
- The goal of this search is to find the hypothesis that best fits the training examples. Example,
Consider the instances X and hypotheses H in the EnjoySport learning task.
• The attribute Sky has three possible values, and AirTemp, Humidity, Wind, Water, Forecast
each have two possible values, the instance space X contains exactly.
 3.2.2.2.2.2 = 96 distinct instances
 5.4.4.4.4.4 = 5120 syntactically distinct hypotheses within H.
• Every hypothesis containing one or more "∅" symbols represents the empty set of instances;
that is, it classifies every instance as negative
1 + (4.3.3.3.3.3) = 973. Semantically distinct hypotheses
General-to-Specific Ordering of Hypotheses
Consider the two hypotheses
h1 = (Sunny, ?, ?, Strong, ?, ?)
h2 = (Sunny, ?, ?, ?, ?, ?)
Consider the sets of instances that are classified positive by hl and by h2
• h2 imposes fewer constraints on the instance→ classifies more instances as positive
• Any instance classified positive by hl will also be classified positive by h2. Therefore, h2 is
more general than hl.
Given hypotheses hj and hk, hj is more-general-than or- equal do hk if and only if any instance
that satisfies hk also satisfies hi.
Definition: Let hj and hk be Boolean-valued functions defined over X. Then hj is more general-
than-or-equal-to hk (written hj g hk ) if and only if
(x X)[(h (x) =1) →(h (x) =1)]

• In the figure, the box on the left represents the set X of all instances, the box on the right the
set H of all hypotheses.
• Each hypothesis corresponds to some subset of X-the subset of instances that it classifies
positive.
• The arrows connecting hypotheses represent the more - general -than relation, with the
arrow pointing toward the less general hypothesis.
• Note the subset of instances characterized by h2 subsumes the subset characterized by hl ,
hence h2 is more - general– than h1
7. Explain the Find-S: Finding a Maximally Specific Hypothesis.
The find-S algorithm is a basic concept learning algorithm in machine learning. The find-S
algorithm finds the most specific hypothesis that fits all the positive examples. We have to
note here that the algorithm considers only those positive training example. The find-S
algorithm starts with the most specific hypothesis and generalizes this hypothesis each time it
fails to classify an observed positive training data. Hence, the Find-S algorithm moves from
the most specific hypothesis to the most general hypothesis.
Steps Involved In Find-S :
1. Start with the most specific hypothesis.
h = {ϕ, ϕ, ϕ, ϕ, ϕ, ϕ}
2. Take the next example and if it is negative, then no changes occur to the hypothesis.
3. If the example is positive and we find that our initial hypothesis is too specific then we
update our current hypothesis to a general condition.
4. Keep repeating the above steps till all the training examples are complete.
5. After we have completed all the training examples we will have the final hypothesis
when can use to classify the new examples.
Algorithm :

1. Initialize h to the most specific hypothesis in H


2. For each positive training instance x
For each attribute constraint a, in h
If the constraint a, is satisfied by x
Then do nothing
Else replace a, in h by the next more general constraint that is satisfied by x
1. Output hypothesis h
EXAMPLE:
Time Weather Temperature Company Humidity Wind Goes
Morning Sunny Warm Yes Mild Strong Yes
Evening Rainy Cold No Mild Normal No
Morning Sunny Moderate Yes Normal Normal Yes
Evening Sunny Cold Yes High Strong Yes
Looking at the data set, we have six attributes and a final attribute that defines the positive or
negative example. In this case, yes is a positive example, which means the person will go for a
walk.

So now, the general hypothesis is:

h0 = {‘Morning’, ‘Sunny’, ‘Warm’, ‘Yes’, ‘Mild’, ‘Strong’}

This is our general hypothesis, and now we will consider each example one by one, but only the
positive examples.

h1= {‘Morning’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}

h2 = {‘?’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}

We replaced all the different values in the general hypothesis to get a resultant hypothesis.

8. Apply candidate elimination algorithm for the given dataset and find version spaces.

Solution:
S0: (0, 0, 0, 0, 0) Most Specific Boundary
G0: (?, ?, ?, ?, ?) Most Generic Boundary
The first example is negative, the hypothesis at the specific boundary is consistent, hence we retain it,
and the hypothesis at the generic boundary is inconsistent hence we write all consistent hypotheses by
removing one “?” at a time.
S1: (0, 0, 0, 0, 0)
G1: (Many,?,?,?, ?) (?, Big,?,?,?) (?,Medium,?,?,?) (?,?,?,Exp,?) (?,?,?,?,One) (?,?,?,?,Few)
The second example is positive, the hypothesis at the specific boundary is inconsistent, hence we
extend the specific boundary, and the consistent hypothesis at the generic boundary is retained and
inconsistent hypotheses are removed from the generic boundary.
S2: (Many, Big, No, Exp, Many)
G2: (Many,?,?,?, ?) (?, Big,?,?,?) (?,?,?,Exp,?) (?,?,?,?,Many)
The third example is positive, the hypothesis at the specific boundary is inconsistent, hence we extend
the specific boundary, and the consistent hypothesis at the generic boundary is retained and
inconsistent hypotheses are removed from the generic boundary.
S3: (Many, ?, No, Exp, ?)
G3: (Many,?,?,?,?) (?,?,?,exp,?)
The fourth example is positive, the hypothesis at the specific boundary is inconsistent, hence we
extend the specific boundary, and the consistent hypothesis at the generic boundary is retained and
inconsistent
hypotheses are removed from the generic boundary.
S4: (Many, ?, No, ?, ?)
G4: (Many,?,?,?,?)
Learned Version Space by Candidate Elimination Algorithm for given data set is:
(Many, ?, No, ?, ?) (Many, ?, ?, ?, ?)

9. Explain the basic decision tree learning algorithm and hypothesis space search in decision
tree learning.

The basic decision tree learning algorithm is a machine learning method used for classification
and regression tasks. It constructs a tree-like model of decisions and their possible consequences,
represented as branches and leaves, respectively. Here's an overview of the algorithm:

Data Preparation: The first step is to prepare the training data, which consists of labeled
examples. Each example is a set of input features and their corresponding target output. For
example, in a classification problem, the features could be attributes of an object, and the target
output could be the class label of the object.

Tree Construction: The algorithm starts with an empty tree and aims to build it iteratively. At
each step, it selects the best feature to split the data based on certain criteria, such as information
gain or Gini index. The selected feature becomes the decision node of the tree, and the data is
divided into subsets based on its possible attribute values.

Recursive Splitting: The algorithm recursively applies the splitting process to each subset of data
created from the previous step. This process continues until certain termination conditions are
met. For example, the algorithm might stop if all the examples in a subset belong to the same
class or if the tree reaches a maximum depth.

Leaf Node Assignment: Once the splitting process terminates for a subset, a leaf node is assigned
to the corresponding subset. The leaf node represents the predicted output value or class label. In
a classification problem, the majority class in the subset is typically chosen as the predicted class
label.

Pruning (Optional): After the tree is fully constructed, an optional pruning step can be performed
to reduce overfitting. Pruning involves removing unnecessary branches or nodes that do not
improve the accuracy on a validation set.

Now, let's discuss hypothesis space search in decision tree learning. The hypothesis space refers
to the set of all possible decision trees that can be constructed from the given input features and
target outputs. It represents the space of possible solutions that the algorithm explores during
learning. The algorithm searches this space to find the best decision tree that fits the training
data.

The search in the hypothesis space involves evaluating different combinations of decision nodes
and splitting criteria to construct decision trees. The algorithm aims to find the tree that best
represents the underlying patterns and relationships in the data.

The search process typically involves the following steps:


Feature Selection: The algorithm considers different features as potential decision nodes and
evaluates their effectiveness in splitting the data. Various metrics, such as information gain or
Gini index, are used to measure the quality of a feature in terms of its ability to discriminate
between different classes.

Splitting Criteria: Once a feature is selected, the algorithm explores different splitting criteria to
divide the data into subsets. The criteria measure the homogeneity or impurity of the subsets
resulting from the split. The goal is to find the criteria that maximize the information gain or
minimize the impurity, indicating a more informative and accurate split.

Recursive Exploration: The search process recursively explores different combinations of


features and splitting criteria for each subset of data. This exploration continues until termination
conditions, such as reaching a maximum depth or a minimum number of examples per leaf node,
are met.

Evaluation and Selection: Throughout the search process, the algorithm evaluates the constructed
trees using metrics such as accuracy or error rate on the training data. It keeps track of the best-
performing tree based on these metrics and selects it as the final decision tree model.

The hypothesis space search in decision tree learning involves an exhaustive exploration of
different combinations of features and splitting criteria to find the optimal tree. The complexity
of the search depends on the number of features, the size of the training data, and the chosen
termination conditions.
10. Construct decision tree using ID3 algorithm for the following data.

Note: Decision Tree algorithm works with discrete valued data but here age attribute
values are real valued so we need to first convert it to discrete/binary valued i.e. age value
will be <=43 or >43.

Decision tree using the ID3 algorithm.

Age
/ \
<=43 >43
/ \ \
Likes Dog 1 Likes Gravity
/ \ / \
0 1 0 1
(No) (Yes) (No) (Yes)

This decision tree classifies whether a person is going to be an astronaut based on their age, likes
for dogs, and likes for gravity. The root node splits the data based on age, with an age of 43
being the threshold.

If a person's age is less than or equal to 43, the decision tree checks whether the person likes
dogs. If the person does not like dogs (Likes Dog = 0), the prediction is "No" for going to be an
astronaut. If the person likes dogs (Likes Dog = 1), the prediction is "Yes" for going to be an
astronaut.

If a person's age is greater than 43, the decision tree checks whether the person likes gravity. If
the person does not like gravity (Likes Gravity = 0), the prediction is "No" for going to be an
astronaut. If the person likes gravity (Likes Gravity = 1), the prediction is "Yes" for going to be
an astronaut.

You might also like