Instance Based Learning
Instance Based Learning
Unit-4
Core Idea:
1. Given a query instance, the algorithm finds the k-nearest neighbours from the training
data (based on a distance metric).
2. The predicted class or value for the query instance is based on the majority class (for
classifica on) or average value (for regression) of the neighbours.
Algorithm
Note:
The k-NEAREST NEIGHBOR algorithm is easily adapted to approxima ng con nuous-valued
target func ons. To approximate a real-valued target func on 𝑓: 𝑅 → 𝑅 we replace the final
line of the above algorithm by the line
∑ 𝑓(𝑥 )
𝑓 𝑥 =
𝑘
Naveen Pragallapati
A Note on Terminology
Much of the literature on nearest-neighbour methods and weighted local regression uses a
terminology that has arisen from the field of sta s cal pa ern recogni on. In reading that
literature, it is useful to know the following terms:
1. Regression means approxima ng a real-valued target func on.
2. Residual is the error 𝑓 (𝑥) − 𝑓(𝑥) in approxima ng the target func on.
3. Kernel func on is the func on of distance that is used to determine the weight of
each training example. In other words, the kernel func on is the func on K such that
𝑤 = 𝐾 𝑑 𝑥 ,𝑥 .
Distance Metrics:
The choice of distance metric significantly impacts the performance of k-NN.
Common distance metrics include:
1. Euclidean Distance (used for con nuous data):
We can distance-weight the instances for real-valued target func ons in a similar fashion,
replacing the final line of the algorithm in this case by
∑ 𝑤 𝑓(𝑥 )
𝑓 𝑥 =
∑ 𝑤
Naveen Pragallapati
Example Problem 1:
Given a dataset with three a ributes: Age, Income, and Credit Score. Predict whether a
person will buy a product (Yes/No).
Steps:
1. Compute the Euclidean distance between the query instance and all training points.
2. Choose k = 3 (find the 3 nearest neighbours).
3. Use majority vo ng to assign a class label (Yes/No).
Naveen Pragallapati
Example Problem-2:
Solve the Example Problem-1 above using weighted k-NN.
Naveen Pragallapati
Naveen Pragallapati
Example Problem-3:
Naveen Pragallapati
Summary:
1. k-NN is a powerful, non-parametric algorithm that works well for classifica on and
regression.
2. Its main drawback is the high computa onal cost during predic on, especially with
large datasets.
3. k-NN performs be er when the feature space is small and the relevant features are
carefully selected.
Naveen Pragallapati
Training Algorithm for Locally Weighted Regression (LWR) Using Gradient Descent:
Naveen Pragallapati
Summary:
This algorithm fits a local linear model for each query point by itera vely upda ng the weights
𝑤 using gradient descent. The kernel func on ensures that only nearby points have significant
influence, making the regression localized. The training process con nues un l the model
converges, a er which it can predict values based on the op mized weights.
Summary:
1. Select centres: Place RBF neurons at key points (possibly through clustering).
2. Calculate ac va ons: Use the Gaussian kernel to compute how much influence each
RBF neuron has for a given input.
3. Op mize weights: Use least squares to find the weights that minimize the predic on
error.
Naveen Pragallapati
A radial basis func on network. Each hidden unit produces an ac va on determined by a Gaussian func on cantered at
some instance xu. Therefore, its ac va on will be close to zero unless the input x is near xu. The output unit produces a linear combina on
of the hidden unit ac va ons. Although the network shown here has just one output, mul ple output units can also
be included.
Diagram Summary
In the diagram, you would typically see:
Input nodes connected to each hidden node (RBF neuron).
RBF neurons in the hidden layer, each receiving the input vector and compu ng its
ac va on.
Weights associated with each connec on from hidden neurons to the output layer.
A single output node that aggregates the weighted ac va ons to produce the predicted
value.
This architecture enables RBF networks to model nonlinear rela onships by combining local
approxima ons (via Gaussian kernels) with a global linear combina on at the output layer.
Naveen Pragallapati
Note:
Solu on:
Steps to follow:
Naveen Pragallapati
Naveen Pragallapati
Medical diagnosis, where pa ent cases help diagnose similar future pa ents.
Technical support and troubleshoo ng, where past solu ons can be adapted for new
issues.
Legal reasoning, where previous legal cases inform judgments in new cases.
Summary
CBR’s strength lies in its ability to adapt previous knowledge directly to new problems, which
is especially powerful in contexts where cases do not follow a strict generaliza on rule. It’s
ideal for tasks where excep ons are common or complex adapta ons are needed.
Naveen Pragallapati
Example
Imagine CADET’s library includes a case of a small irriga on pump with a flow rate of 10 liters
per minute and a pressure of 5 psi. The new problem requires a pump with 10 liters per minute
but with a higher pressure of 8 psi.
1. Retrieve: CADET retrieves the small irriga on pump case, recognizing that it meets the flow
rate requirement.
2. Reuse: CADET reuses much of the design, such as the general structure and configura on.
3. Revise: To meet the higher-pressure requirement, CADET modifies the pump by increasing
the impeller size or using a more powerful motor, ensuring it can achieve 8 psi.
4. Retain: The revised pump design is stored as a new case with specifica ons for a 10 L/min,
8 psi water pump.
Adapta on Techniques in CADET
CADET’s adapta on is based on both similarity metrics and specific engineering rules, such as:
Eager learning emphasizes generaliza on, aiming to abstract pa erns from the
training data.
Lazy learning focuses on memoriza on, retaining the original instances for later use.
Naveen Pragallapati
2. Time Complexity:
Eager learners require more me and computa onal resources during the training
phase, as they need to analyse and construct a model.
Lazy learners are quick to train, as they simply store the training data but may require
more me to make predic ons since they analyse the stored data at query me.
3. Memory Usage:
Eager learning typically uses less memory at query me since it works with a model
rather than storing all instances.
Lazy learning may require significant memory if the training dataset is large, as it must
keep all instances accessible for querying.
4. Flexibility:
Eager learners can some mes struggle with changes in the underlying data
distribu on, as retraining the model is necessary.
Lazy learners can adapt to changes in the data more easily since they can incorporate
new instances dynamically during predic on.
5. Performance and Applica on Context:
Eager learning methods may perform be er in scenarios where the dataset is large,
and a general model can effec vely capture the rela onships in the data.
Lazy learning may excel in cases where data is sparse or when predic ons must be
made based on local rela onships in the data.
Naveen Pragallapati
Use Cases:
Eager learning is o en used in applica ons where the cost of computa on during
training can be jus fied by the need for fast predic ons, such as in online services.
Lazy learning is useful in applica ons where real- me updates are cri cal, such as
recommenda on systems that adapt to user preferences.
Conclusion:
Both lazy and eager learning methods have their advantages and disadvantages, and the
choice between them o en depends on the specific problem context, the size and nature of
the dataset, and the computa onal resources available. Understanding these concepts is
crucial for selec ng appropriate learning algorithms in prac cal machine learning
applica ons.