Module-5_Notes_13-12-2024.docx
Module-5_Notes_13-12-2024.docx
Syllabus
Clustering: Introduction, Types of clustering, partitioning methods of clustering
(k means, k-medoids), Hierarchical methods
Textbook-3 Chapter 13
Machine Learning by Vincy Joseph Anuradha Srinivasaraghavan, ID Numbers
ISBN 10 8126578513, ISBN 13 9788126578511
Instance based learning: introduction, k-nearest neighbour learning (review)
weighted regression, radial bias function, case based reasoning
From Javapoint
Types of Clustering Methods
The clustering methods are broadly divided into Hard clustering (datapoint belongs
to only one group) and Soft Clustering (data points can belong to another group also).
But there are also other various approaches of Clustering exist. Below are the main
clustering methods used in Machine learning:
1
1. Partitioning Clustering
2. Density-Based Clustering
3. Distribution Model-Based Clustering
4. Hierarchical Clustering
5. Fuzzy Clustering
Partitioning Clustering
It is a type of clustering that divides the data into non-hierarchical groups. It is also
known as the centroid-based method. The most common example of partitioning
clustering is the K-Means Clustering algorithm.
In this type, the dataset is divided into a set of k groups, where K is used to define the
number of pre-defined groups. The cluster center is created in such a way that the
distance between the data points of one cluster is minimum as compared to another
cluster centroid.
Density-Based Clustering
The density-based clustering method connects the highly-dense areas into clusters,
and the arbitrarily shaped distributions are formed as long as the dense region can be
connected. This algorithm does it by identifying different clusters in the dataset and
connects the areas of high densities into clusters. The dense areas in data space are
divided from each other by sparser areas.
These algorithms can face difficulty in clustering the data points if the dataset has
varying densities and high dimensions.
2
Distribution Model-Based Clustering
In the distribution model-based clustering method, the data is divided based on the
probability of how a dataset belongs to a particular distribution. The grouping is done
by assuming some distributions commonly Gaussian Distribution.
The example of this type is the Expectation-Maximization Clustering
algorithm that uses Gaussian Mixture Models (GMM).
Hierarchical Clustering
Hierarchical clustering can be used as an alternative for the partitioned clustering as
there is no requirement of pre-specifying the number of clusters to be created. In this
technique, the dataset is divided into clusters to create a tree-like structure, which is
3
also called a dendrogram. The observations or any number of clusters can be selected
by cutting the tree at the correct level. The most common example of this method is
the Agglomerative Hierarchical algorithm.
Fuzzy Clustering
Fuzzy clustering is a type of soft method in which a data object may belong to more
than one group or cluster. Each dataset has a set of membership coefficients, which
depend on the degree of membership to be in a cluster. Fuzzy C-means algorithm is
the example of this type of clustering; it is sometimes also known as the Fuzzy k-
means algorithm.
Clustering Algorithms
The Clustering algorithms can be divided based on their models that are explained
above. There are different types of clustering algorithms published, but only a few are
commonly used. The clustering algorithm is based on the kind of data that we are
using. Such as, some algorithms need to guess the number of clusters in the given
dataset, whereas some are required to find the minimum distance between the
observation of the dataset.
Here we are discussing mainly popular Clustering algorithms that are widely used in
machine learning:
1. K-Means algorithm: The k-means algorithm is one of the most popular
clustering algorithms. It classifies the dataset by dividing the samples into
different clusters of equal variances. The number of clusters must be specified
in this algorithm. It is fast with fewer computations required, with the linear
complexity of O(n).
4
2. Mean-shift algorithm: Mean-shift algorithm tries to find the dense areas in the
smooth density of data points. It is an example of a centroid-based model, that
works on updating the candidates for centroid to be the center of the points
within a given region.
3. DBSCAN Algorithm: It stands for Density-Based Spatial Clustering of
Applications with Noise. It is an example of a density-based model similar to
the mean-shift, but with some remarkable advantages. In this algorithm, the
areas of high density are separated by the areas of low density. Because of this,
the clusters can be found in any arbitrary shape.
4. Expectation-Maximization Clustering using GMM: This algorithm can be
used as an alternative for the k-means algorithm or for those cases where K-
means can be failed. In GMM, it is assumed that the data points are Gaussian
distributed.
5. Agglomerative Hierarchical algorithm: The Agglomerative hierarchical
algorithm performs the bottom-up hierarchical clustering. In this, each data
point is treated as a single cluster at the outset and then successively merged.
The cluster hierarchy can be represented as a tree-structure.
6. Affinity Propagation: It is different from other clustering algorithms as it does
not require to specify the number of clusters. In this, each data point sends a
message between the pair of data points until convergence. It has O(N2T) time
complexity, which is the main drawback of this algorithm.
Fuzzy clustering is a type of soft method in which a data object may belong to more
than one group or cluster. Each dataset has a set of membership coefficients, which
depend on the degree of membership to be in a cluster. Fuzzy C-means algorithm is
the example of this type of clustering; it is sometimes also known as the Fuzzy k-
means algorithm.
From Chatgpt
Clustering in Machine Learning
Clustering is an unsupervised learning technique used in Machine Learning (ML) to
group similar data points into clusters or groups. It is particularly useful when labels
or outputs are not available, allowing the algorithm to infer the structure within the
data. Clustering is used in various applications such as market segmentation,
document categorization, image compression, and anomaly detection.
5
Key Features of Clustering
1. Unsupervised Learning: No predefined labels or categories; patterns are
discovered based on the data.
2. Similarity/Dissimilarity: Clusters are formed based on similarity or
dissimilarity metrics (e.g., Euclidean distance, Cosine similarity).
3. Group Representation: Each cluster represents a subset of data with similar
characteristics.
2. Hierarchical Clustering
• Description: Builds a hierarchy of clusters using a tree-like structure called a
dendrogram.
• Types:
6
o Agglomerative (Bottom-Up):
▪ Starts with each data point as a separate cluster.
▪ Merges the closest pairs of clusters iteratively until one cluster
remains.
o Divisive (Top-Down):
▪ Starts with one large cluster and divides it into smaller clusters
recursively.
• Distance Metrics: Linkage methods determine how to merge or divide
clusters:
o Single Linkage: Minimum distance between clusters.
o Complete Linkage: Maximum distance between clusters.
o Average Linkage: Average distance between clusters.
• Pros:
o Does not require specifying the number of clusters in advance.
o Useful for visualizing cluster structures.
• Cons:
o Computationally expensive for large datasets.
3. Density-Based Clustering
• Description: Identifies clusters based on areas of high data density and
separates low-density areas as noise.
• Algorithm Examples:
o DBSCAN (Density-Based Spatial Clustering of Applications with
Noise):
▪ Groups points that are closely packed together.
▪ Handles noise and outliers effectively.
o OPTICS (Ordering Points to Identify the Clustering Structure):
▪ Extends DBSCAN by handling varying densities.
• Pros:
o Can find arbitrarily shaped clusters.
7
o Handles noise and outliers well.
• Cons:
o Struggles with varying density clusters.
4. Model-Based Clustering
• Description: Assumes the data is generated by a mixture of underlying
probability distributions (e.g., Gaussian distributions) and fits the data to these
models.
• Algorithm Examples:
o Gaussian Mixture Models (GMM):
▪ Uses a probabilistic model to represent each cluster as a Gaussian
distribution.
▪ Determines the likelihood of data points belonging to each
cluster.
• Pros:
o Handles overlapping clusters.
o Provides a probabilistic measure of membership.
• Cons:
o Assumes a specific distribution, which may not always fit the data.
5. Grid-Based Clustering
• Description: Divides the data space into a finite number of cells (grids) and
then clusters the cells based on the density of data points.
• Algorithm Examples:
o STING (Statistical Information Grid):
▪ Summarizes statistical data in grids and performs clustering
hierarchically.
o CLIQUE (Clustering In QUEst):
▪ Combines grid-based and density-based clustering for high-
dimensional data.
• Pros:
8
o Efficient for large datasets.
o Handles high-dimensional data.
• Cons:
o Sensitive to grid size.
6. Spectral Clustering
• Description: Uses the eigenvalues of a similarity matrix to reduce dimensions
and group data points in lower-dimensional space.
• Algorithm Examples:
o Constructs a similarity graph from the dataset.
o Performs clustering using graph partitioning techniques.
• Pros:
o Effective for non-convex clusters.
o Handles complex data structures.
• Cons:
o Computationally expensive for large datasets.
9
o Robust: DBSCAN, GMM.
Clustering is a versatile and powerful tool for exploring and understanding data, and
the choice of clustering method should align with the data characteristics and desired
outcomes.
10
7. Image Processing: Clustering can be used to group similar images together,
classify images based on content, and identify patterns in image data.
8. Genetics: Clustering is used to group genes that have similar expression
patterns and identify gene networks that work together in biological processes.
9. Finance: Clustering is used to identify market segments based on customer
behavior, identify patterns in stock market data, and analyze risk in investment
portfolios.
10. Customer Service: Clustering is used to group customer inquiries and
complaints into categories, identify common issues, and develop targeted
solutions.
11. Manufacturing: Clustering is used to group similar products together,
optimize production processes, and identify defects in manufacturing
processes.
12. Medical diagnosis: Clustering is used to group patients with similar symptoms
or diseases, which helps in making accurate diagnoses and identifying effective
treatments.
13. Fraud detection: Clustering is used to identify suspicious patterns or
anomalies in financial transactions, which can help in detecting fraud or other
financial crimes.
14. Traffic analysis: Clustering is used to group similar patterns of traffic data,
such as peak hours, routes, and speeds, which can help in improving
transportation planning and infrastructure.
15. Social network analysis: Clustering is used to identify communities or groups
within social networks, which can help in understanding social behavior,
influence, and trends.
16. Cybersecurity: Clustering is used to group similar patterns of network traffic
or system behavior, which can help in detecting and preventing cyberattacks.
17. Climate analysis: Clustering is used to group similar patterns of climate data,
such as temperature, precipitation, and wind, which can help in understanding
climate change and its impact on the environment.
18. Sports analysis: Clustering is used to group similar patterns of player or team
performance data, which can help in analyzing player or team strengths and
weaknesses and making strategic decisions.
19. Crime analysis: Clustering is used to group similar patterns of crime data,
such as location, time, and type, which can help in identifying crime hotspots,
predicting future crime trends, and improving crime prevention strategies.
11
Comparison of K-Means and K-Medoids in tabular form:
Aspect K-Means K-Medoids
Definition A partition-based clustering A partition-based clustering
algorithm that minimizes the algorithm that minimizes the
sum of squared distances sum of distances between data
between data points and points and the most centrally
cluster centroids. located data point (medoid) in
each cluster.
Cluster Centroids: Geometric center Medoids: Actual data points
Representation of the cluster, which may not from the dataset, representing
be an actual data point. the cluster center.
Distance Metric Typically uses Euclidean Can use various distance metrics
distance. (e.g., Manhattan, Euclidean).
Objective Minimize the sum of squared Minimize the sum of absolute
Function distances of points from their distances of points from their
cluster centroids. cluster medoids.
Robustness to Sensitive to outliers as Robust to outliers since medoids
Outliers centroids can be skewed by are actual data points and not
extreme values. affected by extreme values.
Data Type Works best with numerical Suitable for numerical and
Suitability data where centroids are categorical data as medoids can
meaningful. represent any data type.
Algorithm Generally faster: Slower due to exhaustive search
Complexity O(n×k×i)O(n \times k \times for medoids: O(n2×k×i)O(n^2
i)O(n×k×i), where nnn is the \times k \times i)O(n2×k×i).
number of data points, kkk is
the number of clusters, and iii
is the number of iterations.
Initialization Requires initialization of kkk Requires initialization of kkk
centroids (can use random medoids (often done by
selection or algorithms like randomly selecting kkk points).
K-Means++).
Convergence Converges faster, but might Converges more reliably to a
get stuck in local minima due solution since medoids are data
to sensitivity to initialization. points, but can take more
iterations.
12
Scalability Scalable to large datasets; Less scalable due to
efficient with high- computational complexity with
dimensional data. large datasets.
Cluster Shape Assumes clusters are Can handle arbitrary-shaped
spherical and evenly sized. clusters better in some cases.
Applications - Image compression - Medical diagnosis (categorical
- Document clustering data)
- Customer segmentation - Robust clustering in noisy
datasets
- Gene expression analysis
Advantages - Fast and efficient for large - Handles noise and outliers
datasets better
- Easy to implement and - Works with mixed data types
understand
Disadvantages - Sensitive to outliers - Computationally intensive
- Limited to numerical data - Slower for large datasets
- Assumes spherical clusters
13
Algorithm Description Use Cases
Locally
Fits a local model for each query Regression problems where
Weighted
instance based on nearby training relationships vary across
Regression
data. the space.
(LWR)
14
o Performance degrades as the number of features increases, making it
hard to define meaningful distances.
4. No Abstraction:
o Doesn't provide insights or a generalized model of the data.
15
classified. In fact, many techniques construct only a local approximation to the target
function that applies in the neighborhood of the new query instance, and never
construct an approximation designed to perform well over the entire instance space.
This has significant advantages when the target function is very complex, but can still
be described by a collection of less complex local approximations.
16
How Instance-Based Learning Works
1. Training Phase:
• No explicit training occurs.
• The algorithm stores the entire training dataset or a subset of it.
2. Prediction Phase:
• When a new instance (query) is presented, the algorithm compares it to
the stored instances.
• Based on the similarity between the query and the training instances, the
algorithm predicts the output (classification or regression).
3. Learning Algorithm:
• Relies on a distance function to measure how close the query instance
is to the stored examples.
• Common algorithms use techniques like nearest neighbor or weighted
contributions from neighbors.
17
Advantages of Instance-Based Learning
1. Simple Implementation:
o Requires minimal effort for training since it stores instances and defers
learning to prediction time.
2. Flexibility:
o Can adapt quickly to new data as it doesn't rely on a fixed model.
3. Handles Complex Relationships:
o Effective for non-linear and highly variable data distributions.
4. Intuitive:
o Easy to interpret since predictions are based directly on stored examples.
18
o Identify characters or objects by comparing with known examples.
4. Text Classification:
o Classify documents or emails (e.g., spam detection).
19
Generalization Local: Predictions depend on Global: Captures general
nearby data points. trends in the data.
Adaptability High: Can quickly incorporate Low: Requires retraining to
new data. incorporate new data.
Computation Expensive during prediction Expensive during training
(similarity computation). (model optimization).
Linear Regression is a supervised learning algorithm used for computing linear relationships
between input (X) and output (Y). The steps involved in ordinary linear regression are:
cost.
As evident from the image below, this algorithm cannot be used for making predictions when
there exists a non-linear relationship between X and Y. In such cases, locally weighted linear
regression is used.
20
Locally Weighted Linear Regression:
From ML | Locally weighted Linear Regression - GeeksforGeeks
21
As evident from the image below, this algorithm cannot be used for making
predictions when there exists a non-linear relationship between X and Y. In such
cases, locally weighted linear regression is used.
22
NOTE: For Locally Weighted Linear Regression, the data must always be available on
the machine as it doesn’t learn from the whole set of data in a single shot. Whereas, in
Linear Regression, after training the model the training set can be erased from the
machine as the model has already learned the required parameters.
23
Points to remember:
1. Locally weighted linear regression is a supervised learning algorithm.
2. It is a non-parametric algorithm.
3. There exists No training phase. All the work is done during the testing
phase/while making predictions.
4. The dataset must always be available for predictions.
5. Locally weighted regression methods are a generalization of k-Nearest
Neighbour.
6. In Locally weighted regression an explicit local approximation is constructed
from the target function for each query instance.
7. The local approximation is based on the target function of the form like
constant, linear, or quadratic functions localized kernel functions.
24
From Locally Weighted Linear Regression - Javatpoint
Applications of Locally Weighted Linear Regression
1. Time Series Analysis: LWLR is particularly useful in time series analysis,
where the relationship between variables may change over time. By adapting to
the local patterns and trends, LWLR can capture the dynamics of time-varying
data and make accurate predictions.
2. Anomaly Detection: LWLR can be employed for anomaly detection in various
domains, such as fraud detection or network intrusion detection. By identifying
deviations from the expected patterns in a localized manner, LWLR helps
detect abnormal behavior that may go unnoticed using traditional regression
models.
3. Robotics and Control Systems: In robotics and control systems, LWLR can
be utilized to model and predict the behavior of complex systems. By adapting
to local conditions and variations, LWLR enables precise control and decision-
making in dynamic environments.
Benefits of Locally Weighted Linear Regression
1. Improved Predictive Accuracy: By considering local patterns and
relationships, LWLR can capture subtle nuances in the data that might be
overlooked by global regression models. This results in more accurate
predictions and better model performance.
2. Flexibility and Adaptability: LWLR can adapt to different regions of the
dataset, making it suitable for complex and non-linear relationships. It offers
flexibility in capturing local variations, allowing for more nuanced analysis and
insights.
3. Interpretable Results: Despite its adaptive nature, LWLR still provides
interpretable results. The localized models offer insights into the relationships
between variables within specific regions of the data, aiding in the
understanding of complex phenomena.
Locally Weighted Linear Regression in Python | by Suraj Verma | Towards Data
Science
A better modelled picture of LWLR is given
25
What is your imagination and answer at the junction/transitions from one linear
segment to another linear segment?
26
Radial bias function
From Radial Basis Functions: Types, Advantages, and Use Cases | HackerNoon
All of you please listen to the audio of the article by the creator of this
webpage
This is an introductory article explaining the basic intuition, mathematical idea & scope of
radial basis functions in the development of predictive machine learning models.
Table of Contents
1. Introduction
2. Basic intuition of a Radial Basis Function
3. Types of Radial Basis Function
4. The concept of the RBF Network
5. Scope & Advantages of RBF
6. Conclusion
7. References
Introduction
In machine learning, problem-solving based on hyperplane-based algorithms heavily depends
upon the distribution of the data points in the space. However, it is a known fact that real-
world data rarely follows theoretical assumptions.
There are a lot of transformation functions that can convert the natural shape of the data
points into theoretically recommended distributions persevering the hidden patterns of the
data. Radial Basis is one such renowned function which is discussed in a lot of machine
learning textbooks. In this article, we will learn about basic intuition, types and usage of the
Radial basis function.
The Basic Intuition of a Radial Basis Function
The radial basis function is a mathematical function that takes a real-valued input and
outputs a real-valued output based on the distance between the input value projected in space
from an imaginary fixed point placed elsewhere.
This function is popularly used in many machine learning and deep learning algorithms such
as Support Vector Machines, Artificial Neural Networks, etc.
Let us understand the concept and the usage of this mathematical function.
27
In real-time, whenever we solve complex machine learning problems using algorithms such
as SVM, we need to project all of our data points in an imaginary multidimensional space
where each feature will be a dimension.
Let's assume we have a classification problem to predict whether a student will pass or fail
the examination.
Let’s consider that our data points look like this where-
• The green colour represents the students who passed the examination
28
• The red colour represents the students who failed the examination
Now, SVM will create a hyperplane that travels through these 3 dimensions in order to
differentiate the failed and passed students-
29
Image source: Illustrated by the author
So, technically now the model understands that every data points which falls on one side of
the hyperplane belong to the students who passed the exams and vice versa.
In our example, it was easy to create the hyperplane because a linear and straight hyperplane
was enough to discriminate the 2 categories. But in real-time complex projects, these
relations may get violated in many scenarios. Especially when you have hundreds of
independent variables, there is no possibility of getting a linear relationship between data
points such that it will be difficult to create an optimal hyperplane.
In such scenarios, researchers usually apply the Radial basis function to each of the data
points so that they will be able to pass a linear hyperplane across the data points to easily
solve the problem.
Consider that our data points are looking like this in the space-
30
Image source: Illustrated by the author
It is clear that we cannot use a linear hyperplane such that it can group the data points
according to their classes.
Some researchers will usually project these data points in much higher dimensions so that the
distance between the data points will be increasing so that they can apply some function
(RBF or any other function) to build a hyperplane. But it is not necessary to build high
dimensions since it is always the decision of the statistician/researcher who understands the
patterns in the data.
Next, we have to mark an imaginary point in the space like this wherever we need.
31
Image source: Illustrated by the author
After that, we need to draw some concentric circles based on this imaginary point.
The distance between the centre and any data point positioned in the boundary of the circle is
32
called the radius.
33
The function will look like this with respect to time,
34
Image source: reference 1
Gaussian Radial Basis Function
35
• ε is a constant
I will explain intuitively what these functions will do intuitively in the space. There are 2
different processes that are done by these functions-
36
Image source: Illustrated by the author
After the expansion and compression, the data points would have been transformed like this-
37
Image source: Illustrated by the author
Now, we can easily construct a linear hyperplane that can classify the data points like this-
38
Image source: Illustrated by the author
The Concept of the RBF Network
Sometimes, RBF is also used along with artificial neural networks with one hidden layer. In
such types of networks, RBF will be used as activation functions in the hidden layers. Apart
from the hidden layer, there will be an input layer that contains several neurons where each
one of them represents a feature variable and the output layer will be having a weighted sum
of outputs from the hidden layer to form the network outputs.
39
Image source: Illustrated by the author
40
This function is available as an inbuilt library in most data science-oriented programming
languages such as Python or R. Hence, it is easy to implement this once you understand the
theoretical intuition. I have added the links to some of the advanced materials in the
references section where you can deep dive into the complex calculations if you are
interested.
References
1. Radial Basis Functions - Wikipedia
2. Radial Basis Function networks Archived 2014-04-23 at the Wayback Machine
3. Broomhead, David H.; Lowe, David (1988). "Multivariable Functional Interpolation
and Adaptive Networks" (PDF). Complex Systems. 2: 321–355. Archived from the
original (PDF) on 2014-07-14.
4. Michael J. D. Powell (1977). "Restart procedures for the conjugate gradient
method". Mathematical Programming. 12 (1): 241–
254. doi:10.1007/bf01593790. S2CID 9500591.
5. Sahin, Ferat (1997). A Radial Basis Function Approach to a Color Image
Classification Problem in a Real-Time Industrial Application (M.Sc.). Virginia Tech.
p. 26. hdl:10919/36847. Radial basis functions were first introduced by Powell to
solve the real multivariate interpolation problem.
41
Case based reasoning
42
What is case-based reasoning?
Case-based reasoning (CBR) is a problem-solving approach in artificial intelligence and
cognitive science that uses past solutions to solve similar new problems. It is an experience-
based technique that adapts previously successful solutions to new situations. The process is
primarily memory-based, modeling the reasoning process on the recall and application of past
experiences.
The CBR process generally involves four steps:
1. Retrieval: Gathering from memory an experience closest to the current problem.
2. Reuse: Suggesting a solution based on the experience and adapting it to meet the
demands of the new problem.
3. Revision: Evaluating the use of the solution in the new context.
4. Retaining: Storing this new problem-solving method in the memory system.
CBR differs from other AI approaches, such as knowledge-based systems, in that it doesn't
rely solely on general knowledge of a problem domain or making associations. Instead, it
employs the specific knowledge of previously experienced, concrete problem situations. This
approach offers incremental, sustained learning as each time a problem is solved, a new
experience is retained and can be reused.
CBR is used in various areas, including pattern recognition, diagnosis, troubleshooting, and
planning. It's considered easier to maintain compared to rule-based expert systems. However,
it's important to note that while CBR is a powerful method for computer reasoning, it also has
its limitations and is not suitable for all types of problems.
What are the benefits of using case-based reasoning?
1. Ease of Knowledge Acquisition — CBR simplifies the process of knowledge
acquisition, as it relies on specific instances of problem-solving rather than abstract
rules or models.
2. Efficiency and Quality — It can improve the efficiency and quality of problem-
solving by adapting solutions that have been successful in the past.
3. Flexibility — CBR is adaptable to a wide range of tasks and domains, making it a
versatile tool in various fields.
4. Human-Like Reasoning — It allows machines to reason more like humans by
understanding and applying knowledge from past cases.
5. Learning Capability — CBR systems can learn incrementally as each new case is
solved and retained, enhancing their problem-solving capabilities over time.
6. Ease of Maintenance — Compared to rule-based systems, CBR systems are
generally easier to maintain because they do not require extensive rule management.
7. Intuitive Approach — The process of CBR is intuitive and mirrors human problem-
solving by using precedents, which can make development and maintenance easier.
43
What are some of the challenges associated with case-based reasoning?
1. Handling Large Case Bases — CBR can struggle with managing and searching
through large case bases efficiently.
2. Dynamic Domain Problems — CBR may not be suitable for problems in dynamic
domains where the conditions change rapidly.
3. Storage and Processing — Storing a large number of cases can require significant
storage space, and finding similar cases can be time-consuming.
4. Case Creation — Cases may need to be manually created, which can be labor-
intensive and error-prone.
5. Adaptation Challenges — Adapting retrieved cases to new problems can be difficult,
especially if the new problem is significantly different from past cases.
6. Robustness — CBR systems may lack robustness, as the absence of even one piece
of data can disrupt the retrieval process.
How can case-based reasoning be used in AI applications?
Case-based reasoning (CBR) can be utilized in various AI applications due to its ability to
solve problems by adapting solutions from similar past cases. Here are some ways CBR is
applied:
1. Diagnosis — In healthcare, CBR can assist in diagnosing diseases by comparing
current patient data with historical cases.
2. Financial Decision Making — Financial institutions use CBR for loan approvals,
risk assessments, and investment strategies by analyzing past financial cases.
3. Legal Reasoning — CBR aids in legal case analysis by referencing similar past legal
cases to inform decisions.
4. Customer Support — Help-desk systems employ CBR to provide solutions to
customer issues based on previously resolved cases.
5. Manufacturing — Advanced manufacturing processes can benefit from CBR by
troubleshooting and process control based on past incidents.
6. E-commerce — CBR can enhance self-service and e-commerce applications by
personalizing recommendations based on customer history.
What is the future of case-based reasoning?
The future of CBR looks promising due to its flexibility, accuracy, and simplicity, which
make it an attractive AI approach for various domains. It is expected to grow in popularity
and be increasingly used in areas such as:
• Self-service Applications — CBR can power self-service systems in e-commerce,
providing personalized experiences based on past user interactions.
• Web Applications — The adaptability of CBR to new areas like web applications
suggests its potential for broader application in online services.
44
• Risk Monitoring and Defense — The efficiency of CBR in risk monitoring and
defense indicates its continued relevance in sectors requiring high-stakes decision-
making.
• AI Accessibility — The relative simplicity of CBR makes it accessible for businesses
and organizations new to AI, suggesting its role in democratizing AI usage.
More terms
45