Machine Learning
Machine Learning
Examples:
1. Email Spam Filtering
2. Recommendation Systems
3. Image Recognition
Supervised Learning
Supervised learning is a type of machine learning where the algorithm is trained
on labeled data. The training data includes input-output pairs, and the goal is for
the algorithm to learn a mapping from inputs to outputs.
Common Algorithms:
1. Linear Regression: Predicts a continuous target variable based on linear
relationships between input features.
2. Neural Networks: Composed of interconnected nodes (neurons) that process
input features and learn complex patterns.
3. Decision Trees: Models decisions and their possible consequences using a tree-
like graph.
4. Support Vector Machines (SVM): Finds the hyperplane that best separates
different classes in the input feature space.
5. K-Nearest Neighbors (KNN): Classifies a data point based on the majority class
among its k nearest neighbors.
Applications:
- Email spam detection
- Image and speech recognition
- Medical diagnosis
- Stock price prediction
Unsupervised Learning
Definition:
Unsupervised learning involves training algorithms on data that is not labeled. The
goal is to infer the natural structure present within a set of data points.
Common Algorithms:
1. K-Means Clustering: Partitions data into k distinct clusters based on feature
similarity.
2. Hierarchical Clustering: Builds a hierarchy of clusters using either a bottom-up
(agglomerative) or top-down (divisive) approach.
3. Principal Component Analysis (PCA): Reduces the dimensionality of data by
transforming it into a new set of orthogonal components.
4. Anomaly Detection: Identifies outliers or unusual data points that differ
significantly from the majority of the data.
5. Autoencoders: Neural networks used for data compression and noise reduction
by learning efficient codings of input data.
Applications:
- Customer segmentation
- Market basket analysis
- Anomaly detection in network security
- Dimensionality reduction for data visualization
Key Differences
Version Space
Version space is a concept in machine learning that represents the subset of hypotheses that
are consistent with the observed training examples. It is used in the context of supervised
learning to narrow down the set of possible hypotheses to those that can correctly classify the
training data.
Key Concepts:
1. Hypothesis Space (H):
- The set of all possible hypotheses that can be formed using the features and values present
in the dataset.
2. Version Space (VS):
- The subset of the hypothesis space that is consistent with all the training examples. It is
represented as the space between the most specific hypothesis (S) and the most general
hypothesis (G).
3. General Boundary (G):
- The set of the most general hypotheses that cover all positive examples.
4. Specific Boundary (S):
- The set of the most specific hypotheses that cover all positive examples.
Example
Let's use the following training examples:
1. Positive Example: (Red, Small, Round) -> Apple
2. Negative Example: (Green, Large, Oval) -> Not Apple
3. Positive Example: (Red, Large, Round) -> Apple
4. Negative Example: (Red, Small, Oval) -> Not Apple
Step-by-Step Version Space Update:
Example 1: (Red, Small, Round) -> Apple
• Specialize Specific Hypothesis (S):
• S: Initially, it's {No fruit}.
• After observing (Red, Small, Round), S must include at least this fruit:
• S: {(Red, Small, Round)}
• Generalize General Hypothesis (G):
• G: Initially, it's {Any fruit}.
• After observing (Red, Small, Round), all fruits not fitting this pattern are excluded.
But since G starts as completely general, we narrow it down:
• G: {(Red, Small, Round)}
Example 2: (Green, Large, Oval) -> Not Apple
• Specialize General Hypothesis (G):
• G: {(Red, Small, Round)}
• Since (Green, Large, Oval) is not an Apple, we do not change G because it already
excludes (Green, Large, Oval).
• Generalize Specific Hypothesis (S):
• S: {(Red, Small, Round)}
• Since (Green, Large, Oval) is not an Apple, S does not need to be generalized
further.
Example 3: (Red, Large, Round) -> Apple
• Specialize Specific Hypothesis (S):
• S: {(Red, Small, Round)}
• We need to find a common pattern between (Red, Small, Round) and (Red, Large,
Round):
• S: {(Red, ?, Round)}
• Generalize General Hypothesis (G):
• G: {(Red, Small, Round)}
• Since (Red, Large, Round) is also an Apple, we generalize G to include both:
• G: {(Red, ?, Round)}
Example 4: (Red, Small, Oval) -> Not Apple
• Specialize General Hypothesis (G):
• G: {(Red, ?, Round)}
• Since (Red, Small, Oval) is not an Apple, we specialize G to exclude it:
• G: {(Red, ?, Round) and not (Red, Small, Oval)}
• Simplified: {(Red, Large, Round)}
• Generalize Specific Hypothesis (S):
• S: {(Red, ?, Round)}
• Since (Red, Small, Oval) is not an Apple, SSS does not need to be generalized
further.
Final Version Space:
After processing all examples, the final version space should ideally converge to:
• General Hypothesis (G):
• {(Red, Large, Round)}
• Specific Hypothesis (S):
• {(Red, Large, Round)}