Unit IV Ensemble Unsupervised Learning
Unit IV Ensemble Unsupervised Learning
Ensemble Learning
1. **Voting**: In classification, multiple models vote, and the majority class is selected.
2. **Error-Correcting Output Codes (ECOC)**: Decomposes multi-class problems into
multiple binary classifications.
3. **Bagging (Bootstrap Aggregating)**: Trains models independently on different
subsets of data and averages results.
4. **Boosting**: Models are trained sequentially, correcting errors from previous
models.
5. **Stacking**: Outputs from base learners are combined using another model (meta-
learner) for final predictions.
### **Advantages**:
- Reduces overfitting.
- Works well with high-dimensional data.
- Can be used for feature importance ranking.
### **Disadvantages**:
- Requires more computational power.
- Loses interpretability compared to individual Decision Trees.
Boosting: AdaBoost
### **Advantages**:
- Reduces bias, improving weak classifiers.
- More accurate than bagging for complex datasets.
### **Disadvantages**:
- Sensitive to noise in the dataset.
- Slower training due to sequential model building.
Unsupervised Learning
Clustering: Introduction
### **Advantages**:
- No need to predefine the number of clusters.
- Dendrograms provide visual insights.
### **Disadvantages**:
- Computationally expensive for large datasets.
- Sensitive to noise and outliers.
### **Advantages**:
- Fast and scalable for large datasets.
- Works well when clusters are well-separated.
### **Disadvantages**:
- Sensitive to initial cluster centers.
- Does not handle outliers well.
Dimensionality reduction techniques help reduce the number of features while preserving
important information.
### **Advantages**:
- Reduces noise and redundancy.
- Speeds up model training.
### **Disadvantages**:
- Can lose interpretability.
- Assumes linearity (for PCA).