Random Forest Classic Style
Random Forest Classic Style
• Boosting:
• - Models are trained sequentially.
• - Each model tries to correct the errors of the
previous one.
What is Random Forest?
• Random Forest is an ensemble algorithm
based on Decision Trees.
• It uses Bagging to combine the output of
multiple decision trees.
• Each tree is trained on a random subset of
data and features.
• Final prediction is based on majority vote
(classification) or average (regression).
Random Forest Algorithm
• Step 1: Select N random samples from the
dataset.
• Step 2: For each sample, build a Decision Tree.
• Step 3: At each node, choose the best split
from a random subset of features.
• Step 4: Repeat until trees are fully grown (or
reach stopping criteria).
• Step 5: Aggregate results from all trees
(majority vote or average).
Solved Example (Conceptual)
• Assume a dataset with features Age and
Income.
• We want to predict whether a person will buy
a product.
• Random Forest builds multiple decision trees
using different samples and features.
• Each tree gives a prediction (Yes/No).
• Final output is the majority vote of all trees.
Advantages and Disadvantages
• Advantages:
• - High accuracy.
• - Handles large datasets and missing values.
• - Reduces overfitting compared to single
decision trees.
• - Works well for both classification and
regression.
• Disadvantages:
Applications of Random Forest
• - Medical Diagnosis (e.g., disease prediction).
• - Fraud Detection in banking.
• - Customer Segmentation in marketing.
• - Product Recommendation Systems.
• - Credit Scoring and Risk Analysis.
Thank You
• This concludes the session on Random Forest
Algorithm.
• Feel free to ask questions.