Introduction to Machine Learning (Slides 1-9)
Definition: Teaching computers to learn patterns from data without being explicitly programmed
(Arthur Samuel, 1959).
Example: Spam filtering using labeled emails.
Applications: Pattern recognition, stock marketing, self-driving cars, NLP, recommendation systems,
etc.
Types:
1. Supervised Learning: Learn from labeled data. 2l machine discover pattern of the data
2. Unsupervised Learning: Discover hidden patterns in unlabeled data. Bhawel 2t3aref 3al 2l
structural relations ben 2l data w ba3daha 3shan 23mel grouping le data w 23mel diminution
reduction 3shan 2allel el redundancy data
3. Reinforcement Learning: Learn via trial-and-error. Learning with feedback
Machine Learning Workflow (Slides 10-12)
1. Collect and understand data. The columns are referred to as features of the data, and the
rows are referred to as examples. Output (binary (yes, no) , multi clasification)
2. Prepare data (e.g., clean missing values (mean , median , most appeared value ), standardize formats,
ambiguous values(uppercase, lowercase ), lazem a3mel normalization ya3ny 2l features tkon nafse 2l
range 3shan 2l machine biased le large numbers ).
3. Train a model.
4. Test the model.
5. Improve accuracy.
Supervised Learning (Slides 13-20)
Tasks:1) Regression (predict continuous values
Generalization : hya 2n 2l machine te2dar tetnabe2 be data 2wel mara shofha not trained on it )
2) Classification (categorize data).
Examples:
o Regression: Predicting house prices.
o Classification: {
o Rule-based: hafez mesh fahem 3ando overfitting and high redundancy 2l hal hena decision
trees
o instance-based (lazy learning) mohtafez bel data kolaha w lma ygelo query bybos 3ala 2l data
w yrod 3alya
o experience-based. Byt3alem men 2l data fa lma ygelo query byrod 3alya 3alatol example
(neural network , decision tree)
Decision Trees (Slides 24-67)
Purpose: A non-parametric supervised learning method for classification and regression.
Structure:
o Internal nodes: Test attributes.
o Branches: Attribute values.
o Leaves: Output classes.
Types:1) Classification Tree (when tree classifies things into categories)
2) Regression Tree (predict numeric value).
Learning Algorithm: ID3 uses entropy and information gain to split data. Basic algo to Decision
Trees
Key Topics:
1. Entropy & Information Gain:
o Measure how well a split improves data classification.
o Example: Tennis dataset.
2. Algorithms:
o ID3: Basic decision tree method.
o C4.5/C5: Enhancements like handling continuous attributes, pruning.
3. Continuous Values: Use threshold splits to avoid overfitting.
4. Overfitting: Trees may become too complex and fail to generalize.
o Solutions: Early stopping and pruning.
5. Pruning:
o Simplifies trees by removing unnecessary splits.
o Balances fit and complexity using validation data.
Final Notes (Slides 68-69)
Credits: Based on Machine Learning by Tom M. Mitchell.
Decision trees can approximate noisy data, handle missing values, and be converted into if-then rules.
Let me know if you'd like further simplification or focus on a specific topic!