Introduction to Machine Learning 9
Introduction to Machine Learning 9
Definition: Teaching computers to learn patterns from data without being explicitly programmed
(Arthur Samuel, 1959).
Example: Spam filtering using labeled emails.
Applications: Pattern recognition, stock marketing, self-driving cars, NLP, recommendation systems,
etc.
Types:
1. Supervised Learning: Learn from labeled data. 2l machine discover pattern of the data
2. Unsupervised Learning: Discover hidden patterns in unlabeled data. Bhawel 2t3aref 3al 2l
structural relations ben 2l data w ba3daha 3shan 23mel grouping le data w 23mel diminution
reduction 3shan 2allel el redundancy data
3. Reinforcement Learning: Learn via trial-and-error. Learning with feedback
1. Collect and understand data. The columns are referred to as features of the data, and the
rows are referred to as examples. Output (binary (yes, no) , multi clasification)
2. Prepare data (e.g., clean missing values (mean , median , most appeared value ), standardize formats,
ambiguous values(uppercase, lowercase ), lazem a3mel normalization ya3ny 2l features tkon nafse 2l
range 3shan 2l machine biased le large numbers ).
3. Train a model.
4. Test the model.
5. Improve accuracy.
Generalization : hya 2n 2l machine te2dar tetnabe2 be data 2wel mara shofha not trained on it )
Learning Algorithm: ID3 uses entropy and information gain to split data. Basic algo to Decision
Trees
Key Topics:
2. Algorithms:
o ID3: Basic decision tree method.
o C4.5/C5: Enhancements like handling continuous attributes, pruning.
3. Continuous Values: Use threshold splits to avoid overfitting.
4. Overfitting: Trees may become too complex and fail to generalize.
o Solutions: Early stopping and pruning.
5. Pruning:
o Simplifies trees by removing unnecessary splits.
o Balances fit and complexity using validation data.