AI Lecture 9
AI Lecture 9
INTELLIGENCE
DR. MANAL TANTAWI
ASSOCIATE PROFESSOR
IN
SCIENTIFIC COMPUTING DEPARTMENT
FACULTY OF COMPUTER & INFORMATION
SCIENCES
AIN SHAMS UNIVERSITY
PART 9
• SPAM FILTERING
Spam: all emails you don’t want to receive and has not asked to
receive
Reinforcement learning
Collect and Collect and understand data
understand
MACHINE
LEARNING Train Train A model
WORKFLOW
Test Test the model
Example
PREPARE DATA
Real-world data is never ideal
to work with. Data might be
sourced from different systems
and different organizations,
which may have different
standards and rules for data
integrity. There are always
missing data, column of same
value for all examples, empty
columns, inconsistent data,
categorical data that needs to
be encoded and data in a
format that is difficult to work
with for the algorithms that we Example for dataset with missing and ambiguous values
want to use.
SUPERVISED LEARNING
• SUPERVISED LEARNING (SL)
➢ Is the machine learning task of learning a function that maps an input to an output
based on example input-output pairs. It infers a function from labeled training
data consisting of a set of training examples.
Regression Classification
REGRESSION (PREDICTING HOUSE PRICES)
REGRESSION (PREDICTING HOUSE PRICES)
REGRESSION (PREDICTING HOUSE PRICES)
REGRESSION (PREDICTING HOUSE PRICES)
REGRESSION (PREDICTING HOUSE PRICES)
linear Non-linear
Rule based learning
➢ Experience based learning: It infers a function from labeled training data consisting of a set
of training examples, which can be used for classifying new (unseen) examples
(generalization).
RULE BASED LEARNING
Corresponding Rules:
• If it is sunny, temperature is Hot, humidity is high, and Wind is
Weak then I will not play.
• If it is sunny, temperature is Hot, humidity is high, and Wind is
Strong then I will not play.
• If it is Overcast, temperature is Hot, humidity is high, and
Wind is Weak then I will play.
.
.
.
.
• If it is Rain, temperature is Mild, humidity is high, and Wind is
Strong then I will not play.
DRAWBACKS OF RULE-BASED SYSTEMS
• NO LEARNING, THE SYSTEM JUST USES THE SET OF RULES GIVEN BY THE KNOWLEDGE
ENGINEER.
• SOMETIMES MORE THAN ONE RULE CAN BE ACTIVATED OR NONE OF THEM.
• FOR LARGE DATASETS, THERE WILL BE TOO MUCH LONG RULES.
• RULES CONTAIN REDUNDANCIES AND NOT NECESSARY CONDITIONS.
• OVERFITTING (LOW OR ZERO TRAINING ERROR AND HIGH-TEST ERROR).
• DECISION TREE LEARNING IS A METHOD FOR
APPROXIMATING DISCRETE-VALUED TARGET
FUNCTIONS. THAT IS ROBUST TO NOISY DATA AND
CAPABLE OF LEARNING DISJUNCTIVE EXPRESSIONS.
DECISION
TREES • LEARNED TREES CAN ALSO BE RE-REPRESENTED AS
SETS OF IF-THEN RULES TO IMPROVE HUMAN
READABILITY.
• DECISION TREES ARE A NON-PARAMETRIC
SUPERVISED LEARNING METHOD USED FOR BOTH
CLASSIFICATION AND REGRESSION TASKS.
DECISION TREE FOR PLAY TENNIS
DECISION TREE FOR PLAY TENNIS
Corresponding Rules:
• If it is sunny and humidity is high, then I will not
play.
• If it is sunny and humidity is normal, then I will
play.
• If it is overcast, then I will play.
• If it is raining and wind is strong, then I will not
play.
• If it is raining and wind is weak, then I will
play.
CONTINUE…
DECISION
TREES Regression Tree (predict numeric
value)
DECISION TREES (LOGIC FUNCTIONS)
DECISION TREES (LOGIC FUNCTIONS)
DECISION TREES (LOGIC FUNCTIONS)
DECISION TREES LEARNING FROM DATA
HOW CAN WE FIND THE BEST TREE ?
Pi is the probability
of class i
Humidity is
better than
wind
TENNIS EXAMPLE
√ winner
TENNIS EXAMPLE
TENNIS EXAMPLE
√ winner
Temperature 40 10 35 15 25 30
Play Tennis No No Yes No Yes Yes
Temperature >?
Yes No
----- -----
Step 2:
For i = 1 … N-1
Consider split 𝑡𝑖 = (𝑣𝑖 + 𝑣𝑖+1 ) / 2
Compute information gain according to threshold split A >= 𝑡𝑖
Chose the t* with the highest information gain.
THRESHOLD SPLIT SELECTION ALGORITHM
After sorting
Temperature 10 15 25 30 35 40
Play Tennis No No Yes Yes Yes No
20 37.5
➢ Early Stopping: stop learning algorithm before the tree becomes too
complex.
Validation
error
EARLY STOPPING
Disadvantage:
Too short sighted: We may miss out on “good”
splits may occur right after “useless” splits.
PRUNING
Simpler
PRUNING
L(T) is the number of tree T
leaf nodes
• WHICH TREE IS SIMPLER ??
L(T) = 5
L(T) = 3
Simpler
BALANCE FIT AND COMPLEXITY
➢ Want to balance:
How well tree fits data and the Complexity of tree
TOTAL COST
PRUNING ALGORITHM
PRUNING ALGORITHM
PRUNING ALGORITHM
“Undo” the splits on Tsmaller
REPEAT FOR EVERY SPLIT
DECISION TREES PRUNING ALGORITHM
CREDIT FOR