Week 8 - Understanding the Decision Tree
Week 8 - Understanding the Decision Tree
Dr Mariam Adedoyin-Olowe
[email protected]
Outlines
• Recap on Classification Techniques
• Overview of Decision Trees
• How Decision Trees Work
• Decision Tree Components
• Advantages of Decision Trees
• Common Use Cases
• Decision Tree Example
• Conclusion
• Classification
– allocate a new data record to one of numerous prior
groups or classes
– We know X and Y belong together, find other things in
same group
Ex. of Classification
Task
• Let’s assume you’re assessing data on individual customers’
financial
backgrounds and purchase history
Visual representation:
A tree-like model that makes decisions based on input
features.
How Decision Trees Work
• Decision-making process: Sequentially split the data based
on features to create a tree structure.
• Presentation Forms
– “if, then” statements (decision rules)
– graphically - decision trees
Decision Tree Components
• Works like a flow chart
• Looks like an upside down tree
• Root Node: The starting point of
the tree.
• Decision Nodes: Nodes that split
the data based on a certain
feature.
• Leaf Nodes: Terminal nodes that
represent the final decision or
outcome.
• Branches: Connect nodes and
represent the decision path.
How DT Works
9 No Married 75K No
• Are they all pure? (all yes or all
10 No Single 90K Yes no)
• If yes: stop
10
10
How DT works
• Decision tree builds classification in the
form of a tree structure.
• It breaks down a dataset into smaller
subsets while at the same time an
associated decision tree is incrementally
developed.
• The final result is a tree with:
• Internal node denotes a test on an
attribute
• Branch represents an outcome of
the test
• Leaf nodes represent class labels or
class distribution
Root Node, Internal Node and leaf Node
Branch
Decision Tree Classification
Task
Apply Model to Test
Data Test Data
Refund 10
No Married 80K ?
Ye N
s o
NO MarSt
Single, Marrie
Divorced d
TaxInc NO
< >
80K 80K
NO YES
Refund 10
No Married 80K ?
Ye N
s o
NO MarSt
Single, Marrie
Divorced d
TaxInc NO
< >
80K 80K
NO YES
Refund 10
No Married 80K ?
Ye N
s o
NO MarSt
Single, Marrie
Divorced d
TaxInc NO
< >
80K 80K
NO YES
s
Ye N
s o
NO MarSt
Single, Marrie
Divorced d
TaxInc NO
< >
80K 80K
NO YES
Refund 10
Ye N No Married 80K ?
s o
NO MarSt
Single, Marrie
Divorced d
TaxInc NO
< >
80K 80K
NO YES
s
Ye N No Married 80K ?
s o
NO MarSt
Single, Marrie Assign Cheat to
Divorced d “No”
TaxInc NO
< >
80K 80K
NO YES
6 No Medium 60K No
Model
10
Training
Set Apply
Model Decision
Tid Attrib1 Attrib2 Attrib3 Class