Fuzzy Decision Trees
Fuzzy Decision Trees
Professor J. F. Baldwin
A1 a11 a11
A2 a12 a12
An a1n a1n
T fi fi+1
Pr p1fi(t1) p1fi+1(t1)
Repeat for each row collecting equivalent rows and adding probabilities
Ai
g1 a
g2 b
g3 g4 c Ai d
g5
Reduced database
Fuzzy ID3
Using the training set Tr' and one attribute reduced database for all continuous attributes, we can use the method of ID3 previously given to determine the decision tree for predicting or classifying the target and also post pruning We modify the stopping condition. Do not expand node N if S = Pr(T)Ln{Pr(T) } for that node is < some value v
T
Node N will have probability distribution {gi : i} You can also limit the depth of the tree to some value. For example expand tree to depth 4.
New case will propagate through many branches of the tree arriving at node Ni with probability i determined by multiplying the probabilities of all branches to arrive at Ni Let distributions for leaf nodes be Nj : {ti : ij} Overall distribution is {t i : j ij}
j
New case will propagate through many branches of the tree arriving at node Ni with probability i determined by multiplying the probabilities of all branches to arrive at Ni Let distributions for leaf nodes be Nj : {fi : ij} Overall distribution is {f i : jij}
j
1 profit income
small
profit : 0.165
INCOME
0 0 outgoing 1
small
Two crisp sets on each universe can give at most only 50%accuracy We would require 16 crisp sets on each universe to give same accuracy as a two fuzzy set partition
Profit
94.14% correct
Ellipse Example
1.5
illegal
legal
X, Y universes each partitioned into 5 fuzzy sets about_-1.5 = [-1.5:1, -0.75: 0] about_-0.75 = [-1.5:0, -0.75:1, 0:0] about_0 = [-0.75:0, 0:1, 0.75:0] about_0.75 = [0:0, 0.75:1, 1.5:0] about_1.5 = [0.75:0, 1.5: 1]
1.5
-1.5 -1.5
about_ 0
about_ 0 . 75
L:0.1352 I:0.8648 L:0.8131 I:0.1869 about_0 L:1 I:0 about_0 . 75 L:0.8178 I:0.1822 about_ 1. 5 L:0.1327 I:0.8673 about_ 1 .5 L:0.0109 I:0.9891 about_0 .75 L:0.3629 I:0.6371 about_ 0 L:0.5090 I:0.5910 about_ 0 . 75 L:0.3455 I:0.6545 about_ 1. 5 L:0.0131 I:0.9869
about_ 1 .5
((0 0)(0.0092 0.0092)(0.3506 0.3506) (0.5090 0.5090)(0.3455 0.3455)(0.0131 0.0131) (0.1352 0.1352)(0.8131 0.8131)(1 1) (0.8178 0.8178)(0.1327 0.1327)(0.0109 0.0109) (0.3629 0.3629)(0.5090 0.5090)(0.3455 0.3455) (0.0131 0 . 0131)(0 0))
Results
The above tree was tested on 960 points forming a regular grid on [-1.5,1.5]2 giving 99.168% correct classification. The control surface for the positive quadrant
Iris Classification
Data 3 classes - Iris-Setosa ,Iris-Versicolor and Iris-Virginica 50 instances of each class Attributes 1. sepal length in cm ----universe [4.3, 7.9] 2. sepal width in cm ----universe [2, 4.4] 3. petal length in cm ----universe [1, 6.9] 4. petal width in cm ----universe [0.1, 2.5] Fuzzy partition of 5 fuzzy sets on each universe
v_small3
(0.33 0.33 0.33) (0 0.27 0.73) 1 v_small1 (0 0.62 0.38) {v_small2,small2} {small1,med1,large1} v_large1 (0 0.95 0.05) (0 0.81 0.19) {med2,large2} (0.33 0.33 0.33) v_large2 (0 0.27 0.73) (v_small2,small2} (0 1 0) v_small1 1 (0 0.43 0.56) {small1,med1} med2 (0 0.35 0.65) {large1,v_large1} (0.33 0.33 0.33) v_small1 (0 0.97 0.03) small1 1 large2 large1 v_large1 (0 0.12 0.88) med1 (0 0.74 0.26) (0 0.41 0.59)
2 large3
v_large2
(0 0 1)
( 0 0 1) v_large5
The decision tree was generated to a maximum depth of 4 given a tree of 161 branches. This gave an accuracy of 81.25% on the training set and 79.9% on the test set.
Diabetes Tree
v_small2 small2 2 (nd:0.99 d:0.01) v_small8 (nd:0.09 d:0.91) small8 (nd:0.3 d:0.7) medium8 (nd:0.5 d:0.5) {large8,v_large8} (nd:0.96 d:0.04) (nd:0.89 d:0.11) (nd:0.6 d:0.4) {v_small8,v_large8} 8 v_small7 (nd:0.65 d:0.35) (nd:0.39 d:0.6) small8 7 {small7,medium7} medium8 (nd:0.88 d:0.12) large7 (nd:0.58 d:0.42) (nd:0.5 d:0.5) large8 v_large7 (nd:0.22 d:0.78) v_small3 3 (nd:0.68 d:0.32) {small3,v_large3} v_small8 (nd:0.74 d:0.26) {medium3,large3} 8 (nd:0.29 d:0.71) v_small6 (nd:0.64 d:0.36) 6 small6 8 (nd:0.45 d:0.55) (medium6,large6) (nd:0.05 d:0.95) v_large6 (nd:0.39 d:0.61) 7 {v_small7,small7} (nd:0.44 d:0.56) medium8 medium7 (nd:0.92 d0.08) {large7,v_large7} (nd:0.56 d:0.44) 5 v_small5 (nd:0.31 d:0.69) large8 small5 (nd:0.03 d:0.97) (nd:0.55 d:0.45) (medium5,large5,v_large5} v_large8 small8 (nd:0.09 d:0.91) {v_small3,small3,v_large3} (nd:0.29 d:0.71) {medium3,large3}
medium2
large2
3 v_large2
= [-1:1 0:0] = [-1:0 0:1 0.380647:0] = [0:0 0.380647:1 0.822602:0] = [0.380647:0 0.822602:1 1:0] = [0.822602:0 1:1]
control surface