0% found this document useful (0 votes)
40 views

Fuzzy Decision Trees

This document describes fuzzy decision trees and their use for classification and prediction problems. It discusses how continuous attributes are handled by partitioning them into fuzzy sets and translating the problem into a classification task. Decision trees are constructed using a modified ID3 algorithm and can be used to classify or predict new cases based on propagating probabilities at each node. Several examples are provided to illustrate the approach, including for ellipse classification, iris classification, diabetes prediction, and predicting the sine of xy values.

Uploaded by

ChakshuGrover
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

Fuzzy Decision Trees

This document describes fuzzy decision trees and their use for classification and prediction problems. It discusses how continuous attributes are handled by partitioning them into fuzzy sets and translating the problem into a classification task. Decision trees are constructed using a modified ID3 algorithm and can be used to classify or predict new cases based on propagating probabilities at each node. Several examples are provided to illustrate the approach, including for ellipse classification, iris classification, diabetes prediction, and predicting the sine of xy values.

Uploaded by

ChakshuGrover
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Fuzzy Decision Trees

Professor J. F. Baldwin

Classification and Prediction


For classification the universe for the target attribute is a discrete set. For prediction the universe for the target attribute is continuous For Prediction use fuzzy partition f2 f3 f4 f5 f1 a b T c d e Arrange fuzzy sets so there are equal number of training data points in each of intervals [a, b], [b, c], [c, d], [d, e]

Target translation for prediction


A1 Training Set Tr a11 A2 a12 An a1n T t1 Pr p1

Translated Training set Tr' This is now a classification Set.

A1 a11 a11

A2 a12 a12

An a1n a1n

T fi fi+1

Pr p1fi(t1) p1fi+1(t1)

Repeat for each row collecting equivalent rows and adding probabilities

Preparing one attribute reduced database for continuous attribute


Continuous attribute

From Tr' if prediction Tr if classification

Ai

Pr Pr(Ai,T)= ... ... Pr(A1..., An ,T)


A1 A i 1 A i=1 An

choose number of fuzzy sets From Tr' Ai gi T Pr

g1 a

g2 b

g3 g4 c Ai d

g5

Reduced database

equal number of data e points in each interval

Fuzzy ID3
Using the training set Tr' and one attribute reduced database for all continuous attributes, we can use the method of ID3 previously given to determine the decision tree for predicting or classifying the target and also post pruning We modify the stopping condition. Do not expand node N if S = Pr(T)Ln{Pr(T) } for that node is < some value v
T

Node N will have probability distribution {gi : i} You can also limit the depth of the tree to some value. For example expand tree to depth 4.

Evaluating new case for classification


Ai g1 gn g2 Attribute value, for continuous attribute will have probability distribution over {gi}. Only 2 non-zero probabilities

New case will propagate through many branches of the tree arriving at node Ni with probability i determined by multiplying the probabilities of all branches to arrive at Ni Let distributions for leaf nodes be Nj : {ti : ij} Overall distribution is {t i : j ij}
j

Decision: choose tk where MAX{ j ij} = j kj


i j j

Evaluating new case for prediction


Ai g1 gn g2 Attribute value, for continuous attribute will have probability distribution over {gi}. Only 2 non-zero probabilities

New case will propagate through many branches of the tree arriving at node Ni with probability i determined by multiplying the probabilities of all branches to arrive at Ni Let distributions for leaf nodes be Nj : {fi : ij} Overall distribution is {f i : jij}
j

predicted value= (f i ){ jij}


i j

Fuzzy Sets important for Data Mining


small large Partition each universe with {small, large} large
large small small profit : 0.543
INCOME

1 profit income

large profit : 0.874


OUTGOING

small

profit : 0.165

INCOME

large profit : 0.543

0 0 outgoing 1

small

Two crisp sets on each universe can give at most only 50%accuracy We would require 16 crisp sets on each universe to give same accuracy as a two fuzzy set partition

Profit

94.14% correct

Ellipse Example
1.5

illegal

legal

X, Y universes each partitioned into 5 fuzzy sets about_-1.5 = [-1.5:1, -0.75: 0] about_-0.75 = [-1.5:0, -0.75:1, 0:0] about_0 = [-0.75:0, 0:1, 0.75:0] about_0.75 = [0:0, 0.75:1, 1.5:0] about_1.5 = [0.75:0, 1.5: 1]
1.5

-1.5 -1.5

tree learnt on 126 random points from [-1.5,1.5]2

Tree for Ellipse example


about_1.5 L:0 I:1 about_1 .5 about_0 .75 Y L:0.0092 I:0.9908 about_0 .75 L:0.3506 I:0.6494 about_0 L:0.5090 I:0.4910 about_0 . 75 L:0.3455 I:0.6545 about_ 1. 5 L:0.0131 I:0.9869 about_0 .75

about_ 0

about_ 0 . 75

L:0.1352 I:0.8648 L:0.8131 I:0.1869 about_0 L:1 I:0 about_0 . 75 L:0.8178 I:0.1822 about_ 1. 5 L:0.1327 I:0.8673 about_ 1 .5 L:0.0109 I:0.9891 about_0 .75 L:0.3629 I:0.6371 about_ 0 L:0.5090 I:0.5910 about_ 0 . 75 L:0.3455 I:0.6545 about_ 1. 5 L:0.0131 I:0.9869

about_ 1 .5

about_ 1 .5 L:0 I:1

General Fril Rule


((Classification = legal) if ( ((X is about_-1.5)) ((X is about_-0.75)& (Y is about_-1.5)) ((X is about_-0.75) & (Y is about_-0.75)) ((X is about_-0.75) & (Y is about_0)) ((X is about_-0.75) & (Y is about_0.75)) ((X is about_-0.75) & (Y is about_1.5)) ((X is about_0) & (Y is about_-1.5)) ((X is about_0) & (Y is about_-0.75)) ((X is about_0) & (Y is about_0)) ((X is about_0) & (Y is about_0.75)) ((X is about_0) & (Y is about_1.5)) ((X is about_0.75) & (Y is about_-1.5)) ((X is about_0.75) & (Y is about_-0.75)) ((X is about_0.75) & (Y is about_0)) ((X is about_0.75) & (Y is about_0.75)) ((X is about_0.75) & (Y is about_1.5)) ((X is about_1.5)))) :

((0 0)(0.0092 0.0092)(0.3506 0.3506) (0.5090 0.5090)(0.3455 0.3455)(0.0131 0.0131) (0.1352 0.1352)(0.8131 0.8131)(1 1) (0.8178 0.8178)(0.1327 0.1327)(0.0109 0.0109) (0.3629 0.3629)(0.5090 0.5090)(0.3455 0.3455) (0.0131 0 . 0131)(0 0))

Results

The above tree was tested on 960 points forming a regular grid on [-1.5,1.5]2 giving 99.168% correct classification. The control surface for the positive quadrant

Iris Classification
Data 3 classes - Iris-Setosa ,Iris-Versicolor and Iris-Virginica 50 instances of each class Attributes 1. sepal length in cm ----universe [4.3, 7.9] 2. sepal width in cm ----universe [2, 4.4] 3. petal length in cm ----universe [1, 6.9] 4. petal width in cm ----universe [0.1, 2.5] Fuzzy partition of 5 fuzzy sets on each universe

Iris Decision tree


v_small4 (1 0 0) (1 0 0) v_small3 3 small4 small3 (0 1 0) {medium3,large3} v_large3 (0 0 1) 4 (0 1 0) {v_small2,small2} (0.93 0.07 0) {medium2,large2,v_large2} (0 0.92 0.08) 3 {v_small3,small3,medium3,large3} medium4 v_large3 (0 0.08 0.92)

v_small3

(0.33 0.33 0.33) (0 0.27 0.73) 1 v_small1 (0 0.62 0.38) {v_small2,small2} {small1,med1,large1} v_large1 (0 0.95 0.05) (0 0.81 0.19) {med2,large2} (0.33 0.33 0.33) v_large2 (0 0.27 0.73) (v_small2,small2} (0 1 0) v_small1 1 (0 0.43 0.56) {small1,med1} med2 (0 0.35 0.65) {large1,v_large1} (0.33 0.33 0.33) v_small1 (0 0.97 0.03) small1 1 large2 large1 v_large1 (0 0.12 0.88) med1 (0 0.74 0.26) (0 0.41 0.59)

(0 1 0) small3 3 large4 2 medium3

2 large3

Gives 98.667% accuracy on test data

v_large2

(0 0 1)

(0 0.02 0.98) v_large3

( 0 0 1) v_large5

Diabetes in Pima Indians


Diabetes mellitus in the Pima Indian population living near Phoenix Arizona - 5 fuzzy sets used for each attribute Data 768 over 21 yrs females - 384 training, 384 test classes Attributes 1 Number of times pregnant 2 Plasma glucose concentration 3 Diastolic blood pressure 4 Triceps skin fold thickness 5 2-Hour serum insulin 6 Body mass index 7 Diabetes pedigree function 8 Age Forward pruning algorithm the tree complexity is halved to 80 branches. This reduced tree gives an accuracy of 80.46% on the training set and 78.38% on the test set. Post pruning reduces the complexity to 28 branches giving 78.125% on the training set and 78.9% on the test set

The decision tree was generated to a maximum depth of 4 given a tree of 161 branches. This gave an accuracy of 81.25% on the training set and 79.9% on the test set.

Diabetes Tree
v_small2 small2 2 (nd:0.99 d:0.01) v_small8 (nd:0.09 d:0.91) small8 (nd:0.3 d:0.7) medium8 (nd:0.5 d:0.5) {large8,v_large8} (nd:0.96 d:0.04) (nd:0.89 d:0.11) (nd:0.6 d:0.4) {v_small8,v_large8} 8 v_small7 (nd:0.65 d:0.35) (nd:0.39 d:0.6) small8 7 {small7,medium7} medium8 (nd:0.88 d:0.12) large7 (nd:0.58 d:0.42) (nd:0.5 d:0.5) large8 v_large7 (nd:0.22 d:0.78) v_small3 3 (nd:0.68 d:0.32) {small3,v_large3} v_small8 (nd:0.74 d:0.26) {medium3,large3} 8 (nd:0.29 d:0.71) v_small6 (nd:0.64 d:0.36) 6 small6 8 (nd:0.45 d:0.55) (medium6,large6) (nd:0.05 d:0.95) v_large6 (nd:0.39 d:0.61) 7 {v_small7,small7} (nd:0.44 d:0.56) medium8 medium7 (nd:0.92 d0.08) {large7,v_large7} (nd:0.56 d:0.44) 5 v_small5 (nd:0.31 d:0.69) large8 small5 (nd:0.03 d:0.97) (nd:0.55 d:0.45) (medium5,large5,v_large5} v_large8 small8 (nd:0.09 d:0.91) {v_small3,small3,v_large3} (nd:0.29 d:0.71) {medium3,large3}

medium2

large2

3 v_large2

Decision Tree for Pima Indian Problem

SIN XY Prediction Example


2 about_ 0 about_0.3333 about_ 0.6667 about _ 1 about_ 1.333 about_1.667 about _ 2 about _2.333 about _ 2.6667 about _ 3 = [0:1 0.333333:0 ] = [0:0 0.333333:1 0.666667:0] = [0.333333:0 0.666667:1 1:0] = [0.666667:0 1:1 1.33333:0] = [1:0 1.33333:1 1.66667:0] = [1.33333:0 1.66667:1 2:0] = [1.66667:0 2:1 2.33333:0] = [2:0 2.33333:1 2.66667:0] = [2.33333:0 2.66667:1 3:0] = [2.66667:0 3:1 ] database consists of 528 triples (X, Y, sin XY) where the pairs (X, Y) form a regular grid on [0, 3]

class_ 1 class _2 class_ 3 class_4 class_5

= [-1:1 0:0] = [-1:0 0:1 0.380647:0] = [0:0 0.380647:1 0.822602:0] = [0.380647:0 0.822602:1 1:0] = [0.822602:0 1:1]

Fuzzy ID3 decision tree with 100 branches


sinxy

Percentage error of 4.22% on a regular test set of 1023 points.

control surface

You might also like