0% found this document useful (0 votes)

13 views

CSET301 LabW8L2

Uploaded by

Anmol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

CSET301 LabW8L2

Uploaded by

Anmol

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Decision Tree Visualization

Decision Tree
Decision tree is the most powerful and popular tool for classification and prediction. A Decision tree is a flowchart like tree structure,
where each node finds the best threshold on that feature to further classify/predict more accurately, each branch represents an
outcome of that threshold, and each leaf node holds a class label.

In [1]:
from matplotlib import pyplot as plt # For plotting
from sklearn import datasets # For loading standard datasets
from sklearn.tree import DecisionTreeClassifier # To run decision tree model
from sklearn import tree # to visualize decision trees

Iris Dataset Description:

Classes: 3
Samples per class: 50
Samples total: 150
Dimesionaltiy: 4
Source: https://archive.ics.uci.edu/ml/datasets/iris

Quick Tip: sklearn.datasets has some toy datasets, the package also has helpers to fetch larger datasets commonly used by the machine
learning community

In [2]:
# Prepare the data data
iris = datasets.load_iris()
X = iris.data
y = iris.target

In [3]:
# Initialize the model
clf = tree.DecisionTreeClassifier()
# Fir the model
clf.fit(iris.data,iris.target)

Out[3]: DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',

max_depth=None, max_features=None, max_leaf_nodes=None,
min_impurity_decrease=0.0, min_impurity_split=None,
min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, presort='deprecated',
random_state=None, splitter='best')

Task
Train your own decision tree and play with the following hyper-parameters then state your observations on at least 15 different
hyper-parameter settings. Following are only some of the parameters:

Must read: https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html

max_depth : The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves
contain less than min_samples_split samples.
min_samples_split : The minimum number of samples required to split an internal node.
min_samples_leaf : The minimum number of samples required to be at a leaf node. This may have the effect of smoothing the
model, especially in regression.
random state : Controls the randomness of the estimator
Write a function to calculate the accuracy

Print accuracies for each hyper-parameter setting used. Print in following format:
1. PARAMS[random_state=1, max_depth=....] , Accuracy=0.97
2. PARAMS[random_state=42, min_samples_split=....] , Accuracy=0.94
..
.
Perform the same set of acitvites on different dataset: https://gist.github.com/kudaliar032/b8cf65d84b73903257ed603f6c1a2508

In [4]:
# initialise and then Fit the classifier
clf = tree.DecisionTreeClassifier()
clf.fit(X, y)

Out[4]: DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',

In [5]:
# Gives text representation to the decision tree trained
text_representation = tree.export_text(clf)
print(text_representation)

|--- feature_2 <= 2.45

In [6]:
# To save the above info in a text file
with open("decistion_tree.log", "w") as fout:
fout.write(text_representation)

How to Visualize Decision Trees using Matplotlib

Scikit-learn version >=0.21.0 allows Decision Trees to be plotted with matplotlib using 'sklearn.tree.plot_tree'

In [7]:
# Visualize the results in a beautiful manner using sklearn plot_tree
# Look documentation for modifying fonts: https://scikit-learn.org/stable/modules/generated/sklearn.tree.plo
fig = plt.figure(figsize=(25,20))
_ = tree.plot_tree(clf,
feature_names=iris.feature_names,
class_names=iris.target_names,
filled=True)

In the above figure color of the nodes represent the majoritiy of the class

In [8]:
# TODO: Write accuracy function here
import sklearn.metrics as metrics
from sklearn.model_selection import train_test_split
X_s_train, X_s_test, y_s_train, y_s_test = train_test_split(X, y, test_size=0.25, random_state=6)
y_pred=clf.predict(X_s_test)
from sklearn.model_selection import train_test_split

print("Accuracy:",metrics.accuracy_score(y_s_test, y_pred))

Accuracy: 1.0

In [9]:
# TODO: Print 15 hyperparam settings along with accuracy
import sklearn.metrics as metrics
for i in range(5,13):
clf1=DecisionTreeClassifier(criterion = "gini", splitter = 'random', max_leaf_nodes = 10, min_samples_leaf
clf1.fit(X,y)
X_s_train, X_s_test, y_s_train, y_s_test = train_test_split(X, y, test_size=0.25, random_state=6)
y_pred=clf1.predict(X_s_test)
print('PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_dept
for i in range(35,43):
clf1=DecisionTreeClassifier(criterion = "entropy", splitter = 'random',min_samples_split=4, max_leaf_nodes
clf1.fit(X,y)
X_s_train, X_s_test, y_s_train, y_s_test = train_test_split(X, y, test_size=0.25, random_state=6)
y_pred=clf1.predict(X_s_test)
print('PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samp

PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 5 ]

Accuracy: 0.9473684210526315
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 6 ]
Accuracy: 1.0
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 7 ]
Accuracy: 0.9736842105263158
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 8 ]
Accuracy: 0.9210526315789473
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 9 ]
Accuracy: 0.9473684210526315
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 10 ]
Accuracy: 0.9473684210526315
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 11 ]
Accuracy: 1.0
PARAMS[criterion = "gini", splitter = "random", max_leaf_nodes = 10, min_samples_leaf = 5, max_depth= 12 ]
Accuracy: 0.9210526315789473
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 35 ] Accuracy: 0.9210526315789473
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 36 ] Accuracy: 0.8947368421052632
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 37 ] Accuracy: 0.9210526315789473
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 38 ] Accuracy: 0.868421052631579
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 39 ] Accuracy: 0.9736842105263158
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 40 ] Accuracy: 0.8157894736842105
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 41 ] Accuracy: 0.9210526315789473
PARAMS[criterion = "entropy", splitter = "random",min_samples_split=4, max_leaf_nodes = 5, min_samples_leaf
= 5, max_depth= 42 ] Accuracy: 0.9210526315789473

In [10]:
# Save the figure
fig.savefig("decistion_tree.png")

How to visualize decision trees using graphviz

If you get runtime error with graphviz, refer to

https://stackoverflow.com/questions/35064304/runtimeerror-make-sure-the-graphviz-executables-are-on-your-systems-path-aft

Graph visualization is a way of representing structural information as diagrams of abstract graphs and networks.

In [11]:
import graphviz
# DOT data - since graphviz accepts data in DOT we will convert our tree into a compatable format
dot_data = tree.export_graphviz(clf, out_file=None,
feature_names=iris.feature_names,
class_names=iris.target_names,
filled=True)

# Draw graph
graph = graphviz.Source(dot_data, format="png")
graph

Out[11]: petal length (cm) <= 2.45

gini = 0.667
samples = 150
value = [50, 50, 50]
class = setosa
False
True

petal width (cm) <= 1.75

gini = 0.0
gini = 0.5
samples = 50
samples = 100
value = [50, 0, 0]
value = [0, 50, 50]
class = setosa
class = versicolor

petal length (cm) <= 4.95 petal length (cm) <= 4.85
gini = 0.168 gini = 0.043
samples = 54 samples = 46
value = [0, 49, 5] value = [0, 1, 45]
class = versicolor class = virginica

petal width (cm) <= 1.65 petal width (cm) <= 1.55 sepal width (cm) <= 3.1
gini = 0.0
gini = 0.041 gini = 0.444 gini = 0.444
samples = 43
samples = 48 samples = 6 samples = 3
value = [0, 0, 43]
value = [0, 47, 1] value = [0, 2, 4] value = [0, 1, 2]
class = virginica
class = versicolor class = virginica class = virginica

sepal length (cm) <= 6.95

gini = 0.0 gini = 0.0 gini = 0.0 gini = 0.0 gini = 0.0
gini = 0.444
samples = 47 samples = 1 samples = 3 samples = 2 samples = 1
samples = 3
value = [0, 47, 0] value = [0, 0, 1] value = [0, 0, 3] value = [0, 0, 2] value = [0, 1, 0]
value = [0, 2, 1]
class = versicolor class = virginica class = virginica class = virginica class = versicolor
class = versicolor

gini = 0.0 gini = 0.0

samples = 2 samples = 1
value = [0, 2, 0] value = [0, 0, 1]
class = versicolor class = virginica

In [12]:
graph.render("decision_tree_graphivz")

Out[12]: 'decision_tree_graphivz.png'

Resources
https://mljar.com/blog/visualize-decision-tree/ (source code)
https://towardsdatascience.com/visualizing-decision-trees-with-python-scikit-learn-graphviz-matplotlib-1c50b4aa68dc
https://explained.ai/decision-tree-viz/
https://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html

In [ ]:

(Ebook) Student Edition 2019 (Hmh Social Studies: Ancient Civilizations) by Houghton Mifflin Harcourt ISBN 9780544669215, 0544669215 download pdf
100% (3)
(Ebook) Student Edition 2019 (Hmh Social Studies: Ancient Civilizations) by Houghton Mifflin Harcourt ISBN 9780544669215, 0544669215 download pdf
71 pages
Concept of Organizing Badminton Event
No ratings yet
Concept of Organizing Badminton Event
15 pages
Random Forest: The Algorithm in A Nutshell
No ratings yet
Random Forest: The Algorithm in A Nutshell
10 pages
FDP Session 4 (Decision Tree)
No ratings yet
FDP Session 4 (Decision Tree)
1 page
practical 15 python
No ratings yet
practical 15 python
6 pages
AIH_Lab2
No ratings yet
AIH_Lab2
10 pages
Lab 4_Logistic Regression_kNN_Notes
No ratings yet
Lab 4_Logistic Regression_kNN_Notes
6 pages
RANDOM_FOREST__1737667979
No ratings yet
RANDOM_FOREST__1737667979
11 pages
Decision Trees
No ratings yet
Decision Trees
11 pages
Sentence Building
No ratings yet
Sentence Building
1 page
decision_trees_implementation (1)
No ratings yet
decision_trees_implementation (1)
13 pages
ml using python programs
No ratings yet
ml using python programs
12 pages
MLA Lab 6:-Implementation of Decision Tree
No ratings yet
MLA Lab 6:-Implementation of Decision Tree
16 pages
L3_Classification_RandomForest - Jupyter Notebook
No ratings yet
L3_Classification_RandomForest - Jupyter Notebook
6 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Random Forest
No ratings yet
Random Forest
3 pages
ML5_Implementation
No ratings yet
ML5_Implementation
32 pages
Programs Lab Bca
No ratings yet
Programs Lab Bca
16 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
DECISION TREES
No ratings yet
DECISION TREES
7 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
18 pages
Lecture 7.2 - DTC Algorithm Implementation
No ratings yet
Lecture 7.2 - DTC Algorithm Implementation
7 pages
ml lab programs 2
No ratings yet
ml lab programs 2
16 pages
Experiment 8
No ratings yet
Experiment 8
14 pages
Decision tree
No ratings yet
Decision tree
4 pages
Unit-5 Decision Trees & Ensembles Methods
No ratings yet
Unit-5 Decision Trees & Ensembles Methods
11 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
13 pages
Hyper_parameter_optimization
No ratings yet
Hyper_parameter_optimization
13 pages
23ucc542_ml9
No ratings yet
23ucc542_ml9
6 pages
Decision Tree - Jupyter Notebook
No ratings yet
Decision Tree - Jupyter Notebook
4 pages
FDS Lab Manual
No ratings yet
FDS Lab Manual
10 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
23 pages
Decision Tree and Related Techniques For Classification in Scalation
No ratings yet
Decision Tree and Related Techniques For Classification in Scalation
12 pages
trees_classification.ipynb - Colab
No ratings yet
trees_classification.ipynb - Colab
6 pages
Decision Trees in Sklearn Decision Trees in Sklearn
No ratings yet
Decision Trees in Sklearn Decision Trees in Sklearn
7 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Team 5
No ratings yet
Team 5
12 pages
FB Models PDF
No ratings yet
FB Models PDF
14 pages
Import Numpy As NP Import Pandas As PD
No ratings yet
Import Numpy As NP Import Pandas As PD
7 pages
Practical 6A & 6B
No ratings yet
Practical 6A & 6B
4 pages
section_2
No ratings yet
section_2
2 pages
What Is Decision Tree?: ISM Implementation of Decision Tree Submitted By: Sagiruddin Akthar 19mcmc28
No ratings yet
What Is Decision Tree?: ISM Implementation of Decision Tree Submitted By: Sagiruddin Akthar 19mcmc28
4 pages
ML_4,5 (1)
No ratings yet
ML_4,5 (1)
5 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
9 pages
Decision_Tree_Regression.ipynb - Colab
No ratings yet
Decision_Tree_Regression.ipynb - Colab
3 pages
10 Random - Forest - Algo
No ratings yet
10 Random - Forest - Algo
6 pages
1.10. Decision Trees — scikit-learn 0.24.1 documentation
No ratings yet
1.10. Decision Trees — scikit-learn 0.24.1 documentation
10 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
2. Random Forest Algorithm
No ratings yet
2. Random Forest Algorithm
2 pages
Trees and Forests: Machine Learning With Python Cookbook
No ratings yet
Trees and Forests: Machine Learning With Python Cookbook
5 pages
Random Forest
No ratings yet
Random Forest
25 pages
Decision tree
No ratings yet
Decision tree
1 page
Machine Learning
No ratings yet
Machine Learning
16 pages
Lab 02: Decision Tree With Scikit-Learn: About The Mushroom Data Set
No ratings yet
Lab 02: Decision Tree With Scikit-Learn: About The Mushroom Data Set
3 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
2023AIB1008_Lab08
No ratings yet
2023AIB1008_Lab08
8 pages
22510045_Assignment_10[1]
No ratings yet
22510045_Assignment_10[1]
14 pages
Python Implementation of Random Forest Algorithm
No ratings yet
Python Implementation of Random Forest Algorithm
10 pages
EXP - 6 - Prasham Doshi - 22bec097
No ratings yet
EXP - 6 - Prasham Doshi - 22bec097
3 pages
DM Lab 04
No ratings yet
DM Lab 04
6 pages
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Reading and Writing Lesson 1.1
No ratings yet
Reading and Writing Lesson 1.1
9 pages
Annual Instructional and Supervisory Plan
100% (1)
Annual Instructional and Supervisory Plan
4 pages
4BD20CS004,4BD20CS040
No ratings yet
4BD20CS004,4BD20CS040
38 pages
Homeroom Guidance: Quarter 1 - 1: Level Up Your Study Habits
No ratings yet
Homeroom Guidance: Quarter 1 - 1: Level Up Your Study Habits
11 pages
Shakespeare Studies 1st Edition Susan Zimmerman All Chapter Instant Download
100% (10)
Shakespeare Studies 1st Edition Susan Zimmerman All Chapter Instant Download
84 pages
Active Listening - Presentation
No ratings yet
Active Listening - Presentation
13 pages
Senior SUMMER CAMP - 2024
No ratings yet
Senior SUMMER CAMP - 2024
4 pages
Ficha 1-7ºano: A Football Star
No ratings yet
Ficha 1-7ºano: A Football Star
4 pages
Visual Memory 1st Edition Steven J. Luck All Chapter Instant Download
100% (22)
Visual Memory 1st Edition Steven J. Luck All Chapter Instant Download
84 pages
Sharath Chandra
No ratings yet
Sharath Chandra
1 page
Mirza Riyasat Ali: Education Skills
No ratings yet
Mirza Riyasat Ali: Education Skills
1 page
Factors Affecting The Capacity of Millennial Farmers in Chili Farming Community in Garut Regency
No ratings yet
Factors Affecting The Capacity of Millennial Farmers in Chili Farming Community in Garut Regency
8 pages
Merged HRM301
No ratings yet
Merged HRM301
20 pages
Sem 4 - principles-Ofmacroeconomics-II - August 20 - For Normal Ba Course
No ratings yet
Sem 4 - principles-Ofmacroeconomics-II - August 20 - For Normal Ba Course
2 pages
TOS Oral Communication 11 1st Quarter
No ratings yet
TOS Oral Communication 11 1st Quarter
1 page
FCL - Self Regulation Lesson Plan
100% (1)
FCL - Self Regulation Lesson Plan
11 pages
Directory Book 2018 (3) API Ghaziabad
No ratings yet
Directory Book 2018 (3) API Ghaziabad
44 pages
Mam Prj00 Eaq Prc 01 Rev b (1)
No ratings yet
Mam Prj00 Eaq Prc 01 Rev b (1)
19 pages
Hope Latham
No ratings yet
Hope Latham
2 pages
MBF - MDevS ဝင်ခွင့်စာမေးပွဲ မေးခွန်းလွှာ (၂) - Analytical Skill (page 2 of 6)
No ratings yet
MBF - MDevS ဝင်ခွင့်စာမေးပွဲ မေးခွန်းလွှာ (၂) - Analytical Skill (page 2 of 6)
3 pages
Spanish B for the Ib Diploma Second Edition Mike Thatcher download
100% (1)
Spanish B for the Ib Diploma Second Edition Mike Thatcher download
53 pages
Hebephrenic Schizophrenia - ICD Criteria
No ratings yet
Hebephrenic Schizophrenia - ICD Criteria
2 pages
adguysco-com-nlpmindmastery-startnow-fbclid-IwY2xjawI8F_RleHRuA2FlbQEwAGFkaWQBqxmqltEY_gEdthllSfeM3zt0Nh_8rQeXW2I5IT1lTqukdikjp18SipNFJk6Fk3N7yKEW_aem_EFN374jGbEL4PmWfnHgbxw
No ratings yet
adguysco-com-nlpmindmastery-startnow-fbclid-IwY2xjawI8F_RleHRuA2FlbQEwAGFkaWQBqxmqltEY_gEdthllSfeM3zt0Nh_8rQeXW2I5IT1lTqukdikjp18SipNFJk6Fk3N7yKEW_aem_EFN374jGbEL4PmWfnHgbxw
14 pages
2023 Action Plan in Sports
No ratings yet
2023 Action Plan in Sports
3 pages
The Differences of Discourse Analysis and Pragmatics
No ratings yet
The Differences of Discourse Analysis and Pragmatics
8 pages
Discussion 4
No ratings yet
Discussion 4
2 pages
A-Level Animal Cell - Google Search
No ratings yet
A-Level Animal Cell - Google Search
1 page
What Is An Interview?: Johari Window
No ratings yet
What Is An Interview?: Johari Window
11 pages