Classification Using Desiccion Tree On Audit Dataset Through R
Classification Using Desiccion Tree On Audit Dataset Through R
Abstract
Introduction
Methodology
Dataset Description
Result Snapshot
Conclusion
References
1. Abstract:
Classification is type of supervised learning. it is a data mining technique that specifies the
class to which data elements belong to. And predicts a class for an input variable as well. It is
mainly used when the output has finite and discrete values. Classification is used to
divine(predict) group membership for data instance within a given time. It has a large range
of application like Medical Disease Diagnosis, Credit Card Rating, Artificial Intelligence, and
Document Categorization etc.
Decision Tree Algorithm comes under category of supervised learning that is part of machine
learning. It is a tree structured classic algorithm used in machine learning for classification
and regression purposes. It can be used for both categorical and numerical data. Categorical
data represent name, gender, etc whereas numerical data represent mobile number, age,
temperature, etc.
For decision Tree input is given to the particular algorithm and Cor-responding answer
containing a tree structure is generated, which helps in decision making. It is basically easy
to understand and useful in data exploration and less data cleaning required in it.
In this research paper uses the Audit Dataset for classification. Basically, apply the Decision
Tree Algorithm on the Audit Dataset to fine out the results and gain the Decision Tree of
given Dataset. Decision Tree Algorithm are suitable for both categorical and numerical data.
So, this research mainly Focus on the Decision Tree Algorithm of classification which is
applied on Audit dataset.
Keywords:
Data mining, Naïve Bays, Decision tree, Classification, Classification Techniques.
2. Introduction:
In order to discover useful knowledge from the given dataset, data miner apply the data
mining algorithms. Data miner is able to find out various kind of information underlying the
data. A Decision Tree is flowchart-Like tree structure, where each internal node represents a
test and each external node represent the outcome of the test. Given tuple X, the attribute
values of the tuple are tested against the decision tree. A path is traced from root node to
leaf node to predict the result of the given dataset. It is easy to convert decision tree into a
classification rule. It is a predictive model which maps observations about an item to
conclusions about the items target value.
There are two different type of Decision Tree first one is
Pruning: The process of removing Sub-Nodes from the tree is called pruning
Branch/ Sub-Node: A sub section of entire tree is called branch or sub tree
Parent and Child Node: A node, which divides itself into sub nodes is called parent
node and child node are child or a parent node.
3. Methodology:
This research paper was implemented by using tool R on decision tree to predict the result
set of the given dataset. it tests the data of Audit dataset and find the result of the dataset.
4. Dataset Description:
PARA_A
Score_B
Numbers
Money Value
District Loss
History
SCORE
Detection Risk
Columns:
Sector Score
Score_A
Risk_B
Score_C
Score_MV
Prob
PROB
Inherent Risk
Audit Risk
Location ID
Risk_A
Total
Risk_C
Risk_D
Risk_E
Risk_F
Control Risk
Risk
6. Result Snapshot:
7. Conclusion:
This research paper brings one of the case studies of an Audit Company of India. This case
study determines the application of machine learning techniques (classification) to predict
the fraudulent firm in the time of Audit Planning. Simply it applies the Decision Tree
Algorithm on the Audit dataset and get the Tree as output of given dataset.
Decision Tree algorithm is not stable because small changes in data can cause large changes
in Tree Structure so, this paper will use neural Network in future.
8. References:
[1] Hooda, Nishtha, Seema Bawa, and Prashant Singh Rana. "Fraudulent
Firm Classification: A Case Study of an External Audit." Applied Artificial Intelligence
32.1(2018).48-64.
[7] Rish, Irina. "An empirical study of the naive Bayes classifier." IJCAI
2001 workshop on empirical methods in artificial intelligence. Vol. 3.
No. 22. New York: IBM, 2001.