0% found this document useful (0 votes)
179 views8 pages

DWDM LAB Manual SVEC-16

This document describes two experiments conducted in Weka to preprocess data and create a decision tree. The first experiment uses various filters to handle missing data through marking, removing, and imputing missing values. The second experiment trains a decision tree classifier on bank data using the J48 algorithm to classify customers.

Uploaded by

Pottli Siddhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
179 views8 pages

DWDM LAB Manual SVEC-16

This document describes two experiments conducted in Weka to preprocess data and create a decision tree. The first experiment uses various filters to handle missing data through marking, removing, and imputing missing values. The second experiment trains a decision tree classifier on bank data using the J48 algorithm to classify customers.

Uploaded by

Pottli Siddhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Title: WEKA SVEC/CSE/EXPT-DWDM

SREE VIDYANIKETHAN ENGINEERING COLLEGE(AUTONOMOUS)


Sree Sainath Nagar, A. Rangampet – 517 102

Department of Computer Science and Engineering


III B. Tech – II Semester
DATA WAREHOUSING AND DATA MINING LAB (16BT61531)
Title: WEKA SVEC/CSE/EXPT-DWDM

WEKA----WEEK 1
Aim:
To Pre-process the data in weka with a simple experiments
a) Handling missing data (both nomial and numerical)
b) All types normalization (min-max, z-score, decimal scaling)
c) Sampling.
DESCRIPTION:

A) Mark Missing Values


1. Open the Weka Explorer.

2. Load the Pima Indians onset of diabetes dataset.

3. Click the “Choose” button for the Filter and select NumericalCleaner, it us under

unsupervized.attribute.NumericalCleaner.

Weka Select Numeric Cleaner Data Filter

4. Click on the filter to configure it.

5. Set the attributeIndicies to 6, the index of the mass attribute.

6. Set minThreshold to 0.1E-8 (close to zero), which is the minimum value allowed for the attribute.

7. Set minDefault to NaN, which is unknown and will replace values below the threshold.
8. Click the “OK” button on the filter configuration.

9. Click the “Apply” button to apply the filter.

Click “mass” in the “attributes” pane and review the details of the “selected attribute”. Notice that the 11

attribute values that were formally set to 0 are not marked as Missing.

Weka Missing Data Marked

In this example we marked values below a threshold as missing.


You could just as easily mark them with a specific numerical value. You could also
mark values missing between a upper and lower range of values.
Next, let’s look at how we can remove instances with missing values from our
dataset.
Remove Missing Data
Now that you know how to mark missing values in your data, you need to learn
how to handle them.
A simple way to handle missing data is to remove those instances that have one
or more missing values.
You can do this in Weka using the RemoveWithValues filter.
Continuing on from the above recipe to mark missing values, you can remove
missing values as follows:
1. Click the “Choose” button for the Filter and select RemoveWithValues, it us
under unsupervized.instance.RemoveWithValues.
Weka Select RemoveWithValues Data Filter

2. Click on the filter to configure it.


3. Set the attributeIndicies to 6, the index of the mass attribute.
4. Set matchMissingValues to “True”.
5. Click the “OK” button to use the configuration for the filter.
6. Click the “Apply” button to apply the filter.
Click “mass” in the “attributes” section and review the details of the “selected
attribute”.
Notice that the 11 attribute values that were marked Missing have been removed
from the dataset.
Weka Missing Values Removed

Note, you can undo this operation by clicking the “Undo” button.
Impute Missing Values
Instances with missing values do not have to be removed, you can replace
the missing values with some other value.
This is called imputing missing values.
It is common to impute missing values with the mean of the numerical
distribution. You can do this easily in Weka using the ReplaceMissingValues
filter.
Continuing on from the first recipe above to mark missing values, you can
impute the missing values as follows:
1. Click the “Choose” button for the Filter and select Replace Missing
Values, it us under unsupervized.attribute. ReplaceMissingValues
Weka ReplaceMissingValues Data Filter

2. Click the “Apply” button to apply the filter to your dataset.

Click “mass” in the “attributes” section and review the details of the “selected attribute”.

Notice that the 11 attribute values that were marked Missing have been set to the mean value of the

distribution.

Weka Imputed Values


EXPERIMENT-2

Aim: To create a Decision tree by training data set using Weka mining tool.

Tools/ Apparatus: Weka mining tool..

mbinations of values in the historical data.

Procedure:

1) Open Weka GUI Chooser.

2) Select EXPLORER present in Applications.

3) Select Preprocess Tab.

4) Go to OPEN file and browse the file that is already stored in the system “bank.csv”.

5) Go to Classify tab.

6) Here the c4.5 algorithm has been chosen which is entitled as j48 in Java and can be selected by clicking
the button choose

7) and select tree j48

9) Select Test options “Use training set”

10) if need select attribute.

11) Click Start .

12)now we can see the output details in the Classifier output.

13) right click on the result list and select ” visualize tree “option .

Sample output:
The decision tree constructed by using the implemented C4.5 algorithm

You might also like