0% found this document useful (0 votes)
109 views

Iot Domain Analyst-Ece3502: Data Analytics Using Weka For Weather Land Related Data

1) The document describes an experiment using the Weka data mining software to analyze weather and leaf datasets. 2) Weka was used to preprocess the datasets, visualize the data distributions, and apply machine learning classifiers like J48 decision trees, random forests, and Naive Bayes. 3) Classification accuracies between 57-77% were achieved depending on the classifier and dataset, indicating the models learned meaningful patterns in the data.

Uploaded by

sai manikanta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
109 views

Iot Domain Analyst-Ece3502: Data Analytics Using Weka For Weather Land Related Data

1) The document describes an experiment using the Weka data mining software to analyze weather and leaf datasets. 2) Weka was used to preprocess the datasets, visualize the data distributions, and apply machine learning classifiers like J48 decision trees, random forests, and Naive Bayes. 3) Classification accuracies between 57-77% were achieved depending on the classifier and dataset, indicating the models learned meaningful patterns in the data.

Uploaded by

sai manikanta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Name: M.

SAIMANIKANTA Reg no: 18BEC1314


SLOT: L37+L38 FACLITY: - DR VELMATHI G

EXPERIMENT NUMBER: - 3 DATE: - 18/02/2021

IOT DOMAIN ANALYST- ECE3502

AIM: Data Analytics using Weka for weather land related data.

THEORY:
WEKA

The workbench for machine learning

Weka is tried and tested open source machine learning software that can be
accessed through a graphical user interface, standard terminal applications, or a
Java API. It is widely used for teaching, research, and industrial applications,
contains a plethora of built-in tools for standard machine learning tasks, and
additionally gives transparent access to well-known toolboxes such as scikit-
learn, R, and Deeplearning4j.
WEKA: Weka (Waikato Environment for Knowledge Analysis) is a popular suite of
machine learning software written in Java, developed at the University of
Waikato, New Zealand. Weka is free software available under the GNU General
Public License. The Weka workbench contains a collection of visualization tools
and algorithms for data analysis and predictive modeling, together with graphical
user interfaces for easy access to this functionality
Weka is a collection of machine learning algorithms for solving real-world data
mining problems. It is written in Java and runs on almost any platform. The
algorithms can either be applied directly to a dataset or called from your own Java
code
The original non-Java version of Weka was a TCL/TK front-end to (mostly third-
party) modeling algorithms implemented in other programming languages, plus
data preprocessing utilities in C, and a Makefile-based system for running
machine learning experiments. This original version was primarily designed as a
tool for analyzing data from agricultural domains, but the more recent fully Java-
based version (Weka 3), for which development started in 1997, is now used in
many different application areas, in particular for educational purposes and
research.
Advantages of Weka include:
 Free availability under the GNU General Public License
 Portability, since it is fully implemented in the Java programming language
and thus runs on almost any modern computing platform
 A comprehensive collection of data preprocessing and modeling techniques
 Ease of use due to its graphical user interfaces

DESIGN AND PROCEDURE:


1) Download and install weka software in laptop and open it.
2) Open a new explorer in weka
3) Now download dataset from given link

https://archive.ics.uci.edu/ml/datasets/leaf

4) Open a file from downloads.

5) In the following graph is showing the outlook of weather in first column it


shows sunny count 5, overcast 4 and rainy count is 5.
6) The following graph is showing the number of times we go to play outside
shown in blue and not gone outside to play shown in red. The count is 9 for going
out to play and 5 for not going out to play.
7) In the following graph is showing the humidity in nature, here we converted
the maximum and minimum into 0 and 1 using a filter called normalize filter in
that mean percentage is 0.537 and stdDev percentage is 0.332

OUTPUT:-
1) The following is showing the trees.j48 logic with 10 cross validation
classification used to study the data set for machine learning and to predict
correctly whether to go out to play or not. The accuracy of this classification is
64.28% which is moderate good.
2) The following is showing the trees.randomtree logic with 10 cross validation
classification used to study the data set for machine learning and to predict
correctly whether to go out to play or not. The accuracy of this classification is
64.28% which is moderate good.
3) The following is showing the function logic with 10 cross validation
classification used to study the data set for machine learning and to predict
correctly whether to go out to play or not. The accuracy of this classification is
57.1429% which is not good.
Result:
The following dataset of weather is analyzed and different classification
techniques are studied for machine learning process with the help of Weka.
Name: M.SAIMANIKANTA Reg no: 18BEC1314
SLOT: L37+L38 FACLITY:- DR VELMATHI G

EXPERIMENT NUMBER: - 4 DATE: - 25/02/2021

IOT DOMAIN ANALYST- ECE3502

AIM: Data Analytics using Weka for leaf related data.

THEORY:
WEKA

The workbench for machine learning

Weka is tried and tested open source machine learning software that can be
accessed through a graphical user interface, standard terminal applications, or a
Java API. It is widely used for teaching, research, and industrial applications,
contains a plethora of built-in tools for standard machine learning tasks, and
additionally gives transparent access to well-known toolboxes such as scikit-
learn, R, and Deeplearning4j.
WEKA: Weka (Waikato Environment for Knowledge Analysis) is a popular suite of
machine learning software written in Java, developed at the University of
Waikato, New Zealand. Weka is free software available under the GNU General
Public License. The Weka workbench contains a collection of visualization tools
and algorithms for data analysis and predictive modeling, together with graphical
user interfaces for easy access to this functionality
Weka is a collection of machine learning algorithms for solving real-world data
mining problems. It is written in Java and runs on almost any platform. The
algorithms can either be applied directly to a dataset or called from your own Java
code
The original non-Java version of Weka was a TCL/TK front-end to (mostly third-
party) modeling algorithms implemented in other programming languages, plus
data preprocessing utilities in C, and a Makefile-based system for running
machine learning experiments. This original version was primarily designed as a
tool for analyzing data from agricultural domains, but the more recent fully Java-
based version (Weka 3), for which development started in 1997, is now used in
many different application areas, in particular for educational purposes and
research.
Advantages of Weka include:
 Free availability under the GNU General Public License
 Portability, since it is fully implemented in the Java programming language
and thus runs on almost any modern computing platform
 A comprehensive collection of data preprocessing and modeling techniques
 Ease of use due to its graphical user interfaces

DESIGN AND PROCEDURE:


1) Download and install weka software in laptop and open it.
2) Open a new explorer in weka
3) Now download dataset from given link

https://archive.ics.uci.edu/ml/datasets/leaf

4) Here we take a csv file

In that we make the changes like

We deleted the unwanted rows form the data set

We can exchange the rows form the dataset etc.

5) Now open the data set csv file through notepad.

And make them in the correct format for weka.


% 1. Title: Iris Plants Database

@RELATION iris
% 2. Sources:

@ATTRIBUTE sepallength NUMERIC


@ATTRIBUTE sepalwidth NUMERIC
@ATTRIBUTE petallength NUMERIC
@ATTRIBUTE petalwidth NUMERIC
@ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica}

The Data of the ARFF file looks like the following:

@DATA
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
4.4,2.9,1.4,0.2,Iris-setosa
4.9,3.1,1.5,0.1,Iris-setosa

After that save file the .arff format.


6) In the following graph is showing the distribution of leaves with respect to the
elongation of the leaves for all 340 instances of the leaves. The normalized mean
elongation of the leave is 0.514, the minimum is 0.108 and maximum is 0.948.
7) In the following graph is showing the distribution of leaves with respect to the
smoothness of the leaves for all 340 instances of the leaves. The normalized mean
smoothness of the leave is 0.018, the minimum is 0.001 and maximum is 0.073.
And most of the leaves are smooth.
8) In the following graph is showing the distribution of leaves with respect to the
contrast of the leaves for all 340 instances of the leaves. The normalized mean
contrast of the leave is 0.125 and minimum is 0.033 and maximum is 0.281.
OUTPUT:-
1) The following is showing the tree.randomtree forest function classification
based on 10-fold cross validation for machine learning to predict the constant of
the leaves. The accuracy of this classification is 77.35%.
The accuracy of classification is good.
2) The following is showing the tree.randomtree function classification based on
10-fold cross validation for machine learning to predict the constant of the leaves.
The accuracy of this classification is 65.29%.
The accuracy of classification is moderate good.
3) The following is showing the multiclass classifier function classification based
on 10-fold cross validation for machine learning to predict the constant of the
leaves. The accuracy of this classification is 4.70%.
The accuracy of classification is the worst case.
4) The following is showing the tree.j48 function classification based on 10-fold
cross validation for machine learning to predict the constant of the leaves. The
accuracy of this classification is 61.17%.
The accuracy of classification is the moderate good.

Result:

The following dataset of leaf is analyzed and different classification techniques are
studied for machine learning process with the help of Weka.

You might also like