100% found this document useful (2 votes)
1K views

IoT Based Smart Water Quality Monitoring System

This document describes an IoT-based smart water quality monitoring system. The system uses sensors to measure water temperature, pH, turbidity and presence of contaminants. The sensor data is collected with an Arduino board and transmitted to a mobile app for analysis and monitoring of water quality. Machine learning algorithms including SVM, logistic regression, fast forest and averaged perceptron are used to analyze the sensor data and classify water quality. The system aims to remotely monitor water sources and detect changes in water quality in real-time.

Uploaded by

Emran Emon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
1K views

IoT Based Smart Water Quality Monitoring System

This document describes an IoT-based smart water quality monitoring system. The system uses sensors to measure water temperature, pH, turbidity and presence of contaminants. The sensor data is collected with an Arduino board and transmitted to a mobile app for analysis and monitoring of water quality. Machine learning algorithms including SVM, logistic regression, fast forest and averaged perceptron are used to analyze the sensor data and classify water quality. The system aims to remotely monitor water sources and detect changes in water quality in real-time.

Uploaded by

Emran Emon
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

IoT Based

Smart Water Quality Monitoring System

Samia Islam
ID : 2015-1-60-102
Monira Mukta
ID : 2015-1-60-116
Md. Emon Miea
ID : 2015-1-60-085

A thesis submitted in partial fulfillment of the requirements for the


degree of Bachelor of Science in Computer Science and Engineering

Department of Computer Science and Engineering

East West University

Dhaka-1212, Bangladesh

December, 2018
Declaration

We, hereby, declare that the work presented in this thesis is the outcome of the investigation performed by me
under the supervision of Surajit Das Barman, Senior Lecturer, Department of Computer Science and engineering,
East West University. We also declare that no part of this thesis/project has been or is being submitted elsewhere
for the award of any degree or diploma.

Countersigned Signature

........................ ........................

(Surajit Das Barman) (Samia Islam)


(ID : 2015-1-60-102)
Supervisor

Signature

........................

(Monira Mukta)
(ID : 2015-1-60-116)

Signature

........................

(MD. Emon Miea)


(ID : 2015-1-60-085)

1
Letter of Acceptance

This thesis report entitled “Smart Water Quality Monitoring [SWQM] System” submitted by Samia Islam (ID:
2015-1-60-102),Monira Mukta (ID: 2015-1-60-116) and Md. Emon Miea (ID: 2015-1-60-085) to the
Department of Computer Science and Engineering, East West University is accepted by the department in partial
fulfillment of requirements for the Award of the Degree of Bachelor of Science and Engineering on December,
2018.

Supervisor

............................
Surajit Das Barman

Senior Lecturer,

DepartmentofComputerScienceandEngineering,EastWestUniversity.

Chairperson

..................
(Dr. Ahmed Wasif Reza)

Chairperson and Associate Professor,

DepartmentofComputerScienceandEngineering,EastWestUniversity

2
Table of contents

Declaration .................................................................................................................................. 1
Table of contents ......................................................................................................................... 3
List of Figures ............................................................................................................................. 5
List of Tables ............................................................................................................................... 6
Abstract ........................................................................................................................................... 7
Chapter 1 ......................................................................................................................................... 8
Introduction ................................................................................................................................. 8
1.1 Overview and Motivation .................................................................................................. 8
1.2 Thesis Objective................................................................................................................. 9
Chapter 2 ....................................................................................................................................... 10
Literature Review ...................................................................................................................... 10
2.1 Smart Water Quality Monitoring [SWQM] System ........................................................ 10
2.2Existing Work on WQM ................................................................................................... 10
Chapter 3 ....................................................................................................................................... 12
Research Methodology .............................................................................................................. 12
3.1 Overview of the System ................................................................................................... 12
3.2 Circuit Diagram and Description of Working Principle .................................................. 13
3.3 Individual connection between different sensors and Arduino........................................ 14
3.4 Flowchart of Arduino Programming ................................................................................ 16
3.5 List of Equipment ............................................................................................................ 17
3.6 Picture of Hardware setup ................................................................................................ 17
3.7 Algorithm for Data Analysis ............................................................................................ 18
3.7.1 SVM [Support Vector Machine] Binary Classification ................................................ 18
3.7.2 Logistic Regression Binary Classification .................................................................... 19
3.7.3 Fast Forest Binary Classification .................................................................................. 19
3.7.4 AveragedPerceptron Binary Classification ................................................................... 21
3.8Flow Chart ........................................................................................................................ 21

3
3.9 Developed Apps ............................................................................................................... 22
3.10 Full Experimental Setup ................................................................................................ 24
Chapter 4 ....................................................................................................................................... 25
Analysis and Results ................................................................................................................. 25
4.1Instrumental Analysis ....................................................................................................... 25
4.2 Analysis............................................................................................................................ 25
4.3 Results .............................................................................................................................. 31
4.3.1Result from Physical Analysis ....................................................................................... 31
4.3.2 Result Analysis with ML .............................................................................................. 32
Chapter 5 ....................................................................................................................................... 35
Conclusion ................................................................................................................................. 35
Future Work ........................................................................................................................... 35
Reference ................................................................................................................................... 36

4
List of Figures

Fig. 1: Schematic of the proposed SWQM system...................................................................... 12


Fig. 2: Circuit Diagram of proposed system ............................................................................... 13
Fig. 3: Temperature Sensor’s Connection to Arduino................................................................. 14
Fig. 4: pH Sensor’s connection to Arduino ................................................................................. 14
Fig. 5: EC Sensor’s connection to Arduino ................................................................................. 15
Fig. 6: Turbidity Sensor’s connection to Arduino ....................................................................... 15
Fig. 7: Flowchart of Arduino Programming ................................................................................ 16
Fig. 8: Snapshot of hardware setup ............................................................................................. 17
Fig. 9: Flowchart of proposed app ............................................................................................... 22
Fig. 10: Initial state of developed app ......................................................................................... 23
Fig. 11: Arduino is reading data from water by sensors.............................................................. 24
Fig. 12: Prediction shown in the app ........................................................................................... 24
Fig. 13: Graphical representation of Temperature ...................................................................... 26
Fig. 14: Graphical representation of Electric Conductivity ......................................................... 27
Fig. 15: Graphical representation of pH ...................................................................................... 27
Fig. 16: This is a graphical representation of Turbidity .............................................................. 28
Fig. 17: Snapshot of Fast Forest Accuracy.................................................................................. 29
Fig. 18: Snapshot of SVM Binary Classifier Accuracy .............................................................. 29
Fig. 19: Snapshot of Averaged Perceptron Binary Classifier Accuracy ..................................... 30
Fig. 20: Snapshot of Logistic Regression Binary Classifier Accuracy ....................................... 30
Fig. 21: Graph of accuracy comparison ...................................................................................... 33
Fig. 22: Snapshot of app’s predicted result ................................................................................. 34
Fig. 23: Snapshot of app’s predicted logs ................................................................................... 34

5
List of Tables

Table 1: List of analytical instruments ......................................................................................... 25


Table 2:The source of the water are mainly from three categories. Here WHO GV = World
Health Organization Guide Value.[8] ........................................................................................... 31
Table 3: Accuracy Comparison of Binary Classifiers ................................................................. 32

6
Abstract

Abstract- The importance to monitor the water quality level is important due to the significant
impact on human health and ecosystem. The aims of this project to develop an IoT based smart
water quality monitoring (SWQM) system that aids in continuous measurements of water
conditions based on four physical parameters i.e., temperature, pH, turbidity and conductivity
properties. Four sensors are connected to arduino-uno to detect those corresponding water
parameters. Extracted data from the sensors are transmitted to a developed desktop application.
Based on the measured result, the proposed SWQM system can successfully analyze the water
parameters using machine learning approaches to classify whether the test water sample is suitable
for human consumption or not.

7
Chapter 1
Introduction
Water is the most essential element for sustaining any living being. Every sphere of our life we
use water such as for drinking, washing, industrial purpose, agriculture, food processing and so
on. No other substance can take the place of water. As population growth is increasing day by day,
the chances of polluting this element is also increasing. But it is a matter of regret that this valuable
element is not unlimited. So, it is needed to use this valuable resource in proper way.

Moreover, the quality of water is a big issue in modern science. Water is being polluted in several
ways. As a consequences various fatal issues are arising such as skin diseases, global warming,
scarcity of pure drinking water for living beings etc. Hence, this is a major issue to check the
quality of water. In the developing world, 90% of all wastewater still goes untreated into local
rivers and streams. Some countries, with roughly a third of the world's population, also suffer from
medium or high-water stress, and 17 of these extracts more water annually than is recharged
through their natural water cycles. The strain not only affects surface freshwater bodies like rivers
and lakes, but it also degrades groundwater resources.

1.1 Overview and Motivation


This natural resource is becoming scarcer in certain places, and its availability is a major social
and economic concern. Currently, about a billion people around the world routinely drink
unhealthy water. Most countries accepted the goal of halving by 2015 the number of people
worldwide who do not have access to safe water and sanitation during the 2003 G8 Evian summit.
Even if this difficult goal is met, it will still leave more than an estimated half a billion people
without access to safe drinking water and over a billion without access to adequate sanitation. Poor
water quality and bad sanitation are deadly; some five million deaths a year are caused by polluted
drinking water. The World Health Organization estimates that safe water could prevent 1.4 million
child deaths from diarrhea each year. [ 1]

On the other hand, manual checking of water quality is expensive and time consuming. To reduce
these constraints, we have worked on water quality analyzing. Though several works have been

8
done on this but all of them were not satisfactory and reliable. To get accurate result and reliable
output we introduced a new approach which is based on machine learning algorithm.

There are many parameters in water to check the quality but major factors are:

 PH level
 Turbidity
 Carbon dioxide level
 Oxygen level
 Conductivity
 Temperature
 Arsenic
 Escherichia coli

But for our limitation of time and resource we worked with a few parameters. We worked with
temperature [2][3], turbidity [4], PH and conductivity parameters. These parameters are sensed
through various sensors.

Our system functionality depends on the following factors:

 Collection of real time data through various sensors


 Create database
 Checking accuracy of those data
 Analyze data
 Apply Machine Learning Algorithm on collected data

1.2 Thesis Objective


The main objectives of the proposed project under the scrutiny are:
1. To developed a smart water quality monitoring (SWQM) system using the IoT platform.
2. Classify the test water samples based on analyzing the extracted data using machine
learning approaches.

9
Chapter 2
Literature Review

This chapter provides a discussion on the water quality monitoring system, brief description of
various water quality monitoring using IoT techniques. In addition, this chapter also discusses
about existing work on water quality monitoring.

2.1 Smart Water Quality Monitoring [SWQM] System


Water quality monitoring is a process of which we can measure the parameter of water. In water,
there can be different parameter or particle such as temperature, PH, many solid particles, different
ions like anion or cation that can be mixed. The main purpose of monitoring water quality is to
identify all these parameters are present in water at a tolerable range. There are different ways to
monitoring water quality that can be manual, smart or digital. When the process of monitoring
system is physical and chemical testing, it can be called manual. The smart or digital process is in
which different sensors are used along with IOT. WQM is literally important issues now-a-days
because the drinkable water is going to be polluted by different processes of human or some sort
of other reasons. If we can design a model for water quality monitoring system that will be very
helpful for human.

2.2Existing Work on WQM


Authors in article [5], proposed a new approach is to develop sensor nodes for real time and in
pipe monitoring, assessment of water quality on the fly and calculate the amount of water
delivered. Themain sensor node incorporates several in-pipe electrochemical and optical sensors
and emphasis is given on low cost, lightweight implementation and reliable longtime operation.
This type of implementation is suitable for large scale deployments enabling a sensor network
approach to water consumers, water companies and authorities. This system is based on selected
parameter, a sensor array is developed along with several microsystems for analog signal
conditioning, processing, logging and remote presentation of data. A real time water quality
monitoring system by using a broker-less publisher and subscriber architecture framework is
developed in [6]. Sensors sense the water measurement metrics, including temperature, pH and
oxygen dissolved level. All the collected data are stored in database and used those to analyze

10
water quality. To complete the experiment, the relationship among temperature, pH and dissolved
oxygen is analyzed and the experiment summarizes that the water temperature is inversely
proportional to pH and dissolved oxygen level. The article in [7] introduces a smart water quality
monitoring system for Fizi using IoT and remote sensing used for monitoring, collecting and
analysis data from remote location. Another smart solution for water quality monitoring technique
is described in [8].

In our thesis, we develop a smart water quality monitoring system. Here we collect data from
different sources by using various sensors. The values of different parameters are shown in the
computer and these data are saved in excel. We design a desktop application for predicting the
drinkable or not drinkable water after the data are read automatically. Then we get a result whether
water is drinkable or not. The next chapter will be described about the whole procedure how we
develop our system.

11
Chapter 3
Research Methodology
3.1 Overview of the System
The SWQM system will read data from water sample by sensors through the microcontroller and
analyze the data to predict its quality by ML algorithm.

Fig. 1: Schematic of the proposed SWQM system.


To establish our system, we need four types of sensors and different types of hardware including
Arduino. The sensors are able to read the temperature, Electric Conductivity, Turbidity and PH.For
application development, we used the environment components such as Visual Studio 2017 IDE,
.Net framework, Windows Form Application, ML.Net, SQLite database etc. Moreover, a new
approach that is “Machine Learning Algorithm” is applied in the application development to make
the result more efficient and flawless.

12
3.2 Circuit Diagram and Description of Working Principle

Fig. 2: Circuit Diagram of proposed system

The circuit is built on breadboard with Arduino UNO and four sensors. They are Digital
Temperature sensor, Analog EC (electric conductivity) sensor, Analog turbidity sensor and Analog
PH sensor. Each sensor needs 5V electricity to operate and a ground node. Therefore, we made a
common node of 5V pin and GND pin in the breadboard.

At the common voltage node,all of the power requiring nodes of the sensors are connected and the
grounds of the sensors are connected to the common ground. Each sensor has an output pin known
as data pin. The data pin of EC sensor are connected to the analog pin A0, the data pin of Turbidity
sensor is connected to the analog pin A3 and the data pin of PH sensor is connected to the analog
pin A5. The data pin of Temperature sensor is connected to the digital pin 5.

13
3.3 Individual connection between different sensors and Arduino
a) Temperature sensor Connected to Arduino

Fig. 3: Temperature Sensor’sConnection to Arduino

b) pH sensor connected to Arduino

Fig. 4: pH Sensor’s connection to Arduino

14
c) EC sensor connected to Arduino

Fig. 5: EC Sensor’s connection to Arduino

d) Turbidity sensor connected to Arduino

Fig. 6: Turbidity Sensor’s connection to Arduino

15
3.4 Flowchart of Arduino Programming

Start

Arduino Pin Setup

LOOP
P

String Data = Null

Read Temperature & Add to


Data String

Read Electric Conductivity &


Add to Data String

Read Turbidity & Add to Data


String

Read pH & Add to Data String

Print Data String to the Serial


Port

Fig. 7: Flowchart of Arduino Programming

This Block Diagram shows the working procedure of the microcontroller. In Arduino
programming, we used several libraries to read data likeDFRobot_EC, EEPROM,
OneWireandDallasTemperature. At first, it will initialize the pin configuration. It will be making

16
a data string combined of four parameters and print it to the serial port by the interval of
800miliseconds.

3.5 List of Equipment


The equipment needed to accomplish this research are given below:

i) Arduino UNO
ii) Turbidity Sensor (SEN0189)
iii) Electric Conductivity Meter (DFR0300)
iv) Waterproof DS18B20 Digital Temperature Sensor (DFR0198)
v) pH Senor (SEN0161)
vi) Cable connection
vii) Breadboard etc.

3.6 Picture of Hardware setup

Fig. 8: Snapshot of hardware setup

17
3.7 Algorithm for Data Analysis
An algorithm is a step by step method of solving a problem. It is commonly used for data
processing, calculation and other related computer and mathematical operations.The best chosen
algorithm makes sure that system will do the given task at best possible manner. We used different
Machine Learning (ML) algorithms for predicting accurate level of water particles.

In our research, we have applied four algorithms on our collected data set, which are:

a) Fast Forest Binary Classifier Algorithm


b) Linear SVM Binary Classifier Algorithm
c) Averaged Perceptron Binary Classifier Algorithm
d) Logistic Regression Binary Classifier Algorithm

A brief explanation of those four algorithms is given below to get familiar that how the algorithms
are working out.

3.7.1 SVM [Support Vector Machine] Binary Classification


Liner SVM is the newest extremely fast machine-learning algorithm for solving multiclass or
binary class classification problem from ultra large data sets that implements an ordinary
proprietary version of a cutting plane algorithm for designing a linear SVM. [9]
In ML, SVM are supervised learning models with associated learning algorithm that analyze data
used for classification and regression algorithm. SVM gives the best output where high dimension
is used and it works well one small data set. Though logistic regression has some limitation, we
need to switch on SVM. In SVM, data are separated through a wide line. The equation is given
below:
1
𝐻𝜃 (𝑥) = (1)
1+𝑒 −𝜃𝑟𝑋

18
3.7.2 Logistic Regression Binary Classification
Logistic regression is a binary classification algorithm. It predicts binary answer based on different
feature. Except binary it works with different values.
Logistic regression can be binomial, ordinal or multinomial. Binomial or binary logistic regression
deals with situations in which the observed outcome for a dependent variable can have only two
possible types “0” and “1”.
Multinomial logistic deals with situations where the outcome can have three or more possible
ordinal deals with dependent variables that are ordered.
One may begin to understand by first considering a logistic model with given parameters, then
seeing how coefficients can be estimated from data. Consider a model with two predictors and
these may be continuous variables or indicator functions for binary variables (taking value 0 or 1).
Then the general form of the log-odds is:
𝐿 = 𝛽0 + 𝛽1𝑥1 + 𝛽2𝑥2 (2)
𝑂 = 𝑏 𝛽0+𝛽1𝑥1+𝛽2𝑥2 (3)

Where, b is the base of the logarithm and exponent. [10]

3.7.3 Fast Forest Binary Classification


It is a machine learning approach commonly known as Random Forest. In ML.Net it is named as
Fast Forest Binary Classifier. This algorithm gives the best prediction result on the dataset we
measured through the sensors.
A decision is made at each node of the binary tree data structure based on a measure of similarity
that maps each instance recursively through the branches of the tree until the appropriate leaf node
is reached and the output decision returned.
Fast forest regression is a random forest and quantile regression forest implementation using
regression tree learner in fast forest tree. The model consists of an ensemble of decision trees. Each
tree is a decision forest outputs a Gaussian distribution by way of prediction[11].
The aggregation is performed over the ensemble of trees to find a Gaussian distribution for all
trees in the model. The pseudocode of this algorithm is given as [12],

19
Algorithm: Random Forest

Precondition: A training set S := (x1, y1), . . . ,(xn, yn), features F, and number of trees in
forest B.

1 function RandomForest(S,F)
2 H ←∅

3 for i∈ 1, . . . , B do
4 S(i) ←A bootstrap sample from S

5 hi ←RandomizedTreeLearn(S(i), F)

6 H← H ∪ {hi}
7end for

8return H

9 end function

10 function RandomizedTreeLearn(S,F)
11 At each node:
12 f←very small subset of F
13 Split on best feature in f
14 return The learned tree
15 end function

20
3.7.4 AveragedPerceptron Binary Classification
Perceptron is a classification algorithm that makes its predictions based on linear function for an
instance with feature values. The prediction is given by the sign of sigma [0, D-1] (w_i*f_i), where
w_0, w_1…w_D-1 are the weights computed by the algorithm.

It is an online algorithm. It processes the instances in the training set one at a time. The weights
are initialized to be 0, or some random values. For example, in the training set, the value of sigma
[0, D-1] (w_i*f_i) is computed. If this value has the same sign as the label of the current example,
the weight remains the same. If they have opposite signs, the weights vector is updated by either
subtracting or adding. In a generalization of this algorithm, the weights are updated by adding the
feature vector multiplied by the learning rate.The weight vectors are stored together with a weight
that counts the number of iterations it survived in averaged perceptron.[13]
For better understanding, the procedure of implementing these algorithms with ML.Net
Framework has given below:
 Create pipeline
 Add dataset to pipeline
 Assign numeric values to text
 Column create (temperature, conductivity, pH, turbidity)
 Add learner to the pipeline (Best Algorithm)
 Add predicted label column
 Train the model
 Making Prediction using the model
3.8Flow Chart
The flow chart given below describes the functionality of the Desktop Application clearly. In the
flow chart, firstly we have to select the port. The sensors measure the temperature, pH level,
Electric Conductivity and turbidity of water. Then the read data are compared with WHO level
and the data can be predicted with ML. Lastly, it gives the result whether the water are drinkable
or not.When we click “Read Data” button, the program will start reading selected port
continuously the data-reading period. The time duration was set 5 second to extract latest stable
from the micro-controller

21
Fig. 9:Flowchart of proposed app
3.9 Developed Apps
An app performs a group of co-ordination of functions, tasks, activities for the benefit of users. It
reduces user’s time, effort and makes the task easy. In our research, a desktop app was developed
named “Sprinkle-Water Quality Checker” on .NET platform to calculate and analyze the data.
Data which have been collected through various sensors are inserted into this app for further work.

22
Fig. 10:Initial state of developed app

This app takes input and predicts a possible output result by applying ML at the backend. ML
helps to find out the best prediction.The “Sprinkle Water Quality Checker” will take all of the
input data and then it is ready apply algorithm for prediction. There is also an option to compare
the immediate read data with WHO Standard and check whether it is drinkable or not.

23
3.10 Full Experimental Setup

Fig. 11: Arduino is reading data from water by sensors

Fig. 12: Prediction shown in the app

24
Chapter 4
Analysis and Results
4.1Instrumental Analysis
The parameters of water are measured through sensors listed in Table 1 below.

Table 1: List of analytical instruments


Parameter Analytical Method Instrument
PH Instrumental, Calculated from PH sensor, SKU:SEN0161
the output Voltage of PH
sensor
Conductivity Instrumental, Calculated from Electric Conductivity Meter,
the output Voltage of DFR0300
conductivity sensor
Turbidity Instrumental, Calculated from Turbidity sensor,
the output Voltage of SKU:SEN0189
turbidity sensor
Temperature Instrumental, Calculated from Temperature sensor,
the output Voltage of digital Waterproof DS18B20 Digital
temperature sensor Temperature Sensor
DFR0198

4.2 Analysis
We have divided the analysis part into two section. They are:

 Physical Analysis
 Machine Learning Algorithm Analysis

25
4.2.1 Physical Analysis

We collected data from different source like tap water, river water, some drinkable sources water.
We collected data by using different procedure and measured water particle with different sensor.
We read data such as temperature, pH, electric conductivity and turbidity of water. However, we
collected approximately 60 different data.

Temperature
50
45
40
35
30
25
20
15
10
5
0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

Fig. 13: Graphical representation of Temperature

The graphical representation of the measured temperatureof water samplesis shown in Fig. 13.
Most of the water were in normal temperature and some of data were in hot and cold temperature.
All of the data are represented in a graph. From the graph, we can easily observe the collected data.
Here, mostly data are in 24- 27 ranges. Some biased data move out from the range.

26
Conductivity
25

20

15

10

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

Fig. 14: Graphical representation of Electric Conductivity

We drew this graph by using the data that we collected from different sources with the help of
electric conductivity sensor. We find that mostly data are in range from 0 to 1. Some of data cross
this boundary.

pH
12

10

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

Fig. 15: Graphical representation of pH

27
We know that neutral range of pH is 7. But the data we collected mostly are more than 9. By
observing the data, we can come to a decision that these are based water type. According to the
WHO standard, the safety rage of pH is from 6.5 to 8.5.When we mixed some salt into the water,
then we got the value of pH around 7.

Turbidity
3500

3000

2500

2000

1500

1000

500

0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59

Fig. 16: This is a graphical representation of Turbidity

Turbidity is another important factor for water. When the tap or filtered water was clean then we
got turbidity value 0.0. When the water was not clean and fresh looking, then we got different
turbidity value up to 3500.

4.2.2 Analysis with ML Algorithm


The dataset is divided in two categories based on our daily considerations. They are:
 Drinkable
 Not Drinkable

We have chosen “Fast Forest” algorithm for our Desktop App. Because it gives the best output
among those four algorithms, which has already described.

28
This algorithm gives the best accuracy. The rate of its accuracy is 100%, which is fully satisfactory.
No other algorithm can give more accuracy than 100%. So, we have selected this algorithm for
our apps “Sprinkle Water Quality Checker”.

The output result of “Fast Forest Binary Classifier Algorithm” is given below:

Fig. 17: Snapshot of Fast Forest Accuracy


We have also applied SVM (Support Vector Machine) algorithm. However, the accuracy of that
algorithm was 80%, which was not satisfactory. The output accuracy of this algorithm is given
below:

Fig. 18: Snapshot of SVM Binary Classifier Accuracy

29
The output of “Averaged Perceptron Binary Classifier” algorithm was also not satisfactory. It
gives 80% accuracy. The output of “Averaged Perceptron Binary Classifier” algorithm is given
below:

Fig. 19: Snapshot of Averaged Perceptron Binary Classifier Accuracy

Another algorithm is “Logistic Regression Binary Classifier”. The output of “Logistic Regression
Binary Classifier” algorithm is given below:

Fig. 20: Snapshot of Logistic Regression Binary Classifier Accuracy

30
4.3 Results
Result part is divided into two categories:

 Result from physical analysis


 Result from analysis with ML

4.3.1Result from Physical Analysis

Table 2: The source of the water are mainly from three categories. Here WHO GV = World
Health Organization Guide Value.

% not
WHO GV
Source N Mean ± St. Dev Min Median Max within GV
All 60 25.98±3.95 16 25.09 44.63 °C -
Tap Water 23 25.41±0.77 24.11 25.11 27.2 -
Temperature
Biased Water 16 27.4±7.51 16 24.6 44.63 °C -
Drinking Water 21 25.52±0.92 23 25.76 27 -
0.3-
All 60 2.31±4.95 4.95 0.44 20.44 0.8mS/cm 71.67
Conductivity Tap Water 23 1.02±1.7 0.12 0.34 6.45 69.57
0.3-
Biased Water 16 4.6±7.72 0.12 0.57 20.44 81.25
0.8mS/cm
Drinking Water 21 1.97±4.26 0.12 0.48 19.89 66.67
All 60 9.03±0.58 7.3 9.15 10.26 6.5-8.5 80
Tap Water 23 9.1±0.6 7.3 9.23 9.88 82.61
PH
Biased Water 16 8.87±0.65 7.66 8.9 10.26 6.5-8.5 68.75
Drinking Water 21 9.07±0.51 7.99 9.12 9.89 85.71
All 60 550.73±921.21 0 0 3000 <5 NTU 31.67
Tap Water 23 236.22±628.55 0 0 2023 13.04
Turbidity
Biased Water 16 1080.06±1142.18 0 989.58 3000 <5 NTU 56.25

Drinking Water 21 491.89±865.37 0 0 2902.89 33.33

31
Table 2, represents the minimum, maximum, median range value of all data. Moreover, it shows
the source of data where N is the number of sample data. If we analyze temperature part then it is
noticeable that there is no “WHO GV” (World Health Organization Guide Value). [8] Since
temperature can vary place to place, so they did not give any guide value.

For conductivity “WHO GV”, referred value is 0.3-0.8mS/cm. However, in physical analysis,
71.67% over all data is failure where failure for “Tap water” is 69.57%, “Biased water” is 81.25%
and “Drinking water” is 66.67%.

The failure percentage of PH parameter is 80% over 60 sample data. Hence, Turbidity failure is
31.67%.

4.3.2 Result Analysis with ML


In our thesis, we have applied four algorithms on our collected data set, which are:

 Fast Forest Binary Classifier Algorithm


 Linear SVM Binary Classifier Algorithm
 Averaged Perceptron Binary Classifier Algorithm
 Logistic Regression Binary Classifier Algorithm

These four algorithms give prediction. However, their accuracy is different. Efficient algorithm is
an algorithm, which predicts the testing data most accurately as compared to other models and
hence, can be deployed successfully.
Therefore, in our thesis we applied these four algorithms and find out accuracy rate for same data
set. The accuracy comparison is shown below:

Table 3: Accuracy Comparison of Binary Classifiers


Binary Classifiers F1score AUC Accuracy
Fast Forest 100.00% 100.00% 100.00%
Linear SVM 75.00% 100.00% 80.00%
Logistic Regression 75.00% 100.00% 80.00%
Averaged Perceptron 75.00% 100.00% 80.00%

32
It can be understood easily from the graph below:

120.00%

100.00%

80.00%

60.00%

40.00%

20.00%

0.00%
Fast Forest Linear SVM Logistic Regression Averaged
Perceptron

F1score AUC Accuracy

Fig. 21: Graph of accuracy comparison

Here, F1score is a machine learning term. It is used to measure a test’s accuracy. F1 Score is the
Harmonic Mean between precision and recall. The range for F1 Score is [0, 1]. It tells you how
precise classifier is (how many instances it classifies correctly), as well as how robust it is (it does
not miss a significant number of instances).

High precision but lower recall, gives an extremely accurate, but it then misses a large number of
instances that are difficult to classify. The greater the F1 Score, the better is the performance of
our model.

AUC is “Area under curved”. AUC is one of the most widely used metrics for evaluation. It is
used for binary classification problem. AUC of a classifier is equal to the probability that the
classifier will rank a randomly chosen positive example higher than a randomly chosen negative
example.

From the above comparison (Figure-22) we can see that the highest accuracy rate belongs to Fast
Forest algorithm. On the other hand, SVM, Logistic regression and Averaged Perceptron algorithm
give accuracy 80.00%, which is poorer than Fast forest algorithm.

33
Hence, using the best algorithm the apps predicts the result. The result shows in following way:

Fig. 22: Snapshot of app’s predicted result


There is a log file in the application. If a user wants to see the previous record of output result then
he/she can have a look on output logs.

Fig. 23: Snapshot of app’s predicted logs

34
Chapter 5
Conclusion

The ultimate goal of this project work is to observe the quality of water samples by designing a
smart water quality monitoring (SWQM) deviceimplemented in IoT platform that can detect four
specific physical water parameters: temperatures, pH, turbidity and conductivity and analyze the
extracted data of these parameters using machine learning approaches. As our experiment is
limited to examine four water parameters, total 60 datasets are used to predict the accuracy.
Compare to the Linear SVM, Logistic Regression and Averaged Perceptron binary classifier, the
Fast Forest algorithm provides 100% accuracy.

Future Work
The proposed work provides good accuracy for small datasets of four individual parameters of
water samples. Further work can be done on large number of dataset considering the level of
chemical parameters present in water sample and improve the system’s effectiveness.

35
Reference

[1] Water. Available-https://en.wikipedia.org/wiki/Water#cite_note-30

[2] Water Quality. Available-http://www.cotf.edu/ete/modules/waterq3/WQassess4h.html

[3] International decade for action”Water for Life” 2005-2015. Available-


http://www.un.org/waterforlifedecade/sanitation.shtml

[4]Turbidity. Available-https://www.lenntech.com/turbidity.htm#ixzz3R3yPreK7

[5] Automated sensor network for monitoring and detection of impurity in drinking water system. Available-
http://www.ijraset.com/fileserve.php?FID=1615

[6] Pranata, Alif Akbar, Jae Min Lee, and Dong Seong Kim. "Towards an IoT-based water quality monitoring
system with brokerless pub/sub architecture." Local and Metropolitan Area Networks (LANMAN), 2017 IEEE
International Symposium on. IEEE, 2017.

[7] Prasad, A. N., et al. "Smart water quality monitoring system." Computer Science and Engineering (APWC
on CSE), 2015 2nd Asia-Pacific World Congress on. IEEE, 2015.

[8] Geetha, S., and S. Gouthami. "Internet of things enabled real time water quality monitoring system." Smart
Water 2.1 (2016): 1.

[9] Support Vector Machine.

Available-https://en.wikipedia.org/wiki/Support_vector_machine

[10] Logistic Regression. Available-https://en.wikipedia.org/wiki/Logistic_regression

[11] Fast Forest Algorithm. Available-https://docs.microsoft.com/en-us/machine-learning-server/python-


reference/microsoftml/rx-fast-forest

[12] RandomForest.

Available-http://pages.cs.wisc.edu/~matthewb/pages/notes/pdf/ensembles/RandomForests.pdf

[13] Average Perceptron.

Available-https://docs.microsoft.com/en-
us/dotnet/api/microsoft.ml.legacy.trainers.averagedperceptronbinaryclassifier?view=ml-dotnet

36
37

You might also like