Final Project Report 1
Final Project Report 1
MACHINE LEARNING
A PROJECT REPORT
Submitted by
JEYAPRIYA S (510120205006)
RAMYA A (510120205011)
of
BACHELOR OF ENGINEERING
in
INFORMATION TECHNOLOGY
MAY 2024
i
ANNA UNIVERSITY :: CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project report “LEAF PATHOGEN INDENTIFICATION
USING MACHINE LEARNING” is the bonafide work of “S.JEYAPRIYA
(510120205006), A.RAMYA (5101020205011)” who carried out the project
work under my supervision.
SIGNATURE SIGNATURE
Mrs. S. SHARMILA, MCA, M.E., (Ph.D) Mrs. S. SHARMILA, MCA,M.E.,
(Ph.D)
HEAD OF THE DEPARTMENT SUPERVISOR
Department of Information Technology Department of Information Technology
Adhiparasakthi College of Engineering Adhiparasakthi College of Engineering
G.B Nagar, Kalavai-632506. G.B Nagar, Kalavai-632506
ii
ACKNOWLEDGEMENT
iii
ABSTRACT
wheat and various pulses. Rice is one of the staple foods of the world. But the
production of rice is hampered by various kinds of rice diseases. One of the main
laborious for farmers of remote areas to identify rice leaf diseases due to
automated system is proposed for diagnosis three common rice leaf diseases
(Brown spot, Leaf blast, and Bacterial blight) and pesticides and/or fertilizers are
Conventional Neural Network that is used to find features from the leaf of plant.
Visual contents (colour, texture, and shape) are used as features for classification
of these diseases. The type of rice leaf diseases is recognized by Support Vector
that can help agriculture related people and organizations to take appropriate
iv
CHAPTER NO TITLE PAGE NO
ABSTRACT iv
TABLE OF CONTENTS v
5 SOFTWARE DESCRIPTION 9
5.1 Development Tools and Techniques 9
6 MODULES 10
6.1 Problem Definition 11
6.2 Project Overview 12
6.3 Module Description 13
6.3.1 Data Collection Module 14
6.3.2 Data Preprocessing Module 14
v
6.3.3 Model Selection and Training Module 15
7 SYSTEM TESTING 13
7.1 Introduction 13
7.2 Testing Methodologies 13
7.2.1 Unit Testing 14
7.2.2 System Testing 14
7.2.3 Performance Testing 14
7.3 Test Cases 15
8 SYSTEM IMPLEMENTATION 16
8.1 Purpose 16
ENHANCEMENTS
9.1 Conclusion 18
9.2 Scope For Future Development 18
10 APPENDICES 19
10.1 Source code 19
10.2 Screen Shots 36
11 REFERENCES 39
List of Abbreviations
vi
MAPE Mean Absolute Percentage Error
EA Evolutionary Algorithm
vii
Figure 5.3.6 Class diagram 51
viii
CHAPTER 1
INTRODUCTION
The agricultural industries started looking for new ways to enhance food production as a
result of the expanding population, climatic changes, and poor governance. Researchers are
working to create new, capable, and distinctive technologies for producing outcomes with
great efficiency. A huge harvest can be produced using a variety of approaches in the
agricultural sector. Precision agriculture is the most recent and beneficial of them all. Based
on the data gathered, farmers can use the precision agriculture technology to gain insights for
increasing agricultural yield. Precision agriculture can be utilised for a variety of purposes,
including the identification of plant pests, the detection of weeds, the production of yield, the
detection of plant diseases, etc. The categorization of agricultural products is done to stop
losses in yield and quantity. If the proper evaluation isn't made of this technique or class, it
has major effects on plants and affects the quality or productivity of the corresponding
product. Several problems, including low productivity and financial losses to ranchers and
farmers, are being caused by illnesses in harvests.
Leaf disease identification is a crucial aspect of plant health management and agricultural
sustainability. It involves the identification and classification of various diseases that affect the
leaves of plants, such as fungi, bacteria, viruses, and nutrient deficiencies. Early detection and
accurate diagnosis of these diseases are essential for timely intervention and effective control
measures. Various methods, including visual inspection, symptomatology, and advanced
diagnostic tools like molecular techniques and machine learning algorithms, are employed to
identify leaf diseases. Proper identification allows farmers and gardeners to implement
appropriate treatment strategies, including chemical treatments, cultural practices, and biological
1
control methods, to mitigate the spread of disease and ensure the health and productivity of
plants.
1.1 BACKGROUND
In order to automatically diagnose diseases, image processing techniques are therefore
required. Techniques for image processing aid in both accurately diagnosing diseases and
categorising them. It is also feasible to identify the diseases early and accurately. The
productivity and quality of the yield will both rise as a result. Human labour is greatly
decreased by image processing systems. Many agricultural applications, such as the
classification of various diseases and the identification of plant leaflets, make extensive use of
image processing techniques. In order to automatically diagnose diseases, image processing
techniques are therefore required. Techniques for image processing aid in both accurately
diagnosing diseases and categorising them. It is also feasible to identify the diseases early and
accurately.
In general, image processing techniques have following steps to identify the diseases:
• Image acquisition
• Image pre-processing
• Segmentation
• Feature Extraction
• Classification.
Different methods have been used for segmentation, feature extraction and for classification
various classifiers are available
The major goal of this study is to develop a model that can distinguish between
healthy and diseased harvest leaves and, in the event that the crop has a disease, to identify
which disease it is. This study used 54,306 photos, including pictures of ill and healthy plant
leaves, to train a convolution neural network model to identify 14 crop species, 26 diseases,
and 38 classes. In a held-out test set, this trained model has an accuracy of 99.35%. In this
process, the collected leaves are analysed using a number of resnet18 models. These resnet18
models are trained using the transfer learning approach. The first layer can be used to identify
different leaf kinds, while the supporting layer can be used to screen for potential plant
illnesses. Deep learning produces results with higher accuracy, which may be used to
2
diagnose crop diseases and analyse down to the smallest pixel-level component of an image.
It is impossible to study this level of detail with the human eye.
1.3.1 PURPOSE
In this methodology, plant diseases are classified and detected using machine learning
techniques and image processing techniques, respectively. Supervised learning and
unsupervised learning are the two main classifications of machine learning. The label values
in the supervised learning method are known. Regression and classification are two examples
of supervised learning techniques. The label values in the unsupervised learning approach are
unknown. An illustration of an unsupervised learning technique is clustering and association.
Several fields, including computing, biology, marketing, medical diagnosis, game playing,
etc., can benefit from machine learning. Many methods, including naive bayes, k-means
clustering, support vector machines, artificial neural networks, decision trees, and random
forests, among others, are supported by machine learning. Data collection, dataset
organisation, feature extraction, pre-processing, feature selection, selecting and implementing
machine learning algorithms, and execution evaluation are some of the fundamental phases in
machine learning.
1.3.2 SCOPE
Beyond the arts, image processing has more utilitarian applications. It now serves the
following specialised functions in agricultural applications:
3
CHAPTER 3
SYSTEM ANALYSIS
• Using the public dataset, which contains 54,306 photos of the diseases and healthy
plant leaves that are taken under controlled conditions, they have trained a model
in the current system to recognise some unique harvests and 26 diseases.
• The Res Nets algorithm was used in the existing system paper. The Res Net
algorithm produced highly accurate findings and was able to identify more
diseases from different harvests. The Res Nets approaches utilized several of the
parameters, including weight decay, gradient clipping, and scheduling learning
rate..
• The experiment made use of a kaggle data collection. The 87k RGB photos of
healthy and sick harvest leaves in this collection are arranged into 38 different
classes. For the training task on this dataset, an 80% ratio is used, and a 20% ratio
is used for the testing task. 33 test photos with an additional index that was made
subsequently for predictive purposes.
• Residual networks are easily able to pick up on the underlying data patterns,
which might result in overfitting and subpar generalisation.
• Large quantities of memory are needed to hold the parameters and weights that are
required by residual networks.
4
typically requires weeks for training, making it virtually unworkable in practical
applications
CHAPTER 4
PROPOSED SYSTEM
The main objective of this project is to create a model that can differentiate between
healthy and sick harvest leaves and, in the event that the crop has a disease, to determine
which sickness it is. This study examined 70,295 plant photos, including those of the tomato,
blueberry, orange, peach, corn (maize), potato, raspberry, soy bean, strawberry, and
blueberry. The referenced dataset is taken from the well-known public source kaggle. For the
purpose of identifying and categorising plant diseases in the suggested system, we employed
the InceptionV3 Architecture. The initial phase of the system for identifying and classifying
plant diseases is dataset loading. This dataset of plant photos contains photographs of both
healthy and sick plants. The second step in the approach for identifying and classifying plant
diseases is preprocessing. At this point, only maintain the relevant data and remove irregular
and noisy data from the dataset. The next step in the process of identifying and categorising
plant diseases is feature extraction. Image categorization greatly benefits from feature
extraction. There are several uses for feature extraction. This method has discovered that
morphological outcomes provide preferred results over other aspects. The dataset can also be
used to identify rare plants and rare diseases. The classification technique is the next phase of
the system for recognising and classifying plant infections. Users can recognise and describe
the plant infection at the final stage. This represents the system's ultimate end stage. The
proposed system had a validation accuracy of 89.21% and a training accuracy of 91.34%.
5
• The suggested method provides an accurate identifying strategy for plant leaf
diseases with good accuracy and leverages auxiliary Classifiers as regularizers.
CHAPTER 5
SYSTEM ANALYSIS
The hardware necessities might function the premise for a contract for the
implementation of the system and will thus be an entire and consistent specification of the total
system. They’re utilized by software system engineers because the start line for the system style.
It shows what the system will and not however it ought to be enforced.
Intel i3 or equivalent.
The software necessities document is that specification of the system. It ought to embody
each a definition and a specification of necessities. It’s a group of what the system ought to do
instead of however it ought to get laid. The software system necessities offer a basis for making
the software system necessities specification.
OS : Windows 10 / 11 64 bit.
Pycharm / VS Code
6
The dataset required for training the model is :
In terms of data types and control structures, Python offers a flexible and powerful set of
features. It supports a variety of data types, including integers, floating-point numbers, complex
numbers, strings, lists, tuples, and dictionaries. These data types allow developers to handle
different kinds of data effectively, whether it's numerical data, text, or structured collections of
items. Functions play a pivotal role in Python programming, allowing developers to encapsulate
reusable blocks of code. By defining functions, developers can write modular and maintainable
code, promoting code efficiency and readability. Python is a general-purpose, interpreted
programming language. It is collected and dynamically typed. Although compilation is a step,
Python is an interpreted language rather than a compiled one. Before being placed in a.pyc
or.pyo file, byte code created from Python code written in.py files is first compiled. Python
7
source code was converted to byte code rather than machine code, like C++. A low-level
collection of instructions known as byte code can be carried out.
8
Python is an Integrated Language : Python is also an Integrated language because
we can easily integrate Python with other languages like C, C++, etc.
Interpreted Language : Python is an Interpreted Language because Python code is
executed line by line at a time. like other languages C, C++, Java, etc. there is no
need to compile Python code this makes it easier to debug our code. The source code
of Python is converted into an immediate form called bytecode.
Large Standard Library : Python has a large standard library that provides a rich
set of modules and functions so you do not have to write your own code for every
single thing. There are many libraries present in Python such as regular
expressions, unit-testing, web browsers, etc.
Dynamically Typed Language : Python is a dynamically-typed language. That
means the type (for example- int, double, long, etc.) for a variable is decided at run
time not in advance because of this feature we don’t need to specify the type of
variable.
5.5 TOOLS AND LIBRARIES
PANDAS
FEATURES OF PANDAS
9
Easy handling of missing data (represented as NaN) in both floating point and non-
floating-point data
Size mutability: columns can be inserted and deleted from DataFrames and higher-
dimensional objects
Automatic and explicit data alignment: objects can be explicitly aligned to a set of labels,
or the user can simply ignore the labels and let series, DataFrame, etc. automatically
align the data in computations
Powerful, flexible group-by functionality to perform split-apply-combine operations on
data sets for both aggregating and transforming data
Making it easy to convert ragged, differently indexed data in other Python and Numpy
data structures into DataFrame objects
Intelligent label-based slicing, fancy indexing, and subsetting of large data sets
PIL
The Python Imaging Library (PIL) is a free, open-source library that allows users to open,
manipulate, and save images in various formats. PIL is not part of Python's standard library, but
it provides a wide range of capabilities for image processing tasks, including: Opening and
saving image files, Manipulating image data, Applying image filters and transformations,
Resizing, and Cropping. PIL supports many different image file formats, including popular
formats such as JPEG. It has been built and tested with Python 2.0 and newer, on Windows, Mac
OS X, and major Unix platforms. To install PIL, you can use the Python package manager called
pip. Open your command prompt or terminal and run the command: ``pip install pillow''. It
provides a wide range of features for image processing tasks such as resizing, cropping,
applying filters, and more. It’s a powerful library in python that allows users to open,
manipulate, and save various image formats. Pillow was announced as a replacement for PIL
for future usage. Pillow supports a large number of image file formats including BMP, PNG,
JPEG, and TIFF. The library encourages adding support for newer formats in the library by
creating new file decoders.
FEATURES OF PILL
Development server and debugger.
Integrated support for unit testing.
10
RESTful request dispatching.
Uses Jinja templating.
Support for secure cookies (client-side sessions).
Unicode-based.
Complete documentation.
PYTORCH
PyTorch is an open-source machine learning framework based on the Torch library, used for
applications such as computer vision and natural language processing, originally developed by
Meta AI and now part of the Linux Foundation umbrella. It is recognized as one of the two most
popular machine learning libraries alongside TensorFlow, offering free and open-source
software released under the modified BSD license. PyTorch is written in Python and is
compatible with popular Python libraries like NumPy, SciPy, and Cython. It is used in
applications like: Image recognition, Language processing, Reinforcement learning, and Natural
language classification.
PyTorch is known for its flexibility and ease of use. It uses dynamic computation graphs, which
are more flexible than static graphs. Dynamic graphs allow users to interleave construction and
valuation of the graph. PyTorch is also known for its speed. It uses the Torch library's GPU
support to accelerate training and inference.
FEATURES OF PYTORCH
7 Phase 3: Design
8 Phase 4: Coding
9 Phase 5: Testing
10 Phase 6: Installation/Deployment
11 Phase 7: Maintenance
11
4.3 PLATFORM KNOWLEDGE
Introduction to Python:
The creator of Python is Guido van Rossum. In 1989, Guido van Rossum began
using Python. Python is a simple programming language, so learning it won't be
difficult even if you're not experienced. A general-purpose programming language
called Python is growing in popularity for data science. Python is being used by
businesses all around the world to extract insights from their data and achieve a
competitive advantage. Particularly for data science, Python. to modify and store
data, as well as useful data science tools to start doing your own analysis.
What is Python?
Python is a general-purpose, interpreted programming language. It is collected and
dynamically typed. Although compilation is a step, Python is an interpreted
language rather than a compiled one. Before being placed in a.pyc or.pyo file, byte
code created from Python code written in.py files is first compiled. Python source
code was converted to byte code rather than machine code, like C++. A low-level
collection of instructions known as byte code can be carried out by an interpreter.
The fact that interpreted languages are platform-independent is a well-liked
benefit. Python byte code can be run on any platform as long as the Virtual
Machine and the Python version are the same (Windows, MacOS, etc. Dynamic
typing is an additional benefit. You must define the variable type in static-typed
languages like C++, and any inconsistency, such as adding a string and an integer,
is verified at compile time. Memory allocation in earlier programming languages
required a lot of human labour. Variables that you use frequently need to be
removed from memory if they are no longer needed or referenced anyplace else in
the application. It is done for you by Trash Collector.
Figure 4.3.1
12
4.4.1.1 Overview:
4.4.2 METHODS
An algorithm for supervised learning, which learns from labelled training data, aids in
making predictions about the results of unanticipated data. It is an extremely reliable and
accurate procedure.
13
4.5 SOFTWARE SPECIFICATION
The first step in learning how to programme in Python is to install or update Python
on your computer. There are numerous ways to install Python: you can use a package
manager, get official distributions from Python.org, or install customised distributions for
embedded devices, scientific computing, and the Internet of Things. Since they're typically
the greatest choice for getting started with learning to programme in Python, this lesson
concentrates on official distributions.
14
Install the Python app in step two.
The steps to finish the installation are as follows once you've decided the version to install:
1. Choose Get.
2. Hold off till the application downloads. The Install on my devices button will appear in
place of the Get button once the download is complete.
3. Choose the devices you want to finish installing on and click Install on my devices.
4. To begin the installation, click Install Now and then OK.
5. If the installation was successful, then you’ll see the message “This product is installed” at
the top of the Microsoft Store page.Install From the Full Installer For professional
developers who need a full-featured Python development environment, installing from the
full installer is the right choice. It offers more customization and control over the
installation than installing from the Microsoft Store.
15
After selecting and downloading an installer, double-click the downloaded file to launch it.
You'll see a dialogue box similar to the one below:
1. You can unpack the source into a directory once you have the necessary requirements and
the TAR file. The programme below, as you can see, will add a new directory called Python-
3.8.3 to the one you're in:
2. $ tar xvf 2. Python-3.8.4.tgz
3. $ cd Python-3.8.4
4. In order to prepare the build, you must now execute the./configure tool:
5.Run the command $./configure—enable-optimizations —with ensurepip=install
6. By enabling some optimisations, the enable-optimizations argument will make Python run
about 10% faster. If this is done, the compilation time can go up by twenty or thirty minutes.
The with-ensurepip=install parameter will cause to divide the building process into parallel
sections will speed up compilation. This phase can take a while, even with simultaneous
builds:
16
7. $ make -j 8 9. Installing your new Python version is the last step. To prevent replacing the
system Python, use the altinstall target in this situation. You must run as root since you are
installing into /usr/bin:
Modifying an install
Python features can be added or removed using the Windows Programs and Features
tool once the programme has been installed. To launch the installer in maintenance mode,
select the Python entry and select "Uninstall/Change." By changing the checkboxes, you can
add or delete features; checkboxes that are left blank will not install or remove anything. In
order to edit some options, such as the install directory, you must first completely uninstall
Python and then reinstall it.
With the current settings, "Repair" will check all the files that should be installed and
will replace any that have been deleted or changed. With the exception of the Python
Launcher for Windows, which has its own entry in the program's registry, "Uninstall" will
delete Python completely.
Python Application:
The users of an application created in Python are not required to be aware of such fact.
In this scenario, a private version of Python could be included in an installation package using
the embedded distribution. There are two choices, depending on how clear (or, conversely,
how professional) it should be. Although it requires some engineering, using a dedicated
executable as a launcher gives users the most transparent experience. With a customised
launcher, the presence of Python is not immediately apparent: icons can be changed, company
and version information can be set, and file associations function as expected. Most of the
time, a customised launcher should be able to call Py Main by passing it a hard-coded
command line. The simplest method is to supply a batch file or automatically produced
shortcut that launches python.exe or pythonw.exe directly and passes all necessary command-
line arguments. Users could find it challenging to distinguish this programme from other
17
active Python processes or file associations because it will appear to be Python rather than its
true name in this scenario.
The latter method requires that packages be installed as folders in addition to the
Python interpreter to make sure they are accessible via the path. Packages can be found in
other locations with the customised launcher since the search route can be specified before
the application is launched.
Embedding Python:
The embedded Python distribution can be used to create scripts for native code applications,
which frequently need them. The majority of the application is often written in native code,
and a small portion uses python.exe or python3.dll directly. To offer a loadable Python
interpreter in either situation, extracting the embedded distribution to a subfolder of the
application installation is sufficient.
Packages can be installed anywhere, just like applications, since search routes can be specified
before the interpreter is initialised. Nevertheless, there are no significant differences between a
standard installation and using an embedded distribution.
Customization:
The launcher will look for two.ini files: py.ini in the "application data" directory for the
current user (the directory provided by running the Windows function SHGetFolderPath with
CSIDL LOCAL APPDATA), and py.ini in the launcher's directory. Both the "console"
version of the launcher (py.exe) and the "windows" version use the same.ini files (i.e.
pyw.exe).
The customization specified in the "application directory" will take precedence over the
customization specified next to the executable, allowing a user to override commands in that
global.ini file even if they do not have write access to the.ini file next to the
launcher.Customizing default Python versions
18
In some cases, a version qualifier can be included in a command to dictate which
version of Python will be used by the command. A version qualifier starts with a major
version number and can optionally be followed by a period (‘.’) and a minor version specifier.
Furthermore it is possible to specify if a 32 or 64 bit implementation shall be requested by
adding “-32” or “-64”. For example, a shebang line of #!python has no version qualifier,
while #!python3 has a version qualifier which specifies only a major version.
The environment variable PY PYTHON can be set to define the default version
qualifier if no version qualifiers are detected in a command. If it is not set, "3" is used as the
default. Any value that can be supplied on the command line, including "3," "3.7," "3.7-32,"
or "3.7-64," can be specified via the variable. (Note that Python 3.7 or newer launchers are
the only ones that support the "-64" option.)
The environment variable PY PYTHON major (where major is the current major
version qualifier as established above) can be set to specify the full version if no minor
version qualifiers are discovered. The launcher will enumerate the installed Python versions
and utilise the most recent minor release found for the major version if no such option is
present. \ When installing the same (major.minor) Python version on 64-bit Windows, the 64-
bit version will always take precedence over the 32-bit version. This is true for both 32-bit
and 64-bit implementations of the launcher; if a 64-bit Python installation of the specified
version is available, the 32-bit launcher will choose to use it. This is done so that the
launcher's behaviour may be predicted based solely on the versions that are present on the
computer, without taking into account their installation order (i.e., without knowing whether a
32 or 64-bit version of Python and corresponding launcher was installed last). As shown
above, a version specifier's behaviour can be altered by adding an optional "-32" or "-64"
suffix.
Examples:
19
• If no pertinent arguments are given, the commands python and python2 will use the most
recent installation of Python 2.x, whereas the command python3 will use the most recent
installation of Python 3.x.
• Because the versions are fully defined, the commands python3.1 and python2.7 will not
examine any options at all.
• The python and python3 commands will both utilise the most recent installation of Python 3
if PY PYTHON=3.
• The commands python and python3 will both utilise 3.1 explicitly if PY PYTHON=3 and
PY PYTHON3=3.1.In addition to environment variables, the same settings can be
configured in the .INI file used by the launcher. The section in the INI file is called
[defaults] and the key name will be the same as the environment variables without the
leading PY_ prefix (and note that the key names in the INI file are case insensitive.) The
contents of an environment variable will override things specified in the INI file.
• The core path is determined and the core paths in the registry are ignored when running
python.exe or any other.exe in the main Python directory (either from an installed version
or directly from the PCbuild directory). The registry's other "application routes" are
constantly read.
• The "Python Home" will not be determined when Python is hosted in another.exe
(different directory, embedded via COM, etc.), therefore the core path from the registry is
utilized. The registry's other "application routes" are constantly read.
• You receive a route with some default but relative directories if Python can't locate its
home and there are no registry values (frozen.exe, very odd installation configuration).
20
MY SQL Server Analysis Services has replaced OLAP Services as a feature in
SQL Server version 7.0. The phrase "Analysis Services" has taken the place of
"OLAP Services." Added to Analytical Services is a new data mining section. Microsoft MY
SQL Server Meta Data Services is the new name for the Repository component included in
SQL Server version 7.0. The term Meta Data Services is now used to refer to the component.
Only the repository engine found in Meta Data Services is referred to as a repository.
They are,
1. TABLE
2. QUERY
3. FORM
4. REPORT
5. MACRO
1) TABLE:
a) Design View
b) Datasheet View
A) Design View
To build or modify the structure of a table, we work in the table design view.
B) Datasheet View
21
To add, edit or analyses the data itself, we work in table’s datasheet view mode.
2) QUERY:
3.) FORMS :
To see and modify data in a database record, utilise a form. A form only presents the
data we want to see and in the format we want to see it in. Forms employ well-known
controls like textboxes and checkboxes. This facilitates data entry and display. Forms can be
used in a variety of views. There are primarily two views: the Design View and the Form
View.
Working in the form's design view allows us to create or alter the form's structure.
Textboxes, choice buttons, graphs, and images are just a few of the controls that may be
added to the form and tied to specific fields in a table or query.
3) REPORT:
Information from the database is viewed and printed using reports. The report can
group records into numerous levels, compute totals, and calculate an average by
simultaneously comparing values from numerous records. Also, the report is eye-catching and
different due to our ability to customise its size and look.
4) MACRO:
22
A macro is a collection of procedures. Each command in a macro performs a specific
task, such producing a report or opening a form. We create macros to speed up and automate
time-consuming routine chores.
There are numerous characteristics that define SQL operations. The ability to create
control-flow logic around both standard static and dynamic SQL statements is supported by
the SQL Procedural Language statements and features that may be found in SQL procedures.
• Are supported across the board by the entire DB2 family of database products, many, if not
all, of which have DB2 Version 9 support.
Are simple to implement because they make use of a strong-typed, high-level language.
Let you to give the caller or a client application access to various result sets.
IMAGE TYPES
Outside of Matlab, images can be colourful, grayscale, or black and white. Yet, there
are four different kinds of images in Matlab. Binary pictures, which contain 1 for white and 0
for black, are what black-and-white photos are known as. Grey scale images with numbers
between 0 and 255 or 0 and 1 are known as intensity images. Indexed pictures or RGB
images can be used to represent coloured images.
There are three indexed images in RGB Images. The first image is entirely red, the
second is green, and the third is entirely blue. Hence, the matrix for a 640 by 480 image will
be 640 by 480 by 3. Indexed Image is a different approach to representing coloured images. It
is true.
23
• RGB Image to Binary Image, RGB Image to Indexed Image, and RGB Image to
Intensity Image (im2bw)
KEY FEATURES
Mathematical functions for linear algebra, statistics, Fourier analysis, filtering, optimisation,
and numerical integration; 2-D and 3-D graphics functions for visualising data; Tools for
building specialised graphical user interfaces for inter health cares; Development
environment for managing code, files, and data.
Today's organizations gather a lot of data from numerous sources. Raw data is
insufficient, though. In order to make effective business decisions, you must examine data for
practical insights. Effective data collection, storage, and processing are essential for accurate
data analysis. Different datasets require different tools for optimal analysis, and there are
several database technologies and data processing tools available. Data modeling gives you a
chance to understand your data and make the right technology choices to store and manage
24
this data. In the same way an architect designs a blueprint before constructing a house,
business stakeholders design a data model before they engineer database solutions for their
organization.
Conceptual data models provide a broad perspective on the data. The following is
explained:
What data the system includes, its qualities, conditions, or constraints, the business rules it
pertains to, the best way to arrange the data, and the needs for security and data integrity.
CHAPTER 5
SYSTEM DESIGN
25
(a) Input Image, (b) Binary Image with Noise, (c) Binary Image with Noise free of
diseased leaf Techniques
26
Figure 5.1.2 Process Flow Diagram
Image Acquisition:
• Images were spared in the JPEG arrange in 1280 x 960 pixels measure.
Pre-processing:
27
• Pre-handling techniques utilizes little neighborhood of pixel in input images to get the
new estimation of brilliance in yield images.
Division:
• Image division is the way toward doling out a name to each pixel in a images with the
end goal that pixels
• Perform dark limit and change over the images to parallel shape.
28
Morphology activity:
• The yield images is made by contrasting the estimation of every pixel and its
neighbours.
29
• Dataset
• Apply the model and plot the graphs for accuracy and loss
5.2.1MODULES DESCSRIPTION:
Dataset:
We created a system in the first module to obtain the input dataset for training and
testing. The model folder contains the dataset. There are 70,295 photos of plants in the
dataset, including tomatoes, blueberries, cherries, apples, grapes, oranges, pepper bell
peppers, potatoes, raspberries, soy beans, strawberries, and corn (maize). The referenced
dataset is taken from the well-known public source kaggle. The dataset's link is provided
below. Link to Kaggle
dataset: https://data.mendeley.com/datasets/tywbtsjrjv/1
Importing the required libraries:
In the following module, we will import the required libraries for our system to
identify plant diseases. Python will be the language we use for this. Before anything else,
we'll import the necessary libraries, including keras (for creating the main model), sklearn
(for separating the training and test data), PIL (for converting the images into arrays of
numbers), numpy (for splitting the training and test data), matplotlib (for plotting), and
tensorflow.
30
We'll get the images back together with their labels. The photos should then be resized
to (224,224) since they all need to be the same size for recognition.
Then, create a numpy array from the photos. Splitting the dataset:
Split the dataset into train and test. 80% train data and 20% test data.
The major goal of Inception v3 is to consume less computing power by altering the original
Inception designs. Inception Networks (Google Net/Inception v1) have proven to be more
computationally effective than VGGNet in terms of the amount of parameters the network
generates and the cost-effectiveness of the effort required (memory and other resources). It is
important to take care not to lose the computational benefits while making changes to an
Inception Network. Due to the unknown effectiveness of the new network, it becomes
difficult to modify an Inception network for various use cases. Many network optimisation
methods have been proposed in an Inception v3 model to relax the restrictions for Inception
v3 Architecture:
1. Factorized Convolutions:
By lowering the number of parameters in a network, this technique lowers
computational efficiency. It also monitors the effectiveness of the network.
2. Smaller convolutions:
Training is unquestionably sped up by swapping out larger convolutions for smaller
ones. Suppose a 5 5 filter has 25 parameters; replacing it with two 3 3 filters results in only
18.
31
Figure5.2.2 Inception v3 Architecture
(3 3 + 3 3) parameters:
A 3x3 convolution can be seen in the middle, and a completely connected layer can be
seen below. The amount of computations can be decreased because both 3x3 convolutions
can share weights among themselves.
3.Asymmetric convolutions:
4.Auxiliary classifier:
32
During training, a small CNN is placed between layers, and the loss it suffers is
contributed to the loss of the primary network. As opposed to Inception v3, where an
auxiliary classifier serves as a regularizer, GoogLeNet uses auxiliary classifiers for a deeper
network.
Grid size reduction is usually done by pooling operations. However, to combat the
bottlenecks of computational cost, a more efficient technique is proposed
The model will be built, and the fit function will be used to apply it. There will be 10 batches.
The graphs for accuracy and loss will then be plotted. A 91.00% average validation accuracy
was obtained.
For the test set, we achieved an accuracy of 89.00%.
Saving the Trained Model: The initial step is to save your trained and tested model into a.h5
or.pkl file using a library like pickle when you are ready to use it in a production-ready
environment.
Next, let’s import the module and dump the model into .h5 file.
33
5.3 LOGICAL DIAGRAM
5.3.1 SYSTEM ARCHITECTURE:
34
5.3.2 UML DIAGRAM:
35
Figure 5.3.2 UML Diagram
36
Pre- Model
Plant processing InceptionV3 Predicted
Loss
Image and Feature Architecture. Results: Plant
&
Dataset Selection Disease
Model
classification
Accuracy
Pre processing
Training dataset
37
In the Unified Modeling Language (UML), a use case diagram is a specific kind of
behavioural diagram that results from and is defined by a use-case analysis. Its objective is to
provide a graphical picture of a system's functionality in terms of actors, their objectives
(expressed as use cases), and any dependencies among those use cases. A use case diagram's
primary objective is to identify which system functions are carried out for which actor. The
system's actors can be represented by their roles.
Input Output
Dataset Acquisition Features extraction
Input image Dataset Classification
38
In the Unified Modeling Language (UML), a sequence diagram is a type of
interaction diagram that demonstrates how and in what order processes interact with one
another. It is a Message Sequence Chart construct. Event diagrams, event situations, and
timing diagrams are other names for diagrams.
Pre-processing
39
Language, activity diagrams can be used to describe the business and operational step-by-step
workflows of components in a system. An activity diagram shows the overall flow of control.
Input Dataset
Preprocessing
Training
Model: InceptionV3
CHAPTER 6
40
Micro services architectures are not a brand-new method of developing software, but
rather a synthesis of several effective and established ideas, such as:
Automatic scaling by unit of use; Pay for Value billing approach; Built-in availability and
fault tolerance; No Infrastructure to Provide or Manage.
dependencies = {
'auc_roc': AUC
verbose_name = {
0:":Apple___Apple_scab",
1:'Apple___Black_rot',
2:'Apple___Cedar_apple_rust',
3:'Apple___healthy',
4:'Blueberry___healthy',
5:'Cherry_(including_sour)___Powdery_mildew',
6:'Cherry_(including_sour)___healthy',
7:'Corn_(maize)___Cercospora_leaf_spot Gray_leaf_spot',
41
8:'Corn_(maize)___Common_rust_',
9:'Corn_(maize)___Northern_Leaf_Blight',
10:'Corn_(maize)___healthy',
11:'Grape___Black_rot',
12:'Grape___Esca_(Black_Measles)',
13:'Grape___Leaf_blight_(Isariopsis_Leaf_Spot)',
14:'Grape___healthy',
15:'Orange___Haunglongbing_(Citrus_greening)',
16:'Peach___Bacterial_spot',
17:'Peach___healthy',
18:'Pepper,_bell___Bacterial_spot',
19:'Pepper,_bell___healthy',
20:'Potato___Early_blight',
21:'Potato___Late_blight',
22:'Potato___healthy',
23:'Raspberry___healthy',
24:'Soybean___healthy',
25:'Squash___Powdery_mildew',
26:'Strawberry___Leaf_scorch',
27:'Strawberry___healthy',
28:'Tomato___Bacterial_spot',
29:'Tomato___Early_blight',
30:'Tomato___Late_blight',
31:'Tomato___Leaf_Mold',
32:'Tomato___Septoria_leaf_spot',
33:'Tomato___Spider_mites Two-spotted_spider_mite',
34:'Tomato___Target_Spot',
35:'Tomato___Tomato_Yellow_Leaf_Curl_Virus',
36:'Tomato___Tomato_mosaic_virus',
42
37:'Tomato___healthy'}
model = load_model('plants.h5') def
predict_label(img_path):
test_image = image.load_img(img_path, target_size=(224,224))
test_image = image.img_to_array(test_image)/255.0
test_image = test_image.reshape(1, 224,224,3)
predict_x=model.predict(test_image)
classes_x=np.argmax(predict_x,axis=1) return
verbose_name[classes_x[0]]
@app.route("/")
@app.route("/first") def
first():
return render_template('first.html')
@app.route("/login") def
login():
return render_template('login.html')
predict_result = predict_label(img_path)
43
@app.route("/chart") def chart(): return
render_template('chart.html')
if __name__ =='__main__':
app.run(debug = True)
<html lang="en">
<head>
<meta charset="utf-8">
<title>plant</title>
<link href="https://cdnjs.cloudflare.com/ajax/libs/font-
awesome/5.10.0/css/all.min.css" rel="stylesheet">
<link href="../static/lib/owlcarousel/assets/owl.carousel.min.css"
rel="stylesheet">
44
</head>
<body>
</a>
<span class="navbar-toggler-icon"></span>
</button>
</div>
</a>
</div>
</div>
45
</nav>
</div>
</div>
<p class="m-0">Login</p>
</div>
</div>
</div>
<div class="col-lg-6">
</div>
</div>
<div class="col-lg-9">
<head> <script>
addEventListener("load", function () { setTimeout(hideURLbar, 0);
}, false);
46
function hideURLbar() { window.scrollTo(0, 1);
}
function login(){
{ alert("Login Success!");
</script>
</head>
<body id="page-top">
<div class="row">
<div class="control-group">
<br>
<br>
47
<label class="control-label" for="username"><b>Username</b></label>
<div class="controls">
<br>
<br>
<div class="control-group">
<!-- Password-->
<div class="controls">
<input type="password" id="pwd" name="pwd" placeholder="" class="formcontrol"> </div>
</div>
<div class="control-group">
<div class="controls">
<br>
<br>
<br>
<input type="button" class="btn btn-primary" value="Login" style="margin-left: 80px"
onclick="login()">
</div>
</div>
</div>
</div>
</div>
</section>
</body>
</div>
</div>
48
</div>
</div>
<script src="https://code.jquery.com/jquery-3.4.1.min.js"></script>
<script
src="https://stackpath.bootstrapcdn.com/bootstrap/4.4.1/js/bootstrap.bundle.min
.js"></script>
<script src="../static/lib/easing/easing.min.js"></script>
<script src="../static/lib/waypoints/waypoints.min.js"></script>
<script src="../static/lib/owlcarousel/owl.carousel.min.js"></script>
<script src="../static/lib/isotope/isotope.pkgd.min.js"></script>
<script src="../static/lib/lightbox/js/lightbox.min.js"></script>
<script src="../static/mail/jqBootstrapValidation.min.js"></script>
<script src="../static/mail/contact.js"></script>
<script src="../static/js/main.js"></script>
</body>
</html>
49
6.1.2 SCREENSHOTS
50
Figure 6.1.2.1 Home page
51
Figure 6.2.2.3 Login page
52
Figure 6.2.2.4 Image 1 Preview page
53
Figure 6.2.2.5 Image 1 Prediction page
54
Figure 6.2.2.6 Image 2 Preview page
55
Figure 6.2.2.7 Image 2 Prediction page
56
Figure 6.2.2.8 Image 3 Preview page
57
Figure 6.2.2.9 Image 3 Prediction page
58
Figure 6.2.2.10 Model Accuracy
59
CHAPTER 7
SYSTEM TESTING
Testing is done to look for mistakes. Testing is the process of looking for any flaws or
weaknesses in a piece of work. It offers a means of testing whether parts, sub-assemblies,
assemblies, and/or a finished product perform properly. It is the process of testing software to
make sure that it satisfies user expectations and meets requirements without failing in an
unacceptable way. Several test types exist. Every test type responds to a certain testing
requirement.
Designing test cases for unit testing ensures that the internal programme logic is
working correctly and that programme input results in legitimate outputs. It is important to
verify the internal code flow and all decision branches. It is the testing of the application's
separate software components. Before integration, it is done following the completion of each
individual unit. This is an invasive structural test that depends on understanding how it was
built. Unit tests carry out fundamental tests at the component level and examine a particular
configuration of a system, application, or business process.
Field testing will be performed manually and functional tests will be written in detail.
Software components that have been merged are tested in integration tests to see if
they genuinely operate as a single programme. Testing is event-driven and focuses more on
the fundamental result of screens or fields. Even though the individual components were
successful in unit testing, integration tests indicate that the combination of the components is
60
accurate and consistent. Integration testing is especially designed to highlight issues that
result from combining components. The incremental testing of two or more integrated
software components on a single platform known as "software integration testing" is done to
induce failures brought on by interface flaws.
Functional tests provide systematic demonstrations that functions tested are available
as specified by the business and technical requirements, system documentation and user
manuals.
System testing makes ensuring that the integrated software system as a whole complies
with specifications. In order to provide known and predictable outcomes, it tests a setup. The
configuration-oriented system integration test is an illustration of system testing. System
testing is based on process flows and descriptions, with an emphasis on integration points and
linkages to pre-driven processes.
61
White box testing is a type of testing where the software tester is familiar with the inner
workings, structure, and language of the software, or at the at least, knows what it is intended
to do. It has a goal. It is employed to test regions that are inaccessible from a black box level.
Software testers who are familiar with the inner workings, structure, and language of
the software—or at the at least, who are aware of what it is meant to do—can conduct white
box testing. It has a purpose. It is used to examine areas that are off-limits at the black box
level.
CHAPTER 8
The system design phase of the system lifetime is the most innovative and difficult.
The process of specifying how a system will fulfil the needs found during system analysis is
known as system design. As the analyst creates a logical system design, they precisely define
the user requirements, which effectively dictates how information enters and leaves the
system as well as the necessary data source. The first step in the design process is to decide
how and in what format the output will be created. Second, input data and master files need to
be created to meet output specifications. The system flow chart, which serves as the
foundation for the coding phase, will be completed at the conclusion of the design phase.
A system flowchart can display a high level picture of how a system is organised. The
rectangular boxes indicate original software assemblages. The system designers who
must create the entire system architecture of hardware and software to implement the
62
user requirements frequently find this graphic to be very helpful. The following
scenario is one in which system flowcharts could be helpful modelling tools: the
creation of the user implementation model, which comes after the system analyst's
work. The user, the system analyst, and the implementation team now debate the
implementation limitations that must be placed on the system, such as establishing the
human interface and the automation border.
2 INPUT DESIGN
Information supplied to the system is what is referred to as the input of a system. The
system uses this for later processing in order to gather useful information that aids in
decision-making. The process of translating user-oriented inputs to a computer-based
representation is known as input design. The design of the entire system, which
includes input, calls for specific consideration. The most frequent cause of errors in
error processing is inaccurate input data. Errors made by data entry operators can be
reduced through input design. Data entry must be reviewed for accuracy and error
direction. The proper error message must be shown. The user shouldn't be permitted
to type any data that is invalid.
The most significant and immediate source of information for the user is the computer
output. The system's relationship with the user is improved and decision-making is aided by
efficient and understandable output design. Throughout the study phase, the output design
was actively being studied. The output design's goal is to specify the content and format of all
publications and reports in a way that is both aesthetically pleasing and functional.
63
requirements are known. While choosing the output device to be used, factors like system
compatibility and response time requirements should be taken into account.
CHAPTER 9
CONCLUSION & FUTURE SCOPE
9.1 CONCLUSION
In the Asia-Pacific Region, where 56% of people live, food production provides and
supports more than 90% of global consumption. In many countries, it is expected that the
want for food would increase more quickly than the generation. An unavoidable problem is
how to increase production from the current level of 524 million tonnes annually to 700
million tonnes continuously by 2025 while using less land, less water, less labour, and fewer
agro-synthetic compounds. Alternative solutions to the problem of uneven and vertical
development have their own advantages and disadvantages. The spanning of the yield hole
for delivering more seems, by all accounts, to be promising in light of this condition.
In this study, a paradigm for evaluating bacterial curse, impact, sheath deterioration,
and black collared spot diseases was developed. To set up the grouping calculation, image
handling techniques like division, feature extraction, and two classifiers were used. Color and
area-aware shape features have been extracted from the element extraction process and used
as a contribution to the classifier. Although the aforementioned analysis has shown the
general nature of the anticipated communications, it is also crucial to evaluate the unwavering
quality for each predicted PPI. In order to evaluate the consistency of each projected
cooperation, two p-values were presented in this work, taking into account GO words and the
relationship between each anticipated association and its articulation. To prepare for each
infection, a distinct database has been used.
SVM outperforms all currently used machine learning techniques as well as
conventional REG methods for detecting plant ailments. We have also developed an SVM-
based web server for impact expectation, a first of its type globally, to support the plant
scientific network and agriculturalists with their fundamental leadership processes. This is
another step along the way.
64
The accuracy achieved with the two classifiers, k-NN and MDC, is 87.02% and
89.23%, respectively, for the proposed systems for the four diseases stated above. The
proposed system is compared to a few currently used tactics that are specifically targeted at
the infection site, and it is discovered that the new method is superior in terms of time
unpredictability, precision, and the number of diseases secured. SVM outperforms all
currently used machine learning techniques as well as conventional REG methods for
detecting plant ailments. We have also developed an SVM-based web server for impact
expectation, a first of its type globally, to support the plant scientific network and
agriculturalists with their fundamental leadership processes. This is another step along the
way.
The accuracy achieved with the two classifiers, k-NN and MDC, is 87.02% and
89.23%, respectively, for the proposed systems for the four diseases stated above. The
proposed system is compared to a few currently used tactics that are specifically targeted at
the infection site, and it is discovered that the new method is superior in terms of time
unpredictability, precision, and the number of diseases secured.
REFERENCES
[1] Aakanksha Rastogi, Ritika Arora and Shanu Sharma,” Leaf Disease Detection and Grading
using Computer Vision Technology &Fuzzy Logic” 2nd International Conference on Signal
Processing and Integrated Networks (SPIN)2015.
[2] Akhtar, Asma, Aasia Khanum, Shoab A. Khan, and Arslan Shaukat. "Automated plant
disease analysis (APDA): performance comparison of machine learning techniques." In 2013
11th International Conference on Frontiers of Information Technology, pp. 60-65. IEEE, 2013.
[3] Caglayan, A., Guclu, O., & Can, A. B. (2013, September). “A plant recognition approach
using shape and color features in leaf images.” In International Conference on Image Analysis
and Processing (pp. 161-170). Springer, Berlin, Heidelberg.
65
[4] Elangovan, K., and S. Nalini. "Plant disease classification using image segmentation and
SVM techniques." International Journal of Computational Intelligence Research 13, no. 7
(2017): 1821-1828.
[5] Godliver Owomugisha, John A. Quinn, Ernest Mwebaze and James Lwasa,” Automated
Vision-Based Diagnosis of Banana Bacterial Wilt Disease and Black Sigatoka Disease “,
Preceding of the 1’st international conference on the use of mobile ICT in Africa ,2014.
[6] Hungilo, Gilbert Gutabaga, Gahizi Emmanuel, and Andi WR Emanuel. "Image processing
techniques for detecting and classification of plant disease: a review." In Proceedings of the
2019 international conference on intelligent medicine and image processing, pp. 48-52. 2019.
[7] J. G. A. Barbedo, “Digital image processing techniques for detecting, quantifying and
classifying plant diseases,” Springer Plus, vol. 2, no.660, pp. 1–12, 2013.
[8] Mohanty, Sharada P., David P. Hughes, and Marcel Salathé. "Using deep learning for
imagebased plant disease detection." Frontiers in plant science 7 (2016): 1419.
[9] P. R. Rothe and R. V. Kshirsagar,” Cotton Leaf Disease Identification using Pattern
Recognition Techniques”, International Conference on Pervasive Computing (ICPC),2015.
66