Final Report TLM
Final Report TLM
A Project Report on
“TRASH DETECTION AND CLASSIFICATION USING DEEP
LEARNING”
Submitted in partial fulfillment of the requirement for the award of a degree of
Bachelor of Engineering
In
Computer Science and Engineering
Submitted by
DHANANJAY K R(4NN19CS007)
PRADYUMNA S(4NN19CS021)
SHREENIDHI TH(4NN19CS028)
CHANDANA LC(4NN20CS407)
Under the Guidance of
Mr. AJAY A V
Assistant Professor
Dept. of CS & Engineering
2022-2023
NIE Institute of Technology, Mysuru
(Affiliated to Visvesvaraya Technological University, Belagavi)
EXTERNAL VIVA
1. 1.
2. 2.
DECLARATION
undertake that the project work entitled “TRASH DETECTION AND CLASSIFICATION
Mr. AJAY A V, Asst. Professor, Department of Computer Science and Engineering, NIEIT,
Mysuru-18, in partial fulfillment of the requirement for the award of Bachelor of Engineering in
Belagavi-590018. The project has been our original work and has not formed the basis for the
award of any degree, associate ship, fellowship, or any other similar titles.
Dhananjay K R (4NN19CS007)
Pradyumna S (4NN19CS021)
Shreenidhi T H (4NN19CS028)
Chandana L C (4NN20CS407)
ACKNOWLEDGEMENT
The satisfaction and euphoria that accompany the successful completion of any task would be
incomplete without the mention of the people who made it possible, whose consistent guidance and
encouragement crowned my efforts with success.
We consider ourselves proud to be a part of the NIE Institute of Technology Mysuru family, the
institution which stood in our way in all our endeavors.
We wish to express our gratitude to Dr.Rohini Nagapadma, Principal, NIE Institute of Technology,
Mysuru, for providing a congenial working environment.
We express our sincere thanks to Dr. USHA M S, Associate Professor & Head, Department of
Computer Science and Engineering, for her support and encouragement.
We would like to thank our guides, Mr. AJAY A V, Assistant Professor, Department of Computer
Science and Engineering, for their inspiration, guidance, constant supervision, direction, and
discussions in the successful completion of this project work.
We are thankful to the Department of Computer Science and Engineering staff members and non-staff
members for their cooperation extended towards this work.
Finally, our heartfelt gratitude to our family members, relatives, and friends for their constant support,
motivation, and encouragement throughout this project. Above all, I would like to thank God,
Almighty, for having showered his blessings on us.
Dhananjay K R(4NN19CS007)
Pradyumna S (4NN19CS021)
Shreenidhi T H (4NN19CS028)
Chandana L C (4NN20CS407)
I
ABSTRACT
The problem of effective disposal of the trash generated by people has rightfully attracted major interest
from various sections of society in recent times. Recently, deep learning solutions have been proposed
to design automated mechanisms to segregate waste. However, most datasets used for this purpose are
not adequate. We have used a dataset, containing 14,263 images across 7 different classes, including
medical and e-waste classes which are not included in any other existing dataset. We also experiment
with transfer learning-based models trained on our dataset to evaluate its generalizability and achieved
a remarkable accuracy of 98.47%. Experimental evaluation of benchmark datasets has shown very
promising results.
Trash classification algorithm model based on deep learning convolutional neural network to help
identify trash classification. Here data augmentation and normalization are carried out to solve the
problem of small amounts of data sets and different sizes of pictures. We have used Global Average
Polling to provide a form of regularization that reduces overfitting by creating a more generalized
representation of the input. By reducing the spatial dimensions, the model is forced to focus on the
most important features of the input, while discarding the less important ones. This can help prevent the
model from memorizing specific details of the training data, which can lead to poor performance on
unseen data.
Also, we have compared our main model with other models. By building 4 different models using
various algorithms and methods. Finally, we have concluded by saying which model is more accurate in
the classification of waste into 7 different classes.
II
TABLE OF CONTENTS
CHAPTER NO. TITLE PAGE NO.
ACKNOWLEDGEMENT I
ABSTRACT II
LIST OF CONTENTS III
LIST OF FIGURES V
LIST OF TABLES VI
CHAPTER 1: INTRODUCTION 1
1.1 Introduction 1
1.1.1 Introduction to Machine Learning 2
1.1.2 Introduction to Transfer Learning 3
1.2 Introduction to the project 5
1.2.1 Features 5
1.3 The motivation behind the project 6
1.4 Scope of the project 6
1.5 Organization of the project 7
CHAPTER 2: LITERATURE SURVEY 8
2.1 Table of Comparison 10
CHAPTER 3: SYSTEM REQUIREMENTS 11
3.1 Platform and Language Used 11
3.2 Software Requirements 13
3.3 Hardware Requirements 13
CHAPTER 4: SYSTEM ANALYSIS 14
4.1 Existing System 15
4.2 Analysis of the proposed system 16
CHAPTER 5: SYSTEM DESIGN 17
5.1 Design and Methodology of the proposed project 17
III
5.2 Proposed Dataset 18
5.3 System Architecture 21
5.4 Functional Requirements 22
5.4.1 Product function 23
5.4.2 General Constraints 23
CHAPTER 6: SYSTEM IMPLEMENTATION 24
6.1 DFD 0 24
6.2 DFD 1 24
6.3 Activity Diagram 25
6.4 Use case Diagram 26
6.5 Methodology 27
6.6 CNN 29
6.6.1 CNN Architecture 29
6.6.2 CNN Algorithm 29
6.6 RELU Activation 31
6.7 Global Average Pooling 32
6.8 Pseudo Code 33
CHAPTER 7: SYSTEM TESTING 34
7.1 Testing methodologies 34
7.2 Case Studies 35
7.3 Test Cases 36
CHAPTER 8: EXPERIMENTAL RESULTS AND ANALYSIS 37
CHAPTER 9: SNAPSHOTS 39
CHAPTER 10: CONCLUSION AND FUTURE SCOPE 44
REFERENCES 45
IV
LIST OF FIGURES
V
LIST OF TABLES
VI
CHAPTER 1
INTRODUCTION
1.1 INTRODUCTION
The past decade has evidenced an explosive increase in the amount of trash generated every
day by urban populations. However, the current practice of directly discarding items into the
trashcan is highly unsustainable - particularly when raw materials are finite resources and will
eventually be exhausted. Recent surveys have reported that, in Delhi, roughly 80% of the waste sent
to landfills daily could be recycled. The waste increase is estimated to increase from 64-72 million
tons to 125 million tons by 2031. Untreated waste that includes recyclable non-biodegradable waste
lies for years at dumpsites where land areas are allocated for disposal of residual waste. Currently,
the two major forms of waste disposal are incineration, where the object is simply burned, and
landfilling, where the object is dumped in a predetermined spot. Of these two methods, landfilling
can be said to be far more popular, due to the lack of machinery or investment required to start
operations. Similarly, with the COVID-19 pandemic, there has been a sharp increase in the number
of surgical masks, gloves, and other such items discarded as litter, often without any regard to
mandated bio-hazardous material disposal procedures. Such hazardous items must be handled with
appropriate care, to protect trash disposal squads and other environmental contamination. To reduce
the amount of waste sent to landfills, there is a need for automated systems for the segregation of the
same. However, manual sorting is a tedious and often hazardous process, which does not scale well
as being labor and cost-prohibitive. The trash sorting machines employed currently work on physical
sorting processes, not intelligent computerized methods, which can help in more accurate class-wise
segregation. Specifically, deep neural models trained on a trash image dataset with a large variety of
classes can help classify trash objects based on their type and boost recyclable product reclamation
and productivity.
Segregation of waste materials is important for a sustainable society. Initially, segregation required
the use of hands to separate waste. This became tedious once the number of waste increased as the
population increased. there's a desire for something which could automatically sort the waste. this
can be efficient since the workers don't sort the waste fully. Waste segregation implies segregating
waste into dry and wet. Dry waste contains wood, plastic, metal, and glass. Wet waste, commonly
alludes to natural waste for the foremost part produced by eating foundations and is overwhelming in
1
weight thanks to sogginess. Waste can likewise be isolated on the premise of biodegradable or
non-biodegradable.
Machine learning (ML) is the scientific study of algorithms and statistical models that computer
systems use to effectively perform a specific task without using explicit instructions, relying on
patterns and inference instead. It is seen as a subset of artificial intelligence Machine learning
algorithms build a mathematical model of sample data. known as "training data", to make
predictions or decisions without being explicitly programmed to perform the task. Machine learning
algorithms are used in a wide variety of applications, such as email filtering, and computer vision,
where it is infeasible to develop an algorithm of specific instructions for performing the task
Machine learning is closely related to computational statistics, which focuses on making predictions
using computers. The study of mathematical optimization delivers methods, theory, and application
domains to the field of machine learning. Data mining is a field of study within machine learning
and focuses on exploratory data analysis through unsupervised learning. In its application across
business problems, machine learning is also referred to as predictive analytics.
Machine learning tasks are classified into several broad categories. In supervised learning, the
algorithm builds a mathematical model from a set of data that contains both the inputs and the
desired outputs. For example, if the task were determining whether an image contained a certain
object, the training data for a supervised learning algorithm would include images with and without
that object (the input), and each image would have a label (the output) designating whether it
contained the object. In special cases, the input may be only partially available, or restricted to
special feedback.
Semi-supervised learning algorithms develop mathematical models from incomplete training data,
where a portion of the sample input doesn't have labels. Supervised learning algorithms include
classification and regression. Classification algorithms are used when the outputs are restricted to a
limited set of values, and regression algorithms are used when the outputs may have any numerical
value within a range. Similarity learning is an area of supervised machine learning closely related to
regression and classification, but the goal is to learn from examples using a similarity function that
2
measures how similar or related two objects are. It has applications in ranking, recommendation
systems, visual identity tracking, face verification, and speaker verification.
Unsupervised learning algorithms take a set of data that contains only inputs, and find structure in
the data, like grouping or clustering of data points. The algorithms therefore learn from test data that
has not been labeled, classified, or categorized Instead of responding to feedback, unsupervised
learning algorithms identify commonalities in the data and react based on the presence or absence of
such commonalities in each new piece of data A central application of unsupervised learning is in
the field of density estimation in statistics, though unsupervised learning encompasses other domains
involving summarizing and explaining data features.
Reinforcement learning is an area of machine learning concerned with how software agents ought to
take actions in an environment to maximize some notion of cumulative reward Due to its generality,
the field is studied in many other disciplines, such as game theory, control theory, operations
research, information theory, simulation-based optimization, multi-agent systems, swarm
intelligence, statistics, and genetic algorithms. In machine learning, the environment is typically
represented as a Markov Decision Process (MDP) Many reinforcement learning algorithms use
dynamic programming techniques Reinforcement learning algorithms do not assume knowledge of
an exact mathematical model of the MDP, and are used when exact models are infeasible.
Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a game
against a human opponent.
We, humans, are very perfect at applying the transfer of knowledge between tasks. This means that
whenever we encounter a new problem or a task, we recognize it and apply our relevant knowledge
from our previous learning experiences. This makes our work easy and fast to finish. For instance, if
you know how to ride a bicycle and if you are asked to ride a motorbike which you have never done
before. In such a case, our experience with a bicycle will come into play and handle tasks like
balancing the bike, steering, etc. This will make things easier compared to a complete beginner. Such
learnings are very useful in real life as it makes us more perfect and allows us to earn more
experience. Following the same approach, a term was introduced Transfer Learning in the field of
machine learning. This approach involves the use of knowledge that was learned in some task and
3
applying it to solve the problem in the related target task. While most machine learning is designed
to address a single task, the development of algorithms that facilitate transfer learning is a topic of
ongoing interest in the machine-learning community.
Transfer learning is a technique in machine learning where a model trained on one task is used as the
starting point for a model on a second task. This can be useful when the second task is similar to the
first task, or when there is limited data available for the second task. By using the learned features
from the first task as a starting point, the model can learn more quickly and effectively on the second
task. This can also help to prevent overfitting, as the model will have already learned general
features that are likely to be useful in the second task.
Many deep neural networks trained on images have a curious phenomenon in common: in the early
layers of the network, a deep learning model tries to learn a low level of features, like detecting
edges, colors, variations of intensities, etc. Such kind of features appears not to be specific to a
particular dataset or a task because no matter what type of image we are processing either for
detecting a lion or car. In both cases, we have to detect these low-level features. All these features
occur regardless of the exact cost function or image dataset. Thus learning these features in one task
of detecting lions can be used in other tasks like detecting humans.
4
1.2 INTRODUCTION TO THE PROJECT
Our main aim in the project is to classify waste into different categories. Currently, the two major
forms of waste disposal are incineration, where the object is simply burned, and landfilling, where
the object is dumped in a predetermined spot. Of these two methods, landfilling can be said to be far
more popular, due to the lack of machinery or investment required to start operations. However,
while incineration, even in the most controlled circumstances, does produce air pollution, landfills
are much more harmful. Landfills are known to pollute the groundwater in the region, due to toxic
chemicals leaching into the water reservoirs. Fires are also known to break out spontaneously in
such sites, due to the presence of highly flammable items. Landfills are also quite an eyesore and
give off an odious smell as one approaches the area.
To reduce the amount of waste sent to landfills, there is a need for automated systems for the
segregation of the same. However, manual sorting is a tedious and often hazardous process, which
does not scale well as being labor and cost-prohibitive. The trash sorting machines employed
currently work on physical sorting processes, not intelligent computerized methods, which can help
in more accurate classwise segregation. Specifically, deep neural models trained on a trash image
dataset with a large variety of classes can help classify trash objects based on their type and boost
recyclable product reclamation and productivity.
1.2.1 FEATURES
● Efficient waste management without any harm to the environment and human life.
● Avoid human introversion in waste segregation.
● More efficient and accurate detection.
● Includes more classes and also subcategories.
● It is the most optimized process till now.
5
1.3 MOTIVATION BEHIND THE WORK
We, as students of Computer Science and Engineering, see it as our duty, and
responsibility even, to contribute to this ever-growing and challenging field. However, being a
student of computer science does not only mean having the ability to code but also understanding
and coordinating with other disciplines of engineering to learn and interface with systems outside
our own. This work presents a growing population base that will continue having pressure on the
existing waste management system. Also with growing new technologies and also with new medical
invention the amount of new sorts of waste generated have also increased drastically. With the
pandemic coming into the picture in this 21st century surgical waste disposal and its separation have
become a great challenge for our society. Most of the surgical waste is harmful and disease-carrying.
So segregation of this type of waste manual can affect human health. Also with the new
technological invention, the amount of e-waste generated has also increased. So the need of the hour
is to find an optimized and efficient waste disposal system without human interference.
In recent years the majority of the world's population resides in cities. Many advancements have
been made in managing urban areas. With the new idea of a Smart City, many man-made activities
have been replaced with IT solutions. One of the areas is waste management. It plays an important
role in the management of a good city.
● Daily the waste is generated in a large amount in big cities Eg:-In Delhi daily around 12,000
tonnes of waste is generated. So it becomes a primary task for the governing organization to
effectively and accurately manage the waste.
● The model of Trash Detection can be incorporated into an IOT device that uses ML to
segregate the waste and also management of the waste becomes easier and quicker.
● This model in the future can be integrated with the quantum computer so that the waste can be
segregated at a very minute level also.
● It’s also easier to add new classes and train the model without many changes in the existing
model.
6
1.5 ORGANIZATION OF THE PROJECT
The sequel of pages and their hierarchical arrangement play a pivotal role in structuring the
project report properly and interlinking the vital elements of the report in the best possible format.
This project report consists of 7 chapters as mentioned below:
● Introduction
● Literature survey
● System requirements
● System analysis
● System design
● System implementation
● System Testing
● Snapshots
● Conclusion & Future Scope
● References
Introduction: provides the background information about the project and the basic idea of what this
project is expected to do.
Literature survey: gives a detailed study of all the existing systems and their disadvantages.
System requirements: tells the detailed description of system requirements including both hardware
and software.
System analysis: provides a detailed description of the system analysis, why is it required, the
method of analysis of the existing system, the proposed system, and its components. System design:
It is involved giving a description of how the system is going to be designed, and how exactly the
system would be developed.
System implementation: It is all about the implementation part of the project that describes the
critical coding of the project.
Testing: gives information about testing the project in real-time scenarios and determines the
efficiency of the system.
Snapshots: It consists of snapshots of software and hardware modules.
Conclusion & Future scope: It includes the extensions that could be made to this project
References: consists of the papers, books, and websites we have referred to.
7
CHAPTER 2
LITERATURE SURVEY
The most important details in this are that it is necessary and challenging to manage solid waste due
to rapid urbanization and population growth. Dustbins are the first stage of waste collection and
management, but this is not enough due to people not being responsible for using them properly,
improper placement, inadequate management systems, and lack of automatic cleaning of the area
surrounding the dustbin. To address these issues, efforts have been made to provide prior
information on the status of garbage present in a particular dustbin, as well as software applications
to show the location of the dustbin in nearby areas. GIS is also being used to collect the status of the
garbage in the dustbin. [1]
Currently, the world generates 2.01 billion lots of municipal solid waste annually, which is huge
damage to the ecological environment. Recycling is becoming an essential part of a sustainable
society, but it requires the selection, classification, and processing of recycled materials.
Computerized recycling is now of great value to an industrial and information-based society, with
both environmental and financial benefits. and Deep learning can be used to solve garbage picture
classification with high accuracy, but requires a larger and more precisely categorized statistics
supply in more intricate situations. [2]
Monitoring and cleanliness assessment of trash areas in urban scenes rely on manual inspection and
photographic records, leading to human intervention and cumbersome problems. Smart cities can
provide an automatic detection method to help alleviate urban trash problems. Deep network
architectures provide the best-in-class performance in terms of accuracy, scalability, and adaptability,
and this project proposes the idea of detecting waste or trash using deep learning. [3]
Segregation of waste materials is important for a sustainable society, and this paper proposes a
Convolutional Neural Network (CNN) to classify common wastes into six different types. The
system will be used at a minimum level to separate the waste materials, with the goal of opening the
dustbin for the desired garbage object. Machine learning can be used to categorize waste into
degradable and non-degradable categories, with no need for human intervention. [4]
8
The development of machine vision technology provides a method to solve the problem of garbage
classification and identification. The difficulty of pattern recognition is that there is no applicable
formula to solve all problems. Existing methods include extracting scale-invariant features and solving
image-matching problems through feature-matching algorithms. Improved geometric hashing
algorithm improves matching accuracy and speed in complex scenes. [5]
The Ministry of Housing and Urban-Rural Development issued the "Notice on Comprehensively
Carrying Out Domestic Waste Classification Work in Cities at Prefecture Levels and Above in China"
in 2019 to solve the environmental damage caused by urban domestic garbage. Deep learning models
with CNN as the core have been used to improve image recognition and classification. This paper uses
OpenCV and TensorFlow to build a VGG-16 convolutional neural network to classify domestic
garbage into hazardous, kitchen waste, other garbage, and recyclable garbage. However, the accuracy
of this project still needs improvement. [6]
The rapid development of industry has caused a large amount of garbage, which is difficult to
distinguish and sort, time-consuming, and labor-consuming. Incineration is not suitable to solve and
eliminate the garbage, and people's cognition of the type of garbage is poor. Garbage accumulation
brings difficulties in sorting, making it difficult to classify garbage quickly and effectively. This
research uses a convolutional neural network model in deep learning to solve garbage classification
effectively. Experimental results achieved an expected goal, recognition accuracy is satisfactory, and
garbage classification is more effective. However, the experiment may take longer and the accuracy
rate may be improved. The next research direction is to improve and optimize the model. [7]
Trash classification has become a major issue, and our project aims to provide an automated waste
sorting tool to help save the environment and ease the process for residents. We increased the accuracy
of CNN to classify recyclables and trash, reaching stable test accuracies of 70% to 80%. The highest
accuracy was 79.94% with Model 2 (SVM as the last layer) using partial data augmentation. In the
future, we hope to explore other models and use transfer learning to achieve higher accuracy. [8]
9
2.1 TABLE OF COMPARISON
Sl.No Title Year Techniques Advantages Limitations
Used
1 Trash Box: Trash 2022 Transfer More accurate and better It is better
Detection and Learning, performing than other performing to only
Classification Quantum benchmark datasets. specific Transfer
using Quantum Computing learning models.
Transfer Learning
2 Garbage 2022 Trash System ensures the best way The model
Classification Classification for waste management and nevertheless wants
Using Deep using will also speed up the a larger and greater
Learning convolutional segregation process with precisely
Techniques neural networks higher accuracy. categorized
statistics supply
taken in greater
intricate situations.
10
CHAPTER 3
SYSTEM REQUIREMENTS
A System Requirements Specification (SRS), a requirements specification for a software
system, is a description of the behavior of a system to be developed and may include a set of use
cases that describe interactions the users will have with the software. In addition, it also contains
non-functional. Non-functional requirements impose constraints on the design or implementation
(such as performance engineering requirements, quality standards, or design constraints).
Software requirements specification establishes the basis for agreement between customers and
contractors or suppliers (in market-driven projects, these roles may be played by the marketing and
development divisions) on what the software product is to do as well as what is not expected to do.
Software requirements specification permits a rigorous assessment of requirements before design
and reduces later redesign. It should also provide a realistic basis for estimating product costs, risks,
and schedules.
Computer OS Windows 10
11
1. Platform used:
● Jupyter: open-source software, open standards, and services for interactive computing
across dozens of programming languages. Project Jupyter has developed and supported
the interactive computing products Jupyter Notebook, Jupyter Hub, and Jupyter Lab,
the next-generation version of Jupyter Notebook.
One of the key features of TensorFlow is its ability to create and manipulate tensors,
which are multi-dimensional arrays that can be used to represent data, including
images, audio, and text. TensorFlow also includes a powerful set of tools for building
and training machine learning models, including a high-level API called Keras, as well
as lower-level APIs for more advanced users.
In addition to its core library, TensorFlow also has several related projects, including
TensorFlow Extended (TFX) for building end-to-end machine learning pipelines,
TensorFlow Lite for deploying models on mobile and embedded devices, and
TensorFlow.js for running models in the browser.
12
● Windows 10: it is a personal computer operating system that was produced by
Microsoft as part of the Windows NT family of operating systems.
2. Language Used:
● Python: Python is a high-level, interpreted programming language that was created in
the late 1980s by Guido van Rossum. It is known for its simplicity, readability, and ease
of use, and has become one of the most popular programming languages in the world.
Python is a versatile language that can be used for a wide variety of tasks, including
web development, scientific computing, data analysis, machine learning, and more. It
has a large and active community of developers who have created a vast ecosystem of
libraries and frameworks that make it easy to accomplish many tasks with just a few
lines of code.
Python is an open-source language, which means that its source code is freely available
for anyone to use and modify. It is also cross-platform, meaning that it can be run on a
variety of operating systems, including Windows, Mac, and Linux.
13
CHAPTER 4
SYSTEM ANALYSIS
System Analysis is the process of studying activity or a profession (as a procedure or a business) in
order to define its goals or purposes and to discover operations and procedures for accomplishing
them most efficiently. The process is an explicit formal inquiry carried out to help a decision maker
(usually in businesses) identify a better course of action and make a better decision than he might
otherwise have made. Systems analysis comprises the study of sets of interacting entities, including
computer systems analysis. The field of system analysis is closely related to requirements analysis or
operations research,
The principal objective of the systems-analysis phase is the specification of what the system needs to
do to meet the requirements of end users. In the systems-design phase, such specifications are
converted to a hierarchy of charts that define the data required and the processes to be carried out on
the data so that they can be expressed as instructions for a computer program. The process of system
analysis consists of various steps:
● Collecting Information about current systems: This step involves the collection of all
information about the current systems by research, observations about the behavior of the
systems in various scenarios, and reviews about the systems by the users.
● Identifying the inputs, outputs, and processes of the current systems: Every system has
inputs and outputs and the systems analyst needs to identify the data input to the present
system, and the data output given by the present system. This is because any new system that is
designed will have to deal with similar inputs and outputs as the present system. Any new
system that is created will need to take in the same input data and will have to produce similar
kinds of outputs. For similar reasons, the systems analyst also has to understand how the
present system works (the processes-which do what and when).
● Identifying issues in the current systems: No system is perfect and it is the job of the systems
analyst to try and identify where and what are the problems in the current system. If these
problems can be fixed, the system will work more smoothly, be more efficient and, in the case
of a business, be more profitable.
14
● New system requirements specifications: After the problems with the present system are
understood, the system analyst can begin to plan how the new system will fix those problems.
The systems analyst specifies a list of requirements for the new system (-requirements! simply
means targets or aims). This list is usually called the Requirements Specification.
● Type of hardware and software required: The systems analysts will now need to decide
what hardware and software will be required for the new system. The Hardware requirements
would include how many computers to use, what type of network to use, how many servers
required, etc. The Software requirements would include decisions such as whether to use
ready-made off-the-shelf software or use custom-written software, etc.
Another dataset, called TACO (Trash Annotations in Context) for trash detection and classification.
The images in TACO are manually labeled and segmented according to a hierarchical taxonomy to
train and evaluate object detection algorithms. While the dataset is crowdsourced, it currently has
only 1,500 annotated images, which is lower than TrashNet.
Several works have explored the effectiveness of machine and deep learning models in the context of
classifying waste objects. Adedeji et al. [9] used the ResNet-50 transfer learning model to classify
trash objects, using the TrashNet dataset. To compensate for less frequency of images, the authors
augmented the dataset by using various techniques such as shearing and scaling. Finally, they use a
multi-class SVM model where the classification takes place. The authors got an accuracy of 87% with
the TrashNet dataset with this model. Azis et al. [10] used a simple CNN model to classify trash
objects, using the Inception-v3 transfer learning model. The training dataset used was the Plastic
Model Detection dataset [11] which contained 2,400 images. To simulate real-world conditions, the
authors used Raspberry Pi as their processor Further, they changed the input size of the Inception
15
model to 299x299 as that is the default image dimension of the Raspberry Pi camera, and reported an
accuracy of 92.5%.
To overcome the above problem we have used our own dataset that contains various trash objects in
diverse environments. The images in the dataset were classified into 7 classes - medical waste,
e-waste, glass, plastic, cardboard, paper, and metal. Furthermore, these classes are divided into
subclasses to facilitate the distinction between various trash objects, and to enable further research in
this field.
Then this dataset is trained over CNN, and CNN with GAP their accuracy is determined and
compared. Then the new model is built using Transfer Learning. MobileNet V2 is used as the base
model. With the use of this machine learning technique, more accuracy is observed in classification
and also this model is the most optimized model of the existing models.
16
CHAPTER 5
SYSTEM DESIGN
System design is the process of defining the architecture, components, modules, interfaces, and data
for a system to satisfy specified requirements. Systems design could be seen as the application of
systems theory to product development. There is some overlap with the disciplines of systems
analysis, systems architecture, and systems engineering.
17
● Preparation of the dataset
● Data Preprocessing
The collected datasets are pre-processed using different techniques
1. Handling missing values
2. Data integration
3. Data reduction
4. Data transformation
● The collected datasets are split into train and test and pass the train data for data modeling.
● Then the model is built using CNN and TL.
● Check for the accuracy of each algorithm.
● Predict the result.
We have used our own dataset that contains various trash objects in diverse environments. The
images in the dataset were classified into 7 classes - medical waste, e-waste, glass, plastic, cardboard,
paper, and metal. Furthermore, these classes are divided into subclasses to facilitate the distinction
between various trash objects.
We prepared by extracting images of trash objects by performing a comprehensive search on the web.
For this purpose, a batch download software tool called WFdownloader was used. This tool allowed us
to download images in bulk from Google Images, among a variety of sources. The downloaded images
were then manually cropped and sorted into their respective classes.
18
FIGURE 5.2: Distribution of images in Dataset
19
Table 5.1 DATASET - CLASS-WISE STATISTICS
Laptops 398
Smartphones 218
Medicines 404
Miscellaneous metal 33
20
5.3 SYSTEM ARCHITECTURE
21
Stage 1: Here the Datset is prepared by collecting images, this is Data Gathering. Then the features of
the images are extracted and data conversion takes place.
Stage 2:
Data Cleaning:
The data can have many irrelevant and missing parts. To handle this part, data cleaning is done. It
involves handling missing data, noisy data, etc.
Missing Data:
This situation arises when some data is missing in the data. It can be handled in various ways. Some of
them are:
1. Ignore the tuples: This approach is suitable only when the dataset we have is quite large and
multiple values are missing within a tuple.
2. Fill in the Missing values: There are various ways to do this task. You can choose to fill the
missing values manually, by attribute mean or the most probable value.
Stage 3:
The obtained data from Stage 2 is taken into consideration then data is trained using the TL model and
obtained result is analyzed and Shown in the graph using Python library. The system architecture is a
conceptual model that defines the structure, behavior, and views of a system. A system architecture
can comprise system components, the expanded systems developed, that will work together to
implement the overall system.
22
● Data preprocessing: Dataset will be added to the preprocessing
1. Input: crop dataset
2. Process: Preprocessing will find missing values and also does feature remove
3. Output: preprocessed dataset
4. Error handling: If the input file is not a valid one
● Feature selection: Selection of the data from a dataset
1. Input: preprocessed dataset
2. Process: It will select only important data which is required
3. Output: Selected data will be displayed
● Splitting of the Data: Training data and Test Data
1. Input: Feature-selected data
2. Process: It will split the data into the train set and test set
3. Output: The dataset will be displayed as a Train set and Test set and it will be tested for
the specific algorithms and performance analysis will be carried out.
23
CHAPTER 6
SYSTEM IMPLEMENTATION
Implementation is the realization of an application, or execution of a plan, idea, model, design,
specification, standard, algorithm, or policy. System implementation generally benefits from high
levels of user involvement and management support. User participation in the design and operation of
information systems has several positive results.
6.1 DFD 0
6.2 DFD 1
24
6.3 ACTIVITY DIAGRAM
25
6.4 USE CASE DIAGRAM
26
6.5 METHODOLOGY
In this section we discuss the methodology designed for implementing a new multi-class trash
detection and classification model. Experiments are performed on the newly built dataset, which
consists of large-scale images across multiple categories. Initially, the CNN model is implemented on
the dataset. As CNN is one of the most used deep learning algorithms for image recognition. This is
done to compare the accuracy with the TLM models. Then using traditional transfer learning models
to validate their performance on the new dataset in comparison to other standard datasets.
27
FIGURE 6.6: Implementation Diagram
28
6.6 CNN
Convolutional Neural Network (CNN) is the extended version of artificial neural networks
(ANN) which is predominantly used to extract the feature from the grid-like matrix dataset
The Convolutional layer applies filters to the input image to extract features, the Pooling layer
downsamples the image to reduce computation, and the fully connected layer makes the final
prediction. The network learns the optimal filters through backpropagation and gradient descent.
29
2. Convolutional layer: The output of the input layer is passed through one or more convolutional
layers, each consisting of multiple filters. Each filter generates a new feature map by applying
the convolution operation to the previous layer's feature map.
3. Activation function: An activation function is applied to the output of each convolutional layer
to introduce non-linearity into the network. Common activation functions include ReLU,
sigmoid, and tanh.
4. Pooling layer: After the activation function, the output is downsampled using a pooling
operation. This helps reduce the spatial dimensions of the feature maps and makes the network
more computationally efficient. Common pooling operations include max pooling and average
pooling.
5. Fully connected layer: The final feature maps are flattened into a one-dimensional vector and
passed through one or more fully connected layers. These layers are similar to the layers in a
traditional neural network and are used to perform classification or regression tasks.
6. Output layer: The final layer of the CNN produces the network's output, which can be a single
value or a vector of probabilities.
7. Training: The CNN is trained using a dataset of labeled examples. The weights of the network
are adjusted using an optimization algorithm, such as stochastic gradient descent, to minimize
the difference between the predicted output and the true output.
8. Evaluation: The trained CNN is evaluated on a separate dataset to measure its performance.
This helps determine if the network has overfitted or underfit the training data and can guide
further adjustments to the network's architecture and hyperparameters.
30
6.6 RELU ACTIVATION
The Rectified Linear Unit is the most commonly used activation function in deep learning models. The
function returns 0 if it receives any negative input, but for any positive value x it returns that value
back.
So it can be written as
f(x)=max(0,x).
In other words, for any input value x, the ReLU function outputs the maximum of 0 and x. This means
that if x is positive, then the function outputs x, and if x is negative, then the function outputs 0. The
ReLU function is therefore a piecewise linear function with a slope of 1 for positive values of x.
The main advantage of the ReLU function is that it is computationally efficient and easy to implement.
It also helps prevent the vanishing gradient problem that can occur in deep neural networks. The
vanishing gradient problem occurs when the gradient of the activation function becomes very small as
it is propagated through the network during backpropagation, which can slow down or prevent the
training of the network. ReLU, being a non-saturating function, does not have this problem, since its
gradient is either 0 or 1, depending on the input value.
However, ReLU has one drawback: it can lead to "dead neurons". A neuron is considered "dead" if its
output is always 0, which can happen if its input is always negative. When this happens, the gradient
of the neuron becomes 0, and it no longer contributes to the learning process. To mitigate this problem,
31
several variants of ReLU have been proposed, such as Leaky ReLU, PReLU, and ELU, which
introduce small non-zero slopes for negative inputs.
So, instead of downsampling patches of the input feature map, global pooling downsamples the entire
feature map to a single value.
32
6.8 PSEUDO CODE
Pseudo code is an informal program description that does not contain code syntax or underlying
technology considerations. In this section, pseudo-code is written to ensure that programmers
understand a project's software and hardware requirements and align code accordingly.
PSEUDO CODE FOR FOR LOADING MOBILENET V2 MODEL AND ADDING TOP
LAYERS
➢ In the above code, the base model is loaded by excluding the top layers. As top layers are used
for prediction we do not want our model to predict the base model classes.
➢ In the above code base layer is added with new top layers that predict the output of our desired
classes.
➢ Here the base layers are frizzed as we do not need to train this model again.
33
CHAPTER 7
SYSTEM TESTING
System testing is any activity aimed at evaluating an attribute or capability of a program or system and
determining that it meets its required results. Although crucial to software quality and widely deployed
by programmers and testers, software testing still remains an art, due to limited understanding of the
principles of software. The difficulty in software testing stems from the complexity of software: we
cannot completely test a program with moderate complexity.
Testing is more than just debugging. Software testing is an investigation conducted to provide
stakeholders with information about the quality of the product or service under test. Test techniques
include, but are not limited to the process of executing a program or application with the intent of
finding software bugs (errors or other defects). The purpose of testing can be quality assurance,
verification and validation, or reliability estimation. Software testing can be stated as the process of
validating and verifying that a computer program/application/product:
Software testing, depending on the testing method employed, can be implemented at any time in the
software development process. Traditionally most of the test effort occurs after the requirements have
been defined and the coding process has been completed, but in the agile approaches, most of the test
effort is ongoing. As such, the methodology of the test is
34
● Integration Testing: It involves the testing of integrated modules to verify combined
functionality after integration. Modules tested are typically code modules, individual
applications, client and server applications on a network, etc. This type of testing is especially
relevant to client/server and distributed systems.
● System Testing: The entire system is tested as per the requirements. It is a type of Functional
testing that is based on overall requirements specifications and covers all combined parts of a
system.
● Acceptance Testing: Normally this type of testing is done to verify if the system meets the
customer-specified requirements. In engineering and its various sub-disciplines, acceptance
testing is a test conducted to determine if the requirements of a specification are met. It may
involve chemical tests, physical tests, or performance tests.
In doing case study research, the "case" being studied may be an individual, organization, or abstract
sense, as in a claim, a proposition, or an argument, such a case can be the subject of many research
methods, not just case study research. Case studies may involve both qualitative and quantitative
research methods.
This project has undergone many case studies, both prior to design and after implementation to check
the effectiveness of the operation. Case studies were conducted in the analysis phase to understand the
scenarios under which such a system will be useful. Case studies after the implementation phase were
mainly for verification and validation, and how well the system performs over a period of time.
35
Test cases are shown in the table below. Table 7.1 consists of the test case, the condition being
checked, the expected result, and obtained result.
1 Load Data Load trash data sets If the data is Load The data is Pass
in the form of not loaded Datasets with loaded
images shows an error Pandas. Successfully
message.
36
CHAPTER 8
EXPERIMENTAL RESULTS AND ANALYSIS
We measured the performance of our dataset by using it to train state-of-the-art deep neural models
like ResNet-34, ResNet-50, ResNet-101, DenseNet-121, and VGG-19. Table II shows the results
obtained by the classical transfer learning models for the TrashBox dataset. We observed that the
ResNet101 model achieved the best results among the considered models for this classification task.
This model was able to classify the trash objects with a training accuracy of 98.86%, validation
accuracy of 98.29%, and testing accuracy of 98.47%.
We have evaluated the dataset to various models like CNN and TLM.In CNN we observed an
accuracy of around 73%. Then in order to obtain more accuracy we tried CNN with GAP. But we
observed a reduction in accuracy. So finally we have implemented it to Transfer Learning models.
The best part of our model is that we do not need to run a huge number of epochs to observe higher
accuracy. We have achieved higher accuracy with fewer epochs. So we can conclude that our model is
optimized and highly accurate.
37
FIGURE 8.1: Result Comparison
38
CHAPTER 9
SNAPSHOTS
39
FIGURE 9.2: Creating Labels for Dataset
40
FIGURE 9.4: Data Preprocessing
41
FIGURE 9.6: Creating Data Generators
42
FIGURE 9.8: Making Predictions
43
CHAPTER 10
CONCLUSION AND FUTURE SCOPE
A new, comprehensive trash classification dataset was put together, consisting of 14,263 images across
various classes. Our dataset is larger than any existing trash classification dataset and also contains
unique classes such as medical waste and e-waste, hence making the dataset quite diverse in nature.
Upon running an optimized ResNet-101 model, an accuracy of 98.47% was achieved. We hope that
our dataset becomes a valuable resource to other researchers in their work.
This model in the future can be leveraged with quantum mechanics and optimized further. Quantum
Transfer Learning can be the game changer in this field. The quantum transfer learning models were
applied to the trash classification problem, along with optimizations in the form of parallelization
strategies to speed up their training. This improved accuracy and decreased training time.
This model can also be integrated with IoT, by this waste segregation can be quicker and safe. This
process can be completely mechanized and there will be no involvement of Humans. This model helps
much larger metropolitan cities to manage waste in the most efficient manner.
This model can be integrated with the framework that separates squandering consequently using no
sensors, but the energy of the machine choosing the method for seeing on which waste will be
organized as degradable or non-degradable. since the framework works freely, there's no need for
human intervention to oversee or to attempt to any horrid task from now forward. The framework is
restricted to the articles which show up as though metals however don't appear to be metals. In the
future, the framework might be moved up to the higher location of waste by utilizing progressed
calculations of machine learning.
44
REFERENCES
[1] S. Lahiry. (2017) Percentage of land-filled waste recyclable in delhi. [Online]. Available:
https://www.downtoearth.org.in/blog/waste/ indias-challenges-in-waste-management-56753
[2] G. Thung and M. Yang, “Classification of trash for recyclability status,” 2016. [Online]. Available:
https://github.com/garythung/trashnet
[3] P. F. Proenc¸a and P. Simoes, “Taco: Trash annotations in context for ˜ litter detection,” arXiv
preprint arXiv:2003.06975, 2020.
[4] O. Adedeji and Z. Wang, “Intelligent waste classification system using deep learning convolutional
neural network,” Procedia Manufacturing, vol. 35, pp. 607–612, 2019, the 2nd International
Conference on Sustainable Materials Processing and Manufacturing, SMPM 2019, 8-10 March 2019,
Sun City, South Africa.
[5] F. A. Azis, H. Suhaimi, and E. Abas, “Waste classification using convolutional neural network,” in
Proceedings of the 2020 International Conference on Information Technology and Computer
Communications, ser. ITCC 2020. Association for Computing Machinery, 2020, p. 9–13.
[6] S. Ali and Thung, “Plastic detection model,” https://github.com/
antiplasti/Plastic-Detection-Model, 2022. [7] A. Masand, S. Chauhan, M. Jangid, R. Kumar, and S.
Roy, “Scrapnet: An efficient approach to trash classification,” IEEE Access, vol. 9, pp. 130 947–130
958, 2021.
[8] C. Shi, C. Tan, T. Wang, and L. Wang, “A waste classification method based on a multilayer hybrid
convolution neural network,” Applied Sciences, vol. 11, no. 18, 2021. [Online]. Available:
https://www.mdpi.com/2076-3417/11/18/8572
[9] N. Killoran, T. R. Bromley, J. M. Arrazola, M. Schuld, N. Quesada, and S. Lloyd,
“Continuous-variable quantum neural networks,” Phys. Rev. Research, vol. 1, p. 033063, Oct 2019.
[10] A. Mari, T. R. Bromley, J. Izaac, M. Schuld, and N. Killoran, “Transfer learning in hybrid
classical-quantum neural networks,” Quantum, vol. 4, p. 340, Oct 2020.
[11] H. Mogalapalli, M. Abburi, B. Nithya, and S. K. Bandreddi, “Classical–quantum transfer learning
for image classification,” SN Computer Science, vol. 3, no. 1, 2021.
[12] R. S. Mor, K. S. Sangwan, S. Singh, A. Singh, and M. Kharub, “E-waste management for
environmental sustainability: an exploratory study,” Procedia CIRP, vol. 98, pp. 193–198, 2021, 28th
CIRP Conference on Lifecycle Engineering, March 10 – 12, 2021, Jaipur, India
45