Template - Mini Project Report
Template - Mini Project Report
A MINIPROJECT REPORT
Submitted by
degree of
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE
APRIL 2023
ii
BONAFIDE CERTIFICATE
Certified that this project report titled “PROJECT TITLE” is a bonafide work of
“TEAM MEMBER NAME (REG NO.)” who carried out the project work under
my supervision.
Professor Designation
ABSTRACT
ACKNOWLEDGEMENT
We thank the almighty GOD for the abundant blessings showered on us. We extend our
deepest love and gratitude to our dear parents who built up our career and backed us up in life.
We thank our management and our Principal Dr. P Deiva Sundari, for the opportunities
given to us for our career development.
We would like to thank all other faculty members of Department of Computer Science
& Engineering for their help and advice throughout our life in this campus.
Finally, we are thankful to all our friends and all others who encouraged us and helped
us in doing this project.
Student’s Name
Reg. No
v
TABLE OF CONTENTS
ABSTRACT iii
ACKNOWLEDGEMENT iv
LIST OF TABLES ix
LIST OF FIGURES x
LIST OF ABBREVIATIONS xi
1. INTRODUCTION 1
1.1 OVERVIEW 1
1.2 OBJECTIVE 2
1.3 MOTIVATION 2
2. RELATED WORK 6
3. SYSTEM ANALYSIS 13
3.3.1 Java 14
3.3.2 Python 15
3.3.4 Firebase 17
4. SYSTEM DESIGN 18
4.2 MODULES 19
5.1 SCREENSHOTS 40
6.1 CONCLUSION 42
REFERENCES 62
viii
LIST OF TABLES
LIST OF FIGURES
LIST OF ABBREVIATIONS
AR Augmented Reality
ML Machine Learning
DL Deep Learning
1
CHAPTER 1
INTRODUCTION
1.1 OVERVIEW
Virtual Learning, a unique and advanced method of learning has been developed and
adapted in many educational academies to create a spark of interest in students to learn the
concepts that they have to learn, in a much easier way than the conventional method of
learning through books and notes. Many studies have proved that when students learn
through virtual and interactive medium, (i.e.) through interactive applications in smart
phones, video, games and pictures, they tend to retain the things that they learn for a much
longer time and they are able to understand even the complex concepts easily.
The incorporation of the Machine Learning module along with the Augmented
Reality module enables the user to scan any image or object that falls under a particular
domain and the Augmented Reality module will play the relevant video or the relevant
2
augmented media content on the plane of the scanned image. To improve the accuracy of the
Machine Learning Image Classification model, a large date set can be used to train the model,
the size of the training data set directly affects the accuracy of the trained Machine Learning
Image Classification model.
1.2 OBJECTIVE
The goal of this project is to help students to learn their academic lessons in an easier
way, by making the conventional way of learning a fun and interactive one, through the
incorporation of Augmented Reality and Machine Learning in an android application.
The Proposed solution is the incorporation of a Machine Learning model that can
classify any image by comparing the scanned image with the dataset used to train the image
classification model. This can be used the negate the issues caused to the Augmented Reality
module by unfavourable lighting conditions or situations in which the user doesn’t have the
material in which the predefined images are present. The user can scan any image relavant to
the predefine image and the application will recognise that image, classify it automatically
and play the relevant augmented media content.
1.3 MOTIVATION
Learning, is a process that involves a person staying focused in the subject at hand to
try and go through that subject matter in order acquire information on that particular subject,
3
it is a major part of any students’ life. Though learning something new is often exciting, the
medium through which a person learns, heavily influences the effectiveness of what he/she
learns.
Right from the first time a kid is enrolled in a school he is taught many educational
concepts and is being asked to learn those concepts through books and class notes. This
archaic method of learning can only hold the attention of a student for a certain amount of
time and that is too short and won’t be enough to learn difficult concepts easily. Studies have
been carried out for centuries to find out the most effective way of learning and almost every
study has proved that learning through visual cues of images/videos(nowadays) has been the
most effective.
The graph shown in Figure 1.1, is a result of a case study to observe the effects of
using Virtual Learning. The study proved that the attention span of the students on a
particular concept has increased, confidence in the concept that they learnt has improved,
satisfaction that they have learnt something has greatly increased.
This has lead us to develop the application LoC (Lens of Coeus), it incorporates
Augmented Reality module to make learning an interesting process and Machine Learning
module to aid the Augmented Reality module.
4
Learning, a process that is inevitable and never-ending, has a huge impact on every
person’s life. The world is changing in its own way and so is the methods and procedures
involved in the educational ecosystem. The archaic way of learning has never been
interactive and fun for the students. This is because of the fact that plain text in books and
notes aren’t interesting and eye-catching. According to various studies that have been carried
out to study the impact of virtual learning, graphical contents like images and videos are more
effective in helping the students to remember things than plain texts do. With the help of the
emerging technologies like Augmented Reality and Machine Learning, the proposed system
uses an Android mobile application to scan the images in a text book through the camera and
recognize the same by Image Recognition and superimposes a related video in the user’s real
world with the help of Augmented Reality.
The main challenge that this application faces is the availability of dataset that is
needed to train the Machine Learning model, in order to classify the images that is scanned.
The number of images that is used to train a particular image classification model directly
influences the accuracy of that model. If there are a very large amount of images the accuracy
of the trained model will increase.
The problem here is the process of collecting the required datasets. Collecting dataset
is a tedious task and often results in unfavourable outcomes, such as incorrect format of
images, the size of the image which might affect the response time of the model if it is large,
the resolution of the images which has to follow a certain standard, otherwise the model will
take a longer time to analyse the image to extract the features.
for the proposed system and module description for each module. Chapter 5 tells about the
system implementation and the various algorithms. Chapter 6 describes about the system
testing which consists of different test cases for the various modules. Chapter 7 tells about the
various results and discussion obtained from the various outputs. Chapter 8 consists of
conclusion and the future work for the project. The appendix 1 consists of the source code
and software implementations. The appendix 2 consists of the screenshots of the various
output screens. The appendix 3 consists of the technical paper presentation and the
references.
6
CHAPTER 2
RELATED
WORK
The literature review of the papers that were referenced and those that serves as base
paper and supporting paper provides a detailed description about the state of the
implementation of Augmented Reality in the field of Virtual Learning. This literature review
outlines the impact of Virtual Learning on students’ attention span, understanding level of
students on a concept, their confidence in the concepts that they learnt through Virtual
Learning module. It also describes in detail about the field of Machine learning to recognise
images and Augmented Reality to augment any media content. Conclusively, this literature
review has makes it clear that at present there isn’t a effective Virtual Learning module that
can be incorporated widely with all the students while also being sophisticated enough to be
able to help students learn what they want to learn.
Bhaskar Kapoor et al [1], have published a paper on Augmented Reality Based Book
Visualization Using Marker Based Technique, suggested that Augmented reality (AR)
belongs to computer displays which provide additional virtual information to a user's sensory
perceptions. AR usually focuses on see-through devices, which are worn on the head that
overlay graphics and text on the user's view of environment. In layman’s term it overlays
7
graphics over a real world environment in real time. Procuring the right information at the
right time and at the right place is the key in such applications.
The above Figure 2.1 explains the block diagram of AR. What makes augmented
reality different is how the information is presented that is, not on a separate display but
along with the user's perceptions. An AR system adds the real world with virtual (computer
generated) objects that appear to coexist in the same space as the real world. This
consciousness was used to create an interactive AR Book using AR application. This study
aimed to implement Augmented Reality on images or diagrams of photosynthesis, water
cycle and pollution by recognizing the images as marker and to add their 3-D view on top of
the images in the device upon recognition. This way, a person develops more interest towards
his/her studies. The person also has an option to study from a video to make the
understanding of the subject easier.
The intrinsic motivation theory was used to explain motivation in the context of
learning. The attention, relevance, confidence, and satisfaction (ARCS) model as shown in
the Figure 2.2, guided the understanding of the impact of augmented reality on student
motivation, and the Instructional Materials Motivation Survey was used to design the
research instrument. This research examined the differences in student learning motivation
before and after using the augmented reality mobile application.
The intrinsic motivation theory was used to understand motivation in the context of
learning. The ARCS model of motivational design was used to understand the impact of AR
technology on student motivation towards learning. The impact on student learning
motivation was measured by comparing the learning motivation of students before and after
using an AR mobile application, using a pre-usage and post-usage questionnaire.
The study as shown in the Table 2.1, has proved that the Virtual Learning approach
to learn new concepts has increased the attention span of students by 30.72% , increased the
confidence that the student has in the concept that he/she has learnt by 10.74% , increased the
level of satisfaction that the student felt after learning a concept through visual learning
module by 12.50% and the overall effect that this virtual learning module has had on the
students learning process by 14.43%, it has also decreased the relevance by 3.26% but
further
9
analysis has stated that this decrease in relevance is insignificant when compare to the other
aspects that has been affected positively.
PERCENTAGE
PREUSAGE POSTUSAGE
DIFFERENCE
Mohd Azlan Abu et al [2] have conducted a study on Image Classification based on
Deep Learning and TensorFlow, his research is a study about image classification by using
the deep neural network (DNN) or also known as Deep Learning by using framework
TensorFlow. Python is used as a programming language because it comes together with
TensorFlow framework.
It is the framework of image classification where deep neural networks are also
applied. There are four (4) phases throughout this process ass shown in the Figure 2.3 and
each of the phases will be discussed. Each of the phases are included on TensorFlow as the
open source software and Python as its programming language. Then, the process is
continued to collect some of the images (inputs), by applying DNN and lastly all images will
be classified into their groups.
The flowchart of image classification is shown in the Figure 2.4. The input data here
mainly focuses in flowers category which there are five (5) types of flowers that have been
11
used in his paper. Deep neural network (DNN) has been chosen as the best option for the
training process because it had produced a high percentage of accuracy. Results were
discussed in terms of the accuracy of the image classification in percentage. Roses got
90.585% and same goes to another type of flowers where the average of the result was up to
90% and above.
This study has helped the research team to conclude that image classification models
based on Deep Neural Network is the most efficient, as it has the most accurate predictions.
Deep Neural Networks work just like how the human brain thinks and thus resulting in the
most accurate predictions.
The literature review of the papers that were referenced and those that serve as base
paper and supporting paper helped to narrow down the main challenges that the proposed
application might face. Also it has served as a guide for the development of the application in
the right way, using the necessary tools. From the Table 2.2, the following were inferred from
the literature survey.
CHAPTER 3
SYSTEM
ANALYSIS
Learning, is a process that involves a person staying focused on the subject at hand to
try and go through that subject matter in order to acquire information on that particular
subject, it is a major part of any students’ life. Though learning something new is often
exciting, the medium through which a person learns, heavily influences the effectiveness of
what he/she learns. Studies have been carried out for centuries to find out the most effective
way of learning and almost every study has proved that learning through visual cues of
images/videos(nowadays) has been the most effective. It provides the few main aspects such
as long attention span, much higher interestingness factor, easy understanding of concepts
and the ability to retain information for a longer period of time, the conventional way of
learning through note books and class notes, lag behind in these aspects. Though learning is
an interesting process on its own, the attention span of the conventional way of learning is
very less, students easily get distracted by even the tiniest things in their surroundings. This
greatly affects the quality of knowledge that they earn through the process of learning. The
level of understanding of a student, in a particular concept, determines how long they retain
the information that they earned through the process of learning. If a student doesn’t
understand what has been learnt, recollecting the topics that has been learnt will be an
impossible task, the conventional method of learning isn’t enough to rectify these issues.
The proposed system aims at developing an Android based mobile application that
incorporates Augmented Reality and Machine Learning. The proposed system can scan an
image using the camera in the mobile phone, which will be analysed by the Machine
Learning Image Classification module, it will be analysed by a TensorFlow model to
recognize what the image is about. This information is sent to the Augmented Reality
14
module, which will use that classified image’s label to trigger the relevant video to be
played on the 2D plane, using ARCore (SDK powered by Google).
3.3.1 Java
Object Oriented - In Java, everything is an Object. Java can be easily extended since it
is based on the Object model.
15
Interpreted - Java byte code is translated on the fly to native machine instructions and is
not stored anywhere. The development process is more rapid and analytical since the
linking is an incremental and light-weight process.
3.3.2 Python
Easy to Learn and Use - Python is easy to learn and use. It is developer-friendly and
high level programming language.
16
Extensible - It implies that other languages such as C/C++ can be used to compile the
code and thus it can be used further in our python code.
Large Standard Library - Python has a large and broad library and prvides rich set of
module and functions for rapid application development.
• Support KOTLIN
17
3.3.4 Firebase
• Cloud Firestore
• Firebase Storage
• ML Kit
18
CHAPTER 4
SYSTEM
DESIGN
The Figure 4.1 depicts the basic architecture of the proposed system LoC. The main
components present in the architecture of LoC are Capture Image, Image Analysis, Target
Tracker, Triggered Event and AR Image/Video output. The first component in this
architecture represents the process of the capturing the image and the last component
represents the process of playing the augmented media content over the tracked target layout.
The overall flow depicted in the architecture diagram is as follows. The reader scans an
image on a physical plane, with a smart phone camera. The data of the scanned image is
analysed by the machine learning module with the images in the Firebase, that were used to
19
train the image classification model. The scanned image is then classified and labelled, the
target layout of the scanned image is identified by measuring the edges of the scanned image.
The label of the classified image is used as a trigger to trigger the relevant video content to
play on the layout of the target image. The augmented media content that has to be played is
retrieved from the Firebase and is played on top of the physical image that was scanned by
the reader.
4.2 MODULES
The proposed system consists of the following three main modules, namely
DataBase Module
The image of the object is fetched through the camera of a smart phone, this image is
analysed by the Machine Learning image recognition algorithm and is input to the
Augmented Reality Module. The class label of the recognised image will be used to look for
any corresponding triggers that are present. When a match is found, the Target Tracker will
track the layout of the recognised image and output the relevant augmented video, image or
audio for the user to see.
The workflow of the Machine Learning module consists of several sub processes,
Training data, an unknown image as an input, analysis of that image and finally classification
of that image, as show in the Figure 4.3.
21
The first step in this module is to write a Machine Learning algorithm that can
recognize images and classify them automatically. This Machine Learning algorithm is
trained using a testing data set that consists of number of classes and several hundreds of
images under each class. The algorithm is written in such a way that it is able to recognize the
shapes and color patterns in images, once it recognizes a pattern and shape of an image under
a particular class, it learns what object it is by looking at the label of that particular class that
the image belongs to. Once the algorithm has run through the training data set, it can be tested
using a test data set which will give the accuracy of the algorithm at its current state. The
accuracy of its recognition depends on the number of images that it processed in the training
set. Once a satisfactory level of accuracy has been achieved the algorithm can be used to
recognize any new image. The input image when fed to the algorithm, will be analyzed for
any shapes and color patterns if a match is found, it will output the class that it belongs to,
which will be the object that was recognized in the image.
22
The database used in the proposed system is Firebase. The architecture is shown in the
Figure 4.4. It is a mobile and web application development platform developed by Firebase
later it was acquired by Google. Firebase is used to store the trained Machine Learning
Module and load that trained Machine Learning model into the Augmented Reality
application. It is also used to store the resource set. Whenever the AR module scans an image
the ML module analysis that image to find a match and when a match is found, the relevant
AR video is triggered and when an AR event is triggered, the Firebase is accessed to retrieve
the relevant video for the AR module to play.
Class diagram
23
Sequence diagram
Activity diagram
The Figure 4.5 is the use case diagram for the proposed system. It visualizes how the
user, or in this case the Reader, interacts with the application. The Reader scan an image with
a smartphone. The scanned image is then analysed by the machine learning module. The ML
module compares the scanned image with the images that it was trained with, which are
present in the Firebase, to classify the scanned image with a label. After classifying the
image, it is then given a corresponding label. The AR module will use the label that was
assigned to the scanned image as a trigger. The AR module will plot the target’s layout by
identifying the edges of the scanned image. The trigger will cause the AR module to trigger
the relevant
24
augmented media content to play on the scanned image’s layout. This augmented media
content will be visible in the display screen of the reader’s smartphone.
The Figure 4.6 is the Class diagram for the proposed system. It described the structure
of the proposed system by showing the system’s classes, their attributes, operations and the
relationships among the different objects. The classes present are, Vision, Training set,
Resource set and AR Tracker. The attributes of the class Vision are, Type of Reader which is
of type String and Image which is of type Buffered Image, Vision’s function is to scan an
image. The attributes of the class Training Set are, Image which is of type Buffered Image
and Image Label which is of the type String, the function of the training set is to analyse the
scanned image. The attributes of the Resource set are Label Match which is of the type
Boolean and Video, which is of the type File, its function is to find a match for the label. The
attributes of the class AR Tracker are, Image Plot, which is of the type integer, Trigger Event,
which is of the type Boolean, AR content, which is of the type string and Video, which is of
the type File, the function of the AR Tracker class are plotting the target layout and playing
the relevant augmented media content on the plotted target layout. The classes AR Tracker
25
and Resource Set share a dependency relationship since the AR Tracker is dependent on the
label provided by the Resource Set to trigger the relevant augmented media to play.
The Figure 4.7 depicts the object interactions in the proposed system, arranged in time
sequence. It depicts the objects and classes involved in the scenario and the sequence of
messages exchanged between the objects needed to carry out the functionality of the scenario.
The objects in the sequence diagram present above are Reader, MLkit, ARCore and Firebase.
The lifeline of the object Reader, has two execution occurrences, the first one depicts when the
process of image recognition through the camera, occurs. The next execution occurrence in
Reader object represents when the image data is sent to the MLKit object and when does it
26
receive the relevant augmented video from the ARCore module. The lifeline of the object,
MLKit consists of one execution occurrence, and it represents when the recognised image’s
data is referred with the resource set in the firebase database to find a matching label. The
lifeline of the ARCore object consists of three execution occurrences, the first one represents
when the AR content relevant to the classified image is obtained from the Firebase object, the
second execution occurrence depicts when the target layout is plotted, the third one represents
the initiation of a process to send the relevant augmented media content back to the Reader
Object. The Firebase object’s lifeline consists of one execution occurrence and it represents
when the analysed image data is received from the Reader object and when does the relevant
AR content is sent to the ARCore.
The Figure 4.8 is the activity diagram for the proposed system and it graphically
depicts the workflows of stepwise activities and actions with a support for decision branches.
The workflow of the system as depicted by the activity diagram above, is as follows, the
reader launches the camera application in a smartphone, the camera is used to capture the
image. The data of the scanned image is analysed by the Machine Learning module, if a match
is found it will be classified and labelled, else the image has to be captured again. The
labelled image data is sent to the AR module, it tracks the target image layout, tries to find the
image plots, if the plots are found the relevant augmented media is played over that target
layout, else the image will be tracked again, to get the target plots. The workflow ends after
the relevant augmented media content has been played.
28
CHAPTER 5
SYSTEM
IMPLEMENTATION
I. Scan an image.
II. The data of the scanned image is sent to the image classification
machine learning algorithm in the form of bin array.
III. The image classification algorithm will look for a pattern match in the
training data set.
a) If, a match is found, then send the label of that image to AR module.
IV. Use the label from the Image Classification module, to play the
corresponding augmented video.
The functioning of the system starts, when the reader opens the application and points
the smart phone camera at an image. The camera will scan the image to extract its features
such as high contrasts, colour variations, shapes and size, and it is made ready to be sent to
the next module.
The data of the scanned image is compressed and sent to the image classification
machine learning algorithm in the form of binary array. The image classification algorithm
recognises the features in the scanned image by processing binary array data which is in a
matrix form. The numbers in the binary array denote the contrast of the colours in the image
and the depth of the colour in the image. This information is used by the image classification
algorithm to analyse the image data for certain features.
The image classification algorithm, after having extracted the data about the features
such as contrasts, colour intensity variations, depth of the colour and shape, in the scanned
29
image, it searches through the trained image classification models dataset with corresponding
label, to find a pattern or feature that matches the one in the scanned image.
If a match is found, then the scanned image is labelled with the label of image that
matched and that label is sent to the Augmented Reality module. If a corresponding label is
not found for the particular features in the scanned image, then the image classification
module throws a ‘No match found’ error message.
The Augmented Reality module, once it receives the label of the scanned image from
the image classification machine learning module, uses that label to find the corresponding
augmented media content that correlates with that label. The Augmented Reality module also
extracts the geometrical layout information of the image. This helps the Augmented Reality
module to overlay any media content that has to played, to play right on top the layout of the
scanned image.
Once the necessary augmented media content is found, it is overlaid on top of the
scanned image. This augmented media content is then overlaid on top of the scanned image
and played for the reader to read.
I. A set of training images are defined and are labelled based on the class that they
belong to.
II. The main features are extracted from the training images and are correlated with their
labels.
III. The image classification model is run through the training data set multiple times to
increase its accuracy and the model is saved once a satisfactory level of accuracy is
reached.
IV. A testing data set consisting new set of images with no label is created.
V. The saved image classification model is run through the testing data set to measure its
accuracy.
30
VI. If the result from testing the developed model is satisfactory, the model is saved as
the final product.
VII. The final product is used to classify the images that are scanned by the reader.
III. The scanned images information is sent to ML module which gives the
label of the scanned image.
IV. The Augmented Reality module uses that label to trigger the corresponding
event.
The following are the functional components of coding in the application LoC,
TensorFlow
Keras
Pandas
Matplotlib
NumPy
ARCore
31
Unity
5.2.1 Tensorflow
5.2.2 Keras
In addition to standard neural networks, Keras has support for convolutional and
recurrent neural networks. It supports other common utility layers like dropout, batch
normalization, and pooling.
Keras allows users to productize deep models on smartphones (iOS and Android), on
the web, or on the Java Virtual Machine. It also allows use of distributed training of deep-
learning models on clusters of Graphics processing units (GPU) and tensor processing units
(TPU) principally in conjunction with CUDA.
5.2.3 Pandas
Tools for reading and writing data between in-memory data structures and different
file formats.
5.2.4 Matplotlib
Matplotlib is a plotting library for the Python programming language and its
numerical mathematics extension NumPy. It provides an object-oriented API for embedding
plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or
GTK+. In LoC, matplotlib is used to represent the accuracy of the model, graphically.
5.2.5 NumPy
NumPy is a library for the Python programming language, adding support for large,
multi-dimensional arrays and matrices, along with a large collection of high-level
mathematical functions to operate on these arrays. NumPy targets the CPython reference
33
Using NumPy in Python gives functionality comparable to MATLAB since they are
both interpreted, and they both allow the user to write fast programs as long as most
operations work on arrays or matrices instead of scalars.
Scikit-learn is a free software machine learning library for the Python programming
language. It features various classification, regression and clustering algorithms including
support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is
designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
Scikit-learn is largely written in Python, and uses numpy extensively for high-
performance linear algebra and array operations. Furthermore, some core algorithms are
written in Cython to improve performance. Support vector machines are implemented by a
Cython wrapper around LIBSVM; logistic regression and linear support vector machines by a
similar wrapper around LIBLINEAR. In such cases, extending these methods with Python
may not be possible. Scikit-learn integrates well with many other Python libraries, such as
matplotlib and plotly for plotting, numpy for array vectorization, pandas dataframes, scipy,
and many more.
5.2.7 ARCore
ARCore uses three key capabilities to integrate virtual content with the real world as
seen through your phone's camera:
34
Motion tracking allows the phone to understand and track its position relative to the
world.
Environmental understanding allows the phone to detect the size and location of all
type of surfaces: horizontal, vertical and angled surfaces like the ground, a coffee
table or walls.
Light estimation allows the phone to estimate the environment's current lighting
conditions.
5.2.8 Unity
In LoC, Unity engine is used to model the augmented reality part of the application.
35
CHAPTER 6
SYSTEM
TESTING
• Functional Testing
• Non-Functional Testing
It is a software testing which validates the software system against the functional
requirements. It checks each function of the application by providing the required input and
output. This can be executed first, this uses Manual or automation testing tool. These testing
uses:
• Integration Testing
• Security Testing
36
• Performance Testing
• Usability Testing
Integration testing is a process where each blocks are combined and tested as an
individual. It is used to test the data by using a design so that the system can operate
successfully. Thus the system testing is a process to check the combinational behaviour and
validate whether the requirements are implemented correctly.
The final step involves Validation testing, which determines whether the software
functions as the user expected. The end-user rather than the system developer conducts this
test most software developers as a process called “Alpha and Beta test” to uncover that only
the end user seems able to find. The compilation of the entire project is based on the full
satisfaction of the end users.
37
An Acceptance Test is performed by the client and verifies whether the end to end the
flow of the system is as per the business requirements or not and if it is as per the needs of the
end-user. Client accepts the software only when all the features and functionalities work as
expected. It is the last phase of the testing, after which the software goes into production.
This is also called User Acceptance Testing (UAT).
In this testing the blocks will give the details of the pesticides along with its
ingredients to the user and test the output. Here the testing does not go for watching the
internal file in the system and what are the changes made on them for the required output.
It is just the vice versa of the Black Box testing. It does not watch the internal
variables during testing. This gives clear idea about what is going on during execution of the
system. The point at which the bug occurs was all clear and was removed.
38
CHAPTER 7
OUTPUT AND
EXPLANATION
7.1 SCREENSHOTS
The application doesn’t have a separate home page, instead the camera is directly
launched when the application is launched. The Figure 7.1, is an example of the working of
the application. The camera has been pointed at an image and so a relevant video content has
started playing.
41
The Figures 7.2 and 7.3 are several other output instances of the application LoC. In
Figure 7.2, a picture of Snow Leopard was scanned by the camera and it can see that the
application has played the relevant video. Figure 7.3, is an instance of the application playing
a relevant video of Lion Tailed Macao.
CHAPTER 8
WORK
8.1 CONCLUSION
LoC is a new educational assistant that integrates Augmented Reality and Machine
Learning in an android application, to make a learning module that delivers educative content
through interactive medium such as images and video in order to improve the amount of
quality education that a student receives. The android application LoC is built around the
principle that students learn much quicker when there’s an interactive medium such as an
image or a video is involved. The educational system can be revolutionized by implementing
interactive learning modules in everyday classes that students take up, but this level of
integration can be achieved only to small procedural integrations and so it will take a lot of
time.
In the future Lens of Coeus can be further improved to to make the application more
efficient and achieve the following aspects,
The Lens of Coeus at its current state, can recognise every image in a particular book
because the image classifier in the LoC has been trained with the images that are present in
the book. When a reader scans an image, the features in that image are extracted and based on
the extracted features the images is labelled and classified. Sometimes, the scanned imaged
will be of very poor quality as a result of very poor lighting conditions, in such cases the
extracted features might not match any class. The images can only be classified if a class for
43
that particular image exists in the training data set. If no such class is present, then the image
won’t be classified and an error/exception will be thrown.
In the future, the application LoC could be developed to such an extent that it’ll be
able to classify any object or image that the reader scans under any lighting conditions with a
smartphone. To make this possible, a very large training data set have to defined and labelled.
If the image classifier in LoC is trained with such a training data set, any image/object that
the reader scans under any lighting condition, will be classified and the relevant educative
media content will be augmented on top of the scanned image/object.
The image classifier module in LoC has been developed in such a way that it can
recognise any image related to a particular concept in a textbook used by schools. If a
particular image that the reader has scanned isn’t present in any textbook, then the application
will just return an error/exception stating that there is no match for that particular image.
To eliminate this issue, an image classifier model with a wide variety of classes and a
large image data set for each and every class along with the correct label for each and every
image in the training data set must be created to train the image classifier model. An image
classifier model that has trained with such a training data set will be able to classify any
image/object, even if it is a concept in any books used in schools or any other educational
institute, for that matter.
The image classification module and augmented reality module require heavy
computational power; such computational power is easily available in a computer or a laptop
whereas in the case of a smartphone such heavy computational power is hard to come by.
Insufficient computational power might lead to many issues in the application.
• Incorrect labelling
• Overheating issues
To negate these issues, the application must be developed in such a way that it doesn’t
put all the processing stress on the smartphones processor. The application must be optimised
to relieve the processors of some computational stress. This will help increase the
performance of the application.
Augmented Reality and Image Classification are two heavy computational power
demanding modules on their own and when they are combined together the power and battery
consumption is large. LoC is designed to require low processing power and battery to work
properly. But the battery consumption can be further reduced to help the readers/users to use
the application for much prolonged duration without having to charge the smartphone
frequently.
45
APPENDIX 1
SAMPLE CODE
MainActivity.java
package com.example.arimages;
import androidx.appcompat.app.AppCompatActivity;
import android.media.MediaPlayer;
import android.net.Uri;
import android.os.Bundle;
import com.google.ar.core.Anchor;
import com.google.ar.core.AugmentedImage;
import com.google.ar.core.Frame;
import com.google.ar.core.TrackingState;
import com.google.ar.sceneform.AnchorNode;
import com.google.ar.sceneform.FrameTime;
import com.google.ar.sceneform.Scene;
import com.google.ar.sceneform.math.Vector3;
import com.google.ar.sceneform.rendering.Color;
46
import com.google.ar.sceneform.rendering.ExternalTexture;
import com.google.ar.sceneform.rendering.ModelRenderable;
import java.util.Collection;
false;
@Override
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
mediaPlayer.setSurface(texture.getSurface());
mediaPlayer.setLooping(true);
ModelRenderable
.builder()
.setSource(this, Uri.parse("video_screen.sfb"))
.build()
.thenAccept(modelRenderable -> {
modelRenderable.getMaterial().setExternalTexture("videoTexture",
texture);
modelRenderable.getMaterial().setFloat4("keyColor",
renderable = modelRenderable;
});
arFragment = (CustomArFragment)
getSupportFragmentManager().findFragmentById(R.id.arFragment);
scene = arFragment.getArSceneView().getScene();
scene.addOnUpdateListener(this::onUpdate);
48
{ if (isImageDetected)
return;
Collection<AugmentedImage> augmentedImages =
frame.getUpdatedTrackables(AugmentedImage.class);
if (image.getTrackingState() == TrackingState.TRACKING) {
if (image.getName().equals("image")) {
isImageDetected = true;
image.getExtentZ());
49
break;
mediaPlayer.start();
texture.getSurfaceTexture().setOnFrameAvailableListener(surfaceTexture -> {
anchorNode.setRenderable(renderable);
texture.getSurfaceTexture().setOnFrameAvailableListener(null);
});
scene.addChild(anchorNode);
CustomArFragment.java
package com.example.arimages;
import android.content.Context;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.os.Bundle;
import android.view.LayoutInflater;
import android.view.View;
import android.view.ViewGroup;
import android.widget.FrameLayout;
import androidx.annotation.Nullable;
import com.google.ar.core.AugmentedImageDatabase;
import com.google.ar.core.Config;
51
import com.google.ar.core.Session;
import com.google.ar.sceneform.ux.ArFragment;
@Override
config.setUpdateMode(Config.UpdateMode.LATEST_CAMERA_IMAGE);
aid.addImage("tar", image);
config.setAugmentedImageDatabase(aid);
this.getArSceneView().setupSession(session);
return config;
}
52
@Override
getPlaneDiscoveryController().hide();
getPlaneDiscoveryController().setInstructionView(null);
return frameLayout;
}
53
APPENDIX 2
SCREEN
SHOTS