0% found this document useful (0 votes)
93 views

Template - Mini Project Report

This document is a mini project report submitted in partial fulfillment of a Bachelor of Engineering degree in Computer Science. It discusses using augmented reality and machine learning for virtual learning applications. Augmented reality superimposes digital information on a user's real-world view using computer vision, while machine learning helps recognize images and play corresponding graphics to enhance the learning experience. The report contains sections on introduction, related work, system analysis, design, implementation, output, conclusion and future work.

Uploaded by

bomb1 squad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views

Template - Mini Project Report

This document is a mini project report submitted in partial fulfillment of a Bachelor of Engineering degree in Computer Science. It discusses using augmented reality and machine learning for virtual learning applications. Augmented reality superimposes digital information on a user's real-world view using computer vision, while machine learning helps recognize images and play corresponding graphics to enhance the learning experience. The report contains sections on introduction, related work, system analysis, design, implementation, output, conclusion and future work.

Uploaded by

bomb1 squad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 64

PROJECT TITLE

A MINIPROJECT REPORT

Submitted by

TEAM MEMBER (REG. NO.)

in partial fulfillment for the award of the

degree of

BACHELOR OF ENGINEERING

IN

COMPUTER SCIENCE

KCG COLLEGE OF TECHNOLOGY, KARAPAKKAM

ANNA UNIVERSITY: CHENNAI 600 025

APRIL 2023
ii

ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report titled “PROJECT TITLE” is a bonafide work of

“TEAM MEMBER NAME (REG NO.)” who carried out the project work under

my supervision.

Dr. Cloudin S Supervisor Name

HEAD OF THE DEPARTMENT SUPERVISOR

Professor Designation

Department of CSE Department of CSE

KCG College of Technology KCG College of Technology

Karapakkam, Chennai – 600 097 Karapakkam, Chennai – 600 097

INTERNAL EXAMINER EXTERNAL EXAMINER


iii

ABSTRACT

Font- Times New Roman

Font Size – 14pts

Text Align- Justify


iv

ACKNOWLEDGEMENT

We thank the almighty GOD for the abundant blessings showered on us. We extend our
deepest love and gratitude to our dear parents who built up our career and backed us up in life.

We thank our management and our Principal Dr. P Deiva Sundari, for the opportunities
given to us for our career development.

We feel indebted to the Head of the Department Dr. Cloudin S, Department of


Information Technology, KCG College of Technology, for all his encouragement, which has
sustained our labour and efforts.

We express our deepest gratitude to the Internal Guide Name, D e s i g n a t i o n ,


Department of Computer Science & Engineering, KCG College of Technology, for his/her
valuable guidance, ideas and support.

We would like to thank all other faculty members of Department of Computer Science
& Engineering for their help and advice throughout our life in this campus.

Finally, we are thankful to all our friends and all others who encouraged us and helped
us in doing this project.

Student’s Name
Reg. No
v

TABLE OF CONTENTS

CHAPTER NO. TITLE PAGE NO.

ABSTRACT iii

ACKNOWLEDGEMENT iv

LIST OF TABLES ix

LIST OF FIGURES x

LIST OF ABBREVIATIONS xi

1. INTRODUCTION 1

1.1 OVERVIEW 1

1.2 OBJECTIVE 2

1.3 MOTIVATION 2

1.4 PROBLEM DEFINITION 4

1.5 ORGANIZATION OF THE REPORT 4

2. RELATED WORK 6

2.1 LITERATURE REVIEW 6

2.2 WHAT IS AUGMENTED REALITY? 6

2.3 EFFECTS OF VIRTUAL LEARNING 7

2.4 IMAGE CLASSIFICATION USING DL 9

2.5 INFERENCE OF LITERATURE SURVEY 11


vi
CHAPTER NO. TITLE PAGE NO.

3. SYSTEM ANALYSIS 13

3.1 PROBLEM DEFINITION 13

3.2 PROPOSED SOLUTION 13

3.3 SOFTWARE REQUIREMENTS 14

3.3.1 Java 14

3.3.2 Python 15

3.3.3 Android Studio 16

3.3.4 Firebase 17

4. SYSTEM DESIGN 18

4.1 SYSTEM ARCHITECTURE 18

4.2 MODULES 19

4.2.1 Augmented Reality Module 19

4.2.2 Machine Learning Module 20

4.2.3 Database Module 22

4.3 UML DIAGRAMS 22

4.3.1 Use Case Diagram 23

4.3.2 Class Diagram 24

4.3.3 Sequence Diagram 25

4.3.4 Activity Diagram 26

CHAPTER NO. TITLE PAGE NO.


vii
5. SYSTEM IMPLEMENTATION 28

5.1 BASE ALGORITHM 28

5.1.1 Explanation of the Algorithm 28

5.1.2 Image Classification Module 29

5.1.3 Augmented Reality Module 30

5.2 FUNCTIONAL COMPONENTS 30

5 OUTPUT AND EXPLANATION 40

5.1 SCREENSHOTS 40

6. CONCLUSION AND FUTURE WORK 42

6.1 CONCLUSION 42

6.2 FUTURE WORK 42

6.2.1 Independent Image Recognition 42

6.2.2 Much Wider Field of Recognition 43

6.2.3 Better Management of Processing Power 43

6.2.4 Reduced Battery Consumption 44

APPENDIX 1: SAMPLE CODE 45

APPENDIX 2: SCREEN SHOTS 53

APPENDIX 3: PAPER PUBLICATION 56

REFERENCES 62
viii

LIST OF TABLES

TABLE NO. TITLE PAGE NO.

2.1 Mean Values for ARCS Model 9

2.2 Inference of the Literature Survey 11

3.1 Software Requirements 14

6.1 Test Case 1 38

6.2 Test Case 1 – Result and Status 38


ix

LIST OF FIGURES

FIGURE NO. TITLE PAGE NO.

1.1 Comparison of mean values 3

2.1 Block diagram of AR 7

2.2 ARCS Model 8

2.3 Block diagram of Image Classification 9

2.4 Flowchart of Image Classification 10

4.1 System Architecture 18

4.2 Block diagram for AR module 20

4.3 Flowchart for ML module 21

4.4 Firebase Architecture 22

4.5 Use Case Diagram 23

4.6 Class Diagram 24

4.7 Sequence Diagram 25

4.8 Activity Diagram 26

7.1 Media Augmented on the Image 1 41

7.2 Media Augmented on the Image 2 42

7.3 Recognising the Image 42


x

LIST OF ABBREVIATIONS

AR Augmented Reality

ML Machine Learning

DL Deep Learning
1

CHAPTER 1

INTRODUCTION

1.1 OVERVIEW

Virtual Learning, a unique and advanced method of learning has been developed and
adapted in many educational academies to create a spark of interest in students to learn the
concepts that they have to learn, in a much easier way than the conventional method of
learning through books and notes. Many studies have proved that when students learn
through virtual and interactive medium, (i.e.) through interactive applications in smart
phones, video, games and pictures, they tend to retain the things that they learn for a much
longer time and they are able to understand even the complex concepts easily.

The Virtual Learning approach to learning concepts consists of an interactive smart


phone application or a video explaining the concept under consideration or a short slideshow
of pictures explaining each and every stage of the process in that concept. The Virtual
Learning domain incorporates two main technological concepts to achieve to such an extent
in order to help the students learning process, a breeze, they are Augmented Reality and
Machine Learning. Most Virtual Learning applications has only the Augmented Reality
module and does not incorporate the Machine Learning part, they function with just the
Augmented Reality module.

Augmented Reality is a cutting edge technology that superimposes digital information


on the user’s view of real world. It works on computer vision based recognition algorithms to
augment graphics, images, video and sound in the real world using a device’s camera. In
order to play the corresponding graphics, image or video the user must predefine the target
area such as images, QR code, etc. It tracks a plane and plays the interactive augmented
media that the user has selected, on that plane. With just the Augmented Reality module the
user can only play a set of predefined augmented media content.

The incorporation of the Machine Learning module along with the Augmented
Reality module enables the user to scan any image or object that falls under a particular
domain and the Augmented Reality module will play the relevant video or the relevant
2

augmented media content on the plane of the scanned image. To improve the accuracy of the
Machine Learning Image Classification model, a large date set can be used to train the model,
the size of the training data set directly affects the accuracy of the trained Machine Learning
Image Classification model.

1.2 OBJECTIVE

The goal of this project is to help students to learn their academic lessons in an easier
way, by making the conventional way of learning a fun and interactive one, through the
incorporation of Augmented Reality and Machine Learning in an android application.

The existing Interactive Virtual Learning models incorporate Augmented Reality in


an android application, which makes it very interactive, but those applications lag behind in
one aspect, which is, they can play an augmented media content only for those images that
have been predefined. Which makes it not so useful when the user needs to learn about a
particular concept but the book in which the predefined images are present isn’t at his
disposal or when the lighting conditions aren’t good enough for the Augmented Reality
model to recognise the scanned image as a predefined one.

The Proposed solution is the incorporation of a Machine Learning model that can
classify any image by comparing the scanned image with the dataset used to train the image
classification model. This can be used the negate the issues caused to the Augmented Reality
module by unfavourable lighting conditions or situations in which the user doesn’t have the
material in which the predefined images are present. The user can scan any image relavant to
the predefine image and the application will recognise that image, classify it automatically
and play the relevant augmented media content.

1.3 MOTIVATION

Learning, is a process that involves a person staying focused in the subject at hand to
try and go through that subject matter in order acquire information on that particular subject,
3

it is a major part of any students’ life. Though learning something new is often exciting, the
medium through which a person learns, heavily influences the effectiveness of what he/she
learns.

Right from the first time a kid is enrolled in a school he is taught many educational
concepts and is being asked to learn those concepts through books and class notes. This
archaic method of learning can only hold the attention of a student for a certain amount of
time and that is too short and won’t be enough to learn difficult concepts easily. Studies have
been carried out for centuries to find out the most effective way of learning and almost every
study has proved that learning through visual cues of images/videos(nowadays) has been the
most effective.

Figure 1.1: Comparison of mean values

The graph shown in Figure 1.1, is a result of a case study to observe the effects of
using Virtual Learning. The study proved that the attention span of the students on a
particular concept has increased, confidence in the concept that they learnt has improved,
satisfaction that they have learnt something has greatly increased.

This has lead us to develop the application LoC (Lens of Coeus), it incorporates
Augmented Reality module to make learning an interesting process and Machine Learning
module to aid the Augmented Reality module.
4

1.4 PROBLEM DEFINITION

Learning, a process that is inevitable and never-ending, has a huge impact on every
person’s life. The world is changing in its own way and so is the methods and procedures
involved in the educational ecosystem. The archaic way of learning has never been
interactive and fun for the students. This is because of the fact that plain text in books and
notes aren’t interesting and eye-catching. According to various studies that have been carried
out to study the impact of virtual learning, graphical contents like images and videos are more
effective in helping the students to remember things than plain texts do. With the help of the
emerging technologies like Augmented Reality and Machine Learning, the proposed system
uses an Android mobile application to scan the images in a text book through the camera and
recognize the same by Image Recognition and superimposes a related video in the user’s real
world with the help of Augmented Reality.

The main challenge that this application faces is the availability of dataset that is
needed to train the Machine Learning model, in order to classify the images that is scanned.
The number of images that is used to train a particular image classification model directly
influences the accuracy of that model. If there are a very large amount of images the accuracy
of the trained model will increase.

The problem here is the process of collecting the required datasets. Collecting dataset
is a tedious task and often results in unfavourable outcomes, such as incorrect format of
images, the size of the image which might affect the response time of the model if it is large,
the resolution of the images which has to follow a certain standard, otherwise the model will
take a longer time to analyse the image to extract the features.

1.5 ORGANIZATION OF THE REPORT

This report is organized as follows: Chapter 2 consists of the related work to


development of the android mobile app using AR and ML. Chapter 3 consists of the system
analysis which defines about the problem definition, various software components and use
cases. Chapter 4 describes about the system design which consists of the system architecture
5

for the proposed system and module description for each module. Chapter 5 tells about the
system implementation and the various algorithms. Chapter 6 describes about the system
testing which consists of different test cases for the various modules. Chapter 7 tells about the
various results and discussion obtained from the various outputs. Chapter 8 consists of
conclusion and the future work for the project. The appendix 1 consists of the source code
and software implementations. The appendix 2 consists of the screenshots of the various
output screens. The appendix 3 consists of the technical paper presentation and the
references.
6

CHAPTER 2

RELATED

WORK

2.1 LITERATURE REVIEW

The literature review of the papers that were referenced and those that serves as base
paper and supporting paper provides a detailed description about the state of the
implementation of Augmented Reality in the field of Virtual Learning. This literature review
outlines the impact of Virtual Learning on students’ attention span, understanding level of
students on a concept, their confidence in the concepts that they learnt through Virtual
Learning module. It also describes in detail about the field of Machine learning to recognise
images and Augmented Reality to augment any media content. Conclusively, this literature
review has makes it clear that at present there isn’t a effective Virtual Learning module that
can be incorporated widely with all the students while also being sophisticated enough to be
able to help students learn what they want to learn.

2.2 WHAT IS AUGMENTED REALITY?

Augmented Reality is a cutting edge technology that superimposes digital information


on the user’s view of real world. It works on computer vision based recognition algorithms to
augment graphics, images, video and sound in the real world using a device’s camera. In
order to play the corresponding graphics, image or video the user must predefine the target
area such as images, QR code, etc. It tracks a plane and plays the interactive augmented
media that the user has selected, on that plane.

Bhaskar Kapoor et al [1], have published a paper on Augmented Reality Based Book
Visualization Using Marker Based Technique, suggested that Augmented reality (AR)
belongs to computer displays which provide additional virtual information to a user's sensory
perceptions. AR usually focuses on see-through devices, which are worn on the head that
overlay graphics and text on the user's view of environment. In layman’s term it overlays
7

graphics over a real world environment in real time. Procuring the right information at the
right time and at the right place is the key in such applications.

Figure 2.1: Block diagram of AR

The above Figure 2.1 explains the block diagram of AR. What makes augmented
reality different is how the information is presented that is, not on a separate display but
along with the user's perceptions. An AR system adds the real world with virtual (computer
generated) objects that appear to coexist in the same space as the real world. This
consciousness was used to create an interactive AR Book using AR application. This study
aimed to implement Augmented Reality on images or diagrams of photosynthesis, water
cycle and pollution by recognizing the images as marker and to add their 3-D view on top of
the images in the device upon recognition. This way, a person develops more interest towards
his/her studies. The person also has an option to study from a video to make the
understanding of the subject easier.

2.3 EFFECTS OF VIRTUAL LEARNING

Kevin Johnston et al [3] have conducted a research to monitor the impact of an


Augmented Reality application on learning motivation of students. He has said that the
research on augmented reality applications in education is still in an early stage, and there is a
lack of research on the effects and implications of augmented reality in the field of education.
The purpose of his research was to measure and understand the impact of an augmented
reality mobile application on the learning motivation of undergraduate health science students
at the University of Cape Town. He extended his previous research that looked specifically at
the impact of augmented reality technology on student learning motivation.
8

Figure 2.2: ARCS Model

The intrinsic motivation theory was used to explain motivation in the context of
learning. The attention, relevance, confidence, and satisfaction (ARCS) model as shown in
the Figure 2.2, guided the understanding of the impact of augmented reality on student
motivation, and the Instructional Materials Motivation Survey was used to design the
research instrument. This research examined the differences in student learning motivation
before and after using the augmented reality mobile application.

The intrinsic motivation theory was used to understand motivation in the context of
learning. The ARCS model of motivational design was used to understand the impact of AR
technology on student motivation towards learning. The impact on student learning
motivation was measured by comparing the learning motivation of students before and after
using an AR mobile application, using a pre-usage and post-usage questionnaire.

The study as shown in the Table 2.1, has proved that the Virtual Learning approach
to learn new concepts has increased the attention span of students by 30.72% , increased the
confidence that the student has in the concept that he/she has learnt by 10.74% , increased the
level of satisfaction that the student felt after learning a concept through visual learning
module by 12.50% and the overall effect that this virtual learning module has had on the
students learning process by 14.43%, it has also decreased the relevance by 3.26% but
further
9

analysis has stated that this decrease in relevance is insignificant when compare to the other
aspects that has been affected positively.

Table 2.1: Mean values for ARCS Model

PERCENTAGE
PREUSAGE POSTUSAGE
DIFFERENCE

ATTENTION 2.93 3.83 30.72% increase

RELEVANCE 3.37 3.26 3.26% decrease

CONFIDENCE 2.98 3.30 10.74% increase

SATISFACTION 2.96 3.33 12.50% increase

OVERALL 3.05 3.49 14.43% increase

2.4 IMAGE CLASSIFICATION USING DEEP LEARNING

Mohd Azlan Abu et al [2] have conducted a study on Image Classification based on
Deep Learning and TensorFlow, his research is a study about image classification by using
the deep neural network (DNN) or also known as Deep Learning by using framework
TensorFlow. Python is used as a programming language because it comes together with
TensorFlow framework.

Figure 2.3: Block diagram of Image Classification


10

It is the framework of image classification where deep neural networks are also
applied. There are four (4) phases throughout this process ass shown in the Figure 2.3 and
each of the phases will be discussed. Each of the phases are included on TensorFlow as the
open source software and Python as its programming language. Then, the process is
continued to collect some of the images (inputs), by applying DNN and lastly all images will
be classified into their groups.

Figure 2.4: Flowchart of Image Classification

The flowchart of image classification is shown in the Figure 2.4. The input data here
mainly focuses in flowers category which there are five (5) types of flowers that have been
11

used in his paper. Deep neural network (DNN) has been chosen as the best option for the
training process because it had produced a high percentage of accuracy. Results were
discussed in terms of the accuracy of the image classification in percentage. Roses got
90.585% and same goes to another type of flowers where the average of the result was up to
90% and above.

This study has helped the research team to conclude that image classification models
based on Deep Neural Network is the most efficient, as it has the most accurate predictions.
Deep Neural Networks work just like how the human brain thinks and thus resulting in the
most accurate predictions.

2.5 INFERENCE OF LITERATURE SURVEY

The literature review of the papers that were referenced and those that serve as base
paper and supporting paper helped to narrow down the main challenges that the proposed
application might face. Also it has served as a guide for the development of the application in
the right way, using the necessary tools. From the Table 2.2, the following were inferred from
the literature survey.

Table 2.2: Inference of the Literature Survey

AUTHOR NAME INFERENCE


AR enabled books are more
Bhaskar Kapoor [1] interactive, attractive and beneficial to
the students.
Using 3D and 2D based AR.
Using deep learning, the accuracy and
Mohd Azlan Abu [2] efficiency of image classification are
improved.
AR mobile application improves the
Kevin Johnston [3] learning motivation of students and also
contribute towards academic achievements.
12

Augmented Reality is a technological concept that deals with overlaying a


digital/augmented media content over a physical real world object. Markers are used to track
an image and the image recognition module in the backend checks for a match in the
predefined set of images and plays the corresponding augmented media content when a
match is found. The geographical plot of the image that is scanned has to be accurate so that
the relevant augmented media content will be played right on top of that scanned image’s
plane. Machine Learning is an advanced computing concept that has developed recently. It
enables a computing machine to learn like humans do. Deep Neural Network is a concept in
Machine Learning that incorporates the concept of neurons in Machine Learning, each
neurons fire when they are used to compute a certain task and all the neurons collectively act
as the brain of a Machine Learning model and helps achieve the most accurate predictions.
13

CHAPTER 3

SYSTEM

ANALYSIS

3.1 PROBLEM DEFINITION

Learning, is a process that involves a person staying focused on the subject at hand to
try and go through that subject matter in order to acquire information on that particular
subject, it is a major part of any students’ life. Though learning something new is often
exciting, the medium through which a person learns, heavily influences the effectiveness of
what he/she learns. Studies have been carried out for centuries to find out the most effective
way of learning and almost every study has proved that learning through visual cues of
images/videos(nowadays) has been the most effective. It provides the few main aspects such
as long attention span, much higher interestingness factor, easy understanding of concepts
and the ability to retain information for a longer period of time, the conventional way of
learning through note books and class notes, lag behind in these aspects. Though learning is
an interesting process on its own, the attention span of the conventional way of learning is
very less, students easily get distracted by even the tiniest things in their surroundings. This
greatly affects the quality of knowledge that they earn through the process of learning. The
level of understanding of a student, in a particular concept, determines how long they retain
the information that they earned through the process of learning. If a student doesn’t
understand what has been learnt, recollecting the topics that has been learnt will be an
impossible task, the conventional method of learning isn’t enough to rectify these issues.

3.2 PROPOSED SOLUTION

The proposed system aims at developing an Android based mobile application that
incorporates Augmented Reality and Machine Learning. The proposed system can scan an
image using the camera in the mobile phone, which will be analysed by the Machine
Learning Image Classification module, it will be analysed by a TensorFlow model to
recognize what the image is about. This information is sent to the Augmented Reality
14

module, which will use that classified image’s label to trigger the relevant video to be
played on the 2D plane, using ARCore (SDK powered by Google).

3.3 SOFTWARE REQUIREMENTS

Table 3.1: Software Requirements

PROGRAMMING LANGUAGE JAVA

SCRIPTING LANGUAGE Python

IDE Android Studio 3.6.1

STORAGE Firebase 19.0

3.3.1 Java

Java is a general-purpose programming language that is class-based, object-


oriented, and designed to have as few implementation dependencies as possible. It is
intended to let application developers write once, run anywhere (WORA), meaning that
compiled Java code can run on all platforms that support Java without the need for
recompilation. Java applications are typically compiled to bytecode that can run on any
Java virtual machine (JVM) regardless of the underlying computer architecture. The
syntax of Java is similar to C and C++, but it has fewer low-level facilities than either of
them.

Following are the notable features of Java:

Object Oriented - In Java, everything is an Object. Java can be easily extended since it
is based on the Object model.
15

Platform Independent - Unlike many other programming languages including C and


C++, when Java is compiled, it is not compiled into platform specific machine, rather
into platform-independent byte code. This byte code is distributed over the web and
interpreted by the Virtual Machine (JVM) on whichever platform it is being run on.

Portable - Being architecture-neutral and having no implementation dependent aspects


of the specification makes Java portable. The compiler in Java is written in ANSI C with
a clean portability boundary, which is a POSIX subset.

Multithreaded - With Java's multithreaded feature it is possible to write programs that


can perform many tasks simultaneously. This design feature allows the developers to
construct interactive applications that can run smoothly.

Interpreted - Java byte code is translated on the fly to native machine instructions and is
not stored anywhere. The development process is more rapid and analytical since the
linking is an incremental and light-weight process.

Dynamic - Java is considered to be more dynamic than C or C++ since it is designed to


adapt to an evolving environment. Java programs can carry an extensive amount of run-
time information that can be used to verify and resolve accesses to objects at run-time.

3.3.2 Python

Python is an interpreted, high-level, general-purpose programming language.


Created by Guido van Rossum and first released in 1991, Python's design philosophy
emphasizes code readability with its notable use of significant whitespace. Its language
constructs and object-oriented approach aim to help programmers write clear, logical
code for small and large-scale projects. Python is dynamically typed and garbage-
collected. It supports multiple programming paradigms, including structured
(particularly, procedural), object-oriented, and functional programming. Python is often
described as a "batteries included" language due to its comprehensive standard library.

Python provides lots of features that are listed below.

Easy to Learn and Use - Python is easy to learn and use. It is developer-friendly and
high level programming language.
16

Interpreted Language - Python is an interpreted language i.e. interpreter executes the


code line by line at a time. This makes debugging easy and thus suitable for beginners.

Cross-platform Language - Python can run equally on different platforms such as


Windows, Linux, Unix and Macintosh etc. So, we can say that Python is a portable
language.

Object-Oriented Language - Python supports object oriented language and concepts of


classes and objects come into existence.

Extensible - It implies that other languages such as C/C++ can be used to compile the
code and thus it can be used further in our python code.

Large Standard Library - Python has a large and broad library and prvides rich set of
module and functions for rapid application development.

3.3.3 Android Studio

Android Studio is the official integrated development environment (IDE) for


Google's Android operating system, built on JetBrains' IntelliJ IDEA software and
designed specifically for Android development. It is available for download on
Windows, macOS and Linux based operating systems. It is a replacement for the Eclipse
Android Development Tools (ADT) as the primary IDE for native Android application
development. Android Studio was announced on May 16, 2013 at the Google I/O
conference. It was in early access preview stage starting from version 0.1 in May 2013,
then entered beta stage starting from version 0.8 which was released in June 2014. The
first stable build was released in December 2014, starting from version 1.0.

The features of Android Studio are:

• Instant App Run

• Visual Layout Editor

• Intelligence Code Editor

• Help to Connect with Firebase

• Support KOTLIN
17

3.3.4 Firebase

Firebase is a mobile and web application development platform developed by


Firebase, Inc. in 2011, then acquired by Google in 2014. As of March 2020, the Firebase
platform has 19 products, which are used by more than 1.5 million apps. In October
2017, Firebase has launched Cloud Firestore, a real-time document database as the
successor product to the original Firebase Realtime Database.

The features of Firebase are:

• Firebase Realtime Database

• Cloud Firestore

• Firebase Storage

• ML Kit
18

CHAPTER 4

SYSTEM

DESIGN

4.1 SYSTEM ARCHITECTURE

Figure 4.1: System Architecture

The Figure 4.1 depicts the basic architecture of the proposed system LoC. The main
components present in the architecture of LoC are Capture Image, Image Analysis, Target
Tracker, Triggered Event and AR Image/Video output. The first component in this
architecture represents the process of the capturing the image and the last component
represents the process of playing the augmented media content over the tracked target layout.
The overall flow depicted in the architecture diagram is as follows. The reader scans an
image on a physical plane, with a smart phone camera. The data of the scanned image is
analysed by the machine learning module with the images in the Firebase, that were used to
19

train the image classification model. The scanned image is then classified and labelled, the
target layout of the scanned image is identified by measuring the edges of the scanned image.
The label of the classified image is used as a trigger to trigger the relevant video content to
play on the layout of the target image. The augmented media content that has to be played is
retrieved from the Firebase and is played on top of the physical image that was scanned by
the reader.

4.2 MODULES

The proposed system consists of the following three main modules, namely

 Augmented Reality Module

 Machine Learning Module

 DataBase Module

4.2.1 Augmented Reality Module

Augmented Reality is a technology that superimposes a digital image on a real world


object or plane and provides an interactive experience. It is a relatively new and emerging
technology and is being rapidly adapted in many fields. In the android application that is
proposed in this paper, Augmented Reality is used in such a way that it gets the recognized
image from the Machine Learning module as input and a corresponding event is triggered. An
event here is the process of playing a video that is relevant to the image that was recognised.
There are several steps in the Augmented Reality module, input from the Machine Learning
module, analyzing the image to see if it is a trigger, that would trigger an event, fetching the
event to be triggered and displaying the output to the user. The Figure 4.2, depicts the steps
involved in this module.
20

Figure 4.2: Block diagram for AR module

The image of the object is fetched through the camera of a smart phone, this image is
analysed by the Machine Learning image recognition algorithm and is input to the
Augmented Reality Module. The class label of the recognised image will be used to look for
any corresponding triggers that are present. When a match is found, the Target Tracker will
track the layout of the recognised image and output the relevant augmented video, image or
audio for the user to see.

4.2.2 Machine Learning Module

The workflow of the Machine Learning module consists of several sub processes,
Training data, an unknown image as an input, analysis of that image and finally classification
of that image, as show in the Figure 4.3.
21

Figure 4.3: Flowchart for ML module

The first step in this module is to write a Machine Learning algorithm that can
recognize images and classify them automatically. This Machine Learning algorithm is
trained using a testing data set that consists of number of classes and several hundreds of
images under each class. The algorithm is written in such a way that it is able to recognize the
shapes and color patterns in images, once it recognizes a pattern and shape of an image under
a particular class, it learns what object it is by looking at the label of that particular class that
the image belongs to. Once the algorithm has run through the training data set, it can be tested
using a test data set which will give the accuracy of the algorithm at its current state. The
accuracy of its recognition depends on the number of images that it processed in the training
set. Once a satisfactory level of accuracy has been achieved the algorithm can be used to
recognize any new image. The input image when fed to the algorithm, will be analyzed for
any shapes and color patterns if a match is found, it will output the class that it belongs to,
which will be the object that was recognized in the image.
22

4.2.3 Database Module

Figure 4.4: Firebase Architecture

The database used in the proposed system is Firebase. The architecture is shown in the
Figure 4.4. It is a mobile and web application development platform developed by Firebase
later it was acquired by Google. Firebase is used to store the trained Machine Learning
Module and load that trained Machine Learning model into the Augmented Reality
application. It is also used to store the resource set. Whenever the AR module scans an image
the ML module analysis that image to find a match and when a match is found, the relevant
AR video is triggered and when an AR event is triggered, the Firebase is accessed to retrieve
the relevant video for the AR module to play.

4.3 UML DIAGRAMS

Unified Modelling Language, is a general purpose development and modelling


language in the field of software engineering that is intended provide a standard way to
visualize the design of a system.

The UML diagrams that are discussed below are,

 Use Case diagram

 Class diagram
23

 Sequence diagram

 Activity diagram

4.3.1 Use Case Diagram

Figure 4.5: Use Case Diagram

The Figure 4.5 is the use case diagram for the proposed system. It visualizes how the
user, or in this case the Reader, interacts with the application. The Reader scan an image with
a smartphone. The scanned image is then analysed by the machine learning module. The ML
module compares the scanned image with the images that it was trained with, which are
present in the Firebase, to classify the scanned image with a label. After classifying the
image, it is then given a corresponding label. The AR module will use the label that was
assigned to the scanned image as a trigger. The AR module will plot the target’s layout by
identifying the edges of the scanned image. The trigger will cause the AR module to trigger
the relevant
24

augmented media content to play on the scanned image’s layout. This augmented media
content will be visible in the display screen of the reader’s smartphone.

4.3.2 Class Diagram

Figure 4.6: Class Diagram

The Figure 4.6 is the Class diagram for the proposed system. It described the structure
of the proposed system by showing the system’s classes, their attributes, operations and the
relationships among the different objects. The classes present are, Vision, Training set,
Resource set and AR Tracker. The attributes of the class Vision are, Type of Reader which is
of type String and Image which is of type Buffered Image, Vision’s function is to scan an
image. The attributes of the class Training Set are, Image which is of type Buffered Image
and Image Label which is of the type String, the function of the training set is to analyse the
scanned image. The attributes of the Resource set are Label Match which is of the type
Boolean and Video, which is of the type File, its function is to find a match for the label. The
attributes of the class AR Tracker are, Image Plot, which is of the type integer, Trigger Event,
which is of the type Boolean, AR content, which is of the type string and Video, which is of
the type File, the function of the AR Tracker class are plotting the target layout and playing
the relevant augmented media content on the plotted target layout. The classes AR Tracker
25

and Resource Set share a dependency relationship since the AR Tracker is dependent on the
label provided by the Resource Set to trigger the relevant augmented media to play.

4.3.3 Sequence Diagram

Figure 4.7: Sequence Diagram

The Figure 4.7 depicts the object interactions in the proposed system, arranged in time
sequence. It depicts the objects and classes involved in the scenario and the sequence of
messages exchanged between the objects needed to carry out the functionality of the scenario.
The objects in the sequence diagram present above are Reader, MLkit, ARCore and Firebase.
The lifeline of the object Reader, has two execution occurrences, the first one depicts when the
process of image recognition through the camera, occurs. The next execution occurrence in
Reader object represents when the image data is sent to the MLKit object and when does it
26

receive the relevant augmented video from the ARCore module. The lifeline of the object,
MLKit consists of one execution occurrence, and it represents when the recognised image’s
data is referred with the resource set in the firebase database to find a matching label. The
lifeline of the ARCore object consists of three execution occurrences, the first one represents
when the AR content relevant to the classified image is obtained from the Firebase object, the
second execution occurrence depicts when the target layout is plotted, the third one represents
the initiation of a process to send the relevant augmented media content back to the Reader
Object. The Firebase object’s lifeline consists of one execution occurrence and it represents
when the analysed image data is received from the Reader object and when does the relevant
AR content is sent to the ARCore.

4.3.4 Activity Diagram

Figure 4.8: Activity Diagram


27

The Figure 4.8 is the activity diagram for the proposed system and it graphically
depicts the workflows of stepwise activities and actions with a support for decision branches.
The workflow of the system as depicted by the activity diagram above, is as follows, the
reader launches the camera application in a smartphone, the camera is used to capture the
image. The data of the scanned image is analysed by the Machine Learning module, if a match
is found it will be classified and labelled, else the image has to be captured again. The
labelled image data is sent to the AR module, it tracks the target image layout, tries to find the
image plots, if the plots are found the relevant augmented media is played over that target
layout, else the image will be tracked again, to get the target plots. The workflow ends after
the relevant augmented media content has been played.
28

CHAPTER 5

SYSTEM

IMPLEMENTATION

5.1 BASE ALGORITHM

I. Scan an image.

II. The data of the scanned image is sent to the image classification
machine learning algorithm in the form of bin array.

III. The image classification algorithm will look for a pattern match in the
training data set.

a) If, a match is found, then send the label of that image to AR module.

b) Else, throw a ' no match found' error message.

IV. Use the label from the Image Classification module, to play the
corresponding augmented video.

5.1.1 Explanation of the Algorithm

The functioning of the system starts, when the reader opens the application and points
the smart phone camera at an image. The camera will scan the image to extract its features
such as high contrasts, colour variations, shapes and size, and it is made ready to be sent to
the next module.

The data of the scanned image is compressed and sent to the image classification
machine learning algorithm in the form of binary array. The image classification algorithm
recognises the features in the scanned image by processing binary array data which is in a
matrix form. The numbers in the binary array denote the contrast of the colours in the image
and the depth of the colour in the image. This information is used by the image classification
algorithm to analyse the image data for certain features.

The image classification algorithm, after having extracted the data about the features
such as contrasts, colour intensity variations, depth of the colour and shape, in the scanned
29

image, it searches through the trained image classification models dataset with corresponding
label, to find a pattern or feature that matches the one in the scanned image.

If a match is found, then the scanned image is labelled with the label of image that
matched and that label is sent to the Augmented Reality module. If a corresponding label is
not found for the particular features in the scanned image, then the image classification
module throws a ‘No match found’ error message.

The Augmented Reality module, once it receives the label of the scanned image from
the image classification machine learning module, uses that label to find the corresponding
augmented media content that correlates with that label. The Augmented Reality module also
extracts the geometrical layout information of the image. This helps the Augmented Reality
module to overlay any media content that has to played, to play right on top the layout of the
scanned image.

Once the necessary augmented media content is found, it is overlaid on top of the
scanned image. This augmented media content is then overlaid on top of the scanned image
and played for the reader to read.

5.1.2 Image Classification Module

I. A set of training images are defined and are labelled based on the class that they
belong to.

II. The main features are extracted from the training images and are correlated with their
labels.

III. The image classification model is run through the training data set multiple times to
increase its accuracy and the model is saved once a satisfactory level of accuracy is
reached.

IV. A testing data set consisting new set of images with no label is created.

V. The saved image classification model is run through the testing data set to measure its
accuracy.
30

VI. If the result from testing the developed model is satisfactory, the model is saved as
the final product.

VII. The final product is used to classify the images that are scanned by the reader.

5.1.3 Augmented Reality Module

I. The reader scans an image with the smartphones camera.

II. The scanned image is tracked by extracting the edge


plots/boundary information of the image.

III. The scanned images information is sent to ML module which gives the
label of the scanned image.

IV. The Augmented Reality module uses that label to trigger the corresponding
event.

V. The corresponding augmented media content is overlaid on top of the


scanned image.

5.2 FUNCTIONAL COMPONENTS OF CODING

The following are the functional components of coding in the application LoC,

 TensorFlow

 Keras

 Pandas

 Matplotlib

 NumPy

 Sklearn / Scikit - Learn

 ARCore
31

 Unity

5.2.1 Tensorflow

TensorFlow is an end-to-end open source platform for machine learning. It has a


comprehensive, flexible ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build and deploy ML
powered applications. In LoCs image classification module TensorFlow plays a major role. It
helps to develop the image classification model.

5.2.2 Keras

Keras is an open-source neural-network library written in Python. It is capable of


running on top of TensorFlow, Microsoft Cognitive Toolkit, R, Theano, or PlaidML.
Designed to enable fast experimentation with deep neural networks, it focuses on being user-
friendly, modular, and extensible.

Keras contains numerous implementations of commonly used neural-network


building blocks such as layers, objectives, activation functions, optimizers, and a host of tools
to make working with image and text data easier to simplify the coding necessary for writing
deep neural network code.

In addition to standard neural networks, Keras has support for convolutional and
recurrent neural networks. It supports other common utility layers like dropout, batch
normalization, and pooling.

Keras allows users to productize deep models on smartphones (iOS and Android), on
the web, or on the Java Virtual Machine. It also allows use of distributed training of deep-
learning models on clusters of Graphics processing units (GPU) and tensor processing units
(TPU) principally in conjunction with CUDA.

5.2.3 Pandas

In computer programming, pandas is a software library written for the Python


programming language for data manipulation and analysis. In particular, it offers data
structures and operations for manipulating numerical tables and time series.
32

The features of Pandas are,

 DataFrame object for data manipulation with integrated indexing.

 Tools for reading and writing data between in-memory data structures and different
file formats.

 Data alignment and integrated handling of missing data.

 Reshaping and pivoting of data sets.

 Label-based slicing, fancy indexing, and subsetting of large data sets.

 Data structure column insertion and deletion.

 Group by engine allowing split-apply-combine operations on data sets.

 Data set merging and joining.

 Hierarchical axis indexing to work with high-dimensional data in a lower-dimensional


data structure.

 Time series-functionality: Date range generation[4] and frequency conversion,


moving window statistics, moving window linear regressions, date shifting and
lagging.

 Provides data filtration.

5.2.4 Matplotlib

Matplotlib is a plotting library for the Python programming language and its
numerical mathematics extension NumPy. It provides an object-oriented API for embedding
plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or
GTK+. In LoC, matplotlib is used to represent the accuracy of the model, graphically.

5.2.5 NumPy

NumPy is a library for the Python programming language, adding support for large,
multi-dimensional arrays and matrices, along with a large collection of high-level
mathematical functions to operate on these arrays. NumPy targets the CPython reference
33

implementation of Python, which is a non-optimizing bytecode interpreter. Mathematical


algorithms written for this version of Python often run much slower than compiled
equivalents. NumPy addresses the slowness problem partly by providing multidimensional
arrays and functions and operators that operate efficiently on arrays, requiring rewriting some
code, mostly inner loops using NumPy.

Using NumPy in Python gives functionality comparable to MATLAB since they are
both interpreted, and they both allow the user to write fast programs as long as most
operations work on arrays or matrices instead of scalars.

5.2.6 SKLearn/ Scikit - Learn

Scikit-learn is a free software machine learning library for the Python programming
language. It features various classification, regression and clustering algorithms including
support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is
designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Scikit-learn is largely written in Python, and uses numpy extensively for high-
performance linear algebra and array operations. Furthermore, some core algorithms are
written in Cython to improve performance. Support vector machines are implemented by a
Cython wrapper around LIBSVM; logistic regression and linear support vector machines by a
similar wrapper around LIBLINEAR. In such cases, extending these methods with Python
may not be possible. Scikit-learn integrates well with many other Python libraries, such as
matplotlib and plotly for plotting, numpy for array vectorization, pandas dataframes, scipy,
and many more.

5.2.7 ARCore

ARCore is Google’s platform for building augmented reality experiences. Using


different APIs, ARCore enables your phone to sense its environment, understand the world
and interact with information. Some of the APIs are available across Android and iOS to
enable shared AR experiences.

ARCore uses three key capabilities to integrate virtual content with the real world as
seen through your phone's camera:
34

 Motion tracking allows the phone to understand and track its position relative to the
world.

 Environmental understanding allows the phone to detect the size and location of all
type of surfaces: horizontal, vertical and angled surfaces like the ground, a coffee
table or walls.

 Light estimation allows the phone to estimate the environment's current lighting
conditions.

5.2.8 Unity

Unity is a cross-platform game engine developed by Unity Technologies, first


announced and released in June 2005 at Apple Inc.'s Worldwide Developers Conference as a
Mac OS X-exclusive game engine. As of 2018, the engine had been extended to support more
than 25 platforms. The engine can be used to create three-dimensional, two-dimensional,
virtual reality, and augmented reality games, as well as simulations and other experiences.
The engine has been adopted by industries outside video gaming, such as film, automotive,
architecture, engineering and construction.

In LoC, Unity engine is used to model the augmented reality part of the application.
35

CHAPTER 6

SYSTEM

TESTING

6.1 TESTING OBJECTIVES

Testing is a set of activities that can be planned in advance and conducted


systematically. For this reason, a template for software testing, a set of steps into which we
can place specific test case design techniques and testing methods should be defined for
software process. Testing often accounts for more effort than any other software engineering
activity. If it is conducted haphazardly, time is wasted, unnecessary effort is expanded, and
even worse, errors sneak through undetected. It would therefore seem reasonable to establish
a systematic strategy for testing software.

6.2 TYPES OF TESTING

In this methodology we use two types of testing:

• Functional Testing

• Non-Functional Testing

6.2.1 Functional Testing

It is a software testing which validates the software system against the functional
requirements. It checks each function of the application by providing the required input and
output. This can be executed first, this uses Manual or automation testing tool. These testing
uses:

• Integration Testing

• Security Testing
36

6.2.2 Non-Functional Testing

Non-Functional testing is used to check the performance, usability or a reliability of a


software. It can perform only after functional testing. This function can check how good a
product works. Manual testing cannot be done easily. Non Functional testing can increase the
user experience to a high extent which has a impact on the quality of the software. These
testing includes:

• Performance Testing

• Usability Testing

6.3 COMMON SOFTWARE TESTING TYPES

6.3.1 Integration Testing

Integration testing is a process where each blocks are combined and tested as an
individual. It is used to test the data by using a design so that the system can operate
successfully. Thus the system testing is a process to check the combinational behaviour and
validate whether the requirements are implemented correctly.

6.3.2 Performance Testing

It is used to check the speed, reliability and scalability of a software. It is done to


provide stakeholders the information about their application. It can demonstrate how the
system meets it pre-defined performance criteria. Costs of performance testing are usually
more than made up for with improved customer satisfaction.

6.3.3 Validation Testing

The final step involves Validation testing, which determines whether the software
functions as the user expected. The end-user rather than the system developer conducts this
test most software developers as a process called “Alpha and Beta test” to uncover that only
the end user seems able to find. The compilation of the entire project is based on the full
satisfaction of the end users.
37

6.3.4 Acceptance Testing

An Acceptance Test is performed by the client and verifies whether the end to end the
flow of the system is as per the business requirements or not and if it is as per the needs of the
end-user. Client accepts the software only when all the features and functionalities work as
expected. It is the last phase of the testing, after which the software goes into production.
This is also called User Acceptance Testing (UAT).

6.3.5 Backend Testing

Whenever an input or data is entered on front-end application, it stores in the database


and the testing of such database is known as Database Testing or Backend Testing. There are
different databases like SQL Server, MySQL, and Oracle, etc. Database Testing involves
testing of table structure, schema, stored procedure, data structure and so on. In Back-end
Testing GUI is not involved, testers are directly connected to the database with proper access
and testers can easily verify data by running a few queries on the database.

6.4 TEST CASE STRATEGIES

6.4.1 Black Box Testing

In this testing the blocks will give the details of the pesticides along with its
ingredients to the user and test the output. Here the testing does not go for watching the
internal file in the system and what are the changes made on them for the required output.

6.4.2 White Box Testing

It is just the vice versa of the Black Box testing. It does not watch the internal
variables during testing. This gives clear idea about what is going on during execution of the
system. The point at which the bug occurs was all clear and was removed.
38

6.5 TEST CASES

Table 6.1: Test Case 1

Project Name: Lens of Coeus – AR + ML based Educational Assistant

Test Case ID: 01 Test Designed by: Josh Sathyajith A

Test Priority: High Test Designed date: 25/04/2020

Test Executed by: Bezeleel Samraj A &


Modular Name: Output Check
Manoah Edwin Paul

Test Title: Verify Output Displayed Test Executed date: 25/04/2020

Description: Testing the AR output precision

Table 6.2: Test Case 1 – Result and Status

Expected Actual Status


Steps Test Steps Test Data
Result Result (Pass/Fail)

Object Video Object Video


Scan large Image from
related to the related to the
1 sized Kindergarten Pass
particular particular
Alphabets books
Alphabet Alphabet
Video related Video related
Image from
Scan an to the image to the image
2 school Pass
image for school for school
textbook
student level student level
Video related Video related
Image from to the image to the image
Scan an
3 Engineering at at Pass
image
textbook Engineering Engineering
Level Level
39

Expected Actual Status


Steps Test Steps Test Data
Result Result (Pass/Fail)

Video related Video related


Scan an Image from to the image to the image
4 Pass
image Magazine from the from the
Magazine Magazine
Latest Video Latest Video
related to related to
Scan an Image from
5 the image the image Pass
image Newspaper
from from
the the
Newspaper Newspaper
Scan an Outdated or
No video to No video to
6 erroneous invalid Pass
be displayed be displayed
image image
40

CHAPTER 7

OUTPUT AND

EXPLANATION

7.1 SCREENSHOTS

Figure 7.1: Media Augmented on the Image 1

The application doesn’t have a separate home page, instead the camera is directly
launched when the application is launched. The Figure 7.1, is an example of the working of
the application. The camera has been pointed at an image and so a relevant video content has
started playing.
41

Figure 7.2: Media Augmented on the Image 2

The Figures 7.2 and 7.3 are several other output instances of the application LoC. In
Figure 7.2, a picture of Snow Leopard was scanned by the camera and it can see that the
application has played the relevant video. Figure 7.3, is an instance of the application playing
a relevant video of Lion Tailed Macao.

Figure 7.3: Recognising the Image


42

CHAPTER 8

CONCLUSION AND FUTURE

WORK

8.1 CONCLUSION

LoC is a new educational assistant that integrates Augmented Reality and Machine
Learning in an android application, to make a learning module that delivers educative content
through interactive medium such as images and video in order to improve the amount of
quality education that a student receives. The android application LoC is built around the
principle that students learn much quicker when there’s an interactive medium such as an
image or a video is involved. The educational system can be revolutionized by implementing
interactive learning modules in everyday classes that students take up, but this level of
integration can be achieved only to small procedural integrations and so it will take a lot of
time.

8.2 FUTURE WORK

In the future Lens of Coeus can be further improved to to make the application more
efficient and achieve the following aspects,

• Better Image Recognition

• Much wider field of recognition

• Better management of processing power

• Reduced battery consumption

8.2.1 Independent Image Recognition

The Lens of Coeus at its current state, can recognise every image in a particular book
because the image classifier in the LoC has been trained with the images that are present in
the book. When a reader scans an image, the features in that image are extracted and based on
the extracted features the images is labelled and classified. Sometimes, the scanned imaged
will be of very poor quality as a result of very poor lighting conditions, in such cases the
extracted features might not match any class. The images can only be classified if a class for
43

that particular image exists in the training data set. If no such class is present, then the image
won’t be classified and an error/exception will be thrown.

In the future, the application LoC could be developed to such an extent that it’ll be
able to classify any object or image that the reader scans under any lighting conditions with a
smartphone. To make this possible, a very large training data set have to defined and labelled.
If the image classifier in LoC is trained with such a training data set, any image/object that
the reader scans under any lighting condition, will be classified and the relevant educative
media content will be augmented on top of the scanned image/object.

8.2.2 Much Wider Field of Recognition

The image classifier module in LoC has been developed in such a way that it can
recognise any image related to a particular concept in a textbook used by schools. If a
particular image that the reader has scanned isn’t present in any textbook, then the application
will just return an error/exception stating that there is no match for that particular image.

To eliminate this issue, an image classifier model with a wide variety of classes and a
large image data set for each and every class along with the correct label for each and every
image in the training data set must be created to train the image classifier model. An image
classifier model that has trained with such a training data set will be able to classify any
image/object, even if it is a concept in any books used in schools or any other educational
institute, for that matter.

8.2.3 Better Management of Processing Power

The image classification module and augmented reality module require heavy
computational power; such computational power is easily available in a computer or a laptop
whereas in the case of a smartphone such heavy computational power is hard to come by.
Insufficient computational power might lead to many issues in the application.

• Not recognising an image

• Incorrect labelling

• Unable to play augmented content


44

• Overheating issues

• Too much battery consumption

To negate these issues, the application must be developed in such a way that it doesn’t
put all the processing stress on the smartphones processor. The application must be optimised
to relieve the processors of some computational stress. This will help increase the
performance of the application.

8.2.4 Reduced Battery Consumption

Augmented Reality and Image Classification are two heavy computational power
demanding modules on their own and when they are combined together the power and battery
consumption is large. LoC is designed to require low processing power and battery to work
properly. But the battery consumption can be further reduced to help the readers/users to use
the application for much prolonged duration without having to charge the smartphone
frequently.
45

APPENDIX 1

SAMPLE CODE

MainActivity.java

package com.example.arimages;

import androidx.appcompat.app.AppCompatActivity;

import android.media.MediaPlayer;

import android.net.Uri;

import android.os.Bundle;

import com.google.ar.core.Anchor;

import com.google.ar.core.AugmentedImage;

import com.google.ar.core.Frame;

import com.google.ar.core.TrackingState;

import com.google.ar.sceneform.AnchorNode;

import com.google.ar.sceneform.FrameTime;

import com.google.ar.sceneform.Scene;

import com.google.ar.sceneform.math.Vector3;

import com.google.ar.sceneform.rendering.Color;
46

import com.google.ar.sceneform.rendering.ExternalTexture;

import com.google.ar.sceneform.rendering.ModelRenderable;

import java.util.Collection;

public class MainActivity extends AppCompatActivity {

private ExternalTexture texture;

private MediaPlayer mediaPlayer;

private CustomArFragment arFragment;

private Scene scene;

private ModelRenderable renderable;

private boolean isImageDetected =

false;

@Override

protected void onCreate(Bundle savedInstanceState) {

super.onCreate(savedInstanceState);

setContentView(R.layout.activity_main);

texture = new ExternalTexture();

mediaPlayer = MediaPlayer.create(this, R.raw.video);


47

mediaPlayer.setSurface(texture.getSurface());

mediaPlayer.setLooping(true);

ModelRenderable

.builder()

.setSource(this, Uri.parse("video_screen.sfb"))

.build()

.thenAccept(modelRenderable -> {

modelRenderable.getMaterial().setExternalTexture("videoTexture",

texture);

modelRenderable.getMaterial().setFloat4("keyColor",

new Color(0.01843f, 1f, 0.098f));

renderable = modelRenderable;

});

arFragment = (CustomArFragment)

getSupportFragmentManager().findFragmentById(R.id.arFragment);

scene = arFragment.getArSceneView().getScene();

scene.addOnUpdateListener(this::onUpdate);
48

private void onUpdate(FrameTime frameTime)

{ if (isImageDetected)

return;

Frame frame = arFragment.getArSceneView().getArFrame();

Collection<AugmentedImage> augmentedImages =

frame.getUpdatedTrackables(AugmentedImage.class);

for (AugmentedImage image : augmentedImages) {

if (image.getTrackingState() == TrackingState.TRACKING) {

if (image.getName().equals("image")) {

isImageDetected = true;

playVideo (image.createAnchor(image.getCenterPose()), image.getExtentX(),

image.getExtentZ());
49

break;

private void playVideo(Anchor anchor, float extentX, float extentZ) {

mediaPlayer.start();

AnchorNode anchorNode = new AnchorNode(anchor);

texture.getSurfaceTexture().setOnFrameAvailableListener(surfaceTexture -> {

anchorNode.setRenderable(renderable);

texture.getSurfaceTexture().setOnFrameAvailableListener(null);

});

anchorNode.setWorldScale(new Vector3(extentX, 1f, extentZ));


50

scene.addChild(anchorNode);

CustomArFragment.java

package com.example.arimages;

import android.content.Context;

import android.graphics.Bitmap;

import android.graphics.BitmapFactory;

import android.os.Bundle;

import android.view.LayoutInflater;

import android.view.View;

import android.view.ViewGroup;

import android.widget.FrameLayout;

import androidx.annotation.Nullable;

import com.google.ar.core.AugmentedImageDatabase;

import com.google.ar.core.Config;
51

import com.google.ar.core.Session;

import com.google.ar.sceneform.ux.ArFragment;

public class CustomArFragment extends ArFragment {

@Override

protected Config getSessionConfiguration(Session session) {

Config config = new Config(session);

config.setUpdateMode(Config.UpdateMode.LATEST_CAMERA_IMAGE);

AugmentedImageDatabase aid = new AugmentedImageDatabase(session);

Bitmap image = BitmapFactory.decodeResource(getResources(), R.drawable.tar);

aid.addImage("tar", image);

config.setAugmentedImageDatabase(aid);

this.getArSceneView().setupSession(session);

return config;

}
52

@Override

public View onCreateView(LayoutInflater inflater, @Nullable ViewGroup container,


@Nullable Bundle savedInstanceState) {

FrameLayout frameLayout = (FrameLayout) super.onCreateView(inflater, container,


savedInstanceState);

getPlaneDiscoveryController().hide();

getPlaneDiscoveryController().setInstructionView(null);

return frameLayout;

}
53

APPENDIX 2

SCREEN

SHOTS

Figure A2.1: Opening Camera


54

Figure A2.2: Media Augmented

You might also like