Image Classification
Image Classification
In
By
to
i
Candidate’s Declaration
I hereby declare that the work presented in this report entitled Image Classification in
partial fulfillment of the requirements for the award of the degree of Bachelor of
Technology in Computer Science and Engineering/Information Technology
submitted in the department of Computer Science & Engineering and Information
This is to certify that the above statement made by the candidate is true to the best of my
knowledge.
(Supervisor Signature)
Dr. Rakesh Kanji
Assistant Professor (SG)
Computer Science Engineering & Information Technology
Dated:
ii
Acknowledgement
Any serious and lasting achievement cannot be achieved without the help, guidance
and co-operation of numerous people involved in the work.
First and foremost, we would like to express my gratefulness to Prof. Dr. Samir Dev
Gupta, Head Department of Computer Science & Engineering and Information
Technology, Jaypee University of Information Technology for providing us the
opportunity to carry out this project as our final year project. It gives us immense
pleasure to express my deepest gratitude and thanks to Dr. Rakesh Kanji, Assistant
Professor (SG), Department of Computer Science & Engineering and Information
Technology, for not only imparting his knowledge but also his constant supervision,
advice and guidance throughout the project, without which this project wouldn’t have
been possible.
We would also like to thank all other department faculty at Jaypee University of
Information Technology. Not only did they taught us and made us capable enough to
undertake this project but were always there at the need of the hour and provided with
all the help, facilities and co-operation, which was required towards the completion of
our project.
A special mention to Ravi Raina Sir who assisted our project lab and guided us towards
all the minor issues.
Last but not the least, we would like to express our thanks to our parents and family
members for their support at every step of my life.
iii
Table of Content
1) Introduction 1
1.1) Introduction 1
1.3) Objectives 6
1.4) Methodology 7
2) Literature Survey 12
3) System Development 25
3.1) Model Development 25
iv
List of Abbreviations
IC – Image Classification
OD – Object Detection
CV – Computer Vision
SC – Supervised Classification
UC – Unsupervised Classification
FR – Face Recognition
FI – Face Identification
v
List of Figures
Figure Page
No.
vi
19. Figure 19: Sample Image LBP 35
vii
Abstract
Image Classification is a widely utilized for face recognition, object
detection, in which Face Recognition widely utilized biometric method due
to its natural and non-intrusive approach. Recently, deep learning networks
using Triplet Loss have become a common framework for person
identification and verification. In this paper, we present a new method on
how to select appropriate hard-negatives for training using Triplet Loss. We
show that, by incorporating pairs which would otherwise have been discarded
yields better accuracy and performance. We also applied Adaptive Moment
Estimation algorithm to mitigate the risk of early convergence due to the
additional hard-negative pairs. We managed to achieve an accuracy of 0.968
in open face and we observed much less accuracy in LBPH.
viii
Chapter 1 INTRODUCTION
1.1 INTRODUCTION
IC also plays a significant role in our day to day life and in the various
fields such as information security, biometric optimization and many
fields. IC is a technique that including IP, is, extracting key features
and matching that key features with the specific image. With the
modern method IC methods, this method has the ability to get
information regarding particular image quicker than anything ever
designed, in any case, we can apply it to consistent tests, traffic
1
recognizable proof, security, medicinal hardware, FR and various other
different fields.
2
far we are anxious. Accuracy, efficiency and performance of
our model is solely depending upon the how organized is our
data set. The more prearranged or supplementary it is, better the
outcomes are going to be. The process commonly tails this path,
firstly we see how well our model is doing on unknown data or
the data it is unaware of and by doing this thing we save a set
which is further going to use for the validation of overhead all.
3
the classes for these pictures utilizing the prepared learning
model.
4
procedure of model is that it extracts the face from particular
input image and then that has to be separated from the whole
input image and store it in pickle. Moving to another step we then
acknowledge the extricate highlights vector from the information
(test) picture. These so-called highpoints are taken into concern
and we correct the white-balance, contrast besides alignment after
doing above all the steps then the improved input image is put in
our dataset. The dataset (display picture) contains the same
arrangement of highlights previously removed and put away
during the enlistment phase when validation of all improved
images is going to take place.
5
1.3 OBJECTIVES
6
and then we have cohesive that bundle with Haar
Cascade, OpenFace and using unlike procedure then we
have equated the exactness of the model.
1.4 METHODOLOGY
7
Figure 1: The machine learning process
1 Feature Elimination
2 Feature Extraction
8
then at the time of abstraction we then formed 6 “new”
autonomous variables. And those 6 newly formed
variables are the exact duplicate of those previous 6
“old” formed variables.
9
Figure 3: Cartesian with modified dataset
1 L1 norms
2 L2 norms
10
Figure 4: Object Based Classification
1.5 ORGANIZATION
11
CHAPTER 3: It has the entire system design, in which the
discussion of system design structure, system architecture, the
attack design structure, the algorithms used and the system
architecture is being discussed.
CHAPTER 4: It covers all the results obtained until now with all
the screenshots. We have explained the whole concept of
different types of algorithm and the detailed pseudo code as well.
2.1 [1]
Author's Name: Tianmei Guo, Jiwen Dong, Henjian Li, Yunxing Gao
Abstract:
This paper is a basic research paper published by Tianmei Guo, Jiwen Dong,
Henjian Li, Yunxing Gao. In this citation the writer has defined that IC has
huge impact in field of CV plays a very crucial part, and this paper has very
imperative protagonist in our individual careers. IC holds a procedure which
incorporates preprocessing of images, picture fractionalization, key
12
characteristic extraction and identification comparing. Due to construction of
newest figures picture categorization procedures, additionally we get image
statistics faster than before, we can apply those statistics to systematic
experimentations, rush-hour congestion identification, safety, medical
assistance, face recognition and many different areas. In the era where DL is
growing so fast, feature extraction and classification is already being united
with the learning framework which assistances has overwhelmed many
outmoded methods of selection difficulties. In the previous decade
optimization of CNN has been chiefly troubled in following aspects.
In this paper writer has projected a simple yet very convenient CNN on
picture sorting. Prior to basis of CNN, writer has also scrutinized countless
different procedures about the learning rate set with proposed to diverse
optimizations techniques regarding solving difficulties which are very
parametric and revolves around different picture classification.
1. Convolution Layer
2. Pooling Layer
3. Fully-Controlled Layer
13
Figure 5: Architecture of LINET
1. Convolution Layer
This layer remains like brain for CNN, internally it got many local
connections and very bulky shared physiognomies. Purpose of living for
Convolution layer is to hold feature representation of various
engrossments. As unprotected previously CNN layer comprise of
supplementary than a few feature maps.
2. Pooling Layer
Specimen process is very related and similar to fuzzy filtering. This layer
got responsibility of subordinate feature withdrawal. Pooling has been
always placed in between two CNN layers. Kernel with moving usually
governs the dimensions of the pooling layer
3. Fully-Connected Layer
14
The classifier of CNN system is at slightest one entirely accompanying
layers. There is no spatial data fortified in completely accompanying
layers. The last completely associated layer is drop back by a vintage
layer. For grouping assignments, SoftMax degeneration is usually utilized
as a result of it producing a well-performed probability dispersion of the
yields.
2.2 [2]
Abstract:
15
sanctioned classification dependent on the constituent of the vision. In this
paper, we investigate the investigation of picture order utilizing profound
learning. The regular techniques utilized for picture grouping is scrap and bit
of the area of computerized reasoning (AI) officially known as AI.AI
comprises of highlight extraction module that concentrates the significant
highlights, for example, edges, surfaces and so on and an order module that
arrange dependent on the highlights extricated. The fundamental restriction
of AI is, while isolating, it can just concentrate certain arrangement of
highlights on pictures and incapable to extricate separating highlights from
the preparation set of information. This impediment is redressed by utilizing
the profound learning. Profound understanding (DL) is a subarea to the AI,
fit for gaining knowledge through its technique for figuring. Profound
studying model is acquainted with relentlessly separate data with a
homogeneous of a few calculations composition like how a person would
make judgments. To achieve this, profound learning uses a layered structure
communicated as a fake neural framework (ANN). Design of ANN is
recreated with the assistance of the organic neural system of the person
cerebrum. This creates the profound adapting generally skilled than the
standard AI models.
Four test pictures ocean anemone, indicator, Cystoscope and radio measuring
instrument are looked over Alex-Net database for testing and approval of
picture characterization utilizing profound learning. The convolutional neural
system is utilized in Alex-Net engineering for order reason. From the
analyses, it is seen that the pictures are characterized accurately in any event,
for part of test pictures and shows an adequacy of profound educating
calculation
16
Figure 7: Alex-Net Architecture
2.3 [3]
Abstract:
17
This paper is a basic research paper published by A. Vailaya, M.A.T.
Figueiredo, A.K. Jain, Hong-Jiang Zhang. In this citation the writer has
defined join numerous two-class morphemes into a solitary various leveled
morpheme. Gathering pictures into (semantically) significant classes utilizing
very low-level optical highlights is difficult and significant issue with respect
to-based picture recovery. Utilizing parallel Bayesian morphemes, they
endeavored to catch significant level ideas from small-level picture includes
under the limitation that experiment picture belongs to one of the classes. In
particular, they thought about the various leveled characterization of get-
away pictures; at the most elevated level, pictures are named indoor or open
air; outside pictures are additionally delegated city or scene; at long last, a
subset of scene pictures is arranged into dusk, timberland, and peak groups.
We exhibit that a little vector quantizes (where ideal measurement is chosen
utilizing an adjusted MDL basis) can be utilized to display the grade-
restrictive frequencies of the highlights, required by the Bayesian system.
The morphemes have been planned and assessed on a database of six
thousand, nine hundred thirty-one excursion photos. Our framework
accomplished a grouping exactness of ninety percent approximately for
indoor/open air, ninety five percent approximately for city/scene, ninety six
percent approximately for dusk/woods and mountain, and ninety six percent
for timberland/mountain characterization issues. We further build up a
learning strategy to gradually prepare the morphemes as extra information
become accessible. We additionally show primer outcomes for include
decrease utilizing grouping procedures.
2.4 [4]
18
Abstract:
19
Figure 8: Dataset of above illustration
20
2.5 [5]
Abstract:
21
this with an enormous preparing dataset, however a heuristic for our littler
dataset is to diminish the size of the information space by normalizing faces
so they eye, nose, and mouth show up at comparative areas in each picture.
If we proceed with test that are related with cataloguing, this process is
capable of consuming a support_vector mechanism and this scheme is
prevalently has there to match real time representation with the assimilated
dataset.
22
Figure 10: OpenFace vs VCG
23
Figure 11: Layers in OpenFace
2.6 [6]
Abstract:
24
to isolate the positive pair from the negative by a separation edge. The
thumbnails are tight yields of the face region, no 2D or 3D arrangement,
other than scale and interpretation is performed.
They have taken four datasets what's more, except for Named Faces in the
Wild and YouTube Appearances they assessed their strategy on the face
confirmation task.
25
Chapter 3 SYSTEM DEVELOPMENT
Design
Design of all the systems and problem depends upon the system
and therefore a procedure of phases in which it is formed. The
Design of our problem mainly depends upon the size of the
database (greater the number of databases is directly
proportional to the accuracy of model). From going through all
the procedure has uncovered that different procedures and
mix of these approaches can be applied being developed of
another face salutation model context. Amid between the
plentiful probable procedures, from the result we obtained and
then have chosen to utilize a bag with combination of statistics-
based approaches for FR part Haar cascade, dlib and Open-Face
for the face acknowledgement part. The principle reason for this
project is the process to regulate its smooth relevance and
steadfast eminence matters. Our approach for this
statement(project) is given below.
26
Figure 12: Haar Cascade based detection
Input Design
27
Figure 13: Face Detection Algorithm
Just only for this process we take Face Segmentation as the first
step of FD part because this process reduces computational time
and RGB is just only used to determine the face color only.
RGB color has very less part in FD part. The White_Balance of
picture varies different from one place to another place just
because the lightening at different places varies from each
other. This type of situation produces non-skin substances that
have place to detect skin objects.
a)
28
b)
a)
29
b)
30
a) b)
31
Figure 17: Model Approach
After done with the model approach, last step is to placed the
face representation. Then placing the image, the above image is
then preprocessed with histogram task so that we can separate
the image representation from the contextual so we convert it
into grayscale. Then the picture grid is resized and placed in
vectorizing frame of 30 X 30.
Algorithm
LBPH
32
In spite of the fact that it seems like an extremely basic
assignment for us, it has demonstrated to be a mind-boggling
task for a PC, as it has numerous factors that can disable the
precision of the strategies, for instance: light variety, low goals,
impediment, among other.
33
revolves around the matching of the feature we extract
and if that feature matches with one, we had stored in our
dataset and then we labeled face. This relation is
basically facialfeature X N. Here N is number of
relations in dataset.
Algorithm(Pseudo Code)
34
Figure 18: FR Model Approach
35
Step by Step
Try not to stress over the parameters at the present time, you
will comprehend them subsequent to perusing the following
stages
36
2 Training the Algorithm: First, we have to prepare the
calculation. To do accordingly, we need to use a dataset
with the facial photos of the people we have to see. We
need to similarly set an ID (it may be a number or the
name of the person) for each image, so the computation
will use this information to see a data picture and give
you a yield. Photos of a comparative individual must
have a comparable ID. With the readiness set recently
grew, we should see this algorithm computational
advances.
37
Figure 19: Sample Image LBP
38
the first picture.
39
3.1.2 Open-Face
Design
40
Figure 22: Face Embeddings
41
Isolating Face from Noisy Background
For any FR application our main motive and first move is to get
the image and separate that from the noisy background such that
result we get separate each and every unknown face which we
found earlier from the above one. This FR applications should
have the tendency to dealt with every situation weather it is
good or bad such as user can face lighting issue, white-balance
issue and position in which user image is placed it could be any
of the above issue for which the FR has to dealt with. That’s
why we have dlib combined with the applications of OpenCV is
more than enough to handle all of this at once. It’s the solely
duty of dlib to handle this area in which it has to recognize
facial points fiducial so that this process can handle position of
each and every user image position.
42
Figure 24: Isolating Face
Preprocessing
After we are done with locating each and every face in a user
input image then comes the main part called preprocessing the
user input what we have got, the major concern is that to find all
the faces that are in it. The one of the major concerns here are
unpredictable, bad illumination and translating user input to
gray_scale to get more and reliable faster model and features
also.
43
Figure 25: Affine Transformation
44
Figure 26: Sample Mean Landmarks
Classification
45
Figure 27: Classification Approach
The programing language we used for the whole process is Python 3.6
46
Here in the command line this shows all the results which are matched
with the labeled images.
After they were matched with the labeled images then the classifier
searches those facial features with those in the whole dataset and the
number which is displaying here shows us that this was located at the N
index in the dataset.
47
Figure 29: Image Classification 1 with Haar Cascade
48
Our Classifier was successfully able to predict multiple faces and it can
detect up to 8-10 people in the frame. It might be helpful in
determining unknown people in the large group
49
4.2 Using OpenFace
Firstly, we load the data into embeddings in such a way that we have all
the anchor in the dataset, and these anchors are going to be used in
when we are going to match matched with the labeled images then the
classifier searches those facial features with those in the whole dataset
and the number which is displaying here shows us that this was located
at the N index in the dataset.
50
After storing embeddings in the dataset, we have to load the data
encodings, loading dataset, loading caffemodel which is going to be use
in the process as this process takes our lot of time depending upon the
configuration of our system.
We were using
51
Figure 34: FaceNet Model with accuracy
52
Classification Accuracy
120
97 96 94 93
100
80
55
60
45
35
40
25
20
0
10 25 50 100
LBPH OpenFace
Training Time
100
0 5
0
-6 10 -2 25 50 100
-100 -55
-101
-200 -150
-300
-400
-500
-500
-600
LBPH OpenFace
53
Chapter 5 CONCLUSIONS
5.1 CONCLUSION
54
2 The bulk of available information is constantly growing and
issues are experienced during classification systems. They
desire to give awesome proposals in record time paying little
mind to this development in data bulk.
55
5.2 Application
56
REFERENCES
[1] K. Fukushima, S. Miyake, Neocognitron: A self-organizing neural
network model for a mechanism of visual pattern recognition, in:
Competition and cooperation in neural nets, 1982, pp. 267–285.
[6] X.-X. Niu, C. Y. Suen, A novel hybrid cnn–svm classifier for recognizing
handwritten digits, Pattern Recognition 45 (4) (2012) 1318–1325.
57
V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in:
Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), 2015, pp. 1–9.
[11] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image
recognition, in: Proceedings of the IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), 2016, pp. 770–778.
58
APPENDICIES
Working of LBPH
59
Loading Embeddings in OpenFace
60
JAYPEE UNIVERSITY OF INFORMATION TECHNOLOGY, WAKNAGHAT
PLAGIARISM VERIFICATION REPORT
15/07/2020
Date: ………………………….
Type of Document (Tick): PhD Thesis M.Tech Dissertation/ Report B.Tech Project Report Paper
Rishabh Agarwal
Name: ___________________________ CSE
__Department: _________________ 161336
Enrolment No _________
9761760383
Contact No. ______________________________E-mail. [email protected]
______________________________________
Dr. Rakesh Kanji
Name of the Supervisor: ________________________________________________________________
Title of the Thesis/Dissertation/Project Report/Paper (In Capital letters): ________________________
IMAGE CLASSIFICATION
________________________________________________________________________________________________________
________________________________________________________________________________________________________
UNDERTAKING
I undertake that I am aware of the plagiarism related norms/ regulations, if I found guilty of any plagiarism and
copyright violations in the above thesis/report even after award of degree, the University reserves the rights to
withdraw/revoke my degree/report. Kindly allow me to avail Plagiarism verification report for the document
mentioned above.
Complete Thesis/Report Pages Detail:
Total No. of Pages = 65
Total No. of Preliminary pages = 53
Total No. of pages accommodate bibliography/references = 12
(Signature of Student)
FOR DEPARTMENT USE
7
We have checked the thesis/report as per norms and found Similarity Index at ………………..(%). Therefore, we
are forwarding the complete thesis/report for final plagiarism check. The plagiarism verification report may be
handed over to the candidate.
Word Counts
All Preliminary
Pages
Report Generated on Bibliography/Ima Character Counts
ges/Quotes
Submission ID Total Pages Scanned
14 Words String
File Size
Checked by
Name & Signature Librarian
……………………………………………………………………………………………………………………………………………………………………………
Please send your complete thesis/report in (PDF) with Title Page, Abstract and Chapters in (Word File)
through the supervisor at [email protected]
JAYPEE UNIVERSITY OF INFORMATION TECHNOLOGY, WAKNAGHAT
PLAGIARISM VERIFICATION REPORT
15/07/2020
Date: ………………………….
Type of Document (Tick): PhD Thesis M.Tech Dissertation/ Report B.Tech Project Report Paper
Varun Choudhary
Name: ___________________________ CSE
__Department: _________________ 161271
Enrolment No _________
8894518242
Contact No. ______________________________E-mail. [email protected]
______________________________________
Dr. Rakesh Kanji
Name of the Supervisor: ________________________________________________________________
Title of the Thesis/Dissertation/Project Report/Paper (In Capital letters): ________________________
IMAGE CLASSIFICATION
________________________________________________________________________________________________________
________________________________________________________________________________________________________
UNDERTAKING
I undertake that I am aware of the plagiarism related norms/ regulations, if I found guilty of any plagiarism and
copyright violations in the above thesis/report even after award of degree, the University reserves the rights to
withdraw/revoke my degree/report. Kindly allow me to avail Plagiarism verification report for the document
mentioned above.
Complete Thesis/Report Pages Detail:
Total No. of Pages = 65
65
Total No. of Preliminary pages = 853
Total No. of pages accommodate bibliography/references = 412
(Signature of Student)
FOR DEPARTMENT USE
7
We have checked the thesis/report as per norms and found Similarity Index at ………………..(%). Therefore, we
are forwarding the complete thesis/report for final plagiarism check. The plagiarism verification report may be
handed over to the candidate.
Word Counts
All Preliminary
Pages
Report Generated on Bibliography/Ima Character Counts
ges/Quotes
Submission ID Total Pages Scanned
14 Words String
File Size
Checked by
Name & Signature Librarian
……………………………………………………………………………………………………………………………………………………………………………
Please send your complete thesis/report in (PDF) with Title Page, Abstract and Chapters in (Word File)
through the supervisor at [email protected]