Documentation
Documentation
A Project Report
of
Master of Computer Applications
in
by
(2201600172)
Under the supervision
of
Dr.A.C.Priya Ranjani
Assistant Professor
K L E F, Green Fields,
2023- 24
KONERU LAKSHMAIAH EDUCATION FOUNDATION
DECLARATION
CERTIFICATE
This is to certify that the Project Report entitled “Facilitating Efficient, Secure, Verifiable
Searchable Symmetric Encryption” is being submitted by Chinni Parameswari Pranavi,
in partial fulfillment of the requirements for the award of Master of Computer Applications
in Computer Science and Applications to the K L Education Foundation is a record of
bonafide work carried out under our guidance and supervision.
The results embodied in this report have not been copied from any other departments/
University/ Institute.
The satisfaction that accompanies the successful completion of any task would be
incomplete without the mention of people who made it possible and whose constant guidance
and encouragements crown all the efforts with success.
I am very thankful to our project guide Dr. A. C. Priya Ranjani, Associate Professor, for
his continuous support and encouragement in completing the Project work. Without his help the
project couldn’t have been completed.
I express our heartful gratitude to Dr. G. Krishna Mohan, Head of the Department of
Computer Science and Applications for providing us with adequate facilities, ways and means
by which we are able to complete this project.
I express our heartful gratitude to Dr. Subrahmanyam, Professor and Principal of
the College of Sciences, for providing us with adequate facilities, ways and means by which
we can complete this project. Last but not least, we thank all Teaching and Non-Teaching
Staff of our department and especially my classmates and my friends for their support in
the completion of our project.
i
LIST OF FIGURES
ii
LIST OF TABLES
iii
ABSTRACT
In this rapidly evolving digital world, it has become increasingly difficult to distinguish
between an original image and a manipulated image. New tools, which are technologically
advanced and are easily accessible, are being used to modify an image to meet one’s sinister
purposes. Rampant counterfeiting of images has been used to create distrust among people.
This necessitates the need for forensic digital analysis of images. This project proposes a
method for the verification of images. This method is used to detect the copy move
modifications within an image, using the discrete cosine transform. The features that are
extracted from these coefficients helps us to obtain transfer vectors, which are clustered and
through this it is likely to determine whether copy move forgery is done in an image or not.
The test results obtained from benchmark datasets illustrate the effectiveness of the proposed
method.
iv
CHAPTER 1
INTRODUCTION
As the famous saying goes,‘‘A picture is worth a thousand words’’ and this has never
been truer than in today’s visually oriented society. Currently, images are used in many
common areas such as teaching, journalism, jurisprudence, medicine, advertising, art, etc.
Driven by social media networks and instant messaging applications, multimedia content is
the primary source of internet traffic.
Besides all of this, the continuous improvement of the cameras incorporated in mobile
devices together with the evolution of the image editing tools has made it easier to
manipulate an image with excellent results and shared it with the world through the Internet
in no time. Although manipulated images have been around for decades and are present in
many sectors (politics, cinema, press, legal, etc.), nowadays the availability of resources to
share information makes image tampering dangerous, making people think that what they are
seeing is the truth.
Regarding the legal use of multimedia content, photograph and video evidence can be
extremely useful in legal proceedings. Potential evidence is everywhere, thanks to the
proliferation of surveillance cameras, smartphones, tablets, and social media networks.
Nevertheless, any image and video to be admissible in court must meet two basic
requirements: relevance and authenticity. For the evidence to be relevant, it must be
sufficiently useful to prove something important in a trial, which means that it must either
support or undermine the truth during the legal proceedings. And to be authenticated, the
evidence must accurately represent its subject. Over the years, image editing tools have been
perfected, offering better results and simplifying their functionality. It is now relatively
simple to make more realistic tampered multimedia files, such as images and videos, without
leaving any noticeable shreds of evidence.
This leaves a challenging task for the forensic analyst to validate the authenticity of
images as it is almost impossible for the human naked eye to distinguish between the forged
image and the real one. In this matter, this project proposes a method which performs
detection of copy-move alterations in an image using the discrete cosine transform. The
characteristics obtained from these coefficients allow us to
1
obtain transfer vectors, which are grouped together, and, through the use of a tolerance
threshold, it is possible to determine whether or not there are regions copied and pasted
within the analyzed image.
Our experiments were carried out with manipulated images from public databases
widely used in literature and they demonstrate the efficiency of the proposed method.
Image Retouching can be considered to be the less harmful kind of digital image forgery.
Image retouching does not significantly change an image, but instead, enhances or reduces
certain feature of an image. This technique is popular among magazine photo editors. It can
be said that almost all magazine cover would employ this technique to enhance certain
features of an image so that it is more attractive; ignoring the fact that such enhancement is
ethically wrong. Fig 1.1 shows an original image of lady’s face and the same face with
enhanced effects applied to it.
2
1.1.2 IMAGE SPLICING
The shark is copied, and it is pasted below the helicopter in the base image. This
copy-paste operation from one image into another image forms a spliced image as shown in
Fig 1.2.
3
1.1.3 COPY MOVE FORGERY
Copy-move attack is more or less similar to Image Splicing in view of the fact that
both techniques modify certain image region (of a base image), with another image.
However, instead of having an external image as the source, copy-move attack uses portion
of the original base image as its source. In other words, the source and the destination of the
modified image originated from the same image. In a copy-move attack, parts of the original
image are copied, moved to a desired location, and pasted. This is usually done in order to
conceal certain details or to duplicate certain aspects of an image. Blurring is usually applied
along the border of the modified region to reduce the effect of irregularities between the
original and pasted region. In Fig 1.4, the original image is shown. In the original image,
there are shrubs, jeep, and a car. In Fig 1.3, the forged image is shown. In this image, an area
from the forest region is copied and pasted on to the area where the jeep is present. In this
way, copy move forgery is done.
4
1.1.4 MORPHING
It is a special effect in motion pictures and animations that changes one image or
shape into another through a seamless transition. Most often it is used to depict one person
turning into another through technological means or as part of a fantasy or surreal sequence.
Fig 1.5 is an example of morphing.
5
1.2 COPY MOVE FORGERY DETECTION APPROACHES
The different copy moves forgery detection approaches have been represented pictorially
in the form of a flowchart as shown in Fig 1.6:
6
1.2.1 BRUTE FORCE
This is the simplest (in principle) and most obvious approach. There are two methods
in brute force technique. They are auto- correlation method and exhaustive search method.
In this method, the image and its circularly shifted version are overlaid looking for
closely matching image segments. Let us assume that xij is the pixel value of a grayscale
image of size M×N at the position i, j. In the exhaustive search, the following differences are
examined:
|𝑥𝑖𝑗 − 𝑥𝑖+𝑘 𝑚𝑜𝑑(𝑀)𝑗 + 𝑙 𝑚𝑜𝑑(𝑁)|, k = 0, 1, …, M–1, l = 0, 1, …, N–1 for all i and j.
It is easy to see that comparing xij with its cyclical shift [k, l] is the same as
comparing xij with its cyclical shift [k’,l’], where 𝑘′ = 𝑀 − 𝑘 and 𝑙′ = 𝑁 − 𝑙 . Thus, it
suffices to inspect only those shifts [k,l] with 1 ≤ 𝑘 ≤ 𝑀/2, 1 ≤ 𝑙 ≤ 𝑁/2, thus cutting the
computational complexity by a factor of 4. Fig 1.7 shows a test image and its circular shift.
For each shift [k,l], the differences ∆𝑥𝑖𝑗 = |𝑥𝑖𝑗 − 𝑥𝑖+𝑘 𝑚𝑜𝑑(𝑀)𝑗 + 𝑙 𝑚𝑜𝑑(𝑁)|, are
calculated and thresholded with a small threshold t. The threshold selection is problematic,
because in natural images, a large amount of pixel pairs will produce differences below the
threshold t. However, according to our requirements we are only interested in connected
segments of certain minimal size. Thus, the thresholded difference ∆𝑥𝑖𝑗 is further processed
using the morphological opening
7
operation. The image is first eroded and then dilated with the neighborhood size
corresponding to the minimal size of the copy-moved area (in experiments, the 10×10
neighborhood was used). The opening operation successfully removes isolated points. The
simplest approach is brute force based and is detected by using exhaustive search. But this
approach is computationally very expensive.
The autocorrelation of the image x of the size M×N is defined by the formula:
𝑀 𝑁
𝑟𝑘,𝑙 = ∑ ∑ 𝑥𝑖,𝑗𝑥𝑖+𝑘 ,𝑗+𝑙 , 𝑖, 𝑘 = 0, … , 𝑀 − 1, 𝑗, 𝑙 = 0, 𝑁 − 1.
𝑖=1 𝑗=1
The autocorrelation can be efficiently implemented using the Fourier transform
utilizing the fact that
𝑟 = 𝑥 ∗ 𝑥̂ ,
where 𝑥̂𝑖𝑗 = 𝑥𝑀+1−𝑖 ,𝑁+1−𝑗 , 𝑖, 𝑘 = 0, … . , 𝑀 − 1, 𝑗, 𝑙 = 0, … . 𝑁 − 1. Thus, we
have
𝑟 = 𝐹−1{𝐹(𝑥) 𝐹(𝑥̂ ) }, where F denotes the Fourier transform.
The logic behind the detection based on autocorrelation is that the original and copied
segments will introduce peaks in the autocorrelation for the shifts that correspond to the
copied-moved segments. However, because natural images contain most of their power in
low frequencies, if the autocorrelation r is computed directly for the image itself, r would
have very large peaks at the image corners and their neighborhoods. Thus, we compute the
autocorrelation not from the image directly, but from its high pass filtered version.
Although, this method is simple and does not have a large computational complexity,
it often fails to detect the forgery unless the size of the forged area is at least ¼ of linear image
dimensions.
8
1.2.2 KEY POINT BASED METHODS
SIFT introduced by Lowe and SURF key point features are widely used in CMFD.
SIFT computation is based on the following steps: scale-space extrema detection, key point
localization, orientation assignment and key point descriptors.
In, SIFT key points are used in CMFD. For an experiment, images are collected from
the internet. In this method, SIFT descriptors are extracted and matched by calculating
Euclidean distance between descriptor vectors. It is observed that due to the scale and rotation
invariant properties of SIFT this method is robust against rotation and scaling. However, it
still needs some improvements in robustness against low signal-to-noise ratio and detecting
forgeries of small-sized smooth regions.
In, SURF descriptors are extracted from the forged image and matching is performed
between the subsets of the descriptor. It is observed that the method is fast as well as reliable
in small-sized images. However, localization of forgery is not done. In, Discrete Cosine
Transform (DCT) and SURF are combined for CMFD. This method is tested on
uncompressed color images database (UCID). First, DCT coefficients are analyzed for
double JPEG compression effect and features are extracted by applying SURF.
It is observed that this method is able to detect and locate the forgery in the tampered
image. However, the experiment is performed only on one small dataset. Fourier–Mellin
Transform (FMT) and SURF are combined in; this image is divided into non-flat and flat regions
after that SURF is applied in non-flat and FMT is applied into flat regions to detect forgery
9
1.2.3 BLOCK BASED METHODS
The block-based approach starts by partitioning the tampered image into overlapping or
non-overlapping blocks. This division is often followed by robust features extraction from
each block and features matching in block pairs. In the matching step, the block features are
sorted or arranged using appropriate data structures and forgery decision is based on the
similarity of the adjacent block features pairs. There are two types of methods in block-based
type. They are exact match and robust match techniques.
The first algorithm described in this section is for identifying those segments in the
image that match exactly. Even though the applicability of this tool is limited, it may still be
useful for forensic analysis. It also forms the basis of the robust match detailed in the next
section. In the beginning, the user specifies the minimal size of the segment that should be
considered for match. Let us suppose that this segment is a square with B×B pixels. The
square is slid by one pixel along the image from the upper left corner right and down to the
lower right corner. For each position of the B×B block, the pixel values from the block are
extracted by columns into a row of a two-dimensional array A with B2 columns and (M–
B+1) (N–B+1) rows.
Each row corresponds to one position of the sliding block. Two identical rows in the
matrix A correspond to two identical B×B blocks. To identify the identical rows, the rows of
the matrix A are lexicographically ordered (as B×B integer tuples). This can be done in 𝑀𝑁
𝑙𝑜𝑔2𝑀𝑁 steps. The matching rows are easily searched by going through all MN rows of the
ordered matrix A and looking for two consecutive rows that are identical.
10
1.2.3.2 ROBUST MATCH
The idea for the robust match detection is similar to the exact match except we do
not order and match the pixel representation of the blocks but their robust representation that
consists of quantized DCT coefficients. The quantization steps are calculated from a user -
specified parameter Q. This parameter is equivalent to the quality factor in JPEG
compression, i.e., the Q factor determines the quantization steps for DCT transform
coefficients. Because higher values of the Q- factor lead to finer quantization, the blocks
must match more closely in order to be identified as similar. Lower values of the Q-factor
produce more matching blocks, possibly some false matches. The detection begins in the
same way as in the exact match case.
The image is scanned from the upper left corner to the lower right corner while
sliding a B×B block. For each block, the DCT transform is calculated, the DCT coefficients
are quantized and stored as one row in the matrix A. The matrix will
have (M– B+1) (N–B+1) rows and B×B columns as for the exact match case. The rows of A
are lexicographically sorted as before. The remainder of the procedure, however, is different.
Because quantized values of DCT coefficients for each block are now being compared
instead of the pixel representation, the algorithm might find too many matching blocks (false
matches). Thus, the algorithm also looks at the mutual positions of each matching block pair
and outputs a specific block pair only if there are many other matching pairs in the same
mutual position (they have the same shift vector). Towards this goal, if two consecutive rows
of the sorted matrix A are found, the algorithm stores the positions of the matching blocks in
a separate list (for example, the coordinates of the upper left pixel of a block can be taken as
its position) and increments a shift-vector counter C. Formally, let (𝑖1 , 𝑖2) and (𝑗1, 𝑗2) be the
positions of the two matching blocks. The shift vectors between the two matching blocks are
calculated as
𝑠 = (𝑠1, 𝑠2) = (𝑖1 − 𝑗1, 𝑖2 − 𝑗2)
Because the shift vectors –s and s correspond to the same shift, the shift vectors s are
normalized, if necessary, by multiplying by –1 so that 𝑠1 ≥ 0. For each matching pair of
blocks, we increment the normalized shift vector counter C by one:
11
The shift vectors are calculated, and the counter C incremented for each pair of
consecutive matching rows in the sorted matrix A. The shift vector C is initialized to zero
before the algorithm starts. At the end of the matching process, the counter C indicates the
frequencies with which different normalized shift vectors occur. Then the algorithm finds all
normalized shift vectors s(1), s(2), …, s(K), whose occurrence exceeds a user-specified
threshold T: C(s(r)) > T for all r = 1, …, K. For all normalized shift vectors, the matching
blocks that contributed to that specific shift vector are colored with the same color and thus
identified as segments that might have been copied and moved. The value of the threshold T
is related to the size of the smallest segment that can be identified by the algorithm. Larger
values may cause the algorithm to miss some not-so-closely matching blocks, while too small
a value of T may introduce too many false matches. We repeat that the Q factor controls the
sensitivity of the algorithm to the degree of matching between blocks, while the block size
B and threshold T control the minimal size of the segment that can be detected. For the robust
match, we have decided to use a larger block size, B=16, to prevent too many false matches
(larger blocks have larger variability in DCT coefficients).
However, this larger block size means that a 16×16 quantization matrix must be used
instead of simply using the standard quantization matrix of JPEG. We have found out from
experiments that all AC DCT coefficients for 16×16 blocks are on average 2.5 times larger
than for 8×8 blocks, and the DC term is twice as big.
12
1.3 DISCRETE COSINE TRANSFORM
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of
a sum of cosine functions oscillating at different frequencies. The DCT, first proposed by
Nasir Ahmed in 1972, is a widely used transformation technique in signal processing and
data compression. It is used in most digital media, including digital images, digital video,
digital audio, digital television, digital radio, and speech coding. DCTs are also important to
numerous other applications in science and engineering, such as digital signal processing,
telecommunication devices, reducing network bandwidth usage, and spectral methods for the
numerical solution of partial differential equations.
The use of cosine rather than sine functions is critical for compression, since it turns out
(as described below) that fewer cosine functions are needed to approximate a typical signal,
whereas for differential equations the cosines express a particular choice of boundary
conditions. In particular, a DCT is a Fourier-related transform similar to the discrete Fourier
transform (DFT) but using only real numbers. The DCTs are generally related to Fourier
Series coefficients of a periodically and symmetrically extended sequence whereas DFTs are
related to Fourier Series coefficients of only periodically extended sequences. DCTs are
equivalent to DFTs of roughly twice the length, operating on real data with even symmetry
(since the Fourier transform of a real and even function is real and even), whereas in some
variants the input and/or output data are shifted by half a sample. There are eight standard
DCT variants, of which four are common.
The most common variant of discrete cosine transform is the type-II DCT, which is often
called simply "the DCT". This was the original DCT as first proposed by Ahmed. Its inverse,
the type-III DCT, is correspondingly often called simply "the inverse DCT" or "the IDCT".
13
Two related transforms are the discrete sine transform (DST), which is equivalent to
a DFT of real and odd functions, and the modified discrete cosine transform (MDCT), which
is based on a DCT of overlapping data. Multidimensional DCTs (MD DCTs) are developed
to extend the concept of DCT to MD signals. There are several algorithms to compute MD
DCT. A variety of fast algorithms have been developed to reduce the computational
complexity of implementing DCT. One of these is the integer DCT (IntDCT), an integer
approximation of the standard DCT.
DCT compression, also known as block compression, compresses data in sets of discrete
DCT blocks. DCT blocks can have a number of sizes, including 8x8 pixels for the standard
DCT, and varied integer DCT sizes between 4x4 and 32x32 pixels. The DCT has a strong
"energy compaction" property, capable of achieving high quality at high data compression
ratios. However, blocky compression artifacts can appear when heavy DCT compression is
applied.
where 0 ≤ p ≤ M − 1 and 0 ≤ q ≤ N − 1.
The DCT has the property that, for a typical image, most of the visually significant
information about the image is concentrated in just a few coefficients of the DCT. For this
reason, the DCT is often used in image compression applications. For example, the DCT is at
the heart of the international standard lossy image compression algorithm known as JPEG.
(The name comes from the working group that developed the standard: the Joint
Photographic Experts Group.)
14
The main reasons why DCT is used are: Discrete Cosine Transform gives better
approximation with fewer coefficients, when compared to other transforms and less storage
is required to represent the image features.
15
CHAPTER 2
This project proposes a method for the verification of images. This method is used to
detect the copy move modifications within an image, using the discrete cosine transform.
The features that are extracted from these coefficients help us to obtain transfer vectors, which
are clustered and through this it is likely to determine whether copy move forgery is done in
an image or not. The test results obtained from benchmark datasets illustrate the
effectiveness of the proposed method.
16
CHAPTER 3
EXISTING METHODS
The copy-move technique is another popular method used today for image forgery,
where a region of an image is used to hide another region from the same image. The existence
of two identical regions is not ordinary in natural images; thus, this property can be used to
detect this type of manipulation. Even after applying some post-processing processes, such
as edge smoothing, blurring, and adding noise to eliminate visible traces of manipulation,
there will be two extremely similar regions in the manipulated image. In the literature a large
number of copy-move forgery detection methods have been proposed. Nevertheless, all of
these methods can be classified into two main categories: block-based and key point-based
methods [1,2]. Of all those, one of the most used to detect copy-move forgery is the method
that uses a block matching algorithm. In this algorithm, the image is divided into overlapping
blocks, and the blocks are compared to find the duplicated region. Fig 3.1 shows a general
scheme of a block matching algorithm.
Fridrich et al. [3] proposed a method based on the discrete cosine transform (DCT) to
identify copy-move forgery. The method split the image into overlapping blocks of 16 * 16.
Then, the DCT coefficient characteristics are extracted from each block and then these
coefficients are classified lexicographically. After the lexicographical classification,
comparable squares are distinguished, and the duplicated regions are found. Fridrich et al.
introduced one of the first techniques that use DCT to identify copy-move forgeries on
images.
17
Popescu et al. [4] introduced a technique to recognize duplicate regions within images.
Popescu’s algorithm employs principal components analysis (PCA) rather than DCT. The algorithm
uses PCA on small fixed-size image blocks, and then each block is lexicographically ordered. This
method has proved great efficiency to recognize copy move forgeries.
Kang et al. [5] used singular value decomposition SVD to distinguish the modified areas in
a picture. By applying SVD, a feature vector is extracted, and the dimensions reduced. Then, identical
blocks were identified by the use of a lexicographic classification. Kang’s method demonstrated to be
robust and effective. The results of the experiment prove the efficacy of the method.
Huang et al. [6] introduced a method to identify copy move manipulation over digital
images applying SIFT algorithm. The authors showed the SIFT calculation algorithm using the block
matching function. This method gives great results even when the image is noisy or compressed.
In Bo X et al. [7], a scheme based on speeded up robust features (SURF) was proposed,
which have key point characteristics better than SIFT because they work better with post processing
techniques such as brightness and blur variations. However, the methods based on key points present a
problem of visual output because the copied and pasted regions consist of lines and points that do not
show a clear and intuitive visual effect.
Amerini et al. [8], proposed a method based on SIFT. The proposed method can identify
copied regions in images. Also, the method proposed can detect which geometric transformation was
applied. Due to the copied region of the image looks the same as the original, the key points extracted
in the duplicated region will be identical to those in the original. This method is also useful with low-
quality factor compressed images.
Table 3.1 presents a summary of the copy-move detection techniques analyzed by comparing
their results in terms of accuracy.
18
Table 3.1: Comparison of Existing Methods
20
CHAPTER 4
SYSTEM REQUIREMENTS
The necessary tools needed to execute this project is listed and explained in this chapter. The
system requirements are as follows:
Python 3
Numpy
TensorFlow
Open CV(CV2)
Matplotlib
Seaborn
Keras
Sklearn
Ast
21
4.1 VISUAL STUDIO CODE
Visual Studio Code is a lightweight but powerful source code editor which runs on
your desktop and is available for Windows, macOS and Linux. It comes with built-in support
for JavaScript, TypeScript and Node.js and has a rich ecosystem of extensions for other
languages (such as C++, C#, Java, Python, PHP, Go) and runtimes (such as .NET and Unity).
22
4.1.1 INSTALLATION
23
Fig 4.1.3: Setup Installation III
24
Fig 4.1.5: Completion of Setup Installation
25
4) Alternatively, we can also download a Zip archive, extract it and run Code from
there.
4.2 PYTHON
Guido van Rossum began working on Python in the late 1980s as a successor to the
ABC programming language and first released it in 1991 as Python
0.9.0. Python 2.0 was released in 2000 and introduced new features such as list
comprehension, cycle-detecting garbage collection, reference counting, and Unicode
support.
Python 3.0, released in 2008, was a major revision that is not completely backward
compatible with earlier versions. Python 2 was discontinued with version
2.7.18 in 2020. Python consistently ranks as one of the most popular programming
languages.
After installing VS Code, install the Python extension for VS Code from the Visual
Studio Marketplace. For additional details on installing extensions, see Extension
Marketplace. The Python extension is named Python and it's published by Microsoft. Along
with the Python extension, a python interpreter is installed. Fig
3.7 is the snapshot where Python is installed from VS Code.
26
Fig 4.1.7: Installing Python from VS code
27
4.3LIBRARIES USED AND THEIR INSTALLATION
Libraries such as NumPy, SciPy, OpenCV and Matplotlib are used. They are
explained below:
4.3.1 NUMPY
NumPy is a Python library used for working with arrays. It also has functions for
working in domain of linear algebra, fourier transform, and matrices. NumPy was created in
2005 by Travis Oliphant. It is an open-source project, and you can use it freely. NumPy
stands for Numerical Python.
4.3.2 TENSORFLOW
4.3.2 OPENCV
OpenCV is the huge open-source library for the computer vision, machine learning,
and image processing and now it plays a major role in real-time operationwhich is very
important in today’s systems. By using it, one can process images and videos to identify
objects, faces, or even handwriting of a human. When it integrated with various libraries, such
as NumPy, python is capable of processing the OpenCV array structure for analysis. To
Identify image pattern and its various features we use vector space and perform mathematical
operations on these features. The first OpenCV version was 1.0.
28
OpenCV is released under a BSD license and hence it’s free for both academic and
commercial use. It has C++, C, Python and Java interfaces and supports Windows, Linux,
Mac OS, iOS and Android. When OpenCV was designed the main focus was real-time
applications for computational efficiency. All things are written in optimized C/C++ to take
advantage of multi-core processing.
4.3.4 MATPLOTLIB
4.3.5 SEABORN
4.3.6 KERAS
Keras is a high-level neural networks API written in Python and is designed to be user-
friendly, modular, and extensible. It provides a way to easily build, train, and deploy deep
learning models. Keras can run on top of TensorFlow, Theano, or Microsoft Cognitive
Toolkit (CNTK). In recent versions, Keras has been integrated tightly with TensorFlow,
making TensorFlow an excellent backend for Keras.
29
4.3.7 SKLEARN
4.3.8 AST
The ‘ast’ module in Python (abstract syntax trees) is a powerful library that allows you
to work with and manipulate Python code as abstract syntax trees. It's commonly used for tasks
such as code analysis, transformation, and generation. Here's an overview of how you can use
the ‘ast’ module in Google Colab to analyze and work with Python code.
In an activated environment, use python and pip to invoke the Python interpreter and
the package manager. In other words, inside a Python 3.7 environment using python will
invoke the Python 3.7 interpreter.
Now, let’s install NumPy, TensorFlow and Matplotlib in the above environment:
30
CHAPTER 5
PROPOSED METHOD
5.1 ALGORITHM
A deep learning model refers to a type of artificial neural network (ANN) that consists of
multiple layers (or "depth") of interconnected nodes, allowing it to learn complex patterns and
representations from data. Deep learning models are a subset of machine learning models and
have gained significant popularity and success in recent years, particularly in tasks like image
recognition, natural language processing, and reinforcement learning.
31
Here are key characteristics and components of deep learning models:
1. Multiple Layers:
Deep learning models typically consist of multiple layers stacked on top of each other,
where each layer performs specific computations on the input data.
Common layers include dense (fully connected) layers, convolutional layers (for spatial data
like images), recurrent layers (for sequential data like text), and more.
2. Hierarchical Representation:
Deep learning models learn hierarchical representations of data. Lower layers capture
low-level features (e.g., edges, textures), while higher layers learn more abstract and complex
features (e.g., object parts, semantics).
3. Activation Functions:
Each layer in a deep learning model usually includes an activation function (e.g., ReLU,
sigmoid, tanh) that introduces non-linearity into the model, enabling it to learn complex
mappings.
Deep learning models are trained using optimization techniques like gradient descent
and backpropagation. Backpropagation computes gradients of the loss function with respect
to the model parameters, allowing for iterative updates of the weights to minimize the loss.
Deep learning models are trained to minimize a defined loss function, which measures
the difference between predicted outputs and true targets.
Optimization algorithms like stochastic gradient descent (SGD), Adam, or RMSprop are used
to update the model weights during training.
32
6. Architectural Variants:
Deep learning models come in various architectures designed for different tasks.
Examples include Convolutional Neural Networks (CNNs) for image-related tasks, Recurrent
Neural Networks (RNNs) for sequential data, and Transformer architectures for natural
language processing.
7. Regularization Techniques:
To prevent overfitting (where a model learns the training data too well but fails to
generalize), deep learning models use techniques like dropout, L1/L2 regularization, and batch
normalization.
8. Transfer Learning:
Deep learning models can leverage pre-trained models (transfer learning) by fine-
tuning them on specific tasks. This approach is common when training data is limited or
computational resources are constrained.
9. Deployment:
Trained deep learning models can be deployed on various platforms, including cloud
servers, edge devices (like smartphones), and embedded systems, depending on computational
requirements and latency constraints.
34
5.1.2 MOBILENET ALGORITHM
The MobileNet algorithm can be complex due to its architecture involving depth wise
separable convolutions and other design elements. However, I can provide a simplified
flowchart that outlines the general structure and flow of operations in a MobileNet-like
convolutional neural network (CNN). This flowchart will focus on the key steps involved in
processing an input image through the MobileNet architecture.
Please note that this flowchart is a high-level representation and does not capture all the
nuances of the MobileNet design, but it should give you a basic understanding of the process.
35
Start
|
|--- Input: Image (WxHxC)
|
|--- Preprocessing:
| - Resize Image to Target Size (e.g., 224x224x3)
| - Normalize Pixel Values (e.g., scale to [0, 1] or [-1, 1])
|
|--- Convolutional Layers:
| - Depthwise Separable Convolution Blocks:
| For each block:
| |
| |--- Depthwise Convolution:
| | - Apply Depthwise Convolution (3x3 kernel, stride=1, padding=same)
| | - Batch Normalization
| | - ReLU Activation
| |
| |--- Pointwise Convolution:
| | - Apply Pointwise Convolution (1x1 kernel, stride=1, padding=same)
| | - Batch Normalization
| | - ReLU Activation
|
|--- Pooling and Global Average Pooling:
| - Apply Average Pooling (e.g., 7x7 kernel, stride=1)
| - Flatten Output into a 1D Vector
|
|--- Fully Connected Layers (Optional):
| - Dense Layers (Fully Connected)
| - Dropout (Optional for regularization)
| - Output Layer (e.g., Softmax for classification)
|
|--- Output: Class Probabilities (Predictions)
|
End
36
Explanation of Flowchart Components:
Input: Represents the input image to the MobileNet model.
Preprocessing: Typical preprocessing steps performed on the input image, such as resizing and
normalization.
Convolutional Layers: This section outlines the core building blocks of MobileNet:
Pooling and Global Average Pooling: Applied to reduce spatial dimensions and extract features.
Fully Connected Layers: Optional dense layers for further feature extraction and classification.
Output: Final output of the model, typically class probabilities for different categories.
This flowchart captures the essence of Mobile Net’s architecture, which is characterized by its
efficient use of depth wise separable convolutions to reduce computational complexity while
maintaining performance. In practice, Mobile Net’s implementation may include additional
optimizations, such as skip connections, linear bottlenecks, or specialized layers (e.g., squeeze-
and-excitation blocks in MobileNetV3), which are not explicitly depicted in this simplified
flowchart.
37
CHAPTER 6
CODE
1. Importing Libraries
import cv2
import os
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten,
GlobalAveragePooling2D
from tensorflow.keras.layers import Conv2D, ZeroPadding2D, MaxPooling2D
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
2. Data Collection
Here for this project, we are using images as a dataset the mages like Fake and real
images of humans.
data=r'/content/drive/MyDrive/project_dataset/Dataset'
3. Data preprocessing
batch_size=32
img_height=224
img_width=224
38
4. Splitting of dataset
test_datagen=ImageDataGenerator()
Train
train=train_datagen.flow_from_directory(data, target_size=(img_height,img_width),
class_mode="categorical",batch_size=batch_size,subset="training")
Test
test=train_datagen.flow_from_directory(data, target_size=(img_height,img_width),
class_mode="categorical",batch_size=batch_size,shuffle=False)
5. Model implementation
classess=train.num_classes
classess
Mobilenet=MobileNet(weights='imagenet',include_top=False,input_shape=(224,224,3))
for i in Mobilenet.layers:
i.trainable = False
def main_model(t1_model,calssess):
m_model=t1_model.output
m_model=GlobalAveragePooling2D()(m_model)
m_model=Dense(1024,activation='relu')(m_model)
m_model=Dense(1024,activation='relu')(m_model)
m_model=Dense(512,activation='relu')(m_model)
m_model=Dense(classess,activation='softmax')(m_model)
return m_model
combining_model=main_model(Mobilenet,classess)
model= Model(inputs=Mobilenet.input,outputs=combining_model)
39
model.compile(optimizer="Adam",loss="categorical_crossentropy",metrics=["accuracy"])
model.summary()
earlystop=EarlyStopping(patience=10)
learning_rate_reduce=ReduceLROnPlateau(monitor="val_accuracy",min_lr=0.001)
callbacks=[earlystop,learning_rate_reduce]
history=model.fit(train,validation_data=test,epochs=3)
6. Training the model and saving the trained model into a model file as h5 file.
model.save("mobilent_project.h5")
if normalize:
cm=cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
cm=np.around(cm,decimals=2)
40
cm[np.isnan(cm)]=0.0
print("Normalized confusion matrix")
else:
print("Not normalized confusion matrix")
tresh=cm.max() / 2
Y_pred = model.predict(test)
y_pred = np.argmax(Y_pred,axis=-1)
print('Confusion Matrix')
cm=confusion_matrix(test.classes,y_pred)
plot_confusion_matrix(cm, target_names, title='Confusion Matrix')
print('Classification Report')
print(classification_report(test.classes,y_pred,target_names=target_names))
train_accuracy=history.history['accuracy']
val_accuracy=history.history['val_accuracy']
train_loss=history.history['loss']
val_loss=history.history['val_loss']
epochs=range(len(train_accuracy))
plt.figure(figsize=(8,8))
plt.subplot(1,2,1)
plt.plot(epochs,train_accuracy,'b',label='Training_accuracy')
plt.plot(epochs,val_accuracy,'r',label='Validation_accuracy')
plt.title('Training and Validation accuracy')
41
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend(['train','val'], loc='lower right')
plt.subplot(1,2,2)
plt.plot(epochs,train_loss,'b',label='Training_loss')
plt.plot(epochs,val_loss,'r',label='Validation_loss')
plt.title('Training and Validation loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend(['train','val'], loc='upper right')
plt.show()
9. Final prediction
Testing with single images and predicting the accuracy with Fake or Real Images.
labels=train.class_indices
final_labels={v: k for k, v in labels.items()}
final_labels
42
for key in list(final_labels.keys()):
result_dict[final_labels [key]] = result[0][key]
sorted_results = {k: v for k, v in sorted(result_dict.items(), key = lambda item:
item[1], reverse=True)}
if not from_test_dir:
print('=' * 50)
for label in sorted_results.keys():
print("{}: {}%".format(label, sorted_results[label]*100))
final_result = dict()
final_result[list(sorted_results.keys())[0]] =
sorted_results[list(sorted_results.keys())[0]] * 100
return final_result
final_result =
predict_image(r'/content/drive/MyDrive/project_dataset/Dataset/Original/3417.jpg', False)
print("Final Result: ", final_result)
43
CHAPTER 7
REQUIREMENTS
In this chapter, benchmark datasets have been executed on the proposed algorithm
and the performance of the algorithm has been analyzed.
7.1 DATASETS
The datasets that have been used to evaluate the performance of the proposed method
are shown in Table 7.1.
Table 7.1: Datasets
The CMFD GRIP Dataset by Cozzolino et al. [10] (hereinafter referred as D1) is a
dataset composed by 80 images, with realistic copy-move forgeries. All these images have
size 768 * 1024 pixels, while the forgeries have arbitrary shapes, aimed at obtaining visually
satisfactory results.
The CoMoFoD database [11] (hereinafter referred as D2) has 260 image sets, 200
images in small image category 512 * 512. Following transformations are applied:
Translation a copied region is only translated to the new location without
performing any transformation.
Rotation a copied region is rotated and translated to the new location.
Scaling a copied region is scaled and translated to the new location.
44
Distortion a copied region is distorted and translated to the new location The
distortion added to the dataset’s images can be noise adding, image blurring,
brightness change, color reduction, contrast adjustments or the combination of two
or more distortions on a copied region before moving it to the new location.
Ardizzone et al. [12] make a copy-move forgery dataset (hereinafter referred as D3)
which contain a medium sized image, almost all 1000 * 700 or 700 * 1000. This dataset
contains 50 not compressed images with simply translated copies and 46 not compressed
images with 11 different types of rotation around the angle zero in the range of [-25, 25] with
step 5 and 11 scaling factors in the range of [0.75, 1.25]
with step 0.05
The CMH dataset (hereinafter referred as D4) was created by [13] and comprises 108
realistic cloning images. Each image is stored in the PNG format (which does not modify
pixel values) and has a resolution varying from 845 * 634 pixels (the smallest) to 1296 * 972
pixels (the biggest). The dataset contains four groups of images:
23 images where the cloned area was only copied and then moved (simple case);
25 images with a rotation of the duplicated area (orientations in the range of 90
degrees and 180 degrees);
25 images with a resizing of the cloned area (scaling factors between 80 and 154%);
35 images involving rotation and resizing altogether.
45
7.2 EXPERIMENT SETUP
In all the experiments carried out, Python has been used as a programming language,
due to its great flexibility to perform data analysis. The characteristics of the equipment in
which the experiments were carried out are presented in Table 7.2. These are essential factors
to consider since the execution times of the different tests vary according to the resources
available.
Resources Features
Operating System Windows 11
Memory RAM 8 GB
Processor i5 generation 10
Graphic Card 4 GB NVIDIA GTX 1650
Storage 1 TB SSD
46
CHAPTER 7
PERFORMANCE ANALYSIS
Here, the performance metrics required to analyze the results are described and
the parameters used in the algorithm are illustrated here.
True Positive (TP): Instances that are actually positive and are predicted by the model
as positive.
False Positive (FP): Instances that are actually negative but are predicted by the model
as positive.
True Negative (TN): Instances that are actually negative and are predicted by the model
as negative.
False Negative (FN): Instances that are actually positive but are predicted by the model
as negative.
Predicted
| Positive | Negative |
----------------------------------------
Actual Positive | TP | FN |
Negative | FP | TN |
In order to test our algorithm, we used the four datasets described before, and the metrics
used to quantify its accuracy were the precision, recall, and F1 scores.
47
The precision, P, is the ratio of the probability that a detected region is accurate,
and the formula to calculate the precision is the following:
𝑇𝑃
𝑃=
𝑇𝑃 + 𝐹𝑃
where TP is the number of true positives pixels and FP the number of false positives pixels
detected by the algorithm.
The recall is the True Positive Rate (TPR) component. These are
given by the recall, R, is the true positive ratio which measure the ability of the algorithm to
find all the positive samples, its formula is the following:
𝑇𝑃
𝑅=
𝑇𝑃 + 𝐹𝑁
where TP is the number of true positives and FN the number of false negatives The F1 score
𝑃+𝑅
𝐹1 =
where P is the precision and R the recall obtained by the algorithm over each analyzed
image.
48
Support (or Sample Size): The support for each class is simply the number of
occurrences of that class in the actual dataset. It is the sum of true positive (TP) and false
negative (FN) values for a particular class. Essentially, it shows how many instances of each
class there are in the dataset.
S=TP+FN
Where:
TP is the number of true positive instances (instances correctly predicted as belonging to the
class).
FN is the number of false negative instances (instances incorrectly predicted as not
belonging to the class when they actually do).
49
8.2 PARAMETERS USED
BLOCK_SIZE: Block size of each window. It provides a block of 8x8 matrix of image.
QF: Quality Factor of quantization matrix, which decides the quantization process and it will
be passed to quant matrix.
Shift_thresh: The threshold value of result, which will prevent false detections and provides
a perfect result.
Stride: Sliding window of the image. It is the iterable which loops through the image for
comparisons.
Q_8x8: The quantization matrix after the process will store in this variable.
The value of the parameters assigned in the above code is depicted here in Table 8.1
Table 8.1: Parameters
50
CHAPTER 9
RESULTS
The outputs obtained by executing the code are shown below:
Original: 56.919342279434204%
Forged: 43.080660700798035%
51
Fig 9.2: Sample Output II
Original: 69.67321634292603%
Forged: 30.326786637306213%
Final Result: {'Original': 69.67321634292603}
52
Fig 9.3: Sample Output III
Original: 66.18366837501526%
Forged: 33.81633460521698%
53
Fig 9.4: Sample Output VI
Forged: 74.34220314025879%
Original: 25.657793879508972%
Final Result: {'Forged': 74.34220314025879}
54
Fig 9.5: Sample Output VI
Forged: 65.21817445755005%
Original: 34.78182554244995%
55
Fig 9.6: Sample Output VI
Forged: 50.94483494758606%
Original: 49.05516803264618%
Final Result: {'Forged': 50.94483494758606}
56
CONFUSION MATRIX RESULT:
57
TRAINING AND VALIDATION RESULT:
58
CLASSIFCATION REPORT RESULT:
59
CHAPTER 10
REFERNCES
10.1 CONCLUSION
During the development of this work, experiments were performed using the
proposed algorithm against six different datasets widely used in literature. This group of
images contained different types of formats, sizes, and additional transformations to the
copy-move manipulation.
60
10.2 FUTURE SCOPE
As at the beginning of this project says the famous saying goes, ‘‘A picture is worth
a thousand words’’. Therefore, having faster and reliable algorithms to analyze the integrity
of an image is needed. Nowadays, thanks to the fast and easy way to share images plus the
easiness of use professional image editing tools make it harder to detect forgeries.
As the world is getting digitalized rapidly, there will be a continuous rise in image
forgeries and this algorithm holds great significance.
61
10.3 REFERENCES
[1] Park CS, Choeh JY (2018) Fast and robust copy-move forgery detection based on scale-
space representation. Multimed Tools Appl 77(13):16795–16811
[3] 3 Fridrich J, Soukal D, Lukas J (2003) Detection of copy move forgery in digital images.
In: Proceedings of the digital forensic research workshop. Binghamton,
New York, pp 5–8
[4] Popescu AC, Farid H (2004) Exposing digital forgeries by detecting duplicated image
regions. Department of Computer Science, vol 646
[5] Kang X, Wei S (2008) Identifying tampered regions using singular value decomposition
in digital image forensics. In: 2008 international conference on computer science and
software engineering, vol 3, pp 926–930. https://doi.org/10.1109/CSSE.2008.876
[6] Huang H, Guo W, Zhang Y (2008) Detection of copy-move forgery in digital images
using SIFT algorithm. In: 2008 IEEE Pacific-Asia workshop on computational intelligence
and industrial application, vol 2, pp 272–276. https://doi.org/10.1109/PACIIA.2008.240
[8] Amerini I, Ballan L, Caldelli R, Del Bimbo A, Serra G (2011) A SIFT-based forensic
method for copy-move attack detection and transformation recovery. IEEE Trans Inf
Forensics Secur6(3):1099–1110
62
[9] Zhao J, Guo J (2013) Passive forensics for copy-move image forgery using a method
based on DCT and SVD. Forensic Sci Int233(1):158–166
[11] Tralic D, Zupancic I, Grgic S, Grgic M (2013) Comofod–new database for copy- move
forgery detection. In: Proceedings ELMAR-2013. IEEE, pp 49–54
[13] Silva E, Carvalho T, Ferreira A, Rocha A (2015) Going deeper into copy-move forgery
detection: exploring image telltales via multi-scale analysis and voting processes. J Vis
Commune Image Represent 29:16–32
[14] Alkawaz MH, Sulong G, Saba T, Rehman A (2018) Detection of copy-move image
forgery based on discrete cosine transform.Neural Comput Appl 30(1):183– 192
[15] Esteban Alejandro Armas Vega,Edgar Gonzalez Fernandez, Ana Lucila Sandoval
Orozco, Luis Javier Garcıa Villalba, Copy-move forgery detection technique based on
discrete cosine transform blocks features, Neural Computing and Applications, 2020
[16] https://en.wikipedia.org/wiki/NumPy
[17] https://en.wikipedia.org/wiki/SciPy
[18] https://en.wikipedia.org/wiki/OpenCV
[19] https://en.wikipedia.org/wiki/Matplotlib
[20] https://en.wikipedia.org/wiki/Discrete_cosine_transfor.
63