CPP Report
CPP Report
Project Report
On
“Malware Detection on
Android Smartphone”
Partial Fulfillment of the Requirement for the Diploma in Computer Engineering,
By
1) Ghungarde Rutuja Shivaji [2214640024]
Guided By
CERTIFICATE
This is to certify that the project work entitled
“Malware Detection on
Android Smartphone”
is
Submitted by
1)Ghungarde Rutuj Shivaji [2214640024]
Date:
Place: Chas, Nimblak By Pass Road, Ahmednagar,
I here by take this opportunity to express my heart felt gratitude towards the people
whose help is very useful to complete my dissertation work on the topic of
“Malware Detection on Android Smartphone.” Inspiration and Guidance are
invaluable in all aspects of life especially on the fields of gratitude and obligation and
sympathetic attitude which I received from my respected project Guide, Prof.Hole
P.P. whose guidance and encouragement contributed greatly to the completion of
this thesis work.
I would like to thank to all faculty members Prof. Hole P.P HOD of Computer
Engineering Department and all my friends and well-wishers for their co-operation
and supports in making this thesis work successful.
I would also like to thank our Principal Prof.Gadakh R.S. for his warm sup- port
and providing all necessary facilities to us. Under these responsible personalities, I have
been efficiently able to complete my thesis in time with success
In the modern era, millions of mobile apps are available, making it hard for users to
distinguish fraudulent ones, especially on platforms like the Play Store. Detecting new
malware and malicious variants has been challenging. This system proposes a method for
feature extraction from Python source code, using Keywords Correlation Distance (KCD) to
analyze key elements such as API calls, Android permissions, and common parameters in
malware code. By applying SVM, the system can adapt to identify new and existing malware,
combining behavioral characteristics to enhance detection accuracy.
CONTENT PAGE
Chapter 1 INTRODUCTION
1.2 Motivation
3.2 Description…………………………………………….......
3.3 Aim…………………………………………….........................
3.4 Objectives……………………………………………..........................
3.5 Features……………………………………………..........................
3.6 Advantages………………………………………………………….
Chapter 4 METHODOLOGY
4.2 Modules
Chapter 6 CONCLUSION
REFERENCES
CHAPTER:1
INTRODUCTION
a more and more important role in daily life. There is no doubt that Android has become the
most popular platform for smart phone today. This trend has attracted attention of attackers,
more and more malicious applications emerged in the official and alternative Android
marketplaces. Malware is an abbreviation for two words malicious and software. Actually, it
is software that included in the computer system for malicious purposes, without any
knowledge from the computer owner. It may be used to collect important information, or gain
access to computer systems. The seriousness of malicious software ranges from hurt the users
with annoying Ads to steal important data. With the advent of the Internet era, the smart
phones in the world are also getting more and more popular, especially the smart phone with
apps. When an app was scanned, its code was compared against this database, and if a
match was found, the app was flagged as malware. Technical Analysis: Relies on past
market data, primarily price and volume, to predict future market movements.
permissions requested by an app. Apps that requested permissions beyond what was
that among all mobile malware, the share of Android based malware is higher than 46 percent
and still growing rapidly Given the rampant growth of Android mal-ware, there is a pressing
This system proposes a method for feature extraction from Python source code, using
Keywords Correlation Distance (KCD) to analyze key elements such as API calls, Android
permissions, and common parameters in malware code. By applying SVM, the system can
adapt to identify new and existing malware, combining behavioral characteristics to enhance
detection accuracy.
• Malware detection in mobile apps provides critical security benefits, protecting user data,
• These advantages contribute to a safer, more reliable mobile environment, benefiting both
users and developers while supporting compliance with regulatory standards and
All projects are feasible, given unlimited resources and infinite time. But the development
of software is plagued by the scarcity of resources and difficult delivery rates.It is prudent to
evaluate the feasibility of the project at the earliest possible time. Three keyconsiderations are
involved in feasibility analysis. Three key considerations are involved in feasibility analysis.
1.4.1 TECHNICAL FEASIBILITY:
Technical feasibility Centre’s on the existing system (Hardware, Software etc.,) and to
what extent it can support the proposed addition. If the budget is a serious constraint, then the
• Labor: The cost of team members’ wages and time working on the project
• Materials and equipment: Physical tools, software, legal permits, and the like
If the project is a go, the project manager must devise a budget based on the cost
estimation document, allocating resources properly. Managing that budget is key to the
project’s success. If certain pieces of the project end up costing more or less than anticipated,
the project manager will need to manage the risk and reallocate funds as necessary.
People are inherently resistant to change, and computers have been known to facilitate
change. It is understandable that the introduction of a candidate system requires special effort
to educate, sell, and train the staff on new ways of conducting business.
CHAPTER:2
LITERATURE SURVEY
Mobile malware is a constant threat for Android users.As these devices be come
increasingly important in our daily lives, it is of the utmost importance to ensure their
Static Analys
Manifest.xml file and methods and APIs used in the applications.Authors ex-tracted the
features from 650 application divided into 325 for malware repre-senting 89 malware
additional permissions and applied a method known as privilege separation, which extracts
the advertising component from the main functionality component of the Application.
Data Dependency
Author: Yongfeng Li, Tong Shen, Xin Sun, Xuerui Pan, and Bing Mao
Authors proposed DroidADDMiner,an efficient and precise system to detect, classify and
extracts features based on data dependency between sensitive APIs. It extracts API data
dependence paths embedded in app to construct feature vectors for machine learning.
CHAPTER:3
SCOPE OF THE PROJECT
Because of rampant growth of Android malware, there is a pressing need to develop a system
which effectively mitigate or defend against them. To detect whether or not the app possesses
the characteristics of benign and relate between the apps features and the features that are
3.2. DESCRIPTION:
Android malwares have increased significantly in recent years. It has been high-lighted that
among all mobile malware, the share of Android based malware is higher than 46percent and
still growing rapidly Given the rampant growth of Android mal-ware, there is a pressing need
to effectively mitigate or defend against them. With the popularity of Android devices, more
and more Android malware is manufactured every year. How to filter out malicious app is a
serious problem for app markets. Analyzing applications in order to identify malicious ones
is a current major concern in information security; In view of the traditional feature extraction
method based on binary program, this paper presents a method for feature extraction of JAVA
source code. The method uses the Keywords Correlation Distance to compute the correlation
between key codes such as API calls, Android permissions, the common param-eters, and the
common key words in Android malware source code. Then SVM is applied to make the
system gain to accommodate the function of the new malicious software sample, so as to
To develop a system which effectively mitigate malware detection focus on identifying the
features of malicious apps by using machine learning techniques to recognize and model the
3.4. OBJECTIVES:
3.5. FEATURES:
• Malware detection in mobile apps provides critical security benefits, protecting user data,
• These advantages contribute to a safer, more reliable mobile environment, benefiting both
users and developers while supporting compliance with regulatory standards and
3.6 ADVANTAGES:
Prevents Data Theft: Malware detection helps prevent unauthorized access to sensitive
information such as contacts, messages, photos, and financial data, safeguarding users
Prevents Unauthorized Access: Malware detection systems can identify and block apps
that attempt to exploit vulnerabilities in the operating system or other apps, preventing
based on keywords correlation distance which is different from the traditional method based
on binary program. In this method Java code is extracted from apk file and keyword extraction
is done also, permissions in android manifest file are checked. Second, we use feature vector
to describe malicious software fea-ture including not only API’s, but also the Android
Malware Detection Using Key-word Vector & SVM common parameters and common
package etc. Third, we give a malware detection method through SVM based on the feature
vector set, which can detect new malwares and malicious software variants.which can detect
Mani-fest.xml file in the root directory then we use the open source software dex2jarand
observation of malwares i. Android Permission ii. Activity Action Intent Parameter iii.
Broadcast Intent Action Constant iv. The commonly Package Name v. API Call
c) Statistic keywords : In this module we record the frequency and location of every
keyword in class of APK and in the configuration file, to storage the information using
matrix then use Keywords Correlation Distance algorithm calculated the distance be tween
e) Training module : We use LIBSVM is a library for Support Vector Ma-chines to train.
(c) Put the training samples and testing samples under the project direc-tory,also you
a) NETWORK
b) PHONESTATE
c) SYSTEMINFO
d) GPSLOCATION
e) WRITESTORAGE
f) BULETOOTN
g) SMS
Purpose: The main purpose for preparing this document is to give a general insight into the
analysis and requirements of the existing system or situation and for determining the operating
Scope: This Document plays a vital role in the development life cycle (SDLC) and it
describes the complete requirement of the system. It is meant for use by the developers and
will be the basic during testing phase. Any changes made to the requirements in the future will
Functional user requirements may be high-level statements of what the system should do but
functional system requirements should also describe clearly about the system services in
detail. The following are the key fields, which should be part of the functional requirements:
Usability: This relates to how easily people can use your app. A measure of usability could
be the time it takes for end users to become familiar with your app’s functions, without
training or help.
Reliability: This is the percentage of time that your app works correctly to deliver the desired
Performance: This is essentially how fast your app works. A performance requirement for
Responsiveness: This requirement ensures that your app is ready to respond to a user’s input
2. Hard Disk: 32 GB
4. Android Device 3.
SDLC used in this Project The Waterfall Model was first Process Model to be introduced. It
is also referred to as a linear-sequential life cycle model. It is very simple to understand and
use. In a waterfall model, each phase must be completed fully before the next phase can
begin. This type of model is basically used for the for the project which is small and there are
no uncertain requirements At the end of each phase, a review takes place to determine if the
project is on the right path and whether or not to continue or discard the project. In this model
the testing starts only after the development is complete. In waterfall model phases do not
overlap.
CHAPTER:5
DETAILS OF DESIGNS, WORKING & PROCESS
UML is an acronym that stands for Unified Modeling Language. Simply put, UML is a modern
approach to modeling and documenting software. It is based on diagrammatic representations
of software components. As the old proverb says: “a picture is worth a thousand words”. By using
visual representations, we are able to better understand possible flaws or errors in software or
business processes.
The UML Class diagram is a graphical notation used to construct and visualize object
oriented systems. A class diagram in the Unified Modeling Language (UML) is a type of static
structure diagram that describes the structure of a system by showing the system's:
classes,
their attributes,
A use case describes how a user uses a system to accomplish a particular goal. A use
case diagram consists of the system, the related use cases and actors and relates these to each
other to visualize: what is being described? (system), who is using the system? (actors) and
what do the actors want to achieve? (use cases), thus, use cases help ensure that the correct
system is developed by capturing the requirements from the user's point of view.
A use case is a list of actions or event steps typically defining the interactions between
a role of an actor and a system to achieve a goal. A use case is a useful technique for
identifying, clarifying, and organizing system requirements. A use case is made up of a set of
possible sequences of interactions between systems and users that defines the features to be
implemented and the resolution of any errors that may be encountered.
Fig. 5.4 Use Case Diagram
5.3.3.2. CLASS DIAGRAM:
Class diagram is a static diagram. It represents the static view of an application. Class
diagram is not only used for visualizing, describing, and documenting different aspects of a
system but also for constructing executable code of the software application.
Class diagram describes the attributes and operations of a class and also the constraints
imposed on the system. The class diagrams are widely used in the modeling of objectoriented
systems because they are the only UML diagrams, which can be mapped directly with object-
oriented languages.
Activity diagram focuses on flow of control from activity to activity. It shows work flow of
our model. Above figure shows activity states, transitions, loops, decision nodes and concurrent
activities use by our proposed system.
One sequence diagram typically represents a single Use Case ’scenario’ or own of
events.Sequence diagrams are an excellent way of documenting usage scenarios and both
capturing required objects early in analysis and verifying object use later in design. The diagrams
show the own of messages from one object to another, and as such correspond to the methods and
events supported by a class/object.
KCD. Then we combine the feature into keywords feature vector. Finally, learn and decision
by SVM for detecting new malware and malicious variant. This system is different from
conventional methods. Experiments will show the method is effective and efficient in
[1] M. Leeds, M. Keffeler, T. Atkison, “ Examining Features for Android Malware Detection ”
Computer Science Department, University of Alabama, Tuscaloosa, AL, USA Intel Conf.
Security and Management SAM’17 ISBN: 1-60132-467-7, 2017.
[2] Saba Arshad, Abid Khan, Munam Ali Shah, Mansoor Ahmed, “ Android Mal-ware Detection &
Protection: A Survey” (IJACSA) International Journal of Advanced Computer Science and
Applications, Vol. 7, No. 2, 2016.
[3] Ahmed H. Mostafa, Marwa M. A. Elfat tah and Aliaa A. A. Youssif, “ An In-telligent
Methodology for Malware Detection in Android Smartphones Based Static Analysis ”,
International Journal of Communication 2016
[4] Lynn M. Batten, Veelasha Moonsamy and Moutaz Alazab, “ Smartphone Ap plications,Malware
and Data Theft ” Springer Science Business Media Singa-pore 2016.
[5] Yongfeng Li(B), Tong Shen, Xin Sun, Xuerui Pan, and Bing Mao, “Detection, Classification
and Characterization of Android Malware Using API Data De-pendency ”, Institute for
Computer SciencesSocial Informatics and Telecom-munications Engineering 2015.
ANNEXURE II
Evaluation Sheet for the Micro Project
Academic Year : 2024-2025
Sr. Student Name Marks out of 6 for Marks out of 4 for Total
No. performance in performance in oral/ out of
presentation (D5 10
group activity (D5
Col.9)
Col. 8)
1 Ghungarde Rutuja Shivaji