0% found this document useful (0 votes)
78 views

Paper 1: Title: Summary:: Data Mining Techniques For Digital Forensics Analysis

Uploaded by

Tahir Hussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
78 views

Paper 1: Title: Summary:: Data Mining Techniques For Digital Forensics Analysis

Uploaded by

Tahir Hussain
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Paper 1:

Title: Data Mining Techniques For Digital Forensics Analysis


Summary:
This paper addressed that how digital forensics analysis can be improved by using data
mining techniques. As digital/computer forensics is about gathering the evidence,
analyzing and reporting it. In this research, importance of data is discussed and also how
it can be a weapon for companies to grow larger in market place. Different types of
digital forensics are discussed along with critical steps like assessment, acquisition,
authentication and analysis to perform digital forensics. Moreover, the main focus was on
data mining techniques to get the interesting structures and patterns of data from which
relationships among data can also be identified.

Research Gap:
Applying digital forensics can be very hard and rowdy sometimes. Investigations can
take a lot of time like including law enforcement agencies and detective officers. By
using data mining techniques like clustering technique, association rule mining, deviation
detection and classification with digital forensics tools, we can make data ready for
analysis. Moreover, data patterns and structures can help us identify crime patterns and
give us advantage to be one step ahead of bad guys.

Novel Contribution:
Data mining techniques and algorithms are used along with digital forensics tools and
techniques. The specific algorithm used is Apriori Algorithm to get large item sets of
data. A new system is proposed whose working is as follows:

1: Collect Evidence

2: Perform Digital forensics analysis

3: Perform Data mining algorithms on output received by digital forensics analysis along
with investigator queries

4: GUI of report

Tools and Parameters Used:


Data mining techniques and algorithms along with digital forensics tools are used.
Apriori data mining algorithm is used to find relations, patterns and output prediction.
Parameters used are large item sets or datasets.

Limitations and Drawbacks:


This research has some limitations and drawbacks. Major limitation is related to apriori
algorithm which could become really slow under certain conditions like if there are
more item sets to process. Another is that data mining is not walk in the park, you have to
be expert to do this so this can become very costly.
---------------------------------------------------------------------------

Paper 2:
Title: Improving Digital Forensics Through Data Mining
Summary:
This research addresses the challenges faced by digital forensics analyst. It can be really
hard for an analyst to do forensics while dealing with complex investigation and at the
same time handling large amount of data where forensics tools also fail to provide
enough information and identify relationships among data sets. Here, data mining
techniques can be used to analyze textual data. A special data mining tool WEKA is used
for data analysis and predictive patterns. Main focus is on using data mining tool to work
with textual/unstructured data to get conclusions.

Research Gap:
The challenge faced by forensics analyst while investigation and at the same time,
dealing with large amount of data. This can consume a lot of time and also makes it more
complex for an analyst. While we can make it easy by using data mining techniques and
algorithms where we can analyze data, take the results which includes patterns and
structures of data and then make it available for analysis. All the work done is on Enron
Scandal, a biggest audit failure in US Corporate history.

Novel Contribution:
Data mining techniques along with mining algorithm are used. Data mining technique
WEKA is used which analyze large amount of textual data and provides predictive
modeling. Unstructured data i-e emails are used to gain useful information. More of a
traditional way is used, data is collected in unstructured form and stored in database.
Then data mining techniques are performed on textual data to get the meaningful
conclusions.

Tools and Parameters:


WEKA, a data mining tool is used in this system along with clustering techniques. Enron
scandal is used as a vehicle. Electronic messages of Enron employees are used as
unstructured data to perform analysis. Results are gathered in clusters which includes
messages and words. Most important and frequent words are separated in specific tables
for conclusions.

Limitations and Drawbacks:


The data mining technique WEKA is used in this system. It works well if data size is
small. But if data set is large, WEKA easily overflow memory. Another drawback is that
this system is helpful at the start of investigation which means that we have to implement
this at the start and wait for its results to start the investigation based on conclusions and
predictions which leads to lot of time consuming.

--------------------------------------------------------------------------

Paper 3:
Title: Digital Forensics And Cyber Crime Data Mining
Summary:
The purpose of this reseach was to work with digital forensics with the combination of
cyber crime data mining techniques.The researchers discussed digital forensics along
with specific forensics techniques which they later combined with their proposed
system.Digital forensics techniques used are file system forensics, network forensics
analysis and network traffic analysis while cyber crime data mining techniques used are
entity extraction, clustering techniques, deviation detection and association rules.The
crime data mining algorithm CDMA was used to work with the data collected from
digital forensics techniques and cyber crime data mining techniques to produce detailed
reports and identify crime patterns.

Research Gap:
It is difficult to do forensics while dealing with large and complex data especially when
data is coming from various domains.To make it plain and contended, digital forensics
techniques are combined with cyber crime data mining techniques along with CDMA
algortithm to achieve results and crime patterns.

Novel Contribution:
A new tool is proposed which works on data gathered from digital forensics techniques
and crime data mining techniques.It uses crime data mining algoritm CDMA which gives
results in form of detailed reports and bar charts. It also identifies crime patterns in order
to allow system administrators to minimize and overcome system vulnerability. This tool
is comprised of 3-tier architecture. Three layers are File system analyses, Network
analyses and Database.

Tools and Parameters:


CDMA, crime data mining algorithm is used. Its working is based on data collected from
file system i-e files and network analysis i-e evidence from network traffic and log files.
Data collected from file system and network analysis is then stored in database. Online
Transaction Processing OLTP and Online Analytical Processing OLAP techniques are
performed on stored data to produce outputs and decisions in form of reports.

Limitations and Drawbacks:


This system has two dependencies which are File system analysis and network analysis
which means that both have to be done in order to get results. If one fails then the system
fails which is a major drawback. Also CDMA crime data mining algorithm used is not
much efficient for handling large amount of data.

---------------------------------------------------------------------------

Paper 4:

Title: Data Mining Based Crime-Dependent Triage In Digital Forensics


Analysis
Summary:
This research discussed cyber crimes and computer forensics along with the difficulties
an investigator can face while gathering data and finding evidence from the seized
devices.A new system is proposed for digital forensics investigations.This system is
based on data mining techniques and Knowledge Management Theory KMT and both
these constructs "POST MORTEM" triage's foundation.This system could create
intelligence from unstructured data and then creates "CLASS", a model dependent
variable along with its relation with independent variables.Independent variables could be
system configuration files, installed files, statistics, browser history and event log.The
identified class variable will then be used to identify whether a computer is used to
commit various crimes like child pornography, terrorism, hacking and copyright
violations.

Research Gap:
Gathering evidence from storage devices such as hard disks, PC's and other storage
media becomes very complex while doing digital forensics investigations.This can take
alot of time and effort to achieve results. Data mining crime dependent triage can be used
along with digital forensics techniques to save time and make it simple and easy for
investigations.

Novel Contribution:
A new system is proposed which is the combination of digital forensics techniques and
data mining along with Knowledge management theory KMT and algorithm. Both data
mining and KMT forms theoretical foundation for "POST MORTEM" triage. The triage
model is comprised of four phases which includes forensics acquisition, feature
extraction, priority definition and triaging matrix. The class variable was then extracted
from triaging matrix.

Tools and Parameters:


POST MORTEM triage system is used with digital forensics techniques and KMT. As a
sample, a case study Servizio Polizia Postalle e delle Communicazioni is taken. In
forensics acquisition phase the parameters used are system config files, softwares , user's
browser history, and event logs. The results are then stored in 2-Dim matrix. In feature
extraction phase, relevant data was extracted from previous results. In priority definition
phase, timeline of interest was checked along with crime related features to gather more
evidence.The final data structures are then constructed as triaging matrix which was then
processed by the model algorithm to get the specific results.
Limitations and Drawbacks:
The algorithm used in this system was not commercial or standard rather it was one's
own.It is not confirmed whether it will process terabytes of data efficiently and correctly
or not.So its a major drawback.Also, we have to provide evidence and results in court, so
the way we used in our investigation must be on a high standard and commercial.The
system is based on non standard algorithm, so there's the limitation.We can not use it
untill it is based on some standard and commercial algorithm , a proved one in market.

--------------------------------------------------------------------------

Paper 5:

Title: Framework for Live Digital Forensics Using Data Mining

Summary:
This study dicusses the digital forensics process and its areas in detail with data mining
and cyber crime mining methods.It highlights the inflation of cyber crimes with the
emergence of imformation and communication technology. A framework composed of
digital forensics techniques and crime data mining is proposed which works under the
hood of two algorithms which are K-MEANS and Apriori Algorithm.The system also
works in real time and for this purpose various tools are used which are Win cap, jpcap
and wmic.

Research Gap:
There are certain challenges for digital forensics when there is huge amount of data to be
processed.Also consumer grade computers having large amount of data,this leads to
potential increase in size of forensics investigations.From this perspective,there will be a
need for increase in machines and human resources and digital forensics professionals
must do this to tackle the complex investigations.

Novel Contribution:
A new framework is proposed which is combination of digital forensics techniques and
crime data mining along with K-MEANS ans Apriori Algorithm. Specific forensics
techniques used are memory forensics analysis, file system analysis, and network
forensics analysis. The framework includes two sets i-e training set and test set. The
clustering technique was used to get the conclusions and predictions at the end.

Tools and Parameters:


Framework based on forensics techniques and crime data mining algorithms was used to
get predictions.Parameters used are data which was gathered by using digital forensics
techniques.The evidence was collected from memory forensics analysis, file system
analysis and network analysis. From this evidence,two sets are generated using K Means
and Apriori algorithms. K value is then precipitated by K Means algorithm on the basis of
training set and test set. In final phase, K was passed to clustering process to get the
results and predictions. Same process was followed using Apriori Algorithm which takes
both training set and test set.By comparing both sets,it gives item set on which
association rule was applied to get the predictions.

Limitations and Drawbacks:


The Apriori algorithm has a drawback in itself.It becomes really slow while processing
larger item sets.For K Means Algorithm, it assumes the distribution variance of each
variable is spherical and all variables have same variance.If any of these conditions does
not hold, K Means will not work properly.Although it will not fail,but will not achieve
maximum efficiency.

---------------------------------------------------------------------------

You might also like