Document Analysis
Document Analysis
ANALYSIS
INTRODUCTION TO
DOCUMENT ANALYSIS
Definition: Document analysis refers to
the forensic examination of digital files
to uncover crucial information such as
file origin, authenticity, and
modifications.
Importance in Forensics:In digital
investigations, documents can provide
evidence of fraud, insider threats,
intellectual property theft, etc.
It involves recovering, verifying, and
analyzing the content and properties of
files.
COMMON TYPES OF DOCUMENTS
IN INVESTIGATIONS:
Text
documents (Word, PDF, etc.)
Spreadsheets (Excel)
Presentation files (PowerPoint)
Emails (analyzed in detail in Chapter 10)
FILE IDENTIFICATION
What is File Identification?The process of determining
the type and structure of a file based on its extension,
format, and contents.
A misidentified file could be malicious or intentionally
disguised to evade detection.
File Signatures and Extensions:Each file has a
signature (often found in the first few bytes of data) that
identifies its format.
Extensions like .docx, .pdf, and .xlsx are visual cues, but
they can be altered to mislead.
Tools can be used to analyze the true format, bypassing
falsified extensions
File Headers:These are critical as they
provide technical details about the file
format. For example, a PDF file will have
%PDF in the header.
A mismatch between file header and
extension raises red flags in forensic
investigations.
UNDERSTANDING METADATA
Definition of Metadata: Metadata is "data about
data," providing critical information about the
document's history, usage, and attributes.
Types of Metadata:
Substantive Metadata: Information regarding
formatting, fonts, and layout. This type of metadata
is useful in intellectual property cases.
Embedded Metadata: Information that applications
store, such as edit history, author, and time stamps.
Custom Metadata: User-defined fields such as
‘Author,’ ‘Last Modified By,’ ‘Document Title,’ etc.
Uses of Metadata in Forensics:Establish
ownership and authorship of a document.
Trace document revisions, which is crucial in
fraud cases or contract disputes.
Prove chain of custody by demonstrating how
and when a file was accessed, modified, or
transferred.
Example:A Word document’s metadata might
reveal that it was last edited by a different
person than claimed or that it was created
much earlier than alleged.
METADATA ANALYSIS
TOOLS
Microsoft Office Suite Example:In Microsoft Word,
tab.
bad clusters).
portion of disk space in a file cluster that may store hidden data.
files.
How to Identify Hidden Data:
Hex editors and forensic tools like X-Ways
Forensics can be used to examine file headers,
search slack space, and find hidden information.
Data carving tools retrieve hidden fragments of
data by scanning storage media for specific file
signatures.
Example:
A forensic investigator may discover secret
messages hidden in an image file through
steganalysis.
DOCUMENT MANAGEMENT
SYSTEMS (DMS)
Definition:Document Management Systems are
platforms that store, manage, and track digital
documents, often used by businesses and
organizations.
Role in Forensic Investigations:DMS logs track
user access, modifications, and document history.
These logs can provide critical audit trails,
especially in cases involving intellectual property
disputes or insider threats.
Features for Forensic Investigations:
Version control: Helps in identifying when
documents were modified and by whom.
Audit trails: Show every interaction with a
document, including access, edits, and deletions.
Backup logs: Useful for recovering lost or
deleted files.
Forensic Tools for DMS:AccessData FTK:
Integrates DMS audit trails into investigations by
indexing and searching through large document
repositories.
CHALLENGES AND RISKS
Antiforensic Techniques:Criminals may employ
antiforensic techniques to destroy or alter metadata
or hide documents, such as using tools to wipe
metadata or scramble MAC timestamps.
Example: CCleaner is a common tool used to clean
up metadata and temporary files to cover tracks.
Risks of Relying on Metadata:Metadata can be
easily altered or deleted by experienced users.
Investigators should be cautious when drawing
conclusions based solely on metadata, as it can be
manipulated.
Best Practices to Overcome These
Challenges:Combine metadata
analysis with other forensic techniques
such as file hashing, network logs, and
file system analysis to build a stronger
case.
Case Study Overview:Example of a digital
forensics case where document analysis was
critical (e.g., an intellectual property dispute or a
fraud case).
Details of the Investigation:Forensic
techniques used to identify altered documents.
Use of metadata to establish timelines and
authorship.
Outcome: How document analysis led to a legal
resolution.
Key Lessons:The importance of accurate data
handling.
How metadata, temporary files, and hidden data
can provide critical evidence.