0% found this document useful (0 votes)
5 views

electronics-13-0789

The article analyzes Microsoft Azure's forensic capabilities in the context of cloud cyber forensics, highlighting the complexities introduced by its shared responsibility model and dynamic resources. It emphasizes the need for advanced tools, including AI and machine learning, to enhance data analysis and improve the effectiveness of forensic investigations. The findings aim to provide a framework for organizations to better combat cybercrime while adhering to legal standards.

Uploaded by

jgpntrip2023
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

electronics-13-0789

The article analyzes Microsoft Azure's forensic capabilities in the context of cloud cyber forensics, highlighting the complexities introduced by its shared responsibility model and dynamic resources. It emphasizes the need for advanced tools, including AI and machine learning, to enhance data analysis and improve the effectiveness of forensic investigations. The findings aim to provide a framework for organizations to better combat cybercrime while adhering to legal standards.

Uploaded by

jgpntrip2023
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Article

Forensic Investigation Capabilities of Microsoft Azure:


A Comprehensive Analysis and Its Significance in Advancing
Cloud Cyber Forensics
Zlatan Morić * , Vedran Dakić , Ana Kapulica and Damir Regvart

Department of Cybersecurity, Algebra University, Gradiscanska 24, 10000 Zagreb, Croatia;


[email protected] (V.D.); [email protected] (A.K.); [email protected] (D.R.)
* Correspondence: [email protected]

Abstract: This article delves into Microsoft Azure’s cyber forensic capabilities, focusing on the unique
challenges in cloud security incident investigation. Cloud services are growing in popularity, and
Azure’s shared responsibility model, multi-tenant nature, and dynamically scalable resources offer
unique advantages and complexities for digital forensics. These factors complicate forensic evidence
collection, preservation, and analysis. Data collection, logging, and virtual machine analysis are
covered, considering physical infrastructure restrictions and cloud data transience. It evaluates Azure-
native and third-party forensic tools and recommends methods that ensure effective investigations
while adhering to legal and regulatory standards. It also describes how AI and machine learning
automate data analysis in forensic investigations, improving speed and accuracy. This integration
advances cyber forensic methods and sets new standards for future innovations. Unified Audit Logs
(UALs) in Azure are examined, focusing on how Azure Data Explorer and Kusto Query Language
(KQL) can effectively parse and query large datasets and unstructured data to detect sophisticated
cyber threats. The findings provide a framework for other organizations to improve forensic analysis,
advancing cloud cyber forensics while bridging theoretical practices and practical applications,
enhancing organizations’ ability to combat increasingly sophisticated cybercrime.
Citation: Morić, Z.; Dakić, V.;
Kapulica, A.; Regvart, D. Forensic
Keywords: cyber forensics; Microsoft Azure; cloud security; forensic tools; network security
Investigation Capabilities of Microsoft
Azure: A Comprehensive Analysis
and Its Significance in Advancing
Cloud Cyber Forensics. Electronics
2024, 13, 4546. https://doi.org/ 1. Introduction
10.3390/electronics13224546 Cyber forensics is a crucial and rapidly advancing field that deals with the intricacies
Academic Editors: Jinwen Liang,
of investigating cybercrime in the modern era. In the face of increasingly advanced and
Chuan Zhang, Fuyuan Song
widespread cyber threats, the importance of cyber forensics in detecting, safeguarding,
and Wen Wu examining, and presenting digital evidence has reached unprecedented levels. Cyber
forensics utilizes scientific methodologies and sophisticated technological instruments to
Received: 4 October 2024 investigate various incidents, including cyber-attacks, data breaches, online fraud, and
Revised: 16 November 2024
cyberbullying. This field is vital for law enforcement agencies, private sector security teams,
Accepted: 18 November 2024
and the judicial system, as it not only provides the essential evidence to convict criminals
Published: 19 November 2024
but also formulates strategies to prevent future occurrences, instilling a sense of hope and
optimism in the fight against cybercrime.
The increase in cyber threats and complex attack methods has brought attention to
Copyright: © 2024 by the authors. significant difficulties in cyber forensics. Forensic investigators must keep up with the
Licensee MDPI, Basel, Switzerland. latest tools and methodologies to stay ahead of cybercriminals who constantly change
This article is an open access article their techniques. Integrating artificial intelligence and machine learning is necessary to
distributed under the terms and enhance forensic capabilities and address modern cyber threats’ complexities, as traditional
conditions of the Creative Commons forensic methods are often inadequate. These sophisticated technologies can automate the
Attribution (CC BY) license (https:// examination of vast amounts of data, detect patterns, and offer more profound insights into
creativecommons.org/licenses/by/ cyber incidents, ultimately resulting in enhanced accuracy and efficiency in investigations.
4.0/).

Electronics 2024, 13, 4546. https://doi.org/10.3390/electronics13224546 https://www.mdpi.com/journal/electronics


Electronics 2024, 13, 4546 2 of 32

Cloud forensics involves addressing challenges about data ownership, jurisdiction,


and the ever-changing nature of cloud storage.
This paper examines the present state of cyber forensics, highlighting the significance
of ongoing improvements in forensic tools and methodologies to stay updated with the
constantly evolving cyber threat landscape. This paper explores incorporating artificial
intelligence to improve forensic capabilities, simplify the investigative process, and tackle
the urgent requirements for precision and effectiveness. This paper also examines the
distinct difficulties various digital environments present and the need for specialized
forensic methodologies to tackle these challenges effectively.
Moreover, this paper emphasizes the crucial requirement for standardized protocols
and comprehensive frameworks in cyber forensics. The absence of consistency and stan-
dardization in forensic procedures frequently hinders the effectiveness and dependability
of investigations. Implementing standardized protocols can enhance the uniformity and
acceptability of digital evidence in legal proceedings, thereby bolstering the credibility
of cyber forensics. The discipline can gain more acknowledgment and dependability
in the legal and cybersecurity communities by ensuring forensic methods’ strength and
widespread acceptance.

2. Related Works
Cyber forensics is a constantly evolving discipline that tackles the difficulties presented
by cybercrime. In his paper about cyber forensics from 2020, Yadav [1] comprehensively
examines cyber forensics, highlighting its significance and the diverse methodologies em-
ployed in cybercrime investigations. The author emphasizes the considerable obstacles,
such as the pressing necessity for ongoing enhancements to forensic tools and methods to
stay abreast of the ever-changing cyber threats. He proposes that future research should
prioritize the integration of artificial intelligence to enhance forensic capabilities and stream-
line the investigative process, ultimately leading to improved accuracy and efficiency in
cyber forensics.
Rakha [2] explores the scientific techniques used in cyber forensics to discover digital
evidence that can be used in legal proceedings. The analysis highlights essential challenges,
such as the absence of uniformity and comprehensive structures, that impede the efficiency
of forensic investigations. The authors propose that future research should focus on
developing standardized protocols to improve the reliability and acceptability of digital
evidence in legal proceedings. By ensuring the robustness and universal acceptance of
forensic methods, the credibility of cyber forensics will be strengthened.
The effectiveness of cyber investigations heavily relies on the usability of digital foren-
sics tools, as Baafi [3] correctly states. He explores the usability concerns associated with
different tools and emphasizes the significance of choosing tools based on user require-
ments. The article addresses obstacles such as integrating tools and usability across various
user backgrounds. The authors suggest that future research should prioritize enhancing
user interfaces and optimizing forensic tools to make them more accessible and efficient
for users with different levels of expertise. This will ultimately improve the usability and
effectiveness of cyber forensics.
A paper by Yerriswamy and Vehumadhava from 2022 emphasizes the crucial sig-
nificance of cyber forensic tools in examining digital crimes. The authors highlight the
importance of enacting statutory laws and implementing security measures to bolster
forensic investigations. They identify challenges associated with tool selection based on
financial constraints and level of expertise. Further investigation should prioritize the
development of economical and adaptable forensic instruments and the establishment of
uniform protocols for their application in different investigative situations [4].
Dunsin et al. [5] examine the utilization of machine learning algorithms in cyber-
security and cyber forensics. The authors discuss the difficulties in guaranteeing these
algorithms’ inclusiveness, genuineness, and efficacy. The authors propose that future
investigations should investigate deep learning and computational intelligence to improve
Electronics 2024, 13, 4546 3 of 32

forensic analysis and create more sophisticated, adaptable algorithms that can effectively
address the intricacies of cyber threats.
In their paper from 2022, Ekhande et al. [6] focus on evaluating the efficacy of deep
learning, specifically convolutional neural networks (CNNs), in digital forensics. The
authors emphasize the difficulties in modifying deep learning models for different forensic
tasks and propose that future research should focus on improving the precision of these
models. Using digital images and sound analysis techniques in forensics can significantly
enhance their classification, increasing reliability and efficiency.
In their paper from 2022, Javed et al. [7] thoroughly examine the most advanced
tools, techniques, and difficulties in digital forensics. The authors emphasize the need for
further investigation into the integration and efficacy of forensic tools, proposing that future
research should prioritize the development of standardized and interoperable forensic
toolkits. This will enhance the efficiency of addressing emerging cyber threats and optimize
the overall effectiveness of forensic investigations.
Digital forensics is a subdivision of forensic science that employs digital data as
evidence in investigations and legal proceedings, state Kaur and Kaur [8]. This paper
comprehensively examines the processes and models involved in digital forensics and the
necessity for standardized procedures. The text emphasizes the difficulties in guaranteeing
the admissibility and dependability of evidence. Further research should focus on creat-
ing comprehensive models for digital forensic investigations that can effectively tackle
emerging digital threats and enhance forensic techniques.
A paper by Perilli et al. [9] discusses using data derived from electronic devices
as admissible evidence in legal proceedings, explicitly emphasizing the significance of
metadata within digital photographs. The article discusses the difficulties in extract-
ing data and protecting it from cyber-attacks. The authors propose that future research
should enhance forensic techniques for metadata analysis and establish resilient cyberse-
curity measures to safeguard against diverse forms of cyber-attacks, guaranteeing digital
evidence’s authenticity.
A paper by McCluskey et al. [10] discusses the development of computer forensics,
focusing on the necessity of a uniform curriculum and professional benchmarks. Addi-
tionally, it emphasizes the difficulties in choosing appropriate tools and following forensic
procedures. The authors propose that future research establishes universally recognized
criteria, enhances forensic education, and guarantees forensic methods’ reliability and
widespread acceptance across various jurisdictions.
In a paper from 2021, Malik [11] explores the use of software in cybercrimes and the
challenges posed by increasing reliance on internet-enabled devices. The author discusses
various cyber-attacks, including phishing, software piracy, and denial of service, and
the mechanisms these attacks exploit. It highlights tools like Kali Linux, Ophcrack, and
Safeback, explaining their dual use in ethical (white-hat) and unethical (black-hat) contexts.
The paper also emphasizes the role of forensic science in combating cybercrime, detailing
tools like EnCase and Safeback, which forensic experts use to retrieve and analyze data
from compromised systems. EnCase is noted for its ability to access hard drives, recover data,
and provide evidence for legal proceedings, while MD5sum is highlighted for ensuring data
integrity and verifying digital evidence. The paper concludes that a lack of public awareness
about cybersecurity facilitates cybercrime, stressing the need for education and vigilance
to mitigate risks. Challenges include understanding complex software algorithms and the
dual-use nature of these tools, which make addressing cybercrime increasingly complex.
Kumar et al. [12] examine cross-site scripting (XSS) attacks and digital forensic models
for gathering evidence. The authors discuss the difficulties in detecting and analyzing these
attacks and suggest that future research should focus on developing more efficient forensic
models and methodologies for collecting evidence. Implementing this measure will im-
prove the capacity to investigate and bring legal action against cybercrimes more efficiently.
A study by Malik [11] comprehensively examines software utilized in cybercrimes
and forensic tools employed for investigative purposes. The authors analyze the difficulties
Electronics 2024, 13, 4546 4 of 32

associated with comprehending and combating different cybercrime techniques. They


propose that future research should prioritize the creation of all-encompassing frameworks
for forensic analysis and enhancing software tools to detect cybercrime. These technological
advancements will improve the efficiency and efficacy of cyber forensics.
In his 2021 paper, Kolesnyk [13] focuses on the role of digital evidence in preventing IT
crimes and provides forensic support in this regard. The authors emphasize the difficulties
encountered during the investigation due to cybercrimes’ distinctive characteristics. The
authors propose that future studies should enhance forensic methodologies, incorporate
cutting-edge technologies, and devise specialized techniques to augment the collection and
analysis of evidence in cybercrime investigations.
In their 2020 paper, Qadir and Varol examine how machine learning improves digital
forensics by facilitating the examination of large, varied datasets. They also examine how
machine learning may forecast and anticipate criminal behavior through the analysis of
historical data, hence enhancing efficiency and precision in recognizing possible dangers
in digital contexts [14]. Similar conclusions are echoed by Dunsin et al., who evaluated
sophisticated AI and ML methodologies in digital forensics, emphasizing applications
such as data recovery and cybercrime timeline reconstruction. This study offers a compre-
hensive evaluation of the advantages and drawbacks of AI in enhancing digital forensics
procedures [5].
In their 2023 paper, Kumar and Kumar examine the application of artificial intelligence
and machine learning in financial institutions to detect cyber threats. They emphasize
how bespoke ML models improve security by forecasting potential threats, illustrating the
relevance of AI-driven solutions in delicate digital forensic contexts [15].
A paper from 2021 written by Rajendiran et al. examines several machine learning
applications in cyber forensics, particularly in addressing enormous data difficulties. It
emphasizes the essential function of machine learning in automating evidence processing
and decision-making, therefore assisting investigators in efficiently managing large data
volumes [16].
This paper is structured as follows: In the next section, we will describe some basic
terms and technologies related to cyber forensics. In the following sections, we will review
the scope of the threats, emerging trends, and new techniques in cyber forensics. We will
discuss some legal aspects, case studies, and real-world applications. Then, we will discuss
potential areas of future research and finish with the paper’s conclusion.

3. Cyber Forensics Fundamentals


Cyber forensics is the scientific field focused on identifying, collecting, preserving, an-
alyzing, and presenting electronic evidence in a manner appropriate for legal proceedings.
It entails the utilization of investigative and analytical methodologies to reveal and record
digital activities, encompassing data breaches and cybercrimes. The domain connects tech-
nology and law enforcement, highlighting the necessity of preserving evidence integrity via
standardized protocols and instruments. Its fundamental principles encompass the chain
of custody, safeguarding data integrity, and employing rigorous forensic methodologies
to achieve precise and dependable conclusions. The first computer crimes occurred in
the early 1980s, spawning cyber forensics [17]. To meet the complexity of modern digital
surroundings, the area has expanded to include computers, mobile devices, cloud systems,
and social media platforms.
Fundamental processes related to cyber forensics are presented in Figure 1:
Electronics 2024,
Electronics 13, 13,
2024, x FOR
4546PEER REVIEW 5 of 532
of 32

Identification

Presentation Preservation

Analysis

Figure 1. Cyber
Figure forensics
1. Cyber process
forensics diagram.
process diagram.

To To provide
provide a better
a better understanding
understanding of these
of these processes,
processes, they can theybecan be defined
defined as fol-
as follows
lows [18]:
[18]:
• • TheThe Identification
Identification Phase
Phaseof of
thethe
cyber
cyberforensics
forensics process
processentails
entails recognizing
recognizingand andlocal-
localiz-
izing
ingpotential
potentialsources
sources of of digital
digital evidence. Investigators delineate
delineate the the parameters
parametersof ofthe
theinquiry
inquiryandandidentify
identify pertinent
pertinent data assets,
data including
assets, hard
including drives,
hard network
drives, logs,logs,
network emails,
or other
emails, electronic
or other records
electronic that may
records that harbor essential
may harbor information;
essential information;
• • During
During preservation,
preservation, efforts
efforts concentrate
concentrate on on securing
securing andand isolating
isolating digital
digital evidence
evidence to to
maintain its integrity and prevent alteration. This phase generally
maintain its integrity and prevent alteration. This phase generally entails generating entails generating
backups
backups of the
of the original
original datadata
andand utilizing
utilizing hashing
hashing methods
methods to confirm
to confirm thethe integrity
integrity
of the
of the evidence.
evidence. Utilizing
Utilizing specialized
specialized instruments
instruments guarantees
guarantees thethe preservation
preservation of the
of the
original
original data
data without
without alteration;
alteration;
• • TheThe analysis
analysis phase
phase involves
involves meticulously
meticulously examining
examining andand interpreting
interpreting thethe collected
collected
evidence. Investigators employ sophisticated forensic instruments
evidence. Investigators employ sophisticated forensic instruments to reveal patterns, to reveal patterns,
anomalies, or essential data points. This encompasses recovering
anomalies, or essential data points. This encompasses recovering deleted files, deleted files,
de-de-
crypting encrypted data, and mapping digital activities to
crypting encrypted data, and mapping digital activities to reconstruct events; reconstruct events;
• • Ultimately,
Ultimately,thethe findings
findings areare systematically
systematically organized
organized andand prepared
prepared forfor dissemination
dissemination
during
during thethe presentation
presentation phase.
phase. Investigators
Investigators produce
produce succinct
succinct reports
reports encapsulating
encapsulating
the evidence, frequently employing visual aids such as timelines
the evidence, frequently employing visual aids such as timelines or activity or activity diagrams.
dia-
These materials are formatted for court proceedings or investigative
grams. These materials are formatted for court proceedings or investigative briefings, briefings, ensuring
comprehensibility
ensuring comprehensibilityand legalandadmissibility.
legal admissibility.
These
These principles
principles guarantee
guarantee thatthat the evidence
the evidence remains
remains unaltered
unaltered and preserves
and preserves its in- its
integrity
tegrity throughout
throughout the forensic
the forensic process.
process.

4. Scope
4. Scope of Cyber
of Cyber Threats
Threats
TheThe range
range of of cyberthreats
cyber threatsisisvast
vastand
and perpetually
perpetually evolving
evolving with
withtechnological
technologicalprogress.
pro-
Cyber threats affect organizations and individuals, resulting in considerable
gress. Cyber threats affect organizations and individuals, resulting in considerable financial, repu-
finan-
tational, and operational harm. Organizations face ransomware attacks, which
cial, reputational, and operational harm. Organizations face ransomware attacks, which can disrupt
operations by encrypting essential data, and data breaches that compromise sensitive
can disrupt operations by encrypting essential data, and data breaches that compromise
customer information, resulting in diminished trust and possible legal ramifications. Cyber
sensitive customer information, resulting in diminished trust and possible legal ramifica-
threats to individuals manifest as phishing attacks, identity theft, and malware infections,
tions. Cyber threats to individuals manifest as phishing attacks, identity theft, and mal-
compromising personal data and devices, resulting in data loss and privacy infringe-
ware infections, compromising personal data and devices, resulting in data loss and pri-
ments. Phishing attacks can facilitate unauthorized access to individual accounts, exposing
vacy infringements. Phishing attacks can facilitate unauthorized access to individual ac-
sensitive information that may be exploited.
counts, exposing sensitive information that may be exploited.
Electronics 2024, 13, 4546 6 of 32

The global character of cyber threats is apparent in their extensive influence across
various geographical areas. Cyber-attacks’ escalating frequency and complexity underscore
the imperative of implementing stringent cybersecurity protocols. Organizations and
individuals must stay alert as these threats increase in complexity. Adversaries utilize more
sophisticated attack methods that necessitate equally advanced defense mechanisms for
effective counteraction. Let us discuss the implications of cyber threats to individuals first.

4.1. Impact on Organizations and Individuals


Organizations can suffer severe consequences because of cyber threats. The conse-
quences encompass financial losses, harm to reputation, disruptions to operations, and
legal ramifications. As an illustration, ransomware attacks can suspend business operations
by encrypting vital data, resulting in substantial periods of inactivity and financial detri-
ment. Malware infections can potentially jeopardize personal data and devices, resulting in
the loss of data and unauthorized surveillance [19]. In addition, data breaches can reveal
confidential customer data, leading to a loss of customer confidence and potential legal
consequences [20]. Individuals are susceptible to cyber threats, which expose them to the
possibility of identity theft, financial harm, and privacy violations. Phishing attacks have
the potential to result in unauthorized entry into personal accounts and the exposure of
sensitive information.

4.2. Statistical Overview


These statistics provide a clear indication of the widespread occurrence and seriousness
of cyber threats on a global scale [21]:
• Cybercrime victim count: The annual count of cybercrime victims is around 556 million,
resulting in over 1.5 million victims per day and 18 victims per second;
• Financial impact: The approximate yearly expense of worldwide cybercrime is ap-
proximately USD 100 billion;
• Types of cyber-attacks: The prevalent forms of cyber-attacks are malware, accounting
for 50% of incidents, followed by criminal insiders at 33%, and theft of data-bearing
devices at 28%;
• Geographic impact: Russia and the U.S. play a substantial role in malware attacks,
accounting for 39.4% and 19.7% of global malware, respectively;
• Frequency of attacks: The US Navy encounters more than 110,000 cyber-attacks per
hour, emphasizing the ongoing danger presented by cyber adversaries.
The statistics highlight the urgent need for robust cybersecurity measures to safeguard
organizations and individuals from growing threats [22].

5. Current and Emerging Trends in Cyber Forensics


Cyber forensics is rapidly developing due to the constant progress of technology and
the growing complexity of cyber threats. Current developments in cyber forensics involve
incorporating artificial intelligence and machine learning to improve investigative abilities,
utilizing blockchains to guarantee data integrity, and accepting cloud forensics to manage
the increasing prevalence of cloud-based services.
These emerging patterns are transforming the methods used by forensic specialists to
collect, examine, and present digital evidence, providing innovative tools and approaches
to tackle intricate cybercrimes better. With the increasing prevalence of cyber threats, the
importance of cyber forensics in upholding cybersecurity and aiding legal proceedings is
growing significantly.

Application of AI
Artificial intelligence and machine learning are revolutionizing forensic capabilities
by augmenting the velocity and precision of data analysis. Artificial intelligence, specifi-
cally deep learning (DL), employs neural networks to replicate human decision-making
processes, allowing forensic investigators to manage large volumes of data effectively.
Electronics 2024, 13, 4546 7 of 32

The creation of the Deep Learning Cyber-Forensics (DLCF) framework demonstrates the
incorporation of artificial intelligence into cyber forensics. This framework utilizes deep
learning algorithms to automate the procedures of evidence acquisition, preservation,
analysis, and interpretation. Classification algorithms categorize extensive datasets and
identify pertinent evidence, whereas clustering techniques reveal concealed patterns and
relationships within the data [18]. Through artificial intelligence, forensic investigations are
enhanced, enabling investigators to concentrate on the most crucial elements of the case.
Artificial intelligence and machine learning are crucial in enhancing cyber forensics by
facilitating the quick analysis of extensive information and revealing concealed patterns,
particularly in cloud environments like Microsoft Azure. AI models, especially deep learn-
ing, enhance investigators’ ability to automate anomaly detection, identify intricate cyber
risks with increased accuracy, and expedite incident response. Utilizing Azure Machine
Learning and Azure Data Explorer, forensic teams may anticipate potential threat vectors
and acquire critical insights that enhance and accelerate security responses. Deep learning
algorithms, adept at discerning intricate patterns among vast datasets, aid investigators
in uncovering previously unnoticed threat flags. AI-driven algorithms in Azure Sentinel
enhance forensic operations by aggregating and categorizing various threat types, enabling
more effective incident triage. This comprehensive method improves the precision of
forensic inquiries while markedly diminishing human mistakes and investigative lags.
Consequently, AI and ML technologies augment the reliability and velocity of forensic
replies, strengthening cybersecurity measures in cloud-based forensic operations.
The application of artificial intelligence (AI) in forensic investigations is aligned
seamlessly with the traditional forensic process outlined in Figure 1, which encompasses
the phases of identification, preservation, analysis, and presentation. AI technologies
augment each phase, streamline workflows, and enhance the accuracy and efficiency
of investigations.
1. Identification Phase: AI technologies accelerate the identification of evidence by au-
tomating the detection of anomalous behaviors, compromised systems, and potential
attack vectors. For instance, anomaly detection algorithms are deployed in Security
Information and Event Management (SIEM) systems like Azure Sentinel, allowing
for the real-time identification of outliers such as suspicious login attempts or unex-
pected geographic access patterns. These capabilities directly enhance the speed and
precision of the Identification Phase.
2. Preservation Phase: In this phase, AI facilitates the automated preservation of digital
evidence. Predictive models are leveraged to flag high-risk activities, prompting the
immediate snapshotting of volatile data and log capture to ensure forensic soundness.
Cloud-native platforms such as Azure Sentinel can be integrated with automated
workflows, allowing for the preservation of relevant data streams, thereby mitigating
the risk of evidence tampering or loss in dynamic environments.
3. The analysis phase is characterized by AI’s transformative role, where processing large
datasets with speed and accuracy is rendered invaluable. Machine learning models
can analyze user behavior baselines, and deviations indicative of insider threats or
advanced persistent threats (APTs) can be detected. For example, rare patterns or
low-frequency events that are often critical to forensic investigations can be detected
by unsupervised learning models in Azure Machine Learning Studio.
4. AI enhances the presentation phase by automating reporting and visualization. AI
tools synthesize complex data into concise visualizations, incident timelines, and
actionable insights, aiding legal and operational teams. When integrated with Azure
ML Studio, tools such as Power BI provide dashboards that clearly present the findings
to technical and non-technical stakeholders.
AI-driven technologies bridge identification, preservation, analysis, and presentation,
allowing forensic teams to respond to incidents with greater agility, accuracy, and scalability.
This alignment ensures that AI is regarded not merely as an auxiliary tool but as a central
Electronics 2024, 13, 4546 8 of 32

component in modernizing forensic practices, empowering organizations to meet the


demands of increasingly sophisticated and dynamic cyber environments.

6. Methodologies in Azure Forensics


In the evolving landscape of cyber forensics, particularly within cloud infrastructures
such as Microsoft Azure, adapting and refining forensic methodologies to address unique
challenges is deemed critical. This section is dedicated to the introduction of advanced
forensic methodologies implemented in Azure Active Directory (Entra ID), with practical
applications and innovations in cloud forensics highlighted.
The collection and analysis of forensic data are conducted with precision and care.
The effective aggregation and analysis of logs are fundamentally anchored in forensic
data collection in Entra ID. The strategic configuration of log management is involved in
capturing relevant data, which is considered crucial for monitoring activities and detecting
anomalies. Entra ID provides robust capabilities for log collection, primarily through
Azure Monitor and Azure Log Analytics. These tools facilitate collecting, analyzing, and
managing vast amounts of logging data for forensic experts.
The data collected typically include user activities, authentication requests, and con-
figuration changes within the Azure environment. Forensic analysts utilize Azure Data
Explorer, a highly scalable data analytics service, to ingest, store, and query large datasets
from various sources, including the Unified Audit Logs and Entra ID Identity Protection logs.
The exploration of intricate patterns and potentially malicious activities is considered vital.
The methodology for utilizing Azure Data Explorer in forensics is presented. The
pivotal role of Azure Data Explorer in forensic investigations is highlighted by its capability
to analyze large-scale structured and unstructured data. The process is initiated by estab-
lishing an Azure Data Explorer cluster and creating a database tailored to forensic needs.
Once the database becomes operational, forensic data, including the export of Unified
Audit Logs and outputs from Entra ID Identity Protection, are ingested.
Forensic analysts utilize KQL to conduct advanced queries on the data. The queries
are designed to uncover hidden patterns and extract actionable insights from the data. For
instance, analysts may search for signs of compromised identities or unauthorized access
attempts by querying risk detections and Sign-in Logs. This systematic approach ensures a
thorough examination of data, which aids in accurately identifying security incidents.
An example of a case study is presented. An example of applying these methodologies
is observed during the forensic investigation that follows a real-world breach scenario.
Analysts leveraged Azure Data Explorer to sift through gigabytes of log data to identify
the origin of the breach. Unusual access patterns were queried from the logs and cross-
referenced with known indicators of compromise, allowing for the swift pinpointing of
malicious activities and the affected accounts.
Challenges and considerations are presented. Azure provides comprehensive tools
for cloud forensics; however, several challenges remain. The dynamic nature of cloud
environments is often associated with transient data, which can complicate the preservation
of evidence. Furthermore, it is required that both providers and customers understand
their roles in maintaining forensic readiness within the shared responsibility model in
cloud computing. The integrity of the forensic process in such a setting is ensured through
constant updates to forensic methodologies and tools.

Challenges and Solutions in Cloud Forensics


Implementing forensic methodologies in cloud environments, such as Microsoft Azure,
is associated with unique challenges. These challenges are attributed to the cloud’s in-
herent characteristics, including its dynamic nature, multi-tenancy, and data distribution
across global infrastructures. This section is intended to outline common real-world chal-
lenges encountered during cloud forensics and innovative solutions developed to address
these challenges.
Electronics 2024, 13, 4546 9 of 32

Challenge 1: The volatility of data and the nature of temporary resources are acknowl-
edged. In cloud environments, the transient nature of resources, such as virtual machines
and containers, can result in volatile data that may be lost upon terminating these resources.
This volatility poses significant challenges for forensic data preservation and collection.
Forensic teams employ automated tools to trigger snapshots and backups of virtual ma-
chines and containers when a potential incident is detected to mitigate the risk of data
loss. The capabilities to automate the response to alerts are provided by Azure Automa-
tion and Azure Logic Apps, allowing for the capture of data before the scaling down or
termination of resources. The automation is designed to ensure that even temporary data
are captured and stored for forensic analysis, with adherence to the principles of digital
evidence preservation.
Challenge 2: Multi-Tenancy and Data Segregation: The multi-tenant architecture of
Azure, although efficient for resource utilization, complicates forensic investigations. The
co-location of data from multiple clients may raise concerns regarding data privacy and the
potential for accidental exposure during a forensic investigation. To address these concerns,
Azure employs strict access controls and encryption to segregate customer data securely.
Forensic tools designed for use in Azure, including Azure Security Center, are equipped
with features that respect tenant boundaries and ensure that investigations are confined
to the data owned by the entity under investigation. Furthermore, Azure’s compliance
with international standards such as ISO 27001 [23] ensures adherence to global privacy
regulations in forensic activities.
Challenge 3: Management and Integrity of Logs: The critical nature of effective log
management for forensic investigations is acknowledged; however, the management and
assurance of log integrity within a distributed environment present an inherent challenge.
Log data’s completeness, accuracy, and immutability are vital for ensuring forensic validity.
It has been observed that Azure offers solutions such as Azure Monitor and Azure Sentinel,
which facilitate comprehensive log collection across all Azure services. Additionally, the
integrity of these logs is ensured by providing immutable storage options. A wide array
of telemetry data are collected by Azure Monitor, which are then processed and analyzed
by Azure Sentinel, Azure’s cloud-native SIEM system. These tools support advanced
query capabilities and automated response actions, resulting in the enhanced efficacy and
efficiency of forensic investigations.
A notable application of these solutions was observed while investigating unautho-
rized access to an Azure-based application. The forensic team utilized Azure Sentinel to
analyze log data and detect abnormal access patterns. The team leveraged Sentinel’s ma-
chine learning capabilities to identify and isolate suspicious activities quickly, significantly
reducing the impact of the incident.
Anticipation is directed towards future developments. Enhancing forensic readiness
is addressed. As cloud environments evolve, the strategies for forensic investigations must
also be adapted. Future advancements are expected to focus on strengthening real-time
data analysis capabilities and the development of more sophisticated AI-driven tools for
anomaly detection. Incorporating forensic readiness into the design of cloud architectures is
deemed crucial, with systems being inherently equipped to facilitate forensic investigations.

7. Case Study
Case studies offer essential insights into the practical application of cyber forensics,
demonstrating how theoretical concepts are utilized to combat complex cyber threats. By
examining incidents, such as ransomware attacks or supply chain breaches, investigators
can identify weaknesses in current cybersecurity frameworks and formulate more effective
defensive strategies. These real-world instances demonstrate the forensic techniques em-
ployed and underscore the significance of prompt detection, comprehensive investigation,
and post-incident evaluation to avert future breaches. This case study examines the intri-
cacies and obstacles organizations encounter in addressing cyber-attacks and illustrates
how utilizing sophisticated forensic tools and methodologies in cloud environments such
effective defensive strategies. These real-world instances demonstrate the forensic tech-
niques employed and underscore the significance of prompt detection, comprehensive in-
vestigation, and post-incident evaluation to avert future breaches. This case study exam-
Electronics 2024, 13, 4546 10 of 32
ines the intricacies and obstacles organizations encounter in addressing cyber-attacks and
illustrates how utilizing sophisticated forensic tools and methodologies in cloud environ-
ments such as Azure can alleviate damage, restore data, and strengthen defenses against
as Azure can alleviate damage, restore data, and strengthen defenses against emerging
emerging threats. This section delves into the practical applications of cyber forensics by
threats. This section delves into the practical applications of cyber forensics by analyzing
analyzing detailed case studies.
detailed case studies.
7.1. Forensic Investigation
7.1. Forensic of a Ransomware
Investigation Attack
of a Ransomware on anonAzure-Hosted
Attack Service
an Azure-Hosted Service
A medium-sized
A medium-sized enterprise
enterpriseencountered
encountered a significant
a significantdisruption
disruptiondueduetotoaa ransom-
ransomware
wareattack
attackthat
thatencrypted
encryptedvital
vitaldata
dataacross
acrossitsitsAzure-hosted
Azure-hostedservices.
services.This
Thisincident
incidenthindered
hin-
dered operational capabilities, and severe data leakage risks were also
operational capabilities, and severe data leakage risks were also posed. Consequently, a posed. Conse-
quently, a forensic
forensic investigation
investigation team was team was engaged
engaged to analyze to the
analyze thepinpoint
breach, breach, the
pinpoint
attack the
vectors,
attack vectors, and devise strategies to prevent
and devise strategies to prevent future incidents. future incidents.
Figure 2 comprehensively
Figure 2 comprehensively illustrates the forensic
illustrates analysis
the forensic process
analysis undertaken
process in re-in re-
undertaken
sponse to a to
sponse ransomware
a ransomware attack on Azure-hosted
attack on Azure-hosted services. The The
services. systematic approach,
systematic fromfrom
approach,
the initial engagement
the initial engagementof the
offorensic teamteam
the forensic through to the
through tofinal mitigation
the final and and
mitigation prevention
prevention
strategies, is outlined
strategies, in the
is outlined in Figure 2: 2:
the Figure

Figure
Figure 2. Mapping
2. Mapping forensic
forensic workflow
workflow of ransomware
of ransomware attack.
attack.

In this
In this casecase study,
study, the forensic
the forensic teamteam within
within AzureAzure Sentinel
Sentinel employed
employed a structured,
a structured,
multi-step analysis to investigate a ransomware breach initiated via a phishing
multi-step analysis to investigate a ransomware breach initiated via a phishing email. email.
The The
investigation was conducted through systematic log consolidation, targeted KQL
investigation was conducted through systematic log consolidation, targeted KQL queries, queries,
and behavioral analysis.
and behavioral analysis.
7.1.1. Data Collection and Consolidation
7.1.1. Data Collection and Consolidation
The forensic analysis was initiated by identifying key log sources within the Azure
The forensic analysis
environment, was provide
which could initiatedcritical
by identifying key the
insights into log suspected
sources within the Azure
ransomware breach.
environment, which could provide critical insights into the suspected
The primary logs gathered included Azure Activity Logs, Sign-in Logs, and ransomware breach.
security
The events.
primaryThe logslogs
gathered includedfor
were essential Azure Activity
tracing user andLogs, Sign-in
system Logs, and
activities, security
access attempts,
events.
andThe logs were essential
configuration changesfor tracing
across theuser and systemAdditionally,
environment. activities, access attempts,
customized andwere
logs
configuration changes across the environment. Additionally, customized logs
extracted from specific virtual machines linked to the affected systems, with application- were ex-
tracted from specific virtual machines linked to the affected systems, with
level events being captured that could potentially reveal the attacker’s behavior. The application-
leveldata
events being captured
collection that could
process was carriedpotentially reveal
out in phases, the attacker’s
beginning withbehavior.
the initialThe data
aggregation
collection process was carried out in phases, beginning with the initial
of log data from each identified source. The native capabilities of Azure Sentinel aggregation of logwere
dataconfigured
from each to identified
automate source. The native process,
the aggregation capabilities of Azure
thereby Sentinel
ensuring datawere config- and
continuity
uredreal-time
to automate the aggregation
updates as new logsprocess, thereby ensuring
were generated. data continuity
Pre-processing steps wereandundertaken
real-time to
updates as new
maintain logs were across
consistency generated.
thesePre-processing steps were
various log sources. undertaken to of
The normalization maintain
timestamp
consistency
formats, the standardization of field names, and the removal of redundant data the
across these various log sources. The normalization of timestamp formats, entries
standardization
were involved, of field
which names, and the
facilitated removalcorrelation
smoother of redundant dataanalysis.
during entries were involved,data
To streamline
ingestion and reduce manual intervention, predefined connectors were set up in Azure
Sentinel to continuously collect and update log data from the Azure Activity Logs, security
events, and Sign-in Logs. These connectors ensured that up-to-date information was
provided by each log source, which allowed for the effective monitoring and analysis of
data by the forensic team. A strong foundation for conducting comprehensive forensic
which
which facilitated
facilitated smoother
smoother correlation
correlation during
during analysis.
analysis. To
To streamline
streamline data
data ingestion
ingestion andand
reduce manual intervention, predefined connectors were set up in Azure
reduce manual intervention, predefined connectors were set up in Azure Sentinel to con- Sentinel to con-
tinuously
tinuously collect
collect and
and update
update loglog data
data from
from the
the Azure
Azure Activity
Activity Logs,
Logs, security
security events,
events, and
and
Electronics 2024, 13, 4546 Sign-in
Sign-in Logs. These connectors ensured that up-to-date information was provided by
Logs. These connectors ensured that up-to-date information was provided by11each
of 32
each
log source, which allowed for the effective monitoring and analysis of data
log source, which allowed for the effective monitoring and analysis of data by the forensic by the forensic
team.
team. A A strong
strong foundation
foundation for for conducting
conducting comprehensive
comprehensive forensic
forensic investigations
investigations with
with
consistent,
consistent, reliable
investigations with data
reliable was
was laid
consistent,
data by
by this
laidreliable integration,
this data was laid
integration, coupled
by thiswith
coupled automated
integration,
with automated ingestion.
coupled with
ingestion.
automated ingestion.
7.1.2.
7.1.2. Specific
Specific KQL
KQL Queries
Queries andand Indicators
Indicators
7.1.2. Specific KQL Queries and Indicators
Suspicious
Suspicious activities
activities linked
linked toto the
the ransomware
ransomware breach
breach were
were identified
identified using
using targeted
targeted
KustoSuspicious
Query activitiesqueries
Language linked to the ransomware
within Azure breach
Sentinel. wereanomalies
Specific identified and
using targeted
indicators
Kusto Query Language queries within Azure Sentinel. Specific anomalies and indicators
Kusto
of Query Languagethat queries within Azure Sentinel. Specific anomalies and indicators
of compromise
compromise (IoCs) (IoCs) that could
could signify
signify unauthorized
unauthorized access
access or
or malicious
malicious activity
activity were
were
of compromise
detected (IoCs) that queries.
could signify unauthorized access or malicious activity were
detected by by crafting
crafting these
these queries.
detected
The by crafting these queries.
The primary
primary query
query focused
focused on on unusual
unusual login
login locations
locations to
to identify
identify logins
logins from
from regions
regions
The
outside primary query focused on unusual login locations to identify logins from regions
outside the
the organization’s
organization’s expected
expected operational
operational boundaries.
boundaries. TheThe detection
detection ofof potential
potential ac-
ac-
outside
count the organization’s
takeovers was expected
facilitated by operational
filtering sign-insboundaries.
from The geolocations:
high-risk detection of potential
count takeovers was facilitated by filtering sign-ins from high-risk geolocations:
account takeovers was facilitated by filtering sign-ins from high-risk geolocations:
SigninLogs
SigninLogs
|
| where
where Location
Location not
not in
in ("ExpectedRegion1",
("ExpectedRegion1", "ExpectedRegion2")
"ExpectedRegion2")
|
| summarize
summarize Count
Count = = count()
count() by
by UserPrincipalName,
UserPrincipalName, Location,
Location, IPAddress
IPAddress
|
| where
where Count
Count >
> 11

The
The following
following table
table represents
represents the
the output
output of
of the
the query,
query, highlighting
highlighting users
users with
with mul-
mul-
tipleThe following
suspicious table represents
sign-in attempts the output
from of the regions:
unexpected query, highlighting users with multiple
tiple suspicious sign-in attempts from unexpected regions:
suspicious sign-in attempts from unexpected regions:
UserPrincipalName
UserPrincipalName Location
Location IPAddress
IPAddress Count
Count
[email protected]
[email protected] "UnknownRegion1"
"UnknownRegion1" 203.0.113.42
203.0.113.42 3
3
[email protected]
[email protected] "SuspiciousRegionX"
"SuspiciousRegionX" 198.51.100.77
198.51.100.77 5
5
[email protected]
[email protected] "HighRiskRegion"
"HighRiskRegion" 192.0.2.33
192.0.2.33 8
8

Anomalous
Anomalous login login activities
activities were
were identified
identified by by isolating
isolating users
users who who accessed
accessed the the sys-
sys-
tem Anomalous login activities were identified by isolatingzones.
users who accessed the system
tem from unexpected regions outside the operational zones. Multiple logins by al-
from unexpected regions outside the operational Multiple logins by al-
from unexpected regions outside
[email protected] the operational noted, zones. Multipleindicatelogins by
[email protected] from from “UnknownRegion1”
“UnknownRegion1” were were noted, whichwhich couldcould indicate user user
[email protected]
travel from “UnknownRegion1” were noted, which could indicate
travel or or credential
credential misuse.
misuse. Additionally,
Additionally, [email protected]
[email protected] logged logged in in five
five times
times
user
from travel or credential misuse. Additionally, [email protected] logged in five
from “SuspiciousRegionX,”
“SuspiciousRegionX,” flagged flagged in in prior
prior threat
threat intelligence.
intelligence. Furthermore,
Furthermore, finance_ad-
finance_ad-
times from “SuspiciousRegionX”,
[email protected] flagged in prior threat intelligence. Furthermore, fi-
[email protected] was was observed
observed to to have
have eight
eight logins
logins from
from “HighRiskRegion,”
“HighRiskRegion,” which which
[email protected]
raises was observed to have eight logins from “HighRiskRegion”,
raises significant
significant concern
concern due due to to its
its administrative
administrative privileges.
privileges. Immediate
Immediate actions actions were were
which
taken, raises significant
including concernassociated
investigating due to its IP administrative
addresses forprivileges.
malicious Immediate
activity, actions
verifying
taken, including investigating associated IP addresses for malicious activity, verifying
were
login taken, including investigating associated IP addresses for malicious activity, verifying
login legitimacy
legitimacy through
through user user travel
travel oror VPN
VPN logs,logs, and
and resetting
resetting credentials
credentials if if unauthor-
unauthor-
login
ized legitimacy through user travel or VPN logs, and resetting credentials if unautho-
ized access
rized access was
was
accesswere
confirmed. During
wasconfirmed.
confirmed. During
During
the Identification
thethe
Identification
Identification
Phase
Phase of
of the
Phase theofforensic
forensic
the
workflow,
workflow,
forensic
these
these
workflow,
anomalies
anomalies were identified,
identified, and
and critical
critical insights
insights were
were provided
provided for
for further
further machine
machine learn-
learn-
these anomalies were
ing-based identified, and criticalsimilar
insights were provided forinfurther machine
ing-based analysis
learning-based analysis to
to mitigate
analysis mitigate and
and prevent
to mitigate prevent
and prevent similar unauthorized
unauthorized
similar unauthorized
access
access in the
accesstheinfuture.
future.
the future.
The aggregation
Theaggregation
aggregation of of login
of login attempts
attempts from
login attempts from non-standard locations was conducted, ena-
The from non-standard
non-standardlocationslocationswas wasconducted,
conducted,ena- en-
bling the
bling the identification of accounts accessed from unfamiliar or high-risk regions, which
abling the identification
identification of of accounts
accounts accessed
accessed fromfrom unfamiliar
unfamiliar or or high-risk
high-risk regions,
regions, whichwhich
may suggest
maysuggest compromised
suggestcompromised credentials.
credentials. A
compromised credentials. further
A further inquiry
further inquiry
inquiry was was directed
was directed towards
directed towards
towards failed failed
failed
may A
login attempts, as it may be recommended that repeated failures indicate brute-force at-
login attempts, as it may be recommended that repeated failures indicate brute-force at-
login attempts, as it may be recommended that repeated failures indicate brute-force at-
tacks
tacksor or unauthorized
orunauthorized
unauthorizedaccess access attempts.
accessattempts. Monitoring
attempts. Monitoring
Monitoring for for a threshold
for aa threshold number
threshold number
number of of failed
of failed logins
failed logins
tacks logins
within
withinaaaspecified
specified timeframe
specifiedtimeframe
timeframeallowedallowed
allowed for for the
for the isolation
the isolation
isolation ofof accounts
of accounts experiencing
accounts experiencing repeated
experiencing repeated
repeated
within
access
access attempts:
attempts:
access attempts:
SecurityEvent
SecurityEvent
Electronics 2024, 13, x FOR PEER REVIEW
| 12 of 32
| where
where EventID
EventID ==
== 4625
4625 //// Failed
Failed login
login event
event ID
ID
|
| summarize
summarize FailedAttempts
FailedAttempts == count()
count() by
by Account,
Account, IPAddress
IPAddress
|
| where
where FailedAttempts
FailedAttempts >> 5
5

The query was conducted to highlight accounts that exhibited multiple failed login
The query was conducted to highlight accounts that exhibited multiple failed login
attempts, which may indicate possible brute-force attempts or attempts to access accounts
attempts, which may indicate possible brute-force attempts or attempts to access accounts
with elevated privileges:
with elevated privileges:
Account IPAddress FailedAttempts
[email protected] 203.0.113.42 7
[email protected] 198.51.100.77 10
[email protected] 192.0.2.24 6

This query analyzes Windows security event logs (Event ID 4625) to detect accounts
with an unusually high number of failed login attempts. This pattern often signals brute-
force attacks or credential-stuffing attempts where attackers repeatedly try to access an
account using guessed or stolen credentials. The query aggregates failed attempts by user
account and originating IP address, isolating instances where more than five failures are
attempts, which may indicate possible brute-force attempts or attempts to access accounts
with elevated privileges:

Account IPAddress FailedAttempts


[email protected] 203.0.113.42 7
Electronics 2024, 13, 4546 [email protected] 198.51.100.77 10 12 of 32
[email protected] 192.0.2.24 6

This query analyzes Windows security event logs (Event ID 4625) to detect accounts
This query analyzes Windows security event logs (Event ID 4625) to detect accounts
with an unusually high number of failed login attempts. This pattern often signals brute-
with an unusually high number of failed login attempts. This pattern often signals brute-
force attacks or credential-stuffing attempts where attackers repeatedly try to access an
force attacks or credential-stuffing attempts where attackers repeatedly try to access an
account using guessed or stolen credentials. The query aggregates failed attempts by user
account using guessed or stolen credentials. The query aggregates failed attempts by user
accountand
account andoriginating
originating IP
IPaddress,
address, isolating
isolating instances
instances where
where more
more than
than five
five failures
failures are
are
logged. The results provide critical evidence for identifying unauthorized access attempts,
logged. The results provide critical evidence for identifying unauthorized access attempts,
whichcan
which canhelp
helpprevent
preventaccount
accountcompromise
compromiseand andinform
informsubsequent
subsequentforensic
forensicanalysis.
analysis.
To detect privilege escalations, the team implemented a query to monitor additions
To detect privilege escalations, the team implemented a query to monitor additions to
to high-privilege groups, such as administrators or other critical roles. The identification
high-privilege groups, such as administrators or other critical roles. The identification of
of newly
newly added
added accounts
accounts withinwithin privileged
privileged groups
groups was facilitated
was facilitated by thisbyquery,
this query,
which which
serves
serves as a standard indicator of an attacker’s attempt to gain elevated
as a standard indicator of an attacker’s attempt to gain elevated access: access:

AuditLogs
| where OperationName == "Add member to role"
| summarize AddedMembers = count() by TargetAccount, RoleName
| where AddedMembers > 0

This query
This query was
was designed
designedtotofilter forfor
filter role changes
role indicating
changes privilege
indicating escalation,
privilege par-
escalation,
ticularly when new accounts were added to high-level administrative groups:
particularly when new accounts were added to high-level administrative groups:

TargetAccount RoleName AddedMembers


[email protected] Global Admin 2
[email protected] Security Group Admin 1
[email protected] Directory Readers 3

The output revealed accounts and roles involved in privilege changes, such as fi-
The output revealed accounts and roles involved in privilege changes, such as fi-
[email protected], where two members were added to the highly sensitive
[email protected], where two members were added to the highly sensitive
Global Admin role, raising concerns about unauthorized escalation. Similarly,
Global Admin role, raising concerns about unauthorized escalation. Similarly,
[email protected] added a member to the Security Group Admin role, suggesting
[email protected] added a member to the Security Group Admin
potential lateral movement, while [email protected] added three members role, suggest-
to the
ing potential lateral movement, while [email protected] added three members
Directory Readers role, possibly for reconnaissance. Immediate actions included verifying to
the Directory Readers role, possibly for reconnaissance. Immediate actions included verify-
the legitimacy of added accounts, auditing their activities, and revoking unauthorized ac-
ing the legitimacy of added accounts, auditing their activities, and revoking unauthorized
cess. These findings, aligned with the Identification Phase, provide critical insights for
access. These findings, aligned with the Identification Phase, provide critical insights for fur-
further investigation and correlation with system activity logs to assess potential compro-
ther investigation and correlation with system activity logs to assess potential compromise
mise of sensitive systems.
of sensitive systems.
7.1.3. Steps for Anomaly Detection
7.1.3. Steps for Anomaly Detection
The forensic team employed a combination of thresholds, parameters, and behavioral
The forensic team employed a combination of thresholds, parameters, and behavioral
analysis techniques within Azure Sentinel to pinpoint suspicious activities associated with
analysis techniques within Azure Sentinel to pinpoint suspicious activities associated with
theransomware
the ransomware attack.
attack. Specific
Specificdetection
detection criteria
criteria were set,set,
were allowing for systematically
allowing for systematicallyiso-
lating abnormal behaviors.
isolating abnormal behaviors.
Multiplefailed
Multiple failedlogin
loginattempts
attempts within
within aa short
short period
period may
may indicate
indicate aa brute-force
brute-force attack.
attack.
The
The team
team established
established a
a threshold
threshold of
offive
fivefailed
failed login attempts
login attempts within an
within hour
an for for
hour each ac-
Electronics 2024, 13, x FOR PEER REVIEW 13 each
of 32
count. An alert was triggered if the threshold was exceeded, indicating that
account. An alert was triggered if the threshold was exceeded, indicating that the account the account
waspotentially
was potentiallycompromised.
compromised.An Anexample
example of of aa KQL
KQL query
query isis presented
presented below:
below:

SecurityEvent
| where EventID == 4625 // Event ID 4625 signifies a failed login attempt
| summarize FailedAttempts = count() by Account, IPAddress, bin(TimeGenerated, 1h)
| where FailedAttempts > 5

Each one-hour time bin counts the number of failed login attempts per account and
Each one-hour time bin counts the number of failed login attempts per account and
IP address. Any account that exceeds five failed attempts per hour is flagged for further
IP address. Any account that exceeds five failed attempts per hour is flagged for further
investigation, which aids in the team’s effective detection of brute-force attempts.
investigation, which aids in the team’s effective detection of brute-force attempts.
The code for the Anomaly Detection Phase was enhanced by the addition of temporal
The code for the Anomaly Detection Phase was enhanced by the addition of temporal
granularity and contextual parameters, which are intended to improve the precision and
granularity and contextual parameters, which are intended to improve the precision and
relevance of detected anomalies. The bin(TimeGenerated, 1h) function was introduced in
relevance of detected anomalies. The bin(TimeGenerated, 1h) function was introduced in
the code, allowing for the aggregation of failed login attempts into specific time intervals.
This enables the detection of rapid, concentrated bursts of activity that are characteristic
of brute-force or credential-stuffing attacks. It is allowed for forensic teams to identify not
just overall patterns of suspicious behavior but also time-bound anomalies that may indi-
cate an ongoing attack. Additionally, it has been observed that filtering by specific thresh-
Each one-hour time bin counts the number of failed login attempts per account and
IP address. Any account that exceeds five failed attempts per hour is flagged for further
investigation, which aids in the team’s effective detection of brute-force attempts.
Electronics 2024, 13, 4546 The code for the Anomaly Detection Phase was enhanced by the addition of temporal
13 of 32
granularity and contextual parameters, which are intended to improve the precision and
relevance of detected anomalies. The bin(TimeGenerated, 1h) function was introduced in
the code, allowing for the aggregation of failed login attempts into specific time intervals.
the code, allowing for the aggregation of failed login attempts into specific time intervals.
This enables the detection of rapid, concentrated bursts of activity that are characteristic
This enables the detection of rapid, concentrated bursts of activity that are characteristic
ofbrute-force
of brute-force or or credential-stuffing
credential-stuffingattacks.
attacks.It Itis is
allowed
allowed forfor
forensic teams
forensic to identify
teams not
to identify
just overall patterns of suspicious behavior but also time-bound anomalies
not just overall patterns of suspicious behavior but also time-bound anomalies that may that may indi-
cate an ongoing
indicate an ongoing attack. Additionally,
attack. it hasitbeen
Additionally, has observed that filtering
been observed by specific
that filtering thresh-
by specific
thresholds (e.g., more than five failed attempts per hour) results in noise reduction,awith
olds (e.g., more than five failed attempts per hour) results in noise reduction, with focusa
placed on significant deviations from baseline behavior. This query
focus placed on significant deviations from baseline behavior. This query is made more is made more effective
for uncovering
effective sophisticated,
for uncovering time-sensitive
sophisticated, threats, such
time-sensitive as targeted
threats, such account compromise
as targeted account
attempts, and
compromise actionable
attempts, andinsights are insights
actionable providedare forprovided
deeper investigation and automated
for deeper investigation and
alerting. alerting.
automated
Loginsfrom
Logins fromvarious
variousgeographic
geographic locations
locations within
within aa brief
brief time
time frame
frame are
are considered
considered aa
strong indicator of account compromise. This query identifies accounts logging in
strong indicator of account compromise. This query identifies accounts logging from
in from
regions not commonly associated with the user’s activity. An example
regions not commonly associated with the user’s activity. An example of a KQL query is of a KQL query is
presentedbelow:
presented below:

SigninLogs
| summarize LoginLocations = makeset(Location) by UserPrincipalName,
bin(TimeGenerated, 1h)
| where array_length(LoginLocations) > 1

Login events were grouped by user and time bin, followed by creating a list of unique
Login events were grouped by user and time bin, followed by creating a list of unique
login locations for each hour. If a user logged in from multiple locations within a one-hour
login locations for each hour. If a user logged in from multiple locations within a one-hour
timeframe, the account was flagged. This method highlights impossible or suspicious
timeframe, the account was flagged. This method highlights impossible or suspicious travel
travel patterns, revealing the potential for account takeovers.
patterns, revealing the potential for account takeovers.
Unexpected additions to privileged groups, such as “Domain Admins” or “Enter-
Unexpected additions to privileged groups, such as “Domain Admins” or “Enterprise
prise Admins,”
Admins,” indicate
indicate attempts
attempts at privilege
at privilege escalation.
escalation. The The
teamteam created
created a query
a query to mon-
to monitor
itor modifications within the group, specifically focusing on tracking new accounts
modifications within the group, specifically focusing on tracking new accounts added added
to
to these high-privilege groups. An example of a KQL query is presented
these high-privilege groups. An example of a KQL query is presented as follows: as follows:

AuditLogs
| where OperationName == "Add member to role"
| where TargetResource in ("Domain Admins", "Enterprise Admins", "Cert Publishers")
| summarize count() by TargetAccount, RoleName, TimeGenerated

This query filters operations involving adding members to specific high-privilege


This query filters operations involving adding members to specific high-privilege roles.
roles. Each flagged account was scrutinized for signs of privilege escalation or unauthor-
Each flagged account was scrutinized for signs of privilege escalation or unauthorized access.
ized access.
The team established baselines for each account’s typical login patterns to capture
The team established baselines for each account’s typical login patterns to capture
anomalous
Electronics 2024, 13, x FOR PEER REVIEW behavior. Standard login times, locations, and access frequency for each user
14 of 32
anomalous behavior. Standard login times, locations, and access frequency for each user
were identified by examining logins over two weeks. A review was initiated for deviations
were identified by examining logins over two weeks. A review was initiated for deviations
from this baseline. An example of a KQL query is presented below:
from this baseline. An example of a KQL query is presented below:
SigninLogs
| where TimeGenerated > ago(14d)
| summarize AvgLoginTime = avg(datetime_part("Hour", TimeGenerated)) by
UserPrincipalName
| join kind=inner (SigninLogs | where TimeGenerated > ago(1d)) on UserPrincipalName
| where datetime_part("Hour", TimeGenerated) > AvgLoginTime + 3 or
datetime_part("Hour", TimeGenerated) < AvgLoginTime - 3

Each user’s average login time was established over 14 days, and recent logins
Each user’s average login time was established over 14 days, and recent logins (within
(within the past day) were compared against this baseline. Logins recorded more than
the past day) were compared against this baseline. Logins recorded more than three hours
three hours outside the user’s typical range were flagged as anomalies:
outside the user’s typical range were flagged as anomalies:

UserPrincipalName AvgLoginTime LoginTime (Hour) TimeGenerated


[email protected] 10 3 2024-11-14T03:15:00Z
[email protected] 15 22 2024-11-14T22:30:00Z
[email protected] 9 1 2024-11-14T01:10:00Z

The login behaviors were analyzed by comparing recent login times to the average
login time of each user over the past 14 days. Logins that occurred significantly outside
the user’s typical time frame were identified, utilizing a deviation threshold of three hours.
For example, it was noted that [email protected] logged in at 3:15 a.m., which was
significantly earlier than the average login time of 10:00 a.m., suggesting that unauthor-
ized access or unusual activity may have occurred. It was observed that john.doe@com-
three hours outside the user’s typical range were flagged as anomalies:

UserPrincipalName AvgLoginTime LoginTime (Hour) TimeGenerated


[email protected] 10 3 2024-11-14T03:15:00Z
Electronics 2024, 13, 4546 [email protected] 15 22 2024-11-14T22:30:00Z
[email protected] 9 1 2024-11-14T01:10:00Z14 of 32

The login behaviors were analyzed by comparing recent login times to the average
loginThe
timelogin behaviors
of each user overwere
theanalyzed by comparing
past 14 days. Logins thatrecent
occurredlogin times to the
significantly aver-
outside
age
the login
user’s time
typical oftime
eachframe
user were
over identified,
the past 14 days. aLogins
utilizing thatthreshold
deviation occurredofsignificantly
three hours.
outside the user’s
For example, it wastypical time
noted that frame were identified, logged
[email protected] utilizing a deviation
in at threshold
3:15 a.m., which was
of three hours. For example, it was noted that [email protected]
significantly earlier than the average login time of 10:00 a.m., suggesting that unauthor- logged in at
3:15
izeda.m.,
accesswhich was significantly
or unusual activity may earlier
have than the average
occurred. login timethat
It was observed of 10:00 a.m., sug-
john.doe@com-
gesting that unauthorized access or unusual activity may have
pany.com logged in at 10:30 p.m., which deviated from the average of 3:00 p.m.,occurred. It was observed
while
that [email protected] logged
[email protected] logged inin
atat
10:30
1:10p.m.,
a.m., which deviated
which was outsidefromthethe average
normal of
range
3:00 p.m., while
of activity for [email protected] logged in at 1:10
high-value account. Further investigation a.m., which
is warranted for was
theseoutside
anom-
the normal
alies range of
to validate theactivity for this
legitimacy of high-value
the logins andaccount.
assessFurther investigation
potential is warranted
threats. Deviations in
for these anomalies to validate the legitimacy of the logins and assess
login patterns were focused on, enhancing anomaly detection by uncovering behavioral potential threats.
Deviations
outliers thatin could
login patterns were focused
signify account on, enhancing
compromise or insideranomaly
threats.detection by uncovering
behavioral outliers that could signify account compromise
The combination of threshold-based queries with behavioral or insider threats.
baselines effectively
The combination of threshold-based queries with
narrowed down potential indicators of compromise by the forensic team. behavioral baselines
These effectively
methods
narrowed
highlighted down potential
critical pointsindicators of compromise
for investigation by the aforensic
and provided team. These
reproducible methods
framework for
highlighted critical points for investigation and provided a reproducible
detecting anomalies tied to account takeover, brute-force attacks, and privilege escalation. framework for
detecting anomalies tied to account takeover, brute-force attacks, and privilege escalation.
7.1.4. Phishing Email Analysis
7.1.4. Phishing Email Analysis
The initial point of compromise in the ransomware attack was identified, leading the
The initial point of compromise in the ransomware attack was identified, leading the
forensic team to investigate the presence of phishing emails, which are frequently utilized
forensic team to investigate the presence of phishing emails, which are frequently utilized
to gain unauthorized access. Several steps were involved in this analysis, beginning with
to gain unauthorized access. Several steps were involved in this analysis, beginning with
the detection of phishing emails and then the attacker’s use of stolen credentials.
the detection of phishing emails and then the attacker’s use of stolen credentials.
The phishing email was identified through a thorough examination of email logs
The phishing email was identified through a thorough examination of email logs
within Microsoft 365 conducted by the team. Potential phishing messages were narrowed
within Microsoft 365 conducted by the team. Potential phishing messages were narrowed
down by filtering for emails that contained known malicious indicators, such as links or
down by filtering for emails that contained known malicious indicators, such as links or
attachments that threat intelligence feeds had previously flagged. An example KQL query
attachments that threat intelligence feeds had previously flagged. An example KQL query
is provided in Microsoft 365 Defender:
is provided in Microsoft 365 Defender:

EmailEvents
| where ThreatTypes has "Phish"
| where AttachmentType in ("zip", "exe", "docm") or UrlThreatCount > 0
| project Timestamp, SenderFromAddress, RecipientEmailAddress, Urls, AttachmentType

This query was designed to identify emails classified as phishing attempts, with a
This query was designed to identify emails classified as phishing attempts, with a
particular emphasis on those that contained high-risk attachments or malicious URLs.
particular emphasis on those that contained high-risk attachments or malicious15URLs.
Electronics 2024, 13, x FOR PEER REVIEW
Once identified, emails, particularly those sent to users in sensitive roles, were flaggedoffor
32
Once identified, emails, particularly those sent to users in sensitive roles, were flagged for
further inspection:
further inspection:

Timestamp SenderFromAddress RecipientEmailAddress Urls AttachmentT


ype
2024-11- attacker@malicious [email protected] ["http://m zip
14T08:30:0 .com alicious-
0Z url.com"]
2024-11- spammer@fraudulent [email protected] [] exe
14T09:15:0 .net
0Z
2024-11- [email protected] [email protected] ["http://p docm
14T10:00:0 g om hish-
0Z site.org"]

Phishing emails flagged with “Phish” in their threat types were identified by the
Phishing emails flagged with “Phish” in their threat types were identified by the query,
query, with a focus on those containing high-risk attachments such as .zip, .exe, or .docm,
with a focus on those containing high-risk attachments such as .zip, .exe, or .docm, or malicious
or malicious URLs. An email was received by [email protected] from at-
URLs. An email was received by [email protected] from [email protected]
[email protected] containing a .zip file and a suspicious URL, while john.doe@com-
containing a .zip file and a suspicious URL, while [email protected] was targeted
pany.com was targeted by [email protected] with an executable attachment (.exe).
by [email protected] with an executable attachment (.exe). A phishing email
A phishing email was received by [email protected] from hacker@phish-
was received by [email protected] from [email protected], which con-
ing.org, which contained a .docm file and a phishing URL. Potential threats were high-
tained a .docm file and a phishing URL. Potential threats were highlighted by these find-
lighted by these findings, warranting immediate actions such as the blocking of sender
ings, warranting immediate actions such as the blocking of sender domains, the quar-
domains, the quarantining of emails, and the alerting of recipients to mitigate risks like
credential theft or malware infections.
Following the identification of the phishing email, the subsequent step was to trace
the utilization of stolen credentials through this initial compromise. Login events associ-
ated with the user who received the phishing email were examined, focusing on tracking
query, with a focus on those containing high-risk attachments such as .zip, .exe, or .docm,
or malicious URLs. An email was received by [email protected] from at-
or malicious URLs. An email was received by [email protected] from at-
[email protected] containing a .zip file and a suspicious URL, while john.doe@com-
[email protected] containing a .zip file and a suspicious URL, while john.doe@com-
pany.com was targeted by [email protected] with an executable attachment (.exe).
pany.com was targeted by [email protected] with an executable attachment (.exe).
A phishing email was received by [email protected] from hacker@phish-
Electronics 2024, 13, 4546
A phishing email was received by [email protected] from hacker@phish-
ing.org, which contained a .docm file and a phishing URL. Potential threats were15high- of 32
ing.org, which contained a .docm file and a phishing URL. Potential threats were high-
lighted by these findings, warranting immediate actions such as the blocking of sender
lighted by these findings, warranting immediate actions such as the blocking of sender
domains, the quarantining of emails, and the alerting of recipients to mitigate risks like
domains, the quarantining of emails, and the alerting of recipients to mitigate risks like
credential
antining oftheft or malware
emails, infections.
and the alerting of recipients to mitigate risks like credential theft or
credential theft or malware infections.
malwareFollowing the identification of the phishing email, the subsequent step was to trace
infections.
Following the identification of the phishing email, the subsequent step was to trace
Following the
the utilization identification
of stolen of the
credentials phishing
through thisemail,
initialthe subsequent Login
compromise. step was to trace
events the
associ-
the utilization of stolen credentials through this initial compromise. Login events associ-
utilization of stolen
ated with the credentials
user who receivedthrough this initial
the phishing emailcompromise.
were examined, Login eventson
focusing associated
tracking
ated with the user who received the phishing email were examined, focusing on tracking
with the user who
IP addresses, received
locations, and the phishing
times of loginemail wereAn
attempts. examined, focusing
example of on tracking
a KQL query that wasIP
IP addresses, locations, and times of login attempts. An example of a KQL query that was
addresses, locations, and times of login attempts.
utilized for tracking login patterns is presented below: An example of a KQL query that was
utilized for tracking login patterns is presented below:
utilized for tracking login patterns is presented below:
SigninLogs
SigninLogs
| where UserPrincipalName == "<compromised_user>"
| summarize
| where UserPrincipalName == "<compromised_user>"
LoginCount = count() by Location, IPAddress, bin(TimeGenerated, 1h)
| where
| summarize LoginCount
LoginCount > 1 = count() by Location, IPAddress, bin(TimeGenerated, 1h)
| where LoginCount > 1

The login attempts associated with the compromised account were isolated and
The login attempts associated with the compromised account were isolated and
grouped by location
The login attempts andassociated
IP address. Following
with the phishing
the compromised compromise,
account anomalous
were isolated and
grouped by location and IP address. Following the phishing compromise, anomalous
login locations
grouped or frequencies
by location revealedFollowing
and IP address. unauthorizedthe access patterns.
phishing compromise, anomalous
login locations or frequencies revealed unauthorized access patterns.
login The team or
locations traced lateral movements
frequencies within the access
revealed unauthorized network, with evidence of compro-
patterns.
The team traced lateral movements within the network, with evidence of compro-
misedThecredentials,
team tracedthrough monitoringwithin
lateral movements abnormal access orwith
the network, privilege escalation
evidence activities
of compromised
mised credentials, through monitoring abnormal access or privilege escalation activities
credentials,
associated through
with themonitoring
compromised abnormal access
account. For or privilege
instance, escalation
any unusualactivities
additionsassociated
to privi-
associated with the compromised account. For instance, any unusual additions to privi-
with
legedthe compromised
groups account.
by this account Foridentified
were instance, any unusual additions
as indicators of further to privileged groups
compromise. An ex-
leged groups by this account were identified as indicators of further compromise. An ex-
by this account
ample KQL query wereforidentified as privileged
changes in indicators of further
groups by compromise.
a compromised Anaccount
exampleis KQL
pre-
ample KQL query for changes in privileged groups by a compromised account is pre-
query
sentedfor changes in privileged groups by a compromised account is presented below:
below:
sented below:
AuditLogs
AuditLogs
| where OperationName == "Add member to role"
| where
| where Initiator
OperationName == "Add member to role"
== "<compromised_user>"
| summarize
| where Initiator == "<compromised_user>"
ActionsCount = count() by TargetResource, RoleName, TimeGenerated
| summarize ActionsCount = count() by TargetResource, RoleName, TimeGenerated

This query explicitly tracks privilege changes initiated by the compromised user. The
This
Thisquery
queryexplicitly
explicitlytracks
tracksprivilege
privilegechanges
changes initiated
initiated by
by the compromised user. The
addition of accounts to high-level roles, such as administrative groups, was flagged, as it
addition
additionofofaccounts
accountstotohigh-level
high-level roles,
roles, such
such as
as administrative
administrative groups,
groups, was flagged, as it
may indicate attempts by an attacker to escalate privileges.
may
mayindicate
indicateattempts
attemptsbybyananattacker
attackertoto escalate
escalate privileges.
privileges.
To strengthen the analysis, the email metadata and headers were examined to verify
To
Tostrengthen
strengthenthetheanalysis,
analysis,the
theemail
emailmetadata
metadata andand headers
headers were
were examined to verify
the origin and authenticity of the phishing email. Analyzing fields such as Received, Re-
the
theorigin
origin and authenticity of
and authenticity ofthe
thephishing
phishingemail.
email.Analyzing
Analyzing fields
fields such
such as Received,
as Received, Re-
turn-Path, and Message-ID allowed the confirmation of the email’s external origin and
Return-Path,
turn-Path, and andMessage-ID
Message-IDallowed
allowedthetheconfirmation
confirmationofofthe
the email’s
email’s external
external origin and
identified any spoofing attempts. The validation of the compromised account access was
identified
identifiedany
anyspoofing
spoofingattempts.
attempts. The
The validation
validation ofof the
the compromised
compromised account access was
found to be related to the phishing vector.
found to be related to the phishing vector.
found to be related to the phishing vector.

7.1.5. Reproducible Timeline or Workflow


The forensic team documented a detailed timeline and workflow outlining the entire
process, from data collection to incident confirmation, to ensure the investigation could be
replicated and validated. The structured approach facilitated explicit event tracking and
allowed other investigators to follow the same methodology in similar breach scenarios.
The investigation was initiated by configuring Azure Sentinel to ingest data from
various sources, including Azure Activity Logs, Sign-in Logs, security events, and custom
logs from relevant virtual machines. Data ingestion automation was implemented to
capture new logs in real time, ensuring that all pertinent activity was documented as the
investigation progressed. (It should be confirmed that all data sources were connected and
actively feeding into Azure Sentinel before the initiation of any queries. The comprehensive
visibility of the environment was ensured from the outset.)
The team executed targeted KQL queries to identify specific indicators of compromise
with data in place. Queries focused on failed login attempts, privilege changes, and unusual
login locations. The detection of each anomaly was accompanied by timestamping and
logging, which formed the initial indicators of potential compromise. Flagged anomalies,
including suspicious logins or privilege escalations, should be cross-referenced with known
attack patterns or indicators of compromise (IoCs) to ensure alignment with the case-specific
indicators identified in the data.)
Electronics 2024, 13, 4546 16 of 32

The origin of unusual login activities was traced back to a phishing email by analyzing
Microsoft 365 email logs. The metadata and header information of the phishing email
were examined to validate its role in the attack. Upon identification, the team tracked the
use of stolen credentials within the compromised account, with monitoring conducted
for signs of lateral movement and privilege escalation. (It should be verified that the
compromised account demonstrated a precise sequence of events originating from the
phishing email. The phishing email was established as the initial breach point, and the
sequence of unauthorized access attempts was validated.)
Access to high-value resources was subsequently gained using the compromised
credentials. Changes to privileged groups and lateral movements across systems were
monitored, allowing for the establishment of how the attacker advanced within the network.
(It is to be confirmed that changes in group memberships or resource access were directly
correlated with the activities of the compromised account. The link between the initial
phishing compromise and subsequent unauthorized access attempts was reinforced.)
The forensic team compiled a timeline to document the attack sequence clearly, which
included each major event: initial phishing email compromise, first unauthorized login,
privilege escalations, and any lateral movements. The timeline was composed of times-
tamps, log sources, query outputs, and validation points for each phase. (The timeline
will be subjected to a final review, incorporating cross-validated findings from all queries
and log sources. It should be ensured that the collected log data or query results directly
support each event in the timeline.)
The investigation was organized into a reproducible timeline with defined validation
points, ensuring that each phase was thoroughly documented and could be independently
verified by the team. This structured workflow facilitated a transparent forensic investiga-
tion, allowing for precise evidence tracking from the initial compromise to the full breach
confirmation, thereby providing a robust framework for future incident response efforts.

7.1.6. Conclusions Based on Findings


Upon completing the investigation, the forensic team reached conclusions based on the
evidence gathered, corroborated findings, and observed indicators of compromise (IoCs).
The systematic approach employed throughout the investigation facilitated the connection
of each anomaly with the attack timeline, establishing a transparent and verifiable narrative
of the incident.
Each identified event—from the initial compromise of the phishing email to privilege
escalations and lateral movements—was cross-referenced with relevant log data to ensure
consistency across sources. For instance, the phishing email identified in Microsoft 365
was found to be directly linked to subsequent unauthorized login attempts that Azure
Sentinel’s Sign-in Logs flagged. A clear cause-and-effect relationship was established by
this correlation, with the phishing email being identified as the point of initial compromise.
Compromised credentials obtained through the phishing email were traced to privilege
escalations, during which the attacker added accounts to high-privilege groups. Sequential
analysis confirmed that the attacker leveraged stolen credentials to gain additional access,
resulting in a coherent narrative of the attack progression.
Focusing on specific anomalies could help detect similar phishing-based compromises
early. The team’s effective tracing of unauthorized access patterns was facilitated by
monitoring email logs for flagged phishing attempts and behavioral analysis of account
activity. Other organizations can replicate this approach by implementing similar logging
and query strategies, particularly identifying deviations from user baselines and tracking
changes to high-privilege accounts.
The team concluded that a proactive approach to anomaly detection is recommended,
emphasizing the importance of continuous monitoring and data correlation across Mi-
crosoft 365 and Azure Sentinel. Establishing behavioral baselines for high-value accounts
and setting alert thresholds for privilege changes were highlighted as critical early breach
detection and containment strategies.
Electronics 2024, 13, 4546 17 of 32

The team proposed further refinement of the existing detection rules and thresh-
olds to improve incident response readiness, emphasizing monitoring abnormal login
attempts, multi-region access within short intervals, and privilege changes. Furthermore,
integrating threat intelligence feeds was advised to identify known phishing domains or IP
addresses associated with malicious activities, which could enhance the early detection of
phishing attacks.
The team recommended implementing similar workflows and validation steps as
standard operating procedures in future incidents, enhancing repeatability and ensuring a
structured, comprehensive response.
After this attack, an overview of the incident was initiated, and the disruption caused
by the ransomware was highlighted. Key phases were progressed through, including data
collection, in-depth Azure tool analysis, and identifying the initial attack vector via phishing.
The containment and mitigation efforts are detailed in subsequent sections, including
resetting compromised credentials and implementing just-in-time VM access. The outcomes
and recommendations are highlighted in the flowchart, which includes enhancements in
security training and the integration of advanced threat protection features. Proactive
monitoring and regular security assessments are underscored to fortify the organization
against future threats. This visual representation effectively encapsulates the entire process,
which provides a clear roadmap of the actions taken and the lessons learned from the
cybersecurity incident.
The initial attack vector was ascertained, and the subsequent maneuvers executed
by the attackers were analyzed. The extent of the impact on the Azure-hosted resources
was assessed, and recommendations to address vulnerabilities and enhance the security
framework were provided. The goals aimed to provide a comprehensive understanding of
the attack’s penetration and progression, evaluate the damage inflicted on the cloud infras-
tructure, and develop robust measures to bolster the enterprise’s cybersecurity defenses
against future threats.
A systematic approach was employed in the forensic investigation, utilizing Azure’s
suite of forensic tools and encompassing several vital phases. Initially, Azure Activity Logs
were gathered to trace the operations performed by the attackers. Detailed telemetry data
indicative of potential malicious activities was captured using Azure Monitor Logs, and
snapshots of affected virtual machines were taken to preserve their states during the attack
for further offline analysis. Log data were consolidated and analyzed using Azure Sentinel
during the analysis phase. The team employed the KQL to identify anomalies, including
irregular login attempts and unexpected external connections. A comprehensive timeline
of events was developed to delineate the sequence of attacker activities, from the initial
breach through network lateral movements to the deployment of ransomware.
The forensic analysis indicated that the initial breach was facilitated by a phishing
email that compromised an employee’s credentials, which were subsequently exploited due
to inadequate conditional access policies to obtain elevated privileges. Immediate strategies
for containment and mitigation were recommended, including resetting compromised
credentials, enforcing multi-factor authentication for all users, and updating firewall rules
to limit unusual outbound traffic. Furthermore, it was advised that Azure Security Center’s
just-in-time VM access feature be implemented to minimize the attack surface by strictly
limiting VM access to necessary instances.
The investigation determined that the root cause of the breach was a combination of
social engineering through phishing and misconfigured security controls. Due to rapid
detection and responsive measures, the impact of the ransomware was confined to a
limited portion of the company’s Azure environment. To prevent future incidents, several
recommendations were made, including enhanced security training for employees, routine
audits of Azure configurations, and the integration of sophisticated threat protection
features within Azure. These measures aim to strengthen the organization’s security
posture and reduce the likelihood of similar breaches in the future.
Electronics 2024, 13, 4546 18 of 32

The investigation underscored the importance of proactive monitoring and anomaly


detection using Azure tools, which are essential for the early identification of potential
threats. Additionally, it highlighted the need for regular security posture assessments using
the Azure Security Benchmark to ensure compliance with optimal security practices across
all Azure resources. These lessons learned emphasize the necessity of maintaining vigilance
and adhering to best practices to safeguard against future cyber threats effectively.

7.2. Enhancing Azure Forensics with Artificial Intelligence


Organizations’ migration of critical infrastructure to the cloud has resulted in an
exponential increase in the complexity and volume of security data, necessitating a more
sophisticated approach to forensic analysis. It has been observed that traditional rule-
based security mechanisms frequently encounter challenges in adapting to the dynamic
and evolving nature of cyber threats within cloud environments, where subtle behavioral
changes and low-frequency anomalies may serve as indicators of sophisticated attacks.
The challenges presented are addressed using artificial intelligence and machine learning,
which are recognized as essential tools for modern forensic investigations. Security teams
facilitate detection and response to incidents in near real-time with increased accuracy.
The cloud-native Security Information and Event Management and Security Orchestra-
tion, Automation, and Response platform, Azure Sentinel, is enhanced through AI-driven
capabilities, improving the forensic workflow. Integrating data from multiple sources, ap-
plying machine learning models for adaptive anomaly detection, and leveraging advanced
analytics through Azure Machine Learning (Azure ML) Studio result in a comprehensive
and proactive approach to threat detection and incident response by Azure Sentinel.
The application of AI-enhanced forensic capabilities is demonstrated by present-
ing a case study involving a sophisticated security incident within a financial organiza-
tion. In this scenario, an attacker gained unauthorized access to a high-value account
([email protected]) through a phishing email. Subsequently, suspicious activ-
ities were observed, including anomalous logins from unusual locations and attempted
privilege escalation.
The forensic investigation was conducted across three primary stages, with AI-driven
tools in Azure Sentinel being utilized to detect, analyze, and respond to the incident.
• Data Collection and Aggregation: The investigation was initiated through the col-
lection of data from various sources, which included Azure Activity Logs, Azure
Monitor, and Microsoft 365 logs. These sources provide detailed records of login
attempts, access control changes, and system performance metrics. The centralization
of these data in Azure Sentinel resulted in a unified view of the environment for the
forensic team, which facilitated the identification of suspicious patterns that may have
remained undetected in isolated data silos.
• Machine Learning Models for Anomaly Detection: Data aggregated in Azure Sentinel
established behavioral baselines for the compromised account and other critical entities
by applying machine learning models. Sentinel’s built-in anomaly detection models
were designed to identify deviations in login times, geographic locations, and access
frequency, flagging unusual activities that may indicate potential unauthorized access.
In this case, the anomaly detection models quickly highlighted the unusual login
behavior and privilege escalation attempts associated with the compromised account.
• Advanced Anomaly Detection with Azure Machine Learning Studio: The forensic team
in Azure Machine Learning Studio developed a custom anomaly detection model to
enhance detection accuracy further. The model based on Isolation Forests was tailored
to detect low-frequency anomalies specific to the organization’s operations. The model
was deployed as a real-time scoring endpoint, allowing for the continuous scoring of
new login events for potential compromise by Azure Sentinel, thereby enhancing the
team’s detection of complex or previously unknown attack patterns.
This case study explores Azure Sentinel’s AI capabilities, highlighting the empow-
erment of forensic teams to adapt to evolving threats, respond to incidents with agility,
tions. The model was deployed as a real-time scoring endpoint, allowing for the con-
tinuous scoring of new login events for potential compromise by Azure Sentinel,
thereby enhancing the team’s detection of complex or previously unknown attack
patterns.
Electronics 2024, 13, 4546 19 of 32
This case study explores Azure Sentinel’s AI capabilities, highlighting the empower-
ment of forensic teams to adapt to evolving threats, respond to incidents with agility, and
improve the overall security posture of cloud-centric environments. This process estab-
and improve
lishes therole
the critical overall security
of each posture
component of of
thecloud-centric
AI-enhanced environments. This process
forensic workflow—data col-
establishes the critical role of each component of the AI-enhanced forensic workflow—data
lection, anomaly detection, and custom model deployment. Actionable insights and auto-
collection, anomaly detection, and custom model deployment. Actionable insights and
mated responses were provided to forensic investigators, significantly reducing the time
automated responses were provided to forensic investigators, significantly reducing the
required to detect and mitigate security incidents.
time required to detect and mitigate security incidents.
As demonstrated in this case study, this structured and adaptive approach enables
As demonstrated in this case study, this structured and adaptive approach enables
organizations to leverage Azure Sentinel and Azure Machine Learning Studio to trans-
organizations to leverage Azure Sentinel and Azure Machine Learning Studio to transform
form forensic practices from reactive to proactive. A resilient and scalable defense frame-
forensic practices from reactive to proactive. A resilient and scalable defense framework was
work was built, which can address sophisticated threats in today’s cyber landscape. Azure
built, which can address sophisticated threats in today’s cyber landscape. Azure Sentinel
Sentinel and Azure Machine Learning Studio are fundamental parts of the Azure AI-en-
and Azure Machine Learning Studio are fundamental parts of the Azure AI-enhanced
hanced cyber forensic investigation in Azure, as shown in Figure 3:
cyber forensic investigation in Azure, as shown in Figure 3:

Azure Activity Logs

Azure Security Center

Data Factory ML Learning Studio

Azure Monitor
Unsupervised Learning Supervised Learning
Azure Sentinel

Figure 3.
Figure 3. AI-enhanced
AI-enhanced cyber
cyber forensic
forensic investigation
investigation in
in Azure.
Azure.

The
The process
process was
was initiated
initiated by
by collecting
collecting data
data from
from Azure
Azure Activity
Activity Logs
Logs and
and Azure
Azure
Monitor, which record
Monitor, which recorddetailed
detailedinformation
informationregarding
regarding user
user and
and system
system activities.
activities. The
The data
data
werewere aggregated
aggregated and processed
and processed through
through AzureAzure
DataData Factory,
Factory, a pre-processing
a pre-processing step step
that
that prepares the data for further analysis. The data were directed to the ML Learning
Studio, subjected to unsupervised and supervised learning techniques. Unsupervised
learning algorithms detected patterns and anomalies without prior data labeling. In
contrast, supervised learning models classified these anomalies into predefined categories,
such as potential security threats. The outcomes of these analyses were subsequently
integrated into Azure Security Center and Azure Sentinel. Azure Security Center utilized
the information to enhance threat protection and improve security management across
Azure services. The refined data were used by Azure Sentinel, a cloud-native SIEM system,
to monitor, detect, and respond to threats in real time, thereby ensuring a comprehensive
and proactive cybersecurity posture. The detection and response to potential cyber threats
are streamlined through this integrated approach, while resource allocation and strategic
focus within the firm’s cybersecurity operations are also optimized.

7.2.1. Data Collection Procedures


Practical forensic analysis within cloud environments initiates the systematic collection
and aggregation of data from diverse sources. Azure Sentinel centralizes security-related
telemetry by aggregating logs from various cloud-native and hybrid resources, includ-
ing Azure Activity Logs, Azure Monitor, Microsoft 365, and other critical sources. The
centralization of data collection is essential for providing a holistic view of user activ-
ities, system events, and network interactions, enabling a more complete and accurate
forensic investigation.
Records of all changes to Azure resources, including administrative actions, access
modifications, and resource creation or deletion events, are provided by Azure Activity
Logs. The critical nature of these logs for tracking configuration changes, identifying
unauthorized administrative actions, and pinpointing potential attack vectors within the
Azure environment is emphasized. For example, the attempt to escalate privileges by
modifying role assignments can be traced through the operation records in Azure Ac-
rensic investigation.
Records of all changes to Azure resources, including administrative actions, access
modifications, and resource creation or deletion events, are provided by Azure Activity
Logs. The critical nature of these logs for tracking configuration changes, identifying un-
Electronics 2024, 13, 4546 authorized administrative actions, and pinpointing potential attack vectors within the Az-
20 of 32
ure environment is emphasized. For example, the attempt to escalate privileges by modi-
fying role assignments can be traced through the operation records in Azure Activity
tivity
Logs.Logs. An example
An example KQL query
KQL query is provided
is provided for thefor the identification
identification of suspicious
of suspicious role
role assign-
assignment changes:
ment changes:

AzureActivity
| where OperationNameValue == "MICROSOFT.AUTHORIZATION/ROLEASSIGNMENTS/WRITE"
| where ActivityStatusValue == "Succeeded"
| where Caller != "[email protected]" // Exclude known authorized users
| project TimeGenerated, Caller, ResourceId, OperationNameValue

This query isolates successful role assignment changes, filtering out authorized users.
This query isolates successful role assignment changes, filtering out authorized users.
It is beneficial for detecting unauthorized role modifications that may indicate privilege
It is beneficial for detecting unauthorized role modifications that may indicate privilege
escalation attempts:
escalation attempts:

TimeGenerated Caller ResourceId OperationNameValue


2024-11- attacker /subscriptions/resourceGroup1 MICROSOFT.AUTHORIZATION
14T09:45:00Z @unknown /ROLEASSIGNMENTS/WRITE
.com
2024-11- suspicio /subscriptions/resourceGroup2 MICROSOFT.AUTHORIZATION
14T10:30:00Z us_user@ /ROLEASSIGNMENTS/WRITE
example.
net
2024-11- external /subscriptions/resourceGroup3 MICROSOFT.AUTHORIZATION
14T11:15:00Z _admin@p /ROLEASSIGNMENTS/WRITE
hishing.
com

Unauthorized role assignment changes in Azure resources were detected by filtering


for Unauthorized
successful role assignment changes in Azure resources were detected by filtering
“MICROSOFT.AUTHORIZATION/ROLEASSIGNMENTS/WRITE” opera-
Electronics 2024, 13, x FOR PEER REVIEW 21 of 32
for successful “MICROSOFT.AUTHORIZATION/ROLEASSIGNMENTS/WRITE”
tions that were not performed by known authorized users. Potential privilege escalation opera-
tions that were
activities were not performed
highlighted byby theknown authorized
results, includingusers. Potentialofprivilege
modifications roles in escalation
resource-
activities were highlighted by the results, including
Group1 by [email protected], changes made in resourceGroup2 modifications of roles in resource-by
[email protected],
Group1 by [email protected], and changes
updatesmadeto roles in resourceGroup3
in resourceGroup2 by external_ad-
by suspicious_user@
[email protected].
example.net, It maytoberoles
and updates indicated by these unauthorized
in resourceGroup3 actions that accounts have
by [email protected].
Itbeen
maycompromised
be indicated by or malicious activity has
these unauthorized occurred,
actions that necessitating
accounts have immediate investiga-
been compromised
ortion to validate
malicious the legitimacy
activity has occurred,of the role assignments
necessitating and revoke
immediate unauthorized
investigation changes.
to validate the
Critical security
legitimacy of therisks
role are pinpointedand
assignments by this query,
revoke with support
unauthorized providedCritical
changes. for the security
forensic
workflow
risks during the
are pinpointed by Identification
this query, with Phase through
support the flagging
provided of anomalous
for the forensic workflow privilege
during
changes
the for further
Identification investigation
Phase through the andflagging
mitigation.
of anomalous privilege changes for further
Azure Monitor
investigation provides a comprehensive suite of logs capturing resource perfor-
and mitigation.
manceAzuremetrics, diagnostic
Monitor providesdata, and health status
a comprehensive across
suite of Azure resources.
logs capturing resourceForensic pur-
performance
poses can
metrics, be served
diagnostic by configuring
data, Azureacross
and health status Monitor to capture
Azure specific
resources. events,
Forensic including
purposes can
behigh CPUby
served usage, networkAzure
configuring trafficMonitor
spikes, or
to unauthorized
capture specific access attempts
events, on virtual
including high CPUma-
chinesnetwork
usage, (VMs), which
trafficmay indicate
spikes, suspicious activity.
or unauthorized For instance,
access attempts if an attacker
on virtual machines attempts
(VMs),
which may indicate
to exfiltrate suspicious
data, identifying activity.network
abnormal For instance,
trafficifmayan attacker attempts
be facilitated to exfiltrate
by analyzing the
data, identifying
Network Securityabnormal
Group (NSG) network
flowtraffic may
logs that be facilitated
Azure by analyzing
Monitor provides. the Network
An example KQL
Security
query for Group (NSG) flow
the detection logs that Azure
of abnormal networkMonitor
trafficprovides. An example
through Azure MonitorKQL query for
is provided
the detection of abnormal network traffic through Azure Monitor is provided below:
below:

AzureNetworkAnalytics_CL
| where ResourceType == "NetworkSecurityGroupFlowEvent"
| where Direction == "Inbound" and Action == "Deny"
| summarize TrafficCount = count() by RemoteIP, bin(TimeGenerated, 1h)
| where TrafficCount > 100

This query indicates the detection of inbound denied traffic from external IP ad-
This query indicates the detection of inbound denied traffic from external IP addresses,
dresses, which may suggest attempted reconnaissance or access from suspicious sources.
which may suggest attempted reconnaissance or access from suspicious sources. Unusual
Unusual traffic patterns were identified, allowing the forensic team to focus on high-risk
traffic patterns were identified, allowing the forensic team to focus on high-risk IPs and
IPs and potentially compromised resources:
potentially compromised resources:

RemoteIP TimeGenerated TrafficCount


203.0.113.42 2024-11-14T10:00:00Z 150
198.51.100.77 2024-11-14T11:00:00Z 200
192.0.2.24 2024-11-14T12:00:00Z 120

High volumes of denied inbound traffic targeting Azure resources were identified by
this query, with a focus on remote IPs that generated over 100 denied requests within one-
| summarize TrafficCount = count() by RemoteIP, bin(TimeGenerated, 1h)
| summarize TrafficCount = count() by RemoteIP, bin(TimeGenerated, 1h)
| where TrafficCount > 100
| where TrafficCount > 100

This
This query
query indicates
indicates the
the detection
detection of
of inbound
inbound denied
denied traffic
traffic from
from external
external IP
IP ad-
ad-
dresses,
dresses, which
which may
may suggest
suggest attempted
attempted reconnaissance
reconnaissance or
or access
access from
from suspicious
suspicious sources.
sources.
Electronics 2024, 13, 4546 21 of 32
Unusual
Unusual traffic
traffic patterns
patterns were
were identified,
identified, allowing
allowing the
the forensic
forensic team
team to
to focus
focus on
on high-risk
high-risk
IPs
IPs and
and potentially
potentially compromised
compromised resources:
resources:

RemoteIP TimeGenerated TrafficCount


RemoteIP TimeGenerated TrafficCount
203.0.113.42 2024-11-14T10:00:00Z 150
203.0.113.42 2024-11-14T10:00:00Z 150
198.51.100.77 2024-11-14T11:00:00Z 200
198.51.100.77 2024-11-14T11:00:00Z 200
192.0.2.24 2024-11-14T12:00:00Z 120
192.0.2.24 2024-11-14T12:00:00Z 120

High volumes
Highvolumes
volumesof of denied
ofdenied inbound
deniedinbound traffic
inbound traffic targeting
traffic targeting Azure
targeting Azure resources
Azure resources
resources were were
were identified
identified by
High identified byby
this query,
thisquery,
query,withwith
withaaafocus
focus
focuson on remote
onremote
remoteIPs IPs that
IPs that generated
that generated
generated over over 100
100 denied
denied requests
requests within
within one-
this within one-
one-
hour intervals.
hourintervals.
intervals. TheThe potential
The potential reconnaissance
potential reconnaissance
reconnaissance or or attack attempts are highlighted by the re-
hour attack attempts
attempts are highlighted
highlighted by by the
the re-
re-
sults,
sults,with with
with 150 denied requests generated by 203.0.113.42 at 10:00 a.m., 200
200 by
sults, 150150denieddenied requests
requests generated
generated by 203.0.113.42
by 203.0.113.42 at 10:00
at 10:00 a.m., 200 bya.m.,
198.51.100.77by
198.51.100.77 at at 11:00 a.m., and
and 120 by 192.0.2.24 at 12:00 p.m.
p.m. ItIt is suggested that scanning
at198.51.100.77
11:00 a.m., and 11:00
120 by a.m.,
192.0.2.24120 at
by12:00
192.0.2.24
p.m. at 12:00
It is suggested is suggested
that scanning that scanning
activities or
activities
activities or
or unauthorized
unauthorized access
access attempts
attempts were
were blocked
blocked by
by network
network security
security rules.
rules. Imme-
Imme-
unauthorized access attempts were blocked by network security rules. Immediate actions
diate actions
diatetaken,
actions were
were taken,
taken, including thethe investigation of flagged IPs
IPs for malicious activity,
were including the including
investigation investigation
of flagged IPsoffor flagged
malicious for malicious
activity, activity,
the enhance-
the enhancement
the enhancement of network
of network security rules, and the monitoring of any associated suspi-
ment of network security rules,security
and therules, and theofmonitoring
monitoring any associated of any associated
suspicious suspi-
behavior.
cious
cious behavior.
behavior. TheThe Identification
Identification Phase
Phase is supported
is supported by this analysis, which allows
allows for
The Identification Phase is supported by this analysis, by whichthis allows
analysis, forwhich
the detection for
of
the
the detection
detection of
of potential
potential external
external threats
threats and
and the mitigation of
of their impact
impact before suc-
potential external threats and the mitigation of the
theirmitigation
impact before theirsuccessful before suc-
intrusions
cessful
cessful intrusions
intrusions can can escalate.
escalate.
can escalate.
User
User activities
activities within
within
User activities within the the Microsoft
the Microsoft 365
Microsoft 365 suite,
suite, including
365 suite, including email
email access,
access, file
file downloads,
downloads,
downloads,
and
and login
andlogin attempts,
loginattempts,
attempts,are are captured
arecaptured
captured by
byby Microsoft
Microsoft
Microsoft 365
365365 logs.
logs.
logs. These
These
These logslogs
logs are
areare considered
considered
considered invalu-
invalu-
invaluable
able
able
in in
forensicforensic
in forensic investigations,
investigations,
investigations, particularly
particularly
particularly in cases
in cases
in cases that
that that involve
involve
involve phishing
phishing
phishing or unauthorized
or unauthorized
or unauthorized file
file
file access.
access.
access. For
For instance,
it canit
instance,
For instance, can
itbe
can be
be revealed
revealed
revealed through
through
through Microsoft
Microsoft
Microsoft 365 logs
logs whether
365 whether
365 logs whether aa compro-
compro-
a compromised
mised
mised account
account account was
was utilized
was utilized to send to
utilized to send
send phishing
phishing phishing
emails or emails
accessor
emails access
access sensitive
orsensitive sensitive files, thereby
thereby as-
files, assisting
files, thereby as-
in
sisting
tracing
sisting thein tracing the
attacker’s
in tracing attacker’s
movements
the attacker’s movements and
and potential
movements potential data
data exposure
and potential exposure
points. An
data exposure points. An
An exam-
example
points. KQL
exam-
ple
ple KQL
query KQL query
query for
for phishing phishing
phishing detection
for detection in Microsoft
detection in
in Microsoft
365 logs is365
Microsoft logs
logs is
provided
365 is provided
below: below:
provided below:

EmailEvents
EmailEvents
| where ThreatTypes has "Phish"
Electronics 2024, 13, x FOR PEER REVIEW 22 of 32
| where ThreatTypes has "Phish"
| project Timestamp, SenderFromAddress, RecipientEmailAddress, Urls, AttachmentType
| project Timestamp, SenderFromAddress, RecipientEmailAddress, Urls, AttachmentType

Thisquery
This queryfilters
filtersemails
emailsflagged
flaggedasasphishing
phishing attempts.
attempts. It It displays
displays sender
sender and
and recipi-
recipient
ent information and associated URLs or attachments. Identifying the
information and associated URLs or attachments. Identifying the initial phishing vector initial phishing vec-
in
tor in a compromise and assessing potential exposure
a compromise and assessing potential exposure is essential. is essential.
Theintegration
The integrationof ofon-premises
on-premises security
security logs,
logs, including
including ActiveActive Directory (AD) logs
andendpoint
and endpointprotection
protectiondata,
data,isissupported
supportedby by Azure
Azure Sentinel,
Sentinel, thereby
thereby facilitating a hybrid
viewofofsecurity
view securityevents.
events.The
Thevalue
value of of this
this capability
capability is
is particularly
particularly noted in environments
whereAzure
where Azure Sentinel
Sentinel monitors
monitors both
bothcloud-based
cloud-basedand andon-premises
on-premises resources.
resources.Forensic in-
Forensic
vestigators utilize
investigators utilizesecurity
securityevents
events from
from Active Directory
Active Directory to totrack user
track authentication
user authentication at-
tempts, account
attempts, accountlockouts,
lockouts,and
andchanges
changes to to
group
groupmemberships.
memberships. These logslogs
These facilitate the
facilitate
detection
the of lateral
detection movement
of lateral movement or privilege
or privilege escalation
escalation across
across cloud andand
cloud on-premises
on-premises re-
sources. AnAn
resources. example
example KQL query
KQL queryfor the
for detection of suspicious
the detection of suspiciousloginslogins
in the in
Active Direc-
the Active
tory is given
Directory below:
is given below:

SecurityEvent
| where EventID == 4625 // Event ID 4625 indicates failed login attempts
| where LogonType == 10 // LogonType 10 signifies a remote interactive login
| project TimeGenerated, Account, IPAddress, LogonType, Status

This query captures failed remote login attempts, often associated with brute-force
This query captures failed remote login attempts, often associated with brute-force
attacks or credential-stuffing attempts on accounts accessible from external networks.
attacks or credential-stuffing attempts on accounts accessible from external networks. Such
Such data facilitate the identification of external threats attempting to access internal re-
data facilitate the identification of external threats attempting to access internal resources.
sources.
7.2.2. Data Transformation and Preparation Using Azure Data Factory
7.2.2. Data Transformation and Preparation Using Azure Data Factory
Data must undergo transformation and preparation in AI-driven forensic investiga-
Data must
tions before undergo
effectively transformation
utilizing machineand preparation
learning in The
models. AI-driven forensic
provision of a investiga-
powerful,
cloud-based ETL (extract, transform, load) service by Azure Data Factory (ADF)powerful,
tions before effectively utilizing machine learning models. The provision of a is noted,
cloud-based
with ETL (extract,
the automation of datatransform,
workflowsload) service
across by Azure
multiple Data
sources andFactory (ADF) is
environments noted,
enabled.
with the automation of data workflows across multiple sources and environments ena-
bled. The orchestration of data movement, transformation, and preparation through ADF
allows forensic teams to optimize data for analysis in Azure Machine Learning Studio.
This chapter is focused on the support provided by ADF for data readiness in machine
learning through transforming, cleansing, and structuring data, which are aimed at en-
Electronics 2024, 13, 4546 22 of 32

The orchestration of data movement, transformation, and preparation through ADF allows
forensic teams to optimize data for analysis in Azure Machine Learning Studio. This
chapter is focused on the support provided by ADF for data readiness in machine learning
through transforming, cleansing, and structuring data, which are aimed at enhancing
model accuracy and efficiency.
The preparation of data for forensic analysis and machine learning is often character-
ized by complexity, necessitating a variety of transformations for the standardization and
cleansing of data obtained from multiple sources. ADF addresses these challenges through
the following capabilities:
• ADF enables the seamless integration of data from cloud-based and on-premises
sources. Data extraction from services such as Azure SQL Database, Azure Blob
Storage, Microsoft 365 logs, and third-party security solutions is supported. Forensic
teams utilize this capability to consolidate data from all relevant sources into a single
pipeline for consistent processing.
• A range of data transformation functions, including data cleaning, aggregation, type
conversion, and data enrichment, is provided by ADF and is considered essential
for forensic analysis. Raw security logs are refined through these transformations,
resulting in a reduction in noise and an improvement in the signal-to-noise ratio. Field
formats can also be standardized through transformations, ensuring compatibility
with Azure ML Studio models.
• Scalability and Scheduling: It has been observed that data workflows within ADF
can be scheduled to execute at defined intervals or triggered in real time. The critical
nature of this scheduling capability in forensic analysis is highlighted by the necessity
for continuous data preparation, which ensures that machine learning models are
provided with the latest data. The scalability of ADF enables the processing of large
volumes of forensic data without compromising performance.
In the forensic workflow, a typical ADF pipeline is characterized by a series of trans-
formation steps employed to prepare data for utilization in anomaly detection models. An
example of a pipeline designed to transform login event data is presented, which includes
steps for data cleaning, normalization, feature engineering, and storage in Azure Blob Stor-
age for access by Azure ML Studio. A pipeline that prepares data for a machine learning
model to detect abnormal login activities is presented.
The extraction of login data from Azure Activity Logs, Microsoft 365, and external
security logs is initiated as the first step. The Copy Data activity of ADF is utilized to move
data from these sources into a staging area, typically in Azure Blob Storage. The capture of
fields, including UserPrincipalName, Timestamp, Location, and EventType, characterizes
the initial extraction. In the Data Cleaning step, invalid or irrelevant records, such as empty
fields or failed logs unrelated to security, are removed. Filters can be applied to ADF’s
data flows to exclude specific EventType values, ensuring that only relevant login events
are processed. Data normalization involves the standardization of time zones to UTC,
while the validation of IP addresses is conducted to ensure conformity to correct formats.
The normalization of these fields facilitates the alignment of data from different sources.
New features were developed to enable anomaly detection within Azure ML Studio. For
example, LoginHour is calculated from Timestamp, LoginFrequency is determined by
counting occurrences per user, and LocationCategory is derived based on the location field
(e.g., trusted or untrusted regions). The transformed data are subsequently loaded into
an Azure Blob Storage container. This structured data repository provides the input for
training and testing machine learning models in Azure ML Studio.
A transformation example is presented, wherein ADF’s data flow is utilized to stan-
dardize timestamps, generate login frequency features, and categorize login locations:
mined by counting occurrences per user, and LocationCategory is derived based on the
location field (e.g., trusted or untrusted regions). The transformed data are subsequently
loaded into an Azure Blob Storage container. This structured data repository provides the
input for training and testing machine learning models in Azure ML Studio.
Electronics 2024, 13, 4546 23 of 32
A transformation example is presented, wherein ADF’s data flow is utilized to stand-
ardize timestamps, generate login frequency features, and categorize login locations:

// Sample transformation steps within ADF's Data Flow


let rawData = ActivityLogs
| where EventType in ("LoginAttempt", "SignIn")
| project UserPrincipalName, Timestamp, Location, IPAddress;

let normalizedData = rawData


| extend TimestampUTC = todatetime(format_datetime(Timestamp, 'yyyy-MM-
ddTHH:mm:ssZ')) // Normalize time to UTC
| extend LoginHour = datetime_part("Hour", TimestampUTC) // Extract hour of login
| summarize LoginFrequency = count() by UserPrincipalName, bin(TimestampUTC, 1d)
// Count daily logins per user
| extend LocationCategory = iif(Location in ("TrustedRegion1", "TrustedRegion2"),
"Trusted", "Untrusted");

let preparedData = normalizedData


| project UserPrincipalName, TimestampUTC, LoginHour, LoginFrequency,
LocationCategory, IPAddress;

preparedData

During the data transformation process, timestamps are normalized to a standard-


During the data transformation process, timestamps are normalized to a standardized
ized UTC format, which ensures consistency across events from various sources and ena-
UTC format, which ensures consistency across events from various sources and enables
bles accurate chronological analysis. The specific hour of each login is extracted to facili-
accurate chronological analysis. The specific hour of each login is extracted to facilitate
tate the detection of unusual login times, as certain anomalies may be more detectable
the detection of unusual login times, as certain anomalies may be more detectable based
based on time-of-day patterns. Additionally, the daily frequency of logins per user is cal-
on time-of-day patterns. Additionally, the daily frequency of logins per user is calculated,
culated, which serves as a feature for identifying abnormally high login attempts that may
which serves as a feature for identifying abnormally high login attempts that may indicate
indicate potential compromise. Logins are categorized as “Trusted” or “Untrusted” based
potential compromise. Logins are categorized as “Trusted” or “Untrusted” based on
on predefined trusted regions. This categorization allows for flagging logins from suspi-
predefined trusted regions. This categorization allows for flagging logins from suspicious
cious or unexpected locations, which may necessitate further investigation.
or unexpected locations, which may necessitate further investigation.
Upon data transformation completion, the pipeline’s final step is loading the pre-
Upon data transformation completion, the pipeline’s final step is loading the prepared
pared data into a storage location accessible by Azure ML Studio, such as Azure Blob
data into a storage location accessible by Azure ML Studio, such as Azure Blob Storage
Storage or Azure Data Lake. The loading phase renders the transformed data readily
Electronics 2024, 13, x FOR PEER REVIEW 24 ofac-
32
or Azure Data Lake. The loading phase renders the transformed data readily accessible
cessible for training and inference within machine learning workflows. The automation of
for training and inference within machine learning workflows. The automation of this
this ETL process by ADF results in Azure ML models consistently having up-to-date data,
ETL process by ADF results in Azure ML models consistently having up-to-date data,
eliminating
eliminatingthetheneed
needfor
formanual
manualintervention.
intervention. An
An example
example of
of an automated ADF pipeline
configuration for continuous data loading is given below:
configuration for continuous data loading is given below:

{"name": "ForensicDataPipeline",
"properties": {"activities": [{"name": "Copy Login Data",
"type": "Copy",
"inputs": ["RawLoginData"],
"outputs": ["StagingLoginData"],
"typeProperties": {"source": {"type": "BlobSource"},
"sink": {"type": "BlobSink"}
}
},
{"name": "Data Transformation",
"type": "DataFlow",
"dataFlow": {"name": "LoginEventTransformFlow"}
},
{"name": "Load to ML Storage",
"type": "Copy",
"inputs": ["TransformedLoginData"],
"outputs": ["MLReadyData"],
"typeProperties": {"sink": {"type": "AzureBlob"}
}
}],
"scheduler": {"frequency": "Hour","interval": 1}
}
}

This pipeline configuration is designed to facilitate automated, hourly execution of


This pipeline configuration is designed to facilitate automated, hourly execution of
the ETL steps. Raw login data are moved through staging, transformation, and, ulti-
the ETL steps. Raw login data are moved through staging, transformation, and, ultimately,
mately, storage for access by ML models.
storage for access by ML models.
7.2.3. Utilizing Data for Enhanced Anomaly Detection
Data transformed and standardized by Azure Data Factory enable Azure Machine
Learning Studio to be a powerful tool for deploying advanced machine learning models
tailored for anomaly detection in forensic analysis. In this scenario, the data prepared by
ADF, which included features such as login time, frequency, and geographic categoriza-
tion, were utilized to train a model that identifies unusual login activities that may signify
Electronics 2024, 13, 4546 24 of 32

7.2.3. Utilizing Data for Enhanced Anomaly Detection


Data transformed and standardized by Azure Data Factory enable Azure Machine
Learning Studio to be a powerful tool for deploying advanced machine learning models
tailored for anomaly detection in forensic analysis. In this scenario, the data prepared by
ADF, which included features such as login time, frequency, and geographic categorization,
were utilized to train a model that identifies unusual login activities that may signify
unauthorized access or potential compromise.
Forensic teams use Azure ML Studio to design, train, and deploy machine learning
models that detect patterns and outliers in complex, multi-source datasets. Anomalies that
do not fit the established baseline of user behavior can be identified using algorithms such
as Isolation Forests or Autoencoders within Azure ML Studio. Login attempts that occur at
unusual hours originate from untrusted regions, or exhibit an unusually high frequency
may be classified as anomalies, which warrant further investigation.
Transformed data are leveraged by the model to identify deviations in login behavior
based on historical patterns, thus enabling a sophisticated layer of security monitoring
that surpasses conventional threshold-based alerts. The integration of ADF and Azure ML
Studio was observed to enhance the forensic capabilities of Azure Sentinel through the
automation of anomaly detection, the reduction in false positives, and the enablement of
real-time analysis of potential threats.
Data that have been transformed are stored in Azure Blob Storage, which Azure ML
Studio can access them to train the machine learning model. In this case, an Isolation
Forest model, well suited for identifying outliers, was employed to detect abnormal login
activities. Anomalies were isolated by this model through the random partitioning of data,
enabling differentiation between normal and suspicious behaviors without the reliance on
labeled training data.
Electronics 2024, 13, x FOR PEER REVIEW 25 of 32
Example Code for Accessing Transformed Data and Training an Isolation Forest Model
in Azure ML Studio:

# Import necessary libraries


import pandas as pd
from sklearn.ensemble import IsolationForest
from azure.storage.blob import BlobServiceClient

# Define Blob Storage credentials


blob_service_client =
BlobServiceClient.from_connection_string("your_connection_string")
container_name = "transformed-data"
blob_name = "login_events.csv"

# Access transformed data from Azure Blob Storage


blob_client = blob_service_client.get_blob_client(container=container_name,
blob=blob_name)
downloaded_blob = blob_client.download_blob().readall()
data = pd.read_csv(pd.io.common.BytesIO(downloaded_blob))

# Select features for training the model


features = data[['LoginHour', 'LoginFrequency', 'LocationCategory']]

# Initialize and train Isolation Forest model


model = IsolationForest(contamination=0.01, random_state=42)
model.fit(features)

# Predict anomalies (1 for normal, −1 for anomaly)


data['Anomaly'] = model.predict(features)
data['AnomalyScore'] = model.decision_function(features)

# Filter and display anomalies for review


anomalies = data[data['Anomaly'] == −1]
print("Detected anomalies:")
print(anomalies[['UserPrincipalName', 'LoginHour', 'LoginFrequency',
'LocationCategory', 'AnomalyScore']])

A connection to Azure Blob Storage was established to access the transformed login
events data stored in login_events.csv. The cleaned and engineered features created by
Azure Data Factory were contained within this CSV file to facilitate preparation for ma-
chine learning analysis. Subsequently, relevant features—LoginHour, LoginFrequency,
and LocationCategory—were selected as inputs for the model. These features were se-
lected due to their ability to capture essential behavioral attributes of user login activity,
enabling the model to establish standard behavior patterns and detect deviations.
# Predict anomalies (1 for normal, −1 for anomaly)
data['Anomaly'] = model.predict(features)
data['AnomalyScore'] = model.decision_function(features)

# Filter and display anomalies for review


anomalies = data[data['Anomaly'] == −1]
Electronics 2024, 13, 4546 print("Detected anomalies:")
print(anomalies[['UserPrincipalName', 'LoginHour', 'LoginFrequency', 25 of 32
'LocationCategory', 'AnomalyScore']])

AAconnection
connectionto toAzure
AzureBlob
BlobStorage
Storage was
was established
established to to access
access the
the transformed
transformed login
eventsdata
events datastored
storedin inlogin_events.csv.
login_events.csv. The The cleaned
cleaned andand engineered
engineered features created by
AzureData
Azure DataFactory
Factory were
were contained
contained within
within thisthis
CSVCSVfile file to facilitate
to facilitate preparation
preparation for ma-
for machine
chine learning
learning analysis. analysis. Subsequently,
Subsequently, relevant
relevant features—LoginHour,
features—LoginHour, LoginFrequency,
LoginFrequency, and
and LocationCategory—were
LocationCategory—were selected
selected as inputs
as inputs for theformodel.
the model.
TheseThese features
features were were se-
selected
lected
due due to
to their theirtoability
ability captureto capture
essentialessential
behavioral behavioral
attributesattributes of user
of user login loginenabling
activity, activity,
enabling
the modelthe model tostandard
to establish establishbehavior
standardpatterns
behaviorand patterns
detectand detect deviations.
deviations.
AnIsolation
An IsolationForest
Forestmodel
modelwas wasinitialized
initializedwith
with a contamination
a contamination level
level of of 0.01,
0.01, indicat-
indicating
ing that
that approximately
approximately 1%the
1% of of the data
data areare expected
expected to to
bebe anomalous.The
anomalous. Theselected
selected features
wereutilized
were utilizedforfortraining
trainingthis
thismodel,
model,facilitating
facilitating the
the identification
identification of of outliers
outliers by isolating
observations
observations that deviate
deviate from
fromthethebaseline
baselineofof regular
regular activity.
activity. Following
Following the training
the training pro-
process,
cess, thethe model
model scores
scores eacheach
datadata point,
point, with with
a scorea score approaching
approaching −1 indicative
−1 indicative of a
of a greater
greater likelihood
likelihood of abnormal
of abnormal behavior.
behavior. Anomalies
Anomalies identified
identified through
through observations
observations werewere
iso-
isolated
lated inina aseparate
separatedataset
datasetfor
forfurther
furtheranalysis,
analysis,thereby
thereby providing
providing forensic investigators
with
withtargeted
targetedinsights
insightsinto
intopotentially
potentiallysuspicious
suspiciousloginloginactivities:
activities:

UserPrincipalName LoginHour LoginFreq Loc_Category AnomalyScore


[email protected] 3 15 Untrusted −0.75
[email protected] 22 2 Trusted −0.82
[email protected] 1 50 Untrusted −0.88

This Python code demonstrates how to use an Isolation Forest machine learning
This Python code demonstrates how to use an Isolation Forest machine learning model
model to detect anomalies in login behavior. The transformed data, stored in Azure Blob
to detect anomalies in login behavior. The transformed data, stored in Azure Blob Storage,
Storage, were accessed and loaded into a Pandas DataFrame. Key features, including
were accessed and loaded into a Pandas DataFrame. Key features, including LoginHour,
LoginHour, LoginFrequency, and LocationCategory, were used to train the model. The
LoginFrequency, and LocationCategory, were used to train the model. The Isolation Forest
Isolation Forest algorithm assigned anomaly scores to each record, with lower scores in-
algorithm assigned anomaly scores to each record, with lower scores indicating higher
dicating higher suspicion levels. For example, [email protected] logged in at an
suspicion levels. For example, [email protected] logged in at an unusual hour
unusual hour (3 a.m.) with a high login frequency from an untrusted location, resulting
(3 a.m.) with a high login frequency from an untrusted location, resulting in an anomaly
in an anomaly score of −0.75. Similarly, [email protected] exhibited an unu-
score of −0.75. Similarly, [email protected] exhibited an unusually high login
sually high login frequency of 50 from an untrusted location, scoring −0.88. These flagged
frequency of 50 from an untrusted location, scoring −0.88. These flagged anomalies can
be further investigated to validate potential threats, providing actionable insights for the
Anomaly Detection Phase of the forensic workflow.
Upon completion of the model training, Azure ML Studio facilitates the deployment
of the model as a real-time scoring endpoint. This endpoint can be integrated with Azure
Sentinel to monitor login events continuously. New login events from Azure Sentinel are
sent to the model for scoring, and alerts for forensic teams are triggered by events identified
as abnormal.
Events are continuously scored in real time, facilitating forensic analysts’ quick iden-
tification and response to potential compromises, such as unauthorized access attempts.
This integration streamlines the forensic workflow, enabling proactive threat detection to
be conducted in an automated and scalable manner.
Example Code for Deploying the Model in Azure ML Studio:
fied as abnormal.
Events are continuously scored in real time, facilitating forensic analysts’ quick iden-
tification and response to potential compromises, such as unauthorized access attempts.
This integration streamlines the forensic workflow, enabling proactive threat detection to
Electronics 2024, 13, 4546 26 of 32
be conducted in an automated and scalable manner.
Example Code for Deploying the Model in Azure ML Studio:

from azureml.core import Workspace, Model


from azureml.core.model import InferenceConfig
from azureml.core.webservice import AciWebservice, Webservice

# Connect to Azure ML Workspace


ws = Workspace.from_config()

# Register the model


model = Model.register(workspace=ws, model_name="anomaly_detection_model",
model_path="path/to/model.pkl")

# Define inference configuration


inference_config = InferenceConfig(entry_script="score.py", environment=myenv)

# Define deployment configuration


deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)

# Deploy the model as a web service


service = Model.deploy(workspace=ws, name="anomaly-detection-service",
models=[model], inference_config=inference_config,
deployment_config=deployment_config)
service.wait_for_deployment(show_output=True)

# Get scoring URI


scoring_uri = service.scoring_uri
print("Deployed service scoring URI:", scoring_uri)

The deployment code first establishes a connection to an Azure Machine Learning


The deployment code first establishes a connection to an Azure Machine Learning
workspace, which enables access to the machine learning environment necessary for man-
workspace, which enables access to the machine learning environment necessary for man-
aging and deploying models. The trained Isolation Forest model is registered within the
aging and deploying models. The trained Isolation Forest model is registered within the
workspace and deployed as a real-time scoring service utilizing an Azure Container In-
workspace and deployed as a real-time scoring service utilizing an Azure Container In-
stance (ACI). The deployment comprises a script designated as score.py, within which the
stance (ACI). The deployment comprises a script designated as score.py, within which
inference logic determines the scoring of incoming login events by the model. Upon de-
the inference logic determines the scoring of incoming login events by the model. Upon
ployment, a unique scoring URI is assigned to the service. The model is called in real time
deployment, a unique scoring URI is assigned to the service. The model is called in real
by Azure Sentinel through this URI, with new login events being sent to the deployed
time by Azure Sentinel through this URI, with new login events being sent to the deployed
service for immediate scoring and anomaly detection.
service for immediate scoring and anomaly detection.
7.2.4.Leveraging
7.2.4. LeveragingMachine
Machine Learning
Learning Models
Models from
from Azure
Azure MLML Studio
Studio toto Enhance
Enhance Azure
Sentinel
Azure Sentinel
Integratingmachine
Integrating machinelearning
learningmodels
models from
from Azure Machine Learning (Azure ML) ML) Stu-
Stu-
diointo
dio intoAzure
AzureSentinel
Sentinelsignificantly
significantly enhances
enhances the
the capabilities
capabilities of Azure Sentinel’s forensic
andthreat
and threatdetection
detectionprocesses.
processes. Azure
Azure Sentinel
Sentinel applies
applies advanced,
advanced, custom-built
custom-built models
trained on historical data to identify anomalies, detect subtle patterns,
trained on historical data to identify anomalies, detect subtle patterns, and andrespond
respondproac-
pro-
actively to emerging threats more accurately than rule-based systems. In this integration,
tively to emerging threats more accurately than rule-based systems. In this integration, the
the analytical
analytical backbone
backbone is provided
is provided by by Azure
Azure MLML Studio.
Studio. AtAtthe
thesame
sametime,
time,the
the operational
operational
layer is
layer is constituted
constituted by
by Azure
AzureSentinel,
Sentinel,which
whichapplies
appliesthethe
model’s
model’s insights to real-time
insights se-
to real-time
curity events.
security events.
The native detection capabilities of Azure Sentinel are primarily based on predefined
rules and anomaly detection models that emphasize common indicators of compromise.
However, custom models developed in Azure ML Studio allow for tailoring detection
algorithms to unique environments, resulting in models that exhibit increased sensitivity to
specific behaviors or nuanced attack patterns. Low-frequency but high-impact anomalies,
such as abnormal login times, atypical access locations, or unusual resource access patterns,
are detectable by these models, which often signify advanced persistent threats or insider
activity. Through this integration, Azure Sentinel can leverage machine learning to process
large volumes of incoming security events, with only the most relevant threats being
flagged for immediate investigation.
Upon training and deploying the machine learning model as a real-time scoring
endpoint in Azure ML Studio, Azure Sentinel can utilize the model to score new security
events continuously. The likelihood of abnormal login attempts can be evaluated through
the model’s scoring. Sentinel’s alerting mechanism can prioritize high-risk events based
insider activity. Through this integration, Azure Sentinel can leverage machine learning
to process large volumes of incoming security events, with only the most relevant threats
being flagged for immediate investigation.
Upon training and deploying the machine learning model as a real-time scoring end-
Electronics 2024, 13, 4546 point in Azure ML Studio, Azure Sentinel can utilize the model to score new security 27 of 32
events continuously. The likelihood of abnormal login attempts can be evaluated through
the model’s scoring. Sentinel’s alerting mechanism can prioritize high-risk events based
ononthe
themodel’s
model’sscoring,
scoring,ensuring
ensuring that
that security
security teams
teams focus
focus on on incidents
incidents with
with the
the highest
highest
probability of compromise.
probability of compromise.
AzureSentinel
Azure Sentinel collects
collects real-time
real-time login
login and
and activity
activity logs
logs from
from multiple
multiple sources,
sources, in- in-
cludingAzure
cluding Azure AD, AD, Microsoft
Microsoft 365,
365, and
and on-premises
on-premises AD AD logs.
logs. The
The data
data are routed to
are routed to the
the
deployedmodel
deployed modelin inAzure
AzureML MLStudio.
Studio. The
The scoring
scoring of
of each
each event
event is
is conducted
conducted by by the
the model,
model,
utilizing features
utilizing features including
including login
loginhour,
hour,login
loginfrequency,
frequency,and andlocation
location category. Events
category. are
Events
assigned an anomaly score, with higher scores indicative of greater
are assigned an anomaly score, with higher scores indicative of greater risk. When the risk. When the anom-
aly scorescore
anomaly surpasses a predefined
surpasses a predefined threshold,
threshold,ananalert is israised
alert raisedby byAzure
AzureSentinel,
Sentinel, which
which
provides security teams with contextual information
provides security teams with contextual information regarding the regarding the flagged event and its
correspondinganomaly
corresponding anomalyscore.
score.
Thefollowing
The following sample
sample code
code isis presented
presented to to illustrate
illustrate thethe utilization
utilization of Azure Logic
Appsfor
Apps forcalling
callingthe
thedeployed
deployed model
model in in Azure
Azure MLML Studio
Studio andand processing
processing the the scoring
scoring re-
results
insults in Azure
Azure Sentinel:Sentinel:

{"definition":
{"$schema":
"https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-
01/workflowdefinition.json#",
"actions": {"Get_Sentinel_Event": {"type": "Http","inputs": {
"method": "GET",
"uri": "https://api.securitycenter.microsoft.com/api/incidents",
"headers": {"Authorization": "Bearer {{Sentinel_API_Token}}"}
}},
"Score_Event_With_Model": {"type": "Http","inputs": {
"method": "POST",
"uri": "https://<your-ml-endpoint>.azurewebsites.net/score",
"body": {
"UserPlName": "@{body('Get_Sentinel_Event')?['properties']?['userPrincipalName']}",
"LoginHour": "@{body('Get_Sentinel_Event')?['properties']?['loginHour']}",
"LoginFrequency": "@{body('Get_Sentinel_Event')?['properties']?['loginFrequency']}",
"LocationCat": "@{body('Get_Sentinel_Event')?['properties']?['locationCategory']}"},
"headers": {"Content-Type": "application/json"}}},
"Raise_Alert": {"type": "If","expression": {"greater": [
"@{body('Score_Event_With_Model')?['AnomalyScore']}", 0.7 ]},
"actions": {"Create_Sentinel_Alert": {"type": "Http","inputs": {
"method": "POST",
"uri": "https://api.securitycenter.microsoft.com/api/alerts","body": {
"title": "High-Risk Login Detected",
"description": "Anomaly score exceeded threshold",
"severity": "High",
"properties": {
"UserPrincipalName": "@{body('Score_Event_With_Model')?['UserPrincipalName']}",
"AnomalyScore": "@{body('Score_Event_With_Model')?['AnomalyScore']}"} },
"headers": {"Authorization": "Bearer {{Sentinel_API_Token}}",
"Content-Type": "application/json"}}}}}}}}

A login event is retrieved from Azure Sentinel through its REST API at the beginning
of thelogin
A event isEssential
workflow. retrievedfields
from Azure Sentinel
such as through its REST
UserPrincipalName, API at the beginning
LoginHour, LoginFre-
of the workflow. Essential fields such as UserPrincipalName, LoginHour, LoginFrequency,
quency, and LocationCategory are included in this event, serving as inputs for the
and LocationCategory are included in this event, serving as inputs for the machine learning
model. Upon data extraction, they are transmitted to the deployed model endpoint in
Azure ML Studio, where the model computes an AnomalyScore. The likelihood that the
login event is suspicious is represented by this score, with higher scores indicating a greater
probability of abnormal behavior. When the AnomalyScore exceeds a predefined threshold
of 0.7, the workflow triggers a high-risk alert in Azure Sentinel. Contextual information
from the model’s output is incorporated into this alert, enabling a focused and informed
investigation into the flagged event by the security team:
model endpoint in Azure ML Studio, where the model computes an AnomalyScore. The
likelihood that the login event is suspicious is represented by this score, with higher scores
indicating a greater probability of abnormal behavior. When the AnomalyScore exceeds a
predefined threshold of 0.7, the workflow triggers a high-risk alert in Azure Sentinel. Con-
Electronics 2024, 13, 4546
textual information from the model’s output is incorporated into this alert, enabling28aoffo-
32
cused and informed investigation into the flagged event by the security team:

Incident UPName LogHour LogFreq LocCat AnomalyS Alert Severity


ID core Title
12345 alice.jo 3 15 Untruste 0.85 High- High
nes@comp d Risk
any.com Login
Detected
67890 finance_ 1 50 Untruste 0.92 High- High
admin@co d Risk
mpany.co Login
m Detected

For instance, a login was performed by [email protected] at 3 a.m. with a


For instance, a login was performed by [email protected] at 3 a.m. with a
high frequency from an untrusted location, which resulted in an anomaly score of 0.85
high frequency from an untrusted location, which resulted in an anomaly score of 0.85 and
and triggered a “High-Risk Login Detected” alert. A highly suspicious pattern was exhib-
triggered a “High-Risk Login Detected” alert. A highly suspicious pattern was exhibited
ited by [email protected], with an anomaly score of 0.92. The automated pro-
by [email protected], with an anomaly score of 0.92. The automated process is
cess is designed to enhance real-time anomaly detection, with the Incident Response
designed to enhance real-time anomaly detection, with the Incident Response Phase being
Phase being streamlined by providing actionable alerts based on advanced machine learn-
streamlined by providing actionable alerts based on advanced machine learning analysis,
ing analysis, enabling security teams to address threats more efficiently.
enabling security teams to address threats more efficiently.
The integration of real-time scoring from Azure ML Studio into Azure Sentinel is as-
The integration of real-time scoring from Azure ML Studio into Azure Sentinel is
sociated with several advantages in the forensic investigation workflow:
associated with several advantages in the forensic investigation workflow:
• Enhanced detection accuracy is achieved using a custom-trained machine learning
• Enhanced
model, which detection
allowsaccuracy is achieved
for identifying nuanced using a custom-trained
patterns machinetolearning
that are challenging capture
model, which allows for identifying nuanced patterns that are
using traditional rule-based methods. False positives are reduced, allowing for challenging to capture
a fo-
using traditional rule-based methods. False positives
cus on events that represent genuine threats by analysts. are reduced, allowing for a focus
• on Theevents that represent
capability genuine
for real-time threats
scoring by analysts.
facilitates the immediate analysis of incoming
• The capability for real-time scoring facilitates
events, generating alerts for the security team within the immediate
moments of analysis of incoming
an abnormal event.
events, generating alerts for the security team within moments of
The time available for an attacker to exploit compromised accounts or escalate privi-an abnormal event.
The time available for an attacker
leges within the system is minimized. to exploit compromised accounts or escalate privi-
• leges within the
The manual systemisisreduced
workload minimized. by implementing automated scoring and alert gen-
• The manual workload is reduced
eration, which streamline the detection by implementing
process and automated
diminishscoring and alert
the necessity forgener-
man-
ation, which streamline the detection process and diminish
ual analysis of each event. By automating routine evaluations, forensic teams the necessity for manual
may
analysis of each event. By automating routine evaluations, forensic teams may allocate
allocate resources to more complex investigations, improving overall efficiency.
resources to more complex investigations, improving overall efficiency.
Integrating Azure Machine Learning Studio models with Azure Sentinel enables fo-
rensic teams to Azure
Integrating Machine
proactively Learning
respond Studio models
to complex with Azure
cyber threats. Sentinel data
Combining enables foren-
transfor-
sic teams to proactively respond to complex cyber threats. Combining
mation from Azure Data Factory with custom machine learning models in Azure ML Stu- data transformation
from Azure Data
dio enhances AzureFactory with
Sentinel’s custom
ability machine
to detect learning
advanced models
threats, in Azure
including ML Studio
abnormal login
enhances Azure Sentinel’s ability to detect advanced threats, including
behaviors and atypical access patterns. The integration is associated with improvements abnormal login
behaviors
in detectionand atypicaland
accuracy access patterns.
timeliness, The
and integration
a scalable, is associated
adaptive forensicwith improvements
workflow that can
in detection
evolve accuracy
alongside and timeliness,
an organization’s and a scalable,
security adaptive forensic workflow that can
needs is established.
evolve alongside an organization’s security needs is established.
7.2.5. Benefits of Machine Learning and Artificial Intelligence in Azure Security Center
7.2.5. Benefits of Machine Learning and Artificial Intelligence in Azure Security Center
Integrating machine learning and artificial intelligence within Azure Security Center
Integrating machine learning and artificial intelligence within Azure Security Center
is a powerful enhancement to traditional security approaches. The detection, analysis, and
is a powerful enhancement to traditional security approaches. The detection, analysis, and
response to threats by Azure Security Center are enhanced by the capabilities of ML and
response to threats by Azure Security Center are enhanced by the capabilities of ML and AI,
AI, resulting in greater speed, accuracy, and adaptability compared to rule-based systems
resulting in greater speed, accuracy, and adaptability compared to rule-based systems alone.
alone. The transformation of Azure Security Center into a proactive security solution is
The transformation of Azure Security Center into a proactive security solution is achieved
achieved by applying advanced techniques that enable continuous learning from data,
by applying advanced techniques that enable continuous learning from data, adaptation to
evolving threats, and reductions in the burden on human analysts. The primary benefits of
leveraging ML and AI in Azure Security Center are outlined below.
The ability of ML and AI to improve the accuracy of threat detection is considered one
of the most significant benefits. Traditional security methods, including signature-based
detection and static rule sets, often need help to detect new or evolving threats that do
not conform to predefined patterns. In contrast, ML models trained on historical data can
establish a baseline of normal behavior for users, devices, and applications. These models
utilize behavioral baselines to detect subtle deviations indicative of potential security
incidents, including abnormal login patterns, unexpected access locations, and unusual
Electronics 2024, 13, 4546 29 of 32

data transfers. Anomalies that fall outside typical usage patterns are identified, resulting in
a reduction in false positives by ML-driven detection, which aids security teams in focusing
on high-priority threats.
The capabilities of AI and ML facilitate the dynamic adaptation of Azure Security
Center to new threats through continuous learning. As data are subject to change over
time, the periodic retraining of the models within Azure ML Studio on fresh data is
facilitated, allowing for the refinement of detection capabilities and the maintenance of
effectiveness against emerging threats. For example, it has been observed that as user
behaviors shift—whether due to seasonal variations, remote work trends, or organizational
changes—baseline updates are made to ML models to accommodate these patterns. The
likelihood of false alarms from legitimate changes is reduced through this adaptability
while the system maintains vigilance to actual threats.
The responsiveness of Azure Security Center is significantly enhanced by integrating
real-time anomaly detection. AI-powered models deployed in Azure ML Studio facilitate
the immediate analysis of incoming security events, enabling the detection of and response
to suspicious activity by Azure Security Center as it occurs. For instance, when a login event
is observed to deviate from an established baseline in terms of time, location, or frequency,
it is scored by the model as potentially abnormal. Azure Security Center raises an alert if
the anomaly score surpasses a specified threshold, which enables instant notifications to
be sent to security teams. This real-time analysis reduces the time for attackers to exploit
vulnerabilities, and the response is accelerated, thereby mitigating potential damage.
In Azure Security Center, machine learning models provide a mechanism for auto-
matically prioritizing threats based on risk scores. Anomaly scores are assigned to events,
allowing incidents to be ranked based on their likelihood of being genuine security threats.
The automated prioritization of alerts is designed to relieve security analysts from the
manual review process, thereby enabling a concentration on high-severity threats and com-
plex investigations. A login attempt characterized by a high anomaly score—suggesting
a significant likelihood of compromise—would be assigned to a high-priority alert. In
contrast, events assessed as lower risk would receive lower priorities. The efficiency and
productivity of security teams are improved through reductions in false positives and the
automatic ranking of alerts by ML and AI.
As organizations experience growth and their digital environments increase in com-
plexity, it is observed that traditional security methods may encounter challenges in scaling
effectively. The scalability of ML- and AI-driven solutions in Azure Security Center enables
continuous monitoring across large, distributed, and hybrid environments. Vast volumes
of data from multiple sources, including cloud resources, on-premises systems, and third-
party integrations, are processed and analyzed by ML models. This capability ensures
that all parts of an organization’s infrastructure are monitored uniformly, irrespective of
scale. Furthermore, it has been demonstrated that automated ML workflows enable the
maintenance of scalability within Azure Security Center while ensuring that the precision
and timeliness of threat detection are not compromised.
Advanced persistent threats are frequently characterized by stealth and longevity,
which results in challenges for detection using traditional rule-based approaches. ML
models within Azure Security Center enhance the ability to identify sophisticated threats
by uncovering subtle patterns and correlations that may not be immediately apparent. For
example, low-frequency but high-risk activities, such as privilege escalations followed by
unauthorized data access, may be involved in an APT. AI-powered models can correlate
these events over time, linking seemingly benign activities into a coherent attack chain
and alerting security teams before the attacker’s objective is achieved. The dwell time of
threats within the environment is minimized through proactive detection, which reduces
the potential for data breaches and operational disruption.
Integrating artificial intelligence and machine learning into Azure Sentinel is a trans-
formative approach to forensic analysis in cloud environments, with proactive and adaptive
responses to increasingly sophisticated cyber threats being enabled. The centralization of
Electronics 2024, 13, 4546 30 of 32

data from diverse sources and the application of machine learning for adaptive anomaly
detection result in an enhanced ability to detect subtle and complex attack vectors that
are often overlooked by traditional rule-based methods in Azure Sentinel. This case study
is exemplified by how AI-driven forensics enables faster and more accurate detection,
analysis, and response to incidents, with the platform’s potential to support robust security
monitoring being showcased. Through a structured and adaptive forensic workflow, which
incorporates advanced anomaly detection models in Azure Machine Learning Studio, the
evolution of cybersecurity posture from reactive to proactive is facilitated for organiza-
tions. Azure Sentinel is positioned as a critical tool within this AI-enhanced framework for
constructing resilient, scalable, and efficient defenses that address the unique challenges
presented by today’s cloud-centric cyber landscape.

8. Future Works
Cyber forensics offers a wide range of potential research areas, which are constantly
expanding due to the rapid progress of technology and the growing complexity of cyber
threats. An important field of study involves incorporating artificial intelligence and
machine learning to improve forensic analysis and detect potential threats. Artificial
intelligence and machine learning can automate examining vast amounts of data, detect
patterns, and forecast potential risks. As a result, they greatly enhance the effectiveness
and precision of forensic investigations. AI can be utilized to create advanced tools for
identifying and examining malware, phishing attacks, and other cyber threats. These
threats are increasingly intricate and challenging to detect using conventional methods [24].
Cloud forensics is an essential field that requires further investigation. As cloud
computing becomes more widely used, there is an increasing demand for developing
forensic tools and techniques to manage cloud environments efficiently. This encompasses
tackling obstacles such as obtaining, safeguarding, and examining data in distributed
and virtualized cloud infrastructures. Research in this field could concentrate on creating
standardized protocols for cloud forensics and efficient tools for gathering and analyzing
evidence from cloud service providers [7].
The field of cyber forensics research holds great promise for the future. Significant
focus areas encompass incorporating artificial intelligence and machine learning to im-
prove analysis and identify threats. These research directions will not only tackle existing
challenges but also anticipate and prepare for the changing landscape of cyber threats,
ensuring that forensic experts are adequately equipped to safeguard digital environments.

9. Conclusions
The rapid advancement of digital technologies and their integration into enterprise
environments underscores an urgent need for robust cybersecurity measures. The present
paper is enriched by a comprehensive year-long examination of cloud forensics practices
within Azure AD, and valuable contributions to the field of cyber forensics are provided.
The critical role of structured forensic methodologies and advanced tools in understanding
and mitigating cyber threats, particularly in cloud environments, is highlighted.
The application of Azure’s forensic tools in a real-world scenario involving a ran-
somware attack has been detailed, providing a systematic approach to forensic investi-
gations in cloud infrastructures. The detailed analysis of the attack vectors, the forensic
methodologies employed, and the integration of Azure’s sophisticated monitoring tools,
such as Azure Log Analytics and Azure Sentinel, are presented as the scientific contribu-
tions of this study. These tools have demonstrated essentiality for incident management
and developing preventive strategies that enhance organizational resilience against cy-
ber threats.
Furthermore, this paper contributes to the academic discussion on using UALs in cloud
environments. It explores the complexities of UALs’ schema and the challenges of analyzing
unstructured data. It demonstrates how to effectively parse and query large datasets using
Kusto and Azure Data Explorer. This exploration is noted to have significant implications
Electronics 2024, 13, 4546 31 of 32

for future forensic practices, as a framework is provided for other organizations to enhance
their log analysis capabilities, thereby improving the ability to detect and respond to
sophisticated cyber threats.
Furthermore, integrating Microsoft Cloud App Security (MCAS) into the forensic
strategy has expanded the understanding of activity logging across cloud services. This
adaptation offers a blueprint for other forensic researchers and practitioners to extend mon-
itoring reach across multiple platforms, ensuring comprehensive coverage of all potential
threat vectors.
Incorporating artificial intelligence and machine learning into Azure’s forensic capabil-
ities signifies a significant evolution in cyber forensic techniques, especially in cloud settings.
AI accelerates incident response and enhances forensic accuracy in identifying complex
cyber threats by automating intricate data analytics and improving threat prediction capa-
bilities. This study underscores AI’s capacity to strengthen cyber forensic methodologies,
facilitating a more agile and adaptive security framework. These innovations facilitate im-
proved security preparedness and a more robust cloud forensics methodology, establishing
new cybersecurity and digital forensics standards.

Author Contributions: Conceptualization, Z.M.; Methodology, V.D. and A.K.; Validation, V.D. and
A.K.; Formal analysis, Z.M. and A.K.; Investigation, A.K. and D.R.; Resources, Z.M. and D.R.;
Writing—original draft, V.D. and Z.M.; Writing—review and editing, V.D. and D.R.; Supervision,
Z.M.; Project administration, Z.M. All authors have read and agreed to the published version of
the manuscript.
Funding: This research received no external funding.
Data Availability Statement: Data are contained within this article.
Conflicts of Interest: The authors declare no conflicts of interest.

References
1. Yadav, S. Cyber Forensics. In Advances in Digital Crime, Forensics, and Cyber Terrorism; IGI Global: Hershey, PA, USA, 2020;
pp. 1–15. [CrossRef]
2. Allah Rakha, N. Cybercrime and the Law: Addressing the Challenges of Digital Forensics in Criminal Investigations. Mex. Law
Rev. 2024, 16, 23–54. [CrossRef]
3. Baafi, P.O. Tools For Cyber Forensics. Adv. Multidiscip. Sci. Res. J. Publ. 2022, 1, 285–290. [CrossRef]
4. Yerriswamy, K.; Venumadhava, G.S. Cyber Forensic Tools and Its Application in the Investigation of Digital Crimes: Preventive
Measures with Case Studies. Int. J. Sci. Res. 2022, 11, 71–73. [CrossRef]
5. Dunsin, D.; Ghanem, M.C.; Ouazzane, K.; Vassilev, V. A Comprehensive Analysis of the Role of Artificial Intelligence and Machine
Learning in Modern Digital Forensics and Incident Response. Forensic Sci. Int. Digit. Investig. 2023, 48, 301675. [CrossRef]
6. Ekhande, S.; Patil, U.; Kulhalli, K.V. Review on effectiveness of deep learning approach in digital forensics. Int. J. Electr. Comput.
Eng. (IJECE) 2022, 12, 5481–5592. [CrossRef]
7. Javed, A.R.; Ahmed, W.; Alazab, M.; Jalil, Z.; Kifayat, K.; Gadekallu, T.R. A Comprehensive Survey on Computer Forensics:
State-of-the-Art, Tools, Techniques, Challenges, and Future Directions. IEEE Access 2022, 10, 11065–11089. [CrossRef]
8. Kaur, R.; Kaur, A. Digital Forensics. Int. J. Comput. Appl. 2012, 50, 5–9. [CrossRef]
9. Perilli, M.; De Bonis, M.; Gallo, C. Computer Forensics and Cyber Attacks. In Handbook of Research on Cyber Crime and Information
Privacy; IGI Global: Hershey, PA, USA, 2020; pp. 132–150. [CrossRef]
10. McCluskey, Q.R.; Chowdhury, M.M.; Latif, S.; Kambhampaty, K. Computer Forensics: Complementing Cyber Security. In
Proceedings of the 2022 IEEE International Conference on Electro Information Technology (eIT), Mankato, MN, USA, 19–21 May
2022; IEEE: Piscataway, NJ, USA, 2022. [CrossRef]
11. Malik, N. A Study on Different Software Used to Perform Cyber Crime. Int. J. Res. Appl. Sci. Eng. Technol. 2021, 9, 879–882.
[CrossRef]
12. Kumar, S.; Pathak, S.K.; Singh, J. A Comprehensive Study of XSS Attack and the Digital Forensic Models to Gather the Evidence.
ECS Trans. 2022, 107, 7153–7163. [CrossRef]
13. Kolesnyk, V. Forensic Analysis of Crimes in the Sphere of Information Technologies. Inf. Secur. Pers. Soc. State 2021, 31–33,
124–136. [CrossRef] [PubMed]
14. Qadir, A.M.; Varol, A. The Role of Machine Learning in Digital Forensics. In Proceedings of the 2020 8th International Symposium
on Digital Forensics and Security (ISDFS), Beirut, Lebanon, 1–2 June 2020; pp. 1–5. [CrossRef]
Electronics 2024, 13, 4546 32 of 32

15. Kumar, D.; Kumar, K.P. Artificial Intelligence Based Cyber Security Threats Identification in Financial Institutions Using Machine
Learning Approach. In Proceedings of the 2023 2nd International Conference for Innovation in Technology (INOCON), Bangalore,
India, 3–5 March 2023; pp. 1–6. [CrossRef]
16. Rajendiran, K.; Kannan, K.; Yu, Y. Applications of Machine Learning in Cyber Forensics. In Advances in Digital Crime, Forensics,
and Cyber Terrorism; IGI Global: Hershey, PA, USA, 2021; pp. 29–46. [CrossRef]
17. Vedre, S.; Parulekar, W. Digital Forensic and Role of Computers in Digital Forensic. Int. J. Creat. Res. Thoughts (IJCRT) 2022, 10,
in press.
18. Karie, N.M.; Kebande, V.R.; Venter, H.S. Diverging Deep Learning Cognitive Computing Techniques into Cyber Forensics. Forensic
Sci. Int. Synerg. 2019, 1, 61–67. [CrossRef] [PubMed]
19. Cisar, P.; CISAR, S.; Bosnjak, S. Cybercrime and Digital Forensics—Technologies and Approaches. In DAAAM International
Scientific Book 2014; DAAAM International: Viena, Austria, 2014; pp. 525–542. [CrossRef]
20. Svoboda, J.; Lukas, L. Sources of Threats and Threats in the Cyber Security. In DAAAM International Scientific Book 2019; DAAAM
International: Viena, Austria, 2019; pp. 321–330. [CrossRef]
21. Kafol, C.; Bregar, A. Cyber Security—Building a Sustainable Protection. In DAAAM International Scientific Book 2017; DAAAM
International: Viena, Austria, 2017; pp. 81–90. [CrossRef]
22. Al Fahdi, M.; Clarke, N.L.; Furnell, S.M. Challenges to Digital Forensics: A Survey of Researchers & Practitioners Attitudes and
Opinions. In Proceedings of the 2013 Information Security for South Africa, IEEE, Johannesburg, South Africa, 14–16 August
2013; pp. 1–8. [CrossRef]
23. ISO/IEC 27001:2022/Amd 1:2024; Information Security, Cybersecurity and Privacy Protection—Information Security Management
Systems—Requirements. ISO, International Electrotechnical Commision (IEC): Geneva, Switzerland, 2024.
24. Fakiha, B. Enhancing Cyber Forensics with AI and Machine Learning: A Study on Automated Threat Analysis and Classification.
Int. J. Saf. Secur. Eng. 2023, 13, 701–707. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like