Amoeba: An Autonomous Backup and Recovery
SSD for Ransomware Attack Defense
Donghyun Min* , Donggyu Park* , Jinwoo Ahn* ,
Ryan Walker 7 , Junghee Lee7 , Sungyong Park* , Youngjae Kim*
Presenter: Donghyun Min
Feb 19, 2019 @ HPCA’19
1 2
Ransomware
User Ransomware
Encryption
1
Ransomware
Ransom fee
Huge Financial Loss
User Ransomware
Encryption
1
Damage of Ransomware Attack
- Many areas are suffering from damage
• Public institutions
• Government, industry
Ransomware-related damage cost will
reach $20 billion by 2021!
2
How to Defend against Ransomware Attack
- Backup method
Original copy Backup copy
3
How to Defend against Ransomware Attack
- Approach 1: Host-level backup
- Backup on Local File system
- Backup on Remote machine
- Approach 2: Device-level backup
- FlashGuard [CCS’17]
- SSD-Insider [ICDCS’18]
- Amoeba [CAL’18]
4
Approach 1: Host-level Backup
- Backup inside File system - Backup on Remote-machine
Network
File system Host machine Remote machine
Data Data Backup
Backup Data Backup
Local storage Local storage Local storage
1. Extra storage space is required.
2. Ransomware with kernel privilege can disable backup process.
5
Approach 2: Device-level Backup
- FlashGuard [CCS’17]
- SSD-Insider [ICDCS’18]
with backup
mechanism
Out-of-place update
6
Opportunities: Out-of-Place Update in an SSD
SSD
Address Translation Physical Block
(LPN, PPN) VALID VALID Physical Page
VALID VALID
VALID VALID
...
(10, 2) VALID
(20, 3)
... 2 VALID
VALID
VALID
VALID
Flash Translation Layer (FTL) NAND Flash memory
7
Opportunities: Out-of-Place Update in an SSD
SSD
Encrypt File(A)
by Ransomware
Address Translation
(LPN, PPN) In-place VALID VALID
LPN 10 Update
VALID VALID
VALID VALID
...
(10, 2) VALID
(20, 3)
... 2 VALID
VALID
VALID
VALID
Flash Translation Layer (FTL) NAND Flash memory
7
Opportunities: Out-of-Place Update in an SSD
SSD
Encrypt File(A)
by Ransomware
Address Translation
(LPN, PPN) VALID VALID
LPN 10
INVALID VALID
VALID VALID
...
(10, 4) VALID VALID
(20, 3) Out-of-place 2 VALID
... Update
VALID
VALID
VALID
Flash Translation Layer (FTL) NAND Flash memory
7
Opportunities: Out-of-Place Update in an SSD Invalid page is actually
an original page for
SSD
Encrypt File(A) recovery.
by Ransomware
Address Translation
(LPN, PPN) VALID VALID
LPN 10
INVALID VALID Backup
VALID VALID
...
(10, 4) VALID VALID
(20, 3) Out-of-place 2 VALID
... Update
VALID
VALID
VALID
1. We can save storage space for backup because additional
backup space is not required.
Flash Translation Layer (FTL) NAND Flash memory
2. Device-level backup can become more secure because backup
copy cannot be seen from ransomware application. 7
Challenges
SSD
Encrypt File(A)
by Ransomware
Address Translation
(LPN, PPN) VALID VALID
INVALID VALID Backup
INVALID
VALID VALID
...
(10, 4) VALID VALID
(20, 3)
Overwrites on ... Out-of-place
VALID
2 VALID
File(B) LPN 20
Update VALID
by Normal VALID
User VALID
Flash Translation Layer (FTL) NAND Flash memory
8
Challenges
SSD
Encrypt File(A)
by Ransomware
Address Translation
(LPN, PPN) VALID VALID
INVALID VALID Backup
INVALID
VALID VALID Backup
...
(10, 4) VALID VALID
(20, 3)
Overwrites on ... Out-of-place
VALID
2 VALID
File(B) LPN 20
Update VALID
by Normal VALID
User VALID
Flash Translation Layer (FTL) NAND Flash memory
8
Challenges
SSD
Encrypt File(A)
by Ransomware
Address Translation
(LPN, PPN) VALID VALID
INVALID VALID Backup
INVALID VALID Backup
...
(10, 4) VALID VALID Backup
(20, 5)
Overwrites on ... Out-of-place
VALID VALID
Backup
File(B) LPN 20
Update VALID
Backup
by Normal VALID
Backup
User VALID
Flash Translation Layer (FTL) NAND Flash memory
SSD should keep invalid pages as backup only for updates by
ransomware. 8
Summary: Limitations of Previous Works [CCS’17, ICDCS’18]
1. Lack of accurate ransomware detection algorithms
- Detection solely relies on I/O access pattern (e.g., Write Intensity)
è False Positive (FP) è GC overhead
è False Negative (FN) è Recovery failure
9
Summary: Limitations of Previous Works [CCS’17, ICDCS’18]
1. Lack of accurate ransomware detection algorithms
- Detection solely relies on I/O access pattern (e.g., Write Intensity)
è False Positive (FP) è GC overhead
è False Negative (FN) è Recovery failure
2. High unnecessary space overhead due to lack of intelligent
backup mechanisms
- Unnecessary backup pages increase GC overhead.
User write
Valid Backup Backup
User write page Backup
page Backup
page
Backup ...
page Backup
page
Ransomware page page
write
9
Our Approach [Amoeba, CAL’18]
1. We use a content-based detection technique for high ransomware
detection rate.
\xCE
\xC5
\x06
……
2. We implement an intelligent backup management mechanism
to minimize space overhead for backup pages.
User write
Backup
Valid Backup Backup Backup
User write page
page page Backup
page Backup
page
Ransomware page ... page
write
10
Challenge 1: How to Apply Content-based Detection at High Speed
- Content-based detection offers high ransomware detection rate,
but, it is highly time-consuming because it requires huge
computation power for old and new comparison for similarity and
entropy computation.
Old New
A A’
Similarity Entropy
11
Challenge 2: How Accurately Detect Ransomware Attack
- Ransomware detection algorithm needs to be developed by
considering three indicators all together should be required for high
detection rate.
I/O access pattern
Old New
A A’
Write Intensity Similarity Entropy
12
Challenge 2: How Accurately Detect Ransomware Attack
- If only Write Intensity is used, it often misjudge I/O access pattern
normal requests and ransomware attacks.
Write Intensity
- If only Similarity and Entropy are used, it cannot distinguish
legitimate encryption applications using compression and PGP
cryptographic library from ransomware attacks.
Old New
A A’
Similarity Entropy
13
Challenge 3: How to Minimize Backup Space Overhead
User write
Backup
Valid Backup
User write Backup page
Backup
page page
page
Backup pageBackup
Ransomware page ... page
write
We should be able to identify only necessary backup pages for
recovery among backup pages.
14
Amoeba:
An Autonomous Backup and Recovery SSD
for Ransomware Attack Defense
Amoeba System Architecture
SSD
DRAM Buffer
SSD Controller
Host
machine DRAM
Flash Controller
Translation Flash NAND
Layer Controller Flash
(FTL) Amoeba
DMA
15
Amoeba System Architecture
- Amoeba DMA
SSD
DRAM Buffer
SSD Controller
Host
machine DRAM
Flash Controller
Translation Flash NAND
Layer Controller Flash
(FTL) Amoeba
DMA
15
Amoeba System Architecture
- Ransomware Attack Risk Indicator (RARI)
SSD
DRAM Buffer
SSD Controller
Host
machine DRAM
Flash Controller
Translation Flash NAND
Layer Controller Flash
(FTL) Amoeba
DMA
15
Amoeba System Architecture
- Intelligent Backup Mechanism
SSD
DRAM Buffer
SSD Controller
Host
machine DRAM
Flash Controller
Translation Flash NAND
Layer Controller Flash
(FTL) Amoeba
DMA
15
1. Amoeba DMA Engine
- Amoeba DMA engine for computing similarity, entropy
16
1. Amoeba DMA Engine
- Amoeba DMA engine for computing similarity, entropy
- Basic DMA (Existing DMA)
SSD
New DRAM
Data Buffer
Write Request
with New Data
SSD Controller
DRAM
Flash Controller
Translation Flash
Old NAND
Layer Controller Data Flash
(FTL) Internal
DMA
16
1. Amoeba DMA Engine
- Amoeba DMA engine for computing similarity, entropy
- Basic DMA (Existing DMA)
- Amoeba DMA
SSD
NewDRAM
DataBuffer
Write Request
with New Data
SSD Controller
DRAM
Flash Controller
Translation Flash OldNAND
Layer Controller DataFlash
(FTL) Amoeba
DMA
17
1. Amoeba DMA Engine
- Amoeba DMA engine for computing similarity, entropy
- Basic DMA (Existing DMA)
- Amoeba DMA
SSD Calculation
DRAM
Occurrence of bytes
Write Request
Buffer
delay can be
with New Data
hidden.
SSD Controller
Similarity Flash
DRAM
Controller
Translation New Old Flash NAND
Flash
Layer Data Data Controller
(FTL) Amoeba
Entropy DMA
17
2. Ransomware Attack Risk Indicator (RARI)
- We establish a model that combines three indicators (write intensity,
similarity, and entropy) to form a RARI value.
18
2. Ransomware Attack Risk Indicator (RARI)
- We establish a model that combines three indicators (write intensity,
similarity, and entropy) to form a RARI value.
Suspicious
Write Request
Similarity Entropy Intensity
Each Indicator value
Threshold
(fixed) = 0.5f
RARI computation
Probability
RARI
18
2. Ransomware Attack Risk Indicator (RARI)
- We establish a model that combines three indicators (write intensity,
similarity, and entropy) to form a RARI value.
Machine Each Indicator value
Learning
Threshold
(fixed) = 0.5f
Host Machine RARI computation
Probability
RARI
18
3. Intelligent Backup Control Mechanism
- We can accurately detect backup pages using RARI values. Thus,
we can only maintain a backup page per logical page.
User write
Backup
Valid Backup Backup Backup
User write page
page page Backup
page Backup
page
Ransomware page ... page
write
We can completely go away unnecessary backup
pages in an SSD.
19
3. Intelligent Backup Control Mechanism
- Recovery Procedure
Recovery
request VALID BACKUP INVALID VALID
VALID BACKUP INVALID VALID
VALID BACKUP INVALID VALID
VALID VALID
NAND Block NAND Block
SSD SSD
20
Evaluation Methodology
- MSR Disksim SSD Simulator
- Workload
• Linux Erebus ransomware
• User’s normal I/O
- Simulation setup
• SSD Occupancy: 20%, 40%, 80%
• Page Size: 8 KB, # of page per block: 128
- Comparison
• Baseline: SSD without backup mechanism
• FlashGuard
• SSD-Insider
• Amoeba
21
Result 1: Average Response Time
SSD page occupancy 20% SSD page occupancy 40% SSD page occupancy 80%
4.5
4
Normalized Avg. Response Time (ms)
3.5
2.5
1.5
0.5
Baseline FlashGuard[1] FlashGuard[2] FlashGuard[4] FlashGuard[8] SSD-Insider Amoeba
22
Result 1: Average Response Time
SSD page occupancy 20% SSD page occupancy 40% SSD page occupancy 80%
4.5
4
Normalized Avg. Response Time (ms)
3.5
2.5
1.5
0.5
Baseline FlashGuard[1] FlashGuard[2] FlashGuard[4] FlashGuard[8] SSD-Insider Amoeba
22
Result 1: Average Response Time
SSD page occupancy 20% SSD page occupancy 40% SSD page occupancy 80%
4.5
4.0546063
4
Normalized Avg. Response Time (ms)
In worst case, response time of
3.5
3
Amoeba only increased by 8%
2.5
compared to baseline.
2
1.5
1.0817913
1
0.5
Baseline FlashGuard[1] FlashGuard[2] FlashGuard[4] FlashGuard[8] SSD-Insider Amoeba
22
Result 2: Detection Accuracy
FlashGuard SSD-Insider Amoeba
90000
11.11%
80000
70000
Number of Occurrence
60000
50000
Amoeba has only less
40000 Decrease by 23 %
than 1% false detection.
30000
2.79%
20000
Decrease by 4.5 %
10000
0.68%
0
False Positive (FP) False Negative (FN) Total
23
Conclusion
• We presented Amoeba: An Autonomous Backup and
Recovery SSD for Ransomware Attack Defense.
• Implemented Amoeba DMA Hardware engine to compute content-
based detection algorithm.
• Proposed a Ransomware Attack Risk Indicator (RARI) metric.
• Provided Intelligent Backup and Recovery mechanism.
I/O access pattern
Old New
A A’
Write Intensity Similarity Entropy
24
Thank you
Q&A
Donghyun Min
mdh38112@[Link]
Sogang University, South Korea
Backup slides 1: GC Calls
SSD page occupancy 20% SSD page occupancy 40% SSD page occupancy 80%
60000
50000
Number of GC Calls
40000
30000
20000
10000
Baseline FlashGuard[1] FlashGuard[2] FlashGuard[4] FlashGuard[8] SSD-Insider Amoeba
Backup slides 3: Recovery fail rate
FlashGuard SSD-Insider Amoeba
4000
3500
Occurrence of Recovery Fail
3000
2500
2000
1500
1000
500
First Recovery Second Recovery Total