0% found this document useful (0 votes)
102 views29 pages

Architectural Support For High Speed Protection of Memory Integrity and Confidentiality in Multiprocessor Systems

This document discusses architectural support for protecting memory integrity and confidentiality in multiprocessor systems. It proposes a platform-oriented security architecture that uses encryption and message authentication codes (MACs) to protect data integrity and confidentiality during sharing between processors in a shared memory multiprocessor system. The key aspects of the proposed architecture include using a MAC tree to authenticate physical memory blocks, encrypting cache lines during sharing using a one-time pad generated from a process key and bus sequence number, and allowing speculative execution using unverified data while authenticating it in the background. The document evaluates the proposed architecture using an extended multiprocessor simulator.

Uploaded by

larryshi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views29 pages

Architectural Support For High Speed Protection of Memory Integrity and Confidentiality in Multiprocessor Systems

This document discusses architectural support for protecting memory integrity and confidentiality in multiprocessor systems. It proposes a platform-oriented security architecture that uses encryption and message authentication codes (MACs) to protect data integrity and confidentiality during sharing between processors in a shared memory multiprocessor system. The key aspects of the proposed architecture include using a MAC tree to authenticate physical memory blocks, encrypting cache lines during sharing using a one-time pad generated from a process key and bus sequence number, and allowing speculative execution using unverified data while authenticating it in the background. The document evaluates the proposed architecture using an extended multiprocessor simulator.

Uploaded by

larryshi
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 29

Architectural Support for High Speed Protection of Memory Integrity and Confidentiality in Multiprocessor Systems

Weidong Shi Hsien-Hsin (Sean) Lee Mrinmoy Ghosh Chenghuai Lu Georgia Institute of Technology Atlanta, GA 30332
1

Types of Security Attacks


Software-based attacks Software reverse engineering, de-assembly Software patching Hardware-based physical attacks Trace system from system bus, peripheral bus Differential power/timing analysis Build fake devices, device spoof (MOD chip) Modify RAM Replay bus signals, fake bus signal injection Trigger fake interrupts

XBOX with MOD-chip installed. MOD-chip is a low cost bus snoop and spoof device widely used to break XBOX security.
Shared-Memory MP Security Architecture 2

Cracking the XBOX


Nbridge + GPU Hyper-Transport
Secret Key South Bridge

FPGA based Bus Tracer

P-III
BIOS Flash (some BIOS codes are encrypted)

Find out the key


socket over HT Bus soldered by hackers

MOD Chip (PCB with -controller and Flash memory)

BIOS hijacking
Low cost FPGA based bus snooping device

Shared-Memory MP Security Architecture

Motivation
Yet to be solved Issues of prior security measures Uni-processor based security model Protected memory cannot be shared Large space and performance overhead in security support Some compromise some security for performance improvement

Our Work
Protect integrity and confidentiality in a Shared-memory Multiprocessor platform
Shared-Memory MP Security Architecture 4

Agenda
Uni-processor Security Architecture Platform-oriented Security Architecture

Architectural Support for Shared Memory Integrity and Confidentiality


Evaluation

Conclusions

Shared-Memory MP Security Architecture

Insecure Uni-Processor Architecture


Processor Core Caches

Secure Processor

North Bridge
(Mem Controller)

RAM
South Bridge

Ethernet

Mouse

Keyboard

Disk

Shared-Memory MP Security Architecture

Secure Uni-Processor Architecture


Processor Core Caches

Trusted Domain UnTrusted Domain

Secure Processor

North Bridge
(Mem Controller)

RAM
South Bridge

Ethernet

Mouse

Keyboard

Disk

Shared-Memory MP Security Architecture

Secure Uni-Processor Architecture


Processor Core Root Signature Caches

MAC hash tree

Trusted Domain UnTrusted Domain

Crypto Engine

Secure Processor

North Bridge
(Mem Controller)

South Bridge

RAM (encrypted data & MAC code)

Ethernet

Mouse

Keyboard

Disk

Not directly applicable to a Shared-memory Multiprocessor system


Shared-Memory MP Security Architecture 8

Basics: Integrity Check (MAC Authentication)


Sender
N-bit Plaintext Secret Key Hash/Encryption Hash/Encryption Secret Key

Receiver

M bit MAC

M bit MAC

Exception

Again, Sender and Receiver share the same secret key Detect data tampering using Message Authentication Code (or MAC) Any attempt for an adversary to modify data or forge a valid authentication code is guaranteed to be detected
Shared-Memory MP Security Architecture 9

Platform-oriented Security Architecture


Processor 1 (PE 1)
Processor Core Caches Crypto Engine encrypted data encrypted MAC

Processor n (PE n)
Processor Core Caches Crypto Engine

Cache-to-Cache
- send encrypted data first then followed by encrypted MAC - receiver decrypts data and verifies integrity

Cache-to-Memory
send encrypted data and MAC to Nbridge Need- to be - Nbridge decrypts the data, verifies its protected integrity, updates MAC tree, and store encrypted data to the RAM

RAM

Crypto Engine
MAC Tree Cache

North Bridge (PE 0)


Shared-Memory MP Security Architecture 10

Protection on the RAM MAC Tree


Root MAC MAC

MAC

32B 32B RAM Block RAM Block

32B RAM Block

M-ary MAC (message authentication code) tree to protect physical memory integrity dynamically (e.g. Replay attack). The root MAC is a signature of the protected memory space. Root MAC is kept inside the North Bridge. Frequently accessed MAC tree nodes are cached inside NBridge
Shared-Memory MP Security Architecture 11

Platform-oriented Security Architecture


Processor 1 (PE 1)
Processor Core Caches Crypto Engine encrypted data encrypted MAC

Processor n (PE n)
Processor Core Caches Crypto Engine

Cache-to-Cache
- send encrypted data first then followed by encrypted MAC - receiver decrypts data and verifies integrity

Cache-to-Memory
- send encrypted data and MAC to Nbridge - Nbridge decrypts the data, verifies its integrity, updates MAC tree, and store encrypted data to the RAM

RAM

Crypto Engine
MAC Tree Cache

Memory-to-Cache
- Nbrdige reads encrypted data and MAC from the RAM - Nbridge decrypts the data, verifies its MAC, reencrypts the data and put encrypted data and MAC on the shared bus 12 - receiver decrypts data and verifies integrity

North Bridge (PE 0)


Shared-Memory MP Security Architecture

Platform-oriented Security Architecture


Physical memory (RAM) authentication MAC Tree Protected data sharing Encryption using Bus sequence number Process key Authentication speculative execution (ASE)

Shared-Memory MP Security Architecture

13

Basics: Counter Mode Encryption


Sender
Init. Counter + 0 Secret Key Block Cipher or Cryptographic Hash Secret Key Block Cipher or Cryptographic Hash

Receiver
Init. Counter + 0

Pseudo-random pad

Pseudo-random pad

Plaintext A

XOR
Ciphertext A

XOR
Plaintext A

To send a data sequence securely Sender and receiver share a secret key, and an initial counter value. A pseudo-random pad is generated deterministically Counter value does not need to be a secret.
Shared-Memory MP Security Architecture

14

Basics: Counter Mode Encryption


Sender
Init. Counter + 1 Secret Key Block Cipher or Cryptographic Hash Secret Key Block Cipher or Cryptographic Hash

Receiver
Init. Counter + 1

Pseudo-random pad

Pseudo-random pad

Plaintext B

XOR
Ciphertext B

XOR
Plaintext B

Counter values increment coherently for both parties in a predetermined sequence

Shared-Memory MP Security Architecture

15

How to Encrypt each Transaction?


256-bit Process Key Bus sequence number

Cryptographic Hash

One-Time-Pad (OTP)

Cache Line

Encrypted Data

OTP generation Bus sequence number Process Key Bus sequence number a 64-bit secret initialized after the system is booted shared by all the parties connected to the shared bus. incremented after each transaction All PEs on the shared bus snoop each bus transaction OTP can be pre-computed based on an approximate range of bus sequence numbers
16

Shared-Memory MP Security Architecture

Generating Process Key & Bus Sequence Number


Burned inside each PE
Secret Constant

By secure kernel
Process unique ID Secret Constant

Session Key

Encryption (AES)

Session Key

Encryption (AES)

Initiated every time It boots

Process Key

Initial Bus Sequence Number

Bus Sequence Number works similar to counter mode encryption


Shared-Memory MP Security Architecture 17

Session Key Generation (Distribution)


Processor PE0 Processor PE1 Processor PE n-1

broadcast random num

receive random num from others


Secure Memory Controller PE n

Random Number PE0 Random Number PE1

Random Number PEn

Secret Hash Key

Hash (SHA256)

Burned inside each PE, same for each PE

During System Boot

128 bit Session Key


Shared-Memory MP Security Architecture 18

Protected Data Sharing Operations


Processor A Processor B

256-bit Process Key

Bus sequence number

256-bit Process Key

Bus sequence number

Cryptographic Hash

Cryptographic Hash

OTP (one-time-pad)

Data Block

Encrypted Data

OTP (one-time-pad)

Encrypted Data

Data Block

Shared-Memory MP Security Architecture

19

OTP Pre-computing
+1,+2, +3,
Process Key Latest Bus sequence number

Data to be transmitted

OTP queue
OTP(0x1234abcd0000) OTP Generation

OTP(0x1234abcd0001)
OTP(0x1234abcd0002) OTP(0x1234abcd001e) OTP(0x1234abcd001e) OTP(0x1234abcd001f) Bus Arbitration Logic

request for bus ownership

Shared Bus

Ownership granted, current bus sequence number = 0x1234abcd001e OTP Generation is on the critical path We can pre-compute OTP needed in the neighborhood

Shared-Memory MP Security Architecture

20

OTP Pre-Computing
Processor A Processor B

256-bit Process Key

Bus sequence number

256-bit Process Key

Bus sequence number

Cryptographic Hash

Cryptographic Hash

OTP (one-time-pad)

Data Block

Encrypted Data

OTP (one-time-pad)

Encrypted Data

Data Block

Shared-Memory MP Security Architecture

21

Split Transaction of Data and MAC


Sequence Authentication Buffer
ID MAC Valid Verified OTP

Processor A

Processor B

Processor C

Data(id, seq), Data(id+1, seq+1), MAC(id-3, seq-3), Data(id+2, seq+2), MAC(id, seq), Shared Bus
Shared-Memory MP Security Architecture 22

Authentication Speculative Execution (ASE)


Performance Side: allow execution to be continued using un-verified data allow execution to be continued using results derived from unverified data Security Side: under counter-mode, instructions and data may be altered by hackers. Authentication has to be performed in a timely fashion to prevent attacks that flip individual bits of encrypted data/instructions. memory state should not be altered using results of un-verified data instruction fetch should not be issued to the memory if determined by control flow using un-verified data

Shared-Memory MP Security Architecture

23

ASE
SAB Tag = 2

r3
SAB Tag =2

Load r3

r4

SAB Tag =3

r6
SAB Tag =2

Load r6

r5 r5<r6 N
SAB Tag =1

0: r3 = (addr1) 1: r4 = r3*const1 2: r5 = r4+const2 3: r6 = (addr2) 4: if (r5<r6) { 5: } else { 6: r7 = r6 + r1} 7: (addr3) = r7 MAC Fetched Fetched Fetched Verify? Verified Verified Verified

Wait if Icache miss


r6

r1 r1

SAB Tag =1

Sequential Authentication Buffer

r7

Wait until all the data sources are verified Shared-Memory MP Security Architecture

Save r7

24

Evaluation Methodology
RSIM MP simulator
Benchmarks: Splash, Splash2

Modified Rsim simulator to support bus snoop based cache coherence Added an accurate DRAM model Added shared memory support Implemented a North Bridge simulator with MAC tree authentication. Extended processor model to support performance simulation of proposed protection including speculative authentication.

Shared-Memory MP Security Architecture

25

Non-Speculative (AIO) vs. ASE


Authentication Performance (2P)
1.2 1 0.8 0.6 0.4 0.2 0
ra qu dix ic ks or t wa te r m p3 d Av er ag e fft lu
Normalized IPC Normalized IPC

Authentication Performance (4P)


1.2 1 0.8 0.6 0.4 0.2 0
ra qu dix ic ks or t wa te r m p3 d Av er ag e fft lu

AIO ASE

AIO ASE

ASE outperforms in-order execution by 80% for 2P- and 4Pprocessor systems.

Shared-Memory MP Security Architecture

26

Data Confidentiality
Performance of Protection on Confidentiality (4P)
No cache
1

8KB seq# cache

32KB seq# cache

Normalized IPC

0.8 0.6 0.4 0.2 0 fft lu radix quicksort water mp3d Average

40 to 55% Performance loss compared to no security support More cache-to-cache transactions, the faster execution due to OTP pre-computation With a sequence number cache, memory-to-cache operations can be accelerated by ~30%
Shared-Memory MP Security Architecture 27

Conclusions
Proposed security scheme to protect confidentiality and integrity for shared memory in snoop bus multiprocessor system. Proposed a number of techniques to minimize the overhead caused by security protection including,
Physical memory (RAM) authentication Shared bus sequence number based encryption Split transmission of data and MAC Authentication Speculative Execution without violating rule of authentication safe

Lightweight secure processor design with novel security design features (offload to North Bridge).
Shared-Memory MP Security Architecture 28

Questions & Answers & Entertaining


Thats All Folks !

Shared-Memory MP Security Architecture

29

You might also like