Data Security
Data Security
Data Security
Data
• Computer asset
“Data is a precious thing and will last longer than the
systems themselves”
Tim Berners-Lee
• Why securing data is so difficult?
– Flow and rest anywhere unless controlling it
1
11/14/2022
Sensitive Data
• Any information that isn’t public or unclassified
• An organization needs to protect
– Due to its value to the organization
– To comply with existing laws and regulations
Sensitive Data
Personally Identifiable Information (PII)
• Any information that can identify an individual.
• NIST Special Publication (SP) 800-122 provides a more formal definition
Proprietary data
• Any data that helps an organization maintain a competitive edge
• Copyrights, patents, and trade secret laws provide a level of protection
for proprietary data
• Many criminals don’t pay attention to copyrights, patents, and laws
4
2
11/14/2022
Data States
• Data exists in one of three states
Data at Rest
• Any data stored on media such as system hard drives, external USB
drives, storage area networks (SANs), and backup tapes, etc.
• More vulnerable
– Threat to reach it over our systems and networks and physical access to
the device
Example
– Personal Health Information (PHI) breaches
• An employee left unattended in his car backup tapes containing PHI
on some 4.9 million patients
3
11/14/2022
Data at Rest
• The solution to protecting data at Rest
– Hash
– Message Authentication Codes
– Encryption (Disk encryption (e.g., BitLocker in Windows 10)
Data at Rest
One-way Hash Functions
• Produces a fixed size (hash, the hash value, fingerprint, message
digest) from a message of an arbitrary length
• One-way Hash Properties:
– One-way: Given a hash value h, hash(m) = h, difficult to find m
– Collision resistant: Difficult to find m1 and m2 such that hash(m1)=
hash(m2)
• Common One-way Hash Functions:
– MD (Message Digest) series: MD2,MD4,MD5,and MD6
– SHA (Secure Hash Algorithm) series: SHA-0,SHA-1(160 bit), SHA-
2(SHA-256 and SHA-512), and SHA-3
4
11/14/2022
Data at Rest
MD One-Way Hash Functions
• MD stands for Message Digest
• Includes MD2, MD4, MD5,and MD6
• Status of Algorithms:
– MD2, MD4 - Severely broken (obsolete)
– MD5 - Collision resistance property broken, one-way property not
broken
– MD6 - developed in response to proposal by NIST
Data at Rest
SHA One-way Hash Function
• Published by NIST
• Includes SHA-0, SHA-1, SHA-2, and SHA-3
• Status of Algorithms:
– SHA-0: withdrawn due to flaw
– SHA-1: Designed by NSA; Collision attack found in 2017
– SHA-2: Designed by NSA; Includes SHA-256 and SHA-512;
– No significant attack found yet
– SHA-3: Released in 2015
5
11/14/2022
Data at Rest
One-Way Hash Algorithm Works
• Works based on a compression function h() on fixed-size input
blocks (Mi)
– 1-bit change in input produces unpredictable changes in output
Data at Rest
openssl
6
11/14/2022
Data at Rest
Programming Languages support
• Different languages including C/C++, Python, SQL, PHP
provide support
• Language specific:
– MySQL - SHA2 function
– Python - Use hashlib package
– C - Use functions from openssl/sha.h header
Data at Rest
Applications of One-Way Hash Functions
• Integrity Verification
• Password Verification
• Trusted Timestamping
• Blockchain and Bitcoin
7
11/14/2022
Integrity Verification
• Changing one bit of the original data changes hash value
• Usage examples:
– Detect change in system files
– Save the hash value of the file instead of the entire file (impractical) in a
safe place
– Detect if file downloaded from website is corrupted
– Many file download sites publish a hash of value for each file
Data at Rest
Message Authentication Code (MAC)
• Attach tag (MAC) to data – one way to verify the integrity of the
data
• Ways to generate MAC
– Attaching one-way hash of the data along with the data
– An attacker can re-compute hash
– Use a shared secret key K between sender and receiver in the hash
– Hash(K || M)
– MITM cannot compute hash without secret key
8
11/14/2022
Data at Rest
Message Authentication Code (MAC)
• Length Extension Attack on MAC
• Key and message need to be mixed properly before computing hash
• Simple concatenation (K || M) does not work
Data at Rest
Encryption
• Most OS provides tools to encrypt individual files or entire
volumes
– Disk encryption (e.g., BitLocker in Windows 10)
– Third-party software is also available
• The processor power increase no noticeable decrease in the
performance of computers that use encryption
• NIST Special Publication 800-111
– Guide to Storage Encryption Technologies for End User Devices,
provides a good, if somewhat dated (2007), approach to this topic
9
11/14/2022
Data in Motion
• Any data transmitted over a network
– Data transmitted over an internal network using wired or wireless
methods
– Data transmitted over public networks such as the internet
• The protection
– Transport Layer Security (TLS version 1.2 and later)
– VPN
Data in Motion
TLS
• Relies on digital certificates to certify the identity of one or both
endpoints
• Support multiple cipher suites
20
10
11/14/2022
Data in Motion
VPN
• Allows users to create a secure, private network over a public
network such as the Internet
• This is achieved by:
– Having a designated host (VPN server) on the network
– Outside computers have to go through the VPN server to reach the hosts
inside a private network via authentication.
– VPN server is exposed to the outside and the internal computers are still
protected via firewalls
21
Data in Motion
VPN
IPSec Tunneling
TLS/SSL Tunneling
22
11
11/14/2022
Data in Use
• Data in memory or temporary storage buffers, while an
application is using it
• In most operating systems today, the data must be decrypted
before it is used
– Because an application can’t process encrypted data
– Leads to Side channel attacks against memory shared by multiple
processes
23
Data in Use
Side Channel Attacks
• Safe software infrastructure does not mean safe execution
• Information leaks because of the underlying hardware
• Any attack based on information gained from the physical implementation of a
system (process), rather than theoretical weaknesses in the algorithms
• Attack which is enabled by the micro architectural design of the CPU and based on
information gained from the implementation of a computer system
– Caches: attack which monitors how quickly data accesses take and infer whether or
not said data was in the cache
– Timing: attack which monitors time it takes for machine to do various
computations
– Power-monitoring: attack which monitors power consumption on hardware on
varies computations
– …
24
12
11/14/2022
Data in Use
Cache Side Channel Attacks
• The side channel comes from monitoring how quickly data can be
accessed from the cache
– Data which is accessed quickly => stored in the cache
– Data which is accessed slow => stored in main memory
• Meltdown and Spectre attacks
– Allows a program to access the memory the secrets of other
programs and the operating system
– Take advantages of the three designs in modern processors
• Out-of-order Execution
• Speculative Execution
• Caching 25
Data in Use
Out-of-order Execution
• Allows instructions for high-performance microprocessors to
begin execution as soon as their operands are ready.
• To avoid a class of stalls that occur when the data needed to
perform an operation are unavailable
• Example
char secret = *(char*) 0xffffffff81a000e0;
printf("%c\n", data); transient instruction
13
11/14/2022
Data in Use
Speculative Execution
• A technique used by modern CPUs to speed up performance
• The CPU may execute certain tasks ahead of time, "speculating"
that they will be needed and complete them
– If the tasks are required, a speed-up is achieved, because the work is
already complete
– If the tasks are not required, changes made by the tasks are reverted and
the results are ignored
• Example
data = [1,2,3,4]
input = 1000;
if(input < data.size){
secret = data[input];
}
27
Data in Use
Flush + Reload
• Flush any access of memory for data you control from the cache
(by clflush)
• Lets malicious (or user program) run and access memory you
control with secret
• Try reloading elements from the controlled memory and see how
quickly they are accessed
28
14