0% found this document useful (0 votes)
50 views56 pages

The MD6 Hash Function Ronald L. Rivest Mit Csail (Aka "Pumpkin Hash")

The document describes the MD6 hash function. It summarizes the design considerations that went into MD6, including making it resistant to differential and side-channel attacks. MD6 uses a tree-based mode of operation that works well in parallel and allows for sequential and hybrid modes. The compression function uses only basic bitwise operations. Software and hardware implementations show MD6 to be efficient. Security analysis provides proofs that MD6 maintains cryptographic properties and is resistant to known attacks.

Uploaded by

Razer Cicak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views56 pages

The MD6 Hash Function Ronald L. Rivest Mit Csail (Aka "Pumpkin Hash")

The document describes the MD6 hash function. It summarizes the design considerations that went into MD6, including making it resistant to differential and side-channel attacks. MD6 uses a tree-based mode of operation that works well in parallel and allows for sequential and hybrid modes. The compression function uses only basic bitwise operations. Software and hardware implementations show MD6 to be efficient. Security analysis provides proofs that MD6 maintains cryptographic properties and is resistant to known attacks.

Uploaded by

Razer Cicak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 56

The MD6 Hash Function

(aka Pumpkin Hash)


Ronald L. Rivest
MIT CSAIL
CRYPTO 2008

MD6 Team

Dan Bailey
Sarah Cheng
Christopher Crutchfield
Yevgeniy Dodis
Elliott Fleming
Asif Khan
Jayant Krishnamurthy
Yuncheng Lin
Leo Reyzin
Emily Shen
Jim Sukha
Eran Tromer
Yiqun Lisa Yin

Juniper Networks
Cilk Arts
NSF

Outline
Introduction
Design considerations
Mode of Operation
Compression Function
Software Implementations
Hardware Implementations
Security Analysis

MD5 was designed in


1991
Same year WWW announced
Clock rates were 33MHz
Requirements:

{0,1}*
{0,1}d for digest size d
Collision-resistance
Preimage resistance
Pseudorandomness

Whats happened since then?


Lots
What should a hash function --MD6 --- look like today?

NIST SHA-3 competition!


Input: 0 to 264-1 bits, size not known
in advance
Output sizes 224, 256, 384, 512 bits
Collision-resistance, preimage
resistance, second preimage
resistance, pseudorandomness,
Simplicity, flexibility, efficiency,
Due Halloween 08

Design Considerations /
Responses

Wang et al. break MD5


(2004)
Differential cryptanalysis
(re)discovered by Biham and Shamir
(1990). Considers step-by-step
``difference (XOR) between two
computations
Applied first to block ciphers (DES)
Used by Wang et al. to break
collision-resistance of MD5
Many other hash functions broken
similarly; others may be vulnerable

So MD6 is

provably resistant to differential


attacks (more on this later)

Memory is now
``plentiful
Memory capacities have increased
60% per year since 1991
Chips have 1000 times as much
memory as they did in 1991
Even ``embedded processors
typically have at least 1KB of RAM

So MD6 has
Large input message block size:
512 bytes (not 512 bits)
This has many advantages

Parallelism has arrived

Uniprocessors have hit the wall


Clock rates have plateaued, since power
usage is quadratic or cubic with clock
rate:
P = VI = V2/R = O( freq2 )
(roughly)

Instead, number of cores will double


with each generation: tens, hundreds
(thousands!) of cores coming soon

16

64

256

So MD6 has
Bottom-up tree-based mode of
operation (like Merkle-tree)
4-to-1 compression ratio at each node

Which works very well in


parallel

Height is log4( number of nodes )

But most CPUs are


small
Most biomass is bacteria
Storage proportional to tree height
may be too much for some
CPUs

So MD6 has

Alternative sequential mode

IV

(Fits in 1KB RAM)

Actually, MD6 has


a smooth sequence of alternative
modes: from purely sequential to
purely hierarchical L parallel
layers followed by a sequential
layer, 0 L 64
Example: L=1:

IV

Hash functions often


``keyed

Salt for password, key for MAC,


variability for key derivation,
theoretical soundness, etc
Current modes are post-hoc

So MD6 has
Key input K
of up to 512 bits
K is input to every compression
function

Generate-and-paste
attacks
Kelsey and Schneier (2004), Joux
(2004),
Generate sub-hash and fit it in
somewhere
Has advantage proportional to size
of initial computation

So MD6 has
1024-bit intermediate (chaining)
values
root truncated to desired final
length
(2,0)
(2,3)
(2,2)
(2,1)
Location (level,index) input to each
node

Extension attacks

Hash of one message useful to


compute hash of another message
(especially if keyed):
H( K || A || B ) = H( H( K || A) ||
B)

So MD6 has

``Root bit (aka z-bit or


pumpkin bit) input to each
compression function:
True

Side-channel attacks
Timing attacks, cache attacks
Operations with data-dependent
timing or data-dependent resource
usage can produce vulnerabilities.
This includes data-dependent
rotations, table lookups (S-boxes),
some complex operations (e.g.
multiplications),

So MD6 uses
Operations on 64-bit words
The following operations only:

XOR
AND
SHIFT by fixed amounts:
x >> r
x << l
<<

>>

Security needs vary


Already recognized by having
different digest lengths d (for
MD6: 1 d 512)
But it is useful to have reducedstrength versions for analysis,
simple applications, or different
points on speed/security curve.

So MD6 has

A variable number r of rounds.


( Each round is 16 steps. )
Default r depends on digest size d :
d
r

r = 40 + (d/4)
160 224 256 384 512
80

96 104 136 168

But r is also an (optional) input.

MD6 Compression function

Compression function
inputs

64 word (512 byte) data block


message, or chaining values

8 word (512 bit) key K


1 word U = (level, index)
1 word V = parameters:

Data padding amount


Key length (0 keylen 64 bytes)
z-bit (aka ``root bit aka``pumpkin bit)
L (mode of operation height-limit)
digest size d (in bits)
Number r of rounds

74 words total

Prepend Constant + Map +


Chop
const

key+UV

data

15

8+2

64

Prepend

1-1 map

89 words

Map
89

Chop

words

16 words

Simple compression
function:
Input: A[ 0 .. 88 ] of A[ 0 .. 16r +
88]
for i = 89 to 16 r + 88 :
x = Si A[ i-17 ] A[ i-89 ]
( A[ i-18 ] A[ i-21 ] )
( A[ i-31 ] A[ i-67 ] )
x = x ( x >> ri )
A[i] = x ( x << li )
return A[ 16r + 73 .. 16r + 88 ]

Constants

ri
li

Taps 17, 18, 21, 31, 67 optimize


diffusion
Constants Si defined by simple
recurrence; change at end of each
16-step round
Shift
amounts
repeat
each
round
0 1 2 3 4 5 6 7 8 9 1 1 1 1
(best diffusion of 1,000,000
0 such
1 2 3
1 5 1 1 1 1 2 7 1 1 7 1 1 7
tables):
0
1
1

2
4

1
6

1
5

2
7

1
5

2
9

1
5

1
4

1
5

1
2

3
1

Large Memory (sliding


window)
2

Array of 16r + 89 64-bit words.


Each computed as function of
preceding 89 words.
Last 16 words computed are output.

Small memory (shift


register)
89 words

2 3 2 1 5 6 3 2 7 1 3 2 6 3 1 4 0 1

Si

Shifts

Shift-register of 89 words (712


bytes)
Data moves right to left

Software Implementations

Software implementations

Simplicity of MD6:
Same implementation for all digest
sizes.
Same implementation for SHA-3
Reference or SHA-3 Optimized
Versions.
Only optimization is loop-unrolling
(16 steps within one round).

NIST SHA-3 Reference


Platforms
32-bit

64-bit

MD6-160

44 MB/sec

97 MB/sec

MD6-224

38 MB/sec

82 MB/sec

MD6-256

35 MB/sec

77 MB/sec

MD6-384

27 MB/sec

59 MB/sec

MD6-512

22 MB/sec

49 MB/sec

SHA-512

38 MB/sec

202 MB/sec

Multicore efficiency
MD6-256

SHA-256

Cilk!

Efficiency on a GPU
Standar
d $100
NVidia
GPU
375
MB/sec
on one
card

8-bit processor (Atmel)


With L=0 (sequential mode), uses
less than 1KB RAM.
20 MHz clock
110 msec/comp. fn for MD6-224
(gcc actual)
44 msec/comp. fn for MD6-224
(assembler est.)

Hardware
Implementations

FPGA Implementation (MD6512)


Xilinx XUP FPGA (14K logic slices)
5.3K slices for round-at-a-time
7.9K slices for two-rounds-at-a-time
100MHz clock
240 MB/sec (two-rounds-at-a-time)
(Independent of digest size due to
memory bottleneck)

Security Analysis

Generate and paste attacks


(again)

Because compression functions


are location-aware, attacks that
do speculative computation hoping
to cut and paste it in somewhere
dont work.

Property-Preservations

Theorem. If f is collision-resistant, then


MD6f is collision-resistant.
Theorem. If f is preimage-resistant, then
MD6f is preimage-resistant.
Theorem. If f is a FIL-PRF, then MD6f is a
VIL-PRF.
Theorem. If f is a FIL-MAC and root node
effectively uses distinct random key (due
to z-bit), then MD6f is a VIL-MAC.
(See thesis by Chris Crutchfield.)

Indifferentiability (Maurer et al.


04)

Variant notion of
indistinguishability appropriate
when distinguisher has access to
inner component (e.g. mode of
operation MD6f / comp. fn f).

MD6f

FIL RO

VIL RO

? or ?

Indifferentiability (I)
Theorem. The MD6 mode of
operation is indifferentiable from a
random oracle.
Proof: Construct simulator for
compression function that makes it
consistent with any VIL RO and MD6
mode of operation
Advantage: 2 q2 / 21024
where q = number of calls (measured
in terms of compression function calls).

Indifferentiability (II)

Theorem. MD6 compression function


f is indifferentiable from a FIL
random oracle (with respect to
random permutation ).
Proof: Construct simulator S for
and -1 that makes it consistent with
FIL RO and comp. fn. construction.
Advantage: q / 21024 + 2q2 /
24672

SAT-SOLVER attacks
Code comp. fn. as set of clauses,
try to find inverse or collision with
Minisat
With many days of computing:

Solved all problems of 9 rounds or less.


Solved some 10- or 11-round ones.
Never solved a 12-round problem.

Note: 11 rounds 2 rotations


(passes over data)

Statistical tests
Measure influence of an input bit
on all output bits; use AndersonDarling A*2 test on set of
influences.
Cant distinguish from random
beyond
12 rounds.

Differential attacks dont


work
Theorem. Any standard
differential attack has less chance
of finding collision than standard
birthday attack.
Proof. Determine lower bound on
number of active AND gates in 15
rounds using sophisticated
backtracking search and days of
computing. Derive upper bound on
probability of differential path.

Differential attacks (cont.)

Compare birthday
bound BB with our
lower bound LB on
work for any standard
differential attack.
(Gives adversary
fifteen rounds for
message
modification, etc.)
These bounds can be
improved

BB

LB

160

80

280

2104

224

96

2112 2130

256 104 2128 2150


384 136 2192 2208
512 168 2256 2260

Choosing number of
rounds
We dont know how to break any
security properties of MD6 for more
than 12 rounds.
For digest sizes 224 512 , MD6 has
80 168 rounds.
Current defaults probably
conservative.
Current choice allows proof of
resistance to differential cryptanalysis.

Summary
MD6

is:

Arguably secure against known


attacks (including differential
attacks)
Relatively simple
Highly parallelizable
Reasonably efficient

THE END

MD6
03744327e1e959fbdcdf7331e959cb2c28101166

Round constants Si
Since they only change every 16
steps, let Sj be the round constant
for round j .
S0 = 0x0123456789abcdef

Sj+1 = (Sj <<< 1) (Sj mask)


mask = 0x7311c2812425cfa0

You might also like