0% found this document useful (0 votes)
10 views74 pages

Lecture.3

The document provides an introduction to cryptography, covering fundamental concepts such as binary systems, modular arithmetic, and types of ciphers including steganography and cryptography. It explains the Caesar cipher and transposition ciphers, as well as techniques for cryptanalysis using letter frequency analysis. The document emphasizes the relationship between human ingenuity and the ability to create and resolve ciphers.

Uploaded by

李世哲
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views74 pages

Lecture.3

The document provides an introduction to cryptography, covering fundamental concepts such as binary systems, modular arithmetic, and types of ciphers including steganography and cryptography. It explains the Caesar cipher and transposition ciphers, as well as techniques for cryptanalysis using letter frequency analysis. The document emphasizes the relationship between human ingenuity and the ability to create and resolve ciphers.

Uploaded by

李世哲
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 74

INTRODUCTION TO

CRYPTOGRAPHY
“All communication involves some sort of encoding of messages”—John Pierce, An
Introduction to Information Theory: Symbols, Signals, Noise
“It may be roundly asserted that human ingenuity cannot concoct a cipher which
human ingenuity cannot resolve.”—Edgar Allen Poe, “A Few Words on Secret
Writing,” Graham’s Magazine, July 1841, 19:33-38
Five Minute Binary Refresher
• Binary and Decimal systems
• Decimal number line (Base 10): 0,1,2,3,4,5,6,7,8,9,?
• Binary number line (Base 2): 0,1,?
• How to count to 10 in binary?
• The math is actually fun (it’s like playing blackjack):
• The powers of 2 are 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, etc.
• To manually convert any number to binary, we simply count the additive powers:
• 53 = (32 + 16 + 4 + 1) = (1 x 32) + (1 x 16) + (0 x 8) + (1 x 4) + (0 x 2) + (1 x 1) = 110101
• To verify: python3 -c "print(bin(53))" | cut -c 3-
• 86 = (64 + 16 + 4 + 2) = (1 x 64) + (0 x 32) + (1 x 16) + (0 x 8) + (1 x 4) + (1 x 2) + (0 x 1) = 1010110
• To verify: python3 -c "print(bin(86))" | cut -c 3-
• To reverse: 0 + 2 + 4 + 0 + 16 + 0 + 64 = 86
1010110 = (0 x 20) + (1 x 21) + (1 x 22) + (0 x 23) + (1 x 24) + (0 x 25) + (1 x 26) =
Exclusive Or Refresher
• The Exclusive Or (XOR) operation states:
• 0⨁0=0
• 1⨁1=0
• 1⨁0=1
• 0⨁1=1
• Thus:
• α⨁α=0
• (α ⨁ β) ⨁ β = α
• Thus, the binary equivalent of hex 5 (0x05, 5d) is 0101, which, when XORd with 1111 (hex
0x0F, 15d) yields: 1010, or hex A (decimal 10), or (1010)2 = 0x0A
• The binary equivalent of hex E (0x0E, 14d) is 1110, when XORd with 1111 yields 0001, or hex 1
(decimal 1), i.e. (0001)2 = 0x01
• (10101010)2 = (AA)16 = 0xAA; (01010101)2 = (55)16
• 10101010 ⨁ 01010101 = 11111111 (0xFF) [0xAA ⨁ 0x55 = 0xFF]
echo –n ‘CAESAR’ | xxd –b
• Example: view and execute: src.lecture/lecture.3/xor/xor.py echo –n ‘ATTACK’ | xxd –b
Modular Arithmetic
• Modular arithmetic is a way we can make a finite field closed under defined operations,
including addition, subtraction, multiplication and division
• Modular arithmetic is concerned with integers, aka whole numbers
• Take 17 / 5 = 3 with a remainder of 2
• It is sometimes useful to focus on the remainder of division, so, we say that 17 mod 5 ≡ 2,
that is to say, 17 modulo 5 is congruent to 2
• In 17 mod 5 ≡ 2, 5 is known as the modulus
• The modulus allows us to keep the remainder within a certain range (N – 1) [viz. 0 – 4]:
range: {0...4} range: {0...4} range: {0...4} range: {0...4}
0 mod 5 ≡ 0 5 mod 5 ≡ 0 10 mod 5 ≡ 0 15 mod 5 ≡ 0
1 mod 5 ≡ 1 6 mod 5 ≡ 1 11 mod 5 ≡ 1 16 mod 5 ≡ 1
2 mod 5 ≡ 2 7 mod 5 ≡ 2 12 mod 5 ≡ 2 17 mod 5 ≡ 2 R<5
3 mod 5 ≡ 3 8 mod 5 ≡ 3 13 mod 5 ≡ 3 18 mod 5 ≡ 3
4 mod 5 ≡ 4 9 mod 5 ≡ 4 14 mod 5 ≡ 4 19 mod 5 ≡ 4
Modular Arithmetic
• If two numbers have the same remainder on division by N, we regard them as equivalent
modulo N: x = y (mod N) (e.g., 6/5, 11/5, 16/5, where the remainder is 1, thus 6, 11 and 16
are all equivalent modulo 5)
• If N is a positive integer, arithmetic modulo N uses only the integers in the series 0, 1, 2, 3, ...
N-1, that is to say, integers from 0 to N-1
• In the US, we use modular arithmetic every day, with a 12-hour clock wherein we use
addition modulo 12 (10 PM + 6 hours = (10+6) - 12 = 4 AM, or by counting six hours
around the 12-hour clock: 11, 12, 1, 2, 3, 4)
• We also use modular arithmetic every day in calculating the days of the week (N = 7) and
for determining even vs odd (we don’t even need Euclid) (N=2)
Modular Arithmetic
• Modulo Definition (per Gauss):
Let a, r, m ∈ ℤ (where ℤ is a set of all integers) and m > 0. We write

a ≡ r mod m
(a is congruent to r modulo m)

If m evenly divides a – r (without a remainder)


m is called the modulus and r is called the remainder

• For example, 42 = (4 x 9) + 6 (there are four 9’s in 36, to which 6 is added yielding 42).
• a = 42
• r=6
• m=9
• We say 42 ≡ 6 mod 9; where 9 | 42 – 6 (36) : 36 / 9 = 4 (‘|’ means “evenly divides”)
Equivalence Classes
• For every given modulus m and number a, there are infinitely many valid remainders.
• For example [a ≡ r mod m iff m | (a-r): (42 – 6) / 9 = 4]:
• 12 ≡ 3 mod 9, 3 is a valid remainder since 9 | (12 – 3) (read: 9 divides (12 – 3), 9 divides 9)
• 12 ≡ 21 mod 9, 21 is a valid remainder since 9 | (21 – 3) (9 divides 18)
• 12 ≡ –6 mod 9, –6 is a valid remainder since 9 | (–6 – 3) (9 divides –9)
• The set of numbers:
{...,-24, -15, -6, 3, 12, 21, 30,...}
form what is known as an equivalence class. There are eight other such classes just for the
modulus 9:
{...,-27,-18,-9,0,9,18,27,...},{...,-26,-17,-8,1,10,19,28,...}, ...,{...,-19,-10,-1,8,17,26,35,...}
• For a given modulus m, it does not matter which element from a class we choose for a
given computation, as they are all equivalent.
• This property of equivalence classes has major positive implications in modern
cryptography, especially concerning modular exponentiation...
INTRODUCTION TO
CRYPTOGRAPHY
“Human ingenuity cannot concoct a cipher which human ingenuity cannot
resolve.”—Edgar Allen Poe, “A Few Words on Secret Writing,” Graham’s Magazine,
July 1841, 19:33-38
Fundamental Types
• Steganography (στεγανός means covered and γράφειν means to write)
• Message is not visible, but remains plaintext if anyone can uncover it...
• Demaratus, a Greek exile, sent a warning back to Greece about a forthcoming attack by Xerxes King of
the Persians by writing it directly on the wooden backing of a wax tablet before applying beeswax, thus
the beeswax covered the writing.
• Herodotus tells the story of Histaiaeus who sent a message to Aristagoras of Miletus to revolt against
the Persian King, by shaving the head of his most trusted servant, "marking" the message onto his scalp,
then sending him on his way once his hair had regrown
• Other classical (and modern) uses involve invisible ink, ancient uses of the thithymallus plant as
reported by Pliny the Elder (d. AD 79, Vesuvius) and modern uses include invisible ink kits for kids
• 20th century uses include German Abwehr agents using 1 millimeter diameter “microdots”, which used
subminiature photography to embed of a page of documents into a ‘dot’ that looked to the naked eye
as a period.
• Modern uses of steganography including hiding messages in graphical images, and including Hewlett-
Packard color laser printers add tiny yellow dots to each page, which contain encoded printer serial
numbers and date and time stamps
Fundamental Types
• Cryptography (κρυπτός means hidden and γράφειν means to write)
• Message is modified into some form of meaningless characters but may be quite
visible (as in the Scytale leather belt worn by Spartan spies)

The Captain Midnight Code-o-


Graph, from Ovaltine, 1938-
1949 The Scytale (σκυτάλη) of Sparta, where the
• Substitution ciphers key is the diameter of the baton
• Transposition ciphers
Caesar Substitution Cipher
• In classical cryptography, there were four main types of substitution ciphers:
• Simple Substitution (as in cryptograms in newspapers)
• Homophonic Substitution (a single letter maps to a set of potential ciphertext values, e.g. the letter “A”
could map to multiple numbers)
• Polygramatic substitution (where blocks of characters are encrypted together)
• Polyalphabetic Substitution (repetitive key encryption, e.g. Vigenère cipher)
• The Caesarean Cipher (aka an additive cipher) is a simple substitution cipher meaning that
a given letter is “shifted” forward a certain number of letters in an alphabet.
• Classic Caesarean Cipher is described by Julius Caesar in the Gallic Wars, which used a
simple shift-3 (add 3) algorithm, where A=>D, B=>E, C=>F, etc.:

ABCDEFGHIKLMNOPQRSTVXYZ
• So in the shift-3 cipher, “CAESAR” would encrypt as: “FDHXDV”
Caesar Cipher
• Suetonius in his Lives of the Caesars (2nd century A.D.), describes the Caesar Cipher with a
rotation of 3 characters
• The end letters X, Y and Z could not be shifted down, so they would “roll back” to the top of
the alphabet, X becoming A and Y becoming B and Z becoming C.
• To decrypt the ciphertext, simply reverse the direction of the algorithm (“F” becomes “C”)
So:
xi , yi , k ∈ { 0...22 }, in alphabet ℤ23
Encryption: yi = (xi + k) mod 23
Decryption: xi = (yi - k) mod 23

ABCDEFGHIKLMNOPQRSTVXYZ
Transposition Cipher
• In a transposition cipher the plaintext remains the same, but the order of the characters
is shuffled around at a “random” line length
• This line length serves as the “key” to solving the cipher
• Words are written without spacing with a fixed line length (right margin) and then
wrapped onto the next line without respect to words, such as the following with a “key”
of 9:
THISISTHE
TEXTTHATI
AMSHOWING
• Once written thus, the text is transposed as columns, reading down from left to right,
yielding the ciphertext: TTAHEMIXSSTHITOSHWTAIHTNEIG
• Cryptanalysis involves letter frequency analysis
• Running the ciphertext through a second transposition can improve the safety of the
encryption
In-Class Exercise
• Let’s use a brute force method on the ciphertext “zayeldan”...which would brute force
as:
Shift Result (mod 26)
1. abzfmebo
2. bcagnfcp
3. cdbhogdq
4. decipher
5. …
Cryptanalysis Using Letter-Frequency
• Statistical letter frequency analysis uses a standard English letter frequency tabulation to
render the ordering of English characters from most to least common, yielding (in order):
etaoinshrdlcumwfgypbvkjxqz
• Knowing the properties of a rotation cipher (i.e., a consistent shift), we can conclude that
probably the most common letter of the ciphertext corresponds to the letter ‘e’:
Cryptanalysis Using Letter-Frequency
• Assume ciphertext:
"qzfc dnzcp lyo dpgpy jplcd lrz zfc qlespcd mczfrse qzces zy estd nzyetypye, l
yph yletzy, nzynptgpo ty wtmpcej, lyo opotnlepo ez esp aczazdtetzy esle lww xpy
lcp ncplepo pbflw. yzh hp lcp pyrlrpo ty l rcple ntgtw hlc, epdetyr hspespc esle
yletzy, zc lyj yletzy dz nzynptgpo lyo dz opotnlepo, nly wzyr pyofcp. hp lcp xpe
zy l rcple mleewp-qtpwo zq esle hlc. hp slgp nzxp ez opotnlep l azcetzy zq esle
qtpwo, ld l qtylw cpdetyr awlnp qzc eszdp hsz spcp rlgp esptc wtgpd esle esle
yletzy xtrse wtgp. te td lwezrpespc qteetyr lyo aczapc esle hp dszfwo oz
estd. mfe, ty l wlcrpc dpydp, hp nly yze opotnlep hp nly yze nzydpnclep hp nly
yze slwwzh estd rczfyo. esp mclgp xpy, wtgtyr lyo oplo, hsz decfrrwpo spcp, slgp
nzydpnclepo te, qlc lmzgp zfc azzc azhpc ez loo zc opeclne. esp hzcwo htww wteewp
yzep, yzc wzyr cpxpxmpc hsle hp dlj spcp, mfe te nly ypgpc qzcrpe hsle espj oto
spcp. te td qzc fd esp wtgtyr, clespc, ez mp opotnlepo spcp ez esp fyqtytdspo
hzcv hstns espj hsz qzfrse spcp slgp esfd qlc dz yzmwj loglynpo. te td clespc qzc
fd ez mp spcp opotnlepo ez esp rcple eldv cpxltytyr mpqzcp fd esle qczx espdp
szyzcpo oplo hp elvp tyncpldpo opgzetzy ez esle nlfdp qzc hstns espj rlgp esp
wlde qfww xpldfcp zq opgzetzy esle hp spcp strswj cpdzwgp esle espdp oplo dslww
yze slgp otpo ty glty esle estd yletzy, fyopc Rzo, dslww slgp l yph mtces zq
qcppozx lyo esle rzgpcyxpye zq esp apzawp, mj esp apzawp, qzc esp apzawp, dslww
yze apctds qczx esp plces"
• Frequencies obtained from the above ciphertext: {'p': 165, 'e': 126, 'l': 102, 'z': 93, 's': 80, 'c': 79, 'y': 77, 't': 68, 'o':
58, 'd': 44, 'w': 42, 'n': 31, 'h': 28, 'q': 27, 'r': 27, 'g': 24, 'f': 21, 'a': 15, 'm': 14, 'x': 13, 'j': 10, 'v': 3, 'b': 1, 'R': 1}
• Shift ‘p’ to ‘e’ mod 26 is: 15 => shift ‘q’ to ‘f’; shift ‘z’ to ‘o’; shift ‘f’ to ‘u’; shift ‘c’ to ‘r’, etc.
• cat letter.frequency.py |tail -4
Cryptanalysis Using Letter-Frequency
“qzfc dnzce lyo degey jelcd lrz zfc qltsecd mczfrst qzcts zy tstd
nzyttyeyt, l yeh ylttzy, nzynetgeo ty wtmectj, lyo oeotnlteo tz tse
aczazdtttzy tslt lww xey lce ncelteo ebflw. yzh he lce eyrlreo ty l
rcelt ntgtw hlc, tedttyr hsetsec tslt ylttzy, zc lyj ylttzy dz nzynetgeo
lyo dz oeotnlteo, nly wzyr eyofce. he lce xet zy l rcelt mlttwe-qtewo zq
tslt hlc. he slge nzxe tz oeotnlte l azcttzy zq tslt qtewo, ld l qtylw
cedttyr awlne qzc tszde hsz sece rlge tsetc wtged tslt tslt ylttzy xtrst
wtge. tt td lwtzretsec qttttyr lyo aczaec tslt he dszfwo oz tstd. mft,
ty l wlcrec deyde, he nly yzt oeotnlte he nly yzt nzydenclte he nly yzt
slwwzh tstd rczfyo. tse mclge xey, wtgtyr lyo oelo, hsz dtcfrrweo sece,
slge nzydenclteo tt, qlc lmzge zfc azzc azhec tz loo zc oetclnt. tse
hzcwo htww wtttwe yzte, yzc wzyr cexexmec hslt he dlj sece, mft tt nly
yegec qzcret hslt tsej oto sece. tt td qzc fd tse wtgtyr, cltsec, tz me
oeotnlteo sece tz tse fyqtytdseo hzcv hstns tsej hsz qzfrst sece slge
tsfd qlc dz yzmwj loglyneo. tt td cltsec qzc fd tz me sece oeotnlteo tz
tse rcelt tldv cexltytyr meqzce fd tslt qczx tsede szyzceo oelo he tlve
tynceldeo oegzttzy tz tslt nlfde qzc hstns tsej rlge tse wldt qfww
xeldfce zq oegzttzy tslt he sece strswj cedzwge tslt tsede oelo dslww
yzt slge oteo ty glty tslt tstd ylttzy, fyoec Rzo, dslww slge l yeh mtcts
zq qceeozx lyo tslt rzgecyxeyt zq tse aezawe, mj tse aezawe, qzc tse
aezawe, dslww yzt aectds qczx tse elcts”
• Frequencies obtained from the above ciphertext: {'p': 165, 'e': 126, 'l': 102, 'z': 93, 's': 80, 'c': 79, 'y': 77, 't': 68, 'o': 58,
'd': 44, 'w': 42, 'n': 31, 'h': 28, 'q': 27, 'r': 27, 'g': 24, 'f': 21, 'a': 15, 'm': 14, 'x': 13, 'j': 10, 'v': 3, 'b': 1, 'R': 1}
• Shift ‘p’ to ‘e’ mod 26 is: 15 => shift ‘q’ to ‘f’; shift ‘z’ to ‘o’; shift ‘f’ to ‘u’; shift ‘c’ to ‘r’, etc.
• python3 src.lecture/lecture.3/caesar/letter.frequency.py
Cryptanalysis Using Letter-Frequency
• Assume plaintext:
"Stately, plump Buck Mulligan came from the stairhead, bearing
a bowl of lather on which a mirror and a razor lay crossed. A
yellow dressinggown, ungirdled, was sustained gently behind him
on the mild morning air. He held the bowl aloft and intoned:
Introibo ad altare Dei. Halted, he peered down the dark
winding stairs and called out coarsely: —Come up, Kinch! Come
up, you fearful jesuit!"
• Visit http://rumkin.com/tools/cipher/caesar.php and paste text above and choose a
shift factor
• Paste ciphertext into src.lecture/lecture.3/caesar/tmp.py
• Note you need enough text (i.e., characters) to provide statistical significance
RANDOMNESS
“Anyone who considers arithmetical methods of producing random digits is, of
course, in a state of sin.”—John von Neumann
Bitmap generated with a language-

“Well, that was random...”


Bitmap generated by a True Random based (PHP) Pseudo-Random
Number Generator (TRNG) Number Generator (PRNG)

• Properties of randomness
• Which number is more random, A or B?
• A) 10010110
• B) 99999999
• E.g. src.lecture/lecture.3/random/ntr.py
• Why does randomness matter?
A State of Sin:

“I can see the penguin.”—


Marsh Ray, Microsoft,
commenting on the
Electronic Code Book (ECB)
mode of the Advanced
Encryption Standard (AES)
Randomness as a Probability Distribution
• A probability distribution lists the outcomes of a randomized process where each
outcome is assigned a probability, or likelihood of occurrence
• In a probability distribution, the total probability of one outcome from the distribution
occurring is 100% (1.0) (i.e. the sum of all probabilities is 1.0):
p1 + p2 + p3 + pN = 1.0
• Coin tossing distribution is ½ for heads and ½ for tails, where ½ + ½ = 1.0
• A uniform distribution occurs when all probabilities are equally likely to occur, i.e., none is
more likely to occur than any other in the distribution
• So, if a 256-bit key is picked uniformly at random, each of the 2256 possible keys in the
distribution should each have a probability of ½256
• Note: Trying to guess one key out of the 2256 possible keys is about 1 in 3 * 1077,
which is about the number of atoms in the universe...
• So why then is ntr.py not random?
Entropy
• Entropy is the measure of uncertainty or disorder in a system.
• Colloquially, entropy is expressed as: “Wow, I didn’t see that coming”
• Good news is we can calculate the entropy of a probability distribution using the binary
logarithm:
• Given p1, p2 , p3,...,pN:
-p1 x log2 (p1) - p2 x log2 (p2) - ... pN x log2 (pN)
• A random distribution of 128-bit keys would have the following entropy:
-2128 x (-2-128 x log2(2-128)) = -log2 (2-128) = 128 bits
• Thus, the entropy of any uniformly distributed n-bit string will be n bits
• In a non-uniformly distributed n-bit string, we would expect the entropy to be < 1
• Say a coin is slightly weighted such that heads has a probability of ¼ and tails has a
probability of ¾ (remember ¼ + ¾ = 1), the entropy of this biased coin toss would be:
-(3/4) x log2 (3/4) – (1/4) x log2 (1/4) ≈ -(3/4) x (-0.415) – (1/4) x (-2) ≈ 0.81 bits
• Notice, you have no way of predicting a fair coin toss, but in our weighted coin, you
have some grounds for “betting” for tails...
Random Number Generation
• So, a cryptosystem needs randomness to be unpredictable, that is to say, to be hard...
• So, we need a component that will provide our randomness, whose job is to return a
number at random when asked to do so
• Such a component is called a Random Number Generator (RNG)
• An RNG has a couple of features:
• A source of entropy, provided by the RNG
• A cryptographic algorithm to produce high-quality random bits from the source of
entropy, which is provided by a pseudorandom number generator (PRNG)
Coin Flipping
• Imagine flipping a coin
• E.g., flipping a (balanced) coin 2 times to generate a single 2-bit sequence, and you
record a ‘1’ if it’s heads and a ‘0’ if it’s tails…
• Let’s say you flip it once and it’s tails, and the second flip it it’s heads: ‘01’
• The probability of reproducing the precise sequence (tails followed by heads) is ½2
(25%), i.e., your odds are 1 in 4, that you will reproduce that exact sequence
• Now imagine we up the game, and flip the coin three times, getting
“heads/tails/heads”, or ‘101’
• The probability of reproducing the new precise sequence (heads then tails followed by
heads) is ½3 (.125), i.e., your odds are 1 in 8, that you will reproduce that exact
sequence
• Notice that every time we increase the exponent (number of tosses) by one, we halve
the odds, that is, we make reproducing it twice as unlikely…
Types of Random Number Generators
• True Random Number Generators (TRNG)
• Output is practically non-repeatable
• E.g., flipping a coin 50 times to generate a single 50-bit sequence, probability of
reproducing the precise sequence is ½50 (8.8E-16), i.e., your odds are 1 in
1,125,899,906,842,620, or roughly 1 in 250 (note 1 in 251 is 2,251,799,813,685,240)
• 1 in 250 is roughly the estimated number of years until the planets detach from stars
if the universe is open
• True RNGs are dependent on physical processes, such as coin flipping, dice rolling,
semiconductor noise, clock jitter in digital circuits, or quantum random number
generators that rely on randomness from quantum mechanics such as radioactive
decay, vacuum fluctuations, photon polarization, etc.
• RNGs produce truly random bits rather slowly and intermittently based on mostly
analog sources, and although their output is nondeterministic, their entropy is not
guaranteed
Pseudo-Random Number Generators
• Cryptographic Pseudorandom number generators (PRNGs) work by generating randomness by
reliably producing many artificial random bits from a few true random bits
• Pseudorandom Number Generators (PRNG)
• Computed from an initial seed value, generally recursive
• PRNGs are generally always available, whereas some RNGs may stop working (i.e. should the
mouse stop moving, or keystrokes stop, etc.)
• By contrast, C-PRNGs produce random-ish bits from digital sources in a nondeterministic way
and can be delivered with maximum entropy
• A C-PRNG takes in random bits from a RNG periodically and generates an entropy pool, which
is a large memory buffer that serves as the PRNG’s source of random bits
• Every time a C-PRNG takes in a new RNG input, it mixes up the pool to provide statistically
unbiased bits
• A PRNG uses a deterministic random bit generator (DRBG) that turns pool bits into a longer
sequence of bits always ensuring that the DRBG never receives the same input twice
Pseudo-Random Number Generators
• Cryptographic PRNGs (C-PRNGs) support two security capabilities:
• Backtracking Resistance (previously-generated bits are impossible to
recover/reconstruct, even given access to the current pools if hacked)
• Predictive Resistance (future bits to be delivered from the PRNG are impossible to
predict, even given access to the current pools if hacked)
• BTW, most programming language-based PRNGs are non-cryptographic, meaning they are
based on an algorithm (Mersenne Twister), which will efficiently produce uniformly
distributed random bits, but which, alas, are predictable given a few bits produced and
recorded (due to a simple linear combination of a single XOR operation with bitwise
operations with a couple of constants)
• This means that you don’t want to do hard-core crypto work using them, including
libc’s rand() and drand48(), PHPs rand(), python’s and Ruby’s random(), etc.
• Another way to put it is programming language-based random() functions are in a state
of sin
The Unix Pseudo-Random Number Generator device
• On Unix systems, there is a device file /dev/urandom that will produce cryptographic
pseudo-random output
• It does this by aggregating raw entropy from various sources, including timing-based
jitter, hardware interrupts, Intel random instructions, and hardware-specific sources of
entropy (e.g., the Apple’s Secure Enclave’s hardware RNG)
• There are 256 bits in 25, or 32 bytes
• Execute: dd if=/dev/urandom bs=32 count=1 2>/dev/null | xxd -b
• Output is 256 pseudo-random bits
• Let’s make that output a little easier to use...
• Execute: /src.lecture/lecture.3/random/PRNGbitgen.sh 256
• The result is equivalent to tossing a coin 256 times to generate a single 256-bit sequence,
where the probability of reproducing the precise sequence is ½256 (1.16E-77), i.e., your odds
of reproduction are 1 in
115,792,089,237,316,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
or roughly 1 in 2256
HASHING
Hash Functions
• Hash functions take any sized input and produce a fixed-width output, often called a hash
value or digest.
• Hash functions may or may not involve a cryptographic key
• Unlike other ciphers that try to encrypt data so it can’t be read by unauthorized parties,
hash functions act to ensure the integrity of data...to ensure that the original data has not
changed, with no plan (or capability!) to ever decrypt the resulting ciphertext
• An example of this is the early Unix password system in /etc/passwd...a user logging in
would enter their password, it would be hashed, and the hash would be compared to the
hash stored in /etc/passwd, if they matched, the password entered would be considered
“valid”
• In this sense, hash digests are often called digital fingerprints or message digests, as hash
functions often condense much larger text into smaller hashes
• Hash functions have several applications, including the creation of digital signatures to ensure
the integrity of data (which we will talk about in more detail...)
Hash Functions
• Hash values of a secure hash function are unpredictable and repeatable
• More formally, we can say a general hash function is a mathematical function such that:
• Its input can be of any size
• It produces a fixed-length output (varies according to algorithm)
• It is efficiently computable (hashing any n-bit string should run within O(n) linear-complex time
of 1:1)
• We will discuss properties, but essentially a hash function will produce different output even if a
single bit of the original data has changed:
• $ echo -n 'a' | xxd -be
00000000: 01100001 a
• $ echo –n 'c' | xxd -be
00000000: 01100011 c
• $ src.lecture/lecture.3/hashing/hash-256.py ‘a’
SHA256: CA978112CA1BBDCAFAC231B39A23DC4DA786EFF8147C4E72B9807785AFEE48BB
• $ src.lecture/lecture.3/hashing/hash-256.py ‘c’
SHA256: 2E7D2C03A9507AE265ECF5B5356885A53393A2029D241394997265A1A25AEFC6
Hash Functions
• More precisely (and for our purposes), we can say that a cryptographically secure hash
function has four main properties:
• A truly random one-way function that offers:
• Collision resistance—it’s hard to discover any two different strings that hash to the same
value, i.e., we cannot easily find x and y such that x != y && H(x) = H(y)
• Pre-image resistance (aka hiding)—cannot go back from hash to original message, hashing
is one-way, (a hash function H is said to be hiding if when a secret value r [sometimes
called a nonce, or number used once] is chosen from a probability distribution that has
high min-entropy, then given H(r || x), (r concatenated with x) it is computationally
infeasible to find (recover) x)
• Second-preimage resistance—it’s hard to find a message that produces the same hash as
a particular other message, which would imply weak collision resistance
• Puzzle friendliness (added for Blockchain...)
One thing is guaranteed in life...collisions
• There are three things certain in life: death, taxes, and the existence of hash collisions...
• But a good hashing algorithm is one where no matter how hard we try, we can’t find
any collisions...

100
≈10 10
2256
Possible Outputs

Possible Inputs
• Example:
src.lecture/lecture.3/hashing$ python3 hash-256.py ‘a’
python3 -c
'print(bin(int("CA978112CA1BBDCAFAC231B39A23DC4DA786EFF8147C4E72B9807785AFEE48BB",16)))'
echo -n [copied binary output from above...ignoring ‘0b’ prefix] |wc -c
The Pigeonhole Principle
• Also known as Dirichlet’s drawer principle
• If we have 9 pigeonholes for pigeons and 10 pigeons, we will
necessarily have two pigeons in a single compartment
• Since the output of every hash function has a fixed-bit length,
say n bits, there are only 2n possible output values, while at the
same time the number of inputs to the hash function is infinite
• Since the possibility of collisions is ineluctable, the best we can do
is ensure that they cannot be found in practice, that is, easily...
• Thus, a strong hash function should be designed such that given x1 and h(x1), it is
impossible to discover some x2 (from the given) such that h(x1) = h(x2)
• Note that it is always possible that Mallory (a malicious active attacker) could “luck out”
and pick x2 out of “thin air” and yield h(x1) = h(x2), but we can make this
computationally infeasible by ensuring that our output length is at least 80 bits (flip our
coin 80+ times).
It gets worse...The Birthday Paradox
• How many people are needed at your birthday party such that there is about a 50%
chance that no one there has the same birthday as yours (Month and Day only)?
• Naïve answer is that it would take about 365/2 or 183 people before you hit a 50%
probability that someone shares your birthday...let’s see...
• The probability, given just me, that no one else shares my birthday is 100%,
since there is no one else to collide with my birthday
• If there’s just me and one other person, the probability of no collisions is:
P(no collisions) = (1 – 1/365) = .997
• If there’s just me and two other people, the probability of no collisions is:
P(no collisions) = (1 – 1/365) * (1 – 2/365) = .992
• So, given N people, how long does it take to get to a probability of < 50% that
no one else shares my birthday?
P(no collisions) = (1 – 1/365) * (1 – 2/365) * (1 – 3/365) * (1 – N/365)
It gets worse...The Birthday Paradox
• How many people are needed at your birthday party such that there is about a 50%
chance that no one there has the same birthday as yours (Month and Day only)?
• Naïve answer is that it would take about 365/2 or 183 people to hit a 50% Probabilty # People:
100.0% 1
probability...let’s see... 99.7%
99.2%
2
3
• The probability, given just me, that no one else shares my birthday is 100%, 98.4%
97.3%
4
5
since there is no one else to collide with my birthday 96.0%
94.4%
6
7
• If there’s just me and one other person, the probability of no collisions is: 92.6%
90.5%
8
9
P(no collisions) = (1 – 1/365) = .997 88.3%
85.9%
10
11
• If there’s just me and two other people, the probability of no collisions is: 83.3%
80.6%
12
13
1 2
P(no collisions) = (1 – /365) * (1 – /365) = .992 77.7% 14
76.0% 15
• So, given N people, how long does it take to get to a probability of < 50% that 71.6%
68.5%
16
17
no one else shares my birthday? 65.3%
62.1%
18
19
P(no collisions) = (1 – 1/365) * (1 – 2/365) * (1 – 3/365) * (1 – N/365) 58.9% 20
55.6% 21
52.4% 22
49.3% 23
Finding Collisions
• Turns out that the unfortunate consequence of the birthday paradox is that the number
of messages we need to hash in order to find a collision is roughly equal to the square
root of the number of possible output values, or about 2! = 2!/# instead of 2! as one
might expect
• After a little Taylor Series hocus pocus, we can compute that we would have to hash 𝑡
n-bit hash input values to find a collision, where lambda is the probability of a collision:

"#$ ⁄%
1
𝑡 ≈2 ln
1−𝜆
• For example, assume we want to find a collision for a hypothetical hash function H()
which produces 80 bits of output. For a collision probability (𝜆) of approximately 50%,
&'#$ ⁄% $
we would expect to have to hash about 𝑡 ≈ 2 ln ≈ 2+'., ≈
$('.*

1.7 Trillion attempts before finding a collision...


• This can be accomplished in a little over one day with a mac desktop computer...
It gets worse...The Birthday Paradox
• Formally, a hash function H is said to be collision-resistant if it is computationally
infeasible to find two values, x and y, such that x ≠ y and H(x) = H(y)
• If two hashed outputs are the same and since we assume no collisions, then if we know
that H(x) = H(y), it’s safe to assume that x = y
• Security against collisions can be enhanced by simply enlarging the number of bits in
the hashed output
• The following table shows the number of hashes needed for a collision for different
hash function output lengths in bits for two different collisions probabilities of λ = 50%
and λ = 90%
Hash Output Length in Bits
λ 128-bit 160-bit 256-bit 384-bit 512-bit

0.5 265 281 2129 2193 2257


0.9 267 282 2130 2194 2258
Not worth the effort for Bitcoin
• As of June 1, 2024, the total mining hash rate is averaging about 720,000,000 TH/s
(TeraHashes (trillions of hashes per second) the Bitcoin network is performing per
second)
• That’s 720.0 quintillion hashes (720.0 x 1018) (or 720 ExaHashes) being performed by the
bitcoin network each and every second
• At this rate, if one were trying to perform a collision attack one would only need to
calculate 2128 hashes (2#$%/#), which on an annual basis yields:
2128/(720.0 x 1018 x 86400 x 365.25) ≈ 1.5 × 1010 years
• By comparison, our universe is only about 1.4 x 1010 years old
• So, a brute force attack at the current hashrate would currently take about 1.1 times the
current best-estimate age of the universe
• I think we’re good for another week...
Preimage Resistance Second Preimage Resistance

{Given P1} P1

P
X X
{Find P2}
H P2

H
on
e-
w
ay

Preimage Resistance: Second Preimage Resistance:


Given H, one can never deduce P from C, Given H, Two messages P1 and P2 cannot
i.e., it is computationally infeasible that: hash to the same value C,
given H(P) = C, that one could recover H-1(C) = P i.e., it is computationally infeasible that
H(P1) = C and H(P2) = C , if P1 ≠ P2
The Workhorse: SHA-256
• MD5 was warned about collisions in 1996 and a team of Chinese cryptanalysts broke it
in 2005, and today collisions can be found in a few seconds...
• In 1993, The National Security Agency (NSA) created the 160-bit SHA-0 (Secure Hash
Algorithm Number Zero) hash function, which was standardizes by the National
Institute of Standards and Technology (NIST)
• In 1995, the NSA released SHA-1 to fix an unidentified security flaw in SHA-0
• In 1998, a couple of researchers discovered how to reduce the number of operations to
find collisions in SHA-0, leading to collisions in less than an hour
• So far, SHA-256 is a secure hash algorithm that guarantees 256-bit preimage resistance.
Hash Functions – Ins and Outs
• How do you hash something that is 4 Megabytes into a 256-bit string (32 bytes)?
• Cryptographers realized that the simplest way to hash a large message is to split it up
into chunks and process each chunk consecutively using a similar algorithm.
• This strategy is known as iterative hashing, and it has two main forms:
• You can use a compression function such as the Merkle-Damgård construction that
transforms an input into a smaller output. (The vast majority of hash functions use
Merkle-Damgård, including MD4, the very popular MD5 (16 bytes (128 bits), broken
in 2005!), SHA-1, SHA-256 and SHA-512 and others.)
• You can use a sponge function that changes an input into an output of the same size, so
that different different inputs give two different permuted outputs, executing an
algorithm of “absorbing” and “squeezing”
SHA-256 Hash Function
• The Initialization Vector is just some number you grab from a standards
document...that’s all those static numbers you see in the source code for SHA-256...
• Theorem: If compression function C is collision free, the hash function is likewise
collision free (although oddly not vice-versa)
• The Merkle-Damgård construction splits the message into blocks of identical size and
commingles these blocks with an advancing internal hash state using a compression
function
Message Message Message

Padding
Block Block Block
(512 bits) (512 bits) (N bits)

Compression Compression Compression


Hash
Initialization
Vector
Hash Functions – Compression Details
• What pops out of the machine is the final hashed value of a fixed-bit size (64, 256, 512)
• The algorithms work with fixed size chunks of the message, usually 512-bit or 1024-bit
blocks
• The hash value ends by appending the binary representation of the original leftover
message’s size in bits, assume we have 8 bits ‘10101010’ leftover: (echo 'obase=2;8' | bc)
• Any “leftover” bits of a block are padded with a ‘1’ followed by the number of 0’s required
to fill up the block size (e.g., 512 bits), followed by the size of the actual leftover bits, e.g.:
10101010 leftover => 101010101000000000000...0000000000001000
512 bits 512 bits 512 bits
M1 M2 M3

Compression Compression Compression


H0 256 bits H1 H2 H3 ...

M1, M2, M3, etc. are advancing pieces of the original message being hashed

H0 is the Initialization Vector (IV) of the internal state; Hn etc. are chaining values representing the advancing state of the hash
The Workhorse: SHA-256
• SHA-256 uses the M-D compression function and repeatedly takes 768-bit input and
produces 256-bit output.
• echo -n "0012ab8dc588ca9d5787dde7eb29569da63c3a238c" | openssl sha256

Effectively,
a digital
fingerprint

0a96742670166ddf3a9cd6
c88c11ea24ea2c0f6920d3
f99049f53c89c67c9eea

256 bits
32 bytes,
Every time
1,553,444 bytes
Message Digests
• A cryptographic hash function is one type of hash function that uses
an algorithm that maps data of varying size to a bit-string of a fixed size (the hash)
• Given our properties of collision resistance hash functions:
• If H(x) = H(y), we may assume x = y, and this allows us to use hash functions as a
message digest
• Remember hash functions do not encrypt the data (i.e., you cannot “decrypt” the hashed
message digest back to the original)
• The input data is commonly called the message, and the output (the hash value or simply
the “hash”) is often called the message digest or simply the digest
• If two messages have the same resulting digest, depending on our properties of hash
functions, we can be confident that the two original messages are identical.
Message Digests
• Message digests have proven effective in a variety of contexts:
• Password storage (cf. crypt in UNIX systems which leverage MD5, Blowfish, SHA-256,
etc. and Microsoft’s pathetic LANMAN)
• Proof of file integrity (MD5) for network data transfers
• Data Integrity: Revision control systems such as Git, Mercurial, etc. use message
digests not for security but to identify revisions and to ensure that the data has not
changed due to accidental corruption
• Bitcoin
The Workhorse: SHA-256
• Use the live dynamic sha256 at: https://passwordsgenerator.net/sha256-hash-
generator/
• Enter the text: Hello MPCS 56600 Peeps!
• See the hashed result:
1DD635FF84BDED73C744DC9CF1915D9C013E32302598FF75A7BA0B9CF00C0674
• $ echo -n 'Hello MPCS 56600 Peeps!' | openssl dgst -sha256
(stdin)= 1dd635ff84bded73c744dc9cf1915d9c013e32302598ff75a7ba0b9cf00c0674

• You can find examples of SHA-256 functions in multiple languages in my pub directory
located here: ~mark/pub/56600/src.lecture/lecture.3/sha256/examples/*
CLASSIFICATION OF CIPHERS
Encryption and Decryption using Stream and Block ciphers
Basic (Relevant) Cryptographic Vocabulary
• Types of Ciphers:
• Stream Ciphers (work on individual bits discretely and individually)
• Classical ciphers
xi , yi , si ∈ { 0, 1 }
Encryption: yi = esi(si ) ≡ xi + si mod 2
Decryption: xi = dsi(yi ) ≡ yi + si mod 2
• Block Ciphers (we’ll get to these...)
• Modern Ciphers
• Symmetric Encryption
• Asymmetric Encryption
Symmetric Ciphers
• In symmetric ciphers, there is a single key that is used to both encrypt and decrypt
• The plaintext is encrypted using a key, and the ciphertext is decrypted using the very
same key
• Needless to say, the key needs to be kept hidden, which is why symmetric ciphers are
sometimes called secret key ciphers
• Examples of modern symmetric ciphers include DES, 3DES, International Data
Encryption Algorithm (IDEA), and the Advanced Encryption Standard (AES)
STREAM CIPHERS
Stream Ciphers
• Most stream ciphers are word-based or character-based, and the
encryption occurs “word by word” or “character by character”
• Both the One-Time Pad and the Vigenère cipher are examples of
stream ciphers
• The most (in)famous example of a stream cipher was the electro-
mechanical stream cipher Enigma Machine

Turing’s Hut 8
The Enigma Machine
• Rotor orientation: Each of the three rotors could be set in one of 26
orientations, therefore 263 (17,576) settings...
• Rotor arrangements: Each of the three scramblers could be positioned in
any of size orders: 3! = 6
• Plugboard: The number of ways of connecting (swapping) any six pairs of
letters out of 26: 100,391,791,500 different combinations (26! / 14! / 6! /
26!)
• The total number of possible keys is the product of these three numbers:
• 17,576 x 6 x 100,391,791,500 = 10,586,916,764,424,000....over ten
quadrillion...a lot.

Alan Turing’s
Alan Turing
17th cousin
Turing’s Bombe Machine
• Alan Turing designed a machine that would decrypt an Enigma-
encrypted message, with one caveat...
• A crib (hint) was needed to reduce the combinations to something
manageable
Modern Stream Ciphers
• Modern stream ciphers are bit-based in that they encipher bit-by-bit
• Stream ciphers take plaintext and apply a keystream sequence to it
• As a simplistic example, let’s imagine a rule statement such that “if there is a 1 in
keystream[x], then change the bit in plaintext[x]. If there is a 0 in the keystream[x], then leave
plaintext[x] unchanged.”
• Let plaintext = 1100101 and keystream = 1000110
• Plaintext[0] is a 1 and keystream[0] is a 1. 1 means change it. Plaintext[0] becomes a 0.
• Plaintext[1] is a 1 and keystream[1] is a 0. 0 means no change. Plaintext[1] remains a 1, etc.
• The resulting ciphertext becomes: 0100011
• Note our rule statement above is essentially the XOR operation, thus we have:
• Where, C = Ciphertext, P = Plaintext, and K = Key:
• Ci = Pi ⨁ Ki
• Pi = Ci ⨁ Ki

• Cf. once again xor.py


Modern Stream Ciphers
• The key to designing good stream ciphers is designing a good binary keystream sequence
generator
• Needless to say, there’s more advanced mathematics involved in generating good stream
cipher keys (mobile networks, WiFi networks, etc.)
• The advantage of stream ciphers is that over noisy or unreliable channels, if a bit or two goes
“rogue”, only a tiny fraction of the total stream is affected, as each plaintext bit is determined
by only one ciphertext bit (not at all true of block ciphers)
• Another general advantage of stream ciphers is their speed and ease of use
• Thus, stream ciphers are often useful for encrypting digitized speech such as in the GSM
mobile phone network (A5/1), much software including Lotus Notes and SSL (RC-4)
BLOCK CIPHERS
Block Ciphers
• Block ciphers are an important subset of symmetric algorithms
• A block cipher is an algorithm that encrypts data by breaking the data into smaller
fixed-size chunks, and then encrypts the chunks one after another
• If a chunk of input happens not to fit the fixed size required by the algorithm, the
algorithm supplies additional bits to enlarge the chunk to the fixed size (usually 0’s)
• We should note that an n-bit string can be divided up into different sized blocks. More
on this in a minute.
• Encryption is sequential and not parallelizable
• There was a problem with repeated data in early attempts at block ciphers: in such cases
the resulting ciphertext could be the same, depending upon the repeated text and the
size of the block, and codebooks could be created to attack encrypted text
• This problem is partially resolved by modern block ciphers by adding an Initialization
Vector (IV) to a block, which is simply more random bits, along with increased key size
Block Representations
• We should note that an n-bit string can be divided up into different size blocks. For
example, a 12-bit string can be divided up into 2 6-bit blocks, 3 4-bit blocks, or 4 3-bit
blocks, each of which may in turn represent a different number
• Take for example the 12-bit binary string 100111010110, 0x9D6, or decimal 2,518
• Any binary sequence of length n can be regarded as representing an integer from 0 to
2n-1
• So, if we choose a block length of n = 4, it can be written as a sequence of integers in
the range of 0 to 24-1, 15 (as 1111 = 15)
• Likewise, if we choose a block length of n = 6, it can be written as a sequence of integers
in the range of 0 to 26-1, 63 (as 111111 = 63)
• If we break this 12-bit binary string, 100111010110, down into 3-bit blocks we have
100=4, 111=7, 010=2, and 110=6, yielding a sequence of 4 7 2 6
• However, if we break the same binary string 100111010110 down into 4-bit blocks, we
have 1001=9, 1101=13, and 0110=6, yielding a sequence of 9 13 6
Properties of Symmetric Block Ciphers
• Diffusion: A small change in the plaintext (i.e., a single character) should produce an
unpredictable change in the ciphertext (note that a Caesar Cipher does not have the
diffusion property...), think breadth of obfuscation
• Confusion: If an attacker is conducting an exhaustive key search, then there should be
no indication that they are ‘nearing’ the correct key (does an exhaustive search of a
Caesar Ciphertext indicate that you are ‘nearing’ the correct key?), think depth of
obfuscation
• Modern block ciphers use a method called cipher block chaining (CBC) which leverages
multiple loops of encryption (cf. US Patent 4074066, 1976) and requires each new block
to be encrypted to be ‘linked’ to the previous block
• Doing so complicates the possibility of using a dictionary of known plaintext-ciphertext
blocks (such as would exist in an Electronic Code Book)
Block Ciphers
• The general block cipher process is as as follows:
1) Random bits create an Initialization Vector (IV)
2) The IV is XOR’d with the first chunk of plaintext data
3) The result is encrypted with a key from the key table
4) This result is then XOR’d with the next chunk of data
5) That result is encrypted with a key from the key table
6) The last two steps are repeated until all the plaintext has been consumed
Encryption and Decryption with a Block Cipher
• Each block of ciphertext (other than C0) is derived by encrypting the XOR of the next
plaintext block with the previous ciphertext block, so, for example, C5 is the result of
encrypting P5 ⨁ C4
• To encrypt: Ci = EK(Pi ⨁ Ci-1), where C0 = IV
• To decrypt: Pi = DK(Ci) ⨁ Ci-1, where C0 = IV
• If we encrypt with a different Initialization Vector each time, the CBC mode becomes a
probabilistic encryption scheme (as opposed to a weak deterministic encryption scheme)
• For this reason, we want to use an Initialization Vector that is a nonce, that is, a
Number used ONCE
• Usually, we will choose a pseudo-random number as our Initialization Vector, and it
is this randomness injection that induces our probabilistic encryption
Block Ciphers
• The result of using a Cipher-Block Chaining mode is that because all the blocks are
chained together, not only does ciphertext yi depend on plaintext block xi but it also
depends on all previous plaintext blocks as well
• Essentially the ciphertext yi which is the result of the encryption of plaintext block xi is
itself fed back to the cipher input and XOR’d with the succeeding plaintext block xi+1
• This XOR sum is then encrypted, yielding the ciphertext yi+1 which then can be used (after
some shifting and/or rotating per the algorithm) for encrypting plaintext block xi+2 so on...
• When decrypting a ciphertext block yi in CBC mode, we simply reverse the two operations
by applying the decryption function e-1(), yielding more formally:
Encryption (first block): y1 = ek(x1 ⨁ IV)
Encryption (second on...): yi = ek(xi ⨁ yi-1), i ≥ 2
Decryption (first block): x1 = ek-1(y1) ⨁ IV
Decryption (second on...): xi = ek-1(yi) ⨁ yi-1, i ≥ 2
• Where e() is a block cipher of block size b, and xi and yi are bit strings of length b,
and IV is a nonce of length b
PUBLIC KEY CRYPTOGRAPHY
“The term public-key cryptography is used interchangeably with asymmetric
cryptography; they both denote exactly the same thing and are used
synonymously.”–Paar and Pelzl, Understanding Cryptography
Asymmetric Ciphers
• Asymmetric ciphers use two different (but mathematically related keys) to encrypt and
decrypt, as the public key is mathematically derived from the private key
• A public key is used (by anyone with access to it) to encrypt something, that can only be
read (decrypted) using your private key
• For this reason, asymmetric ciphers are part of what has become known as public key
cryptography
• Modern examples of asymmetric ciphers include RSA
• Diffie-Hellman (and Merkle) came up with a mechanism for sharing public keys
Introduction
• While symmetric cryptography has been used since ancient times over thousands of
years, public-key cryptography is relatively new
• Recent releases of declassified information out of Britain reveal that researchers James
Ellis, Clifford Cocks and Graham Williamson from the UK’s Government
Communications Headquarters (GCHQ) discovered it as early as 1972
• But it was publicly developed by Whitfield Diffie, Martin Hellman and Ralph Merkle in
1976
Symmetric vs. Asymmetric Ciphers
• Symmetric ciphers (unless we’re merely using one-way hash functions like message
digests) require that both the sender and receiver share a commonly-known key
• Indeed, prior to the 1970’s, symmetric ciphers were your only choice in cryptography
• The implication here is vast: there must be mutual trust between the two parties that
share the same key
• Let’s say that again: in symmetric cryptography, there must be mutual trust between the two
parties that share the same key
• The core idea of public key cryptography is that each party has their own public key and a
corresponding private key
• They may share their public keys, but their private keys they must keep perfectly secure
• Mathematical algorithms exist for creating key pairs such that it is practically impossible
to deduce the private key from it’s associated public key
Public Key Cryptography leverages Asymmetric Ciphers
• Again, asymmetric ciphers use two different (but mathematically related keys) to encrypt
and decrypt
• A public key is used (by anyone with access to it) to encrypt something, that can only be
read (decrypted) using your private key
• For this reason, asymmetric ciphers are part of what has become known as public key
cryptography
• Modern examples of asymmetric ciphers include RSA
Public and Private Keys
• Alice would use Bob’s public key to encrypt data, and would then send the ciphertext to
Bob
• Bob would then use his algorithmically-related private key to decrypt the message that
Alice sent him using his public key
• Any message encrypted with the public key of party P1 can decrypt that message with the
private key of party P1
• Therefore, Alice must make very sure she is using Bob’s actual public key to encrypt the
message, because if she were to mistakenly use Oscar’s (an Opponent) public key and
Oscar were to intercept Alice’s message to Bob, two dire results would ensue:
• Bob could not read the confidential message from Alice
• Oscar could!
• Thus, all public keys need protection in the sense that their authenticity must be
ensured, precisely because the ciphertext does not include any metadata about the
sender
Public and Private Keys
• Following Auguste Kerckhoff’s desideratum, we must assume that both the algorithm and
the encryption key are public
• Oscar is faced with the task of trying to deduce the original message from the ciphertext
(unless of course Alice made a mistake and used Oscar’s public key to encrypt, in which
case the task is trivial)
• Public Key cryptography is built upon the assumption that Oscar will know both the
public key and the algorithm used, and the algorithm’s goal is to make this knowledge
practically useless, that is to say, to make its deduction computationally infeasible
• Most public key algorithms are block ciphers that regard the plaintext as a sequence of
large integers and rely on the difficulty of solving a particular mathematical problem for
their security (such as prime factorization or discrete logarithm problems)
• The most famous of these block cipher algorithms was created by Rivest, Shamir and
Adleman in 1978 and is known as RSA
• RSA leverages prime factorization as its mathematical problem
RSA in a Nutshell
• There is a publicly known number N that is the product of two prime numbers, whose
value is kept secret
• This value N is simultaneously both the block size and the public and private key size
• These prime numbers must be kept secret, for anyone who knows the two numbers can
calculate the private key from the associated public key
• Thus, the block size (i.e., N), must be large enough so that Oscar cannot deduce the
primes, which is to say, no attacker can factorize N
• Let’s say N = 15. What are the prime factors of 15? That’s right, 3 and 5. Bingo!
Knowing both your public key and the algorithm, Oscar just got your private key.
• But for a large-enough value of N, finding the primes is currently computationally
infeasible (it actually involves solving the discrete logarithm problem with huge numbers)
• For RSA and associated algorithms, a value of N may be at least 640 bits (N > 29) and
values of 1024 (210 bits or 309 decimal digits) and 2048 (211 bits or 617 decimal digits)
are not uncommon
RSA in a Nutshell
• Remember that adding a single bit to the length of a key has the effect of doubling the
number of possible keys in the output, effectively doubling the computational problem
of cracking the key (and finding collisions)
• Because of the sizes of the keys which results in slower computational encryption, public
key algorithms tend not to be used for document encryption (cf. hash functions and
message digests) but rather for:
• Key exchange
• Digital signatures
The Current RSA Challenge
• For each number N, there exists prime numbers p and q such that:
N = p x q. Find these two primes, given only N.
• To date, the largest factored RSA number is 232 decimal digits long (768 bits, ≈ 2768 + 1), during which
CPU time spent on finding the two factors by a collection of parallel computers amounted to
approximately 2000 years of computation on a single-core 2.2 GHz AMD Opteron-based computer
• The RSA-2048 (617 decimal digits) Challenge’s cash prize for its factorization is: $200,000:

251959084756578934940271832400483985714292821262040320277771378360436
620207759555626401852588078440691829064124951508218929855914917618450
280848912007284499268739280728777673597141834727026189637501497182469
116507761337985909570009733045974880842840179742910064245869181719511
874612151517265463228221686998754918242243363725908514186546204357679
842338718477444792073993423658482382428119816381501067481045166037730
605620161967625613384414360383390441495263443219011465754445417842402
092461651572335077870774981712577246796292638635637328991215483143816
7899885040445364023527381951378636564391212010397122822120720357

You might also like