0% found this document useful (0 votes)
3 views

notes

The document outlines the principles of data and system security, focusing on various cryptographic methods including symmetric and asymmetric key cryptography, and discusses historical and modern ciphers such as the Vigenere cipher and One Time Pad. It also covers concepts like perfect secrecy, Shannon's information theory, and key distribution methods. Additionally, it explains access control mechanisms and random number generation techniques used in cryptography.

Uploaded by

Aditya Nirmale
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

notes

The document outlines the principles of data and system security, focusing on various cryptographic methods including symmetric and asymmetric key cryptography, and discusses historical and modern ciphers such as the Vigenere cipher and One Time Pad. It also covers concepts like perfect secrecy, Shannon's information theory, and key distribution methods. Additionally, it explains access control mechanisms and random number generation techniques used in cryptography.

Uploaded by

Aditya Nirmale
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CS 745 : Principles of Data and System Security

Scribed by: Kavin Arvind and Nikil S

Contents

Lecture - 01 ..................................................................2
Lecture - 02 ..................................................................3
Lecture - 03 ..................................................................5
Lecture - 04 ..................................................................7
Lecture - 05 ..................................................................9
Lecture - 06 ..................................................................11
Lecture - 07 ..................................................................15
Lecture - 08 ..................................................................18
Lecture - 09 ..................................................................20
Lecture - 10 ..................................................................21
Lecture - 11 ..................................................................23
Lecture - 12 ..................................................................25
Lecture - 13 ..................................................................27
Lecture - 14 ..................................................................28
Lecture - 15 ..................................................................29
Lecture - 16 ..................................................................31
Lecture - 18 ..................................................................32
Lecture - 18 ..................................................................34
Lecture - 20 ..................................................................35
Lecture - 21 ..................................................................37
Lecture - 22 ..................................................................38
Lecture - 23 ..................................................................39

1
Lecture - 01
Topic: History of Cryptography

Shifted cipher
Each letter is Shifted by k and sent. Eg- ”A” is written as ”A”+k (Shifted by k letters) and sent.
This is easy to decode as only 26 ( or 36 (if 0-9 nos are included)) possible k are there and thus its
easy to check each possibility.

Rolling by wooden stick


A paper is rolled on to a stick and text is written. If seen normally, the letters would look fully
shuffled, but if its rolled in the same way as it was written, it can be decoded.
eg- ”MY NAME IS X” is written like

M M X
Y A E S
N I

Thus its crypted as MMXYAESNI.

Mono-Substitution cipher
We have a table where each letter is mapped to other letters and text is ciphered according to that.
Here, we have here 26! ways of mapping and so its very difficult to try different possibilities.
This seems like an optimal solution, but there is a problem. In an average english text, each letter
has a specific frequency of repetition.
Say letter ”A” is coded to letter ”K” (randomly). So frequency of letter K would be same as of the
letter ”A” in a normal text. So by this way, cipher text could possibly be decrypted.

2
Lecture - 02
Topic: History of Cryptography(Continuation.)

Homophonic Cipher
The main problem of Mono-Substitution cipher is that, a character was substituted with only one
alphabet and so the frequency didn’t change.
What if its substituted with many characters to equalize the frequencies?
Say S = {A, B, ..Z, 0, 1, ..9, , α, β, γ, ...} has usable symbols.
Say letter ”A” has frequency x%. We allot 100 x
×|S| number of symbols and are randomly substituted
in the cipher text in place of ”A”. This uniforms/balences the frequency among all the symbols and
hence difficult to decrypt by frequency method.
But here, storing the mapping, encrypting, and decrypting are difficult.

Vigenere’s Cipher
What if we substitute ”A” by any of the letters strategically? Vigenere created a table as shown
below.

A B C D ..
A B C D E ..
B C D E F ..
.
.

A keyword is chosen and correspondingly added to the text encrypt it. Eg - Thus here, according

Actual text M Y N A M E I S X
keyword R O S E R O S E R
Cipher .. F

to the position, same letter is encrypted to different letters and thus the frequencies are balenced.
Is it a good method then?
Words like ”THE”, ”IS”, etc repeat so much in english that its very likely that it is encrypted to the
same cipher text due to same relative position w.r.t keyword. Calculating the repeated strings in
ciphertext and observing the distance between them will give insigts about the length of keyword.
Length of keyword would be a factor of those distances and can be found out(say l). Now, characters
1, 1 + l, 1 + 2l, .. are derived from same column of the table. Hence they are like monostituted and
now, frequencies can be calculated out to find the keyletters and hence keyword.

Mordern Cryptography
Shannon’s Cipher
ξ = (E, D) is a cipher system where E(m, k) = c(m is message, k is key, c is cipher text) is encyption
funtion, and D(c, k) = m is decryption funtion.

3
One Time Pad
Say ml is a message of bits of length l, and key k l is key of same length generated randomly.

E(m, k) = ml ⊕ k l = c

D(c, k) = cl ⊕ k l
= ml ⊕ k l ⊕ k l
= ml

Provided key is generated completely random, and no part of key is known to Eavesdropper, they
can’t decrypt it as probability of c being 0 or 1 is independent of message itself. I.e,

P rob(cipher = c|msg = m) = P rob(cipher = c|msg = m0 )

Hence, it is safe. Disadvantages:

• key is as big as message(or more)

• key should be sent safely. Otherwise its easily decrypted.

If key length is more, either its padded at the end and xored, or key is taken till the length of
message and xored.
In general, if its not a bit string, the encryption can be taken as sum modulus like:

E(m, k) = ml + k l (mod n) = c (if n=2, its just xor)

D(c, k) = cl − k l (mod n)
= ml + k l − k l (mod n)
l
=m (mod n)

4
Lecture - 03
Topic: Perfect Secrecy and Shannon’s information Theory

Perfectly secrecy
OTP
For a message to be perfectly secret, the Eavesdropper should not be able to get any extra infor-
mation from the ciphertext. So,

P (M = m|C = c) = P (M = m) [message = m, and ciphertext = c]


Pc (m) = P (m)
P (M = m|C = c) P (C = c|M = m)
=
P (M = m) P (C = c)
P (C = c|M = m)
=P 0 0
m0 ∈M P (C = c|M = m )P (M = m )

P (C = c|M = m0 ) = P (K ⊕ m0 = c|M = m0 )
 

 = P (K = c ⊕ m0 |M = m0 ) 

1
 
= l [as key is selected randomly, probability that its c ⊕ m0 is 1/2l ]
2

P (M = m|C = c) P (C = c|M = m)
=P 0 0
P (M = m) m0 ∈M P (C = c|M = m )P (M = m )
1/2l
=P l 0
m0 ∈M (1/2 )P (M = m )
1
=P 0
m0 ∈M P (M = m )
1
=
1
P (M = m|C = c) = P (M = m)

Hence proved that it is perfectly secret.


But, what happens if key is repeated? Say a message said ”Fire the gun” to a soldier which was
ciphered to c using key k, though an Eavesdropper technically doesn’t know the key, now he would
see the soldier firing after getting message and so he can guess the message. Using the ciphertext,
he can get the key = message ⊕ cipher and if same key is used again, he would guess the message.
Thus key can be used just once.
Also, if M = m1 =0 a0 , m2 =0 ab0 and,
if c =0 x0 , Pc (m1 ) = 1 and Pc (m2 ) = 0 (This method reveals length of the message)
if c =0 xy 0 , Pc (m1 ) = 0 and Pc (m2 ) = 1.

5
Substitution cipher
If M = m1 =0 aa0 , m2 =0 ab0 and,
if c =0 xx0 , Pc (m1 ) = 1 and Pc (m2 ) = 0.
if c =0 xy 0 , Pc (m1 ) = 0 and Pc (m2 ) = 1.
Thus its not perfectly secret.

Addition OTP

D(c, k) = cl − k l (mod n)
= m + k − kl
l l
(mod n)
= ml (mod n)

Proof is very similar to as OTP.

Shannon’s information Theory


”No class on Friday” has more information/importance than ”There is class on Friday” because
having no class is a rare thing, and need to informed importantly. Having class is a regular thing
and it doesn’t carry much info. So,
1
information ∝
probability of occurance
1
Inf o(x) ∝
P (x)

Entropy of a message distibution(X) is defined as:


X
H(X) = − P (x) log2 (P (x))
x∈X
X 1
= P (x) log2 ( )
P (x)
x∈X

Entropy is max when each of the messages has equal probability i.e, they are more uncertain.
Conditonal entropy of X, given Y is:
X 1
HY (X) = P (x, y) log2 ( )
Py (x)
X,Y
X X 1
= P (y) P (x) log2 ( )
Py (x)
Y X

If C is the cipher text, and if HC (M ) ≈ 0, then its easily breakable as it is not that uncertain.

6
Lecture - 04
Topic: Key distributions

Symmetric Key Cryptography


The methods we have seen so far including Substitution cipher, OTP, etc, are Symmetric key
Cryptography as both the sender and receiver needs the same key to encrypt and decrypt the
message. Here, the main difficulty was to exchage keys between both parties safely.
Its two types are Stream Cipher and Block cipher which we’ll see later.

Assymetric key Cryptography


Here, a pair of Keys Ek and Dk are created by a party which are related to each other in some sense.

Method 1: Secrecy Ensured


Lets say, person A wants to send a message to person B. Now, person B created the pair of keys
EB and DB and sends EB publically. So, now A encrypts message m using EB to send cipher c.
Here, since DB is known only by person B, just B can decrypt it and no one else. Thus here,
secracy is ensured. But, cipher c can be tapped and some other message m0 can be encrypted to c0
using public key EB by an Eavesdropper and sent. So, here, authenticity is not ensured.

m EB c c DB m

EB (m) DB (EB (m)) = m

Method 2: Authenticity Ensured


Lets say, person A wants to send a message to person B. Now, person A created the pair of keys
EA and DA and sends EA publically. So, now A encrypts message m using DA to send cipher c.
S Here, since EA is known by everyone including B, he can decrypt it using EA . Thus here,
authenticity is ensured as DA is private to person A and only he/she can encrypt it. But, cipher c
can be decrypted by littrally everyone as EA is known publically. So security is not ensured.

m DA c c EA m

DA (m) EA (DA (m)) = m

Method 3: Both Ensured


What if we combine both the above methods to ensure both..
Lets say, person A wants to send a message to person B.
Both of them creates pairs of keys EA , DA and EB , DB (and, EB , EA are public).

7
m DA EB c c DB EA m

DA (m) EB (DA (m)) DB (EB (DA (m))) EA (DB (EB (DA (m))))

= DA (m) =m

This suffices both authenticity and secracy. Here, its noteworthy that algorithm is known by
everyone unlike the historical methods and just that the keys are kept secret.

Access Control
Eg- MAC, DAC, RBAC, ABAC, etc are algorithms/methods used to enable access control. Lets
understand this with an eg:
Lets say there is data stored in a database where only specific users can read and special users
can edit it. Also, people should not be able to delete or scatter the information. So, readers and
modifiers should have their own specific keys.

Random Number Generation


Its of two types:

True Random Number generator(TRNG)


This is based on actual random events such as some hardware that changes drastically with outside
conditions. It is truely random and can’t be predicted.

Pseudo Random Number generator(PRNG)


This is done algorithmic and could be predicted based on its previous values.

8
Lecture - 05
Topic: Symmetric Crytopgraphy(Stream Cipher, LFSR)

Weekly Test 1 Solutions


Question 2
Given :
Length of code = 128 bits
Cost of a processor = Rs.1000
Cap on cost of processors is Rs.10 crores. The performance of the processors is 10ns/code and
follows Moore’s law, i.e. it doubles in 24 months. The code is expected to be broken in 7 days.

Solution:
n
In n years, performance will increase by a factor of 2 2 . Therefore,

10 × 10−9 s 10crore
2128 codes × n = processors × 7days × 24hrs × 60min × 60s
22 1000
On solving, n turns out to be approximately 130 years.

Symmetric Encryption: Stream Cipher


The input is a stream of bits and the encryption takes place one bit at a time. It is easy to compute.
e.g.: One Time Pad which encrypts as follows

ci = mi ⊕ ki
ci = mi + ki mod 2

To use this encryption we need a random number generator to obtain k. True random numbers can
be generated from physical phenomena. Rather, We are gonna generate pseudo-random numbers,
i.e., random numbers generated algorithmically. One such method is LFSR.

Linear Feedback Shift Register


LFSR is a shift register whose input bit is a linear function of two or more of the previous output
bits.

1 0 0 Output

K2 K1 K0

9
The sequence generated by the given circuit is

Ki+3 = Ki+1 ⊕ Ki
Ki+3 = Ki+1 + Ki mod 2

The expression for the input bit of an LFSR can be represented by a polynomial of degree n (n =
number of registers), known as the characteristic polynomial. For example, the polynomial of the
above LFSR circuit is

x3 = x + 1 mod 2
x3 + x + 1 mod 2 = 0
n
Such a polynomial is called primitive if it is a factor of x2 −1 + 1 mod 2. A primitve polyno-
mial generates a maximum length cycle of register values for given number of registers (cycle length
= 2n − 1, every pattern except 0). For example, the above polynomial generates the following series:

100, 010, 101, 110, 111, 011, 001


of length 23 − 1 = 7. Note that the polynomial is a factor of x7 + 1 mod 2.
n −1
x2 + 1 mod 2 = (x + 1) × (x3 + x + 1) × (x3 + x2 + 1) mod 2

To preserve randomness, the length of the cycle should be maximized, and so a primitive polynomial
is preferred. Otherwise, we will end up generating a cyle of length less than 2n − 1.

Here, the key depends on


• Characteristic polynomial

• Seed (Initial Value of the registers)

Programmable LFSR
An LFSR circuit of degree n that can configured to adopt any characteristic polynomial of the same
degree. Here’s an example of a programmable LFSR of degree 4.

K3 K2 K1 K0 Output

h3 h2 h1

⊕ ⊕ ⊕

The sequence generated by this would depend on the keys h1,h2,h3 as shown:

Ki+4 = h3 Ki+3 + h2 Ki+2 + h1 Ki+1 + Ki mod 2

Due to its linear nature, an LFSR can be easily broken. We shall introduce non-linearity by using
AND and OR gates in the circuit which will be covered in the next class :)

10
Lecture - 06
Topic: DES, Feistel Cipher

Stream Cipher - Short Summary

ki

m ⊕ ci

Block Cipher
Bundle of bits are fed.

k
k 56 bit

m c
m c DES
64 bit 64 bit

Principle

m confusion diffusion c

The text undergoes several iterations of confu-


sion and diffusion.

Confusion
Confusion + Diffusion = Security ;)
Relation between the text message and cipher-
text is obscure.

Diffusion
Change in one bit in the plaintext influences
multiple bits in the ciphertext.

11
IP

1 58

1 40

IP −1

1 40

1 58

Feistel Cipher

L0 R0

k0

+ f

L1 R1

• Left half(L0) is encrypted in one round and put on the right side(R1).

• Right half(R0) is simply copied to the left side(L1).

• The function f depends on the right half (here, R0) and the corresponding key of that round.

12
Function f (Ki+1 , Ri )

Ri (32bit)

Expand

⊕ Ki+1 (48 bits from 56 bits key)

48bit

6bit 6bit 6bit

S1 S2 S8

4bit 4bit 4bit

32bits

f (Ki+1 , Ri )

Expansion
1 bit may expand to 1 or 2 bits.

1 2 3 32

1 2 3 4 5 48

Substitution
Substitution is done based on 8 tables(S1 to S8) of size 4×16. The 6 bit input is used to navigate
through the substitution table. The first and last bit indicates the row and the 4 bits in the middle
indicate the column.

Key Generation
The Key is of 56 bits, outta which every 7th bit is redundant and thus discarded after shifting(to
be precise, its rotation) as shown below.

13
C0 D0
28bits 28bits
1 2 3 n
(shif t1) (shif t1)
Shift by 1

n 1 2 n−1
C1 D1

Note : The left half and right half of first key are C0 and D0 , respectively. Similarly, the second
key comprises of C1 and D1 .

Thus after rotating, we discard the 7th bits to get 48 bit keys for every corresponding round.

• The keys for each round are generated by shifting from the preceding key.

• In rounds 1, 2, 9, and 16, shift occurs with an offset of 1, while for the remaining rounds, the
offset is 2.

14
Lecture - 07
Topic: Decryption of DES and Assymetric Cryptography
Last class we saw about encryption in DES. Now lets see how to decrypt it.

Decryption in DES
For decryption we just follow the exact same enryption method in reverse order. You may think
that we need to calculate f −1 which had expansion and contraction using s-boxes. But actually, we
don’t need its inverse and we can use the same f function.

Ri+1 = Li ⊕ f (ki , Ri )
Li ⊕ Ri+1 = Li ⊕ Li ⊕ f (ki , Ri )
Li ⊕ Ri+1 = f (ki , Ri ) [Ri = Li+1 ]
Li ⊕ Ri+1 = f (ki , Li+1 ) [Ri = Li+1 ]
Li = Ri+1 ⊕ f (ki , Li+1 )

Thus, we get Li from Ri+1 and Li+1 .

L1 R1

k0

f +

L0 R0

As we have done the reverse, for 1 round, if we do it for 16 rounds, the final ouptut comes in R0
and L0 in reverse order and thus it should be reversed to get the original message.
Lets prove that its in reverse order. [d- denotes for bitstrings during decryption, and e- denotes

15
encryption]

Ld0 = R16
e

R0d = Le16 = R15


e

Ld1 = R0d = R15


e

R1d = Ld0 ⊕ f (R0d , k1d )


= Ld0 ⊕ f (R15
e e
, k16 )
e e e
= R16 ⊕ f (R15 , k16 )
= Le15 ⊕ f (R15
e e
, k16 e
) ⊕ f (R15 e
, k16 )
= Le15

thus, R1d = Le15 and similarly, Ld1 = R15


e . so just the right and left are just reversed.

Problem with DES, and subsequent methods


Here, since the key length is just 64 bits, brute force method was very much possible, and so other
methods such as AES, DES were introduced.

AES(Advanced Encryption Standard)


AES is a block cipher which encrypts 128 bits at a time. Its key length are 128 bits, 192, or 256
in different different methods. Correspondigly, 10 rounds, 12, and 14 rounds of encryption happens
respectively.
Unlike DES, here every operation is byte(8 bits) wise and not bit wise.
In every round, the following happens:

• byte Substitution(by passing thorough s-blocks and stuffs)

• Row shift (for diffusion)

• Mix column (for diffusion)

• Key addition(xor with key)

Since they are byte wise operation, (i.e, 0,1,2,..,255) finite field algebra would be useful to understand
AES. Galois field theory (GF(256)) - prof suggested to read this on our own.

3DES or TDES(triple DES)


It can be done is two ways as shown:

m DES DES DES c

k1 k2 k3

16
m DES DES −1 DES c

k1 k2 k3

RivestShamirAdleman (RSA) algorithm


Its an Assymetric type encryption. The method is as follows.

• get two large prime numbers p and q (let p, q > 21024 )

• find n = pq

• euler’s function φ(n) = (p − 1)(q − 1)

• select public key e such that, gcd(e, φ(n) = 1)

• obtain private key d which is = e−1 (mod φ(n)). Here, inverse element exist because gcd(e, φ(n)) =
1.

private info - d,p,q


public info - e,N

encrypt: c = me (mod n)

decrypt:

cd (mod n) = (me )d (mod n)


= med (mod n)
qφ(n)+1
=m (mod n)
qφ(n)
=m .m (mod n)
= m ;fermat’s little theorem

fermat’s little theorem - aφ(n) (mod n) = 1.

Generally, Assymetric key is computationally very expensive and thus very difficult to send the
entire message through Assymetric method. Thus, keys of Symmetric key method are sent using
Assymetric method and messages are sent with Symmetric key encryption henceforth.

17
Lecture - 08
Topic: RSA algorithm and DiffieHellman Key Exchange

Breaking RSA
The main difficulty of breaking RSA algorithm lies in factorisation of n. As seen in previous lecture,
if we could find p and q from n, then φ(n) can be found out. Using e (Public key), d can be found
and thus could be decrypted by any Eavesdropper. Thus factorisation is the only difficulty here..
In fact, such a factorisation is an open problem, if factorised, awards are awaiting.. See link

DiffieHellman Key Exchange


Decide on prime number p and a primive root α(Primive root will be explained later. For now,
assume its a number from 1 to p − 1). Both these are public info.

A chooses a private key a from 2,3,..,p − 2 A chooses a private key b from 2,3,..,p − 2
(Later would be explained why 1 and (Later would be explained why 1 and
p − 1 are not included.) p − 1 are not included.)
Calculate public key A = αa (mod p) Calculate public key B = αb (mod p)
Send A Send B

Recieve B. Recieve A.
KAB = B a (mod p) = αab (mod p) KAB = Aa (mod p) = αab (mod p)

Now, using KAB , Symmetric Cryptography techniques can be applied to send long messages.

Breaking DHP(DiffieHellman Problem)


Eavesdropper knows α, p, A, B. The only unknown things are a, b.

A = αa (mod p)
a = logαA (mod p)

Solving this Discrete Logarithm Problem(DLP) is computationally very hard.

Group Theory
A group is a set of elements and an operator * such that it follows the following properties:
• Closure property: if a, b ∈ G, a ∗ b = c ∈ G
• Associativity
• existance of neutral element(identity element)
• existance of inverse element
(Commutativity is not necessary but if there, its called an abelian group)

18
How to choose α
Here, lets define a group Z∗p = {i : i ∈ 0, 1, .., p − 1, gcd(i, p) = 1} with operator as multiplication in
(mod p).
If p is a prime number, then Z∗p = {i : i ∈ 1, .., p − 1, gcd(i, p) = 1}
Euler’s function, φ(p) = |Z∗p | = p − 1 for a prime number.
Fermat’s little theorem states that, aφ(n) (mod n) = 1 for any n. So here, αφ(p) (mod p) = αp−1
(mod p) = 1
i.e, on powering α repeats in a cycle of length, a factor of φ(p) = p − 1. If the length is exactly
p − 1, we say its a primitive root of p. We want the length to be as long as possible to maximise
randomness, and number of possiblities of αn .
Eg - in Z∗11 , 2 is a primitive root whereas, 3 is not.. 3 has a cycle of 5, and 2 has a cycle of 10 (=
11-1). Try it out.

Why are a and b chosen from {2,3,..,p − 2}, i.e, 1 and p − 1 are ignored?
If a = 1, A = α1 (mod p) = α. Since A and α are known, its easily seen that they’re equal and so
a = 1 can be found. Then, KAB = B a = B simply. Thus key get knowned.
If a = p-1, A = αp−1 (mod p) = 1(Fermat’s little theorem). Since A = 1, its easily seen that
they’re equal and so a = p − 1 can be found, provided α is a primitve root. Otherwise, a will be a
factor of p−1 which also could be found easily ig.. Then, KAB = B a simply. Thus key get knowned.

19
Lecture - 09
Topic: Diffie-Hellman Key Exchange Protocol (DHD), Elgamal

Diffie-Hellman Key Exchange


Say there are two parties A and B that want to communicate:
Public info : p, α

Party A Party B
Kprivate(A) = a ∈ 2, 3, ..., p − 2 Kprivate(B) = b ∈ 2, 3, ..., p − 2
Kpublic(A) = A = αa (mod p) Kpublic(B) = B = αb (mod p)

A
−−−−→ KAB = Ab (mod p)
= αab (mod p)
B
KAB = B a (mod p) ←−−−−
= αab (mod p)
c
c = m.KAB (mod p) −−−−→ m = c.KAB (mod p)

Elgamal(1985)

20
Lecture - 10
Topic: Digital Signatures - RSA

Security Objective(CIA)
• Confidentiality is reqd. Encryption using a key ensures confidentiality in fact as they can be
decrypted only people with the key.

• Integrity: Message shold not be tampered in between. We should be able to identify if its
tampered or not.

• Message Authentication: It should be provable that only the intended sender sent the
message and no one else.

• Non-repudiation: If someone sent a message, later it can be proved that none other than
him/her sent it. Eg- signature. Like if signed, later they cannot deny that they didn’t send
it.

• Access Control: Who all can read the message, who all can modify it, etc should be able to
be controled.

Digital Signatures
Similar to normal Signatures but are done on digital documents. It provides the above parameters
including authentication and non-repudiation.

How to create digital signatures?


Say a document carrying message m is there and signature s. Message and signature are from ”Mes-
sage space” and ”Signature space” resp (basically superset of messages and signatures). Person A
is the owner and B checks the signature.
B calculates the sign of the document using key k i.e, Sig(m) and checks if its same as S or not. If
its same, then signature is verified and otherwise no.
If its document is tampered, calculated signature won’t be same and could be identified that its
tampered.
But this does not ensure Non-repudiation as anyone can find Sig(m).

To include Non-repudiation, Assymetric Key distribution can be used, where private key is used to
find S and it can only be verified by public key and signature cannot be calculated by it.

Digital Signature using RSA


Private info(i.e owner): d, p, q, n = pq, signature S = md (mod n). Now, e = d−1 (mod φ(n)),
n, and S are sent

Public info(i.e verifier): m ,n , S, e.


S e (mod n) is calculated and if its equal to m, then its verified and not verified otherwise.

21
What does attacker wants to do?:
Message is known to everyone(like everyone knows A send 1000 rs to C), but what an attacker
wants to do is that, he wants to change the message to m0 (say change 1000 to 10000 rs). Now, to
calculate signature S 0 , d is required. Say he made up some random d. Since, φ(n) is unknown, e
cannot be calculated, and thus e can’t be send.

Digital Signatures using Elgamal

• Setup Phase: p, α are choosed and are available pub-


lically

• Owner chooses d from {2,..,p − 2} and Kpub = β = αd


(mod p) is calculated and sent. This is called Long
Term Public Key. This is not changed for every mes- Known info - p, α, Public key
sage the owner sends. Its maintained across all the β, message m, signature (r, s).
messages he/she sends. Verification: is αm (mod p)
same as β r .αs (mod p)? If
• Ephimeral key KE chosen from {2,..,p − 2} such that yes, its verified.
gcd(KE , p − 1) = 1. Ephimeral keys are generated
newly for each message the owner sends.

• Signature is calculated. like r = αKE (mod p) and


s = m − d.r (mod p). The pair (r, s) is the signature.

Correctness

β r .αs (mod p) = β r .αm−d.r (mod p)


d.r m−d.r
=α .α (mod p)
m
=α (mod p)

Now, if someone wants to tamper the message, say they wants to create signature of the owner of
different message m0 .
r = αKE can be calculated by choosing some random KE . But to calculate s, d is required which
can’t be solved from the eqation β = αd (mod p) due to discrete logarithmic problem.

Disadvantage
The signature size is double the size of RSA as we are sending both r and s as signatures.

22
Lecture - 11
Topic: Digital Signatures Algorithm

Digital Signature Algorithm


This is currently widely used methof for Digit Signatures. This algorithm reduces the size of the
signature from 2048 to 320 bits and 4096 to 422 bits as compared to Elgamal method.

• Choose p s.t. 21023 < p < 21024

• Choose q, a factor of φ(p) = p − 1 s.t. 2159 < q < 2160

• Choose α s.t. ord(Zp ) = q.


i.e, α is the generator of a subgroup of Z∗p of order q.
In other words, there exists a subgroup of Z∗p such that • Verification
{e : eq = 1 (mod p)} and α is the primitive root of the
subgroup. We’ll discuss this in detail later. • Compute

• Kpr = d ∈ {0, 1, 2, 3, ..., q − 1}. w = S −1 (mod q)


µ1 = w.SHA(m) (mod q)
• Choose Kpub = β = αd (mod q)
µ2 = w.r (mod q)
• Public parameters transfered to B are : Kpub = β, p, v = (α .β µ2
µ1
(mod p)) (mod q)
q ,α

• Ephimeral key KE ∈ {0, 1, ..., q − 1}.


• If v ≡ r, then the signature is valid.
• Signature : Else invalid
r = αKE (mod q)
s = (SHA(m) + dr)KE−1 (mod q)
Assume SHA to be a funtion which converts any mes-
sage to a 64 bit value. We’ll discuss this later.

• SHA(m),(r,s) sent to B.

r, s are 160 bits each, and thus, the signature (r, s) is of 320 bits (much smaller as compared to
Elgamal method of 1024 + 1024 = 2048 bits.)

Correctness

s = (SHA(m) + d.r)KE−1 (mod q)


s.s −1
= (SHA(m) + d.r)KE−1 s−1 (mod q)
1 = (SHA(m) + d.r)KE−1 s−1 (mod q)
−1
s .(SHA(m) + d.r) = KE (mod q)

23
So, s−1 .(SHA(m) + d.r) = θq + KE .
−1 SHA(m) −1 r
v = αs βs (mod p) (mod q)
s−1 SHA(m) s−1 r.d
=α α (mod p) (mod q)
s−1 SHA(m)+s−1 r.d
=α (mod p) (mod q)
s−1 (SHA(m)+r.d)
=α (mod p) (mod q)
= αθq+KE (mod p) (mod q)
q θ KE
= (α ) .α (mod p) (mod q)
= αKE (mod p) (mod q) [as αq = 1 (mod p)]
KE
=α (mod q)
=r

Here too, as d is unknown signature can’t be created by a random person other than the owner.

Clearing some concepts


Subgroup of Z∗p such that {e : eq = 1 (mod p)} and α is the generator of the subgroup. Here, first
notice that its a group on its own as the properties such as Closure, Associativity, etc exists.
Does inverse element exist?
We know that e−1 exits in Z∗p . We just need to prove e−1 is in the subset provided e is in it.

(e−1 )q = x (mod p)
q −1 q q
e .(e ) = e .x (mod p)
−1 q
(e.e ) = 1.x (mod p)
1=x (mod p)
x=1

Thus proved e−1 is in the subset. Thus proved that this subset is actually a group.
So just choose a primitive root α from this subgroup.

SHA is a hashing funtion which creates any message to a 64 bit value. It hashes so widely such
that, practically every message hashes to a different value. Though through pigeion hole principle,
there will exist messages to hashing to same value mathematically, practically we can assume they
hash to different values.

Why do we hash the funtion? why can’t we use the message at itself?
First, hashing increases confusion in message, i.e, even if just a small change is made in message,
its hash would be completely different. Secondly, messages can be arbitarily long, and so to have it
standard, we hash it to 64 bit value.

24
Lecture - 12
Topic: Digital Signatures Algorithm

Key Exchange
Two parties A and B wants to exchange key between them to use Symmetric Cryptography from
then as Assymetric algorithm is computationally expensive to send the whole message.
Passive Eavesdropping: Just can listen on the common commutication channel between two
parties and cannot send data.

Active Eavesdropping
Say an intruder is Active eavesdropper, i.e, he/she can listen to the common communication channel
and also write or transfer messages.
Such an attack is called Man In The Middle attack. Take example of DHKE between people A
and B. So, when A = αa (mod p) and B are transfered from each other, the intruder reads them,
modify them as A0 and B 0 and send to each other.

R chooses a private key r1


from 2,3,..,p − 2
Calculate public key A0 = αr1
(mod p)
A chooses a private key a R chooses a private key r2 A chooses a private key b
from 2,3,..,p − 2 from 2,3,..,p − 2 from 2,3,..,p − 2
Calculate public key A = αa Calculate public key B 0 = αr2 Calculate public key B = αb
(mod p) (mod p) (mod p)
Send A Send A0 to person B Send B
Send B 0 to person A

Recieve B 0 . Recieve A0 .
KAB 0 = B 0a (mod p) = αa.r1 Recieve A. KA0 B = Aa (mod p) = αa.r2
(mod p) Recieve B. (mod p)
KAB 0 = Aa (mod p) = αa.r1
(mod p)
KA0 B = Aa (mod p) = αa.r2
(mod p)

Both the parties can’t even identify that such an intruder is present in between. Person A assumes
B 0 was sent by actual person B and vice versa. Once, such a thing is done, the intruder can not
only read the messages sent between them, but also can modify and send messages between them.
Very similar attack is possible even in Elgamal. This problem arrises because, there is no authen-
tication for any message. To resolve authentication, one may think we can use digital signatures.
But, that is also prone to such a middle man attack.

25
Centralized Agency (CA)
Thus some authorised middle man is required to authenticate both and build trust. Such an agency
is called Centralized Agency (CA). Such a system with CA is called PKI i.e Public Key Infrastruc-
ture. The above problem just requires people to send an ID along with message to authenticate.The
centralized agency produces a certificate for whatever it sends and its ID. When we use https, we
see a lock symbol in the browser. Its one kind of a certificate.
Now, for a person A to communicate, it asks for a certificate to CA by sending (A = αa (mod p), IDA )
(IDA is some ID sent by it like IP address or serial number of processor or something which is unique
across world). Now, CA produces a signature(SA = SignKprivate,CA (A, IDA )) on them using its own
private key (Kprivate,CA ) and sends it back along with its public key(Kpublic,CA ).

A chooses a private key a from 2,3,..,p − 2


Calculate public key A = αa (mod p) B knows the public key of CA and so it
Send (A, IDA , SA ) can verify (A, IDA ) using SA . Thus B can
be sure that its sent from A only and not
from intruder. Now follow same steps for
Recieve B. Check it with CA. our regular algorithm.
KAB = B a (mod p) = αa.b (mod p)

Now, if intruder wants to do something, it can’t give fake signature of CA about A = αa (mod p)
and ID as private key of CA is not known to anyone. Of cource if CA’s private key is known, he/she
can break anything on their way. Assuming its not possible, now communication is assumed secure.

26
Lecture - 13
Topic: Security Services

Security Services
• Confidentiality

• Authentication

• Integrity (Not tampered by third party)

• Non-repudiation

• Access Control

• ...

Confidentiality can be ensured by Encryption.


Authentication can be ensured by Digital Signatures.
Integrity can be ensured by Diffusion.
Non-repudiation can be prevented by Digital Signatures.

Access Control is our next topic of discussion :)

Elgamal
Vulnerable to Man in the Middle Attack.
Solution to this? Certification Authority as we saw in last class.
The signature of CA is obtained by providing Public key A and IDA . Third party cannot obtain
the certification in the name of A as their identity is verified at multiple levels like Govt ID, emails,
phonecalls, etc. Also, the certification comes with a validity period, and should be renewed after
that period. Nowadays, the certificate comes along with the device when it is bought.
There is no one centralised CA. The main CA is divided into multiple sub CAs. This is done as if
the centralised CA is compromised, then all the data in the world is at risk.
This kind of security is used in SSL, TSL Protocols.
SSL - Secure socket Layer
TLS - Transport layer security
We have four levels of security:

• Application

• Transportation Layer

• Network Layer

• DLL

• Physical Layer

27
Lecture - 14
Topic: Security Protocols(Guest Lecture)
Have you ever noticed broken lock symbol in unsafe websites? What are these? How do we actually
send messages?

Digital certificate
Every device would have a digital certificate which contains information like the person’s iden-
tity(eg- name, address, organisation), public key, some serial number, validaity dates, certification
authority’s identity, its digital signature, etc which would be verified by the CA(Certificatation
Authority). So indirectly, every user trusts the CA (or whoever is the issuer of the certificate). All
the personal information would be stored in a data stucture, and certificate would be issued by it.
But CA cannot certify all over the world right? Traffic would grow exponentially. So, CA can also
allot people, and give a certificate that, this firm can issue certification. So end users can believe
certificate issued by either CA, or by firms approved by CA. Actually, we can choose organisations
that we trust for certification issuing. Many organisations, banks do this. Try this at room using
the slides.

Protocol against Middle Man Attack


Say person Alpha wants to send message to Beta and there is a middle man Gamma trying to
decrypt it. Each have a certificate from CA for their public key. With the public key of the CA,
anyone can verify the certificate. Now, everyone knows everyone’s public key, their cerificates. So if
Alpha wants to send to Beta, he can use Beta’s public key so that only Beta can decrypt it. When
we make connection with google with https, we authenticate google using its certificate. See the
slides, for SSL protocol where exchange of messages to google is clearly mentioned.
This is a broadcast system where each person can see anyone’s message transaction with everyone
having certificate and public keys. But, can an internet service provider(ISP like Jio/ Airtel)
interfere in between? Actually, every internet packet is through ISP only, and so they can do
anything they want. SSL certificates are for websites mainly. For emails, its seperate, for whatsapp,
or facebook, it different it seems.

28
Lecture - 15
Topic: Security Protocols(Guest Lecture)

Revoking a certificate
Say you have your physical id card. And the institute wants to revoke it before the day of expiry.
Since its physical, it can be broken, and revoked, but how to invalidate a digital certificate. We
can’t delete the certificate simply as its known by everyone. Also, if a person is offline, he won’t
Recieve the info that its revoked. We can’t maintain a revocation list as there might be many
certificates revoked everywhere, and its very difficult. There is a heirarchy of certification agencies
which makes it more difficult.

Replay Attack
Say a car is unlocked by a key with IR transmitter. It can be listened, and replayed, or re created
to unlock the car. This kind of attack is called replay attack. To avoid this, we can include time
stamp along with the message.

So, see both the problems mentioned above. A certificate comes along with a freshness term to
represent a time stamp at which a certificate is created.

Authorization / Access Control


We have seen before what both means. We need to somehow maintain permissions for each person
to view/edit any data. There are two types of Access control:

DAC: Discresionary access control.


The owner has the full control over who can read the file, who can update/modify, etc.

MAC: Mandatory access control.


Irrespective of who is the owner, the sytem administrator has to enforce in system level. He
need to design a policy and needs a mechanism to enforce it. There would be system, and
people with an account in it can login on it. Based on his username or something, the system
administrator should enforce that for that user, some of the files are available.

Subject(principle),Object(resources),Action
Matrix: Here we have a matrix with columns as objects, and rows as subjects(users). The matrix
simply indicates the permission each user has over the files(objects).
Suppose user 2 wants to read file 3, the matrix says only write, so he cant read file 3. This Matrix
is simply to maintain access control.
This is not an efficient way of maintaining access control - Sir

A more systematic way to maintain these permissions is by creating an access control list, for each
file s list of the permissions each user have on it. The list can also be maintained capability wise,
that is a list of permissions of particular user one every file.

29
Confused deputy problem
A compiler has read for

solution
Capability-wise list is safer because ... hmm... not sure

30
Lecture - 16
Topic: DAC and MAC

ACM(access control matrix)


This we have seen last class that permissions of every file and every user are stored in a matrix.

ACL(Access control list)


This is a big list which

Confidentiality bases access control model


Some of such models are:

Bella La Pudela model(BLP)


So we have subjects(users) S, and objects(files) O. Now, we have different levels of secracy of an
object. They are - top secret, secret, confidential, and unclassified(public). Eg- some persons email
is confidential, asc report is secret, and some other very important thing is top secret. These 4
categoried can be given to the objects. Then we assign a clearence label of a particular subject i.e,
what all level files can he/she wants. Now, if a person has clearence level of top secret, then he
littrally can see any resources they want. Coming down the heirarchy, the resources will be limited.
This kind of policy is called “NO READS UP”, i.e a person can’t read files of higher sensitive
information than their clearence level. Here, for writing, we follow something called “NO WRITES
UP” where you can’t write files of lower priveldge level. Thus, here, even though a public writes on
a secret file, it won’t be able to be read by other public people. Thus confidentiality is maintained.
This may seem stupid as a public person has access to change secret info. But here, we don’t care
about other things, and we only care about confidentiality.

Integrity
In the previous method, we just implement “NO WRITES UP” and “NO READS DOWN”. This
can be observed to ensure integrity, i.e, un-authorised people can’t write on secret files.

31
Lecture - 18
Topic: DAC and MAC
Previous class we discussed about information flow. There, we imagine a machine giving output
according to input, and knowing the output, will we be able to find the inputs or not was the
question.

Lattice based information flow model


Here, there is a direction of information flow, and based on whether its legitimate or not, etc, we can
model it as a latice. We construct a partial order where the directed edges indicates the legitimate
flow of information. Eg, if x ≤ y, then its legitimate if information flow happens from x to y. We

Partial order

define two operators in latice, namely lowest upper bound (⊕), and greatest lower bound (⊕).

Example
Say we write y = x + z, p = y + t, so there is a info flow from x → y(x ≤ y), z → y(z ≤ y) but,
there is no info flow from y → p 1 .
Some of the properties of this model are :-
1. reflexive (A ≤ A)
2. transitive. If info flows from A to B, and B to C, we can say it flows from A to C.
1
This is because, there would have been a value of y which we know, but after its over written by x and z. So now
using p, we can infer only about the new value, and not the previous one.

32
3. Anti-Symmetric. If info flows from A to B, and B to A, only possibility is that A = B.

RWFM
Its read write flow model.

Information Leaking
What if the algorithm is correct, but its implemented wrongly which leads to information leakage.
This is called a Bug. We can identify this by methods like Static Analysis. We can dry run(i.e,
mentally run it and see for implementation bugs) the code.

33
Lecture - –
Topic: Mid sem solutions

1. (a) integrity
(b) availability, confidentiality
(c) integrity / confidentiality
(d) availability, confidentiality

2. 21 bits

34
Lecture - 20
Topic: Browser Security
Mostly, whatever website you login to, its asks for either its own login, or authentication, or google’s
account, etc. So, we believe google, facebook to keep our login passwords safely, and so we get in
through google credentials. So now, what is the conversation, or policy between google and that
particular website?
Google gives an API, and whichever website uses it, google lends its authentication to it. Also here,
google is a witness for your activites in the website. It actually, now can see how many times you
logged in, etc and so it can give suggestions accordingly.

Cookies
Its a mechanism to not ask the user again and again for our preferences. Eg- if you choose your
preferred language, it’ll be stored in cookie, and the next time you login, it knows the preference.
Cookie, is a file, stored in a browser side to store such things. It may also contain your passwords.
So is it possible for it to get stolen?

• If we use a shared computer, since the cookie is stored in the computer only, if another user
uses it, it can get leaked.

Cookies are encrypted today to avoid the stealing.


Though they are encrypted, there are some ways to break it. They are

XSS(cross site scripting)


its a type of computer vulnerability typically executed with the help of web applications through
breaches in user’s web browser. Eg- if some website takes input, you can input a javascript code
which modifies the javascript of the website and thus crashing it.

To avoid this, if we see some script like input, we ignore it. Nowadays, most websited so this, but
an unaware web developer could be a harm for its user.

35
Reflected XSS
There are two methods of providing the input to the website. They are GET and POST methods.
In GET method, the information is directly given in the url and the web’s code reads from the url
and takes it as input and run accordingly. Eg- instead of searching for the string “data security” in
google’s tab, you can direcly give the url you’re seaching for itself as https://www.google.com/search?q=data+se-
curity.
So, in this method, lets say a hacker gives you a link with input as a javascript, then if you click on
it, the website starts working accoding to the hacker’s contol.

CSRF
Typical practical scenario will be like, if you login in some website though your username, pwd, etc
and you log out, the session should expire and authentication should end. But if csrf is not handled
properly and you click back button, from your history, you can see again your personal data in
website even after logging out.

36
Lecture - 21
Control Flow

• Control Flow Integrity Check (CFI)


• Malware Detection
• Side Channel Based Attacks

Malware Detection
Signatures

Behaviour
Behaviour of a program is .

Side Channel Based Attacks


Ciphers used today are vulnerable after quantum computers. Post Quantum Cryptography is lattice
based cryptography.
Hardware is designed to offer maximum performance while consuming minimal power. Earlier per-
formance was the primary goal, but these days it is minimal power consumption.

Performance increase per year vs time

• 20% /year - before 1985/86


• 50% /year - 1985 - 2003/2004 (motto : faster is better)
• 20% /year - after 2003 (motto : more is better)
But memory accesses are significantly slow (about 100 times slower) which grew at about 6%/year.
Caches were introduced to enhance memory access.
It was believed Control flow was the bottleneck. Actually, while a program runs, 10% of the code
executes for 90% time. This can be prevented by speculating and predicting the direction after a
conditional.

Once the control flow is fixed, data flow became the bottle neck.

Various performance enhancements:


• Speculation of branch direction.
• Speculation of Data.
• Out of order execution.
• Use of Caches.
• Multithreaded architecture support.

37
Lecture - 22
Topic: Malwares
Since every laptop nowadays are connected to internet, malware is common to occur. It means the
ability to sneak in another person’s system. Malicious code is a code aimed to harm a system’s
usage or the user. Categories:

• Virus

• Worm

• Backdoor

• Trojans - spying and entering a system with some other identity. Almost 70% of the malware
attacks are trojans only.

There is a requirement to avoid these as these attacks may be as costly as billions of USD. From the
past malware attack data, we observe that though number of such attacks are growing exponentially,
the total number of distict families of attacks grow only linealy with time. This is a good news that
we just need to prevent attacks of a family.
Phishing is another common attack where spam emails are sent to steal login credentials of users.

Modelling Malware
A computer virus is a program that can infect ... - a formal defenition was given(refer to slides).

Possibility of a virus infection is based on the theory of self-reproducing automata and not based
on system damage or anything.

Theorem 0.1 For every defence mechanism, there exists a virus that escapes it and for every virus,
there exists a defence mechanism for it.

Key defenitions:

• family- all viruses having a same code base belong to same family

• variant- virusus which are just slightly modified from others.

Methods to prevent virus


Using signatures for programs. We can allow only programs of known signature to run and not run
otherwise. The problem with this is that the number of virus is also so much and checking them one
by one would be a very difficult job. In fact, virus detectors like McFree, etc have a full database
of signatures of viruses and it checks regularly.
A better option would be to detect malware families and not individual instances. If you see for
individual instances is that, if you just add a “no operation statement” in your code, the signature,
file size, and everything would change though the behaviour would be exactly the same.
To detect code pattens, AI could be used which learnt already existing virus’s code bases, and
identify the new variant viruses.

38
Lecture - 23
Topic: Slides revision in moodle

Trust anchor: The root of CA is completely trusted to be true, and its trust cannot be derived. It is assumed
to be genuine.

UI/UX symbols: if its green, its secure. If it reads “not secure!” in grey, then it means it was once trusted, and
maybe the certificate has expired. “not secure” in red means it was never trusted.

Wildcard DC: Wildcard Digital certificate is given for a whole domain rather than individual websites. Eg-
if “*iitb.ac.in” is verified, anyone having website in that domain is verified.

Types: Certificate pinning or stapling- Some certificates are hardcoded and cannot be revoked. Eg -
hardware devices, Proof carrying code, some mobile apps, etc.

CRL (Certificate Revocation List)


A CRL will list all the certificates that have been either suspended or revoked. CRLs are published
by Certificate Authorities. Generally, the root certificate for a Certificate Authority will specify the
location of that CA’s CRL.

OCSP (Online Certificate Status Protocol)


As you login to a website, first we need to check its authenticity with the CA. Checking this can
be done is OCSP protocol. OCSP Stapling improves performance by positioning a digitally-signed
and time-stamped version of the OCSP response directly on the webserver. This stapled OCSP
response is then refreshed at predefined intervals set by the CA. The stapled OCSP response allows
the web server to include the OCSP response within the initial SSL handshake, without the need
for the user to make a separate external connection to the CA. OCSP (Online Certificate Status
Protocol) is one of two common schemes used to maintain the security.

If the server’s certificate returns true upon OCSP (Online Certificate Status Protocol) check, it
means that the certificate is valid according to the issuing Certificate Authority (CA). The termi-
nation of the protocol would typically occur during the SSL/TLS handshake process, specifically
during the Certificate Verification step. During the SSL/TLS handshake, after the client receives
the server’s certificate, it performs a series of checks to verify the certificate’s authenticity, includ-
ing checking the certificate’s expiration date, verifying the signature, and optionally checking the
certificate’s revocation status using OCSP or Certificate Revocation Lists (CRLs). If the OCSP
check returns true, indicating that the certificate has not been revoked, the certificate is considered
valid, and the handshake can proceed.
OCSP is usually used in transactions where the validity of digital certificates needs to be verified in
real-time. This is particularly important in situations where immediate assurance of a certificate’s
validity is required, such as in online banking, e-commerce transactions, secure email communica-
tion, and accessing secure websites (HTTPS). By checking the certificate’s status via OCSP, parties
involved in the transaction can ensure that the certificate has not been revoked since it was issued,
thereby enhancing the security of the communication channel.

39
Perspectives
Perspectives is a decentralized approach to securely identifying Internet servers. Network notaries
are like CAs found across the internet and carry the digital ceritficates of various websites. Once
a user accesses a website the browser verifies the digital certificate of the website and compares it
with the previous certificates tracked by the trusted network notaries. If they differ then we know
there is a Man in the Middle attack.

Certificate Pinning by OS/Browser


Explanation: Certificate pinning is a security mechanism used by web browsers, operating systems,
and applications to prevent man-in-the-middle attacks and certificate spoofing. By pinning (or
associating) a website’s digital certificate with its public key or cryptographic hash, the browser
or OS can ensure that subsequent connections to the website only accept certificates signed by the
pinned key. This helps protect against attacks where an attacker presents a fraudulent certificate
signed by a compromised or rogue CA.

40

You might also like