Chapter 1 (Introduction)
Chapter 1 (Introduction)
`Introduction to Cryptography
The word cryptography comes from two Greek words meaning “secret writing” and is the art
and science of information hiding. This field is very much associated with mathematics and
computer science with application in many fields like computer security, electronic commerce,
telecommunication, etc.
In the ancient days, cryptography was mostly referred to as encryption – the mechanism to
convert the readable plaintext into unreadable (incomprehensible) text i.e. ciphertext, and
decryption – the opposite process of encryption i.e. conversion of ciphertext back to the
plaintext. Though the consideration of cryptography was on message confidentiality
(encryption) in the past, nowadays cryptography considers the study and practices of
authentication, digital signatures, integrity checking, and key management, etc.
Encryption mostly provides the secrecy of massage being transmitted over the communication
network. This is called confidentiality of massage. The only sender knows the keys and can
decipher the massage.
Cryptology
The combined study of cryptography and cryptanalysis is known as cryptology. Though most of
the time we use cryptography and cryptology in the same way.
Objective of cryptography
Encryption is the process of encoding a message so that its meaning is not obvious i.e.
converting information from one form to some other unreadable form using some algorithm
called cipher with the help of secret message called key. The converting text is called is plaintext
al
Alternatively, the terms encode and decode or encipher and decipher are used instead of encrypt
and decrypt. That is, we say that we encode, encrypt, or encipher the original message to hide its
meaning. Then, we decode, decrypt, or decipher it to reveal the original message.
Encryption Decryption
Plaintext Ciphertext Original Plaintext
Fig: Encryption-Decryption
The use of encryption techniques is being used since very long period as it can be noted from the
technique called Caesar’s cipher used by Julius Caesar for information passing to his soldiers.
Encryption techniques have also been extensively used in military purposes to conceal the
information from the enemy. Nowadays to gain the confidentiality encryption is being used in
many areas like communication, internet banking, digital right management, etc.
Key
Cipher
al
ep
itn
Cryptosystem
The type of operations used for transforming plaintext to ciphertext. All encryption
algorithms are based on two general principles: substitution, in which each element in the
plaintext (bit, letter, group of bits or letters) is mapped into another element, and transposition, in
which elements in the plaintext are rearranged. The fundamental requirement is that no
information be lost (that is, that all operations are reversible). Most systems, referred to as
product systems, involve multiple stages of substitutions and transpositions.
The number of keys used. If both sender and receiver use the same key, the system is referred
to as symmetric, single-key, secret-key, or conventional encryption. If the sender and receiver
use different keys, the system is referred to as asymmetric, two-key, or public-key encryption.
al
The way in which the plaintext is processed. A block cipher processes the input one block of
ep
elements at a time, producing an output block for each input block. A stream cipher processes the
input elements continuously, producing output one element at a time, as it goes along.
itn
Substitution Cipher
In substitution ciphers the letters are systematically replaced by other letters or symbols.
1. Caesar Cipher
It is the simple shift monoalphabetic classical cipher where each letter is replaced by a letter 3
position (actual Caesar cipher) ahead using the circular alphabetic ordering i.e. letter after Z is A.
Caesar Cipher is quite easily broken even with ciphertext only. One can attack the cipher text
using exhaustive search by trying all possible keys until you find the right one. Exhaustive search
is best suited if the key space is small and we have only 26 possible keys in Caesar cipher.
Another approach of attacking the cipher is statistical analysis where we compare the ciphertext
to 1-gram model of English.
Caesar’s Problem
The main problem with Caesar’s Cipher is that the key is too short and can be found by
exhaustive search. Again statistical frequencies not concealed well i.e. they look too much like
regular English letters. So the solution can be to increase the key length (can be done using
multiple letters in key) so that cryptanalysis gets harder.
Transposition Cipher
In transposition ciphers the letters are systematically arranged so that the actual position of letters
is gets changed making the text garble.
2. Rail-Fence Cipher
The Rail Fence Cipher is a form of transposition cipher that derives its name from the way in
which it is encoded. In the rail fence cipher, the plaintext is written downwards and diagonally
on successive "rails" of an imaginary fence, then moving up when we reach the bottom rail.
When we reach the top rail, the message is written downwards again until the whole plaintext is
written out. The message is then read off in rows.
For example, using 3 "rails" and a message of 'WE ARE DISCOVERED FLEE AT ONCE', the
cipherer writes out:
al
W . . . E . . . C . . . R . . . L . . . T . . . E
. E . R . D . S . O . E . E . F . E . A . O . C .
. . A . . . I . . . V . . . D . . . E . . . N . .
ep
itn
Similarly, if we have 3 "rails" and a message of THIS IS THE PLAINTEXT, the cipherer writes
out (we are not showing diagonal move here just write in down rail a step ahead):
TSTPIE
HIHLNX
ISEATT
The ciphertext is T S T P I E H I H L N X I S E A T T
The problem with Rail Fence Cipher is that the rail fence cipher is not very strong; the number of
practical keys is small enough that a cryptanalyst can try them all by hand. To decrypt we get the
number of letters to be skipped. For this if the number of rail is n key is
total letters in ciphertext / n so in our e.g. n = 3 and key is 18/3 = 6 i.e. skip 6 letters from the
letter you are reading every time to get plaintext (remember to go circular that is if count ends
continue from the starting letter leaving the read letter). See below:
T S T P I E H I H L N X I S E A T T
We have
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
selected letter with index 1 THI. Now choose the letter with index 2, see below
T S T P I E H I H L N X I S E A T T
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
Continue like this until you read off all the characters.
It is like Caesar cipher, but uses a phrase for e.g. for the message THE BOY HAS THE BALL
and the key VIG, encipher using Caesar cipher for each letter:
al
key VIGVIGVIGVIGVIGV
ep
plain THEBOYHASTHEBALL
itn
Plaintext
Assuming key on top and the plaintext on left, Decryption is performed by finding the position
of the ciphertext letter in a column, corresponding to the key letter, of the table, and then taking
the label of the row in which it appears as the plaintext letter. For example, in column V (key
letter), the ciphertext letter O appears in row T, which taken as the first plaintext letter. The
second letter is decrypted by looking up P in column I of the table; it appears in row H, which is
taken as the plaintext letter. This process continues until we find the plaintext letters for all the
ciphertext letters
al
5. Playfair Cipher
The best-known multiple-letter encryption cipher is the Playfair, which treats digrams in the
plaintext as single units and translates these units into ciphertext digrams
The Playfair algorithm is based on the use of a 5 x 5 matrix of letters constructed using a
keyword. Here keyword is MONARCHY then the matrix is:
M O N A R
C H Y B D
E F G I/J K
L P Q S T
U V W X Z
The matrix is constructed by filling in the letters of the keyword (minus duplicates) from left to
right and from top to bottom, and then filling in the remainder of the matrix with the remaining
letters in alphabetic order. Plaintext is encrypted two letters at a time, according to the following
rules:
1. Repeating plaintext letters that are in the same pair are separated with a filler letter, such as
x, so that balloon would be treated as ba lx lo on.
al
2. Two plaintext letters that fall in the same row of the matrix are each replaced by the letter to
ep
the right, with the first element of the row circularly following the last. For example, ar is
itn
3. Two plaintext letters that fall in the same column are each replaced by the letter beneath,
with the top element of the column circularly following the last. For example, mu is
encrypted as CM.
4. Otherwise, each plaintext letter in a pair is replaced by the letter that lies in its own row and
the column occupied by the other plaintext letter. Thus, hs becomes BP and ea becomes IM
(or JM, as the encipherer wishes).
6. Hill Cipher
Another interesting multi letter cipher is the Hill cipher, developed by the mathematician Lester
Hill in 1929. The encryption algorithm takes m successive plaintext letters and substitutes for
them m ciphertext letters. The substitution is determined by m linear equations in which each
character is assigned a numerical value (a = 0, b = 1 ... z = 25).
For example, consider the plaintext "paymoremoney" and use the encryption key
K=( )
The first three letters of the plaintext are represented by the vector
C= E(K, P) = KP mod 26
As with Playfair, the strength of the Hill cipher is that it completely hides single-letter
frequencies. Indeed, with Hill, the use of a larger matrix hides more frequency information. Thus
al
a 3 x 3 Hill cipher hides not only single-letter but also two-letter frequency information.
ep
itn
Introduction
Malicious logic is a set of instructions that cause a site's security policy to be violated.
Malicious code refers to a broad category of software threats to our network and systems.
Perhaps the most sophisticated types of threats to computer systems are presented by malicious
codes that exploit vulnerabilities in computer systems. Any code which modifies or destroys
data, steals data , allows unauthorized access, exploits or damage a system, and does something
that user did not intend to do, is called malicious code. There are various types of malicious code
we will encounter, including Viruses, Trojan horses, Logic bombs, and Worms.
A computer program is a sequence of symbols that are caused to achieve a desired functionality;
the program is termed malicious when their sequences of instructions are used to intentionally
cause adverse affects to the system. In the other words we can’t call any “bug” as a Malicious
Code. Malicious codes are also called programmed threats. The following figure provides an
overall taxonomy of Malicious Code.
al
ep
itn
- Independents: are self contained program that can be scheduled and ran by the operating
system.
- Needs host program: are essentially fragments of programs that can not exist
independently of some actual application program, utility or system program.
Homogeneity – e.g. when all computers in a network run the same OS, if you can hack
that OS, you can break into any computer running it.
Defects – most systems containing errors which may be exploited by malware.
Unconfirmed code – code from a floppy disk, CD-ROM or USB device may be
executed without the user’s agreement.
Over-privileged users – some systems allow all users to modify their internal structures.
Over-privileged code – most popular systems allow code executed by a user all rights of
that user.
Trojan Horse
A worm is a program that can replicate itself and send copies from computers across network
connections.
A Trojan Horse is a program with an overt (documented or known) effect and a covert
(undocumented or unexpected) effect. Dan Edwards was the first to use this term.
al
hidden code that, when invoked, performs some unwanted or harmful actions
itn
Example: In the example above, the overt purpose is to list the files in a directory. The covert
purpose is to create a shell that is setuid to the user executing the script. Hence, this program is a
Trojan horse.
Example: A program named "waterfalls.scr" serves as a simple example of a trojan horse. The
author claims it is a free waterfall screensaver. When run, it instead unloads hidden programs,
commands, scripts, or any number of commands without the user's knowledge or consent.
A propagating Trojan horse (also called a replicating Trojan horse) is a Trojan horse that
creates a copy of itself.
Trojan horses are broken down in classification based on how they breach systems and the
damage they cause. The seven main types of Trojan horses are:
Computer Worms
A computer virus infects other programs. A variant of the virus is a program that spreads from
computer to computer, producing copies of itself on each one. A computer worm is a program
that copies itself from one computer to another. Unlike a virus, it does not need to attach itself to
an existing program. Worms spread by exploiting vulnerabilities in operating systems.
A Worm uses computer networks to replicate itself. It searches for servers with security holes
and copies itself there. It then begins the search and replication process again
Research into computer worms began in the mid-1970s. Schoch and Hupp developed distributed
programs to do computer animations, broadcast messages, and perform other computations.
These programs probed workstations. If the workstation was idle, the worm copied a segment
al
onto the system. The segment was given data to process and communicated with the worm's
ep
controller. When any activity other than the segment's began on the workstation, the segment
shut down.
itn
Example: The Father Christmas worm was interesting because it was a form of macro worm. It
was distributed in 1987 and was designed for IBM networks. It was an electronic letter
instructing recipient to save it and run it as a program that drew Christmas tree, printed “Merry
Christmas!” It also checked address book, list of previously received email and sent copies to
each address. The worm quickly overwhelmed the IBM networks and forced the networks and
systems to be shut down. This worm had the characteristics of a macro worm. It was written in a
high-level job control language, which the IBM systems interpreted.
The Nachi family of worms, for example, tried to download and install patches from Microsoft's
website to fix vulnerabilities in the host system — by exploiting those same vulnerabilities. In
practice, although this may have made these systems more secure, it generated considerable
network traffic, rebooted the machine in the course of patching it, and did its work without the
consent of the computer's owner or user.
In 1982, at the Xerox Park research institute, a worm was created to find idle machines. It was
used to distribute workloads and was not a malicious worm. So worms can be helpful.
Types of Worms:
Computer Viruses
When the Trojan horse can propagate freely and insert a copy of itself into another file, it
becomes a computer virus. A computer virus is a program that inserts itself into one or more files
and then performs some (possibly null) action. Computer virus works in two phases. The first
phase, in which the virus inserts itself into a file, is called the insertion phase. The second phase,
in which it performs some action, is called the execution phase. The following pseudo-code
fragment shows how a simple computer virus works.
beginvirus:
if spread-condition then begin
for some set of target files do begin
if target is not infected then begin
determine where to place virus instructions
copy instructions from beginvirus to endvirus into target
alter target to execute added instructions
end;
end;
end;
perform some action(s)
goto beginning of infected program
endvirus:
secret information which is normally required to do so. Typically, this involves finding a secret
ep
key.
itn
Cryptanalysis can be performed under a number of assumptions about how much can be
observed or found out about the system under attack. It it is normally assumed that the general
algorithm is known; this is Kerckhoffs' principle of "the enemy knows the system". There can be
many types of attacks and broadly we categorize them as attack models:
al
ep
itn