0% found this document useful (0 votes)
74 views9 pages

Lecture 4 Hash Functions

Hash functions are mathematical functions that convert arbitrary-length input data into fixed-length hash values, essential for information security applications. They possess features like fixed-length output and efficiency, and must meet requirements such as pre-image resistance and collision resistance. Popular hash functions include MD5, SHA-1, and SHA-2, with applications in password storage, data integrity checks, and digital signatures, which provide authentication, integrity, and non-repudiation in cryptographic communications.

Uploaded by

justuscheson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views9 pages

Lecture 4 Hash Functions

Hash functions are mathematical functions that convert arbitrary-length input data into fixed-length hash values, essential for information security applications. They possess features like fixed-length output and efficiency, and must meet requirements such as pre-image resistance and collision resistance. Popular hash functions include MD5, SHA-1, and SHA-2, with applications in password storage, data integrity checks, and digital signatures, which provide authentication, integrity, and non-repudiation in cryptographic communications.

Uploaded by

justuscheson
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 9

LECTURE 4 HASH FUNCTIONS

Hash functions are extremely useful and appear in almost all information security applications.
A hash function is a mathematical function that converts a numerical input value into another
compressed numerical value. The input to the hash function is of arbitrary length but output is always of
fixed length.
Values returned by a hash function are called message digest or simply hash values. The following
picture illustrated hash function

Features of Hash Functions


The typical features of hash functions are −
 Fixed Length Output (Hash Value)
o Hash function coverts data of arbitrary length to a fixed length. This process is often
referred to as hashing the data.
o The hash is much smaller than the input data, hence hash functions are sometimes
called compression functions.
o Since a hash is a smaller representation of a larger data, it is also referred to as a digest.
o Hash function with n bit output is referred to as an n-bit hash function. Popular hash
functions generate values between 160 and 512 bits.
 Efficiency of Operation
o Generally for any hash function h with input x, computation of h(x) is a fast operation.
o Computationally hash functions are much faster than a symmetric encryption.

Requirements for a Hash Function


1. H can be applied to a block of data of any size.
2. H produces a fixed-length output.
3. H(x) is relatively easy to compute for any given x, making both hardware and software
implementations practical.
4. For any given value h, it is computationally infeasible to find x such that H(x) =h. This is
sometimes referred to in the literature as the one-way property.
5. For any given block x, it is computationally infeasible to find y x such that H(y) = H(x). This
is sometimes referred to as weak collision resistance.
6. It is computationally infeasible to find any pair (x, y) such that H(x) = H(y). This is
sometimes referred to as strong collision resistance.
The first three properties are requirements for the practical application of a hash function to
message authentication. The fourth property, the one-way property, states that it is easy to
generate a code given a message but virtually impossible to generate a message given a code.
The fifth property guarantees that an alternative message hashing to the same value as a given
message cannot be found. This prevents forgery when an encrypted hash code is used. The sixth
property refers to how resistant the hash function is to a type of attack known as the birthday
attack
Design of Hashing Algorithms
At the heart of a hashing is a mathematical function that operates on two fixed-size blocks of data to
create a hash code. This hash function forms the part of the hashing algorithm.
The size of each data block varies depending on the algorithm. Typically the block sizes are from 128
bits to 512 bits. The following illustration demonstrates hash function −

Hashing algorithm involves rounds of above hash function like a block cipher. Each round takes an input
of a fixed size, typically a combination of the most recent message block and the output of the last
round.
This process is repeated for as many rounds as are required to hash the entire message. Schematic of
hashing algorithm is depicted in the following illustration −

Since, the hash value of first message block becomes an input to the second hash operation, output of
which alters the result of the third operation, and so on. This effect, known as an avalanche effect of
hashing.
Avalanche effect results in substantially different hash values for two messages that differ by even a
single bit of data.
Understand the difference between hash function and algorithm correctly. The hash function generates a
hash code by operating on two blocks of fixed-length binary data.
Hashing algorithm is a process for using the hash function, specifying how the message will be broken
up and how the results from previous message blocks are chained together.

Popular Hash Functions


Message Digest (MD)
MD5 was most popular and widely used hash function for quite some years.
 The MD family comprises of hash functions MD2, MD4, MD5 and MD6. It was adopted as
Internet Standard RFC 1321. It is a 128-bit hash function.
 MD5 digests have been widely used in the software world to provide assurance about integrity of
transferred file. For example, file servers often provide a pre-computed MD5 checksum for the
files, so that a user can compare the checksum of the downloaded file to it.
 In 2004, collisions were found in MD5. An analytical attack was reported to be successful only
in an hour by using computer cluster. This collision attack resulted in compromised MD5 and
hence it is no longer recommended for use.

Secure Hash Function (SHA)


Family of SHA comprise of four SHA algorithms; SHA-0, SHA-1, SHA-2, and SHA-3. Though from
same family, there are structurally different.
 The original version is SHA-0, a 160-bit hash function, was published by the National Institute
of Standards and Technology (NIST) in 1993. It had few weaknesses and did not become very
popular. Later in 1995, SHA-1 was designed to correct alleged weaknesses of SHA-0.
 SHA-1 is the most widely used of the existing SHA hash functions. It is employed in several
widely used applications and protocols including Secure Socket Layer (SSL) security.
 In 2005, a method was found for uncovering collisions for SHA-1 within practical time frame
making long-term employability of SHA-1 doubtful.
 SHA-2 family has four further SHA variants, SHA-224, SHA-256, SHA-384, and SHA-512
depending up on number of bits in their hash value. No successful attacks have yet been reported
on SHA-2 hash function.
 Though SHA-2 is a strong hash function. Though significantly different, its basic design is still
follows design of SHA-1. Hence, NIST called for new competitive hash function designs.
 In October 2012, the NIST chose the Keccak algorithm as the new SHA-3 standard. Keccak
offers many benefits, such as efficient performance and good resistance for attacks.

Applications of Hash Functions


There are two direct applications of hash function based on its cryptographic properties.
Password Storage
Hash functions provide protection to password storage.
 Instead of storing password in clear, mostly all logon processes store the hash values of
passwords in the file.
 The Password file consists of a table of pairs which are in the form (user id, h(P)).
 The process of logon is depicted in the following illustration −

 An intruder can only see the hashes of passwords, even if he accessed the password. He can
neither logon using hash nor can he derive the password from hash value since hash function
possesses the property of pre-image resistance.

Data Integrity Check


Data integrity check is a most common application of the hash functions. It is used to generate the
checksums on data files. This application provides assurance to the user about correctness of the data.
The process is depicted in the following illustration −
The integrity check helps the user to detect any changes made to original file. It however, does not
provide any assurance about originality. The attacker, instead of modifying file data, can change the
entire file and compute all together new hash and send to the receiver. This integrity check application is
useful only if the user is sure about the originality of file.

Cryptography Digital signatures


Digital signatures are the public-key primitives of message authentication. In the physical
world, it is common to use handwritten signatures on handwritten or typed messages. They are
used to bind signatory to the message.
Similarly, a digital signature is a technique that binds a person/entity to the digital data. This
binding can be independently verified by receiver as well as any third party.
Digital signature is a cryptographic value that is calculated from the data and a secret key
known only by the signer.
In real world, the receiver of message needs assurance that the message belongs to the sender
and he should not be able to repudiate the origination of that message. This requirement is very
crucial in business applications, since likelihood of a dispute over exchanged data is very high.

Model of Digital Signature


As mentioned earlier, the digital signature scheme is based on public key cryptography. The
model of digital signature scheme is depicted in the following illustration −

The following points explain the entire process in detail −


 Each person adopting this scheme has a public-private key pair.
 Generally, the key pairs used for encryption/decryption and signing/verifying are
different. The private key used for signing is referred to as the signature key and the
public key as the verification key.
 Signer feeds data to the hash function and generates hash of data.
 Hash value and signature key are then fed to the signature algorithm which produces the
digital signature on given hash. Signature is appended to the data and then both are sent
to the verifier.
 Verifier feeds the digital signature and the verification key into the verification
algorithm. The verification algorithm gives some value as output.
 Verifier also runs same hash function on received data to generate hash value.
 For verification, this hash value and output of verification algorithm are compared.
Based on the comparison result, verifier decides whether the digital signature is valid.
 Since digital signature is created by ‘private’ key of signer and no one else can have this
key; the signer cannot repudiate signing the data in future.

Importance of Digital Signature


Out of all cryptographic primitives, the digital signature using public key cryptography is
considered as very important and useful tool to achieve information security.
Apart from ability to provide non-repudiation of message, the digital signature also provides
message authentication and data integrity.
 Message authentication − When the verifier validates the digital signature using public key
of a sender, he is assured that signature has been created only by sender who possess the
corresponding secret private key and no one else.
 Data Integrity − In case an attacker has access to the data and modifies it, the digital
signature verification at receiver end fails. The hash of modified data and the output
provided by the verification algorithm will not match. Hence, receiver can safely deny the
message assuming that data integrity has been breached.
 Non-repudiation − Since it is assumed that only the signer has the knowledge of the
signature key, he can only create unique signature on a given data. Thus the receiver can
present data and the digital signature to a third party as evidence if any dispute arises in the
future.
By adding public-key encryption to digital signature scheme, we can create a cryptosystem that
can provide the four essential elements of security namely − Privacy, Authentication, Integrity,
and Non-repudiation.

Encryption with Digital Signature


In many digital communications, it is desirable to exchange an encrypted messages than
plaintext to achieve confidentiality. In public key encryption scheme, a public (encryption) key
of sender is available in open domain, and hence anyone can spoof his identity and send any
encrypted message to the receiver.
This makes it essential for users employing PKC for encryption to seek digital signatures along
with encrypted data to be assured of message authentication and non-repudiation.
This can archived by combining digital signatures with encryption scheme. Let us briefly
discuss how to achieve this requirement. There are two possibilities, sign-then-
encrypt and encrypt-then-sign.
However, the crypto system based on sign-then-encrypt can be exploited by receiver to spoof
identity of sender and sent that data to third party. Hence, this method is not preferred. The
process of encrypt-then-sign is more reliable and widely adopted. This is depicted in the
following illustration −

The receiver after receiving the encrypted data and signature on it, first verifies the signature
using sender’s public key. After ensuring the validity of the signature, he then retrieves the data
through decryption using his private key.

ELECTRONIC MAIL SECURITY PRETTY GOOD PRIVACY (PGP)


PGP provides the confidentiality and authentication service that can be used for electronic mail
and file storage applications.
PGP has grown explosively and is now widely used. A number of reasons can be cited for
this growth.
 It is available free worldwide in versions that run on a variety of platform.
 It is based on algorithms that have survived extensive public review and are considered
extremely secure. E.g., RSA, DSS and Diffie Hellman for public key encryption CAST-128,
IDEA and 3DES for conventional encryption SHA-1 for hash coding.
 it has a wide range of applicability.
 It was not developed by, nor it is controlled by, any governmental or standards organization.

Operational description
The actual operation of PGP consists of five services: authentication, confidentiality,
compression, e-mail compatibility and segmentation.
1. Authentication
The sequence for authentication is as follows:
 The sender creates the message
 SHA-1 is used to generate a 160-bit hash code of the message
 The hash code is encrypted with RSA using the sender’s private key and the result is
prepended to the message
 The receiver uses RSA with the sender’s public key to decrypt and recover the hash code.
 The receiver generates a new hash code for the message and compares it with the decrypted
hash code. If the two match, the message is accepted as authentic.

2. Confidentiality
Confidentiality is provided by encrypting messages to be transmitted or to be stored locally as
files. In both cases, the conventional encryption algorithm CAST-128 may be used. The 64-bit
cipher feedback (CFB) mode is used.
In PGP, each conventional key is used only once. That is, a new key is generated as a random
128- bit number for each message. Thus although this is referred to as a session key, it is in
reality a one time key. To protect the key, it is encrypted with the receiver’s public key.
The sequence for confidentiality is as follows:
 The sender generates a message and a random 128-bit number to be used as a session key for
this message only.
 The message is encrypted using CAST-128 with the session key.
 The session key is encrypted with RSA, using the receiver’s public key and is prepended to
the message.
 The receiver uses RSA with its private key to decrypt and recover the session key.
 The session key is used to decrypt the message.

Confidentiality and authentication


Here both services may be used for the same message.
First, a signature is generated for the plaintext message and prepended to the message. Then the
plaintext plus the signature is encrypted using CAST-128 and the session key is encrypted using
RSA.

3. Compression
As a default, PGP compresses the message after applying the signature but before encryption.
This has the benefit of saving space for both e-mail transmission and for file storage.
The signature is generated before compression for two reasons:
 It is preferable to sign an uncompressed message so that one can store only the uncompressed
message together with the signature for future verification. If one signed a compressed
document, then it would be necessary either to store a compressed version of the message for
later verification or to recompress the message when verification is required.
 Even if one were willing to generate dynamically a recompressed message from verification,
PGP’s compression algorithm presents a difficulty. The algorithm is not deterministic;
various implementations of the algorithm achieve different tradeoffs in running speed versus
compression ratio and as a result, produce different compression forms.
Message encryption is applied after compression to strengthen cryptographic security. Because
the compressed message has less redundancy than the original plaintext, cryptanalysis is more
difficult.
The compression algorithm used is ZIP.

4. e-mail compatibility
Many electronic mail systems only permit the use of blocks consisting of ASCII texts. To
accommodate this restriction, PGP provides the service of converting the raw 8-bit binary stream
to a stream of printable ASCII characters. The scheme used for this purpose is radix-64
conversion. Each group of three octets of binary data is mapped into four ASCII characters. e.g.,
consider the 24-bit (3 octets) raw text sequence 00100011 01011100 10010001, we can express
this input in block of 6-bits to produce 4 ASCII characters.
001000 110101 110010 010001
I L Y R => corresponding ASCII characters

5. Segmentation and reassembly


E-mail facilities often are restricted to a maximum length. E.g., many of the facilities accessible
through the internet impose a maximum length of 50,000 octets. Any message longer than that
must be broken up into smaller segments, each of which is mailed separately.
To accommodate this restriction, PGP automatically subdivides a message that is too large into
segments that are small enough to send via e-mail. The segmentation is done after all the other
processing, including the radix-64 conversion. At the receiving end, PGP must strip off all e-mail
headers and reassemble the entire original block before performing the other steps.

PGP Operation Summary:

A message consists of three components.


Message component – includes actual data to be transmitted, as well as the filename and a
timestamp that specifies the time of creation.
Signature component – includes the following
o Timestamp – time at which the signature was made.
o Message digest – hash code.
Two octets of message digest – to enable the recipient to determine if the correct public
key was used to decrypt the message.
Key ID of sender’s public key – identifies the public key
Session key component – includes session key and the identifier of the recipient public key.

PGP – Issues
There were questions of legality, but PGP may now be legally used by anyone in the world:
 noncommercial use in US/Canada with licenced MIT version
 commercial use in US/Canada with Viacrypt version
 noncommercial use outside the US is probably legal with (non US sourced) international
version
 commercial use outside the US requires an IDEA licence for the international version

Security in Practice - SNMP


SNMP is a widely used network management protocol
comprises
o management station
o management agent with
o its management information base (MIB)
o linked by network management protocol (GET,SET)
SNMP v1 lacks any security (GET and SET open if there)
SNMP v2 includes security extensions for
o message authentication (keyed MD5)
o message secrecy (DES)
based on the SNMPv2 party (sender & receiver roles)
o used for access control & key management
o all associated information stored in a party MIB
assumes syncronised clocks (within a set interval)

User Authentication
user authentication (identity verification)
o convince system of your identity
o before it can act on your behalf
sometimes also require that the computer verify its identity with the user
user authentication is based on three methods
o what you know
o what you have
o what you are
All then involve some validation of information supplied against a table of possible values based
on users claimed identity

S/MIME
S/MIME (Secure/Multipurpose Internet Mail Extension) is a security enhancement to the MIME
Internet e-mail format standard, based on technology from RSA Data Security. S/MIME is
defined in a number of documents, most importantly RFCs 3369, 3370, 3850 and 3851.

You might also like