UNIT - 2 Notes
UNIT - 2 Notes
HASH FUNCTIONS
• Hash functions are extremely useful and appear in almost all information
security applications.
• A hash function is a mathematical function that converts a numerical input
value into another compressed numerical value.
• The input to the hash function is of arbitrary length but output is always of
fixed length.
• Values returned by a hash function are called message digest or simply hash
values.
FEATURES OF HASH FUNCTIONS
• Fixed Length Output (Hash Value)
• Hash function coverts data of arbitrary length to a fixed length. This
process is often referred to as hashing the data.
• Hash is much smaller than the input data, hence hash functions are
called as compression functions.
• Efficiency of Operation
• For any hash function h with input x, computation of h(x) is a fast
operation.
• Computationally hash functions are much faster than a symmetric
encryption.
1. Division Method:
This is the most simple and easiest method to generate a hash value. The hash
function divides the value k by M and then uses the remainder obtained.
Formula:
h(K) = k mod M
Here,
k is the key value, and
M is the size of the hash table.
It is best suited that M is a prime number as that can make sure the keys are more
uniformly distributed. The hash function is dependent upon the remainder of a
division.
Example:
k = 12345
M = 95
h(12345) = 12345 mod 95 = 90
k = 1276 M = 11
h(1276) = 1276 mod 11 = 0
Pros:
1. This method is quite good for any value of M.
2. The division method is very fast since it requires only a single division
operation.
Cons:
1. This method leads to poor performance since consecutive keys map to
consecutive hash values in the hash table.
2. Sometimes extra care should be taken to choose the value of M.
2. Mid Square Method:
The mid-square method is a very good hashing method. It involves two steps to
compute the hash value-
1. Square the value of the key k i.e. k2
2. Extract the middle r digits as the hash value.
Formula:
h(K) = h(k x k)
Here,
k is the key value.
Pros:
1. The performance of this method is good as most or all digits of the key value
contribute to the result. This is because all digits in the key contribute to
generating the middle digits of the squared result.
2. The result is not dominated by the distribution of the top digit or bottom digit
of the original key value.
Cons:
1. The size of the key is one of the limitations of this method, as the key is of big
size then its square will double the number of digits.
2. Another disadvantage is that there will be collisions but we can try to reduce
collisions.
3. Digit Folding Method:
This method involves two steps:
1. Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where
each part has the same number of digits except for the last part that can have
lesser digits than the other parts.
2. Add the individual parts. The hash value is obtained by ignoring the last carry
if any.
Formula:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
Here,
s is obtained by adding the parts of the key k
Example:
k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51
4. Multiplication Method
This method involves the following steps:
1. Choose a constant value A such that 0 < A < 1.
2. Multiply the key value with A.
3. Extract the fractional part of kA.
4. Multiply the result of the above step by the size of the hash table i.e. M.
5. The resulting hash value is obtained by taking the floor of the result obtained
in step 4.
Formula:
h(K) = floor (M (kA mod 1))
Here,
M is the size of the hash table.
k is the key value.
A is a constant value.
Example:
k = 12345 A = 0.357840 M = 100
h(12345) = floor[ 100 (12345*0.357840 mod 1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ] = 53
Pros:
The advantage of the multiplication method is that it can work with any value
between 0 and 1, although there are some values that tend to give better results
than the rest.
Cons:
The multiplication method is generally suitable when the table size is the power
of two, then the whole process of computing the index by the key using
multiplication hashing is very fast.
HASHING PATTERNS
There are five different patterns of hashing techniques:
Independent hashing
2. Repeated Hashing: The Hash functions transforms input data into a hash
value and again this hash value is given as input and produces another output
hash value.
Repeated Hashing
3. Combined Hashing: It enables us to produce singular hash value for more
than one chunk of data. This kind of hashing technique is used when the data size
is very small, because it reduces the utilization of more power to generate a hash
value. If you have noticed it is similar to repeated hashing (during first hashing),
the two hash values obtained as same.
Combined hashing
4. Sequential hashing: Sequential hashing creates an update of a hash value as
soon as new data appears using combined and repeated hashing simultaneously.
The existing hash value is merged with the newly arrived input data and then
hashed to get the updated hash value. This hashing pattern is valuable in case you
require a single hash value and wish to track its development back to the
appearance of new data.
Sequential hashing
5. Hierarchical Hashing: Hierarchical hashing uses combined hashing to create
pairs of hash values which enable the creation of hierarchy. The goal of the
pattern of hierarchical hashing is to create a single hash value for a multitude of
data chunks in a similar way as combined hashing. Compared to combined
hashing, hierarchical hashing has an efficiency advantage as the combined data is
formed by hash values which are fixed in size, hence, reducing the required
computational power and required time.
Hierarchical Hashing
DIGITAL SIGNATURE:
Digital Signature in Cryptography is a value calculated from the data
along with a secret key that only the signer is aware of. The receiver needs to be
assured that the message belongs to the sender. This is crucial in businesses as the
chances of disputes over data exchange are high.
2. Signing Algorithms
Signing Algorithms make one-way hashes of the data that has to be
signed. Then they encrypt the hash value using the signature key. The encrypted
hash along with the other information is the Digital Signature.
1. Message Authentication
The private key is only known to the sender. The verifier can use the public
key of the sender to validate that the Digital Signature was created by the
sender.
2. Data Integrity
If at any time the data is attacked, there will be a discrepancy in the hash
value and the verification algorithm as they won’t match. Due to this, the
receiver will end up rejecting the message and declaring a data breach.
3. Non-repudiation
The signer is the only one who is aware of the signature key so, naturally,
they are the only ones who can create a specific signature. Whenever there is
a dispute, the data along with the Digital Signature can be presented as
evidence.
ECDSA (Elliptic Curve Digital RSA (Rivest-Shamir-Adleman)
Signature Algorithm)
ECDSA is a relatively newer algorithm
RSA is a relatively older algorithm. It has
been extensively studied and used for
decades
ECDSA relies on the mathematics of RSA is based on the computational
elliptic curves over finite fields difficulty of factoring large prime numbers
Requires smaller key size but provides the Requires larger key size but provides the
same level of security. same level of security
ECDSA is computationally efficient and RSA is more computationally intensive
requires less processing power compared to ECDSA.
The generation and storage of ECDSA RSA key generation is more time-
keys are generally faster and require less consuming and resource-intensive
storage space
MEMORY-HARD ALGORITHM
• In cryptography, a memory-hard algorithm is designed to make
computational tasks memory-bound, meaning that they require a significant
amount of memory to compute.
• The goal of memory-hard algorithms is to deter or slow down the efficiency
of specialized hardware attacks.
• Memory-hard algorithms are commonly used in password hashing functions,
where the goal is to protect user passwords in case of a database breach.
• These algorithms make it computationally expensive to compute the hash
function, thus increasing the time and resources required to crack passwords.
• One of the most well-known memory-hard algorithms is Argon2, which was
the winner of the Password Hashing Competition in 2015.
APPLICATION OF MEMORY-HARD ALGORITHMS
1. Password Hashing:
Memory-hard algorithms are commonly used for password hashing, where
the goal is to protect user passwords in case of a data breach.
By making the computation memory-bound, these algorithms increase the
time and resources required to crack passwords. They help mitigate the impact of
password leaks by slowing down attackers and making it more difficult to recover
the original passwords.
4. Hardware Resistance:
Memory-hard algorithms are designed to resist attacks from specialized hardware.
By relying on memory access patterns and requiring a large amount of memory,
these algorithms make it harder to achieve significant performance gains through
specialized hardware optimizations.
ZERO-KNOWLEDGE PROOF
A zero-knowledge proof is a way of proving the validity of a statement
without revealing the statement itself. The ‘prover’ is the party trying to prove a
claim, while the ‘verifier’ is responsible for validating the claim.
A zero-knowledge protocol is a method by which one party (the prover)
can prove to another party (the verifier) that something is true, without revealing
any information apart from the fact that this specific statement is true.
ELEMENTS OF ZERO-KNOWLEDGE
In basic form, a zero-knowledge proof is made up of three
elements: witness, challenge, and response.
Witness: With a zero-knowledge proof, the prover wants to prove knowledge of
some hidden information. The secret information is the “witness” to the proof, and
the prover's assumed knowledge of the witness establishes a set of questions that
can only be answered by a party with knowledge of the information. Thus, the
prover starts the proving process by randomly choosing a question, calculating the
answer, and sending it to the verifier.
Challenge: The verifier randomly picks another question from the set and asks the
prover to answer it.
Response: The prover accepts the question, calculates the answer, and returns it to
the verifier. The prover’s response allows the verifier to check if the former really
has access to the witness. To ensure the prover isn’t guessing blindly and getting
the correct answers by chance, the verifier picks more questions to ask. By
repeating this interaction many times, the possibility of the prover faking
knowledge of the witness drops significantly until the verifier is satisfied.
Example:
In this example, you and a competitor discover that you are buying the same
materials from the same supplier. You want to find out if you are paying the same
price per kilogram. However, there isn’t enough trust between the both of you to
divulge the prices you are each paying, and you are also contractually bound to not
share this information.
Assuming the market rate for the materials can only be 100, 200, 300 or 400 per
kilogram, we can set up a zero-knowledge proof for this situation. Let’s follow
these steps to explain the idea:
1. You and a competitor want to know if you are paying the same price without
revealing how much each of you are paying.
2. We obtain 4 lockable lockboxes, each with a small slot that can take only a piece
of paper. They are labelled 100, 200, 300, and 400 for the price per kilogram, and
placed in a secure, private room.
3. You go into the room alone first. Since you are paying 200 per kilogram, you
take the key from the lockbox that is labelled 200 and destroy the keys for the
other boxes. You leave the room.
4. Your competitor goes into the room alone with 4 pieces of paper, 1 with a check,
and 3 with crosses. Because your competitor is paying 300 per kilogram, they slide
the paper with a check inside the lockbox that is labelled 300, and slide the papers
with crosses into the other lockboxes. They leave the room.
5. After they leave, you can return with your key that can only open the lockbox
labelled 200. You find a piece of paper with a cross on it, so now you know that
your competitor is not paying the same amount as you.
6. Your competitor returns and sees that you have a piece of paper with a cross on
it, so now they also know that you are not paying the same amount as them.
If you get a piece of paper with a check on it, both of you would know that you are
paying the same amount. Since you got the paper with a cross on it, both of you
know that you are not paying the same amount, but also without knowing how
much the other is paying.
Both of you leave knowing only that you are not paying the same amount, but
neither of you has gained knowledge of what the other is paying.
This is another analogy of an interactive zero-knowledge proof with a primitive
semi-range proof. It is important to note that all of the examples have limitations
and have to take on certain assumptions, but they adequately illustrate the ways
they could work.
BLOCKCHAIN :
A blockchain is “a distributed database that maintains a continuously
growing list of ordered records, called blocks.” These blocks “are linked using
cryptography. Each block contains a cryptographic hash of the previous block, a
timestamp, and transaction data.
1. Public Blockchains
Public blockchains are open, decentralized networks of computers accessible to
anyone wanting to request or validate a transaction (check for accuracy). Those
(miners) who validate transactions receive rewards.
Public blockchains use proof-of-work or proof-of-stake consensus mechanisms
(discussed later).
Examples: Bitcoin and Ethereum (ETH) blockchains.
2. Private Blockchains
Private blockchains are not open, they have access restrictions. People who want to
join require permission from the system administrator. They are typically governed
by one entity, meaning they’re centralized.
Example: Hyperledger is a private, permissioned blockchain.
4. Sidechains
A sidechain is a blockchain running parallel to the main chain. It allows users to
move digital assets between two different blockchains and improves scalability and
efficiency.
Example : Liquid Network.
ADVANTAGE OF BLOCKCHAIN OVER CONVENTIONAL
DISTRIBUTED DATABASE :
Blockchain offers several advantages over conventional distributed databases. Here
are some key advantages:
1. Decentralization and Trust: Conventional distributed databases typically
rely on a central authority or trusted third party to validate and maintain the
database. In contrast, blockchain operates in a decentralized manner, where
multiple participants or nodes collectively validate and agree on the
transactions. This decentralized approach eliminates the need for a central
authority, reduces the risk of single points of failure, and enhances trust
among participants.
2. Immutable and Tamper-Resistant: Blockchain provides immutability and
tamper-resistance of data. Once a transaction is recorded on the blockchain
and confirmed by the network, it becomes nearly impossible to alter or
delete that transaction. The use of cryptographic hash functions and
consensus mechanisms ensures the integrity of the data stored in the
blockchain.
3. Transparency and Auditability: Blockchain offers transparency by
providing all participants with access to the same set of data. Every
transaction recorded on the blockchain is visible to the participants, enabling
them to independently verify and audit the data. This transparency enhances
accountability, reduces fraud, and fosters trust among participants.
4. Security: Blockchain employs advanced cryptographic techniques to secure
transactions and data. Each transaction is digitally signed and verified,
ensuring the authenticity and integrity of the data. The distributed nature of
blockchain makes it resilient against attacks and makes it difficult for
malicious actors to manipulate the data.
5. Disintermediation and Cost Efficiency: Blockchain eliminates the need for
intermediaries or trusted third parties in transactions. By enabling peer-to-
peer transactions and direct interactions, blockchain reduces costs associated
with intermediaries, such as financial institutions or clearinghouses. This
disintermediation can lead to increased efficiency and cost savings in
various industries.
6. Smart Contracts: Blockchain platforms often support smart contracts,
which are self-executing contracts with the terms of the agreement directly
written into code. Smart contracts automatically execute actions when
predefined conditions are met. This automation eliminates the need for
intermediaries, streamlines processes, and reduces the risk of human error.
7. Data Consistency and Synchronization: In conventional distributed
databases, achieving data consistency and synchronization across multiple
nodes can be challenging. Blockchain provides a shared, distributed ledger
where all participants have access to the same data, ensuring consistency
across the network. The consensus mechanisms employed by blockchain
platforms help maintain the integrity and synchronization of the data.
8. Resilience and Fault Tolerance: Blockchain is designed to be resilient and
fault-tolerant. The distributed nature of the blockchain ensures that even if
some nodes fail or are compromised, the network can continue to operate
and maintain the integrity of the data. This resilience makes blockchain
suitable for applications where data availability and continuity are crucial.
BLOCKCHAIN NETWORK :
A blockchain network is basically a technical network that is
providing ledger and smart contract (chaincode) services to the applications.
Primarily, these smart contracts are used to generate transactions which are
subsequently distributed among every peer node in the network where they are
unalterably recorded on their copy of the ledger. The users of applications must be
end users using client applications.
Blockchain networks are driven by aligned system incentives.a
blockchain with proper functioning requires a community of users, node operators,
developers and miners who work in a mutually beneficial network.
Bitcoin is the largest cryptocurrency with the help of market
capitalization and well known usage of blockchain technology.
PERMISSIONED NETWORKS :
Permissioned networks are the blockchain networks where only pre-
authorized users or organizations can perform write transactions.
They are faster and inexpensive, can comply with regulations, and can easily
be maintained.
Pre-verification of the participating parties is mandatory for a permissioned
blockchain and, hence, transacting parties are made.
Permissioned P2P networks have to guarantee uptime and require a high
level of quality of service on communication links.
The following table highlights the similarities and differences between different
types of blockchain from the permissions perspective:
Public and Public and Private and Private and
Permissionless Permissioned Permissionless Permissioned
Write all and read Write all and read Write restricted Write restricted and
all. restricted. and read all. read restricted.
Everyone can
Everyone can join and
Everyone can join, join, nobody can Nobody can join,
transact, but only
transact, read, and transact, and transact, read, and
permissioned users can
audit. everyone can read audit.
read and audit.
and audit.
Anyone in the
Anyone can network can
Anyone who meets the
download the participate and Only consortium
predefined criteria can
protocol and validate members can
download the protocol
participate with transactions. validate the
and participate with
validate However, this is transaction.
validate transactions.
transactions. only within the
enterprise.
The following table highlights the similarities and differences between different
types of blockchain from a transaction and anonymity perspective:
Everyone will
Nobody can Nobody can Nobody can
participate in
participate in participate in participate in
transaction
transaction transaction transaction
validation, and
validation, and the validation, and the validation, and
the validators are
validators are the validators are the the validators are
not the chosen
chosen ones. chosen ones. the chosen ones.
ones.
Truly
democratic: full Full write equity. Full read equity. Restricted.
equity.
Transaction Transaction
approval is long. approval is long. It Transaction Transaction
It usually takes usually takes approval is short. approval is short.
minutes. minutes.
MINING MECHANISM:
Blockchain mining is used to secure and verify bitcoin transactions.
Mining involves Blockchain miners who add bitcoin transaction data to Bitcoin’s
global public ledger of past transactions. In the ledgers, blocks are secured by
Blockchain miners and are connected to each other forming a chain.
Types of Mining :
The process of mining can get really complex and a regular desktop
or PC cannot cut it. Hence, it requires a unique set of hardware and software that
works well for the user. It helps to have a custom set specific to mining certain
blocks.
2. Pool Mining
In pool mining, a group of users works together to approve the
transaction. Sometimes, the complexity of the data encrypted in the blocks makes
it difficult for a user to decrypt the encoded data alone. So, a group of miners
works as a team to solve it. After the validation of the result, the reward is then
split between all users.
3. Cloud Mining
Cloud mining eliminates the need for computer hardware and
software. It’s a hassle-free method to extract blocks. With cloud mining, handling
all the machinery, order timings, or selling profits is no longer a constant worry.
While it is hassle-free, it has its own set of disadvantages. The
operational functionality is limited with the limitations on bitcoin hashing in
blockchain. The operational expenses increase as the reward profits are low.
Software upgrades are restricted and so is the verification process.
DISTRIBUTED CONSENSUS :
Distributed consensus refers to the process by which participants in a
decentralized network agree on a shared state or order of events without relying on
a central authority. It ensures that all nodes in the network reach a consensus on the
validity and order of transactions or data.
In a distributed consensus protocol, the goal is to achieve agreement
among a set of participants, even in the presence of faults or malicious actors. This
consensus is crucial for maintaining the integrity and trustworthiness of a
distributed system, such as a blockchain.
There are several well-known distributed consensus protocols, including:
1. Proof-of-Work (PoW): PoW is a consensus algorithm used by Bitcoin and
some other cryptocurrencies. Miners compete to solve computationally
intensive puzzles, and the first one to find a valid solution broadcasts it to
the network. Consensus is reached when the majority of participants agree
on the validity of the solution. PoW ensures that the majority of participants
collectively control the network and prevents malicious actors from easily
tampering with the blockchain.
2. Proof-of-Stake (PoS): PoS is an alternative consensus mechanism where
the probability of a participant being chosen to validate new transactions or
create new blocks is based on the number of cryptocurrency tokens they
hold and are willing to "stake" as collateral. This eliminates the need for
extensive computational power, as in PoW. PoS protocols include
Ethereum's upcoming Ethereum 2.0, Cardano, and Tezos.
3. Delegated Proof-of-Stake (DPoS): DPoS is a variant of PoS where a
limited number of participants, known as "delegates," are chosen to validate
transactions and create new blocks on behalf of the entire network. The
selection of delegates is often based on voting by token holders. Examples of
DPoS blockchains include EOS and Tron.
4. Practical Byzantine Fault Tolerance (PBFT): PBFT is a classical
consensus algorithm that works in a permissioned setting, where a fixed set
of known participants is present. It ensures consensus even in the presence
of Byzantine faults (e.g., malicious nodes). PBFT requires a certain
threshold of honest nodes to reach consensus. Hyperledger Fabric is an
example of a blockchain framework that employs a modified version of
PBFT.
MERKEL TREE :
Merkle Trees enable the secure and efficient verification of large
datasets.
A Merkle tree is a binary tree in which the inputs are first placed at
the leaves and then the values of pairs of child nodes are hashed together to
produce a value for the parent node (internal node), until a single hash value
known as a Merkle root is achieved. This structure helps to quickly verify the
integrity of the entire tree, but just by verifying the Merkle root on top the Merkle
tree, because if any change occurs in any of the hashes in the tree, the Merkle root
will also change.
Integrity of the system can be verified quickly by just looking at the Merkle
root.
In Merkle tree, there is no requirement to store large amounts of data, only
the hashes of the data, which are fixed-length digest of the large dataset. Due
to this property, the storage and management of Merkle tree are easy and
efficient as they take up a very small amount of space for storage.
The exact price of the gas is determined by supply, demand, and network
capacity at the time of the transaction.
The term gas limit refers to the maximum price a cryptocurrency user is
willing to pay when sending a transaction, or performing a smart contract function,
in the Ethereum blockchain. These fees are calculated in gas unit, and the gas limit
defines the maximum value that the transaction or function can "charge" or take
from the user. As such, the gas limit works as a security mechanism that prevents
high fees from being incorrectly charged due to a bug or error in a smart contract.
Some wallets and service provide setup the gas prices and gas limit
automatically, but in some cases, users are also able to adjust them manually,
according to their needs. In general, a regular Ether (ETH) transaction would be
made with, at least, a 21,000 gas limit. If the gas limit and gas price (Gwei) are set
to a higher level, the operation will occur much faster. Still, faster operations will
likely charge higher fees. On the other hand, a very low gas limit and gas price
would be risky because transactions could take too long to be confirmed, or even
get stuck (fail).
Transaction fees are essential to the blockchain. They serve two main purposes:
It’s worth noting that transaction fees are not the same as the fees
charged by crypto exchanges. Those additional processing fees go directly to the
exchange, while blockchain transaction fees go to the miner of the block.
ANONYMITY:
Anonymity in blockchain refers to the ability of participants within a
blockchain network to transact and interact without revealing their true identities.
Blockchain technology, which underlies cryptocurrencies like Bitcoin, offers
certain degrees of anonymity, but it's essential to understand the nuances and
limitations involved.
1. Pseudonymity: Blockchain transactions are linked to cryptographic
addresses rather than real-world identities. These addresses are random
strings of characters that users can generate for each transaction or use
repeatedly. While these addresses don't reveal personal information, they
still provide a level of traceability within the blockchain.
REWARD :
In the context of blockchain, rewards typically refer to incentives provided to
participants in the network for their contributions or participation. These rewards
can take different forms depending on the specific blockchain protocol or
cryptocurrency involved. Here are a few common types of rewards in blockchain:
CHAIN :
In the context of blockchain technology, a "chain" refers to the series of blocks that
are sequentially linked together. Each block contains a list of transactions or data,
along with a unique identifier called a cryptographic hash, which is generated
based on the contents of the block.
4. Block Validation: Each block in the chain contains a reference (hash) to the
previous block, forming a cryptographic link. This link enables participants
to validate the authenticity and integrity of each block, ensuring that the data
contained within the block has not been modified.