How to Get the MD5 Hash of a File in C++?
Last Updated :
21 Aug, 2024
In cryptography, we use the MD5 (Message Digest Algorithm 5) hash function for creating a 128-bit hash value which is represented as a 32-character hexadecimal number. However, this algorithm is not very secure cryptographically but can be used for file verifications, checksums, and ensuring data integrity. In this article, we will learn how to calculate the MD5 hash of a file in C++.
MD5 stands for Message Digest Algorithm 5, and was designed by Ronald Rivest in 1991 as an improvement over the earlier MD4 algorithm. It is a widely used cryptographic hash function that takes an input (or message) and produces a fixed-size, 128-bit hash value. This hash value is unique to the given input which means even a small change in the input will produce a significantly different hash.
MD5 AlgorithmMD5 Algorithm to get MD5 Hash of a File in C++
The MD5 algorithm follows the following key steps to process a variable-length input message into a fixed-length output of 128 bits.
1. Padding the Bits
The first step in the MD5 algorithm is to add padding to the original message so that its length (in bits) is congruent to 448 modulo 512. This means padding the message with a single '1' bit followed by as many '0' bits as required to make the total length of the message (in bits) equal to 448 modulo 512. If the message is already 448 bits long, 512 bits are added.
One Round of MD5 Operation2. Appending the Length
After padding, append the original length of the message (before padding) as a 64-bit integer. This makes the total length of the padded message a multiple of 512 bits and ensures that even if the message changes slightly, the hash will differ which is required for processing by the MD5 algorithm.
3. Initialize MD Buffer
MD5 uses four 32-bit variables (A, B, C, D) to store the intermediate and final hash values. These are initialized with specific constants:
- A = 0x67452301
- B = 0xEFCDAB89
- C = 0x98BADCFE
- D = 0x10325476
F,G,H and I functions 4. Processing Each 512-bit Block
The message is divided into 512-bit blocks, and each block is processed using a series of bitwise operations, additions, and modular arithmetic on the four variables (A, B, C, D). The main part of this process is a loop that applies a transformation to each block, updating the values of A, B, C, and D.
- Nonlinear Functions (F, G, H, I): These functions use bitwise operations and vary for each round:
- Constants and Sine Table Values: A unique constant derived from the sine function is added in each operation.
- Circular Left Rotation: The result is shifted left by a number of bits, defined per operation.
- Add the Result to the Buffer: The output of each operation is added back to one of the buffer values (A, B, C, D).
Process P Operation5. Updating the Buffers
For each operation in a block, the values of A, B, C, and D are updated. After processing each 512-bit block, the resulting values of A, B, C, and D are added to their previous values. This cumulative update ensures the integrity of the final hash.
6. Produce the Final Hash Value
After all blocks are processed, the values of A, B, C, and D are concatenated to produce the final 128-bit hash value. The values are output in little-endian format, so they are reordered accordingly.
Before implementing the MD5 algorithm, make sure you have the following:
- OpenSSL Library: You need to have OpenSSL installed on your system. You can install it using a package manager like apt for Linux or brew for macOS, or you can download it from the OpenSSL website.
- C++ Compiler: Make sure you have a C++ compiler installed, such as g++.
C++ Program to Implement MD5 Algorithm
The below program demonstrates how we can get the MD5 hash of a file in C++.
C++
//C++ program to get the MD5 Hash of a file
#include <fstream>
#include <iomanip>
#include <iostream>
#include <openssl/md5.h>
using namespace std;
// Function to print the MD5 hash in hexadecimal format
void print_MD5(unsigned char *md, long size = MD5_DIGEST_LENGTH){
for (int i = 0; i < size; i++){
cout << hex << setw(2) << setfill('0') << (int)md[i];
}
cout << endl;
}
// Function to compute and print MD5 hash of a given string
void computeMD5FromString(const string &str){
unsigned char result[MD5_DIGEST_LENGTH];
MD5((unsigned char *)str.c_str(), str.length(), result);
cout << "MD5 of '" << str << "' : ";
print_MD5(result);
}
// Function to compute and print MD5 hash of a file
void computeMD5FromFile(const string &filePath){
ifstream file(filePath, ios::in | ios::binary | ios::ate);
if (!file.is_open()){
cerr << "Error: Cannot open file: " << filePath << endl;
return;
}
// Get file size
long fileSize = file.tellg();
cout << "File size: " << fileSize << " bytes" << endl;
// Allocate memory to hold the entire file
char *memBlock = new char[fileSize];
// Read the file into memory
file.seekg(0, ios::beg);
file.read(memBlock, fileSize);
file.close();
// Compute the MD5 hash of the file content
unsigned char result[MD5_DIGEST_LENGTH];
MD5((unsigned char *)memBlock, fileSize, result);
cout << "MD5 of file '" << filePath << "' : ";
print_MD5(result);
// Clean up
delete[] memBlock;
}
int main(){
// Example 1: Compute and print MD5 hash of a string
string inputString = "grape";
computeMD5FromString(inputString);
// Example 2: Compute and print MD5 hash of a file
string filePath = "example.txt";
computeMD5FromFile(filePath);
return 0;
}
Output
MD5 of 'grape' : 827ccb0eea8a706c4c34a16891f84e7b
File size: 1024 bytes
MD5 of file 'example.txt' : 098f6bcd4621d373cade4e832627b4f6
Similar Reads
How to Read From a File in C++? Reading from a file means retrieving the data stored inside a file. C++ file handling allows us to read different files from our C programs. This data can be taken as input and stored in the program for processing. Generally, files can be classified in two types:Text File: Files that contains data i
4 min read
How Can I Get a File Size in C++? In C++, we often need to determine the size of a file which is the number of bytes in a file. This is done for various applications, such as file processing or validation. In this article, we will learn how to get the file size in C++. Example: Input: FilePath = "C:/Users/Desktop/myFile.txt" Output:
2 min read
How to Read a File Line by Line in C++? In C++, we can read the data of the file for different purposes such as processing text-based data, configuration files, or log files. In this article, we'll learn how to read a file line by line in C++. Read a File Line by Line in C++We can use the std::getline() function to read the input line by
2 min read
How to Get File Extension in C++? In C++, we may often find the need to extract the file extension from a given path of the file while working in many applications for processing or validating. In this article, we will learn how to get the file extension in C++. For Example, Input: someFolder â³ filename.ext Output: File Extension =
2 min read
How to Open and Close a File in C++? In C++, we can open a file to perform read and write operations and close it when we are done. This is done with the help of fstream objects that create a stream to the file for input and output. In this article, we will learn how to open and close a file in C++. Open and Close a File in C++ The fst
2 min read