Verify Integrity of Files Using Digest in Python
Last Updated :
15 Apr, 2025
Data integrity is a critical aspect of file management, ensuring that files remain unaltered during transmission or storage. In Python, one effective method to verify file integrity is by using cryptographic hash functions and their corresponding digests. A digest is a fixed-size string generated by a hash function, uniquely representing the content of a file. In this article, we'll explore how to verify the integrity of files using digests in Python through a step-by-step guide.
What is Digest in Python?
In Python, a digest is the result of applying a hash function (such as SHA-256 or MD5) to the content of a file. This fixed-size string serves as a unique identifier for the file's content. If the file content changes, even by a single byte, the digest will change, providing a reliable way to detect alterations.
How To Verify Integrity Of Files Using Digest In Python?
Below, are the step-by-step guide on How To Verify the Integrity Of Files Using Digest In Python:
Install Colorama Library
To incorporate the Colorama library, which is not included in the default Python installation, execute the following command to install it:
pip install colorama

Step 1: Library Imports
In below code, the required libraries are imported. The argparse
library is used for parsing command-line arguments, hashlib
for cryptographic hash functions, and sys
for system-specific parameters and functions.
Python3
# Import the necessary libraries required.
import argparse
import hashlib
import sys
# Import the functions init and Fore from the colorama library.
from colorama import init, Fore
Step 2: Hash Calculation Function
Below, code defines a function calculate_hash
that takes a file path as an argument and calculates the SHA-256 hash of the file using a hash object. It reads the file in 64KB chunks for efficiency and updates the hash object accordingly.
Python3
# Define a function to calculate the SHA-256 hash of a file.
def calculate_hash(file_path):
# Create a SHA-256 hash object.
sha256_hash = hashlib.sha256()
# Open the file in binary mode for reading (rb).
with open(file_path, "rb") as file:
# Read the file in 64KB chunks to efficiently handle large files.
while True:
data = file.read(65536) # Read the file in 64KB chunks.
if not data:
break
# Update the hash object with the data read from the file.
sha256_hash.update(data)
return sha256_hash.hexdigest()
Step 3: Hash Verification Function
Here, a function verify_hash
is defined, which takes a downloaded file path and an expected hash value as arguments. It calculates the hash of the downloaded file using the calculate_hash
function and compares it with the expected hash value, returning a boolean result.
Python3
def verify_hash(downloaded_file, expected_hash):
calculated_hash = calculate_hash(downloaded_file)
return calculated_hash == expected_hash
Step 4: Command-Line Argument
In this code, a command-line argument parser is created using argparse
. Two arguments are defined: -f
or --file
for the downloaded file path and --hash
for the expected hash value. Both are marked as required.
Python3
parser = argparse.ArgumentParser(
description="Verify the hash of a file that is downloaded.")
parser.add_argument("-f", "--file", dest="downloaded_file",
required=True, help="path for the file downloaded")
parser.add_argument("--hash", dest="expected_hash",
required=True, help="Expected hash value is")
args = parser.parse_args()
Step 5: Argument Validation and Hash Verification
Finally, this subpart checks if the required command-line arguments are provided. If not, it prints an error message in red and exits. If the arguments are present, it proceeds to verify the hash using the verify_hash
function. Depending on the result, it prints a success or failure message in green or red, respectively.
Python3
if not args.downloaded_file or not args.expected_hash:
print(
f"{Fore.RED}[-] Please Specify the file in order to validate and its Hash.")
sys.exit()
if verify_hash(args.downloaded_file, args.expected_hash):
print(
f"{Fore.GREEN}[+] Hash verification occurred successfully. The software is original.")
else:
print(
f"{Fore.RED}[-] Hash verification has failed, which means the software may have been tampered or is not original.")
Step 6: Run the Command in Terminal
After you have successfully written the code I have mentioned above, you can simply open the command prompt and go to the directory to where you have saved the python program and begin execution, for the execution you will need to run the following command:
python verify.py -f [file path here] [file name with extension] --hash [input your hash here.]
Complete Code
This code initiates Colorama for colored text, defines functions to calculate and verify SHA-256 hash of a file, and utilizes argparse for command-line argument parsing to check the integrity of a downloaded file by comparing its hash with an expected value, printing success or failure messages accordingly. The script ensures proper validation of command-line arguments and outputs informative messages about hash verification results.
Python3
import argparse
import hashlib
import sys
from colorama import init, Fore
# Initialize colorama for colored terminal text.
init()
# Define a function to calculate SHA-256 hash of a file.
def calculate_hash(file_path):
sha256_hash = hashlib.sha256()
with open(file_path, "rb") as file:
while (data: = file.read(65536)):
sha256_hash.update(data)
return sha256_hash.hexdigest()
# Function to verify hash of a downloaded file.
def verify_hash(downloaded_file, expected_hash):
calculated_hash = calculate_hash(downloaded_file)
return calculated_hash == expected_hash
# Command-line argument parsing.
parser = argparse.ArgumentParser(description="Verify downloaded file's hash.")
parser.add_argument("-f", "--file", dest="downloaded_file",
required=True, help="Path of the downloaded file")
parser.add_argument("--hash", dest="expected_hash",
required=True, help="Expected hash value")
args = parser.parse_args()
# Validate arguments and perform hash verification.
if not args.downloaded_file or not args.expected_hash:
print(
f"{Fore.RED}[-] Please specify the file and its hash for validation.")
sys.exit()
if verify_hash(args.downloaded_file, args.expected_hash):
print(
f"{Fore.GREEN}[+] Hash verification successful. The file is original.")
else:
print(
f"{Fore.RED}[-] Hash verification failed. This may indicate tampering or non-original software.")
Output:
C:\Users\kisha\PycharmProjects\gfg-integrity>python verify.py -f C:\Users\kisha\Downloads\Programs\vlc-3.0.20-win64.exe --hash d8055b6643651ca5b9ad58c438692a481483657f3f31624cdfa68b92e8394a57
[+] Hash verification occured successfuly. The software is original.
Command Prompt Verifcation
Conclusion
In conclusion, we learnt some concepts regarding the data integrity and how it can affect the data or file of any company or organization. we also learn about some of the drawbacks that we may have if we implement a system to deal and verify the data integrity, apart from all that we learnt the most important concept which was learning how we can verify the integrity of the files using digest or hash methods such as MD5, SHA-256 etc. along which example code to verify the integrity of a file.
Similar Reads
Finding Md5 of Files Recursively in Directory in Python
MD5 stands for Message Digest Algorithm 5, it is a cryptographic hash function that takes input(or message) of any length and produces its 128-bit(16-byte) hash value which is represented as a 32-character hexadecimal number. The MD5 of a file is the MD5 hash value computed from the content of that
3 min read
Python - Reading last N lines of a file
Prerequisite: Read a file line-by-line in PythonGiven a text file fname, a number N, the task is to read the last N lines of the file.As we know, Python provides multiple in-built features and modules for handling files. Let's discuss different ways to read last N lines of a file using Python. File:
5 min read
Find the Mime Type of a File in Python
Determining the MIME (Multipurpose Internet Mail Extensions) type of a file is essential when working with various file formats. Python provides several libraries and methods to efficiently check the MIME type of a file. In this article, we'll explore different approaches to find the mime type of a
3 min read
Implementing Checksum using Python
The checksum is a kind of error Detection method in Computer Networks. This method used by the higher layer protocols and makes use of Checksum Generator on the Sender side and Checksum Checker on the Receiver side. In this article, we will be implementing the checksum algorithm in Python. Refer to
3 min read
How To Detect File Changes Using Python
In the digital age, monitoring file changes is essential for various applications, ranging from data synchronization to security. Python offers robust libraries and methods to detect file modifications efficiently. In this article, we will see some generally used method which is used to detect chang
3 min read
Extract text from PDF File using Python
All of you must be familiar with what PDFs are. In fact, they are one of the most important and widely used digital media. PDF stands for Portable Document Format. It uses .pdf extension. It is used to present and exchange documents reliably, independent of software, hardware, or operating system. W
5 min read
Get File Size in Bytes, Kb, Mb, And Gb using Python
Handling file sizes in Python is a common task, especially when working with data processing, file management, or simply understanding resource usage. Fortunately, Python provides several methods to obtain file sizes in various units, such as bytes, kilobytes, megabytes, and gigabytes. In this artic
2 min read
Serialize and Deserialize an Open File Object in Python
Serialization refers to the process of converting an object into a format that can be easily stored or transmitted, such as a byte stream. Deserialization, on the other hand, involves reconstructing the object from its serialized form. When dealing with file operations, it's common to serialize data
2 min read
File System Manipulation in Python
File system manipulation in Python refers to the ability to perform various operations on files, such as creating, reading, writing, appending, renaming, and deleting. Python provides several built-in modules and functions that allow you to perform various file system operations. Python treats files
3 min read
Python - Get number of characters, words, spaces and lines in a file
Given a text file fname, the task is to count the total number of characters, words, spaces, and lines in the file. As we know, Python provides multiple in-built features and modules for handling files. Let's discuss different ways to calculate the total number of characters, words, spaces, and line
5 min read