Working with zip files in Python
Last Updated :
22 Jul, 2021
This article explains how one can perform various operations on a zip file using a simple python program.
What is a zip file?
ZIP is an archive file format that supports lossless data compression. By lossless compression, we mean that the compression algorithm allows the original data to be perfectly reconstructed from the compressed data. So, a ZIP file is a single file containing one or more compressed files, offering an ideal way to make large files smaller and keep related files together.
Why do we need zip files?
- To reduce storage requirements.
- To improve transfer speed over standard connections.
To work on zip files using python, we will use an inbuilt python module called zipfile.
1. Extracting a zip file
from zipfile import ZipFile
file_name = "my_python_files.zip"
with ZipFile(file_name, 'r' ) as zip :
zip .printdir()
print ( 'Extracting all the files now...' )
zip .extractall()
print ( 'Done!' )
|
The above program extracts a zip file named “my_python_files.zip” in the same directory as of this python script.
The output of above program may look like this:

Let us try to understand the above code in pieces:
-
from zipfile import ZipFile
ZipFile is a class of zipfile module for reading and writing zip files. Here we import only class ZipFile from zipfile module.
-
with ZipFile(file_name, 'r') as zip:
Here, a ZipFile object is made by calling ZipFile constructor which accepts zip file name and mode parameters. We create a ZipFile object in READ mode and name it as zip.
-
zip.printdir()
printdir() method prints a table of contents for the archive.
-
zip.extractall()
extractall() method will extract all the contents of the zip file to the current working directory. You can also call extract() method to extract any file by specifying its path in the zip file.
For example:
zip.extract('python_files/python_wiki.txt')
This will extract only the specified file.
If you want to read some specific file, you can go like this:
data = zip.read(name_of_file_to_read)
2. Writing to a zip file
Consider a directory (folder) with such a format:

Here, we will need to crawl the whole directory and its sub-directories in order to get a list of all file paths before writing them to a zip file.
The following program does this by crawling the directory to be zipped:
from zipfile import ZipFile
import os
def get_all_file_paths(directory):
file_paths = []
for root, directories, files in os.walk(directory):
for filename in files:
filepath = os.path.join(root, filename)
file_paths.append(filepath)
return file_paths
def main():
directory = './python_files'
file_paths = get_all_file_paths(directory)
print ( 'Following files will be zipped:' )
for file_name in file_paths:
print (file_name)
with ZipFile( 'my_python_files.zip' , 'w' ) as zip :
for file in file_paths:
zip .write( file )
print ( 'All files zipped successfully!' )
if __name__ = = "__main__" :
main()
|
The output of above program looks like this:

Let us try to understand above code by dividing into fragments:
-
def get_all_file_paths(directory):
file_paths = []
for root, directories, files in os.walk(directory):
for filename in files:
filepath = os.path.join(root, filename)
file_paths.append(filepath)
return file_paths
First of all, to get all file paths in our directory, we have created this function which uses the os.walk()Â method. In each iteration, all files present in that directory are appended to a list called file_paths.
In the end, we return all the file paths.
-
file_paths = get_all_file_paths(directory)
Here we pass the directory to be zipped to the get_all_file_paths() function and obtain a list containing all file paths.
-
with ZipFile('my_python_files.zip','w') as zip:
Here, we create a ZipFile object in WRITE mode this time.
-
for file in file_paths:
zip.write(file)
Here, we write all the files to the zip file one by one using write method.
3. Getting all information about a zip file
from zipfile import ZipFile
import datetime
file_name = "example.zip"
with ZipFile(file_name, 'r' ) as zip :
for info in zip .infolist():
print (info.filename)
print ( '\tModified:\t' + str (datetime.datetime( * info.date_time)))
print ( '\tSystem:\t\t' + str (info.create_system) + '(0 = Windows, 3 = Unix)' )
print ( '\tZIP version:\t' + str (info.create_version))
print ( '\tCompressed:\t' + str (info.compress_size) + ' bytes' )
print ( '\tUncompressed:\t' + str (info.file_size) + ' bytes' )
|
The output of above program may look like this:

for info in zip.infolist():
Here, infolist() method creates an instance of ZipInfo class which contains all the information about the zip file.
We can access all information like last modification date of files, file names, system on which files were created, Zip version, size of files in compressed and uncompressed form, etc.
This article is contributed by Nikhil Kumar.
Similar Reads
Writing to file in Python
Writing to a file in Python means saving data generated by your program into a file on your system. This article will cover the how to write to files in Python in detail. Creating a FileCreating a file is the first step before writing data to it. In Python, we can create a file using the following t
4 min read
Unzipping files in Python
In this article we will see how to unzip the files in python we can achieve this functionality by using zipfile module in Python. What is a zip file ZIP file is a file format that is used for compressing multiple files together into a single file. It is used in an archive file format that supports l
3 min read
Reading binary files in Python
Reading binary files means reading data that is stored in a binary format, which is not human-readable. Unlike text files, which store data as readable characters, binary files store data as raw bytes. Binary files store data as a sequence of bytes. Each byte can represent a wide range of values, fr
6 min read
Reading and Writing to text files in Python
Python provides built-in functions for creating, writing, and reading files. Two types of files can be handled in Python, normal text files and binary files (written in binary language, 0s, and 1s). Text files: In this type of file, Each line of text is terminated with a special character called EOL
8 min read
Change current working directory with Python
The OS module in Python is used for interacting with the operating system. This module comes under Python's standard utility module so there is no need to install it externally. All functions in OS module raise OSError in the case of invalid or inaccessible file names and paths, or other arguments t
2 min read
Compare two files using Hashing in Python
In this article, we would be creating a program that would determine, whether the two files provided to it are the same or not. By the same means that their contents are the same or not (excluding any metadata). We would be using Cryptographic Hashes for this purpose. A cryptographic hash function i
3 min read
numpy.random.zipf() in Python
With the help of numpy.random.zipf() method, we can get the random samples from zipf distribution and return the random samples as numpy array by using this method. Syntax : numpy.random.zipf(a, size=None) Return : Return the random samples as numpy array. Example #1 : In this example we can see tha
1 min read
zip() in Python
The zip() function in Python combines multiple iterables such as lists, tuples, strings, dict etc, into a single iterator of tuples. Each tuple contains elements from the input iterables that are at the same position. Letâs consider an example where we need to pair student names with their test scor
3 min read
How to copy file in Python3?
Prerequisites: Shutil When we require a backup of data, we usually make a copy of that file. Python supports file handling and allows users to handle files i.e., to read and write files, along with many other file handling options, to operate on files. Here we will learn how to copy a file using Pyt
2 min read
Unpacking a Tuple in Python
Tuple unpacking is a powerful feature in Python that allows you to assign the values of a tuple to multiple variables in a single line. This technique makes your code more readable and efficient. In other words, It is a process where we extract values from a tuple and assign them to variables in a s
2 min read