Working with zip files in Python
How To Python

Working with zip files in Python

Mishel Shaji
Mishel Shaji

In this tutorial we're going to learn how to create, extract and view zip files in Python using the built-in module named zipfile.

What is a zip file?

It is a file that contains several compressed files and folders. Zip uses several compression algorithms to compress files. The most common one is DEFLATE.

A zip file has these benefits:

  • Reduced file size.
  • Saves storage space.
  • Allows you to encrypt data.

zipfile module

zipfile is a built-in module in Python to work with zip files. It allows you to create, extract and modify files.

Limitations

This module comes with everything you need to work with zip files. But has the following limitations.

  • It cannot handle ZIP files that are more than 4 GiB in size.
  • Cannot create an encrypted zip file.
  • Decompression is extremely slow compared to C.

Exceptions

An exception is an event that affects the normal flow of execution of a program. These are the exceptions that are defined in the zipfile module:

  • zipfile.BadZipFile - The error raised for bad ZIP files. It was introduced in version 3.2. It's alias is zipfile.BadZipfile, for compatibility with older Python versions.
  • zipfile.LargeZipFile - Raised when a ZIP file would require ZIP64 functionality but that has not been enabled.

Classes and methods

Here's a list of classes in zipfile module.

  • zipfile.ZipFile - The class for reading and writing ZIP files.
  • ZipFile.close() - Close the archive file. You must call close() before exiting your program or essential records will not be written.
  • ZipFile.open - (name, mode='r', pwd=None, *, force_zip64=False) Access a member of the archive as a binary file-like object.
  • ZipFile.extract - (member, path=None, pwd=None)Extract a member from the archive to the current working directory; member must be its full name or a ZipInfo object.
  • ZipFile.extractall - (path=None, members=None, pwd=None)Extract all members from the archive to the current working directory. path specifies a different directory to extract to.
  • zipfile.ZipInfo - (filename='NoName', date_time=(1980, 1, 1, 0, 0, 0))Class used to represent information about a member of an archive.
  • zipfile.is_zipfile -(filename)Returns True if filename is a valid ZIP file based on its magic number, otherwise returns False.

Extract a zip file

In this example, we are using the extractAll() method to extract the contents of our zip file.

from zipfile import ZipFile

zip_file = 'myzipfile.zip'
zip = ZipFile(zip_file, 'r')
zip.extractall()
zip.close()
print("Files Extracted")

This will extract the content of our Zip file to the current directory. To extract the content to a different directory, specify the path in extractall() method.

zip.extractall("D:\Extracted")

You can also use this method to open and work with zip files.

from zipfile import ZipFile
zip_file = 'myzipfile.zip'
with ZipFile(zip_file, 'r') as zip: 
    zip.extractall("D:\Extracted")
    print("Files Extracted")

In python the with keyword is used when working with unmanaged resources (like file streams). It allows you to ensure that a resource is cleaned up when the code that uses it finishes running, even if exceptions are thrown.

List contents of a zip file

To list the contents of a Zip file, we can use the printdir() method.

from zipfile import ZipFile
zip_file = 'myzip.zip'
with ZipFile(zip_file, 'r') as zip: 
    zip.printdir()

And here is the result.

File Name                                             Modified             Size
myzip/                                      2019-11-11 23:51:32            0
myzip/assets/                               2019-11-11 23:44:46            0
myzip/assets/bootstrap/                     2019-11-11 23:44:46            0
myzip/assets/bootstrap/css/                 2019-11-11 23:51:32            0
myzip/assets/img/                           2019-11-11 23:44:46            0
myzip/assets/img/1.jpg                      2019-11-11 23:51:32        23358
myzip/assets/img/2.jpg                      2019-11-11 23:51:32        19308
myzip/index.html                            2019-11-11 23:51:32         8161
myzip/post.html                             2019-11-11 23:51:32         9936

Create zip files

Using ZipFile.write() method, we can add files to a zip. In this example, we first create a list named files and stores the files to be zipped in it. Then we write the files as a zip file using the write() method.

from zipfile import ZipFile

# Files to compress
files = [r'E:\file1.txt', r'E:\file2.txt']
# Path to save the zip file
save_to = r"D:\compressed\myzipfile.zip"

with ZipFile('myzipfile.zip', 'w') as zip: 
    for file in files:
        zip.write(file)

We can change the compression algorithm by modifying the code as:

import zipfile
from zipfile import ZipFile

# Files to compress
files = [r'E:\file1.txt', r'E:\file2.txt']
# Path to save the zip file

with ZipFile('myzipfile.zip', 'w', compression= zipfile.ZIP_LZMA) as zip: 
    for file in files:
        zip.write(file)

To zip all files and sub-folders of a directory, us this method.

import os
import zipfile
from zipfile import ZipFile

# Files to compress
files = []
for _root, _directories, _files in os.walk(r'D:\filestozip'):
    for filename in _files: 
        # join the two strings in order to form the full filepath. 
        filepath = os.path.join(_root, filename) 
        files.append(filepath)
print(files)

# Path to save the zip file
save_to = r"D:\myzipfile.zip"

with ZipFile(save_to, 'w', compression= zipfile.ZIP_LZMA) as zip: 
    for file in files:
        zip.write(file)

Wrapping it up

In this post, we learned to create, extract and view contents of a zip file using Python's built in ZipFile module.