Skip to content

Unsupported ZIP compression method (deflation-64-bit) #88

@EarlGlynn

Description

@EarlGlynn

Reproducible example ...

sourceName <- "D:/IRS/IRS990/XML/2021_TEOS_XML_01A.zip"
expandDir  <- "C:/Users/efg/Desktop/Temp/"
xmlFilename <- archive_extract(sourceName, expandDir)
Error: archive_extract.cpp:21 archive_read_data_block(): **Unsupported ZIP compression method (deflation-64-bit)**

You can download 2021_TEOS_XML_01A.zip (465 MB) from
https://apps.irs.gov/pub/epostcard/990/xml/2021/2021_TEOS_XML_01A.zip

The .zip file is from this page:
https://www.irs.gov/charities-non-profits/form-990-series-downloads
Form 990 Series (e-file) XML format, 2021 files

Windows 10 shows 80,000 files extracted from this zip. Unzipping in Windows is quite slow and appears to run in background but eventually unzips all 80,000 files.

7-zip 23.01 (x64) (https://www.7-zip.org/) uses multiple threads and extracts the 80,000 files in about two minutes:

The IRS introduced these new "TEOS_XML" zip files for years 2021-2023 about a month ago.

All the other .zips on that page (XLM format, years 2015-2019) can be processed with the R base package unzip function. These new "TEOS_XML" zip files fail with unzip with the warning: Warning: internal error in 'unz' code

I really want to use unzip or archive_extract from a Posit Notebook if possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions