ZIP2 Logo/Title goes here 

Overview

This is a project to develop a portable archive format that addresses the limitations of the popular ZIP file format.

The folks at InfoZip list a bunch of requirements as to what's desired in a new file format, and end with "we're not volenteering".

Note that there are other limitations of the zipfile format beyond its dependence on 16-bit and 32-bit fields. These include its weak encryption, poorly designed "extra field" capability, poorly designed "multi-disk" support, mediocre robustness, support for streaming encoders only as an afterthought, low-precision timestamps (two-second granularity), lack of cross-timezone support (i.e., Universal Time) except as a third-party add-on, lack of any support for "solid" packing of small files, lack of support for alternate character sets (e.g., EBCDIC) and encodings of international characters (e.g., UTF-8), unnecessary redundancy, and merely acceptable (but no longer outstanding) compression efficiency. All of these things suggest that it's time for a brand-new format, not just a few more patches on an aging standard. (And no, we're not volunteering, either--although we have discussed what we consider some of the requirements for such a new format.)

So I stepped up to the plate and here is my shot at a serious attempt. The ZIP2 specification addresses all this and more.

My original inspiration was the international filename support, problems with spanning archives, and limitations in extensions. But once I decided to define a new format, I embraced many other features and issues

Here is a short list of design points. See the documentation for a complete discussion.

Deployment

If anyone is going to use this, deployment issues need to be considered as well as technical design issues.

Available Everywhere

A portable reference implementation should be able to work on any machine. Specifically, everything in it should be available for Windows, Unix, and Macintosh. I’m inspired by the work done in creating the PNG file format, where it was designed by engineers with working code to make sure designs are sound. I’m too familiar with standards filled with things that are difficult to implement!

The reference implementation is in Perl, and contains library functions for manipulating the file format records on a primitive level, as well as higher-level semantic functions and finally a full-blown user application. It is useful as a toolkit for experimenting with the file format and extensions, not just as a canned application.

Rather than a single command-line tool, the reference implementation will be designed as a library, so it may be incorporated into front-end programs written by others. So, anyone that has a archive product that supports multiple archive formats can very easily add ZIP2 support.

I’d like to have a C++ implementation, too. But I’m not planning to develop another implementation until the first one is nicely working and the design is thought to be stable. But, I envision a lib (.dll or .a file) will be welcome by those who make fancy front-end programs that manipulate a variety of archive formats, as well as be useful for those who want to use the more primitive Chunk library from an application.

Self-Extracting Archive

Self-extracting files can be sent to people long before the popularity of ZIP2 extractors catch on. Someone who gets the file won't care how it's put together, since it’s self-contained.

Self-Advertising

The header of a ZIP2 file points to this website, so someone discovering one can figure out what to do with it.

ZIP2 File Format Reference


Valid HTML 4.01!

Page content copyright 2003 by John M. Dlugosz. Home:http://www.dlugosz.com, email:mailto:john@dlugosz.com