zlib

Zlib Compression interface.

The zlib module provides an API for the zlib library (http://www.zlib.org). It is used to compress and decompress data. The data format is described by RFCs 1950 to 1952.

A typical (compress) usage looks like:

Z = zlib:open(),
ok = zlib:deflateInit(Z,default),

Compress = fun(end_of_data, _Cont) -> [];
              (Data, Cont) ->
                 [zlib:deflate(Z, Data)|Cont(Read(),Cont)]
           end,
Compressed = Compress(Read(),Compress),
Last = zlib:deflate(Z, [], finish),
ok = zlib:deflateEnd(Z),
zlib:close(Z),
list_to_binary([Compressed|Last])

In all functions errors, {'EXIT',{Reason,Backtrace}}, might be thrown, where Reason describes the error. Typical reasons are:

badarg

Bad argument

data_error

The data contains errors

stream_error

Inconsistent stream state

einval

Bad value or wrong function called

{need_dictionary,Adler32}

See inflate/2

Types


zstream() = port()

A zlib stream, see open/0.

zlevel() =
            none | default | best_compression | best_speed | 0..9

zmemlevel() = 1..9

zmethod() = deflated

zstrategy() = default | filtered | huffman_only | rle

zwindowbits() = -15..-8 | 8..47

Normally in the range -15..-8 | 8..15.

Functions


open() -> zstream()

Open a zlib stream.

close(Z) -> ok

Closes the stream referenced by Z.

deflateInit(Z) -> ok

Same as zlib:deflateInit(Z, default).

deflateInit(Z, Level) -> ok

Initialize a zlib stream for compression.

Level decides the compression level to be used, 0 (none), gives no compression at all, 1 (best_speed) gives best speed and 9 (best_compression) gives best compression.

deflateInit(Z, Level, Method, WindowBits, MemLevel, Strategy) ->
               ok

Initiates a zlib stream for compression.

The Level parameter decides the compression level to be used, 0 (none), gives no compression at all, 1 (best_speed) gives best speed and 9 (best_compression) gives best compression.

The Method parameter decides which compression method to use, currently the only supported method is deflated.

The WindowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8 through 15. Larger values of this parameter result in better compression at the expense of memory usage. The default value is 15 if deflateInit/2. A negative WindowBits value suppresses the zlib header (and checksum) from the stream. Note that the zlib source mentions this only as a undocumented feature.

The MemLevel parameter specifies how much memory should be allocated for the internal compression state. MemLevel=1 uses minimum memory but is slow and reduces compression ratio; MemLevel=9 uses maximum memory for optimal speed. The default value is 8.

The Strategy parameter is used to tune the compression algorithm. Use the value default for normal data, filtered for data produced by a filter (or predictor), huffman_only to force Huffman encoding only (no string match), or rle to limit match distances to one (run-length encoding). Filtered data consists mostly of small values with a somewhat random distribution. In this case, the compression algorithm is tuned to compress them better. The effect of filteredis to force more Huffman coding and less string matching; it is somewhat intermediate between default and huffman_only. rle is designed to be almost as fast as huffman_only, but give better compression for PNG image data. The Strategy parameter only affects the compression ratio but not the correctness of the compressed output even if it is not set appropriately.

deflate(Z, Data) -> Compressed

  • Z = zstream()
  • Data = iodata()
  • Compressed = iolist()

Same as deflate(Z, Data, none).

deflate(Z, Data, Flush) -> Compressed

  • Z = zstream()
  • Data = iodata()
  • Flush = none | sync | full | finish
  • Compressed = iolist()

deflate/3 compresses as much data as possible, and stops when the input buffer becomes empty. It may introduce some output latency (reading input without producing any output) except when forced to flush.

If the parameter Flush is set to sync, all pending output is flushed to the output buffer and the output is aligned on a byte boundary, so that the decompressor can get all input data available so far. Flushing may degrade compression for some compression algorithms and so it should be used only when necessary.

If Flush is set to full, all output is flushed as with sync, and the compression state is reset so that decompression can restart from this point if previous compressed data has been damaged or if random access is desired. Using full too often can seriously degrade the compression.

If the parameter Flush is set to finish, pending input is processed, pending output is flushed and deflate/3 returns. Afterwards the only possible operations on the stream are deflateReset/1 or deflateEnd/1.

Flush can be set to finish immediately after deflateInit if all compression is to be done in one step.

 
zlib:deflateInit(Z),
B1 = zlib:deflate(Z,Data),
B2 = zlib:deflate(Z,<< >>,finish),
zlib:deflateEnd(Z),
list_to_binary([B1,B2])

deflateSetDictionary(Z, Dictionary) -> Adler32

  • Z = zstream()
  • Dictionary = iodata()
  • Adler32 = integer()

Initializes the compression dictionary from the given byte sequence without producing any compressed output. This function must be called immediately after deflateInit/[1|2|6] or deflateReset/1, before any call of deflate/3. The compressor and decompressor must use exactly the same dictionary (see inflateSetDictionary/2). The adler checksum of the dictionary is returned.

deflateReset(Z) -> ok

This function is equivalent to deflateEnd/1 followed by deflateInit/[1|2|6], but does not free and reallocate all the internal compression state. The stream will keep the same compression level and any other attributes.

deflateParams(Z, Level, Strategy) -> ok

Dynamically update the compression level and compression strategy. The interpretation of Level and Strategy is as in deflateInit/6. This can be used to switch between compression and straight copy of the input data, or to switch to a different kind of input data requiring a different strategy. If the compression level is changed, the input available so far is compressed with the old level (and may be flushed); the new level will take effect only at the next call of deflate/3.

Before the call of deflateParams, the stream state must be set as for a call of deflate/3, since the currently available input may have to be compressed and flushed.

deflateEnd(Z) -> ok

End the deflate session and cleans all data used. Note that this function will throw an data_error exception if the last call to deflate/3 was not called with Flush set to finish.

inflateInit(Z) -> ok

Initialize a zlib stream for decompression.

inflateInit(Z, WindowBits) -> ok

Initialize decompression session on zlib stream.

The WindowBits parameter is the base two logarithm of the maximum window size (the size of the history buffer). It should be in the range 8 through 15. The default value is 15 if inflateInit/1 is used. If a compressed stream with a larger window size is given as input, inflate() will throw the data_error exception. A negative WindowBits value makes zlib ignore the zlib header (and checksum) from the stream. Note that the zlib source mentions this only as a undocumented feature.

inflate(Z, Data) -> Decompressed

  • Z = zstream()
  • Data = iodata()
  • Decompressed = iolist()

inflate/2 decompresses as much data as possible. It may introduce some output latency (reading input without producing any output).

If a preset dictionary is needed at this point (see inflateSetDictionary below), inflate/2 throws a {need_dictionary,Adler} exception where Adler is the adler32 checksum of the dictionary chosen by the compressor.

inflateChunk(Z, Data) -> Decompressed | {more, Decompressed}

  • Z = zstream()
  • Data = iodata()
  • Decompressed = iolist()

Like inflate/2, but decompress no more data than will fit in the buffer configured via setBufSize/2. Is is useful when decompressing a stream with a high compression ratio such that a small amount of compressed input may expand up to 1000 times. It returns {more, Decompressed}, when there is more output available, and inflateChunk/1 should be used to read it. It may introduce some output latency (reading input without producing any output).

If a preset dictionary is needed at this point (see inflateSetDictionary below), inflateChunk/2 throws a {need_dictionary,Adler} exception where Adler is the adler32 checksum of the dictionary chosen by the compressor.

walk(Compressed, Handler) ->
    Z = zlib:open(),
    zlib:inflateInit(Z),
    % Limit single uncompressed chunk size to 512kb
    zlib:setBufSize(Z, 512 * 1024),
    loop(Z, Handler, zlib:inflateChunk(Z, Compressed)),
    zlib:inflateEnd(Z),
    zlib:close(Z).

loop(Z, Handler, {more, Uncompressed}) ->
    Handler(Uncompressed),
    loop(Z, Handler, zlib:inflateChunk(Z));
loop(Z, Handler, Uncompressed) ->
    Handler(Uncompressed).
        

inflateChunk(Z) -> Decompressed | {more, Decompressed}

Read next chunk of uncompressed data, initialized by inflateChunk/2.

This function should be repeatedly called, while it returns {more, Decompressed}.

inflateSetDictionary(Z, Dictionary) -> ok

Initializes the decompression dictionary from the given uncompressed byte sequence. This function must be called immediately after a call of inflate/2 if this call threw a {need_dictionary,Adler} exception. The dictionary chosen by the compressor can be determined from the Adler value thrown by the call to inflate/2. The compressor and decompressor must use exactly the same dictionary (see deflateSetDictionary/2).

Example:

unpack(Z, Compressed, Dict) ->
     case catch zlib:inflate(Z, Compressed) of
          {'EXIT',{{need_dictionary,DictID},_}} ->
                   zlib:inflateSetDictionary(Z, Dict),
                 Uncompressed = zlib:inflate(Z, []);
          Uncompressed ->
                 Uncompressed
     end.

inflateReset(Z) -> ok

This function is equivalent to inflateEnd/1 followed by inflateInit/1, but does not free and reallocate all the internal decompression state. The stream will keep attributes that may have been set by inflateInit/[1|2].

inflateEnd(Z) -> ok

End the inflate session and cleans all data used. Note that this function will throw a data_error exception if no end of stream was found (meaning that not all data has been uncompressed).

setBufSize(Z, Size) -> ok

Sets the intermediate buffer size.

getBufSize(Z) -> Size

Get the size of intermediate buffer.

crc32(Z) -> CRC

Get the current calculated CRC checksum.

crc32(Z, Data) -> CRC

Calculate the CRC checksum for Data.

crc32(Z, PrevCRC, Data) -> CRC

  • Z = zstream()
  • PrevCRC = integer()
  • Data = iodata()
  • CRC = integer()

Update a running CRC checksum for Data. If Data is the empty binary or the empty iolist, this function returns the required initial value for the crc.

Crc = lists:foldl(fun(Data,Crc0) ->
                      zlib:crc32(Z, Crc0, Data),
                  end, zlib:crc32(Z,<< >>), Datas)

crc32_combine(Z, CRC1, CRC2, Size2) -> CRC

  • Z = zstream()
  • CRC = CRC1 = CRC2 = Size2 = integer()

Combine two CRC checksums into one. For two binaries or iolists, Data1 and Data2 with sizes of Size1 and Size2, with CRC checksums CRC1 and CRC2. crc32_combine/4 returns the CRC checksum of [Data1,Data2], requiring only CRC1, CRC2, and Size2.

adler32(Z, Data) -> CheckSum

  • Z = zstream()
  • Data = iodata()
  • CheckSum = integer()

Calculate the Adler-32 checksum for Data.

adler32(Z, PrevAdler, Data) -> CheckSum

  • Z = zstream()
  • PrevAdler = integer()
  • Data = iodata()
  • CheckSum = integer()

Update a running Adler-32 checksum for Data. If Data is the empty binary or the empty iolist, this function returns the required initial value for the checksum.

Crc = lists:foldl(fun(Data,Crc0) ->
                      zlib:adler32(Z, Crc0, Data),
                  end, zlib:adler32(Z,<< >>), Datas)

adler32_combine(Z, Adler1, Adler2, Size2) -> Adler

  • Z = zstream()
  • Adler = Adler1 = Adler2 = Size2 = integer()

Combine two Adler-32 checksums into one. For two binaries or iolists, Data1 and Data2 with sizes of Size1 and Size2, with Adler-32 checksums Adler1 and Adler2. adler32_combine/4 returns the Adler checksum of [Data1,Data2], requiring only Adler1, Adler2, and Size2.

compress(Data) -> Compressed

  • Data = iodata()
  • Compressed = binary()

Compress data (with zlib headers and checksum).

uncompress(Data) -> Decompressed

  • Data = iodata()
  • Decompressed = binary()

Uncompress data (with zlib headers and checksum).

zip(Data) -> Compressed

  • Data = iodata()
  • Compressed = binary()

Compress data (without zlib headers and checksum).

unzip(Data) -> Decompressed

  • Data = iodata()
  • Decompressed = binary()

Uncompress data (without zlib headers and checksum).

gzip(Data) -> Compressed

  • Data = iodata()
  • Compressed = binary()

Compress data (with gz headers and checksum).

gunzip(Data) -> Decompressed

  • Data = iodata()
  • Decompressed = binary()

Uncompress data (with gz headers and checksum).