Applications 31 – BZip2


Jarret W. Buse

Applications 31 – BZip2

Many compression utilities exist for Linux. The “bzip2” compression utility is a block-sorting file compressor. The “bzip2” utility may be a subject to know for the Linux+ exam.

A block-sorting file compression is the method of re-arranging characters in the file so there are more sets of similar characters together. The similar sets allow for better compression. The re-arranging of the characters is reversible so the file can be reconstructed when uncompressed.

The “bzip2” program compresses a single file and keeps the filename, but adds the “.bz2” file extension. To compress numerous files together use another program such as “tar” to join files together and then compress them using “bzip2”.

The syntax for “bzip2” usage:

bzip2 [flags and input files in any order]

The various flags (parameters) follow with examples.

  • -h (--help) – print basic help information about parameters

The use of the “-h” parameter requires nothing else entered for file names or parameters. Any extra information entered will be ignored. The command to get help for bzip2 is “bzip2 -h”.

  • -d (--decompress) – force decompression

If no parameters are specified, “bzip2” will compress the specified files by default. To decompress files, you must specify the “-d” option. To cause a decompression by default, use the command “bunzip2”. For “bunzip2”, all parameters are the same as “bzip2”.

To uncompress an existing “bzip2” file called “compression.bz2” the command would be “bzip2 -d compression.bz2”.

  • -z (--compress) – force compression

To compress files, use no parameter or the “-z” parameter. It may be best to always use the “-z” option when compressing files to be sure that compression is the method which will be used.

The option is needed most when using the command “bunzip2”. To compress a database called “data.db”, the command would be “bunzip2 -z data.db”.

  • -k (--keep) – keep (don't delete) input files

By default, files which are compressed are automatically deleted after the compression is completed. If you want to keep the original files intact use the “-k” option.

To compress the database called “data.db” and keep the database intact after the compression, the command would be “bzip2 -k data.db”.

  • -f (--force) – overwrite existing output files

When compressing a file where the compressed file exists, the existing file is not overwritten. To overwrite the compressed file, use the “-f” option to cause the file to be overwritten. When the “-f” option is used, the user will not be prompted to overwrite the existing file.

The command is used with the “-d” option or with “bunzip2”. To extract the database from the “data.db.bz2” and force the overwriting of the existing database, the command would be “bzip2 -df data.db.bz2”.

  • -t (--test) – test compressed file integrity

To verify the compressed file is correct and has not been corrupted, the integrity of the compression can be tested. When compressing critical files, use the “-t” option to verify the bzip file was created correctly.

NOTE: The “-t” parameter can only be used when specifying existing “bzip2” files.

For any file, which does not verify properly, you can use the `bzip2recover' program to attempt to recover data from undamaged sections of corrupted files. You may not be able to recover a whole file, but portions of the data may be recovered.

To verify the “data.db.bz2” file, the command would be “bzip2 -tv data.db.bz2”. The “-v” option is included to show the message “ok” if the file is not corrupted. Without the “-v” option, no message is shown unless an error is detected.

  • -c (--stdout) – output to standard out

Display the contents of the “bzip2” file. To easily display the contents, use the “bzcat” command instead of the “bzip2” command and all options are the same.

To see the contents of the “data.db.bz2” file, the command would be “bzcat data.db.bz2”.

  • -q (--quiet) – suppress noncritical error messages

Performs the specified options, but does not display any noncritical errors to stdout. Critical errors will still be shown.

To compress a document called “Article2.doc” and prevent noncritical errors from being shown the command would be “bzip2 -q Article2.doc”.

  • -v (--verbose) – be verbose (a 2nd -v gives more)

If more details are needed when performing a task with bzip2, use the “-v” option to display more details. If more details are required, use the option “-vv”. These options can be used with any other parameters (other than “-h”, “-L”, “-V”) to provide extra information.

To verify the “data.db.bz2” file, the command would be “bzip2 -tv data.db.bz2”. The “-v” option is included to show the message “ok” if the file is not corrupted. Without the “-v” option, no message is shown unless an error is detected. The parameter “-vv” can be used to cause more output to be displayed. In this case, the message “[1: huff+mtf rt+rld]” could also be shown to display the type of compression.

  • -L (--license) – display software version & license

To show the license information of bzip2, the “-L” option will show version and license information about the “bzip2” application. Any other parameters will be ignored.

Sample output for the command, “bzip2 -L”, is as follows:

bzip2, a block-sorting file compressor. Version 1.0.6, 6-Sept-2010.

Copyright (C) 1996-2010 by Julian Seward.

This program is free software; you can redistribute it and/or modify
it under the terms set out in the LICENSE file, which is included
in the bzip2-1.0.6 source distribution.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
LICENSE file for more details.

  • -V (--version) – display software version & license

The “-V” option displays the same details as when using the “-L” option.

The command is “bzip2 -V”.

  • -s (--small) – use less memory (at most 2500k)

For systems with a lower amount of Random Access Memory (RAM), “bzip2” can use less memory when compressing or uncompressing. The option can help reduce overhead on a system which may be strained for resources, especially for files with larger block sizes.

To compress the database called “data.db” and use less memory, the command would be “bzip2 -s data.db”.

  • -1 to -9 – set block size to 100k .. 900k

Block size can be declared by setting it at 100,000 to 900,000 bytes. The options are used only for compressing files since the block size is set in the compressed header. When uncompressing a file, the block size is read from the header and any parameter set for the block size is ignored.

The larger the block size of a compressed file being uncompressed, the more memory will be used.

Basically, the larger the file size, the large block size you should use. To use a large block size on a small file would not compress the file to make a small file size.

  • --fast – alias for -1

If the “--fast” option is used, it is the same as using the “-1” block size. Since the block size is small, less memory is used causing the compression to occur faster.

To compress the database called “data.db” and with a block size of 100,000 byte block size, the command would be “bzip2 --fast data.db”.

  • --best – alias for -9

The “--best” option is the same as the “-9” option. Saying it is “best” may not be truly appropriate in every situation. For a better explanation, look back at the “-1 to -9” options.

To compress the database called “data.db” and with a block size of 900,000 byte block size, the command would be “bzip2 --best data.db”.

When using the command, keep in mind that “bzip2” defaults to compress a file by default. For uncompressing a file by default, use the command “bunzip2” which uses the same options as “bzip”. To show the contents of a “bzip2” file to stdout, use the command “bzcat”.

Try using the various parameters and be aware how each command works. Be comfortable when using the program so you do understand the workings of compressing and uncompressing files.


  • slide.jpg
    32.9 KB · Views: 33,255

Members online