Applications 34 – XZ Compression – Longer Options Part Two

Jarret W. Buse · Sep 14, 2015

Applications 34 – XZ Compression – Longer Options Part Two

The “xz” compression utility has longer options which are for more advanced users. Be aware that these parameters do exist. This is the second article on the Longer Options.

The syntax for “xz” usage:

xz [parameters] [files]

The various parameters follow with examples.

--memlimit-compress=LIMIT

To set the memory limit for a compression, use the “memlimit-compress” option. The value can be set in megabytes (MiB) or a percentage of total memory (%). To reset the value back to default, set the limit to “0”. This option is best when compressing a large file on a system with limited resources. To compress the “database.db” file using only 60% of the RAM, use the command “xz --memlimit-compress=60% database.db”.

-M (--memlimit=LIMIT) - set memory usage limit for compression, decompression, or both; LIMIT is in bytes, % of RAM or 0 for defaults.

To set the memory limit for a compression, decompression or both, use the “memlimit” option. The value can be set in megabytes (MiB) or a percentage of total memory (%). To reset the value back to default, set the limit to “0”. The option is best when compressing or uncompressing a large file on a system with limited resources. To uncompress the “database.db” file using only 60% of the RAM, use the command “xz -d --memlimit=60% database.db”.

--no-adjust - if compression settings exceed the memory usage limit,give an error instead of adjusting the settings downwards

The “no-adjust” option prevents the memory limit settings to be decreased. If the memory limit is not enough, an error is generated and the compression stops. To compress the “database.db” file using only 80% of the RAM preventing a memory decrease, use the command “xz --memlimit-compress=80% --no-adjust database.db”.

--lzma1[=OPTS] – LZMA1, OPTS is a comma-separated list of presets

Compress or decompress a compressed file which is in the lzma1 format.

--lzma2[=OPTS] - LZMA2, OPTS is a comma-separated list of presets

Compress or decompress a compressed file which is in the lzma2 format. The speed and compression ratio are the same with lzma1 and lzma2. The lzma2 compression has fixes to prevent problems of lzma1.

- preset=PRE - reset options to a preset (0-9[e])

The “preset” option allows you to set the Dictionary Size, Compression Speed, Compressor Memory and Decompressor Memory. The default preset is “6”. The presets are as follows:

Code:

 Preset DictSize CompCPU CompMem DecMem
-0 256 KiB 0 3 MiB 1 MiB
-1 1 MiB 1 9 MiB 2 MiB
-2 2 MiB 2 17 MiB 3 MiB
-3 4 MiB 3 32 MiB 5 MiB
-4 4 MiB 4 48 MiB 5 MiB
-5 8 MiB 5 94 MiB 9 MiB
-6 8 MiB 6 94 MiB 9 MiB
-7 16 MiB 6 186 MiB 17 MiB
-8 32 MiB 6 370 MiB 33 MiB
-9 64 MiB 6 674 MiB 65 MiB

- dict=NUM - dictionary size (4KiB - 1536MiB; 8MiB)

The “dict” option allows you set the Dictionary Size with a default of 8MB. For larger files to be compressed, larger dictionaries need to be used. The Dictionary stores the compression information needed to uncompress the file.

- lc=NUM - number of literal context bits (0-4; 3)

The “lc” is for the Literal Context used to manage the number of high bits used from the previous character. In the English language, lower-case letters are typically followed by other lower-case letters making the first three bits identical. By using the default of “3”, the compression can be made smaller.

- lp=NUM - number of literal position bits (0-4; 0)

The “lp” option is for the Literal Position which manages the alignment of the bits when compressing the data. The default of “0” should be fine in most cases so there is no alignment set.

- pb=NUM - number of position bits (0-4; 2)

The “pb” option is used to set an alignment for the uncompressed file and defaults to “2” which is 4 bytes.

- mode=MODE - compression mode (fast, normal; normal)

The “mode” option determines the speed at which the compressed file is created. The “fast” mode operates quickly to compress fast, but loses on a smaller file. The “normal” mode spends time to set up the compress to compress better. The “normal” option is the default.

- nice=NUM - nice length of a match (2-273; 64)

Sets the “nice” length for a match. Once the algorithm finds a match of the “nice” length, it quits looking for a better match. The default setting is 64. Higher values can provide better compression, but the amount of time to compress the file takes longer.

- mf=NAME - match finder (hc3, hc4, bt2, bt3, bt4; bt4)

Match Finder (mf) can effect encoder speed, memory usage, and compression ratio. Hash Chain (hc) match finders are faster than the Binary Tree (bt). The default depends on the preset: 0 uses hc3, 1-3 use hc4, and the rest use bt4. In most cases, “bt4” is the default. The match finders are as follows:

hc3 - Hash Chain with 2- and 3-byte hashing with a minimum value for nice of 3. The memory usage is dict * 7.5 (if dict <= 16 MiB) or dict * 5.5 + 64 MiB (if dict > 16 MiB)
hc4 - Hash Chain with 2-, 3-, and 4-byte hashing with a minimum value for nice of 4. The memory usage is dict * 7.5 (if dict <= 32 MiB) or dict * 6.5 (if dict > 32 MiB)
bt2 - Binary Tree with 2-byte hashing with a minimum value for nice of 2. The memory usage is dict * 9.5
bt3 - Binary Tree with 2- and 3-byte hashing with a minimum value for nice of 3. The memory usage is dict * 11.5 (if dict <= 16 MiB) or dict * 9.5 + 64 MiB (if dict > 16 MiB)
bt4 - Binary Tree with 2-, 3-, and 4-byte hashing with a minimum value for nice of 4. The memory usage is dict * 11.5 (if dict <= 32 MiB) or dict * 10.5 (if dict > 32 MiB)

- depth=NUM - maximum search depth; 0=automatic (default)

Specify the maximum search depth for the Match Finder (mf). The default is 0. Values can range from 4-100 and 16-1,000+ for Binary Trees (bt). High values slow down the compression, especially for values of 1,000 or more.

--x86[=OPTS] - x86 BCJ filter (32-bit and 64-bit)
--powerpc[=OPTS] - PowerPC BCJ filter (big endian only)
--ia64[=OPTS] - IA-64 (Itanium) BCJ filter
--arm[=OPTS] - ARM BCJ filter (little endian only)
--armthumb[=OPTS] - ARM-Thumb BCJ filter (little endian only)
--sparc[=OPTS] - SPARC BCJ filterValid OPTS for all BCJ filters: start=NUM start offset for conversions (default=0)

The above six options specify the architecture of the system. The option helps the program to align the data to the system instruction set. The alignment of each architecture is:

- x86
- PowerPC
- ARM
- ARM-Thumb
- IA-64
- SPARC

--delta[=OPTS] - Delta filter; valid OPTS (valid values; default):dist=NUM where the distance between bytes being subtracted from each other (1-256; 1)

-Q (--no-warn) – makes any warnings not affect the exit status

The “-Q” option is used to suppress warnings. Use it twice “-QQ” to suppress errors as well. The exit code will remain the same even though no errors were shown. To uncompress the “database.db” file and supress all warnings and errors, the command would be “xz -dQQ database.db”.

--robot - use machine-parsable messages (useful for scripts)

If any displayed message needs to be in a machine-parsable format, use the “--robot” option. If you wanted to have machine-parsable output for uncompressing the “database.db.xz” file, the command would be “xz -d --robot database.db.xz”.

--info-memory - display the total amount of RAM and the currently activememory usage limits, and exit

This options displays how much physical memory (RAM) is recognized as available by “xz”. The memory usage limits for compression and decompression are also listed. To see your available RAM to “xz”, the command would be “xz --info-memory”.

Keep in mind that these options are less used for the “xz” command. You may not come across these options ever, except in extreme cases when something “special” is required for compressing or decompressing a file. It is best, as always, to be aware of their existence just in case they are needed.

Applications 34 – XZ Compression – Longer Options Part Two

Jarret W. Buse

Guest

Attachments

Staff online

Members online

Latest posts