Tag Archives: Compress files

How to zip, unzip files and directories in Linux / Unix

Learn how to zip, unzip files and directories in Linux or Unix. Compressing files helps in log management and makes data transfer easy.

Zipping files or directories compress data within them using  Lempel-Ziv coding (LZ77). This reduces the size of the resulting file. Lower size means lower storage requirement (log management) and faster transfer (FTP).

In Linux or Unix platforms gzip is widely available utility mostly native to OS which is used to zip, unzip files. In this post, we will see how to zip and unzip files using gzip utility with examples.

Compressing files

Zipping files i.e compressing is achieved by gzip without any option. You need to submit filename.xyz to gzip command. It will compress the file and the resulting file will have a name as filename.xyz.gz Point here to note is gzip removes the original file and keep new gz file in place.

# ll
total 12
-rw-r--r-- 1 root users  40 Jan  3 00:46 file2

# gzip file2

# ll
total 12
-rw-r--r-- 1 root users  63 Jan  3 00:46 file2.gz

Note in the above output after zipping original file file2 is vanished from the system and compressed archive file2.gz came in existence.

This command support wildcards like * or ? too. Also, you can supply a list of files to it and it will compress all the files supplied in the argument.

# gzip file2 file3

# ll
total 12
-rw-r--r-- 1 root users 63 Jan  3 00:46 file2.gz
-rw-r--r-- 1 root users 134 Jan  3 00:46 file3.gz

You can use forceful operation with -f option. This is helpful in case files to be compressed has multiple links in existence.

gzip also supports -v option i.e. verbose mode which shows all details about the operation being done.

# gzip -v *
file1:   45.7% -- replaced with file1.gz
file2:   22.5% -- replaced with file2.gz
file3:   10.5% -- replaced with file3.gz

In the above example, we zipped all files (hence *) within a directory using wild cards. Here verbose mode printed compression ratio for each file along with which file it replaced after the operation.

Compressing directories

Like files, even directories can be compressed recursively. When -r option is used, gzip command read through the given directory to its subtree structure and zips all the files it founds within.

# ll /tmp/dir3
total 12
-rw-r--r-- 1 root users  35 Jan  3 00:46 file1
-rw-r--r-- 1 root users  40 Jan  3 00:46 file2
-rw-r--r-- 1 root users 114 Jan  3 00:46 file3

# gzip -r /tmp/dir3

# ll /tmp/dir3
total 12
-rw-r--r-- 1 root users  51 Jan  3 00:46 file1.gz
-rw-r--r-- 1 root users  63 Jan  3 00:46 file2.gz
-rw-r--r-- 1 root users 134 Jan  3 00:46 file3.gz

In the above output, it recursively zipped all files within a given directory. This is a helpful option where there are hundreds of files in the directory.

Checking compressed files

To test the compressed archive you can use -t option with gunzip. If there are any issues with the compressed files it will report or else it will return you shell prompt.

# gzip -t file2.gz

You can even view compression details of this file using -l option with gunzip command. It shows, uncompressed and compressed size, compression ratio (0% if not known) and name of uncompressed file i.e. filename before compression

# gzip -l file2.gz
         compressed        uncompressed  ratio uncompressed_name
                 63                  40  22.5% file2

Un-Compressing files

To gain the original file from a compressed archive, -d option needs to be used and gz file to be supplied in the argument. It works vice versa and removes gz file and keeps the original file in the directory here. Recursive -r option works with -d too.

# gzip -d file2.gz

# ll
total 12
-rw-r--r-- 1 root users  40 Jan  3 00:46 file2

You can see file2 is back available now and the gz file has been removed. Using verbose mode prints more information about operations being done.

# gzip -v -d *
file1.gz:        45.7% -- replaced with file1
file2.gz:        22.5% -- replaced with file2
file3.gz:        10.5% -- replaced with file3

In the above output, we de-compressed three files in the same directory using wildcard *. It shows the compression ratio with which file replaced which file after decompression.