Connect with us

Website Tutorials

Everything You Need To Know About Tar Files & The Linux Command Line

Published

on

Everything You Need To Know About Tar Files & The Linux Command Line

Tar is an immeasurably useful archival tool available on most Linux/Unix-based systems. It is used regularly as part of Linux systems administration at all skill level. This article will tackle a number of common usage case and questions regarding tar. When finished, you will have learned how to effectively utilize tar archives to: compress, create, extract, modify, and otherwise operate on archive files in your day to day workflows. Before we being, lets review the full list of questions this article will answer.

Everything You Need To Know About Tar Files & The Linux Command Line

What is a tar file?

The standard tar file is an archive format common to Linux and Unix-based operating systems. These files take the place of the more widely-known zip file format and are used for both storage and transportation of groups of related files and/or directories between devices.

The tar archive format, just like its zip file cousin, contains an array of individual files and/or directories as well as any statistical data about them like attributes, path, ownership, and permissions. All files contained within a tar archive can be easily listed, searched, added, deleted, compressed, and extracted directly on the command line using the tar binary.

What is a tarball?

In the default format, a tar file contains an uncompressed data stream. However, it is generally more common for these archives to be compressed by an external compression program such as: gzip, bzip2, lzip, lzma, lzop, zstd, compress, and more.

What is the difference between a tar file and a tarball?

A standard tar file uses the .tar extension and stores data in an uncompressed data stream. The term tarball specifically refers to a compressed tar archive. Tarballs will have a unique file extension based on the compression used to create them (e.g. .tgz for gzip, most common). These file extensions simplify identification of the compression software that will be needed to decompress the file. These extensions come in both long & short formats. Some compression tools can support multiple extensions. The following quick reference table helps to swiftly identify the necessary tool and associated arguments needed for each of the common compression types.

Advertisement

Tar Compression Quick Reference

gzipbzip2xzlziplzmalzopzstdcompress
Option–gzip–bzip2–xz–lzip–lzma–lzop–zstd–compress
Flag-z-j-J-Z
Extension.gz.bz2.tz2.xz.lz.lzma.lzo.zst.Z
Tarball.tgz.taz.tbz.tbz2.txz.tlz.tzst.taZ
  When in doubt, try: --auto-compress or -a to detect the compression type automatically.

What is the tar syntax? 

When working on the command line it is important to understand all command line arguments being executed. Otherwise we can end up with unexpected and/or potentially damaging results. Tar is no different.

The general syntax for tar is similar to many command line tools and accepts multiple types of arguments. These arguments vary depending on the operation mode being executed. The Tar General Syntax table provided below provides a basic overview of overall syntax expected by tar. We will use this key throughout the article as we go over each operation mode.

🛈 Examples in this document employ the -v (--verbose) flag to better illustrate the results of the command line executed. This flag is entirely optional.

Tar General Syntax

    tar OPERATION [OPTIONS…] ARCHIVE ARCHIVE
    tar OPERATION [-f|–file ARCHIVE] [OPTIONS…] [FILES…]
    tar OPERATION [-f|–file ARCHIVE] [OPTIONS…] [MEMBERS…]
Legend
OPTIONAn argument that is not one of the other types:
OPERATION, FILE, MEMBER, or ARCHIVE.
OPERATIONWhich operation tar will performs from only one of the following:
CATENATE, CREATE, DIFF, DELETE, LIST, UPGRADE, or EXTRACT
FILEA path targeting one or more files/dirs on the system outside of tar.
Supports: relative & full pathing, file globs, and wildcards.
MEMBERA path targeting one or more files/dirs inside of a tar ARCHIVE.
Supports: relative pathing only, file globs, and wildcards.
ARCHIVEA path to a tar file on the file system outside of tar.

What are long & short form command line arguments? 

Command line arguments come in both long and short flavors. Throughout this article we will cover syntax and examples of both format types. To differentiate them, we will call the long argument format options, while the short form are called flags. These formats on a  technical level are synonyms and are interchangeable in most cases. The exception being flag concatenation.

How to use command line options?

The long format options use a double hyphen prefix (--) coupled with one or more case-insensitive English words strung together with single hyphens (-). These words are easy to recall and descriptive, making it possible to remember them without referencing the manual.

Advertisement

Consider these examples to illustrate the point:

  • --create is the operation mode to CREATE a new archive.
  • --list  is the operation mode to LIST contents of an archive.
  • --no-recursion  is the option to DISABLE RECURSION when processing directories.

How to use command line flags?

Short form flags, unlike their longer siblings, are case-sensitive. They consist of a single English character prefixed with a single hyphen (-) . This shorter format is ideally the first letter of its longer cousins. However, due to character limits and conflicts, some options have to use a different character or capitalization instead. Additionally, some potentially dangerous and/or lesser used options will have only their single long form.

Examples to consider:

  • -c is the flag for --create which makes sense.
  • -t is the flag for --list, not the expected -l or -L.
  • --delete has no short flag format and only has it’s long form.

How to use flag concatenation?

Flags are the preferred format when it comes to passing multiple arguments on the command line. However, unlike their longer format, flags can be strung together in series using a single hyphen (-) prefix to further reduce the complexity of the overall command line.

Consider combining the short forms of --file (-f) and --create (-c). You can merge these options into the single concatenated format of -cf ARCHIVE and tar will parse this into its individual short form components.

How to specify the working tar archive?

All tar operation modes will require a target working archive. This is supplied using the special FILE argument. FILE must be immediately followed by the path of the archive on the system that needs to be worked on. The basic syntax for the FILE option is as follows:

Option Syntax: FILE

Long–file ARCHIVE
Short-f ARCHIVE
DescriptionSpecify the working archive being operated on.

How to create a new tar archive?

When creating a new archive, in addition to the previously mentioned file option, we must also pass the CREATE operation and supply it with one or more FILE paths to be added to the new ARCHIVE. By default, tar refuses to create an empty archive so you must supply at-least one valid FILE path in order to successfully create the archive.

Advertisement

When specifying FILES to add to your ARCHIVE, any full paths will be stripped of their leading forward slash (/). This converts it to a relative path inside  the archive which is an important distinction for operations that target MEMBERS inside an archive. The theory behind this behavior is a sort of safety precaution to prevent archive extraction from overwriting files in another location when extracted.

If the target archive you wish to create already exists, the original archive will be squashed, which means overwritten without confirmation. So be sure to save any needed existing archives before squashing them with a new archive.

Now let’s review the syntax and examples of the CREATE operation mode .

Operation Mode: CREATE

Longtar –create [–file ARCHIVE] [OPTIONS…] FILES…
Shorttar -cf ARCHIVE [OPTIONS…] FILES…
DescriptionCreate new ARCHIVE containing all specified FILES…
Directories are added recursively unless –no-recursion is supplied.
Example 1A – Create myarchive.tar, populated with dir1 dir2 file1 and file2
bash-4.2$ tar -cvf myarchive.tar dir1 dir2 file1 file2
dir1/
dir1/file3
dir2/
dir2/file4
file1
file2

The example shows the creation of a new tar file named  myarchive.tar  in the current working directory and populate it with dir1dir2file1, and file2. We can see from the verbose output that file3 and file4 were also added. This is due to directory recursion being enabled by default so everything looks as expected here and we have successfully created the archive.

How to work with existing archives?

There are a number of additional operations that can be performed when working with existing archives. They range from adding new files, to removing or replacing existing files, or listing files contained within an archive, and (most commonly) extracting files from archives. We will go over each of these operation modes, their syntax, and complete an example of each to further drive these lessons home. 

Advertisement

How to add new/replace existing files inside an archive?

The UPDATE operation is used to add new files and replace existing files contained within a target archive. When applying the UPDATE operation, you will need to also supply one or more FILE paths that will be added to the specified ARCHIVE. If a FILE path already exists as a MEMBER within the archive, and the file you are adding is newer , the old file will be replaced inside the archive.

Operation Mode: UPDATE

Longtar –update [–file ARCHIVE] [OPTIONS…] [FILES…]
Shorttar -u [-f ARCHIVE] [OPTIONS…] [FILES…]
DescriptionAdd all FILES… to ARCHIVE replacing any existing files if newer.
Directories are added recursively unless –no-recursion is supplied.
Example 2 – Archive Contents Before
dir1/
dir1/file3
dir2/
dir2/file4
file1
file2
Example 2 – Add dir3/file5 to myarchive.tar
bash-4.2$ tar -uvf myarchive.tar dir3/file5
dir3/file5
Example 2 – Archive Contents After
dir1/
dir1/file3
dir2/
dir2/file4
file1
file2
dir3/file5

Continuing with our previous example we use our freshly minted myarchive.tar file, then use the UPDATE operation to add a single file (file5) contained within dir3. We have specifically added the file5 without adding the whole dir3 directory, which is a noteworthy difference.

How to delete/remove files from an archive?

Removing a file from an archive requires the DELETE operation mode. Unlike previous modes, DELETE does not have a short form flag. This is a type of safety precaution used for potentially damaging operations, so the full long syntax is required.

Targeting archive MEMBERS for removal requires using relative pathing. You will need to make sure the path supplied to tar does not start with a leading forward slash (/), otherwise, the MEMBERS inside the archive will not be found.

The following example shows how to target dir1 for deletion from our archive. We then follow up with listing all files in the archive to show that dir1 is indeed no longer present. 

Advertisement

Operation Mode: DELETE

Longtar –delete [–file ARCHIVE] [OPTIONS…] MEMBERS…
ShortThere is no short option equivalent.
DescriptionRemove one or more MEMBERS… from ARCHIVE permanently.
Does not operate on compressed archives.
Example 3 – Archive Contents Before
dir1/
dir1/file3
dir2/
dir2/file4
file1
file2
dir3/file5
Example 3 – Add dir3/file5 to myarchive.tar
bash-4.2$ tar -uvf myarchive.tar dir3/file5
dir3/file5
Example 3 – Archive Contents After
dir1/
dir1/file3
dir2/
dir2/file4
file1
file2

How to list files inside a tar archive?

Listing the files contained inside a target archive is another very common task. This is where the  --list  option, or its  -t  flag counterpart, come into play. On its own, without any additional arguments, running the LIST operation will print the full MEMBER list of the archive.

However, you can narrow down this list by supplying full or partial MEMBER paths or globs to existing MEMBERS in the archive. The key item to remember when working with archive MEMBERS is to use relative pathing when refining your selection. So you will almost never start a MEMBER path with a forward slash since any leading forward slashes on a path get removed when added to an archive.

The trick is to remember that when operating inside the tar archive, all pathing is relative.

Operation Mode: LIST

Longtar –list [–file ARCHIVE] [OPTIONS…] [MEMBERS…]
Shorttar -tf ARCHIVE [OPTIONS…] [MEMBERS…]
DescriptionList all or some MEMBERS… from ARCHIVE.
Arguments are optional.
Example 4 – Listing all files within an existing archive
bash-4.2$ tar -tf myarchive.tar
dir1/
dir1/file3
dir2/
dir2/file4
file2
dir3/file5
Example 5 – List specific files from an archive
bash-4.2$ tar -tf myarchive.tar file2 dir3/file5 file1
file2
dir3/file5
tar: file1: Not found in archive
tar: Exiting with failure status due to previous errors

How to find files in tar? How to search for files inside of tar?

Tar has built-in support for file globs providing support for standard wildcard characters which can be used to refine the selection of MEMBERS within an archive. All commands that accept MEMBERS as an argument can take advantage of these wildcards. The wildcard characters in question are the standard asterisk (*) for matching everything and the question mark (?) for matching any single character. The following additional LIST examples show how easy it is to use wildcards to find files inside of a tar archives.

The trick is to remember that when operating inside the tar archive, all pathing is relative.

Advertisement
Example 6 – Listing files from an archive using the question mark ( ? ) wildcard.
bash-4.2$ tar -tf myarchive.tar “dir?/”
dir1/
dir1/file3
dir2/
dir2/file4
dir3/file5
Example 7 – Listing files from an archive using the asterisk ( * ) wildcard.
bash-4.2$ tar -tf myarchive.tar “d*/file*“
dir1/file3
dir2/file4
dir3/file5

File globs are a spectrum of complexity and the more you learn about it, the more efficient you can be when utilizing Tar in your day to day workflow. The following manual entry goes over file globs in much greater detail than what was covered here in this article.

How to extract files from an existing tar archive?

Extraction is hands down the most used operation performed on tar archives and tarballs. Syntax-wise, extract operates exactly the same as the LIST command. It too also supports wildcard and globs so you can specifically target only the individual files or directories you need to extract. 

Operation Mode: EXTRACT

Longtar –extract [–file ARCHIVE] [OPTIONS…] [MEMBERS…]
Shorttar -xf ARCHIVE [OPTIONS…] [MEMBERS…]
DescriptionExtract all or some MEMBERS… from ARCHIVE.
MEMBERS are optional. Synonyms: –get
Example 8 – Extract all files within an existing archive
bash-4.2$ tar -xvf myarchive.tar
dir1/
dir1/file3
dir2/
dir2/file4
file2
dir3/file5
Example 9 – Extract specific files from an archive
bash-4.2$ tar -xvf myarchive.tar file2 dir3/file5 file1
file2
dir3/file5
tar: file1: Not found in archive
tar: Exiting with failure status due to previous errors

 Errors like in the example indicate that the specified file was not present in the target archive and thus tar will return an error state but still extracts the rest of the items it found regardless of the error.

How to decompress a tarball?

There are several types of compressed tar file formats. The following are additional examples of these file types, their associated compression tool, and both the long and short forms arguments needed to perform a decompress along with a general extraction of all archive members.

Long Form ExtensionsShort Form Extensions
gziptar -xf myname.tar.gz –gziptar -xzf myname.tgz
tar -xzf myname.taz
bzip2tar -xf myname.tar.bz2 –bzip2
tar -xf myname.tar.tz2 –bzip2
tar -xjf myname.tbz
xzstar -xf myname.tar.xz –xztar -xJf myname.txz
lziptar -xf myname.tar.lz –lzip
lzmatar -xf myname.tar.lzma –lzmatar -xf myname.tlz –lzma
lzoptar -xf myname.tar.lzo –lzop
zstdtar -xf myname.tar.zst –zstdtar -xf myname.tzst –zstd
compresstar -xf myname.tar.Z –compresstar -xf myname.taZ –compress

Stephen Oduntan is the founder and CEO of SirsteveHQ, one of the fastest growing independent web hosts in Nigeria. Stephen has been working online since 2010 and has over a decade experience in Internet Entrepreneurship.

Continue Reading
Advertisement
Comments

Trending

Copyright © 2024 SirsteveHQ. All Rights Reserved.