Last modified: Fri Nov 27 09:44:53 2020
Copyright 1991 Bruce Barnett and General Electric Company
Copyright 2013 Bruce Barnett
All rights reserved
You are allowed to print copies of this tutorial for your personal use, and link to this page, but you are not allowed to make electronic copies, or redistribute this tutorial in any form without permission.
Original version written in 1991 and published in the Sun Observer
How to use your system's tape drive (diskette drive?) to store files. How to do your own backups, and why they are important. Other uses tar.
Accidents and oversights happen. Tapes can be damaged, lost, or misslabeled. Assume your system administrator is top notch. The best administrator can recover your lost data 99% of the time. There is still a small chance that the files you need might not be recovered. Can you afford to duplicate months of effort 1 per cent of the time? No.
An experienced programmer learns to be pessimistic. Typically, this important fact is learned the hard way. Perhaps a few hours is lost. Perhaps days. Sometimes months are lost.
Here are some common situations:
Gulp! I scared myself. Excuse me for a few minutes while I load a tape....
Ah! I feel better now. As I was saying, being pessimistic has it's advantages.
The "cd" command moves you to your home directory. You could back up any directory the same way.
The tar command, which is an abbreviation of tape archive, copies the current directory, specified by the ".," to the default tape drive. The "c" argument specifies the "create" mode of tar.
You might get an error. Something about device "rmt8" off line. Don't worry. I exaggerated slightly when I said tar was easy to use. The tape device tar uses by default is "/dev/rmt8." There are several types of tape units, and not all can be referred to using that name. Some system administrators will link that name to the actual device, which makes tar easier to use. But if that doesn't work, you need to specify additional arguments to tar.
Most Unix commands follow a certain style when arguments are specified. Tar does not follow this convention, so you must be careful to use tar properly. If the standard was followed, then the following might be an example of dumping the current directory to the 1/2 inch tape cartridge, verbose mode, block size of 20:
Instead, all the flags are in the first argument, and the parameters to those flags follow the first argument, in order of the flags specified:
The same command can be specified in a different way by changing the order of the letters in the first argument:
The only key letter that has a fixed location is the first one, which must specify if you are reading or writing an archive. The most common key letters, and the functions they perform are;
Key Letter | Function |
---|---|
c | Create an archive |
x | eXtract an archive |
t | Table of contents |
Some versions of tar require a hyphen before the letter. It is optional on SunOS.
Part of the difficulty in using tar is figuring out which filename to use for which device. If you have a 1/2" tape drive, try
If you has a 1/4" tape cartridge, try
If this doesn't work, then try changing the "8" to a "0." You can also list the devices in the /dev directory and look for one that has the most recent usage:
Some unix systems use different standards for naming magnetic tapes. There might be a "h" at the end of a name for high density. When in doubt, examine the major and minor numbers (using the ls -l command, and read the appropriate manual page, which can be found by searching through the possible entries using
and
Note that the third dump does not use the "no-rewind" name of the device, so that it will rewind when done.
To examine a tape without extracting any files, get a table of contents and use the key letter "t" or "tv" instead of the "c." The "v" flag gives a more verbose listing.
If you want to examine the third dump file, you can either use tar twice with the "no-rewind " names, or you can skip forward one or more dump files my using the mt (magnetic tape) command to skip forward 2:
SunOS 4.1 has added a new convenience to the tar command. If you defined an environment variable TAPE:
Your personal workstation probably doesn't have a tape drive connected. This makes creating tar backup files slightly more complicated. If you have an account on a machine with a tape drive, and the directory is NFS mounted, you can just rlogin to the other machine and use tar to backup your directory.
If the directory is not NFS mounted, or it is mounted but you have permission problems accessing your own files, you can use tar, rsh and dd to solve this dilemma. The syntax is confusing, but if you forget, you can use
Here, the output file of tar is -, which tar interprets as standard input if tar is reading a tape, or standard output if tar is creating a tape.
the dd command does a data copy from standard input to the device /dev/rmt0.
This example assumes you can use rsh without requiring a password. You can add your current machine's name to the remote .rhost file if you get a password prompt when you use rlogin to access this machine.
If you do not have an account, you might be able to let a friend let you use their account temporarily.
I have to warn you about this because there is no secure way to do this. Whenever you allow someone else access to your account, they can create a program that lets them back into your account without your knowledge. This also holds true whenever someone lets you access their account. That is, if someone allows you to type in a command from their workstation or terminal, you may not be executing the programs you think you are. One of these programs might capture your password!
In an effort to be complete, I will describe how to use a remote tape when your account has a different name on the remote machine. Let's assume your account on the remote machine is "sam." You could add a line to his .rhosts file with the following format:
where these match your machine and user name. You could then use the command:
to access the remote account.
The alert reader will realize that this is the same procedure used to allow someone else to have access to a remote account. I must stress that allowing someone else access to your account temporarily gives them the chance to access your account forever. I know of no simple, yet secure way to solve this problem. (I would like electronic mail if you think you know a solution that doesn't require writing a special purpose program.)
Because of Unix's device independence, you can send a tar archive to a pipe or to a Unix file. Both commands create a file called project.tar:
This is a convenient way to create an archive of a directory and move it to a different disk. This file does take up a lot of room. You can recover some of this room by running compress on the tar file:
This creates a file called project.tar.Z and deletes project.tar. This is a convenient way to save disk space, because a compressed tar file or a directory is typically smaller that the original directory. Once the tar file is created, the original directory can be deleted. The directory can be recreated with these steps:
Tar, by default, restores the original modification times. The m option suppresses this. The times of the last modification is important because make uses this information to keep track of file dependencies. This makes tar a convenient program to copy directories, unlike cp -r. To use tar to copy a directory, this example from the manual pages can be used:
Tar will archive binary and source files. To speed up the archiving and reduce the archive size, most project directories have a special command in the make file that deletes all files that can be recreated. This is typically
that may just delete the executable file, object files and possibly a core dump:
Tar has a special flag that helps you make snapshots of a project directory. If you specify the flag F then this tells tar to not include the Source Code Control System (SCCS) directory. If you specify two F arguments, then this makes the restriction stronger, and files with the names core, a.out, errs, and any file ending with .o will not be written to the tar file. So, to create a backup of a project directory, the following command is used:
This is a convenient way to give someone all the current files needed to build an executable, without giving them every SCCS version of every file.
This is convenient, but it may include files you don't want. Make creates files starting with a , to keep track of dependencies. Various editors create backup files ending with % or ~. I often keep the original copy of a program with the .orig extension, and old versions with a .old extension. There may be some binary files that you don't want to archive, but don't want to delete either.
The solution is to use the X flag to tar. This flag specifies that the matching argument to tar is a filename that lists files to exclude from the archive. Here is an example:
In this example, find lists all files in the directories, but does not print the directory names explicitly. If you have a directory name in an excluded list, it will also exclude all the files inside the directory. Egrep is then used as a filter to exclude certain files from the archive. Here, egrep is given several regular expressions to match certain files. This expression seems complex but is simple once you understand a few special characters:
The slash is not a special character. However, since no filename can contain a slash, it matches the beginning of a filename, as output by the find command. The vertical bar separates each regular expression. The dollar sign is one of the two regular expression "anchors," and specifies the end of the line, or filename in this case. The other anchor, which specifies the beginning of the line, is "^." But because we are matching filenames output by find, the only filenames that can match "^" are those in the top directory. Normally the dot matches any character in a regular expression. Here, we want to match the actual character ".," which is why the backslash is used to "quote" or "escape" the normal meaning.
A breakdown of the patterns and examples of the files that match these patterns are given below:
Pattern | Matches files | Used by |
---|---|---|
/, | starting with comma | make dependency files |
%$ | ending with % | Textedit backup files |
~$ | ending with ~ | emacs backup files |
.old$ | ending with .old | old copies |
SCCS | in SCCS directory | Source Code Control System |
/core$ | with name of "core" | core dump |
.o$ | ending with .o | Object files |
.orig$ | ending with .orig | programmer to show original version |
Instead of specifying which files are to be excluded, you can specify which files to archive using the -I option. As with the exclude flag, specifying directories tell tar to include (or exclude) entire directories. You should also note that the syntax of the -I option is different that the typical tar flag. This example dumps all C and make files;
I suggest using find to create the include or exclude file. You can edit it afterwards, if you wish. Extra spaces at the end of any lines will cause that file to be ignored.
Another way to debug the output of the find command is to use /dev/null as the output file:
There are times when you want to make an archive of several directories. You may want to archive a source directory and another directory like /usr/local. The natural. but wrong way to do this is to use the command:
This will archive /usr/local/... as "local/"
There are several ways to specify a directory to archive. If the directory is under the current directory, I have already given the example:
A similar way to specify the same directory is
If you are currently in the directory you want archived, you can type:
Another way to archive the current directory is to type
Here, the shell expands the * character to the files in the current directory. However, it does not match files starting with a ., which is why the previous technique is preferred.
This causes a problem when restoring a directory from a tar archive. You may not know if an archive was created using . or the directory name.
I always check the names of the files before restoring an archive:
If the archive loads the files into the current directory, I create a new directory, change to it, and extract the files.
If the archive restores the directory by name, then I restore the files into the current directory.
If you want to restore a single file, get the pathname of the file as tar knows it, using the t flag. You must specify the exact filename, because filename and ./filename are not the same. You can combine this in one command by using:
Whenever you use tar to restore a directory, you must always specify some filename. If none is specified, no files are restored.
There is still the problem of restoring a directory whose pathname starts with /. Because tar restores a file to the pathname specified in the archive, you cannot change where the file will be restored. The danger is that you may either overwrite some existing files, or you will not be able to restore the files because you don't have permissions.
You can ask the system administrator to rename a directory and temporarily create a symbolic link pointing to a directory where you can restore the files. Other solutions exist, including editing the tar archive, and creating a new directory structure with a C program executing the chroot(2) system call. Another solution is to get the Free Software Foundation's version of tar that allows you to remap pathnames starting with /. It also allows you to create archives that are too large for a single tape, incremental archives, and a dozen other advantages.
But the best solution is to never create an archive of a directory that starts with / or ~.
To restore a directory from a remote host, use the following command:
Because of it's nature, it is difficult to read fixed size blocks over a network. This is why tar uses the B flag to force it to read from the pipe until a block is completely filled.
This document was translated by troff2html v0.21 on September 22, 2001.