Au Satbuildscript PDF
Au Satbuildscript PDF
unattended scripts
Look at how to create scripts that are able to record their output, trap and identify errors, and
recover from errors and problems so that they either run correctly or fail with a suitable error
message and report. Building scripts and running them automatically is a task that every good
administrator has to handle, but how do you handle the error output and make intelligent
decisions about how the script should handle these errors? This article addresses these issues.
The default mode of cron and at commands, for example, is for the output of the script to be
captured and then emailed to the user that ran the script. You don't always want the user to get the
email that cron sends by default (especially if everything ran fine)—sometimes the user who ran
the script and the person actually responsible for monitoring that output are different.
Therefore, you need better methods for trapping and identifying errors within the script, better
methods for communicating problems, and optional successes to the appropriate person.
Getting the scripts set up correctly is vital; you need to ensure that the script is configured in such
a way that it's easy to maintain and that the script runs effectively. You also need to be able to trap
errors and output from programs and ensure the security and validity of the environment in which
the script executes. Read along to find out how to do all of this.
Some of these elements are straightforward enough to organize. For example, you can set the
path using the following in most Bourne-compatible shells (sh, Bash, ksh, and zsh):
PATH=/usr/bin:/bin:/usr/sbin
For directory and file locations, just set a variable at the header of the script. You can then use the
variable in each place where you would have used the filename. For example, when writing to a
log file, you might use Listing 1.
do_something >>$LOGFILE
do_another >>$LOGFILE
By setting the name once and then using the variable, you ensure that you don't get the filename
wrong, and if you need to change the filename name, you only need to change the name once.
Using a single filename and variable also makes it very easy to create a complex filename. For
example, adding a date to your log filename is made easier by using the date command with a
format specification:
DATE='date +%Y%m%d.%H%M'
The above command creates a string containing the date in the format YYYYMMDD.HHMM, for
example, 20070524.2359. You can insert that date variable into a filename so that your log file is
tagged according to the date it was created.
If you are not using a date/time unique identifier in the log filename, it's a good idea to insert some
other unique identifier in case two scripts are run simultaneously. If your script is writing to the
same file from two different processes, you will end up either with corrupted information or missing
information.
All shells support a unique shell ID, based on the shell process ID, and are accessible through the
special $$ variable name. By using a global log variable, you can easily create a unique file to be
used for logging:
LOGFILE=/tmp/$$.err
You can also apply the same global variable principles to directories:
LOGDIR=/var/log/my_app
To ensure that the directories are created, use the -p option for mkdir to create the entire path of
the directory you want to use:
mkdir -p $LOGDIR
Fortunately, this format won't complain if the directories already exist, which makes it ideal for
running in an unattended script.
Finally, it is generally a good idea to use full path names rather than localized paths in your
unattended scripts so that you can use the previous principles together.
Now that you've set up the environment, let's look at how you can use these principles to help with
the general, unattended scripts.
This is less than perfect for a number of reasons. First of all, the configured user that might be
running the script might not be the same as the real person that needs to handle the output. You
might be running the script as root, even though the output of the script or command when run
needs to go to somebody else. Setting up a general filter or redirection won't work if you want to
send the output of different commands to different users.
The second reason is a more fundamental one. Unless something goes wrong, it's not necessary
to receive the output from a script . The cron daemon sends you the output from stdout and stderr,
which means that you get a copy of the output, even if the script executed successfully.
The final reason is about the management and organization of the information and output
generated. Email is not always an efficient way of recording and tracking the output from the
scripts that are run automatically. Maybe you just want to keep an archive of the log file that was a
success or email a copy of the error log in the event of a problem.
Writing out to a log file can be handled in a number of different ways. The most straightforward
way is to redirect output to a file for each command (see Listing 3).
If you want to combine error and standard output into a single file, use numbered redirection (see
Listing 4).
You might also want to write out the information to separate files (see Listing 5).
For multiple commands, the redirections can get complex and repetitive. You must ensure, for
example, that you are appending, not overwriting, information to the log file (see Listing 6).
A simpler solution, if your shell supports it, is to use an inline block for a group of commands, and
then to redirect the output from the block as a whole. The result is that you can rewrite the lines in
Listing 7 using the structure in Listing 8.
cd /etc
rsync --delete --recursive . /backups/etc >>$LOGFILE 2>>$ERRFILE
cd /etc
rsync --delete --recursive . /backups/etc
} >$LOGFILE 2>$ERRFILE
The enclosing braces imply a subshell so that all the commands in the block are executed as if
part of a separate process (although no secondary shell is created, the enclosing block is just
treated as a different logical environment). Using the subshell, you can collectively redirect their
standard and error output for the entire block instead of for each individual command.
For example, Listing 9 shows a more complete script that sets up the environment, executes the
actual commands and bulk of the process, traps the output, and then sends an email with the
output and error information.
{
set -e
cd /shared
rsync --delete --recursive . /backups/shared
cd /etc
rsync --delete --recursive . /backups/etc
} >$LOGFILE 2>$ERRFILE
{
echo "Reported output"
echo
cat /tmp/$$.log
echo "Error output"
echo
cat /tmp/$$.err
} >$ERRORFMT 2>&1
If you use the subshell trick and your shell supports shell options (Bash, ksh, and zsh), then you
might want to optionally set some shell options to ensure that the block is terminated correctly on
an error. For example, the -e (errexit) option within Bash ensures that the shell terminates when a
simple command (for example, any external command called through the script) causes immediate
termination of the shell.
In Listing 9, for example, if the first rsync failed, then the subshell would just continue and run
the next command. However, there are times when you want to stop the moment a command
fails because continuing could be more damaging. By setting errexit, the subshell immediately
terminates when the first command stops.
Other options you might want to set in a shell-independent manner (and the richer the shell, the
better, as a rule, at trapping these instances). In the Bash shell, for example, -u ensures that any
unset variables are treated as an error. This can be useful to ensure that an unattended script
does not try to execute when a required variable has not been configured correctly.
The -C option (noclobber) ensures that files are not overwritten if they already exist, and it can
prevent the script from overwriting files it shouldn't have access too (for example, the system files),
unless the script has the correct commands to delete the original file first.
Each of these options can be set using the set command (see Listing 10).
You can use a plus sign before the option to disable it.
Another area where you might want to improve the security and environment of your script is to
use resource limits. Resource limits can be set by the ulimit command, which is generally specific
to the shell, and enable you to limit the size of files, cores, memory use, and even the duration of
the script to ensure that the script does not run away with itself.
For example, you can set CPU time in seconds using the following command:
ulimit -t 600
Although ulimit does not offer complete protection, it helps in those scripts where the potential for
the script to run away with itself, or a program to suddenly use a large amount of memory, might
become a problem.
Capturing faults
You have already seen how to trap errors, output, and create logs that can be emailed to the
appropriate person when they occur, but what if you want to be more specific about the errors and
responses?
Two tools are useful here. The first is the return status from a command, and the second is the
trap command within your shell.
The return status from a command can be used to identify whether a particular command ran
correctly, or whether it generated some sort of error. The exact meaning for a specific return status
code is unique to a particular command (check the man pages), but a generally accepted principle
is that an error code of zero means that the command executed correctly.
For example, imagine that you want to trap an error when trying to create a directory. You can
check the $? variable after mkdir and then email the output, as shown in Listing 11.
Incidentally, you can use the return status code information inline by chaining commands with the
&& or || symbols to act as an and, or, or type statement. For example, say you want to ensure that
the directory gets created and the command gets executed but, if the directory is not created, the
command does not get executed. You could do that using an if statement (see Listing 12).
The above statement basically reads, "Make a directory and, if it completes successfully, also run
the command." In essence, only do the second command if the first completes correctly.
The || symbol works in the opposite way; if the first command does not complete successfully, then
execute the second. This can be useful for trapping situations where a command would raise an
error, but instead provides an alternative solution. For example, when changing to a directory, you
might use the line:
This line of code tries to change the directory and, if it fails, (probably because the directory does
not exist), you make it. Furthermore, you can combine these statements together. In the previous
example, of course, what you want to do is change to the directory, or create it and then change to
that directory if it doesn't already exist. You can write that in one line as:
cd /tmp/out || mkdir /tmp/out && cd /tmp/out
The trap command is a more generalized solution for trapping more serious errors based on the
signals raised when a command fails, such as core dump, memory error, or when a command has
been forcibly terminated by a kill command.
To use trap, you specify the command or function to be executed when the signal is trapped, and
the signal number or numbers that you want to trap, as shown here in Listing 13.
trap catch_trap 1 2 3 4 5 6 7 8 9 10 11
sleep 9000
You can trap any signal in this way and it can be a good way of ensuring that a program that
crashes out is caught and trapped effectively and reported.
There is no easy solution to this problem, but you can use a combination of the techniques shown
in this article to log errors and information, read or filter the information, and mail and report or
display it accordingly.
A simple way to do this is to choose which parts of the command that you output and report to the
logs. Alternatively, you can post-process the logs to select or filter out the output that you need.
For example, say you have a script that builds a document in the background using the Formatting
Objects Processor (FOP) system from Apache to generate a PDF version of the document.
Unfortunately in the process, a number of errors are generated about hyphenation. These are
errors that you know about, but they don't affect the output quality. In the script that generates the
file, just filter out these lines from the error log:
If there were no other errors, the mailerror.log file will be empty, and email is sent with the error
information.
Summary
In this article, you've looked at how to run commands in an unattended script, captured their
output, and monitored the execution of different commands in the script. You can log the
information in many ways, for example, on a command-by-command or global basis, and check
and report on the progress.
For error trapping, you can monitor output and result codes, and you can even set up global traps
that identify problems and trap them during execution for reporting purposes. The result is a range
of options that handle and report problems for scripts that are running on their own and where their
ability to recover from errors and problems is critical.
Related topics
• System Administration Toolkit: Check out other parts in this series.
• Apache FOP (Formatting Objects Processor): This is the world's first print formatter driven by
XSL formatting objects (XSL-FO) and the world's first output independent formatter.
• Read Wikipedia pages on crontab.
• "Scheduling recurring tasks in Java" (Tom White, developerWorks, November 2003): Find
out how to build a simple, general scheduling framework for task execution conforming to an
arbitrarily complex schedule.
• Find out how to program in Bash: "Bash by example, Part 1: Fundamental programming in the
Bourne again shell (bash)" (Daniel Robbins, developerWorks, March 2000) "Bash by example,
Part 2: More bash programming fundamentals" (Daniel Robbins, developerWorks, April 2000)
Bash by example, Part 3: Exploring the ebuild system" (Daniel Robbins, developerWorks, May
2000)
• IBM Redbooks: Different systems use different tools, and Solaris to Linux Migration: A Guide
for System Administrators helps you identify some key tools.
• Check out other articles and tutorials written by Martin Brown:
• Across developerWorks and IBM
• AIX and UNIX®: The AIX and UNIX developerWorks zone provides a wealth of information
relating to all aspects of AIX systems administration and expanding your UNIX skills.
• Search the AIX and UNIX library by topic:
• System administration
• Application development
• Performance
• Porting
• Security
• Tips
• Tools and utilities
• Java™ technology
• Linux
• Open source
• IBM trial software: Build your next development project with software for download directly
from developerWorks.