RD Controls Software Release Note 145.0
RDCS Automatic Backups
RD Controls Software Release Note 145.0
RDCS Automatic Backups
John DeVoy
September 20, 1994
This document describes
the nightly and incremental backup procedures
for the disks maintained by the RD/EED controls group.
The intent here is to describe the .com files
that actually do the work,
so that if need be,
thay can be maintained by someone other than the author.
The daily routine of changing the tapes is not described here,
but can be found in Software Release Note 109.
Overview
The files associated with the backups are located in
sys$common:[sysmgr.backups].
For ease of management,
a copy of each file is kept in the
backups
directory of each system disk;
you may use either
DUPLICATE and MULTICOPY,
or
DISTRIBUTE
(described below,)
to keep the directories consistent.
The various files are described in the following sections.
AUTO_BACKUP.COM
This is the main procedure of the group.
It is submitted to the system batch queue of each node on which a backup is to be done
(possibly more than once, if the node needs to do multiple backups).
AUTO_BACKUP.COM is controlled by a set of backup logicals
that collectively determine such things as
the list of disks to back up,
the tape drive to use,
whether to verify the backup,
etc.
AUTO_BACKUP takes two parameters.
The first is the name
of a .COM file
(see below)
that,
when run,
is expected to define the backup logicals in the process logical table.
The second parameter is not used by AUTO_BACKUP, but is passed as a parameter to the above .COM file.
The backup logicals are as follows:
- RDCS$BACKUP_DISKS
-
The list of disks to back up.
Invalid or unavailable disks will cause a
(non-fatal)
error message to be printed.
At least one disk must be defined.
- RDCS$BACKUP_TAPE
-
The name of the tape drive to back up the above disks to.
Note that we are no longer using the TAPE$8MM logical;
specify the name of the device explicitly.
- RDCS$BACKUP_PRIORITY
-
Specifes the priority at which the backup job will be run.
The value must be an integer between
0 and 15
(inclusive).
If the value is specified as
``DEFAULT''
then the value 4 will be used.
- RDCS$BACKUP_VERIFY_DAYS
-
Specifies the days of the week on which
``verify''
is turned on during the backup.
If the current day matches one of the days in the list,
then verify is turned on.
As special cases,
the value
``EVERYDAY''
specifies every day of the week,
and
the value
``NONE''
specifies no days.
- RDCS$BACKUP_NOTIFY
-
Specifies a list of users to be mailed a copy of the log file of the backup job when the job finishes.
If this parameter is not specified,
then no mail will be sent.
If you want to use a distribution list,
then you should specify the filename of the list,
preceded by
``@'',
and enclosed in quotes.
Note that this parameter only applies if the program is running in BATCH mode.
Mail will be sent under the following circumstances
(``<suffix>''
refers to the mail subject suffix; see below):
-
The Backup failed.
The subject line of the mail message will be something like:
``Tuesday backup failed <suffix>.''
If the backup tries to fail over,
then the message
``Retry on queue xxx''
will be appended to the subject line
(``xxx''
will be the name of the queue on which the failover job was submitted.)
-
The Backup succeeded,
but one or more tape errors occurred during the backup.
The subject line of the mail message will be something like:
``Tuesday 4 tape errors <suffix>.''
-
One or more disk errors occurred during the backup.
In this case only,
the mail message has no contents
(the information is entirely contained in the subject line of the message.)
If more than one disk had an error,
then a separate message will be sent for each disk.
The subject line of the mail message will be something like:
``25 disk errors on ELMER$DKA0: on node ELMER.''
-
The Backup finished with no errors and the REPORT flag is set.
The subject line of the mail message will be something like:
``Tuesday backup completed <suffix>.''
-
The backup job had to delete a previous backup job before it could allocate the tape.
This applies only if the KILL flag is
``True.''
The log file of the job that was deleted will be mailed.
The subject line of the message will be something like:
``Deleted job AUTO_BACKUP from queue ELMER_SYSTEM <suffix>.''
- RDCS$BACKUP_REPORT
-
If set to
``True,''
the AUTO_BACKUP will mail a copy of the log file when the job finishes.
If set to
``False,''
the the log file is only sent if an error occurs
(see above.)
- RDCS$BACKUP_RETRIES
-
Specifies the number of times the program will attempt to restart the backup if a fatal
(e.g. parity)
error occurs.
There is a five minute delay between retry attempts.
- RDCS$BACKUP_EXCLUDES
-
A list of file specifications to put on the exclude list of an incremental backup.
If the INCREMENTAL flag is not set,
then this parameter will be ignored.
- RDCS$BACKUP_FAILOVER
-
Specifies whether the backup should attempt to failover if an error occurs.
If specified as
``True,''
then
``FAILOVER''
will be specified as the second parameter to the failover job.
- RDCS$BACKUP_FAILOVER_PARAMS
-
The name of the parameter file that should be specified for the failover job.
Defaults to the same parameter file that was specified as P1 to AUTO_BACKUP.COM.
This file will be invoked with the parameter
``FAILOVER''
(see above.)
This allows the same file to be used for both regular and failover backups.
- RDCS$BACKUP_FAILOVER_QUEUE
-
The name of the queue to which the failover job should be sent.
Must be specified if FAILOVER is
``True.''
- RDCS$BACKUP_RECORD
-
Specifies whether the backup date should be recorded for each file that is backed up.
- RDCS$BACKUP_REWIND
-
Specifies whether the tape should be rewound before the first disk is backed up.
- RDCS$BACKUP_INCREMENTAL
-
Specifies whether the backup should be an incremental.
- RDCS$BACKUP_KILL
-
If true, then any batch job that already owns the tape drive will be deleted.
- RDCS$BACKUP_JOURNAL_SUFFIX
-
A suffix to be appended to the journal file name.
The name of the journal file will be the tape label,
followed by the date,
followed by the string specified here
(if any.)
An example might be
``ELMER-NIGHT1''.
- RDCS$BACKUP_COMMENT
-
A string to be used as a comment in the output save set.
Will also be echoed to the log file at the beginning and end of each disk that is backed up.
If the string contains
``!AS,''
it will be replaced with the name of the disk being backed up.
If the string contains
``!AS''
followed by
``!%D,''
the
``!%D''
will be replaced by the current date.
An example might be
``Full backup of !AS on !which will be expanded
to something like
``Full backup of USR$DISK1 on 24-AUG-1994.''
- RDCS$BACKUP_MAIL_SUBJECT_SUFFIX
-
A string that will be appended to the end of the subject line of the
mail message that is sent to the list of users specified by the
NOTIFY parameter.
See the description of the NOTIFY parameter,
above,
for descriptions of the subject lines for the various types of messages.
In the subject lines,
the string
``<suffix>''
will be replaced by the value of this parameter.
Note that this parameter is not used for messages that report disk errors.
A typical value for this parameter might be
``on ELMER 1.''
- RDCS$BACKUP_MAIL_PERSONAL_NAME
-
A string that will be used for the personal name qualifier on mail
mesages sent to the list of users specified by the NOTIFY parameter.
An example might be
``ELMER 1 Backup Job.''
*BACKUP_PARAMS.COM
The backup parameter procedures are called by
AUTO_BACKUP.COM
to define the logicals described above.
A separate procedure is defined for each backup job;
the name of the procedure is passed to
AUTO_BACKUP.COM
as its first parameter.
Each procedure defines the logicals for a different backup job.
The first parameter to the procedure can be used to distinguish between slight variations in a particular job.
Two standard values for the value of this parameter are currently defined:
``RESTART, ''
for a job that is being submitted manually,
and
``FAILOVER,''
for a job that is being automatically resubmitted to a different queue.
If this parameter is passed to
AUTO_BACKUP.COM
as P2,
then
AUTO_BACKUP.COM
will
pass it to the backup parameter procedure.
If you want to modify the parameters for a particular backup job,
then you should edit the corresponding backup parameter procedure.
Edit the file on WARNER,
and copy it to all the nodes.
Be sure to remember to copy it to
both
system disks on
WARNER, DISNEY, and RDIV.
On a node where only a single backup is performed
(e.g. MICKEY, MINNIE, or any standalone node,)
the convention is to name these files
``<NODE>_BACKUP_PARAMS.COM,''
where
``<NODE>''
is the name of the node.
On a node where multiple backups are done
(e.g. ELMER, DAFFY, RDIV01)
the convention is to name them
``<NODE>_#_BACKUP_PARAMS.COM,''
where
``<NODE>''
is the name of the node,
and
``#''
is either one of
1, 2, 3,...
(on WARNER,)
or
one of
A, B, C, ...
(on RDIV.)
For an incremental backup,
the file is named
INCREMENTAL_BACKUP.COM
NOTIFY.DIS
This file contains the distribution list of people to notify when a backup fails
(or when it completes,
in the case of failed-over or restarted jobs.)
Putting the distribution list in a file
allows one to avoid the tedium of modifying each backup parameter procedure individually
when changing the notify list.
To change the list of people to be notified,
edit this file,
and
copy
it
to all
nodes.
Be sure to remember to copy it to
both
system disks on
WARNER, DISNEY, and RDIV.
SUBMIT_ALL.COM
This procedure is called by EVERYNITE.COM
to submit all of the backup jobs for a node or cluster.
Rather than have
``n''
submit lines that differ only in a few qualifiers,
SUBMIT_ALL.COM
contains only the code common to all backups,
and reads a data file
(described next)
to get the qualifiers specific to each job.
*AUTO_BACKUP.DAT
The backup data file contains one line for each backup to be performed on a particular node or cluster
(blank and comment lines are ignored.)
Each line should consist of the qualifiers necessary for the submit line for that particular job.
The qualifier list typically consists of the following:
- /after
-
The time
(after)
which the backup should be done.
This is particulary significant for the incremental backups.
- /name
-
The name of the job.
If more than one backup job is being submitted to the same queue,
then it is recommended that different job names be specified
(e.g.
``AUTO_BACKUP1''
and
``AUTO_BACKUP2''
on WARNER,
and
``AUTO_BACKUPA''
and
``AUTO_BACKUPB''
on RDIV.)
- /queue
-
The name of the queue to which the job should be submitted.
This will be the system queue of the node that is to run the job.
- /param
-
The first parameter must be the name of the backup parameter procedure
(which is executed from
AUTO_BACKUP.COM
to set up the parameters for the particular job.)
Nightly backups need no more parameters.
For incremental backups,
the second parameter should be either
``NOON''
or
``EVENING.''
The second parameter is passed as the first
(and only)
parameter to the backup parameter procedure.
On standalone nodes,
the convention is to name this file
``<NODE>_AUTO_BACKUP.DAT,''
where
``<NODE>''
is the name of the node.
On a cluster,
the convention is to name the file
``<CLUSTER>_AUTO_BACKUP.DAT,''
where
``<CLUSTER>''
is the cluster alias of the cluster.
To add or delete a backup job,
or to change the way that an existing job is submitted,
add, delete or modify, respectively, a line of this file, and copy it to all nodes.
Be sure to remember to copy it to
both
system disks on
WARNER, DISNEY, and RDIV.
REDO_BACKUP.COM
Submits a single backup job.
The job will be submitted
(for immediate execution)
on the system queue of the node on which
REDO_BACKUP.COM
is run.
The name of the job will be
``REDO_BACKUP.''
The first and second parameters
(if present)
will be used for the
``/param''
qualifier on the submit line.
If the first parameter is not specified,
then if exactly one backup parameter procedure
is found in the backups directory,
that procedure will be used as the default.
Otherwise,
a list of all the backup parameter procedures that were found will be printed,
along with a message that the user should use one as the first parameter.
The second parameter defaults to
``RESTART.''
A restarted backup job typically modifies the following parameters from their usual values for a nightly job:
-
Set the verify days to none because we assume we are in a hurry.
-
Set the number of retry attempts to zero, since we can restart manually if necessary.
-
Set the failover to False, again since we are running manually.
-
Set the report flag to True, since we want to know when to swap the tape.
-
Set the kill flag to False, since we don't want to inadvertently kill a job that someone else might have restarted.
-
Set the priority to three, because we are probably running during the day
(however,
this will not prevent the net from saturating.)
-
Set the mail subject suffix and personal name to reflect the fact that this is a restarted job,
as opposed to one that was submitted automatically
(by EVERYNITE.COM.)
To resubmit a failed backup job,
use
REDO_BACKUP.COM.
Run it on the same node that you want the backup to run on.
If that node runs more than one backup,
then you will have to specify the name of the backup parameter procedure as P1.
Note that it is possible to use
REDO_BACKUP.COM
for customized backups;
simply copy an existing backup parameter procedure,
modify it to suit your needs,
duplicate it
(if on a cluster,)
and specify it as the first parameter to
REDO_BACKUP.COM.
DISTRIBUTE.COM
This is a utility procedure that can be used to distribute files in the backups directory.
The program scans the backups directory in which it is located,
and compares it with the backups directory of every other RDCS system disk.
Any file that has a higher version number in the local directory than in the remote,
is copied to the remote directory.
Note the following:
this program runs slowly;
it copies all files, not just the ``official'' ones;
the current default directory does not need to be the backups directory;
and
the same effect can be had by using ``DUPLICATE'' and ``MCOPY'' on each file that has changed.
Keywords:
RDCS,
Backup,
Tape
8mm
Distribution:
Security, Privacy, Legal
rwest@fsus04.fnal.gov