Fermilab ALARM Protocol

Rich Neswold

July 16, 1999

This document describes the ALARM protocol. Most of this information was obtained from ACNET Design Note No. 39.3. Some of it was found in ACNET Design Note 22.28.

Contents

1.  General Comments
2.  As a Front-End Boots
3.  Alarm Receiving Task
4.  Alarm Reporting Task
5.  Alarm Properties

1.   General Comments

2.   As a Front-End Boots

When a front-end boots (or reboots) the list of alarms associated with it need to be cleared from the consoles and AEOLUS. As it initializes its alarms tasks, it sends a reboot notification request to AEOLUS (typecode FEBT.) The reply to this message is delayed until AEOLUS has notified all the consoles and has cleared its own references.

In the current implementation of MOOC, the front-end sends the FEBT message whenever it creates the connection to AEOLUS. This occurs when the front-end boots and each time communications with AEOLUS returns an error. The reply to the FEBT message doesn't appear to have an important format; MOOC only checks the status.

A front-end may be known under several logical nodes. This happens when AEOLUS groups a subset of a front-end's devices under a new node. If the front-end needs to support multiple logical nodes, it must send a FEBT request for each logical node it supports.

Table 1: Format of Front-end Boot Message (found in alarmr.h)

Field Name Field Type Field Description
typecod unsigned char FEBT typecode (currently equal to 9.)
unused unsigned char
mibsn unsigned char The minimum basic subsystem number to send BIGCs to.
mabsn unsigned char The maximum basic subsystem number to send BIGCs to.
node unsigned char Node address of the front-end.
trunk unsigned char Trunk of the front-end.

The minimum and maximum basic subsystem numbers refer to the range of subsystems that should receive a BIGC (i.e. ``big clear'' -- these are described in Alarm Receiving Task.) These numbers are front-end specific. Only 8 subsystems are supported (0-7).

The last thing a front-end must do, when it boots, is to download the alarm blocks for its devices. The alarm blocks are sent to the front-end through the SETDAT protocol (see Alarm Properties.) The task responsible for downloading the alarm blocks varies. These tasks append messages to a log file to indicate success or failure. These log files can be found in OP$USR1:[VTEVATRON.LOG]. Since this is not part of the official ALARMS protocol and, in fact, is done a little differently on each front-end, we won't go into this any further.

3.   Alarm Receiving Task

A front-end needs to create a task called ALARMR to receive requests from AEOLUS. Currently, the only request that is sent to this task is a ``big clear'' (BIGC) message. A BIGC request contains a field which indicates which subsystem should have all its alarms marked ``good''. Intelligent modules will re-report their alarm status. Dumb modules will be put back into alarm when the front-end gets around to checking their value. In either case, the alarm status will eventually be reported to AEOLUS. Subsystems are front-end specific so the subsystem indicator is front-end specific.

Table 2: Format of Big Clear Message (found in alarmr.h)

Field Name Field Type Field Description
mtype unsigned char BIGC typecode (currently equal to 2.)
unused unsigned char
unused unsigned char
subs unsigned char The subsystem that needs to be cleared.

The BIGC messages can arrive as requests or USMs, so the ALARMR task should be prepared to handle both.

4.   Alarm Reporting Task

The front-ends are responsible for reporting event and exception messages to AEOLUS. A task is set up to do this forwarding. This task periodically scans the devices to see if any have entered an alarm state.

Note:

Charlie has requested that we allow the alarm scan rate to be modifiable by the user. He also thinks that, if it's possible, it would be useful to have a variable scan rate on each device, instead of a single, global rate.

As messages are generated, they should be queued up and sent to AEOLUS no faster than once a second. Sending a ``queue overflow'' exception is encouraged to indicate that messages may have been lost due to queue size limitations.

The alarm task should constantly monitor its connection to AEOLUS. If the connection is lost and then restored, the task should send any new messages that may have queued up during the disconnection.

The connection to AEOLUS can be done through network requests (single requests) or USMs. Network requests are preferred since they provide an acknowledgement from the receiver -- again, the format of the reply isn't important. Whether a reply was received is important.

The alarm task sends an ``event report message'' (ERM) to AEOLUS. The format of this message is shown in ERM Format (found in alarmr.h).

Table 3: ERM Format (found in alarmr.h)

Field Name Field Type Field Description
typecod char ERM typecode (Either 1 or 14.)
nofp char Number of ERP packets included in the message.
data short[267] Holds the ``event report packets'' (ERP). Since ERPs are variable lengthed, this unstructured region is used to hold them.

The task fills the message with event report packets (ERPs) to allow multiple events to be reported in one network transaction. The ERM typecode indicates the type ERPs being sent. For typecode 1 the ERPs identify the device with an EMC. For typecode 14 the ERPs identify the device by a DIEMC. Devices which report alarms by DIEMC do not need an EMC in the database because a DIEMC contains the devices di. ERPs have the following format:

Table 4: ERP Format (found in alarmr.h)

Field Name Field Type Field Description
length unsigned char The length of the ERP.
sos unsigned char If set to 1, it indicates ``status-of-status'' in the reading field.
esw ALARM_FLAGS
emc/diemc EMC/DIEMC EMC for typecode 1, DIEMC for typecode 14
rdg union This field holds the current reading. It is a union of signed and unsigned integers and a float.
par unsigned char[16] Extra parameters. These are parameters that get displayed in the alarm text string. When AEOLUS displays an alarm message, it gets the text from the device's alarm text property. This text string can contain formatting specifiers (similar, but not compatible with, printf's formatting string). The values in this field are used by the formatting string.

DIEMCs have the following format:

Table 5: DIEMC Format (found in alarmr.h)

Field Name Field Type Field Description
trunk unsigned char The trunk the device's FE is on.
node unsigned char The node the device's FE is on.
unused unsigned short Contain subsystem mask unused by MOOC FEs.
di unsigned int Device's di(device index).

5.   Alarm Properties

The final piece of the puzzle is that the front-end needs to be able to support the setting and reading of alarm parameters. It does this via alarm properties. The front-end will receive requests to set or read the alarm block through SETDAT or RETDAT, respectively.

The data packet describing alarm parameters is 20 bytes for both analog and digital alarms. These packets have the following formats:

Table 6: Analog Alarm Block Format

Field Name Field Type Field Description
flags unsigned short This field holds various flags. The assignment of each bit is shown in Alarm Block Flag Bits.
minval unsigned long The minimum value the analog device can reach without triggering an alarm.
maxval unsigned long The maximum value the analog device can reach without triggering an alarm.
tneeded unsigned char This field acts like a simple filter. This represents the number of consecutive samples that have to exceed the limits before the alarm is generated.
tnow unsigned char This contains the current number of times that the device exceeded the limits.
ev1 unsigned char MSbyte of an FTD if not using the default 3 second rate. Can be event or frequency.
ev2 unsigned char LSbyte of the FTD.
ssinfo unsigned short 16-bit offset into an array device.
(unused) unsigned char Subsystem-specific information.
alarm type unsigned char This field indicates the data type used for alarm comparisons.
(unused) unsigned char[2] Subsystem-specific information.

Table 7: Digital Alarm Block Format

Field Name Field Type Field Description
flags unsigned short This field holds various flags. The assignment of each bit is shown in Alarm Block Flag Bits.
nominal unsigned long The expected value of the digital device. If its value differs, an alarm is generated.
mask unsigned long The value of the device is logically ANDed with this mask. The result is compared to the nominal field.
tneeded unsigned char This field acts like a simple filter. This represents the number of consecutive samples that have to differ from the nominal before an alarm is generated.
tnow unsigned char This contains the current number of times that the device differed from the nominal.
ev1 unsigned char MSbyte of an FTD if not using the default 3 second rate. Can be event or frequency.
ev2 unsigned char LSbyte of the FTD.
ssinfo unsigned short 16-bit offset into an array device.
(unused) unsigned char[4] Subsystem-specific information.

Both alarm block structures have a common flags field. The bits in this field are defined as:

Table 8: Alarm Block Flag Bits

Bit Name Bit Description
DE Display Event. This alarm block represents an event that should be displayed by the consoles.
LE Log Event. This alarm block represents an event that should be logged.
EV Event Bit. If this bit is set, then the alarm block describes an event. Otherwise the alarm block describes an exception condition.
HI Too High. Set if the reading exceeded the maximum limit.
LO Too Low. Set if the reading exceeded the minimum limit.
K0,K1,K2 Limit Type. These three bits determine the way the alarm limit fields are to be interpreted. 0 means a nominal/tolerance configuration is used. 1 means nominal/percentage and 2 means max/min. Other values are undefined.
AD Analog/Digital. This indicates whether the alarm is analog or digital.
Q0,Q1 Limit Length. These two fields indicate the length of the values used for nominal and tolerance limits. A value of 3 is undefined.
AI Abort Inhibit. Inhibits the effect of the abort bit (AB).
AB Abort Bit. If set, then the beam will be aborted when an alarm occurs.
GB Good/Bad. If the alarm indicates a ``going good'' state, then this bit is 0. A ``going bad'' state has this bit set to 1.
BP Alarm Bypass. If 0 (?), the alarm will not generate alarms or beam inhibits.

In the analog alarm blocks, a field is defined which indicates the data type that is used for comparisons. This is a recent addition to the protocol, so not all front-ends support it. For MOOC systems, this field removes the datatyping responsibilities from the programmer and places it in the database maintainer. To support automated populating of the database, the following algorithm is used.

Security, Privacy, Legal