User Interface to ARD
Frank Nagy John DeVoy
November 15, 1994
This document describes the interface and communication protocols used by the AMP (Alarm Monitor Process) and by alarm display programs that need to communicate with the ARD (Alarm Report Distributor) on the local node. The reader should refer to EPICURE design note 93 for more details on how the ARD itself works.
A display process communicates with the ARD via mailboxes. The ARD maintains a permanent system wide mailbox for receiving requests from display processes. The display process is expected to maintain a temporary mailbox into which the ARD will place Alarm Report Packets (ARP's.) These routines provide a simple interface, hiding from the user the details involved in maintaining the channels to the mailboxes, formating messages, determining if the ARD is present or not, etc.
The ARP's are divided into two types: Alarm reports which are generated by the AMP and status messages which are generated by the ARD. An alarm report informs the user about the state of a physical device (e.g. the reading for device M00H went into alarm at 09:30.) A status message informs the user about the state of the ARD's and AMP's (e.g. the master ARD on node WEBBY crashed at 10:00.) All ARP's begin with a header that identifies the type of the ARP and contains some common information (figure 1.) (See the definition of the ARBhdr structure in the file epicure_inc:alarms.h for more details.)
The length field of the header is the total length of the ARP. The Flags field is a set of bits identifying the type of alarm being reported. The Clinks field contains the time of the alarm (see the CVT_ routines in the fermilib library.) DI is the device index of the device being reported. PI is the property index. If the Foreign bit is clear, then the UIC field is the id of the user who set the alarm limits for this alarm. The Severity field indicates the severity of the alarm. If the Foreign bit is set, then the Node field is an array of 32 bytes indicating the node from which the alarm limits were set. Note that the UIC and Node fields are mutually exclusive in that only one has meaning for any given alarm report. C programmers note: This string (and all other strings described below) is not null terminated.
Figure 2 shows the bits defined in the Flags field. The bits are defined as follows:
The Hi bit set means that the analog value of the device in question has exceeded the upper threshold. The Lo bit set means that the analog value has gone below the minimum threshold. The Bad bit set means that the AMP is reporting that the device has entered an alarm state. The Bad bit clear means that the device has recovered from an alarm state. The Cooked bit is always set in alarm reports being sent to the display. The Event bit set means that the alarm report is reporting an event. The Hold bit modifies the Event bit. If the Hold bit is set, then the event remains active until the AMP clears it. Otherwise, the event is transitory, and no record of it is kept after it occurs. The Invalid_data bit set means that some operation by the AMP or ARD regarding the device produced an error status return (e.g. a linkabort.) The specific status code will be included in the data area of the ARP. The Clear bit set means that the device is to be considered to have gone out of alarm, regardless of its previous state. It is also used to clear events that have their hold bits set. If the device in question is a binary device, then a Clear bit set means that all of the bits associated with that device are to be considered out of alarm. The Status_message bit set means that the ARP is contains a status message. In this case all of the other bits are meaningless. Also, the DI and PI fields have no meaning for status messages. If the Foreign bit is set, then the node field of the header contains the name of the node from which the alarms limits were set. If the Foreign bit is clear, then the UIC field of the header contains the uic of the user who set the alarms.
Figure 3 shows the format of a status message (the status_message bit in the header is set.) The header is not shown. (See the definition of the ARB_STATUS_MESS structure in the file epicure_inc:alarms.h.)
The Typecode field contains identifies the particular message being sent. Four codes are presently defined:
Figure 4 shows the format of an alarm report for an analog device. The header is not shown. (See the definition of the ARB_ANALOG_VALUE structure in the file epicure_inc:alarms.h.)
The only way to determine whether an alarm report refers to an analog device is to look at the PI field of the header. If the PI has the value DB_C_PRP_READING, DB_C_PRP_SETTING, or DB_C_PRP_READING_ALARM, then it is an analog device. The Scaled Data field contains the reading value of the device in the format of a scaled floating point value. If the Clear bit in the header is set, then this field has no meaning. If the Invalid data bit is set, then the scaled data should be interpreted as an unsigned long integer; it will contain a status code that gives the reason for the error. The Cooker field has the same meaning as the Node field of a status message. The Device field is a 12 character string containing the name of the device. The Display Data field is described later.
Figure 5 shows the format of an alarm report for a binary device. Again, the header is not shown. (See the definition of the ARB_BINARY_STATUS structure in the file epicure_inc:alarms.h.)
If the PI has the value DB_C_PRP_STATUS, or DB_C_PRP_STATUS_ALARM, then the alarm report is a binary device. The ARD treats each bit of a binary device as though it were a separate device. Thus, if two bits go into alarm the ARD will send two alarm reports, one for each bit. The exception is when the Clear bit in the header is set; in this case the alarm report refers to all bits of the device. The State field contains 0 or 1; it is the present state of the bit in question. The Bit Number field contains the number of the bit; it ranges from 0 to 63. The Color field contains color information for the current state of the bit. The State Text field is a 7 byte array describing the current state. The Short Name field is an 8 byte array containing the short name of the bit. The previous five fields (State, Bit Number, Color, State Text and Short Name) have no meaning if the Invalid_data or Clear bits are set. If the Invalid_data bit in the header is set, then the first longword will contain the status code (this corresponds to the ``current'' field of the ARB_BINARY_STATUS structure defined in epicure_inc:alarms.h.) The Cooker, Device Name, and Display Data fields are the same as those of an analog report, except that for binary reports the ARD will overwrite the text field of the Display Data structure with the long name text for the bit.
Figure 6 shows the format of the Display Data for analog and binary alarms. (See the definition of the DB_DISPLAY structure defined in epicure_inc:dbuser.h.)
The DI and PI fields are the override device and property. Bit number seven of the Flags field is the logging-enabled flag (the remaining bits are unused.) The Priority field contains the display priority of the device. The Category field contains the category of the device. The Count field contains the number of characters in the Text field, which is a variable length string of up to 32 characters.
These are the interface routines used by display processes to communicate with the local ARD. The display process can choose to receive notification of incoming alarms via AST's or event flags. In either case the display process will receive a pointer to a dynamically allocated ARP. It is the display process's responsibility to deallocate the ARP when it is no longer needed; the routine alm_dpy_freearp() is provided for this purpose. The initial sequence of alm_dpy calls made by a display process is expected to be similar to the following:
alm_dpy_setfilter and/or alm_dpy_setuicfilter
All the routines will be put into the EPICURELIB library and useable by anyone wishing to write a program to process alarm information. The actual entry names and argument lists of these routines will be documented in EPICURELIB help and that documentation may be different from the information in this document. All the routines return an OpenVMS status as the function value. Optional arguments are enclosed in [ ].
This routine is called to connect a program wishing to process display reports to the ARD. This routine must be called before any other of the alm_dpy routines. The caller can specify either an event flag to be set or an AST routine to be invoked when an ARP is received. The event flag, AST routine and AST parameter may be changed by calling this routine a second (or third, etc.) time (it is not necessary to disconnect first.)
If the display process specifies an event flag to be notified of incoming alarms, then the incoming alarms will be placed in a queue. The display process can retrieve them by calling the routine alm_dpy_getarp(). If the display process specifies an AST routine, then that routine will be called once for each incoming ARP. The calling sequence will be:
(void) user_ast( arp, ast_param )
Where ``arp'' is the pointer to the (dynamically allocated) ARP and ``ast_param'' is the third argument passed to alm_dpy_connect(). If no ast_param argument is passed to alm_dpy_connect(), then zero will be used.
This routine will remove the oldest ARP received from a queue of ARP's. The display process supplies a pointer into which the address of the ARP is placed. If the most recent call to alm_dpy_connect() specified an AST routine, then the queue will always be empty. This routine is meant to be used when an event flag, as opposed to an AST, is being used.
Deallocate the ARP acquired through either an AST routine or the alm_dpy_getarp() function. The display process is responsible for calling this routine when an ARP is no longer needed. The display process must not attempt to deallocate an ARP by using the free() function from the C library or by using the LIB$FREE_VM routine. The user must not access the ARP after calling this routine.
Request that the ARD begin sending ARP's to the display process. After a connect, the ARD does not immediately start sending ARP's; it waits until the display process signals that it is ready. This function serves that purpose. This routine can also be used to counteract an alm_dpy_pause() call, but any pending messages will be forgotten.
Request that the ARD temporarily stop sending ARP's to the display process. While the pause is in effect the ARD places any newly arrived messages in a queue of pending messages. These pending messages will be sent and pause turned off when alm_dpy_resume() is called. Calling alm_dpy_begin() will also turn off pause, but the pending messages will be forgotten. Warning: The queue inside the ARD has a limit of 5000 ARP's (as of this writing.) If this limit is exceeded, then the display process will be disconnected.
Request that the ARD send an ARP for every device currently in alarm. A complete list of status messages will also be sent. This routine is provided so that newly connected display programs can ``catch up'' with the current status of the system. The status messages will always precede the alarm reports. This routine should also be called when a filter is changed, or whenever the display process for any reason thinks that it has lost track of the devices currently in alarm.
Request that the ARD resume sending ARP's after a pause request was sent. Any messages that have been accumulating in the ARD while the pause was in effect will be sent.
This routine disconnects the display process from the ARD. The mailbox channels are closed and the temporary mailbox is released. After calling this routine, calls to any of the alm_dpy routines, except alm_dpy_connect(), will fail. Alm_dpy_connect() will set up an exit handler to invoke this routine automatically when the display process exits.
Request that the ARD set the category filter. The filter is implemented as a bit array for the 256 categories (32 bytes, or 8 longwords.) If bit i is set in the bit array, then all alarm reports in category i are included in the message stream to the display process. The bit array is thus interpreted as an ``include these categories'' meaning. If this routine is not called, then all messages are sent by default. Note that the filter does not affect status messages - they cannot be filtered out. Bit number zero of the first longword corresponds to category zero. See also the description of the routine alm_dpy_setuicfilter().
Request that the ARD set the UIC filter. The UIC filter is implemented as an array of UICs and/or node names. Figure 7 shows the format of a single UIC filter element. (See the definition of the UICFilterArray type in the file epicure_sys_inc:alarmnet.h.)
The display process sends a variable length array of UIC filter elements to the ARD. The ARD compares the UIC or node field of each message's header (as determined by the foreign bit,) to each of the filter elements. If the message matches at least one of the UIC filter elements, then it is sent to the display process. The UIC filter array is thus interpreted as an ``include alarms from these users'' meaning. Note that the messages are also subject to category filtering; see the description of the routine alm_dpy_setfilter(). The following rules should be followed when creating the UIC filter array:
Request that the display process be notified if ARD crashes or disconnects from the display process for any reason. The display process can specify either an event flag to be set, or an AST routine to be invoked.
Request that all messages queued and waiting to be delivered by the alm_dpy_getarp() routine be deallocated. Any routines already removed from the queue by alm_dpy_getarp() are not affected. This routine is not useful if AST's are being used to receive ARP's. Note that if pause mode is on, this routine will have no effect on any messages that may be pending inside the ARD. To flush the queue inside the ARD, use alm_dpy_begin().
These routines provide the interface between the AMP and the permanent mailbox maintained by the ARD, allowing the AMP to easily send alarm reports to the ARD for distribution to display processes. The same permanent mailbox that is used by the display processes is used by AMP to pass the alarm reports to the ARD. The AMP UTI also monitors the ARD to keep track of when it is running. Alm_rpt_connect() accepts an AST routine as an optional argument; if provided, it is used to signal the AMP when the connection to the ARD is made and whenever the ARD is restarted. If the ARD is unable to read the mailbox at a fast enough rate to keep up with the AMP, then the UTI will queue the messages until the ARD catches up.
The ARP's sent from the AMP to the ARD are distinguished from those sent from the ARD to the display process by having the Cooked bit of the header clear. Figures 8 and 9 show the format of raw analog and binary alarm reports. (See the definition of the ARB_STATUS_MESS and the ARB_ANALOG_VALUE structures in the file epicure_inc:alarms.h.)
The headers are the same as the one described in section 1.1, and are not shown. The Raw Value field of an analog ARP is a long integer containing the unscaled reading from the device. If the Invalid_data bit is set, then this field will contain an OpenVMS status value. This field is meaningless if the Clear bit is set. The Current State field (a quadword, implemented as an array of two integers) of a binary ARP contains the curent digital reading from the device. The Previous State field contains the digital reading of the previous state from the device. The difference between these two quadwords indicates the bits that are going into or out of alarm. If the Invalid_data bit of the header is set, then the first longword of the Current State field will contain an OpenVMS status value. Both of these fields are meaningless if the Clear bit of the header is set.
These routines are intended to be used by the AMP only, and are not provided for use by general users. All of the routines will be put into the EPICURELIB library, but they will not be documented in the EPICURELIB help. All the routines return an OpenVMS status as the function value. Optional arguments are enclosed in [ ].
This routine will open a channel to a permanent, system-wide mailbox read by the ARD process. This routine must be called before any other of the alm_rpt routines. After calling this routine the caller may begin sending alarm reports to the ARD. The caller may specify an AST routine that will be invoked when the connection to the ARD's mailbox is made, and subsequently whenever a restart of ARD is detected. This routine is expected to cause the caller to send a new alarm report to the ARD for every device that is in alarm. If the caller does not want to specify an AST routine, then a value of NULL should be provided. The AST routine may be changed by calling this routine a second time (it is not necessary to disconnect first.) If ARD's mailbox is full, then the alarm reports will be placed on a queue until the mailbox is emptied.
This routine sends an alarm report to the ARD. The caller provides a pointer to either an ARB_ANALOG_VALUE structure or to an ARB_BINARY_STATUS structure, as defined in epicure_inc:alarms.h. The caller is responsible for making sure that the alarm report contains meaningful data. A new alarm report is dynamically allocated and the data copied into it from the alarm report provided by the caller.
This routine sends a raw analog alarm report to the ARD. The fields of the raw analog alarm report are passed individually and assembled into a complete alarm report that is sent to the ARD.
This routine sends a raw binary alarm report to the ARD. The fields of the raw binary report (except the property index) are passed individually and assembled into a complete alarm report that is sent to the ARD. The property index of the report will be DB_C_PRP_STATUS. If the caller wants to send a report with the property index DB_C_PRP_STATUS_ALARM, then alm_rpt_send_block() must be used.
This routine waits until any messages queued for transmittal to the ARD's input mailbox have been processed and the internal queue is empty. This routine is used to ensure that all raw alarm reports have been transmitted to the ARD before the calling program is allowed to exit. Warning: Do not call this routine from AST level.
This routine flushes the queue of alarm reports waiting to be sent to the ARD. The alarm reports are deallocated. Any messages already in the ARD's mailbox are unaffected. If the caller provided an AST routine to the alm_rpt_connect() routine, then alm_rpt_qflush() is implicitly called just before the AST routine is invoked.
The ARD expects to receive alarm reports, status messages and display requests through its permanent mailbox. Status messages and alarm reports are sent to display processes through temporary mailboxes created by the display processes. All messages to and from the ARD are sent in packets. A packet consists of a header (figure 10,) followed by one or more messages. (See the definition of the MsgHeader structure in the file epicure_sys_inc:alarmnet.h.) The ARD interface routines completely hide the tasks of assembling and disassembling packets from the display processes and the AMP; they only need to deal with single messages.
The Length field of a packet header contains the total length of the packet. The Count field contains the number of messages that follow the header. The Type field contains the type of messages in the packet. The possible values for the Type field are: ALMMSG_K_RAW, ALMMSG_K_COOKED, and ALMMSG_K_REQUEST. This same packet format is used for network messages between the ARD's.
Packets are restricted to one type of message per packet. If the Type field contains ALMMSG__, then the packet contains raw alarm reports (as described in section 2.2) from the AMP. If the Type field contains ALMMSG_K_COOKED, then the packet contains cooked alarm reports (as described in section 1.1) being sent to the display process, or over the network to another ARD.
If the Type field contains ALMMSG_K_REQUEST, then the packet contains either request messages from the display process or status messages. These messages consist of a header, possibly followed by additional data. The format of a request message header is shown in figure 11. (See the definition of the ReqHeader and ReqMsg structures in the file epicure_sys_inc:alarmnet.h.)
The Length field contains the total length of the message. In a request message, the Unit # field identifies the unit number of the temporary mailbox owned by a display process. This field is ignored in status messages. In a status message, The Node field contains the name of the node to which the message applies. This field is ignored in request messages. The Type field identifies the particular message being sent.
If the message is a request message from a display process, then the following messages are defined (the sender in this case will be identified by the unit number of the temporary mailbox:)
If the message is a status message, then the following messages are defined:
The AMP UTI is responsible for monitoring the ARD to determine if and when it has died. When the alm_rpt_connect() routine is called by the AMP, or when the ARD is detected restarting, the queue is flushed, an ARDREQ_K_AMPUP message is sent to the ARD, and the AST routine specified in the call to alm_rpt_connect() will be invoked. Note that the current implementation requires that the AMP and ARD be in the same group. This is because the process name of the ARD is used to determine if the ARD is running.
Keywords: Epicure, program, alarm, event, ARD, display, AMP