Research Division EED/Controls Software
Release Note 152.0
The Collection and Logging of Device Data
Robert E. West
Septermber 21, 1995
Introduction
The EPICURE Data Logging System consists of three parts:
- Specification of the devices from which data is to be collected
- Collection of the specified data and writing of the data to disk
- Extraction and plotting of the recorded data
This document describes the second part, the EPICURE data logging utility for
collecting and recording of the data.
Operation
Network Interface
When the data collection process starts, it declares itself to be a network
object so that it may receive commands over the network from other processes.
This interface is used by the data logging editor process to restart a data
acquisition list after a user has made modifications to the devices on the
list. The command RESTART followed by a list number from 1 to 6 tells the
collection process to read the specified list's file of device names and then
rebuild the data acquisition list. Other network commands exist for system
and diagnostic purposes.
In addition to writing to disk the data it has collected, the collection
process may receive data records over the network from other collection
processes and then write these records into its file of collected data.
In most cases, if the process is functioning as a file server, it probably
would not also be collecting data. Both functions may be performed by the
same process, but the workloads must be balanced to avoid losing data.
Data Acquisition
The data collection process of the EPICURE logging system is configured to
execute
six data acquisition lists each having a maximum of 180 devices. If the
reading, setting, and status properties of each device are to be logged, a
maximum of 540 items may be on a single list.
Tuning parameters may be used to further limit the number of items allowed on
each list. These limit parameters,
along with other control parameters, are read from a file on disk.
The times and/or rates at which the data acquisition
lists collect data are also specified by control parameters read from disk.
The device names and properties to be read by a data acquisition list are
contained in a text file on disk. There a six of these device files, one for
each
of the six possible data acquisition lists. When the data collection process
starts, it reads each of these files and creates the corresponding data
acquisition lists. The data logging editor process is used to modify the
contents of these device files and then to tell the data collection process
to read the new file and rebuild the corresponding data acquisition list.
A memory buffer is allocated for the data collected for each item on a
data acquisition list. A memory buffer contains 120 data points.
A data point consists of the timestamp of the device access and the data value
converted to engineering units in floating point format.
When the buffer is full, it is queued to be written to disk and another buffer
is allocated.
If the time between reads of a device is once a minute, the
buffer is written to disk after two hours. The longer the time between the
device reads, the longer it takes for the buffer to fill and the more data
that would be lost if the system crashed. The buffer may be flushed
to disk after the insertion of a specific number of data points. Because
the data
acquisition lists may be executed at different rates, each list has its own
flush count parameter.
The data collection process fills a device buffer with the data as it is read.
This operation takes place at the AST interrupt level
of processing. When the buffer is full, it is queued to be written to disk
at the base level of processing. If the buffers can't be written to disk fast
enough relative to the rate at which data is being read, then eventually the
process is going to run out of data buffers.
When this occurs, the interrupt processing of data input is disabled until an
adequate number of buffers have been output to disk and are then
available for new data. The result of this overload is a `hole' in the
logged data of all the devices, i.e., a period of time during which no data
has been collected and recorded.
Output to Disk
The EPICURE data logging system is designed to provide circular logging.
The duration of
a data acquisition list's data is how long before old data is
overwritten with new data. This time period is a function of (1) the rate at
which data is collected by the list, (2) how many items are being read by the
list, and (3) how many disk records the list is allowed to have.
The data collected from all six data acquisition lists is written into the
same disk file. This file is organized as an indexed sequential file with
two keys.
The alternate key is the device name followed by the first timestamp
in the buffer. The alternate key may have one or more duplicates and
changes each time a particular
record in the file is overwritten with a full buffer of new data.
The primary key is the number of the data acquisition list (1 to 6)
followed by the record number for this list. The record number starts with
1 and increments until it is greater than the maximum allowed for the particular
list, at which time it is reset to 1.
Control parameters specify the maximum number of records for each list.
The primary key is unique.
Once a particular record is written to the logging file, its primary key never
changes.
The purpose of the primary key is to quickly locate and then overwrite
an earlier data record. Outputting a device
buffer to disk consists of finding the record having the primary key
contained in the buffer and then updating that record with the new data.
When the data collection starts execution, it reads a disk file to determine
where records were last written into the file of collected data.
This is done so that new data will be written at the correct position in the
file, overwriting the oldest data collected by each list. The process
periodically updates this file with the reference number of the record last
written for each data acquisition list.
Performance
A data buffer contains 120 points. The time required to fill the buffer is
120 multiplied by the time between the acquisition of each data point. An RMS
record consists of a single buffer. If there are N items in a data
acquisition which is being executed every S seconds, then N records must be
output to disk every (120 * S) seconds, Otherwise, some collected data
will not be written to the disk file. A maximum of six data acquisition lists
may be executed, so the total number of records output per second is
specified by the following:
If each list contains the same maximum number N, then this expression
of the number of records output per second may be simplified to the following:
The number of seconds to output a single record is specified by the following:
Writing a record to the file requires two RMS accesses, a FIND and then an
UPDATE. The number of seconds per RMS access is expressed by the following:
If the execution rates for the six lists are one second through six seconds
and each list contains 360 items, then each RMS access must be completed
within 68 milliseconds.
The data collection process calculates various measurements of its
performance. One of these statistics is the average time required for an RMS
access to the file. This time is dependent upon the speed of the VAX CPU,
the physical amount of memory, and the amount of other activity on the system.
An average RMS access time of as low as 50 milliseconds has been observed.
Writing a record to the file requires a FIND operation and then an UPDATE
operation. If the RMS access time is 50 milliseconds,
then the minimum time required to write a record is 100 milliseconds.
If no intermediate flushing of buffers is being done, 10 records containing
120 data points each, or a total of 1200 data points, can be output to disk
each second. If the reading of all the requested properties of all the
devices on all six of the collection process's data acquisition lists is less
than or equal to 1200 data points per
second, the data collection process probably will be able to write to disk
all the collected data. Counts higher than 1200 may result in some
collected data occasionally not being recorded.
Input Files
Execution Parameters
The logical name DLG_PARAMS_`node' specifies the file containing the
specifications of (1) the file into which the collected data is to be written
and (2) the various data acquisition lists which are to be executed. `Node'
is the name of the VAX node on which the data collection process is executing.
It is anticipated that only a single data collection processes will execute on
any one node. However, multiple collection processes are possible
if all the necessary logical names are correctly defined.
The parameter file is an ASCII text file which is read when the data collection
process starts its execution. A network input may be given to the collection
process to read this file anytime after the initial read. However, this would
only be necessary if a new version of the file was created using a text editor.
Each of the six data acquisition lists has its own set of parameters. The
format is the list number (1 to 6) followed by the parameter name followed by
the parameter value or values. The file contains the following information
for each data acquisition list:
- FTD - Frequency Time Descriptor specifying when data is
collected by the list. The format is interval, followed by event number,
followed by event type. If no event number or type is specified, the list
is created using the specified interval as the periodic rate. The list is
synchronized to the Tevatron clock if TEV is specified for the event type.
Otherwise, the list is synchronized to the phase reversal clock.
- MAX_ITEMS - The maximum number of items that may be in the list.
This is the total number of reading, setting, and status properties that can
be read, not the number of device names.
- FLUSH_COUNT - The number of new data points inserted into the buffer
after which the buffer will be written to disk.
The flush count is less than or equal to 120, the maximum number of
data points a buffer can contain. This
parameter results in a partially filled buffer being written to disk to
minimize the amount of data which may be lost in case of a system crash.
- MAX_BLOCKS - The number of records which may be written to disk for
this particular
data acquisition list. This parameter specifies the circular nature of the
data file. When this limit is reached, the record number for this list
is reset to one and this list's record with that record number in its primary
key is overwritten. If this value is too large, records will never be
overwritten and eventually disk space will be exhausted.
The text file may also contain the following optional parameters:
- FILE_OUTPUT - If this parameter is set to 1, the collection process
writes the collected data into the associated disk file. This is the default
case.
- NETWORK_OUTPUT - If this parameter is set to 1, the collection process
writes the collected data to a data logging file server.
The logical name DLG_SERVER_`node' specifies the node on which the file
server is executing and the
logical name DLG_SRV_OBJ_`node' specifies the server's network object name.
`Node' is the name of the VAX node on which the data collection process is
executing. Though possible, output to a file server is not expected to be
done if the collection process is writing the data to its own disk file.
- FILE_SERVER - If this parameter is set to 1, the data logging process
functions as a file server. It will not execute any data acquisition lists.
It only receives records of collected data from data logging processes and
then writes them into the logging file. The process could be configured
to serve both as file server and a collector of data, but it is not
anticipated that will be necessary.
- CIRCULAR - The file of collected data is normally treated as a circular
file which
sooner or later overwrites itself. This default case is indicated by this
parameter having a value of 1. If the parameter has a value of 0,
when one of the data acquisition sections is filled, the file is closed and
a new file is opened with the next higher version number.
- MULTIPLE - If this parameter is set to 1, a new version of the file of
logged data is created each midnight.
- SNS_LIMIT - The maximum number of entries which may be put on the queue
of inputs received over the network.
Current Record Numbers
The file of collected data is considered to consist of six sections, one
for each of the six data acquisition lists. Each record has a key which
specifies the number of the list and a record number for this list.
The data collection process periodically writes to disk the current record
number for each data acquisition list. The logical name DLG_BLOCKS_`node'
specifies the file containing the current record numbers. The format of the
file is binary with only the first six words of the file being used.
Device Names and Properties
The logical name DLG_LIST`n'_`node' specifies a file containing the device
names and properties which are to be read. `N' is the number of the
corresponding data acquisition list (1 to 6) and
`Node' is the name of the VAX node on which the collection process is
executing. A device file is an ASCII file which is created and modified by the
data logging editing process. A record in one of these files consists of the
name of the device followed by a character fields for each of the
reading, setting, and status properties as read from left to right.
A period in one of these character fields indicates the property does
not exist. An underscore indicates the property exists but is not to be
logged. An asterisk indicates the property is to be logged. Any additional
information a record may contain is not used by the collection process.
Output Files
The two primary data acquisition output files contain (1) the collected data
and (2) the current record numbers.
The logical name DLG_COLLECTED_`node' specifies the file containing the
data read by the data collection process, where `node' is the name of the VAX
node on which the process is executing.
The logical name DLG_BLOCKS_`node'
specifies the file into which the current record numbers of all the data
acquisition lists are periodically written.
Each time the collection process restarts
a data acquisition list, it writes a text file whose name has the format
DLG_LIST`n'_`node'.LST, where `n' is the number of the list and `node' is the
name of the VAX node on which the process is executing. A record in this file
contains the name of the device, an indication of the property, the name of
the node, the number of the list, the number of the item in the list, and the
time at which the list was restarted. The property is indicated by a T for
status, an E for setting, and a blank for reading. This same information
along with the FTD of the list is
appended to the end of a history file maintained by the collection process.
The format of the name of the history file is `node'.HIS.
General diagnostic information is written to a file whose name has the format
LOGGING_`node'.OUT. Severe error messages are output to a file whose name has
the format LOGGING_`node'.ERR.
Logical Names
- DLG_LOGGERS - Specifies all the VAX nodes on which data
collection processes are executing and which are accessible to the requesting
user. This logical is used by the data logging extraction process and the
data logging editing process to know what logging files may be accessed.
- DLG_NET_OBJ_`node' - For the purpose of enabling network communication,
the data collection
process declares itself to be a network object. The process translates this
logical to get its network object name.
- DLG_NODE_`node' - Specifies where the default data
collection process to be associated with `node' is executing.
- DLG_BLOCKS_`node' - current record numbers of the lists
- DLG_COLLECTED_`node' - collected data
- DLG_PARAMS_`node' - FTD and size specifications
- DLG_LIST`n'_`node' - device and property names
- DLG_SERVER_`node' - Specifies the node on which the associated file
server process is executing. This logical name is only used if the
NETWORK_OUTPUT parameter was set to one.
- DLG_SRV_OBJ_`node' - Specifies the network object name of the
associated file server process. This logical name is only used if the
NETWORK_OUTPUT parameter was set to one.
Network Command Input
As part of its initialization, the data logging collection process declares
itself to be a network object so that it may receive commands over the
network. The command most commonly received is RESTART of a specific data
acquisition list. This command is sent by the data logging editing process
upon request of the user after having modified the devices to be read by the
data acquisition list. Other commands exist but are for system and diagnostic
use.
- Restart # (R) - restart the identified data acquisition list. A number
of 0 to 6 may be specified. If 0 is used, all the lists are restarted.
If the list is executing, it is first canceled. The associated disk file
is input, the necessary data structures are built, and then the data
acquisition list is started.
- Stop # (S) - stop the identified data acquisition list. A number
of 0 to 6 may be specified. If 0 is used, all the lists are stopped.
If the list is executing, it is canceled.
- Flush # (F) - queue to be output to disk all the data buffers currently
allocated for the devices associated with the specified data acquisition
list.
- Parameters (P) - read the file of parameters associated with the
specified
data acquisition list. This command enables system management personnel to
modify a parameter file using a text editor and then have the data collection
process read the updated file without having to restart the process. However,
any change in a data acquisition rate will not take effect until the next time
the list is restarted.
- Diagnostic statistics (D) - write to the diagnostic disk file all
performance
statistics. Also return this information over the network to the requesting
process.
- Exit (E) - cancel all data acquisition lists, flush all data buffers to
disk, and then exit.
- Quit (Q) - exit immediately without trying to do any cleanup.
- New output file (N) - close the logging file into which the collected data
is being written and create a new output file.
Security, Privacy, Legal
rwest@fsus04.fnal.gov