OOOR90101C
-------
-------
Automated Laboratory Standards:
RESULTS FROM THE SURVEY OF CURRENT TECHNOLOGY
FOR AUTOMATED LABORATORIES
Prepared for.
Office of Information Resources Management
U.S. Environmental Protection Agency
Research Triangle Park, North Carolina 27711
June 15,1990
Prepared by:
BOOZ-ALLEN & HAMILTON Inc.
4330 East-West Highway
Bethesda, Maryland 20814
(301) 951-2200
Contract No. 68-W9-0037
Computer Sciences Corporation
79 T.W. Alexander Dr.
Research Triangle Park, North Carolina 27709
(919) 541-9287
Contract No. 68-01-7365
-------
Acknowledgments
This report was the combined efforts of Computer Sciences
Corporation, Booz* Allen & Hamilton Inc., EPA staff, and outside experts.
Richard Trilling, Will Harrelson, and Trevor Elliott of CSC researched and
prepared the draft for public review. Marguerite Jones, Ronald Ross, and
Lynn Eberhardt of Booz* Allen evaluated the comments and completed this
final report. Numerous EPA staff and outside experts provided substantial
critical reviews and valuable technical comments. Richard Johnson of the
Scientific Systems Staff of EPA's Office of Information Resources
Management directed the contractors' work and managed the review process.
-ii-
-------
Table of Contents
Executive Summary iii
Background 1
Exhibit 1 - Need for EPA's Automated Laboratory
Standards Program 2
Exhibit 2 - Considerations in Developing
Automated Laboratory Standards 4
Review of the Literature 7
Survey of LIMS Vendors 9
Development and Administration 9
Results of the Survey 10
Other Sources of New Technology 13
Conclusions 16
Glossary
References
Appendix A: Survey Questionnaire
Appendix B: Summary of Results
-in-
-------
Executive Summary
The U.S. Environmental Protection Agency (EPA) has initiated a
program to ensure the integrity of computer-resident data in laboratories
performing analyses in support of EPA programs by developing standards for
automated laboratory processes. The activities of these environmental
programs are diverse, and include basic research at EPA's environmental
research centers, environmental sample analyses at EPA's regional
laboratories and contractors' laboratories, and product registration relying on
analytical data submitted by the private sector.
This report investigates the availability of current automated
technology that will provide adequate assurance that computer-resident data
will be reliable. Several vendors of laboratory automation and laboratory
information management systems (LIMS) were surveyed to determine if
standards and controls are available that will ensure the reliability and
validity of the data generated. Additionally, an extensive search of the
literature did not reveal any hardware or software currently on the market
that will guarantee the integrity of the data produced.
Vendors already offer a variety of control techniques such as audit
trails and password protection, and provide customizable systems to meet the
varying needs of each of their customers. Most vendors rely on existing
control features (e.g., password protection and system backup) provided by
operating systems rather than duplicating them.
Of the technological advances identified, the following can be
considered in developing standards for automated laboratories:
• Magnetic ink character recognition (MICR), which permits
characters in labels to be read by magnetic scanners when written
in standard format and standard location
• Optical scanning, which permits the recognition of patterns of
ink, such as those used in bar codes (universal product codes)
-IV-
-------
• "Smart cards," or credit cards that communicate with a remote
computer from a sample analysis station via an embedded
processing chip.
These technologies can be tailored to the laboratory environment to assist in
data management operations.
-v-
-------
Background
The U.S. Environmental Protection Agency (EPA) has initiated a
program to ensure the integrity of computer-resident data in laboratories
performing analyses in support of EPA programs by developing standards for
automated laboratory processes. The possession of sound technical data
provides a fundamental resource for EPA's mission to protect the public
health and environment, regardless of the activities of the specific
environmental programs. The activities of these environmental programs
are diverse, and include basic research at EPA's environmental research
centers, environmental sample analyses at EPA's regional laboratories and
contractors' laboratories, and product registration relying on analytical data
submitted by the private sector.
EPA recognizes that the implementation of an automated laboratory
standards program will require each laboratory to allocate resources of dollars
and time for the program's execution. Although this program may be
considered too expensive by some, laboratory managers must consider that in
developing and using a proper standards program, they will achieve a net
savings as information processes do not have to be repeated and expensive
mistakes can be avoided.
Within EPA, the Office of Information Resources Management (OIRM)
has assumed the objective of establishing an automated laboratory standards
program. The need for this program is evidenced by several factors. Exhibit 1
illustrates these factors that include the rising use of computerized operations
by laboratories, the lack of uniform standards developed or accepted by EPA,
evidence of problems associated with computer-resident data, and the
evolving needs of EPA auditors and inspectors for guidance in evaluating
automated laboratory operations.
Laboratories collecting data for EPA's programs have taken advantage
of increasing technology to streamline the analytical processes. Initially,
automated instrumentation entered the laboratories to increase productivity
and enhance the accuracy of reported results. Then, computers maintaining
data bases of results were used for data management and tracking. Computer
-1-
-------
2
g"
to
•o
(0
CO
I!
XT?
iu s
(0
CE
C5
O
cc
0.
(0
o
DC
CO
CL
LU
£
•o
fo
uu
•t ^ ..••-\ ....•• ^^ ',
-2-
-------
systems were then integrated into more sophisticated laboratory information
management systems (LIMS). Each of these advances necessitates thorough
quality control procedures for data generation, storage, and retrieval to ensure
the integrity of computer-resident data.
Currently, EPA has no Agency-wide guidelines that laboratories
collecting and evaluating computer-resident data must follow. The
requirements that must be considered in developing automated laboratory
standards come from a variety of sources, as Exhibit 2 illustrates, including
the requirement of the Computer Security Act of 1987 (P.L. 100-235, January 8,
1988) and various EPA program-specific data collection requirements under
Super fund, the Resource Conservation and Recovery Act, the Clean Water
Act, and the Safe Drinking Water Act, among others. Additionally, OIRM has
developed electronic transmission standards and is developing a strategy for
electronic record keeping and electronic reporting standards that will impact
on all Agency activities. The development of uniform principles for
automated data in EPA laboratories, regardless of program, will take into
account the common elements of all these data collection activities, and
provide a minimum standard that each laboratory should achieve.
There is increasing evidence of problems associated with the collection
and use of computer-resident laboratory data supporting various EPA
programs. To illustrate, as of November 1989, EPA's Office of the Inspector
General was investigating between 10 and 12 laboratories in Superfund's
Contract Laboratory Program (CLP) for a variety of allegations, including
"time traveling" and instrument calibration violations. In "time traveling,"
sample testing dates are manipulated, by either adjusting the internal clock of
the instrumentation performing the analyses or manipulating the resultant
computer-resident data. (Hazardous waste samples must be assayed within a
prescribed time period or the results may be compromised.) Additionally,
calibration standard results have allegedly been electronically manipulated
and other calibration results substituted when the actual results did not meet
the range specifications of the CLP procedure being followed. If true, these
allegations may be treated as felonies.
-3-
-------
CO
•E
co
•o
)
2
o
.Q
(0
2
(0
CME
X 0)
UJ.E
Q.
£
§
0>
Q
(0
.0
?
2
o>
T3
"<0
C
O
o
2 2 2 o -,
*•=•-». Igg^gs*.
^£SSO^§6
-------
Because the introduction of automation is relatively new and still
evolving, no definitive guidelines for EPA auditors and inspectors have been
developed. Inspectors must be alert to the steps in those procedures used by
laboratories generating and using computer-resident data where the greatest
risk exists. These critical control points indicate the magnitude of control that
should be placed on that step of the process. If adequate controls are not
present, the remainder of the process cannot correct a deviation, and the
entire process will provide no reliable conclusions. Automation introduces
many new variables into a system, each with its own set of critical process
points. Inspectors must verify that laboratory management has recognized
the various risks and have instituted an appropriate risk management
program.
As part of the EPA's program to ensure the integrity of computer-
resident laboratory data, the Agency is investigating what automated data
processing (ADP) systems exist, what controls and standards are feasible, and
how vendors have identified and/or developed the standards they
implement. Particularly important is whether there have been recent
technological advances of devices or subsystems that provide full assurance of
integrity for computer-resident data.
To investigate these issues, the following activities were performed:
1) Reviewed professional journals to identify articles introducing
or describing such advances
2) Developed and administered a survey to five (5) vendors of
(LIMS) to determine what data integrity control features these
vendors' products provide
3) Conducted telephone interviews with representatives of two
major laboratory instrumentation manufactures to obtain
information pertaining to the flexibility of the laboratory
instrumentation currently available
-5-
-------
4) Explored new technologies used in the banking, retail, and
manufacturing industries that have the potential to enhance
data integrity in the laboratory environment.
This report complements our earlier report that reviews data integrity
data processing standards in automated financial systems (OIRM, 1989b).
These standards include:
• Use of logon/pass word security
• Data entry verification
• Flagging of changes made and retention of both original and
altered data (audit trails)
• Protecting reliability of data by prohibiting the same person from
both authorizing and allocating funds
• Maintaining hard-copy data outputs.
That report concludes that the financial auditing discipline offers reasonable
levels of assurance of integrity of computer-resident data and recommends
consideration of certain standards used in automated financial systems in
developing standards for automated chemistry laboratories.
-6-
-------
Review of the Literature
Twelve months (fall 1988 to fall 1989) of professional journals that deal
with laboratory science, laboratory automation, and laboratory information
management were reviewed. These journals include Analytical Chemistry,
American Laboratory, Chemical Industry, Laboratory Practice, and Science.
In this review, searches were made for articles on a variety of topics,
including laboratory automation, laboratory information management,
scientific computing, and related topics. Four articles were found. These
articles were Sandowski, C, and G. Lawler, "A Relational Data Base
Management System for LIMS," American Laboratory 21:3 (March 1989),
pp. 70-79; Merrer, Robert J., and Peter G. Berthrong, "Academic LIMS:
Concept and Practice," American Laboratory 21:3 (March 1989), pp. 36-45;
Megargle, Robert, "Laboratory Information Management Systems," Analytical
Chemistry, 61:9 (May 1989), pp. 612A-621A; and Anon., "Products -
Information Management," Laboratory Practice, 38:5 (May 1989), pp. 87-91.
Library on-line search tools were used at a major university library for
these topic areas and related key words, and no references ("hits") were found.
A similar search using EPA's Online Library System (OLS) was performed,
which included not only articles from professional journals but also all EPA
items registered with the National Technical Information Service (NTIS). In
this search, two references were identified and are Dessy, Raymond E., The
Electronic Laboratory (Washington, D.C.: American Chemical Society) and
McDowall, R.D. ed., Laboratory Information Management Systems
(Wilmslow, U.K.: Sigma Press), 1987. The publication by McDowall (1987)
contains two LIMS articles of interest: Mattes, D.C., 1987, "LIMS and Good
Laboratory Practice," and Brown, Elizabeth H., 1987, "Procedures and their
Documentation for a LIMS in a Regulated Environment."
It was concluded from the review of these searches that laboratory
automation and laboratory information management are not yet common
topics, and probably not yet part of the main stream of laboratory literature.
Further, no existing laboratory standards were identified by the literature
search.
-7-
-------
The journals were also reviewed to find advertisements for laboratory
automation and/or laboratory information management systems and
advertisements from the following vendors were found:
PE Nelson
CI Beckman
Varian Associates, Inc.
FIAtron Laboratory Systems
Axiom Systems, Inc.
Laboratory MicroSystems, Inc.
Radian Corporation
Advanced Systems Management, Inc.
VG Instruments, Inc.
Harley Systems, Inc.
It was also known that Hewlett Packard and Digital Equipment
Corporation have LIMS systems. This substantial number shows that even if
the laboratory automation/laboratory information management system topic
is not heavily discussed in professional periodicals, vendors nevertheless
have found a market.
-8-
-------
Survey of LIMS Vendors
Development and Administration
The purpose of the survey was to identify LIMS that provide
reasonably high levels of assurance of data integrity. Consequently, the items
included in the questionnaire deal with a variety of ADP controls that reduce
the risk of threats to data integrity.
The survey elicits information about system documentation; security;
data integrity; data reduction and analysis; and backup, archiving, and
recovery. The full questionnaire appears in Appendix A.
In developing the questionnaire, the following sources for topics and
for questionnaire items were consulted:
• A checklist of ADP audit features already being used by EPA in
laboratory site visits to determine which such features are in
place or feasible (OIRM, 1989a)
• Standard systems analysis and design techniques (OIRM, 1987)
• EPA LIMS functional specifications (OIRM, 1988).
The survey was administered by telephone to the following vendors:
CI Beckman
Varian Associates, Inc.
VG Instruments, Inc.
Hewlett Packard
PE Nelson
The only significant problem encountered with survey administration
was that vendors' products were typically highly customizable and therefore
not easily characterized by a structured survey.
-9-
-------
Results of the Survey
MAJOR FINDINGS
There are four major findings from the survey:
• Vendors offer a variety of features that can be customized to
provide assurance of data integrity, such as passwords and
records of data changes.
• System vendors offer system specific data integrity features;
however, there is no required standard set of data integrity
features.
• There is no "magic box" or technological advance to guarantee
absolute data integrity.
• Most vendors rely on existing control features (e.g., password
protection and system backup) provided by operating systems
rather than duplicating them.
The discussion of the findings from the survey is supplemented with
information obtained from telephone conversations with representatives
from two LIMS vendors. These are discussed in more detail below.
Customizable Products
Vendors offer a relatively extensive variety of control techniques for
ensuring data integrity. Vendors do not rely on or reference a set of standards
in deciding which feature to deliver. Vendors respond instead to requests
from their customers and deliver a system that provides the data integrity
features specified by their customers.
-10-
-------
No Standard Configuration
Presently, the manufacturers of automated laboratory systems design
their instrumentation to incorporate their individual data integrity controls.
With so many manufactures of this equipment in the market place, it stands
to reason that without a universal standard to adhere to, there exist no
standard configuration of data integrity controls.
No "Magic Box"
The vendors surveyed have not made advances in hardware or
software that would guarantee full data integrity. It was thought that vendors
might be making use of optical disk technology, write once/read many
(WORM), or a similar, highly controlled method to minimize risk to data
integrity. Vendors surveyed are not incorporating this type of technology in
their systems. Reasonable levels of data integrity can be achieved through
traditional controls such as re-keying for data entry verification, logon and
password security, and using an audit trail to implement a chain of custody.
After asking vendors all the questions, they were then asked to mention any
additional features in their systems designed to ensure or useful at ensuring
data integrity. Vendors did not identify any additional methods of ensuring
data integrity. It can be assumed that any such methods had not been
overlooked during the design of the survey.
Confidence is high that the laboratory automation commercial market
in general has not made technological advances in hardware or software that
would guarantee full data integrity.
Vendors Reply on Operating Systems
Vendors will typically rely on existing control features of the operating
system rather than duplicate those control systems. Password access control,
for instance, usually consists of whatever the operating system provides.
Backup typically consists of whatever frequency and medium the operating
system provides.
-11-
-------
SUMMARY OF RESULTS
Appendix B presents detailed results of the survey. The systems are
generally capable of doing whatever the customer needs. Even if a system
does not offer a particular option as a feature, it is usually flexible enough to
allow the customer or vendor to customize the system in order to provide
that option through programming or third-party software.
-12-
-------
Other Sources of New Technology
The laboratory automation/LIMS vendors do not seem to use new
technology for the laboratory environment. In the review of other
automated systems, however, a number of technological advances were
identified.
To illustrate, banking, retailing, and manufacturing have witnessed
technological advances that have potential for the laboratory environment.
At least three such advances were identified and are discussed below:
• Magnetic ink character recognition (MICR)
• Optical scanning and bar codes (universal product codes)
• Magnetic cards
• "Smart cards."
Technological advances have been made in magnetic ink and in
sensitive scanners capable of MICR. Using this ink and MICR, scannable
checks can be processed by machine because the banking industry has adopted
a standard format for labeling checks with bank and account information.
The magnetic ink, written onto checks in a standard format and standard
location, can be read by most new scanners.
Carrying the scanning technology further, optical scanners can read
merchandise codes that have been written with ink and that conform not
only to a standard location and format but also use a universal product code
that has been developed so that merchandise can be labeled unambiguously.
(Universal product codes are most familiar as "bar codes" on grocery products
and are read by registers that are really terminals connected to central
processing units and read the codes from merchandise, keep running totals of
the individual's bill, and may even perform automatic inventory control and
reporting.)
This technology could be adopted for the laboratory environment as a
method of labeling and reading samples that enter a laboratory. Labels could
be affixed to sample containers during sample processing. Scanners would be
-13-
-------
installed at every station in the laboratory at which sample identification
information is important. The scanner would read the sample identification
from the physical sample container and pass that information to software.
Software would compare the sample identification information read from the
container with that entered at sample receiving time in order to verify that
results information was being attributed to the proper sample.
Magnetic stripes on credit cards provide a bank with information about
an individual's account. Additionally, some methods of transportation
(notably, the Metro subway system in Washington, DC) use magnetic stripes
to record fare information that can be linked to distance and time of day. In
some implementations, "smart cards" -- credit cards with an imbedded
processing chip, as well as the traditional magnetic stripe ~ can communicate
with and provide additional information to the host computer in a number
of applications.
Card-assisted ADP in the laboratory might work in the following
manner: a physical sample would move through the laboratory and its
identification information would be checked at each station, as desired above.
At one or more of these stations, an authorized individual (perhaps a
laboratory director of principal investigator) might enter a magnetic ("mag")
card that would authorize posting of sample information to the data base and
would retrieve from the data base information required for the next posting
(the result of an analysis or the status of the experiment). Without the
intervention of the mag card, the information from the sample could not be
posted to the data base.
Additionally, a smart card containing stored information on a sample,
or on a number of similar samples being run together, could be inserted into
an instrument to record the results of the sample analyses. Smart cards can be
pre-formatted to receive data in any configuration, such as tabular, and are
ideal for transmitting data from remote instrumentation to a central data
management system. Smart cards can be erased and reformatted for use with
a new sample or set of samples, thereby making them more cost effective.
-14-
-------
In general, automation technology that uses standards and is
implemented adequately performs its tasks more reliably and perhaps more
cost effectively than could be true of manual performance of the same tasks.
Therefore, the technology described above has significant implications for the
laboratory environment.
-15-
-------
Conclusions
Automation technology that uses standards and is implemented
adequately performs its tasks more reliably and perhaps more cost effectively
than could be true of manual performance of the same tasks. After reviewing
available literature and surveying various LIMS vendors, it was determined
that laboratory automation and LIMS commercial vendors have not
developed a standard set of controls that provide full assurance of the
integrity of computer-resident data. Vendors typically deliver systems
customized to fit the specifications of their customers, but there are no
standards that define the default, baseline system each vendor delivers.
These vendors have not made hardware or software advances that guarantee
data integrity. Standards for laboratory automation would provide a
common denominator for software design and other technological advances.
Technological devices developed for a variety of fields have the
potential to be applicable for use in the laboratory setting, but these devices
have had little acceptance in this environment. These devices include the
following:
• Magnetic ink character recognition (MICR)
• Optical scanning and bar codes
• Magnetic cards
• "Smart cards."
Universal product codes (bar codes) have been used for sample identification
in a few laboratories, and acceptance of that technology may be increasing.
It is worth noting, however, that the technological advances in the
banking, retailing, and manufacturing industries can be used only because
each industry has developed standards for use of the technology. The
technology for reading magnetic ink from checks works only because the
banking industry has developed a standard format and a standard location for
writing information onto the checks. Similarly, the retailing and
manufacturing industries have developed standards for the format of
universal product codes.
-16-
-------
The results of the survey of five LIMS vendors has indicated that the
vendors are not currently standardizing their systems and technology. Until
the time that the vendors voluntarily work in concert or are provided with a
set of standards from outside sources, little progress can be made in
incorporating these techniques into the analytical chemistry laboratories of
concern to EPA.
By tailoring existing technologies to the laboratory setting and by
setting standards for operation of automated equipment, laboratory processes
can produce data with increased efficiency and integrity.
-17-
-------
Automated Laboratory Standards Program
GLOSSARY
Application controls - one of the two sets or types of controls recognized by
the auditing discipline. They are specific for each application and include
items such as data entry verification procedures (for instance, re-keying all
input); data base recovery and roll back procedures that permit the data base
administrator to recreate any desired state of the data base; audit trails that not
only assist the data base administrator in recreating any desired state of the
data base, but also provide documentary evidence of a chain of custody for
data; and use of automated reconciliation transactions that verify the final
data base results against the results as reconstructed through the audit trail.
Application software - a program developed, adapted, or tailored to the
specific user requirements for the purpose of data collection, data
manipulation, data output, or data archiving [Drug Information Association].
Audit trail - records of transactions that collectively provide documentary
evidence of processing, used to trace from original transactions forward to
related records and reports or backwards from records and reports to source
transactions. This series of records documents the origination and flow of
transactions processed through a system [Datapro]. Also, a chronological
record of system activities that is sufficient to enable the reconstruction,
reviewing, and examination of the sequence of environments and activities
surrounding or leading to an operation, a procedure, or an event in a
transaction from its inception to final results [NCSC-TG-004].
Auditing - (1) the process of establishing that prescribed procedures and
protocols have been followed; (2) a technique applied during or at the end of a
process to assess the acceptability of the product. [Drug Information
Association]; (3) a function used by management to assess the adequacy of
control [Perry]. That is, auditing is the set of processes that evaluate how well
controls ensure data integrity. As a financial example, auditing would
include those activities that review whether deposits have been attributed to
the proper accounts; for example, providing an individual with a hard-copy
record of the transaction at the time of deposit and sending the individual a
monthly statement that lists all transactions.
Automated laboratory data processing - calculation, manipulation, and
reporting of analytical results using computer-resident data, in either a LIMS
or a personal computer.
Availability - see "data availability."
G-l
-------
Automated Laboratory Standards Program
Back-up - provisions made for the recovery of data files or software, for restart
of processing, or for use of alternative computer equipment after a system
failure or disaster [Drug Information Association].
Change control - ongoing evaluation of system operations and changes
during the production use of a system, to determine when and if repetition of
a validation process or a specific portion of it is necessary. This includes both
the ongoing, documented evaluation, plus any validation testing necessary to
maintain a product in a validated state [Drug Information Association].
Checksum - an error-checking method used in data communications in
which groups of digits are summed, usually without regard for overflow, and
that sum checked against a previously computed sum to verify that no data
digits have been changed [Drug Information Association].
Cipher - a method of transforming a text in order to conceal its meaning.
Confidentiality - see "data confidentiality."
Control - "that which prevents, detects, corrects, or reduces a risk" [Perry,
p. 45], and thus reasonably ensures that data are complete, accurate, and
reliable. For instance, any system that verifies the sample number against
sample identifier information would be a control against inadvertently
assigning results to the wrong sample.
Computer system - a group of hardware components assembled to perform in
conjunction with a set of software programs that are collectively designed to
perform a specific function or group of functions [Drug Information
Association].
Data - a representation of facts, concepts, or instructions in a formalized
manner suitable for communication, interpretation, or processing by human
or automatic means [ISO, as reported by Drug Information Association].
Data availability - the state when data are in the place needed by the user, at
the time the user needs them, and in the form needed by the user [NCSC-TG-
004-88]' the state where information or services that must be accessible on a
timely basis to meet mission requirements or to avoid other types of losses
[OMB]. Data stored electronically require a system to be available in order to
have access to the data. Data availability can be impacted by several factors,
including system "down time," data encryption, password protection, and
system function access restriction.
Data Base Management System (DBMS) - software that allows one or many
persons to create a data base, modify data in the data base, or use data in the
data base (e.g., reports).
G-2
-------
Automated Laboratory Standards Program
Data base - a collection of data having a structured format.
Data confidentiality - the ability to protect the privacy of data; protecting data
from unauthorized disclosure [OMB].
Data element (field) - contains a value with a fixed size and data type (see
below). A list of data elements defines a data base.
Data integrity - ensuring the prevention of information corruption [modified
from EPA Information Security Manual]; ensuring the prevention of
unauthorized modification [modified from OMB]; ensuring that data are
complete, consistent, and without errors.
Data record - consists of a list of values possessing fixed sizes and data types
for each data element in a particular data base.
Data types - alphanumeric (letters, digits, and special characters), numeric
(digits only), boolean (true or false), and specialized data types such as date.
Electronic data integrity - data integrity protected by a computer system;
automated data integrity refers to the goal of complete and incorruptible
computer-resident data.
Encryption - the translation of one character string into another by means of a
cipher, translation table, or algorithm, in order to render the information
contained therein meaningless to anyone who does not possess the decoding
mechanism [Datapro].
Error - accidental mistake caused by human action or computer failure.
Fraud - deliberate human action to cause an inaccuracy.
General controls - one of the two sets or types of controls recognized by the
auditing discipline. These operate across all applications. These would
include developing and staffing a quality assurance program that works
independently of other staff; developing and enforcing documentation
standards; developing standards for data transfer and manipulation, such as
prohibiting the same individual from both performing and approving
sample testing; training individuals to perform data transfers; and developing
hardware controls, such as writing different backup cycles to different disk
packs and developing and enforcing labelling conventions for all cabling.
Integrity - see "data integrity."
G-3
-------
Automated Laboratory Standards Program
Journaling - recording all significant access or file activity events in their
entirety. Using a journal plus earlier copies of a file, it would be possible to
reconstruct the file at any point and identify the ways it has changed over a
specified period of time [Datapro].
Laboratory Information Management System (LIMS) - automation of
laboratory processes under a single unified system. Data collection, data
analysis, and data reporting are a few examples of laboratory processes that
can be automated.
Password - a unique word or string of characters used to authenticate an
identity. A program, computer operator,or user may be required to submit a
password to meet security requirements before gaining access to data. The
password is confidential, as opposed to the user identification [Datapro].
Quality assurance - (1) a process for building quality into a system; (2) the
process of ensuring that the automated data system meets the user
requirements for the system and maintains data integrity; (3) a planned and
systematic pattern of all actions necessary to provide adequate confidence that
the item or product conforms to established technical requirements
[ANSI/IEEE Std 730-1981, as reported by Drug Information Association].
Raw data - "... any laboratory worksheets, records, memoranda, notes, or
exact copies thereof, that are the result of original observations and activities
of a study and are necessary for the reconstruction and evaluation of that
study. . . "Raw data" may include photographs, microfilm or microfiche
copies, computer printouts, magnetic media, . . . and recorded data from
automated instruments." [40 CFR 792.3] Raw data are the first or primary
recordings of observations or results. Transcribed data (e.g., manually keyed
computer-resident data taken from data sheets or notebooks) are not raw data.
Risk - "the probable result of the occurrence of an adverse event..." [Perry,
p. 45]. An "adverse event" could be either accidental (error) or deliberate
(fraud). An example of an adverse event would be the inaccurate assignment
of an accessionary number to a test sample. Risk, then, would be the
likelihood that the results of an analysis would be attributed to the wrong
sample.
Risk analysis - a means of measuring and assessing the relative
vulnerabilities and threats to a collection of sensitive data and the people,
systems, and installations involved in storing and processing those data. Its
purpose is to determine how security measures can be effectively applied to
minimize potential loss. Risk analyses may vary from an informal,
quantitative review of a microcomputer installation to a formal, fully
quantified review of a major computer center [EPA IRM Policy Manual].
G-4
-------
Automated Laboratory Standards Program
Security - the protection of computer hardware and software from accidental
or malicious access, use, modification, destruction, or disclosure. Security
also pertains to personnel, data, communications, and the physical protection
of computer installations [Drug Information Association].
System - (1) a collection of people, machines, and methods organized to
accomplish a set of specific functions; (2) an integrated whole that is
composed of diverse, interacting, specialized structures and subfunctions; (3) a
group of subsystems united by some interaction or interdependence,
performing many duties but functioning as a single unit [ANSI N45.2.10,
1973, as reported by Drug Information Association].
System Development Life Cycle (SDLC) - a series of distinct phases through
which development projects progress. An approach to computer system
development that begins with an evaluation of the user needs and
identification of the user requirements and continues through system design,
module design, programming and testing, system integration and testing,
validation, and operation and maintenance, ending only when use of the
system is discontinued [modified from Drug Information Association].
Transaction log - also Keystroke, capture, report, and replay - the technique of
recording and storing keystrokes as entered by the user for subsequent replay
to enable the original sequence to be reproduced exactly [Drug Information
Association].
Valid - having legal strength or force, executed with proper formalities,
incapable of being rightfully overthrown or set aside [Black's Law Dictionary].
Validity - legal sufficiency, in contradistinction to mere regularity (being
steady or uniform in course, practice, or occurrence) [Black's Law Dictionary].
G-5
-------
References
Anon. (1989), "Products - Information Management," Laboratory Practice, 38:5
(May 1989), 87-91.
Black, Henry C. (1968), Black's Law Dictionary, Revised Fourth Edition (West
Publishing Co., St. Paul, Minnesota).
Brown, Elizabeth H. (1987), "Procedures and their Documentation for a LIMS
in a Regulated Environment," PP. 346-358 in R.D. McDowall, ed. Laboratory
Information Management Systems (Wilmslow, U.K.: Sigma Press, 1987).
Datapro Research (1989), Datapro Reports on Information Security (McGraw-
Hill, Inc., Delran, New Jersey).
Dessy, Raymond E. (1985), The Electronic Laboratory (Washington, D.C.:
American Chemical Society, 1985).
Drug Information Association (1988), Computerized Data Systems for
Nonclinical Safety Assessment: Current Concepts and Quality Assurance
(Drug Information Association, Maple Glen, Pennsylvania).
Mattes, D.C. (1987), "LIMS and Good Laboratory Practice," Pp. 332-345 in R.D.
McDowall, ed., Laboratory Information Management Systems (Wilmslow,
U.K.: Sigma Press, 1987).
McDowall, R.D. (1987), ed.. Laboratory Information Management Systems
(Wilmslow, U.K.: Sigma Press, 1987).
Megargle, Robert (1989), "Laboratory Information Management Systems,"
Analytical Chemistry, 61:9 (May 1989), 612A-621A.
Merrer, Robert }., and Peter G. Berthrong (1989), "Academic LIMS: Concept
and Practice," American Laboratory 21:3 (March 1989), 36-45.
National Bureau of Standards (1976), Glossary for Computer Systems Security
(U.S. Department of Commerce, FIPS PUB 39).
National Computer Security Center (1988), Glossary of Computer Security
(U.S. Department of Defense, NCSC-TG-004-88, Version 1).
Office of Information Resources Management (1987), EPA Systems Design and
Development Guidance, Vols. A, B, and C (Washington, D.C.: U.S.
Environmental Protection Agency, 1987).
-------
Office of Information Resources Management (1988), "EPA LIMS Functional
Specifications." (Washington, D.C.: U.S. Environmental Protection Agency,
March 1988).
Office of Information Resources Management (1989a), Survey of Laboratory
Automated Data Management Practices (Research Triangle Park, N.C.: U.S.
Environmental Protection Agency, 1989).
Office of Information Resources Management (1989b), Automated Laboratory
Standards: Evaluation of the Use of Automated Financial System Procedures
(Research Triangle Park, N.C.: U.S. Environmental Protection Agency, 1989).
Perry, William E. (1983) Ensuring Data Base Integrity (New York: John Wiley
and Sons, 1983).
Sandowski, C, and G. Lawler (1989), "A Relational Data Base Management
System for LIMS," American Laboratory 21:3 (March 1989), 70-79.
-------
Appendix A
Survey Questionnaire
-------
Interviewer Name Date and Time_
Name of Respondent Firm
System Description
1) What kind of system is in use?
(Describe the hardware manufacturer and model)
Manufacturer:
Model:
Name of LIMS Product:
Describe the DBMS and other software in
use by the system
-------
The following questions are to determine what mechanisms are
used to prevent unauthorized access to the system and data.
System Security Yes No
1) Were specific standards or other guidance used
in the design or implementation of security
measures?
If yes, what reference?
2) Does the system require personalized
logon for each user?
3) Does each user have a password?
4) Are there any group user identification
or passwords used by members of
a functional group ?
5) How often does the system require
passwords to be changed?
6) Are there established password standards?
7) Does the data management system track
changes to the data?
If so, how ?
8) Does the system automatically flag
data as having been edited?
If so, how ?
-------
9) Is there a record maintained of the
unaltered data?
10) Are there any additional security mechanisms
not covered in the previous questions ?
-------
The next series of questions relate to the documentatio
provided to the customer by the vendor about the installed LIMS compute
system.
System Documentation
1) Does the vendor provide for each Yes No
installation/system
a) System Implementation Plan?
b) System Detailed Requirements Document?
c) Software Management Plan?
d) Software Test and Acceptance Plan?
e) Software Preliminary Design Document?
f) Software Detailed Design Document?
g) Software Maintenance Document?
h) Software Operations Document?
i) Software User's Guide?
j) System Integration Test Reports?
2) What additional documentation do you provide the
customer ?
-------
The following series of questions address the data entr}
function wit.M" tl->o LIMS.
Data Entry Yes No
1) Does the data entry individual use a
personalized logon to access the system?
2) Is there a password required to access
the data entry module?
3) Is the individual entering data
a) from a hardcopy?
b) by prompting system to access
an existing data file?
c) prompting the system to access
data directly from another system
or instrument?
4) Does the system alert the data entry
personnel if an error is made in data
entry (i.e., values out of date range
or incorrect flags, etc.)?
5) Does the system prevent entry of incorrect
or out-of-range data?
Are the errors logged ?
6) Does the system prompt the individual
entering data if there are missing fields?
-------
The next series of questions evaluate mechanisms that may b<
used to verify the integrity o^ data as the data is entered into th<
system.
Data Verification Yes No
1) Is the screen used for data entry
a) designed to match the forms
used for entering data?
b) convenient for the individual
responsible for data entry?
2) If data is manually entered from
a hardcopy, is the data validated by
a) re-keying by the same person?
b) re-keying by another person?
c) review by same person?
d) review by another person?
3) Does the system verify data entered based on
a) datatype
b) matches against predefined values
c) matches to keys of a preexisting record
d) legal value assigned to worng
unit of analysis
e) quality control limits
-------
4) Are there additional mechanisms in use to
the quality of the data at the point of entry ?
-------
Data Integrity Yes No
1) When data is manually entered into the
data base, if changes are required due
to clerical errors are they made by
a) data entry operator?
b) data entry supervisor?
c) systems group?
d) QA group ?
2) If the data is committed to the data base
can further changes be made to the data
3) If a change is made to data after it has
been committed to the data base does the
system maintain a log of
a) who made the change?
b) when the change was made?
c) a record of both the unchanged and
changed data?
4) If data is entered into the central data
base via a data set on a computer readable
media can further changes be made to the data?
-------
5) Is there additional information that you can
provide relatina to data integrity on your
product ?
10
-------
The next series of questions are directed toward functions i:
the system that have the potential to modify or alter the data.
Data Reduction and Analysis Yes No
1) Are the algorithms or formulas used for data
manipulations performed by the system available
in a written format?
2) - How many data records are
processed to test each algorithm?
3) Are the analysis test results documented ?
4) How many data are records processed
to test each validation algorithm?
5) Are the validation test results documented?
6) Are these checks done
a) during system development?
b) whenever changes are made in the
data base?
c) periodically by quality assurance
staff?
d) through the use of internal
quality control samples?
7) If algorithms or formulas are modified
a) is this documented?
b) is it possible to determine which
data sets were processed with which
version of the calculations?
c) are old results recalculated with new
formulas?
How ?
d) are changes reflected in the detail
design documentation?
11
-------
Data Review Yes No
1) Are there facilities to allow the analyst
to examine and review results data ?
If yes, explain
2) Are there facilities to allow the analyst
to examine and review quality control data ?
If yes, explain
3) Are there facilities to allow the analyst
to examine and review instrument
calibration data ?
If yes, explain
4) Do supervisors need to approve results ?
If so, what facilities are available to
allow the analyst and supervision to online
review and approve results data?
12
-------
The following questions relate to system backup and recovery
in the event of a fa i '
Backups/Archival
1) What areas of the system are backed up ?
2) How often are backups are performed ?
a) daily?
b) weekly?
c) monthly?
d) other:
3) Are the backups
a) partial?
b) total?
4) Who is authorized to perform system backups?
5) On what media are the backups stored
a) magnetic tapes?
b) disks?
c) diskettes?
d) other:
6) When the system is backed up, is this
documented on the system log ?
7) Are command files written to drive backup
operations?
8) Can data and analysis programs be restored in a logically
related manner so that the results may be regenerated ?
13
-------
Recovery From System Failure Yes No
1) If the system fails due to a power failure
or glitch does the system
a) restart automatically?
b) have a manual restart?
c) other:
2) Does the system lose the data being
processed?
If yes, how much data ?
3) Does the system start from where if
left off?
4) If data is lost, can the system show the loss
and identify which data was lost?
5) Does the system journal ?
6) Is there a recovery procedure for data
retrieval?
7) Is there additional information that you can
provide for data recovery in your system ?
14
-------
The following sections address the issue of record and data
tracking in the LIMS.
Records Tracking Yes No
1) Which of the following records are maintained
on the data system?
a) results of instrument calibrations?
b) results of instrument blanks?
c) results of additional quality control
samples such as duplicates, spikes, etc.?
d) laboratory identification of case
samples?
e) flags made associated with problems
found during initial samples
receipt (such as missing client
information, leakage, etc.)?
d) flags associated with quality control
problems?
e) records of individuals who review data?
f) any modifications of data flags made by
data review staff?
g) evidence that data review was completed
and samples were released for reporting?
3) If the data system tracks both case samples
and their associated quality control samples,
is there a pointer used in the system
to link the case sample with
a) standards?
b) blanks?
c) instrument calibrations?
d) instrument conditions?
e) duplicates?
f) spikes?
g) internal standards in sample?
h) surrogate standards in sample?
i) compounds under investigation?
j) unknown compounds found in sample?
15
-------
4) Is it possible using the data system to
change any of these key link? (i.e., could
a case sample be linked to a different
quality control set than that with which
it was run)?
If yes, does the system maintain a record
a) of who made the change?
b) who authorized the change? ~
c) of both the unchanged and changed
case/quality control link?
5) What additional mechanisms are available for data
and data change tracking in your product ?
16
-------
Records Audit Yes No
1) Does the system perform any of the following
data reduction functions?
a) linear or quadratic reduction for
standard curves?
b) quantitative analysis for unknowns
utilizing formulas derived in a)
c) flagging of data to indicate
i) standards outside of quality
control acceptance criteria?
ii) sample results outside linear
range?
iii) sample results below detection
limits?
iv) sample results below reporting
limits?
v) blanks with compounds above
acceptable limits?
vi) comparison of duplicate results
outside acceptable limits?
vii) comparison of spiked and non_
spiked samples outside acceptable
limits?
viii) other:
17
-------
2) If flags are changed on the system, is there
documentation kept of both the changed and
unchanged flags?
3) Are the flags of sufficient detail to
characterize problems with the data (i.e., a
flag merely setting the sample as invalid
without providing detail as to the nature
of the problem may not be sufficient)?
4) Are technical records maintained on the
data system sufficiently complete as to
allow scientific review of the data?
18
-------
Other
1) Do you have any suggested literature (references, meetinc
proceedings, etc.) on these topics ?
19
-------
Appendix B
Summary of Results
-------
Features and characteristics offered by all five vendors:
• A personalized logon is required for each user.
• Each user has a password.
• The data base management system tracks changes to the data.
• Data are automatically flagged, and a record is maintained of the
unaltered data.
• Data can be entered from a hard copy or an existing data file.
• The system alerts the data entry personnel if a detectable error is made
during data entry.
• The data entry screen can be designed to match the data entry forms.
• The data can be validated by a review by the same person or by a
different person.
• The system can verify data based on: data type, matches against a pre-
defined value, a legal value assigned to wrong unit of analysis, and
quality control limits.
• When data are manually entered into the data base, changes required
due to clerical errors can be made by the data entry operator, the data
entry supervisor, and the quality assurance (QA) group.
• When data are entered into the central data base via a data set on a
computer-readable medium, further changes can be made.
• Algorithms and formulas used for data manipulation are available in
hard copy.
• Analysis and validation test results are documented.
• The analyst has facilities to examine and review results data, quality
control (QC) data, and instrument calibration data.
• Data and analysis programs can be restored in a logically related
manner so results can be regenerated.
• The system starts automatically or manually after a power failure or
interruption.
B-l
-------
• The system loses the data being processed at the time of the failure.
• The system journals.
• There is a recovery procedure for data retrieval.
• The following records are maintained on the data system: results of
additional quality control samples (duplicates, spikes, etc.), laboratory
identification of case samples, and any modifications of data flags made
by a data review staff.
• The system performs the following data reduction functions: linear or
quadratic reduction for standard curves, quantitative analysis for an
unknown utilizing the linear and quadratic formulas, flagging of data
to indicate standards outside of QC-acceptable criteria, and flagging of
data to indicate sample results outside linear range.
• Technical records maintained on the data system are sufficiently
complete for scientific review.
Features and characteristics offered by four of the vendors:
• Groups can have group user identification or passwords.
• The data entry individual uses a personalized logon.
• A password is required to access the data entry module.
• Data can be entered from another system or instrument.
• When data are manually entered into the data base, changes required
due to clerical errors can be made by a systems group.
• Data reduction and analysis checks are done during system
development.
• If algorithms or formulas are modified, it is possible to determine
which set of data was done with which version of the formulas.
• Supervisors need to approve results.
• Command files are written to drive backup operations.
• After a system failure, the system restarts where it left off.
B-2
-------
The following records are maintained on the data system: results of
instrument calibrations, results of instrument blanks, flags associated
with problems found during initial sample receipt, flags associated
with quality control problems, records of individuals who review data,
and evidence that data review was completed and samples were
released for reporting.
If the data system tracks both case samples and QC samples, there is a
pointer to link the case sample with: standards, blanks, duplicates,
spikes, internal standards in samples, surrogate standards in samples,
compounds under investigation, and unknown compounds found in
samples.
System flags data to indicate: sample results below detection limits,
sample results below reporting limits, blanks with compounds above
acceptable limits, comparison of duplicate results outside limits, and
comparison of spiked and non-spiked results outside limits.
If flags are changed on the system, documentation of both flags is kept.
Features and characteristics offered by three of the vendors:
• System prevents entry of incorrect or out-of-range data.
• System logs errors.
• System prompts for missing fields.
• System verifies data based on matches to keys of a pre-existing record.
• Data reduction and analysis checks are done whenever changes are
made in the data base.
• If algorithms or formulas are modified, it is documented; old results
are recalculated with new formulas, and changes are reflected in
detailed design documentation.
• If the data system tracks both case samples and QC samples, there is a
pointer that links the case sample with instrument calibrations and
instrument conditions.
• Flags are of sufficient detail to characterize the problems with the data.
B-3
-------
Features and characteristics offered by two of the vendors:
• Data can be validated by a re-key by the same person or another person.
• Further changes can be made after data are committed to the data base.
• Data reduction and analysis checks are done periodically by a QA staff
member and through the use of internal QC samples.
• When the system is backed up, this is documented on the system log.
• If data are lost, the system shows the loss and identifies which data
elements were lost.
B-4
------- |