OOOR90101C

-------

-------
               Automated Laboratory Standards:

RESULTS FROM THE SURVEY OF CURRENT TECHNOLOGY
           FOR AUTOMATED LABORATORIES
                       Prepared for.

        Office of Information Resources Management
           U.S. Environmental Protection Agency
        Research Triangle Park, North Carolina 27711

                       June 15,1990
                                                 Prepared by:

                                BOOZ-ALLEN & HAMILTON Inc.
                                        4330 East-West Highway
                                      Bethesda, Maryland 20814
                                                (301) 951-2200

                                       Contract No. 68-W9-0037
                                  Computer Sciences Corporation
                                         79 T.W. Alexander Dr.
                       Research Triangle Park, North Carolina 27709
                                                (919) 541-9287

                                        Contract No. 68-01-7365

-------
                        Acknowledgments
      This report  was  the  combined  efforts of  Computer  Sciences
Corporation, Booz* Allen  & Hamilton Inc., EPA  staff, and outside experts.
Richard Trilling, Will Harrelson, and Trevor Elliott of CSC researched and
prepared the draft for public review.  Marguerite Jones, Ronald Ross, and
Lynn Eberhardt of Booz* Allen evaluated the comments and completed this
final report. Numerous EPA staff and outside experts provided substantial
critical reviews and  valuable technical comments.  Richard Johnson of the
Scientific Systems  Staff of EPA's  Office  of Information  Resources
Management directed the contractors' work and managed the review process.
                                 -ii-

-------
                         Table of Contents


Executive Summary	iii

Background	1

      Exhibit 1 - Need for EPA's Automated Laboratory
      Standards Program	2

      Exhibit 2 - Considerations in Developing
      Automated Laboratory Standards	4

Review of the Literature	7

Survey of LIMS Vendors	9

      Development and Administration	9

      Results of the Survey	10

Other Sources of New Technology	13

Conclusions	16

Glossary

References

Appendix A:  Survey Questionnaire

Appendix B: Summary of Results
                                -in-

-------
                        Executive Summary

      The  U.S.  Environmental Protection Agency (EPA)  has initiated a
program to ensure the  integrity of computer-resident data in laboratories
performing analyses in support of EPA programs by developing standards for
automated laboratory processes.   The activities  of  these environmental
programs are diverse, and include basic research at  EPA's environmental
research  centers,  environmental  sample  analyses  at  EPA's  regional
laboratories and  contractors' laboratories, and product registration relying on
analytical data submitted by the private sector.

      This report  investigates the  availability of  current  automated
technology that will provide adequate assurance that computer-resident data
will be  reliable.   Several vendors of laboratory automation and laboratory
information management systems (LIMS) were surveyed  to determine if
standards and controls are  available that will  ensure the reliability  and
validity of the data generated.  Additionally, an extensive search of the
literature did not reveal any hardware  or software currently on the market
that will guarantee the integrity of the data produced.

      Vendors already offer a variety  of control techniques such as audit
trails and password protection, and provide customizable systems to meet the
varying needs of each of their customers.  Most vendors  rely on existing
control features (e.g., password protection  and system backup) provided by
operating systems rather than duplicating them.

      Of  the  technological advances identified, the following can  be
considered in developing standards for automated laboratories:

      •     Magnetic ink character recognition (MICR),  which permits
            characters in labels to be read by magnetic scanners when written
            in standard format and standard location

      •    Optical scanning, which permits the recognition of patterns of
           ink, such as those used in bar codes (universal product codes)
                                 -IV-

-------
      •     "Smart cards," or credit cards that communicate with a remote
            computer from  a sample analysis  station via an embedded
            processing chip.

These technologies can be tailored to the laboratory environment to assist in
data management operations.
                                 -v-

-------
                             Background

      The  U.S.  Environmental Protection Agency (EPA) has initiated a
program to ensure the integrity of computer-resident data in laboratories
performing analyses in support of EPA programs by developing standards for
automated laboratory  processes.   The possession of  sound technical data
provides a fundamental resource  for EPA's mission  to protect the public
health  and  environment,  regardless  of the activities  of the specific
environmental programs.  The activities of these environmental  programs
are diverse,  and include basic research at EPA's environmental research
centers, environmental sample analyses at EPA's regional laboratories and
contractors' laboratories, and product registration relying on analytical data
submitted by the private sector.

      EPA recognizes  that the implementation of an  automated laboratory
standards program will require each laboratory to allocate resources of dollars
and time for the program's execution.   Although this program may be
considered too expensive by some, laboratory managers must consider that in
developing and  using  a proper standards program, they will achieve a  net
savings as  information processes do not have to be repeated and  expensive
mistakes can be avoided.

      Within EPA, the Office of Information Resources Management (OIRM)
has assumed the objective of establishing an automated laboratory  standards
program. The need for this program is evidenced by several factors. Exhibit 1
illustrates these factors that include the rising use of computerized operations
by laboratories, the lack of uniform standards developed or accepted by EPA,
evidence of  problems  associated with  computer-resident data,  and the
evolving needs of EPA auditors and inspectors for guidance in evaluating
automated laboratory operations.

      Laboratories collecting data  for EPA's programs  have taken advantage
of increasing technology to streamline  the analytical processes.  Initially,
automated  instrumentation entered the laboratories to increase productivity
and enhance the accuracy of reported results.  Then, computers maintaining
data bases of results were used for  data management and tracking.  Computer
                                  -1-

-------
   2
   g"
   to
  •o

   (0

  CO
I!
XT?
iu s
   (0
CE
C5
O
cc
0.

(0
o
DC
                  CO
  CL
  LU


  £

  •o
                      fo
                      uu
                                      •t  ^ ..••-\ ....••  ^^  ',
                             -2-

-------
systems were then integrated into more sophisticated laboratory information
management systems (LIMS).  Each of these advances necessitates thorough
quality control procedures for data generation, storage, and retrieval to ensure
the integrity of computer-resident data.

      Currently,  EPA has  no Agency-wide  guidelines that laboratories
collecting  and evaluating computer-resident  data must  follow.   The
requirements that must be considered in developing automated laboratory
standards come from a variety of sources, as Exhibit 2 illustrates, including
the requirement of the Computer Security Act of 1987 (P.L. 100-235, January 8,
1988) and various EPA program-specific data collection requirements under
Super fund, the Resource Conservation and Recovery Act, the Clean Water
Act,  and the Safe Drinking Water Act, among others. Additionally, OIRM has
developed electronic transmission standards and is developing a strategy for
electronic record keeping and electronic reporting  standards that will impact
on all Agency activities.   The  development of uniform  principles for
automated data in EPA laboratories,  regardless of program, will take into
account the common elements of all these data collection  activities, and
provide a minimum standard that each laboratory  should achieve.

      There is increasing evidence of problems associated with the collection
and  use of computer-resident  laboratory data  supporting various  EPA
programs.  To illustrate, as of November 1989, EPA's Office of the Inspector
General was investigating between 10 and 12 laboratories  in Superfund's
Contract Laboratory Program (CLP) for a variety of allegations, including
"time traveling" and instrument calibration violations.  In "time traveling,"
sample testing dates are manipulated, by either adjusting the internal clock of
the instrumentation performing the analyses or manipulating the resultant
computer-resident data.  (Hazardous waste samples must be assayed within a
prescribed time period or the results  may be compromised.) Additionally,
calibration standard  results have allegedly  been electronically manipulated
and other  calibration results substituted when the actual results did not meet
the range specifications of the CLP procedure being followed. If true, these
allegations may be treated as felonies.
                                  -3-

-------
  CO
  •E
  co
  •o
   )
  2
  o
  .Q
  (0
  2
  (0
CME
X 0)
UJ.E
  Q.
  £

  §
  0>
  Q
  (0

  .0
  ?

  2
  o>
  T3

  "<0
  C
  O
  o
                             2 2 2 o  -,
                             *•=•-».  Igg^gs*.
                                     ^£SSO^§6

-------
      Because the introduction of automation is relatively  new  and still
evolving, no definitive guidelines for EPA auditors and inspectors have been
developed. Inspectors must be alert to the steps in those procedures used by
laboratories generating and using computer-resident data where the greatest
risk exists.  These critical control points indicate the magnitude of control that
should be placed on that  step of the process. If adequate controls are not
present, the remainder of the process cannot correct a deviation, and the
entire process will provide no reliable conclusions.  Automation introduces
many new variables into a system, each with its own set of critical process
points.  Inspectors must verify that laboratory management has  recognized
the various risks and have instituted  an appropriate risk management
program.

      As part of the  EPA's program to  ensure  the integrity of computer-
resident laboratory data, the Agency is investigating  what automated data
processing (ADP) systems exist, what controls and standards are feasible, and
how  vendors  have  identified  and/or developed  the  standards they
implement.  Particularly  important is  whether there  have been  recent
technological advances of devices or subsystems that provide full assurance of
integrity for computer-resident data.

      To investigate these  issues, the following activities were  performed:

      1)    Reviewed professional journals to identify articles introducing
           or describing such advances

      2)    Developed and administered a survey to five (5) vendors  of
           (LIMS) to determine what data integrity  control features these
           vendors' products provide

      3)    Conducted telephone interviews  with  representatives of two
           major laboratory  instrumentation  manufactures  to obtain
           information  pertaining to  the  flexibility of the  laboratory
           instrumentation currently available
                                  -5-

-------
      4)     Explored  new  technologies used in the banking, retail, and
            manufacturing  industries that have the potential to enhance
            data integrity in the laboratory environment.

      This report complements our earlier report that reviews data integrity
data processing standards in automated financial  systems (OIRM, 1989b).
These standards include:

      •     Use of logon/pass word security

      •     Data entry verification

      •     Flagging of changes made and retention of both original and
            altered data (audit trails)

      •     Protecting reliability of data by prohibiting the same person from
            both authorizing and allocating funds

      •     Maintaining hard-copy data outputs.

That report concludes that the financial auditing discipline offers reasonable
levels of assurance of  integrity of computer-resident data and recommends
consideration of certain standards used in automated  financial systems in
developing standards for automated chemistry laboratories.
                                   -6-

-------
                      Review of the Literature

      Twelve months (fall 1988 to fall 1989) of professional journals that deal
with laboratory science, laboratory  automation, and laboratory information
management were reviewed.  These journals include Analytical  Chemistry,
American Laboratory, Chemical  Industry, Laboratory Practice, and Science.

      In this review, searches  were made for articles on a variety of topics,
including  laboratory  automation, laboratory information management,
scientific computing, and related topics.  Four articles were found.  These
articles were  Sandowski, C, and  G. Lawler,  "A Relational Data Base
Management System for LIMS," American Laboratory  21:3  (March  1989),
pp. 70-79;  Merrer,  Robert J.,  and  Peter G.  Berthrong,  "Academic  LIMS:
Concept and Practice," American Laboratory 21:3 (March 1989), pp. 36-45;
Megargle, Robert, "Laboratory  Information Management Systems," Analytical
Chemistry, 61:9 (May  1989),  pp. 612A-621A; and Anon.,  "Products -
Information Management," Laboratory Practice, 38:5 (May 1989), pp. 87-91.

      Library on-line search  tools were used at a major university library  for
these topic areas and related key words, and no references ("hits") were found.
A similar search using EPA's Online Library System (OLS) was performed,
which included not only  articles  from professional journals but also all EPA
items registered with the National Technical Information Service (NTIS).  In
this search, two references were  identified and are Dessy, Raymond E., The
Electronic Laboratory (Washington, D.C.:  American Chemical Society) and
McDowall, R.D.  ed.,  Laboratory Information  Management  Systems
(Wilmslow, U.K.: Sigma Press), 1987.  The  publication by McDowall  (1987)
contains two LIMS articles of interest: Mattes, D.C., 1987, "LIMS and Good
Laboratory Practice," and Brown, Elizabeth H., 1987, "Procedures and their
Documentation for a LIMS in a Regulated Environment."

      It was concluded from  the review of these searches that laboratory
automation and laboratory information management are not yet common
topics, and probably not yet  part of  the main stream of laboratory literature.
Further, no existing laboratory standards  were identified by the literature
search.
                                  -7-

-------
      The journals were also reviewed to find advertisements for laboratory
automation  and/or  laboratory  information  management  systems and
advertisements from the following vendors were found:

      PE Nelson
      CI Beckman
      Varian Associates, Inc.
      FIAtron Laboratory Systems
      Axiom Systems, Inc.
      Laboratory MicroSystems, Inc.
      Radian Corporation
      Advanced Systems Management, Inc.
      VG Instruments, Inc.
      Harley Systems, Inc.

      It was also known that Hewlett Packard and  Digital Equipment
Corporation have LIMS systems.  This substantial number shows that even if
the laboratory automation/laboratory information management system topic
is  not heavily  discussed in professional periodicals, vendors nevertheless
have found a market.
                                 -8-

-------
                      Survey of LIMS Vendors

Development and Administration

      The purpose of the survey was to identify LIMS that provide
reasonably high levels of assurance of data integrity. Consequently, the items
included in the questionnaire deal with a variety of ADP controls that reduce
the risk of threats to data integrity.

      The survey elicits information about system documentation; security;
data integrity; data reduction and analysis;  and backup,  archiving, and
recovery.  The full questionnaire appears in Appendix A.

      In developing the questionnaire, the following sources for topics and
for questionnaire items were consulted:

      •     A checklist of ADP audit features already being  used by EPA in
            laboratory site  visits to determine which such features are in
            place or feasible (OIRM, 1989a)

      •     Standard systems analysis and design techniques  (OIRM, 1987)

      •     EPA LIMS functional specifications (OIRM, 1988).

The survey was administered by telephone to the following vendors:

      CI Beckman
      Varian Associates, Inc.
      VG Instruments, Inc.
      Hewlett Packard
      PE Nelson

      The only significant problem  encountered with survey administration
was that vendors' products were typically highly customizable and therefore
not easily characterized by a structured survey.
                                  -9-

-------
Results of the Survey

MAJOR FINDINGS

      There are four major findings from the survey:

      •     Vendors offer a variety of features that can be customized to
            provide  assurance of data integrity, such  as  passwords and
            records of data changes.

      •     System  vendors offer system specific data integrity features;
            however, there is no required standard set of data integrity
            features.

      •     There is no "magic box" or technological advance to guarantee
            absolute data integrity.

      •     Most vendors rely on existing control features (e.g., password
            protection and system backup) provided by operating systems
            rather than duplicating them.

The  discussion of  the  findings  from  the  survey is supplemented  with
information  obtained from telephone conversations with  representatives
from two LIMS vendors.  These are discussed in more detail below.

Customizable Products

      Vendors  offer a relatively extensive variety of control techniques for
ensuring data integrity. Vendors do not rely on or reference a set of standards
in deciding which feature to deliver.  Vendors respond instead to requests
from their customers and deliver a system that provides the data integrity
features specified by their customers.
                                  -10-

-------
No Standard Configuration

      Presently, the manufacturers of automated laboratory systems design
their instrumentation  to incorporate their individual data integrity controls.
With so many manufactures of this equipment in the market place, it stands
to reason that without a universal  standard to adhere to, there exist no
standard configuration of data integrity controls.

No "Magic Box"

      The  vendors surveyed have  not made  advances  in  hardware  or
software that would guarantee full data integrity.  It was thought that vendors
might  be making use of optical disk technology, write once/read  many
(WORM), or a similar, highly controlled method to minimize risk to data
integrity.  Vendors surveyed are not incorporating this type of  technology in
their systems.  Reasonable levels of data integrity can be achieved through
traditional  controls such  as re-keying for data entry verification, logon and
password security, and using an audit trail to implement a chain of custody.
After asking vendors all the questions, they were then asked to mention any
additional features in  their systems designed to ensure or useful at ensuring
data integrity.  Vendors did not identify any additional methods of ensuring
data integrity.  It can be assumed  that any such methods had  not been
overlooked during the design of the survey.

      Confidence is high that the laboratory automation commercial market
in general has not made technological advances in hardware or software that
would  guarantee full data integrity.

Vendors Reply on Operating Systems

      Vendors will typically rely on existing control features of  the operating
system rather than duplicate those control systems.  Password access control,
for instance, usually  consists  of whatever the operating system provides.
Backup typically consists of whatever frequency and medium  the operating
system provides.
                                  -11-

-------
SUMMARY OF RESULTS

      Appendix B presents detailed results of the survey.  The systems are
generally capable of doing whatever the customer needs.  Even if a system
does not offer a particular option as a feature, it is usually flexible enough to
allow the customer or vendor to customize the system in order to provide
that option through programming or third-party software.
                                 -12-

-------
                 Other Sources of New Technology

      The laboratory automation/LIMS vendors do not seem to  use new
technology  for  the laboratory environment.   In the review of other
automated systems, however, a number  of technological advances were
identified.

      To illustrate, banking, retailing,  and manufacturing have witnessed
technological advances that have potential for the laboratory  environment.
At least three such advances were identified and are discussed below:

      •     Magnetic ink character recognition (MICR)
      •     Optical scanning and bar codes (universal product codes)
      •     Magnetic cards
      •     "Smart cards."

      Technological advances have  been made in magnetic ink and in
sensitive scanners  capable of MICR.  Using this ink and MICR, scannable
checks can be processed by machine because the banking industry has adopted
a standard format for labeling checks with bank and  account information.
The magnetic ink, written onto checks in  a standard  format and  standard
location, can be read by most new scanners.

      Carrying the scanning technology further, optical scanners  can read
merchandise codes that have been written with ink and that conform not
only to a standard  location and format but also use a universal  product code
that has been developed so that merchandise can be labeled unambiguously.
(Universal product codes are most familiar as "bar codes" on grocery  products
and  are read by registers that are really terminals connected to central
processing units and read the codes from merchandise, keep running totals of
the individual's bill, and may even perform automatic inventory control and
reporting.)

      This technology could be adopted for the laboratory environment as a
method  of labeling and reading samples that enter a laboratory.  Labels could
be affixed to sample containers during sample processing.  Scanners would be

                                 -13-

-------
installed at every station  in the laboratory at which sample identification
information is important.  The scanner would read the sample identification
from the physical sample  container and pass that information to software.
Software would compare the sample identification  information read from the
container with that entered at sample receiving time in order to verify that
results information was being attributed to the proper sample.

      Magnetic stripes on credit cards provide a bank with information about
an individual's  account.  Additionally,  some methods of  transportation
(notably, the Metro subway system in Washington, DC) use magnetic stripes
to record fare information that can be linked to distance and time of day.  In
some  implementations, "smart cards"  --  credit cards with an  imbedded
processing chip, as well as  the traditional magnetic stripe  ~ can communicate
with and provide additional information to the host computer in a number
of applications.

      Card-assisted ADP  in  the laboratory might work in the following
manner: a  physical sample would move through the  laboratory and  its
identification information would be checked at each station, as desired above.
At one or  more of  these stations, an  authorized individual  (perhaps a
laboratory director of principal investigator) might enter  a magnetic ("mag")
card that would authorize posting of sample information to the data base and
would retrieve from the data  base information required for the next posting
(the result  of an analysis or the status of the experiment).  Without the
intervention of the mag card, the information from the sample could not be
posted to the data base.

      Additionally, a smart card containing stored information on a sample,
or on a number of similar samples being run together, could be inserted into
an instrument to record the results of the sample analyses.  Smart cards can be
pre-formatted  to receive data in any configuration, such  as tabular, and are
ideal  for transmitting data from remote  instrumentation to  a central data
management system.  Smart cards can be erased and reformatted for use with
a new sample or set of samples, thereby making them more cost effective.
                                 -14-

-------
      In general,  automation  technology that  uses  standards  and is
implemented adequately performs its tasks more reliably and perhaps more
cost effectively than could be true of manual performance of the same tasks.
Therefore, the technology described above has significant implications for the
laboratory environment.
                                -15-

-------
                            Conclusions

      Automation technology  that uses  standards  and  is implemented
adequately performs its tasks more reliably  and perhaps more cost effectively
than could be true of manual performance of the same tasks.  After reviewing
available literature and surveying various LIMS vendors, it was determined
that laboratory automation and  LIMS  commercial  vendors have not
developed a standard set  of controls that provide full assurance of the
integrity of computer-resident data.   Vendors typically  deliver systems
customized  to  fit  the specifications of  their  customers, but there are no
standards that  define the  default, baseline system each vendor  delivers.
These vendors have not made hardware or  software advances that guarantee
data integrity.   Standards for  laboratory automation would provide  a
common denominator for software design and  other technological advances.

      Technological devices  developed for a variety of fields have the
potential to be applicable for use in the laboratory setting, but these devices
have had little  acceptance in this environment.  These devices include the
following:

      •     Magnetic ink character recognition (MICR)
      •     Optical scanning and bar codes
      •     Magnetic cards
      •     "Smart cards."

Universal product codes (bar codes) have been used for sample identification
in a few laboratories, and acceptance of that technology may be increasing.

      It  is worth noting, however, that the technological  advances in the
banking, retailing, and manufacturing industries can be used only because
each industry has developed standards for  use  of the technology.  The
technology for  reading magnetic ink from checks  works only  because the
banking industry has developed a standard format and a standard location for
writing  information  onto the checks.   Similarly,  the retailing  and
manufacturing  industries  have developed standards  for the format  of
universal product codes.

                                 -16-

-------
      The results of the survey of five LIMS vendors has indicated that the
vendors are not currently  standardizing their systems and technology.  Until
the time that the vendors  voluntarily work in concert or are provided with a
set of standards  from outside  sources,  little progress  can be made  in
incorporating these techniques into the analytical chemistry laboratories of
concern to EPA.

      By tailoring existing technologies  to the laboratory setting  and by
setting standards for operation of automated equipment, laboratory processes
can produce data with increased efficiency and integrity.
                                  -17-

-------
                Automated Laboratory Standards Program


                              GLOSSARY
Application controls - one of the two sets or types of controls recognized by
the auditing discipline.  They are specific  for each application and include
items such as  data entry verification procedures (for instance, re-keying all
input); data base recovery and roll back procedures that permit the data base
administrator to recreate any desired state of the data base; audit trails that not
only assist the data base administrator in recreating any desired state of the
data base, but also provide documentary evidence of a chain of custody for
data; and use of automated reconciliation  transactions that verify the final
data base results against the results as reconstructed through the audit trail.

Application software - a program  developed,  adapted,  or  tailored to the
specific  user  requirements  for  the  purpose  of  data  collection, data
manipulation, data output, or data archiving [Drug Information Association].

Audit trail - records of transactions that collectively provide documentary
evidence of processing, used to trace from original transactions forward to
related records and reports or backwards from records and reports to source
transactions.  This series of records documents  the origination and flow of
transactions processed through a  system [Datapro].  Also, a chronological
record of system  activities that is sufficient to enable the  reconstruction,
reviewing, and examination of the sequence of  environments and activities
surrounding  or leading to an operation, a procedure,  or  an event in  a
transaction from its inception to final results [NCSC-TG-004].

Auditing  - (1) the process of establishing that prescribed procedures and
protocols have been followed; (2) a technique applied during or at the end of a
process  to assess the acceptability of the  product. [Drug Information
Association];  (3) a function used by management to assess the adequacy of
control [Perry].  That is, auditing is the set of processes that evaluate how well
controls  ensure data integrity.  As a financial example, auditing would
include those activities that review whether deposits have been attributed to
the proper accounts; for example,  providing an individual with a  hard-copy
record of the transaction at the time of deposit and sending the individual a
monthly  statement that lists all transactions.

Automated laboratory data processing  - calculation, manipulation, and
reporting of analytical results using computer-resident data, in either a LIMS
or a personal computer.

Availability - see "data availability."
                                  G-l

-------
                Automated Laboratory Standards Program


Back-up - provisions made for the recovery of data files or software, for restart
of processing, or for use  of alternative computer equipment after a system
failure or disaster [Drug Information Association].

Change control -  ongoing evaluation of system operations and  changes
during the production use of a system, to determine when and if repetition of
a validation process or a specific portion of it is necessary.  This includes both
the ongoing, documented  evaluation, plus any validation testing necessary to
maintain a product in a validated state [Drug Information Association].

Checksum - an error-checking  method used in data communications in
which groups of digits are summed, usually without regard for overflow, and
that sum checked against  a previously computed sum to verify that no data
digits have been changed [Drug Information Association].

Cipher - a method of transforming a text in order to conceal its meaning.

Confidentiality - see "data confidentiality."

Control - "that  which prevents, detects, corrects, or reduces a risk" [Perry,
p. 45], and thus reasonably ensures that data  are  complete, accurate,  and
reliable. For instance, any system that verifies the sample  number against
sample identifier information would be a control against inadvertently
assigning results to the wrong sample.

Computer system -  a group of hardware components assembled to perform in
conjunction with a set of software programs that are collectively designed to
perform a  specific function or group of functions  [Drug  Information
Association].

Data - a  representation of facts, concepts,  or  instructions in a formalized
manner suitable for communication, interpretation, or processing by human
or automatic means  [ISO, as reported by Drug Information Association].

Data availability - the state when data are in the place needed by the user, at
the time the user needs them, and in the form needed by the user [NCSC-TG-
004-88]' the state where information or services that must  be accessible on a
timely basis to meet mission  requirements or to avoid other types of losses
[OMB]. Data stored electronically require a system to be available in order to
have access to the data.  Data availability can be impacted  by several factors,
including system "down time," data encryption,  password  protection,  and
system function access restriction.

Data Base Management System (DBMS) - software that allows one or many
persons to create a data base, modify data in the data base, or use data in the
data base (e.g., reports).
                                  G-2

-------
                Automated Laboratory Standards Program
Data base - a collection of data having a structured format.

Data confidentiality - the ability to protect the privacy of data; protecting data
from unauthorized disclosure [OMB].

Data  element (field) - contains a value with a fixed size and data type (see
below). A list of data elements defines a data base.

Data integrity -  ensuring the prevention of information corruption [modified
from  EPA  Information Security  Manual];  ensuring the  prevention  of
unauthorized modification  [modified  from OMB];  ensuring that data are
complete, consistent, and without errors.

Data record - consists of a list of values possessing fixed sizes and data types
for each data element in a particular data base.

Data  types  - alphanumeric  (letters, digits, and special  characters), numeric
(digits only), boolean (true or false), and specialized data types such as date.

Electronic data  integrity -  data  integrity protected by a computer system;
automated data  integrity refers to the goal of  complete and incorruptible
computer-resident data.

Encryption - the translation of one character string into another by means of a
cipher, translation table, or algorithm, in order to  render the  information
contained therein meaningless to  anyone who  does not possess the decoding
mechanism [Datapro].

Error - accidental mistake caused by human action or computer failure.

Fraud - deliberate human action to cause an inaccuracy.

General  controls - one of the two sets  or types of controls recognized by the
auditing  discipline. These  operate across  all applications.  These would
include developing  and staffing a quality assurance program that works
independently of other staff; developing and enforcing documentation
standards; developing standards for data transfer and manipulation, such as
prohibiting the  same  individual  from both performing and  approving
sample testing; training individuals to perform  data transfers; and developing
hardware controls, such as writing different backup cycles to different disk
packs and developing and enforcing labelling conventions for all cabling.

Integrity - see "data integrity."
                                  G-3

-------
                Automated Laboratory Standards Program


Journaling - recording all significant  access or file activity events in their
entirety.  Using a journal plus earlier copies of a file, it would be possible to
reconstruct the file at any point and identify the ways it has changed over a
specified period of time [Datapro].

Laboratory Information Management System  (LIMS)  - automation of
laboratory processes under  a single unified  system.  Data collection, data
analysis, and data reporting are a few examples of laboratory processes that
can be automated.

Password - a unique word or string of characters used to authenticate an
identity.  A program, computer operator,or user may be required to submit a
password to meet security requirements before gaining access to data.  The
password is confidential, as opposed to the user identification [Datapro].

Quality  assurance  - (1) a process for building quality into  a system; (2) the
process  of ensuring  that  the  automated  data  system  meets the user
requirements for the system and maintains data integrity; (3) a planned and
systematic pattern of all actions necessary to provide adequate confidence that
the  item or product  conforms  to established technical requirements
[ANSI/IEEE Std 730-1981, as reported by Drug Information Association].

Raw data - "...  any laboratory worksheets,  records, memoranda, notes, or
exact copies thereof, that are the result  of original observations and activities
of a study and are  necessary for the reconstruction and evaluation of that
study.  . .  "Raw  data" may include photographs, microfilm or microfiche
copies, computer printouts, magnetic media, . . . and recorded data from
automated instruments."  [40 CFR 792.3]  Raw data are the  first or primary
recordings of observations or results.  Transcribed data (e.g., manually keyed
computer-resident data taken from data sheets or notebooks) are not raw data.

Risk - "the probable result of the occurrence  of an adverse  event..." [Perry,
p. 45].  An "adverse event"  could be either  accidental (error) or deliberate
(fraud). An example of an adverse event would be the inaccurate assignment
of an accessionary  number  to a test  sample.  Risk, then, would be  the
likelihood that the results of an analysis would be attributed to the wrong
sample.

Risk analysis  -  a  means of measuring  and  assessing  the relative
vulnerabilities and threats to a collection of sensitive data  and the people,
systems, and installations involved in storing  and processing those data.  Its
purpose is to determine how security measures can be effectively applied to
minimize potential loss.   Risk analyses may vary from an  informal,
quantitative review of a microcomputer  installation  to  a formal,  fully
quantified review of a major computer center [EPA IRM Policy Manual].
                                  G-4

-------
                Automated Laboratory Standards Program


Security - the protection of computer hardware and software from accidental
or malicious  access, use, modification, destruction, or disclosure.  Security
also pertains to personnel, data, communications, and the physical protection
of computer installations [Drug Information Association].

System - (1) a collection of people, machines,  and methods organized to
accomplish  a set  of  specific functions; (2) an integrated whole that is
composed of diverse, interacting, specialized structures and subfunctions; (3) a
group of subsystems united by some interaction  or interdependence,
performing many  duties but functioning as a single unit  [ANSI N45.2.10,
1973, as reported by Drug Information Association].

System Development Life Cycle (SDLC) - a series of distinct phases through
which development projects progress.  An approach to computer system
development  that begins  with  an  evaluation of the user  needs and
identification of the user requirements and continues through system design,
module design, programming and testing, system integration and testing,
validation, and operation and maintenance, ending only when  use of  the
system is discontinued [modified from Drug Information Association].

Transaction log - also Keystroke, capture, report, and replay - the technique of
recording and storing keystrokes as entered by the user for subsequent replay
to enable the original sequence to be reproduced exactly [Drug Information
Association].

Valid - having legal strength or  force, executed with proper formalities,
incapable of being rightfully overthrown or set aside [Black's Law Dictionary].

Validity - legal sufficiency, in contradistinction to mere regularity (being
steady or uniform in course, practice, or occurrence) [Black's Law Dictionary].
                                  G-5

-------
                             References
Anon.  (1989), "Products - Information Management," Laboratory Practice, 38:5
(May 1989), 87-91.

Black, Henry C.  (1968), Black's Law Dictionary, Revised Fourth Edition (West
Publishing Co., St. Paul, Minnesota).

Brown, Elizabeth H. (1987), "Procedures and their Documentation for a LIMS
in a Regulated Environment," PP. 346-358 in R.D. McDowall, ed. Laboratory
Information Management Systems (Wilmslow, U.K.:  Sigma Press, 1987).

Datapro  Research (1989), Datapro Reports on Information Security (McGraw-
Hill, Inc., Delran, New Jersey).

Dessy, Raymond E. (1985), The  Electronic Laboratory (Washington, D.C.:
American Chemical  Society, 1985).

Drug  Information  Association  (1988), Computerized Data Systems for
Nonclinical  Safety  Assessment:  Current  Concepts  and  Quality  Assurance
(Drug Information Association, Maple Glen, Pennsylvania).

Mattes, D.C. (1987),  "LIMS and Good Laboratory Practice," Pp. 332-345 in R.D.
McDowall, ed.,  Laboratory Information  Management Systems (Wilmslow,
U.K.: Sigma Press, 1987).

McDowall, R.D. (1987),  ed..  Laboratory Information  Management  Systems
(Wilmslow, U.K.: Sigma Press, 1987).

Megargle, Robert (1989), "Laboratory Information Management Systems,"
Analytical Chemistry, 61:9 (May 1989), 612A-621A.

Merrer, Robert }., and Peter G. Berthrong (1989),  "Academic LIMS:   Concept
and Practice," American Laboratory 21:3 (March 1989), 36-45.

National  Bureau of  Standards (1976), Glossary for Computer  Systems Security
(U.S. Department of  Commerce, FIPS PUB 39).

National Computer Security  Center (1988), Glossary  of Computer  Security
(U.S. Department of  Defense, NCSC-TG-004-88, Version  1).

Office of  Information Resources  Management (1987), EPA Systems Design and
Development   Guidance,  Vols.   A, B,  and  C (Washington,  D.C.:   U.S.
Environmental Protection Agency, 1987).

-------
Office of Information Resources Management (1988), "EPA LIMS Functional
Specifications." (Washington, D.C.: U.S. Environmental Protection Agency,
March 1988).

Office of Information Resources Management (1989a), Survey of Laboratory
Automated  Data Management Practices  (Research Triangle Park, N.C.:  U.S.
Environmental Protection Agency, 1989).

Office of Information Resources Management (1989b), Automated Laboratory
Standards:  Evaluation of the Use of Automated Financial System Procedures
(Research Triangle Park, N.C.: U.S. Environmental Protection Agency, 1989).

Perry, William E. (1983) Ensuring Data Base Integrity (New York: John Wiley
and Sons, 1983).

Sandowski, C, and G. Lawler (1989), "A Relational Data Base Management
System for  LIMS," American Laboratory  21:3 (March 1989), 70-79.

-------
    Appendix A




Survey Questionnaire

-------
Interviewer Name	    Date and Time_

Name of Respondent	    Firm	
System Description

1)        What kind of system is in use?
         (Describe the hardware manufacturer and model)

         Manufacturer:
         Model:
         Name of LIMS Product:
         Describe the DBMS and other software in
         use by the system

-------
         The  following  questions  are  to determine what mechanisms are
used to prevent unauthorized access to the system and data.


System Security                                            Yes  No

1)       Were specific  standards or other guidance used    	  	
         in the design  or implementation of security
         measures?

         If yes, what reference?	
2)       Does the system require personalized
         logon for each user?

3)       Does each user have a password?

4)       Are there any group user identification
         or passwords used by members of
         a functional group ?

5)       How often does the system require
         passwords to be changed?
6)        Are there established password standards?

7)        Does the data management system track
             changes to the data?

         If so, how ?
8)        Does the system automatically flag
         data as having been edited?

         If so, how ?

-------
9)        Is there a record maintained of the
          unaltered data?
10)       Are there any additional security mechanisms
         not covered in the previous questions ?

-------
         The  next  series  of  questions  relate  to  the documentatio
provided to the customer by the vendor about the installed LIMS compute
system.


System Documentation

1)       Does the vendor provide for each                  Yes No
         installation/system

             a) System  Implementation Plan?                	

             b) System  Detailed Requirements Document?     	 	

             c) Software Management Plan?                  	 	

             d) Software Test and Acceptance Plan?         	 	

             e) Software Preliminary Design Document?      	 	

             f) Software Detailed Design Document?         	 	

             g) Software Maintenance Document?             	 	

             h) Software Operations Document?              	 	

             i) Software User's Guide?                     	 	

             j) System  Integration Test Reports?           	 	


2)       What additional documentation do you provide the
         customer ?

-------
         The  following  series of  questions  address  the  data entr}
function wit.M" tl->o LIMS.
Data Entry                                                 Yes  No
1)       Does the data entry  individual use a
         personalized logon to access the system?

2)       Is there a password  required to access
         the data entry module?

3)       Is the individual entering data

             a) from a hardcopy?

             b) by prompting  system to access
                 an existing  data  file?

             c) prompting the system to access
                 data directly from another system
                 or instrument?

4)       Does the system alert the data entry
         personnel if an error is made in data
         entry (i.e.,  values  out of date range
         or incorrect flags,  etc.)?

5)       Does the system prevent entry of incorrect
         or out-of-range data?

         Are the errors logged ?

6)       Does the system prompt the individual
         entering data if there are missing fields?

-------
         The next  series  of questions evaluate mechanisms that may b<
used to verify  the integrity o^ data as the  data  is entered into th<
system.


Data Verification                                          Yes No
1)       Is the screen used for data entry

             a) designed to match the forms
                 used for entering data?

             b) convenient for the individual
                 responsible  for data entry?

 2)       If data is manually entered from
         a hardcopy, is the data validated by

             a) re-keying by the same person?

             b) re-keying by another person?

             c) review by same person?

             d) review by another person?


3)       Does the system verify data entered based on

             a) datatype

             b) matches against predefined values

             c) matches to keys of a preexisting record

             d) legal value assigned to worng
                unit of analysis

             e) quality control limits

-------
4)        Are there additional mechanisms in use to
         the quality of the data at the point of entry ?

-------
Data Integrity                                            Yes No


1)       When data is manually entered into the
         data base, if changes are required due
         to clerical errors are they made by

             a) data entry operator?                      	 	

             b) data entry supervisor?                    	

             c) systems group?                            	

             d) QA group ?                                	 	

2)       If the data is committed to the data base
         can further changes be made to the data          	 	

3)       If a change is made to data after it has
         been committed to the data base does the
         system maintain a log of

             a) who made the change?                      	 	

             b) when the change was made?                 	 	

             c) a record of both the unchanged and
                changed data?                             	 	

4)       If data is entered into the central data
         base via a data set on a computer readable
         media can further changes be made to the data?   	 	

-------
5)        Is there additional information that you can
         provide relatina to data integrity on your
         product ?
                                 10

-------
         The next series of questions are directed toward functions i:
the system that have the potential to modify or alter the data.


Data Reduction and Analysis                               Yes  No


1)       Are the algorithms or formulas used for data
         manipulations performed by the system available
         in a written format?
2) -      How many data records are
         processed to test each algorithm?
3)       Are the analysis test results documented ?

4)       How many data are records processed
         to test each validation algorithm?
5)        Are the validation test results documented?

6)        Are these checks done
             a) during system development?
             b) whenever changes are made in the
                data base?
             c) periodically by quality assurance
                staff?
             d) through the use of internal
                quality control samples?

7)        If algorithms or formulas are modified
             a) is this documented?
             b) is it possible to determine which
                 data sets were processed with which
                 version of the calculations?
             c) are old results recalculated with new
                 formulas?

                 How ?
             d) are changes reflected in the detail
                 design documentation?
                                  11

-------
Data Review                                                Yes  No


1)       Are there  facilities to allow the analyst
         to examine and review results data ?              	

         If yes, explain	
2)       Are there facilities to allow the analyst
         to examine and review quality control data ?

         If yes, explain	
3)        Are there facilities to allow the analyst
         to examine and review instrument
         calibration data ?

         If yes, explain	
4)        Do supervisors need to approve results ?

         If so, what facilities are available to
         allow the analyst and supervision to online
         review and approve results data?
                                  12

-------
         The following questions  relate  to system backup and recovery
in the event of a fa i '

Backups/Archival
         1)  What areas of the system are backed up ?
         2)   How often are backups are performed ?
                 a) daily?
                 b) weekly?
                 c) monthly?
                 d) other:	
         3)   Are the backups
                 a) partial?
                 b) total?

         4)   Who is authorized to perform system backups?

         5)   On what media are the backups stored
                 a) magnetic tapes?
                 b) disks?
                 c) diskettes?
                 d) other:	
         6)   When the system is backed up, is this
             documented on the system log ?

         7)   Are command files written to drive backup
             operations?

         8)   Can data and analysis programs be restored in a logically
             related manner so that the results may be regenerated ?
                                  13

-------
Recovery From System Failure                               Yes No
1)       If the system  fails due to a power failure
         or glitch does the system
              a) restart automatically?
              b) have  a manual  restart?
              c) other:	
2)       Does the system lose the data being
         processed?

         If yes, how much data ?
3)       Does the system start from where if
         left off?

4)       If data is lost, can the system show the loss
         and identify which data was lost?

5)       Does the system journal ?

6)       Is there a recovery procedure for data
         retrieval?

7)       Is there additional information that you can
         provide for data recovery in your system ?
                               14

-------
         The  following sections address the  issue  of record and data
tracking in the LIMS.


Records Tracking                                           Yes No
1)       Which of the following records are maintained
         on the data system?
             a) results of  instrument calibrations?

             b) results of  instrument blanks?

             c) results of  additional quality control
                samples such as duplicates, spikes, etc.?

             d) laboratory  identification of case
                 samples?

             e) flags made  associated with problems
                 found during  initial samples
                 receipt  (such as missing client
                 information,  leakage, etc.)?

             d) flags associated with quality control
                 problems?

             e) records of  individuals who review data?

             f) any modifications of data flags made by
                 data review staff?

             g) evidence that  data review was completed
                 and samples were released for reporting?

3)       If the data system tracks both case samples
         and their associated  quality control samples,
         is there a pointer used in the system
         to link the case sample with
             a) standards?
             b) blanks?
             c) instrument calibrations?
             d) instrument conditions?
             e) duplicates?
             f) spikes?
             g) internal standards in sample?
             h) surrogate standards in sample?
             i) compounds under investigation?
             j) unknown compounds found in sample?
                                  15

-------
4)       Is it possible using the data system to
         change any of these key link? (i.e., could
         a case sample be linked to a different
         quality control set than that with which
         it was run)?

         If yes, does the system maintain a record
             a) of who made the change?
             b) who authorized the change?                ~
             c) of both the unchanged and changed
                case/quality control link?

5)       What additional mechanisms are available for data
         and data change tracking in your product ?
                                  16

-------
Records Audit                                             Yes No


1)       Does the system perform any of the following
         data reduction functions?

             a) linear or quadratic reduction for
                 standard curves?                         	 	

             b) quantitative analysis for unknowns
                 utilizing formulas derived in a)         	 	

             c) flagging of data to indicate
                 i)    standards outside of quality
                       control acceptance criteria?       	 	
                 ii)   sample results outside linear
                       range?                             	 	
                 iii)  sample results below detection
                       limits?                            	
                 iv)   sample results below reporting
                        limits?                           	
                 v)    blanks with compounds above
                        acceptable limits?                	 	
                 vi)   comparison of duplicate results
                        outside acceptable limits?        	 	
                 vii)  comparison of spiked and non_
                        spiked samples outside acceptable
                         limits?                          	
                 viii) other:	
                                  17

-------
2)       If flags are changed on the system, is there
         documentation kept of both the changed and
         unchanged flags?

3)       Are the flags of sufficient detail to
         characterize problems with the data (i.e., a
         flag merely setting the sample as invalid
         without providing detail as to the nature
         of the problem may not be sufficient)?

4)       Are technical records maintained on the
         data system sufficiently complete as to
         allow scientific review of the data?
                                 18

-------
Other

1)       Do  you have  any  suggested literature  (references,  meetinc
         proceedings, etc.) on these topics ?
                                  19

-------
   Appendix B




Summary of Results

-------
Features and characteristics offered by all five vendors:

•     A personalized logon is required for each user.

•     Each user has a password.

•     The data base management system tracks changes to the data.

•     Data are automatically flagged, and a record is maintained of the
      unaltered data.

•     Data can be entered from a hard copy or an existing data file.

•     The system alerts the data entry personnel if a detectable error is made
      during data entry.

•     The data entry screen can be designed to match the data entry forms.

•     The data can be validated by a review by the same person or by a
      different person.

•     The system can verify data based on: data type, matches against  a pre-
      defined value, a legal value assigned to wrong unit of analysis, and
      quality  control limits.

•     When data are manually entered into the data base, changes required
      due to clerical errors can be made by the data entry operator, the data
      entry supervisor,  and the quality assurance (QA) group.

•     When data are entered into the central data base via a data set on a
      computer-readable medium, further changes can be made.

•     Algorithms and formulas used for data manipulation are available in
      hard copy.

•     Analysis and validation test results are documented.

•     The analyst has facilities to examine and review results  data, quality
      control (QC) data, and instrument calibration data.

•     Data and analysis programs can be restored in a logically related
      manner so  results can be regenerated.

•     The system starts  automatically or manually after a power failure or
      interruption.
                                   B-l

-------
•     The system loses the data being processed at the time of the failure.

•     The system journals.

•     There is a recovery procedure for data retrieval.

•     The following records are maintained on the data system: results of
      additional quality control samples (duplicates, spikes, etc.), laboratory
      identification of case samples, and any modifications of data flags made
      by a data review staff.

•     The system performs the following data reduction functions:  linear or
      quadratic reduction for standard curves, quantitative analysis for an
      unknown utilizing the linear and quadratic formulas, flagging of data
      to indicate standards outside of QC-acceptable criteria, and flagging of
      data to indicate sample results outside linear range.

•     Technical records maintained on the data system are sufficiently
      complete for  scientific review.


Features and characteristics offered by four of the vendors:

•     Groups can have group user identification or passwords.

•     The data entry individual uses a personalized logon.

•     A password is required to access the data entry module.

•     Data can be entered from another system or instrument.

•     When data are manually entered into the  data base, changes required
      due to clerical errors can be made by a systems group.

•     Data reduction and analysis checks are done during system
      development.

•     If algorithms or formulas are modified, it is possible to determine
      which set of data was done with which version of the formulas.

•     Supervisors need to approve results.

•     Command  files are  written to drive backup operations.

•     After  a system failure, the system restarts  where it left off.
                                  B-2

-------
      The following records are maintained on the data system:  results of
      instrument calibrations, results of instrument blanks, flags associated
      with problems found during initial sample receipt, flags associated
      with quality control problems, records of individuals who review data,
      and evidence that data review was completed and samples were
      released for reporting.

      If the data system tracks both case samples and QC samples, there is a
      pointer to link the case sample with: standards, blanks, duplicates,
      spikes, internal standards in samples, surrogate standards in samples,
      compounds under investigation, and unknown compounds found in
      samples.

      System flags data to indicate: sample results below detection limits,
      sample results below reporting limits, blanks with compounds above
      acceptable limits, comparison of duplicate results  outside limits, and
      comparison of spiked and non-spiked results outside limits.

      If flags are changed on the system, documentation of both flags is kept.
Features and characteristics offered by three of the vendors:

•     System prevents entry of incorrect or out-of-range data.

•     System logs errors.

•     System prompts for missing fields.

•     System verifies data based on matches to keys of a pre-existing record.

•     Data reduction and analysis checks are done whenever changes are
      made in the data base.

•     If algorithms or formulas are modified, it is documented; old results
      are recalculated with new formulas, and changes are reflected in
      detailed design documentation.

•     If the data system tracks both case samples and QC samples, there is a
      pointer that links the case sample with instrument calibrations and
      instrument conditions.

•     Flags are of sufficient detail to characterize the problems with the data.
                                  B-3

-------
Features and characteristics offered by two of the vendors:

•     Data can be validated by a re-key by the same person or another person.

•     Further changes can be made after data are committed to the data base.

•     Data reduction and analysis checks are done periodically by a QA staff
      member and through the use of internal QC samples.

•     When the system is backed up, this is documented on the system log.

•     If data are lost, the system shows the loss and identifies which data
      elements were lost.
                                  B-4

-------