United States
Environmental Protection
Agency
Environmental Monitoring Systems
Laboratory
Las Vegas NV 89114
Research and Development
EPA-600/S4-83-056  Jan. 1984
Project Summary
Guidelines  for Conducting
Single  Laboratory Evaluations  of
Biological  Methods

William D. McKenzie and Theodore A. Olsson III
  The single laboratory test is used to
establish the data quality that can be
achieved within a single laboratory. It
provides a basis for deciding whether or
not a given method merits collaborative
testing  and it more clearly defines a
method's potential for inclusion as part
of an operational monitoring network.
This summary provides a brief descrip-
tion  of  the suggested procedures for
single laboratory testing.
  Phases of the single laboratory test
include identification of procedural vari-
ables that must be carefully controlled
(ruggedness testing), evaluation of
method sensitivity, identification of the
limits of reliable measurement, evalua-
tion  of systematic error  (bias), and
identification of method precision and
accuracy.
  The chemical composition of all sam-
ple material must be verified during the
single laboratory test. Sample material
should have a concentration range, in
the same sample matrix, that would be
encountered if the method was being
routinely used for its intended purpose.
Some phases of the test should make
use of certified reference materials.
  The resulting single laboratory test
data and the revised (ruggedness tested)
method protocol will ultimately be used
as part of the basis for deciding whether
or not to proceed with collaborative
testing.
  The Project Summary was developed
by EPA's Environmental Monitoring
Systems Laboratory. Las Vegas, NV, to
announce key findings of the research
project that is  fully documented in a
separate report of the same title (see
Project Report ordering information at
back).
Introduction and Basic Test
Procedures
  This summary provides a brief descrip-
tion of the suggested procedures for
single laboratory testing. These sugges-
tions are presented primarily as guidance
to EPA contract  laboratories that are
involved in the single laboratory testing of
biological methods.
  Single laboratory testing  is used to
establish the data quality that can be
achieved within a single  laboratory. It
provides a basis for deciding whether or
not a given method merits collaborative
testing and it  more clearly defines a
method's potential for inclusion as part of
an operational monitoring network. The
single laboratory test includes identifica-
tion of procedural variables that must be
carefully controlled (ruggedness testing),
evaluation of method sensitivity, identi-
fication of the limits of reliable measure-
ment,  evaluation of  systematic  error
(bias),  and identification of method pre-
cision and accuracy. A complete protocol,
for the method being evaluated, must be
received by the evaluating laboratory prior
to test initiation. All method requirements
and procedural instructions should be
clearly presented. It is obviously important
that these written instructions be tech-
nically correct, complete, and as unam-
biguous as possible. However, the  labo-
ratory  conducting the single laboratory
test is  not usually responsible for the
actual method itself. The laboratory must
strictly follow the method procedures and
method requirements (experimental con-
ditions, reagents,  laboratory equipment,
storage of samples,  maintenance of ex-
perimental organisms, blanks, standards,
replicates, etc.) as they are written in the
protocol.

-------
  For the  purposes  of this guidance
document, biological methods will include
procedures used to analyze biological
tissues and fluids as well as the various
biological tests for toxicity, mutagenicity,
etc. Single  laboratory test objectives will
be somewhat different depending on the
method that is being evaluated. To deter-
mine a method's capability for accuracy
(and for  systematic  error),  the testing
laboratory must have a reference sample
material  and  there must be  a known
response (true value) for the material.
When  a  method calls  for  analysis  of
biological tissue or biological fluid, the
testing  laboratory will  usually have a
reference material for which there is a
true response value, e.g., samples with
certified  compound  concentrations  or
perhaps with certified enzymatic activity
levels. However, when the technique
being evaluated includes a toxicity test or
perhaps an algal assay (population stimu-
lation or inhibition) there is frequently no
"true value"  or "true  response"  and
hence  the method's  single laboratory
capability for accuracy cannot really be
determined. Under these conditions, an
average test response should be acquired
by conducting successive analyses on the
same concentration of the same reference
material. Because of such  differences,
the specific requirements for each single
laboratory test must always be confirmed
with the sponsor before single laboratory
testing begins.
  Ten successive analyses (i.e., acquiring
10  valid responses  by following the
method protocol) have  been suggested
for several phases of the single laboratory
test. The method precision determination,
for example, requires that 10 successive
independent analyses be conducted on
the  same  sample material.  Multistage
calculations to determine the  required
number of  analyses might be conducted
during the single labortory test as more
information becomes available on the
expected variance. However, 10 analyses
will allow the test laboratory to estimate
the  standard deviation  to  within  45
percent of  its true value (at a 95 percent
confidence interval). The single laboratory
test cost will rapidly increase as more of
these successive analyses are required
since each additional value must repre-
sent a valid test response and therefore
will include whatever  quality control
analyses (blanks, replicates, etc.) are
required in the original  method protocol
to insure a valid test response.
  The single laboratory test cannot really
evaluate a method's scope of applicability.
i.e., the array of sample types or environ-
mental situations for which the method
would be able to provide useful  data.
However, the single laboratory test should
include various  test samples  having a
concentration range, in the same sample
matrix, that would be encountered if the
method was being routinely used for its
intended purpose. Chemical composition
of all sample material must be verified
during the single laboratory test.
  Some phases  of the single laboratory
test should make use of certified refer-
ence materials. These reference materials
are samples that have a known chemical
composition or enzymatic activity level. In
some cases, they might be samples which
are known to produce a certain response
(true value)from a given test method. Any
reference material used should be readily
available to other testing laboratories.
Both  the National Bureau of Standards
and the U.S. Environmental Protection
Agency have certified various  types  of
samples for use as reference materials.
  Before the single laboratory test begins,
the evaluating laboratory must reviewthe
                        •B
                        •C
Concentrations
Sample Material
i.e., the
Reference Material
             -E

             — F

             — G

             -H
                         -K

                         -L
                              method protocol and  make  notations
                              where  ambiguous statements are made   i
                              or where more detail is  needed. These
                              questions  must be  resolved  with the
                              sponsor,  and,  if necessary, a second
                              protocol prepared, before the actual labo-
                              ratory test begins. The single laboratory
                              test should beg in with ruggedness testing
                              but  it is  important that  the laboratory
                              performing the single laboratory test plan
                              all  the assays  required  in  advance in
                              order  to  prevent duplication  of effort.
                              Some of the following evaluations may be
                              performed simultaneously, depending on
                              the  nature of the test  methods  being
                              investigated. Therefore, it is important to
                              make advance preparations  for the effi-
                              cient use of time and available funds.
                               Figure 1 illustrates how the different
                              analyses can provide data for more than
                              one portion of the single laboratory test.
                              This figure assumes that a single test
                              material  is being  used, i.e., different
                              concentrations  of  the  same reference
                              material.  Many reference materials are
                              available  and the type of sample mate-
                              rial^) used should always be discussed
Precision - G

Method Sensitivity - D,E.F,H,I,J

Limits of Reliable Measurement
1. precision B,U'compared with
  data from G
2. sensitivity B.C.L.K/'compared
  with data from E,F.H.I

Accuracy - G (10 additional valid
         responses should be
         acquired if no true
         response is available
         i.e., total of 20).

Systematic Error - D.G.J (these
         might be used if a true
         response is available,
         if no true response,
         then no data will be
         acquired for system-
         atic error).
Figure  1.
          Note: The ruggedness test would probably
               use concentration Gout would vary the
               experimental procedure for the differ-
               ent analyses. The above chart assumes
               a ruggedness tested/revised protocol
               is being used. Each letter represents 10
               successive independent analyses i.e.,
               10 respective valid responses for the
               particular technique using  different
               concentrations of the reference mate-
               rial.

Example of how the evaluation analyses  can provide data for multiple portions
of the single laboratory test.

-------
with the sponsor  prior to beginning a
single laboratory test.


Test Phases

Ruggedness Testing
  The single laboratory test should first
identify any procedural variables  that
must be carefully controlled. If the given
method is "rugged" it will not be suscep-
tible to the inevitable, modest departures
in routine and the results obtained will
not be altered by these minor variations. If
the results are altered by small procedural
variations, it is important to emphasize in
the protocol that a specific step must be
strictly followed or, in some  cases, to
indicate the  limits of allowable variability.
For example, the ruggedness test results
might  indicate that a  certain (protocol
directed) temperature  requirement of
20°C was a critical  procedure and that
slight variations in this temperature (at a
given phase of the test) would produce an
altered test  result. The method protocol
will then need to be revised to emphasize
that the stated temperature requirement
must be strictly followed and, perhaps, to
provide more detail on any quality control
steps associated with temperature  de-
terminations. Depending on the sponsor's
requirements, additional tests might be
conducted to determine the specific nar-
row range of temperatures that would be
allowable, i.e., a protocol revision noting
that the temperature cannot vary except
by a stated amount. The minor departures
in routine selected for ruggedness testing
do not always need to be variations in the
protocol. Unjustified latitude is sometimes
given in a  protocol simply because of
insufficient detail.
  A single  concentration  of  one  test
material can be used for the ruggedness
test. The suggested approach does  not
seek to study each separate variable in an
individual sequential fashion, but rather
it provides for  the introduction  of several
changes (protocol  variations) simulta-
neously. At  least seven variables should
be selected  which will require that  the
test  laboratory conduct  eight separate
analyses. However, one of these variables
could be a  meaningless variable thus
providing  a  modified  control for  the
ruggedness test itself.  Basically,  the
differences between  the "protocol
directed" result and the "protocol altered"
result for each variable are compared. If
one or two variables are having an effect
on the test result, their respective differ-
ences (directed vs. altered) will be sub-
stantially larger than the group of differ-
ences associated with the other variables.
Most of the modest procedural alterations
should have little effect on the test result
since the variations should only be of a
magnitude that could be made by a quali-
fied  laboratory  following the  written
method protocol.
  Table 1 summarizes a ruggedness test
using seven variables which will therefore
require  eight separate  analyses.  The
varied condition is to be  either slightly
above  or slightly  below  the  "protocol
directed" condition. The "protocol  di-
rected" conditions are designated as A
through G, and the varied conditions are
designated as a through g. The evaluation
is concerned with identifying respective
variations in the final test result due to
the specific procedural differences, i.e.,
A-a, B-b, C-c, D-d, E-e, F-f, and G-g. Each
of the eight trials consists of  a single
analysis conducted using eight respective
aliquots of a single test material. The final
test results are indicated as s, t, u, v, w, x,
y, and z.

Table 1.    Experimental Design for a Seven
           Variable Ruggedness Test*
    _  (s + u
                                   y)
Experiment
Number
1
2
3
4
5
6
7
8
Factor
Level
A BCDEFG
ABcDefg
AbCdEfg
AbcdeFG
aBCdeFg
aBcdEfG
abCDefG
abcDEFg
Analysis
Result
s
t
u
V
w
X
Y
2
''Basedon W. J. Youden, 1969, The Collabora-
 tive Test, p. 151-158, In Precision Measure-
 ment and Calibration, H. H. Ku, Editor. U.S.
 Department of Commerce, National Bureau
 of Standards.  436 pp.

  The average of A = (s + t + u + v)/4,
compared with the average of a - (w + x + y
+ z)/4,  can serve as a rapid means  of
assessing the effect of changing variable
A to a. Since each of the two groups  of
four determinations contain the other six
variables, twice  at the upper case level
and twice at the lower  case level, the
effect of these variables (if present), tend
to cancel out, leaving only the effect  of
changing variable A to  a. The relative
effect of the other variables can also be
estimated by examining  the following
averages:
 D  =
                          + t + y + z)
                             4
                      _ (s + u + x + z)
                             4
                      _ (s + v -i- w + z)
                             4
                      _ (s + v + x + y)
                             4
   _  (t + V + X + Z)
 c        4
. _ (u + v + w + x)
         4
                       _ (t + v + w + y)
                     6 ~   —  .

                     f _ (t + u + x + y)
                              4
                       _ (t + U+ W + 2)
                   After tabulating the above averages, the
                   differences between each respective vari-
                   able would be computed, e.g..
                          _ (s +1 + u
                       ,-a      ^
                  _y) _ (w + x + y + z)
                            4
 B _ (s +1 + w + x)
          4
= (y_±y_Ly_L?)
       4
Examination  of these respective differ-
ences enables the evaluating laboratory
to assess which variables are  probably
effecting the test result. As stated above,
most of the modest procedural alterations
should  have  tittle or no  effect on the
result. Considerable information can be
gained by merely comparing these differ-
ences (A-a, B-b. C-c, D-d, E-e, F-f, and
G-g). The evaluating laboratory may wish
to conduct multiple tests (repeat analyses)
for each variable combination  (e.g., 10
successive independent analyses for each
of the 8 combinations) depending on the
sponsor's  requirements.  Under these
conditions, if any  of the respective dif-
ferences between  averages are greater
than two times the standard deviation for
each variable (experiment), the testing
laboratory would have another indication
that the particular variable is effecting the
test result. After discussing the results
with the sponsor, additional studies might
be planned to define the limits of accept-
able variation for a particular critical test
procedure.
  The  number of variables to  select
should  be  discussed with the  sponsor
prior to the  test.  However, reference
tables for designing ruggedness tests are
available when certain numbers of  pro-
cedural variables  are selected. Having
completed the ruggedness evaluation, the
subsequent phases of the single labo-
ratory test (precision, method sensitivity,
etc.) can  then be conducted  using a
revised method procedure.

Method Precision
  The only requirement for the precision
test is to conduct 10 separate determina-
tions on the same  sample (preferably
using a reference material). Each  separate

-------
determination must represent a valid test
response as  required by the particular
method protocol. It is also recommended
that the separate  precision determina-
tions be conducted on alternate days, i.e.,
an interval of at least one day between
the completion of one analysis and the
start of the next. The resulting data can be
expressed either as a standard deviation,
a standard error, or  as  a coefficient of
variation.


Method A ccuracy
  To determine a method's single labora-
tory capability for  accuracy  (and for
systematic error), the testing laboratory
must have both  a standard reference
material and  a known method response
(true response) to this reference material.
When  a method calls  for analysis  of
biological tissue or biological fluid, there
will  usually  be a standard reference
material available to the testing labora-
tory, e.g., samples with certified com-
pound  concentrations or perhaps  with
certified enzymatic  activity  levels.  In
these  instances, the method's single
laboratory capability for accuracy can be
assessed by determining the differences
between the  observed single laboratory
result,  using  the reference  sample, and
the known true value. Ten separate deter-
minations (10 valid responses) should be
conducted using a single concentration of
the reference sample material. Each of
the 10 determinations must represent a
valid test response  as  directed in the
particular method protocol. The method
protocol presents whatever requirements
are necessary for replicates, blank sam-
ples, etc.,  in order  to  provide a valid
response. The Student t-Test would  be
used to determine the significance of the
difference between the  observed single
laboratory test result and the known true
value.
  If the method being single laboratory
tested is a toxicity test or perhaps an algal
assay  test (population stimulation  or
inhibition), there will usually be no "true
response" available for a reference mate-
rial and hence the method's single labo-
ratory  capability for accuracy (or for
systematic error) cannot really be deter-
mined. Under these conditions, the test-
ing  laboratory  should  first  select  a
reference material and  then determine
an  average test response  for  a single
concentration of the reference sample.
When  determining the average test re-
sponse, it is recommended that 20 inde-
pendent determinations (valid  test re-
sponse as indicated by method protocol)
be conducted. While the literature data
base may provide valuable background
information  for  a method's average
response to various sample  types, the
single  laboratory testing group  should
still  conduct these determinations (to
acquire an average test response to  a
reference material) using the ruggedness
tested/revised protocol.
Systematic Error
  A method's capability for (minimizing)
systematic error can be considered as a
part of the method's capability for accu-
racy. If a true response (known value) is
not available for a reference material, the
single laboratory test will not be able to
acquire data on the method's systematic
error. The testing laboratory should prob-
ably remind the sponsor that, under these
conditions, no data  will be provided for
this phase of the  single laboratory test.
Comparison of test data with the results
of a  reference  method can be  used in
certain situations, but for the purposes of
this program, the use of reference meth-
ods is  not considered as part of single
laboratory testing.  It  should also be
remembered that it is the bias of the
method, not the bias of the laboratory that
is being addressed. Single  laboratory
testing does not really address laboratory
bias even though  it  will obviously affect
test results.
  In the case of a bioanalytical method for
which  a  standard reference material is
available, the testing laboratory should
estimate a method's systematic  error by
using various dilutions of the reference
sample.  A single concentration of the
standard reference  material  would  be
aliquoted into at least three equal sam-
ples. Two of these aliquots would then be
diluted to different  total volumes, thus
creating three different sample sizes. Ten
respective independent analyses would
then be conducted on each of the three
sample groups. For some methods  it
might  be preferable to make the three
groups all have the same sample size, i.e.,
different total amount of analyte. Results
from each group (results from two groups
would first be corrected by the respective
dilution factor) would then  be compared
(Student t-Test) with the true value for the
reference material. The consistently pre-
sent systematic error should be noted in
each of these three test groups. Additional
concentrations/dilutions as well as addi-
tional independent determinations would
probably  be  beneficial for the single
laboratory assessment of systematic error
(depending  on the  sponsor's  require-
ments).

Method Sensitivity
  For purposes of a single laboratory test,
a method's sensitivity is defined as the
method's capability to detect (or distin-
guish between) small changes in sample
concentration, i.e., concentration of ana-
lyte. A chemically  characterized  refer-
ence material  should be used as sample
material during sensitivity testing.
  Assume concentration A (Figure 2) had
been selected as  the concentration of
sample material used previously  in the
method precision test. For the sensitivity
test, the  laboratory would select  one
concentration greater than (C, Figure 2),
and one  concentration less than  (B,
Figure  2) the concentration used during
the precision test. These concentrations
should probably  be equally distant from
the  precision test  concentration (A,
Figure  2). The laboratory should conduct
10  independent  analyses for each new
concentration (i.e.,  10  separate  valid
responses  acquired by following  the
method protocol). Assuming that the
procedure can distinguish between A and
C, and between  A and B, then the test
laboratory will reduce the concentration
interval by one-half (to C' and B', Figure
2). If the method  is still capable of distin-
guishing between A and C', and between
A and  B', then the test laboratory will
again reduce the sample concentration
interval by one-half (to C" and B"). The 10

     — C
     — C'


     — C"


     — A


     - B"
%
*.,
II
IS
? a
I
     *—B

Figure 2.
            Example of different reference
            material concentrations used in
            sensitivity testing.

-------
independent analyses on these last re-
spective concentrations  (C"  and  B",
Figure 2) will complete the sensitivity test
even  if the method can distinguish  be-
tween A and C", and between A and B".
The sponsor  can indicate if additional
information on the method's sensitivity is
required and, if so, direct the test labora-
tory to continue this process  or to repeat
the process  using a  different reference
material.
  A relatively poor method capability for
sensitivity does not necessarily  imply
limited  method  usefulness  or that  the
method would be an unlikely candidate
for collaborative testing. The  intended
purpose of the respective  method must
always  be considered  when reviewing
single laboratory test data.

Limits of Reliable Measurement
  In determining a  method's  limits of
reliable measurement, the single labora-
tory test data may simply verify that the
method capabilities for  sensitivity, preci-
sion,  and accuracy (if applicable) do not
deteriorate at the upper and lower ex-
tremes of the detection range. The same
reference  material should be  used for
these tests as was originally  used during
the method precision and method sensi-
tivity evaluations.
  The single laboratory test is not required
to establish an upper and lower detection
limit.  It is assumed that a sufficient litera-
ture data base for the  method exists to
grossly estimate an upper and lower limit
of detection  using the particular sample
material. The evaluating laboratory should
initially  select two concentrations of the
sample material. One of these concentra-
tions  will be near the upper extreme of
the method's  detection range  and  the
other concentration will be near the lower
extreme of the detection range. Ten anal-
yses would be conducted on each concen-
tration  to provide precision data  (ex-
pressed as a coefficient of variation).  The
coefficient of  variation will frequently
show a dramatic increase at the extreme
limits of detection and, therefore, preci-
sion data provide a distinct indication of
the limits of reliable measurement. Two
additional  concentrations, one  at each
extreme of the estimated response range
should then be selected in order to con-
duct the sensitivity determinations. Thus,
two sample  concentrations at each ex-
treme of the estimated response range
can be compared (in terms of  precision
and sensitivity) with the previously ac-
quired data. If the method is a bioanalyti-
cal technique with an available true value
for the reference material, accuracy data
would also be acquired.
  When using this evaluation plan, a true
limit of reliable  measurement may not
actually be established. However, even
under these conditions, data would still
be available to indicate that the technique
was, or was not, capable  of providing
reliable measurements  at the extreme
concentrations selected. Additional sam-
ple concentrations,  or additional test
substances,  can  also be selected based
on the sponsor's needs.

Conclusions
  The single laboratory test data and the
revised (ruggedness tested) method proto-
col will ultimately be used as part of the
basis  for  deciding  whether  or  not  to
proceed with collaborative testing. If the
technique is selected,  different labora-
tories will analyze aliquots  of the same
sample  material (strictly following  the
method protocol) in order to validate the
method's  performance. Data from  the
collaborative test will be used to deter-
mine the reproducibility (overall between-
laboratory variability) that can be expected
when the procedure is used by different
qualified laboratories.
   William D, McKenzie and Theodore A. Olsson III, are with Bioassay Systems
     Corporation, Wodburn, MA 01801.
   William W. Sutton is the EPA Project Officer (see below).
   The complete report, entitled "Guidelines for Conducting Single  Laboratory
     Evaluations of Biological Methods," (Order No. PB 84-124 841; Cost: $10.00,
     subject to change/ will be available only from:
          National Technical Information Service
          5285 Port Royal Road
          Springfield,  VA22161
          Telephone: 703-487-4650
   The EPA Project Officer can be contacted at:
          Environmental Monitoring Systems Laboratory
          U.S. Environmental Protection Agency
          P.O. Box 15027
          Las Vegas, NV 89114

-------
United States
Environmental Protection
Agency
Center for Environmental Research
Information
Cincinnati OH 45268
Official Business
Penalty for Private Use $300
                                                                            ipp *r j
                                                                            FEr'Uo20
                                                     'METfR1 *- |
      PS  0000329
      U  S ENVIR  PROTECTION
      REGION 5  LIBRARY
      E30 S DEARBORN STREET
      CHICAGO  It 60604

                                                  \u-~---   \   i
                                              Tsw-tf I  n n ^! \   \
                                              %,\^-O. '
                                               K  WsZSLr'
                                                                 * U.S. GOVERNMENT PRINTING OFFICE: 1984-759-102/826

-------