COLLABORATIVE STUDY
                      of
REFERENCE METHOD FOR THE CONTINUOUS
MEASUREMENT OF CARBON  MONOXIDE IN
     THE ATMOSPHERE (NON-DISPERSIVE
          INFRARED SPECTROMETRY)
                 Herbert C. McKee
                 Ralph E. Childers
               Contract CPA 70-40
              SwRI Project 01-2811

                  Prepared for
        Office of Measurement Standardization
         Division of Chemistry and Physics
       National Environmental Research Center
         Environmental Protection Agency
         Research Triangle  Park, N. C. 27709
                   May 1972

-------
           COLLABORATIVE STUDY
                       of

REFERENCE METHOD FOR THE CONTINUOUS
MEASUREMENT OF  CARBON  MONOXIDE IN
     THE ATMOSPHERE  (NON-DISPERSIVE
          INFRARED  SPECTROMETRY)

                  Herbert C. McKee
                  Ralph E. Childers
                Contract CPA 70-40
               SwRI Project 01-2811

                   Prepared for
        Office of Measurement Standardization
          Division of Chemistry and Physics
       National Environmental Research Center
          Environmental Protection Agency
         Research Triangle Park, N. C. 27709
                     May 1972


                        Approved
                        Herbert C. McKee
                        Assistant Director
                        Department of Chemistry
                        and Chemical Engineering
                                                      r

-------
                              SUMMARY AND CONCLUSIONS
      This  report presents information obtained in the evaluation and collaborative testing of a reference
method for measuring the carbon monoxide content of the atmosphere.

      This  method was  published  by  the  Environmental Protection Agency in the Federal Register,
April 30, 1971, as the reference method to be used in connection with Federal ambient air quality stan-
dards for carbon monoxide. Following minor editorial changes, the method was republished in the Federal
Register, November 25, 1971 . The former publication is reproduced as Appendix A of this report.

      The method is based on the infrared absorption  characteristics of carbon monoxide, using an instru-
ment calibrated with gas  mixtures containing known concentrations of carbon monoxide. A similar method
based  on the same  principle  has been published by the Intersociety Committee as Tentative Method
42101-04-69T in Health Laboratory  Science, January 1970, Part Two, pp 81-86.

      The method published in the  Federal Register was tested, as a part  of this program, by means of a
collaborative test  involving a total of  16  laboratories. The test involved the analysis of both dry and
humidified mixtures of carbon monoxide and air over the concentration range from 0 to 60mg/m3. A
statistical analysis of the data of 15 laboratories provided the following results:

      •    The checking limit for duplicates is-0.5 mg/m3

      •    The repeatability is 1 .6 mg/m3

      •    The reproducibility varies nonlinearly  with concentration with a minimum of 2.3 mg/m3 at a
           concentration of 20 mg/m3 and  ranges as high as 4.3 mg/m3 in the concentration range of 0 to
           60 mg/m3

      •    The minimum detectable sensitivity is 0.3 mg/m3
           The  compensation for water vapor interference is satisfactory for drying agents and refrigera-
           tion  methods.  The  use of  narrow-band optical  filters alone may  not provide adequate
           comensation.
      •    The  accuracy is totally  dependent upon the availability of dependable calibration standards.
           Based on  the results of  this collaborative study, the method produces results, on the average,
           2.5 percent high.

      In addition, this report presents other results with respect to the quality of calibration standards and
the minimum number of samples required to establish validity of results within stated limits.
                                                in

-------
                                   ACKNOWLEDGEMENT
     The authors wish to express appreciation to the Project Officer, Mr. Thomas W. Stanley, and staff
member, Mr. John H. Margeson, of the Office of Measurement Standardization, for assistance in the plan-
ning and execution of the collaborative study.

     The assistance and advice of Scott Research Laboratories, Plumsteadville, Pennsylvania, who prepared
and analyzed the test gases, is acknowledged.

     The assistance and cooperation of the participating laboratories is also  acknowledged with sincere
appreciation for the voluntary efforts of the staff members who represented each organization. The repre-
sentatives and organizations participating in one or more phases of the collaborative test program were as
follows:
                    Name
             RichardS. Brief
             R.G. Confer
           Organization
Esso Research and Engineering Company
Linden, New Jersey
             Franz J. Burmann
             Jack A. Bowen
Environmental Protection Agency
Durham, North Carolina
             Charles A. Cody
Southwest Research Institute
Houston, Texas
             Ronald P. Dubin
             Richard Wonderlick
Bureau of Air Pollution Control
Pennsylvania Department of
  Environmental Resources
Harrisburg, Pennsylvania
             Milton Feldstein
Bay Area Air Pollution Control District
San Francisco, California
             Judith Garelick
             Diane Berkel
             Kenneth T. Irwin
Bureau of Air Pollution Control
Nassau Department of Health
Mineola, New York

Jefferson County, Kentucky
Air Pollution Control District
Louisville, Kentucky
             Norman J. Lewis
Division of Environmental Quality
New Jersey State Department of Health
Trenton, New Jersey
                                               IV

-------
       Name
            Organization
Peter K. Mueller
S.G. Kerns
Robert E. Pattison
F.D. Olmstead
Hisham M. Sa'aid
Roger B. McCann
R.K. Stevens
Dwight A. Clay

Bill Stewart
J.S.Payne
PhilipS. Tow
Air and Industrial Hygiene Laboratory
California State Department of Public
  Health
Berkeley, California

Air Pollution Control Laboratory
Canton City Health Department
Canton, Ohio

Air Quality Section
Kentucky Air Pollution Control
  Commission
Frankfort, Kentucky

Environmental Protection Agency
Research Triangle Park, North Carolina

Air Pollution Control
Texas State Department of Health
Austin, Texas

Sacramento County, California
Air Pollution Control District
Sacramento, California
Alvin L. Vander Kolk
Ken Smith
Standards and Analysis Sections
Michigan Department of Public Health
Lansing, Michigan
Grant S.Winn
Carl E. Kerr
Air Quality Section
Utah State Division of Health
Salt Lake City, Utah

-------
                                  TABLE OF CONTENTS

                                                                                      Page

I.    INTRODUCTION	       	       	       •  •      1

II.   COLLABORATIVE TESTING OF THE METHOD	      1

     A.    Furnishing Test Samples and Calibration Gases        .  .     .  .          	      2
     B.    Selection of Collaborators .     	    ....     ...       .       .  .      3
     C.    Collabortive Test Procedure     .  .        ....       ...          ...        4
     STATISTICAL DESIGN AND ANALYSIS
     A.   Summary of Design  .    .     	    .   .       .  .       ....    .   .     4
     B.   Summary of Results   .  .     .     ....       	     ....        5
LIST OF REFERENCES
                                 LIST OF ILLUSTRATIONS

Figure                                                                                 Page

   1       Repeatability and Reproducibility Versus Concentration   .     .          .....     6


                                      LIST OF TABLES

Table                                                                                  Page

   1       Reference Values for Carbon Monoxide Test Concentrations  Used in Collaborative
          Test, Parts per Million	       .    .       ...       .   .     ....     3
                                              vu

-------
              I. INTRODUCTION

      While carbon monoxide has received less atten-
 tion  than  some  other  contaminants, it is found in
 many of the urban areas of the world. Many different
 sources of carbon  monoxide exist in a typical city,
 but by  far the predominant source is motor vehicles.
 As recent control measures reduce the emissions from
 vehicles, other sources  such  as incinerators  and var-
 ious industrial operations will represent a larger per-
 centage of the total.

      Carbon monoxide has  long been  known to be
 toxic at high concentrations, producing  illness and
 eventually death. At the lower levels of concentration
 found in many urban atmospheres, carbon monoxide
 may act to impair various bodily functions, although
 the exact  exposure conditions  required to  produce
 such effects have not been definitely established.

      Unlike  most  atmospheric contaminants,  at-
 tempts  to measure carbon  monoxide  by  a direct
 chemical method have met with very limited success.
 For industrial  hygiene purposes, combustion pro-
 cesses and colored indicator tubes  have  been used
 satisfactorily. To measure the lower concentrations of
 interest in  urban  air pollution, however, the only
 satisfactory method which  has  received widespread
 use is based on infrared absorption. Commercial in-
 struments based on this principle have been available
 for many years. The method involves detecting the
 difference in absorption of infrared energy of the at-
 mosphere being  tested  in  a sample  cell and a non-
 absorbing gas in a reference cell. The  difference is
 sensed by selective detectors, sensitive only to carbon
 monoxide, and amplified to provide an output signal.
 The resulting signal is then used to operate a recorder
 which provides  a continuous record of carbon mon-
 oxide levels over a period of time. Under normal at-
 mospheric  conditions,  the  only major interference
 with this method is water vapor, which can  be  over-
 come  through the use of drying  agents or other mea-
 sures as  discussed subsequently.

     In order to obtain  reliable data in measuring
carbon  monoxide  and  other  atmospheric
contaminants, the Environmental Protection Agency
(EPA) Office of Measurement Standardization (QMS)
has been working for some time to develop standard
methods which could be used by all persons making
air quality measurements. Following the development
of a tentative standard method, the final step in the
standardization process  is a collaborative test, or in-
terlaboratory comparison, of the proposed standard
method. This procedure,  also called  "round-robin
testing," has been used  to evaluate many different
methods of  measurement  in  such  diverse fields as
water chemistry, metallurgy, paint and surface coat-
ings, food and related products, and many others. A
test of this nature by a representative group of labora-
tories  is the only way  that the statistical limits of
error inherent in any method can be determined with
sufficient confidence.

     This report presents the results of a collabora-
tive test of the carbon monoxide method conducted
by  Southwest Research Institute and the Office of
Measurement Standardization, together with the  sta-
tistical analysis of the data obtained. In this collabo-
rative  test,   standard  samples  contained  in  high-
pressure cylinders  were  prepared  and  carefully
analyzed to  determine  exact concentration. These
cylinders were  then distributed to a representative
group of laboratories who participated in the test on
a  voluntary  basis.  These  samples  were  analyzed
according to the standard procedure as outlined in
the tentative method, after which the gas cylinders
were returned to the supplier  for reanalysis to again
check the concentration levels. The results  of the  col-
laborative test  were then  analyzed statistically to
determine the accuracy and precision of the proposed
method.
       II. COLLABORATIVE TESTING
              OF THE METHOD

     An important step in the standardization of any
method of measurement is the collaborative testing of
a proposed method to  determine,  on a statistical
basis, the limits of error which can be expected when
the  method  is used  by  a  typical  group  of

-------
investigators. The collaborative,  or  interlaboratory,
test of a method is an indispensable part*-1)  of the
development and standardization of an analytical pro-
cedure to insure that (1) the procedure is clear and
complete and that (2) the procedure does give results
with  precision   and  accuracy  in  accord with those
claimed for the  method. Among other organizations,
the   Association  of Official  Analytical  Chemists
(AOAC) and the American Society  for Testing and
Materials (ASTM) have  been  active in the field of
collaborative testing and have published guidelines of
the proper procedure  for conducting  collaborative
tests  and evaluating the data obtained/2~4) Publica-
tions  of both organizations were used extensively in
planning and conducting the collaborative tests of
this method to measure carbon monoxide.

      After the evaluation of various methods for pre-
paring test samples, a detailed collaborative test was
undertaken to  obtain the necessary data  to make a
statistical evaluation of the method. This section of
the report describes  the various phases of the test
plan that was developed.

A.   Furnishing Test  Samples and Calibration
      Gases

      Many air  contaminants  must  be  measured at
concentrations  in  the  fractional   parts-per-million
range, and the use of test atmospheres in high-pres-
sure cylinders is not feasible at these low levels due to
reaction or adsorption  effects  which  make  it im-
possible  to maintain an accurately controlled test
concentration.  This is  not true  with carbon mon-
oxide, which is  only of concern at levels in the parts-
per-million range. At these higher levels,  and with
proper precautions, cylinder gas samples can be used
with a reasonable degree of confidence in the stability
of the samples.  The stability should be checked by
periodic reanalysis where possible.

      Test  gases for this collaborative  test were ob-
tained from Scott Research Laboratories, an organiza-
tion with wide experience in the generation, control,
and  analysis  of various gases for  experimental pur-
poses.  For each  test concentration,  a  large master
cylinder containing carbon monoxide in dry synthetic
air was prepared and analyzed accurately by gas-solid
chromatography  using helium ionization detection.
The  chromatograph was calibrated with primary grav-
imetric  gaseous standards prepared  in  glass. From
these master cylinders, smaller cylinders were filled
and  individually analyzed by the  same  method. The
cylinders used had a chromium-molybdenum alloy in-
side  surface of low iron content to minimize the loss
of carbon monoxide which has been reported to be
caused by  the formation of iron carbonylA5' The
master cylinders were retained and the smaller cylin-
ders  sent to the collaborative test participants.

     At the  conclusion of their analyses, the partici-
pants returned the cylinders to Scott Research Labor-
atories,  who  then reanalyzed the  contents of each
cylinder having sufficient  residual  pressure.  The date
of the first analysis was July 2, 1971; the date of the
reanalysis was January 21, 1972—203 days later. The
results are shown in Table I.  They are shown in parts
per million as reported (divide by 0.873  to convert to
milligrams  per cubic  meter). The results have  not
been converted in Table I in order to eliminate mis-
leading comparisons  because of round-off errors  due
to conversion. The converted values  may be seen in
Table C-II of Appendix C.

     Agreement between first and final analyses was
good for all but 3 or 4 of the 48 cylinders used. Most
collaborators completed their work within 30 days of
the  first  analysis; all  work was complete  within
80 days.  Therefore,  if any  corrections were to be
applied, the result would be  much closer to the first
analysis.  No  corrections were applied  and the first
analyses were used as the reference values.

     Since the infrared instrument produces a rela-
tive  measurement, calibration with standard gases is
necessary in  order to convert this measurement to a
measured concentration as described in the method
(Appendix A). This is done by the use of calibration
*Superscript numbers in parentheses refer to the List of References.

-------
TABLE I. REFERENCE VALUES FOR CARBON MON-
OXIDE TEST CONCENTRATIONS USED IN COLLAB-
      ORATIVE TEST, PARTS PER MILLION
Assignee
Master
220
222
253
270
310
311
370
375
540
571
780
799
860
920
923
927
Master
220
222
253
270
310
311
370
375
540
571
780
799
860
920
923
927
Master
220
222
253
270
310
311
370
375
540
571
780
799
860
920
923
927
Cylinder
Number
W-18156
C-658
C-657
C-656
C-655
C-654
C-653
C-652
C-651
C-650
C-649
C-648
C-644
C-647
C-646
C-645
C-641
W-138035
C-674
C-673
C-672
C-671
C-670
C-669
C-668
C-667
C-666
C-665
C-664
C-659
C-663
C-662
C-661
C-660
W-138036
C-690
C-689
C-688
C-687
C-686
C-685
C-684
C-683
C-682
C-681
C-680
C-675
C-678
C-677
C-676
C-679
Initial
Analysis
7.50
7.29
7.47
7.40
7.33
7.43
7.49
7.48
7.20
7.42
7.45
7.35
7.36
7.13
7.48
7.48
7.36
25.5
26.1
26.4
26.3
26.1
26.0
26.0
26.1
26.4
26.1
26.2
26.0
26.4
26.1
26.0
26.3
26.3
45.5
45.9
45.5
45.7
45.6
45.6
45.9
45.5
45.7
45.7
45.7
45.6
45.7
45.7
45.7
45.7
45.7
Final
Analysis
7.47
7.27
	
_
7.21
7.53
_
7.52
6.54
7.46
7.44
7.33
7.33
5.50
7.28
7.57
7.22
25.2
26.8
—
—
26.2
26.0
26.3
—
26.2
26.1
26.4
26.0
26.2
26.4
25.9
25.9
26.3
45.5
45.7
_
-
45.7
45.7
45.6
-
45.6
-
45.7
45.7
45.7
45.6
45.8
45.4
45.2
Change
-0.03
-0.02
_
_
-0.12
0.10
	
0.04
-0.66
0.04
-0.01
-0.02
-0.03
-1.63
-0.20
0.09
-0.14
-0.3
0.7
_
—
0.1
0.0
0.3
—
-0.2
0.0
0.2
0.0
-0.2
0.3
-0.1
-0.4
0.0
0.0
-0.2
-
-
0.1
0.1
-0.3
-
-0.1
—
0.0
0.1
0.0
-0.1
0.1
-0.3
-0.5
Source: Scott Research Laboratories
gases representing 20, 40, 60, and 80 percent of the
range of the instrument. Such  calibration gases are
available from a number of commercial suppliers, and
the participants were instructed to obtain the neces-
sary calibration gases from their usual sources.

      Because  of this calibration  procedure,  the
accuracy of the method is completely  dependent on
the accuracy of the  calibration  gases used.  For this
reason,  all collaborators were instructed to take all
possible precautions to obtain calibration gases of suf-
ficient accuracy and to safeguard these materials from
contamination or deterioration in storage or use.

B.    Selection of Collaborators

      If a collaborative test is to achieve the desired
objective, it is  desirable that the participants in the
test be representative of the large group that will ulti-
mately use the method being tested. Since air pollu-
tion measurements are  of interest to many different
groups,  it was  desirable to include in  the group of
collaborators a  variety of  governmental agencies, uni-
versities, industrial laboratories, and others. The final
selection of participants  included two from federal
laboratories, twelve from state and local air pollution
control  agencies, one from industry, and one from  a
research  institution.  A complete list of the partici-
pants and their affiliation is  given in  the acknowl-
edgement.

      Even more important than the type of labora-
tory is the degree of skill and experience of the per-
sons who participated. Each laboratory was  asked to
assign a person to this test who  had previous experi-
ence with the infrared method for measuring carbon
monoxide and  was competent  in carrying out mea-
surements by this method. This was done because the
emphasis was upon  the capabilities of the method
rather than the performance of the laboratories. Each
laboratory had previous experience in  the use of the
method and thus possessed a satisfactory infrared in-
strument and the necessary equipment  for laboratory
processing of samples and calibration gases.

      For purposes  of familiarization, each partici-
pant was furnished a standard test sample for analysis
prior to  the  actual  collaborative test.  Results from

-------
 these preliminary  runs were used as an approximate
 check on the experience and skill of each participant,
 with  the  intention of eliminating any whose  results
 were  grossly in error, thus indicating a lack of famili-
 arity  or experience with the method. No  such elimi-
 nation was necessary and, therefore, all of the partici-
 pants originally  selected were used  in  the  actual
 collaborative test which followed.

 C.   Collaborative Test Procedure

      After the preliminary familiarization samples
 were  analyzed and the  results obtained, test samples
 for the actual  collaborative test were  distributed  to
 each  participant. The national primary and secondary
 air   quality   standard  for carbon  monoxide  is
 10 mg/m3 for  8 hr or 40 mg/m3  as a maximum 1-hr
 concentration  (both to be  exceeded not  more than
 once  per year). Therefore,  test concentrations were
 selected  to indicate  the variability of the method
 within these ranges. This led to the selection of 8, 30,
 and 53 mg/m3  as test concentrations for purposes of
 collaborative testing.

      In addition to examining the effects of concen-
 tration on precision and accuracy, it was necessary to
 realistically evaluate the effects of humidity  on the
 analysis.  Therefore, in addition to analyzing the dry
 test gases, each was analyzed after humidification.
 The  test  gases were essentially saturated  by passing
 them through  a midget impinger containing  15 m£
 distilled water. Losses  of carbon monoxide  due  to
 absorption are negligible.

      In order  to estimate other random effects, each
 of the three concentrations  was analyzed in triplicate
 on each of 3 days under both dry and humid  condi-
 tions. This resulted in a total of 810 separate determi-
 nations-54 by each reporting laboratory.  Section I-B
 of Appendix B contains a more detailed discussion of
 the experiment design,  and Figure B-l  graphically
 shows the design.

      The results of this test series were then used for
detailed statistical analysis, which is described in detail
in Appendix B and summarized in the next section.
          111. STATISTICAL DESIGN
                AND ANALYSIS

      Several fundamental requirements must be met
in order to provide the maximum reliability  of the
collaborative test. First, the  conditions of the  test
must be representative of a specified population; each
factor involved must be a representative sample of a
population about which inferences are to  be drawn.
Second, the collaborative test must be unbiased; pre-
cautions must be taken to avoid the introduction of
any bias in the collaborative  test procedure. It is im-
portant that the collaborators assume a responsibility
to try to eliminate any bias by carefully following the
instructions of the collaborative procedure and the
method. Every  detail  is  important  and  even  the
slightest departure from the specified procedures may
bias  the results. Third, the results of the collaborative
test must be reproducible; that is, the  conditions for
the test should be such that  similar results would be
obtained if the collaborative  test were repeated. The
fourth requirement involves the scope of the test; the
materials and conditions for  which  the  analytical
method was designed must be included in the test.
Finally, the collaborative test must be practical  and
economically feasible. Since  funds and facilities are
never available for an unlimited testing program, it is
necessary to  accept less than the ideal  testing pro-
cedures in  order to accomplish the program.  Thus,
fundamental  requirements may  not  be completely
fulfilled, since any practical  compromise introduces
limitations  on the inferences that can be  drawn. If
pursued  too  far,  compromises from  practical con-
siderations  may  render the collaborative test useless.

      Appendix B contains the complete  and detailed
description of the design and analysis of the formal
collaborative test. The results of Appendix B are sum-
marized in this section.

A.    Summary of Design

      The primary purpose was to establish the reli-
ability of the  method in terms of its precision and
accuracy. More emphasis was  placed on the  quality of
the  method  when  properly  used than  upon  the

-------
performance of the laboratories. At the same time, it
was necessary  to retrieve  information which would
allow  the  investigation  of  other aspects  of  the
method; therefore, intermediate data were obtained
relating to calibration curves.

      The statistical planning of a  program is limited
in  scope and   depends upon what information is
desired. The scope is limited by what a collaborating
laboratory can  conveniently and economically accom-
plish, as well as by the number of collaborators that
can be accommodated. Under these limitations, it was
possible  to  examine the effects of laboratories, con-
centrations,  and  days upon  the precision of the
method  in   addition  to  estimating the  replication
error.

      Of the 16 laboratories that took part in the test
program, 15 satisfactorily  completed the test. These
laboratories constitute  a random  sample  of a rather
large population of experienced laboratories. Three
different concentrations were  analyzed by each labo-
ratory. The  concentrations were nominally 8, 30, and
53  mg/m3  Each of the three concentrations was ana-
lyzed both dry and humidified, in triplicate,  on each
of 3 separate days using independently prepared cali-
bration curves. This procedure resulted in a  total of
810 individual determinations.

      The collaborative test was designed to allow the
analysis of the  results using the most efficient statisti-
cal  methods available. The experiment was designed
so that the linear model analysis^3'6"8) could be used.
This analysis, as well as  tests for outlying observa-
tions, is described in Appendix B.

B.    Summary of Results

      1.    Procedural Errors

           Since the emphasis was upon  the quality
of the method  and not upon the performance of the
laboratories, all arithmetic  errors were corrected, and
the   arithmetic  error problem was evaluated quali-
tatively.  Few instances of errors in arithmetic opera-
tions were  noted. The method is relatively  simple
and, consequently, is not vulnerable to arithmetic and
procedural errors.
      2.    Precision Between Replicates

           The replication error (see Section II-A of
Appendix B for detailed definition) was shown to be
independent  of concentration  and humidity. The
standard deviation for variation between replicates is
equal to 0.17 mg/m3  Replication will not materially
assist in  increasing the precision of the method, and
will, in general, be  a waste of time and effort; how-
ever,  replicates are  often  advisable to avoid gross
errors.

           The  checking  limit  for  duplicates   is
0.5 mg/m3; therefore, two replicates differing  by
more than this amount should be considered suspect.
Section II-A of Appendix B contains more details  re-
garding the replication error.

      3.    Humidity Effects

           The humidity  has  no  measurable effect
upon the precision or accuracy when drying or refri-
geration  methods  are used (see Section 3.1  of  the
method in Appendix A). No data are available for the
saturation method. Optical filters alone do not appear
to be adequate; however, this conclusion is based  on
very limited  data. Section II-B of Appendix B  con-
tains further details regarding humidity effects.

      4.    Precision Between Days

           The standard deviation  for variation be-
tween days for the  same sample includes both  the
replication error and a component for between-days
variation and is equal to 0.47 mg/m3  Two test results
on  the same sample on different days by  the same
laboratory should not differ by  more than 1.3 mg/m3

           The standard deviation  for variation be-
tween days for different but similar samples includes
an  additional term  to account  for  heterogeneity
between  samples. The corresponding standard devia-
tion is 0.57 mg/m3  If  the test results on each of
these samples differ  by less than 1.6 mg/m3-the  re-
peatability of the method—there is no reason to  be-
lieve there is any real difference between them.

-------
           Section  III-A-2 of Appendix B presents
more details regarding precision between days; in par-
ticular, a comparison of the means of different popu-
lations each analyzed by the same laboratory.

      5.    Precision Between Laboratories

           The  standard deviation for variation be-
tween laboratories includes terms representing addi-
tional, more complex, effects and is equal to the square
root of Vj(y) where V/(y) is given by

       Vj(y) = 0.001007x? -0.0393*,- + 1.10

where the subscript/ is attached to signify the depen-
dence upon the  concentration Xj which is the inde-
pendent variable.
     5
                            Two  test  results  on  the same sample
                 should  agree  within  the  reproducibility  which is
                 shown plotted versus concentration in Figure 1  along
                 with the repeatability for comparison. If the test re-
                 sults on two different samples differ by less than the
                 reproducibility, there is no reason to believe there is
                 any real difference between them.

                            Section  III-A-3  of Appendix B includes
                 more details regarding the  precision between  labo-
                 ratories.

                            Various  statistical methods are available
                 for the comparison of means or the  comparison of a
                 mean and  a fixed value.(9-H) These  methods  are
                 straightforward and are applied independently of the
                 results of this study. That is, whether or not a mean is
    3
 3
TJ
 O
QC
£•  2
oc
                      I
  I
I
I

I
                                                                 Reproducibility
                                                                  Repeatability
                     10
20              30
      Concentration, mg/m
                                                                     40
                                       50
                                                                                                     60
                            FIGURE 1. REPEATABILITY AND REPRODUCIBILITY
                                        VERSUS CONCENTRATION

-------
significantly  different  from some fixed value is de-
pendent upon  the  actual standard deviation of the
sample  population.  The variance of the sample popu-
lation includes both the variance of the true  values
and  the variance due  to  the measurement method.
A limiting case is discussed in Appendix B under the
assumption  that all variation is  due to the measure-
ment method.  The  case is an extremely unlikely, if
not  impossible, situation; however,  a certain amount
of guidance can be  obtained in terms of the numbers
of observations required to provide a specified degree
of agreement.  These numbers are sufficient only to
compensate for the variation of the method. An addi-
tional quantity, dependent  on  the variation in the
true values, will always be  required. Interested readers
may  refer to Figures B-4  and B-5 and the respective
discussions in  Section III-A-3  of Appendix B  where
two illustrative examples are given.

      6.   Accuracy of the Method

           There is a  statistically significant bias in
the method based upon the results  of this collabora-
tive  test. The  practical significance must be  based
upon other criteria.

           There is an approximately linear relation-
ship with the tendency for results to be, on the average,
2.5 percent high. (See Figure B-6 in Appendix B for a
graphic  illustration.) Since the method uses the same
type materials for calibration as were used for reference
samples in this test,  there remains little doubt that the
inaccuracy results almost entirely from the use of cali-
bration  gases which exhibit significant variation with
respect to their specified content. Since results tend to
be high, the calibration gases must have a tendency to
be correspondingly low.

           It cannot be overemphasized that the ac-
curacy of the method is almost totally dependent upon
the  availability of sufficiently  accurate calibration
standards.

           Section  III-B of Appendix B contains fur-
ther  details regarding the accuracy of the method and
an examination of the quality of calibration gases.
     7.    Minimum Detectable Sensitivity

           The minimum detectable sensitivity is de-
fined as "the smallest amount of input concentration
that can be detected as the concentration approaches
zero" (see Addenda B of the method in Appendix A).
The best estimate for this parameter is that based on
two standard deviations (replication error); therefore,
the minimum detectable sensitivity may be taken to
be 0.3  mg/m3  Obviously, it is also affected by other
criteria such as chart range and dimensions, recorder
performance, and instrument response. These charac-
teristics varied widely in the collaborative test.
           LIST OF REFERENCES

  1.  Youden, W.J., "The Collaborative Test," Jour-
     nal  of the AOAC, Vol46, No. 1, pp 55-62
     (1963).

  2.  Handbook of  the  AOAC,  Second  Edition,
     October 1, 1966.

  3.  ASTM Manual for Conducting an Interlabora-
     tory  Study  of  a Test  Method, ASTM  STP
     No.  335, Am. Soc. Testing & Mats. (1963).

 4.   1971 Annual Book of ASTM Standards, Part 30,
     Recommended Practice for Developing Precision
     Data on ASTM Methods for  Analysis and Test-
     ing  of  Industrial  Chemicals, ASTM Designa-
     tion:El80-67, pp 403422.

 5.  Westberg,  Karl; Cohen, Norman; and Wilson,
     K.W.:  "Carbon Monoxide:  Its Role in Photo-
     chemical Smog  Formation," Science, Vol 171,
     No. 3975, pp 1013-1015 (March 12,  1971).

 6.  Mandel, John, The Statistical Analysis of Ex-
     perimental Data,  John  Wiley  &  Sons,  New
     York, Chapter 13, pp 312-362 (1964).

 7.  Mandel,  J., "The Measuring Process,"  Tech-
     nometrics, l,pp 251-267(1959).

-------
8.   Mandel,  J., and  Lashof,  T.W.,  "The  Inter-
    laboratory  Evaluation  of  Testing  Methods,"
    ASTM Bulletin, No. 239, pp 53-61 (1959).


9.   Dixon, Wilfred J.,  and Massey, Frank J., Jr.,
    Introduction to Statistical  Analysis, McGraw-
    Hill  Book  Company,  Inc., New York, Chap-
    ter 9, pp 112-129 (1957).
10.   Duncan,  Ache son  J.,  Quality  Control  and
     Industrial Statistics, Third Edition, Richard D.
     Irwin, Inc., Homewood, Illinois, Chapters XXV
     and XXVI, pp 473-521 (1965).
11.   Bennett, Carl A., and Franklin, Norman L., Sta-
     tistical Analysis  in Chemistry and the Chemical
     Industry, John  Wiley and  Sons, New  York,
     Chapter 5, pp 149-164 (1954).

-------
                Errata
              Appendix A

  Reference Method for the Continuous
   Measurement of Carbon Monoxide in
     the Atmosphere (Non-Dispersive
         Infrared Spectrometry)

 Page A-l, Section 1.1, lines 4  and 5
delete "split into parallel beams and"

-------
                       APPENDIX A
REFERENCE METHOD FOR THE CONTINUOUS MEASUREMENT
       OF CARBON MONOXIDE IN THE ATMOSPHERE
       (NON-DISPERSIVE INFRARED SPECTROMETRY)
Reproduced from Appendix C, "National Primary and Secondary Ambient Air
Quality  Standards," Federal Register,  Vol  36,  No.  84, Part II,  Friday,
April 30, 1971

-------
                                                  RULES AND REGULATIONS
                                  -   .  -. @d $&mpl® and
      !  ,.  i  -i
APPENDIX  C—REFERENCE. .METHOD  FOR THE
  CONTINUOUS   MEASUREMENT   OF  CARBON
  MONOXIDE  IN  THE  ATMOSPHERE  (NON-
  DISPERSIVE  INFRARED  SFECTROMETRY)

  1. Principle and Applicability.
  1.1  This method IB based on the absorp-
tion of Infrared  radiation by  carbon mon-
oxide. Energy from a source emitting radia-
tion In  the  infrared region  is  split Into
parallel beams  and  directed  through ref-
erence and  sample cells. Both beams pass
into matched cells, each containing a selec-
D ii." •''•'•--• sii if! ti; it*

tive detector and CO. The CO In the cells
absorb Infrared radiation only at Its charac-
teristic frequencies and the detector Is sensi-
tive to those frequencies. With a nonatasorb-
ing gas In the reference cell, and with  no
CO In  the sample  cell,  the signals  from
both  detectors are  balanced electronically.
Any CO introduced into  the  sample cell will
absorb radiation, which reduces the temper-
ature and  pressure in the detector cell and
displaces a" diaphram. This  displacement is
detected electronically and amplified to pro-
vide an output signal.
  1.2   This  method is applicable to the de-
termination of carbon monoxide In ambient
air,  and  to the  analysis  of  gases under
pressure.
  2. Range and Sensitivity.
  2.1   Instruments are available that meas-
ure in the  range  of  0 to 58 mg./m.3 (0-50
p.p.m.), which Is the range most commonly
used for urban atmospheric sampling. Most
instruments measure in additional ranges.
  2.2   Sensitivity  is  1  percent  of full-scale
response per 0.6 mg. CO/m.3 (0.5 p.p.m.).
  3.  Interferences.
  3.1  Interferences vary between individual
instruments. The effect  of  carbon dioxide
Interference at  normal  concentrations  is
minimal. The primary interference is water
vapor, and  with  no correction may give an
interference equivalent to as high as 12 mg.
CO/m.3  Water vapor  interference  can be
minimized  by (a) passing  the air  sample
through silica  gel or  similar drying  agents,
(b)  maintaining constant humidity in the
sample and calibration gases by refrigera-
tion,  (c) saturating the air sample and cali-
bration gases  to  maintain constant humid-
ity or (d)  using  narrowband optical niters
In combination with some of these measures.
  3.2  Hydrocarbons  at  ambient levels do
not ordinarily interfere.
  4.  Precision,  Accuracy, and Stability.
  4.1  Precision determined with calibration
gases is  ±0.5 percent full scale in the  0-58
mg./m.3 range.
  4.2  Accuracy   depends  on  Instrument
linearity  and  the absolute concentrations
of the calibration gases. An accuracy of ±1
percent  of  full scale  in the 0-58  mg./m.8
range can be obtained.
  4.3  Variations in ambient room tempera-
ture  can cause  changes  equivalent  to  as
much as 0.5 mg. CO/m.8 per °C. This effect
can be minimized by operating the analyzer
in a  temperature-controlled room. Pressure
changes  between  span  checks will  cause
changes in Instrument response.  Zero drift
Is usually less than ±1 percent of full scale
per 24 hours, if  cell  temperature and pres-
sure are maintained constant.
  5. Apparatus.
  5.1  Carbon Monoxide  Analyser. Commer-
cially  available instruments should be in-
stalled on location and demonstrated, pref-
erably  by  the manufacturer,  to meet  or
exceed  manufacturers  specifications  and
those described in this method.
  5.2  Sample Introduction System. Pump,
flow control valve, and flowmeter.
  5.3  Filter (In-line). A filter with a poros-
ity of 2 to 10 microns  should be used  to
Keep large  particles from the sample cell.
  5.4  Moisture Control.  Refrigeration units
are available  with some  commercial Instru-
ments for  maintaining constant humidity.
Drying tubes  (with sufficient capacity to op-
erate  for  72  hours)  containing  Indicating
silica gel can be used. Other techniques that
prevent the  Interference  of  moisture  are
satisfactory.
  6. Reagents.
  6.1  Zero Gas. Nitrogen or helium contain-
ing less than 0.1 mg. CO/m.s
  6.2  Calibration Gases. Calibration  gases
corresponding to  10, 20, 40, and 80 percent
of full scale  are  used. Oases must be  pro-
vided with certification or  guaranteed anal-
ysfls of carbon monoxide content.
  6.3  Span Gas. The  calibration gas corre-
sponding to 80 percent of full  scale Is  used
to span the instrument.
   7. Procedure.
   7.1  Calibrate the Instrument as described
In  8.1. All gases  (sample,  zero, calibration,
and span)  must be introduced Into the en-
tire  analyzer  system. Figure  Cl shows  a
typical flow diagram. For specific operating
Instructions,  refer to the manufacturer's
manual.
                                     FEDERAL REGISTER, VOL. ,36, NO.  84—FRIDAY, APRIL 30, 1971
                                                               A-l

-------
  8.  Calibration.
  8.1  Calibration  Curve.  Determine  the
linearity of the  detector response at the
operating flow rate and  temperature.  Pre-
pare a calibration curve and check the curve
furnished with the instrument.  Introduce
zero gas and set the zero  control to indicate
a recorder reading of zero. Introduce span
gas and adjust the span control to indicate
the proper value on the recorder  scale  (e.g.
on  0-58  mg./m.3 scale, set the 46 mg./m.3
standard  at  80  percent  of the recorder
chart). Recheck zero and span until adjust-
ments are  no longer necessary.  Introduce
intermediate calibration gases and plot the
values obtained.  If a smooth curve  is not
obtained,   calibration   gases  may   need
replacement.
  9. Calculations.
  9.1  Determine the concentrations directly
from the calibration curve. No  calculations
are necessary.
  9.2  Carbon  monoxide  concentrations  in
mg./m.3 are converted to p.p.m. as follows:
       p.p.m. C0 = mg. CO/m.3XO-873

  10. Bibliography,
  The Intech NDIR-CO Analyzer by Frank
McElroy.  Presented at  the  llth Methods
Conference  in Air Pollution, University  of
California, Berkeley, Calif., April 1, 1970.
  Jacobs, K. B.  et al., J.A.P.C.A.  9,  No. 2,
110-114, August 1959.
  MSA LIRA Infrared Gas and  Liquid Ana-
lyzer Instruction Book, Mine Safety  Appli-
ances Co., Pittsburgh, Pa.
  BecKman Instruction 1635B, Models 215A,
3 ISA and 415A Infrared Analyzers, Beckman
Instrument Company, Fullerton, Calif.
  Continuous CO Monitoring System, Model
A 5611, Intertech Corp., Princeton, N.J,
  Bendix—UNOR   Infrared  Gas  Analyzers*
Ronceverte, W. Va.
  A. Suggested  Performance Specifications
 for NDIR Carbon Monoxide Analysers:

 Range (minimum) ------  0-58 mg./m.B
                            (0-50 p.p.m.).
 Output (minimum) _____  0-10, 100,  1,000,
                            5,000  mv.  lull
                            scale.
                          0.6  mg./m.B  (0.5
                       RULES AND  REGULATIONS

                  Output—Electrical signal which is propor-
                    tional  to the measurement;  intended for
                    connection to readout or data processing
                    devices. Usually expressed as millivolts or
                    milliamps full scale at a given impedance.
                  Full Scale—The maximum measuring limit
                    for a given range.
                  Minimum Detectable Sensitivity—The small-
                    est  amount of Input  concentration that
                    can be detected as the concentration ap-
                    proaches zero.
                  Accuracy—The degree of agreement between
                    a measured value and the true value; usu-
                    ally  expressed as ± percent of full scale
                  Lag Time—The time interval from  a step
                    change in input concentration at the in-
                    strument inlet to  the first  corresponding
                    change in the instrument output.
                  Time to  90 percent Response—The time in-
                    terval  from a  step change in the  input
                    concentration at the  instrument inlet to
                    a reading of 90 percent of  the  ultimate
                    recorded concentration.
                  Rise Time (90 percent)—The Interval be-
                    tween  initial response time and time to 90
                    percent response after a step increase in
                    the inlet concentration.
                  Fall Time  (90 percent)—The Interval be-
                    tween  initial response time and time to
                    90 percent response after a  step decrease
                    in the inlet concentration.
                  Zero Drift—The change in instrument out-
                    put over a stated  time period, usually 24
                    hours,  of unadjusted continuous  opera-
                    tion,  when  the  Input  concentration is
                    zero;   usually  expressed as  percent full
                    scale.

                                  SAMPLE INTRODUCTIOM
Minimum detectable sen-
  eitivity.

Lag time (maximum) —
Time  to  9O percent re-
  sponse (maximum).
Rise   time,  90  percent
   (maximum) .
Fall  time,  90  percent
   (maximum).
Zero drift (maximum) —
Span drift (maximum)--
  p.pjn.).
15 seconds.
30 seconds.

15 seconds.

15 seconds.

3 percent/ week,
  not  to  exceed
  1  percent/ 24
  hours.
3 percent /week,
  not  to  exceed
  1  percent/ 24
  hours.
±0.5 percent.
3 days.
Precision (minimum) ___
Operational period (min-
  Imum) .
Noise (maximum) _______   ±0.5 percent.
Interference   equivalent   l percent of full
   (maximum) .              scale.
Operating temperature   5-40° C.
  range  (minimum) .
Operating humidity range   10-100 percent.
   (minimum) .
Linearity (maximum de-   1 percent of full
  viation) .                  scale.

  B. Suggested Definitions of Performance
Specifications:
Range — The minimum and maximum meas-
  urement limits,
                                           Span Drift—The change in instrument out-
                                             put over a stated time period,  usually 24
                                             hours, of unadjusted continuous opera-
                                             tion,  when  the input concentration  is a
                                             stated upscale value; usually expressed as
                                             percent full scale.
                                           Precision—The degree of agreement between
                                             repeated measurements  of  the  same  con-
                                             centration, expressed as  the average devia-
                                             tion of the  single results from the mean.
                                           Operational Period—The period of time over
                                             which the instrument can  be expected  to
                                             operate unattended, within specifications. •
                                           Noise—Spontaneous deviations from a mean
                                             output not caused by input concentration
                                             changes.
                                           Interference—An undesired positive or nega-
                                             tive  output caused  by a substance other
                                             than  the one  being measured,
                                           Interference  Equivalent—The   portion  of
                                             indicated input concentration due to the
                                             presence of an interferent.
                                           Operating  Temperature Range—The range
                                             of ambient  temperatures over  which the
                                             instrument  will  meet  all  performance
                                             specifications.
                                           Operating  Humidity  Range—The range  of
                                             ambient relative humidity over which the
                                             instrument  will  meet   all  performance
                                             specifications.
                                           Linearity—The maximum deviation between
                                             an actual instrument  reading  and  the
                                             reading predicted by a straight line drawn
                                             between  upper  and  lower  calibration
                                             points.

                                                           ANALYZER SYSTEM
   SPAM
   AND
CALIBRATION
                                                                                                         !. R. ANALYZER
                                                                           VENT-<-
                                                                           VALVE
                                                                        Figure C1. Carbon monoxide analyzer flow diagram.
                                    FEDERAL  REGISTER, VOL. 36, NO.  84—FRIDAY, AWIIL 30,  1971
                                                               A-2

-------
          APPENDIX B
STATISTICAL DESIGN AND ANALYSIS

-------
                                  TABLE OF CONTENTS

                                                                                     Page

I.    INTRODUCTION	B-l

     A.   Purpose and Scope of the Experiment	     	B-l
     B.   Design of the Experiment	     	     •  •     ...  B-l
     C.   Presentation of the Data	B-2

II.   STATISTICAL ANALYSIS	B-3

     A.   Replication Error	     	B-3
     B.   Humidity Effects	    .   .  .   .  B-5
     C.   Linear Model Analysis	B-6

III.  INTERPRETATION OF THE PARAMETERS    	B-12

     A.   Precision of the Method	       .  .     	     •   •  B-12
     B.   Accuracy of the Method  .     ....    ...     .    .   .    .     	B-15

LIST OF REFERENCES  .    .        	B-20
                                            B-i

-------
                                   LIST OF ILLUSTRATIONS
Figure
B-1
B-2
B-3
B-4
B-5
R-fi

Control Charts for Means, Slopes, and Standard Errors of Estimate for

Repeatability and Reproducibility Versus Concentration . 	
Expected Agreement Between Two Means Versus Concentration for Various
Expected Agreement Between a Mean and a Fixed Value Versus Concentration
for Various Numbers of Observations (95 Percent Level of Significance) . .
Rial or Svstematir Frrnr VprsiK Hnnrfintration ... . ....
Page
B-2
. . B-9
. . B-12
B-16
. . B-17
B-18
                                        LIST OF TABLES

Table                                                                                         Page

B-l           Means and Standard Deviations for Each Cell of Data from the Collaborative Test     .       B-4

B-ll          Differences Between Humidified and Dry Test Results      ...    .                     B-5

B-l 11         Test of Hypothesis That the Mean Difference Between  Humidified and Dry Samples
             Is Equal to Zero    ...            .    .                     .                .     B-6

B-IV         Means and Standard Deviations for Each Laboratory for Each Sample           ....     B-8

B-V          Means, Slopes, and Standard Errors of Estimate for Linear Model Analysis                   B-9

B-VI         Analysis of Variance for Linear Model .                     ...          .  .    B-10

B-VII         Summary of Results for Variance Components and Derived Quantities for Linear
             Model Analysis                     .          .     .        .        .     ....    B-ll

B-VIII        Sources of Variability and Their Relative Importance for the Linear Model  Analysis   .  .    B-ll

B-IX         Standard Errors of Estimate for Calibration Curves Prepared from Calibration Gases
             and from Reference Gases  ...        	              .     ...      B-19
                                                B-ii

-------
                                            APPENDIX B
                             STATISTICAL DESIGN AND ANALYSIS
              I. INTRODUCTION
      In the  application  of interlaboratory testing
techniques, the first step is to  determine the exact
purpose of the program. There are many, and the
particular one must be established. All subsequent de-
tails of the  program must  be planned keeping the
prime  objective in mind. This appendix describes the
design and analysis of the formal collaborative test of
the Reference  Method for the Continuous Measure-
ment of Carbon Monoxide in the Atmosphere (Non-
Dispersive Infrared Spectrometry).

A.   Purpose and Scope  of the Experiment

      The basic objective of the interlaboratory study
is to derive precise  and usable information about the
variability  of results produced by  the measurement
method. This information is  necessary to establish the
reliability of the method in  terms of its precision and
its accuracy. More emphasis  was placed on the in-
herent  quality of the method  when properly  used
than upon the performance of the laboratories.

      The statistical planning of the program, which
necessarily must be limited  in scope, depends upon
what  information is desired. The scope is limited by
what  a collaborating laboratory can conveniently and
economically accomplish, as well as by the number of
collaborators that can be accommodated. Under these
limitations, it was possible to examine the effects of
laboratories,  concentrations,  days,  and replication
upon  the precision of the method, in addition to esti-
mating the effects of humidity upon the analysis.  The
experiment was designed so  that the analysis of vari-
ance technique could be used.

     A total of 16 laboratories took part in the pro-
gram. An analyst representing each laboratory went
through the  familiarization  phase and subsequently
conducted the formal collaborative testing. These in-
dividuals and  their affiliations have been identified
elsewhere in the main report. These laboratories con-
stitute a random sample  from a rather large popu-
lation of experienced laboratories.

      Three different concentrations were  analyzed
by  each laboratory. The concentrations were nomi-
nally 8, 30, and  53 mg/m3  These concentrations
were selected to approximate the low range, the inter-
mediate  range, and the high range  for the method.
Due to variations among the test gas cylinders, it was
not  possible for each laboratory to have test atmo-
spheres  having  the exact values above; however, the
expected concentrations are known  with confidence,
and  the  deviations  of the observed values from the
expected values may be examined.

      In  addition to the analysis of the dry gases,
each of the three concentrations was analyzed  after
humidification according to the technique illustrated
in the main report. This information was for the pur-
pose of testing the effectiveness of the  various mois-
ture compensation options used in the method.

     It was desirable to retrieve information which
would allow the investigation of various steps within
the  method; therefore, emphasis was  placed  upon
obtaining intermediate  data  relating to  calibration
curves, moisture compensation methods, instrument
ranges, instrument models, and sources  of calibration
gas.  As  a result,  a  substantial amount of data was
obtained in addition to the end result of the analyti-
cal procedure.

B.   Design of the Experiment

     A  properly planned collaborative test should
allow the analysis of the  results by the analysis  of
variance technique or by a procedure which incorpo-
rates this technique.U"5) In general, analysis of vari-
ance techniques are more  efficient  than the simpler
control chart techniques. Since the cost of statistical
analysis  is small compared to the total  cost involved
in a  collaborative test, it is desirable to  use the most
                                                  B-l

-------
efficient statistical methods available in analyzing the
results. High efficiency in data utilization is impor-
tant if the amount of data is limited.

      The form of the analysis depends upon the sta-
tistical model under  consideration.  Several separate
statistical  analyses were performed in order to deter-
mine the necessary parameters. Each of these analyses
will be described in detail in later subsections.

      The overall design of the experiment can best
be  shown by the diagram in Figure B-l. It can be seen
that one analyst in each of 15 laboratories analyzed,
in triplicate, each of three concentrations, both dry
and humidified, on each of three separate days, re-
sulting in a  total of 810 individual determinations.
Independent calibration curves were used  on each
day. The data are presented appropriately in the next
subsection.  In  collaborative  testing,  two  general
sources of variability can be readily detected. First,
the variability between laboratories can be estimated.
This is frequently the largest source of variability and
is not under the  control of the investigator. Second,
the within-laboratory variability can be estimated.
This source is under the control of the investigator to
the extent that the separate components which make
up this source may  be  identified separately. These
separate  components,  of varying magnitude and im-
portance, may be measured if the proper design has
been  employed.  Alternatively, the separate  sources
              may be confounded or lumped into a single variable
              by  altering the  design.  By employing the  design
              above, separate estimates could be made of the varia-
              bility between days and of the variability between
              replicates.  These  two components,  appropriately
              combined,  constitute the within-laboratory source of
              variability.

                   Additional assumptions  and rationale for  each
              of the  analyses listed previously will be stated later as
              the analysis is described and applied. If appropriate,
              the statistical  model will be stated in the respective
              discussion.

              C.   Presentation of the Data

                   The  data  resulting  from  the experiment are
              rather  voluminous; however, it is essential that these
              data be tabulated for future reference. In  addition to
              their necessity as supporting information for the prob-
              lem at hand, the data are also valuable academically as a
              source of data for  the development, evaluation, and
              comparison of new  statistical techniques. Therefore,
              the more voluminous raw data will be found in Appen-
              dix C.  Data subsets and averages will be presented in
              this appendix, as appropriate, along with the discussion
              of the respective statistical analysis.

                   In presenting the data, all identifiable arithmet-
              ic errors have been corrected. The data of Laboratory

1 1
L! Li L1B



(same as L^ )
1
DT D2 D3

(same as D-\ )
     I
    HI
 I
HI
n
 H2
                  1    2    3
                                                                                 FU R3
                                                         R2 R3
            FIGURE B-l.  DESIGN OF CARBON MONOXIDE METHOD EXPERIMENT. L, LABORATORIES;
                     D, DAYS; C, CONCENTRATIONS; H, HUMIDITIES; AND R, REPLICATES
                                                   B-2

-------
780 have been recalculated omitting an extremely sus-
picious calibration gas. The raw data from 15 of the 16
collaborating laboratories can be seen in Table C-I. One
laboratory failed to report complete and usable results.

      It is not convenient to have a separate subsec-
tion concerning  the tests for and disposition of out-
lying observations. Since there were several statistical
analyses and  several types of outliers, the outlying
observations, if any, will be identified in the respec-
tive analysis.

         II. STATISTICAL ANALYSIS
provides its  own estimate of the replication error
with  two  degrees of  freedom. The desired repli-
cation error  is the combined  estimate of these in-
dividual  estimates, but the question  is whether and
how these  should be combined. Three factors must
be  investigated in order  to  answer  this question.
First, outlying observations must be identified and
dealt  with. Second,  it  must be  determined if  the
replication  error  is  a  function  of  concentration.
Third,  it  must  be determined if  the  replication
error  is  affected  by  humidity. The techniques  for
each  of  these  analyses are discussed in the follow-
ing paragraphs.
      Because this study incorporates several statisti-
cal techniques, each with its own common notation,
certain complications involving consistency of mathe-
matical notation arise. To minimize the confusion, a
foldout notation guide is provided at the end of this
report. The symbols have been categorized according
to  their  respective use; however,  duplicate entries
were avoided where there was no conflict or incon-
sistency. This guide will materially assist the reader
throughout this report and may be left folded out for
ready reference at any time.

A.   Replication Error

      To  avoid ambiguity, the term replication  must
be  explicitly  defined. In the context of this study,
replicates are  defined to be successive determinations
with the  same operator and instrument on the  same
sample within intervals  short enough to avoid change
of  environmental  factors,  and with no intervening
manipulations other than zero adjustment. In general,
this will mean that time intervals between successive
replicates will be  on the  order  of a few to several
minutes. Defined in  this way, the replication  error
will primarily reflect the effects of instrument charac-
teristics such  as sensitivity, response time, and  read-
out  noise.  The effects  of changing  environmental
factors will not be included.

     The determination of the  replication  error is
straightforward from the data in  Table C-l. Each cell
      The means and standard deviations for each cell
are shown in Table B-I. The  data from laboratories
not running  replicates  consistent with the previous
definition are marked with an asterisk and cannot be
used in the estimation  of the replication error. The
data from laboratories  not reporting results to the
nearest tenth of a milligram per cubic meter were also
omitted and  are marked with a dagger. It would not
be consistent  to estimate standard deviations of the
magnitudes  involved from  data  rounded  to the
nearest unit. Inspection of the remaining standard devi-
ations  in  Table B-I reveal several suspicious results.
These remaining data were tested for outliers by Coch-
ran's test*-6) applied to each column of standard devia-
tions in Table B-I. The observations thus identified as
outliers (99 percent level of significance) have been
marked with a double dagger in the table. The values at
the foot of each column of standard deviations indicate
the magnitude of the pooled estimate and its degrees of
freedom for  the respective column.  The pooled esti-
mates were computed according to usual practice/7)
These results indicate that the replication error is not
affected by concentration or by humidity within the
limits included in the test. Statistical tests and regres-
sion analysis, although  hardly necessary, verify this
conclusion. Therefore, all individual estimates may be
pooled into the final estimate for the replication error
ae which is 0.17 mg/m3 (286 degrees of freedom).

     Beyond this point, the  individual values within
each cell are no longer required and all further analyses
of the data are made using only the cell averages.
                                                    B-3

-------
TABLE B-I. MEANS AND STANDARD DEVIATIONS FOR EACH CELL OF DATA FROM THE
      COLLABORATIVE TEST. First figure in each cell is the mean and the second figure
                            is the standard deviation.
Laboratory
Code Number
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
Day
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
1
2
3
Mean
D.F.
Low Concentration
Dry
8.4 0.00
8.5 0.12
8.6 0.00
6.9 O.OOt
6.9 0.00*
6.9 O.OOf
8.0 O.OOf
9.2 O.OOt
8.0 O.OOf
7.3 0.64*
8.0 1.15*
8.8 0.69*
9.0 0.35$
9.1 0.17
9.1 0.31
9.1 0.21*
8.6 0.06*
9.5 0.12*
8.2 0.00
8.6 0.00
8.0 0.00
7.3 0.00
7.2 0.06
7.7 0.10
8.6 0.12
8.9 0.12
8.9 0.16
8.2 0.00
8.6 0.06
8.9 0.06
8.1 0.10
8.8 0.15
8.2 0.25
7.9 0.29
8.3 0.31
8.0 0.00
7.4 O.OOf
7.4 O.OOt
7.4 O.OOt
8.0 O.OOt
8.0 O.OOJ
8.0 O.OOf
8.9 0.00
8.9 0.00
8.9 0.00
0.13
26
Humid
8.6 0.00
8.6 0.06
8.7 0.06t
7.0 0.64*
7.4 0.00*
7.2 0.29f
8.0 O.OOf
9.0 0.35f
8.0 O.OOf
7.7 1.33*
8.0 0.00*
8.4 0.69*
8.3 0.31
8.8 0.12
8.8 0.00
9.5 0.15*
9.3 0.00*
9.7 0.25*
8.3 0.23
8.6 0.00
8.1 0.12
7.2 0.06
7.2 0.06
7.7 0.06
8.6 0.06
8.8 0.20
8.6 0.06
7.9 0.23
8.6 0.06
8.7 0.40
12.0 0.10
12.3 0.25
11.9 0.12
8.2 0.00
8.0 0.00
9.0 0.64f
7.4 O.OOf
7.4 O.OOf
7.4 O.OOt
8.0 O.OOf
8.0 O.OOt
8.0 O.OOJ
8.9 0.00
8.9 0.00
8.9 0.00
0.20
24
Intermediate Concentration
Dry
30.5 0.06
30.3 0.21
30.5 0.00
28.6 O.OOt
29.0 0.35*
29.0 0.69f
31.1 0.35f
32.1 O.OOf
30.9 O.OOt
30.5 0.64*
31.3 0.69*
31.7 0.69*
31.3 0.17
31.2 0.30
31.3 0.17
31.4 0.10*
31.5 0.06*
31.0 0.06*
30.6 0.00
30.6 0.00
30.0 0.35
30.3 0.12
29.6 0.00
30.8 0.17
30.3 0.29
30.5 0.15
30.5 0.06
29.4 0.32
30.2 0.35
30.6 0.17
30.7 0.21
30.5 0.15
30.5 0.23
29.9 0.00
29.5 0.31
31.1 0.17
30.4 O.OOt
30.4 O.OOt
30.4 O.OOt
30.2 0.64f
32.1 O.OOt
32.1 O.OOt
32.1 O.OOt
32.1 O.OOt
32.1 O.OOt
0.19
23
Humid
30.4 0.00
30.2 0.15
30.5 0.00
28.6 0.45*
29.8 0.60*
28.1 1.53t
30.9 O.OOt
32.1 O.OOt
30.9 O.OOJ
31.7 1.33*
32.1 0.00*
31.7 0.69*
30.3 0.23
30.6 0.25
31.4 0.17
31.2 0.06*
31.2 0.00*
30.9 0.10*
30.6 0.06
30.7 0.15
29.9 0.17
29.6 0.23
29.6 0.06
30.4 0.17
30.0 0.06
30.5 0.15
30.5 0.06
29.2 0.06
29.9 0.17
30.3 0.17
33.4 0.35
33.6 0.17
32.8 0.17
29.8 0.85 J
29.2 0.60J
31.0 0.17
30.4 O.OOt
30.4 O.OOt
30.4 O.OOt
30.5 0.64f
31.7 0.69t
32.1 O.OOt
32.1 O.OOt
32.1 O.OOt
32.1 O.OOt
0.15
25
High Concentration
Dry
53.2 0.23
52.3 0.35
53.3 0.00
52.7 O.OOt
53.1 0.35*
53.5 0.86f
53.8 O.OOt
53.8 O.OOt
53.8 O.OOt
56.5 2.66*
55.4 0.64*
55.0 0.00*
55.6 0.00
55.3 0.31
55.9 0.17
53.4 0.06*
54.3 0.10*
53.5 0.17*
54.4 0.00
54.2 0.00
53.4 0.12
54.1 0.00
53.4 0.20
54.2 0.06
53.1 0.17
52.8 0.42J
52.9 0.06
49.7 0.12
50.8 0.17
51.7 0.35
52.6 0.26
52.7 0.15
53.2 0.15
53.2 0.23
54.3 0.12
54.7 0.23
52.7 O.OOt
52.7 O.OOt
52.7 O.OOt
53.8 O.OOt
55.0 O.OOJ
55.0 O.OOt
55.0 O.OOt
55.0 O.OOt
55.0 O.OOt
0.17
22
Humid
53.4 0.06
52.3 0.35
52.8 0.12
52.7 0.00*
54.0 0.87*
53.3 O.OOt
53.8 O.OOt
54.8 0.35t
54.2 0.35f
56.1 1.96*
55.4 0.64*
55.0 0.00*
53.9 0.17
54.5 0.12
55.3 0.23
52.8 0.10*
53.4 0.06*
52.9 0.06*
54.4 0.00
54.2 0.06
53.4 0.12
53.4 0.59t
53.3 0.12
53.2 0.20
53.3 0.00
52.8 0.10
53.0 0.21
49.3 0.38
50.4 0.30
50.9 0.23
54.3 0.00
54.5 0.00
54.3 0.31
52.4 0.31
50.9 0.17
54.4 0.17
52.7 O.OOt
52.7 O.OOt
52.7 O.OOt
53.8 O.OOt
54.6 0.69f
53.8 O.OOJ
55.0 O.OOt
55.0 O.OOt
55.0 O.OOt
0.20
23
*Not consecutive replicates.
t Results not reported to nearest 0.1 mg/m3 -
:£ Outlying observations.
                                    B-4

-------
B.    Humidity Effects

      Since humidity is a known interference, the ex-
periment was designed to measure the effectiveness of
the various methods of humidity compensation listed
in the  method (see Section 3.1  of the method in
Appendix A). The result of the design was to provide
two  levels  for  this  factor-one dry  and the other
essentially  saturated. The technique for this humidifi-
cation step has been discussed in the main report.

      Three of  the  four options for humidity com-
pensation listed in the method were used in the col-
laborative test. No potential collaborator reported the
use of option (c) which is "saturating the air sample
and calibration gases to maintain constant humidity."
Therefore,  this  option could  not  be  included. Five
laboratories used option (a) using drying agents, and
six  laboratories used  option  (b) using refrigeration.
Four  laboratories  used option (d) using narrow-band
optical filters. Two of these laboratories used optical
filters alone, and two used optical filters in combina-
tion with other methods. These  laboratories  will be
identified subsequently when  the data are presented.

      The effects of humidity can best be determined
by pairing  the data within days and within concentra-
tions since they are not independent pairs. Statistical
techniques to analyze these differences test whether
the  mean  difference is  significantly different from
zero.(8)  In addition to analysis of all differences to-
gether,  the  three  subclasses  of humidity compensa-
tion methods may be analyzed separately to determine
whether there are differences in the effectiveness of
the respective humidity compensation methods.

      The  data  are  shown in  Table B-II where  the
entries are the differences in the means of the three
replicates for each humidity level. Each entry is iden-
tified according  to its respective humidity compensa-
tion method  as shown  by  the symbols  and their
respective footnotes. These data can be shown to be
not  normally distributed—either  overall  or  within
concentrations.  The  data from  two laboratories, 780
and  799,  make the  major contribution  to non-
normality. These two laboratories used optical filters
and  it  is  obvious   that they   are  not  completely
effective; however, the one large negative  departure
for Laboratory 799 is probably an outlier. The data
for laboratories  using optical filters are  not further
analyzed due to the small number  of laboratories
using this method; however, it appears that  the use of
optical  filters in combination   with  other methods
gives satisfactory results.
     The data for options (a) and (b), which are nor-
mally distributed, are analyzed separately and the
    TABLE B-II.  DIFFERENCES BETWEEN HUMIDIFIED AND DRY TEST RESULTS. The figures for each day for each
        concentration for each laboratory are the result of subtracting the dry result from the humidified result, each
                        the average of triplicates. Differences are in milligrams per cubic meter.
Laboratory
Code Number
220b
222b
25 3a
270b
310b
311d
370b
375a'd
540b
571a
780d
799a,d
860a
920a
927a
Low
Concentration
0.2 0.1 0.1
0.1 0.5 0.3
0.0 -0.2 0.0
0.4 0.0 -0.4
-0.7 -0.3 -0.3
0.4 0.7 0.2
0.1 0.0 0.1
-0.1 0.0 0.0
0.0 -0.1 -0.3
-0.3 0.0 -0.2
3.9 3.5 3.7
0.3 -0.3 1.0
0.0 0.0 0.0
0.0 0.0 0.0
0.0 0.0 0.0
Intermediate
Concentration
-0.1 -0.1 0.0
0.0 0.8 -0.9
-0.2 0.0 0.0
1.2 0.8 0.0
-1.0 -0.6 0.1
-0.2 -0.3 -0.1
0.0 0.1 -0.1
-0.7 0.0 -0.4
-0.3 0.0 0.0
-0.2 -0.3 -0.3
2.7 3.1 2.3
-0.1 -0.3 -0.1
0.0 0.0 0.0
0.3 -0.4 0.0
0.0 0.0 0.0
High
Concentration
0.2 0.0 -0.5
0.0 0.9 -0.2
0.0 1.0 0.4
-0.4 0.0 0.0
-1.7 -0.8 -0.6
-0.6 -0.9 -0.6
0.0 0.0 0.0
-0.7 -0.1 -1.0
0.2 0.0 0.1
-0.4 -0.4 -0.8
1.7 1.8 1.1
-0.8 -3.4 -0.3
0.0 0.0 0.0
0.0 -0.4 -1.2
0.0 0.0 0.0
^Passing the aii sample through silica gel or similar drying agent.
.Maintaining constant humidity in the sample and calibration gasses by refrigeration.
Using narrow-band optical filters in combination with other measures.
                                                   B-5

-------
                      TABLE B-IH. TEST OF HYPOTHESIS THAT THE MEAN DIFFERENCE
                        BETWEEN HUMIDIFIED AND DRY SAMPLES IS EQUAL TO ZERO
Statistic
Number of
Observations
Mean
Difference
Standard
Deviation
t-value
Degrees of
Freedom
Drying Agents
Concentration*
A
15

-0.05

0.10

-1.82
14
B
15

-0.07

0.18

-1.62
14
C
15

-0.12

0.50

-0.93
14
All
45

-0.08

0.31

-1.75
44
Refrigeration
Concentration*
A
18

-0.01

0.30

-0.16
17
B
18

-0.01

0.54

-0.04
17
C
18

-0.16

0.53

-1.24
17
All
54

-0.06

0.47

-0.90
53
*A is low concentration, B is intermediate concentration, C is high concentration, and All is all concentrations combined.
results are summarized in Table B-III. No values of
the  t-statistics^8)  are  significant at the  95 percent
level of significance;  therefore, the hypothesis of
mean  differences  equal  to zero is accepted.  Both
methods  of moisture compensation appear to be
equally satisfactory in  comparison with the precision
capabilities of the method.


      An  analysis  of variance of the differences in
Table B-II (omitting Laboratories 780 and 799) indi-
cates a significant variation between laboratories with
respect to the variation between days. Both the varia-
tion between laboratories and the variation between
days appear to be  dependent upon concentration;
however, the data are  erratic in this respect and the
results are inconclusive.


      Using  the previously determined   replication
error and the  preliminary estimates of the precision
between  days  (0.3, 0.4, and 0.5 mg/m3  for the low,
intermediate, and high concentrations, respectively),
an examination of the significance of the magnitude
of individual differences  can be made. According to
these estimates, differences of less than 0.9, 1.2, and
1.4 mg/m3  for  the low, intermediate, and high con-
centrations,  respectively,  may  be accounted  for
95 percent  of the  time by chance  alone. Excluding
Laboratories 780 and 799, relatively few observations
exceed these amounts.
     The humidity  has  no measurable effect upon
the accuracy of the  method and does not appear to
contribute significantly to the precision.

C.   Linear Model Analysis

     The assumption made in linear model analysis is
that systematic differences  exist between  sets of
measurements made  by different observers in differ-
ent laboratories, and that these systematic  differences
are linear functions of the magnitude of the measure-
ments. Hence, the technique  is called "the linear
model."(1>3~5) The  linear  model leads to a simple
design, but  requires a special  method of statistical
analysis,  geared to the practical objectives of collabo-
rative tests.

     The general  design is  as follows: to each of p
laboratories, q materials have been sent for test, and
each laboratory  has  analyzed each material n times.
Now, the n determinations  made by the  ith labora-
tory on  the ]th  material constitute what  will be de-
noted as  the "ij cell." The n replicates of any  particu-
lar cell are viewed as a random  sample  from a theore-
tically  infinite population  of  measurements  within
that cell. The laboratories, however, are not  con-
sidered as a random sample  from a larger  population
of laboratories, but are considered as fixed variables.
Therefore,  the  inferences involving  the  variability
among laboratories is limited, at least  theoretically, to
                                                  B-6

-------
those laboratories participating in the test. The set of
values which corresponds to the q materials is viewed
as a fixed variable, but each material is considered to
be a random selection from a population of materials
with the same "value." This model allows for noncon-
stant,  nonrandom differences  between laboratories.
The method is not  as sensitive to  outliers as  is the
conventional analysis of variance where even  a  single
outlier  may  result in  an unusually large  interaction
term.

      This  collaborative  test has a nested design in
order to allow the differentiation between the repro-
ducibility of results  made almost simultaneously and
that of results obtained on different days. The term
replicate in the  paragraph above includes both the
replication error  as it has been previously defined in
this  appendix  as well  as  the  within-laboratory be-
tween-days precision yet to be determined.

      In view of the results of the analysis of humid-
ity effects  discussed in the previous subsection, it is
appropriate to combine  the data from both dry and
humidified test  concentrations  in  this linear model
analysis. Since humidity has  no apparent effect on
either precision or accuracy, there is no reason not to
combine the data.

      The data in Table B-IV provide the basis for the
determination of the between-days precision as well
as for  the  subsequent  linear  model  analysis. The
means and standard  deviations have been computed
as follows:
                  k=w
ytj
                 C  +
                     w
                   k=w
  4-=-
                        w- 1
                                             (B-l)
                                             (B-2)
where
                                             (B-3)
Each yfjk is rounded to 0.1 mg/m3 before subsequent
use in Equation  (B-l) or (B-2). For the particular
model, there  are p = 15 laboratories, q = 6 concentra-
tions or samples, w = 3 days, and n = 3 replicates. The
values of c,y  are shown in Table C-II of Appendix C
and the values of c,  are 8, 30, and 53 mg/m3 for the
low, intermediate, and high concentrations, respec-
tively. The reference values are the same for both dry
and humidified samples;  hence there are only three
values whereas q is equal to 6.

     The foregoing treatment is necessary in order to
remove  the variations due to the differences in indi-
vidual test  gas concentrations within a given concen-
tration level.  In estimating the replication error or the
between-days precision,   such   treatment  was  not
necessary; however,  it was  required for subsequent
analysis involving variations between laboratories.

     The data in Table B-IV are arranged by column
for samples  and  by rows for  laboratories with two
entries for  each cell—the upper  is the mean and the
lower is the standard deviation. Inspection of the
standard deviations  reveals  some  suspiciously  high
values which must be tested  to determine  whether
they  are  outliers.  Each  standard  deviation  has
w — 1 degrees of freedom, and each column may be
examined by  Cochran's test/6) One observation thus
identified as  an outlier (99 percent level of signifi-
cance) has  been marked with an asterisk. The values
at  the foot of each  column show  the  value for the
pooled estimate for  the column and also the respec-
tive degrees of freedom computed in accordance with
usual practice/7)

     Further examination of the pooled  estimates
for each column by regression analysis indicates that
there  is no significant correlation of standard devia-
tion with  concentration. Therefore,  all individual
estimates may be pooled into  a  single value  equal to
0.45 mg/m3 (178 degrees of  freedom).

     The value of 0.45 mg/m3 corresponds to  F(e),
which is the replication variance  in the context of the
general linear  analysis  modelA^-S)  \i  can  be
partitioned into the between-days precision and the
                                                   B-7

-------
TABLE B-IV. MEANS AND STANDARD DEVIATIONS FOR EACH LABORATORY FOR EACH SAMPLE. Upper number in
              each cell is mean and lower number is standard deviation. Values in milligrams per cubic meter.
Laboratory
Code Number
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
Pooled Estimate
D.F.
Dry
Low
Concentration
8.1
0.10
6.3
0.00
7.9
0.69
7.6
0.75
8.6
0.06
8.5
0.45
7.7
0.31
7.2
0.26
8.3
0.17
8.1
0.35
8.0
0.38
7.7
0.21
7.2
0.00
7.4
0.00
8.5
0.00
0.34
30
Medium
Concentration
30.5
0.12
28.7
0.23
31.3
0.64
31.3
0.61
31.5
0.06
31.5
0.26
30.5
0.35
30.0
0.60
30.5
0.12
30.1
0.61
30.8
0.12
30.0
0.83
30.5
0.00
31.7
1.10
32.0
0.00
0.50
30
High
Concentration
53.3
0.55
54.0
0.40
54.5
0.00
56.4
0.78
56.4
0.30
54.1
0.49
54.9
0.53
54.6
0.44
53.6
0.15
51.4
1.00
53.6
0.32
54.8
0.78
53.4
0.00
55.3
0.69
55.7
0.00
0.52
30
Humidified
Low
Concentration
8.2
0.06
6.6
0.20
7.8
0.58
7.6
0.35
8.1
0.29
8.9
0.20
7.7
0.25
7.2
0.29
8.2
0.12
7.9
0.44
11.7
0.21
8.0
0.53
7.2
0.00
7.4
0.00
8.5
0.00
0.30
30
Medium
Concentration
30.5
0.15
28.6
0.87
31.2
0.69
31.9
0.23
31.0
0.57
31.3
0.17
30.5
0.44
29.7
0.46
30.4
0.29
29.8
0.56
33.5
0.42
29.8
0.92
30.5
0.00
31.6
0.83
32.0
0.00
0.53
30
High
Concentration
53.2
0.55
54.2
0.65
55.0
0.50
56.3
0.56
55.4
0.70
53.4
0.32
54.9
0.53
54.0
0.10
53.7
0.25
50.9
0.82
55.2
0.12
53.3
1.76*
53.4
0.00
54.8
0.46
55.7
0.00
0.47
28
*Outlying observation.
replication error as defined in this study according to
the relationship
V(e) = o}>
                                             (B-4)
To reduce confusion as much as possible, V(e) will be
used to denote the variance for replication in the con-
text  of linear model analysis, and a\ will be used to
denote the variance for replication as defined in this
study.  Solving Equation (B-4) with  V(e) =  0.45,
ae =0.17,and« = 3yieldscT£, = 0.44 mg/m3 (161 de-
grees of freedom) for the value of the standard devia-
tion for between-days precision. The effects of environ-
mental factors and calibration procedures are included
in this error term.

     Since there was no significant correlation of be-
tween-days precision with concentration, there was no
                                                    B-8

-------
 need to make any transformation of scale, and the
 following linear model analysis was thus made upon
 the means in Table B-IV.

      On  the  assumption  of linear  relationships
 among the p laboratories, it  follows that  the values
 obtained by  each laboratory are  linearly  related to
 the  corresponding average values  of all  laboratories.
 Each of the  means in Table B-IV may be plotted ver-
 sus  its  respective column  mean. This  should be a
 linear function, and the points corresponding to each
 line may be represented by three parameters: a mean;
 a slope; and  a quantity related to  the  deviation from
 linearity, the standard error of estimate. These param-
 eters are determined by  a least-squares  regression
 analysis, and the results are shown in Table B-V. The
 data of Laboratory 780 have been  eliminated from
 this analysis  because  of the large differences between
 dry  and humidified samples. The linear model analy-
 sis  by itself will reveal  any other laboratory outliers.
    TABLE B-V.  MEANS, SLOPES, AND STANDARD
     ERRORS OF ESTIMATE FOR LINEAR MODEL
      ANALYSIS. (Omitting Laboratory 780) Data in
              milligrams per cubic meter.
Laboratory
Code Number
220
222
253
270
310
311
370
375
540
571
799
860
920
927
Mean
Mean
30.63
29.73
31.28
31.85
31.83
31.28
31.03
30.45
30.78
29.70
30.60
30.37
31.37
32.07
30.93
Slope
0.9697
1.0248
1.0083
1.0482
1.0226
0.9686
1.0150
1.0129
0.9762
0.9277
0.9936
0.9932
1.0244
1.0148
1.0000
Standard Error
of Estimate
0.13
0.74
0.34
0.26
0.44
0.37
0.27
0.32
0.16
0.44
0.59
0.35
0.47
0.20
0.41*
*Pooled estimate.
      A plot of the lines represented  by the means
and  slopes from Table B-V would result in  a  rela-
tively tight bundle of  straight lines,  each  line rep-
resenting a particular laboratory. Only the  lines for
Laboratories 222 and  571 depart  from the  cluster
enough  to be  recognized; therefore,  the plot was
not  reproduced in this  report. Both the means and
the  slopes  approximate normal distributions,  and
no outliers can be detected in either.

      Inspection of the standard errors of estimate
from Table B-V reveals one suspiciously high value;
however, these standard errors of estimate have an
approximate  chi-square distribution and no outliers
can be identified.

      These data may be more easily compared from
the  graphic presentation in Figure B-2 where they
have been  sorted into  an ascending order relative to
the means. This sorting often reveals effects not read-
ily  visible  otherwise.  Control limits, based upon
deviation from linearity, are shown for the means  and
the slopes. These 95 percent control limits indicate
several points to be "out of control." This indicates
that  the differences between laboratories cannot be
                                                       n   0.75
                                                       E
        571 222 360 375 799 220 540 370 311 253 920 310 270 927
                    Laboratory Number

FIGURE B-2. CONTROL CHARTS FOR MEANS, SLOPES,
    AND STANDARD ERRORS OF ESTIMATE FOR
            LINEAR MODEL ANALYSIS.
              (Omitting Laboratory 780).
                                                   B-9

-------
accounted  for by experimental error alone.  Exami-
nation  of Figure  B-2  reveals which laboratories
showed the  greatest  departures  from the  overall
mean, which laboratories showed the greatest depar-
ture from unit slope, and which laboratories were re-
sponsible  for the  greatest deviations  in  linearity.
When viewing this figure, it is important to watch for
relationships between the parameters.

      The next step is an analysis of variance which
was  performed  according  to  the technique  of
Mandel/3) and the  results are  shown in Table B-VI.
The interested  reader  may consult  the appropriate
reference for the theory and details of the analysis.
      The next step  in linear model analysis is to de-
termine whether a  correlation exists  between the
means and the slopes. Such a correlation, if it exists,
is a valuable feature in the interpretation of the data.
The correlation between these two parameters is sig-
nificant at 90 percent but not at the 95 percent level
of significance; therefore, an approximate correlation
exists, and the  slopes and the  means  are  not  com-
pletely independent. This significantly positive corre-
lation indicates  a tendency for concurrence of the
lines  at  a  point below  the  overall   mean  of
30.9 mg/m3  If the lines  were exactly concurrent,
there  would exist  a particular value  of concentra-
tion—the point of concurrence—at which  all labora-
tories obtained  the  same  result. An F ratio of the
mean  square for  nonconcurrence  to  V(n)  from
Table B-VI is highly significant; therefore, the concur-
rence is not absolute, and there remains a significant
amount of variability  between laboratories  even at
the point  at which all laboratories tend to agree best.
This point lies in the vicinity of zero.
      The variance components may now be  com-
puted  from the data in Table B-VI, and again  the
technique of Mandel(3) was used. A summary of re-
sults for variance components and derived quantities
is shown in Table B-VII.

      It is now necessary to introduce and define the
concept of a test result.^ A test result is defined as
the average  of m replicates, where m is the required
number  of  replicate measurements specified by  the
method. The particular method does not specify  any
more than one replicate; therefore the  value of m is
taken to be one. Thus, a test result  is defined as  a
single measurement  and V(e) is given by Equation
(B-4) with n = m =  1.

      The four sources of variability have been calcu-
lated for several values of concentration and are shown
in Table B-VIII. Also shown are the fractions of the
total variance accounted for by each source. Compari-
son of V(e) and K(X), each of which is constant, would
indicate that the precision of the method could be im-
proved by decreasing V(e). However,  V(e) is largely
composed of the variation between days, which is large
in comparison with the replication error; therefore, in-
creasing  the number of replicates  will not materially
assist in  improving the precision of the  method. The
between-laboratory variability is larger than the within-
laboratory variability throughout the table,which indi-
cates significant sources of variation between the labo-
ratories.  These sources of variation are undoubtedly
related to the accuracy of the calibration gases used
in the collaborating laboratories.

      The repeatability  and  reproducibility^9) must
now be defined and computed. The repeatability is "a
                         TABLE B-VI. ANALYSIS OF VARIANCE FOR LINEAR MODEL
Source of Variation
Laboratories
Concentrations
Laboratory x Concentration
Linear
Concurrence
Nonconcurrence
Deviation from Linear
Sum of Squares
42.6987
30284.1677
35.9206
27.0714
7.4996
19.5718
8.8491
Degrees
of Freedom
13
5
65
13
1
12
52
Mean Square
3.2845
6056.8335
0.5526
2.0824
7.4996
1.6310
0.1702
                                                   B-10

-------
   TABLE B-VII. SUMMARY OF RESULTS FOR VARI
     ANCE COMPONENTS AND DERIVED QUANTI-
      TIES FOR LINEAR MODEL ANALYSIS. Data
             in milligrams per cubic meter.
                                             where V(rj) is given by
Components
Within Laboratories
°l
°b
K(e) = a'D + a\ln
VM
VM = VM + V(e)lw
Between Laboratories
V(u.)
V($)
K(6)
at
X
Derived from
Collaborative Test,
n = w = 3

0.0289
0.1936
0.2025
0.1027
0.1702

0.5191
0.000884
0.000754
0.02207
30.9
For Computations
Based on a Test Result,
n - w ~ 1

0.0289
0.1936
0.2225
0.1027
0.3252

0.5191
0.000884
0.000754
0.02207
30.9
quantity that will be exceeded only about five per-
cent of the time by the difference, taken in absolute
value, of two randomly selected test results obtained
in the  same laboratory on a given material. "(9) The
reproducibility  is "a quantity  that will  be exceeded
only about five percent of the time by the difference,
taken in absolute value, of two single results made on
the same material in two different, randomly selected
laboratories. "(9) These parameters are computed by
the formulas
 Repeatability = 2.77 VF(T?)
Reproducibility = 2.77 V^/C
                                             (B-5)

                                             (B'6)
                                                                                                  (B-7)
                                                      where V(e) is given by Equation (B-4) for n = m = 1
                                                      and Vj(y} is given by

                                                             V,(y) = (l+ay/)
                                                                           Between
                                                                          laboratories
                                                                                                  (B-8)
                                   Within
                                 laboratories

where the index; is attached to the variance  symbol
to signify its dependence upon yf- which is given by

                 -Y. = x —x                  (R-9}
                 i]  Aj  A                  \L> ?)

where Xj is the level of concentration at which Vj(y)
is  desired. Substituting  the derived values  into Equa-
tion (B-8) and simplifying, the following equation is
obtained.

     Vf(y) = 0.001007*? - 0.0393^. + 1.10    (B-10)

Users may choose between Equation  (B-8) or (B-10)
or the graphic presentation shown in Figure B-3 in
which the repeatability and the reproducibility have
been plotted for a range of values of concentration.
                TABLE B-VIII. SOURCES OF VARIABILITY AND THEIR RELATIVE IMPORTANCE
                                   FOR THE LINEAR MODEL ANALYSIS
X
0
5
10
15
20
25
30
35
40
45
50
55
60
\/F(e) Pet.*
0.45 19
0.45 22
0.45 26
0.45 28
0.45 29
0.45 28
0.45 25
0.45 22
0.45 18
0.45 15
0.45 12
0.45 10
0.45 9
•JV(\) Pet.*
0.32 10
0.32 11
0.32 13
0.32 14
0.32 15
0.32 14
0.32 13
0.32 11
0.32 9
0.32 8
0.32 6
0.32 5
0.32 4
Vd + <*7)2 K(M) Pet.*
0.23 5
0.31 10
0.39 19
0.47 31
0.55 43
0.63 54
0.71 62
0.79 66
0.86 67
0.94 66
1.02 64
1.10 62
1.18 60
V72V(fi) Pet.*
0.85 67
0.71 56
0.57 42
0.44 27
0.30 13
0.16 4
0.03 0
0.11 1
0.25 6
0.39 11
0.52 17
0.66 22
0.80 27
N/FOO
1.04
0.95
0.89
0.85
0.83
0.85
0.90
0.97
1.06
1.16
1.28
1.40
1.53
*Percent of total variance.
                                                 B-ll

-------
_£
 OJ
 E

I"  3
!Q
'o
~a
 o
 Q.
 0)
oc
 o
 2-  2
                                                               Reproducibility
                                                               Repeatability
              I
I
I
I
                     10             20             30              40             50
                                                            2
                                          Concentration, mg/m
              FIGURE B-3. REPEATABILITY AND REPRODUCIBILITY VERSUS CONCENTRATION
                                                                            60
      III.  INTERPRETATION OF THE
                PARAMETERS
                              cases and selected from a group of g means, then the
                              a allowance for any comparison is
     The results of the previous section may now be
used to answer some fundamental questions—thus ful-
filling the objectives of this collaborative test. Unless
otherwise stated below, a 95 percent level of signifi-
cance is assumed.

A.   Precision of the Method

     The most general method to test class means is
the studentized range A ^"^^ If an estimate of the
standard  deviation s is based on v degrees of freedom
and is  independent of the class means to be  com-
pared, and if these  class means are computed from TV
                                                                        (B-ll)
                              where x\ is the highest class mean and x2 is the low-
                              est class mean. The value  of q is obtained from the
                              appropriate tabledH> 13) interest will center around
                              g = 2 because most often the interest is in comparing
                              two class means. In computing checking limits for
                              duplicates, N is of course  equal to  1 and the test is
                              identical to ASTM recommended practiceX14)


                                    An obvious limitation is that the means must all
                              contain the same number of observations. When this
                                                B-12

-------
 is not the case, the standard normal deviate is ade-
 quate*-15) and use can be made of the equation
          -*2|is the absolute value of the difference
 in the two class means x1 and 3c2 , and Nl and 7V~2 are
 the  numbers  of observations in Xi and 5c2 , respec-
 tively. The results from this equation are the same as
 Equation (B-ll) when N = N^  = 7V2  and v is  large.
 The results  are  adequate if N^ and JV2 are relatively
 large (20 or more).

      To  test whether the true  value  of a mean is
 lower than a specified fixed value, the maximum per-
 missible difference is
 which is a one-sided test where x is the mean, ;U0 is
 the fixed value, and N is the number of observations
 in x.

      These techniques will be applied as appropriate
 to  the  three  sources  of variation below. The treat-
 ment will be in more depth for the precision between
 laboratories, which is of more practical interest.

      1.   Precision Between Replicates

           We have  already concluded that replica-
 tion will not materially assist  in increasing the preci-
 sion of the method. Replication will, in general, be a
 waste  of time  and effort; however, replicates  are
 often advisable to avoid gross errors. The expression
 for the  checking limit  for duplicates uses  ae and
 Equation (B-l 1) yielding
          Rr
2.77(0.17) = 0.5
(B-l 4)
where  .Rmax is the  maximum permissible range  be-
tween  duplicates. Two such replicates should be con-
sidered  suspect  if  they  differ  by  more  than
0.5 mg/m3

     2.    Precision Between  Days
           One situation involves within-laboratory
comparisons  of the same sample. It  is  of  interest
                                     when comparing measured values on the same sample
                                     analyzed on separate days. The estimate of the stan-
                                     dard deviation in Equation (B-ll) must now include
                                     the variation between days in addition to the replica-
                                     tion error. The expression for Rmax, the maximum
                                     permissible range between two test results, is
                                               /?max=2.77VF(e)=1.3
                                                       (B-l 5)
                                     where V(e) is given by Equation (B-4) for n = m = 1.
                                     Two such test results should be considered suspect if
                                     they disagree by more than 1.3 mg/m3
                                                A  separate  and  distinct  case  arises  for
                                     within-laboratory  comparisons  of  two samples. Sup-
                                     pose it is desired to  compare the results from a single
                                     laboratory on  two different but similar samples ana-
                                     lyzed on different days. The samples  may have  the
                                     same  concentration but may differ in other inter-
                                     fering  properties  such  as humidity. It  must  be
                                     assumed  that  the heterogeneity  between the  two
                                     samples with respect to interfering  properties is essen-
                                     tially the same as that shown in the collaborative test.
                                     Therefore, the estimate  of F(X)  is the appropriate
                                     measure for the possible heterogeneity of the  two
                                     samples. Thus, the  standard deviation estimate  for
                                     Equation (B-ll) must now include F(A) as well. The
                                     resulting expression for -Rmax, the maximum permis-
                                     sible range between the test results on each sample is
                                           R max = 2.77VF(A)+F(e) = 1.6
                                                      (B-l 6)
Therefore,  the  maximum permissible difference be-
tween a single test  result on each of the  samples is
1.6 mg/m3. If two such test results differ by less than
1.6 mg/m3  there is no reason to believe that there is
any real difference between them.

     There  may also  be some occasions where it
will  be  necessary to compare the means for  each
of two  given sampling  stations,  where each mean
was  obtained by the  same  analyst, and consisted
of a  known number of test results.  The number of
observations  in  each  mean  will  not usually be
equal. Their standard deviations will not  usually be
equal,  and  one  or  both  may   not  be   normally
distributed.  Where  they  are  normally distributed,
standard tests such as the t-test(17) may be  applied.
                                                  B-13

-------
           A limiting case may  be investigated  if
it is assumed that two  means 3ci  and x2  are nor-
mally  distributed with  ai = cr2 =  \/V(\) + V(e) =
0.57 mg/m3. This is an  unlikely, if not impossible,
situation which could only result from  absolutely
constant  concentrations at  each  of  the sampling
stations. Under these  assumptions, we may apply
Equation (B-12) and obtain
                                           (B-17)
where ^max is the  maximum permissible  range be-
tween means jct  and x2 containing N\ and 7V2 ob-
servations, respectively. If the range exceeds Rmax,
the  means  are significantly different  and do not
belong to the same  population.

           Under the same  limiting assumptions, a
mean  3c  containing TV observations  may  be  com-
pared  with some  fixed  value  /j0 and it may  be
stated whether  the  true value  of 3c is less than MO •
Equation (B-13) may be applied to this case result-
ing in
                                           (B-18)
where .Rmax is the maximum permissible range be-
tween x and MO- If 3c -ju0 is less than.Rmax, then the
true value of jc is less than /u0.

      3.    Precision Between Laboratories

           Probably the  most  frequent  comparison
to be made  will be that involving observations of two
different laboratories. When a comparison is made be-
tween results obtained in different laboratories, the
variance  F(X) is always included  in the comparison,
regardless of  whether  this  comparison involves  a
single material  or different materials. While it is true
that the interfering properties for a single material are
constant, the response of different laboratories to the
same interfering property may not necessarily be the
same. The variability of this response is exactly what
is  measured by F(A). The estimate of the standard
deviation  for  Equation (B-l 1)  now  contains  the
effects of variations in the means and the slopes of
the response lines for the laboratories. The required
estimate is  the square root of V/(y) which may be
obtained from either Equation (B-8) or (B-10). The
resulting expression for jRmax, the maximum permis-
sible  difference  between a  test result from each of
two different laboratories, is

                                            (B-19)

This comparison is complicated by the dependence of
between-laboratory variability on  the concentration.
Rmax  is  identical  to the  reproducibility given by
Equation (B-6) and plotted  in Figure B-3. Two such
test results may not be considered to belong to the
same  population if they differ by more than ^?max.
Conversely,  the two test results are  not significantly
different if they differ by less than Rm ax.

           Frequently, it will be  necessary to com-
pare the means for each of two given sampling sta-
tions. Each  mean may be the result of observations
by  one or more different  laboratories.  Each  mean
may contain a different number of observations, each
a  test  result.  Their  standard  deviations will not
usually  be equal, and one or  both may  not be nor-
mally  distributed.  Where   they  are normally dis-
tributed, standard tests such as the t-test^17) may be
applied.

           Similar  to the  preceding  subsection, a
limiting case may be investigated if it is assumed that
the two means 3cj  and 3c2  containing NI =7V2 =N
observations are   normally  distributed  with
PI = ^2 = VP/O")- Here again, this is an unlikely, if
not impossible, situation which  could  only  result
from  absolutely  constant  concentrations at  each
sampling station. Nevertheless, a certain  amount of
guidance can be derived. If Equation  (B-ll) is applied
to this case,  the result is
                                            (B-20)
where  RmaK is  the maximum  permissible range  be-
tween  the means  3ci  and  3t2.  If  the range exceeds
.Rmax, the means are significantly different and do
not belong to the same population.
                                                  B-14

-------
           Under the  same assumptions  as  above,
with  the  exception  that N1  may not equal /V2 but
both  are  relatively  large, Equation (B-12) is used,
yielding
                                                                 Rearranging Equation (B-23) and solving
      R»
                             1     1
                             — + —      (B-21)
                             N,   N,
         max  is the maximum permissible range  be-
where  R
        11.
tween  Xj  and  x2.  If  the range exceeds /?max, the
means  are significantly different and do not belong to
the same population.

           It is interesting to pursue this line of rea-
soning  further  in terms  of the  number of samples
required to  detect  a specified difference  under the
limiting assumptions.  Rearranging Equation (B-20)
and solving for TV, the result is
                                            (B-22)
This expression now gives the minimum number of
observations N for any desired agreement Rmax be-
tween  two  means  at any level of concentration x,-.
These  results are best illustrated  in  Figure B4. This
figure  shows the agreement versus the concentration
level for a family of sample sizes. Superimposed on
the curve are constant percentage agreement lines for
comparison purposes. For example, if  agreement
better  than 5 percent at a concentration of 15 mg/m3
is  desired, a minimum of  10 observations would be
required.

           Under the same limiting assumptions, it is
possible to compare a mean x containing N observa-
tions with some  fixed value MO and be able to state
whether  the true value of x  is less than MO. Equa-
tion (B-13) is used for this type case yielding
          Rr
                = -1.645
                              Vfiy)
                               N
(B-23)
where R     is the maximum permissible  range be-
tween x and MO- If* -Mo is less than,Rmax,
true value of x is less than MO •
                                         then the
                                                      for N yields
                                                                   N=
                                                                         1.645
                                                                                                 (B-24)
This  equation  is  exactly  analogous  to Equa-
tion (B-22). N is  the  minimum  number of obser-
vations required to attain the agreement -Rmax under
the limiting assumptions.  Figure B-5, which is ana-
logous to Figure B-4, best illustrates the resulting rela-
tionships.  For example, a  minimum of two  observa-
tions would be required to establish that the true
value of x is  less than 20 mg/m3, while the actual
value is 19 mg/m3  (a  5-percent difference). Stated
differently, given a set of two observations with a
mean of 19 mg/m3, there exists a 95 percent confi-
dence that the true mean is less than 20 mg/m3

B.    Accuracy of the Method

      In  the  discussion of  accuracy, an additional
concept must  be introduced—the  reference value of
the measured property for the system under consider-
ation.  Mandel(18) discusses  three types of reference
values  of which the "assigned value" applies for  this
collaborative   test.  The  reference  values   for  the
samples included in the study are the values provided
for these samples by the supplier of those samples.
This does not necessarily mean that these values are
considered  absolutely  correct, but it  does mean that
there is a reasonable degree  of confidence in the qual-
ity of such materials from this source.

      If the reference  value is represented by R  and
the  mean  of  the population  of repeated measure-
ments  is M, then the bias or systematic error is M - R.
The  error for an individual  measurement x would be
x —R. Inaccuracy is thus measured by the magnitude
of p.-R 01 x -R. A method is accurate if M - R is
not significantly different from zero.

       A definite and  statistically  significant   in-
accuracy exists; however,  its practical significance
must be interpreted with  respect to other  criteria.
This inaccuracy is best illustrated in  Figure B-6.  The
                                                  B-15

-------
                                      1, mg/m3
FIGURE B4.  EXPECTED AGREEMENT BETWEEN TWO MEANS VERSUS CONCENTRATION FOR VARIOUS
  NUMBERS  OF OBSERVATIONS (95 Percent Level of Significance). EACH MEAN HAS N OBSERVATIONS
           WITH A STANDARD DEVIATION EQUAL TO (0.001007x5 - 0.0393*, + 1.10)0-5
                                       B-16

-------
01
E
 o
                 10
20
 30

, mg/mc
                                                             40
                                            50
                                                                                          60
    FIGURE B-5. EXPECTED AGREEMENT BETWEEN A MEAN AND A FIXED VALUE VERSUS CONCENTRATION
        FOR VARIOUS NUMBERS OF OBSERVATIONS (95 Percent Level of Significance). THE MEAN HAS N
         OBSERVATIONS WITH A STANDARD DEVIATION EQUAL TO (0.001007M20 - 0.0393,u0 + 1.10)0-5
                                           B-17

-------
    3 -
   -2
                           O
                         ^vLLLKJJ	
                   20     30
                Concentration,
                                  40
                                                60
     FIGURE B-6.  BIAS OR SYSTEMATIC ERROR
             VERSUS CONCENTRATION
 departures  of individual  laboratory  averages from
 their respective reference values as well as the depar-
 tures of overall averages from their respective values
 have been  plotted  versus concentration. The large
 open circles are overall averages at each level of con-
 centration in the collaborative test. The  small circles
 are individual laboratory values. The overall averages at
 each level are the mean of 14 laboratory averages, each
 of which is the average of 1 8 observations (three repli-
 cates on each of three days for each of two samples).
 The standard error of the overall means is represented
 by  the solid lines, and the standard error of the indi-
 vidual  means is represented by the  dashed lines. The
 standard errors have been plotted with  reference to
 zero so that observations failing outside these lines are
 significantly different from zero.

     Several  of  the  individual laboratory  means
 are significantly different from zero, mostly at the
 higher  concentrations and nearly  all on the  high
 side. The overall  means  are  significant at the two
higher  levels  of concentration. This  relationship is
nearly  linear  and the  results tend  to  be,  on  the
 average, 2.5 percent  high.
      The method uses the same  type materials for
calibration as were used for reference samples in this
test. There is little doubt, therefore, that the inaccu-
racy results primarily, if not completely, from the use
of calibration gases which exhibit significant variation
with  respect to their specified  concentration. Since
the results tend to be high, the calibration gases must
have a tendency to be correspondingly low.

      Caution  should be exercised in the use of these
measures  of   accuracy.  Although  calibration  gas
sources  were  randomly  selected,  it is known that
some standards used by different laboratories were
prepared and analyzed at the same time by the same
supplier. Nevertheless, it cannot be overemphasized
that the accuracy of the method is  almost totally
dependent upon the availability of sufficiently accu-
rate standards.

      In order to further examine the accuracy  of the
calibration gases used by the individual  collaborators,
some additional  analyses were made.  The first of
these investigated the individual calibration curves
and  compared them  with calibration curves con-
structed from  the reference sample  data. Since  the
chart readings for each calibration curve  were re-
corded,  the parameters of each curve could be com-
puted. The standard error of estimate was computed
for each calibration curve by a least-squares regression
analysis of chart  readings on calibration gas concen-
tration.  The average standard errors  of estimate for
each laboratory are shown in Table B-IX, where they
have  been grouped into  instrument ranges  and sub-
grouped into calibration gas sources. They are in units
of chart divisions and, for purposes of  comparison, a
chart division  for  the 0 to 58-mg/m3 (0 to 50 ppm)
range is approximately equal  to  0.6 mg/m3,  and a
chart division  for  the higher range is approximately
equal to  1.2 mg/m3. These  standard errors of esti-
mate  are measures of nonlinearity of the calibration
curves.
      In order to provide individual comparisons, the
^n~ analysis was performed on the reference samples
and their respective chart readings as though they
were actually calibration gases. These are also shown
in Table B-IX. Inspection reveals some unusually large
same ana
                                                   B-18

-------
                 TABLE B-IX. STANDARD ERRORS OF ESTIMATE FOR CALIBRATION CURVES
                    PREPARED FROM CALIBRATION GASES AND FROM REFERENCE GASES
Laboratory
Code Number
571
370
375
799
310
311
222
220
253
540
860
920
780
927
270
Instrument*
A
B
A
A
A
A
C
D
B
A
B
A
A
B
A
Rangef
0-58
0-58
0-58
0-58
0-58
0-58
0-116
0-116
0-116
0-116
0-116
0-116
0-116
0-116
0-116
Water Vapor
Compensation^
a
b
a&d
a&d
b
d
b
b
a
b
a
a
d
a
b
Calibration
Gas Source*
A
B
B
C
D
D
E
A
A
A
A
B
C
F
E
Standard Error of Estimate
Calibration
1.0**
1.0**
2.5**
4.3**
1.8**
1.2**
1.2ft
0.7ft
0.4ft
0.4ft
O.Off
1.6ft
3.7ft
1.3ft
2.4ft
Reference
1.2**
1.1**
0.7**
0.9**
0.9**
1.2**
1.9ft
0.2ft
0.5ft
0.3ft
O.Sff
0.5ft
0.4ft
0.2ft
0.5ft
*Coded to obscure identity.
t Milligrams per cubic meter.
jSee Section 3.1 of method in Appendix A.
**Chart divisions- 1 chart division is approximately equivalent to 0.6 mg/m3
ffChart divisions- 1 chart division is approximately equivalent to 1.2 mg/m3
values in  the calibration gas data. In the majority of
cases,  the standard error of estimate for the calibra-
tion gases is larger than the corresponding value for
the reference samples. It is evident that the calibra-
tion  gases  are  more variable  than  the  reference
samples; the question remains, to what can this vari-
ability be ascribed?

      To explore this matter further, each calibration
gas was "analyzed," using its respective chart reading
and the "calibration curve" prepared from reference
samples. Such a treatment corresponds to giving the
collaborator  the  concentrations  of  the  reference
samples and asking him to prepare a calibration curve
from them and then analyze his own calibration gases
as if  they were unknown  samples. This is most in-
formative since these "analytical results" may be
compared with the specified value  for the calibration
gases.  Unusually large differences  would point to  a
suspicious calibration gas.
     Following this  procedure, some  of  the large
standard errors of estimate can be explained by one
suspicious calibration gas. Some notable examples of
differences  more  than  10 percent are  (^Labora-
tory 799-a  higher value than  quoted for 23 mg/m3
(20 ppm)  calibration  gas,  (2) Laboratory 270-a
much   lower  value than  quoted  for 91 mg/m3
(80 ppm) calibration gas, (3) Laboratory 780-alower
value than quoted for 46 mg/m3 (40 ppm) calibration
gas, and  (4) Laboratory 927—a lower value than
quoted for 23 mg/m3 (20 ppm) calibration gas. Other
cases of high standard errors of estimate could not be
attributed to a single suspicious calibration gas. Several
other cases of differences of 10 percent were observed
along with numerous  cases of 5-percent differences.
Absolute  magnitudes  of  "analyzed" values minus
quoted values ranged from -10.2 to +3.3 mg/m3

     It should be noted, however, that some labora-
tories did not use suspicious calibration points in com-
puting  their results for the reference samples and pre-
ferred to use their judgment based  on the calibration
curve as a whole, sometimes drawing nonlinear calibra-
tion curves. While this practice  can often minimize in-
accuracy, it can also lead to worse  situations. For in-
stance, Laboratory 780, upon inspection of its calibra-
tion curve, thought the 91 -mg/m3 (80 ppm) point to
be out of line and chose not to use it when actually the
46-mg/m3 (40 ppm) calibration gas was at fault.
     Utmost care should be taken in obtaining high-
quality calibration gases and protecting them from de-
terioration. If smooth  calibration curves are not  ob-
tained, calibration gases may be at fault and should be
replaced.
                                                 B-19

-------
          LISTOF REFERENCES

1.  ASTM  Manual  for Conducting  an  Inter-
    laboratory Study of a Test Method, ASTM STP
    No. 335, Am. Soc. Testing & Mats. (1963).

2.  1971 Annual Book of ASTM Standards, Part 30,
    Recommended Practice  for  Developing Pre-
    cision Data  on ASTM Methods for Analysis
    and  Testing of  Industrial Chemicals, ASTM
    Designation: E180-67, pp 403422.

3.  Mandel, John,  The Statistical Analysis of Ex-
    perimental  Data,  John Wiley  & Sons, New
    York, Chapter 13, pp 312-362 (1964).

4.  Mandel, J., and Lashof, T.W., "The Interlabora-
    tory Evaluation of Testing Methods," ASTM
    Bulletin, No. 239, pp 53-61 (1959).

5.  Mandel, John, "The Measuring Process," Tech-
    nometrics, l,pp 251-267 (1959).

6.  Dixon, Wilfred J., and Massey, Frank J., Jr., In-
    troduction to Statistical Analysis, McGraw-Hill
    Book Company,  Inc., New York, Chapter 10,
    pp 180-181 (1957).

7.  Ibid, Chapter 8, pp 109-110.
8.  Ibid, Chapter 9, pp 112-129.
 9.  Mandel, John, "Repeatability and Reproduci-
     bility," Materials Research and Standards, Am.
     Soc.  Testing  &  Mats., Vol 11,  No. 8,  p8
     (August 1971).

10.  Duncan, Acheson J., Quality Control and In-
     dustrial  Statistics, Third  Edition, Richard D.
     Irwin, Inc., Homewood, Illinois, Chapter XXXI,
     pp 632-636 (1965).

11.  Ibid, p 909.

12.  Bennett, Carl A., and Franklin, Norman L., Sta-
     tistical Analysis in Chemistry and the Chemical
     Industry, John Wiley and Sons, Inc., New York,
     Chapter 4, p  111(1954).

13.  Ibid, p 185-189.

14.  1971  Annual Book of ASTM Standards, op cit,
     p411.

15.  Dixon and Massey, op cit, p 120.

16.  Ibid, pp 114-115.

17.  Ibid, pp 123-124.

18.  Mandel,  John, The Statistical Analysis of Ex-
     perimental Data,  John Wiley  &  Sons,  New
     York, Chapter 6, pp 104-105 (1964).
                                             B-20

-------
         APPENDIX C
TABULATION OF ORIGINAL DATA

-------
TABLE C-I-a. OBSERVED VALUES FOR DRY SAMPLES FOR COLLABORATIVE
 TEST OF CARBON MONOXIDE METHOD, MILLIGRAMS PER CUBIC METER
Laboratory
Code Number
Day 1
Day 2
Day 3
Low Concentration
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
8.4 8.4 8.4
6.9 6.9 6.9
8.0 8.0 8.0
8.0 6.9 6.9
9.4 8.8 8.8
8.9 9.3 9.0
8.2 8.2 8.2
7.3 7.3 7.3
8.5 8.5 8.7
8.2 8.2 8.2
8.2 8.0 8.1
7.7 8.2 7.7
7.4 7.4 7.4
8.0 8.0 8.0
8.9 8.9 8.9
8.4 8.4 8.6
6.9 6.9 6.9
9.2 9.2 9.2
9.2 8.0 6.9
8.9 9.2 9.2
8.7 8.6 8.6
8.6 8.6 8.6
7.2 7.2 7.1
9.0 9.0 8.8
8.6 8.6 8.7
8.9 8.6 8.8
8.6 8.2 8.0
7.4 7.4 7.4
8.0 8.0 8.0
8.9 8.9 8.9
8.6 8.6 8.6
6.9 6.9 6.9
8.0 8.0 8.0
9.2 9.2 8.0
9.4 8.8 9.2
9.4 9.6 9.6
8.0 8.0 8.0
7.6 7.8 7.7
8.9 8.9 8.8
8.9 8.9 9.0
8.0 8.5 8.2
8.0 8.0 8.0
7.4 7.4 7.4
8.0 8.0 8.0
8.9 8.9 8.9
Intermediate Concentration
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
30.5 30.5 30.6
28.6 28.6 28.6
30.9 30.9 31.5
29.8 30.9 30.9
31.2 31.2 31.5
31.5 31.4 31.3
30.6 30.6 30.6
30.2 30.2 30.4
30.6 30.1 30.1
29.3 29.2 29.8
30.9 30.6 30.5
29.9 29.9 29.9
30.4 30.4 30.4
29.8 30.9 29.8
32.1 32.1 32.1
30.1 30.4 30.4
29.2 29.2 28.6
32.1 32.1 32.1
32.1 30.9 30.9
31.5 30.9 31.2
31.5 31.4 31.5
30.6 30.6 30.6
29.6 29.6 29.6
30.5 30.7 30.4
30.4 30.4 29.8
30.7 30.5 30.4
29.8 29.2 29.4
30.4 30.4 30.4
32.1 32.1 32.1
32.1 32.1 32.1
30.5 30.5 30.5
28.6 29.8 28.6
30.9 30.9 30.9
32.1 32.1 30.9
31.2 31.2 31.5
30.9 31.0 31.0
30.4 29.8 29.8
30.9 30.9 30.6
30.5 30.6 30.5
30.7 30.4 30.7
30.2 30.6 30.6
31.2 30.9 31.2
30.4 30.4 30.4
32.1 32.1 32.1
32.1 32.1 32.1
High Concentration
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
53.3 53.3 52.9
52.7 52.7 52.7
53.8 53.8 53.8
59.6 55.0 55.0
55.6 55.6 55.6
53.4 53.3 53.4
54.4 54.4 54.4
54.1 54.1 54.1
53.3 53.0 53.0
49.6 49.6 49.8
52.7 52.8 52.3
52.9 53.3 53.3
52.7 52.7 52.7
53.8 53.8 53.8
55.0 55.0 55.0
52.7 52.1 52.1
53.3 53.3 52.7
53.8 53.8 53.8
56.1 55.0 55.0
55.6 55.2 55.0
54.3 54.4 54.2
54.2 54.2 54.2
53.2 53.4 53.6
52.5 53.3 52.7
51.0 50.7 50.7
52.9 52.6 52.7
54.4 54.2 54.4
52.7 52.7 52.7
55.0 55.0 55.0
55.0 55.0 55.0
53.3 53.3 53.3
53.3 52.7 54.4
53.8 53.8 53.8
55.0 55.0 55.0
55.8 56.1 55.8
53.6 53.3 53.6
53.5 53.3 53.5
54.2 54.2 54.3
52.9 52.9 52.8
51.5 52.1 51.5
53.2 53.0 53.3
55.0 54.6 54.6
52.7 52.7 52.7
55.0 55.0 55.0
55.0 55.0 55.0
                           C-l

-------
TABLE C-I-b. OBSERVED VALUES FOR HUMIDIFIED SAMPLES FOR COLLABORATIVE TEST OF
           CARBON MONOXIDE METHOD, MILLIGRAMS PER CUBIC METER
Laboratory
Code Number
Day 1
Day 2
Day 3
Low Concentration
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
8.6 8.6 8.6
7.4 6.3 7.4
8.0 8.0 8.0
9.2 6.9 6.9
8.6 8.0 8.2
9.7 9.5 9.4
8.6 8.2 8.2
7.2 7.2 7.3
8.6 8.5 8.6
7.8 7.8 8.2
11.9 12.1 12.0
8.2 8.2 8.2
7.4 7.4 7.4
8.0 8.0 8.0
8.9 8.9 8.9
8.6 8.7 8.6
7.4 7.4 7.4
9.2 9.2 8.6
8.0 8.0 8.0
8.9 8.7 8.7
9.3 9.3 9.3
8.6 8.6 8.6
7.3 7.2 7.2
8.6 9.0 8.8
8.6 8.6 8.5
12.5 12.3 12.0
8.0 8.0 8.0
7.4 7.4 7.4
8.0 8.0 8.0
8.9 8.9 8.9
8.7 8.7 8.6
6.9 7.4 7.4
8.0 8.0 8.0
8.0 8.0 9.2
8.8 8.8 8.8
9.7 9.4 9.9
8.0 8.0 8.2
7.8 7.7 7.7
8.6 8.7 8.6
8.2 8.9 8.9
12.0 11.8 11.8
9.7 8.6 8.6
7.4 7.4 7.4
8.0 8.0 8.0
8.9 8.9 8.9
Intermediate Concentration
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
30.4 30.4 30.4
28.6 28.1 29.0
30.9 30.9 30.9
33.2 30.9 30.9
30.0 30.4 30.4
31.3 31.2 31.2
30.6 30.6 30.7
29.3 29.7 29.7
30.1 30.0 30.0
29.2 29.2 29.3
33.7 33.4 33.0
30.6 29.9 28.9
30.4 30.4 30.4
29.8 30.9 30.9
32.1 32.1 32.1
30.4 30.2 30.1
29.2 30.4 29.8
32.1 32.1 32.1
32.1 32.1 32.1
30.6 30.9 30.4
31.2 31.2 31.2
30.9 30.7 30.6
29.7 29.6 29.6
30.5 30.4 30.7
30.1 29.8 29.8
33.7 33.4 33.7
29.8 29.2 28.6
30.4 30.4 30.4
30.9 32.1 32.1
32.1 32.1 32.1
30.5 30.5 30.5
27.5 29.8 26.9
30.9 30.9 30.9
32.1 32.1 30.9
31.5 31.2 31.5
30.8 30.9 31.0
30.1 29.8 29.8
30.5 30.5 30.2
30.5 30.4 30.5
30.4 30.4 30.1
32.9 32.9 32.6
30.9 30.9 31.2
30.4 30.4 30.4
32.1 32.1 32.1
32.1 32.1 32.1
High Concentration
220
222
253
270
310
311
370
375
540
571
780
799
860
920
927
53.4 53.4 53.5
52.7 52.7 52.7
53.8 53.8 53.8
58.4 55.0 55.0
54.1 53.8 53.8
52.7 52.9 52.8
54.4 54.4 54.4
53.2 54.1 53.0
53.3 53.3 53.3
49.5 49.6 48.9
54.3 54.3 54.3
52.3 52.1 52.7
52.7 52.7 52.7
53.8 53.8 53.8
55.0 55.0 55.0
52.7 52.1 52.1
53.3 53.8 55.0
55.0 54.4 55.0
55.0 55.0 56.1
54.6 54.4 54.6
53.4 53.5 53.4
54.2 54.3 54.2
'53.4 53.4 53.2
52.7 52.8 52.9
50.4 50.1 50.7
54.5 54.5 54.5
51.0 50.7 51.0
52.7 52.7 52.7
55.0 55.0 53.8
55.0 55.0 55.0
52.9 52.7 52.7
53.3 53.3 53.3
53.8 54.4 54.4
55.0 55.0 55.0
55.2 55.6 55.2
52.9 52.9 52.8
53.3 53.3 53.5
53.0 53.4 53.2
53.2 52.8 52.9
51.0 50.6 51.0
54.6 54.4 54.0
54.6 54.3 54.3
52.7 52.7 52.7
53.8 53.8 53.8
55.0 55.0 55.0
                                 C-2

-------
TABLE C-II. REFERENCE VALUES FOR CARBON MONOXIDE TEST
     CONCENTRATIONS USED IN COLLABORATIVE TEST,
            MILLIGRAMS PER CUBIC METER
Laboratory
Code Number
220
222
253
270
310
311
370
375
540
571
780
799
860
920
923
927
Low
Concentration
8.4
8.6
8.5
8.4
8.5
8.6
8.6
8.2
8.5
8.5
8.4
8.4
8.2
8.6
8.6
8.4
Intermediate
Concentration
29.9
30.2
30.1
29.9
29.8
29.8
29.9
30.2
29.9
30.0
29.8
30.2
29.9
29.8
30.1
30.1
High
Concentration
52.6
52.1
52.3
52.2
52.2
52.6
52.1
52.3
52.3
52.3
52.2
52.3
52.3
52.3
52.3
52.3
                       C-3

-------
       NOTATION
Foldout for Ready Reference

-------
                                              NOTATION
(a)  Principal Variables: (may  also be used as sub-
      scripts)


     y       =  measurements,
     L       =  laboratories,
     M      =  materials or concentrations,
     D       =  test days, and
      e       =  replication errors.

(b)   Qualifying Subscripts:

      i        =  a particular laboratory,
     /        =  a particular material,
      k       =  a particular test day, and
      m       =  a particular replication error.

(c)  Number of Levels of Variables:

     p       =  number of laboratories,
     q       =  number of materials,
     w       =  number of test days, and
      n       =  number of replicates.

(d)  Statistical Notation:
                 the reference value for the jth ma-
                 terial for the ith laboratory,
                 the average of all c^ for material /,
                 average  of all replicates by labora-
                 tory i on material j on day k, and
                 average  of all yiik by  laboratory z
                 on material/.
(e)   Measures of Variability:
(g)   Regression Analysis:
     x
     y
a
s
R
                 population standard deviation,
                 sample estimate of a,
                 range  (largest  measurement minus
                 smallest measurement), and
                 variance of random variable y.
(f)   Analysis of Variance:

     DF      =  degrees of freedom,
     SS      =  sum of squares,
     MS      =  mean square, and
     EfMS)   =  expected value of mean square.
                                                                  independent variable,
                                                                  dependent variable,
                                                                  slope of a straight line,
                                                                  residual (observed value  minus fit-
                                                                  ted value), and
                                                                  correlation coefficient.
(h)  L inear Model A nalysis:

     Xj       =  average of all y^ for material /,
     3e       =  average of all Xj,
     a       =  the slope of the line /?,- versus ju,-,
     j3,-       =  slope of the line y^ versus Xj,
     Jj       =  Xj-X,
     6,-       =  scatter of the ith point about  the
                 line /3,- versus M,-,
     e       =  replication error,
     Tj(y      =  scatter of the jth point for the ith
                 laboratory about the line y(J- versus
                 Xj,
     \       =  that  part  of T?  which  is not  ac-
                 counted for by e  and
     Hi       =  average of allj^ for laboratory i.

(i)   Qualifying Superscripts:

     •^       =  a  sample estimate  of a population
                 parameter,
              =  a mean, and
     =       =  a mean of means.

(j)   Hypothesis Testing:

     g       =  number of items from which range
                 is obtained,
     N       =  number  of   cases from  which  a
                 mean is computed,
     q       =  a  variable that  has  a studentized
                 range distribution,
     s       =  independent  estimate  of standard
                 deviation,
     ta      =  the a point of a f distribution,
     z       =  a  variable that has a normal distri-
                 bution  with  zero mean  and unit
                 standard deviation,
     a       =  level of significance,
     M       =  mean of a universe,
     Mo      =  hypothetical  value  of p. that  is
                 being tested,
     v       -  degrees of freedom, and
     CT       =  population standard deviation.

-------