Environmental Monitoring Series
DEVELOPMENT OF A SYSTEM FOR CONDUCTING
INTER-LABORATORY TESTS
FOR WATER QUALITY
AND EFFLUENT MEASUREMENTS
Environmental Monitoring and Support Laboratory
Office of Research and Development
U.S. Environmental Protection Agency
-------
RESEARCH REPORTING SERIES
Research reports of the Office of Research and Development, U.S. Environmental
Protection Agency, have been grouped into nine series. These nine broad cate-
gories were established to facilitate further development and application of en-
vironmental technology. Elimination of traditional grouping was consciously
planned to foster technology transfer and a maximum interface in related fields.
The nine series are:
1. Environmental Health Effects Research
2. Environmental Protection Technology
3. Ecological Research
4 Environmental Monitoring
5. Socioeconomic Environmental Studies
6. Scientific and Technical Assessment Reports (STAR)
7. Interagency Energy-Environment Research and Development
8. "Special" Reports
9. Miscellaneous Reports
This report has been assigned to the ENVIRONMENTAL MONITORING series.
This series describes research conducted to develop new or improved methods
and instrumentation for the identification and quantification of environmental
pollutants at the lowest conceivably significant concentrations, It also includes
studies to determine the ambient concentrations of pollutants in the environment
and/or the variance of pollutants as a function of time or meteorological factors.
This document is available to the public through the National Technical Informa-
tion Service. Springfield, Virginia 22161.
-------
EPA-600/4-77-031
June 1977
DEVELOPMENT OF A SYSTEM FOR CONDUCTING
INTER-LABORATORY TESTS FOR WATER QUALITY AND
EFFLUENT MEASUREMENTS
by
Arthur C. Green
Robert Naegele
FMC Corporation
Advanced Products Division
San. Jose, California 95108
Contract 68-03-2115
Project Officer
Terry C. Covert
Environmental Monitoring and Support Laboratory
Cincinnati, Ohio 45268
ENVIRONMENTAL MONITORING AND SUPPORT LABORATORY
OFFICE OF RESEARCH AND DEVELOPMENT
U.S. ENVIRONMENTAL PROTECTION AGENCY
CINCINNATI, OHIO 45268
-------
DISCLAIMER
This report has been reviewed by the Environmental Monitor-
ing and Support Laboratory, U.S. Environmental Protection Agency
and approved for publication. Approval does not signify that
the contents necessarily reflect the views and policies of the
U.S. Environmental Protection Agency, nor does mention of trade
names or commercial products constitute endorsement or recom-
mendation for use.
ii
-------
FOREWORD
Environmental measurements are required to determine the
quality of ambient waters and the character of waste effluents.
The Environmental Monitoring and Support Laboratory - Cincinnati,
conducts research to:
• Develop and evaluate techniques to measure the presence
and concentration of physical, chemical, and radiologi-
cal pollutants in water, wastewater, bottom sediments,
and solid waste.
• Investigate methods for the concentration, recovery, and
identification of viruses, bacteria and other micro-
biological organisms in water. Conduct studies to
determine the responses of aquatic organisms to water
quality.
• Conduct an Agency-wide quality assurance program to
assure standardization and quality control systems
for monitoring water and wastewater.
In carrying out its legislated mandates, U.S. EPA requires
water quality and effluent monitoring data from a broad spectrum
of laboratory and field operations—federal, state, local,
contract and private. The quality (precision and accuracy) of
the data generated by the various monitory activities must be
known if EPA is to use these data to assess pollution trends,
set standards, verify compliance with regulations, and conduct
enforcement actions.
In order for EPA to significantly improve its capability to
assess the validity of the data it receives and uses, its inter-
laboratory testing program must be substantially expanded. This
report developed by FMC Corporation contains a formalized system
to assure that all laboratories and all necessary measurements
are continually evaluated as to their performance and reliabil-
ity. The report contains a systematic plan for conducting
interlaboratory tests for water pollution measurements and
establishes their relationship to the overall external quality
control evaluation program.
This report is not an official EPA manual. Rather, it is a
research report that is but one of a series being used as an
input to develop EPA Manuals and Guidelines.
DWIGHT G. BALLINGER, Director
Environmental Monitoring &
Support Laboratory/Cincinnati
iii
-------
ABSTRACT
FMC Corporation has developed a system for evaluating water
pollution data and the laboratories which produce these data.
The system consists of a plan for the design and implementation
of an interlaboratory test program. A pilot test program was
included to evaluate and to verify the complete program.
Investigation of ongoing interlaboratory testing programs
were conducted and their deficiencies identified in their design
and in the procedures by which they were conducted. The conclu-
sions and recommendations presented in the report are supported
by an extensive literature review of previous interlaboratory
tests and their methods for experimental design and test data
analyses. Additionally, 18 EPA, state and private laboratories
were visited to receive their comments regarding difficulties
and deficiencies in interlaboratory test programs in general.
This report was submitted in fulfillment of Contract No.
68-03-2115 by FMC Corporation under the sponsorship of the U.S.
Environmental Protection Agency. This work covers a period from
July 16, 1974, to April 15, 1976.
LV
-------
CONTENTS
Page
Foreword ill
Abstract iv
Figures vi
Tables vii
Abbreviations and Symbols viii
Acknowledgements x
I Introduction 1
II Summary 4
III Conclusions 5
IV Recommendations 7
V Literature Survey 8
VI Field Investigation 29
VII Data Analysis and Evaluation 34
VIII Interlaboratory Test - Program Plan 46
IX Pilot Program 56
X References 89
XI Interlaboratory Test Programs - Bibliography 98
XII Appendix 127
-------
FIGURES
Number page
5-1 Percent of Insoluble Residue 17
7-1 Information Flow Model 35
7-2 Error Diagram 37
7-3 Sources of Error in Measurement 38
7-4 Normality Test and Outliers Treatment 43
8-1 Inter-Laboratory Test Program 48
9-1 Youden's Plot, Cu, Samples 1 & 4 70
9-2 Youden's Plot, Cu, Samples 2 & 3 71
9-3 Youden's Plot, Cu, Samples 5 & 6 72
9-4 Youden's Plot, Zn, Samples 1 & 4 73
9-5 Youden's Plot, Zn, Samples 2 & 3 74
9-6 Youden's Plot, Zn, Samples 5 & 6 75
9-7 Relative Errors Distribution, Cu 77
9-8 Relative Errors Distribution, Zn 78
9-9 Thompson's Ranking Scores for 16 Laboratories 85
9-10 Mean Score of (M-n) Laboratories 86
VI
-------
TABLES
Number Page
5-1 Median Values of Sample Constituents (Table I of Reference
46) 19
5-2 Water-Insoluble Nitrogen Results (Table 6 of Reference 1).... 20
5-3 Data for Two Difference Methods 24
5-4 Eight Combinations of Seven Factors Used to Test Ruggedness
of an Analytical Method (Table 8 of Reference 1) 26
5-5 Measurement of 1^0 in Phosphoric Acid 27
6-1 Agencies Visited During Field Investigation 32
7-1 Lab Training and Data Evaluation 45
9-1 Sample Compositions for Pilot Test Program 57
9-2 Basic Study Data for EPA Method (yg/1) 60
9-3 Sample Statistics: Sample 1 63
9-4 Sample Statistics: Sample 2 64
9-5 Sample Statistics: Samples 65
9-6 Sample Statistics: Sample 4 66
9-7 Sample Statistics: Sample 5 67
9-8 Sample Statistics: Sample 6 68
9-9 Results of Thompson's System 80
9-10 Results of Youden1 s Ranking 82
12-1 Standard Methods for Chemical Analysis of Water: List of
Approved Test Procedures 128
VII
-------
LIST OF ABBREVIATIONS AND SYMBOLS
( 1) y = True mean (The expected value of a population,
X, y = E [X].)
(2) a = True variance (The expected value of the square of the
difference between X and y, s = E [(X-y) ].)
- - n
(3) X = Sample mean (X = 2j — X., where X., i = 1, 2,...,n are
n=l n 1 1
the results, and n is the number of results.)
(4) M = Median (Halfway point in the results when they have been
arranged in order of magnitude (the middle result of an
odd number of results, or the average of the middle two
for an even number).)
( 5) Accuracy (The correctness of a measurement, or the degree
of correspondence between the results and the true
value (actual amount added).)
( 6) Precision (The reproducibility of sample results or the
degree of agreement among the results.)
( 7) E Mean Error (The average difference with regard to sign
m between the results and the true value.
Equivalently, the difference between the
mean of the results and the true value (T.V.)
E , Mean error = X - T.V.)
m
(8) E = Relative Error (The mean error expressed
r as a percentage of the true
value. E , Relative error =
X ~ T'V- - x 100)
T.V.
(9) S = Sample variance ( sum of squared differences between
measurements and sample mean, X, divided by n-1, where
n is the number of results. ~
(10) S = Sample standard deviation (the square root of sample
variance. )
viii
-------
(11) SD = Relative standard deviation (also called coefficient
r of variation; sample
standard deviation.
normalized by the sample
mean, SD = -JL- X 100)
r X
(12) R = Range (the difference between the largest and smallest
results in the measurements.)
(13) t = Student's t distribution (t = /"FT (x - y)/S)
(14) UCL = Upper confidence limit (the limit below which the true
mean, y, will lie with probability
1 - a, where a is the probability
that the UCL does not bound the
true mean. UCL = X + (t S/N),
2
where t is the upper a point of
I 2
student's t - distribution.)
(15) LCL = Lower confidence limit (the lower counterpart of UCL,
LCL = X - (t S/N).)
(16) TL = Tolerance limits (limits within which one can state
with proportion P of the entire popula-
tion will lie. The upper and l.ower
tolerance limits are given by X + Ks,
where K is the factor for two-sided
tolerance limits for normal populations.
The value of K depends upon the chosen
values of y and p-)
IX
-------
ACKNOWLEDGMENTS
The authors wish to convey their appreciation to Mr. Terry
Covert, Project Officer, as well as Mr. Paul Britton and Mr.
Edward Berg, all of the Environmental Monitoring and Support
Laboratory, Cincinnati, Ohio, who have worked closely with us
in bringing this project to a successful completion.
We wish to acknowledge the invaluable assistance furnished
during our field investigation by Mr. James H. Finger, EPA,
Region IV, Athens, Georgia; Mr. David Payne, EPA, Region V,
Chicago, Illinois; Mr. Thompson, EPA-NERC, Research Triangle
Park, North Carolina; and Mr. William Kelley, National Institute
of Occupational Safety and Health, Cincinnati, Ohio.
The authors express their gratitude to the many Federal,
state, and private laboratories whose advice and recommenda-
tions are incorporated into this project.
-------
SECTION I
INTRODUCTION
NATURE OF THE PROBLEM
The role of the analytical laboratory is to provide qualita-
tive and quantitative data that accurately describe the character-
istics or the concentration of constituents in the sample sub-
mitted to the laboratory.
On the basis of the laboratory data, far-reaching decisions
are often made. Water quality standards are set to establish
satisfactory conditions for a given water use. Legal action is
required by pollution control authorities when laboratory results
indicate a violation of the standard. In wastewater analyses,
the laboratory data define the treatment plant influent, the ef-
fectiveness of the treatment process, and the final load imposed
upon the receiving water resources. Decisions on process changes,
plant modification, or even the construction of a new facility
may be based upon the results of laboratory analyses. The value
and progress of research and development efforts depend, to a
large measure, upon the validity of the laboratory results. In
many cases, the protection of public health and the preservation
of the nation's environmental resources are dependent upon the
accuracy of laboratory analyses.
Because of the importance of laboratory analyses and the re-
sulting actions which they produce, a program to insure the re-
liability of data is essential. An established routine control
program applied to every analytical test is important in assuring
the reliability of the final results. Furthermore, it is criti-
cal that analytical results between individual laboratories be
accurate and precise. The additional variance between labora-
tories requires an established interlaboratory testing program
to monitor and control individual laboratory performance. Once
this performance is established as acceptable, then comparison
of analytical results between laboratories can be meaningful and
significant. Standardization of methods between cooperating
laboratories is important in order to remove the methodology as
a variable in comparison or joint use of data between laborator-
ies. This is particularly important when laboratories are pro-
viding data to a common data bank or when several laboratories
are cooperating in joint field surveys.
Under the charter of the U.S. Environmental Protection Agency,
(EPA), the Office of Research and Development coordinates the
-------
collection of water quality data to determine compliance with
water quality standards, to provide information for planning of
water resources development, to determine the effectiveness of
pollution abatement procedures, and to assist in research acti-
vities. To a large extent, the success of the EPA pollution
control program rests upon the reliability of the information
provided by the data collection activities.
The Environmental Monitoring and Support Laboratory (EMSL),
Cincinnati, is responsible for insuring the reliability of
physical, chemical, biological, and microbiological data gathered
in water treatment and wastewater pollution control activities
of the EPA.
The Quality Assurance Branch, Environmental Monitoring and
Support Laboratory, Cincinnati, presently conducts formal inter-
laboratory studies among EPA laboratories to evaluate methods
selected by EPA for its method manuals. Other federal, state,
university and industrial laboratories are accomodated in these
round-robin studies on a voluntary basis. The studies carry
deadlines and conclude with reports distributed to all partici-
pants. Reference samples are also furnished without charge to
interested governmental, industrial, commercial, and private
laboratories for their within-laboratory quality control pro-
grams. However, there is no certification or other formal
evaluation function resulting from their use.
Presently the EPA has no system for conducting interlabora-
tory tests to confirm laboratory proficiency. In the absence of
such a system, certain doubts are raised as to the validity of
the results reported by the Agency. Variances between labora-
tories are sources of errors which may have significant effects
on the validity of the final data results.
Laboratories cooperating in joint survey programs or those
providing results to a common data bank, such as STORET*, must
maintain acceptable quality control to insure that analytical
results between laboratories are in good agreement of accuracy
and precision. The variance between laboratories must be main-
tained to an acceptable minimum if the final results are to be
valid.
Because of the importance of water pollution data and the
resulting actions they produce, it is essential that a dynamic
system be developed and implemented by the EPA to conduct
*STORET is the acronym used to identify the computer-oriented
EPA Water Quality Information System for STOrage and RETrieval
of data and information.
-------
interlaboratory testing for evaluation of water and wastewater
quality data and the laboratories producing these data.
OBJECTIVE
The objective of this program is to provide an interlabora-
tory testing program that will be one element of EPA's quality
control evaluation system to be used for objectively evaluating
the ability of an environmental laboratory under routine con-
ditions, to analyze samples containing unidentified constituents
in varying quantities, and to produce results that have the
desired precision and accuracy for making valid decisions.
SCOPE OF WORK
To achieve this objective efficiently, this program has
been divided into two phases. Phase I involved the investiga-
tion of existing interlaboratory testing programs using
literature search and review followed by field investigation of
Federal, State, and private laboratories. The data were analyzed
and a preliminary program prepared.
Phase II consisted of final program development and a
detailed program plan to be tested for functionality following
the development of a program specification and method for
testing.
-------
SECTION II
SUMMARY
A dynamic system is developed for evaluating water pollution
data and the laboratories which produce these data. The system
consists of a plan for the design and implementation of an inter-
laboratory test program. A pilot test program is included to
evaluate and to verify the complete program.
Investigation of interlaboratory tests conducted in the past
has identified deficiencies in their design and in the procedures
by which they were conducted. These conclusions are listed in
Section III, and the recommendations which follow from them are
listed in Section IV. The conclusions and recommendations are
supported by an extensive literature review (Section V) of pre-
vious interlaboratory tests and their methods for experimental
design and for test data analysis. Additionally, 18 EPA, State
and private laboratory agencies were visited. Obtained by
questionnaire and by personal interview, the comments, critiques,
and suggestions of these agencies have served to identify major
difficulties and deficiencies in interlaboratory test programs
generally, and some specific causes of their failure to yield
conclusive analytical and proficiency data. These field in-
vestigations are described in Section VI.
The interlaboratory test program developed in this study is
presented in Section VII. The functions and responsibilities of
each agency, namely, the cognizant EPA offices, the Interlabora-
tory Test Program Manager,and the participating laboratories are
defined. Analytical methods and statistical procedures are
specified for sample preparation, for data analysis, and for pro-
ficiency evaluation.
A pilot test program of limited scope, discussed in Section
XI, is developed to. test and to validate the experimental de-
sign and statistical analysis methods which have been selected.
Finally, a list of reference literature and publications is
presented. This source material is extensive, and excerpts from
it have been widely used in the body of the report. The con-
tributions of these authors to the field of interlaboratory test-
ing is hereby acknowledged.
-------
SECTION III
CONCLUSIONS
1. Collaborative tests for methods development are well defined
and are in wide use currently. Interlaboratory testing as a func-
tion of laboratory evaluation is under development and subject to
many differing program objectives, design and statistical evalua-
tions.
2. Interlaboratory test programs for proficiency evaluation must
be carefully designed and implemented with adequate control pro-
cedures. Otherwise, the resulting data will be difficult to ana-
lyze and interpret. Regardless of the sophistication of the
statistical analysis procedure, meaningful conclusions cannot be
derived.
3. For proficiency evaluation to be effective, the number of
participating laboratories should be as large as possible. This
reduces the uncertainty associated with the test data statistics,
and facilitates the differentiation among laboratories exhibiting
nearly equal performance.
4. The interlaboratory test design must provide as large a num-
ber as possible of experimental data points, and these must be
interrelatable (as, for example, in multiple Youden pairs). In
this manner, the masking effects which result from gross errors
inherent in many test methods can be minimized.
5. Prior to implementation, the interlaboratory test design should
be validated, under controlled conditions, in one or two "reference"
laboratories. This provides, target level of performance, and per-
mits ranking individual laboratories according to this target
rather than according to the performance of their population at
large.
6. The statistical methods employed in many prior interlaboratory
tests are incomplete to the extent that they usually assume the
data to be normally distributed yet fail to test and prove this
assumption. Furthermore, they fail to derive confidence limits on
sample means, standard deviations and relative ranking of labora-
tories.
-------
7. Proficiency evaluation should not be limited to the analysis
of individual laboratory test data, but should include evalua-
tion of personnel qualifications, laboratory facilities and equip-
ment, and in-house quality control standards.
8. Analytical results obtained in Method Study 7-Trace Metals,
shows a wide variation in accuracy and bias errors, and seven of
the sixteen laboratories performed in an unacceptable manner,
based upon the proposed evaluation system.
9. When one or more laboratories fails to perform all tests as
directed, or when results are reported as "less than x micrograms
per liter", uniform statistics for each element and sample cannot
be derived. However, even in these cases, an accurate assess-
ment of individual laboratory performance can be obtained from
as few as ten or twelve reported results of all the elements and
concentrations to be tested.
10. Participating laboratories should be notified promptly of
the test results and their individual levels of performance.
Conclusions relating to individual performance should take into
account data recording or transcription errors, equipment
limitations, and procedural deficiencies.
-------
SECTION IV
RECOMMENDATIONS
1. Proficiency evaluation programs should be closely integrated
with the interlaboratory test design, to assure that the two
functions are coherent and that the tests yield all information
required for the evaluation.
2. The interlaboratory test design should include samples com-
pounded from all five constituent groups, so that each laboratory
may be evaluated with respect to all types of test and test
procedures.
3. Laboratories selected for participation should be chosen
from those which routinely and rigorously employ adequate
quality control procedures. Otherwise, reported test data may
be subject to large errors, and degrade the subsequent statistical
analysis.
4. The statistical analysis should be as complete as possible
so that timely and accurate program results may be reported to
the participants.
5. Chemical samples to be used for laboratory proficiency tests
should be compounded at concentration levels near those of prior
method studies, or they should be subjected to tests of precision
by two or three referee laboratories, to obtain target standard
deviations at each concentration. These data are required to
evaluate absolute (as opposed to relative) performance of each
laboratory.
6. In order to standardize proficiency evaluation, the statis-
tical analysis procedure adopted by the EPA should be published
in "User's Manual" format for use by any interlaboratory test
program manager involved in water quality measurements.
-------
SECTION V
LITERATURE SURVEY
SURVEY OF EXISTING SYSTEMS AND METHODS (REF. 1-89)
The literature survey being conducted has encompassed three
major subjects — the evaluations of laboratory methods in the
environmental field (Analytical Reference Service Reports), the
reports concerning the laboratory accreditation program for
industrial hygiene or environmental health laboratories, and
the publications in the field of statistical methods of collabo-
rative experiments. The results of the literature survey are
summarized briefly in the following:
EVALUATION OF LABORATORY METHODS (REF. 1-36)
Interlaboratory test programs for evaluating analytical
methods, such as those conducted by ASTM, AOAC, and IEPA, have
been under development for more than three decades. The use of
these programs for rating individual laboratory performance is
relatively new. Consequently, most of the literature is primarily
concerned with interlaboratory methods evaluation programs. The
objective of these reports is the exchange of information so that
accurate and precise analytical procedures can be agreed upon and
followed by the laboratories involved. (Ref. 2-13) Cited are
several typical failures in methods evaluation programs which arise
from combinations of the following:
• Interlaboratory studies are poorly designed statistically;
are not optimized.
• Data from such programs are not analyzed to determine
which probability distribution pertains; frequently a cor-
rect parametric analysis is applied to an incorrect (non-
normal probability distribution) data set.
• Methods are not subjected to Youden's ruggedness test
procedures before the interlaboratory test, as evidenced
by the gross disparity of data frequently obtained in
such tests.
-------
These reports have covered mainly the physics of water and
the testing of the quality and pollution of water. The results
of the tests by the various laboratories are tabulated and plot-
ted as bar graphs with statistical quantities listed (mean,
standard deviation, confidence limits, number of outliers, etc.).
Not all reports give a complete statistical assessment about the
precision and accuracy of the laboratories involved. In addition
to the statistical inadequacies, there are several practical
pitfalls that are as applicable today as they were in 1959 when
Pierson and Fay (Ref. 14) identified them to be:
• Benefits derivable from the program not fully understood
• Chairman not fully qualified for the task and unaware of
some of the requirements
• Objectives not clearly stated and understood
• Improper selection, preparation, or packaging of samples
• Inadequate written instructions from the chairman to the
participants
• Inadequate statistical design
• Participating laboratories are not adequately instructed
about the method prior to participation. Typically, where
initial practice samples are supplied, there is an insuf-
ficient number to provide adequate experience prior to
analyzing the test samples.
• The number of replicates required is frequently inadequate
to determine the intralaboratory errors.
EVALUATION OF LABORATORY PROFICIENCY
Proficiency evaluation and certification programs for a variety
of specialized analytical laboratories are presently in being or
proposed by agencies of the Federal Government.
Typically, these programs contain three elements:
• Documentation submitted by laboratories
Personnel qualification and duties
Quality assurance program
Standard analytical methods specified
Facilities and equipment
Records maintained
-------
• Site visits
Periodic visits are made to the laboratory by
specialists who review the documentation and records
and observe laboratory personnel performing analyses.
• Proficiency testing
Laboratories participate in interlaboratory test
programs on a regular basis. Current State-of-the-
art is best described in documentation developed
in promulgating these programs.
Proficiency evaluation and/or certification programs include:
U.S. Department of Health, Education and Welfare - Public
Health Service
• Center for Disease Control - Atlanta, Georgia
A formal program for proficiency testing and accreditation
of clinical laboratories has been in force under the
auspices of the Clinical Laboratories Improvement Act of
1967. In this program the interlaboratory test results
are divided into three ranges. The laboratories whose
results lie within the first and narrowest range are given
a score of 3. The laboratories in the second narrowest
range are scored 2; the laboratories in the third and the
widest range are scored 1. Finally, the ones outside the
widest range are given -1. The passing score is 1. The
laboratories which fail are warned and asked to correct
their measuring procedures. The limits used in determin-
ing the three ranges are based on three factors: (1) the
central 95% of all laboratories under test, (2) reference
lab values representing the true values, and (3) the
clinical requirement, a percentage of the median of
reference lab values. This program also adopts the histo-
gram and X tests as a two-level technique for normality.
To test the significant difference among the laboratories,
the program uses a short cut method in the analysis of
variance.
• Food and Drug Administration - Cincinnati, Ohio
Under the Grade "A" Pasteurized Milk Ordnance (1965) the
Food and Drug Administration conducts a performance evalu-
ation and certification program for state central milk
laboratories who in turn certify official, commercial, and
dairy-industry laboratories in the individual states. The
approval of the milk laboratories is based on testing done
twice a year, which includes two kinds of testing programs;
10
-------
(a) Laboratory Survey Program (inspections of facilities,
procedures, results, and records), (b) Split Sample Pro-
gram (a minimum of 10, preferably 12, split samples being
analyzed by each laboratory to show their accuracy as
well as their precision). The statistical evaluation
method includes the following steps:
(1) take the log of the viable counts and assume log
normality.
(2) calculate the average as the estimated mean then
reject the counts beyond 3a range (a is assumed
known).
(3) recompute the mean and the 1.3a range, note the
laboratories that are outside the 1.3o range (75%
of the normal samples).
Presumably the laboratories that are consistently outside
the 1.3o range should be informed of their deficiencies.
• National Institute for Occupational Safety and Health
(NIOSH) - Cincinnati, Ohio
NIOSH is sponsoring a program being developed by the
American Industrial Hygiene Association for accreditation
of Industrial Hygiene Laboratories. According to the
agreement between AIHA and NIOSH (National Institute for
Occupational Safety and Health), the accreditation program
calls for the laboratories to participate in the PAT 12, 13
(Proficiency Analytical Testing) Program so that the stand-
ards of the testing techniques are met. As a key element
in the accreditation program, PAT program itself has been
under study constantly; for example, NIOSH is developing
a parametric testing method assuming log normality versus
the ranking tests now in use.
In addition, this agency is actively conducting programs
for improving laboratory quality assurance programs and
statistical methods for objectively measuring laboratory
proficiency utilizing interlaboratory testing. NIOSH has
published comprehensive information pertaining to analyti-
cal laboratory operation and quality control procedures.
U.S. Environmental Protection Agency
• Pesticides and Toxic Substances Effects Laboratory National
Environmental Research Center, Research Triangle Park,
North Carolina
This EPA laboratory is responsible for coordinating a
quality assurance program for 74 heterogeneous laboratory
11
-------
entities, which includes EPAf State and private laboratories
that perform environmental pesticide analysis.
The statistical testing samples (or check samples) are
distributed to the laboratories and the results are
treated in the following steps:
(1) Compute the 95% range from the results and reject
the labs with results outside the 95% range.
(2) Recompute the mean and standard deviation after the
rejections and compute the relative standard devia-
tion, which gives an indication of the overall ac-
curacy and precision of the laboratories as a whole.
(% total error).
(3) Rank the laboratory performance by a 200 point
system - 100 points assigned to full identification
and 100 points assigned to complete quantification.
(4) The laboratories scoring between 150 and 190 are
taken as the ones with some definite problems to be
resolved. The labs scoring below 150 should be ad-
vised to suspend all routine work pending the re-
solution of some very serious problems in measure-
ment.
EPA - Region V
Central Regional Laboratory
Chicago, Illinois and
International Joint Commission
Windsor/ Ontario
The Upper Lakes Reference Group has established a program
in determining the accuracy and the confidence which can
be placed on the analytical data being produced by labor-
atories operating under different jurisdictions.
The interlaboratory testing program includes seventeen
laboratories that analyze split samples utilizing a variety
of "standard" methods. Further compounding the problem
is the extremely low contamination levels compared to
typical rivers and harbors samples.
Data are reported using Youden's graphical technique in
addition to the standard statistical evaluations. "True"
value of each sample is also calculated.
12
-------
In addition to the Federal laboratory evaluation programs,
several states have similar programs.
Typical of these are the programs of California and Illinois.
State of California, Health and Welfare Agency, Department of
Health; Berkeley, California
California has an ongoing program for the certification of
water laboratories. In 1974, the program was redirected toward
more frequent recertification based on higher standards. Labora-
tory status is subject to change to reflect level of performance
and technical facilities.
Interlaboratory testing is an integral part of the evaluation
program. Laboratories with major deviations and/or omissions
which are not correctable within a reasonable time will not be
recertified to the proper authorities nor will their test data be
accepted by:
California Department of Health
County and City Health Departments
California Water Resources Control Board
California Regional Water Quality Control Boards
Environmental Protection Agency - National Pollutant
Discharge Elimination System
California Department of Fish and Game
Illinois Environmental Protection Agency (IEPA), Springfiled,
Illinois
The three laboratories in the Illinois EPA system conduct
interlaboratory testing in order to validate data produced and
confirm the overall quality assurance program in operation.
This agency uses a formal quality control manual that was
developed jointly by the three laboratories in the system using
the following procedure:
(1) Analysts write each individual procedure
(2) Procedures are verified by test
(3) Signed by analyst authorizing procedure
(4) Approved by each laboratory after testing
(5) Procedure published showing effective date
(6) Procedures are subject to revision as required, but
each revision has effective date.
13
-------
This program minimizes the laboratory errors due to unauthorized
deviations from standard methods employed.
Twin Cities Round Robin Program, Minneapolis - St. Paul, Minnesota
This volunteer program is composed of governmental, independent
and industrial laboratories involved in the analysis of water and
waste waters. The purpose of this project is to conduct inter-
laboratory tests covering five major parameters; demand, nutrients,
metals, minerals and special constituents, then determine the cor-
relation of results between laboratories in order to validate each
laboratories' in-process quality control program.
The statistical evaluation consists of:
(1) Preparation of test samples by Youden's nonreplicate
technique, i.e., preparing two similar yet different
samples to be analyzed by each laboratory only once.
(2) Compute the sums and differences of the results by the
laboratories from which the precision and total error
are estimated.
(3) Perform an F test to check the presence of inaccuracy
due to laboratories. Make recommendations if the in-
accuracy (also called systemic error) is indeed present.
Approximately 16 laboratories participate in the program.
METHODS FOR EVALUATING INTERLABORATORY TEST RESULTS
The use of interlaboratory testing programs to evaluate a
group of hetrogeneous analytical laboratories presupposes that
the laboratories are meeting the following standards:
(1) Quality Control Program operating
(2) Trained technicians
(3) Instrumentation suitable for test
(4) Calibration of instruments routine
(5) Standardized measuring procedure
However, if the standard methods of analysis are not free of
determinate error, then the above lab-to-lab deviations from
norms cannot be discriminated from errors inherent in the method.
14
-------
The methods currently in practice for evaluating inter-
laboratory test results include the basic statistical techniques
such as computation of mean, standard deviation, analysis of
variance, and tests for normality, and the techniques for dis-
cerning and correcting determinate errors such as Youden's two-
sample technique, ranking methods, McFarren's true value tech-
nique, etc. It should be emphasized that once the determinate
errors are detected for a particular laboratory, steps may
consist of (1) training of personnel, (2) recalibration of
instruments, (3) use of blanks, correction factors, or standard
compensation, and (4) improvement of instrument to lower detec-
tion limits.
In the development of a test method, it is an accepted
practice to consider the following factors; sensitivity, uncer-
tainty of "calibration curve, ruggedness, and total error."
These factors plus other statistical techniques are delineated
in the following subsections.
Basic Statistical Techniques (Ref. 37-43)
Before the discussion of the basic statistical techniques
in use, it is necessary to first define the statistical terms
endorsed by ANALYTICAL CHEMISTRY. (Note: suggested when results
reported are suited for statistical treatment, based on 5 or
more determinations):
(1) Series. A number of test results which possess common
properties that identify them uniquely.
(2) Mean. The sum of a series of test results divided by
the number in the series. Arithmetic mean is under-
stood.
(3) Precision Data. Measurements which relate to the varia-
tion among the test results themselves; i.e., the
scatter or dispersion of a series of test results,
without assumption of any prior information.
The following measures apply:
- Variance. The sum of squares of deviations of the
test results from the mean of the series after
division by one less than the number of observations.
Standard Deviation. The square root of the variance.
- Relative Standard Deviation. The standard deviation
of a series of test results as a percentage of the
mean of this series. This term is preferred over
"coefficient of variation."
The statistical methods in current use can be found in the
literature cited earlier that are the publications of on-going pro-
grams, textbooks, and handbooks. These sources contain recommended
15
-------
standard approaches to:
(1) Control charts for quality control
(2) Estimation of standard deviation (a) from average
range of measurement.
(3) Test for difference of sample mean X versus population
mean X (both a known and unknown).
(4) Tests for difference of sample variance versus popula-
tion variance.
(5) Test for differences of two sample means (X-^ & X2) ;
(both a known and unknown) .
(6) Test of normality
(7) Analysis of variance
The order of listing is not significant; the nature of the
test suggests the basis for selection, and the methods enumerated
are, as stated earlier, in general use. However, a suspected
pattern of approach is discerned, apparent in part by virtue of
its limitations. Principally, the definition of distribution
seems to be implicit rather than explicit. Although a test of
normality is provided, specification of other methods implies
an underlaying assumption of normality. Whether this is a
weakness or not remains to be determined. Second, it appears
that some criterion for sample data classification would be of
utility: should the data be agregated and treated in toto, or
would' some stratification contribute substantially to the
analysis. Third, means for identifying and evaluating sources
and effects of errors in the data are not provided. Fourth,
the information of control charts is rather limited, providing
reasonable basis for surmising that errors of precision of
accuracy would be difficult to detect in a timely manner. It is
likely that other more effective means for identifying such
trends can be devised. The application of the results in the
analysis of variance to collaborative testing is elaborated in
detail by Youden. As the number of samples or the number of
laboratories available for certain tests is seldom large, and
the assumption of normality may not always be valid, it may be
necessary to use methods of analysis based upon order statistics
or some other form of nonparametric statistic.
Youden's Two-Sample Technique
The Youden "Two-Sample" technique requires the preparation
of two samples that are similar in nature and fairly close in
concentration.^ Each laboratory is asked to measure the concen-
tration only once. The measurements are labeled X and Y and
entered in a chart as shown below. Each point in the chart
represents the results from one laboratory.
16
-------
01
03
02
01
0 01 02 03 04 05
Figure 5-1. Percent of insoluble residue
(illustration of the two-sample chart)
The pattern made by the points conclusively demonstrates the
major role played by the differing error contributions.
Consider that random errors are really the cause of the
scatter. Then the two determinations may err in being both low,
both high, X low and Y high, and X high and Y low. As random
errors are equally likely to be high or low (from the average),
all four of the possible outcomes just enumerated should be
equally likely. Thus, if a vertical line is drawn through the
average of the X results and a horizontal line is drawn through
the average of the Y results, the paper will be divided into
four quadrants.
These quadrants correspond to the four outcomes, ++, —, -+,
+-, just enumerated. If random errors are responsible for the
scatter, the points corresponding to the laboratories should be
divided equally among the four quadrants. In many hundred of
cases no instance of such equal division has been found (unless
the number of points is very small). The points are always
found dominantly in two quadrants: the ++ or upper right quadrant
and the -- or lower left quadrant. If a laboratory gets a result
that is high (in reference to the consensus) with one material,
it is almost sure to be similarly high with the other material.
The same statement holds for low results. Generally, the points
form an elliptical pattern with the major axis of the ellipse
running diagonally at an angle of 45 degrees to the X-axis.
Nearly always one or more points will be found far out along this
diagonal clearly removed from the major elliptical cluster. The
systematic error for the laboratory supplying the data for this
point is evidently large in comparison with the other collabora-
tors .
17
-------
If random errors were vanishingly small, the amount high
(or low) would be close to the same for both materials. The points
would, therefore, hug the line closely, the ellipse becoming more
and elongated. Indeed, the lengths of the perpendiculars drawn
from the points to the 45-degree line are directly related to the
random errors.
This technique has been adopted by many research laboratories
and has also been used in interlaboratory testing programs. EPA
Region V, Central Regional Laboratory, Chicago, Illinois; and
Division Laboratories, California State Department of.Bublic Health,
Berkeley, California, use this technique extensively.
The procedure of Youden was used, with some modifications, to
evaluate the results from each laboratory. Youden's method not
only permits the simultaneous evaluation of paired results, but
also has the additional advantages of identifying results affected
by systematic or random errors. The median value was determined
for each constituent of each sample. These medians were used as
In addition, to estimate overall precision, the standard deviation
of the joint results was calculated according to the following
formula: Estimate of Standard Deviation, SD,
SD =
where i = 1, 2, *'*n, n is the number of laboratories* d'. = d.-d,
d^ is the algebraic difference between results for sample1! anh
sample 2 reported by laboratory i,
n *
is the average difference between results for the two samples.
Table 5-1 summarizes the measures used to establish ranges of
acceptable, questionable, and unacceptable performance.
The use of two water samples, analyzed for the same consti-
tuents, permits the application of an effective statistical
technique. This procedure yields valuable data on laboratory per-
formance that can be readily interpreted to the participants. Al-
though a laboratory approval program that has high-quality per-
formance as its goal can lean heavily on the use of reference
samples, real laboratory improvement cannot be expected unless an
appropriate adequate follow-up procedure is also instituted.
18
-------
TABLE 5-1. MEDIAN VALUES OF SAMPLE
CONSTITUENTS (TABLE 1 OF REFERENCE 46)
Constituent
Calcium
Magnesium
Sodium
Potassium
Chloride
Sulfate
Fluoride
Number of
Laboratories
Making
Analyses
79
79
73
71
91
78
81
Median
Sample 1
59.7
25.7
29.0
1.6
33.8
42.8
0.84
(mg/1)
Sample 2
94.6
42.4
106.5
2.7
168.3
111.0
0.44
Methods for Ranking Laboratories
In addition to the two-sample technique, there are several
other techniques for the determination of performance levels of
the laboratories. A simplified technique is the ranking procedure
described by Youden , which involves the ranking of laboratory
measurements according to actual data reported. For example, if
A, B, C, laboratories report measurements of one sample as 1.5,
1.1, 1.8, then the ranks that the A, B, C laboratories receive
will be 2, 1, 3 respectively. Lab A receiving rank 2 is considered
most likely to have good performance. This ranking technique is
most useful when a large number of labs and samples are involved.
The following is an example when 10 labs and 5 samples are involved
in a collaborative test where the data are arranged in a two-way
classification scheme shown in the left half of Table 5-2.
19
-------
TABLE 5-2. WATER-INSOLUBLE NITROGEN
RESULTS (TABLE 6 OF REFERENCE 1)
Results
(%) For
Ranked
Samples
Column
No.
7
8
9
10
11
12
13
15
16
17
1
4.59
4.94
4.80
4.73
4.72
4.80
4.45
4.72
4.63
4.88
2
1.46
1.52
1.40
1.46
1.51
1.51
1.40
1.50
1.32
1.42
3
5.64
5.68
5.62
5.65
5.62
5.80
5.45
5.58
5.69
5.67
4
2.19
2.28
2.12
2.09
2.12
2.29
2.07
2.27
2.04
2.16
5
27.32
26.44
26.89
27.17
27.00
27.48
27.02
26.76
26.92
27.39
1
9
1
3.5
5
6.5
3.5
10
6.5
8
2
Results For
Samples
2
5.5
1
8.5
5.5
2.5
2.5
8.5
4
10
7
3
6
3
7.5
5
7.5
1
10
9
2
4
4
4
2
6.5
8
6.5
1
9
3
10
5
5
3
10
8
4
6
1
5
9
7
2
Column
Score
27.5
17
34
27.5
29
9a
42.5
31.5
37
20
aDesignates unusually low score.
The right half of the table shows the data replaced with
rankings that have been assigned to the laboratories according to
the amounts reported to the referee. The rank 1 is given to the
largest amount, rank 2 to the next largest, and so on. When a
tie occurs between two laboratories for the xth place, each lab-
oratory is assigned the rank x + 1/2. In the case of a triple
tie for the xth place, all three get the rank (x + 1). This
keeps the sum of the ranks equal to n(n + l)/2, when n is the
number of laboratories.
Each laboratory receives a score equal to the sum of the
ranks it received. For M materials, the smallest possible score
is M and the largest possible score is nM. A laboratory that
reports the highest amount for every one of the M materials gets
the score of nM. Such a score is obviously associated with a
laboratory that consistently gets high results, and the presump-
tion is that this laboratory has a pronounced systematic error.
20
-------
We need a quantitative measure to pass judgment on the scores.
We wish to know how big (or how small) a score we can reasonably
expect to happen by chance in the total absence of any systematic
errors. The numbers 1 to n may be written on n cards, which are
then shuffled to obtain a random order for the ranks. Repetitions
of the shuffling process will produce a series of random rankings
for the laboratories. The scores will tend to cluster around
the value M(n + l)/2. The statistical distribution of such scores
has been tabulated. When a collaborative test yields scores
in extreme regions, we conclude that a pronounced systematic error
is present in the work of the laboratory with the extreme score.
In the face of such convincing evidence, the laboratory concerned
should be willing to make a thorough search for the source of the
systematic error. The referee may decide, in view of the evidence,
to set aside all the results from this laboratory. Collaboratory
12 in the Table has a score of 9 as a consequence of getting high
values rather consistently. The allowable score limits for 10
laboratories and 5 materials are 11 and 44. (Ref. 1)
In extreme cases, most of the laboratories may get approxi-
mately the same ranking on each material so that the scores ap-
proximate the values M, 2M,..., nM. Obviously, this is an indict-
ment of the analytical method; presumably it is either inadequately
written or unacceptably sensitive to the various environments en-
countered in the various laboratories.
Another ranking technique is the one used by the Center for
Disease Control at Atlanta, Georgia, Public Health Service, HEW,
where three ranges are established, as described in Section 1.2
Scores of 3, 2, 1 are given the labs that have reported measure-
ments lying respectively in narrowest, medium, and widest ranges.
The labs which lie outside the widest range are given a negative
score of -1. The laboratories ranked below 1 are warned and
asked to correct their measuring procedure.
A third ranking procedure in practice is the one adopted by
the Presticides and Toxic Substances Effects Laboratory, National
Environmental Research Center, Research Triangle Park, N.C., EPA.
The testing and ranking procedures are described also in Section
1.2. The ranking procedure is essentially based on three criteria:
1. Identification of all compounds present
2. Correct quantitative assessment
3. Avoidance of reporting compounds not present
A 200-point system is used in actual ranking - 100 points as-
signed to full identification and 100 points assigned to complete
quantification." For example, a laboratory is asked to identify
and measure 5 compounds possibly present in a sample which actually
21
-------
contains 4 compounds. The laboratory reports 3 correct identifi-
cation, 1 incorrect, and 1 missing. Since there are two incor-
rect identifications (1 missing and 1 incorrect), 40 points are
to be deducted (40 - 2 X i.00.) from the 100 points, 100-40=60.
A "compound quantification- point" system is used according to
the following definition:
Compound Quantification Pt. = Comp. pt. value -
Formulation-Value Reported
Standard Deviation
Use the same example as above. Since there are 4 compounds,
each compound is assigned 25 points (out of 100 points). If F
is the formulation value, R is the reported value, and S is the
standard deviation, then the compound quantification point
(CQP) for the first compound is:
= 25 -
FrRi
The total CQP will be
CQP =
CQPi
For instance, the formulation of Dieldrin is 20 pg/ul, yet
the laboratory reports 50 pg/ul. The quantification point is
25 -
(20-50) = 13
2.5
where 2.5 is the standard deviation.
The sum of the identification points and quantification
points is the total score of the laboratory. The research center
has set the following ranking standard:
The laboratories scoring between 150 and 190 are taken
as the ones with some definite problems to be resolved.
The labs scoring below 150 should be advised to suspend
all routine work pending the resolution of some very
serious problems in measurement.
22
-------
Determination of Acceptable Analytical Methods
In testing an analytical method, it is not a simple task to
determine whether the method is an acceptable one. This is be-
cause the data collected are subject to precision and accuracy
errors. This is an important problem to standard methods commit-
tees if their selection of methods is to be sensible and unbiased.
E. McFarren at Analytical Reference Service, Bureau of Water
Hygiene, Public Health Service, Cinn., Ohio,24 has proposed a
method for judging the acceptability of analytical methods, which
is borne out by certain of the ARS Evaluation of Laboratory
Methods Reports, as well as by a recent article by Devine and
Partington (Ref. 23). Clearly, total error has a bearing upon the
statistical analysis of test results. When the total error is
large, the relative evaluation of laboratories becomes statistical-
ly difficult. Furthermore, many of the current standard test
methods may not be acceptable for interlaboratory evaluation, un-
less the inherent errors of the methods have been previously evalu-
ated by suitable "ruggedness" tests. In this case, allowance can
be made for systematic errors in the method.
Obviously, both precision and accuracy (as defined in ANALYT-
ICAL CHEMISTRY) must be considered in judging the acceptability
of an analytical method. The difference being in the case of
collaborative study, that the precision as calculated from the
data collected by many laboratories will be somewhat larger be-
cause of differences in reagents, instrument calibrations, glass-
ware calibrations, etc. These latter errors are also random
errors but are in addition to the operator or laboratory random
errors calculated when a series of test results are collected by
only one operator in one laboratory.
The mean error, on the other hand, as calculated for a series
of test results from many laboratories may not bear any relation-
ship to that calculated for a series of test results from one
laboratory. The latter may represent either the method bias, the
laboratory bias, or both. The former, however, since it is an
average of the bias from many laboratories, presumably more truly
represents only the method bias (accuracy).
Using slightly redefined terms for the precision and accuracy
of collaborative data, it is possible by means of suitable statis-
tical tests, such as the F test and the t-test to determine whether
there is a significant difference in either the precision or the
accuracy of two methods. If there is a significant difference,
23
-------
then the method that is either more precise or more accurate is,
presumably, the better method. The terms slightly redefined
are:22
1. Mean error - the difference between the average of a
series of test results and the true result
2. Total error - (absolute value of mean error + 2 x
standard deviation)/(true value in
percent
The term standard deviation is the regular definition/-
namely, the square root of the variance. For example, let us
assume that the following set of data was collected for two
difference methods (Table 5-3).
TABLE 5-3. DATA FOR TWO DIFFERENCE METHODS
Method
A
B
Number
of
Results
25
25
Mean
1.10
0.90
Mean
error
+0.10
-0.10
Std.
dev.
0.05
0.05
Rel.
error
10.0
10.0
Relative
standard
deviation
4.5
5.6
Application of the definition for total error gives:
A. 0.1 + 2(0.05) x 10Q _ 20% total error
1.00
0.1+2(0.05)
.
1 • U U
which indicates that both methods are equally precise and
accurate, as it should be. However, if one uses another defi-
nition for total error; for example, total error = relative
error + 2 (rel. standard deviation), the results would be
A. 10 + 2(4.5) = 19% total error
B. 10 + 2(5.6) = 21% total error
and it appears that there is a difference in the two methods,
when actually the methods are equally precise and accurate.
This phenomenon occurs because, for A, the mean is greater than
the true value (1.00), and for B, the mean is less than the true
value. Consequently, one can conclude that the new definition
by McFarren is more accurate in determining the acceptability
of an analytical method. In addition, he also proposed to
divide methods into at least three different classes; namely,
methods that can be rated as excellent or highly satisfactory,
24
-------
methods that are acceptable provided no better method is avail-
able, and methods that are unacceptable. Since the experience
of ARS has indicated that few methods will qualify even if a
total error as large as 25% is permitted, those methods that do
qualify might be considered acceptable only if no better method
is available will have a much larger error, perhaps as great as
50%. Under these conditions, with reasoning similar to the above
example, a relative standard deviation as large as 25% and a
relative error as large as 45% would be acceptable. As can be
seen, however, the permissible relative error is dependent on
the size of the relative standard deviation and on the sum of
the relative error plus two times the relative standard devi-
ation not exceeding 50%. The third category then would be those
methods that have a total error greater than 50% and that would
be judged unacceptable.
In his paper, as a result of the application of the proposed
criterion for judging the acceptability of analytical methods,
atomic absorption spectrophotometry was found acceptable for the
determination of zinc, chromium, copper, magnesium, manganese,
iron, and silver but unacceptable for the determination of lead
and cadmium. On the other hand, none of the pesticides studied
could be determined satisfactorily by gas chromatography. Objec-
tive reevaluation with the proposed criterion of the methods,
resulted in conclusions essentially in agreement with those
previously determined subjectively.
Techniques for Testing Ruggedness of A Procedure
Once an analytical procedure is shown to be free of accuracy
errors, it is then necessary to test whether the procedure will
be rugged under routine conditions (both intralaboratory and
inter-laboratory). This is to say, the procedure should be
insensitive to a slight deviation from normal procedure. A
technique for testing the ruggedness of an analytical procedure
has been developed by Youden,* which is used in Reference 20,
Section III. The details of such a technique are delineated
as follows:
Let A, B, C, D, E, F, and G denote the nominal values for
seven different factors that might influence the result if their
nominal values are slightly changed. Let their alternative
values be denoted by the corresponding lower case letters a, b,
c, d, e, f, and g. Now the conditions for running a determin-
ation will be completely specified by writing down these seven
letters, each letter being either capital or lower case. There
are 27 or 128 different combinations that might be written out.
Fortunately, it is possible to choose a subset of eight for these
combinations that have an elegant balance between capital and
lower case letters.
The particular set of combinations is shown in Table 5-4.
25
-------
TABLE 5-4. EIGHT COMBINATIONS OF SEVEN FACTORS USED
TO TEST RUGGEDNESS OF AN ANALYTICAL METHOD
(TABLE 8 OF REFERENCE!)
Combination or Detn No.
Factor Value
A
B
C
D
E
F
G
or
or
or
or
or
or
or
Observed
a
b
c
d
e
f
g
result
A
B
C
D
E
F
G
s
A
B
C
D
e
f
g
t
A
b
C
d
E
f
g
u
A
b
c
d
e
F
G
V
a
B
C
d
e
F
g
w
a
B
c
d
E
f
G
X
a
b
C
D
e
f
G
Y
a
b
c
D
E
F
g
z
The table specified the values for the seven factors to be
used while running eight determinations. The results for the
analyses are designated by the letters s through z. To find
whether changing factor A to a had an effect, we compare the aver-
age (s + t + u + v)/4 with the average w+x+y+z)/4. The
table shows that determinations 1, 2, 3, and 4 were run with the
factor at level A and determinations 5, 6, 7, and 8 with the
factor at level a. Observe that this partition gives two groups
of four determinations and that each group contains the other
six factors twice at the capital level and twice at the lower
case level. The effects of these factors, if present, consequently
cancel out, leaving only the effect of change A to a.
Inspection of the table shows that whenever the eight determi-
nations are split into two groups of four on the basis of one of
the letters, all the other factors cancel out within each group.
Every one of the factors is evaluated by all eight determinations.
The effect of altering G to g, for example, is examined by compar-
ing the average (s + v + x + y)/4, with the average of
(t + U + w + z)/4.
Collect the seven average differences for A - a, B - b,
..., G - g, and list them in order of size. If one or two fac-
tors are having an effect, their differences will be substantially
larger than the group of differences associated with the other
factors. Indeed, this ranking is a direct guide to the method's
sensitivity to modest alterations in the factors. Obviously, a
26
-------
useful method should not be affected by changes that will almost
certainly be encountered between laboratories. If there is no
outstanding difference, the most realistic measure of the analyt-
icaly error is given by the seven differences obtained from the
averages for capitals minus the average for corresponding lower
case letters. Denote these seven differences by Da, Db, ...,
Dg. To estimate the standard deviation, square the differences
and take the square root of 2/7 the sum of their squares. To
check the calculation, compute the standard deviation obtained
from the eight results, s through z. Obtain the mean of the eight
results. Square the eight differences from the mean, sum the
squares, divide by 8 - 1, and take the square root. This estimate
of the analytical error is realistic in that the sort of variation
in operating conditions that will be encountered among several
laboratories has been purposely created within the initiating lab-
oratory. If the standard deviation so found is unsatisfactorily
large, it is a foregone conclusion that the collaborative test
should never be undertaken until a method has been subjected to
the abuse described above and satisfactory results obtained in
spite of the abuse. (Ref. 1 page 35).
The following is an example of the factors involved in a
laboratory in measuring the percent of water in phosphoric acid
samples. Table 5-5 gives the factors and eight measurements
(Ref. 20, page 10.3.c).
TABLE 5-5. MEASUREMENT OF H20 IN PHOSPHORIC ACID
Factor
No.
Letter
Value for
Capital Letter
Value for
Lower Case Letter
Amount of HO
Reaction Time
Distillation Rate
Distillation Time
N-heptane
Aniline
Reagent
Measurement s
1 R «(
1
2
3
4
5
6
7
Ti
A, a
B,b
C,c
D,d
E,e
F,f
G,g
t u
9f> ^R 19.90
Ca 2 ml
0 min
2 drops/sec
90 min
210ml
8 ml
New
v w
18.03 19.50
Ca 5 ml
15 min
6 drops/ se'c
45 min
190 ml
12 ml
Used
x y z
19.16 19.88 19.85
One can proceed to calculate the ruggedness of the procedure
to various factors. For instance, the sensitivity to a slight
variation in reagent used is
s + v + x + y - t + u + w + z = 18.97 - 19.96 = -.99
4 4
27
-------
Such computations are summarized as follows:
Condition Varied Difference
Reagent -0.99
Aniline -0.83
Distillation Time 0.63
Amount of Water -0.27
Distillation Rate 0.11
Reaction Time 0.09
N-Heptane -0.07
From the summary, one can conclude that the reagent used,
amount of aniline and distillation time exert greater effects on
the analytical result than the other four factors. Therefore,
the individual planning to use this test method must make the
decision whether to redefine test procedure prior to using it in
a proficiency testing program. This decision should be based on
results of the test "pre-qualification" activity, which provide
estimates of the accuracy and precision errors inherent in the
method and for the particular samples.
If the differences, for example those shown above, are
small compared with the estimates obtained from the pre-qualifica-
tion, they can be considered acceptable. If they are not, and
hence become significant contributors to bias (accuracy) error,
then the method is not sufficiently rugged for interlaboratory
evaluation.
28
-------
SECTION VI
FIELD INVESTIGATION
Comprehensive meetings were held with AQC coordinators at
selected EPA Regional offices, the EPA National Field Investi-
gation Centers, NERC Laboratories as well as other Federal, State
and private laboratories listed in Table 3-1.
The primary purpose of this survey was to:
• gather data from those agencies now conducting inter-
laboratory testing programs
• analyze and evaluate problem areas in existing programs
• obtain recommendations for alternative approaches
In order to obtain objective data in an orderly manner dur-
ing the field investigation, a questionnaire was prepared and
mailed to the agency to be visited at the time an appointment was
made (Ref. 25). The questionnaire was divided into two sections:
Section I - Interlaboratory Test Programs
The preponderance of interlaboratory test programs, to
date, have been concerned with methods development and
evaluation. The questions in this secetion were intended
to identify similarities and differences between inter-
laboratory test programs designed for methods evaluation
and programs for rating individual laboratory proficiency.
Section II - Intralaboratory Quality Control Practices
Historically, only those laboratories with effective
quality control programs have performed well in any
interlaboratory test program. The questions in this sec-
tion were designed to determine the level of quality con-
trol to be maintained in an analytical laboratory in
order to maintain the levels of proficiency required for
producing valid data to quantify water pollution.
The questionnaire is included as Appendix B. Due to the wide
variation in purpose of the respondent laboratories, many ques-
tions were not applicable to all laboratories. However, the
29
-------
questions were successful in promoting candid discussions with the
personnel. Each of these interviews produced new insights into
the problems of developing interlaboratory programs for monitoring
laboratory proficiency.
The deficiencies of prior interlaboratory test programs
evaluated are listed in summary form on page 4. Clearly, no
specific program has been deficient in all these respects, although
most of them exhibit deficiencies in several areas.
The selection and preparation of test samples is a prominent
shortcoming. For example, water samples distributed by the
California Department of Health often contain constituent concen-
trations known to be below detection levels when the concentrated
sample is diluted according to instruction. Any subject labora-
tory interested in retaining its certificate will be tempted to
first test the concentrate, then the dilution using extraction
techniques, and compare results.
Typical deficiencies reported during FMC's field investiga-
tion include the following:
1. Timely reporting of interlaboratory test results back to the
participants seldoms occurs, resulting in reduced participation
in voluntary programs.
2. Constructive advice on improving analytical capabilities is
rarely furnished.
3. Many interlaboratory studies do not take into consideration
the data parameters required to accomplish valid statistical
analysis.
4. Analytical test procedures used in interlaboratory tests
frequently result in gross disparity of data that can be attri-
buted ,to the analytical procedure rather than to difference in
personnel or instrument capability.
5. Samples are not received with adequate instructional material,
properly preserved, and/or not in a quantity sufficient for
analysis.
6. Concentration range of interlaboratory test not in normal
range of routine testing conducted by laboratory. May be at or
below detection limits of analytical instrument without special
procedures, or may be more concentrated; i.e. Great Lakes Labora-
tories.
Other problem areas include:
1. Optimization of the frequency of interlaboratory tests for
each of the major categories. For example Dr. Hall of the Center
for Disease Control is attempting to decrease the frequency of
some tests. Other agencies are limited to one or two testing
30
-------
programs each year due to legislative requirements, level of
funding, and/or difficulty in analyzing test data and preparing
evaluation reports.
2. Analysis of test data and documentation of results appears
to be a common problem, particularly when computer program devel-
opment is concurrent with introduction of the interlaboratory
proficiency testing program.
3. Some programs have reverted to Youden techniques in order
to report results in a timely manner. In general manpower limi-
tations of state and federal agencies preclude adequate followup
with participating laboratories either from a shortage of experi-
enced personnel or lack of jurisdiction.
4. Often unusual handling is given to the samples once they are
identified as check samples by the analyst.
5. Trace analysis or analysis of low concentration parameters
are not completely assessed by samples prepared by concentrates.
As a result of this investigation, no recommendations were
made by the participants for alternative approaches to interlab-
oratory testing. However they did recommend that precautions be
taken in data treatment. For example:
Under precautions to be observed in conducting interlaboratory
proficiency testing, the following comments were made, "MDQARL'S
Methods Studies do a good job in assessing methods but are insuf-
ficient in parameters and frequency to make any assessment of
laboratory performance," and "MDQARL'S check samples are a great
help in monitoring an in-house quality control program, but are
not meant; for proficiency testing in an interlaboratory program."
One major insight gained from the discussions during the
field investigation was that no laboratory should waste its time
and money in participating in methods development or performance
evaluation interlaboratory test program until it has an effective
intralaboratory quality control program in force.
An intralaboratory quality control program should be a docu-
mented program concerned with all aspects of a functional analyti-
cal program, i.e., adherence to sample preparation procedures,
Instrument calibration, etc.; precision and accuracy on each of
group of samples; instrument stability over time (perhaps by use
of check samples plus instrument use and repair logs); preparation
and use of quality control reports such as computer files and/or
quality control charts.
The objectives and procedures for conducting Methods Develop-
ment and Proficiency Evaluation are individually unique and should
not be confused when developing either program. The first is a
31
-------
impersonal technical evaluation where pride of authorship is
about the only interpersonal relationship. On the other hand,
proficiency testing is highly interpersonal with all the intan-
gible values associated with the more highly developed pro-
cedures of rating individuals on their current ability and
future potential.
TABLE 6-1. AGENCIES VISITED DURING FIELD INVESTIGATION
AGENCY
PRINCIPAL CONTACT
Environmental Protection Agency
Environmental Monitoring and Support Laboratory
Cincinnati, Ohio 45268
Methods Development and Quality
Assurance Research Laboratory
1014 Broadway
Cincinnati, Ohio 45268
Water Supply Research Laboratory
Taft Laboratory
4676 Colombia Parkway
Cincinnati, Ohio 45268
National Environmental Research Center
Research Triangel Park, North Carolina
27711
Division of Atmospheric Surveillance
Quality Control Branch
Pesticides & Toxic Substances Effects
Laboratory
Chemistry Branch
EPA Office of Enforcement and General Counsel
National Field Investigation Center
5555 Ridge Avenue
Cincinnati, Ohio
National Field Investigation Center
Denver Federal Center
Denver, Colorado 80225
EPA Regional Offices
Surveillance and Analysis Division
Region IV
SE Environmental Research Laboratory
College Station Road
Athens, Georgia
Region V
Central Regional Laboratory
1819 West Pershing Road
Chicago, Illinois 60609
Region VII
26 Funston Road
Kansas City, Kansas 66115
Mr. Dwight Ballinger, Director
Mr. John Winter, Chief
Quality Assurance Lab. Evaluation
Earl McFarren, Chief
Water Supply Division
Mr. Seymour Hochheiser, Chief
Mr. J. F. Thompson, Chief
Lowell A. Van Den Berg, Deputy Director
Dr. Richard Enderoux
Carl R. Hirth
Dr. T. O. Meiggs, Deputy Director
Regional Analytical Quality Control
Coordinators
James Finger, Quality Assurance Officer
David A. Payne, Quality Assurance Officer
Dr. Harold G. Brown, Chief
Laboratory Branch
32
-------
TABLE 6-1. AGENCIES VISITED DURING FIELD INVESTIGATION
(Continued)
AGENCY
PRINCIPAL CONTACT
Region VIII
Denver Federal Center
Denver, Colorado 80225
Department of Health, Education and Welfare
Public Health Service
Center for Disease Control
Atlanta, Georgia 30333
Public Health Service
National Institute for Occupational Safety
and Health
1014 Broadway
Cincinnati, Ohio 45202
Public Health Service
Food and Drug Administration, Bureau of
Foods *
Division of Microbiology, Taft Laboratory
4676 Colombia Parkway
Cincinnati, Ohio 45226
National Bureau of Standards
Office of Measurement Standards
Gaithersburg, Maryland
John R. Tilstra, Quality Assurance
Officer
Charles T. Hall, Ph.D.
Chief, Proficiency Testing Section
Licensure & Proficiency Testing Branch
William D. Kelly, Deputy Director
James Leslie
Dr. Joseph M. Cameron, Chief
State Agencies:
Ohio Environmental Protection Agency
1571 Perry Street
Columbus, Ohio 43201
Illinois Environmental Protection Agency
2200 Churchill Road
Springfield, Illinois 62706
Private Activities
American Council of Independent
Laboratories
1725 K. Street, N.W.
Washington, D.C. 20006
Twin Cities Round Robin Program
Minneapolis-St. Paul, Minnesota
Dr. Edward E. Glod
Arnold Westerhold
David Schaeffer, Ph.D.
Mr. Robert Corning, Chairman
Water Quality Sub-Committee
Cedar Rapids, Iowa
Mr. William A. O'Connor
SERCO Laboratories
2982 N. Cleveland Avenue
Roseville, Minnesota 55113
33
-------
SECTION VII
DATA ANALYSIS AND EVALUATION
OBJECTIVES OF DATA ANALYSIS AND EVALUATION
The objective of data evaluation in this study is to ascertain
the accuracy and precision of the testing methods used by the
various laboratories. After the evaluation, the data can be ana-
lyzed to determine the validity of various test methods and
procedures. Specifically, data analysis and evaluation should
serve two purposes:
A. Detection of error in the chemical determinations
performed
B. Isolation and correction of the source of error
In principle, the "test sample" approach amounts to a cali-
bration of the laboratories involved, very much akin to the
"traceable" calibration of an instrument for physical measure-
ment; if the organization and processes for a given determina-
tion in each laboratory are considered to be an "instrument"
(Ref. 90-106).
GENERAL ANALYSIS AND EVALUATION PROCEDURES
The general procedures to be followed in data analysis and
evaluation should be based on analysis of variance; namely, the
one of analyzing the precision error (or the sum of squares
between the labs). To aid the discussions on analysis and evalu-
ation procedures, and laboratory training methods, one can first
model the testing program as an information flow system where
categorization of measurements with respect to experimental
design can be visualized easily.
INFORMATION MODEL
Figure 7-1 presents the flow diagram of the suggested "base-
line" information model. It is not intended, at this point, to
be definitive, but to provide a framework into which various
sample orderings may be placed. To establish this context, five
stages of information are identified.
34
-------
TEST SAMPLE
2.
LAB 1 TEST
MEASURE 1 , 2 n
1
LAB 2 TEST
MEASURE I, 2 n
3-
COMPUTE
STATISTICS
COMPUTE
STATISTICS
CONSOLIDATE
& ANALYZE
STATISTICS
/
FEEDBACK
TO LAB 1
FEEDBACK
TO LAB 2
LAB N TEST
MEASURE 1,2,
Figure 7-1. Information flow model.
Stage 1. This is the test sample, containing the various ingre-
dients to be determined in the test. The types and quantities
of the ingredients are assumed to be unknown (at least to the
individuals who will perform the test). It is further assumed
that, within reasonable limits, the method by which the test
sample is apportioned to the various laboratores does not affect
the relative quantitative proportions of the ingredient.
35
-------
Stage 2. This is the laboratory test; that is, the procedure
applied to yield a determination of the types and quantities of
the ingredients contained in the test sample. For each type of
ingredient the sequence of measure; 1, 2, ..., n is recorded
separately; this is the statistical sample, which should be
treated as a member of the type/quantity population for succeed-
ing analyses. (First caveat: if a particular type of ingredient
is subjected to more than one technique of test determination,
the results of different techniques must be treated as separate
samples).
Stage 3. Each laboratory separately computes for each sample of
measurements the statistics outlined on page 15 and following.
at the minimum the steps described as the "mean" and "precision
data." Where relevant information is available to the analyst,
"accuracy data" may also be developed.
Stage 4. This is where the laboratory statistics are compared
to:
a. The standard quantities of the sample, on a laboratory-
by-laboratory basis.
b. The results elicited by each laboratory on a compara-
tive basis.
Precision, accuracy, and test techniques are all subject to
evaluation in this stage, and analytical methods are selected to
provide the most effective means of comparison. Provided that
historical data are available, trends in precision and accuracy
with time may also be examined. This stage, in essence, is the
quality control evaluation of the laboratories performing the
test. Although indicated on the diagram as unique, information
may pass through two or more substages, depending on the sequence
of data consolidation. For example, if 100 labs -produced test
data samples, and as an intermediate step the statistics from
sets of 10 labs were compared and 10 sets of consolidated results
were provided to the final analysis. This intermediate analysis
is of particular importance when two or more acceptable test
methods have been specified in the test instructions. In this
case, of the total population of participating laboratories, A
group using method 1, B group using method 2, C group using me-
thod 3, etc., the group statistics are distinct from each other
and caution is required in consolidating them.
Stage 5. This stage is a feedback to the participating labora-
tories. Principally, this feedback relates to the laboratory
performance compared to that of the total population of labora-
tories. If the EPA is to elicit and to maintain a cooperative
36
-------
attitude among all laboratories, and this is desirable even if
the intent of the test program is only compulsory participation
as a condition of certification, then the participant must be
given more than his bare "score". He is entitled to know his
relative standing. If he is deficient, he should be told the
nature of the deficiency, so that he may take appropriate action
to rectify it. At the discretion of the EPA test program manager,
he may everi be provided with other samples to test and report on.
In summary, feedback should be regarded as a cooperative
attempt by the lab and the analysis center to identify and elimi-
nate causal factors for anomalies in test determinations.
SOURCES OF ERROR (Ref. 79-109)
As the objective of data analysis and evaluation is to detect,
identify, and correct errors in an interlaboratory test, some
examination of sources of errors is in order and some indication
of their impact should be developed. For this purpose, the "test
sample" is assumed to be standard; i.e., the type concentrations
subject to determination can be known only within certain limits
due to "Universe" error. Moreover, because of their rather
specialized nature, the determinations considered are assumed to
be for concentrations reasonably exceeding detection thresholds.
The assumption establishes the general domain of labs that would
undergo training, evaluation, and certification. The following
diagram depicts the relationship among the errors.
Figure 7-2. Error diagram,
37
-------
Precision errors are errors randomly distributed about the
mean. The expected value of their sum is equal to zero. System-
atic (bias) errors are not distributed uniformly, and will yield
a non-zero sum. Strictly speaking, bias and accuracy errors are
synonymous; as the terms are used in this report, "accuracy"
errors represent the performance of the individual analyst, while
"bias" errors are characteristic of the laboratory itself and re-
flect the general laboratory environment. Both sources of errors
result in a consistent displacement of test data from the true
value.
The potential sources of error in a measurement imposed by
the Universe are sketched as follows:
PREPARE
REAGENT
ADJUST , .
TEST (ea)
APPARATUS
OBSERVATION
TEST SAMPLE
PREPARE FOR
ANALYSIS
1
CALIBRATE
INSTRUMENTS
REACTION
MEASURE
REACTION
PRODUCT (e .)
STATISTIC
Figure 7-3. Sources of Error in Measurement,
38
-------
Each of the sources is discussed separately:
1. ep - the error in preparing the test sample for analysis.
This may involve dilution, the so-called "spiking" (which in
essence shifts the precision parameters), or development of a
more analyzable or measurable form.
2. er - the error in preparing chemical reagents. Whether the
reagents are comparatively stable or their nature requires prep-
aration immediately before development of a reaction, it is
considered for this report that (typically) dilution or compound-
ing introduce basic sources of error.
3. ea - the error implicit in the installation and preparation
of test apparatus. This factor is related to Youden's "rugged-
ness" criteria.
4. ec - the error resulting from improper calibration of test
instrumentation.
5. ed ~ the error associated with round-off of data. Youden
and others have suggested that the last digit be treated as
having an inaccuracy of + 1.
6. e0 - the error derived from reading and recording measure-
ments by analyst; such things as misreading instruments, impro-
perly manipulating apparatus, or inadvertently transposing
digits fall into the category of human error.
To make the statistical analysis tractable, these sources
are generally assumed to be independent of each other, the total
error variance is then the sum of individual ones.
It should be noted that the human error listed above can be
a predominating one. The accuracy of a recorded numerical
value is subject to implicit limitation. This does not neces-
sarily occur because of the "significant figures" consideration,
though this appears to have been carelessly handled in a rather
large fraction of the literature reviewed; typically squares or
products of two-digit figures should result in four-digit
figures or conversely roots or quotients result in corresponding
reduction. Of somewhat more import here, however, is the simple
act of interpreting what is recorded. Youden (and others)
have suggested that the digit of least ordinal value (i.e. 2176)
be treated as having an inaccuracy of plus or minus one (1)
digit; for the number cited, falling between 2175-2177. If an
interpolation of a measurement scale has been performed, the
assumption is valid. On the other hand, if the reading is
direct (as with digital readouts or such devices as analytical
balances), the error must be assumed over a one-digit range;
i.e., as previously cited, 2175.5-2176.5. (This conforms to
39
-------
the principle of "round-off"). The effect of the human element
is not altogether separable. Such things as misreading instru-
ments, improperly manipulating apparatus, or inadvertently
transposing digits fall into the category of human error.
Further, if a quantitative judgement is required, Weber's Law
indicates that the detection threshold for differences is
approximately 2.5% (relative). [Weber's law is a law in
psychology, which -has been thought to be the governing factor
in human-initiated errors in reading measurements (Ref. 107)].
D. Meister has performed extensive studies on the effects of
human errors in data collection procedures. The results of
his studies indicate that a substantially high percentage of all
equipment failures (20 to 80 percent) result from human error.
He has also developed probabilistic theories to predict and
measure human errors (Ref. 108, 109). Both Weber's law and
Meister's studies can be used as references as to the extent of
degradation on the accuracy of lab measurements caused by human
errors.
SAMPLE SIZE REQUIREMENTS
The sample size required is generally a function of the
parameters under estimate and the technique of sampling.
Derivations have been carried out by constraining on the con-
fidence intervals of the sample mean or sample variance, which
result in expressions having the following form
n fc f (y, a2, X, S2)
^^ ^
where p and a^ are the true mean and variance, and X, S are the
corresponding sampled quantities. For example, by constraining
on sample mean, one finds -*'A
n 2.
By constraining on sample variance, one arrives at a
similar expression but different from above mainly in the con-
fidence interval. The latter constraint yields more stringent
size requirement. Another significance about constraining on
sample variance is that is is more meaningful in that even
poorly designed tests which result in gross errors in the
observed mean still may yield valid estimates of the measure-
ment variances. It should be noted that the two constraints
are equivalent when gross errors do not occur, that is, when
the observed mean lies very close to the true mean.
40
-------
OUTLIERS PROCESSING
The problem of outliers is a difficult one, especially in
small sample cases where the only basis for rejecting outliers
is the small number of samples which contains the suspected
values. Youden recommends Dixon's approach which is to
compare the gap between the outlier and the nearest value as
a fraction of the total range and to reject the suspected
outlier if the gap is greater than a certain fraction.
(Ref. 1, page 30).
It should be noted that Dixon's approach is a nonparametric
one which is generally not as powerful as a parametric approach.
An alternate approach is not to reject the outliers as detected
but to modify the values of the outliers. Such an amendment
is quite attractive under the circumstances when the sample
size is really small or one does not know the exact underlying
distribution.
This amendment is called Winsorization 42, which can be
demonstrated by an example: there are small samples, such as
7 labs involved in an interlaboratory study, reporting measure-
ments as follows:
3.0, 4.2, 4.5, 4.7, 4.9, 5.1, 7.9
A Winsorization method with r=l will make corrections on
the extreme observations as:
4.2, 4.2, 4.5, 4.7, 4.9, 5.1, 5.1
and compute the statistical parameters as usual,
X = 4.67
S = 0.39
Without Winsorization, the results would be:
X = 4.9
S = 1.49
The higher values of the mean and standard deviation, are
due to the extreme observations, 3.0 and 7.9.
41
-------
By a similar procedure, one can compare Winsoration with
Dixon's Approach which dictates that the following ratios be
checked.
7.9-5.1 _
7.0-3.0
= 0.57
4.2-3
7.9-3
and
.0
.0
0.24
According to Dixon (Table 8e in Reference 1), a ratio
equal to 0.507 carries 5% risk of unjust rejection as an
outlier. Since 0.57 is less than 5% risk one can reject 7.9
as an outlier. But, since 0.24 is less than 0.507, one cannot
reject 3.0. The X and S are then computed based on 6 values,
3.0, 4.2, 4.5, 4.7, 4.9, 5.1
X = 4.4
S = 0.75
This gives smaller X~ and larger S than those by Winsoration,
The important point in doing the Winsorization of data is
that the effects of extreme observations are not completely
thrown out, consequently, the danger of rejecting a lower
estimate is greatly reduced. The efficiency of such a proce-
dure has been shown to be quite high. Moreover, it also
desensitizes the estimate to variations in the tails of the
underlying distribution.
As stated earlier, a parametric outlier detection method
is more powerful than a nonparametric one. For example, Grubbs'
method is found to be more desirable by NIOSH since the log
normality has been established for the data. 13 This method
essentially uses t statistics, from which the maximum of the
absolute t values is found and compared with an established
table of critical values. Outliers are subsequently detected
and rejected. For an example of this method, see Section 9.
The general procedure is illustrated in Figure 7-4, and
a discussion of it is contained in Appendix C, Reference 4.
42
-------
NORMALITY TEST
KALMOGOROV-SMIRNOV METHOD
NO
COMPUTE
(OUTLIER-MEAN)
NO
YES
TEST BY
SAIMTNERMETHOD
1
TEST BY
DIXON METHOD
Figure 7-4. Normality test and outliers treatment
METHODS FOR LABORATORY TRAINING AND DATA EVALUATION
Based on the literature survey and data management discussed
so far, an outline of the two-phase program is presented here to
conduct interlaboratory tests for water quality and effluent
measurements. The first phase is a training program in which the
laboratories involved are subjected to quality control training
so that the lab errors including the precision error and accuracy
error are minimized. The training procedure may consist of field
visits (inspection) by qualified personnel, a survey of perfor-
mance and a sample testing program.
A. Laboratory Visits - To be carried out by qualified inspec-
tors to determine the quality of the equipment, procedures,
results, and personnel involved in performing the experi-
ments. Judgments should be made as to the degree of
43
-------
compliance with quality standards. Recommendations should
be made for improvement where required. This survey should
cover all the details of the laboratory facilities, measur-
ing methods, techniques of recording, and personnel quali-
fications. The returns from the survey should be tabulated
and analyzed to identify the problem areas as well as the
differences and similarities among the labs. Recommenda-
tions should be made to resolve the differences and the
problems.
B. Sample Testing Program - Formulated samples should be sent
to participating labs for the identifications of compounds
and the quantitative analysis. Statistical data analysis
should include the evaluation of means, standard deviations,
relative standard deviations (coefficient of variation) and
percent total error. The relative performance rankings
should also be shown to identify those laboratories where
requirement for improvement is indicated.
This training program should be repeated twice a year, for
example, to ascertain that appropriate quality and level are
maintained by all the laboratories.
The procedures of data evaluation should include the follow-
ing steps:
• Data Screening. This is a step to identify and reject
extreme measurements (outliers) which can be done by Dixon's
method for small sample case or by the 95% range method as
used by Center or Disease Control and by Pesticides and
Toxic Substances Effects Laboratory for large sample case.
In addition, the Winsorization technique should be consid-
ered as a candidate method. If there is doubt about the
normality of the population, one may apply histogram, X^f
testing, or Kolmogorov-Smirnov goodness-of-fit testing
technique before the rejection of outliers is carried out
(Ref. 3-8).
• Computation of Statistical Values. After the outliers are
rejected, one can proceed to calculate the mean, standard
deviation, percent total error, etc. The limits should
also be calculated for various confidence levels.
• Ranking of Lab Performance. This step is done to provide
an indication of the improvements for the laboratories
with poor performance. In this step, a set of reference
laboratories might serve as a yardstick for ranking perfor-
mance. The reference laboratories should be ones known to
have high-quality personnel and facilities, and long his-
tory of satisfactory performance. The performance ranking
44
-------
technique used by Pesticides and Toxic Substances Effects
Lab and the ranking technique suggested by Youden are two
possible candidates.
The procedures of these two-phase programs of lab training
and data evaluation are summarized in Table 7-1.
TABLE 7-1. LAB TRAINING AND DATA EVALUATION
Lab Training Data Evaluation
1. Lab visit 1. Data Screen
2. Lab survey 2. Statistical Computations
3. Sample testing 3. Performance Ranking
Existing Similar Programs Existing Similar Programs
(i) PAT Program (i) All the Programs in LAB TRAINING
(ii) USDA Milk Lab Program (ii) EPA, Analytical Reference Service
Reports
(iii) Twin Cities Round Robin Program (iii) EPA Surveillance and Analysis Divisions
(iv) Public Health Disease Control Lab (Georgia, Illinois, etc.)
(v) EPA, Research Triangle Park, PTSEL
(vi) Training Program by Env. Health
Facilities, Cincinnati, Ohio
45
-------
SECTION VIII
INTERLABORATORY TEST
PROGRAM PLAN
The interlaboratory test program for water quality and
effluent measurements is intended to provide a method for the
periodic assessment of the performance of the 22 EPA laboratories
(and potentially 50 state laboratories) which routinely perform
these measurements. The documents, procedures, and statistical
methods which have been reviewed as a part of this study (see
Sections 5 and 6 above), and information obtained in visits to and
correspondence with laboratories engaged in water and waste water
analysis, together form a well-defined background of experience
and practice within which water quality tests have been performed
during the past 20 years. In spite of the effort which has been
applied to the conduct and analysis of these tests, inadequate
emphasis has been directed toward interlaboratory testing suffic-
ient in scope to yield valid conclusions regarding the natural
environment, the degree and extent of local disturbances (indus-
trial and agricultural), and the validity and consistency of
analytical results produced by testing laboratories which measure
these products.
The following paragraphs of this section describe the pro-
gram and its elements in detail. In summary, the program con-
sists of the following major activities:
1. Selection or Designation of Participants - The test program
manager must determine which laboratories are suitable subjects
for proficiency tests. Presumably, the proficiency tests are to
be used as one criterion in initial certification and periodic
recertification of federal, state and local governmental and
private commercial laboratories which routinely perform water
and waste water quality measurements.
2. Test Schedule - The schedule and frequency of tests,
annually, semi-annually or quarterly, will influence the type
and number of samples to be tested. If performed annually, the
test program must cover representative elements and measurements
of all five groups (Demand, Nutrients, Metals, Minerals, Special)
If performed quarterly, the number of samples and measurements
required for each test may be correspondingly reduced.
46
-------
3. Selection and preparation of Samples - For reasons developed
in Section IX below, sample element concentrations should be
selected at levels for which Method Study results are available.
This constraint arises from the need to have a prior statistical
basis for evaluating absolute performance.
4. Preparation of Instructional Material - Participants must
be instructed to perform all tests, and to report numerical
values for each result, unless the element is reported as "not
detected".
5. Mathematical Analysis - Laboratory proficiency will be
measured as a function of the accuracy of each reported result.
Methods are presented for combining the individual results
reported by each laboratory, in order to assess its overall
performance relative to other participating laboratories and
relative to the standards shown to be achieveable by referee
laboratories or Method Study results.
6. Report of Test Results - The test program manager must pre-
pare and distribute a report of the test. This report will
describe the overall test results, and it will contain an insert
for each laboratory which identifies its performance. In the
event that the laboratory performed poorly, the insert material
will contain suggestions or instructions for improvement.
PROGRAM SCOPE
A test program of the magnitude defined herein involves the
integrated activities of several organizations within the EPA,
the test program management function, and the participating
laboratories. Within EPA several ongoing programs are related
to the interlaboratory tests, and these include separate studies
on sample preparation and handling, interlaboratory evaluation
protocol, laboratory certification, and the laboratory data
storage and retrieval system (STORET). The interlaboratory test
program management function is a control activity which involves
many interfaces with other EPA organizations as well as with the
participants being tested.
The interrelations among these activities are illustrated in
Figure 6. Each activity is described briefly below, and the
major items are discussed in greater detail in the following
sections of this program plan. Heading numbers correspond to the
block identifiers of Figure 8-1.
47
-------
00
PROTECTION AGENCY f
1.1 1.2 1 1.3
STORET SAMPLE INTER-LABOR
DATA INPUT PREPARATION EVALUATION
REQUIREMENTS & PRESERVATION | PROTOCOL
2. INTER-LABORATORY
TEST PROGRAM 2.1 2.7 2.11
EXPERIMENT DESIGN t REQUIREMENTS DISTr
. pnoc
t ?« ]
I I I I -•- TEST
2.2 2.3 24 2.5 STRATIFICATION I
STANDARD DATA FORMATS SAMPLE TEST 1
& DISTRIBUTION} REQUIREMENTS 2.9 f
A 1 A •-*- TEST
\ J T PRE-OUALIHCATION
26
ESTABLISHED LAB PROCEDURES &
1
1
3. PARTICIPATING LABORATORIES INTERNAL
QUALITY
ASSURANCE
PROGRAM
3' _. t 4
3.2 LAB. ERROR
| ' ' MEASURED
1 3.3 _ ' '
FACILITY
3.4
TEST PROCEDURE
LE
IBUTION
CDUHE
2.10 1
INSTRUCTIONAL
y
1 » in
CERTIFICATION
< | PROGRAM
1 1
1.4 | 1.5 ] 1.6
METHODS ( j EQUIPMENT | EXPERIMENT
EVALUATION | | EVALUATION ! EVALUATION
1
1
DATA TO
STORET 1
* + 4
1 2.12 2-»
,-H DATA QUALITY — »- STATISTICAL _»J
CHECK ANALYSIS
1 OF DATA
V
l__ ,
TEST
DATA
t
Z.14
NTER-
LABOR
TESTR
ATORY
EPORT
3.6 3.8
INTER-LABORATORY COMMENTS BY
PERFORMED BY LABS CONCERNING
1 ... EXPERIMENTS
3
D
B
L
T 1
7
ATA SCREENED
Y INDIVIDUAL
ABORATORIES
Figure 8-1. Inter-Laboratory Test Program
-------
U.S. Environmental Protection Agency
STORET Data Input Requirements—
The interlaboratory test program shall be designed to accom-
modate the input data requirements of the STORET environmental
data processing system. These requirements do not have direct
impact upon the design and conduct of the interlaboratory test
program, but the data recording, reporting and analysis formats
and procedures should be constructed to provide a proper input
format to the STORET system.
Sample Preparation—
The EPA test coordinators who are responsible for the prepa-
ration and distribution of test samples will be able to utilize
the results of the Interlaboratory Experimental Design activity
below to determine the quantities and constituents of samples
which must be supplied for each test. These requirements will
supplement the work separately being performed under EPA Contract
No. 68-03-2075.
Interlaboratory Evaluation Protocol—
This activity, EPA RFP CI-74-0412, is intended to develop a
uniform method for the evaluation of laboratory performance. The
interlaboratory test program will provide the instructional
material for each test (Instructional Materials), which will
serve as a basis for evaluating the capability of any specific
laboratory to perform environmental monitoring procedures.
Evaluation—
Three separate activities comprise the test program evalua-
ion function. The first of these, Methods Evaluation, consists
of the ongoing monitoring and periodic revision of standard
test methods and procedures. Similarly, laboratory equipment
used in the tests is evaluated, from time to time to assess the
capability of equipment in general use, and to investigate the
capability and limitations of new equipment introduced into the
field.
The third evaluation function is concerned with the experi-
ment itself. The EPA will examine the overall performance of
all participating laboratories, to determine that the levels
expected from the analysis of the experimental design activity
(Test Stratification and Test Prequalification) are achieved.
If the participants uniformly fail to meet the expectation,
then the test itself becomes suspect, and should be redesigned.
Data to STORET—
This function consists of the collection of preprocessing of
interlaboratory test results, in a format suitable for input and
retrieval in the STORET system.
Laboratory Certification—
In the event that the EPA implements a formal certification
49
-------
program for environmental monitoring laboratories, the initial
certification and periodic proficiency reviews may utilize the
individual and collective results of the interlaboratory test
program. Even if a formal program is not undertaken at this time,
the Laboratory Evaluation Protocol (see Interlaboratory Evalua-
tion Protocol) and the interlaboratory test results will provide
a technical basis for establishing minimum performance criteria
for future use.
Interlaboratory Test Program Activities
Interlaboratory Experiment Design—
The scope of this activity is primarily determined by the
overall requirements for laboratory evaluation and certification,
and the number and typesof tests and number of participants are
functions of these factors. However, the detailed design shall
be developed from technical criteria which reflect current prac-
tice and the inherent limitations of test methods and laboratory
facilities.
Inputs to Experimental Design—
Current and proposed procedures constitute the main source
of inputs to the design of interlaboratory tests. These include:
Interlaboratory Test Programs - conducted by EPA, USDA, Pat Pro-
gram, PHS, and other agencies.
Data Formats and Handling - the mathematical and statistical
techniques of data acquistion, reporting and analysis.
Sample Preparation and Distribution - including the determination
of constituents and their concentrations, and handling by the
laboratory.
Test Practice Requirement - as a part of instructional materials
to familiarize laboratory personnel with the desired test pro-
cedure.
Sample Requirements--
As determined by the experimental design, requirements for
samples for each test will be established. These will include
the total quantity and number to be supplied to each participat-
ing laboratory (see Sample Preparation).
Test Stratification—
Some interlaboratory tests will require personnel and equip-
ment capabilities beyond the scope of the typical commercial or
local government laboratory, and will be restricted to a smaller
number of laboratories, typically at the state and federal level.
Because of the smaller number of participants, consideration
must be given in the design itself (for example, a factorial de-
sign) to assure that statistics are properly defined. More
50
-------
general tests will involve a larger number of participants, and
will provide a broader data base.
In both cases, effort will be directed toward minimizing the
time and cost associated with each series of tests, and the con-
sequent burden upon the participants.
Test Prequalification—
Each experiment should be prequalified by performance at a
small number of well-qualified laboratories. This achieves two
purposes. First, any deficiencies in the design itself can be
identified before field experiments are undertaken. Second, the
test results of these laboratories will serve as a target or
baseline for the performance expected of the public or private
laboratories to be evaluated.
Instructional Materials—
These materials will include a statement of the nature and
objective of the test, a detailed procedure for handling and
preparation of the test sample and necessary supplies, and a
statement of special precautions or qualifications, if any.
They will also include forms and specifications for data
recording, and for any mathematical operations which the labora-
tory shall follow using the raw data. Similarly, requirements
for descriptive or commentary text, will also be specified.
The required scope of instructional material is typified by
the protocols supplied to participants by the HEW Center for
Disease Control, by the FDA milk testing program, and by EPA
Region V Surveillance and Analysis Division.
Sample Distribution Procedure—
The interlaboratory test manager shall be responsible for
specifying and controlling the preparation and distribution of
samples for each test series, in coordination with the activity
of Sample Preparation.
Data Quality Check—
When test data has been received from all participants, the
data shall be reviewed as submitted to assess quality.. Before
performing statistical analysis, the test program manager shall
examine and attempt to resolve apparent anomalies. The disposi-
tion of such results shall be based upon the manager's technical
judgment as to their utility in the combined analysis.
Statistical Analysis of Data—
The techniques to be followed in statistical analysis, appro-
priate to the nature and size of the test, will follow from the
test design (Interlaboratory Experiment Design)
51
-------
These have been discussed in detail in Section 7 of this
report by FMC, and are further elaborated below. Typical methods
are contained in the Quality Control handbook published by NIOSH,
and in the Industrial Hygiene Laboratory Accreditation program
of the Center for Disease Control.
Report--
The final report for each test series will incorporate the
data analysis and experiment analysis described above. It will
also include commentary material submitted by the participants
bearing on the test, the procedure, and the individual results.
Participating Laboratories
The evaluation of individual laboratory performance involves
the assessment of all contributions to error in the test results.
These include random and systematic components of accuracy, pre-
cision and laboratory bias. The major contribution to these
errors are discussed below.
Personnel--
Personnel contributions to laboratory error include those
which, at any level, may lead to or result in obtaining and
reporting test data.
At the administrative level, these may include the misal-
location of the personnel, instructions contrary to the required
protocol, and errors in the processing of paperwork.
At the technical level, personnel errors include errors in
procedure, improper use of materials and equipment, faulty
interpretation of instrumentation and of visual test results,
and mistakes in recording or manipulating test data.
Training—
Deficiencies in indoctrination and training, although they
result in personnel errors, can be separately evaluated. They
can be eliminated by proper briefing as to test objectives and
procedural requirements, familiarization with test materials
and equipment, and indoctrination into the required experi-
mental protocol.
Facility—
Facility deficiencies include lack of required supplies
and equipment, proper allocation of space, heating, lighting
and ventilation, and any other environmental condition which
may contribute to experimental error.
Test Procedure--
It is assumed that approved test procedures will be followed
and that no fundamental errors are incorporated into them.
However, apparent ambiguities may occasionally remain, which,
although understood by a properly qualified operator, may lead
52
-------
to experimental error by less qualified personnel. If such
deficiencies occur, they shall be identified and corrected by
the evaluation of results for each experiment.
Laboratory Error—
Laboratory error is the composite of the preceeding four
factors. The separate identification of random errors )pre-
cision) and systematic errors (accuracy and laboratory bias) is
the primary objective of the interlaboratory test program. The
frequency and magnitude of these errors can be at least parti-
ally controlled by an active Quality Assurance program operating
within each participating laboratory.
Interlaboratory Test—
The test shall be performed in complete compliance with
the prescribed protocol. However, the laboratories should be
instructed to perform as nearly as possible in their normal
practice. If more than one analyst participates in the test,
individual records should be reported.
Screening of Data—
Errors in reported test results can be evaluated to some
extent in the laboratory itself by means of data screening.
While some errors may remain, such as failure to detect the
presence of a potential constituent, or human error in inter-
preting test data and results, nevertheless each result should
be examined for "reasonableness." If the result appears
anomalous, then the laboratory should repeat the test to verify
the finding, and report the difficulty to the test program
manager.
PREPARATION AND DISTRIBUTION OF SAMPLES
The test samples used in this program will be constituted
into each of the five groups as conventionally defined:
I Demand
II Nutrients
III Metals
IV Minterals
V Special
Three samples will be prepared for each group and the con-
centrations of each constituent shall be chosen to lie:
1. At or below detection limits unless special instru-
mentation and/or extraction processes are employed.
2. At or slightly above detection limits of a well-instru-
mented laboratory. Extraction may be required in some cases.
53
-------
3. Near normal levels reported for surface waters.
Interferences shall be passed in some cases.
The concentrations of each constituent shall also be selec-
ted to permit the mathematical analysis of laboratory results by
Youden pairs and the other statistical methods described in
Section 4 and 6 of this report.
Before the samples are distributed to all participants, six
variants of each sample will be analyzed by a laboratory selected
as a reference. The first two shall be prepared at 50 percent
below the nominal value of each constituent, the second two at
nominal values, and the third two at 50 percent above the nominal
values. These variants may be constituted simply at different
dilutions; however, the dilution shall be performed prior to
sample distribution by the Interlaboratory Test Program Manager.
The purpose of this "prequalification" is the assessment of
ranges in accuracy and precision to be expected from the results
reported later by all participants.
The Test Program Manager shall also prepare instructions for
sample storage, distribution, and handling by the reference lab-
oratory and by all participants.
INSTRUCTIONAL MATERIAL
Instructional material used in this program shall consist of
three types. First, introductory information similar to the sample
letter included in Section 6 for the Pilot Test Program will be
provided. This material shall describe the scope and general
requirements of the interlaboratory test activity and the objec-
tives of the tests. Second, detailed instructions shall accompany
each sample, and these shall include a definition of any require-
ment unique to the sample as well as a specification of the one
or more acceptable EPA or other test methods to be used in the
analysis. Finally, instructions shall be included for the acqui-
sition and recording of the test data, and for reporting general
information or comments from the individual laboratories. Comple-
tion schedules shall be specified as required.
PARTICIPATING LABORATORIES
The various EPA and State laboratories which will participate
in these tests are known to possess widely differing capabilities.
Differences are primarily due to availability of instrumentation
and analytical equipment required for measurements near detection
limits. Undoubtedly, there are also differences which are attrib-
utable to personnel skills and training and to in-house admini-
strative and technical (primarily quality control) procedures.
54
-------
When major differences exist among the capabilities of
participating laboratories and sample concentrations are near
trace levels, analysis of the data is made difficult and
confusing. For any interlaboratory study, the sample con-
centrations should be chosen so that they are well within the
normal detection range. In this way, the capabilities of each
laboratory may be properly assessed and its specific deficien-
cies identified.
55
-------
SECTION IX
PILOT PROGRAM
GENERAL
This section contains a summary and description of a pilot
interlaboratory test program whose objective is the validation
and demonstration of the plan discussed in the preceding sections
of this report. Subsequent to FMC's submittal of a pilot program
in draft form, the Environmental Monitoring and Support Laboratory
redirected the scope of the pilot test activity. The following
paragraphs retain the general outline of work and mathematical
methods as originally submitted. However, these have been
modified in several respects, in part because of the nature of
the data used for laboratory evaluation, and in part as the re-
sults of findings developed in the data analysis.
SAMPLE COMPOSITION
Analysis of trace metals was selected for the pilot test
program since the atomic absorption techniques incorporate pro-
cedures which produce results having the greatest precision and
accuracy among the many procedures used in water chemistry.
The selection of the number and concentrations of the trace
metals samples proposed for the pilot program were based upon
several parameters. These factors included recognition of quality
of the select group of laboratories that will be participating in
this pilot program and the need to obtain an objective differen-
tiation between the minor differences in their analytical ability.
A total of three pairs of samples, Table 9-1, with discrete
differences in concentrations represent a compromise between the
maximum number of samples to be analyzed without excessive analy-
tical time on the part of the participating laboratories and the
minimum number of samples required to prove the statistical con-
cepts .
The samples are paired in each concentration range in order
to detect bias errors using Youden techniques. A relatively
high spread in the concentrations within each pair of samples at
or near the detection limits of the analytical procedure has
been provided since it is anticipated that the analytical results
56
-------
will vary widely around the "true value" and this will allow the
determination of bias error.
The 13 trace metals, from a total of 28 covered by standard
methods, were selected with two factors considered; their poten-
tial hazard to the environment and their potential for interfer-
ence in analysis.
TABLE 9-1. SAMPLE COMPOSITIONS FOR
PILOT TEST PROGRAM
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Trace Level
Contamination
(yg/liter)
1 2
Metals Low High
Aluminum (Al) 7 23
Arsenic (As) 20 50
Cadmium (Cd)
Chromium (Cr) 1 5
Copper (Cu) 1 5
Iron (Fe)
Mercury (H ) 0.5 3
Lead (Pb) 12 28
Manganese (Mn)
Nickel (Ni)
Selenium (Se) 10 25
Zinc (Zn) 1 5
Cobalt (Co)
Medium Level Normal Water
Contamination Contamination
(yg/liter) (mg/liter)
3 45
Low High Low
0.5
75 90
50 75 0.3
0.8
12 19
5 12 1.4
25 50
2.4
12 19 0.1
20 50 0.33
50 70
7 22
10 30 0.22
6
High
0.9
—
0.55
1.4
—
3
3.2
0.2
0.70
—
—
0.35
57
-------
The concentration range for each set of pairs was selected
as follows:
Pair 1-2: Metal ions at or below the detection limit un-
less special instrumentation is available and/or
extraction process is employed.
Pair 3-4: Metal ions at or slightly above detection limits
of well instrumented laboratory. Extraction may
be required in some cases.
Pair 5-6: Concentration ranges that are reported in litera-
ture as being normal levels encountered in sur-
face water. Interferences are present in order
to detect finite differences between laboratory
capabilities.
In conjunction with these samples being analyzed by the
participating laboratories, it is essential that either two re-
ference laboratories analyze 6 replicates or one reference labo-
ratory analyze 10 replicates of each of these six samples. The
means and standard deviations of these data will be used as a
baseline for performance evaluation.
INSTRUCTIONAL MATERIAL
Instructional material for the pilot test program is con-
tained in the Appendix. This material includes a cover letter
of general instructions, a list of consitituents to be analyzed,
test methods to be used, and forms and instructions for record-
ing and processing test data within the laboratory.
Test and reporting examples are included in this Appendix,
and detailed instruction is provided for filling in the required
data and commentary forms.
DATA ANALYSIS
For reasons of economy and convenience the Environmental
Monitoring and Support Laboratory, responsible for the adminis-
tration and management of the program, supplied FMC with the
raw data results submitted by 18 EPA laboratories for "EPA Method
Study 7, Trace Metals", in lieu of analytical results for the
proposed series of tests. These data have been subjected to the
mathematical treatment described in the pilot program plan.
58
-------
Method Study Procedure
The objectives of this study and instructions to be follow-
ed by participating laboratories, issued by the National Environ-
mental Research Center, Analytical Quality Control Laboratory,
Cincinatti, Ohio, are reproduced in Appendix 1. Six sample con-
centrates were distributed, and they were to be analysed for Al,
As, Cd, Cr, Cu, Fe, Mn, Pb, Se and Zn. The laboratories were
instructed to analyze the samples only for those trace metals
regularly analysed by the laboratory. Not all laboratories
tested for all metals, and some tested only at certain concen-
trations. This procedure is a valid one when its objective is
the evaluation of analytical methods. It complicates the
evaluation of laboratory results when the objective is assess-
ment of laboratory performance. Consequently, the reported re-
sults for only two metals, Cu and Zn, which 16 laboratories re-
ported at all six concentrations, have been used in the analysis
which follows.
Laboratory Data
The individual results reported by 16 laboratories are
shown in Table 9-2. Data for 7 of the 10 metals to be analysed
are listed in this table. Laboratory entries "Not detected" and
"Not reported" are shown as a zero. Results obtained when using
extraction methods are not differentiated. True values are list-
ed for each sample group. Samples 1 and 4, 2 and 3, 5 and 6 are
treated as Youden pairs.
Ordered Data and Sample Statistics
Ordered data (the "less than" and zeros are ordered among the
lowest) are shown in Tables 9-3 through 9-8 in increasing rank.
The corresponding laboratory numbers are shown in parentheses.
Sample statistics were computed by deleting the "less than"
and zeros, and these values are tabulated below the ordered
data. It is seen that the sample means so computed are general-
ly biased high compared to true values, caused bv a few extreme-
ly high measurments. This also results in high standard devia-
tions and large ranges. Relative error, defined as the difference
between sample mean and true value, normalized by true value, is
also shown in the parentheses.
Outlier Processing
When evaluating the analytical method, outliers would ordinarily
be screened out to avoid this bias and dispersion.
59
-------
TABLE 9-2. BASIC STUDY DATA FOR EPA METHOD (Ug/1)
Lab No.
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
True Value
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
True Value
Cd
262.4
0.
0.
69.0
55.0
90.0
65.0
70.0
70.0
63.0
48.0
74.0
65.0
70.0
<250.0
71.0
71.0
296.5
0.
0.
77.0
63.0
77.0
73.0
70.0
80.0
72.0
68.0
82. 0
72.0
60.0
<250.0
76.0
78.0
Cr
190.0
295.0
0.
366.0
180.0
508.0
365.0
400.0
370.0
380.0
360.0
392.0
380.0
350.0
400.0
0.
370.0
179.0
350.0
0.
392.0
190.0
568.0
410.0
360.0
400.0
420.0
430.0
418.0
400.0
40.0
400.0
0.
407.0
Cu
517.5
275.0
327.0
293.0
370.0
306.0
272,0
320.0
300.0
279.0
270.0
285.0
318.0
280.0
300.0
300.0
302.0
540.5
300.0
365.0
326.0
290.0
350.0
340.0
300.0
340.0
303.0
320.0
310.0
345.0
320.0
300.0
328.0
332.0
Fe
Sample 1
645.0
'900.0
374.0
337.0
6700.0
740.0
850.0
C>60.0
850.0
784.0
720.0
0.
840.0
0.
800.0
0.
840.0
Sample 4
625.0
770.0
734.0
697.0
6000.0
610.0
720.0
850.0
680.0
651.0
670.0
0.
700.0
0.
700.0
0.
700.0
Pb
570.0
400.0
396.0
350.0
300.0
578.0
385.0
370.0
400.0
420.0
0.
285.0
325.0
400.0
<500.0
387.0
367.0
584.0
450.0
183.0
320.0
280.0
520.0
325.0
400.0
290.0
355.0
0.
275. O
370.0
300.0
<500.0
326.0
334.0
Mn
411.0
0.
437.0
428.0
0.
767.0
408.0
470.0
420.0
440.0
340.0
465.0
440.0
400.0
430.0
394.0
426.0
450.0
0.
505.0
469.0
0.
873.0
447.0
450.0
470.0
490.0
420.0
519.0
480.0
450.0
450.0
430.0
469.0
Zn
589.0
275.0
306.0
273.0
246.0
341.0
285.0
290.0
300.0
273.0
260.0
249.0
270.0
260.0
270.0
282.0
281.0
425.0
310.0
344.0
300.0
252.0
359.0
320.0
280.0
310.0
293.0
305.0
257.0
310.0
300.0
310.0
310.0
310.0
60
-------
TABLE 9-2 (CONTINUED) . BASIC STUDY DATA FOR EPA METHOD (Pg/1)
Lab No.
Cd
Cr
Cu
Fe
Pb
Mn
Zn
Sample 2
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
True Value
64.0
0.
0.
13.0
0.
16.0
17.0
< 10.0
20.0
15.0
< 10.0
24.0
14.0
10.0
<250.0
14.0
14.0
21.0
130.0
0.
74.0
20.0
< 1.0
80.0
10.0
70.0
83.0
70.0
75.0
85.0
70.0
100.0
0.
74.0
103.2
110.0
136.0
60.0
50.0
83.0
54.0
10.0
60.0
54.0
60.0
80.0
68.0
60.0
100.0
64.0
60.0
528.0
450.0
399.0
355.0
3300.0
340.0
390.0
0.
350.0
314.0
350.0
0.
350.0
0.
400.0
0.
350.0
170.0
<400.0
68.0
90.0
50.0
0.
122.0
70.0
93.0
100.0
0.
82.0
71.0
100.0
<500.0
100.0
101.0
73.0
0.
84.0
85.0
0.
99.0
82.0
< 10.0
90.0
90.0
79.0
100.0
02. 0
70.0
80.0
78.0
84.0
110.0
<150.0
83,0
55.0
22.0
66.0
70.0
< 10.0
60.0
58.0
53.0
54.0
48.0
50.0
60.0
59.0
56.0
Sample 3
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
True Value
79.6
0.
0.
14.0
15.0
14.0
18.0
< 10.0
20.0
19.0
21.0
26.0
17.0
< 10.0
<250.0
19.0
18.0
30.0
100.0
0.
94.0
30.0
115.0
105.0
20.0
90.0
104.0
100.0
92.0
90.0
100.0
100.0
0.
93.0
126.2
100.0
143.0
73.0
70.0
87.0
67.0
10.0
80.0
68.0
66.0
82.0
82.0
70.0
<100.0
78.0
75.0
594.0
425.0
482.0
426.0
4200.0
400.0
440.0
0.
430.0
406.0
420.0
0.
430.0
0.
500.0
0.
438.0
140.0
<400.0
56.0
70.0
70.0
0.
95.0
50.0
70.0
88.0
0.
62.0
82.0
100.0
<500.0
83.0
84.0
72.0
0.
107.0
107.0
0.
140.0
105.0
< 10.0
110.0
115.0
98.0
121.0
106.0
100.0
90.0
94.0
106.0
126.0
<125.0
88.0
08.0
40.0
79.0
75.0
< 10.0
80.0
72.0
76.0
63.0
75.0
60.0
70.0
75.0
70.0
61
-------
TABLE 9-2 (CONTINUED) . BASIC STUDY DATA FOR EPA METHOD (Ug/1)
Lab No.
Cd
Cr
Cu
Fe
Pb
Mn
Zn
Sample 5
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
True Value
19.0
22.0
0.
2.0
0.
7.0
4.2
< 10.0
10.0
2.0
< 10.0
2.0
1.0
0.
<250.0
< 2.0
1.4
0.
0.
0.
8.0
0.
< 1.0
< 30.0
70.0
10.0
11.0
8.0
6.0
8.0
0.
<100.0
0.
7.4
20.0
<100.0
29.0
7.0
15.0
18.0
7.0
50.0
10.0
9.0
13.0
9.4
12.0
1.2
<100.0
16.0
7.5
50.0
<100.0
37.0
26.0
400.0
48.0
18.0
360.0
20.0
25.0
38.0
0.
30.0
0.
<100.0
0.
24.0
62.0
<100.0
50.0
40.0
37.0
0.
33.0
140.0
39.0
35.0
0.
38.0
35.0
4.3
<500.0
50.0
37.0
0.
<400.0
15.0
12.0
0.
27.0
9.0
80.0
10.0
10.0
8.0
3.5
12.0
0.
< 20.0
12.0
11.0
98.0
<150.0
21.0
4.0
0.
30.0
23.0
50.0
10.0
11.0
9.0
13.0
4.0
0.8
< 20.0
9.0
7.0
Sample 6
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
23.6
22.0
0.
3.0
0.
8.0
1.7
< 10.0
0.
7.0
18.0
2.8
3.0
0.
<250.0
7.0
0.
<100.0
0.
16.0
0.
< 1.0
< 30.0
90.0
20.0
17.0
15.0
11.0
13.0
0.
<100.0
0.
27.8
<100.0
21.0
13.0
18.0
19.0
9.5
70.0
20.0
13.0
17.0
4.4
14.0
1.0
<100.0
18.0
38.0
<100.0
19.0
13.0
200.0
54.0
10.0
420. C
10.0
12.0
< 10.0
0.
10.0
0.
<100.0
0.
36.0
<400.0
32.0
30.0
30.0
0.
32.0
90.0
21.0
16.0
0.
25.0
24.0
3.0
<500.0
26.0
0.
0.
21.0
18.0
0.
27.0
12.0
110.0
20.0
15.0
16.0
14.0
16.0
0.
< 20.0
16.0
97.0
<150.0
13.0
8.0
0.
38.0
18.0
70.0
10.0
13.0
11.0
14.0
20.0
1.2
< 20.0
10.0
True Value
2.8
15.0
12.0
10.0
25.0
17.0
11.0
62
-------
TABLE 9-3. SAMPLE STATISTICS: SAMPLE 1
Cd
Cr
Cu
Fe
Pb
Mn
Zn
Ordered data and statistics
<250.
0.
0.
48.
55.
63.
65.
65.
69.
70.
70.
70.
71.
74.
90.
262.
(17)
( 5)
( 3)
(13)
( 7)
(12)
(15)
( 9)
( 6)
(10)
(16)
(11)
(18)
(14)
( 8)
( 2)
0.
0.
280.
190.
295.
350.
360.
365.
366.
370.
380.
380.
392.
400.
400.
503.
( 5)
(18)
( 7)
( 2)
( 3)
(16)
(13)
( 9)
( 6)
(11)
(12)
(15)
(14)
(17)
(10)
( 8)
270. (13)
272. ( 9)
275. ( 3)
279. (12)
280. (16)
285. (14)
298. ( 6)
300. (11)
300. (17)
300. (18)
306. ( 8)
318. (15)
320. (10)
327. ( 5)
370. ( 7)
513. ( 2)
0. (14)
0. (16)
0. (18)
645. ( 2)
660. (10)
720. (13)
740. ( 8)
784. (12)
800. (17)
837. ( 6)
840. (15)
850. (11)
850. ( 9)
874. { 5)
900. ( 3)
6700. ( 7)
<500.
0.
285.
300.
325.
350.
370.
385.
387.
396.
400.
400.
400.
420.
570.
578.
(17)
(13)
(14)
( 7)
(15)
( 6)
(10)
( 9)
(18)
( 5)
(16)
(11)
( 3)
(12)
( 2)
( 8)
0. ( 3)
0. ( 7)
340. (13)
394. (18)
400. (16)
408. ( 9)
411. ( 2)
420. (11)
428. ( 6)
430. (17)
437. ( 5)
440. (15)
440. (12)
465. (14)
470. (10)
767. ( 8)
246.
249.
260.
260.
270.
270.
273.
273.
275.
282.
285.
290.
300.
306
341.
589.
( 7)
(14)
(13)
(16)
(15)
(17)
(12)
( 6)
( 3)
(18)
( 9)
(10)
(11)
( 5)
( 8)
( 2)
Sample statistics
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
99% Ranges
82.
0.
54.
214.
4
49
16
94
40
15
352.
-0.
84.
328.
3
57
05
10
00
16
313.28
0.04
60.16
247.50
1 15
1246.15
0.48
1640.62
6055.00
4 15
Sample statistics after
Mean, M
Relative error, RE
Standard deviation, SD
Range , R
Results of Dixon's
outlier test
Mean , M
Relative error, RE
Standard Deviation, SD
Range , R
67.
-0.
10.
42.
6
68.
-0.
3.
11.
50
05
23
00
14
56
03
50
00
352.
-0.
84.
328.
3
352.
-0.
84.
328.
57
05
10
00
16
57
05
10
00
299.67
-0.01
26.47
100.00
1 14
294 . 64
-0.02
18.63
57.00
791.67
-0.06
83.39
255.00
4 15
791.67
-0.06
83.39
255.00
397.
0.
84.
293.
3
57
08
84
00
16
446.43
0.05
97.71
427.00
3 15
deleting data beyond
397.
0.
84.
293.
3
368.
0.
43.
135.
57
08
84
00
14
17
00
59
00
421.77
-0.01
33.46
130.00
3 15
421.77
-0.01
33.46
130.00
298.06
0.
81.
343.
1
06
00
00
15
99% range
278.
-0.
24.
95.
1
278.
-0.
24.
95.
67
01
11
00
15
67
01
11
00
63
-------
TABLE 9-4. SAMPLE STATISTICS: SAMPLE 2
Cd
Cr
Cu
Fe
Pb
Mn
Zn
Ordered data and statistics
<250. C17)
< 10. (13)
< 10. (10)
0.( 3)
0. ( 5)
0.( 7)
10. (16)
13. ( 6)
14. (15)
14. (18)
15. (12)
16. ( 8)
17. ( 9)
20. (11)
24. (14)
64. ( 2)
< 1.
0.
0.
10.
20.
21.
70.
70.
70.
74.
75.
80.
83.
85.
100.
130.
( 8)
( 5)
(18)
(10)
( 7)
( 2)
(13)
(16)
(11)
( 6)
(14)
( 9)
(12)
(15)
(17)
( 3)
10. (10)
50. ( 7)
54. ( 9)
54. (12)
60. (11)
60. ( 6)
60. (13)
60. (26)
64. (18)
68. (15)
80. (14)
83. ( 8)
100. (17)
102. ( 2)
110. ( 3)
136. ( 5)
0.
0.
0.
0.
314.
340.
350.
350.
350.
355.
390.
399.
400.
450.
528.
3300.
(10)
(14)
(16)
(18)
(12)
( 8)
(15)
(13)
(11)
( 6)
( 9)
( 5)
(17)
( 3)
( 2)
( 7)
<500.
<400.
0.
0.
50.
68.
70.
71.
82.
90.
93.
100.
100.
100.
122.
170.
(17)
( 3)
(13)
( 8)
( 7)
( 5)
(10)
(15)
(14)
( 6)
(11)
(16)
(12)
(18)
( 9)
( 2)
< 10.
0.
0.
70.
73.
78.
79.
80.
82.
82.
84.
85.
90.
90.
99.
100.
(10) <1!,0. ( 3)
( 7) < 10. (10)
( 3)
(16)
(22)
(18)
(13)
(17)
(15)
( 9)
( 5)
( 6)
(12)
(11)
( 3)
(14)
22. ( 7)
48. (15)
50. (16)
53. (13)
!>4. (14)
S5.( 6)
58. (12)
59. (18)
60. (17)
60. (11)
66. ( 8)
70. ( 9)
83. ( 5)
110. ( 2)
Sample statistics
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
99% Ranges
20.70
0.48
15.70
54.00
7 15
68.
-0.
33.
120.
4
31
08
54
00
16
Sample
Mean, M
Relative error, RE
Standard Deviation, SD
Range, R
Results of Dixon's
outlier test
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
15.89
0.13
4.11
14.00
6 15
18.30
0.02
3.65
12.00
68.
-0.
33.
120.
3
83.
-0.
31.
95.
31
08
54
00
16
57
10
61
00
72.01
0.20
29.57
126.00
1 16
627.
0.
843.
2986.
5
17
79
71
00
15
93.
-0.
30.
120.
5
statistics after deleting
72.01
0.20
29.57
126.00
3 14
76.92
0.03
10.02
34.00
384.
0.
60.
214.
5
435.
-0.
31.
100.
18
10
62
00
14
90
00
58
00
93.
0.
30.
120.
5
80.
-0.
24.
90.
00
08
95
00
16
data
00
08
95
00
16
50
04
12
00
84.
0.
8.
30.
4
00
00
93
00
16
beyond 99%
84.
0.
8.
30.
4
105.
-0.
16.
68.
00
00
93
00
16
00
0.
13
00
60.57
0.08
19.54
88.00
3 16
range
60.57
-0.08
19.54
88.00
4 15
73.42
0.06
7.59
28.00
64
-------
TABLE 9-5. SAMPLE STATISTICS: SAMPLE 3
Cd
Cr
Cu
Fe
Pb
Mn
Zn
Ordered data and statistics
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
99% Ranges
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
Results of Dixon's
outlier test
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
<250. (17)
< 10. (16)
< 10. (10)
0. ( 5)
0. ( 3)
14. ( 8)
14. ( 6)
15. ( 7)
17. (15)
18. ( 9)
19. (12)
19. (18)
20. (11)
21. (13)
26. (14)
30. ( 2)
23.87
0.33
18.80
65.60
6 15
18.30
0.02
3.65
12.00
7 15
15.89
0.13
4.11
14.00
0. ( 5) <100. (17)
0. (18)
20. (10)
30. ( 2)
30. ( 7)
90. (15)
90. (11)
92. (14)
94. ( 6)
100. (13)
100. (16)
100. (17)
100. ( 3)
104. (12)
105. ( 9)
115. ( 8)
Sample
83.57
-0.10
31.61
95.00
3 16
Sample
83.57
-0.10
31.61
95.00
4 16
68.31
-0.08
33.54
120.00
10. (10)
66. (13)
67. ( 9)
68. (12)
70. (16)
70. ( 7)
73. ( 6)
78. (18)
80. (11)
82. (15)
82. (24)
87. ( 8)
100. ( 3)
126. ( 2)
143. ( 5)
0. (10)
0. (14)
O.U6)
O.(18)
400. ( 8)
406. (12)
420. (13)
425. ( 3)
426. ( 6)
430. (11)
430. (15)
440. ( 9)
482. ( 5)
500. (17)
594. ( 2)
4200. ( 7)
<500. (17)
<400.( 3)
0. (13)
0. ( 8)
50. (10)
56. ( 5)
62. (14)
70. ( 6)
70. ( 7)
70. (11)
82. (15)
83. (18)
88. (12)
95. ( 9)
100. (16)
140. ( 2)
< 10. (10) <125. ( 3)
0.( 7) <
0.( 3)
72. ( 2)
90. (17)
94. (18)
98. (13)
100. (16)
105. ( 9)
106. (15)
107. ( 5)
107. ( 6)
110. (11)
115. (12)
121. (14)
140. ( 8)
10. (10)
40. ( 7)
60. (16)
63. (14)
68. ( 6)
70. (17)
72. (12)
75. ( 9)
75. (15)
75. (18)
76. (13)
79. ( 8)
80. (11)
88. ( 5)
126. ( 2)
statistics
80.15
0.07
29.55
133.00
2 16
762.75
0.74
1083.78
3800.00
5 15
80.50
-0.04
24.12
90.00
5 16
statistics after deleting data
80.15
0.07
29.55
133.00
1 16
72.01
0.20
29.57
126.00
450.27
0.03
56.30
194.00
5 14
369.80
0.06
39.44
136.00
80.50
-0.04
24.12
90.00
5 15
86.00
-0.15
20.16
72.00
105.00
-0.01
16.13
68.00
4 16
beyond 99%
105.00
-0.01
16.13
68.00
4 16
84.00
0.00
8.93
30.00
74.79
0.07
18.58
86.00
3 IS
range
70.85
0.01
11.77
48.00
4 IS
59.67
0.07
9.64
35.00
65
-------
TABLE 9-6. SAMPLE STATISTICS: SAMPLE 4
Cd
Cr
Cu
Fe
Pb
Mn
Zn
Ordered data and statistics
<250. (17)
0.( 5)
0.( 3)
60. (16)
63. ( 7)
68. (13)
70. (10)
72. (12)
72. (15)
73. ( 9)
76. (18)
77. ( 6)
77. ( 8)
80. (11)
82. (14)
297. ( 2)
0.
0.
40.
179.
190.
350.
360.
392.
400.
400.
400.
410.
418.
420.
430.
568.
( 5)
(18)
(16)
( 2)
( 7)
( 3)
(10)
( 6)
(11)
(17)
(15)
( 9)
(14)
(12)
(13)
( 8)
290.
300.
300.
300.
303.
320.
320.
320.
326.
328.
340.
340.
345.
350.
365.
541.
( 7)
( 3)
(10)
(17)
(12)
(14)
(16)
(13)
( 6)
(28)
( 9)
(11)
(15)
( 8)
( 5)
( 2)
0.
0.
0.
610.
625.
651.
670.
680.
697.
700.
700.
720.
734.
770.
850.
6000.
(14)
(16)
(18)
( 8)
( 2)
(12)
(13)
(11)
( 6)
(17)
(15)
( 9)
( 5)
( 3)
(10)
( 7)
<500.
0.
183.
275.
280.
290.
300.
320.
325.
326.
355.
370.
400.
450.
520.
584.
(17)
(13)
( 5)
(14)
( 7)
(11)
(16)
( 6)
( 9)
(18)
(12)
(15)
(10)
( 3)
( 8)
( 2)
0. ( 3)
0. ( 7)
420. (13)
430. (18)
447. ( 9)
450. ( 2)
450. (16)
450. (17)
450. (10)
469. ( 6)
470. (11)
480. (15)
490. (12)
505. ( 5)
519. (14)
873. ( 8)
252. ( 7)
257. (14)
280. (10)
293. (12)
300. ( 6)
300. (16)
305. (13)
310. ( 3)
310. (15)
310. (11)
310. (17)
310. (18)
320. ( 9)
344. ( 5)
359. ( 8)
425. ( 2)
Sample statistics
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
99% Ranges
89.73
0.15
62.44
236.50
4 15
354.
-0.
132.
528.
3
07
13
18
00
16
336.
0.
58.
250.
1
09
01
55
50
15
1108.
0.
1471.
5390.
4
23
58
11
00
15
355.
0.
104.
401.
3
Sample statistics after deleting
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
Results of Dixon's
outlier test
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
72.50
-0.07
6.56
22.00
4 15
72.50
-0.07
6.56
22.00
354.
-0.
132.
528.
3
354.
-0.
132.
528.
07
13
18
00
16
07
13
18
00
322.
-0.
22.
75.
1
322.
-0.
22.
75.
47
03
12
00
15
47
03
12
00
700.
0.
64.
240.
4
700.
0.
64.
240.
58
00
94
00
15
58
00
94
00
355.
0.
104.
401.
3
355.
0.
104.
401.
57
06
79
00
16
data
57
06
79
00
16
57
06
79
00
493.07
0.05
112.77
453.00
3 15
311.56
0.01
40.36
173.00
1 15
beyond 99% range
463.85
-0.01
28.67
99.00
3 15
463.85
-0.01
28.67
99.00
304.00
-0.02
27.65
107.00
1 15
304.00
-0.02
27.65
107.00
66
-------
TABLE 9-7. SAMPLE STATISTICS: SAMPLE 5
Cd
Cr
Ordered
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
99% Ranges
Mean, M
Relative error, RE
Standatd deviation, SD
Range, R
Results of Dixon's
outlier test
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
<250.
< 10.
< 10.
< 2.
0.
0.
0.
1.
2.
2.
2.
4.
7.
10.
19.
22.
7.
4.
7.
20.
8
7.
4.
7.
20.
8
7.
4.
7.
20.
(17)
(13)
(10)
(18)
(16)
( 5)
( 7)
(15)
( 6)
(14)
(12)
( 9)
( 8)
(11)
( 2)
( 3)
70
50
84
90
16
70
50
84
90
16
70
50
84
90
<100.
< 30.
< 1.
0.
0.
0.
0.
0.
0.
6.
a.
8.
8.
10.
11.
70.
17.
1.
23.
64.
10
17.
1.
23.
64.
10
8.
0.
1.
5.
Cu
Fe
Pb
Mn
lln
data and statistics
(17) <100. ( 3)
( 9) <100. (17)
( 8)
( 7)
( 2)
( 3)
(16)
( 5)
(18)
(14)
(13)
( 6)
(15)
(11)
(12)
(10
Sample
29
34
30
00
16
Sample
29
34
30
00
15
50
15
76
00
1.U6)
7.( 6)
7.( 9)
9. (12)
9. (14)
10. (11)
12. (15)
13. (13)
15. ( 7)
16. (18)
18. ( 8)
20. ( 2)
29. ( 5)
50. (10)
<100. ( 3)
<100. (17)
0. (16)
0. (14)
0. (18)
18. ( 9)
20. (11)
25. (12)
26. ( 6)
30. (15)
37. ( 5)
38. (13)
48. ( 8)
50. ( 2)
360. (10)
400. ( 7)
<500.
<100.
0.
0.
4.
33.
35.
35.
37.
38.
29.
40.
50.
50.
62.
142.
(17) <400.
( 3) <
(13)
( 8)
(16)
( 9)
(12)
(15)
( 7)
(14)
(11)
( 6)
( 5)
(18)
( 2)
(10)
20.
0.
0.
0.
4.
8.
9.
10.
10.
12.
12.
12.
15.
27.
30.
( 3) <150.( 3)
(17) <
( 7)
(16)
( 2)
(14)
(13)
( 9)
(11)
(12)
( 6)
(15)
(18)
( 5)
( 8)
(10)
20. (17)
0.( 7)
l.(16)
«.( 6)
4. (15)
S. (13)
9. (18)
10. (11)
10. (12)
13. (14)
21. ( 5)
23. ( 9)
30. ( 8)
50. (10)
98. ( 2)
statistics
15.47
1.06
12.02
48.80
3 15
95.64
2.98
141.26
382.00
6 16
statistics after
12.82
0.71
7.03
27.80
3 15
12.82
0.71
7.03
27.80
95.64
2.98
141.26
382.00
6 14
32.44
0.35
11.56
32.00
46.
0.
32.
135.
5
94
27
32
70
15
deleting data
38.
0.
14.
57.
6
41.
0.
48
04
30
70
15
90
13
9.19
29.
00
18.
0.
21.
76.
6
05
64
36
50
15
beyond 99%
11.
0.
6.
23.
7
11.
0.
2.
7.
85
08
14
50
14
00
20
00
21.75
2.11
26.47
97.20
4 15
range
15.40
1.20
13.85
49.20
4 14
12.25
0.75
8.96
29.20
67
-------
TABLE 9-8. SAMPLE STATISTICS: SAMPLE 6
Cd
<250
< 10
0
0
0
0
2
3
3
3
7
7
8
18
22
24
. (17)
- (10)
. ( 5)
.(11)
.(16)
- ( 7)
. ( 9)
.(14)
. ( 6)
. (15)
. (12)
. (18)
. ( 8)
.(13)
.( 3)
. ( 2)
Cr
<100. ( 3)
<100. (17)
< 30. ( 9)
< l.( 8)
0. ( 2)
0. ( 5)
0. (16)
0. ( 7)
0. (18)
11. (14)
13. (15)
15. (13)
16. ( 6)
17. (12)
20. (11)
90. (10)
Cu
<100. ( 3)
<100. (17)
1. (16)
4. (14)
10. ( 9)
13. (12)
13. ( 6)
14. (15)
17. (13)
18. ( 7)
18. (18)
19. ( 8)
20. (11)
21. ( 5)
28. ( 2)
70. (10)
Fe
<100.
<100.
< 10.
0.
0.
0.
10.
10.
10.
12.
13.
19.
38.
54.
200.
420.
( 3)
(17)
(13)
(16)
(14)
(18)
( 9)
(11)
(15)
(12)
( 6)
( 5)
( 2)
( 8)
( 7)
(10)
Pb
<500.
<400.
0.
0.
3.
16.
21.
24.
25.
26.
30.
30.
32.
32.
36.
90.
(17)
( 3)
(13)
( 8)
(16)
(12)
(11
(15)
(14)
(18)
( 7)
( 6)
( 5)
( 9)
( 2)
(10)
Mn
< 20.
0.
0.
0.
0.
12.
14.
15.
16.
16.
16.
18.
20.
21.
27.
110.
(17)
( 3)
( 7)
(16)
( 2)
( 9)
(14)
(12)
(13)
(15)
(18)
( 6)
(11)
( 5)
( 8)
(10)
Zn
<150. ( 3)
< 20. (17)
0. ( 7)
1.U6)
8. ( 6)
10. (11)
10. (18)
11. (13)
13. ( 5)
13. (12)
14. (14)
18. ( 9)
20. (15)
38. ( 8)
70. (10)
97. ( 2)
Sample statistics
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
99* Ranges
Mean, M
Relative error, RE
Standard deviation, SD
Range, R
Results of Dixon's
outlier test
Mean , M
Relative error, RE
Standard deviation, SD
Range, R
9
2
8
21
7
9
2
8
21
7
9
2
8
21
.61
.43
.38
.90
16
.61
.43
.38
.90
16
.61
.43
.38
.90
26.00
0.73
28.37
79.00
10 16
Sample
26.00
0.73
28.37
79.00
10 15
15.33
0.02
3.14
9.00
18.99
0.58
16.20
68.90
3 15
78.
6.
133.
410.
7
60
86
31
00
16
30.
0.
20.
87.
5
statistics after deleting
15.06
0.26
7.12
26.70
3 15
15.06
0.26
7.12
26.70
78.
6.
133.
410.
7
20.
1.
16.
44.
60
86
31
00
14
75
08
43
00
25.
0.
9.
33.
6
27.
0.
5.
20.
42
22
73
00
15
data
00
00
23
00
15
20
09
96
00
25.
0.
28.
98.
6
91
52
18
00
15
beyond 99%
17.
0.
4.
15.
6
17.
0.
4.
15.
50
03
28
00
15
50
03
28
00
24.86
1.26
27.93
95.80
4 15
range
18.85
0.71
18.40
68.80
4 13
11.82
0.07
5.24
18.80
68
-------
For laboratory performance evaluation any result, however far
removed it may be from its nearest neighbor or the centroid of
the data, must'be retained to arrive at a score for each labora-
tory. Therefore, the raw data at this point, were not subjected
to Dixon's or Grubb's procedures for outlier rejection. A less
restrictive screening procedure was applied, namely rejection of
values lying outside a 99 percent range centered about the sample
mean. The rank of ordered results retained after this procedure
is tabulated for each element. Using only the retained values,
sample mean and standard deviation were recomputed, and these
values are used in the Thompson ranking procedure described
below.
For purposes of comparison, Dixon's outlier processing was
also performed. The integral numbers again indicate the ordered
positions within which the data satisfy the Dixon criterion. In
other words, the data beyond the ordered positions can be treat-
ed as outliers at 95% confidence level. Statistical computations
carried out after such an outlier processing are shown in the
same table, which produce relative errors similar to those result-
ing from the previous data screening process.
It is emphasized that apparent outliers should not be re-
jected indiscriminately, or at the same significance level one
would use in method evaluation. One should not be surprised to
find large standard deviation or small (2 or 3 points) clusters
of reported results widely separated from the remainder of the
results. Nothing inherent in the test program precludes grossly
poor performance by several participants. Therefore, these
anomalies were deleted only when it was felt their inclusion
would so bias the computation of mean and standard deviation as
to penalize good performers in subsequent computations.
Youden Two-Sample Plot and Bias Error
Youden two-sample plots were generated by computer and
plotter for three sample pairs: Samples 1 and 4, 2 and 3, 5 and
6. Results for Cu and Zn were plotted, Figures 9-1 through 9-6.
The true values of the two samples in each plot are signified by
intersecting lines. It is seen that most laboratories reported
measurements in the first and third quadrants, namely either
both positive errors or both negative errors, and the points
form a significantly elongated ellipse in each plot. The measure-
ments are biased mostly high at the low concentrations (samples
5 and 6), which indicate that most laboratories make positive
errors in such cases. Furthermore, there are always one or two
points found far removed from the major cluster, a result often
found by other researchers.
69
-------
550. i
500.
450.-
{ 400,
H
P
L
E
4 350.
300.-
2S0.-
- YOU DEN PLOT < CU > T-11.34
•
»
X >
X X
X
XX )
•
X
X
xx
X
X
f 1 1 1 - . - .4 1 . 1 ..1
2S0. 300. 350. 400.
Figure 9-1.
450. 500.
SAMPLE 1
Youden's plot, Cu, samples 1 & 4
550. 600. GSO. 760,
-------
300.
250.-
200.
S 150,
n
p
L
C
3 100.-
50.-
fl.-
P YOUOEN PLOT < CU > F=4S.23
•
3
xx i
«
X
1 1
X
X
X
x* **
1 I t 1 ( 1 t 1
0.
20.
40.
Ł0.
80. 100. 120. 140. 160. 180.
SAMPLE 2
Figure 9-2. Youden's plot, Cu, samples 2 & 3.
-------
to
120. -I
100.-
80.
S 60.-
«
p
L
E
r 40.
20.
fl.-
i
r YOUDEN PLOT < CU > F» 5.6
*
•
•
b
X
X
X
X
* xx* X
X x
X
>• 10. 20. 30. 40. 50. 60. 70. 80. S9
SAMPLE S
Figure 9-3. Youden's plot, Cu, samples 5 & 6.
-------
u>
550.-
500.
450.-
f *09«-
mi — oorj
4 350.-
300.-
9 F= 5.6
»
X
S X
X
X
X
X
X
X
— 1 1 1 1 1 1 1 I
203. 250.
SAMPLE 1
Figure 9-4. Youden's plot, Zn, samples 1 & 4.
650.
-------
160.^
140.
120.-
s 100..
M
P
L
E
3 80.-
60.
40.
2
p YOUDEN PLOT < ZN > F=34.64
»
X x
X
X
X
x
x
x X
X X
x Jf
fl« 30. 40. 50. 60. 70. 80. 90. 100. n'0
SAMPLE 2
Figure 9-5. Youden's plot, Zn, samples 2 & 3.
-------
120.T
YOUDEN PLOT F=49.0i
100. •
60.
-J
en
ft
H
P
L
S 40.- •
.• • x
•4-
—I—
160.
20. 40* 60. 80. 100.
SAMPLE 5
120.
Figure 9-6. Youden's plot, Zn, samples 5 & 6.
-------
The numerical calculation, F-ratio, suggested by Youden was
also carried out, and its value is indicated along with the title
of each plot, and varies from 5.6 to 49.01. Since there are 14
to 16 points plotted in each case, for which the critical F ratio
at 99% confidence level lies between 3.54 and 3.93, the calculat-
ed f ratios clearly indicate bias error in each case. The
critical F ratios cited above are interpolated from the critical
values given by Youden, i.e., 4.16 at 12 degrees of freedom (DF),
3.70 at 14 DF, and 3.37 at 16 DF. It is concluded, therefore,
for the elements Cu and Zn, the bias errors (also called system-
atic errors) are definitely (at 99% confidence level) present.
One may further separate the standard deviations of bias errors
(Sb2) and random errors (Sr2) by using the relationship Sd2 =
2Sb2 + Sr2. However, because the major concern here is the de-
tection of bias errors and to pursue the ranking of laboratory
performance, no effort was made to estimate the magnitude of
random errors.
Relative Errors
As a first step in ranking the laboratory performance,
relative errors (R.E.) were used to discern the differences in
laboratory performance. Relative error is defined as:
R F = Measurement - True Value
True Value
namely the normalized error. The R.E.'s for each laboratory at
each concentration were computed for Cu and Zn. These values
were averaged over the samples 1, 2, and 5, and the samples 4,
3, and 6 separately. The average R.E.'s are treated as a co-
ordinate value to be plotted as a Youden Two-Sample Plot. Such
'distribution is shown in Figures 9-7 and 9-8 for Cu and Zn
respectively. Similar features are found in these two plots;
(a) heavy distribution on the I and III quadrants,
(b) points forming an elongated ellipse and
(c) one or two points far removed from the center of the
cluster.
Thompson's Ranking Method
This method is a quantitative measure of laboratory per-
formance. It accounts for both errors in measurements and
errors in identification. In this study, it was found that the
Thompson score for measurement error can be modified to better
76
-------
2.5
2.0
0.5
0
-0.5
RELATIVE ERROR DIST.(Cu)
,.
^B
•
•
*„/ *
1 1 1 1 I 1 I >
-0.5
0.5 1.0 1.5 2.0 2.5 3.0 3.5
AVERAGE R.E. OF SI.2.5
4.0
Figure 9-7. Relative errors distribution, Cu.
-------
00
•O
CO
O
-1
-1
RELATIVE ERROR DIST.(Zn)
AVERAGE R.E. OF Sl.2.5
8
Figure 9-8. Relative errors distribution, Zn.
-------
represent the laboratory performance. This modification is based
on the observation that when the concentration level is low, the
measurements tend to bias on the high side and have large sample
standard deviation (SD). As the score is computed by dividing
the absolute difference between the measurement (X) and the true
value (TV) by the sample standard deviation, X - TV/SD, the
magnitude of the error is reduced compared to that divided by
TV, X - TV/TV. That is to say:
|X - TV| < [x - TV]
SD TV
when TV < SD. As a result, Thompson's score under-represents
the significance of the error at low concentration. On the other
hand, at high concentrations, the measurements cluster more
closely around TV and SD.*? TV reversing the above inequality.
In this case, Thompson's score does represent the significance
of the error.
A modified Thompson's Quantification Score (PQ) is there-
fore used to rank the laboratory performance, which is expressed
by the following equation:
100 )Xi - TV
- Lm =• No. of measurements by
m
1|.*LI 11 ^^ • \^t A. III
SD, TV] I each lab
The denominator above is the minimum of SD and TV. The
identification score (P.J.) is still the same:
= 100 - N ~» N = No. of missed elements.
Table 9-0 gives the results of such a scoring system. The
full score is 100 points in each case. The total score shown in
the table is the sum of the two individual scores; a full score
in this case is 200 points. It is seen that laboratories 6, 9,
11, 12, 13, 14, 15, 16, and 18 all have scores above 190, where-
as the rest, 2, 3, 5, 7, 8, 10, and 17 have scores below 190.
The highest score is achieved by laboratory 6 and the lowest is
laboratory 3.
79
-------
TABLE 9-9. RESULTS OF THOMPSON'S SYSTEM
QUANTIFICATION SCORE
IDENTIFICATION SCORE
TOTAL SCORE
oo
o
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
29.96
44.75
81.01
97.49
68.61
84.10
93.02
50.16
95.55
95.05
94.46
91.04
94.84
91.86
54.80
96.46
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
100.00
50.00
100. 00
100.00
83.33
100.00
100.00
83.33
100.00
100.00
100.00
100.00
100.00
100.00
58.33
100.00
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
129.96
94.75
181.01
197.49
151.94
184.10
193.02
133.49
195.55
195.05
194.46
191.04
194.84
191.86
113.13
196.46
-------
A performance ranking according to the modified Thompson's
Score is shown below with a dotted line separating the two
groups.
LAB NO.
SCORE
RANK
6
18
11
12
15
13
9
16
14
8
5
7
10
2
17
3
197.5
196.5
195.6
195.1
194.8
194.5
193.0
191.9
191.0
184.1
181.0
151.9
133.5
130.0
113.1
94.8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Youden's Ranking Method
The Youden's performance ranking method requires that the
measurements first be ordered, and scores assigned to each
laboratory according to its position after the ordering. In the
following computation, a score of one is assigned to the
laboratory reporting the lowest value and a score of two to the
laboratory reporting second lowest, and so forth. If there is a
T-way tie of measurements, equal scores of S + 1/2 (T-l) are
given to each of the tied laboratories, where S is the original
score that would have been assigned to the lowest one. For ex-
ample, if the measurements after ordering are:
15, 22, 22, 22, 24, 26, 27, 33
then the corresponding scores will be
1, 3, 3, 3, 5, 6, 7, 8
where the score 3 is computed from 2 + y (3-1) = 3.
The results of such a ranking scheme are given in Table
9-10, where performance rankings were first computed for the
elements (Cu and Zn) and the samples (1 to 6) separately. It is
seen that such a breakdown generates less definitive ranking
81
-------
TABLE 9-10. RESULTS OF YOUDEN'S RANKING
By Sample
By Element
00
Lab No .
2
3
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1
32.00
12.00
28.00
14.50
16.00
26.00
13.00
25.00
22.00
11.50
4.50
8.00
17.50
8.50
14.50
19.00
4
32.00
13.00
29.00
14.50
2.00
29.00
24.50
6.00
21.50
9.00
14.50
8.00
23.00
13.00
13.00
20.00
2
30.00
16.50
31. OC
14.50
5.00
25.00
17.50
2.50
18.00
12.50
12.50
18.00
14.00
11.50
24.50
19.00
3
31.00
15.50
31.00
14.00
9.50
26.00
14.00
3.50
24.00
13. OC
15.00
16.50
21.50
10.50
8.00
19.00
5
30.00
3.50
27.00
10.00
13.00
27.00
17.50
31.00
17.00
16.00
17.50
18.00
14.50
7.00
3.50
19.50
6
31.00
3.50
23.50
11.50
12.50
26.00
17.00
31.00
19.50
16.00
17.00
15.00
21.00
7.00
3.50
17.00
Cu
90.00
38.00
90.00
41.50
46.00
75.00
30.50
51.00
58.00
30.00
37.00
45.50
63.50
31.50
29.00
59.50
Zn
96.00
26.00
79.50
37.50
12.00
84.00
73.00
48.00
64.00
48.00
44.00
38.00
48.00
26.00
38.00
54.00
Total
Ranking
186.0
64.0
169.5
79.0
58.0
159.0
103.5
99.0
122.0
78.0
81.0
83.5
112.5
57.5
67.0
113.5
-------
results. A laboratory can perform well on one element but not
well on the other. However, when the separate rankings are sum-
med to give total ranking, the results are quite similar to those
by the modified Thompson's score. As seen in Table 9-6, when
the laboratories are ranked by their distance from the mean score,
they are again separable into two distinct groups, namely those
of laboratories 6, 9, 10, 11, 12, 13, 14, 15, and 18, and 2, 3,
5, 7, 8, 16, and 17. The mean score used in above computations
comes from averaging the lowest possible score, 12, and highest
possible score 192 = 16 X 12, i.e., mean score =
192 - 12
12 + "^ 2 1/ = 102
A performance ranking based on the total score is thus given
below with a dotted line separating the high and low group:
Lab No. Separation From Mean Score Rank
9
10
15
18
14
11
13
6
12
17
3
7
16
18
5
2
+1.5
-3
+10.5
+ 11.5
-18.5
+ 20
-21
-23
-24
-35
-38
-44
-44.5
+ 57
+ 67-5
+84
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Although the relative ranking of laboratories within each
group is not the same as by Thompson's method, since the con-
trol parameter is bias rather than accuracy, the groupings are
identical except that the classification of laboratories 10 and
16 are interchanged.
LABORATORY EVALUATION
The interlaboratory test program manager is faced with two
tasks in evaluating the data submitted in proficiency tests.
The first test is the assessment of individual laboratory per-
formance relative to the performance of other laboratories.
83
-------
It has been shown above that two tools are available: Youden's
ranking method and Thompson's ranking method. The first is a
test for systematic error and the second is a test for accuracy
(and for precision if more than one aliquot is provided).
The second task to be performed is the assessment of per-
formance as a whole; are all laboratories "good", all "bad", or
some "good" and some "bad". Criteria for classification should
be as quantitative as is possible and conventional statistical
tests, outlier tests, for example, may not necessarily apply.
In_sjpe_ct_ic)n qf__R_anking Results
The reported analytical results for copper and zinc obtained
from EPA Method Study 7, Trace Metals, described in the prececJ-
ing paragraphs, exhibit an interesting characteristic.
Scores obtained by the modified Thompson's ranking method
were ordered and plotted in the form of a cumulative probability
distribution as'shown in Figure 9-9. If these points deviate
from a straight line, then one suspects that they do not come
from a normally distributed population. From inspection of this
figure, it is concluded that there are in fact two distinct
populations, one with scores in the range 94 to 184 and the
other with scores in the range 190 to 198.
Hence, the test program manager is inclined to conclude
that those laboratories in the lowest category performed poorly,
while those in the higher category performed well. (Refer to
criteria established by NERC Triangle Park, shown on the bottom
of page 23).
Suppose, however, that all scores approximated the line
with the steeper slope, with a range of scores between approxi-
mately 190 and 198. Would all of them be "acceptable", or would
some lower fractile be classified "unacceptable". On the other
hand, if all scores were distributed more nearly approximating
the lower slope (larger standard deviation) distribution, with a
range, say, between 90 and 160, then what are the acceptance
criteria? The test program manager should probably conclude
either that all laboratories are "bad" or that there is some-
thing wrong with the design of the test program. A potential
resolution of this dilemma is shown in Figure 9-10.
In this figure, the mean score of M-n laboratories is plot-
ted as a fraction of n, where n is the number of laboratories
deleted from the calculation of the mean score. The mean score
is normalized, from these data, as S/200. If the remaining mean
tends to converge on S/200 = 1.0, then obviously some laboratories
are "good" and some "bad". If the convergence value is appreci-
ably smaller than 1.0, say 0.7, but convergence is evident, then
84
-------
CXI
Ul
100
99.99 99.999.8 99.5 99 98 95
A
80 70 60 50 40 30 20 10 5 21 0.5 0.2 0.1 0.05 0.01
LU
cc
o
CJ
in
110
120
130
140
150
160
170
180
190
200
0.01 0.050.1 0.2 0.5 1 2 5 10 20 30 40 50 60 70 80 90 95 98 99 99.5 9989 .9 9999
Figure 9-9. Thompson's ranking scores for 16 laboratories.
-------
M = 16
1.0
oo
8
to
UJ
en
O
CL
O
CO
O
UJ
DC
O
O
V)
z
SCORE OF (n-H)TH LOWEST LAB
4 5 6 7 8 9 10
NUMBER OF LOWEST LABORATORIES DELETED, n
Figure 9-10. Mean score of (M-n) laboratories.
-------
either all laboratories are "bad" or the method is "bad"; if
convergence is not evident, then probably the test itself is
"bad", since one does expect that regardless of circumstances
some laboratories will perform better than others.
Also plotted in this figure is the ratio of the score of the
(n + l)th lowest laboratory to the mean score with n deleted.
This curve illustrates the relative contribution of the (n + l)th
laboratory to degredation of the mean of the scores of the remain-
ing laboratories. Both of the figures illustrate that laboratories
3, 17, 2, 10 and 7 are clearly unsatisfactory performers and that
laboratories 5 and 8 are marginal performers.
Tests of Significance on Scores
Significance (outlier) tests may also be performed on the
laboratory scores. However, the purpose of the test is not to
access the probability of dispersion from the mean but rather
from the nominal maximum score. Thus, the interval between the
highest score and the next highest, etc., should be examined.
Dixon's test was applied to the interval between laboratory 10
and laboratory 8:
X,, - X, 191.04 - 184.10 _ c,,
__Ł 1 = = U.DOl
Xk-l - Xl 196.46 - 184.10
and the critical value of 0.447 at X = 0.05 (95%) was exceeded.
Using the method of ASTM D 2777,
T = Xn - X = 184.10 - 193.3870 = 2.4323,
n
S 3.3182
where X and S are computed for the top 10 scores only. This
ratio exceeds the critical value of 2.29 at the 5 percent
significance level. Both of these tests are applied from the
top down. Values of X and S for the entire population of 16
laboratories yield meaningless statistics, since the overall
distribution is clearly non-normal.
By either test, the scores of laboratories 5 and 8 appear
not to belong to the higher population, and their performance
should be rated unacceptable, if only two classifications are
used.
These tests of significance are expected to yield results
equivalent to the method outlined in "Proposal Performance
Evaluation Plan", EMSL-Cincinatti, 5 December 1975. This method
87
-------
tests each individual laboratory results against an appropriate
t or X^ statistic, and rates acceptability of each result in
terms of statistics derived from previous method studies. This
approach was employed by FMC in its initial treatment of the test
data, and is a preferred method when adequate statistics are
available for relative standard deviation, with the sample true
value end as the "mean". It was observed that in general, a
laboratory which did well on one sample element usually did well
on the rest, and an aggregate evaluation technique, such as the
cumulative score of Thompson's method, tended both to smooth
performance variations and to yield the same result regarding
acceptability.
Other Observations
The data evaluated for copper and zinc exhibit another
characteristic which is to be expected. Relative errors are
large at low concentrations and small at high concentrations.
Positive bias is generally evident, and this bias is also highest
at low concentrations.
It is not an objective of this study to perform method
evaluation, although such was the intent of the program from
which the data were obtained. Nevertheless, it is observed that
relative errors ranging from about 100 percent at low con-
centrations to 5 or 10 percent at higher concentrations should
be expected.
It was also observed that the data were for most samples
normally distributed. After screening for outliers, the data
were tested by Owen's procedure, (D.B. Owen, Handbook of
Statistical Tables, Addison-Wesley, 1962, p 423), which uses
Kolmogoror-Smirnov one-sample statistics to test for a
distribution. Basically, an empirical distribution Fn(X) and
an assumed continuous cumulative distribution F(X) are compared,
from which the highest deviation is taken to check against a
critical value. For the normality test, the cumulative
distribution function F(X) is that of a normal distribution,
also commonly called error function complement. The results of
the normality tests are mostly positive, when the measurements
are tested element by element and sample by sample. Out of the
total 42 cases, there are only two (Fe Sample 6 and Mn Sample 4)
which fail the normality test. These two cases were log-
transformed and tested again, but again both fail to meet the
log-normality criterion. It is concluded that the data under
study are normally distributed except the two cases mentioned
above.
88
-------
SECTION X
REFERENCES
1. Youden, W. J., "Statistical Techniques for Collaborative
Tests," Manual, Association of Official Analytical Chemists,
1969.
2. "Industrial Hygiene Laboratory Accreditation (587)," Manual,
U.S. Dept. of HEW, National Institute for Occupational
Safety and Health, 1974.
3. "Water-Oxygen Demand Number 2, Study No. 21," HEW, Public
Health Service Publication No. 999-WP-26, 1965.
4. "Water Physics No. 1, Report No. 39," EPA, 1971.
5. "Water Trace Elements No. 2, Study No. 26," HEW, 1966.
6. "Water Chlorine (Residual) No. 2, Report No. 40," EPA, 1971.
7. "Water Nutrients No. 2, Study No. 36," HEW, Public Health
Service Publication No. 2019, 1970.
8. "Water Fluoride No. 3, Study No. 33," HEW, Public Health
Service Publication No. 1895, 1969.
9. "Industrial Hygiene Laboratory Accreditation," Division of
Training, National Institute for Occupational Safety and
Health, HEW, May 1974.
10. "Criteria for Accreditation of Industrial Hygiene Laborato-
ries," American Industrial Hygiene Association Brochure.
11. "Evaluation of Round 30 (74-4) Results," Memo, J. H. Cavender,
Chemical Reference Lab., Public Health Service, HEW, Oct. 21,
1974.
12. "Statistical Protocol for Analysis of Data from PAT Samples,"
W. E. Grouse, DLCD, NIOSH, HEW, July 15, 1974.
13. Statistical Protocol for the Analysis of the PAT Data,"
W. E. Grouse, DDCD, NIOSH, Public Health Service, HEW,
August 6, 1974.
89
-------
14. Pierson, R. H. and Fay, E. A., "Guidelines for Interlaboratory
Testing Programs," Report for Analytical Chemists, Presented
at the 135th National Meeting of The American Chemical Society
at Boston.
15. Greenberg, A. E., "Use of Reference Samples in Evaluating
Water Laboratories," Public Health Reports, Vol. 76, No. 9,
September 1961, pp. 783-787.
16. Mandel, J. & Stiehler, R. D., "Sensitivity - A Criterion
for the Comparison of Methods of Tests," National Bureau
of Standards, Vol. 53, No. 3, Sept. 1954, pp. 155-159.
17. Youden, W. J., "Graphical Diagnosis of Interlaboratory Test
Results," Industrial Quality Control, Vol. 15, July - June,
1958-1959, pp. 24-28.
18. Mandel, J. and Linnig, F. J., "Study of Accuracy in Chemical
Analysis Using Linear Calibration Curves," Analytical Chem-
istry, Vol. 29, No. 5, May 1957, pp. 743-749.
19. Linnig, F. J. & Mandel, J., "Which Measure of Precision?"
Analytical Chemistry, Vol. 36, No. 13, Dec. 1964, pp. 25-32.
20. "Statistical Method - Evaluation and Quality Control for the
Laboratory," Training Course Manual in Computational Analysis,
U.S. Department of HEW, Environmental Health Facilities, 1968.
21. Kramer, H. P. & Kroner, R. C., "Cooperative Studies on
Laboratory Methodology," Journal AWWA, May 1959, pp. 607-613.
22. Greenberg, A. E., Thomas, J. S., Lee, T. W., and Gaffey,
W. R., "Interlaboratory Comparisons in Water Bacteriology,"
Journal American Water Works Association, Vol. 59, No. 2,
Feb. 1967, pp. 237-244.
23. Devine, R. F. and Partington, G. L., "Interference of Sulfate
Ion on SPADNS Colorimetric Determination of Fluoride in
Wastewaters," Envir. Science and Tech., Vol. 9, No. 7,
July 1975, pp. 678-679.
24. McFarren, E. F., et.al., "Criterion for Judging Acceptability
of Analytical Methods," Analytical Chemistry, Vol. 42, No.
3, Mar. 1970, pp. 358-365.
25. "Development of a System for Conducting Interlaboratory
Tests for Water Quality and Effluent Measurements," Survey
Questionary, FMC.
26. Wernimont, G., "The Design and Interpretation of Interlab-
orabory Test Programs," ASTM Bulletin, May 1950, pp. 45-58.
90
-------
27. Greenberg, A. E., "Water Laboratory Approval Program,"
Presented before the Laboratory Section, American Public
Health Association, Nov. 1, 1960.
28. Lee, T. G., "Interlaboratory Evaluation of Smoke Density
Chamber," NBS.
29. McKee, H. C., Childers, R. E., "Collaborative Study of
Reference Method for the Continuous Measurement of Carbon
Monoxide in the Atmosphere," Southwest Research Institute,
Houston, Texas.
30. Bingham, C. D., Whichard, J., "Evaluation of an Interlab-
oratory Comparison Involving Pyrocarbon and Silicon Carbide-
coated Uranium-Thorium Carbide Beads," USAEC New Brunswick
Laboratory, N.J.
31. Lee T. G. and Huggett C., "Interlaboratory Evaluation of
the Tunnel Test (ASTME 84) Applied to Floor Coverings,"
Inst. for Applied Technology, NBS.
32. Weiss, C. M., and Helms, R. W., "The Interlaboratory Preci-
sion Test, An Eight Laboratory Evaluation of the Provisional
Algal Assay Procedure Bottle Test," Chapel Hill Dept. of
Envir. Sciences and Engineering, North Carolina Univ.
33. Merkle, E. J., et al, "Interlaboratory Comparison of Chemical
Analysis of Uranium Mononitride," Lewis Res. Center, NASA.
34. "Cooperative Evaluation of Techniques for Measuring Hydro-
carbons in Diesel Exhaust," Coordinating Research Council,
Inc., N.Y.
35. "Cooperative Evaluation of Techniques for Measuring Hydro-
carbon in Diesel Exhaust, Phase III," Coordinating Research
Council, Inc., N.Y.
36. McKee, H. C., et al, "Collaborative Study of Reference Method
for Determination of Sulfur Dioxide in the Atmosphere
(Pararosaniline Method)," Southwest Research Inst., Houston,
Texas.
37. Dixon, W. J. and Massey, F. J., Jr., Introduction to Statis-
tical Analysis, (New York: McGraw-Hill, 1957).
38. Brownlee, K. A., Statistical Theory and Methodology in
Science and Engineering,(New York: Wiley, 1960).
91
-------
39. Cochran W.'C. and Snedecor, G. W., Statistical Methods,
(Ames, Iowa: Iowa State University Press, 1967).
40. Siegel, S., Nonparametric Statistics for the Behavioral
Sciences, (New York: McGraw-Hill, 1956)".
41. Mace, A. E., Sample Size Determination, R. E. Kriege Co.,
1974, pp. 35-37 and 56-57.
42. David, H. A., Order Statistics, Wiley & Son Inc., 1970,
p. 193.
43. Bennett, C. A. and Franklin, N. L., Statistical Analysis of
Chemistry and the Chemical Industry, Wiley and Son Inc.,
1954, Ch. 8.
44. Barnett, R. N. and Pinto, C. L., "Evaluation of a System
for Precision Control in the Clinical Laboratory," The
American Journal of Clinical Pathology, Vol. 48, No. 2,
1967, pp. 243-247.
45. Copeland, B. E. "Standard Deviation," American Journal
Clinical Pathology, Vol. 27, 1957, pp. 551-557.
46. Greenberg, A. E., et al, "Chemical Reference Samples in
Water Laboratories," Journal American Water Works Associa-
tion, Vol. 61, No. 11, No. 1969, pp. 599-602.
47. Griffin, D. F., "Systems Control by Cumulative Sum Method,"
American Journal of Medical Technology, Vol. 34, No. 11,
Nov. 1968, pp. 644-650.
48. Sokal, R. R., and Rohlf, J. F., "Biometry; The Principles
and Practice of Statistics in Biological Research," Freeman
and Comp., San Francisco, Calif.
49. Wernimont, G., "Design and Interpretation of Interlaboratory
Studies of Test Methods," Analytical Chemistry, Vol. 23,
No. 11, Nov. 1951, pp. 1572-1976.
50. Willits, C. 0., "Standardization of Microchemical Methods
and Apparatus," Analytical Chemistry, Vol. 23, No. 11,
Nov. 1951, pp. 1565, 1567.
51. McArthur, D. S., et al, "Evaluation of Test Procedures,"
Analytical Chemistry, Vol. 26, No. 6, June 1954, pp.
1012-1018.
92
-------
52. Czech, F. P., "Simplex Optimized Acetylacetone Method for
Formaldehyde," Journal of AOAC, Vol. 56, No. 6, 1973, pp.
1496-1502.
53. Czech, F. P., "Simplex Optimized J-Acid Method for the
Determination of Formaldehyde," Journal of AOAC, Vol. 56,
No. 6, 1973, pp. 1489-95.
54. Byram, K. V. and Krawczyk, D. F., "Management System for an
Analytical Chemical Laboratory," Americal Laboratory, Vol.
5., No. 1, Jan. 1973, pp. 55-62.
55. Bryam, K. V. and Krawczyk, D. F., "The Use of a Management
System in Operating an Analytical Chemical Laboratory,"
Working Paper, EPA, Pacific Northwest Water Laboratory,
Corvallis, Oregon.
56. Table of Contents, 1973 Book of ASTM Standards, Parts 23
and 30.
57. Mandel J. and Paule, R. C., "Analysis of Interlaboratory
Measurements on the Vapor Pressure of Gold," Inst. for
Material Research, NBS.
58. Ku, H. H., "Precision Measurement and Calibration," NBS.
59. "Correlation of Full-Flow Light Extinction Type Diesel
Smokemeters by a Series of Neutral Density Filters,"
Coordinating Research Council, Inc., N. Y.
60. "Instrumental Analysis of Chemical Pollutants, Training
Manual," Office of Water Programs, EPA.
61. "Methods for Organic Pesticides in Water and Wastewater,"
Analytical Quality Control Laboratory, NERC, Cine., Ohio.
62. "Evaluation of Monitoring Methods and Instrumentation for
Hydrocarbon and Carbon Monoxide in Stationary Source Emis-
sions," Walden Research Corp., Camb., Mass.
63. Bohl, D. R., Sellero, D. E., "Statistical Evaluation of
Selected Analytical Procedures," Mound Laboratory,
Miamisburg, Ohio.
64. Mandel, J. and Paul, R. C., "Standard Reference Material:
Analysis of Interlaboratory Measurements on the Vapor
Pressures of Cadmium and Silver. (Certification of Standard
Reference Materials 746 and 748)," Inst. of Material Research
(401 - 937), NBS.
93
-------
65. McFarren, E. F., et al, "Water Metals No. 4, Study Number
30. Report of a Study Conducted by Analytical Reference
Service," Bureau of Disease Prevention and Environmental
Control, PHS, Cine., Ohio.
66. Lishka, R. J., Parker, J. H., "Water Surfactant No. 3, Study
Number 32. Report of a Study Conducted by Analytical
Reference Service," Bureau of Disease Prevention and En-
vironmental Control, PHS, Cine., Ohio.
67. Arnett, E. M., "A Chemical Information Center Experimental
Station," Pittsburgh Chemical Information Center, Penn.
68. "Proceedings, Joint Conference on Prevention and Control
of Oil Spills," American Petroleum Institute, N.Y.
69. Ekedahl, G., et al, "Interlaboratory Study of Methods for
Chemical Analysis of Water," Journal WPCF, Vol. 47, No.
4, April 1975, pp. 858-866.
70. "Industrial Hygiene Service Laboratory Quality Control,"
Manual, Technical Report No. 78, U.S. Dept. of HEW, National
Institute for Occupational Safety and Health, updated.
71. Lark, P. D., "Application of Statistical Analysis to
Analytical Data," Analytical Chemistry, Vol. 26, No. 11,
Nov. 1954, pp. 1712-1715.
72. "Quality Control in the Industrial Hygiene Laboratory,"
Manual, U.S. Dept. of HEW, National Institute for Occupational
Safety and Health, 1971.
73. Frazier, R. P., et al, "Establish a Quality Control Program
for a State Environmental Lab.," Water & Sewage Works,
May 1974, pp. 54-75.
74. Frazier, R. P., et al, "Establishing a Quality Control Pro-
gram for a State Environmental Laboratory," Water & Sewage
Works, May 1974, pp. 54-57, 75.
75. Hoffmann, R. G. and Waid, M. F., "The Number Plus Method
of Quality Control of Laboratory Accuracy," The American
Journal of Clinical Pathology, Vol. 40, No. 3, Sept. 1963.
76. Barnett, R..N. and Weinberg, M. S. "Absence of Analytical
Bias in a Quality Control Program," The American Journal
of Clinical Pathology, Vol. 38, No. 5, Nov. 1962, pp. 468-
472.
94
-------
77. Hoffmann, R. G. and Waid, M. E., "The Quality Control to
Laboratory Precision," American Journal of Clinical Pathol-
ogy, 25: 585-594, 1955.
78. Jennings, E. R. and Levey, S., "The Use of Control Charts
in the Clinical Laboratory," American Journal of Clinical
Pathology, 20: 1059-1066, 1950.
79. Nelson, A. C. Jr. and Smith, F., "Guidelines for Development
of a Quality Assurance Program. Reference Method for Mea-
surement of Photochemical Oxidents," Research Triangle Inst.,
Durham, N.C.
80. "Guideline for Development of A Quality Assurance Program.
Reference Method for the Continuous Measurement of Carbon
Monoxide in the Atmosphere," Research Triangle Inst., Re-
search Triangle Park, N. C.
81. Covell, D. F., "Computer-coupled Quality Control Procedure
for Gamma-ray Scintillation Spectrometry," Naval Radiological
Defense Lab., San Francisco, Calif.
82. Harley, J. H. and Volchok, H. L., "Quality Control in Radio-
chemical Analysis," Vsaec Health and Safety Laboratory,
N. Y.
83. Ballinger, D. G., et al, "Handbook for Analytical Quality
Control in Water and Wastewater Laboratory," NERC, Cine.,
Ohio.
84. Robert, S., "Laboratory Quality Control Manual," Kerr Water
Research Center, Ada., Oklahoma.
85. "Review of Current Literature on Analytical Methodology and
Quality Control, Number 22," Analytical Methodology Informa-
tion Center, Battelle Columbus Laboratory, Ohio.
86. "FWPCA Method Study 1. Mineral and Physical Analyses,"
Analytical Quality Control Laboratory, Federal Water Pollu-
tion Control Administration, Cine., Ohio.
87. Meiggs, T. O., "Workshop on Sample Preparation Techniques
for Organic Pollutant Analysis Held at Denver, Colorado,
Oct. 2-4, 1973," National Field Investigation Center, Denver,
Colo.
88. Smith, F., et at, "Guideline for Development of a Quality
Assurance Program, Vol 1. Determination of Stack Gas Velocity
and Volumetric Flow Rate C Type-S Pitot Tube," Research
Triange Inst., Durham, N. C.
95
-------
89. Bailey, L. V., Arnett, L. M., "ASP - Analysis of Synthetics
Program for Quality Control Data," Savannah River Laboratory,
DuPont de Nemours (E. I.) and Comp., Aiken, So. Carolina.
90. "Operational Hydromet Data Management System, Design Charac-
teristics," North American Rockwell Information Systems
Comp., Anaheim, Calif.
91. "Design and Operation of An Information Center of Analytical
Methodology," Battelle Memorial Institute, Columbus, Ohio.
92. "Storage and Retrieval of Water Quality Data, Training
Manual," EPA, Washington D. C.
93. Lewinger, K. L. "Studies in the Analysis of Metropolitan
Water Resource Systems, Vol V: A Method of Data Reduction
for Water Resources Information Storage and Retrieval,"
Wa.ter Resources and Marine Sciences Center, Cornell University,
Ithaca, N. Y.
94. Proceedings of Conference on "Toward a Statewide Ground
Water Quality Information System" and "Report of Ground
Water Quality Subcommittee, Citizens Advisory Committee,
Governors Environmental Quality Control," Water Resources
Research Center, University of Minnesota, Minneapolis, Minn.
95. Reynolds, H. D., "An Information System for the Management
of Lake Ontario," Cornell University, Ithaca, N. Y.
96. "Transport and the Biological Effects on Molybdenum in the
Environment," Colorado State University, Fort Collins, Colo.
97. Ward, R. C., "Data Acquisition Systems in Water Quality
Management," Colorado State University, Fort Collins, Colo.
98. Steel, T. D., "The Syslab System for Data Analysis of
Historical Water - Quality Records (Basic Program)," Geo-
logical Survey, Washington D. C.
99.' Lehmann, E. J., "Automatic Acquisition of Water Quality
Data," National Technical Information Service, Springfield,
Virg.
100. Bulkley, J. W. and Yaffee, S. L., "Factors Affecting Innova-
tion in Water Quality Management: Implementation of the
1968 Michigan Clean Water Bond Issue," Dept. of Civil
Engineering, University of Michigan, Ann Arbor, Mich.
96
-------
101. Guenther, G., et al, "Michigan Water Resources Enforcement
and Information System," Water Resources Commission, Dept.
of Natural Resources, Lansing, Michigan.
102. "A National Overview of Existing Coastal Water Quality
Monitoring," Interstate Electronics Corp., Anaheim, Calif.
103. Barrow, D. R., "SIDES: Storet Input Data Editing System,"
Surveillance and Analysis Division, EPA, Athens, Georgia.
104. Ho, C. Y., "Theomophysical and Electronic Properties
Information Analysis Center (TEPIAC): A Continuing System-
atic Program on Tables of Therphysical and Electronic
Properties of Materials," Theomophysical Properties Research
Center, Purdue University, Lafayette, Indiana.
105. Dubois, D. P. "STORET II:" Storage and Retrieval of Data
for Open Water and Land Areas," Div. of Pollution Surveil-
lance, Fed. Water Pollution Control Adm., Washington D. C.
106. Conley, W. and Tipton, A. R., "Part I, A Conceptual Model
for a Terrestrial Ecosystem Perturbed with Sewage Effluent,
With Special Reference to the Michigan State University
Water Quality Management Project. Part II, A Personalized
Bibliographic Retrieval Package for Resource Scientists,"
Dept. of Fisheries and Wildlife, Michigan State University,
East Lansing, Mich.
107. Stevens, S. S., Handbook of Experimental Psychology, John
Wiley, 1951, pp. 35 and 1297.
108. D. Meister, "The Problem of Human-Initiated Failures,"
Proced. 8th National Sym. on Rel and Q.C. pp. 234-239,
Jan. 9, 1964.
109. D. Meister, "Methods of Predicting Human Reliability in
Man-Machine Systems," Human Factors, 6 (6), 1964.
97
-------
SECTION XI
INTERLABORATORY TEST PROGRAMS
BIBLIOGRAPHY
A. INTERLABORATORY TESTS
ANALYSIS OF INTERLABORATORY MEASUREMENTS ON THE VAPOR PRESSURE
OF GOLD
National Bureau of Standards, Washington, D.C. Institute for
Materials Research
AUTHOR: Paule, Robert C.; Mandel, John
ABSTRACT: A detailed statistical analysis has been made of re-
sults obtained from a series of interlaboratory measurements on
the vapor pressure of gold. The Gold Standard Reference Material
745 which was used for the measurements has been certified over
the pressure range 10 to the -8th to 10 to the 3rd atm. The
temperature range corresponding to these pressures is 1300-2100 K,
The gold heat of sublimation at 298 K and the associated standard
error were found to be 87,720 + 210 cal/mol (367,040 + 900 J/mol).
Estimates of uncertainty have been calculated for the~certified
temperature-pressure values as well as for the uncertainties ex-
pected from a typical single laboratory's measurements. A statis-
tical analysis has also been made for both the second and third
law methods, and for within and between laboratory components
of error. Several notable differences in second and third law
errors are observed.
PRECISION MEASUREMENT AND CALIBRATION. SELECTED NBS PAPERS ON
STATISTICAL CONCEPTS AND PROCEDURES
National Bureau of Standards, Washington, D.C.
AUTHOR: Ku, Harry H.
ABSTRACT: This volume is one of an extended series which brings
together the previously published papers, monographs, abstracts,
and bibliographies by NBS authors dealing with the precision
measurement of specific physical quantities and the calibration
of the related metrology equipment. It deals with methodology
in the generation, analysis, and interpretation of precision
measurement data. It contains 40 reprints assembled in 6 sec-
98
-------
tions: (1) the measurement process; (2) design of experiments
in calibration; (3) interlaboratory tests; (4) functional re-
lationships; (5) statistical treatment of measurement data;
(6) miscellaneous. Each section is introduced by an interpretive
foreword, and the whole is supplemented by abstracts and selected
references.
INTERLABORATORY EVALUATION OF SMOKE DENSITY CHAMBER
National Bureau of Standards, Washington, D.C. Building Research
Division
AUTHOR: Lee, T. G.
ABSTRACT: Results are reported of an interlaboratory (round-robin)
evaluation of the smoke density chamber method for measuring the
smoke generated by solid materials in fire. A statistical
analysis of the results from 10 material-condition combinations
and 18 laboratories is presented. For the materials tested, the
median coefficient of variation of reproducibility was 7.2% under
non-flaming exposure conditions and 13% under flaming exposure
conditions. A discussion of errors and recommendations for im-
proved procedures based on user experience is given. A tentative
test method description is included as an appendix.
COLLABORATIVE STUDY OF REFERENCE METHOD FOR THE CONTINUOUS MEA-
SUREMENT OF CARBON MONOXIDE IN THE ATMOSPHERE (NON-DISPERSIVE
INFRARED SPECTROMETRY)
Southwest Research Institute, Houston, Texas
AUTHOR: McKee, Herbert C.; Childers, Ralph E.
ABSTRACT: Information obtained in the evaluation and collabora-
tive testing of a reference method for measuring the carbon mon-
oxide content of the atmosphere is presented. The method is
based on the infrared absorption characteristics of carbon mon-
oxide, using an instrument calibrated with gas mixtures contain-
ing known concentrations of carbon monoxide. The method as pub-
lished in the appended "Federal Register" article was tested by
means of a collaborative test involving a total of 16 labora-
tories. The test involved the analysis of both dry and humidified
mixtures of carbon monoxide and air over the concentration range
from 0 to 60 mg/cu m. A statistical analyais of the data of 15
laboratories is presented.
EVALUATION OF AN INTERLABORATORY COMPARISON INVOLVING PYROCARBON
AND SILICON CARBIDE-COATED URANIUM-THORIUM CARBIDE BEADS
Usaec New Brunswick Laboratory, New Jersey
AUTHOR: Bingham, C. D.; Whichard, J.
ABSTRACT: An interlaboratory comparison program was conducted
between six chemistry laboratories and three nondestructive assay
99
-------
laboratories. The material of interest was pyrocarbon- and sili-
con carbide-coated uranium-thorium carbide beads. Accuracy of
uranium and thorium measurements was ascertained by supplying to
the laboratories uranium oxide and thorium oxide samples contain-
ing known quantities. With one exception, the accuracy of the
chemical analysis of uranium was within a range of 0.5% relative
to the prepared value. Within-laboratory precisions ranged from
0.013 to 0.39% RSD for the mixed oxide samples. Chemical assay
of the beads exhibited a range of nearly +_!% (relative) about the
interlaboratory chemical average for uranium content. Within-
laboratory precisions ranged from 0.03 to 0.33% RSD. Some de-
pendence on sample preparation was evidenced. NDA measurements
on mixed oxides showed biases as high as 3% from the prepared
values. Measurements on coated beads were nearly comparable with
chemical measurements in accuracy.
INTERLABORATORY EVALUATION OF THE TUNNEL TEST (ASTM E 84) APPLIED
TO FLOOR COVERINGS
National Bureau of Standards, Washington, D.C. Institute for Ap-
plied Technology
AUTHOR: Lee, T. G.; Huggett, Clayton
ABSTRACT: Results of an interlaboratory evaluation of the ASTM
E 84 tunnel test method involving eleven laboratories and nine
materials, including four carpets, are reported. Data on flame
spread, smoke, and fuel contribution are analyzed statistically.
Selected physical characteristics of each tunnel are tabulated
and compared relative to specifications in the test method. The
between-laboratory coefficient of variation (reproducibility) in
flame spread classification (FSC) was found to range from 7 to
29% for the four carpets and from 18 to 43?? for the other ma-
terials tested. The between-laboratory coefficients of variation
for smoke developed and fuel contribution ranged from 34 to 85%
and from 22 to 117% respectively for all materials tested.
THE INTERLABORATORY PRECISION TEST. AN EIGHT LABORATORY EVALUA-
TION OF THE PROVISIONAL ALGAL ASSAY PROCEDURE BOTTLE TEST
North Carolina University, Chapel Hill Department of Environmental
Sciences and Engineering
AUTHOR: Weiss, Charles M.; Helms, Ronald W.
ABSTRACT: In order to establish the validity of an algal assay
procedure for the determination of algal nutrient levels in sur-
face waters, a suitable protocol was designed and followed by
eight laboratories. This group consisted of one government lab-
oratory, four university laboratories and three industrial labor-
atories. The basic procedure was to evaluate by use of the
"bottle" or batch tost the precision and reproducibility of the
growth response of one test organism, Selenastrum capricornutum,
in four media of varying nutrient strength. The medium was
originally defined for the PAAP test and modified slightly in
100
-------
subsequent evaluations. The test media of this experiment were
all dilutions of the PAAP medium.
INTERLABORATORY COMPARISON OF CHEMICAL ANALYSIS OF URANIUM
MONONITRIDE
National Aeronautics and Space Administration, Lewis Research
Center, Cleveland, Ohio
AUTHOR: Merkle, E. J.; Davis, W. F.; Halloran, J. T.; Graab, J. W.
ABSTRACT: Analytical methods were established in which the
critical variables were controlled, with the result that accept-
able interlaboratory agreement was demonstrated for the chemical
analysis of uranium mononitride. This was accomplished by using
equipment readily available to laboratories performing metallurgi-
cal analyses. Agreement among three laboratories was shown to be
very good for uranium and nitrogen. Interlaboratory precision
of +_0.04 percent was achieved for both of these elements. Oxygen
was determined to +_15 parts per million (ppm) at the 170-ppm
level. The carbon determination gave an interlaboratory preci-
sion of +46 ppm at the 320-ppm level.
COOPERATIVE STUDIES ON LABORATORY METHODOLOGY
Journal American Water Works Association 51:607 (May 1959)
AUTHOR: Kramer, H. P.; Kroner, R. C.
ABSTRACT: The Analytical Reference of the Robert A. Taft Sani-
tary Engineering Center is a voluntary association of member
organizations whose purpose is evaluation of methods in sanitary
engineering. Samples are prepared to guarantee, to the extent
possible, the desired concentrations of constituents. One ali-
quot was chosen at random and analyzed in the Sanitary Engineer-
ing Center to assure that no significant errors were made in
sample preparation and to uncover possible difficulties not anti-
cipated during sample design and preparation.
Sample Type I-A was the second sample for testing water mineral
in approximately two years. The article summarized the results
of studies made on sample Type I-A for calcium, magnesium, hard-
ness, sulfate and chloride, alkalinity, sodium and potassium.
Results obtained indicate that, in contrast to the determination
of alkalinity, those of calcium, magnesium, hardness, sulfate,
chloride, sodium, and potassium can be performed with a high de-
gree of accuracy. The superiority shown by EDTA methods for
hardness, calcium, and magnesium; and of the mecuric nitrate
method for chloride was noted.
101
-------
INTERLABORATORY COMPARISONS IN WATER BACTERIOLOGY
Journal American Water Works Association 59:237 (February 1967)
AUTHOR: Greenberg, A. E.; Thomas, J. S.; Lee, T. W.; Gaffey,
W. R.
ABSTRACT: The article describes an experiment designed to exam-
ine whether test results differ between laboratories. The model
used was a four-way, partially nested, mixed model analysis of
variance in which laboratories, media, and water samples were
assumed to be fixed effects, and days represented a random sample
of days. The analysis of variance was performed for three sep-
arate interlaboratory comparisons.
This model makes it possible to evaluate main effects and inter-
actions from one analysis. Test conclusions showed that results
in the laboratory in question were acceptable. The results also
showed several interactions which would bear followup.
CHEMICAL REFERENCE SAMPLES IN WATER LABORATORIES
Journal American Water Works Association 61:599 (November 1969)
AUTHORS: Greenberg, A. E.; Moskowitz, N.; Tamplin, B. R. ;
Thomas, J.
ABSTRACT: The article reports results of a single analysis of
two water samples containing different ionic concentrations that
were analyzed for the same constituents. Analytical results on
both samples were received from 92 laboratories approved for
chemical work. Youden's procedure for graphical diagnosis of in-
terlaboratory test results was used, with some modifications, to
evaluate the results from each laboratory. Circles defining
acceptable, questionable, and unacceptable results were drawn.
Thirty-eight laboratories had perfect scores, 54 had one or more
unacceptable results; and two of the 54 had no acceptable results.
In 1961, 29 of 63 laboratories reported unacceptable results for
one or more constituents. Results of the current test indicate
there has been no general improvement in the intervening years.
Lack of improvement was associated with an inadequate follow-
up program.
USE OF REFERENCE SAMPLES IN EVALUATING WATER LABORATORIES
Public Health Reports 76:783 (September 1961)
AUTHOR: Greenberg, A. E.
ABSTRACT: A reference sample was used to evaluate sample results
of approved water laboratories. Laboratories were sent replicate
1-gallon samples of water bottled at a water treatment plant
handling surface water. Analysis for calcium, magnesium, sodium,
potassium alkalinity, chloride, and sulphate were requested.
102
-------
Analyses were to be made in duplicate. In the sanitation and
radiation laboratory of the state health department, each of
seven chemists analyzed the reference sample to provide basic
information on its composition and the variability of results.
Comparison of the approved laboratory results with those of the
state health department laboratory showed four sources of varia-
tion in approved laboratories: (a) differences between replicate
samples, (b) differences between laboratories, (c) differences
between analysts, (d) differences between methods. A comparison
of individual approved laboratories with all approved laborator-
ies was made using results falling between the mean and + 1
standard deviation as acceptable, results between + 1 and + 2
standard deviations from the mean were acceptable but question-
able, and results outside the limits of + 2 standard deviations
from the mean were unacceptable. Twenty-nine of 63 participating
laboratories, more than two-thirds, produced unacceptable results
for one or more constituents. In summary, performance of a small
number of laboratories was generally unacceptable. Performance
of a larger number of laboratories was better, but occasionally
unacceptable. With this information, the state health department
laboratory instituted a follow-up program to rectify those lab-
oratories needing improvement.
GRAPHICAL DIAGNOSIS OF INTERLABORATORY TEST RESULTS
Industrial Quality Control 15:24 (May 1959)
AUTHOR: Youden, W. J.
ABSTRACT: The article describes a double sample graphical
analysis scheme for diagnosis of errors in interlaboratory test
results. Samples of two different materials are sent to a number
of laboratories which are asked to make one test on each materi-
al. The two materials should be similar and be reasonably close
in the magnitude to the property evaluated. Diagnosis of the
configuration of points makes possible identification of situa-
tions where more careful description or modification is required,
erratic work, deviations from specified procedure, and prevalence
of constant errors. A method for estimating standard deviation
from test results is described. The graphical procedure facili-
tates presentation of the results in a convincing manner, thus
avoiding statistical computations.
B. ANALYTICAL METHODS EVALUATION
COOPERATIVE EVALUATION OF TECHNIQUES FOR MEASURING HYDROCARBONS
IN DIESEL EXHAUST
Cooridnating Research Council, Inc., New York
103
-------
ABSTRACT: A small diesel engine was shipped to 13 laboratories
in succession, and each laboratory measured exhaust hydrocarbon
concentrations by methods of their own choosing. The1 standard
deviation of the measured concentrations was on the order of 50%
of the median values. Sources of the1 variation could be true
differences in the exhaust samples from the engine, differences
among laboratories in taking and handling the samples, and dif-
ferences in instrument responses. Differences in sampling among
laboratories appeared to be a major source of the variation.
COOPERATIVE EVALUATION OF TECHNIQUES FOR MEASURING HYDROCARBONS
IN DIESEL EXHAUST, PHASE III
Coordinating Research Council, Inc., New York
ABSTRACT: Earlier cooperative tests indicated that errors in
measuring hydrocarbon concentrations in diesel exhaust were un-
desirably large. To determine sources of the errors and to
eliminate them, additional tests were conducted on one engine at
a central location with twelve continuous hydrocarbon analyzers.
Results of these tests show that with improvements in equipment
and operating techniques, the precision and reliability of hydro-
carbon measurements are satisfactory for current needs.
COORELATION OF FULL-FLOW LIGHT EXTINCTION TYPE DIESEL SMOKEMETERS
BY A SERIES OF NEUTRAL DENSITY FILTERS
Coordinating Research Council, Inc., New York
ABSTRACT: The project involved testing twenty-four smokemeters
by fourteen laboratories. The same series of four precalibrated
metallic type neutral density filters was used by each laboratory
in performing the static calibration of their diesel smoke mea-
suring systems. The overall result was that essentially the same
detector response was reported by the laboratories although each
was asked to perform the static calibrations as they normally
would. It may be concluded that the smokemeter results under
static conditions are essentially equivalent and that no gross or
consistent discrepancies could be found.
INSTRUMENTAL ANALYSIS OF CHEMICAL POLLUTANTS. TRAINING MANUAL
Environmental Protection Agency, Washington, D.C. Office of Water
Programs
ABSTRACT: The manual was developed for use by students in train-
ing courses of the Water Quality Office, Environmental Protection
Agency. The report discusses gas, liquid, and thin-layer chroma-
tography, atomic and colorimetric spectral analysis, sampling
methods, and instrument design. A special section for pesticide
analysis of soil or water is also included.
104
-------
METHODS FOR ORGANIC PESTICIDES IN WATER AND WASTEWATER
National Environmental Research Center, Cincinnati, Ohio.
Analytical Quality Control Laboratory
ABSTRACT: The report presents a general discussion, helpful hints
and suggestions, and precautionary measures required for pesti-
cide analysis. Step by step procedures are given for organo-
chlorine pesticides.
EVALUATION OF MONITORING METHODS AND INSTRUMENTATION FOR HYDRO-
CARBONS AND CARBON MONOXIDE IN STATIONARY SOURCE EMISSIONS
Walden Research Corporation, Cambridge, Massachusetts
ABSTRACT: The report reviews the state of the art of monitoring
methods and instruments for carbon monoxide (CO) and hydrocarbons
(HC) in stationary sources. Emissions are characterized from
boilers, municipal incinerators, gray iron foundries, refineries,
and asphalt batching plants. Manual methods for CO and HC de-
termination are discussed, and monitoring instrumentation is re-
viewed. Nondispersive infrared spect.roscopy (NDIR) , gas chromato-
graphy, and flame ionization detection are evaluated in laboratory
and pilot plant studies. Field evaluations were conducted on the
reported industries. Calibration procedures, accuracy, and some
results are reported. A computer program for data reduction is
included.
C. STATISTICAL ANALYSIS - QUALITY CONTROL
COLLABORATIVE STUDY OF REFERENCE METHOD FOR DETERMINATION OF
SULFUR DIOXIDE IN THE ATMOSPHERE (PARAROSANILINE METHOD)
Southwest Research Institute, Houston, Texas
AUTHOR: McKee, Herbert C.; Childers, Ralph E. ; Saenz, Oscar Jr.
ABSTRACT: The report presents information obtained in the evalu-
ation and collaborative testing of a reference method for mea-
suring the sulfur dioxide content of the atmosphere. The tech-
nique is called the pararosaniline dye method or sometimes the
West-Gaeke method. Different variations of this method have been
used extensively by many laboratories since the original publica-
tion in 1956, and it has been found to be reliable and reasonably
free of interferences. Collaborative tests were performed in-
volving a total of eighteen laboratories. A statistical analysis
of the data of fourteen laboratories provided the following re-
sults, based on the analysis of pure synthetic atmospheres using
the 30-min sampling procedure and the sulfite calibration method
prescribed. Results are also presented with respect to the use
of control samples and reagent blank samples, the minimum number
of samples required to establish validity of results within
105
-------
stated limits, and the statistical evaluation of various steps in-
cluded in the method. The method can give satisfactory results
only when followed rigorously by experienced laboratory personnel.
The publication of the method in the Federal Register, April 30,
1971, as the reference method to be used in connection with Fed-
eral ambient air quality standards for sulfur dioxide is appended.
GUIDELINES FOR DEVELOPMENT OF A QUALITY ASSURANCE PROGRAM.
REFERENCE METHOD FOR MEASUREMENT OF PHOTOCHEMICAL OXIDENTS
Research Triangle Institute, Durham, North Carolina
AUTHOR: Smith, Franklin; Nelson, A. Carl Jr.
ABSTRACT: Guidelines for the quality control of Federal reference
method for photochemical oxidants are presented. These include:
(1) good operating practices; (2) directions on how to assess
data and qualify data; (3) directions on how to identify trouble
and improve data quality; (4) directions to permit design of
auditing activities; and, (5) procedures which can be used to se-
lect action options and relate them to costs. The document is
designed for use by operating personnel.
GUIDELINES FOR DEVELOPMENT OF A QUALITY ASSURANCE PROGRAM.
REFERENCE METHOD FOR THE CONTINUOUS MEASUREMENT OF CARBON MONOXIDE
IN THE ATMOSPHERE
Research Triangle Institute, Research Triangle Park, North Carolina
ABSTRACT: The report has been prepared for the quality control
of ambient air measurements of carbon monoxide. The purpose of
the document is to provide uniform guidance to all EPA monitoring
activities in the collection, analysis, interpretation, presenta-
tion, and validation of quantitative data. The technique used is
non-dispersive infrared (NDIR) spectrometry.
COMPUTER-COUPLED QUALITY CONTROL PROCEDURE FOR GAMMA-RAY
SCINTILLATION SPECTROMETRY
Naval Radiological Defense Lab, San Francisco, California
AUTHOR: Cove11, D. F.
ABSTRACT: Long-term stabilization of instrumental performance is
necessary for gamma-ray scintillation spectrometry whether used
for nuclear spectroscopy studies or for radionuclide identifica-
tion and estimation. This requirement is especially important if
high-precision measurements are to be made on a routine basis.
It is proposed to achieve sufficient stabilization through statis-
tical quality control, a technique used to maintain the quality of
output of a process or system. A quality control procedure was
devised which consists of periodic measurement of a current stand-
ard spectrum and comparison of it, on a channel-by-channel basis
on a computer, with a reference standard spectrum. Significant
106
-------
differences between the two spectra are interpreted as machine
deviations that require correction. As part of the procedure,
values obtained from this measurement are charted so that current
and past performance can be compared easily. This makes possible
a prompt awareness of unusual changes in performance. Applica-
tion of the technique has resulted in improved stability, im-
proved reliability, and reduced maintenance. Approximately 20
minutes of technician time are required per day to apply this
procedure to a single instrument. Less time per instrument is
required when several instruments are simultaneously controlled.
QUALITY CONTROL IN RADIOCHEMICAL ANALYSIS
Usaec Health and Safety Laboratory, New York; Woods Hole Oceano-
graphic Institution, Massachusetts (Usa)
AUTHOR: Harley, J. H.; Volchok, H. L.
ABSTRACT: An ideal system of quality control in radiochemical
analysis is described and some data relating to analysis of sea-
water are presented. Several basic factors which affect the
quality of a radiochemical analysis are: the use of proper
standards for calibration; the use of proper counter efficiencies
and backgrounds; the proper determination of radiochemical re-
covery; correction of results for analytical blank, and the con-
tinual checking of the performance of the overall system"for ac-
curacy and precision.
HANDBOOK FOR ANALYTICAL QUALITY CONTROL IN WATER AND WASTEWATER
LABORATORIES
National Environmental Research Center, Cincinnati, Ohio,
AUTHOR: Ballinger, D. G.; Booth, R. L.; Midgett, M. R; Kroner,
R. C.; Kopp, J. F.
ABSTRACT: One of the fundamental responsibilities of manaaement
is the establishment of a continuing program to insure the re-
liability and validity of analytical laboratory and field data
gathered in water treatment and wastewater pollution control activ-
ities. This handbook is addressed to laboratory directors,
leaders of field investigations, and other personnel who bear
responsibility for water and wastewater data. Subject matter of
the handbook is concerned primarily with quality control for
chemical and physical tests and measurements. Sufficient informa-
tion is offered to allow the reader to inaugurate, or to rein-
force, a program of analytical quality control which will empha-
size early recognition, prevention and correction of factors
leading to breakdowns in the validity of data.
107
-------
LABORATORY QUALITY CONTROL MANUAL
Robert S. Kerr Water Research Center, Ada, Oklahoma
ABSTRACT: The Federal Water Pollution Control Administration
(FWPCA) is concerned about laboratory quality and has initiated
a program of improved effort in that direction. The manual deals
with two areas of that program; statistical analytical quality
control and record keeping. The manual describes statistical
techniques as applied to analytical quality control. It is also
concerned with record keeping as it applies to laboratory pro-
cedures and suggests a method of laboratory record keeping that
should satisfy the most severe critic.
REVIEWS OF CURRENT LITERATURE ON ANALYTICAL METHODOLOGY AND
QUALITY CONTROL, NUMBER 22
Battelle Columbus Laboratories, Ohio, Analytical Methodology
Information Center
ABSTRACT: The report is a compilation of current literature in
the field of water pollution methodology. The contents include
physical and chemical methods, biological methods, microbiologi-
cal methods, methods and performance evaluation, and instrument
development.
GUIDELINES FOR DEVELOPMENT OF A QUALITY ASSURANCE PROGRAM.
REFERENCE METHOD FOR THE DETERMINATION OF SULFUR DIOXIDE IN THE
ATMOSPHERE
Research Triangle Institute, Durham, North Carolina
AUTHOR: Smith, Franklin; Nelson, A. Carl Jr.
ABSTRACT: Guidelines for quality control of the Federal refer-
ence method for sulfur dioxide are presented. These include:
(1) good operating practices, (2) directions on how to assess and
qualify data, (3) directions on how to identify trouble and im-
prove data quality, (4) directions to permit design of auditing
activities, (5) procedures for selecting action options and re-
lating them to costs. This document is not a research report.
It is for use by operating personnel.
FWPCA METHOD STUDY 1: MINERAL AND PHYSICAL ANALYSES
Federal Water Pollution Control Administration, Cincinnati, Ohio.
Analytical Quality Control Laboratory
ABSTRACT: Pairs of synthetic water samples were prepared in
three ranges of concentration for pH, specific conductance, total
dissolved solids, total hardness, sodium, potassium, total acidity/
alkalinity, chloride and sulfate for analysis by FWPCA Official
Interim Methods for Chemical Analysis of Surface Waters. Fifty-
one analysts from twenty laboratories in FWPCA and 5 non-FWPCA
108
-------
laboratories cooperated in this study. A statistical summary of
the results indicates the precision and accuracy values obtain-
able in routine work.
WORKSHOP ON SAMPLE PREPARATION TECHNIQUES FOR ORGANIC POLLUTANT
ANALYSIS HELD AT DENVER, COLORADO ON 2-4 OCTOBER 1973
National Field Investigations Center-Denver, Colorado
AUTHOR: Meiggs, Theodore 0.
ABSTRACT: The emphasis of the workshop was placed upon the
problems of sample collection, extraction, and fractionation
prior to detection of the pollutants of interest by the appropri-
ate detection techniques. Wherever possible, methods or pro-
cedures were stressed that were applicable to the analysis for
general classes of organic compounds as opposed to procedures
for individual compound identification. What follows is a sum-
mation of the techniques discussed at the workshop. Many of
these are currently being used by water laboratories to analyze
industrial effluents, natural waters, bottom sediments, and
aquatic biota for industrial and agricultural organic-chemical
pollutants. In addition, some discussion is provided regarding
analytical quality control in the organic laboratory.
GUIDELINES FOR DEVELOPMENT OF A QUALITY ASSURANCE PROGRAM,
VOLUME I. DETERMINATION OF STACK GAS VELOCITY AND VOLUMETRIC
FLOW RATE (TYPE-S PITOT TUBE)
Research Triangle Institute, Durham, North Carolina
AUTHOR: Smith, Franklin; Wagoner, Denny E.; Nelson, A. Carl Jr.
ABSTRACT: The document presents guidelines for developing a
quality assurance program for the determination of stack gas
velocity and volumetric flow rate using a type-S pitot tube.
The introduction lists the overall objectives for a quality as-
surance program and delineates the program components. The oper-
ations manual sets forth recommended operating procedures to as-
sure the collection of data of high quality and instructions for
performing quality control checks. The manual for a field team
supervisor contains directions for assessing data quality on an
intra-team basis and for collecting the information necessary to
detect and/or identify trouble. The manual for manager of groups
of field teams presents information relative to the test method
(a functional analysis) to identify the important operations
variables and factors, and statistical properties of and pro-
cedures- for carrying out auditing procedures for an independent
assessment of data quality.
109
-------
STATISTICAL EVALUATION OF SELECTED ANALYTICAL PROCEDURES
Mound Laboratory, Miamisburg, Ohio
AUTHOR: Bohl, D. R.; Sellers, D. E.
ABSTRACT: A data evaluation study was conducted to evaluate the
precision and accuracy of analytical procedures. Conventional
statistical formulas were used to evaluate the data. The pro-
cedures evaluated statistically were a potentiometric method for
determining iron and uranium, a volumetric titration of nickel,
and the determination of uranium by controlled-potential colori-
metric and potentiometric titration. The accuracy, standard de-
viation and confidence intervals were calculated using historical
data from these procedures.
STANDARD REFERENCE MATERIALS: ANALYSIS OF INTERLABORATORY MEA-
SUREMENTS ON THE VAPOR PRESSURES OF CADMIUM AND SILVER. (CERTI-
FICATION OF STANDARD REFERENCE MATERIALS 746 AND 748)
National Bureau of Standards, Washington, D.C. Institute for
Materials Research (401 937)
AUTHOR: Paule, Robert C.; Mandel, John
ABSTRACT: Detailed statistical analyses have been made of re-
sults obtained from a series of interlaboratory measurements on
the vapor pressures of cadmium and silver. Standard Reference
Materials 746 (cadmium) and 748 (silver) which were used for the
measurements have been certified over the respective pressure
ranges 10 to the -llth to 10 to the -4th atm and 10 to the -12th
to 10 to the -3rd atm. The temperature ranges corresponding to
these pressures are 350-594 K for cadmium and 800-1600 K for
silver. The heats of sublimation at 298 K and the associated
two standard error limits for cadmium and silver are 26660 plus
or minus 150 cal/mol and 68010 plus or minus 300 cal/mol, re-
spectively. Estimates of uncertainty have been calculated for
the certified temperature-pressure values as well as for the un-
certainties expected from a typical single laboratory's measure-
ments. The statistical analysis has also been made for both the
second and third law methods, and for the within- and between-
laboratory components of error. The uncertainty limits are ob-
served as functions of both the heat of sublimation and the
temperature.
HANDBOOK FOR ANALYTICAL QUALITY CONTROL IN WATER AND WASTEWATER
LABORATORIES
National Environmental Research Center, Cincinnati, Ohio,
Analytical Quality Control Laboratory
AUTHOR: Ballinger, D. G.; Booth, R. L.; Midgett, M. R.; Kroner,
R. C.; Kopp, J. F.
110
-------
ABSTRACT: One of the fundamental responsibilities of management
is the establishment of a continuing program to insure the re-
liability and validity of analytical laboratory and field data
gathered in water treatment and wastewater pollution control
activities. This handbook is addressed to laboratory directors,
leaders of field investigations, and other personnel who bear
responsibility for water and wastewater data. Subject matter of
the handbook is concerned primarily with quality control for
chemical and physical tests and measurements. Sufficient in-
formation is offered to allow the reader to inaugurate,. or to
reinforce, a program of analytical quality control which will
emphasize early recognition, prevention and correction of factors
leading to breakdowns in the validity of data.
WATER METALS NO. 4, STUDY NUMBER 30. REPORT OF A STUDY CONDUCTED
BY ANALYTICAL REFERENCE SERVICE
Public Health Service, Cincinnati, Ohio. Bureau of Prevention
and Environmental Control
AUTHOR:
McFarren, Earl F.; Parker, John H.; Lishka, Raymond J.
ABSTRACT: In the study, three samples containing between 0.005
and 5.0 mg per liter of each of nine metals - zinc, chromium,
copper, magnesium, manganese, silver, lead, cadmium, and iron -
were provided. Each participant was requested to do a single
analysis for each of the metals in each of the three samples by
the provided atomic absorption spectrophotometric method. This
method, depending upon the sensitivity of the instrument (burner,
tube, etc.) available, gave the participant a choice of aspirating
the sample directly into the flame or of chelating with ammonium
pyrrolidine dithiocarbamate and extracting into methyl isobutyl
ketone before aspirating. The results obtained were evaluated
in terms of whether the sensitivity of the method was sufficient
to permit the measurement of the metal with a reasonable degree of
precision and accuracy at the concentration prescribed by drinking
water standards.
WATER SURFACTANT NO. 3, STUDY NUMBER 32. REPORT OF A STUDY
CONDUCTED BY ANALYTICAL REFERENCE SERVICE
Public Health Service, Cincinnati, Ohio. Bureau of Disease
Prevention and Environmental Control
AUTHOR: Lishka, Raymond J., Parker, John H.
ABSTRACT: In the study each participant was shipped three steri-
lized water samples in disposable 1-quart polyethylene containers.
Sample 1 was composed of filtered river water containing 2.94
mg/liter linear alkylsulfonates (LAS). Sample 2 was tap water
containing 0.48 mg/liter LAS. Sample 3 was dist'lied water con-
Ill
-------
taining 0.27 mg/liter LAS. A small amount of methylene blue and
a copy of the procedure were sent with the samples. The data
indicate no difference in methylene blue obtained from many dif-
ferent suppliers. Results from 111 analysts show good accuracy
and precision for all samples.
THE QUALITY CONTROL OF LABORATORY PRECISION
American Journal of Clinical Pathology 25:585 (May 1955).
AUTHOR: Waid, M. E.; Hoffman, R. G.
ABSTRACT: The paper had four purposes; (1) to propose a method
of using data of patients to evaluate the precision of laboratory
procedures; (2) to illustrate the method with data from two gen-
eral hospitals; (3) to fit frequency distribution curves to these
data and illustrate their applicability; and (4) to demonstrate
that the care of many patients may be affected by results that
have been inaccurately standardized.
The best manner for using the method proposed in this paper is
first to run standards of known concentration through the labora-
tory to insure that the laboratory is functioning properly. When
assurance is gained that the laboratory is functioning properly,
then the test results of the clinical specimens run during this
same period may be used to set up the charts.
The steps in the method are: (1) The numerical value of each
test is recorded. (2) All values for a particular test are added
at the end of each day, or other predetermined period of time.
(3) Arithmetic means for each type of test are computed.
(4) The means obtained are plotted as points on a graph.
(5) Probability limits may be computed to be used as guidelines
for the director of the laboratory.
Data on hemoglobin and red cell counts were tabulated for two
general hospital laboratories. In one hospital, the hemoglobin
tests were restandardized during the period covered by the data.
In the other hospital, a suspected change in the hemoglobin level
was seen. In both cases, the ability of the proposed method to
portray these changes was graphically demonstrated.
The effects on medical practices which resulted from the hemo-
globin restandardization were estimated by tabulating the number
of patients who received transfusions. The transfusion rate was
reduced approximately one-half.
Charts similar to those presented in this paper may be used for
the control of any laboratory, procedure.
112
-------
ABSENCE OF ANALYTIC BIAS IN A QUALITY CONTROL PROGRAM
The American Journal of Clinical Pathology 38:468 (November 1962)
AUTHOR: Weinberg, M. S.; Barnett, R. N.
ABSTRACT: The article describes an experiment conducted to de-
termine if the analyst produces incorrect results because of
conscious or unconscious bias toward a known value, such as in
a pool which may be used for several months, with all analysts
aware of the anticipated results.
A single batch of pooled serums that had been in use and for
which sufficient data on reliability had been accumulated was
used in the study. Samples of the batch were used in routine
daily quality control. Another sample was introduced as a blind
sample during July and August in such a way as to prevent knowl-
edge of such a sample by analysts.
An additional study was performed during the same period. Each
technologist was instructed to choose one of the routine clinical
samples for duplicate analysis for each determination requested.
In the study of blind versus known quality control serums intro-
duced into routine clinical chemical determinations, no evidence
was found that the analysts achieved a closer approach to the
average known values nor a narrower 3 standard deviation range
for the known samples. Values for duplicate determinations of
unknown specimens were always closer than the comparative values
of blind and known controls. The authors concluded that this
was the result of more exact reproduction of analytic conditions
rather than the effect of bias.
BIOMETRY: THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL
RESEARCH
W. H. Freeman and Company, San Francisco
AUTHOR: Sokal, R. R.; Rohlf, F. J.
ABSTRACT: The abstract includes the table of contents and part
of Appendix 3, Statistical Computer Programs. The computer pro-
grams included with a brief summary of their outputs is as
follows:
A3.1 Basic statistics for ungrouped data. Output includes:
mean, median, variance, standard deviation, coefficient of vari-
ation, gl, g2, and the Kolraogorov-Smirnov statistic Dmax result-
ing from a comparison of the observed sample with a normal dis-
tribution based on the sample mean and variance - followed by
their standard errors and 100 (1 - ) % confidence intervals
where applicable.
113
-------
A3.2 Basic statistics for data grouped into a frequency distri-
bution. This program is similar to program A3.1, but is intended
for data grouped into a frequency distribution.
A3.3 Goodness of fit to discrete frequency distributions. Op-
tions are provided for the following computations.
(1) Compute a binomial or poisson distribution with specified
parameters.
(2) Compute the deviations of an observed frequency distribution
from a binomial or poisson distribution of specified parameters
or based on appropriate parameters estimated from observed data.
A G-test for goodness of fit is carried out.
(3) A series of up to 10 observed frequency distributions may
be read in and individually tested for goodness of fit to a
specified distribution, followed by a test of homogeneity of the
series of observed distributions.
(4) A specified expected frequency distribution (other than
binomial or poisson) may be read in and used as the expected
distribution. This may be entered in the form of relative ex-
pected frequencies or simply as ratios (for example 1:2:1). The
maximum number of classes for all cases is thirty. In the binomial
and poisson, the class marks cannot exceed Yi = 29.
SYSTEMS CONTROL BY CUMULATIVE SUM METHOD
American Journal of Medical Technology 34:644 (November 1968)
AUTHOR: Griffen, D. F.
ABSTRACT: The article describes a system for plotting daily con-
trol data that is most useful where the secondary standards render
recovery values on control or reference samples doubtful. The
system involves subtracting an arbitrary target value from the
daily recovery values of the control. Values for successive
days are added algebraically to the previous day's total so a
running difference from the target value is plotted. No actual
confidence limit lines are drawn as parallels to the target or
datum line. An out-of-control condition may be indicated by
six successive climbing or falling plots, or when the cumulative
sum track forms an angle of 45 degrees or greater on the datum
line, if the linear distance between two successive vertical scale
points is made equal to the linear distance between two succes-
sive horizontal points, and one such vertical scale segment is
used as two standard deviations.
Trends and shift show up much more dramatically under this system
of charting than they do on the usual X or X chart.
114
-------
ESTABLISHING A QUALITY CONTROL PROGRAM FOR A STATE ENVIRONMENTAL
LABORATORY
Water and Sewage Works, May 1974, pp 54 and ff
AUTHOR: Frazier, R. P.; Miller, J. A.; Murray, J. F. ; Mauzy,
M. P.; Schaeffer, D. J.; Westerhold, A. F.
ABSTRACT: The article describes five phases in development by
the Illinois Environmental Protection Agency of a quality control
program for regional laboratories. The current program is built
around accuracy quality control charts. To develop these charts,
every seventh sample entering the laboratory is divided into two
portions, one of which is spiked by diluting with deionized water.
By comparing new quality control data with that previously re-
corded, laboratory personnel are able to maintain a check on the
analytical process.
At the end of each month, the quality control information con-
sisting of the paired results on the original and spiked samples
is assembled, entered on special data forms, and submitted to
the data processing section for computer analysis. A summary
report is distributed monthly to each of the laboratories. Using
this information, the individual labs can take action for specif-
ic problems, while the division can take action for general problems,
The quality control program also uses externally prepared refer-
ence samples to provide an independent check of the various
analyses.
Several support programs were intiated including checking the
level of trace contaminants on bottles used for sample collection,
field preservation, a standards group which prepares standards
for the three laboratories, and an internal laboratory certifica-
tion program to determine compliance of laboratories with
procedures.
MANAGEMENT SYSTEM FOR AN ANALYTICAL CHEMICAL LABORATORY
American Laboratory S(l): 55 (January 1963)
AUTHOR: Krawczyk, D. F.; Byram, K. V.
ABSTRACT: The article describes a sample handling and verifica-
tion system (SHAVES) to facilitate managing the analytical lab-
oratory and to keep its records. It standardizes many laboratory
procedures and automates many clerical tasks.
The principal elements of SHAVES are standardization, error check-
ing, data reporting, and cost allocation. The system standard-
izes requests for analyses, recording of field data, reporting of
laboratory analytical data, and limits of accuracy and precision
for given determinations. The analyst must record all factors
used in computing an analytical result, and the computer uses his
115
-------
factors to check it. The system detects errors in labeling and
reporting of results. Costs for each analysis derived from the
time and supplies required to perform it are available to the
computer. The laboratory manager uses the monthly cost summary
to make adjustments in financial support from programs using the
laboratory. The system is near total effectiveness in detecting
analytical computational errors.
THE NUMBER PLUS METHOD OF QUALITY CONTROL OF LABORATORY ACCURACY
The American Journal of Clinical Pathology 40:263 (September 1963)
AUTHOR: Hoffman, R. G.; Waid, M. E.
ABSTRACT: The number plus method uses clinical values as a
source of quality control information. The procedure first in-
volves obtaining a substantial number (about 500) of clinical
values for the test in question, the organization of these values
into a frequency distribution and then location of the mode.
Next the percent of all tests that have values above the mode must
be determined. Maintaining the order in which the 500 tests were
made, they must be separated into groups of 50 consecutive tests.
Then the number of tests that have values greater than .the mode
is counted. Plot the number of values that are greater than the
mode (number plus) on a control chart. Control limits of any
desired width can be constructed.
If the testing procedure was stable during the period over which
the tests were made, then each group of 50 tests should have a
number of plus tests (test values exceeding the mode) which be
within the control limits. Control charts can be kept current
by counting the number of plus test values as each group of 50
tests is completed. A point outside of the control limits, or a
shift of values toward a control limit may indicate a shift in
quality or control of clinical test results, and should be
investigated.
Experience indicates that the procedure is sensitive enough to
detect a shift of sufficient magnitude that it is worth looking
for, but the shifts are small enough that they will not bias
greatly the clinical use of given test results.
One advantage that the number plus method has over the reference
standard method is: Number plus method uses clinical values,
while a control serum frequently is not handled in the same manner
as patient serums. Factors may influence a change in control
serums or standards which do not apply to patient's specimens.
STANDARD DEVIATION: A PRACTICAL MEANS FOR THE MEASUREMENT AND
CONTROL OF THE PRECISION OF CLINICAL LABORATORY DETERMINATIONS
The American Journal of Clinical Pathology 27:55/ (May 1957)
AUTHOR: Copeland, B. E.
116
-------
ABSTRACT: Precision is defined as the closeness with which re-
peated analyses agree. The article describes the criteria for
determining a measure of precision to include the following:
The measure of precision can be used in common by all individuals
interested in clinical precision - the pathologist, the techni-
cian, the clinician, the research scientist and the statistician.
The data necessary for the calculation of the measure must be
easy to collect, and the calculation must be easy to perform.
The desired expression of precision must be easily interpreted.
An exchange of letters or a personal interview should not be re-
quired to compare precision of one laboratory with the precision
of another.
The standard deviation is described as a unit of precision which
best fits the above critieria. A method for computing the stand-
ard deviation is described which uses the difference between
duplicate measurements rather than differences from the mean.
Some of the conditions which must be stated to define adequately
the frame of reference of the standard deviation are (1) number
of technicians, (2) one or more days, (3) one or more samples,
(4) concentration level of samples, (5) whether the technician
knows he is being tested, etc.
EVALUATION OF A SYSTEM FOR PRECISION CONTROL IN THE CLINICAL
LABORATORY
The American Journal of Clinical Pathology 48:243
AUTHOR: Barnett, R. N.; Pinto, C. L.
ABSTRACT: The article describes a method for quality control of
clinical chemistry based on mixtures of patient samples. - From
each group of specimens submitted for analysis, two samples are
selected and labeled A and B. Equal portions of A and B are
mixed to form C, whose true value is (A + B)/2. Mixture C then
becomes a sample which is analyzed with the other members of the
batch. After all analyses are complete, the difference between
the actual value for C and its theoretical value, (A + B)/2 is
recorded. Forty such mixtures are analyzed on separate days.and
all the differences recorded. A control chart for these differ-
ences can then be prepared based on average deviations.
There are two disadvantages of the system in comparison with the
conventional pooled plasma. It provides no check on accuracy.
A change in reagents, standard, or procedure resulting in a shift
of values is not detected. Because patient samples may exhibit
values greatly different from those of a pool, the standard devi-
ation determined from mixed samples might differ greatly from
that of a pool merely because a different range of values was
under study.
The above method was compared with the results found using frozen
117
-------
pooled serum for 10 commonly performed clinical chemical analyses.
The standard deviations, coefficients of variation, and confidence
limits were found to be close to those achieved by pooled serum
technic. This substantiates the validity of limits of precision
obtained by the use of serum pools.
SENSITIVITY - A CRITERION FOR THE COMPARISON OF METHODS OF TEST
Journal of Research, National Bureau of Standards 53:155 (Febru-
ary 1954)
AUTHOR: Mandell, J.; Stiehler, R. D.
ABSTRACT: In the evaluation of many methods of test, the two
usual criteria - precision and accuracy - are insufficient. Ac-
curacy is applicable only where comparisons with a standard can
be made. Precision, when interpreted as degree of reproducibility,
is not necessarily a measure of merit, because a method may be
highly reproducible merely because it is too crude to detect
small variations.
To obtain a quantitative measure of merit of test methods, a
new concept, sensitivity, is introduced. If M is a measure of
some property Q, and o its standard deviation, the sensitivity
of M, denoted by V , is defined by the relation Y = (dM/dQ)/o .
It follows from this definition that the sensitivity of a test
method may or may not be constant for all values of the property
Q.
A statistical test of significance is derived for the ratio of the
sensitivities of alternative methods of test. Unlike the stand-
ard deviation and the coefficient of variation, sensitivity is a
measure of merit that is invariant with respect to any transform-
ation of the measurement, and is therefore independent of the
scale in which the measurement is expressed.
THE USE OF CONTROL CHARTS IN THE CLINICAL LABORATORY
American Journal of Clinical Pathology 20:1059 (1950)
AUTHOR: Levey, S.; Jennings, E.R.
ABSTRACT: The article describes a study of the use of control
chart methods in a clinical laboratory. The control charts used
were arithmetic mean (X) and range (R). The method used whole
blood and plasma in which the concentration of the substance
estimated was stable over a long period, and in the range of
normal blood values. Two samples each of whole blood and plasma
were tested in the analysis twice a week. The true value of the
concentration of any of the control substances was estimated by
averaging the individual values obtained from the first 20 pairs
analyzed over a period of about a month.
118
-------
After the analysis was completed, the average and the range were
plotted with the test value as ordinate and the order of test as
abscissa. The statistical limits (three standard deviations)
also were put on the chart.
Control charts were illustrated for urea nitrogen, plasma chlor-
ide, total plasma protein, plasma albumin, and carbon dioxide
combining-power of plasma.
The control chart offers a simple method of checking the result-
ant effect of all factors influencing the accuracy of a test;
e.g., the reagents, standards, time factors, technicians, and in-
struments used in the analysis. It offers a basis for action in
initiating correction of a method that is not functioning proper-
ly. Also, it improves the general accuracy of a laboratory, be-
cause the technicians become control conscious and readily detect
and report a test that is out of control. If the method is out
of control, the chart usually cannot give the reason, and it is
up to the analyst to determine the cause of the difficulty.
Sometimes it is possible to note deterioration of reagents or
standards by observing a trend in a control chart.
D. COMPUTER PROGRAMMING - INFORMATION RETRIEVAL
ASP - ANALYSIS OF SYNTHETICS PROGRAM FOR QUALITY CONTROL DATA
Du Pont de Nemours (E. I.) and Company, Aiken, South Carolina
Savannah River Laboratory
AUTHOR: Bailey, L. V.; Arnett, L. II.
ABSTRACT: The computer program, ASP, which calculates bias, pre-
cision, and other statistics of analytical methods, was written
in FORTRAN IV for use on the IBM system/360-65. The Savannah
River Plant laboratories use ASP montnly and quarterly to evalu-
ate and to report the bias and precision of analyses important
to process control and accountability.
A CHEMICAL INFORMATION CENTER EXPERIMENTAL STATION
Pittsburgh Chemical Information Center, Pennsylvania
AUTHOR: Arnett, E. M.
ABSTRACT: Reports are presented by the Principal Investigator and
representatives of the following project task groups: library;
programming; knowledge availability systems center; and behavioral
research group. Each report is self-contained with its own ab-
stract and appendices.
PROCEEDINGS. JOINT CONFERENCE ON PREVENTION AND CONTROL OF OIL
SPILLS
American Petroleum Institute, New York
119
-------
ABSTRACT: On December 15-17, 1969, a Joint Conference on Preven-
tion and Control of Oil Spills was held under the co-sponsorship
of the American Petroleum Institute and the Federal Water Pollu-
tion Control Administration. The objectives of the conference
were to delineate the overall dimensions of the oil spills
problem, explore the present state of the art of prevention and
control of oil spills, and review the relevant research and de-
velopment efforts of government and private industry, both here
and abroad. The topics discussed include spill prevention, boom
design, mechanical removal, chemical additives, analysis and
sampling, monitoring, beach cleanup, fate of spills, ecological
effects, and oil-spill information retrieval and dissemination.
OPERATIONAL HYDROMET DATA MANAGEMENT SYSTEM. DESIGN CHARACTER-
ISTICS
North American Rockwell Information Systems Company, Anaheim,
California
ABSTRACT: The hydromet system under development will include a
Central data bank operated by the U.S. Corps of Engineers, a
large number of automated hydromet data gathering stations
interfacing with the central data bank, and data retrieval fa-
cilities for interfacing the participating agencies with the data
bank. The Operational Hydromet Data Management System (OHDMS)
will be based in a large scale digital computer with appropriate
large volume digital storage devices and peripherals. It will in-
clude a real-time digital data acquisition subsystem operating in
association with an extensive manual data gathering network and
a diverse user terminal subsystem for retrieval of stored hydromet
data. The present study is structured to include the definition
of the hardware and software characteristics of an integrated
data management system to meet the requirements of each of the
participating federal agencies. A key element in the study is
the detailed definition of the user requirements for each of the
federal participants.
DESIGN AND OPERATION OF AN INFORMATION CENTER ON ANALYTICAL
METHODOLOGY
Battelle Memorial Institute, Columbus, Ohio. Columbus Laboratories
ABSTRACT: The report discusses the design and operation of a
pilot analytical methodology information storage and retrieval
system tailored to the needs of the Analytical Quality Control
Laboratory (AQCL) and other segments of the National Analytical
Methods Development Research Program (NAMDRP). All aspects of
the system are presented.
120
-------
STORAGE AND RETRIEVAL OF WATER QUALITY DATA. TRAINING MANUAL
Environmental Protection Agency, Washington, D.C. Water Quality
ABSTRACT: STORET is the data storage and retrieval system devel-
oped by and for the EPA and is a system suitable to the needs of
all users of water quality and water resource data. The contents
of the report make up a course which is intended to provide in-
formation and instruction on the STORET system for those persons
directly involved in accumulating, processing and using water
data.
STUDIES IN THE ANALYSIS OF METROPOLITAN WATER RESOURCE SYSTEMS.
VOLUME V: A METHOD OF DATA REDUCTION FOR WATER RESOURCES IN-
FORMATION STORAGE AND RETRIEVAL
Cornel University, Ithaca, New York. Water Resources and Marine
Sciences Center
AUTHOR: Lewinger, K. L.
ABSTRACT: Data storage and retrieval expenses represent a signif-
icant portion of the cost of operating a management information
system. The study focuses on the question of how much data al-
ready collected need be stored for future use, and on methods of
reducing the quantity of data without necessarily reducing the
information content. Several linear interpolation and least
squares methods are explored for achieving data reduction, using
as a means of illustration twenty-three different types of hydro-
logic records. Discussed also is the value of the data, the de-
sired accuracy needed for various water resources studies, and
the costs of data reduction as compared to data storage and
retrieval.
.PROCEEDINGS OF CONFERENCE ON "TOWARD A STATEWIDE GROUND WATER
QUALITY INFORMATION SYSTEM" AND REPORT OF GROUND WATER QUALITY
SUBCOMMITTEE, CITIZENS ADVISORY COMMITTEE, GOVERNORS ENVIRONMENTAL
QUALITY CONTROL
Minnesota University, Minneapolis. Water Resources Research
Center
ABSTRACT: The following topics were discussed: the natural
quality of ground water in Minnesota, the use of ground water in
Minnesota, hydrogeologic framework for deterioration in ground
water quality, spray disposal of sewage effluent, solid waste
disposal, needs and uses for a ground water quality data system,
water well records and information system needs, subsurface
geologic information system in Minnesota, ground water quality
information system experiences in other states, Federal water in-
formation systems, and relation of ground water qualtiy informa-
tion system and other systems in Minnesota.
121
-------
AN INFORMATION SYSTEM FOR THE MANAGEMENT OF LAKE ONTARIO
Cornell University, Ithaca, New York
AUTHOR: Reynolds, Huey Dale
ABSTRACT: The first part of this study is concerned with a gen-
eral analysis of information needs for the Experimental Operations
Office (for Lake Ontario management) considering the purposes and
objectives of the office, the boundary of the office, and the
problem areas to be managed by the office. The second part deals
with the theory of information and information systems in general,
to provide a theoretical background. The third part consists of
an analytical framework for an information system, followed by
case studies of two particular areas, namely an economic base
study and water quality control.
TRANSPORT AND THE BIOLOGICAL EFFECTS OF MOLYBDENUM IN THE ENVIRON-
MENT
Colorado State University, Fort Collins
ABSTRACT: The report presents an investigation of the transport
and biological effects of molybdenum in the environment. The
topics covered include: geochemistry of molybdenum, molybdenum
transport in a reservoir, molybdenum toxicity studies in animals,
fate of trace metals in a coal-fired power plant, molybdenum
removal in conventional water and wastewater treatment plants,
accumulation of available molybdenum in agricultural soils, levels
of molybdenum in milk, analytical facilities, effects of dietary
molybdenum on the physiology of the white rat, skeletal biology
of molybdenum, information processing system, methodological
problems in economic analysis of externalities and mineral de-
velopment, perception of alternatives and attribution of respon-
sibility for a water pollution problem, and information storage
and retrieval routines.
DATA ACQUISITION SYSTEMS IN WATER QUALITY MANAGEMENT
Colorado State University, Fort Collins
AUTHOR: Ward, Rofer C.
ABSTRACT: The role of routine water quality surveillance was
investigated, including a delineation of the objectives of a
state water quality program based upon the state and federal
laws. Seven specific objectives are listed under the two general
objectives of prevention and abatement: planning, research, aid
programs, technical assistance, regulation, enforcement, and
data collection, processing, and dissemination. Each objective
was broken down into the general activities required for its
accomplishment and the data needed for each activity were identi-
fied. A survey of systems for grab sampling, automatic monitor-
ing, and remote sensing was performed, each data acquisition
122
-------
technique being analyzed for capabilities, reliability, and cost.
A procedure was developed for designing a state water quality
surveillance program responsive to objectives. Financial and man-
power constraints were considered.
THE SYSLAB SYSTEM FOR DATA ANALYSIS OF HISTORICAL WATER-QUALITY
RECORDS (BASIC PROGRAMS)
Geological Survey, Washington, D.C.
AUTHOR: Steel, Timothy Doak
ABSTRACT: The report documents the basic computer programs com-
prising the SYSLAB system for systematically analyzing histori-
cal water-quality records. The first computer program retrieves
station records for sets of water-quality variables from the sur-
vey's surface-water quality files. The procedure for analyzing
water-quality data commonly has the following sequence: (1) a
summary of basic statistics for each water-quality variable for
the period of record or for shorter time increments, (2) plots
of values of selected data pairs scaled according to the.range
of the data and (3) regression relationships based upon the
graphic analysis of the plots. The appropriate SYSLAB computer
program is given for each step in the sequence. Derivation of
regression relationships is particularly applicable for the major
inorganic chemical constituents which frequently are highly
correlated with specific conductance. The report includes a
description of the card set-up format and data input require-
ments for each computer program.
AUTOMATIC ACQUISITION OF WATER QUALITY DATA. A BIBLIOGRAPHY
WITH ABSTRACTS
National Technical Information Service, Springfield, Virginia
AUTHOR: Lehmann, Edward J.
ABSTRACT: The NTISearch bibliography contains 51 selected ab-
stracts of research reports retrieved using the NTIS on-line
search system—IITISearch. The abstracts include the techniques
and equipment used to obtain continuous water quality data.
General system management and planning studies are covered.
FACTORS AFFECTING INNOVATION IN WATER QUALITY MANAGEMENT:
IMPLEMENTATION OF THE 1963 MICHIGAN CLEAN WATER BOND ISSUE
Michigan University, Ann Arbor. Department of Civil Engineering
AUTHOR: Yaffee, Steven L.; Bulkley, Jonathan W.
ABSTRACT: This report focuses upon factors affecting innovation
in the implementation of the 1963 Michigan Clean Water Bond Is-
sue. The Joint Legislative Committee on Water Resources Planning
which sized the bond program did not consider nutrient removal
123
-------
or any treatment beyond secondary in its determination of the
fiscal resources necessary to meet 1980 Water Pollution Control
objectives. Consequently, the fiscal resources were limited from
inception. The net effect of the Clean Water Bond program main-
tains a 1968 status quo situation. Factors resisting innovation
are identified and factors enhancing innovation are identified.
An automated information storage/retrieval system for monitoring
wastewater treatment facility funding is developed. Structural
and process changes for future innovation are recommended.
MICHIGAN WATER RESOURCES ENFORCEMENT AND INFORMATION SYSTEM
Michigan Department of Natural Resources, Lansing. Water Re-
sources Commission
AUTHOR: Guenther, Gary; Mincavage, Daniel; Morley, Fred
ABSTRACT: The project demonstrated an interactive federal/state
water pollution control, enforcement, and information system,
including interactive computer graphics as a method of output
presentation. Two systems were interfaced: Michigan's Water
Information System for Enforcement (WISE) and EPA's STORET system
The WISE system is used to alert enforcement personnel to problem;
through exception reporting, and to provide follow-up information
on these problems. STORET is used as a storage and retrieval
system for water quality and inventory information. As informa-
tion enters WISE, certain inputs are coded for storage in STORET.
The interface mechanism is a common numbering system. Because
WISE is modular in design, it can be used in part or in total by
other agencies. The demonstration indicated that careful con-
sideration should be given to the information that will comprise
the computer file. Administrative, procedural, and auditing
techniques should be completely set down before proceeding with
management's commitment to the system. Microfilm should be used
when feasible, both as Computer Output Microfilm (COM) and in
manual files.
A NATIONAL OVERVIEW OF EXISTING COASTAL WATER QUALITY MONITORING
Interstate Electronics Corporation, Anaheim, California Oceanics
Division
ABSTRACT: An overview of coastal water quality monitoring activ-
ity is presented, including an examination of related factors
such as water quality standards, population, waste discharges,
ocean dumping, a survey of data banks at the national level and
others. Data from several inventories pertinent to coastal zone
water quality is summarized to the state and EPA regional level
with extensive descriptions contained in appendices.
124
-------
SIDES: STORET INPUT DATA EDITING SYSTEM
Environmental Protection Agency, Athens, Georgia Surveillance
and Analysis Division
AUTHOR: Barrow, David R.
ABSTRACT: The Water Quality Control Information System provides
a broad data management capability for all activities of EPA's
water programs activities. Central to both the program activities
and the data management system is the need to store and retrieve
ambient water quality data. The initial stages of the data man-
agement system were designed to fulfill that basic need. That
was the beginning of STORET. The present report provides docu-
mentation for SIDES, a procedure designed specifically for field
survey data and medium speed terminal, card input applications.
THERMOPHYSICAL AND ELECTRONIC PROPERTIES INFORMATION ANALYSIS
CENTER (TEPIAC): A CONTINUING SYSTEMATIC PROGRAM ON TABLES OF
THERMOPHYSICAL AND ELECTRONIC PROPERTIES OF MATERIALS
Purdue University, Lafayette, Indiana Thermophysical Properties
Research Center
AUTHOR: Ho, Cho-Yen
ABSTRACT: The final report describes the activities and ac-
complishments of the Thermophysical and Electronic Properties
Information Analysis Center (TEPIAC), which comprises internally
the Thermophysical Properties Information Analysis Center (TPIAC)
and the Electronic Properties Information Center (EPIC). TEPIAC1s
activities reported herein include literature search, acquisition,
review, and codification; substance classification and organiza-
tion; operation of a computerized information storage and re-
trieval system; publication of the Thermophysical Properties Re-
search Literature Retrieval Guide Supplement; data extraction
and compilation; data evaluation, correlation, analysis, synthesis,
and generation of recommended reference values; publication of the
TPRC Data Series, state-of-the-art summaries, and critical re-
views; technical and bibliographic inquiry services; and current
awareness and promotion efforts. TPIAC covers 14 thermophysical
properties of all matter at all temperatures. EPIC covers 22
electronic (including also electrical, magnetic, and optical)
properties and property groups of selected material groups at all
temperatures.
STORET II: STORAGE AND RETRIEVAL OF DATA FOR OPEN WATER AND LAND
AREAS
Federal Water Pollution Control Administration, Washington, D.C.
Division of Pollution Surveillance
AUTHOR: Dubois, Donald P.
125
-------
ABSTRACT: STORET Subsystem II described in this manual consists
of a series of related computer programs designed for the effi-
cient storage and retrieval of data collected in connection with
water quality management programs. The system is intended for
use in handling data collected from large open bodies of water
and from points on land areas which cannot be associated readily
with points on a stream.
PART I. A CONCEPTUAL MODEL FOP T-. TERRESTRIAL ECOSYSTEM PERTURBED
WITH SEWAGE EFFLUENT, WITH SPECIAL REFERENCE TO THE MICHIGAN STATE
UNIVERSITY WATER QUALITY MANAGEMENT PROJECT. PART II. A PER-
SONALIZED BIBLIOGRAPHIC RETRIEVAL PACKAGE FOR RESOURCE SCIENTISTS
Michigan State University, East Lansing. Department of Fisheries
and Wildlife
AUTHOR: Conley, Walt,; Tipton, Alan R.
ABSTRACT: The report is provided in two distinct but intercon-
nected parts. Part I contains discussions of management and de-
sign problems, components of terrestrial ecosystems, and.specific
site descriptions, all as they pertain to the sewage effluent
spray program of the Michigan State University Water Quality Man-
agement Project. Part II began as an effort to compile a bibli-
ographic reference file for the a.oove project. This portion
grew into the construction or: relevant software, and was built
around a 2500 citation bibliography. The bibliography is
specifically oriented towards sewage effluent treatments, and is
currently operative and available for interested researchers. A
second bibliography is also described in this section.
126
-------
SECTION XII
APPENDIX
SAMPLE LETTER
Directors of EPA Environmental Laboratories
Gentlemen:
Currently the Environmental Monitoring and Support Laboratory
(EMSL-Cincinnati) has issued a contract for the "Development
of a System for Conducting Inter-Laboratory Tests for Water
Quality and Effluent Measurements." A pilot test program is
being conducted to evaluate the validity of the inter-laboratory
test program proposed by the Contractor.
Within the next month (mid-November) you will receive six
chemical reference samples which are being distributed to the
22 EPA laboratories engaged in environmental monitoring.
The constituents to be determined are aluminum, arsenic,
cadmium, copper, iron, mercury, lead, maganese, nickel, selenium,
zinc, and cobalt. Laboratories should analyze for all these
constituents.
The attached table provides an approved list of the standard
methods for the chemical analysis of water. It is assumed
that atomic absorption spectroscopy will be used where it is
available and appropriate for a given element. Since the
concentration of metals in at least one of the samples may be
below the limit of detection to determine these levels, some
form of concentration procedure such as chelation and extrac-
tion with organic solvents must be employed before analysis
if flameless atomization is not used.
The sample should be analyzed as received; no dilution is
required. A reporting form is enclosed.
127
-------
TABLE 12-1. STANDARD METHODS FOR CHEMICAL ANALYSIS OF WATER: LIST OF APPROVED TEST PROCEDURES*
Parameter (mg/1)
Method
References (page numbers)
Standard EPA
methods ASTM methods
M
00
Analytical methods for trace metals:
Aluminum-total
Aluminum-dissolved
Antimony-total
Antimony-dissolved
Arsenic-total
Arsenic-dissolved
Barium-total
Barium-dissolved
Beryllium-dissolved
Beryllium-dissolved
Boron-total
Boron-dissolved
Cadmium-total
Cadmium-dissolved
Calcium-total
Calcium-dissolved
Chromium VI
Chromium Vl-dissolved
Chromium-total
Atomic absorption
0.45 micron filtration and reference method
for total aluminum
Atomic absorption
0.45 micron filtration and reference method
for total antimony
Digestion plus silver diethyldithiocarba-
mate; atomic absorption
0.45 micron filtration and reference method
for total arsenic
Atomic absorption
0.45 micron filtration and reference method
for total barium
Aluminon; atomic absorption
0.45 micron filtration and reference method
for total beryllium
Curcumin
0.45 micron filtration and reference method
for total boron.
Atomic absorption; colorimetric
0.45 micron filtration and reference method
for total cadmium
EDTA titration; atomic absorption
0.45 micron filtration and reference method
for total calcium
Extraction and atomic absorption; colori-
metric
0.45 micron filtration and reference method
for total chromium VI
Atomic absorption; coloimetric
210
65,62
210
67,210
69
210,422
84
429
692
692
98
86
86
13
86
86
86
210,426 692,403
86
101
86
102
86
94
86
104
* Federal Register, Vol. 40, No. Ill, Monday, June 9, 1975.
-------
TABLE 12-1 (CONTINUED) . STANDARD METHODS FOR CHEMICAL ANALYSIS OF WATER: LIST OF APPROVED TEST PROCEDURES
NJ
Parameter (mg/1)
Me thod
References (page numbers)
Standard
methods
ASTM
EPA
methods
Chromium-dissolved 0.45 micron filtration and reference method
for total chromium 86
Cobalt-total Atomic absorption 692
Cobalt-dissolved 0.45 micron filtration and reference method
for total cobalt 86
Copper-total Atomic absorption; colorimetric 210,430 692,410 106
Copper-dissolved 0.45 micron filtration and reference method
for total copper 86
Gold-total Atomic absorption
Iridium-total 45 micron filtration and reference method
for total lead 86
Magnesium-total Atomic absorption. gravimetric 210,416,201 692 112
Magnesium-dissolved 0_45 micron filtration and reference method
for total magnesium 86
Manganese-total Atomic absorption 210 692 114
Manganese-dissolved Q_45 micron filtration ^3 reference method
for total manganese 86
Mercury-total Fla-neless atomic absorption
Mercury-dissolved Q_45 micron filtrati-on and reference method
for total mercury 86
Molybdenum-total ,. . , _ _
* Atomic absorption
Molybdenum-dissolved - ... * • n. • j ^ ..u j
' 0.45 micron filtration and reference method
for total molybdenum 86
Nickel-total Atomic absorption; colorimetric 443 692
Nickel-dissolved 0>45 micron filtration and reference method
for total nickel 86
Osmium-total Atomic ^ tion
Palladium-total , *
Platinum-total
-------
TABLE 12-1 (CONTINUED) . STANDARD METHODS FOR CHEMICAL ANALYSIS OF WATER: LIST OF APPROVED TEST PROCEDURES
TABLE 12-1 (Continued) . STANDARD METHODS FOR CHEMICAL ANALYSIS OF WATER: LIST OF APPROVED METHODS
References (page numbers)
co
o
Parameter (mg/1)
Potassium-dissolved
Potassium-dissolved
Rhodium- total
Ruthenium- total
Selenium- total
Selenium-dissolved
Silica-dissolved
Silver-dissolved
Sodium-total
Sodium-dissolved
Thallium- total
Thallium-dissolved
Tin-total
Tin-dissolved
Titanium- total
Titanium-dissolved
Vanadium- total
Vanadium-dissolved
Zinc-total
Zinc-dissolved
Method
Standard
methods
ASTM
EPA
me thods
Atomic absorption; colorimetric; flame
photometric 283,285 326 115
0.45 micron filtration and reference method
for total potassium 86
Atomic absorption
do
do
0.45 micron filtration and reference method
for total selenium — — 86
0.45 micron filtration and molybdosilicate-
colorimetric 303 83 86,273
Atomic absorption 210
0.45 micron filtration and reference method
for total silver 86
Flame photometric; atomic absorption 317 326 118
0.45 micron filtration and reference method
for total sodium 86
Atomic absorption
0.45 micron filtration and reference method
for toLal thallium 86
Atomic absorption
0.45 micron filtration and reference method
for total tim 86
Atomic absorption
0.45 micron filtration and reference method
for total titanium 86
Atomic absorption, colorimetric 357
0.45 micron filtration and reference method
for total vanadium 86
Atomic absorption; colorimetric 210,444 692 120
0.45 micron filtration and reference method
for total zinc . 86
-------
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
1 REPORT NO.
EPA-600/4-77-031
3. RECIPIENT'S ACCESSION NO.
4. TITLE AND SUBTITLE
Development of a System for Conducting Inter-Laboratory
Tests for Water Quality and Effluent Measurements
REPORT DATE
June 1977
Issuing Date
6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
Arthur C. Green
Robert Naegele
8. PERFORMING ORGANIZATION REPORT NO.
9. PERFORMING ORGANIZATION NAME AND ADDRESS
FMC Corporation
1105 Coleman Ave.
San Jose, CA 95108
10. PROGRAM ELEMENT NO.
24AUB TASK NO. 5
11. CONTRACT/GRANT NO.
68-03-2115
12. SPONSORING AGENCY NAME AND ADDRESS
Environmental Monitoring & Support Laboratory-Cin.,OH
Office of Research and Development
U.S. Environmental Protection Agency
Cincinnati, Ohio 45268
13. TYPE OF REPORT AND PERIOD COVERED
July _16j 1974 - April 15. 1976
14. SPONSORING AGENCY CODE
EPA/600/06
15. SUPPLEMENTARY NOTES
16. ABSTRACT
FMC Corporation has Developed a system for evaluating water pollution data and the
laboratories which produce these data. The system consists of a plan for the design
and implementation of an interlaboratory test program. A pilot test program was
included to evaluate and to verify the complete program.
Investigation of ongoing interlaboratory testing programs were conducted and
their deficiencies identified in their design and in the procedures by which they
were conducted. The conclusions and recommendations presented in the report are
support by an extensive literature review of previous interlaboratory tests and
their methods for experimental design and test data analyses. Additionally,
18 EPA, State, and private laboratories were visited to review their comments
regarding difficulties and deficiencies in interlaboratory test programs in
general.
17.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.lDENTIFIERS/OPEN ENDED TERMS
COS AT I Field/Group
Laboratories
Chemical Laboratories
Quality Control
Acceptable Quality Level
Reproducibility
Statistical Tests
07B
18. DISTRIBUTION STATEMENT
Release to Public
19. SECURITY CLASS (This Report)
Unclassified
20. SECURITY CLASS (This page)
Unclassified
21. NO. OF PAGES
141
22. PRICE
EPA Form 2220-1 (9-73)
131
ft Hi 6WBHWI rtWTIIK OFFICE 1977- 757-OS6/f)47J
------- |