xvEPA
United States          Office of Research and        EPA 600/R-03/075
Environmental Protection      Development           September 2003
Agency           Washington DC 20460
Comparison of Methods
for the Determination of
Alkyl Phosphates in Urine

-------
                                                           EPA 600/R-03/075
                                                             September 2003
Comparison of Methods  for the Determination
              of Alkyl Phosphates in Urine
                                   by
                               Ryan R. James
                     Atmospheric Science and Applied Technology
                           Battelle Memorial Institute
                            Columbus, Ohio 43201

                Stephen C. Hern, Gary L. Robertson, Brian A. Schumacher
                       National Exposure Research Laboratory
                             Las Vegas, NV 89193
                           EPA Contract 68-D-99-011
                             EPA Project Officer

                               Ellen W. Streib
                   National Exposure Research Laboratory (MD-56)
                    Research Triangle Park, North Carolina 27711
                       National Exposure Research Laboratory
                        Office of Research and Development
                       U.S. Environmental Protection Agency
                    Research Triangle Park, North Carolina 27711

-------
                                         Notice
       The information in this document has been funded wholly or in part by the United States
Environmental Protection Agency under Contract 68-D-99-011 to Battelle Memorial Institute.  It has
been subjected to the Agency's peer and administrative review and has been approved for
publication as an EPA document. Mention of trade names or commercial products does not
constitute endorsement or recommendation for use.
                                            11

-------
                                        Foreword
       The mission of the National Exposure Research Laboratory (NERL) is to provide scientific
understanding, information, and assessment tools that will quantify and reduce the uncertainty in
EPA's exposure and risk assessments for environmental stressors. These stressors include
chemicals, biologicals, radiation, and changes in climate, land use, and water use.  The Laboratory's
primary function is to measure, characterize, and predict human and ecological exposure to
pollutants.  Exposure assessments are integral elements in the risk assessment process used to
identify populations and ecological resources at risk.  The EPA relies increasingly on the results of
quantitative risk assessments to support regulations, particularly of chemicals in the environment. In
addition, decisions on research priorities are influenced increasingly by comparative risk assessment
analysis. The utility of the risk-based approach, however, depends on accurate exposure
information. Thus, the mission of NERL is to enhance the Agency's capability for evaluating
exposure of both humans  and ecosystems from a holistic perspective.

       The National Exposure Research Laboratory focuses on four major research areas:
predictive exposure modeling, exposure assessment, monitoring methods, and environmental
characterization. Underlying the entire research and technical support program of the NERL is its
continuing development of state-of-the-art modeling, monitoring, and quality assurance methods to
assure the conduct of defensible exposure assessments with known certainty. The research program
supports its traditional clients - Regional Offices, Regulatory Program Offices, ORE) Offices, and
Research Committees - and ORD's Core Research Program in the areas of health risk assessment,
ecological risk assessment, and risk reduction.
                                               Gary J. Foley
                                               Director
                                               National Exposure Research Laboratory
                                             in

-------
                                         Abstract

       Organophosphorous (OP) pesticides have been used heavily in the United States and have
been detected in dust, handwipes, drinking water,  food, and air indicating human exposure
pathways. Once inside the body, these pesticides are metabolized mostly to one of six alkyl
phosphate compounds: dimethylphosphate, dimethylphosphorodithioate, dimethylphosphorothioate,
diethylphosphate, diethylphosphorothioate, and diethylphosphorodithioate.  These metabolites have
been detected in urine and the quantity of these metabolites in urine has been shown to correlate
with the level of pesticide dose that a person has experienced. Therefore, the measurement of these
urinary metabolites can be used to assess and compare exposure. Unfortunately, this measurement is
not straightforward.  To characterize the performance of four existing analytical methods used to
analyze urine samples for the six urinary alkyl  phosphate metabolites of OP pesticides, an
interlaboratory comparison study was done.

       Thirty-five urine samples fortified with various concentrations of the alkyl phosphate
metabolites were distributed to four laboratories that have developed and  implemented analytical
methods to measure these compounds. The results provided by each laboratory were analyzed by an
analysis of variance (ANOVA) model to satisfy two objectives. The first was to identify those
compounds where statistically significant differences existed (at the 0.05 level) in the reported
measurements between the concentration levels within each laboratory, in order to determine the
approximate detection threshold of each laboratory.  The second was to determine when statistically
significant differences existed in the reported measurements between the analytical methods, in
order to compare the overall performance of the participating laboratories and hence, the different
methods.

       The study  resulted in the following recommendations regarding urinary alkyl phosphate
analyses:
       —Given the variability of the data, especially at low concentrations, care should be used in
       interpreting relatively small differences between samples.
       —Although there is considerable within and between laboratory variability, all of the
       laboratories could distinguish between samples containing low, medium, and high levels of
       alkyl phosphate metabolites.
       —Given the sample to sample variability, especially among the blind replicates, preparing
       and analyzing each sample in duplicate will improve data quality.
       —It is recommended that a performance evaluation sample of known concentration be
       developed and analyzed with each batch of samples to provide assurance the method is
       performing as expected.

       The work reported herein was  performed by Battelle Memorial Institute under U.S.
Environmental Protection Agency Contract 68-D-99-011. Work was completed as of May 15, 2003.
                                             IV

-------
                                        Contents

Foreword 	iii
Abstract 	iv
Tables	vi
Acknowledgments	  vii

       Chapter 1     Introduction	1
       Chapter 2     Conclusions 	3
       Chapter 3     Experimental Methods	5
                       Laboratory Participation  	5
                       Materials and Sample Handling	5
                       Experimental Design	6
       Chapter 4     Results and Discussion 	10
                       Statistical Differences Between Concentration Levels	10
                       Accuracy of Reported Concentrations	13
                       Statistical Differences Between the Laboratories 	15
References 	19
Appendix A   Descriptions of the Participating Laboratories' Methods for Measuring Alkyl
             Phosphates in Urine	  A-l
Appendix B   Statistical Methods and Results	B-l
Appendix C   Raw Data	C-l
                                             v

-------
                                          Tables

1    Common urinary alkyl phosphates	1
2    Concentrations corresponding to each mix and spiking level  	7
3    Detection limits reported by each participating laboratory  	7
4    Sample testing matrix for each participating laboratory, according to the spiking concentration
     levels associated with each compound mix  	8
5    The randomized order of sample testing specified for each laboratory 	9
6    Summary of concentration level effects for each lab and for each alkyl phosphate target
     compound	11
7    Lowest spiking level of alkyl phosphate target analytes that were significantly larger than the
     unspiked level	13
8    Recoveries of alkyl phosphate target analytes in spiked urine samples	14
9    Range of and average recoveries across all target analytes and all detectable
     concentration levels  	15
10   Lowest spiked concentration of alkyl phosphate target analytes that were significantly larger
     than the unspiked level, the average percent recovery at that spiking level in parentheses, and
     the reported detection limit for each participating laboratory  	15
11   Summary of lab effects at low #2, medium, and high concentration levels	17
                                             VI

-------
                                  Acknowledgments
       We thank Pacific Toxicology Laboratories (Woodland Hills, CA), the University of
Washington Department of Environmental Health (Seattle, WA), Centers for Disease Control and
Prevention (Atlanta, GA), and Centre de Toxicologie Institut National de Sante Publique du Quebec
(Sainte-Foi, Quebec, Canada) for participating in this study.  Major contributions to the research
effort were also made by Battelle staff members - Robert Lordo, Zhenxu (James) Ma, Donald
Kenny, and Julie Sowry.
                                           vn

-------
                                        Chapter 1
                                      Introduction
       Organophosphorus (OP) insecticides are among the most widely used and frequently
detected pesticides in the U.S. (Lewis et al., 1988; Fortmann et al., 1991; Murphy et al., 1983).
They have been detected in dust, handwipes, drinking water,  food, and air indicating human
exposure pathways. Upon entering the body, most organophosphorus pesticides are metabolized to
yield one or more of the six common alkyl phosphates shown in Table 1.  These metabolites have
been detected in urine and the quantity of these alkyl phosphate metabolites excreted in human urine
has been shown to provide a measure of pesticide dose (Morgan et al., 1977; Franklin et al., 1981;
Bradway et al.,  1977). Therefore, the measurement of these urinary metabolites can be used to
assess and compare exposure. Unfortunately, this measurement is not straightforward-hence, this
study was undertaken to evaluate the existing analytical methods. This study characterized the
performance of four different laboratory methods, described in Apppendix A, that were used to
analyze urine samples for the six urinary alkyl phosphate metabolites of OP pesticides. This report
presents the methods and results of this study. For this study, we recruited laboratories that had each
developed and implemented

                        Table 1. Common urinary alkyl phosphates
Name
Dimethylphosphate
Dimethylphosphorodithioate
Dimethylphosphorothioate
Diethylphosphate
Diethylphosphorothioate
Diethylphosphorodithioate
Acronym
BMP
DMDTP
DMTP
DEP
DETP
DEDTP
a specific analytical method to analyze urine samples for each of the above six alkyl phosphates.
These samples were fortified with the six alkyl phosphate compounds at concentration levels
unknown to the participating laboratories. The measurement data generated by each laboratory were
analyzed in order to make statistical comparisons of the results across analysis methods for each of
the six alkyl phosphates.

       Chapter 2 summarizes the results of this method comparison study for each participating
laboratory and presents an overview of their performance compared with one another. The
experimental design of the study is described in Chapter 3.  This includes the amounts of alkyl
phosphate metabolites added to the urine  samples and the experimental matrix of urine samples
containing various concentrations of the target analytes.  Chapter 4 explains the results of the
statistical analysis of the concentration data submitted by each participating laboratory. The focus is
on determining significant differences between the concentration levels within each laboratory and
significant differences between the performance of each laboratory at each concentration level. The
ability of each laboratory to measure the target analytes near their reported detection limits is also

-------
discussed. Appendix A provides a summary of the analytical methods used by each laboratory.
Appendix B is the complete report of the statistical analysis of the data, which is summarized and
discussed in Chapter 4. Appendix C contains the analytical results data provided by each
participating laboratory.

-------
                                       Chapter 2
                         Conclusions and Recommendations

 Overall conclusions and recommendations include:
 1)     Given the variability of the data, especially at low concentrations, care should be used in
       interpreting relatively small differences between samples.

2)     Although there is considerable within and between laboratory variability, all of the
       laboratories could distinguish between samples containing low, medium, and high levels of
       alkyl phosphate metabolites.

3)     The DMP results were problematic for all of the laboratories. This may have been related to
       the preparation and handling of the samples prior to shipment to the laboratories rather than a
       problem with the analytical methods.

4)     Given the sample to sample variability, especially among the blind replicates, preparing and
       analyzing each sample in duplicate will improve data quality.

5)     It is recommended that a performance evaluation sample of known concentration be
       developed and analyzed with each batch of samples to provide assurance the method is
       performing as expected.

 Specific conclusions from the statistical analysis
 1)     Lab A reported concentrations that were significantly different from (and higher than) the
       unspiked samples for DMDTP, DEP, and DEDTP at the Low #1 and #2 concentration levels,
       DETP at the Low #2 concentration level, and for all of the target analytes except DMP at the
       Medium and High concentration levels.  The overall average recovery for Lab A was 103%
       with a standard deviation of 39%.

 2)     Lab B reported concentrations that were significantly different from (and higher than) the
       unspiked samples for all the target analytes except DMP  at the Medium and High
       concentration levels. However, with the exception of DEP and the High level concentrations
       of DMTP and DETP, Lab B's recoveries were generally much greater than 100%.

 3)     Lab C reported concentrations that were significantly different from the unspiked samples for
       DEDTP, DMTP, and DETP at the Low #2 concentration level and for all the target analytes
       except DMP at the Medium and High concentration levels. The overall average recovery for
       Lab C was 88% with a standard deviation of 25%.

 4)     Lab D reported concentrations that were significantly different from the unspiked samples
       for DMTP and DETP  at the Low #2 concentration level,  and for all of the target analytes
       except DMP at the Medium and High concentration levels. However, for DMDTP and DEP,
       the High level concentrations were not statistically different from the Medium level.  The
       overall average recovery for Lab D was  100% with a standard deviation of 62%.

-------
5)    Lab A reported concentrations of spiked samples near its detection limits that were
      significantly different those reported for the unspiked samples for all the target analytes
      except for DMTP. For Lab B, only those reported concentrations well above its reported
      detection limits were found to be significantly different from those reported for the unspiked
      samples for all the target analytes.  Labs C and D reported concentrations of spiked samples
      near their detection limits that were significantly different from those reported for the
      unspiked samples for all the target analytes (Lab D does not measure DEDTP) except for
      DMDTP and DEP.

6)    None of the laboratories at  any spiking level reported concentrations for DMP that were
      significantly different from the concentrations determined in the unspiked urine samples. It
      is unclear why the results for DMP were poor across all four laboratories.  A review of the
      solution preparation records confirmed the addition of DMP to the spiked urine. Additional
      study is recommended to investigate the occurrence of a possible matrix interference in the
      urine that may keep the DMP from being extracted or that causes degradation of DMP in the
      urine.  Also, verification of the purity of the DMP standard used to make the urine solutions
      needs to be done to investigate the possibility that an impure standard was the reason for the
      poor results for DMP.

-------
                                        Chapter 3
                                 Experimental Methods
Laboratory Participation
       Four laboratories agreed to participate in the study, namely Pacific Toxicology Laboratories
(Woodland Hills, CA), the University of Washington Department of Environmental Health (Seattle,
WA), Centers for Disease Control and Prevention (Atlanta, GA), and Centre de Toxicologie Institut
National de Sante Publique du Quebec (Sainte-Foi, Quebec, Canada). In no particular order, they
are identified as Labs A through D in this report. Each of the participating laboratories submitted
information on detection limits, details of their method, required sample size, and costs to
participate. Appendix A provides a description of each laboratory's method for measuring alkyl
phosphates in urine.

Materials and Sample Handling

       Four of the target compounds were available from commercial vendors, DMP (Pfaltz and
Bauering,Waterbury, CT) DEP (Chem Service, West Chester, PA), and DETP and DEDTP (Aldrich
Chemical, Milwaukee, WI); and the remaining two test compounds, DMTP and DMDTP, were
obtained from Applichem GmbH, Germany as a custom synthesis.

       All of the test solutions were prepared either from a commercially available pooled urine
(American Biological Technologies, Seguin, TX) or a synthetic urine (formulation obtained from
Centers for Disease Control and Prevention). Samples prepared in the pooled urine were fortified at
only the Medium and High concentration level in order to avoid interference from background levels
of the alkyl phosphate metabolites in the pooled urine.  Samples prepared in the synthetic urine were
fortified with all concentration levels. Unspiked samples were prepared from both the pooled and
synthetic urine, but only those prepared in the sythetic urine were included in the statistical analysis
so significant differences between the unspiked levels and the lower concentration levels could be
evaluated.

       Stock solutions of the alkyl phosphate metabolites were prepared by weighing 10 to 15 mg of
the solid compounds into a weighing boat using an analytical balance (Mettler AE1660). The
calibration of the balance was confirmed with 5 and 100 mg standard weights prior to use and the
exact weights of each target analyte was recorded in a laboratory notebook to the nearest tenth of a
milligram.  The mixture was dissolved in distilled water and two tenfold dilutions were performed to
prepare working stock solutions. Appropriate volumes of the stock solutions were pipetted into each
sample using an Eppendorf pipette.  To assure homogeneity of the samples between the laboratories,
each sample was prepared from a single volume of urine and allocated into the sample containers for
each of the participating laboratories. Unique identifier numbers were assigned to each sample.

       After preparation,  all samples were stored in a freezer at -20°C to protect against degradation
of the alkyl phosphate compounds.  Each set of samples was shipped under dry ice for next-day
delivery to the participating laboratories to ensure that the samples remained frozen during shipment.

-------
Special care was taken to protect against breakage and to conform with all state and federal
regulations for transport of biohazardous material. All of the participating laboratories were
required to store the urine samples in a -20°C freezer prior to analysis. All of the laboratories were
contacted prior to shipment of samples so receipt of the samples on dry ice was ensured. Lab A
received three sets of 35 samples for replicate analysis while Labs B through D received a single set.

Experimental Design

       The six alkyl phosphate target compounds were prepared in two spiking mixtures.  The
compounds in each mix were as follows:

       • D    Mix A: DMP, DMDTP, DEP, DEDTP
       • D    MixB: DMTP, DETP

       When spiking the urine samples with a given mix, all compounds within that mix were
represented at the same concentration level. Table 2 shows the four concentration levels of each
mix, plus an unspiked level, that were determined to be sufficient to characterize method
performance for each lab (and, equivalently, each analytical method) over a range of concentration
levels for a given compound and in the company of compounds from the other mix at various
concentration levels. Thus, there were 5x 5 = 25  different types of samples prepared in this study,
corresponding to each combination of the two mixes at the following five concentration levels:

       • D    Unspiked (authentic pooled and synthetic urine)
       • D    Low #1 (spiked near the detection limit for Lab A; Table 3)
       • D    Low #2 (spiked near detection limit for the other three labs)
       • D    Medium (spiked at approximately  two to five times the highest detection limit
             reported by the participating laboratories)
       • D    High (spiked at 200 jig/L).

Due to a laboratory error during sample preparation, the samples that were supposed to  be spiked
with Mix B at the Low #1 concentration level were instead spiked at the Low #2 level.  Thus, there
were twice as many samples spiked at the Low #2 concentration level for Mix B than at the
unspiked, medium, and high levels, and no sample was spiked at the Low #1 level for Mix B. Mix
A compounds were spiked at their specified level for each sample, and thus were unaffected by this
laboratory error.
Table 2. Concentrations corresponding to each mix and spiking level
Compound
Mix A
DMP
DMDTP
DEP
Low Level #1
1.00
1.04
1.40
Low #2 Level
2.00
2.08
2.80
Medium Level
50.0
52.0
70.0
HighLevel
200
208
280

-------

MixB
DEDTP
DMTP
DETP
1.35
NAa
NA
2.70
3.82
4.44
67.5
23.9
27.8
270
191
222
 There was only one low level of Mix B.

       Table 3 presents the reported detection limits for each compound for the four participating
labs. Note that because the Low #1 spike level was considerably below the detection limits of all
but Lab A, samples spiked with Mix A at this level were expected to resemble unspiked samples
(with regard to Mix A) for those labs with higher detection limits. Meanwhile, for Lab A, analysis
of the Mix A compounds at the Low #1 level was expected to provide information on performance at
a level  close to their detection limit.
Table 3. Detection limits reported by each participating laboratory (|J.g/L)
Laboratory
A
B
C
D
DMP
0.6
5
1.6
2.5
DMTP
0.5
5
1
2.5
DMDTP
0.2
10
0.8
2.5
DIP
0.3
5
1
2.5
DETP
0.3
5
0.9
2.5
DEDTP
0.2
10
0.6
NAa
 Lab D does not not routinely analyze DEDTP

       The study design addressed the principal statistical objective of the project, which was to
make statistical comparisons of average analytical results across analysis methods for each of the six
compounds. To the extent possible, the design took into account other factors that could have
contributed to differences among the analytical results, such as having different participating
laboratories and having samples with different concentration levels of the compounds, so that
differences among analytical methods could be detected with greater sensitivity.

       The study design required each laboratory to analyze 35 samples, with each of the 25
possible sample types represented by either one or two samples.  These 35 samples consisted of the
following:

       •      2 samples where neither Mix A nor Mix B was present
       •      4 samples where Mix A was not spiked, but Mix B was spiked at one of three spiking
              levels (2 samples spiked with Mix B at the Low #2 concentration level; 1 sample
              spiked with Mix B at each of the medium and high levels)
       •      4 samples where Mix B was not spiked, but Mix A was spiked at one of four spiking
              levels (1 sample spiked with Mix A at each of the Low #1, Low #2, medium, and
              high spiking levels)
       •      7 samples where one mix was spiked at the Low #1 level (Low #2  level for Mix B)
              and the other mix was spiked at either the Low #2, Medium, or High level (1 sample
              for each of these 7 spiking combinations)
       •      18 samples where each mix was spiked at either the Low #2, Medium, or High
              spiking levels (2 samples for each of these 9 spiking combinations)

-------
       This information on the numbers and types of samples that each laboratory analyzed is
summarized within the matrix in Table 4.  Each laboratory was directed to test the 35 samples in the
order given in Table 5. The testing order was determined by dividing the 35 samples into two
groups of 17 plus one extra unspiked sample. Within each group of 17, one sample was tested for
each of the 9 sample types where both mixes were represented at either the Low #2, Medium, or
High spiked levels. In addition, the samples represented by asterisks within the matrix in Table 4
were tested within the first group of 17 samples. The concentrations of alkyl phosphates
corresponding to the Low #1, Low #2, Medium, and High spiking levels are listed in Table 2.
Appendix Table B-5 lists the number of samples of each concentration level analyzed as part of this
study.

       Occasionally, the labs provided more than one measurement for a given urine sample,
representing duplicate sample analysis.  The statistical analysis included all reported measurements
except the unspike pooled urine samples which were omitted so the performance of the methods near
their reported detection limits could be evaluated in the absence of backbround levels of the target
analytes. It took into account when measurements were associated with a common sample and the
fact that Lab A analyzed three sets of 35 samples.

Table 4. Sample testing matrix for each participating laboratory, according to the spiking
concentration levels associated with each compound mix
Spiking Levels for
Each Mix
Mix A
Unspiked
Low #1 Level
Low #2 Level
Medium Level
High Level
MixB
Unspiked
2 samples*
1 sample
1 sample*
1 sample
1 sample*
Low Level #1"
1 sample
1 sample*
1 sample
1 sample*
1 sample
Low Level #2
1 sample*
1 sample
2 samples
2 samples
2 samples
Medium Level
1 sample
1 sample*
2 samples
2 samples
2 samples
High Level
1 sample*
1 sample
2 samples
2 samples
2 samples
 Due to a sample preparation error, Mix B Low Level # 1 was prepared at the same concentration of Mix B Low Level #2.
* Included among the first 17 samples tested at each laboratory.

-------
Table 5.  The randomized order of sample testing specified for each laboratory
Test Number
1"
2
3"
4"
5
6"
7
8
9"
10
11
12
13
14"
15
16
17"
18
Sample Type
Mix A
High
Low #2
Unspiked
High
Low #2
Unspiked
Medium
Low#l
Medium
Low #2
Unspiked
Low #2
High
Medium
Low#l
Medium
High
Low #2
MixBa
Unspiked
Low #2
Unspiked
High
Unspiked
High
Low#l
Medium
Medium
High
Low #2
Medium
Low #2
High
Low#l
Low #2
Medium
Low#l
Test Number

19
20
21
22
23
24"
25"
26
27"
28
29"
30"
31
32"
33
34
35
Sample Type
Mix A
Low#l
Low #2
Low#l
High
Low#l
Unspiked
High
Medium
High
High
Medium
Medium
Low #2
Medium
Unspiked
Low #2
Unspiked
MixB
Low #2
Medium
Unspiked
Low #2
High
Medium
Medium
Low #2
High
Low#l
Unspiked
High
High
Medium
Low#l
Low #2
Unspiked

 Due to a sample preparation error, Mix B Low Level # 1 was prepared at the same concentration of Mix B Low Level #2.
 Samples prepared in pooled urine, other samples prepared in synthetic urine

-------
                                        Chapter 4
                                 Results and Discussion
       Each participating laboratory submitted the results of their analyses for statistical evaluation.
The primary statistical objectives of this study were: 1) to identify those compounds where
statistically significant differences existed (at the 0.05 level) in (log-transformed) reported
measurements between the concentration levels within each laboratory in order to determine the
approximate detection threshold of each laboratory and compare that against their reported detection
limits; and 2) to determine when statistically significant differences existed in the (log-transformed)
reported measurements between the analytical methods, in order to compare the overall performance
of the participating laboratories. To satisfy these objectives, an analysis of variance (ANOVA)
model was derived and fitted to the reported measurements. The ANOVA model was fitted
separately for each of the six compounds.  Appendix B gives a detailed description of the ANOVA
model used to analyze the data. The analysis utilized Version 8, Release 8.2, of the SAS® System.
Measurements falling below a laboratory's detection limit were replaced by one-half of the detection
limit prior to the statistical analysis.

Statistical Differences Between Concentration Levels

       Before directly comparing the performance of each laboratory, the approximate detection
limit for each compound was determined from the results provided by each laboratory. For example,
if measurements for samples spiked at the Low #1 and Low #2 concentrations were found to be
statistically equivalent to measurements for unspiked samples, but measurements for samples spiked
at the Medium level were found to differ significantly from the unspiked and low-spiked samples,
then the first concentration level that should be used to compare laboratory performance should be
the Medium level.  In the above example, the concentrations reported for the unspiked, Low #1, and
Low #2 samples are statistically equivalent to non-detectable results, and therefore, any observed
differences between laboratories at these spiking levels are statistically inconsequential.

       To address the first statistical objective, statistical tests were  performed within the ANOVA
to determine,  within each laboratory, if significant differences existed in the reported concentrations
between the different levels of fortification.  When the ANOVA determined that the effect of the
spiking concentration was significant (i.e., there were statistically significant differences in the
reported results between samples spiked at different concentration levels), then multiple comparison
procedures were performed within the ANOVA to determine which  pairs of concentration levels
differed significantly. Each pairwise comparison of concentration levels of the mix was performed
using the Bonferroni-adjustment method, to ensure that the overall error rate associated with all
pairwise comparisons was no greater than 0.05.  Table 6 displays the results of these statistical tests.
Each cell within the table, corresponding to a given compound and laboratory, lists those pairs of
concentration levels that are statistically different from each other at an overall 0.05 significance
level. For example, if the High level was determined to be significantly different from each of the
unspiked, Low #1, Low #2, and Medium levels, the table would show: H vs U, LI, L2, M. This
model only reports incidences of significant differences among pairs of spiking levels. Appendix

                                             10

-------
Tables B-4a through B-4f provide the geometric means of the reported concentrations at each
spiking level for each laboratory, as well as other statistical summary parameters that characterize
the distribution of the reported data at a given spiking level.
Table 6.  Summary of concentration level effects for each lab and for each alkyl phosphate
target compound a'b
Lab


Lab A

LabB

LabC

LabD

Significant Concentration Level Effect
BMP
Hvs.U,Ll,L2
Mvs. L1,L2,U


No significant
differences.
Hvs. L1,L2,M


No significant
differences.

DMDTP
Hvs. U,L1,L2,M
Mvs. U,L1,L2
L2vsU
LI vsU
Hvs. U,L1,L2,M
M vs. U,L1,L2
Hvs. U,L1,L2,M
Mvs. U,L1,L2

Hvs. U,L1,L2,
Mvs. U,L1,L2

DIP
H vs U,L1,L2,M
M vs. U,L1,L2
L2vsU
LI vsU
H vs. U,L1,L2,M
M vs. U,L1,L2
H vs. U,L1,L2,M
M vs. U, LI, L2

Hvs. U,L1,L2
M vs. U, LI, L2

DEDTP
Hvs.U,Ll,L2,M
M vs. U, L1,L2
L2 vs. U,L1
LI vs. U
H vs. U,L1,L2,M
M vs. U, L1,L2
H vs. U,L1,L2,M
M vs. U,L1,L2
L2 vs. U
NRC

DMTP
H vs. U,L2,M
M vs. U,L2


H vs. U,L2,M
M vs. U, L2
H vs. U, L2, M
M vs. U,L2
L2 vs. U
H vs. U, L2, M
M vs. U,L2
L2 vs. U
DETP
H vs. U,L2,M
M vs. U,L2
L2 vs. U

H vs. U,L2,M
M vs. U, L2
H vs. U,L2,M
M vs. U,L2
L2 vs. U
H vs. U,L2,M
M vs. U,L2
L2 vs. U
a F tests were used to test for significant concentration level effects for each lab, where the Benjamini and Hochberg multiple comparison adjustment
method was used to control the overall error rate across all of these tests to be no higher than 0.05. When significant differences among concentration
levels were present for a given lab, pairwise comparisons were made between each pair of concentration levels for the given lab, with each pairwise
comparison performed using Bonferroni-adjustment method to ensure that the overall error rate across the pairwise comparisons was no greater than
0.05. Pairs of concentration levels differing significantly at the Bonferroni-adjusted 0.05 level are identified in parentheses.
b Mix A compounds and Mix B compounds were spiked at five and four concentration levels, respectively.
0 No results because this laboratory does not routinely measure DEDTP.
                                                        11

-------
       Few significant differences between spiking levels were observed in the reported results for
DMP among the laboratories indicating analytical difficulties with this compound (Table 6).  The
Lab A results indicated that only the Medium and High concentration level were significantly
different from (and higher than) the unspiked sample, but that they were not statistically different
from one another. The results from Lab C indicate that while results for the High level were
significantly different from (and higher than) the Medium, Low #1, and Low #2 levels, results for
the unspiked level were not significantly different from any of the spiking levels, including the High
level. Lab B and D results indicated no significant differences between the five concentration levels.
In addition to these findings, the data in Appendix Table B-4a shows that across all laboratories, the
highest individual sample result reported for the High level  spike of DMP was 17.5 |ig/L, when the
known spiked concentration was 200 |ig/L. Similarly, for the Medium level samples, the highest
individual sample result reported was 15.1 |ig/L when the known spiked concentration was 50 |ig/L.
It is unclear why the results for DMP were so poor across all four laboratories. The addition of
DMP to the spiked urine samples was confirmed by review  of the solution preparation records.
Apparently, a matrix interference in the urine samples may keep the DMP from being extracted and
analyzed by either a physical occlusion or a degradation that takes place in the urine matrix.  Also,
verification of the purity of the DMP standard used to make the urine solutions needs to be done to
investigate the possibility that an impure standard was the reason for the poor results for DMP.

       Beyond DMP, the interpretation of results in Table 6 for the rest of the compounds is
relatively straightforward. For all four laboratories, results for samples spiked at the High  and
Medium levels were significantly different from (and greater than) the results at the Low and
unspiked levels.  Below the Medium concentration level the results from each laboratory differed
because of range of detection limits for each target analyte.

       The reported detection limits for Lab A were all below the Low #1 concentration level. For
DMDTP and DEP, Lab A determined the Low #1 and Low  #2 concentration levels to be
significantly different from the unspiked samples, but was unable to detect a significant difference
between the Low #1 and Low #2 concentration  levels. For DEDTP, DETP, and DEP,  Lab A
determined each possible concentration level to be significantly different from the unspiked samples
and each other.  For DMTP, Lab A was unable to detect a significant difference between the
Medium and Low #2 concentration levels.  The reported detection limits for Lab B were between the
Low #2 and Medium concentration levels.  As expected, Lab B was unable to detect a significant
difference between any of the Low #1 or #2 concentration levels  and the unspiked samples. The
detection limits for Labs C and D were at or near the Low #2 concentration level and both  performed
similarly.  For DMDTP and DEP, neither of these labs were able  to detect a significant difference
between the Low #2, Low #1, or unspiked concentration levels.  Also, they were unable to detect a
significant difference between the Medium and High concentration levels for those two target
analytes. For DEDTP (Lab C only), DMTP, and DETP they both determined each concentration
level to be significantly different from the unspiked samples and  each other. With the exception of
DMP, Table 7 gives the lowest spiked level for each target analyte that was significantly different
from the unspiked level.
       In summary, Lab A was able to detect concentrations near its reported detection limits to be
significantly different from the unspiked samples for all the target analytes except for DMTP.

                                             12

-------
However, Lab A was able to detect a significant difference between the Low #1 and Low #2 spiking
levels for only DEDTP (the Mix B compounds were not spiked at the Low #1 level). Lab B was
able to detect concentrations well  above its reported detection limits to be significantly different
from the unspiked samples for all  the target analytes. Labs C and D were able to detect
concentrations near its reported detection limits to be significantly different from the unspiked
samples for all the target analytes  except for DMDTP and DEP (Lab D does not measure DEDTP).

Table 7. Lowest spiking level of alkyl phosphate target analytes that were significantly larger
than the unspiked level (|J.g/L)a

Lab A
LabB
LabC
LabD
DMDTP
Low#l b
Medium
Medium
Medium
DEP
Low#l b
Medium
Medium
Medium c
DEDTP
Low#l
Medium
Low #2
NRd
DMTP
Medium
Medium
Low #2
Low #2
DETP
Low #2
Medium
Low #2
Low #2
 All spike levels of DMP were statistically indistinguishable from the unspiked level
 Result not statistically different from the Low #2 concentration level
 Result not statistically different from the High concentration level
 No results because this laboratory does not routinely measure DEDTP.

Accuracy of Reported Concentrations

       An estimate of the accuracy of the results reported by each laboratory relative to the spiking
level of the sample (Table 5) was calculated as follows:

                                                      (*1
                                  % Accuracy  =  100 —

where x is the mean measured value (across all reported measurements for a given spiking level by a
given laboratory), and T is the known fortified concentration. Table 8 gives the average recoveries
for concentration levels detectable significantly above the unspiked samples for each lab.  Table 9
summarizes the average recoveries listed in Table 8 by providing the range of these average
recoveries along with their mean and standard deviation. On average, Labs A and C were within
12% of the known concentrations spiked into the urine samples. They also had relatively small
uncertainties around their average recoveries, but the range of their recoveries were from  60% to
180% for Lab A and from 65% to 165% for Lab C.  The accuracy achieved by Lab D was
approximately 100% on average, but its standard deviation is somewhat larger than Labs A and C.
The larger uncertainty is driven by the broad range of recoveries, from 31% to 236%.  Lab B grossly
over-recovered the target analytes in most instances, which resulted in a
                                              13

-------
Table 8.  Recoveries of alkyl phosphate target analytes in spiked urine samples
Analyte
DMP
DMDTP
DIP
DEDTP
DMTP
DETP
Lab
Lab A
LabB
LabC
LabD
Lab A
LabB
LabC
LabD
Lab A
LabB
LabC
LabD
Lab A
LabB
LabC
LabD
Lab A
LabB
LabC
LabD
Lab A
LabB
LabC
LabD
% Recovery (number of samples) at four different concentration levels
Low#l
a
-
-
ND"
167.4 (9)
ND
-
ND
177.9 (9)
-
-
ND
120.6 (8)
-
-
NRC
NSd
NSd
Low #2
-
-
-
ND
112.4(8)
-
-
ND
121.2(16)
-
-
ND
84.9 (23)
-
164.8(8)
NR
-
-
84.6(13)
235.6(8)
180.0(39)
-
103.0(13)
156.6(11)
Medium
-
-
-
ND
74.4 (16)
1389 (7)
85.3(8)
69.0 (8)
64.8 (24)
95.7(8)
66.3 (8)
45.3(8)
81.3(24)
414.1 (8)
72.1 (8)
NR
87.3 (24)
233.1 (8)
85.7(8)
135.0(8)
88.1(24)
190.6(7)
82.3 (8)
116.1 (8)
High
-
-
-
-
74.9 (24)
1292 (8)
81.5(8)
52.7 (8)
60.4 (24)
110.8(8)
64.9 (8)
31.4(8)
85.3 (24)
430.1 (8)
79.3 (8)
NR
86.4 (24)
128.0(8)
86.7 (8)
85.4 (8)
79.2 (24)
110.7(8)
82.0 (8)
74.1 (8)
* The dash indicates that results for this spiking level were not significantly different from the unspiked level for this lab, as indicated
in Table 6.
b "ND" indicates that at this concentration level, the laboratory reported all the results to be below their detection limit.
c No results reported because the laboratory  does not routinely measure DEDTP.
d "NS" There was no Low #1 level for Mix B.
very large average recovery.  Furthermore, recoveries for Lab B ranged from 96% through 1,389%.
Overall, the accuracy of Labs A, C, and D were reasonable for this type of analysis, but the rather
poor precision across all the laboratories indicates the difficulty in extracting these target analytes
from urine in a consistent fashion. The two labs that use isotopically labeled internal standards,
Labs A and C, produced more precise results than the other laboratories.
                                                 14

-------
                  Table 9.  Range of and average recoveries across all target
                        analytes and all detectable concentration levels

Lab A
LabB
LabC
LabD
Range (%)
60-180
96-1,389
65 - 165
31-236
Average Recovery ± Standard Deviation (%)
103 ±39
439 ±490
88 ±25
100 ± 62
       To summarize the performance of each laboratory near their reported detection limits, as
discussed in the previous section, Table 10 lists the lowest spiking concentrations that were
statistically different from the unspiked samples, the average percent accuracy at that concentration,
and the detection limit for the target analytes at each laboratory.  It shows the difference between the
lowest detectable concentration and the reported detection limit and also how accurate the
measurements were at that concentration level.

Table 10. Lowest spiked concentration (LSC) of alkyl phosphate target analytes that were
significantly larger than the unspiked level, the average percent recovery at that spiking level
in parentheses, and the reported detection limit (DL) for each participating laboratory (all
concentrations in ng/L)a


Lab A
LabB
LabC
LabD
DMDTP
LSC
1.04 "(167)
52.0(1,389)
52.0(85)
52.0 c (69)
DL
0.2
10
0.8
2.5
DEP
LSC
1.40 "(121)
70.0 (96)
70.0 (66)
70.0 c (45)
DL
0.3
5
1
2.5
DEDTP
LSC
1.35(121)
67.5 (414)
2.70(165)
NRd
DL
0.2
10
0.6
NAa
DMTP
LSC
23.9 (87)
23.9(233)
3.82 (85)
3.82 (236)
DL
0.5
5
1
2.5
DETP
LSC
4.44(180)
27.8(191)
4.44(103)
4.44(157)
DL
0.3
5
0.9
2.5
 All spike levels of DMP were statistically indistinguishable from the unspiked level
 Result not statistically different from the Low #2 concentration level
 Result not statistically different from the High concentration level
 No results because this laboratory does not routinely measure DEDTP.

Statistical Differences Between the Laboratories

       The second statistical objective was to investigate the presence of significant differences in
the reported results among laboratories, taking into account the different spiking levels of the
samples. This objective was addressed by performing additional statistical tests within the ANOVA
discussed earlier.  When statistical tests determined that the laboratory effect was significant at a
given spiking level (i.e., there were statistically significant differences in the reported results among
laboratories), then those laboratories whose results at that spiking level were significantly different
from the unspiked level (Table 6) were identified. Among these laboratories, those pairs of
                                               15

-------
laboratories that differed significantly at that spiking level were identified. Each pairwise
comparison of laboratories was performed using the Bonferroni - adjustment method, to ensure that
the overall error rate associated with all pairwise comparisons (at a given spiking level) was no
greater than 0.05 (Table 11).  Because the Low #1 spiking level was found to differ significantly
from the unspiked level only for Lab A, that concentration level was omitted from the table.
Similarly, no results are included in Table 11 for DMP, as no spiking levels differed significantly
from the unspiked level for any laboratory. The laboratories included in the pairwise comparisons
are noted within each cell.

       For DMDTP at the  Medium and High spiking levels, Lab B, with average recoveries of
1,389% and 1,292%, was significantly different from the other three laboratories, whose average
recoveries ranged from 53% to 85%. Labs A, C, and D did not differ significantly from each other
at these two spiking levels.

       For DEP, there was no significant difference between laboratories at the Medium
concentration level according to the ANOVA, so no pairwise comparisons of labs were performed.
The average recoveries for all the laboratories ranged from 45% to 96%. At the High concentration
level, where all four laboratories were also compared, Lab B (111% average recovery) differed
significantly only from Lab D (31% average recovery), while Labs A (60% average recovery), C
(65% average recovery), and D did not differ significantly from one another. Also, Lab B did not
differ significantly from Labs A and C.

       For DEDTP at the Low #2 concentration level, only Labs A and C were compared, and there
was no significant difference found between them.  Their average recoveries were 85 and 165%,
respectively.  At the Medium and High concentration levels, when pairs of all three laboratories
analyzing DEDTP were compared, Lab B, which had average recoveries of 414 and 430%, was
significantly different from Labs A and C, whose average recoveries ranged from 72 to 85%.  Labs
A and C were not significantly different from one another.

       For DMTP, only Labs C and D were compared at the Low #2 concentration level and there
was no significant difference found. Their average recoveries were 85%, and 236%, respectively.
All four laboratories were compared at the two higher concentration levels. At the Medium level,
Lab A (87% average recovery) was significantly different from Lab B (233% average recovery) but
was not significantly different from Labs C (86% average recovery) and D (135% average recovery).
Labs B, C, and D were not significantly different from one another. At the High level, there was no
significant difference among the four laboratories; their average recoveries ranged from  85% to
128%.
                                             16

-------
Table 11.  Summary of lab effects at low #2, medium, and high concentration levels a
Concentration
Level
Low #2
Medium
High
Significant Lab Effects
DMP
NA"
NA
NA
DMDTP

Lab B vs Lab A,
Lab C, Lab D
(comparing all labs)
Lab B vs Lab A,
Lab C, Lab D
(comparing all labs)
DIP

No significant
differences in labs
Lab B vs Lab D
(comparing all
labs)
DEDTP"
No significant
pairs of
differences
(comparing Labs
A and C only)
Lab B vs Lab A,
LabC
(comparing all
labs)
Lab B vs Lab A,
LabC
(comparing all
labs)
DMTP
No
significant
pairs of
differences
(comparing
Labs C and D
only)
Lab A vs Lab
B (comparing
all labs)
No
significant
differences in
labs
DETP
Lab A vs Lab
C (comparing
Labs A, C,
and D only)
No significant
differences in
labs
No significant
differences in
labs
a F tests were used to test for significant lab effects at each concentration level of the compound, where the Benjamini and Hochberg
multiple comparison adjustment method was used to control the overall error rate across all of these tests to be no higher than 0.05.
When significant differences among labs were present at a given concentration level, pairwise comparisons were made between each
pair of labs at the given concentration level, with each pairwise comparison performed using a Bonferroni-adjustment method to
ensure that the overall error rate across the pairwise comparisons was no greater than 0.05.  Pairs of labs differing significantly at the
Bonferroni-adjusted 0.05 level are identified in parentheses.
 No labs had spiking levels that differed significantly from the unspiked level for BMP (see Table 6).
 The dash indicates that the specified spiking level differed significantly from the unspiked level for either no lab or only one lab,
and so no pairwise comparisons are reported among labs.
d Lab D did not analyze for DEDTP.

       For DETP, at the Low #2 concentration level, Lab A (180% average recovery) was
significantly different from Lab C (103% average recovery) but was not significantly different from
Lab D (157% average recovery). Labs C and D were not significantly different from one another.
At the Medium and High concentration levels, there were no significant difference among the four
laboratories; their average recoveries ranged from 82% to 191%  at the Medium level, and from 74%
to 111% at the High level.

       Overall, Lab A reported concentrations that were significantly different from (and higher
than) the unspiked samples for DMDTP, DEP, and  DEDTP at the Low #1 and #2 concentration
levels, DETP at the Low #2 concentration level, and for all of the target analytes except DMP at the
Medium and High concentration levels.  The range  of average recoveries for Lab A was 60% to
180%. Lab B reported concentrations that were significantly different from (and higher than) the
unspiked samples for all the target analytes except DMP at the Medium  and High concentration
levels. However, with the exception of DEP and the High level concentrations of DMTP and DETP,
Lab B's recoveries were generally much greater than 100%.  Lab C reported concentrations that
were significantly different from the unspiked samples for DEDTP, DMTP, and DETP at the Low
#2 concentration level and for all the target analytes except DMP at the Medium and High
                                                17

-------
concentration levels. The range of average recoveries for Lab C was 65% to 165%. Lab D reported
concentrations that were significantly different from the unspiked samples for DMTP and DETP at
the Low #2 concentration level, and for all of the target analytes except DMP at the Medium and
High concentration levels. However, for DMDTP and DEP, the High level concentrations were not
statistically different from the Medium level. The range of average recoveries for Lab D was 31% to
236%.
                                            18

-------
                                       References

Bradway, D.E., Shafik, T.M., and Lores, E.M.  (1977). "Comparison of cholinesterase activity,
residue levels, and urinary metabolite excretion of rats exposed to organophosphorus pesticides." J.
Agric. Food Chem., 25:1353-1358.

Fortmann, R.C., Sheldon, L.S., Smith, D., Perritt, K., and Camann, D.E.  (1991). House Dust/Infant
Pesticides Exposure Study (HIPES).  Final report to EPA, Contract No. 68-02-4544, Research
Triangle Institute and Southwest Research Institute.

Franklin, C.A., Fenske, R.A., Greenhalgh, R., Mathieu, L., Denley, H.V., Leffingwell, J.T., and
Spear, R.C. (1981). "Correlation of urinary pesticide metabolite excretion with estimated dermal
contact in the course of occupational exposure to Guthion."  J. Toxicol. Environ. Health 7:715-731.

Lewis, R.G., Bond, A.E., Johnson, D.E., and Shu, J. P. (1988). "Measurement of atmospheric
concentrations of common household pesticides:  A pilot study." Environ. Monitoring Assessment
10:59-73.

Lin, D.C.K., Melton, R.G, Kopfler, F.C., and Lucas, S.V. (1981). "Glass capillary gas
chromatographic/mass spectrometric analysis of organic concentrates from drinking and advanced
waste treatment waters." In: Advances in the Identification and Analysis of Organic Pollutants in
Water, Vol. 2, Chapter 46 (L.H. Keith,  ed.). Ann Arbor Science Publ., Inc., Ann Arbor, MI. pp.
861-906.

Morgan, D.P., Hetzler, H.L., Slach, E.F., and Lin, L.I. (1977). "Urinary excretion of
paranitrophenol and alkyl phosphates following ingestion of methyl or ethyl parathion by human
subjects. Arch. Environ. Contam. Toxicol. 6:159-173.

Murphy, R.S., Kutz, F.W., and Strassman, S.C. (1983).  "Selected pesticide residues or metabolites
in blood and urine specimens from a general population survey." Environ. Health Perspect. 48:81-
86.

Schattenberg, H.J., III and Hsu, J.-P. (1992). "Pesticide residue survey of produce from 1989 to
1991." J. AOAC Intl. 75:925-933.
                                            19

-------
                                      Appendix A
                   Descriptions of the Participating Laboratories'
                 Methods for Measuring Alkyl Phosphates in Urine

Lab A
1.  4.0 mL aliquot of urine is spiked with 25 jig/L of deuterated DMP, DMDTP, DEP, DMTP, and
DETP and 13C- labeled DEDTP, as internal standards.
2.  4 mL acetonitrile added to the sample and mixed.
3.  Samples evaporated at 50°C  using a Turbovap apparatus until approximately 4 mL of solution
remains.
4.  An additional 4 mL of acetonitrile is added and the evaporation is repeated using the same
conditions until 2 mL of solution remained.
5.  This step is repeated once more until the urine contents in the tube were totally concentrated.
6.  The concentrated residue is reconstituted with 1 mL of acetonitrile and 50 |j,L of derivatizing
agent, l-chloro-3-iodopropane.
7.  The sample is maintained at  room temperature for 1 h and then is transferred to a clean test tube.
A few grains of potassium carbonate are added to the sample, and the sample is placed in a heater
block for 2 h at 80°C.
8.  The sample is evaporated in  a Turbovap apparatus using the same conditions as described above
until the final volume was 100 |j,L.
9.  The sample is transferred to  an autosampler vial, sealed, and stored at -20°C until analysis.
10. 1  |iL of each sample is injected into a triple quadrupole GC/MS outfitted with a 30-m DB-5MS
capillary column (0.25 mm i.d., 0.25 |im film thickness). One quantitation ion is the molecular ion
produced by chemical ionization in the positive ion mode and the other quantitation ion is the
daughter ion produced by the insertion of a collision induced dissociation gas.

LabB
1.  1 mL of urine is spiked with fenthion as the internal standard
2.  Samples are lyophilized and  then derivatized with a benzyltolyltriazine reagent
3.  A saturated salt solution is added to the reaction vessel and the benzyl derivatives are extracted
with cyclohexane.
4.  Solution is then analyzed by GC with a flame photometric detector outfitted with a 30-m DB-210
capillary column (0.53 mm i.d., 1.0 |im film thickness).
                                           A-l

-------
LabC
1.  0.5 mL urine sample spiked with deuterated DMP, DEP, DMTP, and DETP as internal standards.
2.  Acetonitrile is added to the sample and the mixture is centrifuged.
3.  The supernatant liquid is evaporated to dryness and redissolved in pure acetonitrile.
4.  Sample is then derivatized with pentafluorobenzyl bromide along with the addition of potassium
carbonate as a catalyst at 70°C for 2 h.
5.  The derivatized alkyl phosphates are extracted twice with a mixture of dichloromethane in
hexane (8% v/v), filtered on sodium sulphate and evaporated to 0.2 mL.
6.  The alkyl phosphates are quantified on GC/MS with electron impact ionization.  The GC/MS is
outfitted with a 30-m HP-50+ capillary column (0.25 mm i.d., 0.25 |im film thickness).

LabD
1.  5 mL of urine is pipetted into a centrifuge tube, 35 mL of acetonitrile is added and the mixture is
centrifuged at 2500 rpm for 10 min.
2.  The supernatant liquid is decanted into a TurboVap flask and evaporated to  1 mL.
3.  Quantitatively transfer the distillate to a 15 mL screw cap test tube and add 1.5 mL of methanol
to the TurboVap flask.
4.  Add 2 mL acetonitrile to the test tube and a bilayer will form.
5.  Quantitatively transfer methanol from TurboVap flask to test tube which will cause the formation
of a clear yellow solution  and some precipitate.  Add 8 mL of acetone. Vortex.
6.  Centrifuge tubes for 10 minutes at 2500 rpm.
7.  Decant supernatant to a new test tube. Evaporate to dryness under a gentle stream of nitrogen.
Sample residue will bill be approximately 0.25 mL of yellow oil.
8.  To the oil residue, add 1 mL dehydrated acetone.
9.  Add 20 |iL pentafluorobenzyl bromide derivatizing reagent.
10.  Cap and rotate at room temperature for 30 min.  Evaporate to near dryness under gentle stream
of nitrogen.
11.  Add approximately 20 mg of potassium carbonate to the dry residue.
12.  Extract with 10 mL hexane using vortex.
13.  Decant hexane extracts into a TurboVap flask and evaporate to 0.5 mL (thio-containing
phosphates).
14.  Add an additional 20  mg of potassium carbonate to the dry residue. Add 1.0 mL dehydrated
acetonitrile/dimethylformamid (4:1). Pipette 20 |j,L pentafluorobenzyl bromide derivatizing reagent
into the  sample. Vortex.
15.  Cap and derivatize at 90°C for 30 minutes.
16.  Cool samples, added 2 mL water to dissolve remaining potassium carbonate.
17.  Extract residue with 3x5 mL hexane.
18.  Combine extract and evaporate to 0.5 mL for analysis (non-thio-containing akyl phosphates)
19.  Both the thio and non-thio-containing extracts are then analyzed by GC with a pulsed flame
photometric detector outfitted with a 30-m SPB-20 capillary column (0.32 mm i.d.,  1.0 |im film
thickness).
                                            A-2

-------
                                       Appendix B
                            Statistical Methods and Results

                                 Statistical Analysis Methods

       For each of the six alkyl phosphate compounds, the statistical objectives of this study were 1)
to identify those laboratories (i.e., analytical methods) where statistically significant differences
existed (at the 0.05 level) in reported measurements between spiking levels, and particularly, with
the unspiked level, and 2) to identify those spiking levels for which statistically significant
differences existed (at the 0.05 level) in reported measurements between the laboratories (i.e.,
analytical methods).  To satisfy these objectives, an analysis of variance (ANOVA) model was
derived and fitted to the reported measurements. The model was fitted separately for each
compound, corresponding to a total of six model fits. The data analysis utilized Version 8, Release
8.2, of the S AS® System.

       Note that each laboratory utilized a different analytical method. Thus, this statistical analysis
could not distinguish whether observed differences in results between two analytical methods are
due to differences associated with the methods or to differences associated with the laboratories
performing these methods.

       Descriptive statistics of the reported measurements were calculated within tables and figures
(plots). These statistics include sample size, geometric means, standard deviation, and selected
percentiles.  These summaries and other investigations of the reported measurements concluded that
the ANOVA model would be fitted to the log-transformed measurements. Measurements which the
laboratory reported as zero or less than the detection limit were replaced with one-half of the
detection limit prior to summarizing the measurements and analyzing the measurements using
ANOVA. However, results that specified a particular value that fell below the detection limit were
retained as reported when the data were summarized and analyzed.

       The ANOVA model took the following form:
       Yijkmrs = ji + LAB; + C'j + C°k + (C1*^ + (LAB*C\ + (LAB*C°)ik + (LAB*C1*C°)Uk
                oiilm(I) + Rt,"r(ijkm)   es(ijkmr)

              (i=l,...,I; j=l,...,J; k=l,...,K; m=l,...,Mi; r=l, ...,%; s=l,...,Sijkmr)

where
   •   Yijkmrs denotes the log-transformed measurement for the sth analysis performed on the
       physical sample uniquely identified by the combination of subscripts (i,j,k,m,r) (these
       subscripts are more fully defined in the bullets that follow),

   •   i is an overall constant,
                                            B-l

-------
       LAB; is a fixed effect representing the ith laboratory or equivalently, the ith analytical method
       (i=3 [Lab A, Lab B, Lab C] for DEDTP; i=4 [Lab A, Lab B, Lab C, Lab D] for the other
       compounds),

       C*j is a fixed effect representing the jth spiking level of the mix in which the given compound
       is included (j=5 for Mix A [DMP, DMDTP, DEP, and DEDTP, where the spiking levels
       were denoted by Unspiked, Low #1, Low #2, Medium, High]; j=4 for Mix B [DMTP and
       DETP, where the spiking levels were denoted by Unspiked, Low, Medium, and High]),

       C°k is a fixed effect representing the kth spiking level of the mix not containing the given
       compound (k=5 if this other mix is Mix A; k=4 is this other mix is Mix B),

       SETm(i) is a random effect representing the mth set of 35 samples provided to the ith laboratory
       (m(l)=3 [i.e., for Lab A];  m(i)=l otherwise),
               is a random effect representing the rth sample containing the jth spiking level of the
       mix in which the given compound is included and the kth spiking level of the other mix,
       where the sample is within the mth set of samples analyzed by the ith laboratory (r=l, 2, or 3,
       depending on the sample type defined by the combination (j,k)),

       Terms containing asterisks represent interactions of the above effects, and

       es(ijkmr) is random error not attributable to the model, as represented by variability in results
       for duplicate analyses of the same physical sample within a laboratory (where s can range
       from 1 to 4, depending on the specific combination of (i,j,k,m,r)).

Model (1) was fitted using the MIXED procedure in the SAS® System.

       Within the fitted ANOVA for each compound, two sets of statistical tests were performed to
address the two statistical analysis objectives stated above:

       •   F-tests for significant differences among spiking levels for the mix containing the given
          compound, one test for each laboratory.

       •   F-tests for significant differences among laboratories, one test for each spiking level for
          the mix containing the given compound.

These tests were possible due to having a term representing the interaction of laboratory and spiking
level effects in the model. The significance levels for each F-test in a set (and for a particular
compound) were adjusted using the Benjamini and Hochberg method (available in the SAS®
System), and an adjusted significance level below 0.05 resulted in the given test being declared
significant. Thus, any test in Set #1 having an adjusted significance level of less than 0.05 indicated
that significant differences among spiking levels existed for the given laboratory, and any test in Set
#2 having an adjusted  significance level of less than 0.05 indicated that significant differences

                                            B-2

-------
among laboratories existed at the given spiking level.  This adjustment of significance levels was
necessary to ensure that the overall rate of erroneously declaring a given test as significant within a
given set of tests (for a given compound) was no higher than 0.05.

       When significant differences were observed among spiking levels for a given laboratory (i.e.,
the outcome of a test in Set #1), additional F-tests were performed within the ANOVA to determine
those pairs  of spiking levels that differed significantly for that laboratory.  A Bonferroni multiple
comparisons method was used, indicating that each test needed to have a significance level less than
0.05/T, where T was the total number of pairs  of spiking levels, in order for the given pair of spiking
levels to be declared significantly different. (T=10 for Mix A compounds, and T=6 for Mix B
compounds.) This approach ensured that for a given laboratory, the overall error rate among all T
pairwise comparisons was no greater than 0.05.

       Similarly, when significant differences were observed among laboratories at a given spiking
level (i.e., the outcome of a test in Set #2), F-tests were performed within the ANOVA to determine
those pairs  of laboratories that differed significantly at that spiking level. When the laboratory effect
was significant at a given spiking level, then each pair of laboratories was statistically compared
within the ANOVA. Each pairwise comparison of laboratories was performed using the Bonferroni
adjustment method,  indicating that each test needed to have a significance level less than 0.05/T,
where T was the total number of pairs of laboratories of interest, in order for the given pair of
laboratories to be declared significantly different. While the analyses presented in this appendix
considered  all possible pairs of laboratories, the analysis presented in the main body of this report
considered  only those pairs where both laboratories reported measurements at the given spiking
level that were significantly different from (and, on average, higher than) the unspiked level,
according to the tests described in the previous paragraph.

Data Analysis Results

       For each compound, Table B-l specifies the number of measurements reported by each
laboratory,  by spiking level.  In most cases, one measurement was reported for each compound for a
given physical sample. However, occasionally the laboratories reported duplicate measurements for
the same physical sample for certain compounds.  These incidents are noted in parentheses within
Table B-l.

       For each compound, Table B-2 specifies the number of measurements reported by each
laboratory that were below the laboratory's detection limit (given in Table  1 of the main report).
The total number of measurements and the percentage of measurements below the detection  limit are
also specified in these tables. These numbers are reported by laboratory and spiking level and
include both individual sample results and duplicate results for the same sample.

       For each compound, Table B-3 summarizes accuracy percentages (i.e., average measurement
divided by  the actual spiking spiking, specified in Table 5  of the main report, and expressed as a
percentage) that are  associated with the reported measurements that were above the detection limit.
These percentages are reported by laboratory and spiking level.  Note that no duplicate results for

                                            B-3

-------
the same sample, and no non-detected results, were used in calculating the average measurement that
is given in the numerator of these accuracy percentages. Cells containing a dash symbol (--) indicate
that no measurements were reported above detection limits.

       Tables B-4a through B-4f contain descriptive statistics of the reported analytical
measurements, presented by laboratory and spiking level, for each set of samples received by a
laboratory.  Separate tables exist for each compound.  Note that the measurements summarized in
these tables include duplicate measurements taken on the same physical sample.  Measurements
reported as zero or below the detection limit were replaced with one-half of the detection limit. Due
to the possible of contamination in the pooled urine samples, the following samples were excluded:
test numbers 3, 6,  and 24 for compounds in Mix A; and test numbers 1,3, and 29 for compounds in
Mix B.  Because the data were analyzed after taking log transformations, geometric means and
geometric standard deviations, equal to the exponential value of the arithmetic mean and standard
deviation of the log-transformed data, are presented in these tables. The geometric means presented
in these tables for each spiking level and laboratory are presented graphically in Figures B-l through
B-6,  with separate figures for each compound.
                                           B-4

-------
Table B-l.     Numbers of Samples with Analytical Measurements Reported for Mix A and Mix B
              Compounds, by Laboratory and Spiking Level
Laboratory
Unspiked
Low#l
Low #2
Medium
High
All Samples
Mix A Compounds
Lab A
(Set#l)
Lab A
(Set #2)
Lab A
(Set #3)
LabB
LabC
LabD"
6
6
(5 for DMDTP)
6
6
6
6
5
5
5
5
5
(2 dup. results for 3
samples)
5
8
8
(2 dup. results for 1
sample: DMP)
8
(2 dup. results for 1
sample: DMP)
8
8
(2 dup. results for 1
sample)
8
8
8
(2 dup. results for 1
sample: DMP)
8
8
8
(2 dup. results for 2
samples)
8
(2 dup. results for 2
samples)
8
(4 dup. results for 1
sample: DEDTP)
8
8
8
8
(2 dup. results for 2
samples)
8
(2 dup. results for 1
sample)
35
35
(34 for DMDTP)
35
35
35
35
Mix B Compounds
Lab A
(Set#l)
Lab A
(Set #2)
Lab A
(Set #3)
LabB
LabC
LabD
6
6
6
6
6
(2 dup. results for
2 samples)
6
(2 dup. results for
2 samples)
13
13
13
13
13
(2 dup. results for 1 sample)
13
8
8
8
8
8
(2 dup. results for
3 samples)
8
(2 dup. results for
1 sample)
8
8
8
8
8
(2 dup. results for
2 samples)
8
35
35
35
35
35
35
 No measurements were reported for DEDTP.
                                            B-5

-------
Table B-2.     Number of Not-Detected Analytical Measurements for Each Compound, Calculated by
               Laboratory and Spike Level, with Number of Analytical Measurements and the Not-
               Detected Percentage Given in Parentheses
Laboratory
# Not-Detected Measurements
(Total # Measurements, % of Measurements that are Not-Detected)
Unspiked
Low#l
Low #2
Medium
High
Overall
Compound = BMP
Lab A
LabB
LabC
LabD
4(18,22.2)
6 (6, 100)
3 (6, 50.0)
6 (6, 100)
6(15,40.0)
4 (5, 80.0)
4 (8, 50.0)
5 (5, 100)
10(26,38.5)
6 (8, 75.0)
2(9,22.2)
8 (8, 100)
4(25,16.0)
7(8,87.5)
3(10,30.0)
10(10,100)
6 (27, 22.2)
2(8,25.0)
0(10,0.0)
7(9,77.8)
30(111,27.0)
25(35,71.4)
12(43,27.9)
36 (38, 94.7)
Compound = DMDTP
Lab A
LabB
LabC
LabD
11(18,61.1)
5 (6, 83.3)
5 (6, 83.3)
6 (6, 100)
7(15,46.7)
5 (5, 100)
7 (8, 87.5)
5 (5, 100)
10(26,38.5)
5 (8, 62.5)
0(9,0.0)
8 (8, 100)
1(25,4.0)
1 (8, 12.5)
0(10,0.0)
0(10,0.0)
3(27,11.1)
0 (8, 0.0)
0(10,0.0)
0 (9, 0.0)
32(111,28.8)
16(35,45.7)
12(43,27.9)
19(38,50.0)
Compound = DEP
Lab A
LabB
LabC
LabD
7(18,38.9)
5 (6, 83.3)
2(6,33.3)
6 (6, 100)
6(15,40.0)
3 (5, 60.0)
0 (8, 0.0)
5 (5, 100)
10(26,38.5)
3(8,37.5)
0 (9, 0.0)
7(8,87.5)
1(25,4.0)
0 (8, 0.0)
0(10,0.0)
0(10,0.0)
3(27,11.1)
0 (8, 0.0)
0(10,0.0)
0 (9, 0.0)
27(111,24.3)
11(35,31.4)
2 (43, 4.7)
18(38,47.4)
Compound = DEDTP
Lab A
LabB
LabC
12(18,66.7)
4 (6, 66.7)
5(6,83.3)
7(15,46.7)
4 (5, 80.0)
0 (8, 0.0)
3(26,11.5)
3(8,37.5)
0 (9, 0.0)
1(25,4.0)
0 (8, 0.0)
0(10,0.0)
0 (27, 0.0)
0 (8, 0.0)
0(10,0.0)
23(111,20.7)
11(35,31.4)
5(43,11.6)
Compound = DMTP
Lab A
LabB
LabC
LabD
0(18,0.0)
3 (6, 50.0)
4 (8, 50.0)
5 (8, 62.5)
4 (42, 9.5)
6(13,46.2)
0(14,0.0)
5(13,38.5)
2 (26, 7.7)
0 (8, 0.0)
0(11,0.0)
0 (9, 0.0)
1(25,4.0)
0 (8, 0.0)
0(10,0.0)
0(8,0.0)
7(111,6.3)
9(35,25.7)
4 (43, 9.3)
10(38,26.3)
Compound = DETP
Lab A
LabB
LabC
LabD
4(18,22.2)
3 (6, 50.0)
2(8,25.0)
6 (8, 75.0)
3(42,7.1)
6(13,46.2)
0(14,0.0)
2(13,15.4)
2 (26, 7.7)
1 (8, 12.5)
0(11,0.0)
0 (9, 0.0)
1(25,4.0)
0 (8, 0.0)
0(10,0.0)
0 (8, 0.0)
10(111,9.0)
10(35,28.6)
2 (43, 4.7)
8(38,21.1)
Note: "Not-detected measurements" are any measurements that fall below a laboratory's reported detection limit for the given compound.

-------
 Table B-3.      Accuracy Estimates (%) for Each Compound, Calculated by Laboratory and Spike
                 Level, with Number of Analytical Measurements Falling Above the Detection Limit
                 Given in Parentheses
Laboratory
Spiking Level (# Measurements > Detection Limit)
Low#l
Low #2
Medium
High
Compound = BMP
Lab A
LabB
LabC
LabD
167.4 (9)
1190(1)
423.7(3)
-
127.6(14)
492.5 (2)
185.5(6)
-
11.7(20)
17.4(1)
7.8 (6)
-
3.5(21)
4.9(6)
3.6 (8)
2.6 (2)
Compound = DMDTP
Lab A
LabB
LabC
LabD
112.4(8)
-
76.9(1)
-
74.4 (16)
4635 (3)
73.0(8)
-
70.8 (24)
1389(7)
85.3(8)
69.0 (8)
74.9 (24)
1292 (8)
81.5(8)
52.7 (8)
Compound = DEP
Lab A
LabB
LabC
LabD
177.9 (9)
535.7(2)
151.1(5)
-
121.2(16)
742.9 (5)
109.1(8)
235.7(1)
64.8 (24)
95.7(8)
66.3 (8)
45.3(8)
60.4 (24)
110.8(8)
64.9 (8)
31.4(8)
Compound = DEDTP
Lab A
LabB
LabC
120.6 (8)
10141 (1)
88.6 (5)
84.9 (23)
1518(5)
164.8(8)
81.3(24)
414.1(8)
72.1(8)
85.3 (24)
430.1 (8)
79.3 (8)
Compound = DMTP
Lab A
LabB
LabC
LabD
78.0 (38)
650.0 (7)
84.6(13)
235.6 (8)
87.3 (24)
233.1 (8)
85.7(8)
135.0(8)
86.4 (24)
128.0(8)
86.7 (8)
85.4 (8)
Compound = DETP
Lab A
LabB
LabC
LabD
180.0(39)
355.5 (7)
103.0(13)
156.6(11)
88.1(24)
190.6 (7)
82.3 (8)
116.1(8)
79.2 (24)
110.7(8)
82.0 (8)
74.1 (8)
Note: Accuracy is estimated by (mean/T)* 100%, where "mean" is the arithmetic mean of the analytical measurements falling above the detection limit,
calculated across all samples spiked at the specified level, and T is the actual spiking level.
                                                   B-7

-------
Table B-4a.    Descriptive Statistics of Reported Analytical Measurements for DMP (jig/L), Calculated
              by Spiking Level for Each Laboratory and Across All Laboratories
Lab
Lab A
LabB
LabC
LabD
Set
1
2
3
Overall
1
1
1
Spiking Level
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low #1
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low #1
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low #1
Low #2
Medium
High
Unspiked
Low#l
# Measure-
ments
3
5
8
8
8
3
5
9
9
8
3
5
9
8
8
9
15
26
25
24
3
5
8
8
8
3
8
9
10
10
3
5
Geom. Mean
0.9
0.6
1.2
2.5
3.4
1.3
1.5
1.3
2.7
3.5
0.3
0.7
0.8
2.4
3.9
0.7
0.9
1.1
2.6
3.6
3.1
3.4
3.7
3.2
7.4
4.3
1.8
2.6
2.1
6.8
1.3
1.3
Geom.
Standard
Deviation
2.6
2.3
2.8
3.8
3.5
1.3
1.3
3.3
3
3.5
1
2.7
2.8
4.4
3.8
2 2
2.2
2.9
3.5
3.4
1.4
0
1.8
1.6
1.7
18.1
2.2
2.1
2.2
1.4
1
1
Minimum
0.3
0.3
0.3
0.3
0.3
1
1.1
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
2.5
2.5
2.5
2.5
2.5
0.8
0.8
0.8
0.8
3.2
1.3
1.3
25th
Percentile
0.3
0.3
0.5
1.2
1.5
1
1.2
0.3
1.8
0
0.3
0.3
0.3
1
1.9
0.3
0.3
0.3
1.7
1.5
2.5
2.5
2.5
2.5
5.4
0.8
0.9
2.1
1
6
1.3
1.3
Median
1.4
0.5
1.6
2.7
4.3
1.3
1.5
1.7
2.6
4.6
0.3
0.5
0.9
3.1
5.3
1
1.1
1.6
2.8
4.5
2.5
2.5
2.8
2.5
9.3
0.8
1.9
3.3
2.2
7
1.3
1.3
75*
Percentile
1.7
1
1.8
8.1
10.1
1.5
1.7
3.2
3.4
9.7
0.3
1.6
2.1
8.7
11.4
1.4
1.7
2.1
3.8
9.7
4.6
2.5
4.9
3.7
11.2
120.6
3.6
4.5
3.6
8.3
1.3
1.3
Maximum
1.7
0
7.8
15.8
14.7
1.5
2.3
5.7
13.8
13.5
0.3
2.5
3
15.1
17.5
1.7
2.5
7.8
15.8
17.5
4.6
11.9
13.3
8.7
11.8
120.6
5.7
5.9
7.3
10.3
1.3
1.3

-------

All Labs

Overall
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
8
10
9
18
33
51
53
51
1.3
1.3
1.7
1.3
1.4
1.6
7 9
4
1
1
1.9
4
2.4
2.6
2.7
2.8
1.3
1.3
1.3
0.3
0.3
0.3
0.3
0.3
1.3
1.3
1.3
0.8
1
0.9
1.3
1.5
1.3
1.3
1.3
1.3
1.3
1.7
2.5
5.3
1.3
1.3
1.3
1.7
2.5
3.1
3.4
8.3
1.3
1.3
5.9
120.6
11.9
13.3
15.8
17.5
Note: In calculating these statistics, results reported as "below detection limits" were replaced by one-half of the laboratory's detection limit.
                                                                      B-9

-------
Table B-4b.    Descriptive Statistics of Reported Analytical Measurements for DMDTP
              Calculated by Spiking Level for Each Laboratory and Across All Laboratories
Lab
Lab A
LabB
LabC
LabD
Set
1
2
3
Overall
1
1
1
ipiking Level
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
# Measure-
ments
3
5
8
8
8
3
5
8
8
8
3
5
8
8
8
9
15
24
24
24
3
5
8
8
8
3
8
9
10
10
3
5
Geom. Mean
0.1
0.3
0.4
36.6
158.4
0.2
0.4
1.0
38.0
156.8
0.1
0.5
0.6
35.7
151.7
0.1
0.4
0.6
36.8
155.6
5.0
5.0
13.8
327.1
2249.1
0.4
0.5
1.5
43.7
171.6
1.3
1.3
Geom.
Standard
Deviation
1.0
3.7
4.6
1.1
1.0
3.4
3.4
2.5
1.1
1.1
1.0
4.3
4.3
1.1
1.0
2.0
3.5
3.7
1.1
1.0
1.0
1.0
4.4
6.2
1.9
1.0
1.3
1.3
1.3
1.2
1.0
1.0
Minimum
0.1
0.1
0.1
35.1
154.4
0.1
0.1
0.1
33.5
139.2
0.1
0.1
0.1
32.7
144.0
0.1
0.1
0.1
32.7
139.2
5.0
5.0
5.0
5.0
1062.1
0.4
0.4
1.0
20.2
109.6
1.3
1.3
25*
Percentile
0.1
0.1
0.1
35.4
156.5
0.1
0.1
1.1
36.7
154.6
0.1
0.1
0.1
33.8
150.9
0.1
0.1
0.1
35.3
152.7
5.0
5.0
5.0
231.7
1493.5
0.4
0.4
1.3
42.0
175.0
1.3
1.3
Median
0.1
0.1
0.5
35.8
157.9
0.1
0.8
1.4
38.7
158.1
0.1
0.8
1.5
36.4
152.0
0.1
0.8
1.4
36.7
156.1
5.0
5.0
5.0
661.6
1667.8
0.4
0.5
1.6
45.0
183.3
1.3
1.3
75*
Percentile
0.1
1.0
1.8
37.4
161.0
0.8
0.8
1.5
39.9
161.8
0.1
1.4
1.7
37.3
153.8
0.1
1.2
1.5
38.4
159.8
5.0
5.0
46.5
1052.4
4183.0
0.4
0.6
1.8
52.1
188.4
1.3
1.3
Maximum
0.1
1.2
2.5
41.1
162.4
0.8
1.2
1.5
40.2
167.8
0.1
2.2
2.1
38.5
156.1
0.8
2 2
2.5
41.1
167.8
5
5
196.2
1163.5
5751.3
0.4
0.8
2.1
60.7
203
1.3
1.3
                                            B-10

-------

All Labs

Overall
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
8
10
9
18
33
49
52
51
1.3
34.8
105.8
0.4
0.7
1.3
52.6
225.3
1.0
1.1
1.5
4.5
3.6
4.7
2.9
2.9
1.3
30.3
38.9
0.1
0.1
0.1
5.0
38.9
1.3
32.3
97.4
0.1
0.4
1.2
35.1
144.0
1.3
33.8
125.0
0.4
0.8
1.5
38.2
157.4
1.3
38.5
128.9
1.3
1.3
1.9
44.1
185.6
1.3
42.8
137.1
5
5
196.2
1163.5
5751.3
Note: In calculating these statistics, results reported as "below detection limits" were replaced by one-half of the laboratory's detection limit.
                                                                   B-ll

-------
Table B-4c.Descriptive Statistics of Reported Analytical Measurements for DEP (jig/L), Calculated by
          Spiking Level for Each Laboratory and Across All Laboratories
Lab
Lab A
LabB
LabC
LabD
Set
1
2
3
Overall
1
1
1
Spiking Level
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
# Measure-
ments
3
5
8
8
8
3
5
8
8
8
3
5
8
8
8
9
15
24
24
24
3
5
8
8
8
3
8
9
10
10
3
5
Geom. Mean
0.2
1.2
1.3
46.8
168.1
0.2
0.5
0.9
44.2
168.7
0.6
0.6
1.0
42.6
169.3
0.2
0.7
1.1
44.5
168.7
2.5
3.9
7.7
48.6
275.9
2.1
1.8
2.9
45.8
184.3
1.3
1.3
Geom.
Standard
Deviation
1.0
3.5
4.2
1.2
1.1
1.0
4.6
4.8
1.2
1.1
3.5
4.9
5.4
1.2
1.1
2.5
4.1
4.5
1.2
1.1
1.0
1.8
3.2
2.8
1.7
12.3
1.6
1.8
1.2
1.2
1.0
1.0
Minimum
0.2
0.2
0.2
40.3
155.4
0.2
0.2
0.2
33.0
155.5
0.2
0.2
0.2
35.8
150.5
0.2
0.2
0.2
33.0
150.5
2.5
2.5
2.5
5.4
113.7
0.5
1.0
1.3
25.0
124.5
1.3
1.3
25*
Percentile
0.2
1.5
0.6
40.8
159.7
0.2
0.2
0.2
38.5
157.8
0.2
0.2
0.2
36.7
159.5
0.2
0.2
0.2
38.5
159.1
2.5
2.5
2.5
37.7
218.6
0.5
1.2
1.7
47.2
191.6
1.3
1.3
Median
0.2
1.6
2.3
44.3
167.6
0.2
0.2
1.9
43.1
169.3
0.7
0.6
1.9
39.5
174.5
0.2
1.5
2.1
41.4
170.5
2.5
2.5
7.5
59.6
254.1
0.5
1.6
2.9
49.0
196.5
1.3
1.3
75th
Percentile
0.2
1.8
3.9
52.3
175.9
0.2
2 2
3.5
50.1
177.4
1.8
1.5
4.4
49.0
177.4
0.2
2 2
3.6
50.1
177.1
2.5
6.3
22.5
99.4
422.4
38.7
2.8
4.7
49.1
204.0
1.3
1.3
Maximum
0.2
4.5
5.2
65.3
185.1
0.2
2.7
4.7
64.7
188
1.8
6.2
7.5
61.9
184.2
1.8
6.2
7.5
65.3
188
2.5
8.7
44.2
137
577
38.7
3.5
5.5
53.4
206
1.3
1.3
                                            B-12

-------

All Labs

Overall
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
8
10
9
18
33
49
52
51
1.5
30.8
80.0
0.7
1.2
1.9
42.3
162.5
1.8
1.2
1.8
4.6
3.2
4.0
1.6
1.6
1.3
22.2
16.3
0.2
0.2
0.2
5.4
16.3
1.3
25.7
85.2
0.2
1.0
1.3
36.7
150.5
1.3
30.9
98.3
0.6
1.5
2 3
42.9
173.8
1.3
37.0
103.3
1.8
2.5
4.7
49.4
195.9
6.6
39.5
118.3
38.7
8.7
44.2
137
577
Note: In calculating these statistics, results reported as "below detection limits" were replaced by one-half of the laboratory's detection limit.
                                                                   B-13

-------
Table B-4d.    Descriptive Statistics of Reported Analytical Measurements for DEDTP (ng/L),
              Calculated by Spiking Level for Each Laboratory and Across All Laboratories
Lab
Lab A
LabB
LabC
All Labs
Set
1
2
3
Overall
1
1
Overall
Spiking Level
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
Low #2
Medium
High
Unspiked
Low#l
# Measure-
ments
3
5
8
8
11
3
5
8
8
8
3
5
8
8
8
9
15
24
24
27
3
5
8
8
8
3
8
9
10
10
15
28
Geom. Mean
0.0
0.8
1.3
54.0
162.5
0.1
0.3
2.4
57.6
241.9
0.2
0.3
1.9
52.4
227.5
0.1
0.4
1.8
54.6
202.0
5.9
9.7
16.4
235.7
1034.6
0.4
1.0
2.4
47.5
201.9
0.3
1.0
Geom.
Standard
Deviation
3.4
3.3
1.9
1.1
2.5
1.0
4.3
1.5
1.1
1.1
2.6
5.5
3.3
1.1
1.1
2.6
4.2
2 3
1.1
1.8
1.3
4.4
3.0
1.9
1.7
1.6
1.4
2.4
1.3
1.4
6.2
5.3
Minimum
0.0
0.1
0.3
48.1
22.7
0.1
0.1
1.5
54.0
224.8
0.1
0.1
0.1
38.6
199.8
0.0
0.1
0.1
38.6
22.7
5.0
4.9
5.0
101.8
525.3
0.3
0.8
1.1
29.4
118.8
0.0
0.1
25*
Percentile
0.0
0.7
1.2
52.0
224.8
0.1
0.1
1.7
54.9
227.3
0.1
0.1
2.0
51.0
221.8
0.1
0.1
1.5
52.7
224.9
5.0
5.0
5.8
117.6
748.9
0.3
0.8
1.5
39.8
152.0
0.1
0.4
Median
0.1
1.2
1.5
53.5
239.5
0.1
0.1
2.3
56.8
247.4
0.1
0.1
3.0
54.6
230.2
0.1
0.7
2.1
55.2
238.4
5.0
5.0
15.3
292.2
833.6
0.3
0.9
2 2
51.8
240.9
0.1
1.0
75*
Percentile
0.1
1.5
1.9
57.2
250.3
0.1
1.0
3.1
60.1
254.1
0.5
2.0
3.2
57.3
236.8
0.1
1.9
3.0
58.1
249.4
8.2
5.0
40.0
429.3
1766.9
0.7
1.1
2.3
55.0
246.0
0.7
2.1
Maximum
0.1
2.1
2.5
59.5
283
0.1
1.9
4.5
63.9
255.3
0.5
2.6
3.7
58.4
245.3
0.5
2.6
4.5
63.9
283
8.2
136.9
94.3
456.4
2066.3
0.7
2.1
22.9
63.8
281.8
8.2
136.9
                                            B-14

-------


Low #2
Medium
High
41
42
45
3.0
69.8
270.0
3.4
2.0
2.3
0.1
29.4
22.7
1.6
52.5
225.4
2.3
55.7
242.3
3.7
62.0
263.7
94.3
456.4
2066.3
Note: In calculating these statistics, results reported as "below detection limits" were replaced by one-half of the laboratory's detection limit.
                                                                   B-15

-------
Table B-4e.Descriptive Statistics of Reported Analytical Measurements for DMTP (ng/L), Calculated
            by Spiking Level for Each Laboratory and Across All Laboratories
Lab
Lab A
LabB
LabC
LabD
All Labs
Set
1
2

Overall
1
1
1
Overall
Spiking Level
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
Hi ah
# Measure-
ments
3
13
8
8
3
13
8
8
3
13
8
8
9
39
24
24
3
13
8
8
3
14
11
10
3
13
9
8
18
79
52
50
Geom. Mean
3.0
2.2
20.7
167.9
1.8
1.7
20.0
164.8
1.6
2.6
21.2
161.8
2.0
2.1
20.6
164.8
6.0
7.6
38.3
214.6
0.5
3.1
20.3
164.9
1.3
4.2
31.8
157.6
1.8
3.1
24.4
170.7
Geom.
Standard
Deviation
1.0
2 2
1.2
1.1
1.8
1.8
1.2
1.0
1.4
2.4
1.2
1.0
1.6
2 2
1.2
1.1
4.5
3.3
2.7
1.8
1.0
1.3
1.2
1.2
1.0
2.7
1.2
1.4
2.6
2.6
1.6
1.3
Minimum
2.9
0.4
17.4
155.1
1.0
0.8
17.1
152.8
1.1
1.0
17.5
146.3
1.0
0.4
17.1
146.3
2.5
2.5
6.2
81.5
0.5
2.3
14.7
113.7
1.3
1.3
19.0
75.5
0.5
0.4
6.2
75.5
25*
Percentile
2.9
1.8
18.0
161.9
1.0
1.1
17.2
159.0
1.1
1.1
18.2
160.7
1.9
1.1
17.7
160.9
2.5
2.5
28.1
142.5
0.5
2.6
16.2
162.0
1.3
1.3
32.0
156.3
1.1
1.4
18.8
161.3
Median
2.9
2.0
20.5
165.9
1.9
1.3
20.1
167.4
2.0
2.2
20.9
161.9
2.0
1.9
20.4
164.8
2.5
8.1
36.8
274.0
0.5
3.1
21.1
173.5
1.3
7.0
32.5
176.2
1.9
2.5
22.0
167.6
75*
Percentile
3.0
3.1
23.3
173.2
3.1
2 3
22.5
170.3
2.0
3.8
24.4
165.8
2.9
3.8
23.3
170.0
34.0
19.6
79.0
334.4
0.5
3.6
23.7
180.9
1.3
9.7
35.6
181.3
2.9
6.2
31.6
178.8
Maximum
3
9.2
26.2
188.1
3.1
4.3
24.8
173.2
2
10.3
27.4
172.2
3.1
10.3
27.4
188.1
34
57
151.8
373.2
0.5
5
26.3
182.7
1.3
12
38.3
202.5
34
57
151.8
373.2
Note: In calculating these statistics, results reported as "below detection limits" were replaced by one-half of the laboratory's detection limit.
                                                   B-16

-------
Table B-4f. Descriptive Statistics of Reported Analytical Measurements for DETP (ng/L), Calculated by
          Spiking Level for Each Laboratory and Across All Laboratories
Lab
Lab A
LabB
LabC
LabD
All Labs
Set
1
-)
3
Overall
1
1
1
Overall
Spiking Level
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
Unspiked
Low
Medium
High
# Measure-
ments
3
13
8
8
3
13
8
8
3
13
8
8
9
39
24
24
3
13
8
8
3
14
11
10
3
13
9
8
18
79
52
50
Geom. Mean
0.3
6.4
24.4
176.2
0.3
5.9
24.5
181.9
0.5
7.1
23.6
169.2
0.4
6.4
24.2
175.7
6.2
6.0
33.7
226.6
0.6
4.2
22.7
182.0
1.3
5.0
32.0
161.2
0.8
5.7
26.3
181.8
Geom.
Standard
Deviation
2.0
2.0
1.2
1.0
1.7
1.5
1.2
1.1
3.1
2 2
1.2
1.0
7 9
1.9
1.2
1.1
4.9
2.9
2.6
1.6
1.5
1.4
1.2
1.1
1.0
2.0
1.2
1.3
3.8
2.0
1.5
1.3
Minimum
0.2
3.3
21.3
167.1
0.2
4.1
20.9
165.9
0.2
3.5
20.0
157.9
0.2
3.3
20.0
157.9
2.5
0.8
4.9
100.2
0.5
2.7
15.5
137.2
1.3
1.3
23.9
96.4
0.2
0.8
4.9
96.4
25*
Percentile
0.2
3.7
22.0
173.5
0.2
4.5
21.1
176.7
0.2
3.8
20.3
163.6
0.2
3.9
21.3
170.5
2.5
2.5
25.7
168.6
0.5
3.2
21.0
182.4
1.3
4.4
30.8
155.6
0.4
3.7
22.0
168.2
Median
0.3
4.0
22.8
175.2
0.4
4.6
23.5
183.6
0.9
4.1
22.0
170.8
0.4
4.5
22.6
175.7
2.5
9.2
35.5
273.6
0.5
4.1
23.8
186.0
1.3
5.7
31.7
167.1
0.8
4.5
24.3
179.3
75th
Percentile
0.6
7.9
27.2
179.8
0.4
6.8
28.4
187.7
1.1
10.4
27.6
175.7
0.6
10.3
28.0
181.9
38.6
12.7
63.9
312.7
1.0
4.4
25.0
191.1
1.3
7.2
33.0
1 88.3
1.3
9.2
31.7
190.0
Maximum
0.6
19.7
31.9
186.1
0.4
10.8
31.9
195
1.1
26.3
31.7
176.5
1.1
26.3
31.9
195
38.6
31.5
121
355.8
1
7.4
30
195.9
1.3
12
41.8
198
38.6
31.5
121
355.8
                                            B-17

-------
          10.0
           1.0
                Lab:
           0.1
             Unspiked
A
B
C
D
      Low 1                  Low 2                 Medium

                Mix A  Concentration  Level
High
Note: In calculating these statistics, results reported as "below detection limits" were replaced by one-half of the laboratory's detection limit.
Figure B-l.Geometric Means of DMP Measurements (jig/L) at Each Mix A Spiking Level, Calculated
            for Each Laboratory
                                                    B-18

-------
      10000.0
       1000.0
        100.0
         10.0
          1.0
          0.1
              Lab:
A
B
C
D
            Unspiked

     Low 1                Low 2
              Mix A Concentration  Level
Medium
High
Figure B-2. Geometric Means of DMDTP Measurements (ng/L) at Each Mix A Spiking Level,
           Calculated for Each Laboratory
                                              B-19

-------
       1000.0
  s>
  CL
  B
        100.0
10.0
         1.0
         0.1
              Lab:
                  A
                  B
                  C
                  D
           Unspiked
                        Low 1                Low 2

                                Mix A Concentration Level
Medium
High
Figure B-3. Geometric Means of DEP Measurements (ng/L) at Each Mix A Spiking Level, Calculated
           for Each Laboratory
                                             B-20

-------
      10000.0
       1000.0
        100.0
         10.0
          1.0
          0.1
          0.0
              Lab:    * *  *• A
                            D
                            w
            Unspiked
Low 1                Low 2
         Mix A Concentration Level
Medium
High
Figure B-4. Geometric Means of DEDTP Measurements (ng/L) at Each Mix A Spiking Level,
           Calculated for Each Laboratory
                                              B-21

-------
 I
       1000.0
        100.0
         10.0
         1.0
         0.1
Lab:
           Unspiked
                           A
                           B
                           C
                           D
                         Low                      Medium

                           Mix B Concentration Level
High
Figure B-5. Geometric Means of DMTP Measurements (ng/L) at Each Mix B Spiking Level, Calculated
          for Each Laboratory
                                             B-22

-------
       1000.0
        100.0
        10.0
         1.0
         0.1
              Lab:
A
B
C
D
           Unspiked
            Low                      Medium

              Mix B Concentration Level
High
Figure B-6. Geometric Means of DETP Measurements (ng/L) at Each Mix B Spiking Level, Calculated
          for Each Laboratory
                                             B-23

-------
       Results of the statistical analyses, involving fitting ANOVA model (1) to the log-transformed
measurements, are summarized in Tables B-5 through B-7. Table B-5 contains the p-values for the
tests of fixed effects (i.e., lab effects, spiking level effects, and their interactions) included in the
model. The results of statistical tests to further investigate the presence of significant differences
among spiking levels for each laboratory, as well as overall across all laboratories, are presented in
Table B-6.  Table B-7 contains the results of statistical tests for differences among the laboratories at
each spiking level, as well as overall across all spiking levels.  Selected findings from these tables
are found in Tables 6 and 10 of the main report.

Table B-5.     Summary of Tests for Fixed Effects Included in the ANOVA Model, For each Alkyl
               Phosphate Target Compound
Fixed Effects
Lab
Mix A Spiking1
Mix B Spikingb
(Mix A Spiking)*(Mix B Spiking)
Lab*(Mix A Spiking)1
Lab*(Mix B Spiking)"
Lab*(Mix A Spiking)*
(Mix B Spiking)
Results of Statistical Test for Fixed Effect (p-values)
Mix A
BMP
0.0527
0.0001
O.0001
0.2492
0.0028
0.0001
0.1410
DMDTP
0.0054
0.0001
0.6056
0.577
0.0069
0.5122
0.4290
DEP
O.0001
0.0001
0.0049
0.0063
0.0005
0.1910
0.3511
DEDTP
O.0001
0.0001
0.9921
0.3527
0.0001
0.2334
0.0146
MixB
DMTP
0.0214
0.0001
O.0001
0.0001
O.0001
0.0001
O.0001
DETP
0.0265
0.0001
O.0001
0.0001
O.0001
0.0001
O.0001
a.      For compounds in Mix A, these are the spiking level effects and the interactions of laboratory and spiking levels.
b.      For compounds in Mix B, these are the spiking level effects and the interactions of laboratory and spiking levels.
                                               B-24

-------
Table B-6.      Summary of Spiking Level Effects for Each Lab and Overall across all Labs for each
                   Alkyl Phosphate Target Compound a'b'c
Lab


Lab A





LabB



LabC



LabD



All Labs





Significant Spiking Level Effect?
Mix A
DMP
Yes
(Hvs. U, LI,
L2;
M vs. U, LI,
L2)

No



Yes
(Hvs. LI, L2,
M)


No



Yes
(Hvs. U, LI,
L2, M; M vs.U,
LI)


DMDTP
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs. U; LI vs. U)

Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2)
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2)

Yes
(Hvs. U, LI,
L2, Mvs. U,
LI, L2)
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs. L1,U)

DEP
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs. U; LI vs.
U;)
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2)
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2)

Yes
(Hvs. U, LI,
L2; M vs. U,
LI, L2)
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs.U)

DEDTP
Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs. U, LI; LI
vs.U)
Yes
Hvs. U, LI,
L2, M; M vs.
U, LI, L2;)
Yes
Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs.U)


—

Yes
(Hvs. U, LI,
L2, M; M vs.
U, LI, L2; L2
vs. U, LI; LI
vsU)
MixB
DMTP
Yes
(H vs. U, L, M;
M vs. U, L)



Yes
(H vs. U, L, M;
M vs. U, L)

Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)

Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)

Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)

DETP
Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)


Yes
(H vs. U, L, M;
M vs. U, L)

Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)

Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)
Yes
(H vs. U, L, M;
M vs. U, L; L
vs.U)


     F tests were used to test for significant lab effects across all labs at the 0.05 level. When significant differences among the labs were present,
     pairwise comparisons were made between each pair of spiking level, with each pairwise comparison performed using Bonferroni-adjustment
     method to ensure that the overall error rate across the pairwise comparisons was no greater than 0.05.  Pairs of spiking levels differing
     significantly at the Bonferroni-adjusted 0.05 level are identified in parentheses.
     F tests were used to test for significant spiking level effects for each lab, where the Benjamini and Hochberg multiple comparison adjustment
     method was used to control the overall error rate across all of these tests to be no higher than 0.05. When significant differences among spiking
     levels  were present for a given lab, pairwise comparisons were made between each pair of spiking levels for the given lab, with each pairwise
     comparison performed using Bonferroni-adjustment method to ensure that the overall error rate across the pairwise comparisons was no greater
     than 0.05. Pairs of spiking levels differing significantly at the Bonferroni-adjusted 0.05 level are identified in parentheses.
     Mix A compounds and Mix B compounds were spiked at five and four spiking levels, respectively.
                                                             B-25

-------
Table B-7.      Summary of Lab Effects at each Spiking Level and Overall across all Spiking Levels for
                   each Alkyl Phosphate Target Compound a'b
Spiking Level c


Unspiked






Low#l



Low #2




Medium



High



Overall





Significant Lab Effect?
Mix A
DMP
Yes
(Lab A vs. Lab
B, Lab C)




Yes
(Lab A vs Lab
B)

Yes
(Lab A vs Lab
B)


No



Yes
(Lab D vs Lab
B, Lab C)

No





DMDTP
Yes
(Lab B vs Lab
A, LabC;
Lab A vs Lab
D)


Yes
(Lab B vs Lab
A, Lab C; Lab
A vs Lab D)
Yes
(Lab B vs Lab
A, LabC, Lab
D)

Yes
(Lab B vs Lab
A, LabC, Lab
D)
Yes
(Lab B vs Lab
A, LabC, Lab
D)
Yes
(Lab A vs Lab
B)



DEP
Yes
(Lab A vs Lab
B, Lab C, Lab
D)



Yes
(Lab A vs Lab
B)

Yes
(Lab B vs Lab
A, Lab D;
Lab A vs Lab
C)
No



Yes
(Lab B vs Lab
D)

Yes
(Lab B vs Lab
A, LabC, Lab
D;
Lab C vs Lab
A)
DEDTP
Yes
(Lab B vs Lab
A, LabC; Lab
A vs Lab C)



Yes
(Lab B vs Lab
A, LabC; Lab
A vs Lab C)
Yes
(Lab B vs Lab
A, LabC)


Yes
(Lab B vs Lab
A, LabC)

Yes
(Lab B vs Lab
A, LabC)

Yes
(Lab B vs Lab
A, LabC)



MixB
DMTP
Yes
(Lab B vs Lab
A, Lab C, Lab
D; Lab A vs
LabC)


Yes
(Lab A vs Lab
B, Lab D; Lab
B vs Lab C)





Yes
(Lab A vs Lab
B)

No



Yes





DETP
Yes
(Lab A vs Lab
B, Lab C, Lab
D;
Lab B vs Lab
C, Lab D; Lab
C vs Lab D)
Yes
(Lab A vs Lab
C)






Yes



No



Yes





     F tests were used to test for significant lab effects across all spiking levels at the 0.05 level. When significant differences among the labs were
     present, pairwise comparisons were made between each pair of labs, with each pairwise comparison performed using Bonferroni-adjustment
     method to ensure that the overall error rate across the pairwise comparisons was no greater than 0.05.  Pairs of labs differing significantly at the
     Bonferroni-adjusted 0.05 level are identified in parentheses.
     F tests were used to test for significant lab effects at each spiking level of the compound, where the Benjamini and Hochberg multiple comparison
     adjustment method was used to control the overall error rate across all of these tests to be no higher than 0.05. When significant differences
     among labs were present at a given spiking level, pairwise comparisons were made between each pair of labs at the given spiking level, with each
     pairwise comparison performed using Bonferroni-adjustment method to ensure that the overall error rate across the pairwise comparisons was no
     greater than 0.05.  Pairs of labs differing significantly at the Bonferroni-adjusted 0.05 level are identified in parentheses.
     Mix A compounds and Mix B compounds were spiked at five and four spiking levels, respectively.
                                                             B-26

-------
       Based upon adjusted significance levels, the spiking level effect was found to be significant
for each laboratory and each compound, except for DMP at Lab D. Thus, pairwise comparisons
were made in each of these instances to identify those pairs of spiking levels that differed
significantly from each other for a given laboratory and compound. The findings of these tests,
presented within Table B-6, were as follows:

       a.     For DMP, the High and Medium levels were significantly different from (and higher
             than) the Unspiked, Low #1, and Low #2 levels, for Lab A.   For Lab C, the High
             spiking level was significantly different from (and higher than) the Low #1, Low #2,
             and Medium levels. In particular, the Unspiked level was not significantly different
             from any other spiking level among Labs B, C, and D, and was significantly different
             only from the High level for Lab A.

       b.     For DMDTP, the High and Medium spiking levels were significantly different from
             (and higher than) the Unspiked, Low #1, and Low #2 levels for each laboratory. Also,
             for Labs A, B, and C, the High spiking level was significantly different from (and
             higher than) the Medium level, and for Lab A, the Unspiked level was significantly
             different from (and lower than) the Low #1 and Low #2 levels.

       c.     For DEP, the High and Medium spiking levels were significantly different from (and
             higher than) the Unspiked, Low #1, and Low #2 levels for each laboratory. Also, for
             Labs A, B, and C, the High spiking level was significantly different from (and higher
             than) the Medium level, and for Lab A, the Unspiked level  was significantly
             different from (and lower than) the Low #1 and Low #2 levels.

       d.     For DEDTP (analyzed by only Labs A, B, and C), for each laboratory, the High and
             Medium spiking levels were significantly different from (and higher than) the
             Unspiked, Low #1, and Low #2. The High spiking level was significantly different
             from (and higher than) the Medium level for Labs A, B, and C. Also for Lab A, the
             Unspiked level was significantly different from (and lower than) the Low #1 and
             Low #2 levels, and for Lab C, the Unspiked level was significantly different from
             (and lower than) the Low #2 level.

       e.     For DMTP, each of the spiking levels was significantly different from each of the
             other levels for all laboratories except Lab A and Lab B, when the Low and Unspiked
             levels were not significantly different.

       f     For DETP, each of the spiking levels was significantly different from each of the
             other levels for all laboratories except Lab B, when the Low and Unspiked levels
             were not significantly different.

       Also based upon adjusted significance levels, the laboratory effect was found to be
significant at certain spiking levels for certain compounds.  In these instances, pairwise comparisons
                                           B-27

-------
were made to identify those pairs of laboratories that differed significantly from each other.  The
findings of these tests, presented within Table B-7, were as follows:

       1)     For DMP, laboratory effects were significant at all spiking levels except the medium
              level. Lab A was significantly different from (and lower than) Lab B at the Unspiked,
              Low #1 and Low #2 levels, and Lab C at the Unspiked level. At the high level, Lab D
              was significantly different from (and lower than) the other three laboratories.

       2)     For DMDTP, laboratory effects were significant at all five spiking levels. Lab B was
              significantly different from (and higher than) Labs A and C at all five spiking levels
              and Lab D at Low #2, Medium, and High levels. In addition, Lab A was significantly
              different from (and lower than) Lab D at the Unspiked and Low #1 levels.

       3)     For DEP, laboratory effects were significant at the Unspiked, Low #1, Low #2, and
              High levels. Lab B was significantly different from (and higher than) Lab A at the
              Unspiked, Low #1, and Low #2 levels, and Lab D at the Low #2, and High levels,  In
              addition, Lab A was significantly different from (and lower than) Lab C at  the
              Unspiked and Low #2 levels, and Lab D at the Unspiked level.

       4)     For DEDTP, laboratory effects were significant at all five spiking levels. Lab B was
              significantly different from (and higher than) Labs A and C at all five levels. In
              addition, Lab A was significantly different from (and lower than) Lab C at  the
              Unspiked and Low #1 levels..

       5)     For DMTP, laboratory effects were significant at the Unspiked, Low, and Medium
              levels. Lab B was  significantly different from (and higher than) Lab A at each of
              these three levels,  Lab  C at the Unspiked and Low levels, and from Lab D at the
              Unspiked level.  In addition, Lab A was significantly different from (and lower than)
              Lab D at the Low level; Lab A was significantly different from (and higher than) Lab
              C at the Unspiked  level; and Lab C was significantly different from (and lower than)
              Lab D at the Unspiked level.

       6)     For DETP, laboratory effects were significant at the Unspiked, Low, and Medium
              Level; At the Unspiked Level, all labs were found to be significantly different from
              each other with the highest value in Lab C, following by Lab B, then by Labs D and
              A. In addition, Lab A was significantly different from (and higher than) the Lab C at
              the Low level. No significant differences between any two labs were observed at the
              Medium level.
                                           B-28

-------
                              Appendix C
                               Raw Data
Lab A Set 1
Sample
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
13
13
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Reported Measurement (ug/L)
BMP
4.22
0.51
1.15
14.74
0.56
10.19
0.65
0.52
2.93
7.76
1.39
1.66
0.35



12.99
<0.6
1.74
5.99
1.50
<0.6
1.94
0.97
1.60
2.03
2.60
4.47
<0.6
14.23
1.50
2.55
15.81
1.74
3.25
<0.6
<0.6
1.65
DMDTP
161.41
<0.2
0.2
158.25
0.2
0.38
41.08
0.2
35.80
0.87
0.2
1.71
154.39



36.54
1.01
35.15
157.65
2.49
1.21
0.2
0.2
160.59
0.2
0.34
155.69
35.66
162.39
157.36
38.31
35.13
0.2
35.75
0.2
1.89
0.2
DEP
158.08
3.03
0.67
177.91
2.61
23.22
44.30
1.76
44.23
4.81
0.3
1.94
169.77



65.30
1.46
41.09
161.40
0.3
1.58
1.03
0.3
173.80
4.49
3.48
165.49
40.33
185.15
155.44
44.43
60.08
5.17
40.51
0.3
0.3
0.3
DEDTP
239.49
2.52
0.02
263.67
1.88
0.39
54.15
0.75
51.45
1.21
0.2
1.46
28.48
22.73
250.29
224.81
59.51
1.54
59.48
229.26
1.57
1.20
0.33
0.2
283.04
2.08
0.73
243.45
52.94
239.73
238.90
54.89
52.46
1.97
48.06
0.01
1.15
0.2
DMTP
5.79
1.38
1.19
188.08
2.94
167.00
2.01
17.61
21.00
163.31
1.83
17.40
6.24



170.09
0.39
3.07
25.57
1.87
1.97
18.37
3.01
9.24
160.51
20.07
26.24
2.07
176.22
5.15
2.18
164.89
155.15
20.96
1.75
1.21
2.92
DETP
11.52
3.27
0.56
174.52
0.60
186.10
7.88
21.35
23.62
172.79
4.04
21.94
19.71



175.85
3.80
7.73
30.75
4.01
3.65
22.07
0.26
18.84
174.11
22.07
31.87
7.77
182.87
17.73
3.04
176.67
167.14
23.66
3.65
3.68
0.3
Lab A Set 2
                                  C-l

-------
Sample
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
31
32
32
33
34
35
Reported Measurement (ug/L)
BMP
3.82
<0.6
1.23
13.53
<0.6
9.25
<0.6
1.51
2.98
1.83
0.97
3.18
<0.6
13.80
1.14
2.58
5.30
1.07
1.74
1.71
2.31
2.54
1.24
2.27
6.27
1.52
13.12
1.37
1.78
10.44
4.82
5.73
3.41
2.59
1.34
<0.6
1.55
DMDTP
156.80
0.99
<0.2
156.10
1.49

40.03
0.77
38.02
1.35
<0.2
1.44
159.35
38.05
0.81
35.48
163.31
1.46
<0.2
<0.2
<0.2
153.09
1.23
1.56
160.33
40.19
167.79
139.20
39.68
39.38
1.15

33.48

0.83
1.52
<0.2
DEP
175.18
2.22
0.79
167.24
1.60
16.15
38.30
<0.3
48.20
4.74
<0.3
2.70
156.57
64.67
<0.3
46.44
171.30
<0.3
2.22
<0.3
<0.3
188.02
2.72
4.82
159.11
39.66
179.61
155.51
38.66
52.10
4.23

32.97

<0.3
<0.3
<0.3
DEDTP
224.82
2.29
0.46
228.15
1.52
<0.2
57.33
<0.2
54.30
3.16
<0.2
2.41
245.34
53.97
1.02
58.10
249.38
2.96
1.86
4.52
<0.2
226.44
<0.2
0.36
255.27
63.86
254.09
254.06
56.23
62.02
1.55

55.51

<0.2
1.88
<0.2
DMTP
5.63
1.04
1.56
170.69
3.07
161.73
2.15
17.11
20.09
169.92
1.33
17.21
4.26
166.61
1.06
2.29
24.12
1.17
0.89
17.13
1.92
4.02
173.23
20.07
24.78
1.91
168.18
4.09
2.78
156.21
152.76

20.83

0.85
1.13
0.97
DETP
10.09
4.46
0.43
186.32
0.40
187.27
6.44
21.09
24.10
180.97
4.48
21.17
10.27
177.70
4.58
6.81
31.89
4.54
4.28
20.92
0.36
10.81
188.18
22.85
31.39
5.72
195.03
10.64
4.39
175.64
165.91

25.35

4.14
4.33
<0.3
C-2

-------
Lab A Set 3
Sample
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Reported Measurement (ug/L)
BMP
4.46
<0.6
0.81
17.48
<0.6
12.80
<0.6
0.54
3.79
2.56
<0.6
2.05
2.26
15.06
<0.6
1.71
6.28
<0.6
<0.6
<0.6
1.89
1.59
1.49
2.53
2.73
6.14
<0.6
16.47
<0.6
2.78
13.56
3.02
3.36
<0.6
0.87
<0.6
DMDTP
151.76
1.54
0.46
150.17
1.50
0.49
37.78
1.37
38.52
<0.2
<0.2
1.48
156.12
35.94
<0.2
33.11
151.70
2.07
<0.2
<0.2

0.76
154.14
2.19
0.65
153.45
36.81
152.26
143.99
36.87
34.48
1.83
32.71
<0.2
<0.2
<0.2
DEP
174.12
<0.3
0.54
176.36
1.51
21.79
36.43
1.46
41.49
7.47
<0.3
2.34
174.79
61.93
<0.3
36.94
150.47
<0.3
<0.3
<0.3

0.55
159.16
6.16
2.89
178.50
35.81
184.20
159.84
37.76
56.60
6.47
41.24
0.66
2.42
1.83
DEDTP
218.65
3.34
<0.2
224.88
2.98
1.07
58.15
<0.2
52.50
3.16
<0.2
<0.2
199.83
58.43
<0.2
49.41
238.38
3.09
<0.2
3.73

1.96
225.40
2.62
<0.2
235.31
53.25
235.06
245.32
55.98
56.38
1.79
38.57
0.52
2.29
<0.2
DMTP
6.72
0.98
2.19
172.25
1.95
161.53
3.80
17.47
22.39
160.09
1.08
17.85
9.29
166.89
1.11
3.01
26.47
1.10
1.17
18.65

1.07
10.28
164.80
20.75
27.39
3.38
162.29
10.09
2.70
161.33
146.29
21.09
1.82
2.23
1.98
DETP
10.43
3.80
1.05
176.49
0.95
164.65
10.44
20.02
24.30
162.50
3.89
20.07
22.09
175.90
3.81
9.06
30.93
4.07
3.74
21.63

1.13
26.32
173.42
20.50
31.75
7.96
175.55
21.81
3.11
168.20
157.86
22.40
3.52
3.95
<0.3
                                        C-3

-------
LabB
Sample
Number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
Reported Measurement (ug/L)
BMP
10.40
<5
<5
11.80
13.30
3.90
<5
<5
<5
3.30
4.60
3.10
4.70
8.70
<5
<5
11.20
<5
<5
<5
<5
6.10
11.90
<5
8.20
<5
11.10
<5
<5
4.90
6.40
<5
<5
<5
<5
DMDTP
1529.40
<10
<10
1535.30
196.20
145.40
1066.60
<10
268.00
<10
<10
40.50
1062.10
195.40
<10
1038.10
1800.20
<10
<10
<10
<10
4497.90
<10
<10
1457.50
1163.50
3868.10
5751.30
882.60
440.60
52.50
<10
<10
<10
<10
DEP
232.30
<5
3.40
315.80
44.20
13.60
51.00
6.30
28.40
10.20
<5
34.70
113.70
70.90
<5
127.80
236.80
<5
<5
5.40
<5
529.00
8.70
4.20
204.90
137.00
271.40
577.00
68.20
47.00
9.50
5.40
<5
<5
<5
DEDTP
757.70
<10
6.50
810.70
94.30
331.80
456.40
136.90
130.90
36.70
<10
20.20
525.30
101.80
4.90
419.60
856.40
6.60
<10
10.40
<10
1895.80
<10
10.30
740.10
439.00
1638.00
2066.30
353.90
230.50
43.30
104.20
8.20
<10
<10
DMTP
130.80
<5
<5
373.20
34.00
81.50
8.10
41.30
22.20
229.50
8.10
36.00
22.80
188.80
<5
16.60
151.80
<5
<5
37.60
<5
41.60
348.90
33.90
116.70
19.60
318.40
57.00
15.10
96.10
319.90
6.20
<5
<5
<5
DETP
102.20
<5
1.90
355.80
38.60
100.20
10.40
39.20
22.10
286.50
9.20
29.20
31.50
208.00
3.10
9.90
121.00
0.80
<5
37.90
<5
17.90
338.90
33.00
88.50
12.70
274.70
18.90
5.20
129.10
272.50
4.90
2.90
<5
<5
LabC
                                     C-4

-------
Sample
Number
1
1
2*
3
4
5*
6
7*
8*
8
9
9
10
11
12
13*
14
15
15
16
17
17
18*
19
20
21*
22
23
23
24
25
26
27
28
29
29
30
31
31
32
33
34
35
Reported Measurement (ug/L)
BMP
6.9
7.0
<1.6
3.7
10.3
3.3
10.8
3.6
<1.6
<1.6
<1.6
2.0
5.9
120.6
2.3
6.8
5.9
<1.6
<1.6
2.5
8.3
6.0
2.1
5.7
4.2
2.9
7.7
4.2
3.0
<1.6
5.1
1.7
9.9
3.2
<1.6
<1.6
7.3
4.5
5.0
2.5
<1.6
<1.6
<1.6
DMDTP
196.2
203.0
1.0
<0.8
138.9
1.5
<0.8
20.2
<0.8
<0.8
60.7
56.0
1.2
<0.8
1.8
109.6
52.1
<0.8
<0.8
42.0
185.6
181.0
1.3
0.8
1.8
<0.8
177.1
0.8
<0.8
0.9
188.4
43.5
185.9
175.0
46.3
41.0
45.3
1.6
2.0
44.7
<0.8
2.1
<0.8
DEP
199.6
204.0
1.7
1.9
146.3
5.5
8.6
25.0
1.6
1.0
49.7
47.0
4.7
38.7
2.9
124.5
53.4
1.5
1.0
49.0
204.4
206.0
1.7
2.6
2.1
1.3
195.9
3.5
3.0
1.7
191.6
49.1
197.0
195.5
48.9
48.0
49.1
4.6
5.0
47.2
<1
1.3
<1
DEDTP
152.0
149.0
1.1
<0.6
118.8
1.2
<0.6
29.4
0.9
0.9
39.7
42.0
2.2
0.7
2.3
152.3
39.8
1.3
0.8
63.8
281.8
243.0
22.9
2.1
2.1
0.8
281.0
0.9
0.8
<0.6
246.0
59.1
242.3
239.4
53.9
55.0
53.3
2.3
2.2
50.2
<0.6
1.5
<0.6
DMTP
3.3
5.0
2.3
<1
113.7
<1
165.2
3.0
14.7
15.0
21.1
23.0
180.9
3.5
21.9
3.6
152.0
3.3
3.0
3.6
26.3
26.0
2.6
3.0
21.1
<1
5.0
181.0
176.0
23.7
18.9
3.2
182.7
4.2
1.1
2.0
178.8
171.0
162.0
16.2
2.3
2.4
<1
DETP
5.6
9.0
3.0
1.5
137.2
<0.6
188.3
3.3
15.5
16.0
23.1
25.0
183.7
4.2
24.8
7.4
195.9
4.3
3.0
4.1
27.0
30.0
3.2
4.0
23.7
1.0
7.3
191.1
190.0
23.8
21.0
4.4
194.3
7.3
5.1
5.0
182.4
183.1
182.0
24.2
2.7
4.2
<0.6
* Sample was received at laboratory in a broken vial, but samples was still frozen so the analysis
was completed.  No contamination problems were observed.
                                           C-5

-------
LabD
Sample
Number
1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
29
30
31
32
32
33
34
35
Reported Measurement (ug/L)
BMP
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
5.90
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
4.40
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
<2.5
DMDTP
134.60
128.90
<2.5
<2.5
120.40
<2.5
<2.5
30.30
<2.5
33.40
<2.5
<2.5
<2.5
128.70
34.30
<2.5
42.80
137.10
<2.5
<2.5
<2.5
<2.5
97.40
<2.5
<2.5
125.00
38.50
95.40
38.90
40.10
30.40
35.40
<2.5
32.30
32.80
<2.5
<2.5
<2.5
DEP
118.30
99.00
<2.5
<2.5
98.30
<2.5
<2.5
25.70
<2.5
37.00
<2.5
<2.5
<2.5
103.30
39.50
<2.5
32.10
92.40
<2.5
<2.5
<2.5
<2.5
108.40
<2.5
<2.5
85.20
22.20
81.30
16.30
38.60
24.80
28.80
6.60
29.60
35.30
<2.5
<2.5
<2.5
DEDTP






































DMTP
7.30
5.70
<2.5
7.90
183.90
<2.5
174.50
<2.5
19.00
32.20
75.50
7.00
36.60
12.00
177.90
<2.5
8.70
31.20
6.20
9.70
35.60
<2.5
7.80
202.50
38.30
32.00
<2.5
150.40
<2.5
<2.5
<2.5
178.70
162.20
33.20
32.50
10.30
10.30
<2.5
DETP
9.20
9.30
<2.5
<2.5
185.60
<2.5
163.70
4.40
23.90
30.50
96.40
5.50
31.60
12.00
190.90
3.60
9.30
30.80
5.70
6.70
31.70
<2.5
11.60
170.50
33.00
41.80
7.20
158.00
<2.5
<2.5
<2.5
198.00
153.10
34.90
32.40
4.40
6.10
<2.5
                                      C-6

-------