Targeted National Sewage Sludge
Survey (TNSSS): Summary Statistics
and Estimates of 95th Percentiles for
84 Additional Analytes
U.S. Environmental Protection Agency
Office of Water (4301T)
Office of Science and Technology
Health and Ecological Criteria Division
1200 Pennsylvania Avenue, NW
Washington, DC 20460
EPA 822-R-21-003
April 2021
Prepared by:
Battelle
505 King Avenue
Columbus, Ohio 43201
EPA Contract No. EP-W-09-024, Work Assignment 4-04
i
-------
Table of Contents
Page
1. Introduction 1
1.1 TNSSS Design Overview 1
1.2 Compounds Analyzed in the TNSSS 2
2. Approach 5
2.1 ProUCL Software 7
2.2 Methods for Testing Distributional Goodness-of-Fit 8
2.3 Methods for Identifying Statistical Outliers 8
2.4 Methods for Estimating the 95th Percentile 9
2.4.1 .Detection Limit Substitution Methods 9
2.4.2.MLE Methods 10
2.4.3.ROS Substitution Method 10
2.4.4. Kaplan-Meier Nonparametric Method 10
2.4.5. Summary of 95th Percentile Calculation Methods 12
3. Results 13
4. Key Findings and Conclusions 33
5. References 34
List of Tables
Page
Table 1. Numbers of POTWs within the TNSSS, by Average Flow Rate 1
Table 2. Listing of 34 Analytes Measured in the TNSSS, Whose Measurements Were Subject to
In-Depth Statistical Analysis in USEPA (2009a) 2
Table 3. Listing of 27 Analytes Measured in the TNSSS, Wth No More than One Detected
Outcome from Among the 84 Collected Samples of Treated Biosolids 3
Table 4. Listing of 84 Analytes Measured in the TNSSS, and the Percentage of Detected
Outcomes from Among the 84 Collected Samples of Treated Biosolids 4
Table 5. Criteria for Classifying Wthin-Facility Aggregated Measurements as a Detect or Non-
Detect in the TNSSS 5
Table 6. Summary Statistics for Detected Outcomes for 84 Analytes Measured in the TNSSS 6
Table 7. Summary of 95th Percentiles Calculation Methods in ProUCL 12
Table 8. Outcome of Goodness-of-Fit Tests, and Estimates of 95th Percentiles Using Various
Statistical Methods and Assumptions, for 84 Analytes Measured in the TNSSS 14
Table 9. Detected Facility Measurements Labeled as Statistical Outliers by Outlier Tests for 84
Analytes Measured in the TNSSS 17
ii
-------
Table 10. Recommended (Nonparametric) Estimates of the 95th Percentile for the
Pharmaceuticals, Steroids, and Hormones, Along with the Maximum Observed
Concentration 18
Table 11. Recommended (Lognormal-Based) Estimates of the 95th Percentile for the 84 Metals,
Organics, Anions, and PBDEs, Along with the Maximum Observed Concentration 19
Table 12. Weighted Summary Statistics and 95th Percentile Estimates for the 84 Analytes, Using
Statistical Techniques Applied in the Weighted (Preliminary) Analysis Performed in
USEPA (2009a) 20
Table 13. 95th Percentile Estimates for the Prioritized Analytes, as Reported in USEPA (2009a),
and Unweighted Estimates Generated by ProUCL 23
List of Figures
Page
Figure 1. Histograms of Facility-Specific Concentrations for Non-Prioritized Metals in the TNSSS 24
Figure 2. Histograms of Facility-Specific Concentrations for Non-Prioritized Organics and
Classicals (Anions) in the TNSSS 26
Figure 3. Histograms of Facility-Specific Concentrations for Non-Prioritized PBDEs in the
TNSSS 27
Figure 4. Histograms of Facility-Specific Concentrations for Non-Prioritized Pharmaceuticals in
the TNSSS 28
Figure 5. Histograms of Facility-Specific Concentrations for Non-Prioritized Steroids/Hormones in
the TNSSS 31
iii
-------
1. INTRODUCTION
1. Introduction
The Targeted National Sewage Sludge Survey (TNSSS), collected and analyzed a total of 145 analytes in
treated biosolids taken from a statistically representative subset of the nation's Publicly Owned Treatment
Works (POTWs). The TNSSS statistical report (USEPA, 2009a) presented results of in-depth statistical
analyses performed on the measurements of 34 prioritized analytes. This report presents the results of
data analyses performed on measurements for 84 additional analytes, which were not prioritized in 2009.
For each of the 84 analytes, this report assesses the distribution of measurements from the TNSSS and
utilizes an appropriate statistical approach to estimate the 95th percentile of the distribution based on
these data. EPA's ProUCL software serves as the tool for generating these estimates, while accounting
for non-detected outcomes present among the measurements. Detections of 27 analytes were too limited
to conduct statistical analyses; 16 analytes had zero detections and 11 analytes had one detection.
Following a brief overview of the TNSSS and the list of analytes measured in the sampled biosolids,
Section 2 of this report discusses the statistical approaches considered for estimating the 95th percentiles.
Section 3 presents the estimates that were calculated under these approaches. Section 4 presents key
findings and conclusions.
1.1 TNSSS Design Overview
The target population for the TNSSS consisted of 3,337 POTWs that met the following criteria:
Were in full operation in 2002 and/or 2004,
Had flow rates greater than 1 million gallons per day (MGD),
Employed a minimum of secondary treatment1,
Were located in the contiguous United States, and
Were neither privately-owned, non-publicly owned, nor Tribal facilities.
EPA used statistical survey sampling techniques to select POTWs from which to collect biosolids samples
in the TNSSS. Sample collection occurred from August 2006 to March 2007. To ensure that the sampled
facilities covered the entire range of flow rates, the sampling design divided the sample frame into three
strata defined by flow rate. Table 1 shows the three strata and the sample sizes resulting from each one.
USEPA (2009a) contains more detail on the TNSSS design.
Table 1. Numbers of POTWs within the TNSSS, by Average Flow Rate.
Average Flow Rate
Number of POTWs
Sampled in the TNSSS
>100 MGD
8
10 to 100 MGD
12
1 to 10 MGD
54
TOTAL
74
1 At a POTW, all wastewater first must go through the primary treatment process, which involves screening and
settling out large particles. The wastewater then moves on to the secondary treatment process, during which organic
matter is removed by allowing bacteria to break down the pollutants.
1
-------
1. INTRODUCTION
Within each sampled POTW, EPA collected one grab sample of biosolids for analysis, except in the
following situations where two grab samples were selected:
At six facilities, duplicate grab samples were collected
For four facilities that each utilized two treatment systems, EPA collected one grab sample from each
system.
Therefore, EPA collected a total of 84 grab samples of treated biosolids in the TNSSS from the 74
sampled POTWs.
1.2 Compounds Analyzed in the TNSSS
The TNSSS statistical analysis report (USEPA, 2009a) presents nationally representative estimates of
means and upper percentiles of the concentrations of 34 analytes measured in the biosolids samples.
Table 2 lists these analytes, according to the class of chemicals in which they reside.
Table 2. Listing of 34 Analytes Measured in the TNSSS, Whose Measurements Were Subject to In-
Depth Statistical Analysis in USEPA (2009a).
Metals
Barium Molybdenum
Beryllium Silver
Manganese
Organics
4-Chloroaniline Pyrene
Fluoranthene
Classicals
(Anions)
Nitrate/Nitrite
PBDEs
BDE-47 (2,2',4,4'- tetrabromodiphenyl) BDE-153 (2,2',4,4',5,5'-
BDE-99 (2,2',4,4',5- hexabromodiphenyl)
pentabromodiphenyl) BDE-209 (decabromodiphenyl)
Pharmaceuticals
4-Epitetracycline (ETC) Erythromycin-Total
Azithromycin Fluoxetine
Carbamazepine Miconazole
Cimetidine Ofloxacin
Ciprofloxacin Tetracycline (TC)
Diphenhydramine Triclocarban
Doxycycline Triclosan
Steroids and
Hormones
Beta Stigmastanol Coprostanol
Campesterol Epicoprostanol
Cholestanol Stigmasterol
Cholesterol
Along with the 34 analytes in Table 2, EPA measured the concentrations of 111 additional analytes in the
biosolids samples. These 111 analytes were not subject to the in-depth data analysis featured in that
report. Table 3 lists 27 of these non-prioritized analytes which had no more than one detected
concentration reported among the 84 collected samples. The lack of a sufficient number of detected,
quantifiable analysis outcomes for characterizing uncertainty in the measurements made it inappropriate
to apply a rigorous statistical analysis to data for these 27 analytes (which were exclusively
pharmaceuticals, steroids, and hormones).
2
-------
1. INTRODUCTION
Table 3. Listing of 27 Analytes Measured in the TNSSS, With No More than One Detected
Outcome from Among the 84 Collected Samples of Treated Biosolids.
4-Epianhydrochlortetracycline
Flumequine
(EACTC)
Isochlortetracycline (ICTC)
4-Epichlortetracycline (ECTC)
N org estimate
Albuterol
Ormetoprim
Anhydrochlortetracycline (ACTC)
Oxacillin
Carbadox
Oxolinic Acid
Pharmaceuticals
Cefotaxime
Penicillin G
Chlortetracycline (CTC)
Penicillin V
Clinafloxacin
Sulfamera-zine
Cloxacillin
Sulfamethi-zole
Digoxigenin
Sulfathiazole
Digoxin
Tylosin
Warfarin
Steroids and
17 Alpha-Dihydroequilin
Equilenin
Hormones
17 Alpha-Ethinyl-Estradiol
Table 4 lists the remaining 84 analytes and the percentage of collected samples of biosolids in the
TNSSS for which the analytical method yielded a detected outcome for that analyte. This report uses
statistical techniques to estimate the 95th percentile of the concentration of these analytes in treated
biosolids, based on data collected in the TNSSS.
-------
1. INTRODUCTION
Table 4. Listing of 84 Analytes Measured in the TNSSS, and the Percentage of Detected Outcomes
from Among the 84 Collected Samples of Treated Biosolids.
Aluminum
100.0%
Magnesium
100.0%
Antimony
86.5%
Mercury
100.0%
Arsenic
100.0%
Nickel
100.0%
Boron
97.3%
Phosphorus
100.0%
Cadmium
100.0%
Selenium
100.0%
Metals
Calcium
100.0%
Sodium
100.0%
Chromium
100.0%
Thallium
94.6%
Cobalt
100.0%
Tin
94.6%
Copper
100.0%
Titanium
98.6%
Iron
100.0%
Vanadium
100.0%
Lead
100.0%
Yttrium
100.0%
Zinc
100.0%
HrnanirQ
2-Methylnaphthalene
44.6%
Benzo(a)pyrene
77.0%
i yai iiuo
Bis(2-ethylhexyl) phthalate
100.0%
Classicals
Fluoride
100.0%
Water-Extractable
100.0%
(Anions)
Phosphorus
BDE-028
100.0%
BDE-100
100.0%
DDncc
BDE-066
100.0%
BDE-138
67.9%
rDUtS
BDE-085
100.0%
BDE-154
100.0%
BDE-183
100.0%
1,7-Dimethylxanthine
5.1%
Metformin
7.8%
4-EOTC
10.3%
Minocycline
43.3%
4-Epianhydrotetracycline
34.6%
Naproxen
51.3%
(EATC)
Norfloxacin
33.3%
Acetaminophen
2.6%
Oxytetracycline (OTC)
35.9%
Anhydrotetracycline (ATC)
60.3%
Ranitidine
57.1%
Caffeine
46.2%
Roxithromycin
2.6%
Clarithromycin
53.8%
Sarafloxacin
2.6%
Pharmaceuticals
Codeine
24.4%
Sulfachloropyridazine
2.6%
Cotinine
44.9%
Sulfadiazine
3.9%
Dehydronifedipine
21.8%
Sulfadimethoxine
6.5%
Demeclocycline
3.8%
Sulfamethazine
2.6%
Diltiazem
82.1%
Sulfamethoxazole
37.7%
Enrofloxacin
15.4%
Sulfanilamide
10.4%
Gemfibrozil
89.7%
Thiabendazole
69.2%
Ibuprofen
62.8%
Trimethoprim
29.5%
Lincomycin
3.8%
Virginiamycin
17.9%
Lomefloxacin
2.6%
17 Alpha-estradiol
6.8%
Equilin
17.8%
17 Beta-estradiol
11.5%
Ergosterol
61.5%
Androstenedione
41.1%
Estriol
21.6%
Steroids and
Androsterone
65.8%
Estrone
76.7%
Hormones
Beta-Estradiol 3-Benzoate
23.0%
Norethindrone
6.6%
Beta-Sitosterol
85.9%
Norgestrel
5.4%
Desmosterol
66.7%
Progesterone
22.1%
Testosterone
23.3%
4
-------
2. APPROACH
2. Approach
This section describes the statistical methodology for estimating the 95th percentile of the concentration of
the 84 additional analytes (listed in Table 4) in treated biosolids across the POTWs sampled in the
TNSSS.
As noted in Section 2.4.3 of USEPA (2009a), the TNSSS aimed to collect a single sample of final treated
biosolids from a facility; the measurements taken from this single sample represented the facility's
average concentration for the pollutant at a single point in time. Therefore, in the ten instances when a
facility had two biosolids samples collected, either for quality control purposes or because the facility
generated two types of biosolids products, EPA investigated whether the two data values for a given
analyte could be aggregated (averaged) into a single value prior to performing the data review and
analysis. For the statistical analysis of the 34 prioritized analytes (USEPA, 2009a), EPA aggregated data
values within a facility in the following instances:
For all analytes, when the second sample was a field duplicate sample (6 facilities).
For analytes within the classicals, metals, and organics classifications, when the two samples
represented different treatment systems (4 facilities). Aggregation did not occur for other classifications
(i.e., PBDEs, pharmaceuticals, steroids, and hormones) within these facilities because measurements
often differed considerably between samples collected from different systems, especially between solid
and liquid samples.
When data aggregation occurred, the criteria for classifying a facility's aggregated (average) value as a
detect or non-detect result matched that used in USEPA (2009a) and is documented in Table 5.
Table 5. Criteria for Classifying Within-Facility Aggregated Measurements as a Detect or Non-
Detect in the TNSSS.
If the two sample data
values are ...
The aggregated value
is calculated as the ...
The result is
classified as a ...
Both detected
Arithmetic average of the measured values
Detect
Both non-detected
Arithmetic average of the sample-specific
detection limits
Non-detect
A mixture of detected and
non-detected samples
Arithmetic average of the measured value
(for the detected sample) and sample-
specific detection limit (for the non-
detected sample)
Detect
Thus, following this within-facility data aggregation, the 95th percentile was estimated using a set of 74
data values for each of the metals, organics, and classicals, and a maximum set of 78 data values for
PDBEs, pharmaceuticals, steroids, and hormones. (For selected pharmaceuticals, steroids, and
hormones, fewer than 78 data values were available for the calculation, as the laboratory did not report a
value for certain samples/facilities.)
When the laboratory reported a non-detect outcome, it reported the sample-specific detection limit rather
than a measured value for that sample. For a given analyte, different samples could have different
detection limits whose values can overlap the distribution of detected outcomes. Table 6 lists the 84
analytes and some summary statistics on the observed detected measurements and on the reported
detection limits (for non-detects).
5
-------
2. APPROACH
Table 6. Summary Statistics for Detected Outcomes for 84 Analytes Measured in the TNSSS.
CAS Total Detected Concentrations
Analyte
N
Minimum
Mean
Metals (mg/kg)
Aluminum
7429905
74
74
1,400.00
11,200.00
57,300.00
13,494.66
Antimony
7440360
74
64
0.45
1.71
20.50
2.53
Arsenic
7440382
74
74
1.18
4.96
49.20
6.94
Boron
7440428
74
72
5.70
33.00
131.00
41.48
Cadmium
7440439
74
74
0.21
1.76
11.80
2.64
Calcium
7440702
74
74
9,480.00
27,000.00
243,000.00
41,025.41
Chromium
7440473
74
74
6.74
32.68
1,160.00
80.16
Cobalt
7440484
74
74
0.87
4.59
290.00
10.73
Copper
7440508
74
74
115.00
456.00
1,720.00
553.13
Iron
7439896
74
74
1,580.00
15,650.00
131,000.00
26,252.50
Lead
7439921
74
74
5.81
46.15
350.00
76.19
Magnesium
7439954
74
74
713.50
4,460.00
18,050.00
4,956.61
Mercury
7439976
74
74
0.19
0.83
7.50
1.23
Nickel
7440020
74
74
7.61
23.45
526.00
48.32
Phosphorus
7723140
74
74
5,715.00
19,300.00
69,400.00
21,806.49
Selenium
7782492
74
74
1.10
6.20
24.20
7.00
Sodium
7440235
74
74
154.00
1,017.75
26,600.00
2,699.97
Thallium
7440280
74
70
0.02
0.13
1.68
0.18
Tin
7440315
74
70
7.50
36.15
522.00
49.08
Titanium
7440326
74
73
18.50
86.90
4,805.00
281.73
Vanadium
7440622
74
74
2.04
12.65
617.00
36.19
Yttrium
7440655
74
74
0.70
3.89
26.30
4.82
Zinc
7440666
74
74
216.00
784.00
8,550.00
970.01
Organics (ug/kg)
2-Methylnaphthalene
91576
74
33
10.00
250.00
4,600.00
498.12
Benzo(a)pyrene
50328
74
57
63.00
360.00
4,000.00
810.69
Bis(2-ethylhexyl) phthalate
117817
74
74
657.35
24,000.00
310,000.00
52,862.48
Anions (mg/kg)
Fluoride
16984488
74
74
14.70
54.10
234.00
59.42
Water-Extractable
C055
74
74
11.00
420.75
9,550.00
988.08
Phosphorus
BDEs (ng/kg)
BDE 028
41318756
78
78
2,200.00
8,900.00
160,000.00
15,348.72
BDE 066
189084615
78
78
1,800.00
12,000.00
110,000.00
17,396.79
BDE 085
182346210
78
78
3,200.00
23,000.00
150,000.00
27,943.59
BDE 100
189084648
78
78
13,000.00
120,000.00
1100000.00
150,365.38
BDE 138
182677301
78
53
1,900.00
7,900.00
40,000.00
10,247.17
BDE 154
207122154
78
78
7,700.00
46,500.00
440,000.00
59,900.00
BDE 183
207122165
78
78
2,100.00
10,000.00
120,000.00
16,664.74
¦ d W J i'lMJJII IIS1H/ITH
1,7-Dimethylxanthine
611596
78
4
1,130.00
2,245.00
9,580.00
3,800.00
4-Epianhydrotetracycline
(EATC)
4465650
78
27
126.00
299.00
2,160.00
434.29
4-Epioxytetracycline (EOTC)
14206587
78
8
35.70
45.80
54.90
45.60
Acetaminophen
103902
78
2
1,120.00
1,210.00
1,300.00
1,210.00
Anhydrotetracycline (ATC)
4496859
78
47
94.30
205.00
1,960.00
330.06
Caffeine
58082
78
36
72.90
262.50
1,110.00
369.16
Clarithromycin
81103119
78
42
8.68
34.50
617.00
65.53
Codeine
76573
78
19
10.70
35.80
328.00
61.28
Cotinine
486566
78
35
11.40
21.00
690.00
99.36
Dehydronifedipine
67035227
78
17
3.48
5.96
21.65
7.97
Demeclocycline
127333
78
3
96.00
164.00
200.00
153.33
Diltiazem
42399417
78
64
1.81
18.25
225.00
44.45
Enrofloxacin
93106606
78
12
12.55
32.20
66.00
34.42
Gemfibrozil
25812300
78
70
12.10
115.00
2,650.00
234.12
Ibuprofen
15687271
78
49
99.50
255.00
11,900.00
920.67
~neomycin
154212
78
3
12.85
29.10
33.40
25.12
Lomefloxacin
98079517
78
2
33.30
36.55
39.80
36.55
Metformin
657249
77
6
550.00
756.00
1,160.00
781.50
Minocycline
10118908
67
29
351.00
475.00
8,650.00
883.40
Naproxen
22204531
78
40
20.90
75.75
1,020.00
137.37
Norfloxacin
70458967
78
26
99.30
203.00
995.50
297.30
-------
2. APPROACH
CAS
Total
Detected Concentrations
Analyte
Number
N
N
Minimum
Median
Maximum
Mean
Oxytetracycline (OTC)
79572
78
28
21.05
62.50
467.00
83.07
Ranitidine
66357355
77
44
3.85
18.15
2,250.00
81.98
Roxithromycin
80214831
78
2
14.30
18.33
22.35
18.33
Sarafloxacin
98105998
78
2
179.00
1,079.50
1,980.00
1,079.50
Sulfachloropyridazine
80320
77
2
26.00
42.35
58.70
42.35
Sulfadiazine
68359
77
3
22.90
77.30
140.00
80.07
Sulfadimethoxine
122112
77
5
3.58
7.35
62.20
18.30
Sulfamethazine
57681
77
2
21.50
22.35
23.20
22.35
Sulfamethoxazole
723466
77
29
3.91
12.30
651.00
43.26
Sulfanilamide
63741
77
8
191.00
1,593.50
15,600.00
3,651.50
Thiabendazole
148798
78
54
8.42
22.05
238.00
46.32
Trimethoprim
738705
78
23
12.65
38.90
204.00
51.37
Virginiamycin
11006761
78
14
43.50
125.25
469.00
162.56
Steroids/Hormones (|jg/kg)
17 Alpha-Estradiol
57910
73
5
14.45
21.90
48.80
26.13
17 Beta-Estradiol
50282
78
9
22.00
33.20
222.25
60.89
Androstenedione
63058
73
30
108.00
387.50
1,520.00
495.15
Androsterone
53418
73
48
17.65
107.50
1,030.00
157.26
Beta-Estradiol 3-Benzoate
50500
74
17
30.20
145.00
1,850.00
449.16
Beta-Sitosterol
83465
78
67
24,400.00
260,000.00
1640000.00
333,643.28
Desmosterol
313042
78
52
2,730.00
14,700.00
94,400.00
19,037.69
Equilin
474862
73
13
22.30
36.75
100.30
48.31
Ergosterol
57874
78
48
4,530.00
21,700.00
91,900.00
27,988.33
Estriol
50271
74
16
7.56
77.85
232.00
79.24
Estrone
53167
73
56
26.70
74.90
965.00
133.78
Norethindrone
68224
76
5
21.00
41.10
1,360.00
305.82
Norgestrel
6533002
74
4
43.80
113.75
1,300.00
392.83
Progesterone
57830
77
17
143.00
757.00
1,290.00
753.50
Testosterone
58220
73
17
30.80
97.90
2,040.00
291.79
Section 2.1 introduces the ProUCL software used to calculate the 95th percentile estimates for the 84
analytes in Table 6. Sections 2.2 and 2.3 present goodness-of-fit distributional tests and statistical outlier
tests, respectively, which were used in preparing the datasets for analysis and determining an appropriate
statistical approach for estimating the 95th percentile. Section 2.4 presents brief overviews of these
statistical approaches; the results of applying these approaches to data for the 84 analytes follow in
Section 3.
2.1 ProUCL Software
The analysis in this report utilized Version 4.1.00 of EPA's ProUCL software, an open-source statistical
estimation software tool available for download at https://www.epa.qov/land-research/proucl-software.
ProUCL offers a variety of parametric and nonparametric statistical approaches for calculating estimates
of the upper percentiles of a statistical distribution. These approaches differ in the assumed underlying
distribution of the data and in how non-detects are treated. Most approaches regard non-detects as left-
censored at the reported sample-specific detection limit (i.e., the reported result is known only to fall
below the limit), and some can handle multiple values for detection limits among the non-detects.
Because the 95th percentile was of interest to estimate here, because ProUCL offers approaches that are
more rigorous than simple substitution methods for handling non-detects and which have demonstrated
good performance in peer reviewed publications, and because the reported data in the TNSSS contain
non-detects at multiple detection limits (when non-detects were present) for a given analyte, EPA
considered ProUCL to be an appropriate tool for estimating 95th percentiles for the 84 analytes in this
report.
ProUCL was designed to analyze environmental concentration data associated with a localized site
characterization. Thus, it was not designed to analyze data from complex sampling designs, such as
stratification or the use of sampling weights. The in-depth statistical analysis performed in USEPA
(2009a) did account for the sampling weights, and thus, generated nationally representative estimates.
7
-------
2. APPROACH
2.2 Methods for Testing Distributional Goodness-of-Fit
ProUCL considers three different data distributions as a basic assumption in its parametric estimation
methods: normal, lognormal, and gamma distributions. Thus, ProUCL offers goodness-of-fit tests for
each of these three distribution models. ProUCL recommends that the results of these tests be reviewed
with histograms or quantile-quantile (Q-Q) plots of the data in order to get a more complete assessment
of distributional goodness-of-fit. These data plots also provide useful information about the presence of
potential outliers and influential data values. This, histograms of detected measurements for the
individual analytes can be found at the end of Section 3.
Because of the unknown quantitative value of non-detects, the goodness-of-fit tests were applied only to
the set of detected measurements for each analyte. That is, any non-detects were excluded from the
test.
ProUCL uses the Shapiro-Wilktest for normality (Gilbert 1987), as well as Lilliefors test (Dudewicz and
Misra, 1988; Conover, 1999) when the sample size exceeds 50. (ProUCL indicates that Lilliefors test
performs well for samples of this size, while recognizing that the Shapiro-Wilktest can be applied to
samples with larger sample sizes.) Lognormality tests are equivalent to normality tests performed on log-
transformed data.
To test for goodness-of-fit to a gamma distribution, ProUCL uses two empirical distribution function
(EDF)-based methods: the Kolmogorov-Smirnov (K-S) test and the Anderson-Darling (A-D) test
(D'Agostino and Stephens, 1986; Stephens, 1970). The critical values for these two test statistics
originate from Monte Carlo simulation experiments.
Conclusions derived solely from goodness-of-fit tests need to be made with caution. Because the null
hypothesis of these tests is that the given distributional model holds (e.g., normality) and the alternative is
that it does not hold, then the outcome of these tests is either the distributional model can or cannot be
rejected based on the data. Thus, if one fails to reject the given distribution, this does not mean that the
distribution is the best fit to the data, only that it cannot be outright rejected. Furthermore, the outcome of
the test is highly influenced by the sample size - fewer data points make it more difficult to reject the
distributional model, thereby making it more likely to conclude that the distribution is reasonable when in
fact it is not, while many data points can result in rejecting the distribution with high likelihood, even when
the distribution is appropriate. The test outcomes can also be influenced by extreme data values. Thus,
these goodness-of-fit tests provide only a general indication of the relevance of a given distributional
assumption.
2.3 Methods for Identifying Statistical Outliers
The presence of outliers among the collected concentration data could distort the estimates of
distributional parameters such as upper percentiles. To identify and assess potential outliers in the
measurements for the 84 analytes, the following outlier detection tests were accessed in ProUCL:
Dixon's Extreme Value test (Dixon, 1953), when the sample size is less than 25.
Rosner's test (Gilbert, 1987), which can detect up to 10 outliers for sample sizes of 25 or more.
The outcomes of both outlier tests are sensitive to the assumption that the data are normally distributed in
the absence of outliers. Therefore, the extent to which normality holds in the data was checked along
with the results of the outliertests (or equivalents, lognormality if the tests are performed on log-
transformed data). Furthermore, using outliertests to identify a single (i.e., most extreme) outlier often
suffer from masking effects when multiple outliers are present, as these outliers inflate the standard
deviation, which makes it more difficult to identify the most extreme data point as an outlier. For both
tests, non-detects can be either excluded from the dataset or represented by one-half of the detection
limit. (In this analysis, non-detects were excluded; that is, outliertests were performed only on detected
outcomes, and thus, the sample size refers to the number of detected outcomes.)
8
-------
2. APPROACH
As always, the decision regarding the proper disposition of outliers (e.g., to include or not to include
outliers in statistical analyses) should consider the extent to which conditions associated with sampling
and analysis, as well as facility conditions on the day of collection, are not typical, and thus, warrant
exclusion. Because no data exclusions could be warranted for such reasons, no outliers were excluded
from the calculation of 95th percentiles based on applying these outlier tests.
2.4 Methods for Estimating the 95th Percentile
ProUCL provides four basic statistical approaches to calculating the 95th percentile. They typically
calculate a 95th percentile as:
where jj. and
-------
2. APPROACH
2.4.2. MLE Methods
ProUCL utilized MLE methods in two situations:
Under the assumption of normality and at least one non-detect outcome,
Under the assumption of a gamma distribution and 100% detected outcomes.
In the first situation (i.e., normality and at least one non-detect outcome), ProUCL estimates a 95th
percentile using Cohen's MLE method (Cohen, 1950, 1959) for those analytes having data that can
accommodate the method's numerical analysis. Among the 84 analytes, 21 had sufficient data to
calculate MLE estimates for the 95th percentile under the normality assumption. Here, ju and
-------
2. APPROACH
(observed) measured concentrations (the detection limits for non-detects) forn samples for which the
concentrations originate from a common underlying distribution, and assume yi Then the cumulative distribution function F(x), as
estimated by the K-M approach, equals the following:
F(x) =1 ifx>yp (i.e., x exceeds the maximum observed detected value)
ft ffl
= TT _J L if y1x j
observed detected values)
= F(yi) if Xf < x< yi (i.e., x falls between the smallest detection limit and the smallest
detected value)
= 0 if 0 < x < xi=yi (i.e., x falls below all observed detected measurements, and the
smallest value is detected)
= undefined if 0 < x < x?< yi (i.e., x falls below all observed detected measurements,
and the smallest observed value is a non-detect).
Therefore, the estimate F(x) is a step function that is calculated from the highest observed measurement
down to the smallest, as follows:
if yp-i < x < yP , then F(x) = (nP - mP)/nP
if yP 2 < x < yp-i , then F(x) = [(nP-i - mP-i)/nP-i] *[(nP- mP)/nP]
etc.
Note that when x is below the smallest of the n reported measurements (xi), F(x) is undefined if xi is a
non-detect, and is zero if xi is a detected value.
Using the estimate F(x) and the set of p detected measurements, the mean of the distribution is estimated
as follows:
) ~ ,,{y. i)] (where F(yo)=°) (2)
7=1
The variance is estimated as follows:
p
2 = Z y> ) - F(y-i)] - ^ (where F^)=o) (3)
7=1
One estimate of the 95th percentile based on the K-M approach is the value of x for which F(x) = 0.95. (If
multiple values of x satisfy this criterion, then any of these values could be chosen, such as the midpoint
or minimum value.)
Alternatively, the 95th percentile could be estimated from K-M estimates and standard normal z-scores by
calculating the K-M mean and variance from Equations (2) and (3) and inserting those values into
Equation 1, letting c=1.645. This alternative approach assumes an underlying normal distribution in the
data.
11
-------
2. APPROACH
2.4.5. Summary of 95th Percentile Calculation Methods
Table 7 contains a summary of the statistical methods used in ProUCL to calculate 95th percentiles, as
discussed in Sections 2.4.1 through 2.4.4. Section 3 applies these methods to the measurement data for
the 84 analytes.
Table 7. Summary of 95th Percentiles Calculation Methods in ProUCL.
Normal 95th
Percentile (DL/2
Sub.)
95th percentile calculated from the sample mean and standard deviation (sd) with non-detects
replaced by one-half of the detection limit: mean + 1,645*sd. Here, 1.645 is the 95th
percentile of the standard normal distribution. ProUCL calculates this estimate in all
situations, but does not recommend this substitution method and includes this calculation only
for historic reasons.
Normal 95th
Percentile (MLE)
95th percentile calculated from Cohen's MLE estimates of the mean and standard deviation
(sd): mean + 1,645*sd. ProUCL calculates this value only when at least one non-detected
sample result exists and when a sufficient number of detected sample results exist to perform
the MLE estimation technique.
Lognormal 95th
Percentile (DL/2
Sub.)
95th percentile calculated from the sample mean and standard deviation (sd) of log-
transformed concentrations with non-detects replaced by one-half of the log-transformed
detection limit: (exp[mean + 1.645*sd]). ProUCL calculates this estimate in all situations, but
does not recommend this substitution method and includes this calculation only for historic
reasons.
Lognormal-ROS
95th Percentile
Regression on order statistics (ROS) approach assuming that non-detected outcomes follow a
lognormal distribution. 95th percentile calculated as (exp[mean + 1.645*sd]), where the mean
and standard deviation (sd) are calculated as the sample mean and standard deviation with
non-detects replaced by estimates obtained from a linear regression fitted to detected
measurements paired with lognormal quantiles. ProUCL calculates this value only when at
least one non-detect and three detected sample results exist.
Gamma 95th
Percentile (MLE)
95th percentile calculated by y*theta/2, where y is the 95th percentile of a chi-square
distribution with 2*k degrees of freedom (where k is the MLE of the shape parameter of the
Gamma distribution), and theta is the MLE of the scale parameter of the Gamma distribution.
ProUCL calculates this value only when all sample results are detected.
Gamma-ROS
95th Percentile
Regression on order statistics (ROS) approach assuming that non-detected outcomes follow a
gamma distribution with shape and scale parameters (y and theta, respectively) represented
by their MLEs calculated from detected data. 95th percentile calculated as y*theta/2, with non-
detects replaced by estimates obtained from a linear regression fitted to detected
measurements paired with quantiles from the same gamma distribution. ProUCL calculates
this value only when at least one non-detect and four detected sample results exist.
Non parametric
95th Percentile
(Order stats.)
95th percentile calculated by (0.95*n)th order statistic. If (0.95*n) is not an integer, then if I is
the next lowest integer and e=(0.95*n)-l, and if x(k) denotes the kth order statistic, then the 95th
percentile is x(l)+e*(x(l+1 )-x(l)). ProUCL calculates this value only when all sample results are
detected.
K-M 95th
Percentile
95th percentile calculated from the Kaplan-Meier estimate of the cumulative distribution
function (Section 2.4.4). ProUCL calculates this value only when at least one non-detected
sample result exists.
12
-------
3. RESULTS
3. Results
Using ProUCL, the methods of Section 2 were applied to the set of concentration data summarized in
Table 6, for each of the 84 non-prioritized analytes having at least two detected outcomes. Table 8
summarizes the results of the tests of goodness-of-fit discussed in Section 2.2 and presents estimates of
the 95th percentile of the underlying distribution of data under the various approaches presented in
Section 2.4 and summarized in Table 7.
For a given analyte, the estimates in Table 8 can vary considerably among the different approaches. In
fact, some of these approaches may not be suitable for estimating an analyte's 95th percentile due to the
data failing to satisfy important underlying assumptions related to the distribution of the data. This section
investigates the distributional properties of the analyte data in order to make a proper decision on an
approach for a final estimate of the 95% percentile for each analyte. To assist the decision-making, the
detected measurements for each analyte are plotted in histograms within Figures 1 through 5 at the end
of this section. (Each bar within these figures represents the number of samples/facilities whose data
values fall within a specified range, with the median of the range specified to the left of the bar.)
Goodness of fit test outcomes. For pollutant measurements in environmental media, lognormal or
gamma distributions are often good models for the underlying concentration distribution, as they cover
only positive values and are skewed toward low values, with long right-hand tails to represent possible
large measurements. Table 8 includes the results of goodness-of-fit tests (described in Section 2.2) for
the normal, lognormal, and gamma distributions when applied to the detected measurements for each of
the 84 analytes. For a given analyte, an "X" is specified in a given column of the table if the distribution
specified in the column heading cannot be rejected at a 0.05 significance level. Thus, if no X is specified
for a given distribution, then the approaches that require this distribution to hold should not be used to
estimate a 95th percentile.
Table 8 shows that when considering the detected observations only, the lognormal and gamma
distributions are most frequently deemed satisfactory for the 84 analytes (i.e., could not be rejected by the
goodness-of-fit tests). However, nearly one-third of the analytes (27) had neither the lognormal nor
gamma distributions as sufficient representations of the observed data based on the outcomes of the
goodness-of-fit tests. Nevertheless, the histograms (Figures 1 through 5) demonstrate a skewed
distribution for most analytes that resembles a lognormal or gamma distribution.
Of the 84 analytes, the majority of the 35 metals, anions, organics, and PBDEs had 100% detected
outcomes, and only one of these analytes was below 50% detected. The lognormal distribution could not
be rejected for 23 of these analytes, the gamma distribution fitted satisfactorily to two additional analytes,
and all three distributions were rejected for the remaining 10 analytes. In USEPA (2009a), a lognormal
assumption was made for the metals, organics, and PBDEs. One could, therefore, recommend using a
loqnormal-based approach to calculate 95th percentiles among the metals, classicals, organics, and
PBDEs. given their high detection percentages and to be consistent with the approach taken in USEPA
(2009a). However, for the 12 analytes for which the goodness-of-fit tests for lognormality were rejected, it
would be worthwhile to compare the lognormal-based estimates with those under the nonparametric
approach and note any differences.
Of the 84 analytes, for the 49 pharmaceuticals, steroids, and hormones, the detection percentages were
considerably lower than for the other analytes. Thus, it was more difficult to characterize these
distributions. When identifying a common approach to calculating the 95th percentile across these
analytes, the overall conclusion from the distributional goodness-of-fit tests is that nonparametric
technigues (e.g.. Kaplan-Meier) are more appropriate for the pharmaceuticals and steroids/hormones that
have a relatively high proportion of non-detects. This conclusion is consistent with ProUCL's
recommendations for calculating 95% upper confidence limits on the means when detection percentages
were low. It differs, however, from USEPA (2009a), where a lognormal approach was used for the
prioritized pharmaceuticals, steroids, and hormones (for which the detection percentages were higher).
13
-------
2. APPROACH
Table 8. Outcome of Goodness-of-Fit Tests, and Estimates of 95th Percentiles Using Various Statistical Methods and Assumptions, for
84 Analytes Measured in the TNSSS.
Goodness-of-Fit Test
Outcomes (on Detected
Normal-Based
95lh Percentile
Lognormal-Based
95lh Percentile
Gamma-Based 95th
Percentile
Non parametric
95lh Percentile
Mini-
Results Only)
Estimates
Estimates
Estimates
Estimates
mum
%
Log-
ROS
ROS
95th
Obs. Max.
Detec-
Normal
normal
Gamma
DL/2
Extra-
Extra-
Order
Percent
Detected
Analyte
n
ted
Test
test
test
Sub.
MLE
DL/2 Sub.
polation
MLE
polation
stats.
K-M
-ile
Cone.
Metals (mg/kg)
Aluminum
74
100.0%
X
X
29,773
34,255
30,870
29,960
29,773
57,300
Antimony
74
86.5%
X
6.93
7.16
14.9
6.49
9.85
6.89
6.49
20.5
Arsenic
74
100.0%
X
17.9
15.7
16.1
14.0
14.0
49.2
Boron
74
97.3%
X
X
94.0
135
115
112
119
93.5
93.5
131
Cadmium
74
100.0%
6.65
6.68
6.54
8.31
6.54
11.8
Calcium
74
100.0%
X
111,421
100,753
103,717
109,700
100,753
243,000
Chromium
74
100.0%
323
227
253
265
227
1,160
Cobalt
74
100.0%
68.1
22.0
36.0
20.4
20.4
290
Copper
74
100.0%
X
X
1,146
1,298
1,202
1,248
1,146
1,720
Iron
74
100.0%
X
70,814
78,323
71,689
91,795
70,814
131,000
Lead
74
100.0%
X
195
220
201
241
195
350
Magnesium
74
100.0%
X
X
10,402
12,096
11,050
11,945
10,402
18,050
Mercury
74
100.0%
X
3.28
3.09
3.09
3.56
3.09
7.50
Nickel
74
100.0%
197
115
148
189
115
526
Phosphorus
74
100.0%
X
X
40,871
44,114
42,278
40,780
40,780
69,400
Selenium
74
100.0%
X
X
13.8
15.6
14.4
14.5
13.8
24.2
Sodium
74
100.0%
10,899
7,653
8,934
10,128
7,653
26,600
Thallium
74
94.6%
X
X
0.517
0.592
0.439
0.439
0.446
0.515
0.439
1.68
Tin
74
94.6%
155
175
109
108
175
155
108
522
Titanium
74
98.6%
1,555
1,550
675
674
1,063
1,547
674
4,805
Vanadium
74
100.0%
162
101
118
111
101
617
Yttrium
74
100.0%
X
X
11.8
12.8
11.9
14.2
11.8
26.3
Zinc
74
100.0%
X
2,622
2,087
2,178
1,839
1,839
8,550
Organics (ug/kg)
2-Methyl-
74
44.6%
X
X
1,349
1,124
728
1,334
1,229
728
4,600
naphthalene
Benzo(a)pyrene
74
77.0%
X
2,220
2,838
2,397
2,194
3,252
2,207
2,194
4,000
Bis(2-ethylhexyl)
74
100.0%
X
X
161,178
266,644
180,771
184,000
161,178
310,000
phthalate
Fluoride
74
100.0%
X
X
124
135
128
131
124
234
Water-Extractable
74
100.0%
X
3,792
4,910
3,628
3,733
3,628
9,550
Phosphorus
BDE-028
78
100.0%
54,936
38,006
43,681
55,600
38,006
160,000
BDE-066
78
100.0%
X
47,906
45,781
44,914
57,300
44,914
110,000
BDE-085
78
100.0%
X
64,134
69,656
64,202
61,150
61,150
150,000
14
-------
3. RESULTS
Table 8. (cont.)
Goodness-of-Fit Test
Normal-Based
Lognormal-Based
Gamma-Based 95th
Non parametric
Outcomes (on Detected
95lh Percentile
95lh Percentile
Percentile
95lh Percentile
Mini-
Results Only)
Estimates
Estimates
Estimates
Estimates
mum
%
Log-
ROS
ROS
95th
Obs. Max.
Detec-
Normal
normal
Gamma
DL/2
Extra-
Extra-
Order
Percent
Detected
Analyte
n
ted
Test
test
test
Sub.
MLE
DL/2 Sub.
polation
MLE
polation
stats.
K-M
-ile
Cone.
BDE-100
78
100.0%
X
386,860
387,979
363,164
314,500
314,500
1,100,000
BDE-138
78
67.9%
X
22,310
23,144
19,114
39,832
18,787
18,787
40,000
BDE-154
78
100.0%
X
X
155,163
149,085
143,689
130,000
130,000
440,000
BDE-183
78
100.0%
50,338
41,314
44,090
57,300
41,314
120,000
Pharmaceuticals (ug/kg)
1,7-
78
5.1%
X
X
X
2,439
1,125
107
1,071
2,868
107
9,580
Dimethylxanthine
4-EOTC
78
10.3%
X
X
X
40.2
40.0
38.8
31.3
43.5
31.3
54.9
4-Epianhydrotetra-
78
34.6%
X
X
694
839
531
517
872
702
517
2,160
cycline (EATC)
Acetaminophen
78
2.6%
528
406
1,156
406
1,300
Anhydrotetra-
78
60.3%
694
848
638
640
1,144
693
638
1,960
cycline (ATC)
Caffeine
78
46.2%
X
X
604
723
587
585
993
602
585
1,110
Clarithromycin
78
53.8%
X
168
192
117
123
204
168
117
617
Codeine
78
24.4%
X
90.0
48.9
49.5
87.0
89.8
48.9
328
Cotinine
78
44.9%
242
679
125
134
260
241
125
690
Dehydronifedipine
78
21.8%
9.03
8.71
6.95
6.55
10.1
9.42
6.55
21.7
Demeclocycline
78
3.8%
X
X
94.1
83.2
50.5
121
50.5
200
Diltiazem
78
82.1%
X
127
147
182
173
187
126
126
225
Enrofloxacin
78
15.4%
X
X
X
44.1
32.4
26.9
56.0
33.1
26.9
66.0
Gemfibrozil
78
89.7%
X
904
920
885
791
993
900
791
2,650
Ibuprofen
78
62.8%
3,300
3,641
1,515
1,799
3,338
3,291
1,515
11,900
Lincomycin
78
3.8%
X
X
36.7
29.5
18.3
18.7
18.3
33.4
Lomefloxacin
78
2.6%
25.4
18.9
34.6
18.9
39.8
Metformin
77
7.8%
X
X
X
716
742
445
341
709
341
1,160
Minocycline
67
43.3%
2,224
2,167
1,038
1,075
2,226
2,261
1,038
8,650
Naproxen
78
51.3%
X
305
361
248
255
409
305
248
1,020
Norfloxacin
78
33.3%
X
763
426
345
575
448
345
995
Oxytetracycline
78
35.9%
136
161
96.6
97.9
176
137
96.6
467
(OTC)
Ranitidine
77
57.1%
469
501
91.3
98.8
271
467
91.3
2,250
Roxithromycin
78
2.6%
12.2
11.0
15.9
11.0
22.4
Sarafloxacin
78
2.6%
775
295
538
295
1,980
Sulfachloro-
77
2.6%
18.6
10.8
32.6
10.8
58.7
pyridazine
Sulfadiazine
77
3.9%
X
X
36.8
13.8
1.11
49.1
1.11
140
Sulfadimethoxine
77
6.5%
X
X
14.0
3.90
0.683
6.86
15.6
0.683
62.2
Sulfamethazine
77
2.6%
14.5
7.68
21.8
7.68
23.2
Sulfamethoxazole
77
37.7%
142
156
31.4
36.1
94.9
142
31.4
651
Sulfanilamide
77
10.4%
X
X
3,650
2,620
451
82.3
2,096
3,715
82.3
15,600
15
-------
3. RESULTS
Table 8. (cont.)
Goodness-of-Fit Test
Outcomes (on Detected
Results Only)
Normal-Based
95lh Percentile
Estimates
Lognormal-Based
95lh Percentile
Estimates
Gamma-Based 95th
Percentile
Estimates
Non parametric
95lh Percentile
Estimates
Mini-
mum
Analyte
n
%
Detec-
ted
Normal
Test
Log-
normal
test
Gamma
test
DL/2
Sub.
MLE
DL/2 Sub.
ROS
Extra-
polation
MLE
ROS
Extra-
polation
Order
stats.
K-M
95th
Percent
-ile
Obs. Max.
Detected
Cone.
Thiabendazole
78
69.2%
113
124
107
110
177
112
107
238
Trimethoprim
78
29.5%
X
X
76
65.1
50.7
88.1
74.2
50.7
204
Virginiamycin
78
17.9%
X
X
284
234
93.6
167
183
93.6
469
Steroids/Hormones (ug/kc
J)
17 Alpha-estradiol
73
6.8%
X
X
X
21.5
18.3
21.1
40.3
23.4
18.3
48.8
17 Beta-estradiol
78
11.5%
69.2
41.0
18.8
40.3
67.0
18.8
222
Androstenedione
73
41.1%
X
X
785
1,049
795
736
1,184
774
736
1,520
Androsterone
73
65.8%
X
X
365
442
390
366
579
363
363
1,030
Beta-Estradiol 3-
Benzoate
74
23.0%
X
X
652
826
245
192
591
650
192
1,850
Beta-Sitosterol
78
85.9%
X
X
756,638
786,498
3,422,184
1,069,746
1,492,776
751,640
751,640
1,640,000
Desmosterol
78
66.7%
X
X
40,327
46,638
47,737
42,371
73,162
40,145
40,145
94,400
Eguilin
73
17.8%
X
X
X
52.6
49.4
34.1
49.7
51.7
34.1
100
Ergosterol
78
61.5%
X
X
51,969
58,455
77,101
60,380
100,132
51,566
51,566
91,900
Estriol
74
21.6%
X
X
X
93.9
121
70.3
51.2
99.0
91.7
51.2
232
Estrone
73
76.7%
376
460
339
328
544
375
328
965
Norethindrone
76
6.6%
X
397
87.4
114
293
293
87.4
1,360
Norgestrel
74
5.4%
X
X
289
65.8
2.29
120
302
2.29
1,300
Progesterone
77
22.1%
X
X
810
731
600
949
797
600
1,290
Testosterone
73
23.3%
X
544
356
273
161
390
526
161
2,040
X: The hypothesis that the given distribution holds cannot be rejected at the 0.05 level.
16
-------
3. RESULTS
In those few instances where normality could not be rejected at a 0.05 level (i.e., 11 pharmaceuticals and
steroids/hormones), only a small number of detected outcomes (less than 25%) were available for the
goodness-of-fit test. As a result, for these analytes, there is typically not sufficient power to declare that a
given distributional form is not appropriate, as these tests require the data to demonstrate that the
distribution model does not hold. Thus, normality was not considered to be a viable distributional
assumption for the analytes in Table 8.
Identifying possible statistical outliers. Section 2.3 noted the two outlier tests that ProUCL uses to
identify statistical outliers among a set of detected outcomes: Dixon's test (which identifies a maximum of
one outlier and is applied when the number of detected outcomes is less than 25), and Rosner's test
(which can identify up to 10 outliers and is applied when at least 25 detected outcomes are available).
These tests were applied to the set of log-transformed detected measurements for each of the 84
analytes (as Figures 1 through 5 indicate that the log-measurements are more likely to resemble a normal
distribution compared to the untransformed measurements, and these outlier tests assume normality in
the data being analyzed). When Rosner's test was applied in this analysis, a maximum of five outliers
was specified given the sample sizes.
Outlier testing resulted in identifying one or more statistical outliers at the 0.05 significance level for 13 of
the 84 analytes. Table 9 lists these analytes and those measurements identified as statistical outliers
(with the ID number for the surveyed facility that was linked to the outcome in parentheses following each
measurement). Because the number of detected outcomes for each of these 13 analytes exceeded 25,
Rosner's test was used to identify the outliers (listed in the last column of Table 9). As a means of
comparison, Table 9 also includes the largest detected measurement for the analyte which was not
labeled as an outlier - each outlier listed in the last column of Table 9 ranged from 50% higher (BDE 028)
to over 14 times higher (Rantidine) than the analyte's highest non-outlier measurement. These outliers
are clearly visible in the histograms within Figures 1 through 5. Finally, Table 9 indicates that the outliers
are associated with a variety of facilities, and no one facility tends to be the source of many outliers
(which would have suggested a possible issue with that facility which would make its measurements
incompatible with the distribution of measurements from the other facilities).
Table 9. Detected Facility Measurements Labeled as Statistical Outliers by Outlier Tests for 84
Analytes Measured in the TNSSS.
Number of
Highest detected
Detected
measurement not
Detected measurements labeled as
Measure-
classified as an
outliers at a 5% Significance Level
Analyte
ments
outlier
(survey ID of facility is in parentheses)
Metals (mg/kg)
Antimony
64
9.89
20.5 (20)
Arsenic
74
29.8
49.2 (55)
Cobalt
74
23.9
97.2 (57) 290 (37)
Nickel
74
255
508 (2) 526 (71)
Thallium
70
0.50
1.68 (55)
Tin
70
226
522 (7)
Titanium
73
732
1,930 (4) 4,510 (27) 4,805 (18)
Vanadium
74
190
617 (4)
Zinc
74
2,479
8,550 (57)
PBDEs (ng/kg)
BDE 028
78
77,000
120,000 (70) 160,000 (48)
Pharmaceuticals (ug/kg)
Minocycline
29
1,590
8,650 (9)
Oxytetracycline (OTC)
28
139
467 (62)
Ranitidine
44
154
2,250 (47)
While the presence of large outliers has the potential for impacting the 95th percentile estimates
considerably, no evidence was apparent to exclude any of the measurements listed in Table 9 from the
calculation of 95th percentile estimates due to quality concerns. However, if other concerns remain for
these outliers, nonparametric approaches tend to be less impacted by the presence of outliers compared
to the approaches that are specific to a distributional model form.
17
-------
3. RESULTS
95th percentile estimates for the pharmaceuticals, steroids, and hormones (use of nonparametric
estimation techniques). For the pharmaceuticals, steroids, and hormones, the relatively high non-detect
percentages warranted that the 95th percentile estimates should be based upon nonparametric K-M
techniques (e.g., MLE techniques tend to yield unstable estimates when the percentage of non-detects is
high). Table 10 lists the recommended estimates of the 95th percentile for these analytes, along with their
maximum observed values.
Table 10. Recommended (Nonparametric) Estimates of the 95th Percentile for the
Pharmaceuticals, Steroids, and Hormones, Along with the Maximum Observed Concentration
Observed
Observed
95th
Maximum
95th
Maximum
Analyte
Percentile
Cone.
Analyte
Percentile
Cone.
Pharmaceuticals (ug/kg)
1,7-Dimethylxanthine
2,868
9,580
Sulfachloro-pyridazine
32.6
58.7
4-EOTC
43.5
54.9
Sulfadiazine
49.1
140
4-Epianhydrotetra-cycline (EATC)
702
2,160
Sulfadimethoxine
15.6
62.2
Acetaminophen
1,156
1,300
Sulfamethazine
21.8
23.2
Anhydrotetracycline (ATC)
693
1,960
Sulfamethoxazole
142
651
Caffeine
602
1,110
Sulfanilamide
3,715
15,600
Clarithromycin
168
617
Thiabendazole
112
238
Codeine
89.8
328
Trimethoprim
74.2
204
Cotinine
241
690
Virginiamycin
183
469
Dehydronifedipine
9.42
21.7
Steroids/Hormones (|jg/kg)
Demeclocycline
121
200
17 Alpha-estradiol
23.4
48.8
Diltiazem
126
225
17 Beta-estradiol
67.0
222
Enrofloxacin
33.1
66.0
Androstenedione
774
1,520
Gemfibrozil
900
2,650
Androsterone
363
1,030
Ibuprofen
3,291
11,900
Beta-Estradiol 3-Benzoate
650
1,850
Lincomycin
18.7
33.4
Beta-Sitosterol
751,640
1,640,000
Lomefloxacin
34.6
39.8
Desmosterol
40,145
94,400
Metformin
709
1,160
Equilin
51.7
100
Minocycline
2,261
8,650
Ergosterol
51,566
91,900
Naproxen
305
1,020
Estriol
91.7
232
Norfloxacin
448
995
Estrone
375
965
Oxytetracycline (OTC)
137
467
Norethindrone
293
1,360
Ranitidine
467
2,250
Norgestrel
302
1,300
Roxithromycin
15.9
22.4
Progesterone
797
1,290
Sarafloxacin
538
1,980
Testosterone
526
2,040
Note that the 95th percentile estimates (second column of Table 10) are, on average, 43 percent of the
size of the observed maximum concentration (last column). These estimates range from 21 percent
(Ranitidine, which has a large outlier as noted in Table 9) to 94 percent (Sulfamethazine) of the observed
maximum. These estimates tended to be in line with the estimates from other techniques, and more
importantly, do not appear to be underestimates.
95th percentile estimates for the non-prioritized metals, anions, organics, and PBDEs (use of
lognormal estimation techniques). The metals, anions, organics, and PBDEs had relatively high
percentages of detected measurements which tended to be well-represented by a lognormal distribution.
Table 11 lists the recommended lognormal-based estimates of the 95th percentiles for these analytes
along with their maximum observed values. Like the pharmaceuticals, steroids, and hormones in Table
10, the 95th percentiles in Table 11 are 43% of the observed maximum concentrations, on average. They
range from 8% (cobalt, which had two large outliers as noted in Table 9) to 86% (Bis(2-ethylhexyl)
phthalate) of the observed maximum. They are similar in magnitude to the nonparametric estimates for
these analytes.
18
-------
3. RESULTS
Table 11. Recommended (Lognormal-Based) Estimates of the 95th Percentile for the 84 Metals,
Organics, Anions, and PBDEs, Along with the Maximum Observed Concentration.
Aluminum
34,255
57,300
Antimony
Arsenic
6.49
20.5
15.7
49.2
Boron
112
131
Cadmium
6.68
Calcium
100,753
243,000
Chromium
227
1,160
Cobalt
22.0
290
Copper
Iron
1,298
1,720
78,323
131,000
Lead
220
350
Magnesium
Mercury
Nickel
12,096
18,050
3.09
7.50
115
526
Phosphorus
Selenium
44,114
69,400
15.6
24.2
Sodium
7,653
26,600
Thallium
0.439
1.68
Tin
108
522
Titanium
674
4,805
Vanadium
101
617
Yttrium
12.8
26.3
Zinc
2,087
8,550
Analyte
Metals (mg/kg)
Percentile
Observed
Maximum
Cone.
Water-Extractable Phosphorus
Analyte
Organics (|jg/kg)
2-Methyl-naphthalene
Benzo(a)pyrene
Bis(2-ethylhexyl) phthalate
Classicals (mg/kg)
Fluoride
PBDEs (ng/kg)
BDE-028
BDE-066
BDE-085
BDE-100
BDE-138
BDE-154
Percentile
266,644
387,979
149,085
38,006
45,781
69,656
19,114
2,194
4,910
728
135
Observed
Maximum
Cone.
1,100,000
310,000
440,000
160,000
110,000
150,000
40,000
4,600
4,000
9,550
234
BDE-183
41,314
120,000
Note from Table 8 that only modest differences in the 95th percentile estimates occur between the
lognormal-based and nonparametric approaches for the 12 analytes in Table 11 for which the goodness-
of-fit test for lognormality was rejected. Thus, taking a lognormal approach to estimating 95th percentiles
for each of the analytes in Table 11 is not highly impactful when lognormality is rejected.
Updated 95th percentile estimates for an analyte in the 2009 report having an outlier excluded. The
95th percentile estimates in Tables 10 and 11 utilized all available data without excluding any of the
outliers listed in Table 9. In contrast, USEPA (2009a) presented the 95th percentile estimate for silver
upon excluding one outlier (856 mg/kg) from the calculation. This outlier was suspected to be the result
of an anomaly to normal operations at the POTW, although the value of the sample analysis was
confirmed with the facility (USEPA, 2009b). The 95th percentile estimates for silver were as follows:
95th percentile estimate with outlier excluded: 57 mg/kg (as reported in USEPA, 2009a).
95th percentile estimate with outlier included: 74 mg/kg (a 30 percent increase).
Note that among the other analytes in the 2009 report, one sample measurement for cimetidine and two
sample measurements for fluoxetine were also omitted from estimation in USEPA (2009a), but the
exclusions were due to failing chemical quality assurance criteria rather than classification as a statistical
outlier.
Comparing 95th percentile estimates with estimates that result from applying the analysis
approach used on prioritized analytes in the 2009 report. Table B-7 of USEPA (2009a) included
preliminary estimates of the 95th percentile for the 84 analytes in this report using the statistical
techniques that were applied to the 34 prioritized analytes. Table 12 replicates the estimates from this
table, as a means of comparing to the 95th percentile estimates given in Tables 10 and 11. The 2009
statistical analysis accounted for the survey weights assigned to the sampled POTWs and the survey's
stratified sample design.
19
-------
3. RESULTS
Table 12. Weighted Summary Statistics and 95th Percentile Estimates for the 84 Analytes, Using
Statistical Techniques Applied in the Weighted (Preliminary) Analysis Performed in USEPA
(2009a).
#
Sampled
Standard
95th
Analyte
POTWs
Mean
Deviation
Median
Percentile
Metals (mg/kg)
Aluminum
74
13,477.80
10,020.66
11,200.00
34,525.52
Antimony
74
2.26
2.99
1.42
14.18
Arsenic
74
6.76
6.84
4.95
15.13
Boron
74
43.25
33.70
33.00
122.42
Cadmium
74
2.48
2.28
1.72
6.09
Calcium
74
39,539.11
39,847.24
25,950.00
96,371.30
Chromium
74
78.15
152.58
30.60
212.92
Cobalt
74
10.99
36.71
4.44
21.51
Copper
74
558.54
368.89
449.00
1,330.71
Iron
74
24,742.64
27,716.08
13,250.00
71,425.51
Lead
74
73.96
73.51
44.40
210.31
Magnesium
74
4,705.62
2,978.38
4,300.00
11,295.55
Mercury
74
1.27
1.29
0.83
3.20
Nickel
74
47.38
92.09
22.80
108.42
Phosphorus
74
21,668.72
11,761.54
18,300.00
43,262.02
Selenium
74
7.10
4.18
6.20
15.97
Sodium
74
2,873.59
5,102.50
1,110.00
8,344.24
Thallium
74
0.17
0.21
0.13
0.41
Tin
74
43.54
40.38
36.20
102.33
Titanium
74
221.31
601.17
80.90
627.73
Vanadium
74
33.94
79.63
11.60
86.76
Yttrium
74
4.55
3.63
3.54
12.07
Zinc
74
969.77
1,054.80
759.00
2,110.95
Organics (jjg/kg)
2-Methylnaphthalene
74
449.04
746.50
200.00
1,111.65
Benzo(A)Pyrene
74
661.00
849.06
320.00
2,259.31
Bis(2-Ethylhexyl) Phthalate
74
48,142.54
65,207.23
23,000.00
226,937.29
Anions (mg/kg)
Fluoride
74
58.20
35.87
54.20
132.68
Water-Extractable Phosphorus
74
1,062.09
1,770.57
480.00
5,012.47
PBDEs (ng/kg)
BDE28
78
13,990.24
20,783.92
8,500.00
33,076.02
BDE66
78
16,536.70
16,088.17
12,000.00
41,134.17
BDE 85
78
27,824.89
20,202.11
23,000.00
66,312.15
BDE 100
78
148,973.10
125,545.38
120,000.00
362,133.61
BDE 138
78
10,807.30
12,722.42
7,000.00
20,822.02
BDE 154
78
58,730.15
50,756.61
49,000.00
143,826.47
BDE 183
78
15,079.78
17,215.83
10,000.00
36,522.57
Pharmaceuticals (ug/kg)
1,7-Dimethylxanthine
78
1,180.46
1,088.76
986.50
1,440.00
4-Epioxytetracycline (EOTC)
78
45.30
11.66
41.50
68.60
4-Epianhydrotetracycline (EATC)
78
251.31
301.11
140.00
797.00
Acetaminophen
78
461.80
200.38
395.50
973.00
Anhydrotetracycline (ATC)
78
262.91
283.89
153.00
680.00
Caffeine
78
231.59
239.33
103.00
881.00
Clarithromycin
78
41.58
81.76
13.40
141.00
Codeine
78
30.63
40.75
19.90
70.40
Cotinine
78
57.97
120.79
13.20
332.00
Dehydronifedipine
78
5.03
3.12
4.04
10.70
Demeclocycline
78
105.97
24.36
99.20
147.00
Diltiazem
78
40.20
56.35
14.80
199.00
Enrofloxacin
78
27.87
30.69
19.80
66.00
20
-------
3. RESULTS
Table 12. (cont.)
Analyte
#
Sampled
POTWs
Mean
Standard
Deviation
Median
95th
Percentile
Gemfibrozil
78
213.56
437.13
101.00
665.00
Ibuprofen
78
652.80
1,703.48
143.00
2,980.00
Lincomycin
78
30.20
27.43
19.90
85.10
Lomefloxacin
78
22.93
15.84
19.80
33.30
Metformin
77
533.68
451.80
546.00
1,160.00
Minocycline
67
660.76
1,090.03
433.00
1,180.00
Naproxen
78
86.20
146.58
31.60
316.00
Norfloxacin
78
274.57
699.46
109.00
684.00
Oxytetracycline (OTC)
78
57.87
53.47
43.15
113.00
Ranitidine
77
57.66
276.53
12.50
89.80
Roxithromycin
78
8.10
9.17
4.72
22.35
Sarafloxacin
78
293.65
718.92
91.90
1,150.00
Sulfachloropyridazine
77
11.96
9.91
9.84
14.00
Sulfadiazine
77
13.61
18.22
9.84
22.90
Sulfadimethoxine
77
3.57
7.67
2.01
7.35
Sulfamethazine
77
7.38
12.57
4.02
21.50
Sulfamethoxazole
77
21.65
81.60
4.32
67.70
Sulfanilamide
77
536.88
2,110.35
99.20
2,390.00
Thiabendazole
78
36.59
49.33
16.50
137.00
Trimethoprim
78
30.37
37.72
10.80
114.00
Virqiniamycin
78
137.50
233.05
73.30
469.00
Steroids and Hormones (ug/kg)
17 Alpha-Estradiol
73
22.54
6.45
21.40
27.20
17 Beta-Estradiol
78
34.33
40.48
21.50
131.00
Androstenedione
73
326.82
325.94
158.00
1,100.00
Androsterone
73
120.29
130.72
84.90
332.00
Beta-Estradiol 3-Benzoate
74
146.80
345.64
23.20
695.00
Beta-Sitosterol
78
291,398.60
294,849.73
207,000.00
885,000.00
Desmosterol
78
15,654.68
16,484.25
10,800.00
38,500.00
Equilin
73
34.77
22.37
23.00
80.60
Erqosterol
78
19,829.93
18,535.97
12,600.00
56,100.00
Estriol
74
38.70
38.78
24.80
128.00
Estrone
73
105.97
160.61
51.20
326.00
Norethindrone
76
101.84
338.51
22.30
146.00
Norqestrel
74
66.94
155.02
42.00
111.00
Proqesterone
77
322.37
355.78
139.00
1,260.00
Testosterone
73
162.85
270.69
95.20
511.00
Taken from Table B-7 of USEPA (2009a).
Like the analyses presented in Tables 10 and 11, the weighted analysis estimates presented in Table 12
utilized a nonparametric approach for pharmaceuticals, steroids, and hormones, and a lognormal-based
approach for all other analytes:
The weighted lognormal approach is documented in Section C.1 of Appendix C of USEPA (2009a).
This approach used Cohen's MLE techniques when non-detects were present.
The nonparametric approach is documented in Section C.2 of Appendix C of USEPA (2009a). It
utilized a weighted order statistics approach to identifying the 95th percentile, but substituted non-
detects with the detection limit.
In general, the 95th percentile estimates in the last column of Table 12 compared favorably with the
estimates given in Tables 10 and 11. The following specific findings were noted:
21
-------
3. RESULTS
For metals, organics, anions, and PBDEs, the estimates differed on average by about one percent.
The weighted analysis tended to yield larger estimates than the above unweighted analyses. The
largest observed difference was a 54% decrease, from 14.2 to 6.5 mg/kg, in the 95th percentile estimate
for antimony from the weighted analysis estimate to the unweighted estimate in this report.
For pharmaceuticals and steroids/hormones, the difference was about 16 percent, on average. Larger
differences between the two methods were observed, in part due to the nonparametric approach and
the smaller number of detected outcomes compared to the other analytes. The number of analytes with
estimates from the weighted analysis that were lower than the estimates presented in Table 10 was
about equal to the number that had higher estimates.
Table 13 lists the 34 prioritized analytes and estimates of the 95th percentile under the in-depth (weighted)
analysis used in USEPA (2009a), as well as both the lognormal-based and nonparametric (unweighted)
approaches used for the non-prioritized analytes in this report. The lognormal-based unweighted
estimates averaged about 7% lower than the weighted estimates for these analytes. (The weighted
analysis for all but one of these analytes was lognormal-based.) The nonparametric unweighted
estimates averaged 17% lower than the weighted estimates. Thus, using techniques that utilize a
lognormal distributional assumption, the 95th percentile estimates differ as a whole in only a minor way
between the weighted and unweighted approaches.
Thus, as a result of this investigation, it is not apparent that accounting for the weighting and stratified
sample design as was done by using the in-depth analysis approach (Tables 12) would lead to
considerably different estimates for the 95th percentile compared to the results from the unweighted
analysis that are presented in Tables 10 and 11.
22
-------
3. RESULTS
Table 13. 95th Percentile Estimates for the Prioritized Analytes, as Reported in USEPA (2009a),
and Unweighted Estimates Generated by ProLICL.
95th Percentile Estimates
As reported
Unweighted
Estimates from
Unweighted
Estimates from
in USEPA
ProUCL -
ProUCL -
Analyte
(2009a)1
Lognormal
Nonparametric
Metals (mg/kg)
Barium
1,396
1,336
1,674
Beryllium
1.04
1.06
0.99
Manganese
4,156
4,020
3,430
Molybdenum
40.5
40.9
43.5
Silver
57
71.5
63.6
Organics (jjg/kg)
4-Chloroaniline
4,762
3,541
2,648
Fluoranthene
5,256
5,774
5,374
Pyrene
6,184
6,398
6,477
Classicals (mg/kg)
Nitrate/Nitrite
960
473
712
PBDEs (ng/kg)
BDE-47
1,688,881
1,776,508
1,575,000
BDE-99
1,713,370
1,812,193
1,530,000
BDE-153
166,454
170,769
150,000
BDE-209
7,360,103
8,029,037
7,606,248
Pharmaceuticals (ug/kg)
4-Epitetracycline (ETC)
3,787
3,513
2,470
Azithromycin
3,172
2,689
2,484
Carbamazepine
497
468
1,317
Cimetidine*
4,789
3,631
3,429
Ciprofloxacin
36,095
34,531
21,690
Diphenhydramine
2,696
2,662
2,005
Doxycycline
3,082
2,348
1,988
Erythromycin-Total
123
103
82.8
Fluoxetine*
778
688
863
Miconazole
4,652
3,643
3,417
Ofloxacin
32,363
27,133
19,753
Tetracycline (TC)
4,458
4,185
2,823
Triclocarban
131,079
144,599
95,475
Triclosan
62,217
63,043
40,268
Steroids and Hormones (ug/kg)
Beta Stigmastanol
632,009
631,228
504,913
Campesterol
360,119
360,990
257,550
Cholestanol
2,629,149
2,519,426
1,446,500
Cholesterol
4,369,111
3,355,221
1,976,463
Coprostanol
16,626,022
16,249,696
8,001,500
Epicoprostanol
5,143,938
5,948,141
2,716,385
Stigmasterol
1,157,099
365,893
281,498
1 In-depth analysis was based on a lognormal approach for all but nitrate/nitrite, for which a nonparametric approach
was used.
23
-------
3. RESULTS
Aluminum
Antimony
Arsenic
20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
0 20 30 40 50 60 70
Number of Samples/Facilities
Boron
Cadmium
Calcium
20 30 40 50 60 70
Number of Samples/Facilities
Chromium
10 20 30 40 50 60 70
Number of Samples/Facilities
20 30 40 50 60 70
Number of Samples/Facilities
Cobalt
Copper
20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
20 30 40 50 60 70
Number of Samples/Facilities
Iron
Lead
Magnesium
20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
20 30 40 50 60 70
Number of Samples/Facilities
Figure 1. Histograms of Facility-Specific Concentrations for Non-Prioritized Metals in the TNSSS.
24
-------
RESULTS
Mercury
Nickel
Phosphorus
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
20 30 40 50
Number of Samples/Facilities
Selenium
Sodium
Thallium
Cone, (mg/kg)
2,000
20 30 40 50 60
Number of Samples/Facilities
Tin
10 20 30 40 50 60 70
Number of Samples/Facilities
Titanium
20 30 40 50
Number of Samples/Facilities
Vanadium
Yttrium
20 30 40 50 60
Number of Samples/Facilities
20 30 40 50
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
Zinc
3,600
4,800
6,000
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
Figure 1. (cont.)
-------
3. RESULTS
2-Methyl naphthalene
Benzo(a)pyrene
Bis(2-Ethylhexyl) Phthalate
20 30 40 50 60
Numberof Samples/Facilities
Numberof Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
Fluoride
Water-Extractable Phosphorus
20 30 40 50 60
Numberof Samples/Facilities
Number of Samples/Faci I iti ¦
Figure 2. Histograms of Facility-Specific Concentrations for Non-Prioritized Organics and Classicals
(Anions) in the TNSSS.
26
-------
3. RESULTS
BDE 028
BDE 066
BDE 085
Cone, (ng/kg)
150,000
Cone, (ng/kg)
105,000
10 20 30 40 50 60 70
Number of Samples/Facilities
BDE 100
Cone, (ng/kg) Cone, (ng/kg)
3,000
Cone, (ng/kg)
10,000
30.000
50.000
70.000
90,000
110,000
130,000
150.000
150,000
10 20 30 40 50 60 70
Number of Samples/Facilities
BDE 138
Cone, (ng/kg)
10 20 30 40 50 60 70
Number of Samples/Facilities
BDE 154
9,000
10 20 30 40 50 60 70
Number of Samples/Facilities
BDE 183
10 20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
Figure 3. Histograms of Facility-Specific Concentrations for Non-Prioritized PBDEs in the TNSSS.
27
-------
3. RESULTS
1,7-Dimethylxanthine
4-Epioxytetracycline (EOTC)
4-Epianhydrotetracycline (EATC)
~onc. (mg/kg)
20 30 40 50
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
1,750
2,250
10 20 30 40 50 60 70
Number of Samples/Facilities
Acetaminophen
Andydrotetracycline (ATC)
Caffeine
10 20 30 40 50 60 70
Number of Samples/Facilities
0 10 20 30 40 50 60 70
Number of Samples/Facilities
0 20 30 40 50 60 70
Number of Samples/Facilities
Clarithromycin
Codeine
Cotinine
10 20 30 40 50 60 70
Number of Samples/Facilities
0 10 20 30 40 50 60 70
Number of Samples/Faci I iti ¦
0 20 30 40 50 60 70
Number of Samples/Facilities
Dehydronifedipine
Demeclocycline
Diltiazem
10 20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
Figure 4. Histograms of Facility-Specific Concentrations for Non-Prioritized Pharmaceuticals in the
TNSSS.
28
-------
3. RESULTS
Enrofloxacin
Gemfibrozil
Ibuprofen
f
Number of Samples/Facilities
1,800
2,200
2,600
Lone, (mg/kg)
y.ooo
Number of Samples/Facilities
Number of Samples/Facilities
Li neomycin
Lomefloxacin
Metformin
Number of Samples/Facilities
Number of Samples/Faci I iti ¦
Number of Samples/Facilities
Minocycline Naproxen Norfloxacin
Cone, (mg/kg)
Cone, (mg/kg)
Cone, (mg/kg)
Number of Samples/Facilities
Number of Samples/Facilities
0 10 20 30 40 50 60 70
Number of Samples/Facilities
Oxytetracycline (OTC) Ranitidine Roxithromycin
Cone, (mg/kg)
Cone, (mg/kg)
Cone, (mg/kg)
Number of Samples/Facilities
Numberof Samples/Facilities 0 10 20 30 40 50 60 70
Number of Samples/Facilities
Figure 4. (cont.)
29
-------
RESULTS
Sarafloxacin
Sulfa chloropyridazine
Sulfadiazine
Number of Samples/Facilities
Number of Samples/Facilities
Number of Samples/Facilities
Sulfadimethoxine
Sulfamethazine
Sulfamethoxazole
Number of Samples/Facilities
Number of Samples/Facilities
Number of Samples/Facilities
Sulfanilamide
Thiabendazole
Trimethoprim
20 30 40 50 60
Number of Samples/Facilities
Number of Samples/Facilities
20 30 40 50
Number of Samples/Facilities
Virginiamycin
20 30 40 50 60
Number of Samples/Facilities
Figure 4.
(cont.)
-------
3. RESULTS
17 Alpha-Estriadol
17 Beta-Estriadol
Androstenedione
20 30 40 50 60
Numberof Samples/Facilities
[
20 30 40 50
Numberof Samples/Facilities
20 30 40 50 60 70
Number of Samples/Facilities
Androsterone
Beta-Estradiol 3-Benzoate
Beta-Sitosterol
20 30 40 50 60
Numberof Samples/Facilities
20 30 40 50 60 70
Numberof Samples/Facilities
10 20 30 40 50 60 70
Number of Samples/Facilities
Desmosterol
Equilin
Ergosterol
f-
I
I
20 30 40 50
Numberof Samples/Facilities
20 30 40 50 60
Numberof Samples/Facilities
30 40 50 60 70
Number of Samples/Facilities
Estriol
Estrone
Norethindrone
Numberof Samples/Facilities
Number of Samples/FaciIiti¦
20 30 40 50 60 70
Numberof Samples/Facilities
Figure 5. Histograms of Facility-Specific Concentrations for Non-Prioritized Steroids/Hormones in the
TNSSS.
31
-------
RESULTS
Norgestrel
Progesterone
Testosterone
20 30 40 50 60
Numberof Samples/Facilities
20 30 40 50
Numberof Samples/Facilities
Number of Samples/FaciIiti
Figure 5.
(cont.)
-------
4. KEY FINDINGS AND CONCLUSIONS
4. Key Findings and Conclusions
This report presented estimates of the 95th percentile for 84 additional analytes which were measured in
the treated biosolids sampled within the TNSSS. These 84 analytes had at least two detected outcomes
among the tested biosolids from the sampled facilities. The statistical techniques available within EPA's
ProUCL open-source software tool were applied to yield the 95th percentile estimates. Because the
measurements in the TNSSS were frequently below detection limits, and because multiple detection limit
values were observed for a given analyte, the ProUCL software was especially relevant for use here.
ProUCL offers rigorous statistical estimation techniques that handle non-detects more appropriately than
simple substitution methods that treat non-detects as detected outcomes. These estimation techniques
allow for non-detects at multiple detection limits and include the Kaplan-Meier nonparametric technique
and regression on order statistics (ROS) methods that extrapolate values for non-detects based on
information available from the detected outcomes.
Conclusions for the 84 analytes:
For the metals, organics, anions, and PBDEs, which tended to have a high prevalence of detected
outcomes, a lognormal-based approach was recommended for estimating the 95th percentile. When
non-detects were present for a given analyte, ROS estimates were assigned to the non-detects. These
ROS estimates were obtained by extrapolating from a fitted ordinary least squares regression line that
was fitted to the observed log-transformed detected outcomes and corresponding normal scores.
For the pharmaceuticals, steroids, and hormones, which often had detection percentages that fell below
50%, a nonparametric Kaplan-Meier approach was recommended for estimating the 95th percentile.
The low detection percentages resulted in less stable and defensible percentile estimates from
parametric-based approaches, and goodness-of-fit test outcomes were less certain due to limited
detected data and non-consistent across the analytes. This is in accord with ProUCL
recommendations, where nonparametric techniques are recommended when detection percentages
are low.
While the sample data occasionally contained large measurement values for selected analytes, evidence
was insufficient to warrant excluding these measurements from the analysis. In addition, outliers were
not clustered among one or more facilities, nor were outliers flagged with data qualifiers in the survey
database which would have suggested invalidity. However, it is appropriate to assess how the presence
of large values may impact the estimates by performing the analysis both with and without outliers.
33
-------
5. REFERENCES
5. References
Cohen, A. C. Jr. 1950. Estimating the Mean and Variance of Normal Populations from Singly Truncated
and Double Truncated Samples. Ann. Math. Statist., Vol. 21, pp. 557-569.
Cohen, A. C. Jr. 1959. Simplified Estimators for the Normal Distribution When Samples Are Singly
Censored or Truncated. Technometrics, Vol. 1, No. 3, pp. 217-237.
Conover W.J. 1999. Practical Nonparametric Statistics, 3rd Edition, John Wiley & Sons, New York.
CSC, 2007. CSC's Sampling and Analysis Report for the 2006-2007 Targeted National Sewage Sludge
Survey. Technical report prepared by CSC Environmental Solutions to Rick Stevens, Office of Water,
U.S. Environmental Protection Agency. 30 September 2007.
D'Agostino, R.B. and Stephens, M.A. 1986. Goodness-of-Fit Techniques. Marcel Dekker, Inc.
Dixon, W.J. 1953. Processing data for outliers. Biometrics. Vol. 9, No. 1, pp. 74-89.
Dudewicz, E.D. and Misra, S.N. 1988. Modern Mathematical Statistics. John Wiley, New York.
Gilbert, R.O. 1987. Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold,
New York.
Gilliom, R. H. and Helsel, D.R. 1986. Estimations of Distributional Parameters for Censored Trace Level
Water Quality Data 1. Estimation Techniques. Water Resources Research, 22, pp. 135-146.
Helsel, D.R. 2005. Nondetects and Data Analysis. Statistics for Censored Environmental Data. John
Wley and Sons, NY.
Helsel, D.R. 1990. Less Than Obvious, Statistical Treatment of Data Below the Detection Limit. ES&T
Features Environmental Sci. Technol., Vol. 24, No. 12, pp. 1767-1774.
Kaplan, E.L. and Meier, O. 1958. Nonparametric Estimation from Incomplete Observations. Journal of the
American Statistical Association, Vol. 53, pp. 457-481.
NRC, 2002. Biosolids Applied to Land: Advancing Standards and Practices. Report by the National
Research Council. Washington, DC: National Academy Press. July 2002.
Stephens, M. A. 1970. Use of Kolmogorov-Smirnov, Cramer-von Mises and Related Statistics Without
Extensive Tables. Journal of Royal Statistical Society, Vol. B 32, pp. 115-122.
USEPA. 2009a. Targeted National Sewage Sludge Survey: Statistical Analysis Report. Washington,
DC: Office of Water, U.S. Environmental Protection Agency. EPA-822-R-08-018.
USEPA. 2009b. Targeted National Sewage Sludge Survey: Sampling and Analysis Technical Report.
Washington, DC: Office of Water, U.S. Environmental Protection Agency. EPA-822-R-08-016.
34
------- |