Targeted National Sewage Sludge Survey (TNSSS): Summary Statistics and Estimates of 95th Percentiles for 84 Additional Analytes U.S. Environmental Protection Agency Office of Water (4301T) Office of Science and Technology Health and Ecological Criteria Division 1200 Pennsylvania Avenue, NW Washington, DC 20460 EPA 822-R-21-003 April 2021 Prepared by: Battelle 505 King Avenue Columbus, Ohio 43201 EPA Contract No. EP-W-09-024, Work Assignment 4-04 i ------- Table of Contents Page 1. Introduction 1 1.1 TNSSS Design Overview 1 1.2 Compounds Analyzed in the TNSSS 2 2. Approach 5 2.1 ProUCL Software 7 2.2 Methods for Testing Distributional Goodness-of-Fit 8 2.3 Methods for Identifying Statistical Outliers 8 2.4 Methods for Estimating the 95th Percentile 9 2.4.1 .Detection Limit Substitution Methods 9 2.4.2.MLE Methods 10 2.4.3.ROS Substitution Method 10 2.4.4. Kaplan-Meier Nonparametric Method 10 2.4.5. Summary of 95th Percentile Calculation Methods 12 3. Results 13 4. Key Findings and Conclusions 33 5. References 34 List of Tables Page Table 1. Numbers of POTWs within the TNSSS, by Average Flow Rate 1 Table 2. Listing of 34 Analytes Measured in the TNSSS, Whose Measurements Were Subject to In-Depth Statistical Analysis in USEPA (2009a) 2 Table 3. Listing of 27 Analytes Measured in the TNSSS, Wth No More than One Detected Outcome from Among the 84 Collected Samples of Treated Biosolids 3 Table 4. Listing of 84 Analytes Measured in the TNSSS, and the Percentage of Detected Outcomes from Among the 84 Collected Samples of Treated Biosolids 4 Table 5. Criteria for Classifying Wthin-Facility Aggregated Measurements as a Detect or Non- Detect in the TNSSS 5 Table 6. Summary Statistics for Detected Outcomes for 84 Analytes Measured in the TNSSS 6 Table 7. Summary of 95th Percentiles Calculation Methods in ProUCL 12 Table 8. Outcome of Goodness-of-Fit Tests, and Estimates of 95th Percentiles Using Various Statistical Methods and Assumptions, for 84 Analytes Measured in the TNSSS 14 Table 9. Detected Facility Measurements Labeled as Statistical Outliers by Outlier Tests for 84 Analytes Measured in the TNSSS 17 ii ------- Table 10. Recommended (Nonparametric) Estimates of the 95th Percentile for the Pharmaceuticals, Steroids, and Hormones, Along with the Maximum Observed Concentration 18 Table 11. Recommended (Lognormal-Based) Estimates of the 95th Percentile for the 84 Metals, Organics, Anions, and PBDEs, Along with the Maximum Observed Concentration 19 Table 12. Weighted Summary Statistics and 95th Percentile Estimates for the 84 Analytes, Using Statistical Techniques Applied in the Weighted (Preliminary) Analysis Performed in USEPA (2009a) 20 Table 13. 95th Percentile Estimates for the Prioritized Analytes, as Reported in USEPA (2009a), and Unweighted Estimates Generated by ProUCL 23 List of Figures Page Figure 1. Histograms of Facility-Specific Concentrations for Non-Prioritized Metals in the TNSSS 24 Figure 2. Histograms of Facility-Specific Concentrations for Non-Prioritized Organics and Classicals (Anions) in the TNSSS 26 Figure 3. Histograms of Facility-Specific Concentrations for Non-Prioritized PBDEs in the TNSSS 27 Figure 4. Histograms of Facility-Specific Concentrations for Non-Prioritized Pharmaceuticals in the TNSSS 28 Figure 5. Histograms of Facility-Specific Concentrations for Non-Prioritized Steroids/Hormones in the TNSSS 31 iii ------- 1. INTRODUCTION 1. Introduction The Targeted National Sewage Sludge Survey (TNSSS), collected and analyzed a total of 145 analytes in treated biosolids taken from a statistically representative subset of the nation's Publicly Owned Treatment Works (POTWs). The TNSSS statistical report (USEPA, 2009a) presented results of in-depth statistical analyses performed on the measurements of 34 prioritized analytes. This report presents the results of data analyses performed on measurements for 84 additional analytes, which were not prioritized in 2009. For each of the 84 analytes, this report assesses the distribution of measurements from the TNSSS and utilizes an appropriate statistical approach to estimate the 95th percentile of the distribution based on these data. EPA's ProUCL software serves as the tool for generating these estimates, while accounting for non-detected outcomes present among the measurements. Detections of 27 analytes were too limited to conduct statistical analyses; 16 analytes had zero detections and 11 analytes had one detection. Following a brief overview of the TNSSS and the list of analytes measured in the sampled biosolids, Section 2 of this report discusses the statistical approaches considered for estimating the 95th percentiles. Section 3 presents the estimates that were calculated under these approaches. Section 4 presents key findings and conclusions. 1.1 TNSSS Design Overview The target population for the TNSSS consisted of 3,337 POTWs that met the following criteria: Were in full operation in 2002 and/or 2004, Had flow rates greater than 1 million gallons per day (MGD), Employed a minimum of secondary treatment1, Were located in the contiguous United States, and Were neither privately-owned, non-publicly owned, nor Tribal facilities. EPA used statistical survey sampling techniques to select POTWs from which to collect biosolids samples in the TNSSS. Sample collection occurred from August 2006 to March 2007. To ensure that the sampled facilities covered the entire range of flow rates, the sampling design divided the sample frame into three strata defined by flow rate. Table 1 shows the three strata and the sample sizes resulting from each one. USEPA (2009a) contains more detail on the TNSSS design. Table 1. Numbers of POTWs within the TNSSS, by Average Flow Rate. Average Flow Rate Number of POTWs Sampled in the TNSSS >100 MGD 8 10 to 100 MGD 12 1 to 10 MGD 54 TOTAL 74 1 At a POTW, all wastewater first must go through the primary treatment process, which involves screening and settling out large particles. The wastewater then moves on to the secondary treatment process, during which organic matter is removed by allowing bacteria to break down the pollutants. 1 ------- 1. INTRODUCTION Within each sampled POTW, EPA collected one grab sample of biosolids for analysis, except in the following situations where two grab samples were selected: At six facilities, duplicate grab samples were collected For four facilities that each utilized two treatment systems, EPA collected one grab sample from each system. Therefore, EPA collected a total of 84 grab samples of treated biosolids in the TNSSS from the 74 sampled POTWs. 1.2 Compounds Analyzed in the TNSSS The TNSSS statistical analysis report (USEPA, 2009a) presents nationally representative estimates of means and upper percentiles of the concentrations of 34 analytes measured in the biosolids samples. Table 2 lists these analytes, according to the class of chemicals in which they reside. Table 2. Listing of 34 Analytes Measured in the TNSSS, Whose Measurements Were Subject to In- Depth Statistical Analysis in USEPA (2009a). Metals Barium Molybdenum Beryllium Silver Manganese Organics 4-Chloroaniline Pyrene Fluoranthene Classicals (Anions) Nitrate/Nitrite PBDEs BDE-47 (2,2',4,4'- tetrabromodiphenyl) BDE-153 (2,2',4,4',5,5'- BDE-99 (2,2',4,4',5- hexabromodiphenyl) pentabromodiphenyl) BDE-209 (decabromodiphenyl) Pharmaceuticals 4-Epitetracycline (ETC) Erythromycin-Total Azithromycin Fluoxetine Carbamazepine Miconazole Cimetidine Ofloxacin Ciprofloxacin Tetracycline (TC) Diphenhydramine Triclocarban Doxycycline Triclosan Steroids and Hormones Beta Stigmastanol Coprostanol Campesterol Epicoprostanol Cholestanol Stigmasterol Cholesterol Along with the 34 analytes in Table 2, EPA measured the concentrations of 111 additional analytes in the biosolids samples. These 111 analytes were not subject to the in-depth data analysis featured in that report. Table 3 lists 27 of these non-prioritized analytes which had no more than one detected concentration reported among the 84 collected samples. The lack of a sufficient number of detected, quantifiable analysis outcomes for characterizing uncertainty in the measurements made it inappropriate to apply a rigorous statistical analysis to data for these 27 analytes (which were exclusively pharmaceuticals, steroids, and hormones). 2 ------- 1. INTRODUCTION Table 3. Listing of 27 Analytes Measured in the TNSSS, With No More than One Detected Outcome from Among the 84 Collected Samples of Treated Biosolids. 4-Epianhydrochlortetracycline Flumequine (EACTC) Isochlortetracycline (ICTC) 4-Epichlortetracycline (ECTC) N org estimate Albuterol Ormetoprim Anhydrochlortetracycline (ACTC) Oxacillin Carbadox Oxolinic Acid Pharmaceuticals Cefotaxime Penicillin G Chlortetracycline (CTC) Penicillin V Clinafloxacin Sulfamera-zine Cloxacillin Sulfamethi-zole Digoxigenin Sulfathiazole Digoxin Tylosin Warfarin Steroids and 17 Alpha-Dihydroequilin Equilenin Hormones 17 Alpha-Ethinyl-Estradiol Table 4 lists the remaining 84 analytes and the percentage of collected samples of biosolids in the TNSSS for which the analytical method yielded a detected outcome for that analyte. This report uses statistical techniques to estimate the 95th percentile of the concentration of these analytes in treated biosolids, based on data collected in the TNSSS. ------- 1. INTRODUCTION Table 4. Listing of 84 Analytes Measured in the TNSSS, and the Percentage of Detected Outcomes from Among the 84 Collected Samples of Treated Biosolids. Aluminum 100.0% Magnesium 100.0% Antimony 86.5% Mercury 100.0% Arsenic 100.0% Nickel 100.0% Boron 97.3% Phosphorus 100.0% Cadmium 100.0% Selenium 100.0% Metals Calcium 100.0% Sodium 100.0% Chromium 100.0% Thallium 94.6% Cobalt 100.0% Tin 94.6% Copper 100.0% Titanium 98.6% Iron 100.0% Vanadium 100.0% Lead 100.0% Yttrium 100.0% Zinc 100.0% HrnanirQ 2-Methylnaphthalene 44.6% Benzo(a)pyrene 77.0% i yai iiuo Bis(2-ethylhexyl) phthalate 100.0% Classicals Fluoride 100.0% Water-Extractable 100.0% (Anions) Phosphorus BDE-028 100.0% BDE-100 100.0% DDncc BDE-066 100.0% BDE-138 67.9% rDUtS BDE-085 100.0% BDE-154 100.0% BDE-183 100.0% 1,7-Dimethylxanthine 5.1% Metformin 7.8% 4-EOTC 10.3% Minocycline 43.3% 4-Epianhydrotetracycline 34.6% Naproxen 51.3% (EATC) Norfloxacin 33.3% Acetaminophen 2.6% Oxytetracycline (OTC) 35.9% Anhydrotetracycline (ATC) 60.3% Ranitidine 57.1% Caffeine 46.2% Roxithromycin 2.6% Clarithromycin 53.8% Sarafloxacin 2.6% Pharmaceuticals Codeine 24.4% Sulfachloropyridazine 2.6% Cotinine 44.9% Sulfadiazine 3.9% Dehydronifedipine 21.8% Sulfadimethoxine 6.5% Demeclocycline 3.8% Sulfamethazine 2.6% Diltiazem 82.1% Sulfamethoxazole 37.7% Enrofloxacin 15.4% Sulfanilamide 10.4% Gemfibrozil 89.7% Thiabendazole 69.2% Ibuprofen 62.8% Trimethoprim 29.5% Lincomycin 3.8% Virginiamycin 17.9% Lomefloxacin 2.6% 17 Alpha-estradiol 6.8% Equilin 17.8% 17 Beta-estradiol 11.5% Ergosterol 61.5% Androstenedione 41.1% Estriol 21.6% Steroids and Androsterone 65.8% Estrone 76.7% Hormones Beta-Estradiol 3-Benzoate 23.0% Norethindrone 6.6% Beta-Sitosterol 85.9% Norgestrel 5.4% Desmosterol 66.7% Progesterone 22.1% Testosterone 23.3% 4 ------- 2. APPROACH 2. Approach This section describes the statistical methodology for estimating the 95th percentile of the concentration of the 84 additional analytes (listed in Table 4) in treated biosolids across the POTWs sampled in the TNSSS. As noted in Section 2.4.3 of USEPA (2009a), the TNSSS aimed to collect a single sample of final treated biosolids from a facility; the measurements taken from this single sample represented the facility's average concentration for the pollutant at a single point in time. Therefore, in the ten instances when a facility had two biosolids samples collected, either for quality control purposes or because the facility generated two types of biosolids products, EPA investigated whether the two data values for a given analyte could be aggregated (averaged) into a single value prior to performing the data review and analysis. For the statistical analysis of the 34 prioritized analytes (USEPA, 2009a), EPA aggregated data values within a facility in the following instances: For all analytes, when the second sample was a field duplicate sample (6 facilities). For analytes within the classicals, metals, and organics classifications, when the two samples represented different treatment systems (4 facilities). Aggregation did not occur for other classifications (i.e., PBDEs, pharmaceuticals, steroids, and hormones) within these facilities because measurements often differed considerably between samples collected from different systems, especially between solid and liquid samples. When data aggregation occurred, the criteria for classifying a facility's aggregated (average) value as a detect or non-detect result matched that used in USEPA (2009a) and is documented in Table 5. Table 5. Criteria for Classifying Within-Facility Aggregated Measurements as a Detect or Non- Detect in the TNSSS. If the two sample data values are ... The aggregated value is calculated as the ... The result is classified as a ... Both detected Arithmetic average of the measured values Detect Both non-detected Arithmetic average of the sample-specific detection limits Non-detect A mixture of detected and non-detected samples Arithmetic average of the measured value (for the detected sample) and sample- specific detection limit (for the non- detected sample) Detect Thus, following this within-facility data aggregation, the 95th percentile was estimated using a set of 74 data values for each of the metals, organics, and classicals, and a maximum set of 78 data values for PDBEs, pharmaceuticals, steroids, and hormones. (For selected pharmaceuticals, steroids, and hormones, fewer than 78 data values were available for the calculation, as the laboratory did not report a value for certain samples/facilities.) When the laboratory reported a non-detect outcome, it reported the sample-specific detection limit rather than a measured value for that sample. For a given analyte, different samples could have different detection limits whose values can overlap the distribution of detected outcomes. Table 6 lists the 84 analytes and some summary statistics on the observed detected measurements and on the reported detection limits (for non-detects). 5 ------- 2. APPROACH Table 6. Summary Statistics for Detected Outcomes for 84 Analytes Measured in the TNSSS. CAS Total Detected Concentrations Analyte N Minimum Mean Metals (mg/kg) Aluminum 7429905 74 74 1,400.00 11,200.00 57,300.00 13,494.66 Antimony 7440360 74 64 0.45 1.71 20.50 2.53 Arsenic 7440382 74 74 1.18 4.96 49.20 6.94 Boron 7440428 74 72 5.70 33.00 131.00 41.48 Cadmium 7440439 74 74 0.21 1.76 11.80 2.64 Calcium 7440702 74 74 9,480.00 27,000.00 243,000.00 41,025.41 Chromium 7440473 74 74 6.74 32.68 1,160.00 80.16 Cobalt 7440484 74 74 0.87 4.59 290.00 10.73 Copper 7440508 74 74 115.00 456.00 1,720.00 553.13 Iron 7439896 74 74 1,580.00 15,650.00 131,000.00 26,252.50 Lead 7439921 74 74 5.81 46.15 350.00 76.19 Magnesium 7439954 74 74 713.50 4,460.00 18,050.00 4,956.61 Mercury 7439976 74 74 0.19 0.83 7.50 1.23 Nickel 7440020 74 74 7.61 23.45 526.00 48.32 Phosphorus 7723140 74 74 5,715.00 19,300.00 69,400.00 21,806.49 Selenium 7782492 74 74 1.10 6.20 24.20 7.00 Sodium 7440235 74 74 154.00 1,017.75 26,600.00 2,699.97 Thallium 7440280 74 70 0.02 0.13 1.68 0.18 Tin 7440315 74 70 7.50 36.15 522.00 49.08 Titanium 7440326 74 73 18.50 86.90 4,805.00 281.73 Vanadium 7440622 74 74 2.04 12.65 617.00 36.19 Yttrium 7440655 74 74 0.70 3.89 26.30 4.82 Zinc 7440666 74 74 216.00 784.00 8,550.00 970.01 Organics (ug/kg) 2-Methylnaphthalene 91576 74 33 10.00 250.00 4,600.00 498.12 Benzo(a)pyrene 50328 74 57 63.00 360.00 4,000.00 810.69 Bis(2-ethylhexyl) phthalate 117817 74 74 657.35 24,000.00 310,000.00 52,862.48 Anions (mg/kg) Fluoride 16984488 74 74 14.70 54.10 234.00 59.42 Water-Extractable C055 74 74 11.00 420.75 9,550.00 988.08 Phosphorus BDEs (ng/kg) BDE 028 41318756 78 78 2,200.00 8,900.00 160,000.00 15,348.72 BDE 066 189084615 78 78 1,800.00 12,000.00 110,000.00 17,396.79 BDE 085 182346210 78 78 3,200.00 23,000.00 150,000.00 27,943.59 BDE 100 189084648 78 78 13,000.00 120,000.00 1100000.00 150,365.38 BDE 138 182677301 78 53 1,900.00 7,900.00 40,000.00 10,247.17 BDE 154 207122154 78 78 7,700.00 46,500.00 440,000.00 59,900.00 BDE 183 207122165 78 78 2,100.00 10,000.00 120,000.00 16,664.74 ¦ d W J i'lMJJII IIS1H/ITH 1,7-Dimethylxanthine 611596 78 4 1,130.00 2,245.00 9,580.00 3,800.00 4-Epianhydrotetracycline (EATC) 4465650 78 27 126.00 299.00 2,160.00 434.29 4-Epioxytetracycline (EOTC) 14206587 78 8 35.70 45.80 54.90 45.60 Acetaminophen 103902 78 2 1,120.00 1,210.00 1,300.00 1,210.00 Anhydrotetracycline (ATC) 4496859 78 47 94.30 205.00 1,960.00 330.06 Caffeine 58082 78 36 72.90 262.50 1,110.00 369.16 Clarithromycin 81103119 78 42 8.68 34.50 617.00 65.53 Codeine 76573 78 19 10.70 35.80 328.00 61.28 Cotinine 486566 78 35 11.40 21.00 690.00 99.36 Dehydronifedipine 67035227 78 17 3.48 5.96 21.65 7.97 Demeclocycline 127333 78 3 96.00 164.00 200.00 153.33 Diltiazem 42399417 78 64 1.81 18.25 225.00 44.45 Enrofloxacin 93106606 78 12 12.55 32.20 66.00 34.42 Gemfibrozil 25812300 78 70 12.10 115.00 2,650.00 234.12 Ibuprofen 15687271 78 49 99.50 255.00 11,900.00 920.67 ~neomycin 154212 78 3 12.85 29.10 33.40 25.12 Lomefloxacin 98079517 78 2 33.30 36.55 39.80 36.55 Metformin 657249 77 6 550.00 756.00 1,160.00 781.50 Minocycline 10118908 67 29 351.00 475.00 8,650.00 883.40 Naproxen 22204531 78 40 20.90 75.75 1,020.00 137.37 Norfloxacin 70458967 78 26 99.30 203.00 995.50 297.30 ------- 2. APPROACH CAS Total Detected Concentrations Analyte Number N N Minimum Median Maximum Mean Oxytetracycline (OTC) 79572 78 28 21.05 62.50 467.00 83.07 Ranitidine 66357355 77 44 3.85 18.15 2,250.00 81.98 Roxithromycin 80214831 78 2 14.30 18.33 22.35 18.33 Sarafloxacin 98105998 78 2 179.00 1,079.50 1,980.00 1,079.50 Sulfachloropyridazine 80320 77 2 26.00 42.35 58.70 42.35 Sulfadiazine 68359 77 3 22.90 77.30 140.00 80.07 Sulfadimethoxine 122112 77 5 3.58 7.35 62.20 18.30 Sulfamethazine 57681 77 2 21.50 22.35 23.20 22.35 Sulfamethoxazole 723466 77 29 3.91 12.30 651.00 43.26 Sulfanilamide 63741 77 8 191.00 1,593.50 15,600.00 3,651.50 Thiabendazole 148798 78 54 8.42 22.05 238.00 46.32 Trimethoprim 738705 78 23 12.65 38.90 204.00 51.37 Virginiamycin 11006761 78 14 43.50 125.25 469.00 162.56 Steroids/Hormones (|jg/kg) 17 Alpha-Estradiol 57910 73 5 14.45 21.90 48.80 26.13 17 Beta-Estradiol 50282 78 9 22.00 33.20 222.25 60.89 Androstenedione 63058 73 30 108.00 387.50 1,520.00 495.15 Androsterone 53418 73 48 17.65 107.50 1,030.00 157.26 Beta-Estradiol 3-Benzoate 50500 74 17 30.20 145.00 1,850.00 449.16 Beta-Sitosterol 83465 78 67 24,400.00 260,000.00 1640000.00 333,643.28 Desmosterol 313042 78 52 2,730.00 14,700.00 94,400.00 19,037.69 Equilin 474862 73 13 22.30 36.75 100.30 48.31 Ergosterol 57874 78 48 4,530.00 21,700.00 91,900.00 27,988.33 Estriol 50271 74 16 7.56 77.85 232.00 79.24 Estrone 53167 73 56 26.70 74.90 965.00 133.78 Norethindrone 68224 76 5 21.00 41.10 1,360.00 305.82 Norgestrel 6533002 74 4 43.80 113.75 1,300.00 392.83 Progesterone 57830 77 17 143.00 757.00 1,290.00 753.50 Testosterone 58220 73 17 30.80 97.90 2,040.00 291.79 Section 2.1 introduces the ProUCL software used to calculate the 95th percentile estimates for the 84 analytes in Table 6. Sections 2.2 and 2.3 present goodness-of-fit distributional tests and statistical outlier tests, respectively, which were used in preparing the datasets for analysis and determining an appropriate statistical approach for estimating the 95th percentile. Section 2.4 presents brief overviews of these statistical approaches; the results of applying these approaches to data for the 84 analytes follow in Section 3. 2.1 ProUCL Software The analysis in this report utilized Version 4.1.00 of EPA's ProUCL software, an open-source statistical estimation software tool available for download at https://www.epa.qov/land-research/proucl-software. ProUCL offers a variety of parametric and nonparametric statistical approaches for calculating estimates of the upper percentiles of a statistical distribution. These approaches differ in the assumed underlying distribution of the data and in how non-detects are treated. Most approaches regard non-detects as left- censored at the reported sample-specific detection limit (i.e., the reported result is known only to fall below the limit), and some can handle multiple values for detection limits among the non-detects. Because the 95th percentile was of interest to estimate here, because ProUCL offers approaches that are more rigorous than simple substitution methods for handling non-detects and which have demonstrated good performance in peer reviewed publications, and because the reported data in the TNSSS contain non-detects at multiple detection limits (when non-detects were present) for a given analyte, EPA considered ProUCL to be an appropriate tool for estimating 95th percentiles for the 84 analytes in this report. ProUCL was designed to analyze environmental concentration data associated with a localized site characterization. Thus, it was not designed to analyze data from complex sampling designs, such as stratification or the use of sampling weights. The in-depth statistical analysis performed in USEPA (2009a) did account for the sampling weights, and thus, generated nationally representative estimates. 7 ------- 2. APPROACH 2.2 Methods for Testing Distributional Goodness-of-Fit ProUCL considers three different data distributions as a basic assumption in its parametric estimation methods: normal, lognormal, and gamma distributions. Thus, ProUCL offers goodness-of-fit tests for each of these three distribution models. ProUCL recommends that the results of these tests be reviewed with histograms or quantile-quantile (Q-Q) plots of the data in order to get a more complete assessment of distributional goodness-of-fit. These data plots also provide useful information about the presence of potential outliers and influential data values. This, histograms of detected measurements for the individual analytes can be found at the end of Section 3. Because of the unknown quantitative value of non-detects, the goodness-of-fit tests were applied only to the set of detected measurements for each analyte. That is, any non-detects were excluded from the test. ProUCL uses the Shapiro-Wilktest for normality (Gilbert 1987), as well as Lilliefors test (Dudewicz and Misra, 1988; Conover, 1999) when the sample size exceeds 50. (ProUCL indicates that Lilliefors test performs well for samples of this size, while recognizing that the Shapiro-Wilktest can be applied to samples with larger sample sizes.) Lognormality tests are equivalent to normality tests performed on log- transformed data. To test for goodness-of-fit to a gamma distribution, ProUCL uses two empirical distribution function (EDF)-based methods: the Kolmogorov-Smirnov (K-S) test and the Anderson-Darling (A-D) test (D'Agostino and Stephens, 1986; Stephens, 1970). The critical values for these two test statistics originate from Monte Carlo simulation experiments. Conclusions derived solely from goodness-of-fit tests need to be made with caution. Because the null hypothesis of these tests is that the given distributional model holds (e.g., normality) and the alternative is that it does not hold, then the outcome of these tests is either the distributional model can or cannot be rejected based on the data. Thus, if one fails to reject the given distribution, this does not mean that the distribution is the best fit to the data, only that it cannot be outright rejected. Furthermore, the outcome of the test is highly influenced by the sample size - fewer data points make it more difficult to reject the distributional model, thereby making it more likely to conclude that the distribution is reasonable when in fact it is not, while many data points can result in rejecting the distribution with high likelihood, even when the distribution is appropriate. The test outcomes can also be influenced by extreme data values. Thus, these goodness-of-fit tests provide only a general indication of the relevance of a given distributional assumption. 2.3 Methods for Identifying Statistical Outliers The presence of outliers among the collected concentration data could distort the estimates of distributional parameters such as upper percentiles. To identify and assess potential outliers in the measurements for the 84 analytes, the following outlier detection tests were accessed in ProUCL: Dixon's Extreme Value test (Dixon, 1953), when the sample size is less than 25. Rosner's test (Gilbert, 1987), which can detect up to 10 outliers for sample sizes of 25 or more. The outcomes of both outlier tests are sensitive to the assumption that the data are normally distributed in the absence of outliers. Therefore, the extent to which normality holds in the data was checked along with the results of the outliertests (or equivalents, lognormality if the tests are performed on log- transformed data). Furthermore, using outliertests to identify a single (i.e., most extreme) outlier often suffer from masking effects when multiple outliers are present, as these outliers inflate the standard deviation, which makes it more difficult to identify the most extreme data point as an outlier. For both tests, non-detects can be either excluded from the dataset or represented by one-half of the detection limit. (In this analysis, non-detects were excluded; that is, outliertests were performed only on detected outcomes, and thus, the sample size refers to the number of detected outcomes.) 8 ------- 2. APPROACH As always, the decision regarding the proper disposition of outliers (e.g., to include or not to include outliers in statistical analyses) should consider the extent to which conditions associated with sampling and analysis, as well as facility conditions on the day of collection, are not typical, and thus, warrant exclusion. Because no data exclusions could be warranted for such reasons, no outliers were excluded from the calculation of 95th percentiles based on applying these outlier tests. 2.4 Methods for Estimating the 95th Percentile ProUCL provides four basic statistical approaches to calculating the 95th percentile. They typically calculate a 95th percentile as: where jj. and |