United States Environmental Protection Agency Final Contaminant Candidate List 3 Chemicals: Classification of the PCCL to CCL ------- Office of Water (4607M) EPA815-R-09-008 August 2009 www. epa. gov/safewater ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Table of Contents 1.0 INTRODUCTION TO THE CONTAMINANT CANDIDATE LIST (CCL) CLASSIFICATION PROCESS 1 1.1 Principles of Evaluation 2 1.2 Developing the Classification Approach 3 2.0 ATTRIBUTES 5 2.1 Health Effects Attributes 6 2.1.1 Potency 6 2.1.2 Severity 15 2.2 Occurrence Attributes 20 2.2.1 Prevalence and Magnitude Data Elements 21 2.2.2 Prevalence - Calibrating Scales and Scoring 22 2.2.3 Evaluation of the Prevalence Protocol 23 2.2.4 Magnitude - Calibrating Scales and Scoring 24 2.2.5 Persistence-Mobility as a Surrogate Measure for Magnitude 29 2.2.6 Persistence-Mobility Data - Calibrating Scales and Scoring 30 2.2.7 Evaluation of the Magnitude Protocol 31 2.3 Fine Tuning the Protocols 32 3.0 DEFINITIONS AND OVERVIEW OF THE TRAINING DATA SET 32 3.1 Key Considerations 33 3.2 Developing Key Components of the Training Data Set 33 3.2.1 Attribute Scores 33 3.2.2 Making List-Not list Decisions 37 4.0 PROTOTYPE CLASSIFICATION MODELS AND THE CCL PROCESS 38 4.1 Model Training and Development 39 4.2 Model Sensitivity Analyses 41 4.2.1 Training with subsets of the IDS 41 4.2.2 Training after Selected "Outliers" Are Removed From the IDS 42 4.2.3 Graphical and Statistical Analyses to Identify Significant Differences in Attribute "Weights" Or Influence on Model Performance 43 4.3 Model Performance Testing 45 4.4 Evaluating Classification Differences 46 4.4.1 Classification Differences Among the Models 47 4.4.2 Logical Evaluation of the Models - Graphical Analysis 49 4.5 Applying Model Results 55 4.5 Applying Model Results 56 4.5.1 Additive Model Results 56 4.5.2 Additive Rank Order Results 56 i of vi ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 5.0 MODEL OUTCOME AND POST MODEL EVALUATION PROCESS 58 5.1 PCCL Characterization and Model Results 58 5.2 Evaluation of the Modeling Output 59 5.2.1 Procedure 59 5.2.2 Evaluation Results 60 5.3 Post-Model Adjustments to Output 62 5.3.1 Using Supplemental Sources to Identify the Data Most Relevant to Drinking Water 63 5.3.2 Calculation of a Health-Concentration Ratio for Contaminants with Water Data 63 5.3.3 Grouping Contaminants based on Data Certainty 66 5.3.4 LDso Values with Limited Documentation 67 5.4 Selecting the Draft CCL 3 67 5.5 Summary 68 6.0 REFERENCES 69 7.0 APPENDICES 70 Appendix A. Attribute Scoring Protocols A-l Appendix B. Information Sheets from the TDS Exercises B-l Appendix C. Summary of EPA Team TDS Decisions C-l Appendix D. Software Sources D-l Appendix E. Solutions E-l Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results F-l Appendix G. PCCL Contaminants with Incomplete Data for Scoring or that had Parent Compounds Scored in Developing the Draft CCL 3 G-l ii of vi ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Table of Exhibits Exhibit 1. Developing an Approach to Process PCCL Chemicals 4 Exhibit 2. Decile Distribution of RfD Values (mg/kg/day) 8 Exhibit 3A. Logarithmic Distribution of RfD Values 9 Exhibit 3B. Logarithmic Distribution of NOAEL Values 9 Exhibit 3C. Logarithmic Distribution of LOAEL Values 10 Exhibit 3D. Logarithmic Distribution of LD50 Values 10 Exhibit 4. Scoring Equations for Potency 12 Exhibit 5. Logarithmic Distribution of Cancer Potency Values 13 Exhibit 6. Potency Scores for Chemicals in the Learning Set 14 Exhibit 7. Potency Scores for Chemicals Not in the Learning Set 14 Exhibit 8. NRC Severity Scoring Proposal 16 Exhibit 9. Final Nine-Point Scoring Protocol for Severity 17 Exhibit 10. Relationship of Data Elements Used to Score Magnitude and Prevalence 21 Exhibit 11. Comparison of Prevalence Scores for Learning Set Contaminants 23 Exhibit 12. Comparison of the NRC Magnitude Score with the Ratio of the Health Advisory Guideline to the Concentration in Finished Water 25 Exhibit 13. Magnitude Concentrations and Scores Derived from Potency Doses 26 Exhibit 14A. Equal Bins Drinking Water Magnitude Scale (ug/L) 27 Exhibit 14B. Half Log Option A Drinking Water Magnitude Scale (ug/L) 27 Exhibit 14C. Half Log Option B Drinking Water Magnitude Scale (ug/L) 28 Exhibit 15. Magnitude Attribute Scores: Example Contaminants Scored by their Median of Detections Using the Various Approaches in Exhibit 14 28 Exhibit 16. Mobility and Persistence Data Elements 29 Exhibit 17. Comparison of Scores derived using the Magnitude Protocol 31 Exhibit 18. Combinations of low and high attribute scores1 for the four attributes using Latin Hypercube Sampling 35 Exhibit 19. Attribute Space for the 101 IDS compared to that for the 202 IDS 36 Exhibit 20a. QUEST Classifications Based on the Full Training Data Set 40 Exhibit 20b. QUEST Classifications Based on 5-Fold Cross-Validation 40 Exhibit 21. Linear Model-estimated versus Team Average Classification for the TDS 43 Exhibit 22. Relative Weights of Attributes at QUEST Nodes 44 Exhibit 23. Summary Statistics from MCMC Sample 45 iii of vi ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Exhibit 24. Features of the Three Preferred Models Based on TDS Test Results 46 Exhibit 25. Decision Comparison Matrix; Weight of Differences 47 Exhibit 26. Summary of Quaternary Model Decisions 48 Exhibit 27. Results of 202 Model Classifications and Weighted Misclassifications 48 Exhibit 28. Summary of Individual Quaternary Model Classifications 50 Exhibit 29. ANN Model Predictions for the Four Attribute Space 51 Exhibit 30. MARS Model Predictions for the Four Attribute Space 53 Exhibit 31. Univariate CART Model Predictions for the Four Attribute Space 54 Exhibit 32. Linear Model Predictions for the Four Attribute Space 54 Exhibit 32. Linear Model Predictions for the Four Attribute Space 55 Exhibit 33. Summary Comparison of the Sum of the 3 Model Decisions to the Distribution of EPA Blinded (TDS) Decisions 57 Exhibit 34. Model Results for the PCCL Chemicals 58 Exhibit 35. Results of the Model Output Evaluation (Total = 129 chemicals) 62 Exhibit 36. Formulae used in the CCL 3 Process to Calculate Health Reference Levels (HRLs) from the CCL 3 Potency Data Elements 64 iv of vi ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 jig |ig/L ANN ATSDR CART CASRN CCL CCL 1 CCL 2 CCL 3 CUS/IUR DBF EDWC EEC EPA g/day HRL IOC IRIS kg L Ibs LOAEL MARS MCMC mg mg/kg List of Abbreviations and Acronyms Less than Less than or equal to Greater than Greater than or equal to Microgram, one-millionth of a gram Micrograms per liter Artificial Neural Network Agency for Toxic Substances and Disease Registry Classification and Regression Tree Chemical abstract services registry number Contaminant Candidate List EPA's first contaminant candidate list EPA's second Contaminant Candidate List EPA's third Contaminant Candidate List Chemical update system/inventory update rule Disinfection byproduct Estimated Drinking Water Concentration Estimated Environmental Concentration United States Environmental Protection Agency Grams per day Health Reference Level Inorganic compound Integrated Risk Information System Kilogram Liter Lethal dose 50; an estimate of a single dose that is expected to cause the death of 50 percent of the exposed animals; it is derived from experimental data. Pounds Lowest observed adverse effect level Multivariate Adaptive Regression Splines Markov Chain Monte Carlo Milligram, one-thousandth of a gram Milligrams per kilogram body weight v of vi ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 mg/kg/day Milligrams per kilogram body weight per day mg/L Milligrams per liter N Number of samples NAWQA National water quality assessment (USGS program) NCOD National contaminant occurrence database ND Not detected (or non-detect) NOW AC National Drinking Water Advisory Council NIRS National Inorganic and Radionuclide Survey NOAEL No observed adverse effect level NRC National Research Council OW Office of Water OPP Office of Pesticide Programs PBPK Physiologically Based Pharmacokinetic PCCL Preliminary-CCL PWS Public water system QUEST Quick, Unbiased, Efficient Statistical Tree RTECs Registry of Toxic Effects of Chemical Substances RfD Reference dose TDS Training data set TRI Toxics Release Inventory UCMR Unregulated Contaminant Monitoring Regulations UCMR 1 First Unregulated Contaminant Monitoring Regulation UCMR 2 Second Unregulated Contaminant Monitoring Regulation UL Tolerable upper intake level US United States of America USGS United States Geological Survey vi of vi ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 1.0 INTRODUCTION TO THE CONTAMINANT CANDIDATE LIST (CCL) CLASSIFICATION PROCESS Every five years the United States Environmental Protection Agency (EPA) is required to publish a list of contaminants (1) that are currently unregulated, (2) that are known or anticipated to occur in public water systems, and (3) which may require regulations under the Safe Drinking Water Act (SDWA). This list is known as the Contaminant Candidate List or CCL. SDWA section 1412(b)(l) requires that in the development of the CCL, EPA consider specific data sources and include the scientific community. EPA must evaluate substances identified in section 101(14) of the Comprehensive Environmental Response, Compensation, and Liability Act (CERCLA) of 1980 and substances registered as pesticides under the Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA). SDWA also requires the Agency to consider the National Contaminant Occurrence Database established under section 1445(g) of SDWA. SDWA directs the Agency to consult with the scientific community, including the Science Advisory Board (SAB). In addition, it directs the Agency to consider the health effects and occurrence information for unregulated contaminants to identify those contaminants that present the greatest public health concern related to exposure from drinking water. EPA interprets the criterion that contaminants are known or anticipated to occur in public water systems broadly. In evaluating this criterion, EPA considers not only public water system monitoring data, but also data on concentrations in ambient surface and ground waters, releases to the environment (e.g., Toxics Release Inventory), and production. While such data may not establish conclusively that contaminants are known to occur in public water systems, EPA believes these data are sufficient to anticipate that contaminants may occur in public water systems and support their inclusion on the CCL. The Agency considered adverse health effects that may pose a greater risk to life stages and other sensitive groups which represent a meaningful portion of the population. Adverse health effects associated with infants, children, pregnant women, the elderly, and individuals with a history of serious illness were evaluated. In selecting contaminants for the CCL 3, each of the above requirements was met. SDWA section 1412(b)(l) also requires EPA to determine whether to regulate at least five contaminants from the CCL every five years. SDWA specifies that EPA shall regulate a contaminant if the Administrator determines that: • The contaminant may have an adverse effect on the health of persons; • The contaminant is known to occur, or there is a substantial likelihood that the contaminant will occur in public water systems with a frequency and at levels of public health concern; and • In the sole judgment of the Administrator, regulation of such contaminant presents a meaningful opportunity for health risk reduction for persons served by public water systems. Once contaminants have been placed on the CCL, EPA identifies if there are any additional data needs or if there are sufficient information to make a regulatory determination. EPA interprets these criteria for regulatory determination as more rigorous than what is used to place contaminants on the CCL. Page 1 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 EPA developed a multi-step approach to select contaminants for the third CCL (CCL 3), which includes the following key steps: (1) The identification of a broad universe of potential drinking water contaminants (CCL 3 Universe); (2) A screening process that uses straightforward screening criteria, based on a contaminant's potential to occur in public water systems and thereby pose a potential public health concern, to narrow the universe of contaminants to a Preliminary-CCL (PCCL); and (3) A structured classification process (e.g., a prototype classification algorithm model) that objectively compares data and information as a tool and is evaluated along with expert judgment to develop a CCL from the PCCL. Steps 1 and 2 in the process are described in other support documents: Final CCL 3 Chemicals: Identifying the Universe (EPA, 2009a); and Final CCL 3 Chemicals: Screening to a PCCL (EPA, 2009b). The purpose of this document is to describe the methodology used to develop the classification process (Step 3) and the process used to select chemicals for the CCL 3. The PCCL consisted of 561 chemicals that were screened from the CCL3 Universe. To select contaminants for the CCL 3, EPA used classification models to handle larger, more complex assortments of data in a consistent and reproducible manner. Learning from EPA's experience and expertise, the classification models were trained based on past expert decisions. The algorithms were used to prioritize chemicals which allowed the final expert evaluation and review to be more objective and efficient. The data and information used to evaluate contaminants on the PCCL is provided in Contaminant Information Sheets available in the CCL 3 docket (EPA-HQ-OW-2007-1189) at www.regulations.gov. 1.1 Principles of Evaluation In developing the first CCL (CCL 1), the Agency utilized readily available occurrence and health effects information coupled with an expert review process. Following the publication of CCL 1, the Agency sought the advice of the National Research Council (NRC) and National Drinking Water Advisory Council (NDWAC). The panels provided recommendations to guide EPA in creating a more comprehensive and transparent evaluation of potential drinking water contaminants for developing future CCLs. In the light of the NRC and NDWAC recommendations, EPA has reviewed and evaluated a large number of contaminants and their data, developed decision making protocols using classification algorithm approaches, and included expert review in arriving at decisions to list or not list contaminants on CCL 3. These steps have provided a decision process that is more transparent and reproducible than approaches used for previous CCLs. The process is driven by the data on individual contaminants and minimizes the bias that may occur with expert panels related to the participants' individual backgrounds and the effects of group dynamics. As experience is gained, the new classification process is likely to evolve and improve for application to future CCLs. Page 2 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 To guide the development of the classification process, EPA identified several key features that the approach addresses. 1. Meaningful Basis for Classification. The classification process must reflect the critical goals of the CCL; that is, it must consider the potential for occurrence in water, the potential for causing adverse health effects, and it must prioritize contaminants based on these criteria. The data supporting the list no-list decision must be linked back to these three tenets. 2. Incorporating Relevant Data. The most relevant data used for the classification process include health effects data that are appropriate for drinking water exposures, and occurrence data that indicate the nature and spatial extent of potential occurrence in drinking water. 3. Transparent Process for Communication. One goal of the classification approach is to provide a transparent process that can be reviewed by external experts and the public. The attributes and data characterizing the contaminants should be easy to understand and the decision-making process to list or not list a particular contaminant must be conveyed in a straight forward manner. 4. Reproducibility. A key feature of the classification process is that it should be reproducible. The classification process should always give the same result for the same set of input information. 1.2 Developing the Classification Approach Based on this framework, EPA developed an approach for classifying potential drinking water contaminants. An overarching premise in using classification models to prioritize contaminants is that different contaminants can be compared on the basis of similar attributes. The approach ensures that the contaminant attributes reflect the key decision characteristics in deciding whether or not to list a contaminant on the CCL. The attributes are properties used to categorize contaminants for their potential to occur in drinking water and for their potential to cause adverse health effects. For example, occurrence can be characterized by a contaminant's water concentration data or potential to occur based on its release to the environment. The adverse health effects of contaminants can be characterized using preliminary toxicological data such as median lethal dose (LDso) or more developed values such as oral reference doses (RfDs). To evaluate, categorize, and prioritize the PCCL contaminants as potential CCL contaminants, EPA integrated various types of data that represent measures of their attributes. This relative assessment across data measures normalized the available data by developing a set of attribute scales for the attribute data, and scoring mechanisms for the various types of data available for potential drinking water contaminants. Because of this new approach and its new application, EPA developed, tested, and evaluated the results of several classification algorithms to assess whether they are useful, and which ones might provide the best decision support tools. To test and evaluate the process, EPA developed a data set and used it to "train" the classification algorithms. Once the modeling was completed, EPA evaluated the model output based on the compilation of data for a subset of the modeled Page 3 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 contaminants and assisted in developing a process to utilize the model output to generate the CCL 3. The following chapters describe the steps EPA used to develop the components of the classification process, as displayed in Exhibit 1. Exhibit 1. Developing an Approach to Process PCCL Chemicals Develop Attribute Scoring Protocol Select Training Data Set Contaminants and Make Listing Decisions Score Training Data Set Contaminants with Final Attribute Scoring Protocols Train and Validate Classification Approaches using Training Data Set Iterative Process - The results of training and validation will indicate if areas need further evaluation and refinement. The iterative process may or may not go back to the primary assumptions. Post-model evaluation of PCCL chemicals Chapter 2 describes the attributes and scoring protocols. Chapter 3 describes the set of chemicals used to train the classification models, the training data set. Chapter 4 describes how the models were calibrated using the attributes and training data set. Chapter 5 describes the evaluation of the model output and post model processes. Page 4 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 2.0 ATTRIBUTES Attributes are used to characterize different chemicals on the basis of similar qualities or traits. These qualities or traits represent the anticipated occurrence or adverse health effects of each contaminant. Occurrence and health effects are both represented by different types of data. To evaluate contaminants as potential CCL contaminants, one must be able to establish consistent relationships among the different types of data that represent measures of the attributes. This process involves the need to normalize the available data by developing scales and scoring mechanisms that will accept a variety of input data. The attributes are properties used to categorize contaminants for their potential to occur in drinking water and for their potential to cause adverse health effects. For example, occurrence may be characterized by water concentration data or a contaminant's potential to occur based on its release to the environment. The adverse health effects of contaminants may be characterized using preliminary toxicological data such as median lethal dose (LDso) or more developed values such as oral reference doses. The NRC recommended using the attributes Potency and Severity to describe health effects, and Prevalence and Magnitude to describe occurrence. When occurrence data are not available, they also suggested that environmental fate properties (i.e. Persistence and Mobility) could be used as surrogates to estimate potential for occurrence. EPA agreed that the recommended attributes are appropriate and consistent with data used in the past decision-making efforts by EPA's Office of Water (OW). Throughout the process of evaluating the attributes, it was recognized that a wide range of data elements would have to be used to characterize each attribute. The CCL process involves classifying relatively new and emerging contaminants and most will not have complete dossiers of data. If the same data were available for all chemicals their comparison and prioritization would be relatively straight forward. However, the types of data available for unregulated chemical contaminants varies. To enable comparisons among chemicals with differing types of data and information, a scaling system that accepts a variety of input data, yet provides a consistent comparative framework, is needed. In concert with NRC and NDWAC recommendations, EPA identified the following principles to guide development of the attribute scoring process: • Attribute scores should increase with concern (e.g., a 10 is of greater concern, 1 of lesser concern); • There should be sufficient scoring categories to capture the range of data and to discriminate among the data; • The number of categories should not be so great that they create a false sense of precision; • Attributes can use different numbers of scoring categories if necessary (i.e., Prevalence could use 1-10, while Severity could use 1-8); • The possible range of the scores for a given attribute should be the same regardless of the data elements that are used to assign the score for that attribute; • The data source and data element used for each attribute should consider more direct measures of occurrence or health effects before potential measures; peer reviewed data before unpublished data, and measured data before modeled data. Page 5 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 • The calibration scale (i.e., the scale relating the range for a data element to the scoring categories) should be established using a representative "universe" of data for each attribute to capture the potential range of values that might be encountered; • The calibration scale must be set and remain constant throughout the operational process; and • The scoring approach should be as simple as possible and data should be used with minimal transformations. Section 2.1 describes the development of the process used to score the health effects attributes, and section 2.2, the approach for the occurrence attributes. 2.1 Health Effects Attributes Potency and Severity are the two attributes used for evaluating health effects. As defined in detail below, Potency reflects the lowest dose of a chemical that causes an adverse health effect in a case study report or in a toxicological or epidemiological study. Severity is the adverse health effect associated with the dose that is used as the measure of Potency, and is calibrated based on the health-related significance of the adverse effect (e.g., dermatitis versus cancer). These two attributes are interrelated, in that the Severity is linked to the measure of Potency. 2.1.1 Potency Potency is a value that indicates the power of a contaminant to cause adverse health effects. In the case of chemicals, that power is apparent in the dose required to cause the most sensitive manifestation of an adverse health effect, or to generate a particular excess cancer risk. Potency for chemicals is reflected in several standard toxicological parameters that are discussed below. A number of approaches have the potential to be useful in scoring the Potency attribute. However, regardless of the approach selected, the methods require calibrating the scores to normalize the scale. To evaluate the data elements and establish consistent scales, an initial "learning set" of about two hundred chemicals was developed for use in experimentation with approaches to calibration. The chemicals considered included regulated chemicals and unregulated chemicals for which EPA has derived Health Advisories (EPA, 2004). These chemicals are primarily at the high end of the Potency scale. To ensure that the Potency scale covers the full range of conditions that may be encountered (from high to low Potency) in a universe of chemicals, a group of chemicals (nutrients/food additives) that are generally considered as relatively non-toxic and have toxicity values that can be compared to health advisories were added to the learning set. Page 6 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The following toxicity parameters were compiled for the learning set chemicals, and their numeric distribution across the range of values was examined (see the footnotes below for definitions of the terms). • Reference Dose (RfD)1 or equivalent • Cancer potency2 (concentration in water equivalent to a 10~4 cancer risk) • No Observed Adverse Effect Level (NOAEL)3 and/or Lowest Observed Adverse Effect Level (LOAEL)4 associated with the RfD • Rat oral median Lethal Dose Several approaches to characterize the distribution of values for the different toxicity parameters were employed in this exercise. The approaches are described in the following section. The data for the learning set were obtained from the following sources: • EPA's Integrated Risk Information System (IRIS) • EPA's Office of Water Health Advisories Documents6 • Registry of Toxic Effects of Chemical Substances (RTECS) (Mostly LDso values) • Tolerable Upper Intake Levels (ULs) from the Institute of Medicine Dietary Reference Intakes 1 A Reference Dose (RED) is an estimate (with uncertainty spanning perhaps an order of magnitude) of a daily exposure to the human population (including sensitive subgroups) that is likely to be without an appreciable risk of deleterious effects during a lifetime. It is expressed in mg/kg/day. The Agency for Toxic Substances and Disease Registry (ATSDR) lifetime Minimal Risk Levels (MRLs), World Health Organization (WHO) Tolerable Daily Intakes (TDIs), WHO and Food and Drug Administration (FDA) Acceptable Daily Intakes (ADIs), and the Institute of Medicine (IOM) nutrient Tolerable Upper Intake Levels (ULs) are roughly equivalent to the RfD. 2 For this exercise cancer potency was evaluated as the concentration in drinking water equivalent to an excess cancer risk of one case in 10,000 (10~4). This value is given in the Office of Water (OW) Drinking Water Standards and Health Advisories Tables and also is included in all Integrated Risk Information System (IRIS) Summary documents. When the 10"4 risk value is not available, it can be calculated from a cancer slope factor. NOAEL is a No-Observed-Adverse-Effect Level. It is the highest dose in a toxicological study or a group of studies that has no observed adverse effect. LOAEL is a Lowest-Observed-Adverse-Effect Level. It is the lowest dose in a toxicological study or a group of studies that causes an adverse health effect. 5 An oral median Lethal Dose (LD50) is an estimate of the oral dose that will cause the death of 50 percent of the exposed animals. LD50 data are based on acute exposures with limited post-exposure observations of the animals for cause of mortality, clinical signs, and gross pathology. 6 The 2002 Edition of the Drinking Water Standards and Health Advisories was used for the RfD and 10"4 risk values. Page 7 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 2.1.1.1 Potency Data - Calibrating Scales and Scoring Once the data for the learning set of chemicals was collected, they were arrayed and graphically displayed to analyze their range and distribution. For the initial evaluation, the range (in mg/kg/day) was divided into approximately ten equal units (deciles). This distribution was found to be highly skewed, with a large majority of the values falling in the decile of highest toxicity (see Exhibit 2 for an example). Two factors influenced this result. The first factor is that the range of values covered up to twelve orders of magnitude for the parameters evaluated. The second factor is that the set of contaminants contained both toxic chemicals as well as those generally regarded as safe (in keeping with the principles) and there are far more toxicological data available in the literature on chemicals considered to be toxic than for those, like the nutrients, that are only weakly toxic. This shifts the volume of data toward the chemicals with higher potencies. Most chemicals that are generally regarded as safe have limited available toxicological data, as their nutritional and commercial uses do not indicate a potential hazard at low to moderate intakes. Exhibit 2. Decile Distribution of RfD Values (mg/kg/day) 160 140 120 100 20 <=0.1 >0.1-0.2 >0.2-0.3 >0.3-0.4 >0.4-0.5 >0.5-0.6 >0.6-0.7 >0.7-0.8 >0.8-0.9 >0.9 RfD (mg/kg/day) The second distribution evaluated was based on logarithms (base 10) of the toxicity parameters rounded to the nearest integer (see Exhibit 3 A-D as examples). Page 8 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 3A. Logarithmic Distribution of RfD Values -6-5-1-3-2-101234 More Exhibit 3B. Logarithmic Distribution of NOAEL Values -5-4-3-2-1012345 Mor Page 9 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 3C. Logarithmic Distribution of LOAEL Values <= -5 -4 -3 -2 -1 0 1 2 34 5 More Round(Log10(LOAEL)) Exhibit 3D. Logarithmic Distribution of LDso Values -2-101 2345 Page 10 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The decile distribution (Exhibit 2) was found to be undesirable in developing a protocol for scoring Potency because almost all of the chemicals are clustered at one end of the distribution. This does not provide a good distribution of scores for discrimination of differences. With the decile distribution, almost all of the chemicals in the learning set would have a high Potency score of 10. Very few chemicals would have lower scores. The distribution based on the rounded Log10 of the toxicity parameter provided a distribution that spread the chemical toxicity parameters across the range and the most frequent Logic value is approximately in the middle of the range making the curve roughly log-normal Exhibit 3 A-3D). It was for this reason that the Logic distribution was selected for development of the scoring equation. The distribution of toxicity values is still somewhat skewed toward higher toxicity scores; however, this is a product of limited available data for the weakly toxic chemicals. The log-based distribution was used to establish a scoring equation for Potency for each measure of toxicity. This was accomplished by assigning the most frequent (modal) value in the distribution a score of 5 on a 10 point scale and solving an equation for each type of toxicity parameter that would make that distributional value equal a score of 5. For example, in Exhibit 3 A (RfD values), the most frequent value is a rounded logarithm of-2 (0.01). The scoring equation for the RfD values was developed as follows: 5 = 10- (most frequent rounded log + X) 5 = 10-(-2 + X) 5 = 10 + 2-X 5 = 12-X 5 - 12 = -X -7 = -X 7 = X Accordingly the equation for scoring the RfD values is Score = 10 - (rounded log of RfD + 7) The scoring equations for the other measures of toxicity were derived from the modal rounded logarithm values of their distributions in a similar fashion. As displayed in Exhibit 3, the position of the modal rounded log differed for each of the measures of toxicity, and necessitated differing equations for each measure. These equations are summarized in Exhibit 4. Page 11 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Exhibit 4. Scoring Equations for Potency RfD Score = 10 - (Log10 of RfD + 7) NOAEL Score = 10 - (Log10 of NOAEL + 4) LOAEL Score = 10 - (Log10 of LOAEL + 4) LD50 Score = 10 - (Logio of LD50 + 2) 10~4 cancer risk l Score = 10 - (Logio of the 10~4 cancer risk + 6) 1 The concentration in water for 10"4 cancer risk in water was selected as the measure of potency for carcinogens because this is the value given in the Standards and Drinking Water Health Advisories Tables prepared by OW and also is provided in IRIS Summaries. Changing the reference value to the 10"6 risk would merely shift the rounded log value and the constant by two integers but would not change the score. Scores were restricted to whole number values with a maximum of 10 and a minimum of 1. Some distributions for toxicity parameters span a range greater than ten orders of magnitude. EPA decided that calculated scores less than 1 would be given scores of 1 and calculated scores greater than 10 would be given scores of 10, which combine the chemicals at the tails of the distributions. Conversely, for the distributions that covered less than 10 orders of magnitude, no attempt was made to normalize the scores across a range often because the learning set is limited and could have been expanded by searching for chemicals that are more toxic than the most toxic substance in the learning set (dioxin with an RfD of 1 x 10"9 mg/kg/day) and less toxic than the least toxic chemical in the learning set (phosphorous with an RfD-equivalent of 57 mg/kg/day derived from the Institute of Medicine (IOM) UL. However an adjustment was made to accommodate LDso values that are reported as greater than a specific numerical dose. In such a case, the highest dose used in the study did not cause death in 50 percent of the tested animals, indicating that the chemical is less toxic than would be indicated by the highest dose tested. Accordingly, the LDso equation was modified to accommodate this situation and became: LD50 Score = 10 - (Log10 of >LD50 + 3) This change to the LD50 equation decreases the Potency score from that derived from the numeric value of the LD50 by one to accommodate the "greater than" designation. A similar adjustment was made for situations where the NOAEL in a critical study was the highest dose tested. The distribution for cancer effects is the most skewed of those examined (see Exhibit 5). There are a greater number of chemicals that are more potent carcinogens when compared to those in the modal grouping than there are those that are less potent. This is not unusual because cancer bioassays are costly and there is an incentive to invest resources in studying chemicals that have a high likelihood of being potent carcinogens. No attempt was made to further normalize the cancer scores across a range of 10. For the chemicals in the learning set, the lowest cancer Potency score is 3. Page 12 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 5. Logarithmic Distribution of Cancer Potency Values -2-10 1 Round(Log10(E4)) 234 More 2.1.1.2 Evaluation of the Potency Scoring Protocol All of the chemicals in the learning set were scored for each toxicity parameter to examine the consistency across scores for the non-cancer measures of Potency. Some examples of this evaluation are provided in Exhibit 6. Since the mechanisms that lead to the development of cancer involve some biological responses that are unique to tumors, the 10~4 cancer risk values were not included in this comparison. The scores for individual chemicals were compared across the toxicity values, and the agreement between scores was evaluated. Page 13 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 6. Potency Scores for Chemicals in the Learning Set Chemical Calcium (Calcium chloride for LD50) Cyanazine Dioxin (2,3,7,8-TCDD) Hexazinone Iodine (Sodium iodide for LD50) Methyl ethyl ketone Methyl parathion Naphthalene Phenol Vitamin D RfD 1 6 10 4 5 3 7 5 4 6 NOAEL ND 6 ND 5 8 3 8 4 4 9 LOAEL 4 6 10 4 8 3 7 4 4 9 LD50 5 6 4 5 4 5 7 5 5 ND ND = No data In addition, the scoring equations were applied to selected chemicals that were not in the learning set using data available in the Agency of Toxic Substances and Disease Registry (ATSDR) Toxicological Profiles. Those results are summarized in Exhibit 7. The scores were evaluated for consistency across parameters. Exhibit 7. Potency Scores for Chemicals Not in the Learning Set Chemical/ Potency Scores Acrylonitrile Ethion Malathion Endosulfan RfD-equivalent (mg/kg/day) 4 6 5 6 NOAEL (mg/kg/day) 5 7 6 7 LOAEL (mg/kg/day) 5 6 5 ND LD50 (mg/kg) 6 6 5 5 ND = No Data The agreement of non-cancer scores across the RfD, NOAEL, LOAEL and LDso inputs was evaluated. There were 216 chemicals in the learning set; 13.5 percent of those with multiple non- cancer scores had identical scores across all parameters (see cyanazine in Exhibit 6). For 54.6 percent, the scores deviated by 1 integer (see hexazinone in Exhibit 6); 20.5 percent deviated by 2 integers (see methyl ethyl ketone in Exhibit 6). There was a 3-integer deviation for only 9.7 percent, and the majority of those were inorganic compounds (see iodine [sodium iodide] in Exhibit 6). Only 1.6 percent deviated by more than 3 integers (see dioxin in Exhibit 6). Scores deviated by two integers or less for 88.6 percent of the chemicals. The difference between scores Page 14 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 81 5-R-09-008 Classification of the PCCL to CCL August 2009 for a given compound was greatest for the relatively non-toxic chemicals. In almost all cases the NOAEL and LOAEL scores were higher than the RfD score, effectively negating the concerns that the inclusion of uncertainty factors in the calculation of the RfD would inflate the Potency score. For those chemicals with low uncertainty factors the NOAEL or LOAEL scores were often 3 or more integers higher than the RfD scores (see calcium chloride and vitamin D in Exhibit 6). Since most chemicals with RfD values are also likely to have NOAEL, LOAEL, and/or values, a policy decision was needed with regard to how one should select the parameter used to score for a non-cancer endpoint. Since there is a general consistency among scores, EPA determined that a hierarchy of RfD> NOAEL> LOAEL> LDso would be used. In cases where a NOAEL is higher than the lowest LOAEL, the LOAEL would be used in its place. This hierarchy gives preference to the Potency value with the richest supporting data set (the RfD or equivalent values) and the lowest ranking to the LD50 because it is a measure of acute rather than chronic toxicity. When comparing cancer and non-cancer scores, it was determined that the end point (cancer or non-cancer) that provided the highest measure of Potency would be used to score the candidate. Similar to the screening protocols, EPA applied the potency scoring protocol for LOAELs to contaminants with MRDDs. The Agency did conduct additional searches to identify the best available information to characterize the potency and severity for chemicals on the PCCL, including pharmaceuticals. If additional information was not found, the Agency relied on the data used in the screening step. These evaluations were used to develop the scales and hierarchy of data used in the Potency Scoring Protocol, which is presented in Appendix A. 2.1.2 Severity Severity refers to the relative impact of an adverse physiological change caused by a xenobiotic chemical in humans or animals on the ability of the human or animal to function and survive in the environment. The sixteenth century physician, Paracelsus, provided the underlying principle for the toxicological sciences with the axiom "the dose makes the poison." Just as toxicity increases with dose, so too does the Severity of the observed effect, in most cases. A low dose effect could be a simple increase in liver weight while the same chemical at a higher dose could cause cirrhosis of the liver. For that reason, the measure of Severity that will be used for scoring in the CCL process is the effect or effects seen at the LOAEL. Restricting Severity scores to the effects occurring at the LOAEL ties them to the data used to derive the Potency score - the type of data likely to be available for CCL candidates. This approach is consistent with the advice provided by the NRC and NOW AC (NRC 2001, NOW AC 2004). The Severity measures that will be used for CCL scoring differ from those used for Potency, Prevalence, and Magnitude because they are descriptive rather than quantitative. Accordingly, they are less amenable to automation and often require more scientific judgment in their application. The sections that follow describe the approach that was used to derive the scoring protocol for Severity and to evaluate its performance. Page 15 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 2.1.2.1 Severity - Scales and Scoring In developing the protocol for scoring Severity, EPA began with the system used by the NRC (2001) for their case study on methods for selecting a CCL from a PCCL. The NRC Severity scoring protocol was based on the anticipated clinical impact of the most sensitive endpoint in affected individuals. The NRC prototype for scoring Severity is provided in Exhibit 8. Exhibit 8. NRC Severity Scoring Proposal Score Description 0 No effect 1 Changes in organ weights with minimal clinical significance Biochemical changes with minimal clinical significance Pathology of minimum clinical importance (e.g., fluorosis, warts, common cold) 4 Cellular changes that could lead to disease; minimum functional change 5 Significant functional changes that are reversible (e.g., diarrhea) 6 Irreversible changes; treatable disease 7 Single organ system pathology and function loss Multiple organ system pathology and function loss Disease likely leading to death 10 Death In trying to apply the NRC Severity prototype using the critical effects from EPA IRIS Health Risk Assessments, EPA toxicologists encountered difficulty because of the clinical components of the prototype. It was difficult to determine clinical outcomes such as function loss, treatability, or potential for mortality from the critical effects identified in IRIS. In addition, some of the features of a clinical progression could be influenced by the availability and affordability of treatment. EPA decided that it would not be appropriate to use a scoring scheme that had economic and environmental justice implications. The critical effect data for PCCL contaminants will, in most cases, be expressed using terminology very similar to the terminology found in the IRIS database. Accordingly, critical effects of 100 IRIS chemicals were compiled and grouped into categories by EPA toxicologists. These categories were, in turn, used to build a scoring scale that applied some of the rationale reflected in the NRC prototype, but utilized the critical effects information most likely to be available from databases such as IRIS, which eliminated outcome judgments that would confound the scoring process. In this exercise, some difficulties were encountered in scoring Severity, particularly with assigning the middle score categories (3, 4, 5, and 6) and with classifying different types of cancer. Accordingly, the scoring protocol was modified again to try to provide better discrimination between the effects associated with the middle scores and remove the medical treatment considerations. Two new scoring options were developed. One was a nine-point scheme and the other a five-point scheme. Page 16 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Testing of the two new scoring schemes was conducted by EPA toxicologists in the Health and Ecological Criteria Division of the Office of Water. Each toxicologist was presented with all the critical effects given in IRIS with no knowledge of the chemical or chemicals to which they were attached and the revised scoring protocols. They were asked to independently score the large group of critical effect descriptions. The toxicologists met as a group to compare scores and reach consensus on the score and category that is best suited for each critical effect. The five- point scale was compared to the nine-point scale. After completion of this exercise, the nine- point scale displayed in Exhibit 9 was selected based on its ease of use, more transparent clustering of effects within scoring categories, and consistency across the individual scores assigned by toxicologists. Exhibit 9. Final Nine-Point Scoring Protocol for Severity Score 1 2 3 4 5 6 1 8 9 Critical Effect No adverse effect Cosmetic effects Reversible effects; differences in organ weights, body weights or changes in biochemical parameters with minimal clinical significance Cellular/physiological changes that could lead to disorders (risk factors or precursor effects) Significant functional changes that are reversible or permanent changes of minimal toxicological significance. Significant, irreversible, non-lethal conditions or disorders Developmental or reproductive effects leading to major dysfunction Tumors or disorders likely leading to death Death Interpretation Considers those effects that alter the appearance of the body without affecting structure or functions Transient, adaptive effects Considers cellular/physiological changes in the body that are used as indicators of possible adverse systemic damage Considers those disorders in which the removal of chemical exposure will restore health back to prior condition Considers those disorders that persist for over a long period of time but do not lead to death Considers those chemicals that cause developmental effects or that impact the ability of a population to reproduce Considers chemical exposures that result in a fatal disorder and all types of tumors Page 17 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The consensus judgment of the EPA toxicologists was used to construct a compendium of nearly 250 critical effect descriptions grouped by their severity scores (e.g., "Chronic irritation without histopathology changes" equals a score of 3). The final Severity protocol and compendium of critical effects are provided in Appendix A. The ordering of the nine-point scale, which clusters developmental and reproductive effects at a score of 7, and assigns tumors or disorders likely leading to death a score of 8 became a point of discussion. Some reviewers of the protocol felt that a separation of developmental and reproductive effects by the seriousness of the outcome was better than the clustered approach. This option was discussed during internal review of model outcomes (Chapter 4) by an internal EPA reviewer panel. The Agency reviewers decided that the benefits of the proposed scale outweighed potential drawbacks. The ability to clearly identify PCCL chemicals with even a slight developmental reproductive or tumorigenic effect through their Severity score is a benefit of the Exhibit 9 scoring system. The scoring scale's "uneven steps" were also noted as a point of concern. A detailed exploration of alternative options, which included the collapse or reordering of the categories, resulted in a consensus judgment to retain the current scale. The current Severity scale works well in providing a meaningful categorization of the array of critical effects. Given the range of critical effects that result from a given exposure, it is not possible to have a consistent difference in the Severity of the outcome between each step on the scale. 2.1.2.2 Evaluation of the Severity Scoring Protocol The Severity scoring protocol was evaluated using the group of chemicals that were included in the training data set discussed in Chapter 3 of this report. Evaluation criteria included: • Ease of scoring using the protocol and critical effect compendium • Correlation of the list or not list decisions made by workgroup members using the written narrative descriptions of the critical effects with those made with the numeric scores. • Outcomes from the algorithm list/no-list decisions (discussed in Chapter 4) using the scored data as compared with workgroup's decisions based on the descriptive data. During the initial evaluation process several issues were identified. The most challenging issue related to Severity scores derived from LDso Potency data. According to the scoring protocol, the Severity score for an LDso Potency value would be based on the outcome of death in the test population and result in a Severity score of 9. The same score of 9 would be given to a LOAEL or RfD from a more chronic study where the critical effect was described as decreased survival or longevity. When the evaluator's decisions based on descriptive information for both the Potency and Severity were compared to the decisions based on scores, it was apparent that the evaluators looked at the two effects differently. A decrease in survival from a standard chronic study was regarded as a more serious concern than death in a LD50 study where death is the targeted outcome. Several options were considered for solving this problem. The simplest option was to have no Severity score for an LDso based Potency value. Another option was to retrieve the study that was the basis of the LDso value and use the critical effect and dose for systemic effects observed rather than death. The last option was to look for a Potency value and critical effect from a toxicity study other than an LDso study. Page 18 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Experimentation with the three options for Severity based on LDso values demonstrated that a combination of the second and third options provided a feasible alternative to scoring Severity on the basis of death when the Potency value was an LDso. The option of eliminating the Severity score for an LDso value was determined to be a poor choice since it fails to make full use of the available data. It was decided that only when attempts failed to identify an alternate study and/or pre-mortality effects in the LD50 study, that an LD50 based score of 9 would be assigned. A problem was encountered with critical effect information for LOAELs from the RTECS database. This database summarized all effects without specifying which one was the critical effect. In cases where the original data source was available in the supplemental data, it was consulted to identify which effect was critical. When the supplemental data identified a NOAEL for the critical study it replaced the RTECS LOAEL. If the original source could not be accessed, an alternative NOAEL or LOAEL and its critical effect(s) were identified from the supplemental data and replaced the RTECS LOAEL. Two guidelines were applied when choosing the replacement option. In most cases a replacement was made only if the new LOAEL was lower than the RTECS value. However, in some cases the alternate value, although greater than the RTECS LOAEL was chosen because it was from a study that was higher in quality, more accessible and more recent than the RTECS citation. In any case where the RTECS remained the only source for the data, the score for Severity was based on the most serious of the cluster of effects presented. Some problems with scoring were encountered in cases where critical effects were not included in the critical effect compendium. The compendium of critical effects descriptors was developed to allow people who were not toxicologists to score chemicals based on Severity. In cases where the scorers could not determine a Severity score, the data were submitted to EPA toxicologists. A minimum of three toxicologists scored the critical effect. The consensus score was determined and the critical effect descriptor and its score were added to the critical effect compendium. One Severity scoring factor that may have had an effect on the correlation between the classification algorithm-based list/no-list decisions (See Chapter 4) and EPA decisions for the Training Data Set was the numeric Severity score of 8 for carcinogens. The only critical effect to score 8 was carcinogenicity. Workgroup members could easily identify carcinogens by their Severity score and possibly placed more emphasis on this result than the other numeric scores. The classification algorithm was less able to do so, particularly for carcinogens with low Potency values. For example, in some cases, the algorithm made a "no-list" decision when the Severity Score was 8 and the expert evaluators made a "list" decision primarily because of the Severity score's linkage to cancer. This was particularly true in a couple of cases where all the other scored values were identical or close to identical but Severity was a 7 compared to an 8 (cancer). The decisions for the algorithm and EPA matched more closely when Severity was a 7 than when it was an 8 with EPA more likely to choose a list decision for the 8 Severity score than the algorithm. In most cases, the combination of Potency and Severity scores performed well in EPA exercises used in developing the PCCL to CCL process and the algorithm trials that followed (Chapter 4). Alternative approaches were adopted for dealing with LDso based Potency values, and critical effect terms that were not initially in the critical effects compendium were added. Finding an alternative to an LDso Severity score of 9 and consulting supplemental sources for critical effect information increased the effort required to obtain the Severity data, but appeared to function Page 19 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 well. These changes are reflected in the Severity Scoring Protocol and Compendium of Critical Effects in Appendix A. 2.2 Occurrence Attributes The attributes selected to define actual or potential occurrence of contaminants in drinking water are Prevalence and Magnitude. Magnitude is related to the quantity (e.g., concentration) of a contaminant that may be in the environment. Prevalence provides a measure of how widespread the occurrence of the contaminant is in the environment. When direct occurrence data are not available, Persistence and Mobility data are used as surrogate indicators of potential occurrence of a contaminant. Persistence-Mobility is defined by chemical properties that measure or estimate environmental fate characteristics of a contaminant and affect their likelihood to occur in the water environment. Similar to the health effects attributes, the occurrence attributes are interrelated. The data sources and the learning sets used to define and scale Magnitude, Prevalence, and Persistence-Mobility, as well as more details about the individual attributes are described in the following sections. Unlike the health effects attributes, the data elements used to characterize occurrence are not solely based on a disciplined progressive study of the contaminants. The availability of data from surveys of contaminants in ambient and drinking water, the detection limits of analytical methods, limitations in reporting requirements, as well as indirect measures of potential occurrence needed to be considered and evaluated. Data sources that could provide occurrence data ranged from direct measures of concentrations in water to annual measures of environmental release or production. The most relevant data for characterizing demonstrated occurrence are monitoring studies or surveys designed to assess national occurrence in drinking water. Finished drinking water occurrence data sources that have been compiled include the Unregulated Contaminant Monitoring Regulations (UCMR), the National Drinking Water Contaminant Occurrence Database (NCOD) (Round 1 and Round 2 unregulated contaminant data), and the National Inorganic and Radionuclide Survey (NIRS). Finished water occurrence data are often not available for many chemicals; therefore, other types of data that provide the measures of potential occurrence in Public water systems (PWSs) need to be considered. EPA identified national monitoring studies of occurrence in ambient waters, which may be the eventual source waters for drinking water supplies. Two US Geological Survey (USGS) data sources provide information on source water occurrence for CCL: the National Water Quality Assessment Program (NAWQA) and studies related to the National Reconnaissance of Emerging Contaminants. These sources provide direct measures of occurrence in potential source water and indicate possible occurrence in PWSs. Many of the chemicals evaluated through the CCL process will not have direct water measurements (finished or ambient). Other available sources that provide data about the potential for drinking water occurrence include: • the EPA Toxics Release Inventory (TRI), that reports annual volumes of chemicals released from industrial applications and the number of states in which those releases occur; Page 20 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 • the National Center for Food and Agricultural Policy's National Pesticide Use Database that provides estimates of the amount of pesticide applied and the number of states in which it is applied; and • EPA's Chemical Update System/Inventory Update Rule (CUS/IUR), a source for annual production volume data under the Toxic Substances Control Act. Note the CUS/IUR data are categorical (i.e., chemicals are in categories with a range of production values, such as 500,000 to 1,000,000 pounds). 2.2.1 Prevalence and Magnitude Data Elements A learning data set of 207 chemicals was compiled and used to develop and calibrate scales for scoring the Magnitude and Prevalence attributes. Due to the linkage of the data used, the scaling and scoring evaluations were performed concurrently. The linkage between Magnitude and Prevalence measures is shown in Exhibit 10. The Magnitude measure indicates the median concentration of detections in water or the total pounds of the chemical released into the environment. The median was selected over mean because it typically is a more stable estimate of central tendency in environmental occurrence data. Outliers have strong influence on means, often to the extent that the mean is greater than all but the maximum value (particularly when only detections are used in the calculation). The median of detections was selected over the median of all measurements in water because all measurements would include non-detections. Non-detections either signify that the chemical is not occurring or the analytical method is unable to measure the chemical below the detection limit. The inclusion of non-detections reduces the median value and, for the majority of environmental chemicals, the median would be a less than value (i.e., < the reporting or a "non-detect" value). This would provide little information and limited discrimination among the chemicals. Prevalence uses the same data source as Magnitude. The linked Prevalence measure provides an indicator of how widely the contaminant may be present; in general Prevalence shows the proportion of monitoring sites or states with detections or releases. Exhibit 10. Relationship of Data Elements Used to Score Magnitude and Prevalence. Magnitude Data Median concentration of detections from finished water systems. Median concentration of detections from ambient water sites. Amount of total releases nationally in TRI; annual, in pounds. Prevalence Data Percent of finished water systems nationally with detections of a contaminant. Percent of ambient water sites nationally with detections of a contaminant. Number of states reporting releases of the chemical in the Toxics Release Inventory. Sections 2.2.2 and 2.2.3 discuss the approach used to develop and calibrate the scales for scoring Prevalence, and Section 2.2.4 through 2.2.7 discusses the approach for Magnitude including the use of Persistence and Mobility Scores as a surrogate for Magnitude when Production volume is used for Prevalence. Page 21 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 2.2.2 Prevalence - Calibrating Scales and Scoring Prevalence is a measure of a contaminant's occurrence across the United States. It uses measures such as: • Contaminant detections from Drinking Water Monitoring Programs • Contaminant detections from Ambient Water Monitoring • States where pesticides are applied • States reporting releases of a given chemical to the environment • Production of commodity chemicals in pounds per year These Prevalence measures have finite ranges such as zero to 100 percent of PWSs or 1 to 50 states depending on the reporting requirements of the available data source. Accordingly, transformations to log-based distributions are not necessary. The scaling analyses for Prevalence focused on establishing groupings of the chemicals across the scoring scale. The analyses began with equal bin distributions. Both 100 percent of sites with detections and 50 states with releases divide equally into ten bins based on deciles. In the case of Prevalence, the bins provided a fairly good fit to the distribution. However, they still required some adjustment because the equal bins had a tendency to segregate contaminants by type. Contaminants with the highest percent detections scoring a 9 or 10 were naturally occurring inorganic contaminants. For example, in the National Inorganic and Radionuclide Survey for ground water, ions such as sodium, calcium, and iron were all detected in > 90% of the groundwater systems sampled. Contaminants with the highest releases were mostly the high-use pesticides applied in nearly all the agricultural states or high-use commodity chemicals with reported discharges from manufacturing or distribution sites in a large number of states such as the Benzene, ethyl benzene, toluene, and xylene impurities in petroleum products. Creating ten equal bins from the number of states with environmental releases resulted in a scale where a Prevalence score of 10 meant that releases had to be reported from 45 or more States. EPA revised the scale for release data so that if more than half the states (25) reported releases the chemical would receive a Prevalence score of 10 and indicate that the contaminants potential for occurrence was relatively high. The percent of detections in finished and ambient water (i.e. percent of systems/sites) were also adjusted to ensure that the most widely detected organic chemicals received more representative scores when compared to the naturally occurring inorganic compounds (lOCs). Among occurrence data elements, the linkage between the Prevalence measures and Magnitude measures works well for the water measurements and environmental release measures. It does not work well in the cases when only annual Production data are available. The Production data provide a measure of pounds of a chemical product produced annually in the United States but these data do not provide a linked measure such as the number of states in which it is produced or used. This production rate represents the commercial importance of the chemical to some extent. Since high production tonnage suggests wide use of a commodity chemical, EPA decided that production data would be used as a measure for likely Prevalence across the country. For example, a chemical produced at a billion pounds per year is more likely to be used and released more widely than a compound produced at only 10,000 pounds per year. Experimentation to examine the correlation of Prevalence scores based on measures of detections in water and the Page 22 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 number of states receiving environmental releases, based on production, supported this hypothesis. Correlations were only fair to good but justified the use of production data as a measure of Prevalence when other data on the spatial spread of a contaminant across the United States are not available. Following appropriate adjustments to insure that there was adequate representation of organic and inorganic contaminants across the ten point scale and a reasonable distribution of the scores based on release data, the Prevalence scoring scales were finalized. The Prevalence scoring protocol is presented in Appendix A. 2.2.3 Evaluation of the Prevalence Protocol The relationship between production or even environmental release data and the actual occurrence in drinking water is complex. Exhibit 11 shows the scores for several contaminants based on the finalized Prevalence scoring scales. As expected, in some cases the agreement of scores across these differing data elements was not good. For example, a chemical like glyphosate scores very high for environmental release, but its water occurrence scores are very low, because of the chemical and physical properties that influence its fate and transport in the environment, restrictions on use locations, and drinking water treatment. Exhibit 11. Comparison of Prevalence Scores for Learning Set Contaminants Chemical Calcium Atrazine Glyphosate Metribuzin Toluene Tri chl oroethyl ene Tetrachloroethane 1,1,2,2 Potable water samples % PWS detect. 10 9 2 1 9 9 3 Total TRI Releases # states NA 8 ND 4 10 10 6 Pesticide Applications # states NA 10 10 10 NA NA NA Production Ibs/year 8 7 NA NA 9 8 7 The contaminants in Exhibit 11 indicate that, when the correlation between possible Prevalence scores is weak, the major difference (e.g. glyphosate) is between the finished water score and the production/release scores. This supported the decision to use a hierarchy of data elements for Prevalence. Where actual water measurements are available, they are the Prevalence measure of choice because they are the most direct measures of likely occurrence in drinking water. The hierarchy selected for use in scoring Prevalence is as follows: • Percent of PWSs with detections (national scale data) • Percent of ambient water sites or samples with detections (national scale data) • Number of states reporting application of the contaminant as a pesticide Page 23 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 • Number of states reporting releases (total) of the chemical • Production volume in pounds per year 2.2.4 Magnitude - Calibrating Scales and Scoring To scale the Magnitude attribute, an evaluation to identify possible correlations among data elements was conducted. First, a comprehensive universe of finished water quality data was compiled, including the national occurrence database of regulated contaminants (compiled for the first 6-Year Regulatory Review), the historic data from various unregulated contaminant monitoring programs (noted as NCOD Rounds 1 and 2, above), and the data from MRS. This provided a comprehensive array of data covering the expected distribution range of Magnitude for any new contaminant, ranging from high median concentrations for some naturally occurring inorganic ions or elements to non-detect values for some trace organic chemicals. The NRC (2001) had initially recommended that Magnitude be scored based on its relationship to Potency. In their pilot study they proposed that the magnitude score be the square root of the median concentration, (based on its position in a decile distribution) times the potency score. A median concentration that fell within the lowest decile of the distribution would receive a 1 and that in the highest decile a 10 for the calculation. EPA evaluated the NRC approach to scoring Magnitude and found that it was not feasible for the following reasons: • The NRC equation cannot be applied when the Magnitude data are based on environmental release or chemical/physical properties. • A decile distribution for the median concentration values results in low scores for almost all organic chemicals because of the high concentration of geochemical inorganic contaminants present in water (see Exhibit 12) • Application of the NRC equation did not provide a good measure of relative Magnitude (see aldrin and sodium in Exhibit 12). A high concentration, low Potency combination can receive the same score as a low concentration, high Potency combination. To examine the efficacy of the NRC approach, EPA applied it to six of the chemicals from CCL 1 for which regulatory determinations had been made and the magnitude scores, thus, had the necessary Potency and occurrence data. The results of that evaluation are summarized in Exhibit 12. Page 24 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 12. Comparison of the NRC Magnitude Score with the Ratio of the Health Advisory Guideline to the Concentration in Finished Water Contaminant Aldrin Hexachloro- butadiene Manganese Metribuzin Naphthalene Sodium Potency Benchmark mg/L 0.000002 0.0009 0.3 0.07 0.1 120 Score 10 7 4 5 5 1 Median Concentration mg/L 0.0006 0.001 0.01 0.001 0.001 16.4 Score 1 1 1 1 1 10 Magnitude NRC score 3.2 2.6 2 2.2 2.2 3.2 Potency Benchmark: Concentration Ratio1 0.003 0.9 30 70 100 7.3 The Potency Benchmark is the Health Advisory guideline (cancer or non-cancer) for a lifetime exposure for all chemicals except sodium. The guideline for sodium is derived from the recommended dietary intake for sodium in adults, 2.4 g/day + 2L/day using a Relative Source Contribution of 10%. The Potency Scores were derived from the RfD-equivalent or 10~4 cancer risk values. The concentration scores were obtained by using sodium as the upper level for the range and dividing the range into deciles as recommended by NRC. As indicated in Exhibit 12, the NRC score does not display a consistent relationship to the ratio of the potency-based drinking water guideline to the median finished water concentration. Aldrin, the contaminant from Exhibit 12 that is present in drinking water at the levels of greatest concern has the same magnitude score as sodium ion that is only weakly toxic and not present at a concentration of concern for other than those on very low sodium diets. In addition, as mentioned above, the decile distribution of concentrations resulted in a score of 1 for any contaminant present in water at concentrations lower than 1.6 mg/L (one tenth of the sodium concentration). Given this distribution, only inorganic contaminants are likely to receive intermediate scores on the concentration scale. Because of the observed limitations in the NRC proposed approach EPA determined that it was not appropriate for scoring Magnitude. The second approach that was investigated employed the use of the Health Reference Level (HRL) to establish the scores for Magnitude. For example, the largest dose that received a Potency score of 10 was converted to a mg/L equivalent using the HRL methodology. Anything less than that concentration received a 1 on the Magnitude scale. Each log-based Potency value was paired with a log-based concentration. A Potency score of 10, when paired with any Magnitude score, would be suggestive of concern because the concentration was greater than the Potency. However a Potency score of 8 would only give rise to concern if the Magnitude score was 3 or greater (see Exhibit 13). Page 25 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 13. Magnitude Concentrations and Scores Derived from Potency Doses Potency Score 10 9 8 Potency Range mg/kg/day Oto3.16x 10'' 3.17 x 10'7 to 3. 16 x 10'b 3.17x 10'b to 3. 16 x 10'5 Concentration equivalent mg/L Oto2.2x lO* >2.2x 10'b to 2.2 x 10'5 >2.2x 10's to 2.2 x 10'4 Magnitude Score 1 2 3 This second approach to relating Potency and Magnitude proved to be unwieldy because the two scales are inversely related. It was also problematic because it could not be used for Potency values based on NOAELs, LOAELS, and LDSOs, or Magnitudes that were not expressed in concentrations terms. It also did not take into account the differences in the HRL determination process for carcinogens versus non-carcinogens. EPA next explored a variety of potential scales that could be applied to the finished water concentration data without consideration of Potency. EPA converted the finished water data to a standard unit of measure (ug/L) and evaluated several ranges of concentrations to correspond to magnitude scores. Exhibits 14A through 14C illustrate the comparisons of three of the approaches evaluated for the organic and inorganic contaminants. Exhibit 15 shows the differentiation in scores across the three experimental approaches. The first approach was to develop scales that utilized the array of compiled Magnitude data and 10 bins with approximately equal numbers of contaminants in each bin, referred to as the equal number bins scale in Exhibit 14A. Equal bins did not provide a good dispersion of scores. Accordingly, various log-scale options were explored. The Magnitude data do not range across as many orders of magnitude as the Potency RfD data, so various semi-logarithmic scales were evaluated to better represent the distribution of values across the scale. In evaluating and developing the calibration scale, the water occurrence data presented a particular challenge because the lOCs tended to skew the results. Many lOCs result from various anthropogenic processes, but most are of geologic origin as well, and they have relatively high measures for both Prevalence and Magnitude compared to most organic chemicals. Hence, for some of the semi-logarithmic Magnitude scales (e.g., Half-Log Option A), the only chemicals that could score high (e.g., a 10 or 9) would be lOCs. Such a scale would depress the score for organic chemicals that are of equally high concern because of their expectedly lower concentrations. One approach that EPA evaluated was using different scales for lOCs and organic chemicals; however, having two scales would make the scoring process overly complex. To keep the process straightforward and transparent it was decided to use one scale for all water data. Accordingly, the scores were distributed across the range of values so that organic contaminants could receive high scores as well as the lOCs. Comparisons and adjustments were made until the current protocols, using a semi-logarithmic scale (Half-Log Option B shown in Exhibit 14C), were selected. Page 26 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 14A. Equal Bins Drinking Water Magnitude Scale (ug/L) Category and Break Points pg/L QOrganics Count I Inorganics Count Exhibit 14B. Half Log Option A Drinking Water Magnitude Scale (ug/L) Category and Break Points HOrganics Count I Inorganics Count Page 27 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 14C. Half Log Option B Drinking Water Magnitude Scale (ug/L) Category and Break Points |nOrganics Count • Inorganics Count Exhibit 15. Magnitude Attribute Scores: Example Contaminants Scored by their Median of Detections Using the Various Approaches in Exhibit 14. Chemical Hexachlorobutadiene 1 , 1 ,2,2-Tetrachloroethane Boron Sulfate Antimony Ethylbenzene Endothall Methyl ethyl ketone "Bins" Score 2 3 10 10 9 6 10 5 Half-Log Option A Score 2 3 6 10 4 4 6 3 Half-Log Option B Score 5 6 10 10 7 6 9 6 When developing the calibration scales for the release data, the ranges of data were similarly arrayed using a scale based on half-log units with a distribution of scores that reflected the distribution of the data in the learning set. Page 28 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 2.2.5 Persistence-Mobility as a Surrogate Measure for Magnitude In cases where production data are the only measure of occurrence, scoring for Prevalence and Magnitude becomes difficult. The NRC discussed Persistence and Mobility as a fifth attribute and had suggested it could be used to predict possible occurrence if other direct measures were not available. In its review, NDWAC suggested that Persistence and Mobility could provide a surrogate measure of Prevalence with production used as a measure of Magnitude. To examine the NDWAC proposal, EPA carried out a series of exercises in which scores for Magnitude derived from concentrations in drinking water and environmental releases were examined to see if they correlated with production scores and with Persistence-Mobility scores calculated using the scoring equation developed by NDWAC. In no case was the correlation as good as one might desire, but it was apparent that the Persistence-Mobility approach showed a better correlation with the Magnitude scores, based on the preferred data elements (concentration/release), than the production information. Accordingly, EPA chose to use Persistence-Mobility as a surrogate measure for Magnitude. Persistence and Mobility are environmental fate parameters. They are considered in combination as a measure of potential occurrence because both transport (i.e. Mobility) and fate (i.e. Persistence) are important when predicting whether a contaminant is likely to be found in water at a specific location, in situations where there is an environmental source for the contaminant. The length of time a chemical remains in the environment before it is degraded (Persistence) affects its importance as a potential drinking water contaminant. Persistence is generally expressed as rate of degradation or half-life (ti/2) indicating, in this case, the length of time required for the chemical to degrade to half its original concentration in the medium of interest (e.g. water). Similarly, the Mobility of a chemical, or its ability to be transported to and in water, affects its potential to reach and dissolve in the source waters for a PWS. There are a number of data elements that measure the fate of a chemical in the environment. The physical/chemical parameters that are most relevant to the fate in drinking water are summarized in Exhibit 16. The first 4 measures of mobility represent the equilibrium ratio for the partitioning of the contaminant from one medium to another: Koc (sediment: water), Kow (octanol: water), Kd (soil: water) and Henry's Law Coefficient (air: water). Koc, Kow and Kd are sometimes expressed as logs of the original measurements. The measures of persistence each reflect the time the chemical will remain unchanged in the environment. Exhibit 16. Mobility and Persistence Data Elements MOBILITY Organic Carbon Partition Coefficient (Koc) Octanol/Water Partition Coefficient (Kow) Soil/Water Distribution Coefficient (Kd) Henry's Law Coefficient (KH) Solubility PERSISTENCE Half-Life Measured Degradation Rate Modeled Degradation Rate Page 29 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The data elements listed in the table above are arranged in hierarchical order, with the most desirable at the top (i.e., the first data to be used if available). Organic Carbon Partitioning Coefficient (Koc) is one of the most common indicators of the mobility of a chemical in water. A high Koc increases the probability that, once a chemical reaches a receiving water body, it will remain bound to sediments or adjacent soils, and thus, slowly partition from the sediment to the water column. A high Koc favors the presence of the contaminant in water for a long time but at low concentrations since the Koc will favor the sediment over the water. A high solubility favors rapid dissolution in the water body from a near- by source and potentially high concentrations if the water source is confined and the environmental release substantial. 2.2.6 Persistence-Mobility Data - Calibrating Scales and Scoring Many of the measurements of environmental fate properties vary depending on the actual field or laboratory conditions. Some are reported in standard data sources only as ranges, or categorical descriptions. Scoring was further complicated by the fact that two separate environmental fate parameters were used in the scoring of the one attribute. Accordingly, EPA selected the approach proposed by NRC and supported by the NDWAC for using the Persistence-Mobility information after experimenting with several other approaches. The Persistence and Mobility data were arrayed, or partitioned into relatively simple low- medium-high categories as suggested by NRC. Published definitions for the categories were used, such as the categories for Koc from Fetter, 1994 and the classifications for the octanol water partition coefficient (Kow) from Lyman, et al, 1990. The categories are given values of 1, 2, or 3 based on the ranking of the measurement from low to high. The persistence value is averaged with the mobility value and a multiplier (10/3) is used to translate the score to a 10 point scale (see the Persistence-Mobility Protocol in Appendix A, for details). Since the persistence and mobility data are being used as a measure of Magnitude, a low ranking (1) for a parameter is one that will minimize the concentration in water and a high ranking (3) is one that will maximize the concentration. For example, a high Koc means that the distribution between the water column and sediment favors the sediment and is ranked a low, while a lower KOC means that the ratio of a contaminant in sediment to that in the water allows a larger portion of the total to be in the water and is ranked as high. As mentioned above, EPA undertook a series of evaluations to compare the Persistence-Mobility scores for selected contaminants to the Magnitude scores derived from the preferred data elements (concentrations in water or environmental releases). Often, data were not available for a half-life or a measured degradation rate for the Persistence value. In these cases, EPA's PBT Profiler was tested and added to the Persistence protocol to ensure both Mobility and Persistence data were used to calculate the attribute score (www.pbtprofile.net). Page 30 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 The PBT Profiler was developed as a screening tool to identify pollution prevention opportunities for chemicals without experimental data. Among other endpoints, it estimates environmental Persistence for organic chemicals.7 In addition to estimating a degradation rate, the PBT Profiler also estimates the percentage of a chemical that partitions to soil, sediment, water, and air compartments. As a last option, in cases where other chemical property data are not available, the amount of a chemical that is predicted to partition to the water phase by the PBT Profiler (the percent in water, a measure of solubility) is used to score Mobility. EPA recognized that the Persistence-Mobility protocol can result in relatively high scores (7 to 10) in cases where more direct data elements for scoring are not available. However, given the uncertainty associated with some of the Persistence-Mobility data elements, EPA decided the somewhat conservative scores were acceptable as surrogate measures for Magnitude, when only these data were available for scoring. 2.2.7 Evaluation of the Magnitude Protocol The occurrence data clearly vary in how directly they measure demonstrated or potential occurrence related to drinking water. Exhibit 17 compares the scores for several chemicals using the different measures of Magnitude. In all cases the finished water Magnitude score is higher than the score for ambient water. Scores for pesticide application rates are higher than those for TRI releases. As was the case for Prevalence, EPA determined that a hierarchy would be used in scoring Magnitude. The hierarchy developed uses finished water occurrence data if available. Exhibit 17. Comparison of Scores derived using the Magnitude Protocol Chemical Calcium Atrazine Glyphosate Metribuzin Toluene Tri chl oroethyl ene 1,1,2,2 Tetrachl oroethane CASRN 7440702 1912249 1071836 21087649 108883 79016 79345 Finished Water Concentration Median (fig/L) 10 6 2 7 6 7 6 Ambient Water Concentration Median (fJ-g/L) — 4 — 3 4 4 5 Pesticide Release Data Lbs/year — 10 10 8 — — .. Total TRI Lbs/year — 8 — 2 7 10 4 Persistence/ Mobility 10 8 7 7 5 10 7 7 The PBT program will not accept inorganics as input, and identifies the elements, which if present, that prevent the profiling of a particular chemical. The only exceptions to this rule are sodium, potassium, and ammonium salts of organic acids, which can be profiled. Thus, the PBT profiler cannot be used for inorganics or organometallics. However, as drinking water ions, inorganic contaminants are generally present as salts and do not degrade, and thus are assigned a score of "3" - high persistence. See the Appendix A for more complete review. Page 31 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The hierarchy suggested for Magnitude draws on the following data sets: • Median concentration of detections from finished water systems • Median concentration of detections from ambient water sites or samples • Amount of pesticide applied • Amount of total releases • Persistence-Mobility data 2.3 Fine Tuning the Protocols As discussed in the previous sections, EPA developed and fine-tuned the Attribute Scoring Protocols through a step-wise process of data selection, data analysis, calibration of scales, and evaluation of the functionality of the scores in PCCL to CCL decision-making. The decision- making component of the process examined the ability of the scored attributes to adequately represent the level of concern about contaminants. The testing also evaluated whether or not the scores provide a consistent input to the decision making portion of the CCL listing process that is relatively independent of the type of input data that provides the basis for the score. Quality assurance measures utilized comparisons of list - not list determinations by a panel of EPA subject matter experts based on descriptive and quantitative measures of health effects and occurrence (raw data) compared with determinations based on the scored attributes. Differences in decisions were identified. The panel discussed those differences and the rationale they had used to reach decisions based on the raw data versus the scored data. Minor adjustments were made to the scoring protocols based on those discussions. Using a training data set of contaminants (Chapter 3), blinded test-case decisions made with raw data versus scored results, or decisions based on one data element in a hierarchy versus another, were compared. The results provided a high level of confidence that the scores, while not capturing all information experts used in making decisions based on raw data, adequately captured the critical relationships that informed "list" versus "don't list" determinations made by the EPA panel. 3.0 DEFINITIONS AND OVERVIEW OF THE TRAINING DATA SET This chapter describes the process used to identify a set of chemicals to train (or calibrate) the classification models discussed in the next chapter. The raw data, attribute scores, and protocols discussed in chapter 2 were applied to these contaminants and that information is carried forward in the evaluation of classification models discussed in Chapter 4. The training data set (TDS) for chemicals is the set of data used to train (or teach) the classification models to mimic expert list-not list decisions. The TDS used to train the models for CCL 3 was comprised of 202 discrete sets of attribute scores for contaminants and consensus list-not list decisions made by a team of EPA subject matter experts. Classification models are algorithms that use statistical approaches for pattern recognition and derive mathematical relationships among input variables (measurements or descriptive data) and output from a TDS. For the CCL, the classification models are used to develop a relationship Page 32 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 between the contaminant attribute scores (input variables) and the classification of these contaminants into list-not list categories (output). The mathematical relationship between attribute scores and list-not list decisions is determined based on the classification decisions on TDS chemicals and their associated data. Once the TDS is used to train the classification model, the model is then applied to a larger list of contaminants to predict their classifications, list or not list. The process for developing the TDS utilized EPA subject matter experts familiar with the technical aspects of the attribute data and the selection of drinking water contaminants for listing and regulation. The subject matter experts represented drinking water, toxicology, public health, engineering, and statistics disciplines. 3.1 Key Considerations EPA considered the following key factors in developing the training data set: • Selection of contaminants representing a range of outcomes and decisions likely to be encountered in developing a CCL; • A variety of input data ensuring adequate coverage of possible attribute scores and combinations of scores; • Chemicals that, when present in drinking water, would present a meaningful opportunity for public health improvement if regulated; and • Contaminants that would likely be selected for the PCCL. 3.2 Developing Key Components of the Training Data Set 3.2.1 Attribute Scores Attribute scores are a critical component of the TDS, as mentioned in Chapter 2. The TDS used for training the classification models consisted of attribute scores for 202 contaminants. A set of known chemicals was chosen to develop the TDS and supplemented with a range of attribute scores that represented hypothetical or artificial contaminants. These artificial contaminants were developed to fill voids in the space of possible attribute scores and improve classification model results. 3.2.2 Attribute scores for real contaminants Initially, EPA selected "data rich" contaminants from among regulated contaminants and previous CCLs because they had a range of readily available occurrence and health effects information. EPA drinking water subject matter experts and stakeholders (as part of the NDWAC process) reviewed the initial list of contaminants and identified candidates for the TDS. Based upon an NRC and NDWAC recommendation, EPA also added chemicals "generally regarded as safe" by the U.S. Food and Drug Administration to provide adequate coverage of possible attribute inputs and a range of list-not list decisions. This initial selection process identified 51 chemical contaminants for the TDS. Subsequently, EPA chose 50 additional contaminants from the CCL 3 Universe. These 50 contaminants were randomly selected from those with high health effects toxicity levels that had Page 33 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 occurrence data because they represented contaminants likely to make it to the PCCL. The addition of these 50 contaminants resulted in 101 contaminants with data to score attributes. To aid in the review and evaluation, data summary sheets were prepared for each contaminant that included available health effects, occurrence, and environmental fate data. All the available health effects and occurrence, use, and fate data that could be used to develop the attribute scores for Potency, Severity, Magnitude and Prevalence were included on the individual summary sheets. The data summary sheets are presented in Appendix B. While contaminant names were included in the initial evaluations, EPA subject matter experts found that knowledge of the contaminant name introduced bias into the decision-making process. Subsequently, EPA "blinded" contaminant names or identifiers in contaminant evaluations to increase objectivity and force decisions to be made solely on the available data and associated attribute scores. The names of contaminants were revealed after the "blinded" evaluations. The attribute scores were developed according to the Attribute Scoring Protocols discussed in Chapter 2 and presented in Appendix A. 3.2.3 Attribute scores for hypothetical contaminants The performance of the classification models using the initial TDS gave an indication of gaps in the possible attribute space that the set of 101 TDS contaminants did not adequately cover. This led EPA to add a set of 101 hypothetical contaminants to the TDS. These contaminants had specific combinations of attribute scores designed to fill gaps in the space defined by all possible attribute scores and to improve the performance of the models. EPA identified 16 general ranges of scores using all four attributes and permutations of high or low scores. The majority of these possible scores were selected using Latin hypercube sampling from the set of all possible attribute score combinations, as seen in Exhibit 18 (NIST, 2006). Five contaminants were selected at random from each of the 16 "cubes" represented by the combinations of high (6-10) and low (1-5) scores for the four attributes. This selection resulted in 80 hypothetical contaminants. Twenty one additional contaminants were deliberately selected to fill in some obvious voids in the 4-attribute space, resulting in 101 artificial contaminants. Page 34 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 18. Combinations of low and high attribute scores for the four attributes using Latin Hypercube Sampling. Potency Low Low Low Low Low Low Low Low High High High High High High High High Severity Low Low Low Low High High High High Low Low Low Low High High High High Prevalence Low Low High High Low Low High High Low Low High High Low Low High High Magnitude Low High Low High Low High Low High Low High Low High Low High Low High Low scores are randomly sampled from the range 1-5. High scores are randomly sampled from the range 6-10. Exhibit 19 displays the attribute space coverage of the 101 contaminants compared to the attribute space coverage of the TDS of 202 contaminants. The combination of real and artificial contaminants resulted in 202 scored candidates that became the TDS. The total attribute space for a model that includes four attributes with scores from 1 to 10 is 10,000 combinations of possible attribute scores. Each point plotted in Exhibit 19 represents one chemical in the TDS and one of the 10,000 possible combinations of attribute scores. Page 35 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 19. Attribute Space for the 101 TDS compared to that for the 202 TDS Color of square indicates its classification decision. List = red; List? = beige; Not List? = light blue; and Not List = dark blue (also see Exhibit 29). LU O Z LU LU "I . Q_ 1 severity --> POTENCY• LLJof 0" = 21 " = LJLJ-f _l E gfl LU ^ POTENCY 8 9 10 Page 36 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 This graphical analysis shows five elements of the model results, the four attributes evaluated and the categorical decision (L, L?, NL?, and NL) in a single graph. Note in Exhibit 19 that the vertical and horizontal axes show two attributes on each axis. The attribute scores for Potency are the large squares across the horizontal axis. The corresponding score for Severity is a separate scale within each larger square. That is, each Potency square has a range of Severity scores. Similarly the Prevalence and Magnitude scores are plotted on the vertical axis, Prevalence as the large squares along the vertical axis and Magnitude as a separate square within each larger square. The decision category assigned each potential attribute is color coded (NL decisions are denoted by dark blue, NL? by lighter blue, L? by beige, and L decisions by red). 3.3 Making List-Not list Decisions List-not list decisions are the second key component of the TDS, as mentioned in Chapter 3. The EPA subject matter experts made list-not list decisions on an individual basis and as a group, based on attribute scores and based on data that had not been converted to attribute scores (actual or raw data). The development of the list-not list decisions was an iterative process that incorporated revisions to the attribute scoring protocols, and the final list-not list decisions, as experience was gained by the EPA experts. Differences between the decisions based on the scored attributes and the raw data were resolved by revising the scoring protocols to improve the correlation of scores to the raw data. After evaluating the health effects and occurrence data for each contaminant, each individual subject matter expert made decisions about how to classify the contaminant, and then met as a group to discuss their decisions. Early in the process the subject matter experts recognized that clear list or not list classification decisions could easily be made for some contaminants, but not for other contaminants. The chemicals in the later group were placed into categories of List? (L?) or Not list? (NL?), in which L? signifies that the decision is leaning towards listing but with some uncertainty, and NL? signifies that the decision is leaning towards not listing, but with some uncertainty. These additional two categories were incorporated into the evaluation process. As part of the iterative process, the subject matter experts discussed their classification results and made adjustments to the process, accordingly. When adjustments changed attribute scoring protocols, TDS contaminants were rescored and reevaluated. Individual decisions were made separately based upon either the raw data or attribute scores. Decisions based upon raw data utilized health effects and occurrence data elements, as well as supporting information on fate and uses. For decisions based on attribute scores, only the numeric individual scores were used. The scores were developed from the raw data using the protocols, for Potency, Severity, Prevalence, and Magnitude. In both cases, this evaluation was conducted "blinded," meaning contaminant names were not shown. Appendix C summarizes decisions based upon raw data and attribute scores. For each contaminant, comparisons were made between the list - not list decisions based upon raw data and those based on scores. Subject matter experts discussed the similarities and differences on an individual contaminant basis, and revised the attribute protocols to reflect decisions made on the actual data (see Chapter 2). Once list or not list classification decisions were made based on the attribute scores using the revised protocols, consensus among the EPA subject matter experts was used as the final decision for each contaminant. This consensus decision was used to train the models and is further discussed in Chapter 4. Consensus decisions were made by averaging the numerical decisions of individual reviewers (L = 4, L? = 3, NL? = 2, and NL=1) and rounding to the Page 37 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 nearest integer. The rounded averages became the consensus values used to train and evaluate the models (Chapter 4). Appendix C also provides the consensus decisions for each IDS contaminant. 4.0 PROTOTYPE CLASSIFICATION MODELS AND THE CCL PROCESS The NRC recommended EPA use prototype classification models for CCL selection, citing the limitations of expert processes and other rule-based models. NDWAC agreed that EPA should use a prototype model, also noting that this should improve the reproducibility and transparency in the process. This kind of approach does not eliminate subjectivity but rather, makes the judgments more explicit. Prototype classification models are often described as pattern recognition models. These models develop statistical relationships (to recognize the patterns) among input variables (attributes, discussed in Chapter 2) of drinking water contaminants to predict their classification ("List," "List?," "Not List?," and "Not List"). The model determines the relationship or rule that links the input to the output based on the decisions made on the TDS (Chapter 3) and then uses that relationship to classify PCCL contaminants based on their attribute scores. In its study, the NRC experimented with a linear discriminant model and an artificial neural network (ANN) model to demonstrate the use of classification approaches. EPA, working with NDWAC, identified the following classes of models for evaluation (NDWAC 2004): • Artificial Neural Networks, • Classification Decision Trees (with univariate and multivariate splitting rules), • Linear Models, and • Multivariate Adaptive Regression Splines (MARS). The model evaluation was a two-step process. First was the evaluation and selection of the most appropriate ("best-fit") model from within each of the model classes. The second step was the evaluation of the performance of the best models selected from each class. Following these evaluations, two of the models were rejected and three were maintained to inform the final expert review process. Artificial Neural Networks (ANNs) - ANNs are information processing models conceptually based on the human nervous system and its learning processes. ANNs apply flexible and often very complex parameterization. Their value is that they use flexible, non-linear functions that can capture almost any kind of underlying relationship between input and output data. For classification purposes, ANNs apply weighting in non-linear functions and do not specify a strict functional form (such as quadratic or cubic equations) as do many statistical models. Classification Decision Trees - The decision tree classifies the sample by devising a series of tests (or rules, from the TDS) that are mutually exclusive in outcome. The graphical tree is derived with a test at a node in the tree with outcomes from the test branching from each node. Hence, in moving through the tree a contaminant encounters the test at a node, and is sent down one branch or another based on how its attribute(s) meets the test criterion, usually a simple inequality, such as is Magnitude < 3.5 (true or false). Eventually the contaminant reaches a Page 38 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 terminal node (the last node, that no longer branches) that assigns the classification (e.g., category 2 = NL?). Two types of decision tree models were explored, Classification and Regression Tree (CART) which utilized univariate (one attribute at a time) tests at nodes, and the Quick, Unbiased, Efficient Statistical Tree (QUEST) model, which utilized multivariate (weighted sum of all attributes) tests at all nodes of the tree. Linear Models - General Linear Models - Two types of linear models were tried. A Logistic regression model was applied to deal with CCL's categorical data. The Logistic model was only attempted using two categories (List and Not List). EPA found that the binary approach was not satisfactory, and moved to a four category approach. Recognizing that the ANN models often employ logistic regression, to avoid duplication, the Logistic model was dropped from the final evaluations. Consequently, the data were adapted for use with a regular Linear Regression model. This model estimates EPA's average classification (on a scale of 1 to 4; 1 = Not List, 2 = Not List?, etc.) for each contaminant as a linear combination of the contaminant's four attribute scores. Multivariate Adaptive Regression Splines (MARS) - MARS is a non-parametric classification model sometimes referred to as a statistical neural network model. MARS has become widely used in data mining and exploratory analysis because it doesn't assume or impose any particular class of relationship (such as linear or logistic) on all the predictor variables and the outcomes. It can develop different regression relationships for different input variables. 4.1 Model Training and Development Some software packages are designed to build, fit, and test models internally, while others require an expert user to develop the model. Generally, models are evaluated based on: • the number of attributes that the model is able to consider, • the types of relationships or mathematical functions that the model utilizes, and • the model's ability to predict classification of the TDS. For example, training a model can involve estimating the values of rule coefficients (such as Po and Pi in the simple linear regression model Y = Po + PiX + s), or determining some other aspect of model structure (such as the number of splits in a regression tree model) to improve how well the model classifies the existing data. Ideally, this training process minimizes the model's predictive error, thereby reducing incorrect model predictions. "Over-fitting" is a concern when selecting a model. Any of the model classes can be made to fit a particular data set very well by making the model more complex (this usually means estimating more model parameters). However, the addition of model complexity can come at the cost of a loss of general applicability; the added complexity may capture the idiosyncrasies of the specific data set, but may not be representative of the broader processes that generate the data, and hence, may not perform well when applied to an unknown sample. Several methods were used as guidance to avoid over-fitting, depending on the specific type of model being tested. Software designed specifically for CART, ANN, and MARS were used for those methods. Appendix D lists the specific software sources that were used. These programs provide the user with a number of options to control the model building process. For example, QUEST software, Page 39 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 used to produce a classification decision tree model with linear discriminant nodes, allows the user to specify the following: • Minimal node size of the tree • Splitting method (linear or univariate discriminants) • Splitting criterion (likelihood ration, Pearson chi-square, etc.) • Pruning method (by coefficient of variation or by test sample) • Number of fold for cross-validation After the user selects the control options, the software does its best to fit the training data set. In general, the user is not able to view precisely how the software does its job, but is shown the final model, some statistics regarding its performance, and an indication of other alternatives that were considered. For example, the QUEST software outputs a list of decision trees and their summary statistics (numbers of nodes, error rates). QUEST also identifies the optimal tree and provides the tree's decision rule. In addition, QUEST reports the results of cross-validation tests, in which subsets of the training data are held back. The algorithm produces a rule to best fit the remaining data and this rule is then applied to the data that were held back. This gives a slightly greater error rate because (a) fewer data are used to estimate the model parameters and (b) data used for checking are independent of those used to estimate the parameters. Exhibits 20a and 20b compare QUEST Classifications based on the full training data set (Exhibit 20a) and 5-fold cross-validation (Exhibit 20b). Exhibit 20a. QUEST Classifications Based on the Full Training Data Set (shaded cells are exact match with Expert Decisions) Consensus Blinded Decisions 4(L) 3(L?) 2 (NL?) 1(NL) Model Decisions 4(L) 42 13 0 0 3(L?) 0 41 8 0 2 (NL?) 0 2 54 2 1(NL) 0 0 3 37 Exhibit 20b. QUEST Classifications Based on 5-Fold Cross-Validation (shaded cells are exact match with Expert Decisions) Consensus Blinded Decisions 4(L) 3(L?) 2 (NL?) 1(NL) Model Decisions 4(L) 41 14 0 0 3(L?) 1 37 10 0 2 (NL?) 0 5 50 8 1(NL) 0 0 5 31 Page 40 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 81 5-R-09-008 Classification of the PCCL to CCL August 2009 Unlike other models, the simple linear model did not depend on special software. Under this model, the average classification of the subject matter experts for a contaminant was estimated as a linear combination of attribute scores. Letting Y[i] be the subject matter expert's average classification for training set contaminant i, the model equation is: Y[i] = b0 + bpot * Pot[i] + bsev * Sev[i] + bp™ * Prev[i] + bMag * Mag[i] + s An intercept term (bo) and coefficients for the four attributes (bp0t, bsev, bprev, and bMag) were selected to maximize the likelihood of the IDS average classifications, given normal error structure (& is an error term that is normally distributed with mean zero). A residuals plot revealed that unanimous List and unanimous Not List contaminants were often predicted to have extreme errors, suggesting that perhaps the subject matter experts would have assigned some of these to more extreme categories, had they been available. Without censoring, the unanimous Lists were treated as observations of exactly 4.0 and the unanimous Not Lists were treated as observations of exactly 1.0. Recognizing that these may be censored values, they are treated as > 4.0 and < 1.0, and the likelihood function is adjusted to include these as probability masses (probability of at least 4.0 and probability of at most 1.0) rather than probability densities (probability of exactly 4.0 and exactly 1 .0). Maximum likelihood parameters appear to fit the data very well, and predict most TDS average decisions to within 0.25 units. 4.2 Model Sensitivity Analyses Some analyses that were performed in the development process may be considered sensitivity analyses. These included the following: • Training the models on subsets of the TDS. This included the partial TDS (as it was being developed) and cross-validation exercises, wherein randomly-selected contaminants were held back from training to provide independent error checks. • Training after selected "outliers" are removed from the TDS. Those selected outliers found to have strong influence on the overall performance were investigated further to see if there were valid reasons for excluding them from the TDS. • Graphical and statistical analyses. These analyses were used to identify significant differences in attribute "weights" or influence on model performance. If any attribute had been found to be insignificant, it could have been ignored, perhaps saving some data development resources. (Though attributes were found to have different weights, none was found to be insignificant.) Rather than detail all of the sensitivity analyses conducted for all classes of models, the remainder of this chapter illustrates the analyses described above using selected applications. 4.2.1 Training with subsets of the TDS Cross validation for QUEST is described under 4.1, above. Training with early subsets of the TDS (50 and 102 contaminants) produced mixed results for the five model classes. QUEST and linear models exhibited no logical inconsistencies, but ANN, MARS, and CART showed some problems. Most dramatic was MARS, which placed contaminants with the very lowest health effects and occurrence scores in the List category. Clearly, additional training data was needed to overcome these difficulties. No class of model was eliminated on the basis of these findings. Page 41 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The final TDS (size 202) allowed all of the classes to improve their performance. ANN was found to have no logical inconsistencies. Although MARS and CART improved significantly, both had some areas of non-monotonicity. This means that there were some cases where an increasing attribute score could lead to a decreasing classification for a contaminant. (This inconsistency is discussed and displayed graphically in Section 4.4.2.) 4.2.2 Training after Selected "Outliers" Are Removed From the TDS The linear model was most sensitive to selected TDS contaminants. Fortunately, this model provided a number of tools for identifying outliers. While other models had the objective of minimizing the count of classification errors (or in the case of QUEST, a weighted sum of classification errors), the linear model attempted to minimize the deviance between its prediction and the average classifications for TDS contaminants. When the other models encountered an outlier (for example, a contaminant with very high attribute scores, but a classification of NL), they did not attempt to make the correct classification for the outlier because that would have meant making other errors for nearby contaminants. Including or not including such an outlier had no effect on the outcome. The linear model, in essence, attempted to minimize the squared estimation error, so outliers tended to have some influence on the linear model parameters. Residuals plots such as Exhibit 21 revealed potentially important outliers for the linear model. Exhibit 21 shows the model-estimated versus team classification of one important outlier: a contaminant with scores (4, 8, 10, and 10) with a team-average classification of 3.17 (L?) and model-estimated value of 3.88 (L). Another contaminant has as large a residual (model = 1.53 and team = 2.33, both NL?). However, when the model was run first with one and then the other contaminant removed, only the first outlier was found to have a marked influence on the overall error rate (number of misclassifications and weighted sum of misclassifications). When EPA's subject matter experts were asked about these two contaminants, they agreed that their classification for the first contaminant was influenced by their belief that it was a ubiquitous inorganic that should probably not be listed. When asked how the model should treat PCCL contaminants with such high Severity and occurrence levels, the team agreed that the correct decision would probably be to List the contaminant, but that the two tens for occurrence suggested that the contaminant was inorganic biasing them towards the lower decision category. It was decided to drop this contaminant from training the linear model. Because it had negligible influence on the other models, it was included for them. Page 42 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 21. Linear Model-estimated versus Team Average Classification for the TDS 5f"IIIII o •3 3 Q T3 0 I I I I I 1 1.5 2 2.5 3 3.5 Team Mean Decision (doesn't include perfect 1 =NL and 4 = L) The graphical displays discussed in Section 4.4.2 were used as additional checks for outliers. The outliers for the linear model were apparent when the training data set was plotted against the background display. The inorganic contaminant that was eliminated from linear model training was seen to fall "between" two other contaminants that were both assigned to the List category - further evidence that its classification of L? may have been inappropriate, at least for the purpose of training this model. 4.2.3 Graphical and Statistical Analyses to Identify Significant Differences in Attribute "Weights" Or Influence on Model Performance Graphical displays of model outputs (Section 4.4.2) revealed that all of the attributes were important. The ANN graph is the only means of studying the ANN rule, but QUEST and the linear model provide mathematical expressions that clarify the roles of the four attributes. For QUEST, each "node" of the tree involves comparing a weighted sum of attribute scores with a threshold. If the threshold is surpassed, then the "right" path is taken, otherwise, the "left" path is taken. The QUEST software is capable of using fewer than four attributes, and when trained with about half of the 202 TDS contaminants, it sometimes used only three of the four. When the full TDS was used, however, all four attributes were used at each of the tree's seven final nodes. At each node, the four attributes can be ranked in order of their model coefficient. Exhibit 22 shows the ranking of attributes for the nodes of the final QUEST tree. Page 43 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 22. Relative Weights of Attributes at QUEST Nodes (1 = greatest weight, 4 = least weight) Node # l 1 2 3 4 5 1 28 Potency 1 1 2 1 1 1 2 Severity 2 2 1 3 2 3 1 Prevalence 4 4 3 4 4 4 3 Magnitude 3 3 4 2 O 2 4 N2 202 141 61 52 89 18 23 Numbers as assigned by QUEST. 2 N = Number of TDS contaminants that are evaluated at the node. All 202 are evaluated at the first node. Of these, 141 proceed to node 2, while the remaining 61 pass to node 3 (see Appendix E for additional details). Overall, it appears that Potency carries the most weight, followed by Severity, Magnitude and, Prevalence. The linear model assigns a weight to each attribute and the greatest of these is that of Potency, followed by Severity, Prevalence, and Magnitude (see also Appendix E). The order of Prevalence and Magnitude is the reverse of that found for QUEST. The linear model also provides a means of testing the statistical significance of the intercept and four coefficients. Because the model accounts for possible censoring, this testing is not as simple as in a least- squares regression. Two methods were used to approximate the covariance matrix for this model. The first is based on the Fisher information (J(model parameters 6)), derived using the likelihood function, L(data|9): J(0) = - E [d2 ln(L(data|0)) / d02 | 0] The second used a Bayesian posterior sample of parameter values. This sample produced a covariance matrix that was nearly identical to that derived from the Fisher information, suggesting that the likelihood and posterior are very nearly multivariate normal. Hypothesis tests could therefore be conducted using the Markov Chain Monte Carlo (MCMC) sample (10,000 sets of parameter values). Exhibit 23 below shows means, medians, and 95% credible intervals for the model parameters, bl through b4 are the parameters for the four attributes (Potency, Severity, Prevalence, and Magnitude, respectively), bO is an intercept term, and Phi is the precision (inverse of the error variance). The 95% intervals reveal that all of the attribute parameters are statistically significantly greater than zero. Page 44 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 23. Summary Statistics from MCMC Sample Parameter bO bl b2 b3 b4 Phi Mean -1.674 0.2410 0.2170 0.1157 0.1699 14.25 2.5% -1.865 0.3343 0.2002 0.1033 0.1539 11.44 Median -1.673 0.241 0.2169 0.1157 0.1699 14.22 97.5% -1.488 0.2591 0.2342 0.1284 0.1858 17.41 Based on the MCMC sample, pair wise comparisons of attribute parameters were all found to be statistically significant. Separate weights are needed for the two health effects attributes and for the two occurrence attributes. 4.3 Model Performance Testing The TDS, Attribute Scoring Protocols, and prototype model test results were linked together in an iterative process. Testing of the models in the early stages was impacted by changes and refinements in attribute scales, resulting changes in the scores, and changes in the composition of the TDS. These changes required iterative reevaluation of the models and resulted in many improvements that are part of this final analysis. Refinements in scoring are discussed further in Chapter 2 and development of the TDS in Chapter 3. EPA also evaluated the impact of the attributes used by the models and the effects of missing data on the performance of the models during the various stages of development. During early stages of the model testing, the models were run with various sized TDSs. The CART and MARS models did not always use all four attributes with some of the smaller TDSs. However, all models used all four attributes when trained with the final TDS, consisting of 202 contaminants. Exploratory analysis of the results revealed some additional problems with the CART and MARS models. When two contaminants have identical attribute scores for all but one attribute, the contaminant with the higher score for that attribute should logically be classified at least as high as the contaminant with the lower score. For example, if a contaminant with scores (4, 4, 4, 4) is assigned to the L? Category, then a contaminant with scores (4, 4, 4, 5) should not be assigned lower, to category NL? or NL. Both CART and MARS rules had this type of misclassification. Both models did not consistently classify contaminants. Another problem with the CART and MARS models was their errors across two categories. Both models did not consistently separate the NL? from the L contaminants or separate the L? from the NL contaminants. Because of these problems, and because of poor performance with respect to the training set decisions, EPA decided not to use these two models to inform PCCL to CCL decisions. Three models, ANN, QUEST and Linear Regression consistently demonstrated the best performance when using the final TDS. Exhibit 24 lists the features of these three models. Page 45 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 24. Features of the Three Preferred Models Based on TDS Test Results Features Objective Function (to be minimized or maximized) Prediction Ranking Capability Transparency of Optimization Method Classification Rule Computation Speed Software Cost Classification Models Artificial Neural Network Minimize count of training set errors Rounded average subject matter expert classification Rank by Probability (Probability of List) Not transparent Not clear, but classifications available for all attribute score combinations. < 1 Second Version used is Freeware. Classification Tree with Linear Nodes (QUEST) Minimize count of training set error loss OR minimize error loss Rounded average subject matter expert classification Rank by classification and distance from discriminant (requires post- processing) Not transparent Clear. Complex classification tree with linear inequalities for intermediate nodes < 1 Second (but process for deriving distances for ranking is not part of software) Freeware Linear Regression Maximize likelihood or minimize error loss Average subject matter expert classification (not rounded) Rank by prediction Simple and transparent Clear. Simple linear function of attribute scores. < 1 Second No special software 4.4 Evaluating Classification Differences This section describes how the classification models were assessed and compared with respect to: • The number of correct and incorrect classifications for the 202 contaminants in the final TDS • The number of "large" misclassifications (off by more than one category) • The weighted sum of TDS classification errors • Ability to identify intermediate classifications • Consistent behavior (e.g., no decreasing classification as attribute scores increase) Page 46 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 As described in Section 3.3.1, the approach to classifying the TDS contaminants became a four- category decision (L, L?, NL?, and NL) to allow the EPA subject matter experts, experienced in making list or not list decisions, to identify the decisions that were not strong list or not list decisions. Accordingly, quantification of model performance as it compared with the decisions of the EPA subject matter experts had to consider a suite of various misclassification outcomes (Exhibit 21), such as a consensus decision that a contaminant should be a L?, but the model classifying it as a L. However, not all the misclassifications are considered to be equally serious. Of the differences, the most substantive misclassification would be placing a strong "List" contaminant in the "Not List" category. This might result in missing a key candidate for the CCL. To consider the relative seriousness of the different kinds of misclassifications, EPA developed the classification error losses in terms of the weights displayed in Exhibit 25. Initially, the table had equal weights for all misclassifications and these were adjusted until EPA was comfortable that they represented the relative significance of the 12 misclassifications or errors that are possible. The most serious error (placing a List contaminant in the Not List category) has ten times the weight (i.e., a 10) of the least substantive difference (placing a contaminant one category too high, such as placing a List? contaminant in the List category, i.e., a value of 1). Exhibit 25. Decision Comparison Matrix; Weight of Differences Model Decisions Not list Not list? List? List Subject Matter Expert Decisions Not list 1 2 O Not list? 2 •/ 1 2 List? 5 2 1 List 10 5 2 The Decision Comparison Matrix and the quantitative weighting of differences were used to compare model results to EPA decisions. This was part of the process to minimize the losses and cost of the misclassifications. The models are tools to help classify and prioritize the contaminants for expert review at the end of the CCL process. After applying the models, EPA plans to scrutinize all of the contaminants identified as "List," but likely will spend less evaluation time on those placed in the other categories (particularly the "Not List"). As a result, EPA recognized the need to minimize the likelihood of classifying "List" contaminants as "Not List" or "Not List?" and applied the Decisions Comparison Matrix as a tool in evaluating model output misclassifications. 4.4.1 Classification Differences Among the Models Appendix E describes the classification rules or "solutions" that were generated by the different models. These rules perform differently, when compared with the TDS consensus decisions. Exhibit 26 summarizes the number of each type of decision by each model compared to the subject matter expert consensus decisions and Exhibit 27 summarizes the results and summed Page 47 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 products from the Weighted Loss Value (Exhibit 25). The model input (and output) for ANN, CART, MARS, and QUEST were the integers representing the classes (i.e., 4=L, through 1=NL) while the Linear model estimated the average classification. When a majority of subject matter experts favored one classification for a contaminant, that class was assigned. When the subject matter experts were evenly split (for example, if three assigned a contaminant to 1 (NL) and three assigned it to 2 (NL?)), an agreement was reached to assign the contaminant to the higher of the two categories (2 (NL?) in the case of even split between scenario 1 and 2). In contrast, the Linear model predicted the average classification and was trained using average classifications for the TDS. For example, if three decision makers assigned a contaminant to 1 and three assigned it to 2, the average classification was 1.5. Exhibit 26. Summary of Quaternary Model Decisions Decision Category 4(L) 3(L?) 2(NL?) 1(NL) Total Number of Decisions in Category by Model Expert Workgroup Blinded Decision 42 56 65 39 202 ANN 42 55 65 40 202 CART 27 68 73 34 202 Linear 27 69 69 37 202 MARS 47 38 81 36 202 QUEST 55 49 58 40 202 Exhibit 27. Results of 202 Model Classifications and Weighted Misclassifications ANN CART Linear MARS QUEST Number of Classification matching TDS 168 156 160 160 174 Weighted Loss Value 52 84 72 67 33 While there are important differences, all the models were able to process the TDS and produce classification rules. All five models produced from 79 percent to 86 percent exact matches with the consensus decisions. Exhibit 28 provides further details on the predicted classifications for each model. Perhaps most important, no model classed any consensus L(4) or L?(3) decision as a NL (1). Only CART classed any L(4) candidates (2%) as NL?(2). The best performance, by these metrics, was that from the QUEST model, while the lowest performance was by CART. The objective of the QUEST model was to minimize the value loss of the misclassifications, while the other methods minimized errors with no regard for the Page 48 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 weights shown in Exhibit 25. As a result, QUEST has the lowest loss, and the highest exact match rate. Note on Exhibit 28, that QUEST'S misclassifications are all shifted to the "left;" i.e., QUEST only predicted 2 of consensus L? decisions would be NL?; and 8 of consensus NL? were predicted to be L?. EPA believes this is a more acceptable and conservative difference. ANN attempts to maximize the likelihood of correct predictions and simply minimize the number of misclassifications (not their weighted value). Its misclassifications are rather equally distributed around the exact match categories. The performance of MARS and the Linear model look similar, but MARS had the highest value of any model for consensus L? decisions that were predicted as NL? (16), a less acceptable difference. 4.4.2 Logical Evaluation of the Models - Graphical Analysis As introduced in Section 3.2.1, the testing of the models included evaluation of the total potential "attribute space." The total "attribute space" for a model that includes four attributes with scores from 1 to 10, is 10,000 combinations of possible attribute scores. The graphical analysis of model performance looked at how the models generated decisions on the category to which it assigned contaminants (L or NL). When applied across the entire attribute space, the discriminate surfaces that bound the model's decisions on the category to which it assigned any possible score became apparent. These category boundaries or discriminant surfaces were reviewed for consistency through the graphical analysis. Five models (ANN, QUEST, MARS, CART, and Linear Regression) developed with the 202 TDS produced classification rules that were applied to the 10,000 scores and plotted to evaluate their performance (Exhibits 29 through 32). Exhibit 29 is another example of the graphic tool introduced in Chapter 3, Exhibit 19, to help visualize the multi-dimensional space of the CCL classifications. The graphical analysis shows five elements of the model results, the four attributes evaluated and the categorical decision (L, L?, NL?, and NL) in a single graph. Note in Exhibit 29 that the vertical and horizontal axes show two attributes on each axis. The attribute scores for Potency are the large squares across the horizontal axis. The corresponding score for Severity for each Potency score is a separate scale within each larger square. That is, each Potency square has a range of Severity scores. Similarly the Prevalence and Magnitude scores are plotted on the vertical axis with Prevalence along the primary axis and Magnitude along the axis imbedded in each Prevalence square. The categorical decision assigned to each potential attribute score combination is color coded. Red represents a L decision, beige, a L?; light blue represents a NL? and dark blue represents a NL decision. Page 49 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 28. Summary of Individual Quaternary Model Classifications (shaded cells are exact match with Expert Decisions) Consensus Blinded Decisions 4(L) 3(L?) 2 (NL?) 1(NL) 4(L) 3(L?) 2 (NL?) 1(NL) 4(L) 3(L?) 2 (NL?) 1(NL) 4(L) 3(L?) 2 (NL?) 1(NL) 4(L) 3(L?) 2 (NL?) 1(NL) Model Decisions ANN 4(L) 37 5 0 0 3(L?) 5 44 6 0 2 (NL?) 0 7 53 5 1(NL) 0 0 6 34 CART 4(L) 26 1 0 0 3(L?) 12 47 9 0 2 (NL?) 4 8 53 8 1(NL) 0 0 3 31 Linear 4(L) 26 1 0 0 3(L?) 16 47 6 0 2 (NL?) 0 8 54 7 1(NL) 0 0 5 32 MARS 4(L) 37 10 0 0 3(L?) 5 30 3 0 2 (NL?) 0 16 59 6 1(NL) 0 0 3 33 QUEST 4(L) 42 13 0 0 3(L?) 0 41 8 0 2 (NL?) 0 2 54 2 1(NL) 0 0 O 37 Page 50 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 81 5-R-09-008 Classification of the PCCL to CCL August 2009 Exhibit 29. ANN Model Predictions for the Four Attribute Space (10,000 possible score combinations) so 10- 1 - kkkk\\ \\T11 hkkkk severity --> POTENCY >4 5678 9 10 The colors represent the classification decision: List = red; List? = beige; Not List? = light blue, and Not List = dark blue. One TDS contaminant (Potency = 4, Severity = 8, Prevalence = 5, and Magnitude = 10) is shown in black, though EPA's decision for that contaminant is List (red). This particular contaminant is always shown in contrasting color to help the viewer orient to the details of the graph and check the scaling and axes. 1 Expressed in RGB format, dark blue is (5 113 176), light blue is (146 197 222), beige is (244 165 130), and red is (202 0 32). These colors were selected using ColorBrewer, by Cynthia A. Brewer of Perm State University. ColorBrewer can be found online at www.ColorBrewer.org. Page 51 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Exhibit 29 plots the results of the ANN models classifications for the 10,000 combinations of attribute scores. The patterns clearly show a logical progression from the lower left to upper right, progressing from Not list predictions (dark blue) for low attribute scores, through NL? and L?, to List classifications for the highest scores, both within each square and across the entire matrix. The graphical analysis helped to understand and visualize the logic of the discriminant approach of models and to visualize the performance with the TDS. The QUEST model produces a very similar graphic result to the ANN model. In contrast to Exhibit 29, Exhibit 30 shows the MARS results. The figure shows areas where red (L) directly touches light blue (NL?) and where dark blue (NL) touches beige (L?). Both are indications that the model was unable to define the intermediary categories. Another problem can be seen in the lower right box of the figure, where Potency is 10 and Prevalence is 1. Within that box, when magnitude is 1 (along the bottom edge of the box), as Severity increases, the decision can be seen to go directly from NL? to NL (light blue to dark blue). This unacceptable result also occurs for several other combinations of high Potency and low Prevalence. EPA found that these results were illogical and unacceptable. Exhibit 31 shows that the univariate CART model exhibited similar problems. The adapted Linear Regression model, shown in Exhibit 32, presents an interesting variant. As noted, the Linear model predicts average classification of contaminants. In other words in contrast to ANN or QUEST which predict a classification as an integer of 3 (or L?), the Linear model predicts the value from the regression model, such as 3.312 (rounded to 3 = L?), so the colors can be displayed more as a continuous variable. The Linear model again displays a very logical function across the total attribute space. As discussed above, the CART and MARS models exhibited inconsistent categorization of contaminants and poor performance in the decision matrix comparisons, while the other three models (ANN, Linear, and QUEST) performed very well with respect to TDS error loss, number of training set errors, and the logic of the classification model. The linear model was generally able to predict EPA average within approximately 0.3 (less than half a category). Hence, evaluating ways to apply the model results focused on procedures for utilizing the results from the ANN, Linear, and QUEST models. Page 52 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 30. MARS Model Predictions for the Four Attribute Space (10,000 possible score combinations). A Mill Hill Mill 11111 .Mill Mil 1 1 I J ' J I severity --> POTENCY -—> See Exhibit 29 for the key and text for discussion. Page 53 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Exhibit 31. Univariate CART Model Predictions for the Four Attribute Space (10,000 possible score combinations) A LU O I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I LU LU 1 I I 01 Q_ E severity -> POTENCY -—> See Exhibit 29 for the key and text for discussion. Page 54 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 32. Linear Model Predictions for the Four Attribute Space (10,000 possible score combinations) k k ; . L. k Ik severity --> POTENCY -—> See Exhibit 29 for the key and text for discussion. Page 55 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 4.5 Applying Model Results From the inception of the development of the CCL classification process, EPA intended to use classification models as decision support tools. It was envisioned that, after testing and evaluation, a model(s) might be used to process complex data in a consistent, objective, and reproducible manner and provide a prioritized listing of candidate contaminants for the last stage of the CCL process, an expert review and evaluation. This also would help to focus resources for the review and evaluation of potential contaminants. The use of classification models as a tool in the CCL process is a new application of such tools. Several factors have been considered in assessing how to utilize the model results. After testing, EPA determined that three models performed well: the ANN, Linear, and QUEST models. These are three different classes of models, with three different mathematical approaches, but all provided similar results and logical determinations. Yet the results of each are unique (e.g., Exhibits 29 and 32). Therefore, EPA explored ways to combine the results of all three models, to capture both agreement among models and unique results. Two straight forward approaches looked most useful and were applied: a simple additive approach, and a collective rank-order approach. 4.5.1 Additive Model Results The first step in combining the results of the three models was to simply add the results of their classifications for each contaminant. A tabulation of all contaminants (in the TDS) was prepared with their predicted classification from the models. Recall, the model output is as a class (number), with 4 equaling L through 1 equaling NL. The Linear model output was rounded to its integer class for this approach). Then the 3 results were simply added. This resulted in 10 "bins" or classes, ranging from 3 (all three models classed the contaminant as a 1) to 12 (all three models classed the contaminant as a 4). Hence, a contaminant with an additive score of 11, had two models class it as 4 and one model class it as a 3, totaling 11. A comparison of the sum of the three models to the TDS workgroup Decisions is shown in Exhibit 33. Exhibit 33, shows some important features of the additive process. For 142 of the 202 contaminants, the three models were unanimous and in agreement with the TDS. When reviewing these analyses EPA noted that every contaminant subject matter experts classed as List (by consensus) was predicted as a List by at least one model. The models do move some NL? into a strong L? position, but only 2 of the L? contaminants were placed into the NL? category. The areas where the models differ in outcome can provide a place to focus some review during the development of future CCLs. 4.5.2 Additive Rank Order Results EPA also tested a different approach from the 10 additive classes. A simple method to provide a more continuous rank-order for each model was also developed. The output for each model was used to produce a rank-ordering for that model; ordering from highest (an L candidate) as number one, to lowest (a NL) as number 202 for the TDS. Once the ranks for a model were ordered, the contaminants were simply assigned a number from 1 to 202 (high to low). After this was done for all three models, the rank numbers were added (resulting in a range from 3 to 606) divided by 3 (just to stay on the 202 scale), and then reordered by their composite ranks. Page 56 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit 33. Summary Comparison of the Sum of the 3 Model Decisions to the Distribution of EPA Blinded (TDS) Decisions Shaded c using all applicatic Sum 3 Model Results All 3 = 4 (L) All 3 = 3 (L?) All 3 = 2 (NL?) All 3 = 1 (NL) Sum 12 11 10 9 8 7 6 5 4 3 Consensus Blinded Decision 4(L) 26 11 5 42 3(L?) 1 4 8 35 1 5 2 56 2 (NL?) 6 2 49 4 2 2 65 1(NL) 2 3 2 32 39 ells are unanimous model decisions that match with the TDS. These analyses were also conducted models. The analysis reinforced some of the problems discussed for the CART and MARS ms. As part of the unique input of the three models, each model produces different output with which to develop its own prediction and a rank-order. The Linear Regression model as applied, predicted the outcome as a continuous variable by solving the regression equation (e.g., 3.312), and these values were simply used to rank-order. ANN produces a probability of a contaminant being a 4. So, for ANN, the probabilities for each contaminant were used for the rank-ordering. QUEST does require some processing after the model produces classification predictions to produce a rank order. For QUEST, the distance from the lower discriminant surface was computed. The contaminants were then rank-ordered within a classification group (i.e., ranked within the L? group), then a composite was compiled. QUEST, as a classification decision or regression tree, produces more ties than the other models, but it still produces enough of a continuum that it did not present a problem. The composite provides a nearly continuous rank-ordered list that can further help to prioritize the analysis for the expert review. Combining the additive results and the rank ordering could also be useful. Knowing which contaminants get unanimous 4s and Is, or identifying contaminants that stand out as anomalies in one model was useful in the review of the model output. Having the rank-ordering within the group that included an L? decision, for example, was useful for prioritizing additional evaluation. Page 57 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 5.0 MODEL OUTCOME AND POST MODEL EVALUATION PROCESS The preceding chapters have described the process that was developed for selecting the CCL from the PCCL. The companion document, Final CCL 3 Chemicals: Screening to a PCCL (USEPA, 2009b), describes the approach that was used for screening and selecting the PCCL from the Universe of chemicals. Once the PCCL screening was executed, the Attribute Scoring Protocols finalized, and the models trained, all of the PCCL chemicals were scored for their attributes and run through the models. This chapter describes the results from the modeling and the processes EPA used in evaluating the model output before selecting the CCL 3. EPA evaluated the model output and formulated several post-model refinements that were added to the CCL selection process, including an approach for considering the certainty reflected in the differing data elements. The post-model analyses are also described in this Chapter. 5.1 PCCL Characterization and Model Results The screening process, described in "Final CCL 3 Chemicals: Screening to a PCCL " (USEPA, 2009b), selected the chemicals for the PCCL. The attributes for these chemicals were scored using the procedures presented in Chapter 2 and evaluated by the three models described in Chapter 4. Exhibit 34 illustrates the results of the model output for the PCCL contaminants developed for the Draft CCL 3. These results show the distribution of the different types of data o and information EPA used to evaluate occurrence and potential occurrence . The PCCL consisted of chemicals with variable health effects data, ranging from RfDs to LD50, and occurrence data, ranging from measured water concentration data from PWSs to production volume data. Exhibit 34. Model Results for the PCCL Chemicals 3- Models Decision L L-L? L? NL7-L? NL? NL7-NL NL N(all) %of PCCL 9% 12% 33% 6% 28% 4% 9% 100% Total # PCCL 44 58 163 30 139 20 46 500 Finished or Ambient Water 3 9 26 6 29 7 21 101 Release 24 29 64 11 28 9 7 172 Production 17 20 73 13 82 4 18 227 As described in Chapter 4, three models (ANN, Quest, and Linear) were used in classifying the PCCL contaminants. EPA used an additive process to combine the results of all three models. 8 The screening of the CCL 3 Universe, including processing with supplemental data during the nominations process, resulted in 532 chemical contaminants for the PCCL. These chemicals were scrutinized as part of the classification and modeling process. Some of the PCCL chemicals had limited data available for scoring and could not be run through the models process. The 32 contaminants that had limited data remain on the PCCL. They are identified in Appendix G. Exhibit 34 recaps the model output for the 500 chemicals that were scored and processed. Page 58 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 The bolded decision category (i.e. L, L?, NL?, NL) in Exhibit 34 signifies that all of the models were in 100% agreement with that listing decision. The other categories (e.g., NL7-NL) represent varied agreement where one or two of the models choose one listing option and one or two models chose a different option. None of the models categorized a contaminant in a category more than one category higher or lower than the other models. That is, no contaminants were categorized as an "L" by one model and as an "NL?" by another model, or vice versa. The models categorized approximately 1A of the chemicals on the PCCL as L? or above. When analyzed by data type, the majority of chemicals in the List category had LDso data for health effects. This was a concern and became an important issue for consideration in the post-model evaluation process. 5.2 Evaluation of the Modeling Output As part of the last stage in the CCL classification process, the model output was reviewed by internal EPA experts. This step involved: • a more detailed review of the data used, • a review of supplemental data, and • deliberations on how the model data should be used to produce a draft proposal for a CCL. Specifically, the function of the team was to critically compare the results from the model to the information collected for the individual chemicals, and identify any concerns with the model output. This exercise was conducted for a cross section of the model outcomes and their associated contaminants. An Evaluation Team was comprised of internal EPA experts representing scientists, engineers, toxicologists, and environmental protection specialists from the OW, Office of Research and Development, Office of Children's Health, and Office of Pesticide Programs. The Evaluation Team met on a weekly basis for approximately 8 weeks to discuss the evaluation of modeling results. 5.2.1 Procedure Prior to the initiation of the evaluation effort, all Evaluation Team members received background descriptions of the CCL process for chemicals (Chapters 1-4 of this document), Attribute Scoring Protocols, and evaluation work sheets. A spread sheet with the attribute scores, the data that supported the scores, and the model output for each of the chemicals selected for the first review session was also included in the package. An initiation meeting was held to familiarize the participants with the contents of their evaluation package and discuss the approach that would be followed in evaluating the model output for individual contaminants. Participants on the Evaluation Team received a set of contaminants and supplemental data dossiers for evaluation. The completed evaluation sheets were submitted so that the results could be compiled for discussion. The evaluation sheets allowed the participants to: Page 59 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 • Comment on the model input data for each attribute, • Provide a statement on their level of confidence in the data underlying each attribute score, • Express agreement or disagreement with the model output, • Indicate their degree of confidence in the model decision, and • Provide an explanation for their agreement of lack of agreement with the model decision. Following submission of the evaluation results for each set of contaminants, the Evaluation Team discussed the outcome of the evaluation, concentrating first on those contaminants with the greatest differences among the reviewers. These discussions identified the issues and steps described in the following sections of this chapter. The Evaluation team reviewed a subset of 129 chemicals from the PCCL. The contaminants were divided into groups as follows: • Contaminants with finished and/or ambient water data, • Contaminants with release data (pesticide applications and/or TRI), and • Contaminants with production data. The team evaluated all contaminants with finished and/or ambient water data and a randomly chosen subset of the contaminants with release or production data. The identities of the contaminants were blinded for the review. This was done so that the team would focus their review on the data for a contaminant and not its name. The identity of all contaminants was revealed when the team discussed the evaluation results. 5.2.2 Evaluation Results Discussion of the model results raised issues that are important to the selection process for CCL 3 and subsequent CCLs. The evaluators represented a variety of disciplines and contributed important perspectives reflecting their field of specialization. Below are some of the important issues that were raised by evaluators: • The ratio between the health reference value and the concentrations observed in finished and/or ambient water is an important relationship that is not entirely captured by the four attribute scores. When finished and/or ambient water data were available, this ratio was most often the reason for not agreeing with the model output. For example, the model may have classified a chemical as an L?, but when the health value and concentration data were directly compared, the outcome indicated that occurrence was one or more orders of magnitude below the health-based benchmark. In this situation, the evaluators usually disagreed with the models decision. • Confidence in the data elements used for attribute scoring varied widely among the PCCL contaminants. Evaluators noted that there was a considerable difference in the weight-of-evidence for the differing types of data used to score PCCL contaminants. Although the scores used a hierarchy in selecting the data elements that best represented health effects and occurrence, the most highly ranked data element was not equivalent for every chemical. Individual chemicals used different combinations of data as input for the models. The type of data elements used to represent the occurrence and health effects Page 60 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 became a subject of discussion for the Evaluation Team. Some contaminants had recent UCMR monitoring data combined with an Office of Pesticide Programs (OPP) RfD and others had TRI release data combined with an LDso. For some chemicals, the only available data came from an LDso combined with the number of pounds produced per year and environmental fate properties. The evaluators were more comfortable with the model decisions based on strong supporting data (i.e., RfD and finished water occurrence) than on those based on weak data sets (i.e., LDso and production data). • Reviewers believed it was important that the occurrence and health values represent the same form of the chemical. This is particularly important for nonmetals where the common inorganic form of the element is a complex ion (i.e. phosphate) and not the element (i.e. phosphorous). This is also important for metals (i.e vanadium) where the occurrence data represent ions in solution that may have been paired with a toxicity value for the free metal. • Toxicity data from National Cancer Institute/National Toxicity Program bioassays were incorporated into the Universe for a number of contaminants that were positive for tumors, and were tested by way of the inhalation route of exposure. Some of these contaminants were screened to the PCCL on the basis of their qualitative cancer findings. They were scored for Potency and Severity based on slope factors that had been derived for the oral route of exposure, but based on the inhalation data without the use of Physiologically Based Pharmacokinetic (PBPK) modeling. Some of these very volatile contaminants received L or L? model designations. Reviewers questioned whether toxicity data from inhalation studies should be used for scoring cancer Potency. Therefore, only cancer slope factors that were derived using PBPK modeling for cross route extrapolation were used to score chemicals. Inhalation data were not used for non- cancer endpoints. • Due to the risk assessment policy differences between agencies, the hierarchy for scoring Potency and Severity considered the agency that established the value (described in Chapter 2 and listed in Appendix A). However, some reviewers questioned whether the date of the assessment rather than the Agency conducting the assessment should be the basis for the hierarchy. • Prevalence and Magnitude were given the lowest possible scores ("1") when a contaminant had been monitored but there were no detections. Since the detection level for a few chemicals was above the health-based value, some reviewers questioned whether this was appropriate. They suggested that it might be better to use the detection limit as the basis of the Magnitude attribute score. • UCMR 1 screening studies monitored a small number of statistically selected sites (300). There were cases where there were no finished water detections in the screening surveys, but the same contaminant had been detected in ambient water by USGS. Reviewers questioned the placement of finished water above ambient water in the hierarchy in these cases. Page 61 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 • A number of disinfection byproducts (DBFs) had occurrence data based on production or release, while some had no occurrence data. Production and release data do not adequately represent the potential occurrence of DBFs and byproducts of other treatment processes in finished water. • Reviewers were uniform in believing that contaminants that had a Potency score based on an LDso value and a Severity score of 9 (death), should be returned to the Universe independent of their other attribute scores. The quantitative results of the model output evaluation are summarized in Exhibit 35 For Exhibit 35, agreement with the model outcome by a majority of the Evaluation Team constitutes agreement. Appendix F lists the chemicals reviewed by the Team and the percentage of the team agreeing with the model outcome for the individual chemicals. Exhibit 35. Results of the Model Output Evaluation (Total = 129 chemicals) Number of Contaminants Agreement with model outcome (>50%) % where an outcome higher than the model was recommended % where an outcome lower than the model was recommended % high confidence decisions (avg.) % medium confidence decisions (avg.) % low confidence decisions (avg.) Finished/Ambient Water Grouping 89 96% 2% 2% 36% 49% 15% Release Grouping 28 89% 0% 11% 16% 31% 52% Production Grouping 12 67% 0% 33% 7% 17% 76% 5.3 Post-Model Adjustments to the Process Based upon issues identified by the Evaluation Team comments, several post-model refinements were added by EPA to the CCL process. The post-model refinements changed the listing status of some of the chemicals as candidates for CCL 3. For example, EPA evaluated the UCMR screening studies to determine the adequacy of the analytical method for contaminants with no detections when ambient water occurrence data was available. In these cases the Agency opted to use the ambient water data and included contaminants on the CCL. The post-model adjustments that were incorporated are discussed in the following sections. EPA re-evaluated health effects data to ensure that toxicological data matched the various forms and valences of contaminants and that those data used appropriate cross-route extrapolation methods to develop values from different exposure routes (i.e., inhalation to ingestion). The Agency did not change the health effects hierarchy to use the most recent data from any source rather than best available data from the most suitable Agency. The protocols and established hierarchies ensured that the data used at each step and were applied uniformly for all contaminants. The hierarchy also provides a transparent data driven approach that allows Page 62 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 81 5-R-09-008 Classification of the PCCL to CCL August 2009 stakeholders and the public to uniformly understand the assumptions and processes that were applied in the selection of data for individual contaminants. Issues raised by the Evaluation Team may have been addressed by one of the post-model adjustments and processes. For example, several of the scoring or hierarchical issues identified were addressed by implementing the Health-Concentration Ratio. This process addressed concerns about the occurrence data hierarchy, magnitude scores based on detection rather than the method reporting limit, and using the most relevant data to make listing decisions. 5.3.1 Using Supplemental Sources to Identify the Data Most Relevant to Drinking Water One issue identified by the Evaluation Team was that scoring should be based on the data most relevant to exposure from drinking water. For example, DBFs were included in the Universe and many were brought forward to the PCCL. The data used to score these contaminants for occurrence should be based on their occurrence in drinking water at PWSs, not ancillary data that may be available such as release or production volume. There are DBF data from the Information Collection Rule monitoring and supplemental studies identified in the CCL Nominations process. These data had not originally been included in the data used for scoring Prevalence and Magnitude. As part of the post-model processing EPA retrieved the data, scored the chemicals, and ran the models using the supplemental data 5.3.2 Calculation of a Health Effect-Concentration Ratio for Contaminants with Water Data The models classified chemicals using scores for the four attributes. The Evaluation Team recognized that the relationship between Potency and Magnitude was important when deciding whether or not to list a chemical. Therefore EPA calculated the ratio between the health-based value and the 90th percentile concentration in finished or ambient water as a post-model process to select contaminants for the CCL 3. EPA also sought models to predict water concentrations for contaminants that did not have direct measurements in water sources to calculate this ratio. EPA used the health effect-concentration ratio as key criterion for listing a contaminant on the CCL 3if this value was less than or equal to 10. 5.3.2.1 Developing a Health Reference Level (HRL) To calculate the health effect-concentration ratio, the data that provided the Potency score were used to calculate the HRL benchmark using a process similar to the one the Agency has used for Regulatory Determination. For a carcinogen, the HRL is the one-in-a-million cancer risk expressed as a drinking water concentration. For non-carcinogens, the HRL is equivalent to the lifetime health advisory value. The lifetime health advisory value is obtained by multiplying the RfD times 70 kg, dividing by a water intake of 2 L/day and multiplying by a 20% relative source contribution (unless there are data to suggest that the 20% is inappropriate). Determining the HRL for chemicals where the Potency value was the NOAEL, LOAEL, or value from an individual study, required application of an uncertainty factor to adjust the toxicity value to an RfD approximation. In these cases, the uncertainty factor was based on the difference Page 63 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 in the modal values from the log-based data distributions used to develop the Potency scoring equations (see Chapter 2). The uncertainty factors applied are as follows: • NO AEL= 1,000 • LOAEL = 3,000 • LD50 = 100,000 The NOAEL and LDso uncertainties were derived from the difference in the constant for the non- cancer Potency scoring equation (Exhibit 4). For a NOAEL, the difference is 3 (7 - 4 = 3) or 1,000 since the Potency equation is log based. The difference for an LDso is 5 (7-2 = 5) or 100,000. The uncertainty factor (3,000) chosen for the LOAEL is a half log greater than that for the NOAEL, in recognition that the LOAEL is a level that causes effects rather than no effects. Exhibit 36 shows the formulae used (including the uncertainty factors), for the CCL 3 program, to calculate HRLs from the various Potency data elements. The formulae calculate the HRL as a mg/L equivalent, in turn the HRL was converted to ug/L to compare with the CCL 3 water data. Exhibit 36. Formulae used in the CCL 3 Process to Calculate Health Reference Levels (HRLs) from the CCL 3 Potency Data Elements. The formulae identify the uncertainty factors (UF) applied for the CCL 3. The HRLs are in mg/L. They were further converted to ug/L for comparison with CCL 3 water data. BW = body weight; RSC = relative source contribution. Non-Cancer Equations HRL, mg/L = RfD (mg/kg/dav) x BW (70 kg ) x RSC (0.2) 2 L/day HRL, mg/L = NOAEL (mg/kg/dav) x BW (70 kg ) x RSC (0.2) 2 L/day x UF (1,000) HRL, mg/L = LOAEL (mg/kg/dav) x BW (70 kg ) x RSC (0.2) 2 L/day x UF (3,000) HRL, mg/L = LDsn (mg/kg) ) x BW (70 kg ) x RSC (0.2) 2 L/day xUF (100,000) Cancer Equations HRL, mg/L = Risk(10'6)xBW(70kg) Slope Factor x 2 L/day HRL, mg/L = 10 Cancer Risk (mg/L) x 0.01 Page 64 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 5.3.2.2 Developing a HRL - Concentration Ratio EPA determined that if the measured or modeled concentration of contaminants was equal to or greater than on tenth of the health reference level then the contaminant should be included on the CCL 3. EPA selected the 90th percentile (of detections) water concentration as the point of comparison for the ratio, rather than the mean or median. The CCL list is designed to identify contaminants that may benefit from a Health Advisory, even if they do not merit a positive regulatory determination. The 90th percentile concentration level was used as a public health protective benchmark that may identify a possible need for a health advisory for areas of the country that may have higher concentrations in drinking water than others. If a 90* percentile concentration level was not available the Agency used the maximum or the next highest percentile (i.e., 95th or 99th percentile) reported value. The ratio of the heath-value to the 90* percentile concentration detected in water (either ambient or finished) was calculated for all contaminants with water data. If the ratio was 10 or less the contaminant was selected for consideration for the CCL 3. If the ratio was greater than 10, the contaminant was eliminated from consideration for CCL 3 and remains on the PCCL. For chemicals that had been monitored but not detected, and for chemicals that were detected in ambient waters but not finished water, analytical method detection limits were compared to the HRL to ensure that the detection accounted for the health effects. Consideration was also given to whether the ambient water data suggested that the UCMR 1 screening might have been too limited to identify the contaminant in areas where it might pose a problem. For contaminants that had limited finished water data, but more robust ambient water monitoring data, the ambient water concentration was used to develop the ratio. The contaminant information sheets note which data were used to develop the ratio and are available for all PCCL contaminants in the docket at www.regulations.gov. 5.3.2.3 Developing a Ratio for Contaminants Without Concentration Data EPA used modeled data for pesticides when concentration data were not available. The modeled concentrations of pesticides in water are included in the OPP registration and re-registration evaluation documentation, but they are not readily available in a form that could be used for the Universe database. The modeled data predicts environmental concentrations in ground water and surface water using a standardized approach for pesticides. For pesticides, the modeled data from OPP were compared with the health reference level. As part of the pesticide registration process, EPA calculates an estimated environmental concentration (EEC) in water or estimated drinking water concentration (EDWC) depending on the year the last assessment was completed. Both the EEC and EDWC are derived from models that estimate the pesticide's concentration in an index reservoir used for drinking water. OPP used the PRZM-EXAMS model for surface water. Ground water concentrations are derived using the SCI-GROW regression model to represent exposures in shallow ground water. Both the EEC and the EDWC are equivalent. The modeled EEC values allowed EPA to calculate the HRL/EEC or EDWC ratio for pesticides and/or their degradates. Pesticides with HRL/EEC ratios of 10 and lower were selected for the draft CCL 3. Page 65 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 5.3.3 Grouping Contaminants based on the Certainty or Relevance of the Supporting Data Data certainty was not directly factored into the development of the attribute scoring protocols, but was indirectly factored into the protocols through the use of the hierarchies of the data used for health effects and occurrence (Chapter 2). In the evaluation of the model output, data certainty was an important factor for the Evaluation Team. In cases where the model output listed a chemical with well developed data from high in the hierarchy (e.g. IRIS RfD, UCMR/NAWQA concentration), the Evaluation Team typically agreed with the model decision. The Evaluation Team confidence ranking for model decisions based on these types of data was generally high while confidence for less developed or preliminary data from the hierarchy was generally lower (see Exhibit 35). Accordingly, as part of the post-model evaluation process, EPA tried various approaches for addressing the certainty issue. Initially, EPA attempted to develop numeric certainty scores for each data element, but decided not to use this approach because the certainty scores could not be calibrated due to the subjectivity in assigning the numeric values. For example, it would be difficult to justify that a chemical evaluated by environmental release data should be assigned a certainty score of 6, while a chemical evaluated by production volume should be assigned a certainty score of 10 versus 9. Therefore, EPA placed tags on the chemicals that characterize the certainty. The chemicals were tagged as high, medium and low certainty based on the combinations of data elements that were used to score the attributes for health effects and occurrence. The certainty tags are not calibrated measures of certainty. They were developed to express the relative certainty associated with the data elements that were used to score a chemical's attributes. The certainty rankings assigned to the combinations of individual attribute data elements are listed below: High Certainty: Finished Water + RfD/ CSF, NOAEL or LOAEL Ambient Water + RfD/CSF, NOAEL Medium Certainty: Ambient Water + LOAEL Release/Application + RfD, NOAEL, LOAEL Production + RfD Low Certainty: Health effects based on LDso Occurrence based on production values The high certainty bin consisted of chemicals that had been scored based on the most relevant data for occurrence in water and with the richest data for health effects. Such contaminants are expected to be good candidates for regulatory determination with minimal research needs. Examples of chemicals in the high bin include chemicals with reference doses and measured water concentration data. The medium bin consists of chemicals that need further occurrence and/or health effects research. These include chemicals that may have well studied health effects data but may need additional occurrence data (e.g. chemicals with release data but, no measured water occurrence data). The low certainty bin consists of chemicals that need extensive health Page 66 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 81 5-R-09-008 Classification of the PCCL to CCL August 2009 effects and occurrence research needs that may take longer than the life cycle of a CCL. Examples include chemicals with LD50 and/or production volume data. While the CCL consists both of chemicals that may provide sufficient data to support regulatory determinations and chemicals that are of concern and need additional drinking water research EPA found that LDso and production volume data are not sufficient data for a contaminant to be included on the CCL. EPA evaluated contaminants from each bin to determine if the best available data for that contaminant was sufficient for listing. 5.3.4 LD50 Values with Limited Documentation Following the advice from the Evaluation Team, Severity scores based on death from LDso studies were subject to additional research to identify supplemental health effects information. If no other information were available the contaminants were removed from the modeled PCCL results. (This decision applies to contaminants where no critical endpoint other than death was specified in the source of the LDso data.) These contaminants were removed from consideration for the CCL. None of the chemicals on the PCCL with LDso derived health attributes scores had ambient or finished water data. 5.4 Selecting the Draft CCL 3 The chemicals for the draft CCL 3 were selected based on the processes previously described in this document and the other cited support documents, the Final CCL 3 Chemicals: Identifying the Universe (EPA, 2009a), and the Final CCL 3 Chemicals: Screening to a PCCL (EPA, 2009b). The Agency noted from which of the three uncertainty bins, described in Section 5.3.3, contaminants were selected. In selecting contaminants for the draft CCL 3, EPA used the post model criteria (described in Section 5.3) for the HRL/Concentration ratio and the certainty bins. EPA identified four groups of on the draft CCL 3. • 36 chemicals in the high certainty bin, which have finished water data and an HRL/concentration ratio of < 10. • 24 pesticides in the medium certainty bin, which have modeled surface and/or ground water data that yielded a HRL/concentration ratio of < 10 • 27 pesticides and chemicals in the medium certainty bin, which have release data that gave modeled L or L? rankings • 8 chemicals were initially in the low certainty bin. These contaminants were nominated and evaluated with supplemental information that was submitted or evaluated by EPA. No chemicals with only LDso and production data were selected for the CCL. These chemicals will be considered for future CCLs. Subsequent to placement on the draft CCL 3, the list was subject to review by a panel of qualified external experts and stakeholders. Stakeholder input was considered in determining which chemicals from among a preliminary CCL 3 grouping were retained for the Draft CCL 3. A summary of this review is available in the docket at www.regulations.gov. The draft CCL 3 was published on February 21, 2008, and included 93 chemicals or chemical groups and 1 1 microbiological contaminants. Page 67 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 5.5 Selecting the Final CCL3 EPA provided information on the process, the draft list, and sought comment on its efforts to expand and strengthen the underlying CCL listing process, the draft list, and EPA's efforts to improve the contaminant selection process for future CCLs. EPA received comments from 177 individuals or organizations on the draft CCL 3. Commenters identified several issues on the draft CCL 3 and the process used to select contaminants for the list. Commenters also provided information and recommendations for the Agency to consider as it finalized the CCL 3. The Agency has provided responses to individual comments in the "Comment Response Document for the Third Drinking Water Contaminant Candidate List (CategorizedPublic Comments) " document that is available in the regulatory docket at www.regulations.gov (USEPA 2009c). The EPA SAB and its Drinking Water Committee also reviewed the draft CCL 3 during 2008, and provided an advisory to the EPA Administrator on January 29, 2009. EPA staff met with the SAB to provide an overview of the draft CCL 3, to answer questions from the Drinking Water Committee, and to clarify questions from the full SAB. The Agency also participated in teleconferences with SAB during the development of the "SAB Advisory on EPA's Draft Third Drinking Water Contaminant Candidate List (CCL 3) " (USEPA 2009e). EPA evaluated all the data and information on chemical contaminants provided by commenters and collected by the Agency after the draft CCL 3 was published. EPA used the same process described in the draft CCL 3 notice (73 FR 9628, USEPA 2008) and other supporting documents to evaluate contaminants for which data became available after the publication of the draft CCL 3. The Agency added contaminants to the Universe, adjusted the contaminants that passed through to the PCCL based on new data and reevaluated the PCCL using the protocols described in this document. The 106 chemicals included on the final CCL 3 included: • 38 chemicals in the high certainty bin • 23 pesticides chemicals in the medium certainty with modeled occurrence data with an HRL/concentration ratio < 10 • 26 pesticides and chemicals in the medium bin which have application or release data and health effects data that resulted in a L, L-L?, or L? classification. • 19 chemicals that were initially in the low or medium certainty bin which EPA or commenters identified supplementary data that resulted in an HRL/concentration ratio or model classification to be included on the final CCL 3. 5.6 Summary The CCL 3 and the process EPA used to select contaminants was developed and tested by the Agency to meet the Safe Drinking Water Act requirements and address recommendations and Page 68 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 advice from the NRC (2001) and NDWAC (2004). The Agency has developed a process and final CCL 3 that: • Considers of a broad Universe of contaminants • Relies on best available science and information to inform the process • Evaluates the known or potential health effects and occurrence in screening the Universe to a PCCL • Uses a set of contaminant attributes and prototype classification algorithms as decision support tools in selecting candidates for the CCL from the PCCL • Provides an opportunity for nominations and expert judgment. The first application of the CCL 3 process accomplished many of the specific recommendations from NRC and NDWAC. During the development of CCL 3, the Agency identified areas for improvement that can be implemented in the selection of CCL 4 and later CCLs. 6.0 REFERENCES Fetter, C. W. 1994. Applied Hydrogeology, 3rd Edition, Macmillan College Publishing Co. New York. Lyman, W. I, Reehl, W. F., and Rosenblatt, D. H. 1990. Handbook of Chemical Property Estimation Methods, American Chemical Society, Washington, DC. National Drinking Water Advisory Council (NDWAC). 2004. National Drinking Water Advisory Council Report on the CCL Classification Process to the U. S. Environmental Protection Agency, May 19, 2004. National Research Council (NRC). 2001. Classifying Drinking Water Contaminants for Regulatory Consideration. National Academy Press, Washington DC. NIST. 2006. NIST/SEMATECHe-Hcmdbook of Statistical Methods. Available on the internet at: http://www.itl.nist.gov/div898/handbook/, (used on May 3, 2007). USEPA. 2004. Office for Water. Drinking Water Standards and Health Advisories, EPA 822-R- 04-005 Washington, DC. Winter 2004. USEPA. 2008. Drinking Water Contaminant Candidate List 3 - Draft Notice. Federal Register. Vol. 72. No. 35. p.9628. February 21, 2008. USEPA. 2009a. Final Contaminant Candidate List 3 Chemicals: Identifying the Universe. EPA 815-R-09-006. August 2009. USEPA. 2009b. Final Contaminant Candidate List 3 Chemicals: Screening to a PCCL. EPA 815- R-09-007. August 2009. USEPA. 2009c. Final Comment Response Document for the Third Drinking Water Contaminant Candidate List (Categorized Public Comments). EPA 815-R-09-010. August 2009. USEPA. 2009d. Summary of Nominations for the Third Contaminant Candidate List. EPA-815- R-09-01 I.August 2009. USEPA. 2009e. SAB Advisory on EPA's Draft Third Drinking Water Contaminant Candidate List (CCL 3). EPA-SAB-09-011. January 2009. Page 69 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 7.0 APPENDICES Page 70 of 70 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Appendix A. Attribute Scoring Protocols This section provides scoring protocols for the health effects attributes of Potency and Severity as well as the Occurrence attributes, Magnitude and Prevalence. A.1 Potency Scoring Protocol This section describes the process for assigning a numerical score for the Potency attribute. Protocol for Potency Scoring Step One: Open the spreadsheet for Potency and Severity Scoring (a sample of this spreadsheet is shown in Exhibit A. 1) and is an alternative to using the computer version of the spread sheet. Step Two: Enter the name of the chemical in the column labeled contaminant. Step Three: Identify and score highest-ranked non-cancer data element for potency using the following hierarchy of values: Reference Dose (RfD) or equivalent > No-Observed-Adverse-Effect Level (NOAEL) that is lower than the lowest LOAEL > Lowest-Observed-Adverse-Effect Level (LOAEL) > Toxic DoseLO (TDLO- RTECS) > Lethal dose (LD50) Measured > Modeled For RfDs (or equivalent) only: EPA RfD > ATSDR Minimal Risk Level (MRL) (Chronic> Intermediate >Acute) > RAISHE RfD > Cal EPA Public Health Goal (PHG)a > TDIs from WHO/EU/Health Canada > UL from IOM Office of Pesticide Programs (OPP) > IRIS for Pesticides Step Four: Enter the selected quantitative measure of non-cancer potency into the appropriate column of the spread sheet. Make sure that the units are in mg/kg/day. (The spreadsheet formula produces a score in a corresponding column for the data element on the right side of the sheet.) Step Five: Select a measure for cancer potency if one is available. The preferable measure will be the 10"4 risk concentration in drinking water in mg/L. If the risk is expressed at levels other than 10"4, convert the value to the target risk (10~4). If the cancer potency measure is the slope factor, calculate the 10"4 risk concentration using the following equation: 10"4 Risk concentration = 0.0001 x35kg/dav/L Slope Factor (mg/kg/day) a The California PHG will have to be converted from mg/L to a dose by multiplying it by the [Drinking Water Intake (L) + (the body weight (kg) x Relative Source Contribution)]. A-1 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Step Six: In a case where the entered potency value is a LD50 value that is reported as greater than a particular dose, or as a NOAEL with no LOAEL, decrease the score calculated using the spreadsheet by one integer. Situations where there is a NOAEL with no LOAEL can be identified by the lack of a critical effect, because the NOAEL was the highest dose tested. Step Seven: Choose the higher of the non-cancer or cancer potency scores as the measure of potency. Note: if no value for Potency can be found that qualifies for this protocol, please refer the contaminant for expert judgment. The only endpoints that may be applied to this protocol are those listed explicitly in the hierarchy of values. Further, the only endpoints considered as equivalent to an RfD are MRLs from ATSDR, RAISHE RfDs, Cal EPA RfDs, WHO or HC, TDIs, and IOM ULs. Exhibit A.l. Potency Scoring Table SCORE 10 9 8 7 6 5 4 3 2 1 RfD mg/kg-day 0 - 0.000000316 0.000000317 - 0.00000316 0.00000317 - 0.0000316 0.0000317 - 0.000316 0.000317 - 0.00316 0.00317 - 0.0316 0.0317 - 0.316 0.317 - 3.16 3.17 - 31.6 31.7 - >31.7 LOAEL/NOAEL mg/kg-day 0 - 0.000316 0.000317 - 0.00316 0.00317 - 0.0316 0.0317 - 0.316 0.317 - 3.16 3.17 - 31.6 31.7 - 316 317 - 3,160 3,170 - 31,600 31,700 - >31,700 LD50 mg/kg 0 - 0.0316 0.0317 - 0.316 0.317 - 3.16 3.17 - 31.6 31.7 - 316 317 - 3,160 3,170 - 31,600 31,700 - 316,000 317,000 - 3,160,000 3,170,000 - >31, 700,000 10~4 Cancer Risk 0 - 0.00000316 3.17E-06 - 0.0000316 3.17E-05 - 0.000316 0.000317 - 0.00316 0.00317 - 0.0316 0.0317 - 0.316 0.317 - 3.16 3.17 - 31.6 31.7 - 316 317 - >317 A-2 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 A.2 Severity Scoring Protocol The score for Severity is based upon the critical effect associated with the data element (RfD, LOAEL, etc.) used to score Potency. Potency must be scored prior to Severity. Protocol for Severity Scoring Step One: Identify the critical effect for the contaminant, based on the data used to score the attribute of potency, and enter it into the severity scoring worksheet (shown in Exhibit A.2). If the contaminant has more than one critical effect all of the listed effects should be included. NOTE: If the critical effect is death and the LD50 data element was used to score potency, go to Step Four. If the effects are for a LOAEL from RTECS go to Step Five. Step Two: Locate the critical effect within the Compendium of Critical Effects Table (see Exhibit A.3) and enter the severity score associated with that critical effect in the severity scoring worksheet. If a contaminant has more than one critical effect, choose the highest of the scores. NOTE: If the critical effect is not listed in the Table, go to Step Three. Step Three: If the critical effect is not listed in the Table, the scorer should flag that critical effect as 'not listed.' (Health effects experts should be consulted to score these effects.) Once the effect is scored it should be added to the compendium for future use and consistent scoring. Step Four: If a critical effect is not available, or is "death," use one of the following options for scoring: 1) Search sources identified as supplemental sources for CCL for additional health effects data that could be used to score potency and severity for the contaminant. If data are found that provide a data element from the potency protocol other than LD50 to score the contaminant, then that element can be used for scoring. Sources that may be most helpful in this search include: Hazardous Substances Data Bank (HSDB), International Program on Chemical Safety (INCHEM), and the National Toxicology Program (NTP). The element that is found may be used to rescore the contaminant for potency, and subsequently severity, using the score associated with the critical effect endpoint. 2) Search for an alternative critical effect associated with the LD50 determination. Locate the LD50 study and search for information regarding the types of effects occurring prior to animal death. If a critical effect other than death is given in the study, it may be used to score the severity of the contaminant. (The potency score is still given by the value of the LD50.) 3) If no additional information can be found, recommend that the contaminant be returned to the Universe. Step Five If the Potency score is a LOAEL from RTECS, the effects listed represent all effects and A-3 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 not just the critical effect (s). There are three available options for improving the scoring in this situation. 1. If the RTECS data source is included in the supplemental data, review the supplemental information to identify the critical effect. If the supplemental source includes a NOAEL for the critical effect, replace the LOAEL with the NOAEL and rescore potency if necessary. In cases where the data source for the LOAEL is not in the supplemental data search the supplemental data for an alternative data source. If the data identified provides a NOAEL or LOAEL that is the same or lower than that in RTECS or is from a study of higher quality than the RTECS study , use that NOAEL or LOAEL and its critical effect to score both potency and severity. If it is not possible to find better information in the supplemental data sources score the most serious of the effects listed in RTECS. Exhibit A.2. Severity Scoring Table Key 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 Study used to score Potency Critical Effect(s) for Severity Severity Score A-4 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Definition Score Compendium of Critical Effects 1 NO ADVERSE EFFECT No observed effect(s). No observed adverse effect(s). Absence of effects. No critical effect(s) identified. No effect(s) related to treatment. Absence of biologically significant adverse effect(s). Absence of gross light microscopic histopathological change(s). Excedance of the Taste Threshold COSMETIC EFFECT (Interpretation: Consider those effects that alter the appearance of the body •without affecting structure or functions) Dental fluorosis. Abnormal appearance. Facial flushing. Flushing. Argyria. Dermal sensitization. Skin pigmentation. Hyperpigmentation. Alopecia. Keratosis. REVERSIBLE EFFECTS; DIFFERENCES IN ORGAN WEIGHTS OR SIZE, BODY WEIGHTS OR CHANGES IN BIOCHEMICAL PARAMETERS WITH MINIMAL CLINICAL SIGNIFICANCE. (Interpretation: Transient, adaptive effects) Growth and Weight Effects Decreased body weight and or body-weight gain. Increased absolute organ weights. Increased liver weight. Increased kidney weight. Increased relative organ weight. Decreased relative organ weight. Lower ovarian weight. Decreased maternal weight gain. Increased absolute and relative (to body and/or brain) liver weight. Increased kidney body weight ratio. Increase in spleen weight. Increase in thyroid/body weight ratio. Changes in thymus weight. Decreased body weight. Decreased growth. Gastrointestinal Disturbances Decreased stool quantity. Osmotic diarrhea. Diarrhea. Nausea. Vomiting. GI irritation. GI disturbances. A-5 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Score Definition Compendium of Critical Effects 3 (cont.) Irritation/Irritability Chronic irritation. Maternal hyperirritability. Chronic irritation without histopathology changes. Biochemical Changes Decreased glucose. Increased blood sugar. Increased enzymes. Increased triglycerides. Increase serum concentration of compound. Clinical serum effects. Alterations in clinical chemistry. Increased serum alkaline phosphatase. Significant elevation of serum calcium levels. Enzyme inhibition, induction, or change in blood tissue levels Decreased ESOD activity. Decrease in erythrocyte superoxide dismutase (ESOD) concentration. Minor alteration in clinical chemistry, e.g., decrease in erythrocyte superoxide dismutase (ESOD). Hematological effects Hematological effects. Abnormal pigments in blood. Decreased lymphocyte count. Decreased blood counts. Decreased white blood cells. Methemoglobinemia. Increased carboxyhemoglobin. Hemosiderosis. Anemia. Normocytic anemia. Iron deposits and elevated Heinz bodies in liver. Decreased hemoglobin and possible erythrocyte destruction. Decreased RBC, packed cell volume, and hemoglobin. Hematologic, hepatic, and renal toxicity as evidenced by a statistically significant decrease in hemoglobin, hematocrit, and RBC levels. RBC and liver effects as evidenced by increase Heinz bodies in RBC. Sporadic decrease in hemoglobin and RBC. Decreased RBC and hematocrit. A-6 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Score Definition Compendium of Critical Effects 3 (cont.) Cholinestemse Effects Reversible PChE (plasma) orRBC-ChE inhibition without cholinergic symptoms or signs RBC ChE depression without cholinergic symptoms or sweating. Plasma cholinesterase (ChE) inhibition without cholinergic symptoms or sweating. Hormone Changes Decrease inT3, T4. Dose-related decrease in T4, T3, and increase TSH. Elevated thyroid stimulating hormone (TSH) concentration. ACTH decrease. Cellular Vacuolization Mild to moderate vacuolization Tubular epithelial vacuolization. Brain cell vacuolization. Additional Effects Changes in teeth and supporting structures. Sensory organ effects. Centrilobular eosinophilic liver changes. Possible vascular complication. Inhibition of the concentration of beneficial bacteria in the gastrointestinal microflora CELLULAR/PHYSIOLOGICAL CHANGES THAT COULD LEAD TO DISORDERS (risk factors or precursor effects). (Interpretation: Considers cellular/physiological changes in the body that are used as indicators of disease susceptibility) Hematological Effects Jaundice. Anemia Hemolytic anemia. Erythrocyte destruction. Hemolysis. Immunological Effects Decreased delayed hypersensitivity response. Decrease in cellular immune response. Decrease in humoral immune response. Liver Effects Fatty cyst - liver and elevated liver enzymes (i.e., SGPT, LDH). Liver cell enlargement or alteration. Liver cell polymorphism. Proteinuria. Renal cytomegaly. Cholinergic Effects Cholinesterase inhibition with symptoms. Cholinergic signs or symptoms. Other Effects A-7 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Score Definition Compendium of Critical Effects Hypothermia Mild CNS Effects SIGNIFICANT FUNCTIONAL CHANGES THAT ARE REVERSIBLE OR PERMANENT CHANGES OF MINIMAL TOXICOLOGICAL SIGNIFICNACE. (Interpretation: Consider those disorders in -which the removal of chemical exposure -will restore health back to prior condition) Increased cholinergic effects ChE inhibition with sweating, diarrhea, hypotention, and/or fishy body odor.. RBC and/or plasma acetylcholinesterase (AChE) inhibition with cholinergic symptoms or sweating. Brain acetylcholineesterase inhibition with or without signs or symptoms Hematological Effects GI bleeding. Coagulation defects. Extramedulary hematopoesis Tendency to hemorrhage. Structural Effects Rachitic bone. Renal Effects Renal cytomegaly. Renal effects/toxicity (increased uric acid levels; increased urinary coproporphyrins). Inflammatory foci - kidneys. Hepatic Effects Liver function tests impaired. Fatty-cyst in liver hemosiderosis. Multiple Organ Effects Effects on the lungs, liver, kidney, thyroid and thyroid hormones. Ocular Effects Corneal damage. Neurological Effects Mild neurological signs. Alteration of classic conditioning. Brain ChE inhibition. Myelin degeneration. CNS depression. Brain/ other coverings- recordings from specific areas of CNS. Tremors. Dyspnea. Changes in motor activity. Hypoactivity. Ataxia. A-8 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score i Score Definition j Compendium of Critical Effects 5 (cont.) Other Effects Chronic pneumonitis. Clinical selenosis. Nonneoplastic lesions - splenic capsule. Intestinal lesions. Splenomegaly SIGNIFICANT, IRREVERSIBLE, NONLETHAL CONDITIONS OR DISORDERS. (Interpretation: Consider those disorders that persist for over a long period of time but do not lead to death) Multiple Organ Effects Histopathological effects in liver, kidney, and thyroid. Minimal to moderate congestion of liver, kidney, and lungs. Liver and kidney pathology. Kidney and spleen pathology. Hepatic Effects Hepatic lesions/necrosis. Hepatocyte degeneration. Hepatotoxicity. Liver cell polymorphism. Liver effects/toxicity. Liver lesions. Renal Effects Atrophy and degeneration of the renal tubules - nephropathy (unspecified). Kidney toxicity. Mineralization of the kidneys. Renal dysfunction. Renal effects/toxicity (increased uric acid levels; increased urinary coproporphyrins). Functional and histopathological effects in kidney. Kidney damage (unspecified). Kidney lesions (unspecified). Impaired renal clearance/function. Tubular epithelial vacuolation. Sensory and Neurological Effects Significant decrease in brain and brain to body weight ratio. Degenerative changes for brain/ other coverings. Peripheral neuropathy- neuropathy (unspecified). Neurotoxicity. Nerve damage (unspecified). Optic nerve degeneration/ damage. Sensory neuropathy. Minimal lens opacity and cataracts. j Nasal olfactory lesions. A-9 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Score Definition Compendium of Critical Effects 6 (cont.) Hyperplasia Thyroid hyperplasia. Urothelial hyperplasia. Hyperplasia. Squamous and basal hyperplasia of the forestomach. Epithelial hyperplasia - forestomach. Cardiac Effects Cardiac toxicity. Cardiomyopathy, including infarction. Vascular complications. Right atrial dilation. Convulsions. Mild histological lesions. Other Effects Gastrointestinal necrotic changes. Chronic irritation with histopathology findings. Forestomach lesions (unspecified). Organ atrophy. Thyroid effects (unspecified). Thyroid mineralization. Spleen toxicity (unspecified). Bladder toxicity (unspecified). Bone marrow toxicity (unspecified). Hormonal response to extrogenic substances in post menopausal women. DEVELOPMENTAL OR REPRODUCTIVE EFFECTS LEADING TO MAJOR DYSFUNCTION. (Interpretation: Considers those chemicals that cause permanent developmental effects or that impact the ability of a population to reproduce) Reproductive Organ Effects Testicular atrophy/damage. Testicular and uterine effects. Atrophied seminiferous epithelium. Histopathological changes in testes. Hypospadia. Lesions observed in reproductive organs. Decreased testes weight and testes to body weight ratio, atrophied seminiferous epithelium; and decreased tubular size in testes. Endometriosis. Decreased tubular size in testes. Decreased ovarian weight and function. Altered cellular foci. Maternal Toxicity Maternal toxicity. Decreased maternal weight gain. I A-10 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Score Definition Compendium of Critical Effects 7 (cont.) Fertility effects Spermatogenic arrest. Reduced numbers of corpora allata. Reduced or deformed sperms. Adverse reproductive effects. Reduction in fertility. Decreased fertility index. Decrease in size of litter. Growth inhibition Reduced offspring weight gain, total litter weight, or litter size. Decreased pup weight Decreased lactation indices. Increased runt incidence. Decreased crown-rump length Decreased offspring viability Excessive loss of litters Increase in number of stillbirths. Maternal and fetal toxicity. Increased intrauterine death. Decreased pup survival or viability. Increased abortion rate. Increase in number of stillbirths. Increased dead pups at birth. Decreased pup viability index. Parturition mortality. Fetal resorptions. Developmental effects Fetal toxicity/malformations. Developmental toxicity (skeletal or visceral abnormalities). Delayed ossification. Neurodevelopmental effects. Brain cell vacuolization in neonates. Myelin degeneration. Skeletal or visceral abnormalities (Extra ribs and other measures of sexual maturation). Increased retinal folds in weanlings. Mixed sexual differentiation (i.e., effeminization or emasculanization). Imbalance in sex ratio. TUMORS OR DISORDERS LIKELY LEADING TO DEATH (Interpretation: Considers chemical exposures that result in a fatal disorder \ and all types of tumors). Cancer. Suspected carcinogenicity (including short latency periods and rare tumors). Any type of cancer. A-11 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.3. Compendium of Critical Effects Table (from Health Advisories & IRIS) For Scoring Severity Severity Score Definition Score Compendium of Critical Effects DEATH. Increased mortality. Longevity. Mortality. Survival. Decreased survival. Increased mortality. Decreased adult survival. Decreased adult longevity. High incidence of mortality at early age (i.e., 25% to 50% by mid-life) in chronic studies. Maternal death during pregnancy. Reduced longevity. Death. A.3 Prevalence Scoring Protocol This section describes how to assign a numerical score for the attribute Prevalence. Step One: Identify highest-ranked data value When more than one data value is available for a particular contaminant candidate, use the hierarchy in Exhibit A.4. Use the same type of data to score Prevalence as for Magnitude. Exhibit A.4. Hierarchy of Prevalence Data Elements Rank 1 2a 2b 3 4 5 Prevalence Data Element Finished Drinking Water- Percentage of all Public Water Systems (PWSs) with Detections (If data from both NCOD Round 1 and Round 2 are available, use the higher of the values.) Percentage of all Ambient/Raw/Source Monitoring Samples or Sites with Detections Percentage of Ambient/Raw/Source Monitoring Samples or Sites with Detections (Note: use combined surface / ground water if available and higher of SW/GW if not) Pesticide application data, number of states where pesticide was applied Environmental release data, number of states reporting releases Production volume data Type of Data National scale / representative data (data from UCMR has highest priority, then NCOD, then NIRS) National scale / representative data (NAWQA) National scale / representative data (NREC - first use National Reconnaissance data, then National Aggregate data) From NCFAP From TRI From Chemical Update System/ Inventory Update Rule (CUS/IUR) A-12 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Step Two: Use scoring table to find attribute score for value identified in Step One. For each element there is a corresponding column in the Prevalence Scoring table (see Exhibit A.5), which contains a range of data values assigned to a numeric prevalence score between 1 and 10. Once a data value has been found for a particular element, look up the value in Exhibit A.5 to determine the prevalence score. For CUS/IUR data, use the most recent year reported. For pesticides, if the compound is a degradate and does not have its own data, use the parent to score. Exhibit A.5. Prevalence Scoring Scales Hierarchy Prevalence Score 1 2 3 4 5 6 7 8 9 10 1 % Finished Water PWSs with detections of contaminant All PWSs <=0.10 0.11-0.16 0.17-0.25 0.26-0.44 0.45-0.61 0.62-1.00 1.01-1.30 1.31-2.50 2.51-10.00 >10.00 2 % Ambient water sites with detections of contaminant All sites/samples <=0.10 0.11-0.16 0.17-0.25 0.26-0.44 0.45-0.61 0.62-1.00 1.01-1.30 1.31-2.50 2.51-10.00 >10.00 3 # States Reporting Pesticide in Use — ~ Default for any pesticide in non- environmental use — Default for any pesticide in environmental use without data <6 6-10 11-15 16-25 >25 4 # of States Reporting TRI total releases 1 2 3 4 5 6 7-10 11-15 16-25 >25 5 CUS/IUR (production data) Number of pounds (by category) produced <500K ~ >500K-1 M — >1M-10M >10M-50M >50M-100M >100M-500M >500M-1B >1B Note: Use data in the highest category to score. For CUS/IUR data, use the most recent year reported. Not Reported means there has been no change in production volume since the last report. For pesticides, if the compound is a degradate and does not have its own data, use the parent to score. A-13 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 A.4 Magnitude Scoring Protocol This section describes how to assign a numerical score for the attribute Magnitude. Step One: Identify the highest-ranked data element When more than one data element is available for a particular contaminant, use the hierarchy below to select the preferred element. Exhibit A.6 presents the hierarchy of data elements to be used in the Magnitude scoring process. Note that the Magnitude element should be correlated with the value used to score the attribute Prevalence, except when production data are used for Prevalence and Persistence- Mobility is used for Magnitude. Exhibit A.6. Hierarchy of Magnitude Data Elements Rank 1 2a 2b 3 4 5 Magnitude Data Element Finished Drinking Water- Median of detected concentrations from all Public Water Systems with detections (If data from both NCOD Round 1 and Round 2 are available, use the higher of the values.) Median of detected concentrations from all ambient / raw source monitoring sites with detections Median of detected concentrations from ambient / raw / source water samples with detections (Note: use combined surface / ground water if available and higher of SW/GW if not) Pesticide application data Environmental release data, total pounds or tons reported as released (TRI) Persistence - Mobility (Environmental Fate Data) Type of Data National scale finished drinking water occurrence data [data from Unregulated Contaminant Monitoring Rule (UCMR) has highest priority, then the National Contaminant Occurrence Database (NCOD), then the National Inorganics Reconnaissance Survey (NIRS)] National scale ambient monitoring data (National Water Quality Assessment Program - NAWQA) National scale / representative data (National Reconnaissance of Emerging Contaminants - NREC - first use National Reconnaissance data, then National Aggregate data) From National Center for Food and Agricultural policy (NCFAP) From Toxics Release Inventory (TRI) Physical chemical properties Step Two: Use scoring table to fend attribute score for value identified in Step One. For each data element, there is a corresponding column in the Magnitude Scoring table (Exhibit A.7), which contains a range of data values assigned to a numerical magnitude score. Locate the column in the table associated with the highest-ranking data element identified in step one. Use the information in the column to determine the numerical score associated with the data value for the chemical being scored. The number corresponding to each "Score" is the maximum in that category, e.g. 0.1 ug/L for finished A-14 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 water scores 4, not 5. In cases where there are no data for Scoring Magnitude in Exhibit A.7 (e.g. Prevalence is scored using Production Volume data), use the Persistence-Mobility Scoring approach to develop a Magnitude Score. Persistence-Mobility Scoring The approach for scoring persistence and mobility includes assigning two values, one for persistence and one for mobility, on a numeric scale of 1 through 3, representing low, medium, and high for each property as it favors the presence of the contaminant in water. Using a hierarchy of physical property data elements, each contaminant is scored for both persistence and mobility. The average of these two values is multiplied by 10/3 to obtain the persistence-mobility score. Exhibit A.8 displays the hierarchy of available properties for each data element representing either persistence or mobility. Protocol for Persistence-Mobility Scoring Step One: Identify and score highest-ranked data value for Persistence When more than one data element value is available for a particular contaminant candidate, use the hierarchy below to select the preferred element. Exhibit A. 6 describes the hierarchy of data elements to be used in the Persistence scoring process. When several values for a physical property are available, the highest scoring value should be used, unless that value is not representative of environmental conditions in drinking water. Step Two: Identify and score highest-ranked data value for Mobility The hierarchy of physical properties for scoring mobility is given in Exhibit A.6. Select the highest priority data element available for scoring. When several values for a particular physical property are available, the highest scoring value should be used for scoring, unless that value is not representative of environmental conditions in drinking water. Step Three: Multiply the average of the persistence and mobility values by 10/3 for the magnitude score. A-15 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.7. Magnitude Scoring Scales Hierarchy Magnitude Scale Data Used to Score Units Score 1 2 3 4 5 6 7 8 9 10 1 Finished Water Occurrence Scale Median of detections - all PWSs ug/L 2 Ambient Water Occurrence Scale Median of detections - all sites/samples ug/L 3 Pesticide Use Scale Number of pounds applied Ibs 4 TRI Total Releases Scale Total number of pounds released Ibs <0.003 0.003-0.01 >0.01 -0.03 >0.03-0.1 >0.1 -0.3 >0.3-1 >1 -3 >3-10 >10-30 >30 <0.003 0.003-0.01 >0.01 -0.03 >0.03-0.1 >0.1 -0.3 >0.3-1 >1 -3 >3-10 >10-30 >30 <10,000 10,000-30,000 30,001-100,000 100,001-300,000 300,001-1M 1M-3M 3M-10M 10M-30M >30M <300 301-1,000 1,001-3,000 3,001-10,000 10,001-30,000 30,001-100,000 100,001-300,000 300,001-1 M 1M-3M >3M 5 Persistence/ Mobility Used when Production data are used to score for prevalence. See Persistence/ Mobility protocol (Exhibit A.8) Notes: Use data in the highest category to score. The number corresponding to each "Score" is the maximum in that category, e.g. 0.1 ug/L scores 4, not 5. For pesticides, use the parent to score if the compound is a degradate and does not have its own data. Exhibit A.8. Magnitude Scales for Environmental Fate Data Magnitude Hierarchy 5 Mobility Scale Value Organic Carbon Partitioning Coefficient (Koc) Log Octa no I/Water Partitioning Coefficient (log Kow) Soil/Water Distribution Coefficient (Kd) Henry's Law Coefficient (KH) Henry's Law Coefficient (KH) Units mL/g dimensionless mL/g atm-m3/mol dimensionless 1 (Low) >1,000 >4 >10 >10'3 >0.042 2 (Medium) 100-1,000 1-4 1-10 10'7-10'3 0.042- 4.2x10'6 3 (High) <100 <1 <1 <10'7 <4.2x10'6 A-16 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Exhibit A.8. Magnitude Scales for Environmental Fate Data Magnitude Hierarchy 5 Mobility Scale Value Solubility Percent in water (PBT Profiler) Units mg/L dimensionless 1 (Low) <1 <25 2 (Medium) 1-1,000 >25-50 3 (High) >1,000 >50 Persistence Scale Value Half Life (t1/2) Measured Degradation Rate1 Modeled Degradation Rate (PBT Profiler) Units time time time 1 (Low) days, days- weeks days, days- weeks (BF, BFA)2 days, days- weeks 2 (Medium) weeks, weeks- months weeks, weeks- months (BS, BSA) weeks, weeks- months 3 (High) months, recalcitrant months, recalcitrant (BST) months, recalcitrant When two results are found for a measured degradation rate, the data are "averaged" and then a value determined. 2 BF = Biodegrades Fast, BFA = Biodegrades Fast with Acclimation, BS = Biodegrades Slow, BST = Biodegrades Sometimes. A-17 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-08-008 August 2009 Contaminant 3 Appendix B. Example Blinded Information Sheets from the IDS Exercises Contaminant Name: Background: It is a volatile organic chemical. It is used as a wetting and dispersing agent in textile processing, dye-baths, stain and printing compositions; used in cleaning and detergent preparations, adhesives, cosmetics, deodorants, fumigants, emulsions and polishing compositions. Used in lacquers, paints, varnishes, paint and varnish removers. Degreasing agent. It is on the TSCA list. The reportable released quanitity of this substance under CERCLA is 1 Ib. It is also subject to RCRA waste management requirements, and is listed as a hazardous air pollutant by EPA. Several states have drinking water guidelines for this chemical (CA, FL, MA, ME, NC). Its one-day Health Advisory Level (HAL) is 4,000 ug/L, its 1 0-day HAL is 400 ug/L, and its 1 0A-4 cancer risk HAL is 300 ug/L. This is an HPV chemical. It is also on the CCL. (HSDB, 2005; EPAHA, 2004) HEALTH EFFECTS DATA Data Element Reference Dose Value N/A Units Source Notes Carcinogen classification (EPA) Slope Factor B2 (probable human carcinogen) 0.011 1/(mg/kg-d) IRIS IRIS 9/1/1990 9/1/1990 [Carcinogen Classification (IARC) 2B (possible) IARC Non EPA Derived Dose1 Critical Effect File/Issue Date 0.1 Hepatic effects 10/1/2004 mg/kg-d ATSDR MRL Chronic oral UF=100 Lowest Oral Chronic LOAEL1 N/A Lowest Oral LD501 N/A Is contaminant on list of carcinogens? Y Y/N Cal EPA Chemicals Known to the State to Cause Cancer or Reproductive Toxicity 1/1/1988 Is the contaminant on a list of reproductive toxins? N Y/N Cal EPA Chemicals Known to the State to Cause Cancer or Reproductive Toxicity | Risk assessment ongoing? Y Y/N Health Reference Level (HRL)2 Health Reference Level (HRL)2 cancer Health Reference Level (HRD cancer 700 3.18 300 ug/L ug/L ug/L Based on MRL 10"4 cancer risk Health Advisory (EPAHA. 1987) Notes 1 Non-EPA toxicology data will be sought if no EPA Reference Dose or carcinogen information available; may require multiple entries; chronic studies will be prioritized over short term studies. 2 Health Reference Level calculated by conversion of RfD or other dose to units of ug/L, assuming 2 liters per day of water consumed by a 70 Kg adult, and a default Relative Source Contribution of 20%. For carcinogens, the concentration at the 10"6 cancer risk level will be converted to units of ug/L and will also be listed. B-1 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-08-008 August 2009 Contaminant 3 OCCURRENCE DATA Water Occurrence Data Finished Water Occurrence - total # PWSs/Sites sampled No Data # with Detects No Data % Detects No Data Minimum of Detects (ug/L) No Data Maximum of Detects (ug/L) No Data Median of Detects (ug/L) No Data 99% of Detects (ug/L) No Data Source Notes Source Water-Total # PWSs/Sites sampled No Data # with Detects No Data % Detects No Data Minimum of Detects (ug/L) No Data Median of Detects (ug/L) No Data Mean of Detects (ug/L) No Data 90% of Detects (ug/L) No Data 95% of Detects (ug/L) No Data 99% of Detects (ug/L) No Data Maximum of Detects (ug/L) No Data Source Production/Release Production data Value >1M-10M Units Ibs/yr Source CUS-IUR (2002) Notes Pesticide Application - total Pesticide Application - total (# States) N/A N/A Ibs/yr # States Release - total Release - total (# States) Release - to Surface Water Release - to SW (# States) 1,146,641 22 75,119 9 Ibs/yr # States Ibs/yr # States TRI TRI TRI TRI Environmental Fate Parameters T1/2, Half life KQC, Organic Carbon Partition Coefficient Kow, Octanol Water Partition Coefficient HLC, Henry's Law Constant Water Solubility Kd, Distribution Coefficient Value No Data 1 Log -0.27 0.000196 1 ,000,000 N/A Units length of time L/kg unitless unitless mg/L source specific Source RAISCF RAISCF RAISCF RAISCF Notes No Data = No data found for this contaminant; N/A = Not applicable to contaminant B-2 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-08-008 August 2009 Contaminant 4 Contaminant Name: Background: This is a volatile organic chemical. It is used as a food additive, organic intermediate, solvent, and in cosmetic formulations. It is also used as a solvent or solubilizer in the paint and printing ink sector, as components in textile auxiliaries and pesticides, for hormone extraction, and in the surfactant field as foam boosters or antifrothing agents. Per the FDA, this food additive is permitted for direct addition to food for human consumption as a synthetic flavoring substance and adjuvant. (HSDB, 2005) HEALTH EFFECTS DATA Data Element Reference Dose Value N/A Units Source Notes Carcinogen classification (EPA) Slope Factor N/A N/A (Carcinogen Classification (IARC) N/A Non EPA Derived Dose1 N/A Lowest Oral Chronic LOAEL1 N/A Lowest Oral LD501 500 mg/kg RTECS Critical effect: Ataxia, irritability, dyspnea, acute pulmonary edema Is contaminant on list of carcinogens? N Y/N Cal EPA Chemicals Known to the State to Cause Cancer or Reproductive Toxicity Is the contaminant on a list of reproductive toxins? N Y/N Cal EPA Chemicals Known to the State to Cause Cancer or Reproductive Toxicity | Risk assessment ongoing? N Y/N Health Reference Level (HRL)2 Health Reference Level (HRL)2 cancer N/A N/A ug/L ug/L Notes 1 Non-EPA toxicology data will be sought if no EPA Reference Dose or carcinogen information available; may require multiple entries; chronic studies will be prioritized over short term studies. 2 Health Reference Level calculated by conversion of RfD or other dose to units of ug/L, assuming 2 liters per day of water consumed by a 70 Kg adult, and a default Relative Source Contribution of 20%. For carcinogens, the concentration at the 10"6 cancer risk level will be converted to units of ug/L and will also be listed. B-3 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-08-008 August 2009 Contaminant 4 OCCURRENCE DATA Water Occurrence Data Finished Water Occurrence - total # PWSs/Sites sampled No Data # with Detects No Data % Detects No Data Minimum of Detects (ug/L) No Data Maximum of Detects (ug/L) No Data Median of Detects (ug/L) No Data 99% of Detects (ug/L) No Data Source Notes Source Water-Total # PWSs/Sites sampled No Data # with Detects No Data % Detects No Data Minimum of Detects (ug/L) No Data Median of Detects (ug/L) No Data Mean of Detects (ug/L) No Data 90% of Detects (ug/L) No Data 95% of Detects (ug/L) No Data 99% of Detects (ug/L) No Data Maximum of Detects (ug/L) No Data Source Production/Release Production data Value >500K-1M Units Ibs/yr Source CUS-IUR (2002) Notes Pesticide Application - total Pesticide Application - total (# States) N/A N/A Ibs/yr # States Release - total Release - total (# States) Release - to Surface Water Release - to SW (# States) No Data No Data No Data No Data Ibs/yr # States Ibs/yr # States Environmental Fate Parameters T1/2, Half life KQC, Organic Carbon Partition Coefficient KQW, Octanol Water Partition Coefficient HLC, Henry's Law Constant Water Solubility Kd, Distribution Coefficient Value No Data 15 Log 2.62 1 .88E-05 1000 N/A Units length of time L/kg unitless atm-cu m/mol mg/L source specific Source HSDB HSDB HSDB HSDB Notes No Data = No data found for this contaminant; N/A = Not applicable to contaminant B-4 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-08-008 August 2009 Contaminant 5 Contaminant Name: Background: i nis is a volatile organic cnemicai registered tor use in tne u.b. Nematiciae. beventn most commonly usea pesticiae in u.b. agricultural crop production, usea in organic synthesis and in manufacture of pesticides. Pre-plant soil fumigant. It is listed on FIFRA and TSCA. The reportable release quantity under CERCLA is 1 00 Ibs. It is subject to RCRA waste management requirements. It is listed as a hazardous air pollutant and as a hazardous substance by the Federal Water Pollution Control Act and the Clean Water Act. It has a state drinking water standard in CA. It has a state drinking water guideline in several states (FL, MA, ME, MN, Wl). It has a DWEL of 1 ,000 ug/L, and its one-day and ten-day Health Advisory Levels (HALs) are 30 ug/L. This is an HPV chemical. (HSDB, 2005; EPAHA, 2004) HEALTH EFFECTS DATA Data Element Reference Dose Critical Effect File/Issue Date Value 0.03 Chronic irritation 5/25/2000 Units mg/kg-d Source IRIS Notes Basis = BMDL(1 0)3.4 mg/kg-d Rat, UF=100, MF=1 Confidence: Study: High; Database: High; RfD: High Reference Dose Critical Effect File/Issue Date Carcinogen classification (EPA) Slope Factor 0.025 mg/kg-d OPP decrease in body weight gain and an increase in the incidence of basal cell hyperplasia of the nonglandular mucosa of the stomach 1998 B2; inadequate in humans, sufficient in animals IRIS 0.1 1 /(mg/kg-d) IRIS Basis = NOEL 2.5 mg/kg-d Rat, UF=1 00, MF=1 5/25/2000 (Carcinogen Classification (IARC) 2B (possible) IARC Non EPA Derived Dose1 N/A Lowest Oral Chronic LOAEL1 N/A Lowest Oral LD501 N/A Is contaminant on list of carcinogens? Y Y/N Cal EPA Chemicals Known to the State to Cause Cancer or Reproductive Toxicity 1/1/1989 Is the contaminant on a list of reproductive toxins? N Y/N Cal EPA Chemicals Known to the State to Cause Cancer or Reproductive Toxicity |Risk assessment ongoing? N Y/N Health Reference Level (HRL)2 Health Reference Level (HRL)2 cancer Health Reference Level (HRL) cancer 210 0.35 40 ug/L ug/L ug/L Based on IRIS RfD Based on IRIS slope factor 1 0"4 cancer risk Health Advisory (EPAHA, 1 988) Notes 1 Non-EPA toxicology data will be sought if no EPA Reference Dose or carcinogen information available; may require multiple entries; chronic studies will be prioritized over short term studies. 2 Health Reference Level calculated by conversion of RfD or other dose to units of ug/L, assuming 2 liters per day of water consumed by a 70 Kg adult, and a default Relative Source Contribution of 20%. For carcinogens, the concentration at the 10"6 cancer risk level will be converted to units of ug/L and will also be listed. B-5 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-08-008 August 2009 Contaminant 5 OCCURRENCE DATA Water Occurrence Data Finished Water Occurrence - total Finished Water Occurrence - SW Finished Water Occurrence - GW # PWSs/Sites sampled 9,164 898 8,303 # with Detects 15 5 10 % Detects 0.16% 0.56% 0.12% Minimum of Detects (ug/L) 0.5 1 0.5 Maximum of Detects (ug/L) 2 2 1.6 Median of Detects (ug/L) 1 1.25 0.5 99% of Detects (ug/L) 2 2 1.6 Source NCOD Round 1 NCOD Round 1 NCOD Round 1 Notes Finished Water Occurrence - total Finished Water Occurrence - SW Finished Water Occurrence - GW 16,787 1,609 15,178 Source Water-Total # PWSs/Sites sampled No Data 58 10 48 0.35% 0.62% 0.32% 0.2 0.2 0.2 39 1.6 39 0.5 0.5 0.5 39 1.6 39 NCOD Round 2 NCOD Round 2 NCOD Round 2 # with Detects No Data % Detects No Data Minimum of Detects (ug/L) No Data Median of Detects (ug/L) No Data Mean of Detects (ug/L) No Data 90% of Detects (ug/L) No Data 95% of Detects (ug/L) No Data 99% of Detects (ug/L) No Data Maximum of Detects (ug/L) No Data Source Production/Release Production data Value >1M-10M Units Ibs/yr Source CUS-IUR (2002) Notes Pesticide Application - total Pesticide Application - total (# States) 34,717,237 20 Ibs/yr # States NCFAP NCFAP Release - total Release - total (# States) Release - to Surface Water Release - to SW (# States) 10,532 8 85 3 Ibs/yr # States Ibs/yr # States TRI TRI TRI TRI Environmental Fate Parameters T1/2, Half life KQC, Organic Carbon Partition Coefficient KQW, Octanol Water Partition Coefficient HLC, Henry's Law Constant Water Solubility Kd, Distribution Coefficient Value No Data 81 Log 2.03 0.145 2,800 N/A Units length of time L/kg unitless unitless mg/L source specific Source RAISCF RAISCF RAISCF RAISCF Notes No Data = No data found for this contaminant; N/A = Not applicable to contaminant B-6 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to the CCL EPA815-R-09-008 August 2009 Appendix C. CCL 3 Training Data Set (TDS) and Summary of EPATDS Decisions. For detailed Chemical Information Sheets for the TDS go to www.regulations.gov Docket ID EPA-HQ-OW-2007-1189. Chemical ID Chemical Algorithm Number 5 16 91 22 53 10 48 49 101 51 20 37 52 4 54 24 55 26 33 21 19 11 66 61 44 90 8 86 57 38 92 93 32 14 2 17 82 84 35 58 40 45 60 25 94 36 23 59 81 62 83 69 6 65 89 95 27 96 97 18 50 9 30 67 98 CASRN 75343 563586 87843 95636 122667 142289 123319 123911 111706 542756 594207 88062 99558 109068 75070 34256821 107028 142363539 309002 7429905 62533 7440360 7440382 1912249 1302789 100527 71432 95943 82688 50328 271896 140114 7440428 15541454 75274 74839 471341 120809 67663 1897456 2921882 77929 7440484 7758987 2691410 333415 683181 60571 124403 298044 75003 106934 7705080 50000 110009 98011 87683 115117 78795 7439921 1309428 7439965 57837191 16752775 598550 Contaminant Name 1,1-Dichloroethane 1,1-Dichloropropene 1 ,2,3,4,5-Pentabromo-6- 1 ,2,4-Trimethylbenzene 1 ,2-Diphenylhydrazine 1,3-Dichloropropane 1,4-Benzenediol 1,4-Dioxane 1-Heptanol(4) 1-Propene, 1,3-dichloro- 2,2-Dichloropropane 2,4,6-Trichlorophenol 2-Methyl-5-nitroaniline 2-Methylpyridine (2) Acetaldehyde Acetochlor Acrolein Alachlor ESA Aldrin Aluminum Aniline Antimony Arsenic Atrazine Bentonite (2) Benzaldehyde (3) Benzene Benzene, 1,2,4,5-tetrachloro- (3) Benzene, pentachloronitro- Benzo(a)pyrene Benzofuran (3) Benzyl acetate (3) Boron Bromate Bromodichlorom ethane Bromom ethane Calcium carbonate Catechol (2) Chloroform Chlorothanil Chlorpyrifos Citric acid (2) Cobalt Copper sulfate Cyclotetramethylenetetranitramine (3) Diazinon Dibutyltin dichloride Dieldrin Dimethylamine (2) Disulfoton Ethane, chloro- (2) Ethylenedibromide Ferric chloride Formaldehyde Furan (3) Furfural (3) Hexachlorobutadiene Isobutene (3) Isoprene (3) Lead Magnesium hydroxide Manganese Metalxyl Methomyl Methyl Carbarn ate (3) INPUT ATTRIBUTE SCORES Potency 6 4 5 4 6 5 4 4 5 5 4 7 5 4 5 5 6 3 8 3 5 6 7 5 3 4 5 7 6 7 4 3 4 6 5 6 3 4 5 5 6 2 5 5 4 7 5 8 4 7 3 7 3 4 6 6 7 4 6 7 2 4 4 5 4 Severity 8 5 8 3 8 6 3 8 6 8 5 3 8 3 5 7 9 3 8 6 6 9 8 7 1 6 8 6 6 8 6 3 3 8 8 5 6 8 4 8 3 3 3 3 6 3 5 8 3 6 8 8 3 6 6 3 6 4 5 3 3 5 3 6 8 Prevalence 6 2 1 6 1 1 8 9 3 4 2 1 1 7 10 1 3 9 1 9 9 10 10 9 3 6 8 5 9 4 1 5 10 10 10 6 10 10 10 4 9 7 4 10 3 1 10 3 10 1 4 7 10 10 6 6 4 10 8 9 10 10 9 2 5 Magnitude 7 6 5 6 1 6 8 9 7 6 6 1 1 6 10 1 7 3 6 10 8 7 8 6 1 7 7 7 6 4 7 7 10 7 8 7 10 6 8 4 2 8 8 9 5 1 5 6 8 1 7 4 10 10 8 7 5 7 7 8 10 10 3 7 10 Team Consensus Blinded Decisions List=4 Mean 3.50 1.67 2.20 1.50 1.17 1.67 2.33 3.67 2.60 2.83 1.50 1.00 1.17 1.83 3.50 1.17 3.00 1.50 2.83 3.17 3.17 3.67 4.00 3.50 1.00 2.40 3.67 3.20 3.33 3.00 1.80 1.20 2.50 3.83 3.67 2.83 2.83 3.33 2.83 2.67 1.83 1.33 2.00 3.00 1.40 1.17 2.67 3.33 2.33 1.33 2.50 3.33 2.17 3.33 3.00 2.40 2.83 2.40 3.20 3.33 1.83 3.17 1.50 2.17 3.60 Integer Score 4 2 2 2 1 2 2 4 3 3 2 1 1 2 4 1 3 2 3 3 3 4 4 4 1 2 4 3 3 3 2 1 3 4 4 3 3 3 3 3 2 1 2 3 1 1 3 3 2 1 3 3 2 3 3 2 3 2 3 3 2 3 2 2 4 L/NL L NL? NL? NL? NL NL? NL? L L? L? NL? NL NL NL? L NL L? NL? L? L? L? L L L NL NL? L L? L? L? NL? NL L? L L L? L? L? L? L? NL? NL NL? L? NL NL L? L? NL? NL L? L? NL? L? L? NL? L? NL? L? L? NL? L? NL? NL? L C-1 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to the CCL EPA815-R-09-008 August 2009 Chemical ID Chemical Algorithm Number 99 85 79 41 29 42 56 28 31 46 68 1 70 71 78 15 100 13 3 72 88 73 43 7 74 39 75 34 76 77 64 87 102 80 47 12 63 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 CASRN 12108133 98851 51218452 21087649 1634044 91203 14797558 14797650 98953 1836755 100754 109319 95487 1910425 56382 14797730 77098 584087 1610180 96184 106490 91225 7440235 144558 96093 14808798 75650 127184 78002 62566 108883 78422 102716 76879 1330207 7646857 Contaminant Name Methylcyclopentadienylmanganese tricarbonyl(MMT) (3) Methylphenyl carbinol (3) Metolachlor Metribuzin MTBE Naphthalene Nitrate Nitrite Nitrobenzene Nitrofen N-Nitrosopiperidine Nonanedioic acid, dihexyl ester (2) o-cresol Paraquat Parathion Perchlorate Phenolphthalein (3) Potassium carbonate Prometon Propane, 1,2,3-trichloro- p-Toluidine (3) Quinoline Sodium Sodium bicarbonate Styrene oxide Sulfate tert-Butanol Tetrachloroethylene Tetraethyl lead Thiourea Toluene Tri(ethylhexyl) phosphate (3) Triethanolamine (5) Triphenyltin hydroxide Xylenes Zinc chloride Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic INPUT ATTRIBUTE SCORES Potency 7 3 4 5 4 5 3 4 6 5 7 2 4 5 5 8 3 3 5 7 6 8 4 4 6 4 3 5 10 4 4 4 4 7 4 4 4 3 5 6 2 6 8 2 4 7 4 10 1 9 1 2 7 4 7 2 6 10 5 9 3 8 10 8 1 10 3 10 Severity 9 3 3 3 5 3 3 3 5 8 8 7 6 5 3 7 8 3 7 8 8 8 4 9 8 3 3 8 6 7 3 6 8 3 3 3 1 2 3 1 8 9 8 2 1 3 8 8 3 1 1 6 8 5 3 6 6 9 1 3 6 7 5 8 5 4 8 8 Prevalence 5 10 6 1 5 7 10 10 1 1 1 1 1 1 2 9 1 10 1 3 5 7 10 10 1 10 10 9 9 7 9 3 8 10 9 10 7 2 1 2 1 1 8 10 8 8 8 8 5 1 1 1 4 10 10 7 7 10 5 3 1 1 1 8 10 8 10 9 Magnitude 8 8 7 6 8 6 10 9 10 5 1 1 1 1 3 8 7 10 1 6 8 6 10 10 1 10 8 7 8 5 7 5 7 6 7 9 7 2 2 1 1 4 2 9 1 3 1 1 9 7 1 9 7 9 10 7 5 7 3 3 3 2 1 8 3 5 1 4 Team Consensus Blinded Decisions List=4 Mean 3.80 1.80 1.83 1.50 2.33 2.00 2.00 2.50 2.00 2.17 1.67 1.00 1.00 1.00 1.17 4.00 1.80 2.00 1.17 3.33 3.60 3.83 2.83 3.17 1.17 2.50 1.83 3.67 4.00 2.67 2.33 1.80 3.00 2.83 2.17 2.50 1.50 1.00 1.00 1.00 1.00 2.33 3.50 2.00 1.00 2.00 2.33 3.50 1.33 1.67 1.00 1.83 3.67 3.17 3.67 2.17 2.67 4.00 1.00 2.00 1.00 2.33 2.00 4.00 1.33 3.33 1.83 3.83 Integer Score 4 2 2 2 2 2 2 3 2 2 2 1 1 1 1 4 2 2 1 3 4 4 3 3 1 3 2 4 4 3 2 2 3 3 2 3 2 1 1 1 1 2 4 2 1 2 2 4 1 2 1 2 4 3 4 2 3 4 1 2 1 2 2 4 1 3 2 4 L/NL L NL? NL? NL? NL? NL? NL? L? NL? NL? NL? NL NL NL NL L NL? NL? NL L? L L L? L? NL L? NL? L L L? NL? NL? L? L? NL? L? NL? NL NL NL NL NL? L NL? NL NL? NL? L NL NL? NL NL? L L? L NL? L? L NL NL? NL NL? NL? L NL L? NL? L C-2 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to the CCL EPA815-R-09-008 August 2009 Chemical ID Chemical Algorithm Number 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 CASRN Contaminant Name Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic Synthetic INPUT ATTRIBUTE SCORES Potency 5 2 8 1 6 1 6 8 1 9 5 10 5 9 6 2 6 3 6 9 3 8 4 8 3 9 10 8 4 7 4 9 7 4 7 1 6 2 2 8 5 10 5 8 7 2 8 1 10 4 10 6 1 7 2 8 6 2 8 5 8 4 10 9 4 7 3 9 1 Severity 4 2 5 8 8 2 5 8 8 8 4 4 6 8 6 1 2 7 8 8 1 2 9 8 5 8 2 4 8 6 4 3 7 7 8 4 3 8 3 8 3 3 8 7 7 4 3 8 8 4 8 4 6 6 3 2 9 8 8 4 1 7 8 7 1 2 8 8 1 Prevalence 3 10 3 5 2 9 6 8 7 7 3 5 3 9 1 8 8 6 8 2 1 2 5 1 6 9 1 6 8 9 1 4 7 3 1 10 10 9 10 6 3 2 2 2 9 6 10 7 6 4 2 5 5 4 10 7 6 7 8 1 1 4 10 4 8 10 10 7 10 Magnitude 8 7 8 10 6 8 6 6 7 10 3 3 5 9 1 3 2 3 1 3 6 7 7 7 8 4 1 9 7 8 4 3 7 5 4 2 5 3 5 5 10 6 6 7 4 9 6 8 8 2 9 3 1 2 4 2 6 4 4 6 7 7 3 6 9 6 9 8 5 Team Consensus Blinded Decisions List=4 Mean 2.50 1.17 3.33 2.83 3.00 1.33 2.83 3.83 2.17 4.00 1.17 2.67 2.17 4.00 1.17 1.17 1.50 2.17 2.83 3.17 1.17 2.33 3.33 3.50 2.33 3.67 1.50 3.33 3.67 3.83 1.00 2.33 3.50 2.17 2.67 1.33 2.50 2.17 1.17 3.67 2.50 2.83 2.67 3.33 3.67 2.33 3.50 2.33 3.83 1.00 3.67 2.00 1.00 2.00 1.00 1.50 3.67 2.00 3.83 2.33 1.67 2.83 3.83 3.83 2.00 2.67 3.67 4.00 1.17 Integer Score 3 1 3 3 3 1 3 4 2 4 1 3 2 4 1 1 2 2 3 3 1 2 3 4 2 4 2 3 4 4 1 2 4 2 3 1 3 2 1 4 3 3 3 3 4 2 4 2 4 1 4 2 1 2 1 2 4 2 4 2 2 3 4 4 2 3 4 4 1 L/NL L? NL L? L? L? NL L? L NL? L NL L? NL? L NL NL NL? NL? L? L? NL NL? L? L NL? L NL? L? L L NL NL? L NL? L? NL L? NL? NL L L? L? L? L? L NL? L NL? L NL L NL? NL NL? NL NL? L NL? L NL? NL? L? L L NL? L? L L NL C-3 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA815-R-09-008 Classification of the PCCL to CCL August 2009 APPENDIX D. SOFTWARE SOURCES Artificial Neural Networks - ANN methods packaged in R software libraries "MASS" and "nnet" are available at no charge from the website http ://www. r-project.org, under the Free Software Foundation's GNU General Public License. Univariate Decision Tree - CART - methods packaged in the R software library "rpart" are available at no charge from the website http ://www.r-project.org, under the Free Software Foundation's GNU General Public License. Multivariate Decision Tree - QUEST software is available at no charge from the website http ://www. stat. wise. eduMoh/quest.html Linear Modeling - Likelihood function was maximized using MathCAD's built-in Maximize function (www.mathsoft.com). Multivariate Adaptive Regression Splines - MARS methods packaged in the R software library "polspline" are available at no charge from the website http://www.r-project.org, under the Free Software Foundation's GNU General Public License. D- 1 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 APPENDIX E. SOLUTIONS TO THE CLASSIFICATION MODELS USED IN THE CCL PROCESS Artificial Neural Network - The software used does not reveal its decision rule. Instead, it provides classifications for contaminants that have been scored for the four attributes. When given a complete set of all possible combinations of integer attribute scores, the software provides classifications. Although not expressed mathematically, this complete description of the decision rule can be seen in Exhibit 4-4. Example: Contaminant with scores (3, 4, 5, 6). Exhibit 29 shows this as a dark blue point. Not List Simple Linear Model - The maximum likelihood linear model is shown below. Y[i] is the estimated team-average classification and Pot[i], Sev[i], Prev[i], Mag[i] are the attribute scores for contaminant i. If Y[i] is less than 1.5, then the classification is Not List. Similarly, if Y[i] is at least 3.5, then the classification is List. Y[i] = -1.671 + 0.241 * Pot[i] + 0.217 * Sev[i] + 0.116 * Prev[i] + 0.170 * Mag[i] Example: Contaminant with scores (3, 4, 5, 6). Y = -1.671 + 0.241 * 3 + 0.217 * 4 + 0.116 * 5 + 0.170 * 6 = 1.520 -» Not List Multivariate Tree (QUEST) - The solution involves a number of intermediate nodes and terminal nodes arranged as shown in Exhibit 4.1.1. When a contaminant encounters an intermediate node, a weighted sum of attribute scores is compared to a threshold value. The direction the contaminant moves from the node depends on whether the threshold is exceeded. Below, vector notation is used below to simplify the description. Letting X[i] be a column T vector of attribute scores, (Pot[i], Sev[i], Prev[i], Mag[i]), then Bl *X[i] is the vector product of Bl (a column vector of weights) and X[i], which, in turn, is compared with the threshold. When the contaminant encounters a terminal node (Node 6, 10, 11, 16, 17, 29, 30, or 31), a classification is assigned. Node 1: If Bl*X[i] < 0.3023, then Node 2, otherwise Node 3. Node 2: If B2*X[i] < 0.3844, then Node 4, otherwise Node 5. Node 4: If B4*X[i] < 0.6460, then Node 6, otherwise Node 7. Node 6: Not List Node 7: If B7*X[i] < 3.336, then Node 10, otherwise Node 11. Node 10: Not List Node 11: Not List? Node 5: If B5*X[i] < 1.213, then Node 16, otherwise Node 17. Node 16: Not List? Node 17: List? Node 3: If B3*X[i] < 1.181, then Node 28, otherwise Node 29 Node 28: If B28*X[i] < 6.460, then Node 30, otherwise Node 31. Node 30: List? Node 31: List E-l ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Node 29: List Exhibit A. 1 - Tree Produced by QUEST (heavy arrows show path of contaminant with attribute scores 3, 4, 5, 6) Contaminant Entry: (3, 4, 5, 6) Terminal Node Index Exhibit A.2 - The column vectors of weights: Bl 0.01631 0.01315 0.007523 0.01034 B2 0.03008 0.02075 0.01214 0.02043 B3 0.05223 0.06855 0.03516 0.01807 B4 0.06890 0.01756 0.01753 0.05501 B5 0.07779 0.06447 0.03300 0.04850 B7 0.3531 0.1136 0.07560 0.2144 B28 0.2966 0.3174 0.1995 0.1952 Example: Contaminant with scores X = (3, 4, 5, 6) Node 1: Bl T*X = 0.01631*3 + 0.01315*4 + 0.007523*5 + 0.01034*6 = 0.2012 This is less than 0.3023, so go to Node 2. Node 2: B2 T*X = 0.03008*3 + 0.02075*4 + 0.01214*5 + 0.02043*6 = 0.3565 This is less than 0.3844, so go to Node 4. E-2 ------- EPA-OGWDW Final CCL 3 Chemicals: EPA 815-R-09-008 Classification of the PCCL to CCL August 2009 Node 4: B4T*X = 0.06890*3 + 0.01756*4 + 0.01753*5 + 0.05501*6 = 0.6947 This exceeds 0.6460, so go to Node 7. Node 7: B7T*X = 0.3531*3 + 0.1136*4 + 0.07560*5 + 0.2144*6 = 3.1781 This is less than 3.336, so go to Node 10. Node 10: Not List E-3 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 51285 60571 62737 63252 67641 67721 72559 74839 74873 74953 74975 75150 75343 75694 75718 Set 1 Summary Common Name 2,4-Dinitrophenol Dieldrin Dichlorvos Carbaryl Acetone Hexachloroethane p,p'-DDE Methyl bromide Bromomethane Chloromethane (Methyl chloride) Dibromomethane Halon 1011 (bromochloromethan e) Carbon disulfide 1,1-Dichloroethane CFC-11. Trichlorofluoromethan e CFC-12. Dichlorofluoromethan e Model Decision NL L?-L NL - NL? NL? L? NL NL - NL? L? L? NL? NL? NL? - L? L? L?-L NL? # Evaluators 18 18 16 16 18 17 16 17 16 14 13 15 12 13 13 % agreement 100 83 88 69 78 100 88 82 81 71 62 53 67 69 77 Direction - disagree +/-(+ toward L) +/-0 +7 -1 +/-0 -2 +1 +1 +3 +/-0 -1 -2 -2 +1 +1 -4 Value (L=4; NL=1) 1.00 3.66 1.63 2.00 2.75 1.06 1.61 3.11 2.88 1.92 1.83 2.21 3.00 3.38 1.71 Category NL L NL? NL? L? NL NL? L? L? NL? NL? NL? L? L? NL? Overall Confidence H% 65% 41% 33% 33% 40% 44% 40% 47% 50% 36% 40% 29% 18% 42% 50% M% 35% 47% 53% 47% 60% 56% 53% 47% 50% 57% 60% 64% 64% 50% 42% L% 0% 12% 13% 20% 0% 0% 7% 7% 0% 7% 0% 7% 18% 8% 8% Value H=3; L=1) 2.647 2.294 2.200 2.133 2.400 2.438 2.333 2.400 2.500 2.286 2.400 2.214 2.000 2.333 2.417 POTENCY Data Element Element (L4G) Reference Dose (RfD) Lifetime Cancer Risk (10A-4) Lifetime Cancer Risk (10A-4) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Lifetime Cancer Risk (10A-4) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Slope Factor (Oral) Reference Dose (RfD) Reference Dose (RfD) Source IRIS IRIS IRIS OPP IRIS IRIS IRIS IRIS EPAHA RAISH E EPAHA IRIS OEHHA IRIS IRIS Type (NCAR/ CAR) NCAR CAR CAR NCAR NCAR NCAR CAR NCAR NCAR NCAR NCAR NCAR CAR NCAR NCAR PREVALENCE Data Element Element (L4G) Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Samples (Detects), Surface Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Source UCMR NCODR1 2 NREC NCODR1 2 NAWQA NAWQA UCMR NCODR1 2 NCODR1 2 NCODR1 2 NCODR1 2 NAWQA NCODR1 2 NCODR1 2 NCODR1 2 F-1 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 79345 80626 86500 87616 87683 88062 91203 94746 95498 95636 Set 1 Summary Common Name 1,1,2,2- Tetrachloroethane Methyl methacrylate Azinphos-methyl 1,2,3- Trichlorobenzene Hexachlorobutadiene 2,4,6-Trichlorophenol Naphthalene MCPA o-Chlorotoluene 1,2,4- Trimethylbenzene Model Decision NL? NL NL? NL - NL? L? NL NL? NL? - L? NL? NL? # Evaluators 14 13 12 13 13 13 13 14 13 13 % agreement 64 100 100 77 77 92 85 71 77 69 Direction -disagree +/-(+ toward L) +1 +/-0 +1 -3 -2 +1 -1 -1 -4 -1 Value (L=4; NL=1) 2.04 1.00 2.15 1.47 2.75 1.08 1.93 2.38 1.71 1.92 Category NL? NL NL? NL L? NL NL? NL? NL? NL? Overall Confidence H% 36% 58% 27% 42% 46% 42% 67% 33% 58% 33% M% 55% 42% 64% 42% 46% 50% 33% 42% 42% 33% L% 9% 0% 9% 17% 8% 8% 0% 25% 0% 33% Value H=3; L=1) 2.273 2.583 2.182 2.250 2.385 2.333 2.667 2.083 2.583 2.000 POTENCY Data Element Element (L4G) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Source EPAHA IRIS OPP RTECS EPAHA EPAHA IRIS OPP IRIS RAISH E Type (NCAR/ CAR) NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Source NCODR1 2 NAWQA NAWQA NCODR1 2 NCODR1 2 UCMR NCODR1 2 NAWQA NCODR1 2 NCODR1 2 F-2 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 2212671 2312358 5989275 7439987 7440020 7440097 7440213 7440235 7440246 7440428 7440484 7440564 Set 2 Summary Common Name Molinate Propargite (D)-Limonene Molybdenum Nickel Potassium Silicon Sodium Strontium Boron Cobalt Germanium Model Decision L? L? NL? L?-L L? L? L L? L? L? NL? - L? L? # Evaluator s 19 18 17 18 18 18 18 19 19 18 17 18 % agreement 84 72 82 78 89 44 61 68 74 61 71 61 Direction -disagree +/-(+ toward L) -1 -5 -4 +/-0 -2 -9 -4 -3 +/-0 +3 -1 -2 Value (L=4; NL=1) 3 3 2 3 3 2 3 3 3 3 2 3 Categor y L? L? NL? L? L? NL? L? L? L? L? NL? L? Overall Confidence H% 32% 24% 24% 50% 28% 24% 17% 26% 26% 24% 24% 18% M% 58% 59% 59% 39% 67% 24% 33% 37% 47% 53% 53% 24% L% 11% 18% 18% 11% 6% 53% 50% 37% 26% 24% 24% 59% Value H=3; L=1) 2.211 2.059 2.059 2.389 2.222 1.706 1.667 1.895 2.000 2.000 2.000 1.588 POTENCY Data Element Element (L4G) Reference Dose (RfD) Reference Dose (RfD) No Observed Effect Level (NOEL) UL Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Lethal Dose 50 (LD50) Lowest Observed Adverse Effect Level (LOAEL) Reference Dose (RfD) Reference Dose (RfD) MRL-Int Lowest Observed Adverse Effect Level (LOAEL) Source IRIS OPP NTP IOM IRIS NAS RTECS RTECS IRIS IRIS ATSDR RTECS Type (NCAR /CAR) NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Samples (Detects), Surface Water, Ambient Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Percentage of Samples (Detects), All Water, Finished Source UCMR NAWQA NREC NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS NIRS F-3 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 7440622 7664417 7723140 13071799 13194484 13494809 14797730 16655826 16752775 21087649 21725462 25013165 25057890 27314132 Set 2 Summary Common Name Vanadium Ammonia White Phosphorus Terbufos Ethoprop Tellurium Perchlorate 3-Hydroxycarbofuran Methomyl Metribuzin Cyanazine Butylated hydroxyanisole Bentazon Norflurazon Model Decision L?-L NL? L NL NL? NL? - L? NL? - L? L? NL? NL - NL? NL? NL? NL? NL? # Evaluator s 18 17 19 17 16 16 16 18 16 16 17 15 15 14 % agreement 78 82 100 82 81 56 50 83 56 69 65 73 53 79 Direction -disagree +/-(+ toward L) -4 -4 -1 +3 +2 +2 +6 +2 -1 +/-0 +/-0 +/-0 +1 +2 Value (L=4; NL=1) 3 2 4 1 2 2 3 3 2 2 2 2 2 2 Categor y L? NL? L NL NL? NL? L? L? NL? NL? NL? NL? NL? NL? Overall Confidence H% 18% 24% 63% 63% 33% 18% 33% 29% 27% 50% 31% 13% 36% 31% M% 59% 65% 32% 31% 39% 18% 47% 53% 67% 31% 63% 40% 57% 46% L% 24% 12% 5% 6% 28% 65% 20% 18% 7% 19% 6% 47% 7% 23% Value H=3; L=1) 1.941 2.118 2.579 2.563 2.056 1.529 2.133 2.118 2.200 2.313 2.250 1.667 2.286 2.077 POTENCY Data Element Element (L4G) MRL-Int Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) NOAEL Reference Dose (RfD) RfD Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Reference Dose (RfD) Reference Dose (RfD) Source ATSDR RAISHE IRIS OPP OPP Journal IRIS OPP OPP OPP EPAHA RTECS IRIS OPP Type (NCAR /CAR) NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Percentage of Samples (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Samples (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Samples (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Samples (Detects), Surface Water, Ambient Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Source NIRS NAWQA NIRS UCMR NAWQA NIRS UCMR NCODR12 NCODR12 NCODR12 NAWQA NREC NAWQA NAWQA F-4 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 34014181 34256821 51218452 Set 2 Summary Common Name Tebuthiuron Acetochlor Metolachlor Model Decision NL-NL? NL NL? # Evaluator s 15 16 13 % agreement 73 69 69 Direction -disagree +/-(+ toward L) -4 +4 -3 Value (L=4; NL=1) 1 1 2 Categor y NL NL NL? Overall Confidence H% 53% 67% 38% M% 33% 20% 54% L% 13% 13% 8% Value H=3; L=1) 2.400 2.533 2.308 POTENCY Data Element Element (L4G) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Source OPP IRIS OPP Type (NCAR /CAR) NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Source NAWQA UCMR NCODR12 F-5 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 96184 96333 98066 98953 103651 106434 107028 107131 108054 108861 109999 115968 121142 Set 3 Summary Common Name 1 ,2,3-Trichloropropane Methyl acrylate tert-Butylbenzene Nitrobenzene n-Propylbenzene p-Chlorotoluene Acrolein Acrylonitrile Vinyl acetate Bromobenzene Tetrahydrofuran Trichlorethyl phosphate 2,4-Dinitrotoluene Model Decision NL? NL NL? NL7-L? NL? NL? L?-L NL?-NL NL NL? L? NL?-L? L?-L # Evaluators 16 15 16 16 16 15 16 15 15 16 16 14 15 % agreement 75 93 75 44 94 87 69 73 100 69 75 50 60 Direction - disagree +/-(+ toward L) +1 +1 -1 +5 +1 -1 +1 +3 +/-0 +3 -1 -3 +1 Value (L=4; NL=1) 2.12 1.07 1.97 2.75 2.03 1.94 3.53 1.78 1.00 2.09 2.93 2.39 3.53 Category NL? NL NL? L? NL? NL? L NL? NL NL? L? NL? L Overall Confidence H% 44% 40% 19% 31% 31% 31% 25% 20% 40% 27% 13% 7% 38% M% 31% 53% 69% 38% 50% 56% 63% 73% 47% 53% 47% 60% 54% L% 25% 7% 13% 31% 19% 13% 13% 7% 13% 20% 40% 33% 8% Value H=3; L=1) 2.188 2.333 2.063 2.000 2.125 2.188 2.125 2.133 2.267 2.067 1.733 1.733 2.308 POTENCY Data Element Element (L4G) Reference Dose (RfD) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Reference Dose (RfD) Reference Dose (RfD) Lifetime Cancer Risk (10A-4) Reference Dose (RfD) Reference Dose (RfD) No Observed Effect Level (NOEL) Reference Dose (RfD) Lifetime Cancer Risk (10A-4) Source IRIS RAISHE RTECS IRIS RTECS EPAHA /IRIS RAISHE EPAHA RAISHE RAISHE Journal RAISHE EPAHA Type (NCAR / CAR) NCAR NCAR NCAR NCAR NCAR NCAR NCAR CAR NCAR NCAR NCAR NCAR CAR PREVALENCE Data Element Element (L4G) Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Samples (Detects), Surface Water, Ambient Percentage of PWSs (Detects), All Water, Finished Source NCODR12 NAWQA NCODR12 UCMR NCODR12 NCODR12 NAWQA NAWQA NAWQA NCODR12 NAWQA NREC UCMR F-6 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 121755 122667 126987 135988 298044 309002 314409 50000 50997 75570 78002 78795 78820 101779 Set 3 Summary Common Name Malathion 1 ,2-Diphenylhydrazine Met hacrylonit rile sec-Butylbenzene Disulfoton Aldrin Bromacil Formaldehyde D-Glucose Tetramethylammonium chloride Tetraethyl lead Isoprene Isobutyronitrile Benzenamine, 4,4'- methylenebis- Model Decision NL NL-NL? NL NL? NL L? NL? L?-L NL?-NL L? L L?-L L?-L L # Evaluators 13 12 14 15 14 15 15 15 14 14 15 15 15 15 % agreement 77 100 93 93 71 73 73 67 64 57 73 47 33 67 Direction - disagree +/-(+ toward L) +3 +/-0 +1 +/-0 +3 +4 -4 -3 -3 -3 -2 -7 -7 -5 Value (L=4; NL=1) 1.23 1.50 1.11 2.00 1.35 3.27 1.80 3.27 2.14 2.77 3.88 2.94 3.00 3.40 Category NL NL? NL NL? NL L? NL? L? NL? L? L L? L? L? Overall Confidence H% 31% 64% 54% 29% 38% 33% 36% 13% 8% 14% 7% 7% 7% 13% M% 54% 27% 31% 64% 38% 47% 43% 47% 8% 7% 43% 21% 0% 47% L% 15% 9% 15% 7% 23% 20% 21% 40% 85% 79% 50% 71% 93% 40% Value H=3; L=1) 2.154 2.545 2.385 2.214 2.154 2.133 2.143 1.733 1.231 1.357 1.571 1.357 1.133 1.733 POTENCY Data Element Element (L4G) Reference Dose (RfD) Lifetime Cancer Risk (10M) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Reference Dose (RfD) Lifetime Cancer Risk (10M) Reference Dose (RfD) Reference Dose (RfD) Lethal Dose 50 (LD50) Lethal Dose 50 (LD50) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Lethal Dose 50 (LD50) Slope Factor (Oral) Source OPP IRIS IRIS RTECS OPP, 2002 EPAHA OPP IRIS RTECS RTECS IRIS RTECS HSDB OEHHA Type (NCAR/ CAR) NCAR CAR NCAR NCAR NCAR CAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR CAR PREVALENCE Data Element Element (L4G) Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Release, Number of States Production Volume Production Volume Production Volume Production Volume Production Volume Release, Number of States Source NAWQA UCMR NAWQA NCODR12 UCMR NCODR12 NAWQA TRI CUS/IUR CUS/IUR CUS/IUR CUS/IUR CUS/IUR TRI F-7 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 108930 302012 625558 1111780 1335326 3268493 4719044 5216251 6610293 13463406 23422539 71751412 91465086 Set 3 Summary Common Name Cyclohexanol Hydrazine Isopropyl formate Ammonium carbamate Lead acetate Methional Hexahydro-1 ,3,5-tris(2- hydroxyethyl)-s-triazine 4-Chlorobenzotrichloride Methylthiosemicarbazide Iron pentacarbonyl Methanimidamide, N,N- dimethyl-N'-[3- [[(methylamino)carbonyl]o xy]phenyl]-, monohydrochloride Avermectin B1 Cyclopropanecarboxylic acid, 3-2-chloro-3,3,3- trifluoro-1-propenyl)-2,2- dimethyl- cyano(3- phenoxyphenyl)methyl ester, 1. alpha. (S*),3.alpha.(Z)- (.+ -.)- Model Decision L?-L L L L? L L? L L?-L L? L?-L L?-L L?-L L?-L # Evaluators 14 15 13 13 14 13 13 14 14 13 14 13 14 % agreement 64 87 54 77 50 69 38 86 71 62 57 69 71 Direction - disagree +/-(+ toward L) -6 -1 -5 -2 -6 -2 -6 -2 -2 -6 -3 -1 -4 Value (L=4; NL=1) 2.83 3.79 3.46 2.75 3.35 2.86 3.41 3.21 2.75 2.81 3.27 3.39 3.11 Category L? L L? L? L? L? L? L? L? L? L? L? L? Overall Confidence H% 7% 13% 0% 14% 8% 0% 7% 14% 0% 0% 17% 14% 14% M% 21% 53% 7% 7% 17% 31% 7% 36% 15% 31% 42% 29% 36% L% 71% 33% 93% 79% 75% 69% 86% 50% 85% 69% 42% 57% 50% Value H=3; L=1) 1.357 1.800 1.071 1.357 1.333 1.308 1.214 1.643 1.154 1.308 1.750 1.571 1.643 POTENCY Data Element Element (L4G) Lethal Dose 50 (LD50) Lifetime Cancer Risk (10A-4) Lethal Dose 50 (LD50) Lethal Dose 50 (LD50) Slope Factor (Oral) Lethal Dose 50 (LD50) Lethal Dose 50 (LD50) NOAEL Lethal Dose 50 (LD50) Lethal Dose 50 (LD50) Reference Dose (RfD) ADI Reference Dose (RfD) Source RTECS IRIS RTECS RTECS OEHHA RTECS RTECS OPPT RTECS HSDB OPP JMPR 1997 IRIS Type (NCAR/ CAR) NCAR CAR NCAR NCAR CAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Release, Number of States Release, Number of States Production Volume Production Volume Production Volume Production Volume Production Volume Production Volume Production Volume Release, Number of States Release, Number of States Release, Number of States Release, Number of States Source TRI TRI CUS/IUR CUS/IUR CUS/IUR CUS/IUR CUS/IUR CUS/IUR CUS/IUR TRI NCFAP NCFAP NCFAP F-8 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 51796 55630 60355 62533 67561 71363 75218 75569 76879 80159 106990 107211 109864 121448 Set 4 Summary Common Name Cobalt compounds Urethane Nitroglycerin Acetamide Aniline Methanol 1-Butanol Ethylene oxide Propylene oxide Triphenyltin hydroxide Cumene hydroperoxide 1 ,3-Butadiene Ethylene glycol 2-Methoxyethanol Triethylamine Model Decision L L L?-L L L?-L L?-L L?-L L L L L L L L L # Evaluators 8 8 9 9 10 10 11 9 9 10 10 11 10 9 7 % agreement 75 63 78 67 70 60 55 78 89 80 60 73 80 78 43 Direction - disagree +/-(+ toward L) -2 -2 +/-0 -2 +2 +1 -1 -2 -1 -2 -3 -2 -2 -3 -4 Value (L=4; NL=1) 3.81 3.79 3.50 3.56 3.61 3.45 3.33 3.78 3.78 3.90 3.61 3.80 3.70 3.65 3.36 Category L L L L L L? L? L L L L L L L L? Overall Confidence H% 22% 22% 9% 20% 20% 17% 17% 36% 36% 22% 18% 27% 27% 30% 0% M% 22% 0% 18% 20% 30% 50% 33% 18% 18% 44% 9% 9% 36% 10% 29% L% 56% 78% 73% 60% 50% 33% 50% 45% 45% 33% 73% 64% 36% 60% 71% Value H=3; L=1) 1.667 1.444 1.364 1.600 1.700 1.833 1.667 1.909 1.909 1.889 1.455 1.636 1.909 1.700 1.286 POTENCY Data Element Element (L4G) Lowest Observed Adverse Effect Level (LOAEL) No Observed Effect Level (NOEL) Lowest Observed Adverse Effect Level (LOAEL) Slope Factor (Oral) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Slope Factor (Oral) Slope Factor (Oral) Slope Factor (Oral) Lowest Observed Adverse Effect Level (LOAEL) Slope Factor (Oral) Reference Dose (RfD) Reference Dose (RfD) Lowest Observed Adverse Effect Level (LOAEL) Source Journal Journal RTECS OEHHA RAISHE IRIS IRIS OEHHA OPP OPP RTECS OEHHA IRIS RAISHE RTECS Type (NCAR / CAR) NCAR NCAR NCAR CAR NCAR NCAR NCAR CAR CAR CAR NCAR CAR NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Release, Number of States Source TRI TRI TRI TRI TRI TRI TRI TRI TRI NCFAP TRI TRI TRI TRI TRI F-9 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 123911 133062 137304 319846 330541 330552 333415 541731 542756 630206 Set 4 Summary Common Name 1 ,4-Dioxane Captan Ziram . alpha. - Hexachlorocyclohexane Diuron Linuron Diazinon m-Dichlorobenzene Telone 1,1,1 ,2-Tetrachloroethane Model Decision L L L L? NL? NL NL NL? L? L? # Evaluators 9 10 8 12 13 12 11 13 13 13 % agreement 100 70 88 67 77 92 91 77 62 77 Direction - disagree +/-(+ toward L) +/-0 -2 -1 +1 +3 +/-0 +1 +1 +3 -1 Value (L=4; NL=1) 4.00 3.72 3.75 3.00 2.19 1.00 1.09 2.00 3.23 2.88 Category L L L L? NL? NL NL NL? L? L? Overall Confidence H% 30% 33% 13% 18% 18% 50% 45% 45% 25% 27% M% 30% 22% 25% 64% 64% 40% 36% 45% 50% 64% L% 40% 44% 63% 18% 18% 10% 18% 9% 25% 9% Value H=3; L=1) 1.900 1.889 1.500 2.000 2.000 2.400 2.273 2.364 2.000 2.182 POTENCY Data Element Element (L4G) Lifetime Cancer Risk(10A-4) Slope Factor (Oral) Slope Factor (Oral) Lifetime Cancer Risk(10A-4) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Slope Factor (Oral) Lifetime Cancer Risk(10A-4) Source EPAHA OPP OPP IRIS OPP OPP OPP EPAHA OPP EPAHA Type (NCAR / CAR) CAR CAR CAR CAR NCAR NCAR NCAR NCAR CAR CAR PREVALENCE Data Element Element (L4G) Release, Number of States Release, Number of States Release, Number of States Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Source TRI NCFAP NCFAP NAWQA UCMR UCMR UCMR NCODR12 NCODR12 NCODR12 F-10 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 759944 944229 1313275 1582098 1610180 1634044 1861321 1897456 2164172 Set 4 Summary Common Name S-Ethyl dipropylthiocarbamate Fonofos Molybdenum oxide (MoO3) Trifluralin Prometon Methyl tert-butyl ether Chlorthal-dimethyl (Dacthal) Chlorothalonil Fluometuron Model Decision NL NL L NL - NL? NL L? NL? NL? NL # Evaluators 12 12 11 11 12 12 12 12 11 % agreement 75 83 45 82 100 58 67 75 91 Direction - disagree +/-(+ toward L) +3 +/-0 -3 +2 +/-0 +5 +4 +3 +/-0 Value (L=4; NL=1) 1.38 1.00 3.38 1.59 1.00 3.42 2.25 2.17 1.00 Category NL NL L? NL? NL L? NL? NL? NL Overall Confidence H% 55% 60% 0% 56% 40% 10% 33% 20% 20% M% 45% 40% 25% 44% 40% 70% 56% 60% 70% L% 0% 0% 75% 0% 20% 20% 11% 20% 10% Value H=3; L=1) 2.545 2.600 1.250 2.556 2.200 1.900 2.222 2.000 2.100 POTENCY Data Element Element (L4G) Reference Dose (RfD) Reference Dose (RfD) RfD (UL) Reference Dose (RfD) Reference Dose (RfD) Slope Factor (Oral) Reference Dose (RfD) Reference Dose (RfD) Reference Dose (RfD) Source IRIS IRIS DRI OPP IRIS OEHHA OPP OPP IRIS Type (NCAR / CAR) NCAR NCAR NCAR NCAR NCAR CAR NCAR NCAR NCAR PREVALENCE Data Element Element (L4G) Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Release, Number of States Percentage of Sites (Detects), All Water, Ambient Percentage of PWSs (Detects), All Water, Finished Percentage of PWSs (Detects), All Water, Finished Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Percentage of Sites (Detects), All Water, Ambient Source UCMR UCMR TRI NAWQA UCMR UCMR NAWQA NAWQA NAWQA F-11 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix F. Chemicals Reviewed by the EPA Evaluation Team: Summary of Results CASRN 26471625 Set 4 Summary Common Name Toluene diisocyanate Model Decision L # Evaluators 10 % agreement 80 Direction - disagree +/-(+ toward L) -1 Value (L=4; NL=1) 3.89 Category L Overall Confidence H% 25% M% 25% L% 50% Value H=3; L=1) 1.750 POTENCY Data Element Element (L4G) Slope Factor (Oral) Source OEHHA Type (NCAR / CAR) CAR PREVALENCE Data Element Element (L4G) Release, Number of States Source TRI F-12 ------- EPA-OGWDW Final CCL 3 Chemicals: Classification of the PCCL to CCL EPA815-R-09-008 August 2009 Appendix G. PCCL Contaminants with Incomplete Data for Scoring or that had Parent Compounds Scored in Developing the Draft CCL 3 CASRN 930552 10595956 683181 753731 818086 5160021 7447418 7782992 7783064 7783188 12108133 14808607 75003 75025 75887 102716 106876 115117 116143 127060 7440291 10028156 57018527 1007289 1313275 6190654 7681529 79277671 76578126 56070156 Substance Name Pyrrolidine, 1-nitroso- Ethanamine, N-methyl-N-nitroso- Stannane, dibutyldichloro- Stannane, dichlorodimethyl- Stannane, dibutyloxo- Benzenesulfonic acid, 5-chloro-2-[(2-hydroxy-1- naphthalenyl)azo]-4-methyl-, barium salt (2:1) Lithium chloride (LiCI) Sulfurous acid Hydrogen sulfide (H2S) Thiosulfuric acid (H2S2O3), diammonium salt Manganese, tricarbonyl[(1, 2,3,4, 5-.eta.)-1-methyl-2,4- cyclopentadien-1-yl]- Quartz (SiO2) Ethane, chloro- Ethene, fluoro- Ethane, 2-chloro-1 ,1,1 -trifluoro- Ethanol, 2,2',2"-nitrilotris- 7-Oxabicyclo[4.1.0]heptane, 3-oxiranyl- 1-Propene, 2-methyl- Ethene, tetrafluoro- 2-Propanone, oxime Thorium Ozone 2-Propanol, 1-(1,1-dimethylethoxy)- 1,3,5-Triazine-2,4-diamine, 6-chloro-N-ethyl- Molybdenum oxide (MoO3) 1,3,5-Triazine-2,4-diamine, 6-chloro-N-(1-methylethyl)- Hypochlorous acid, sodium salt 2-Thiophenecarboxylic acid, 3-[[[[(4-methoxy-6-methyl-1,3,5- triazin-2-yl)amino]carbonyl]amino]sulfonyl]- Quizalofop Terbufos-O-analogue sulfone Diazinon oxygen analog DCPA mono/di-acid degradate Common Name N-nitrosopyrrolidine (NPYR) N-Nitrosomethylethylamine (NMEA) Dibutyltin dichloride Dimethyltin dichloride Dibutyltin oxide C.I. Pigment Red 53, barium salt (2:1) Lithium chloride Sulfurous acid Hydrogen sulfide Ammonium thiosulfate Methylcyclopentadienyl manganese tricarbonyl Quartz (SiO2) Chloroethane Vinyl fluoride HCFC-133a Triethanolamine 1,2-Epoxy-4-(epoxyethyl)cyclohexane Isobutene Tetrafluoroethene 2-Propanone oxime Thorium-232 Ozone Propylene glycol mono-t-butyl ether Desisopropylatrazine Molybdenum trioxide Desethylatrazine Sodium hypochlorite Thifensulfuron Quizalofop Terbufos-O-analogue sulfone Diazinon oxygen analog Dacthal mono/di-acid degradate G-1 ------- |