UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON, D C 20460 MAY n 2010 OFFICE OF THE ADMINISTRATOR SCIENCE ADVISORY BOARD SUBJECT: Transmittal of Science Advisory Board Report FROM: VanessaT. Vu ^^^^^-—- Director, Science Advisory Board Staff Office (HOOF) TO: Karen Sheffer EPA Headquarters Library Repository (3404T) This is to advise you that the Science Advisory Board, Ecological Processes and Effects Committee (FY 2009) Augmented for Review of Nutrient Criteria Guidance, issued a report numbered EPA-SAB-10-006, SAB Review of Empirical Approaches for Nutrient Criteria Derivation, dated April 27,2010. Two copies of the report are attached and a third copy has been sent electronically to the attention of Ms. Jeannie Turner at turner.jeannie@epa.gov. The report is available in electronic format on the Science Advisory Board's Web site at http://www.epa.gov/sab. If you have any questions regarding this report, please contact the Designated Federal Officer, Dr. Thomas Armitage directly at (202) 343-9995. Attachments (2) Internet Address (URL) • httpJAwww epa gov Recycled/Recyclable • Printed with Vegetable Oil Based Inks on 100% Postconsumer. Process Chlorine Free Recycled Paper ------- UNITED STATES ENVIRONMENTAL PROTECTION AGENCY WASHINGTON D.C. 20460 OFFICE OF THE ADMINISTRATOR SCIENCE ADVISORY BOARD April 27, 2010 EPA-SAB-10-006 The Honorable Lisa P. Jackson Administrator U.S. Environmental Protection Agency ,1200 Pennsylvania Avenue, N.W. Washington, D.C. 20460 Subject: SAB Review of Empirical Approaches for Nutrient Criteria Derivation Dear Administrator Jackson: EPA's Office of Water (OW) requested that the Science Advisory Board (SAB) review the Agency's draft guidance document titled Empirical Approaches for Nutrient Criteria Derivation ("Guidance"). The Guidance is one of a series of technical documents developed by OW to describe approaches and methods for developing numeric criteria for nutrients. The Guidance specifically focuses on empirical approaches for determining stressor-response relationships to derive numeric nutrient criteria. In response to the Agency's advisory request, the SAB Ecological Processes and Effects Committee, augmented with additional experts, met on September 9-11, 2009 to conduct a peer review of the Guidance. OW requested that the SAB: 1) comment on the technical merit of the methods and approaches described in the Guidance; 2) suggest approaches that might be considered to improve the Guidance; and 3) offer suggestions to improve the utility of the Guidance for state and tribal water quality scientists and resource managers. The enclosed advisory report provides the advice and recommendations of the Committee. The SAB commends EPA for addressing nutrient issues. Nutrients (nitrogen and phosphorus) are a major cause of impairment in the quality of the Nation's waters, and the SAB recognizes the importance of EPA's efforts to develop numeric nutrient criteria. The stressor- response approach is a legitimate, scientifically based method for developing numeric nutrient criteria if the approach is appropriately applied (i.e., not used in isolation but as part of a weight- of-evidence approach). We encourage the Agency to continue this important work. EPA's draft Guidance provides a primer on a limited set of statistical methods that could be used in deriving nutrient criteria based on stressor-response relationships. However, in its present form, the Guidance does not present a complete or balanced view of using the statistical ------- methods to develop criteria. Restructuring and substantial revision of the Guidance is needed prior to its release to make the document more useful to state and tribal water quality scientists and resource managers. In general, we find that the scope and intended use of the Guidance should be more clearly identified. The empirical stressor-response framework described in the Guidance is one possible approach for deriving numeric nutrient criteria, but the uncertainty associated with estimated stressor-response relationships would be problematic if this approach were used as a "stand alone" method because statistical associations do not prove cause and effect. We therefore recommend that the stressor-response approach be used with other available methodologies in the context of a tiered approach where uncertainties in different approaches are recognized, and weight-of-evidence is used to establish the likelihood of causal relationships between nutrients and their effects for criteria derivation. In this regard, we recommend that EPA more clearly articulate how this particular guidance fits within the Agency's decision-making and regulatory processes and, specifically, how it relates to and complements EPA's other nutrient criteria approaches, technical guidance manuals, and documents. The SAB also recognizes that methods in the Guidance do not address downstream impacts of excess nutrients. The SAB has provided many recommendations to improve the Guidance and strongly recommends that they be incorporated into the final document. These recommendations focus on revising the document to address: cause and effect; the utility and limitations of the statistical methods and approaches in the document; the supporting analyses and data needed to correctly identify predictive relationships; the need for more guidance and examples to describe when and how to use various methods and approaches; linkages among designated uses and stressors; and the need for a more specific and descriptive framework outlining the steps in the criteria development process. Finally, the SAB strongly recommends that EPA invest in providing the technical support and training needed to make the approaches and methods in the final Guidance more useful to state and tribal water resource managers. Thank you for the opportunity to review this important guidance document. The SAB looks forward to receiving the Agency's response to this advisory report and stands ready to provide additional advice as EPA continues to develop nutrient criteria guidance. Sincerely, /Signed/ Dr. Deborah L. Swackhamer, Chair Science Advisory Board /Signed/ Dr. Judith L. Meyer, Chair Ecological Processes and Effects Committee ------- NOTICE This report has been written as part of the activities of the EPA Science Advisory Board, a public advisory group providing extramural scientific information and advice to the Administrator and other officials of the Environmental Protection Agency. The Board is structured to provide balanced, expert assessment of scientific matters related to the problems facing the Agency. This report has not been reviewed for approval by the Agency and, hence, the contents of this report do not necessarily represent the views and policies of the Environmental Protection Agency, nor of other agencies in the Executive Branch of the Federal government, nor does mention of trade names or commercial products constitute a recommendation for use. Reports of the EPA Science Advisory Board are posted on the EPA website at http://www.epa.gov/sab. in ------- U.S. Environmental Protection Agency Science Advisory Board Ecological Processes and Effects Committee (FY 2009) Augmented for Review of Nutrient Criteria Guidance CHAIR Dr. Judith L. Meyer, Distinguished Research Professor Emeritus, University of Georgia, Lopez Island, WA MEMBERS Dr. Richelle Allen-King, Professor and Chair, Department of Geology, University at Buffalo, Buffalo, NY Dr. Ernest F. Ben field, Professor of Ecology, Department of Biological Sciences, Virginia Tech, Blacksburg, VA Dr. G. Allen Burton, Professor and Director, Cooperative Institute for Limnology and Ecosystems Research, School of Natural Resources and Environment, University of Michigan, Ann Arbor, MI Dr. Peter M. Chapman, Principal and Senior Environmental Scientist, Environmental Sciences Group, Colder Associates Ltd, Burnaby, BC, Canada Dr. Loveday Conquest, Professor, School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA Dr. Wayne Landis, Professor and Director, Department of Environmental Toxicology, Institute of Environmental Toxicology, Huxley College of the Environment, Western Washington University, Bellingham, WA Dr. James Oris, Professor, Department of Zoology, Miami University, Oxford, OH Dr. Amanda Rodewald, Associate Professor of Wildlife Ecology, School of Environment and Natural Resources, The Ohio State University, Columbus, OH Dr. James Sanders, Director and Professor, Skidaway Institute of Oceanography, Savannah, GA Mr. Timothy Thompson, Senior Environmental Scientist, Science and Engineering for the Environment, LLC, Seattle, WA IV ------- CONSULTANTS Dr. Victor Bierman, Senior Scientist, LimnoTech, Oak Ridge, NC Dr. Elizabeth Boyer, Associate Professor, School of Forest Resources and Assistant Director, Pennsylvania State Institutes of Energy & the Environment, and Director, Pennsylvania Water Resources Research Center, Pennsylvania State University, University Park, PA Dr. Mark David, Professor, Natural Resources & Environmental Sciences, University of Illinois, Urbana, IL Dr. Douglas McLaughlin, Principal Research Scientist, National Council for Air and Stream Improvement, Inc., Western Michigan University, Kalamazoo, MI Dr. Patrick J. Mulholland, Distinguished Research Staff Member, Carbon & Nutrient Biogeochemistry Group, Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN Dr. Andrew N. Sharpley, Professor, Department of Crop, Soil and Environmental Sciences, Division of Agriculture, University of Arkansas, Fayetteville, AR SCIENCE ADVISORY BOARD STAFF Dr. Thomas Armitage, Designated Federal Officer, U.S. Environmental Protection Agency, Washington, DC ------- U.S. Environmental Protection Agency Science Advisory Board CHAIR Dr. Deborah L. Swackhamer, Professor and Charles M. Denny, Jr., Chair in Science, Technology and Public Policy and Co-Director of the Water Resources Center, Hubert H. Humphrey Institute of Public Affairs, University of Minnesota, St. Paul, MN SAB MEMBERS Dr. David T. Allen, Professor, Department of Chemical Engineering, University of Texas, Austin, TX Dr. Claudia Benitez-Nelson, Associate Professor, Department of Earth and Ocean Sciences and Marine Science Program, University of South Carolina, Columbia, SC Dr. Timothy Buckley, Associate Professor and Chair, Division of Environmental Health Sciences, College of Public Health, The Ohio State University, Columbus, OH Dr. Thomas Burke, Professor, Department of Health Policy and Management, Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD Dr. Deborah Cory-Slechta, Professor, Department of Environmental Medicine, School of Medicine and Dentistry, University of Rochester, Rochester, NY Dr. Terry Daniel, Professor of Psychology and Natural Resources, Department of Psychology, School of Natural Resources, University of Arizona, Tucson, AZ Dr. George Daston, Victor Mills Society Research Fellow, Product Safety and Regulatory Affairs, Procter & Gamble, Cincinnati, OH Dr. Costel Denson, Managing Member, Costech Technologies, LLC, Newark, DE Dr. Otto C. Doering HI, Professor, Department of Agricultural Economics, Purdue University, W. Lafayette, IN Dr. David A. Dzombak, Walter J. Blenko Sr. Professor, Department of Civil and Environmental Engineering, College of Engineering, Carnegie Mellon University, Pittsburgh, PA Dr. T. Taylor Eighmy, Vice President for Research, Office of the Vice President for Research, Texas Tech University, Lubbock, TX Dr. Elaine Faustman, Professor, Department of Environmental and Occupational Health Sciences, School of Public Health and Community Medicine, University of Washington, Seattle, WA VI ------- Dr. John P. Giesy, Professor and Canada Research Chair, Veterinary Biomedical Sciences and Toxicology Centre, University of Saskatchewan, Saskatoon, Saskatchewan, Canada Dr. Jeffrey Griffiths, Associate Professor, Department of Public Health and Community Medicine, School of Medicine, Tufts University, Boston, MA Dr. James K. Ham mitt, Professor, Center for Risk Analysis, Harvard University, Boston, MA Dr. Rogene Henderson, Senior Scientist Emeritus, Lovelace Respiratory Research Institute, Albuquerque, NM Dr. Bernd Kahn, Professor Emeritus and Associate Director, Environmental Radiation Center, School of Mechanical Engineering, Georgia Institute of Technology, Atlanta, GA Dr. Agnes Kane, Professor and Chair, Department of Pathology and Laboratory Medicine, Brown University, Providence, Rl Dr. Nancy K. Kim, Senior Executive, New York State Department of Health, Troy, NY Dr. Catherine Kling, Professor, Department of Economics, Iowa State University, Ames, IA Dr. Kai Lee, Program Officer, Conservation and Science Program, David & Lucile Packard Foundation, Los Altos, CA Dr. Cecil Lue-Hing, President, Cecil Lue-Hing & Assoc. Inc., Burr Ridge, IL Dr. Floyd Malveaux, Executive Director, Merck Childhood Asthma Network, Inc., Washington, DC Dr. Lee D. McMuIlen, Water Resources Practice Leader, Snyder & Associates, Inc., Ankeny, IA Dr. Judith L. Meyer, Distinguished Research Professor Emeritus, Odum School of Ecology, University of Georgia, Lopez Island, WA Dr. Jana Milford, Professor, Department of Mechanical Engineering, University of Colorado, Boulder, CO Dr. Christine Moe, Eugene J. Gangarosa Professor, Hubert Department of Global Health, Rollins School of Public Health, Emory University, Atlanta, GA Dr. Eileen Murphy, Manager, Division of Water Supply. New Jersey Department of Environmental Protection, Trenton, NJ VII ------- Dr. Duncan Patten. Research Professor, Department of Land Resources and Environmental Sciences, Montana State University, Bozeman, MT Dr. Stephen Polasky, Fesler-Lampert Professor of Ecological/Environmental Economics, Department of Applied Economics, University of Minnesota, St. Paul, MN Dr. Stephen M. Roberts, Professor, Department of Physiological Sciences, Director, Center for Environmental and Human Toxicology, University of Florida, Gainesville, FL Dr. Amanda Rodewald, Associate Professor, School of Environment and Natural Resources, The Ohio State University, Columbus, OH Dr. Joan B. Rose, Professor and Homer Nowlin Chair for Water Research, Department of Fisheries and Wildlife, Michigan State University, East Lansing, MI Dr. Jonathan M. Samet, Professor and Flora L. Thornton Chair, Department of Preventive Medicine, University of Southern California, Los Angeles, CA Dr. James Sanders, Director and Professor, Skidaway Institute of Oceanography, Savannah, GA Dr. Jerald Schnoor, Allen S. Henry Chair Professor, Department of Civil and Environmental Engineering, Co-Director, Center for Global and Regional Environmental Research, University of Iowa, Iowa City, IA Dr. Kathleen Segerson, Professor, Department of Economics, University of Connecticut, Storrs, CT Dr. V. Kerry Smith, W.P. Carey Professor of Economics , Department of Economics, W.P Carey School of Business, Arizona State University, Tempe, AZ Dr. Herman Taylor, Professor, School of Medicine, University of Mississippi Medical Center, Jackson, MS Dr. Barton H. (Buzz) Thompson, Jr., Robert E. Paradise Professor of Natural Resources Law at the Stanford Law School and Perry L. McCarty Director, Woods Institute for the Environment, Stanford University, Stanford, CA Dr. Paige Tolbert, Associate Professor, Department of Environmental and Occupational Health, Rollins School of Public Health, Emory University, Atlanta, GA Dr. Thomas S. Wallsten, Professor and Chair, Department of Psychology, University of Maryland, College Park, MD Dr. Robert Watts, Professor of Mechanical Engineering Emeritus, Tulane University, Annapolis, MD VIM ------- SCIENCE ADVISORY BOARD STAFF Dr. Angela Nugent, Designated Federal Officer, U.S. Environmental Protection Agency, Washington, DC IX ------- TABLE OF CONTENTS 1. EXECUTIVE SUMMARY xi 2. INTRODUCTION 1 3. RESPONSE TO CHARGE QUESTIONS 3 3.1 Charge Question 1. Improving the utility of the Guidance .... 4 3.2 Charge Question 2. Selecting stressor and response variables .. 10 3 3 Charge Question 3. Approaches to demonstrate the distribution of and relationships among variables.... .. 15 34 Charge Question 4. Methods for assessing the strength of the cause-effect relationship 20 3 5 Charge Question 5 Statistical methods to analyze the data 22 36. Charge Question 6. Evaluating the predictive accuracy of stressor-response relationships 31 3 7. Charge Question 7. Evaluating candidate stressor-response criteria ... 37 4. REFERENCES 43 ------- 1. EXECUTIVE SUMMARY EPA's Office of Water (OW) requested that the Science Advisory Board (SAB) conduct a peer review of Agency's draft guidance document, Empirical Approaches for Nutrient Criteria Derivation (the "Guidance"). The Guidance was developed by OW to provide information for state and tribal water resource managers on empirical stressor-response approaches for developing numeric nutrient criteria. In response to the Agency's advisory request, the SAB Ecological Processes and Effects Committee reviewed the Guidance. To augment the expertise on the Committee for this advisory activity, several additional scientists with specific knowledge and expertise in assessing the effects of nutrient enrichment in aquatic systems also participated in the review. EPA's Office of Water develops ambient water quality criteria that serve as guidance to states and tribes for adoption of water quality standards. The water quality standards include designated uses, such as aquatic life protection and recreation, and criteria that define levels of water quality variables protective of the designated uses. Because nutrients (nitrogen and phosphorus) are a major cause of impairment in the quality of the Nation's waters, state adoption of numeric nutrient criteria in water quality standards has been a high priority for OW. To assist the states and tribes in developing numeric nutrient criteria, OW published technical guidance manuals for developing nutrient criteria for lakes and reservoirs (U.S. EPA, 2000a), rivers and streams (U.S. EPA, 2000b), estuaries and coastal marine waters (U.S. EPA, 2001), and wetlands (U.S. EPA, 2008). These technical guidance manuals focus primarily on describing a reference condition approach for deriving criteria from distributions of nutrient concentrations and biological responses in minimally disturbed reference waterbodies. Other basic analytical approaches for nutrient criteria derivation recognized in the manuals include mechanistic modeling (i.e., predicting the effects of changes in nutrient concentrations using site-specific parameters and equations that represent ecological processes), which EPA intends to address as the subject of a later document, the stressor-response approach (discussed in the Guidance and considered in this advisory report), and the application and/or modification of established nutrient/algal thresholds. The stressor-response approach involves quantifying the relationship between nutrient concentrations and biological response measures related to the designated use of a waterbody. The Guidance outlines a five-step process for developing numeric nutrient criteria. It describes data analysis methods and approaches that could be used in each of these steps. Step one involves the use of exploratory analysis and data visualization tools to select variables that appropriately quantify the stressor (i.e., excess nutrients) and the response. Step two involves the use of conceptual models, existing literature, and other methods to assess the strength of the relationship represented in the stressor-response linkage. Step three involves the use of various statistical methods to analyze data, estimate stressor-response relationships, and identify thresholds that may be used to derive water quality criteria. Step four involves the evaluation of estimated stressor-response relationships (including validation of predictive performance for a stressor-response model, and selecting a model that best represents the data). Step five involves evaluating candidate nutrient criteria by predicting conditions that might be expected after implementing different criteria. The Guidance contains five sections, each addressing one of the proposed steps in the criteria development process. In its charge to the SAB, EPA requested that XI ------- the Committee comment on the methods and approaches described in each section of the Guidance, recommend other approaches that might be considered, and offer suggestions to improve the utility of the Guidance for state and tribal water quality scientists and resource managers. In its responses to the charge questions, the Committee provides comments and recommendations to improve the Guidance and assist EPA in its efforts to support the development of numeric nutrient criteria. General comments on the Guidance The Committee recognizes the importance of EPA's efforts to support numeric nutrient criteria development and encourages the Agency to continue this important work. In addition, we recognize the stressor-response approach as a legitimate, scientifically based method for developing numeric nutrient criteria if it is appropriately applied (i.e, not used in isolation but as part of a tiered weight-of-evidence approach using individual lines of evidence as discussed here). The draft Guidance provides a primer on a limited set of statistical methods that could be used in deriving numeric nutrient criteria. However, we find that improvements in the Guidance are needed prior to its release to make the document more useful to state and tribal water quality scientists and resource managers. In general, we find that the scope, limitations, and intended use of the Guidance should be more clearly described. The Guidance addresses only one type of "empirical" approach for derivation of numeric nutrient criteria (i.e., the stressor-response framework). As illustrated in many of the examples in the Guidance, considerable unexplained variation can be encountered when attempting to use the empirical stressor-response approach to develop nutrient criteria. The final Guidance should clearly indicate that such unexplained variation presents significant problems in the use of this approach. Further, the final document should clearly state that statistical associations may not be biologically relevant and do not prove cause and effect. However, when properly developed, biologically relevant statistical associations can be useful arguments as part of a weight-of-evidence approach (further discussed in Section 3.3, recommendation #7 of this advisory report) to criteria derivation. Therefore, the final Guidance should provide more information on the supporting analyses needed to improve the basis for conclusions that specific stressor-response associations can predict nutrient responses with an acceptable degree of uncertainty. Such predictive relationships can then be used with mechanistic or other approaches in a tiered weight-of-evidence assessment including cause and effect relationships to develop nutrient criteria. Tiered environmental assessment is iterative. The initial assessment is the simplest (e.g., minimal ecosystem specific data) and most conservative (i.e., risks must be assumed in the absence of system-specific information), and thus, will not always provide sufficient certainty for decision-making. Cause and effect relationships would be inferred but not demonstrated; only a few lines of evidence would be available and the corresponding uncertainty great. At the highest tier, there would be several lines of evidence and factors that would confound the prediction of effects, such as other stressors or the morphology of the waterbody, and these need to be understood and considered. Successive tiers involve more focused (e.g., specific for particular ecosystem types) investigations, based on the results of the previous tier. Data needs are relatively few at the initial tier, but increase at successive tiers. However, through additional XII ------- testing, measurement, or modeling, uncertainty decreases at successive tiers, and sources of uncertainty become better understood. Policy makers require information to understand the uncertainty associated with regulatory decisions, and to determine how much uncertainty may or may not be acceptable in particular decision-making contexts. Weight-of-evidence typically determines the tier at which uncertainty has been reduced sufficiently for informed management decision-making. It is important to explicitly describe and consider uncertainty at each step of the criteria development and decision-making process. The level of uncertainty of the conceptual model is likely to be rather low, as it is mostly based on well-established general principles of aquatic systems. Here the uncertainty is about how well the selected conceptual model fits the specific stressors and ecological systems under consideration. As criteria are developed it is important to address uncertainty associated with more specific factors that influence biological responses to nutrient inputs because uncertainty may cascade down through the analysis, in effect multiplying the uncertainty in later steps of the analysis. The Committee also recommends that EPA more clearly articulate how the Guidance fits within the Agency's decision-making and regulatory processes and, specifically, how it relates to and complements EPA's other nutrient criteria technical guidance manuals and documents. As further discussed in the response to Charge Question 1, numeric nutrient concentration criteria and load-response models should be considered as two different approaches for accomplishing the goal of controlling excessive nutrient loadings. In addition, the Committee notes that the methods in the Guidance do not address the problem of excess nutrient enrichment downstream from waters for which the criteria are being developed. There is a need for methods to address this problem (one of which could be load-response modeling) and it should be clearly stated that this is beyond the scope of the current guidance document. Charge Question 1 Improving the utility of the Guidance for state and tribal water quality scientists and resource managers What suggestions do you have that will improve the utility of the draft document, Empirical Approaches for Nutrient Criteria Derivation, for State water quality scientists and resource managers to derive numeric nutrient criteria based on stressor-response relationships? The Committee finds that improvements in EPA's Guidance are needed to make the document more useful to state and tribal water quality scientists and resource managers and to ensure against inadvertent misuse. In this regard, as previously mentioned, the scope, limitations, and intended use of the document should be more clearly identified. • The Committee recommends that EPA more clearly articulate how the Guidance fits within the Agency's decision-making and regulatory processes and, specifically, how it relates to and complements EPA's other nutrient criteria technical guidance manuals and documents. • In the Guidance, and the Agency's related technical manuals, EPA should more clearly address the importance of: 1) establishing linkages among designated uses and measured responses, stressors and measures of stressors; and 2) relating measures of stressors directly to deleterious effects on designated uses. XIII ------- • The Committee finds that the Guidance: 1) should provide a more specific and descriptive framework outlining the steps in the criteria development process (Figure 1 of this advisory report illustrates EPA's proposed framework for developing nutrient criteria and the SAB recommendations for revision of the framework); 2) must be detailed and sophisticated enough to ensure statistical rigor, but additional support must also be provided by EPA to help users meet the technical demands of the methods; 3) should more clearly express the caveats and limitations of the statistical methods and approaches in the document, in particular the fact that statistical correlations do not establish cause and effect; 4) should contain more technical guidance and examples to describe when and how to use various methods and approaches; and 5) should provide additional guidance on data requirements for application of the statistical methods and approaches. • Charge Question 2 Selecting stressor and response variables Section 1 of the draft guidance document reviews how to select the variables that appropriately quantify the stressor (i e, excess nutrients) and the response (e g., chlorophyll a, dissolved oxygen, or a biological index) Please comment on whether the factors to consider described in Section 1 of the draft document are appropriate for selecting response variables that are sensitive to nutrients and related to measures of designated uses In Section 1 of the Guidance, EPA discusses factors to consider when selecting the stressor and response variables. In this regard, the Committee finds that EPA should strengthen the Guidance by including additional material. • The examples in the Guidance rely heavily on taxa richness as a response variable. Some rationale as to how this variable relates to a designated use should accompany these examples. The coupling of response variables to designated uses must be clear and the rationale explained. Further, the Guidance could be strengthened considerably by presentation of examples showing strong nutrient-response relationships with response variables that are clearly linked to designated uses. • The Committee notes that co-limitation by both nitrogen and phosphorus may be common in many systems and regions. Therefore, the use of multivariate or data stratification approaches may be needed to identify nutrient-response relationships. XIV ------- EPA's Framework as Described in Framework Recommended by the SAB The Draft Guidance Document (At each step in the process, the uncertainty should be explicitly described ) Step 1 Selecting and Evaluating Data i p Step 2 Assessing the Strength of the Cause- Effect Relationship i p Step 3 Analyzing Data i r Step 4 Evaluating Estimated Slressor-Response Relationships i r StepS Evaluating Candidate Stressor-Response Criteria * This includes consideration of factors discussed in this mechanisms and existing conditions, and ability to predi advise clthe Ste Problem Formu Develo i P« pment 1 Step 2 Conceptual Model Development (Consideration of Empirical Approach in Conjunction With Other Lines of Evidence) i i Step 3 Selection and Preliminary Evaluation of Data i i Step 4 Evaluation of Stressor-Response Approach ^ ./^ ->. XIQ .S^ Is Slressor- ^s^ ^^ Response ^^ » ^V. Appropriate'? ^^ > \ YES StepS Model Stressor-Response Selection \ Consider Other Methods A t Step 6 Evaluate Candidate Siressor-Responw Cntcna and Consider Other Methods if Necessarj \ r Step 7 Criteria De\elopmenl — Use Weight-of-Kvidence Approach to Compare Output to Step 1 Coals ry report such as cause and effect, relevance to known irobabiltty of meeting designated uses. Figure I KPA's Framework Tor Developing Nutrient Critenn Based on Stressor-Response Relationships and SAB Recommendations for Revision XV ------- « The Guidance should provide more information on the data needed to characterize other stressor and constraint variables (e.g., high dissolved organic carbon versus low dissolved organic carbon lakes, shaded versus unshaded streams) which are critical for applying multivariate techniques or for stratification/classification of univariate nutrient-response relationships. • The Guidance focuses on total nitrogen and total phosphorus as the primary nutrient stressor variables. In systems where inorganic nutrients are the dominant form, additional consideration should be given to inorganic nitrogen and phosphorus. • The Guidance focuses on nutrient-response pathways driven by autotrophic processes (nutrients directly control algal growth and excessive amounts of algae impair systems through indirect effects on dissolved oxygen, food web changes, and aesthetics). The Committee notes that nutrients can also directly control heterotrophic microbes (bacteria and fungi) and indirectly control decomposition of organic matter. This should be more fully discussed in the Guidance. • The Guidance provides inadequate discussion of the temporal/spatial aspects of data needed to develop relevant stressor-response relationships. The Guidance should discuss the conditions under which mean/median or maximum/minimum values of stressor and response variables might be more appropriate than discrete instantaneous measurements for developing stressor-response relationships. The use of time series data to describe specific systems should also be addressed. Although such guidance may be provided in various system-specific technical manuals (e.g., U.S. EPA, 2000a, b), a summary synthesis of the major points in these earlier documents should be included in the Guidance. Charge Question 3 Approaches to demonstrate the distribution of and relationships among variables Section 1 outlines methods to visualize available data. Please comment on the effectiveness of the following approaches described in the document (listed below) to demonstrate the distribution of and relationships among variables a) Basic data visualization techniques b) Maps c) Conditional probability d) Classifications Section 1 of the Guidance discusses exploratory data analysis, and presents several methods for demonstrating the distribution of and relationships among variables. In Subsections 1.2 - 1.6 several basic plotting techniques are presented. This is followed by a description of conditional probability analysis (a statistical approach for summarizing how changes in nutrient concentrations are associated with the probability of waterbodies attaining their designated uses). The Committee finds that the discussion of exploratory data analysis would be more effective if Section I of the Guidance were reorganized and expanded. xvi ------- • As further discussed in the response to Charge Question 3, Subsections 1.2 - 1.6 of the Guidance should be reframed as a separate major section on exploratory data analysis, which should follow another separate major section on problem formulation. The material in Subsection 1.1 (selection of stressor-response variables) should be moved to a later section of the document. • The Guidance should stress that exploratory data analysis, including data visualization, should be conducted prior to inferential statistical analyses of potential stressors and responses. The objectives of exploratory data analysis should be to better understand the system of interest and to maximize accuracy and minimize variability of subsequently derived stressor-response relationships. • Additional methods for exploratory data analysis should be discussed in the Guidance. These additional methods should include: use of summary statistics, time series plots at fixed points in space; longitudinal plots at fixed points in time; bubble plots; Pearson and other correlation analyses; and maps that show temporal (monthly, seasonal, inter-annual) as well as spatial patterns. • Clear guidance is needed on when and how to use the statistical methods and visualization techniques presented in the document. The strengths and limitations of the methods should also be clearly identified. It would be useful to show several case examples that range from state-wide to local and data-rich to data-poor; and exemplify different types of aquatic ecosystems (e.g., headwater streams, large rivers, lakes and estuaries). Examples should note the strengths, limitations, assumptions and uncertainties that must be considered when using the methods to explore and visualize the data. These examples should demonstrate how nutrients can be identified as significant stressors in the presence of multiple stressors and habitat factors that may affect the resident communities. • Subsection 1.6 of the Guidance (examination of stressor-response distributions across different classes, e.g., ecoregions) should be expanded. The subsection should discuss additional data analyses and examples for different spatial classifications (e.g., ecoregions, states, watersheds, systems of interest), different waterbody types (e.g., streams, rivers, lakes, estuaries) and other important physical and chemical characteristics of systems that could affect the applicability of the nutrient criteria. Charge Question 4 Methods for assessing the strength of the cause-effect relationship Section 2 of the draft guidance document describes methods for assessing the strength of the cause-effect relationship represented in the stressor-response linkage Please comment on whether the draft guidance document adequately describes how conceptual models, existing literature, and empirical models can be used to assess how changes in nutrient concentration are likely to cause changes in the chosen response variable xvn ------- Section 2 of the Guidance provides a summary of how the strength of candidate stressor- response pairings from step 1 can be assessed. The Committee recommends a number of improvements in this section. • It is appropriate to use conceptual models and existing literature as the scientific basis to assess how changes in nutrient concentrations might affect response variables. However additional discussion of conceptual model selection, with specific examples, would be helpful. As illustrated in Figure 1 of this advisory report and further discussed in the response to Charge Question 7, the Committee recommends that development of the conceptual model occur in or immediately after the problem formulation step, early in the process of criteria development. • Structural Equation Modeling (SEM) is discussed in the Guidance as a method for exploring nutrient-ecosystem response. The Committee finds that use of SEM should be more fully explained. Clear examples of its use should be provided. • The Guidance discusses the use of Propensity Score Analysis (PSA) to estimate stressor- response relationships. PSA seems to be useful for sorting out groups that share covariates but may have unique nutrient characteristics. Such sorting could lead to a clearer understanding of how nutrients function amid multiple covariates. The example of PSA in the Guidance appendix is helpful, but further explanation of how to interpret the results of the analysis is needed. An analysis such as PSA should be discussed in a later section of the Guidance because it is a tool for analyzing data (Section 3 of the Guidance) rather than supporting potential relationships. • It is not clear why EPA did not include information obtained from mechanistic models in Section 2 of the Guidance. Because mechanistic models can integrate information on the interactions of major ecosystem processes to derive quantitative estimates of effects, they should be discussed as a means to interpret the stressor-response relationship. Charge Question 5 Statistical methods to analyze the data Section 3 of the draft guidance document outlines statistical methods to analyze the data to estimate stressor-response relationships Please comment on the appropriateness of the methods outlined in the document (listed below) for describing stressor-response relationships associated with nutrient pollution What approaches would you recommend that could effectively address indirect pathways of adverse effects7 What recommendations do you have to address the effects of confounding variables and uncertainty in the estimated relationships7 a) Simple linear regression b) Quantile regression c) Logistic regression d) Multiple linear regression e) Non-parametric changepoml analysis f) Discontinuous regression models XVIII ------- Section 3 of the Guidance describes a number of statistical methods for analyzing data to estimate stressor-response relationships. The Committee provides comments addressing the appropriateness of statistical methods for estimating stressor-response relationships. • Methods described in the Guidance are generally appropriate for estimating stressor-response relationships in support of conceptual models. However, as further discussed in the response to Charge Question 5, more careful consideration of confounding variables is necessary to maximize the potential for stressor-response relationships to reflect cause and effect between nutrient concentrations and ecological responses. The Guidance should be revised to state this more definitively and better assist its users in achieving this objective. • Those charged with using stressor-response methodology may require additional technical support to use the methods in the Guidance. • EPA should provide guidance on how the degree of the relationship (indicated by R2, residuals analysis, and other evidence) relates to establishing predictive stressor-response relationships for numeric nutrient criteria development. The Committee also notes that uncertainty must be identified and quantified for all methods and at all stages of the process. Charge Question 6 Evaluating the predictive accuracy of stressor-response relationships Section 4 of the draft guidance document describes how to evaluate the predictive accuracy of estimated stressor-response relationships Please comment on the appropriateness of approaches in Section 4 of the guidance document and factors to consider in evaluating and comparing different estimates of the stressor-response relationships and selecting those most appropriate for criteria derivation. The Committee provides comments on the appropriateness of approaches discussed in Section 4 of the Guidance and the factors to consider in evaluating and comparing different estimates of stressor-response relationships in order to select those most appropriate for criteria development. Overall, the Committee finds that this section of the Guidance lacks the detail provided in other sections and needs improvement. • A clear framework for statistical model selection is needed. This framework should include: 1) an assessment of whether analyses indicate that the stressor-response approach is appropriate; 2) selection criteria to evaluate the capability of models to consider cause/effect and direct/indirect relationships between stressors and responses; 3) consideration of model relevance to known mechanisms and existing conditions; 4) establishment of biological relevance; and 5) ability to predict probability of meeting designated use categories. • The concept of "validation" as presented in Subsection 4.1 of the Guidance is inconsistent with other EPA guidance (U.S. EPA, 2009a) on development, evaluation, and application of models. Model corroboration (sensu "validation") and uncertainty analysis should both be part of model evaluation and selection. These activities should be directed and informed by pre-established data quality objectives. Additional guidance is also needed on: data set xix ------- specification and stratification; a suite of validation techniques (e.g., random or non-random held-out data, independent data, resampling/Monte Carlo); and appropriate quantitative levels of goodness-of-fit and uncertainty measures. • With regard to validation, the Committee recommends that nutrient criteria should result from a tiered weight-of-evidence approach based on the application of multiple empirical approaches and consideration of multiple response variables as appropriate. The nutrient criteria values that may be determined, after considering validation and uncertainty, may vary significantly from technique to technique or from response variable to response variable. EPA should provide greater guidance on how to assign numeric criteria when a range of responses among analyses/models results in different values. Charge Question 7. Evaluating candidate stressor-response criteria Section 5 of the draft guidance document describes how to evaluate the candidate stressor- response criteria. An approach is outlined for predicting conditions that might result after implementing different nutrient criteria. Please comment on uncertainties that would remain if water quality criteria for nutrients were based solely on estimated stressor-response relationships and in what ways other information/analysis would help address and possibly reduce this uncertainty7 Section 5 of the Guidance describes how to evaluate candidate numeric nutrient criteria. The Committee provides comments on uncertainties associated with deriving candidate water quality criteria. We also recommend improvements in the Guidance to help address and reduce uncertainty. • The Guidance describes approaches that use a data-mining exercise to demonstrate a possible cause-effect relationship for the nutrient-ecosystem response. However, as further discussed in the response to Charge Question 7, the document does not address or partition the inherent critical uncertainties associated with the stressor-response approach. We note that these uncertainties can be extremely large (e.g., several orders of magnitude). To address these uncertainties, the Guidance should better document the physical, chemical and biological variables comprising the morphological relationships (e.g., habitat, spatial, and temporal) that define the aquatic system of interest, and which may be important in modifying the relationship between nutrient concentrations (both nitrogen and phosphorus) and observed endpoints. These factors may dominate the cause-effect pathway and should be documented so that uncertainty in the relationship between nutrient concentrations and measured endpoints can be reduced. • The Guidance should indicate that, at the start of the initial problem formulation exercise, a realistic cause-effect conceptual model must be developed, and that the model should include those factors that are likely to contribute most to the change in the response variable for the specific region/system of interest. Then data analyses can be used to evaluate which of the factors, or combination of factors, caused the observed change in the response variable. xx ------- As further discussed in the response to Charge Question 7, when predicting conditions that might result after implementing nutrient criteria, it is important to consider environmental factors that may cause differences in nutrient concentrations and biological conditions (e.g., lead and lag times) in response to nutrient loadings. There is considerable uncertainty in linkage of the response variables discussed in the Guidance to the Clean Water Act goals of drinkable, swimmable, and fishable waters. The recommended response variables in the Guidance should be directly linked to these Clean Water Act goals. xxi ------- 2. INTRODUCTION EPA's Office of Water (OW) requested that the Science Advisory Board (SAB) conduct a peer review of the Agency's draft guidance document, Empirical Approaches for Nutrient Criteria Derivation (the "Guidance"). The Guidance was developed by EPA's Office of Water to provide information for water resource managers on the scientific foundation for using empirical approaches to describe stressor-response relationships for developing numeric nutrient water quality criteria. The SAB Ecological Processes and Effects Committee (Committee) met on September 9th-11th, 2009 to review the Guidance. To augment the expertise on the Committee for this advisory activity, several additional scientists with specific knowledge and expertise in assessing the effects of nutrient enrichment in aquatic systems also participated in the review. This report transmits the advice of the Committee. EPA's Office of Water is charged with protecting aquatic life, wildlife, and human health from adverse water-mediated effects of anthropogenic pollutants. In support of this mission, OW develops ambient water quality criteria that serve as guidance to states and tribes for adoption of water quality standards. State and tribal water quality standards include designated uses, such as aquatic life protection and recreation, and criteria that define levels of water quality variables protective of the designated uses. Because nutrients (nitrogen - N and phosphorus - P) are a major cause of water quality impairment in the Nation's waters, state adoption of numeric nutrient criteria into water quality standards has been a high priority for OW. The Office of Water has stated that numeric nutrient water quality standards are important because they can: support development of nutrient related Total Maximum Daily Loads (TMDLs); provide targets for nutrient trading programs; and make it easier to write National Pollutant Discharge Elimination System (NPDES) permits, evaluate the success of nutrient runoff minimization programs, and measure environmental progress. To assist states and tribes in developing numeric nutrient criteria, OW published peer reviewed technical guidance for developing such criteria for lakes and reservoirs (U.S. EPA, 2000a), rivers and streams (U.S. EPA, 2000b), estuaries and coastal marine waters (U.S. EPA, 2001), and wetlands (U.S. EPA, 2008). These technical guidance documents focus primarily on a reference condition approach for deriving nutrient criteria from distributions of nutrient concentrations and biological responses in minimally disturbed reference waterbodies. Other basic analytical approaches for nutrient criteria derivation identified in prior guidance documents include mechanistic modeling (i.e., predicting the effects of changes in nutrient concentrations using site-specific parameters and equations that represent ecological processes), the stressor- response approach, and the application and/or modification of established nutrient/algal thresholds. The stressor-response approach involves quantifying a relationship between nutrient concentrations and biological response measures related to the designated use of a waterbody. In the Guidance, EPA states that, when developing nutrient criteria, the strengths and characteristics of each analytical approach should be carefully considered with respect to data availability and designated use protection needs. The Guidance outlines a five-step process for developing numeric nutrient criteria. Step one involves selecting variables that appropriately quantify the stressor (i.e., excess nutrients) and the ------- response. The Guidance describes various techniques for exploratory data analysis to understand the properties of different variables and visualize data. These techniques include histograms, box and whisker plots, quantile-quantile plots, cumulative distribution plots, scatter diagrams, and spatial mapping. Step two involves assessing the strength of the relationship represented in the stressor-response linkage. The Guidance discusses the use of conceptual models, existing literature, and empirical models to assess the degree to which changes in nutrient concentration are likely to cause changes in a chosen response variable. Step three involves analysis of data to estimate stressor-response relationships and identify thresholds that may be used to derive criteria. The Guidance describes a number of statistical methods for analyzing data to estimate stressor-response relationships. These methods include linear regression, logistic regression, quantile regression, non-parametric changepoint analysis, and discontinuous regression modeling. Step four involves evaluating the stressor-response relationships (including validation of predictive performance for a stressor-response model and selecting a model that best represents the data). Step five involves evaluating candidate stressor-response criteria. The Guidance outlines an approach for predicting conditions that might be expected after implementing different nutrient criteria and selecting a value to optimize resource protection. The Committee was asked to comment on the scientific and technical merit of the methods and approaches discussed in the Guidance and to offer suggestions to improve the usefulness of the document to state and tribal water quality scientists and resource managers. The Committee recognizes the importance of EPA's efforts to support numeric nutrient criteria development and encourages the Agency to continue this important work. In addition, we recognize the stressor-response approach as a legitimate, scientifically based method for developing numeric nutrient criteria if it is appropriately applied (i.e., not in isolation). The draft Guidance provides a primer on a limited set of statistical methods that could be used in deriving nutrient criteria based on stressor-response relationships. However, the Committee finds that improvements in the Guidance are needed prior to its release to make the document more useful to state and tribal water quality scientists and resource managers. In general, we find that the scope, limitations, and intended use of the Guidance need to be more clearly described. The Guidance addresses only one type of "empirical" approach for derivation of numeric nutrient criteria (i.e., the stressor-response framework). In this regard, we strongly recommend that EPA more clearly articulate how the Guidance fits within the decision- making and regulatory processes and, specifically, how it relates to and complements EPA's other nutrient criteria technical guidance manuals and documents. As illustrated in the data analysis examples in the Guidance, a large degree of unexplained variation can be encountered when attempting to use the empirical stressor-response approach to develop nutrient criteria. The final Guidance should clearly indicate that such unexplained variation can present significant problems in the use of this approach. Further, the final document should clearly stale that statistical associations may not be biologically relevant and do not prove cause and effect When properly developed, statistical associations can be useful in supporting cause and effect arguments as part of a weight-of-evidence approach (further discussed in Section 3.3. recommendation #7 of this advisory report) to criteria development. Therefore, the final Guidance should provide more information on the supporting analyses needed to improve the basis for conclusions that specific stressor-response associations can predict nutrient responses with an acceptable degree of uncertainty. Such predictive relationships can then be applied, with ------- mechanistic or other approaches, in a tiered weight-of-evidence assessment using individual lines of evidence in combination to develop nutrient criteria. Tiered environmental assessment is iterative. The initial assessment is the simplest (e.g., minimal ecosystem specific data) and most conservative (i.e., risks must be assumed in the absence of system-specific information), and thus will not always provide sufficient certainty for decision-making. Cause and effect relationships would be inferred but not demonstrated; only a few lines of evidence would be available and the corresponding uncertainty high. At the highest tier, there would be several lines of evidence and factors that would confound the prediction of effects, such as other stressors or the morphology of the waterbody, and these need to be understood and considered. Successive tiers will involve more focused (e.g., specific for particular ecosystem types) investigations, based on the results of the previous tier. Data needs are relatively few at the initial tier, but increase at successive tiers. However, through additional testing, measurement, or modeling, uncertainty decreases at successive tiers, and sources of uncertainty become better understood. Policy makers require information to understand the uncertainty associated with regulatory decisions, and to determine how much uncertainty may or may not be acceptable in particular decision-making contexts. Weight-of-evidence typically determines the tier at which uncertainty has been reduced sufficiently for informed management decision-making. It is important to explicitly describe and consider uncertainty at each step of the criteria development and decision-making process. The level of uncertainty of the conceptual model is likely to be rather low, as it is mostly based on well-established general principles of aquatic systems. Here the uncertainty is about how well the selected conceptual model fits the specific stressors and ecological systems under consideration. As criteria are developed, it is important to address uncertainty associated with more specific factors that influence biological responses to nutrient inputs because uncertainty may cascade down through the analysis, in effect multiplying the uncertainty at later steps. In our responses to the charge questions we have recommended specific revisions to improve various sections of the Guidance before it is published. These recommendations focus on: modifying the framework of the Guidance to make it more specific and descriptive (as illustrated in Figure 1 of this report); providing additional information on conditions under which the stressor-response framework may apply; more clearly expressing the caveats, limitations, and data requirements associated with the approaches presented in the Guidance; providing additional information and examples showing when and how to use methods and approaches described in the document; and providing more detailed and descriptive guidance on the use of statistical methods and additional support from EPA to help users meet the technical demands of the methods. 3. RESPONSE TO CHARGE QUESTIONS In the responses to each of the charge questions, the Committee has listed key Findings and comments as bullets. These findings are followed by the Committee's key recommendations. Various aspects of some cross-cutting findings have been discussed in the responses to more than one of the charge questions and cross-references have been provided. ------- 3.1. Charge Question 1. Improving the utility of the Guidance What suggestions do you have that will improve the utility of the draft document, Empirical Approaches for Nutrient Criteria Derivation, for State water quality scientists and resource managers to derive numeric nutrient criteria based on stressor-response relationships? The Committee was asked to offer suggestions to improve the usefulness of the Guidance to state and tribal water quality scientists and resource managers for deriving numeric nutrient criteria based on stressor-response relationships. In this regard, we find that the following improvements in EPA's Guidance are needed. Findings concerning improving the utility of the Guidance • The scope, limitations, and intended use of the Guidance should be more clearly identified. The Guidance addresses only one possible approach (i.e., the stressor-response framework) for derivation of numeric nutrient criteria. The Guidance would be more useful if it: 1) expanded upon the utility of the mechanistic modeling approach for understanding stressor- response relationships and the reference condition approach for criteria derivation; 2) more clearly articulated how it relates to EPA's other published nutrient criteria guidance; 3) explained the linkages among designated uses, stressors, measures of stressors, and the deleterious effects of stressors on designated uses; 4) explained that the Guidance does not address "downstream" effects of nutrients; and 5) acknowledged other factors that have appeared to limit state progress toward developing nutrient criteria (e.g., lack of data and technical expertise, insufficient resources, or other factors). • Substantial revision of the document is needed to facilitate identification of the most scientifically defensible approaches to deriving numeric nutrient criteria. The Committee emphasizes that understanding the causative link between nutrient levels and impairment is necessary in order to assure that managing for particular nutrient levels will lead to desired outcomes. As further discussed below, the stressor-response framework in the Guidance may often not be the most appropriate approach for deriving numeric nutrient criteria. [See the response to Charge Question 5 for additional discussion.] • Substantial revision of the document is needed to increase its usability while reducing the likelihood of misuse. The Committee finds that the Guidance would be more useful if it: 1) provided a more specific and descriptive framework outlining the steps in the criteria development process (a specific example is illustrated in Figure 1); 2) contained more technical guidance and examples to describe when and how to use various methods and approaches in the document and ensure statistical rigor (with additional support provided from EPA to help users meet the technical demands of the methods); 3) more clearly expressed the caveats and limitations of the statistical methods and approaches in the document; and 4) provided additional guidance on data requirements for application of the statistical methods and approaches. [See the response to Charge Question 5 for additional discussion.] 4 ------- EPA's Framework as Described in The Draft Guidance Document Stepl Selecting and Evaluating Data i p Step 2 Assessing the Strength of the Cause- Effect Relationship i p Step 3 Analyzing Data i r Step 4 Evaluating Estimated Slressor-Response Relationships i p StepS Evaluating Candidate Slressor-Response Criteria * This includes consideration of factors discussed in this mechanisms and existing conditions, and ability lo predi Framework Recommended by the SAB (At each step in the process, the uncertainty should be explicitly described) adviso ct Ihe Sle Problem Formu Develo i pi lation and Coal pment r Step 2 Conceptual Model Development (Consideration of Empirical Approach in Conjunction With Other Lines of Evidence) \ > Step 3 Selection and Preliminary Evaluation of Data i r Step 4 Evaluation of Slressor-Response Approach J k ^^IsStressor- \NO ^^ Response ^S fr ^V^ Appropriate?' ^r 1 YES StepS Model Slressor-Response Selection ' Consider Other Methods r Step 6 Evaluate Candidate Slressor-Response Criteria and Consider Other Methods if Necessary < p Step 7 Criteria Development- Use \Veight-of-E\idence Approach to Compjre Output lo Step 1 Coals ry report such as cause and effect, relevance lo known irohability of meeting designated uses. Figure I. EPA's Framework for Developing Nutrient Criteria Based on Stressor-Response Relationships and SAB Recommendations for Ret ision ------- • The absence of a direct causative relationship between stressor and response is one of the most serious issues raised by the Committee. Without a mechanistic understanding and a clear causative link between nutrient levels and impairment, there is no assurance that managing for particular nutrient levels will lead to the desired outcome. There are numerous empirical examples where a given nutrient level is associated with a wide range of response values due to the influence of habitat, light levels, grazer populations and other factors. If the numeric criteria are not based upon well-established causative relationships, the scientific basis of the water quality standards will be seriously undermined. [See the responses to Charge Questions 4, 5, and 7 for additional discussion.] • Numeric nutrient concentration criteria and load-response models should be considered as two different approaches for accomplishing the goal of controlling excessive nutrient loadings. EPA has put forth the reference condition approach, the empirical stressor- response approach, and mechanistic modeling as basic analytic approaches for development of numeric nutrient criteria. However, the way in which EPA used results from mechanistic models to develop nutrient load reduction goals for the Gulf of Mexico (Mississippi River/Gulf of Mexico Watershed Nutrient Task force, 2008), and the way in which it is currently using mechanistic models for nutrient and sediment TMDLs for Chesapeake Bay, does not involve development or use of numeric nutrient criteria. The reason is that these mechanistic models (Scavia et al., 2004; Cerco and Noel, 2004) are load-response models, not empirical stressor-response models, and hence they obviate the need for numeric nutrient criteria because they directly link nutrient loads to response variables that represent water quality impairments (e.g., dissolved oxygen, chlorophyll, water clarity and acreage of submerged aquatic vegetation). This reasoning applies not only to mechanistic models but can also apply to empirical models. Turner et al. (2008) and Hagy et al. (2004) developed empirical statistical models for hypoxia in the Gulf of Mexico and Chesapeake Bay, respectively. Both of these models are load-response models and neither involves numeric nutrient concentrations. Further support for this reasoning can be found in Carleton et al. (2005), an EPA study designed to demonstrate the use of mechanistic models to develop nutrient criteria. In fact, in the two examples presented in this study, mechanistic models were actually used as load-response models and not to develop ambient nutrient concentration criteria. Key recommendations concerning identification of the scope, limitations, and intended use of the document As a consequence of the Committee's discussion and the findings listed above, we provide the following recommendations for revising the Guidance 1. EPA should specify how the Guidance is to be used in combination with other EPA nutrient criteria technical guidance manuals. In the preamble, the Guidance should clearly state that the contents represent one of several possible approaches (i.e., the stressor-response framework in the Guidance, mechanistic modeling, reference condition, and the application and/or modification of established nutrient/algal thresholds) that should be considered when deriving numeric nutrient criteria, and expand upon the utility ------- of considering all approaches in a tiered weight-of-evidence approach before deciding on a particular course of action. In this regard, the Guidance should indicate that numeric nutrient concentration criteria and load-response models should be considered as two different approaches for accomplishing the goal of controlling excessive nutrient loadings. To provide additional information on other approaches, EPA should consider appending to the document relevant portions from earlier guidance manuals. 2. EPA should more clearly articulate how the Guidance fits within the decision-making and regulatory processes and, specifically, how it relates to and complements EPA's nutrient criteria technical guidance manuals and other EPA technical documents. Outlining the fundamental principles that underlie the use of stressor-response relationships and providing background information on water quality impairments (e.g., causes and types of impairments, types of designated uses) might provide a useful context. Including a clearer description of how water-use designations influence the derivation of empirically- derived nutrient criteria might be considered as well. Considering the number and usefulness of other EPA-developed processes and recommendations, the authors should consider how they might improve the integration of this document with other EPA efforts. For example, the Guidance would benefit by incorporating the problem formulation stage that is part of the Ecological Risk Assessment process (see Figure 1 of this advisory report). 3. In the Guidance, EPA should address the importance of: 1) establishing linkages among designated uses, measured responses, stressors, and measures of stressors; and 2) relating measures of responses directly to deleterious effects on designated uses. We agree with the statement in the Florida Department of Environmental Protection's letter of September 4, 2009 (letter from Daryll Joyner, Florida Department of Environmental Protection to Thomas Armitage, Designated Federal Officer, EPA Science Advisory Board Staff Office) indicating that the "most scientifically defensible strategy for managing nutrients within the range of uncertainty is to verify a biological response prior to taking a management action." This risk/performance-based approach to setting nutrient criteria is evident not only in Florida's program, but also in those developed by California and Maine (Florida Department of Environmental Protection, 2009; Maine Department of Department of Environmental Protection, 2009; McLaughlin and Sutula, 2007). Those risk-based linkages are not addressed in either the Guidance or EPA's Nutrient Criteria Technical Guidance documents for Rivers (U.S. EPA, 2000a), Lakes/Reservoirs (U.S. EPA, 2000b), and Estuaries (U.S. EPA, 2001). 4. In the Guidance, EPA should emphasize that the document does not address downstream effects of nutrient enrichment, which are intended to be the focus of a separate future document. Load-response models may prove useful in addressing downstream effects. The Committee has some reservations about addressing downstream effects in a separate document because fragmentation of the guidance documents will increase the likelihood that each will be used in isolation and potentially provide misleading results. 5. In the Guidance, EPA should acknowledge key factors that have appeared to limit state progress toward developing nutrient criteria. It is the Committee's understanding that ------- one of the key aims of the Guidance is to accelerate State progress toward adopting numeric nutrient criteria. Because a variety of issues (such as limited availability of data and technical expertise, insufficient resources, and expense) are likely responsible for slow progress, the Guidance may not sufficiently remedy the underlying problems and therefore not facilitate state numeric nutrient criteria adoption. A more thorough exploration of the underlying reasons for slow progress would help EPA more directly address specific issues that impede progress. Key recommendations concerning identification of the most scientifically defensible approaches to deriving numeric nutrient criteria 6. In the Guidance, EPA should recommend that users consider alternative conceptual and methodological approaches in cases where such approaches may be needed to account for complex problems associated with nutrients. The problem of eutrophication is complex, involving multiple causal variables, multiple response variables, and feedbacks among the variables (e.g., plants increase in response to nutrients then, in turn, those nutrients are provided a second time as plants decay). Moreover, response variables can be at multiple levels - primary response variables (e.g., plants), secondary response variables (e.g., dissolved oxygen [DO], pH), and tertiary response variables (e.g., fish, macroinvertebrates). A change in a response variable is unlikely to be satisfactorily described by changes in a single "causal" variable (e.g., total nitrogen [TN] or total phosphorus [TP]). The Committee suggests that developing conceptual models/diagrams (more detailed and accurate than shown in Figure 10 of the Guidance) to illustrate linkages and feedbacks between nutrients and response variables would be a useful approach to capture ecological complexity and better construct the conceptual framework. 7. In the Guidance, EPA should explicitly acknowledge the conditions under which the stressor-response relationship applies. For example, the stressor-response relationship is relatively strong and well-established in lakes and reservoirs as opposed to streams and rivers where the relationship is more complex and influenced by many factors (e.g., shading, sediment, flow regime). In cases where the relationship is not the most appropriate lens through which the problem should be viewed, the user could be directed to other approaches that might better fit the problem. Other documents referenced above (e.g., Florida nutrient guidance document) provide useful examples. The Guidance would benefit from addition of an inset "red-flag" text box that lists circumstances or system characteristics that would alert the user to the need to consider approaches other than stressor-response. This box also might caution the user about mixed systems that have been highly modified and are not easily classified. Likewise, these caveats should also include explicit recognition that the most appropriate criteria may depend upon contexts of the waterbody (e.g., shaded versus open canopy streams), as was done in Florida's guidance document. Searching for a single statewide criterion might obscure important relationships. ------- 8. The Committee suggests that EPA consider the following two key questions as the Agency selects variables to develop numeric criteria: 1) which measures will allow detection of impairment of designated uses? and 2) is the relationship sufficiently strong to determine a management or regulatory target (i.e.. a criterion) to ensure that the designated use is protected? In certain cases, the most appropriate numeric criterion may not be a particular concentration level of a nutrient. Moreover, the stressor-response framework is but one approach for developing numeric nutrient criteria, and often it may not be the most appropriate. Because this concern cuts across all recommendations and approaches included in the Guidance, and also cuts across all charge questions, it must be addressed. Key recommendations to increase the usability of the Guidance and reduce the likelihood of misuse 9. EPA should consider modifying the steps that provide the framework of the Guidance. The Committee suggests that the steps in the framework should be more specific and descriptive. An example is illustrated in Figure 1 of this advisory report. Two important aspects of the example in Figure 1 currently are missing from the Guidance: problem formulation/goal development and conceptual model development should be the first steps in the process, and the framework should contain an explicit step to determine whether the stressor-response relationship is appropriate. 10. EPA should revise the Guidance to include more detailed and descriptive information on the use of the statistical methods discussed in the document. In addition, EPA should provide additional support to help users meet the technical demands of the methods. The Committee finds that that the current draft of the Guidance is written for a user with considerable statistical expertise that may or may not be possessed by state water agencies. This potential mismatch has two serious potential consequences. First, the Guidance will not be helpful if it cannot be easily used by state/tribal water scientists, and second, the recommended methods could be misused and/or misapplied if not sufficiently understood by the user. As a corollary, the Guidance could specify the level of expertise needed by potential users. Correctly identifying the level of expertise of the anticipated users and providing detailed and descriptive information for them is perhaps the most critical step in the continued development and refinement of the Guidance. As part of this process, EPA needs to outline a relatively straightforward process that the users can follow to employ the methods described, and provide technical support for their use. 11. In the Guidance, EPA should more clearly express the caveats and limitations of the approaches presented. In this regard, the following issues are of greatest concern to the Committee: a) The approaches presented in the Guidance are correlative and do not demonstrate causation, b) Many water quality problems are site-specific and confounding variables likely exist, c) As further discussed in the responses to charge questions 2, 3 and 5, there are limitations associated with the retrospective approaches that are the primary focus of the Guidance, and also shortcomings associated with the multivariate techniques presented in the document. In particular, EPA should better ------- identify potential confounding variables and other latent variables that may affect the response. 12. The Guidance should be revised to include additional information (i.e., technical guidance) and more examples showing when and how to use different approaches presented in the document, the advantages and limitations of each approach, the underlying assumptions and data requirements, appropriate interpretations of statistical results, and how to best parameterize the statistical models. This "how-to" information could take a number of forms, including keys, inset boxes, and appendices. Users must be given additional information that provides a clear understanding of why and under which conditions they should consider any particular approach. Related to this, the Committee recommends that the Guidance contain additional examples of the methods described in the document. Specific topics that might be included in this technical guidance include: how to modify the approaches in order to derive site-specific criteria, how to identify thresholds, use of weight-of-ev'idence approaches, and how to handle censored values. EPA also could include an appendix that lists other sources of assistance (e.g., Regional Technical Assistance Groups [RTAGs]), and methodological resources). Organization of the document and current section headings also could more clearly identify the steps involved in the suggested empirical approaches. It would also be helpful to incorporate case studies that apply data sets typical of what most states have. These case studies could highlight decision points in the process of criteria derivation. The use of a single case study across all the various approaches suggested in the document would be particularly helpful. 13. The document should better address data requirements (including data acquisition and data quality requirements). Without providing guidelines on data requirements, the potential for applying techniques to inappropriate or inadequate data sets is great. The Committee recommends casting this discussion in terms of data quality objectives (DQOs) and therefore suggests the following process: I) state the problem; 2) identify the decision; 3) identify inputs to the decision; 4) define the study boundaries; 5) develop a decision rule; 6) specify tolerable limits on decision errors; and 7) optimize the design for obtaining data. 3.2. Charge Question 2. Selecting stressor and response variables Section 1 of the draft guidance document reviews how to select the variables that appropriately quantify the stressor (i.e., excess nutrients) and the response (e.g., chlorophyll a, dissolved oxygen, or a biological index). Please comment on whether the factors to consider described in Section 1 of the draft document are appropriate for selecting response variables that are sensitive to nutrients and related to measures of designated uses. Section I of the EPA Guidance reviews factors to consider when selecting stressor and response variables for empirical derivation of numeric nutrient criteria. The Committee finds that this section of the Guidance could be strengthened and recommends that EPA include additional material to address the points discussed below. Although the current version of the 10 ------- Guidance addresses some of these points, we recommend including additional examples and revisions to further develop various parts of the text as discussed below. Findings on selecting response variables • Although the Guidance states that response variables should be coupled to designated uses. the Committee Finds that this point needs additional elaboration. Some response variables described in the Guidance are clearly related to designated uses (e.g., DO) but the linkage of other responses to designated uses is less obvious or not as well supported scientifically (e.g., macroinvertebrate species richness). Despite the importance of DO and the fact that a large number of waterbodies are impaired due to low DO concentrations, none of the examples in the Guidance include DO as a response variable. This is a significant omission that needs correcting. The Committee notes that appropriate response variables are also highly ecosystem specific. For example, chlorophyll concentrations are often more clearly related to designated uses for lakes than streams. While response variables for single taxa (e.g., salmon) may be tightly related to designated use, multimetric variables (macroinvertebrate indices, index of biotic integrity [IBI]) may be more powerful for integrating the response to nutrients at the community or ecosystem level. The Guidance would be strengthened by including more discussion relating ecosystem type and potential response variables to the designated uses (a table with some accompanying text might be an effective way to do this). [See the response to Charge Questions 3, 5, and 7 for additional discussion.] • Conceptual model development should be required and should be incorporated early in the process of criteria development (see Figure 1 of this advisory report). Conceptual models are an important component in selection of response variables. Any stressor-response relationship used in criteria development must have ecological relevance (based on ecological understanding of the system) that can be readily explained and defended as discussed in step two in the Guidance. Conceptual models based on past empirical and experimental studies are important for identifying the mechanisms responsible for responses and effectively communicating this linkage. In the framework suggested by the Committee (Figure 1), developing the conceptual model is the second step in the process. [See the responses to Charge Questions 4 and 6 for additional discussion.] • The Guidance would be strengthened considerably by presentation of examples illustrating a strong nutrient-response relationship and, as previously mentioned, clear linkage of the response variable to a designated use. It is important to clearly present the rationale for such linkage. Some of the examples in the Guidance illustrate relationships with very low R2 and response variables that are not clearly related to designated use. [See the responses to Charge Questions 3, 5, 6, and 7 for additional discussion.] • In the Guidance, further discussion of potential response variables appropriate for nutrient effects on detritus-based systems is warranted (e.g.. how macroinvertebrate populations dependent on detritus may respond). The Guidance focuses on nutrient-response relationships driven by autotrophic processes (nutrients directly control algal growth, excessive amounts of which impair systems through indirect effects on DO, food web 11 ------- changes, and aesthetics). However, nutrients can also directly control heterotrophic microbes (bacteria, fungi) and indirectly control decomposition of organic matter. Excessive nutrient levels could produce large microbial growths or alter food webs in detritus-based ecosystems (e.g., many streams). Studies in the literature are cited, but examples using relevant response variables (e.g., shredder macroinvertebrate biomass, leaf breakdown rate) would be useful. Findings on stressor and related variables • In the Guidance, more discussion is needed to outline and provide advice on the rationale for selecting variables that should be included in data collection to allow: 1) classification/stratification of data prior to evaluation of stressor-response relationships (e.g.. development of different criteria for different strata of systems): and 2) use of multivariate approaches to separate the influence of nutrients from other stressors (e.g.. sediments, light regime, toxics'). Stratification/classification is a particularly important issue for defining nutrient stressor-response relationships for streams where other factors can impose significant constraints on the effects of excess nutrients on designated uses. For example, nutrient-chlorophyll relationships may not be observed in highly shaded (forested) streams, but may be significant in open-canopy streams. Similarly, nutrient-chlorophyll relationships may be weak in high gradient streams but much stronger in low-gradient streams. For lakes, nutrient-chlorophyll relationships may be much different for highly-colored (high dissolved organic carbon [DOC]) versus clear (low DOC) systems. [See the responses to Charge Questions 3 and 5 for additional discussion.] • Single variable stressor-response relationships (e.g.. those derived using the simple linear regression approach discussed in the Guidance) that explain a substantial amount of variation are likely to be uncommon for most aquatic ecosystems (in particular, streams). Multivariate approaches (multiple regression, structural equation modeling [SEM], etc.) may be needed to identify nutrient effects. These approaches require data on other potential stressors or constraining variables. Multivariate approaches may also be useful early in the analysis to determine whether nutrient effects are significant relative to other stressors and constraints and whether/how to pursue the nutrient effects using simple univariate regressions, perhaps after stratification of systems. [See the response to Charge Question 5 for additional discussion.] • The Guidance focuses primarily on TN and TP as the primary nutrient stressor variables. In systems where inorganic nutrients are the dominant form, some consideration should be given to inorganic N and inorganic P. It is easier to measure the inorganic forms of N and P and more and/or better data may be available for these forms. This is particularly true for ammonium and nitrate versus TN, but perhaps less so for P • In many regions N and P are often co-limiting to plants and microbes and stressor-response relationships based on only one nutrient are weak. Nevertheless, nutrients CN and P) may be the primary factor controlling productivitv/biomass. There have been several recent papers arguing for management of N and P in combination rather than singularly (Lewis and Wurtsbaugh, 2008; Conley et al., 2009; Paerl, 2009). This would suggest development of 12 ------- multivariate stressor-response relationships (e.g., multiple regression) that include both N and P as independent variables. • A basic conceptual problem concerning selection of nutrient concentrations as stressor variables (as illustrated in the Guidance) is that nutrient concentrations directly control only point-in-time. point-in-space kinetics, not peak or standing stock plant biomass. Plant biomass is driven by nutrient supply rates (i.e., nutrient mass loads). Ambient nutrient concentrations are not necessarily good surrogates for nutrient mass loads. Relationships between nutrient mass loads and ambient nutrient concentrations are highly system-specific and depend on many factors including inflows, hydrology, bathymetry, sediment-water exchanges and chemical-biological processes. Consequently, there may be many systems for which nutrient concentrations will not be appropriate stressor variables. For such systems it may be more appropriate, and scientifically defensible, to use site-specific mechanistic models incorporating loading to determine the nutrient controls required to attain designated uses. Findings on temporal/spatial aspects of data • The Guidance provides little discussion regarding the temporal/spatial aspects of data needed to develop relevant stressor-response relationships. For example, the document could be strengthened by providing additional material to address the following questions. "Under what conditions might the use of mean/median or maximum/minimum values of stressor and response variables be more appropriate than discrete instantaneous measurements?" "Are there instances when the use of temporally out-of-phase stressor and response data are most appropriate (e.g., the widely recognized relationship between spring nutrient concentration and summer maximum chlorophyll concentration in lakes)?" "How can time series or longitudinal data in specific systems be used to develop more generalized stressor-response relationships?" Although such guidance may be covered in the various system specific technical manuals (U.S. EPA, 2000a, 2000b, 2001), a summary/synthesis of the major points of these earlier documents should be included in the empirical approaches document. • The Guidance could be strengthened by including a discussion of the importance of considering "data bias" in interpreting the stressor-response relationships. This discussion should focus on how "data bias" (i.e., limits on data representativeness) might affect predictive performance and uncertainty in stressor-response relationships. Uncertainty imposed by model assumptions should also be discussed. Specifically, additional guidance is needed with regard to interpretation of data from particular environments (e.g., a set of lake data from a particular region) and its appropriateness (or lack thereof) for describing conditions more broadly. It would be helpful to include in the Guidance examples illustrating databases that would be "ideal" or appropriate for each empirical model presented. For example, information could be provided lo indicate whether a conceptual model for considering nutrient criteria might be best approached using: seasonal data; data from shaded versus unshaded streams; data from wadeable streams versus big rivers; and/or long versus short term averages of data describing the stressor or the response. [See the response to Charge Question 6 for additional discussion.] 13 ------- • It would be useful to include in the Guidance some discussion of how nutrient recycling and other feedbacks influence stressor-response relationships. For example, the Guidance could be strengthened by addressing the following questions. "How does recycling contribute to variability and uncertainty in stressor-response relationships?" "Are there variables that can be used in stressor-response relationships to account for recycling?" Key recommendations concerning selection of variables to appropriately quantify the stressor and response The Committee provides the following key recommendations to address the findings above and strengthen Section 1 of the Guidance. 1. The Guidance should be revised to elaborate upon the coupling of response variables to designated uses and the importance of ecological relevance of the stressor-response relationship. Examples should be included to further illustrate this important point. The examples should show strong nutrient-response relationships. The Guidance should be revised to include at least one example for DO as a response variable. Ideally, each method should include an example for streams/rivers and an example for lakes. If empirical stressor- response relationships are not appropriate or workable for DO in lakes, then the Guidance should state this specifically and recommend other approaches, for example, site-specific mechanistic models. There are a large number of waterbodies that are impaired by low DO and the draft Guidance is silent on this important nutrient-related problem. 2. The Guidance should be revised to include discussion of potential response variables appropriate for assessing nutrient effects on detritus-based systems. 3. The Guidance should be revised to include more discussion and advice concerning selection of variables and data needed to allow: - Classification/stratification of data prior to evaluation of stressor-response relationships (e.g., development of different criteria for different strata of systems). - Use of multivariate approaches to separate the influence of nutrients from other stressors (e.g., sediments, light regime, toxics). In general, the importance of multivariate stressor-response relationships and tools for multivariate approaches should be further discussed in the final Guidance. 4. In systems where inorganic nutrients are the dominant form, the Guidance should recommend considering inorganic N and P as nutrient stressor variables. 5. The basic conceptual problem associated with selecting nutrient concentrations as stressor variables should be addressed in the Guidance (i.e., nutrient concentrations directly control only point-in-time. point-in-space kinetics, not peak or standing stock plant biomass). 6. The Guidance should be revised to include discussion of: 14 ------- - The temporal/spatial aspects of data needed to develop relevant stressor-response relationships, (e.g., are there instances when the use of temporally out-of-phase stressor and response data are most appropriate?) - How "data bias" (e.g., data from different types of systems) might affect predictive performance and uncertainty in stressor-response relationships. - How nutrient recycling and other feedbacks influence stressor-response relationships. 3.3. Charge Question 3. Approaches to demonstrate the distribution of and relationships among variables Section 1 outlines methods to visualize available data. Please comment on the effectiveness of the following approaches described in the document (listed below) to demonstrate the distribution of and relationships among variables. a) Basic data visualization techniques b) Maps c) Conditional probability d) Classifications Section 1 of EPA's Guidance discusses exploratory data analysis and presents several methods for demonstrating the distribution of and relationships among variables. Several basic plotting techniques are presented in Subsections 1.2 - 1.6 of the document. This is followed by a description of conditional probability analysis (a statistical approach for summarizing how changes in nutrient concentrations are associated with the probability of waterbodies attaining their designated uses). The Committee was asked to comment on the effectiveness of the methods presented in this section of the Guidance. The Committee notes that the response to Charge Question 3 necessarily overlaps with responses to other charge questions, particularly those that focus on identifying stressor-response relationships and conducting statistical analyses. We emphasize that visualization of data is of secondary importance if the data and statistical methods being visualized are inappropriate, because the visualization in itself suggests authenticity. Furthermore, the exploratory data analysis, including visualization, should be conducted prior to inferential statistical analyses of potential stressors and responses. The objectives of exploratory data analysis should be to better understand the system of interest and to maximize the accuracy and minimize the variability of the subsequent stressor-response relationships. The Committee finds that discussion of exploratory data analysis in the Guidance would be more effective if the document were reorganized and expanded to address the following points. • The Guidance would be more effective if exploratory data analysis were included by itself in a separate section of the document following a major section on problem formulation (corresponding to the Framework in Figure I of this advisory report). 15 ------- Additional methods for exploratory data analysis should be described in the Guidance. These additional methods should include: the use of summary statistics; time series plots at fixed points in space; longitudinal plots at fixed points in time; bubble plots; Pearson and other types of non-parametric correlation analyses; and maps that show temporal (monthly, seasonal, inter-annual) as well as spatial patterns. Clear guidance is needed for identifying when and how the statistical methods and visualization techniques should be used. The strengths and limitations of the methods should also be identified. It would be useful to show several case examples that range from state- wide to local and data-rich to data-poor, and exemplify different types of aquatic ecosystems (e.g., headwater streams, large rivers, lakes and estuaries). Examples should note the strengths, limitations, assumptions and uncertainties that must be considered when using the methods to explore and visualize the data, and subsequently develop the criteria. These examples should demonstrate how nutrients can be identified as significant stressors when multiple stressors and habitat factors are present and may affect the resident communities. [See the response to Charge Question 5 for additional discussion.] The discussion in Subsection 1.6 of the Guidance (examination of stressor-response distributions across different classes, e.g.. ecoregionsl should be expanded. The subsection should discuss additional data analysis and contain examples for different spatial classifications (e.g., ecoregions, states, watersheds, systems of interest), different waterbody types (e.g., streams, rivers, lakes, estuaries) and other important physical and chemical characteristics that could affect the applicability of the nutrient criteria. [See the response to Charge Questions 2 and 5 for additional discussion.] The examples provided in the Guidance generally do not demonstrate a strong nutrient stressor linkage to beneficial use impairment. The stream examples show very weak correlations that have high levels of uncertainty and the examples lump data from distinctly different ecosystems where multiple factors in addition to nutrients will contribute to biotic responses. [See the responses to Charge Questions 5, 6, and 7 for additional discussion.] All of the statistical and visualization methods discussed in Subsections 1.2 -1.6 of the Guidance can be effective but they should be presented and used in a combined, weight-of- evidence approach because they each involve exploring the data in different ways. [See the responses to Charge Questions 1. 3, 5. and 7 for additional discussion of weight-of- evidence.] The Committee emphasizes the importance of choosing the biological endpoints (i.e.. response variables) that respond specifically to nutrients. We note that responses of benthic indices can be related to many types of stress. We question why periphyton would not be a better receptor to measure. The Committee suggests that field-based species sensitivity distributions fSSPs") may be useful for nutrient criteria development. We note that SSDs have been used effectively in recent publications for establishing guidelines (or refuting them) for contaminants, temperature, and salinity (Hickey, 2008; Leung et al., 2005). 16 ------- The Committee also notes the following technical edits and corrections needed in the Guidance. a. Clarify that macroinvertebrate richness is plotted in examples in Subsection 1.3. b. The Guidance (p. 7) states that "variables are equally weighted" yet only one variable is plotted in each box plot. A better statement would be: "One limitation for box plots is that all of the samples are equally weighted." c. Explain probability survey design and data smoothers or provide references. d. Figure 7 is very confusing to those unfamiliar with scatterplot matrices; some additional explanation regarding how to "read" the horizontal and vertical axes of each graph in the matrix would help. Suggested wording: "For each scatterplot, its x-axis is the variable stated in the column in which the graph appears. Its y-axis is the variable stated in the row in which the graph appears." Key recommendations regarding methods for demonstrating the distribution of and relationships among variables As discussed above, the Committee recommends that EPA restructure and revise the Guidance to strengthen discussion of methods for demonstrating the distribution of and relationships among variables. The following key recommendations are provided. 1. The Committee recommends that the Guidance be clarified by reframing Subsections 1.2 through 1.6 as a separate major section on exploratory data analysis. These subsections should follow another separate major section on problem formulation (see Figure 1 of this advisory report), and the material in Subsection 1.1 (selecting stressor and response variables) should be moved to later section(s) of the document. 2. The Guidance should be revised to include additional methods for exploratory data analysis. These additional methods should include: the use of summary statistics; time series plots at fixed points in space; longitudinal plots at fixed points in time; bubble plots; Pearson and non-parametric correlation analyses; and maps that show temporal (monthly, seasonal, inter-annual) as well as spatial patterns. 3. Subsection 1.6 of the Guidance should be expanded to include additional examples of different spatial classifications. Specifically, the classification subsection of the Guidance (Subsection 1.6) should be expanded with data analysis examples for different spatial classifications (e.g., ecoregions, states, watersheds, systems of interest), different waterbody types (e.g., streams, rivers, lakes, estuaries) and other important characteristics that will affect the applicability of the nutrient criteria. These characteristics could include, but should not be limited to, stream order, flow, velocity, canopy cover, dissolved oxygen, reference condition trophic status, channel width and depth. 4. The Guidance should be revised to clarify, early in the document, that there are many useful statistical and visualization methods that are not presented and which may be useful. The more common/well accepted methods could be listed in a table with references. It may also be useful to mention methods that are inappropriate. With each 17 ------- method the associated strengths, limitations, assumptions and uncertainties should be noted to better guide the user. 5. Several case examples of exploratory data analysis should be included in the Guidance. These examples should illustrate cases ranging from national to local in scope, and data- rich to data-poor, with guidance on how best to explore and visualize the data. 6. The Guidance should contain additional information concerning statistical assumptions associated with various methods. Some guidance should be presented, as in other EPA documents (e.g., U.S. EPA, 2006a; U.S. EPA, 2006b), to address the importance of ensuring that statistical assumptions are not violated and that adequately trained statisticians, in concert with experienced aquatic ecologists and environmental modelers, evaluate the data. An example could be included to show how overly simplistic statistical analysis could not identify a relationship that became evident after complex/advanced analysis. The Committee notes that CProb 1.0, EPA's tool for conditional probability analysis was developed with the R language and environment for statistical computing. The Committee questions whether R, an open-source freeware product that is becoming very popular, is completely acceptable, in the sense that there are many R-macros in use that remain to be properly "vetted." There should be some level of assurance that the recommended R-products have been properly vetted (e.g., CProb 1.0). 7. The Guidance should contain a quantitatively based weight-of-evidence framework using multiple methods and then combining them into figures and tables for visualization. Multiple statistical methods on one data set do not equate to a reasonable weight-of- evidence that significantly reduces uncertainty. Rather, the weight-of-evidence should involve different assessment methods (e.g., different data sets, different biological endpoints, measures of habitat, etc.). This premise has been embraced by other EPA programs and the scientific community (Adams, 2003; Burton et al. 2002; Chapman, 2007; Chapman et al., 2002; Collier, 2003; Cormier et al., 2010; Fox, 1991; Linder et al., 2010; Linkov etal., 2009; Suteret al., 2002; Suter etal., 2010; U S. EPA, 2000c; Weed, 2005; Wickwire and Menzie, 2010). 8. The Guidance should contain a discussion of how the stressor/response variables to be used are linked to one another in space and time for further analysis. There is no mention of this in Subsection 1.1 of the Guidance. The Committee questions whether it should be assumed that stressor/response measurements always occur at the exact same time and locations. It is also important to ensure that high flow events have been measured. It is well established that most nutrient loading occurs during high flows. Therefore, the influence of seasonality and smaller-scale temporal dynamics (e.g , storm events) and the importance of linking stressor and response variables with these factors should be at least noted in the Guidance. 9. The Guidance should discuss the use of modeled data (e.g , land use characterization. hydrology, surface runoff, receiving water quality) for estimating nutrient concentrations/exposures. The pros and cons associated with the use of such data should ------- be briefly mentioned. There are a number of EPA-supported models that have been widely used and documented in recent years (e.g., HSPF, BASFNS, QUAL2K, WASP, AQUATOX, and Chesapeake Bay WQSTM). Some of these are integrated watershed models designed to represent inflows and non-point source runoff loads. Typically, they are used as a "loading engine" for a receiving water quality model. Receiving water quality models describe load-response relationships for exposures (ambient nutrient concentrations) and effects (e.g., plant biomass, zooplankton, dissolved oxygen), and response parameters that represent use impairment. Some receiving water quality models can address multiple stressors. For example, they can include N, P and silicon as potentially limiting nutrients, sediment (suspended solids) and its influence on underwater light attenuation, incident solar radiation, temperature, and grazing pressure. It is possible to use these water quality models to describe exposure (in terms of ambient. nutrient concentrations) but in the absence of empirical data, this would not be scientifically defensible. 10. The Committee recommends that EPA re-evaluate many of the figures in the Guidance (e.g., 4-8, 13-16, 21, 25, and 26). These figures show widely varying data that demonstrate weak relationships. 1!. The Committee recommends that the Guidance be revised to clearly indicate the statistical assumptions and uncertainties that should be taken into consideration when using methods described in the document. Some of the methods are complex and their descriptions lack transparency. Guidance should be provided to ensure that states and other users have an understanding of the data requirements and limitations, the associated statistical assumptions, and uncertainties. 12. The document should contain a discussion of ways to examine the independent and interactive effects of the variables to be considered in deriving numeric nutrient criteria (i.e., provide a menu of options to examine independent and interactive effects). Statistically, there are several well known ways to address additional contributing variables, such as total suspended solids (TSS). One way would be to use a multiple regression model or analysis of covariance (ANCOVA). This would be a valuable approach, as the additional variables are to be treated as continuous variables, and interaction terms could be added to see if the effects of TN/TP were dependent on levels of TSS, which would be expected, particularly for TP. If one treats the additional variables as factors then an analysis of covariance (ANCOVA) model would be most appropriate. For example, if there were a TSS threshold of interest, a relationship could be established between an invertebrate endpoint and nutrient levels above and below a critical TSS threshold. This would allow one to examine independent and interactive effects. 13. The Guidance should mention the potential benefits of using proxy variables in an initial approach for exploratory analysis of data trends. For example, variable data sets that are easier and more practical to obtain, such as more generic pomt/nonpoint source loadings or commonly sampled stressor/response variables, might be used as proxy variables for exploratory analysis of data trends. This is briefly mentioned in Subsection 3.1 of the 19 ------- Guidance (auxiliary model), but such an approach could also be useful for selecting stressor/response variables early in the process (Section 1). 3.4. Charge Question 4. Methods for assessing the strength of the cause-effect relationship Section 2 of the draft guidance document describes methods for assessing the strength of the cause-effect relationship represented in the stressor-response linkage. Please comment on whether the draft guidance document adequately describes how conceptual models, existing literature, and empirical models can be used to assess how changes in nutrient concentration are likely to cause changes in the chosen response variable. Section 2 of the Guidance provides a summary of how the strength of tentative stressor- response pairings from step 1 can be assessed. Certainly, as indicated in the Guidance, conceptual models and existing literature can be used to support relationships that will be explored with the statistical analysis that follows. At this stage of the analysis, stressor-response relationships for which there is no reasonable conceptual model or literature to explain the underlying mechanisms would be of limited value for setting criteria. Such relationships should be set aside. The Committee finds that the Guidance should be improved by incorporating revisions to address the following points. • Section 2 of the Guidance does not address the strength of the stressor-response relationship. but rather support for the stressor-response relationship that is to be explored statistically. "Support" for the stressor-response relationship, rather than "strength" of the relationship, would be a better term to use in this section of the Guidance, because strength refers to the "tightness" of the statistical association between stressor and response. Use of the term "support" would, therefore, be less confusing to the user. • It is not clear why information from mechanistic models was not included in Section 2 of the Guidance. Because mechanistic models can integrate information on the interactions of major ecosystem processes to derive quantitative estimates of effects, they too should be discussed as a possible way of supporting the stressor-response relationship. [See the response to Charge Question 1 for additional discussion.] • Additional discussion of conceptual model selection (with specific examples') would be helpful. There are many ways to select a conceptual model and various model selection criteria that could be applied. An expanded discussion of these issues could help provide further background for a user of the document. Specific examples could be followed in later sections with discussion of statistical approaches to analyze the strength of the potential cause-effect relationships. In other words, EPA could provide an example from beginning to end that a user could follow from step to step. [See the response to Charge Questions 1, 2, and 6 for additional discussion.] • One important aspect of finding support for stressor-response pairings is that without formal training and practical experience in the sciences, especially biological and ecological 20 ------- disciplines, it is difficult to fully understand the complex relationships that may be identified. The Guidance should state the level of statistical and ecological expertise needed to use the document. [See the response to Charge Question 1 for additional discussion.] Structural equation modeling (SEM) and Propensity Score Analysis (PSA) are techniques that can be used to organize and evaluate relationships between nutrients and response variables when extensive data are available. SEM might be more useful in tracing pathways (it is also called path analysis) of cascades that are initiated by excess nutrients than in defining criteria candidates. A relevant example of SEM is really needed in the Guidance if this approach is to be considered by users. PSA, on the other hand, seems to be useful for sorting out groups that share covariates but may have unique nutrient characteristics. Such sorting could lead to a clearer understanding of how nutrients function amid multiple covariates. The example of PSA in the Guidance appendix is helpful, but further explanation of how to interpret the results of the analysis is needed. An analysis such as PSA might really belong in a later section of the document, as it is used for data analysis rather than supporting potential relationships. A reasonable way to assess nutrient effects might be to split data sets (through PSA, principal components analysis, and/or cluster analysis') to enable a system-specific analysis (or analysis of a small groups of sites). Given the many factors that affect streams and rivers, system- specific analysis really provides an assessment of whether altering nutrient concentrations would have the desired effect on the biotic communities present. Possible factors to consider in splitting data for streams and rivers might include, for example, stream order, flow, velocity, canopy, cover, dissolved oxygen, bottom type, channel width, habitat, and depth. [See the responses to Charge Questions 2 and 5 for additional discussion.] Experimental validation of causal relationships between nutrient and response variables should be approached with caution. The final method discussed on page 17 of the Guidance is experimental validation of causal relationships between selected nutrients and response variables. The Committee notes that this approach could be helpful in situ and there are examples of this (Benstead et al., 2009; Cross et al., 2006; Cross et al., 2007; Greenwood et al., 2007; Peterson et al., 1985; Slavik et al., 2004; Stockner and Shortreed, 1978), but mesocosm or laboratory experiments are of limited use in validating causal relationships between nutrient and response variables. For example, Hill and Fanta (2008) and Hill et al. (2009) showed in Oak Ridge National Laboratory artificial streams how P and light interact. This type of work provides fundamental data on how stream algae respond to P and light, and supports basic conceptual models of this relationship. These and previous studies have shown that, under controlled conditions it takes very little P to maximize algal growth given high light and this fundamental relationship could be applied to any stream in the U.S. However, the relationship is often not observed in data sets because other factors such as bottom substrate, turbidity, canopy cover, hydrology, or depth limit algal production. Therefore, caution must be used in applying a relationship from a subset of data to all data from systems that do not have the same or similar conditions. [See the response to Charge Question 6 for additional discussion of model validation.] 21 ------- Key recommendations concerning methods for assessing the strength of the cause-effect relationship represented in the slressor-response linkage In light of the comments and findings discussed above, the Committee provides the following key recommendations to improve Section 2 of the guidance. 1. Section 2 of the Guidance would be more appropriately titled "Assessing Support for the Potential Cause-Effect Relationship." 2. Mechanistic models should be discussed in the Guidance as one way of supporting the stressor-response relationship. 3. The discussion of conceptual models should be expanded to address various criteria for model selection, and additional examples should be included. 4. The level of statistical and ecological expertise needed to use the Guidance should be stated. 5. Structural Equation Modeling (SEM), offered as an alternative model for exploring nutrient-ecosystem response, should be more fully explained with clear examples. 6. Further explanation of how to interpret the results of propensity score analysis (and additional examples) should be included in the Guidance. 7. Experimental validation of causal relationships between nutrient and response variables should be approached with caution because a number of factors can affect the response of a system to nutrient enrichment. 3.5. Charge Question 5. Statistical methods to analyze the data Section 3 of the draft guidance document outlines statistical methods to analyze the data to estimate stressor-response relationships. Please comment on the appropriateness of the methods outlined in the document (listed below) for describing stressor-response relationships associated with nutrient pollution. What approaches would you recommend that could effectively address indirect pathways of adverse effects? What recommendations do you have to address the effects of confounding variables and uncertainty in the estimated relationships? a) Simple linear regression b) Quantile regression c) Logistic regression d) Multiple linear regression e) Non-parametric changepoint analysis f) Discontinuous regression models 22 ------- The Committee notes that EPA's draft Guidance appropriately states that numeric nutrient criteria should be based on predictive stressor-response relationships so that changes in the level of stressor variables will result in predictable ecosystem responses. However, based on examples presented in the draft document and elsewhere, a large degree of unexplained variation can be encountered when attempting to use empirical stressor-response approaches to establish criteria. The final Guidance needs to clearly indicate that such unexplained variation can present a significant problem to this method of developing numeric criteria. Further, the final document should emphasize that statistical associations may not be biologically relevant and do not prove cause and effect. However, when properly determined, statistical associations can be very useful in supporting a cause and effect argument as part of a weight-of-evidence approach to criteria development. To this end, the final document should provide greater detail on the implementation of statistical procedures and development of other supporting information to minimize the degree of unexplained variation and maximize the potential for the empirical stressor-response approach to result in useful numeric nutrient criteria. EPA should also provide guidance on the strength of stressor-response relationships needed to support criteria development using an empirical stressor-response approach. Further, because nutrients are essential elements, the application of statistical methods must consider both nutrient deficiency and excess. Clear links between response variables and designated uses are needed to ensure that both of these possible impairment types are addressed. The Committee provides the following findings and comments concerning the appropriateness of statistical methods in the Guidance, approaches to address indirect pathways of adverse effects, and ways to address the effects of confounding variables and uncertainty in the estimated relationships. Findings on appropriateness of listed statistical methods • The Guidance represents a substantial step forward in describing statistical methods that can be used in deriving nutrient criteria based on stressor-response relationships, but more information is needed to describe supporting analyses necessary for application of the methods. The six methods identified in the Guidance generally provide appropriate options for describing stressor-response relationships that may be sufficiently predictive to support setting numeric nutrient criteria. As many examples in the draft document illustrate, there is likely to be considerable variability in stressor-response nutrient relationships and, thus, in the predicted outcome or response to both target setting and response to mitigation efforts. Therefore, the document must provide more information on the supporting analyses needed for each method to correctly identify useful predictive relationships, and acknowledge that the use of these statistical methods alone cannot provide sufficient evidence of a cause-effect relationship. [See the response to Charge Question I for additional discussion.] • The use of non-parametric change point analysis and discontinuous regression analysis must be associated with biological significance and the designated uses to be protected by numeric nutrient criteria. As stated previously, response variables must be associated with designated uses in all cases. This has implications for the use of non-parametric change point analysis (nCPA) and discontinuous regression in criteria development. The Guidance indicates that, because these procedures may identify breakpoints in nutrient responses that can serve as criteria thresholds, the methods may be used when designated use thresholds are not 23 ------- available. However, although these methods may be able to identify and characterize breakpoints, such breakpoints may not necessarily have any biological significance, nor will they necessarily be related to designated uses that are to be protected by numeric nutrient criteria. Use of these methods must be associated with designated uses. [See the responses to Charge Questions 1, 3, 6, and 7 for additional discussion of the importance of biological significance and linkages to designated uses.] The statistical methods in the Guidance require careful consideration of confounding variables before being used as predictive tools. For example, the appropriate use of bivariate regression methods requires additional efforts through classification or other means to minimize the influence of other potential causal variables so that an acceptable level of confidence in the predictive power of the relationship can be achieved. Without such information, nutrient criteria developed using bivariate methods may be highly inaccurate. Multiple linear regression is an appropriate way to incorporate covariates into a single analysis, although predictive power using this procedure must also be evaluated carefully. [See the responses to Charge Questions 1, 2, 3, and 4 for additional discussion.] As previously noted, because plant biomass is driven by nutrient supply rates (mass loads'), a potential conceptual problem exists with the selection of nutrient concentration (often used in the Guidance) as a stressor variable. This problem illustrates the importance of careful characterization of confounding variables. Nutrient concentrations control only point-in- time, point-in-space kinetic rates, not peak or standing stock plant biomass. Plant biomass is driven by nutrient supply rates (mass loads). Furthermore, nutrient concentrations may not be direct surrogates for nutrient mass loads. Relationships between nutrient mass loads and ambient nutrient concentrations are highly system-specific and depend on many factors. Consequently, in some circumstances, statistical methods alone will not adequately account for the influence of confounding variables and reduce uncertainties. In other words, the Committee anticipates situations in which stressor-response statistical analysis may not lead to a scientifically justified endpoint. [See the responses to Charge Questions 1 and 2 for additional discussion.] In order to be scientifically defensible, empirical methods must take into consideration the influence of other variables. On page 22 of the Guidance, the authors acknowledge that factors co-varying with TP concentrations may explain a portion of the 61% of the variation in log chlorophyll a concentrations apparently attributable to log TP concentrations. This presents a critical challenge in the use of empirical methods as a means of establishing numeric nutrient criteria because it means that controlling TP concentrations may have no potential to yield reductions in chlorophyll a concentrations. Thus, in order to be scientifically defensible, empirical methods must take into consideration the influence of other variables. It is important to discuss strength-of-relationship concerns and how results of empirical approaches should be interpreted in the context of criteria development. Figure 13 on page 24 of the Guidance provides an illustration of the challenges facing the users of simple linear regression (SLR) and other empirical approaches. In this case, total macroinvertebrate species richness was regressed against total N concentrations obtained from EPA 24 ------- Environmental Monitoring and Assessment Program (EMAP) West Xeric Region streams. Overall, total species richness declines with increasing TN concentration in these stream data. Applying SLR to log-transformed data yields a statistically significant slope -3(log(TN)) at pO.OOl. However, a large degree of scatter remains, as indicated by the R2 value of 0.19. A TN "candidate criterion" of 320 ug/L is obtained by finding the point of intersection of an assumed designated use total species richness threshold of 40 and the mean regression line log(TN) = ~ 2.5. Unfortunately, the points where the lower and upper 90% prediction interval lines cross a species richness threshold of 40 cover a TN concentration range from about log(TN) = 1.25 to log(TN) = 4 based on inspection of Figure 13. This corresponds to a TN concentration range of 16 ug/L to 10,000 ug/L. It is important to understand the management consequences of this considerable uncertainty. Also, the fact that the relationship in Figure 13 is both statistically significant (i.e., some trend is evident) and has a low R2 = 0.19 (much scatter also exists) presents an opportunity to discuss strength-of-relationship concerns and how such results should be interpreted in the context of criteria development. [See the responses to Charge Questions 1, 2, and 6 for additional discussion.] As previously discussed, relationships for streams may be more complex than for lakes and must account for multiple stressors/conditions and/or stream 'types' or conditions, and then be applied appropriately. For example, a stratified approach that considers attributes known to be important for a particular environment (lake, stream, estuary) such as canopy, habitat, etc., should be considered. It is also important to deal with both N and P simultaneously and to consider inorganic N and dissolved P. An exercise in Section 3 of the Guidance illustrates the relationship between chlorophyll a and TP in lake water. This is perhaps the easiest and most well known example of stressor-response in natural waters, and specifically in lakes. This relationship is less certain in streams because they are more heterogeneous than lakes. The Guidance also inappropriately assumes that only nutrients affect taxa. The functionality of aquatic food chains is not solely dependent on one type of biota, sediment type, or single nutrient concentration. There are multiple stressors affecting receptors in a number of ways, over the landscape and watershed in question. Confounding variables are not sufficiently addressed in the Guidance. As previously discussed, approaches that address multiple factors, such as a stratified (or hierarchical) approach that considers other attributes known to be important (e.g., canopy, habitat, multiple nutrients) should be considered. [See the responses to Charge Questions 1, 2, and 3 for additional discussion.] The Guidance could be improved by replacing many examples that provide low explanatory power. Concerns include examples with very low R2 indicating low explanatory power and incomplete description of large uncertainty. These examples indicate that variables other than TP or TN have a greater impact on response, which implies that reducing TP or TN may not have the desired effect. Helpful examples could include: one with a response variable indirectly associated with a designated use; and one from a state where a Secchi depth is used as a criterion for water quality (otherwise Subsection 3.1, paragraph 2 sounds extremely vague). [See the responses to Charge Questions 1 and 3 for additional discussion.] Parametric (e.g., Pearson) and non-parametric (e.g.. Spearman's rank. Kendall's tau) correlation analyses can assist in identifying the influence of confounding variables, but these 25 ------- methods are not specifically mentioned in the Guidance. Both of these types of analyses would be helpful in exploratory data analysis. The Guidance lacks sufficient discussion of the importance of variable selection and data characteristics to ensure useful implementation of the statistical procedures. In addition to its incomplete treatment of confounding variables, the Guidance lacks sufficient discussion of the importance of variable selection and data characteristics to ensure useful implementation of the statistical procedures. Many of the non-parametric procedures rely upon bootstrap procedures to obtain confidence intervals. This underscores the importance of using a probability sampling procedure. The implications of different sample sizes should also be more fully discussed. The Guidance states that an advantage of using quantile regression (QR) is that it can provide direct estimates of percentiles of a distribution of Y values at given X values, which may be better estimates of these values than provided by SLR when the assumptions of SLR are not met. Uncertainty associated with estimating extreme quantiles from "small" sample sizes is appropriately identified in the Guidance as a concern for QR. However, small sample size is likely to present considerable challenges to any nutrient criteria development approach, and the Guidance should provide a discussion of how the amount of data may affect the utility of empirical stressor-response approaches. In the Guidance, more information must be provided regarding regression assumptions. limitations, and diagnostic procedures. Although the Guidance should not be expected to provide the same level of detail on the implementation of statistical procedures contained in a statistics textbook, more information must be provided regarding regression assumptions, limitations, and diagnostic procedures. The appropriateness of the regression methods will depend on the assumptions and use restrictions of each method. Although the document discusses many of the important assumptions, it would be helpful for this information to be clearly summarized in a table. The table could include headings for each method such as use, inherent assumptions, and specific remarks. In addition, the importance of regression diagnostic procedures should be emphasized. Examples and specific references to additional sources of information should be provided. This could include evaluating data with and without outliers or unusual values. More guidance is needed on the interpretation of results from the listed regression procedures. For example, how does one decide whether the results of quantile regression are adequate for criterion development? In the discussion of logistic regression (p. 28, last paragraph), nothing is said about whether the coefficients in this analysis are significantly different from zero, or about the proportion of total deviance accounted for by the regression. For multiple linear regression (p. 31) a reference (e.g., Kutner et al., 2004) is needed for Akaike and the other methods listed in the third paragraph of the page. The role of. and options for, data transformations should receive considerably more discussion in the Guidance. Data transformation may be appropriate in the development of stressor-response relationships using regression analysis, but this topic (including the associated back-transformation of slope estimates and confidence intervals to yield criteria) should be more carefully developed. In reading the document, one wonders when the log- transformation should be used to establish linear relationships or whether curvature that may 26 ------- be present in raw data (with no transformation) should be characterized. In addition, the document does not describe the range of data transformations that may be appropriate, instead focusing only on the log-transformation. For example, regarding the nCPA presented in Figure 24, would the analysis give the same result if it were based on TP data that were not log transformed? It is not clear in the Guidance when to apply a linear method to transformed data or a changepoint or discontinuous regression method to untransformed data. As a start, a table like Table 6.5, "Linearizing Transformations" in Weisberg (1985), p. 142 could be included in the Guidance, along with some explanation. Finally, "back- transformation" has the potential to introduce bias into the criterion value if done incorrectly, and this topic should be treated more completely to minimize that potential. The Guidance appropriately points out that regression relationships should generally not be used to project conditions beyond the range of conditions used to develop the relationships. The Guidance is silent on how and when the results of multiple statistical procedures may be integrated to support numeric criteria as an alternative to selecting "the best" model in situations where a clearly preferred model does not emerge from the analysis. Rather than presenting the statistical techniques strictly as alternatives, the document could describe how these procedures can complement each other and provide a more robust picture of what an appropriate criterion should be. For example, a linear regression whose residuals appear to show the presence of curvature might also be evaluated with nCPA to evaluate the range of stressor values over which the curved response occurs. Model averaging (Burnham and Anderson, 2002) is recommended for use with multiple regression when slight changes in the data lead to different final models. The Guidance provides a limited list of the statistical methods that could be explored to yield useful criteria. If a data set includes censored values, maximum likelihood estimation can provide an alternative to bivariate or multivariate linear regression that avoids the need to substitute values such as one-half the detection limit for nondetects. In addition, parametric multivariate methods including principal components analysis (PCA), discriminant function analysis, cluster analysis, and others may also provide a useful means of incorporating covariates in a stressor-response relationship. PCA may be used to describe a group of correlated variables through a single equation. A number of non-parametric linear regression approaches are also available, including the family of Kendall tests available from the U.S. Geological Survey (Helsel and Hirsch, 1992; Helsel et al., 2006) A key and an associated appendix of case studies should be included in the Guidance to explain the appropriate use of statistical methods and inherent assumptions and uncertainties. Since choice of method(s) will depend on the nature of the data being modeled and on the underlying assumptions, it would be useful to include in the Guidance some kind of key giving an explanation of "which method to use when," with the inherent required assumptions and uncertainties associated with each method. Better use of case studies (from lakes, streams, estuaries) in an appendix could help show "why one approach works in a particular situation and another does not." One case study should estimate the stressor- response relationship when the data form a "wedge-shaped" scatterplot, a pattern commonly observed in nutrient stressor-response relationships. 27 ------- • Statistical rigor is essential to the development of scientifically defensible criteria. Simplistic application of approaches in the Guidance can lead to stressor-response relationships with poor predictive power and result in inappropriate numeric nutrient criteria. Therefore, EPA will need to provide technical support and training to states for use of these statistical methods. As previously stated, the use of bivariate methods (including nCPA) must involve a careful examination of potentially confounding variables to develop support for a predictive relationship. In order to properly evaluate the predictive power of empirical stressor- response relationships, uncertainties associated with each method used must be identified and quantified. Simulated data sets designed to contain specific properties that may be encountered by users of the Guidance could help communicate how these statistical procedures behave over a variety of data set characteristics (e.g., a range of uncertainty in the regression slope). • The need for statistical rigor applies to both the strength and the form of the relationship among variables (i.e.. evaluating the presence of curvature in a stressor-response relationship"). The Guidance should describe the goal of data analysis as one of characterizing not only the strength of relationship but also its form, and the evidence supporting conclusions about both. This is particularly relevant when deciding to use nCPA or discontinuous regression to characterize a relationship. A more complete approach should be presented to test the hypothesis that a true data threshold exists. • EPA should provide guidance on how the degree of relationship (indicated by R . residuals analysis, and other evidenced relates to establishing predictive stressor-response relationships. At a minimum, EPA should describe how to address the important question of "when is the evidence insufficient to support using a empirical stressor-response approach?" One suggestion is to better incorporate the EPA data quality objectives process into the Guidance (see U.S. EPA, 2009c). Findings on indirect pathways • The Committee notes that, with respect to approaches used to address indirect pathways of adverse effects, the Guidance currently does not contain a clear definition of the term "indirect pathway." One definition follows in part from the caption of Figure 10 in the Guidance: "Simplified diagram illustrating the causal pathway between nutrients and aquatic life use impacts. Nutrients enrich both plant/algal as well as microbial assemblages, which lead to changes in the physical/chemical habitat and food quality of streams. These effects directly impact the insect and fish assemblages. The effects of nutrients are influenced by a number of other confounding factors as well, such as light, flow, and temperature." This description appropriately indicates that nutrient concentrations directly impact plant/algal and microbial communities and indirectly impact insect and fish assemblages through impacts on plant/algal and microbial communities. As discussed previously, a challenge in using empirical approaches is establishing sufficient evidence to support 28 ------- conclusions of cause and effect so that relationships with adequate predictive power can be developed. The farther removed the response variables are from immediate responses of variations in nutrient concentrations, the more difficult it may be to demonstrate a useful degree of predictive power. Guidance on the acceptable degree of uncertainty, and/or the desired level of predictive power, may help users of the Guidance identify useful relationships whether or not pathways are direct or indirect. On the other hand, empirical methods alone are unlikely to effectively address indirect pathways of adverse effects. This requires appropriate conceptual and mechanistic models, adequate site-specific data, and experienced professional judgment. Findings on confounding variables and uncertainty • As previously discussed, exploratory data analysis that includes classification of data by similarities in confounding variables prior to the evaluation of stressor-response relationships may improve the predictive power of the relationships if sufficient data are available. Incorporation of confounding variables in a multiple regression is also appropriate. [See the responses to Charge Questions 1, 2, and 3 for additional discussion.] • Because uncertainty in the appropriate criterion value cannot be eliminated, it is prudent to evaluate the potential consequences of varying degrees of uncertainty in a stressor-response relationship on the resulting criteria and management objectives. This may be accomplished in part through the use of the EPA data quality objectives (DQO) process or a similar approach. [See the responses to Charge Questions 1,3,6, and 7 for additional discussion of evaluating uncertainty in the stressor-response relationship.] • References should be provided to direct the reader to more information on regression diagnostics including leverage statistics and information on influential points. This would assist the user in addressing uncertainties associated with these values. (One useful textbook is Kutner et al., 2004; there are many others.) • The Guidance should emphasize the importance of careful pairing of potential stressor and response variables. Uncertainty in a stressor-response relationship may be increased if incompatible data types are paired. For example, combining a seasonal average chlorophyll a concentration calculated from multiple samples with a TP concentration obtained from a single grab sample could introduce considerably more uncertainty than if both variables represent seasonal averages. There are places in the Guidance where measured values are presented without a clear description of the spatial or temporal components that the value represents (on p. 22, for example, 15 ug/L chlorophyll a is presented as a threshold between mesotrophic and eutrophic conditions without indicating the applicable averaging period). The Guidance should consistently include such information in its descriptions of various components of the threshold identification and criteria-setting process. Key recommendations concerning statistical methods in the Guidance The Committee provides the following key recommendations to address the comments and findings presented above. 29 ------- 1. In the Guidance, EPA must provide more information on the supporting analyses needed for each statistical method to correctly identify useful predictive relationships, and acknowledge that the use of these statistical methods alone cannot provide sufficient evidence of a cause-effect relationship. 2. The Guidance should indicate that response variables must in all cases have biological relevance and be associated with designated uses. 3. The Guidance should emphasize that use of the statistical methods requires careful consideration of confounding variables before the methods can be used as predictive tools. As discussed above, further information on how to address confounding variables should be included in the document. 4. The Guidance should contain additional discussion of the potential consequences of varying degrees of uncertainty in a stressor-response relationship on the resulting criteria and management objectives. This may be accomplished in part through the use of the EPA DQO process or a similar approach. 5. The Guidance should contain more information on approaches that address multiple factors, such as a stratified (or hierarchical) approach that considers other attributes known to be important such as canopy, habitat, multiple nutrients, etc. 6. EPA should consider replacing the examples in the Guidance that provide low explanatory power. 7. As discussed above, the Guidance should contain additional specific information (or guidance on where to find it) on: - The use of parametric (e.g., Pearson) and non-parametric (e.g., Spearman's rank, Kendall's tau) correlation analyses. - The importance of variable selection (including careful pairing of stressor and response variables) and data characteristics to ensure useful implementation of the statistical procedures. - Regression assumptions, limitations, and diagnostic procedures. - Interpretation of results from the listed regression procedures. - The role of, and options for, data transformations. - How and when the results of multiple statistical procedures may be integrated to support numeric criteria. - An appendix of case studies to explain the appropriate use of statistical methods and inherent assumptions and uncertainties. 8. The Committee recommends that EPA consider providing technical support and training to states and tribes to assist them in the use of the statistical methods in the Guidance. 30 ------- 9. The Guidance should describe the goal of data analysis as one of characterizing not only the strength of relationship but also its form, and the evidence supporting conclusions about both. 10. The Committee emphasizes that EPA should provide guidance on how the degree of relationship (indicated by R2, residuals analysis, and other evidence) relates to establishing predictive stressor-response relationships for numeric nutrient criteria development. 3.6. Charge Question 6. Evaluating the predictive accuracy of stressor-response relationships Section 4 of the draft guidance document describes how to evaluate the predictive accuracy of estimated stressor-response relationships. Please comment on the appropriateness of approaches in Section 4 of the guidance document and factors to consider in evaluating and comparing different estimates of the stressor-response relationships and selecting those most appropriate for criteria derivation. Overall, the Committee notes that Section 4 of the Guidance lacks the detail provided in other sections and, as discussed below, needs improvement. The Committee finds that this section is particularly important because it addresses the reliability or "validity" of the approaches considered. The Guidance should provide information to help managers decide which criteria derivation approach to use (e.g., analysis of best fit by regression or some other means). These are important decisions and additional guidance on how to select the best tools would be helpful. If the proposed methods yield inaccurate results, this could lead to inappropriate or ineffectual solutions to comply with Clean Water Act goals. The Committee provides the following findings and comments in response to Charge Question 6. • The Committee finds that a clear framework and criteria for statistical model selection is needed in the Guidance. This framework should include a set of decision tools and criteria used not only to determine which model fits best, but also to decide whether the stressor- response approach to criteria development is appropriate. Model selection criteria should include: - Capability of model to consider cause-effect and direct-indirect relationships between stressor and response; - Biological relevance; - Relevance to known mechanisms and existing conditions; and - Capability of model to predict probability of meeting designated use categories. Findings on model validation • More detail is needed in Subsection 4.1 of the Guidance to describe model validation techniques. In the Guidance there is limited discussion of validation of empirically derived stressor-response relationships. This is a critical component. Validation can be defined as demonstrating the accuracy of the model for a specified use. Within this context, accuracy is the absence of systematic and random error - in ecology they are commonly known as 31 ------- trueness and precision respectively. All models are by their nature incomplete representations of the system they are intended to model but, in spite of this limitation, models can be useful. Many discussions of mathematical modeling discriminate between model confirmation (i.e., plausible, worthy of belief) and model verification (i.e., shown to be true). Given the nature of the environmental stressor and response data, such stressor- response models cannot be fully validated. EPA should provide much more detailed validation guidance including four components: - Conceptual validation concerns the question of whether the model accurately represents the environmental system. This is largely qualitative and requires consideration of the strength of the cause-effect relationships. To consider whether the empirical model assumptions are credible, a conceptual model of factors affecting the stressor-response relationship should be developed. For each of the proposed methods, guidance should be provided with examples showing the mechanistic reasoning behind the cause-effect assumptions and the direct-indirect responses of the stressor and response variables. This should be supported by some experimental evidence relevant to the context in which it is used (e.g., data needs appropriate for lakes may be different than for streams). For each application of the empirical model, experimental or observational data in support of the principles and assumptions should be presented and discussed. - Algorithm validation concerns the translation of model concepts into mathematical formulae. It addresses questions such as: "Do the equations represent the conceptual model?" "Under which conditions can simplifying assumptions be justified?" "Is there agreement among the results from use of different methods (e.g., different response variables) to solve the model?" For ecological stressor-response models, these questions relate to the adequacy of the empirical models themselves for describing the effects of nutrient enrichment on aquatic life. - Functional validation concerns checking the model against independently obtained observations. For this type of validation the Guidance recommends using additional empirical observations (an alternative experimental data set). However, this requires more information than is usually available, and expected results may not be the same from one data set to another given the heterogeneity of environmental systems. Such data cannot truly validate the stressor-response model per se, but may produce valuable insights. Guidance is needed to answer questions such as: "what are the minimum data requirements for validation?" and "if one is working with a limited data set, how does one consider the tradeoffs between using more data in the original analysis and reserving data for validation?" - Software validation concerns the implementation of mathematical formulae in various computer software. This validation takes into consideration the possible effects of software-specific factors on the model output (e.g., with regard to precision). For example, problems have been documented with regard to performing statistical analyses with some spreadsheet programs or open source codes. 32 ------- The Committee finds that the concept of "validation" as presented in Subsection 4.1 of the Guidance is inconsistent with other EPA guidance (U.S. EPA. 2009a) on development. evaluation, and application of models. In EPA's other modeling guidance, model evaluation includes model corroboration, and sensitivity and uncertainty analyses. Model corroboration is defined as quantitative and qualitative methods for evaluating the degree to which a model corresponds to reality. In practical terms, this is the process of "confronting models with data." In some disciplines, this process has been referred to as validation. EPA prefers the term "corroboration" because it implies a claim of usefulness and not truth. The Committee finds that this is not just a semantic distinction and we recommend that Subsection 4.1 of the Guidance be revised so that it is consistent with other EPA guidance (U.S. EPA, 2009a). The use of data quality objectives (DOOs) should be discussed in Subsection 4.1 of the Guidance. The DQOs should be established at the beginning of the criteria development process (i.e., Guidance step one) but they can also be used to evaluate the potential stressor- response models (Guidance step four). The discussion of DQOs should address levels of uncertainty, Type I and Type II error rates, and the extent to which each model can predict the probability of meeting designated use categories. [See the response to Charge Question 1 for additional discussion of DQOs.] In Subsection 4.1. more detailed guidance should be provided on the use of randomly or non- randomly selected data sets to help address questions about how much data should be held out of the original analysis to adequately support the validation process. Subsection 4.1 is intended to describe how to validate "the predictive performance of different models." Recommended approaches include: a) collecting new samples; and b) holding out a subset of the original data from the analysis. Reserved samples may be selected randomly or non- randomly. Authors of the Guidance appropriately note that a potential problem with using random subsetting is that the covariance structure of the data is likely to be the same, so that this approach may not provide an independent test of the predictive power of a relationship. As stated in the Guidance, reserving a non-random subset may be a useful alternative. Some discussion of the relative size of calibration and validation data sets is warranted. The concept of "best fit" needs elaboration in the Guidance. Best fit is based on the assumptions made and the model developed and, as previously discussed, there may be considerable uncertainty even if a model is thoroughly and carefully developed. Assumptions that are incorrect or incomplete will lead to erroneous criteria. Authors of the Guidance understand this, and state that relationships can be confounded by unsampled or unmodeled factors. This statement is true and it should be more fully discussed, and perhaps given much greater weight in each section. EPA should consider whether each example in the Guidance should be accompanied by a discussion of possible confounding issues and what might be missing. The concept of uncertainty, its effect on model results, and ways to at least understand the level of uncertainty are not fully described in the Guidance. The Guidance should contain additional information to assess the closeness of root-mean- square predictive error (RMSPE). The RMSPE as presented on p. 42 of the Guidance is a well-recognized measure of how well a statistical model does in predicting response values from given stressor values. Figure 27 of the Guidance gives an example where the RMSPE 33 ------- for the calibration data set was 0.28, while the RMSPE for the held-out validation data (from a particular State) was 0.27. Many would agree that those two RMSPEs are "close." But it is necessary to answer the question, "how close is close?" No further statements appear in the Guidance about how to assess the closeness of two RMSPEs. Comparing 0.28 with 0.27 in a single example does not help users of the Guidance extend this example to their own data sets. It might be possible to take a bootstrap approach with regard to the calibration data set to derive an actual distribution of values for the calibration RMSPE against which the RMSPE of the validation data set could be compared. The Guidance does not address this. In addition, it is appropriate to characterize fit quality using other information such as R2, residuals analysis, and regression results. • With regard to validation, nutrient criteria should result from weight-of-evidence from the application of multiple empirical approaches considering multiple response variables and other approaches as appropriate. The nutrient criteria values determined after considering validation and uncertainty may vary significantly from technique to technique or from response variable to response variable. The Committee suggests that EPA consider the range of responses and concordance among analyses/models and, as stated previously, establish linkage between response variables and designated use categories. The Guidance should discuss model averaging and should recommend considering the range of responses as a measure of overall utility of the empirical approach. In addition, the Guidance should more strongly advocate decision making based on weight-of-evidence from multiple empirical and other approaches. [See the responses to Charge Questions 1,3,5, and 7 for additional discussion of weight-of-evidence.] Findings on qualitative assessment of the uncertainty of the estimated stressor-response relationship • The Committee finds that Guidance Subsection 4.2 (addressing uncertainty) is too brief. Given the importance of this cross-cutting issue, a section on uncertainty is needed for each of the steps outlined in the Guidance, and uncertainty should be summarized at the end of the document. • Subsection 4.2 of the Guidance should address both qualitative and quantitative estimates of uncertainty. Given reasonable expectations for data availability and inevitable limits on the conceptual understanding of complex environmental systems, the Guidance should discuss both qualitative and quantitative estimates of uncertainties. The Committee notes that an explicit accounting of uncertainty is critical. • Validity of the space-for-time substitution assumption can be supported by analysis of long- term stressor-response data for selected data-rich sites. Subsection 4.2 of the Guidance states that all stressor-response models estimated from cross-sectional or synoptic data must also invoke the assumption that spatial differences in sites can be substituted for temporal differences without a substantial degradation of model accuracy (i.e., the space-for-time substitution). As the Guidance states, a good way to provide support for the validity of this assumption is to analyze long-term stressor-response data for selected data-rich sites. 34 ------- • As previously discussed, the Guidance should contain additional information about the importance of considering "data bias" in interpreting the strcssor-response results with regard to predictive performance and uncertainty, and also the importance of uncertainty imposed by model assumptions. Additional guidance is needed on to how to interpret data from a particular environment (e.g., a data set based on lake data) and its appropriateness (or lack thereof) for describing conditions more broadly. It would be helpful to include in the Guidance examples of databases that would be "ideal" or appropriate for each empirical model presented. For example, would the conceptual model for considering nutrient criteria be ideally approached using seasonal data, data from shaded versus unshaded tributaries, data from wadeable streams versus big rivers, and/or long versus short term averages of data describing the stressor or the response? [See the Responses to Charge Questions 1 and 2 for additional discussion.] Findings on selection of the stressor-response model • The Committee notes that Subsection 4.3 of the Guidance should discuss grounding models in reality through use of prior knowledge. A great deal is known about the effects of nutrients on aquatic systems, and the relationships between variables should reflect that knowledge. All models should be evaluated to determine whether they make sense biologically (e.g., is the range of data used appropriate? are the models mechanistically sound?). [See the response to Charge Question 5 for additional discussion.] • Subsection 4.3 of the Guidance could be improved by providing a more detailed discussion of how to decide when to use each method to model stressor-response relationships, and the advantages/disadvantages associated with each method. Table I on page 44 of the Guidance is not sufficient for this purpose. It would be beneficial to provide a case study using a single data set to demonstrate the comparison of a range of model choices. • The Committee notes that the stated objective of Subsection 4.3 in the Guidance. "demonstrating how to select a stressor-response model using the response variable that best represents the data." is not the same as the goal of Section 4, "evaluating the predictive accuracy of estimated stressor-response relationships." Confidence in predictive accuracy should be the primary consideration in model selection. Further, while it may ultimately be necessary to select a single model, one should also understand the significance to criteria derivation of selecting among reasonable alternative models or the effect of model averaging when a single most appropriate model cannot clearly be identified. • In Subsection 4.3 of the Guidance, more detail should be provided in the discussion of conditions under which the last two methods, non-parametric changepoint analysis (nCPA) and discontinuous regression, should be applied (other than simply stating that they should be used when a direct designated use impairment threshold is unavailable). In addition, the Committee notes that a curved response: 1) may or may not be real; 2) may or may not signal an impaired designated use; and 3) may or may not be indicated at all by the data. Further, a curved response may be modeled by one of the linear methods after transformation. [See the response to Charge Question 5 for additional discussion.] 35 ------- • The Committee notes that linear stressor-response functions may not provide high levels of accuracy for nutrient criteria development. Six different methods are summarized in Table 1 of Subsection 4.3. The first four methods all assume that the stressor-response function can be modeled sufficiently as a linear model or a generalized linear model. It is unlikely that linear stressor-response functions can ever achieve high levels of accuracy across the many different confounding variables and the many different physical, chemical and biological characteristics of specific sites. Key recommendations concerning evaluating the predictive accuracy of estimated stressor- response relationships As a consequence of the findings presented above, the Committee provides the following key recommendations. 1. The Guidance should be revised to provide a clear framework for statistical model selection. This framework should include a set of decision tools and criteria used not only to determine which model fits best, but also whether the stressor-response approach to criteria development is appropriate. 2. The Guidance should be revised to provide much more detailed model validation guidance. 3. Subsection 4.1 of the Guidance (Model validation) should be revised to: - Make it consistent with other EPA guidance (U.S. EPA, 2009a) on development, evaluation, and application of models. - Provide more detailed information on the use of randomly or non-randomly selected data sets to help address questions about how much data should be held out of the original analysis to adequately support the validation process. - Elaborate upon assumptions and uncertainties in "best fit" determinations, and in particular provide additional information to assess the closeness of root-mean-square predictive error (RMSPE). - State that nutrient criteria should result from a weight-of-evidence approach based on the application of multiple empirical approaches considering multiple response variables as appropriate. 4. Subsection 4 2 of the Guidance should be revised to provide an expanded discussion of uncertainty. This section should address both qualitative and quantitative estimates of uncertainty as well as data bias. 5. Subsection 4.3 of the Guidance should be revised to: - Address grounding models in reality through use of prior knowledge. 36 ------- - Provide a more detailed discussion on how to decide when to use each method for modeling stressor-response relationships, and the advantages/disadvantages associated with each method. - Provide more detail regarding the conditions under which the last two methods, non- parametric changepoint analysis (nCPA) and discontinuous regression, should be applied. - Address inaccuracies associated with linear stressor-response functions. 3.7. Charge Question 7. Evaluating candidate stressor-response criteria Section 5 of the draft guidance document describes how to evaluate the candidate stressor-response criteria. An approach is outlined for predicting conditions that might result after implementing different nutrient criteria. Please comment on uncertainties that would remain if water quality criteria for nutrients were based solely on estimated stressor-response relationships and in what ways other information/analysis would help address and possibly reduce this uncertainty. Section 5 of the Guidance is an important part of the document because selection of criteria has environmental, social, and economic consequences. We provide the following comments and findings in response to Charge Question 7. Findings on recognizing uncertainty • As previously discussed, the Guidance does not address or partition inherent critical uncertainties in the stressor-response approach. The Guidance describes approaches that use a data-mining exercise to demonstrate a possible cause-effect relationship for the nutrient- ecosystem response. However, the document does not address or partition inherent critical uncertainties in the stressor-response approach which, as demonstrated in examples in the Guidance and in public presentations given to the Committee, can be extremely large (e.g., several orders of magnitude). Because of the demonstrated uncertainties, prediction from an empirical stressor-response model for a specific system of interest cannot always be interpreted as an accurate prediction of future conditions. [See the responses to Charge Questions I and 5 for additional discussion.] • Uncertainty also results from climatic or other environmental conditions under which studies were conducted. In addition to uncertainties documented in the Guidance and in the public presentations to the Committee, uncertainty also results from the climatic or other environmental conditions under which empirical studies were conducted and response models developed. Studies conducted over relatively limited conditions (e.g., wet or dry years) or short-term periods (e.g , base flows, summer) are unlikely to provide the robust response relationships required for criteria development. 37 ------- Findings on reducing uncertainty • A major uncertainty inherent in the Guidance is accounting for factors that influence biological responses to nutrient inputs. For criteria that meet EPA's stated goal of "protecting against environmental degradation by nutrients," the underlying causal models must be correct. Habitat condition is a crucial consideration in this regard (e.g., light [for example, canopy cover], hydrology, grazer abundance, velocity, sediment type) that is not adequately addressed in the Guidance. Thus, a major uncertainty inherent in the Guidance is accounting for factors that influence biological responses to nutrient inputs. Addressing this uncertainty requires adequately accounting for these factors in different types of waterbodies. [See the responses to Charge Questions 1, 2, 3, and 5 for additional discussion.] • Uncertainty in the water quality criteria for nutrients could be reduced by obtaining data from well-designed site-specific monitoring programs. If "water quality criteria for nutrients were based solely on estimated stressor-response relationships," a critical overall uncertainty would be understanding where, within the range of probabilities, a single waterbody to which the criteria are applied will fall. This, in effect, is uncertainty in the space-for-time assumption discussed in the Guidance. That is, if the criterion nutrient concentration developed using an approach involving data from multiple locations is exceeded, will the predicted response and designated use impairment occur at a single location of interest? This type of uncertainty can be reduced by obtaining data from well-designed site-specific monitoring programs. Such monitoring would focus on obtaining specific information on the variability in stressor and response variables and important covariates with a goal of better defining the interactions of multiple variables and attributes affecting the designated uses of a waterbody. Measurement of actual biological responses would be appropriate, emphasizing variables that respond most directly to changes in nutrient concentrations. These are typically measures of primary productivity or primary producers, or water chemistry changes such as DO and pH. Where necessary, such data may be used to develop computer simulation models specific to the system of interest that can facilitate forecasting of stressors and associated responses. • Numeric nutrient criteria developed and implemented without consideration of system specific conditions (e.g.. from a classification based on site types') can lead to management actions that may have negative social and economic and unintended environmental consequences without additional environmental protection. The Committee emphasizes the importance of not only recognizing but also making allowance in the Guidance for conditions specific to the system of interest so that the resulting science allows the best management decisions to be made. In this regard, as previously discussed, we recommend use of a tiered weight-of-evidence approach to criteria development. Weight-of-evidence is typically used to determine the tier at which uncertainty has been reduced sufficiently for informed management decision making. [See the responses to Charge Questions 1, 2, 3, and 5 for additional discussion.] • The Guidance can be used to develop numeric nutrient criteria in a tiered, weight-of-evidence assessment using appropriately modified EPA approved procedures together with other approaches that address causation. Large uncertainties in the stressor-response relationship 38 ------- and the fact that causation is neither directly addressed nor documented indicate that the stressor-response approach using empirical data cannot be used in isolation to develop technically defensible water quality criteria that will "protect against environmental degradation by nutrients." The Guidance can, however, be used in a tiered, weight-of- evidence assessment (using appropriately modified U.S. EPA-approved procedures, e.g., EPA's Causal Analysis/Diagnosis Decision Information System [CADDIS]), (U.S. EPA, 2009b). [See the responses to Charge Questions 1,3,5, and 6 for additional discussion.] • EPA should consider addressing the use of probabilistic modeling (using the distribution of data in the model and re-sampling or simulating a new distribution) to better determine significant stressor-response relationships. For instance, a statistically significant stressor- response relationship can be derived that may represent only a small portion of the variability in the data. Relying solely on this relationship would result in a tremendous amount of uncertainty in the final criterion developed. A good example of this is Figure 14 (p. 25) of the Guidance, which shows a statistically significant model that explains only 5% of the variation in the data - meaning that 95% of the variation is not explained by the model. Guidance on model selection is critical to reducing uncertainty. The selection of target numeric criteria as outlined in the Guidance is enhanced by the attempt to predict post- implementation conditions. However, the example used in Figures 29 and 30 of the Guidance is confusing as it appears that the values are re-projected using one criterion value (log TP=2) and the prediction analysis is made (i.e., that all 8 of the sites would still exceed the criterion) using a different value (log TP=1.6). Findings on criteria application and monitoring for assessment • The approach presented in Section 5 of the Guidance should be revisited and possibly replaced. It appears to be highly sensitive to the way that individual data points located above a response threshold are distributed around the regression line. For example, in Figures 30 and 31 of the Guidance, near the intersection of TP and chlorophyll a targets and candidate criteria, more than half of the data points fall above the regression line which reflects the best fit to all the data. Projecting back to lower TP concentrations for each of these individual data points would force a lower TP criterion than would be the case if the data were actually normally distributed around the regression line. In other cases, there may be a "cluster" of data points below the regression line, and the back-projected TP criterion would be higher than if all data points were distributed randomly about the regression line. • The Guidance does not adequately address the important issue of continued monitoring and assessment for adaptive management. With regard to application of numeric nutrient criteria, Section 5 of the Guidance discusses comparison of predicted and observed data to evaluate response(s), along the lines of adaptive targets. This intrinsically implies that continued monitoring and assessment of concentration versus biological response is taking place. While this is a good idea in principle, it is not clear from the Guidance that this is to be done, how it is to be done, or at what scale it should be done. This is important because it relates to the issue of measuring changes in indicators of biological response as nutrient inputs are reduced to waterbodies. It is unclear how hereditary or legacy losses or inputs of N and P to waterbodies will be considered and accounted for in such an empirical approach. This begs 39 ------- the next set of questions facing water resource managers who establish targets for nutrient loss reduction: "if no water quality improvement or indicator biological response is seen, are the targets/criteria too high or are legacy nutrient inputs increasingly significant contributors?" and "how long does it take dynamic ecosystems and watersheds to respond to changing nutrient inputs?" • The Guidance should address a number of questions to clarify how the evaluation of candidate stressor-response criteria will occur, presumably through monitoring. These questions include the following: - While a sound monitoring program will be essential, what form will this take? - At what level in time and space will monitoring be established to evaluate criteria? - Where, when, and how will samples be collected to establish a long-term monitoring program to clearly define and measure candidate response(s) to any changes in management and stressor inputs, as predicted by nutrient criteria? - How will monitoring be conducted to give a whole watershed assessment, considering all nutrient sources and stressors that are contributing spatially and temporally? - How will continued legacy stressor inputs (N and P) be distinguished from management change-related decreases? Internal recycling of nutrients can mask water quality improvements brought about by nutrient loss reductions resulting from land management changes. • The direct and indirect effects of best management practices should be captured in setting numeric nutrient targets and evaluating responses to target reductions. Implementation of practices to decrease nutrient losses or inputs to surface waters (i.e., best or beneficial management practices [BMPs]) can influence other factors that will affect biological response to nutrient loadings. For instance, riparian buffers are effective at removing sediment and sediment-bound nutrients (particularly P), as well as removing N by uptake and denitrification. However, they also provide shade and will influence stream water temperature and thereby the stressor-response relationship. Such interactions should be addressed in nutrient criteria development. In addition, the use of buffers, for example, will influence the size of particulates or sediment in a stream or river that may affect the benthic population dynamics or species diversity. These direct and indirect effects and complexities should be captured in target setting and the evaluation of response to achieving target reductions. Key Recommendations in response to Charge Question 7 The Committee provides the following key recommendations to address the comments and findings above. 40 ------- Key Recommendations with regard to recognizing uncertainty 1. The Guidance needs to clearly indicate that the empirical stressor-response approach does not result in cause-effect relationships; it only indicates correlations that need to be explored further. For example, the words "cause-effect" should be removed from the title of Step two. 2. The Guidance should address partitioning the uncertainty among the various factors that are involved in the stressor-response relationship for the specific region/system of interest. Some variables may be irrelevant to the hypothesized model for that system. 3. The Guidance should better document the physical, chemical and biological variables comprising the relationships (e.g., habitat, spatial, and temporal) that define the aquatic system, and which may be important in modifying the relationship between nutrient concentrations and observed endpoints. These factors need to be well documented so that the uncertainty in the relationship between nutrient concentrations and measured endpoints can be reduced. Key recommendations with regard to conceptual models and uncertainty description/analysis 4. The Guidance should caution users about potential problems associated with using the overall regression to predict conditions that might result after implementing different nutrient criteria. 5. EPA should consider addressing the use of probabilistic modeling to better determine significant stressor-response relationships. 6. The Guidance should address uncertainty resulting from climatic or other environmental conditions under which studies were conducted. 7. EPA should discourage use of "biased" databases (i.e., that do not contain the range of data necessary to fully characterize a system of interest) to develop stressor-response relationships. 8. When cross-sectional data are used to develop empirical models, the ranges of values for stressors and responses in the cross-sectional data should fully encompass not only the current conditions in systems of interest, but also the predicted values for the stressors and responses corresponding to removal of the designated use impairment. 9. The Committee recommends predicting conditions that might result after implementing different nutrient criteria and testing these conditions on specific data-rich systems of interest. 10. The Committee recommends that EPA frame uncertainty according to the following key issues: 41 ------- What are the goals of the decision makers (e.g., what are the designated uses and when are they impaired?), and what amount of certainty is required to make that decision? Are the mechanisms of the cause-effect relationship understood and are they reflected in the types of measurements recommended? Do the variables measured reflect the goals of the Clean Water Act? In the examples presented in Section 5 of the Guidance species richness or chlorophyll a are not clearly linked to the stated goals (fishable, swimmable waters, etc). Does the analysis tool reflect a known cause-effect relationship and does it allow an understanding of the process? What are the a priori criteria to be met by the data? This must be established to make it possible to tell when the data cannot support the decision making process. 42 ------- 4. REFERENCES Adams, S.M. 2003. Establishing causality between environmental stressors and effects on aquatic ecosystems. Human and Ecological Risk Assessment 9:17-35. Benstead, J.P., A.D. Rosemond, W.F. Cross, J.B. Wallace, S.L. Eggert, K. Suberkropp, V. Gulis, J.L. Greenwood, and C.J. Tant. 2009. Nutrient enrichment alters storage and fluxes of detritus in a headwater stream ecosystem. Ecology 90:2556-2566. Burnham K.P. and D.R. Anderson. 2002. Model Selection and multimodel Inference- A Practical Information-Theoretic Approach. Springer-Verlag, NY, 488 pp. Burton, G.A., Jr., P.M. Chapman, and E.P. Smith. 2002. Weight-of-evidence approaches for assessing ecosystem impairment. Human and Ecological Risk Assessment 8:1657-73. Carleton, J.N., M.C. Wellman, A.S. Donigian, J.C. Imhoff, J.T. Love, R.A. Park, and J.S. Clough. 2005. Nutrient Criteria Development with a Linked Modeling System. Methodology Development and Demonstration Case Studies for Blue Earth, Rum and Crow Wing Rivers, Minnesota. EPA-823-R-05-003. U.S. Environmental Protection Agency, Office of Water and Office of Science and Technology, Washington, DC. Cerco, C.F. and M.R. Noel. 2004. The 2002 Chesapeake Bay Eutrophicalion Model. EPA 903- R-04-004. U.S. Environmental Protection Agency, Region III, Chesapeake Bay Program Office, Annapolis, MD, and U.S. Army Corps of Engineers, Engineer Research and Development Center, Vicksburg, MS. Chapman, P.M. 2007. Determining when contamination is pollution - weight-of-evidence determinations for sediments and effluents. Environment International 33:492-501. Chapman, P.M., B.G. McDonald, and G.S. Lawrence. 2002 Weight-of-evidence frameworks for sediment quality and other assessments. Human and Ecological Risk Assessment 8:1489-1515. Collier, T.K. 2003. Forensic ecotoxicology: Establishing causality between contaminants and biological effects in field studies. Human and Ecological Risk Assessment 9:259-266. Conley, D.J., H.W. Paerl, R.W. Howarth, D.F. Boesch, S.P. Seitzinger, K.E. Havens, C. Lancelot and G.E. Likens. 2009. Controlling eutrophication: nitrogen and phosphorus. Science 323:1014- 1015. Cormier, S.M., G.W. Suter, and S.B. Norton. 2010. Causal characteristics for ecoepidemiology. Human and Ecological Risk Assessment \ 6 (in press). Cross, W.F., J.B. Wallace, A.D. Rosemond, and S.L. Eggert. 2006. Whole-system nutrient enrichment increases secondary production in a detrital-based ecosystem. Ecology 87:1556- 1565. 43 ------- Cross, W.F., J.B. Wallace, and A.D. Rosemond. 2007. Nutrient enrichment reduces constraints on material flows in a detritus-based food web. Ecology 88:2563-2575. Florida Department of Environmental Protection. 2009. Draft Technical Support document. Development of Numeric Nutrient Criteria for Florida Lakes and Streams. Standards and Assessment Section, Tallahassee, FL [Available at: http://www.dep.state.fl.us/water/wqssp/nutrients] Fox, G.A. 1991. Practical causal inference for ecoepidemiologists. Journal of Toxicology and Environmental Health 33:359-373. Greenwood, J.L., A.D. Rosemond, J.B. Wallace, W.F. Cross, and H.S. Weyers. 2007. Nutrients stimulate leaf breakdown rates and detritivore biomass: bottom-up effects via heterotrophic pathways. Oecologia 151:637-649. Hagy, J.D., W.R. Boynton, C.W. Keefe, and K.V. Wood. 2004. Hypoxia in Chesapeake Bay, 1950-2001: Long-term change in relation to nutrient loading and river flow. Estuaries 27(4):634- 658. Helsel D.R. and R.M. Hirsch. 1992. Statistical Methods in Water Resources. Elsevier, NY, 522 pp. [Available online at: http://www.practicalstats.com/aes/aes/AESbook_files/HelselHirsch.PDF] Helsel, D.R., D.K Mueller, and J. R. Slack. 2006. Computer program for the Kendall family of trend tests: US Geological Survey Scientific Investigations Report 2005-5275, 4 pp. Mickey, G.L. 2008. Making species salinity sensitivity distributions reflective of naturally occurring communities: using rapid testing and Bayesian statistics. Environmental Toxicology and Chemistry 22:2403-2411. Hill, W.R. and S.E. Fanta. 2008. Phosphorus and light colimit periphyton growth at subsaturating irradiances. Freshwater Biology 53:215-225. Hill, W.R., S.E. Fanta, and B.J. Roberts. 2009. Quantifying phosphorus and light effects in stream algae. Limnology and Oceanography 54:368-380. Kutner, M.H., C.J. Nachtsheim, J. Neter, and W. Li. 2004. Applied Linear Statistical Models, 5th Edition. McGraw-Hill Irwin, NY, 1396 pp. Lewis, W.M., Jr. and W.A. Wurtsbaugh. 2008. Control of lacustrine phytoplankton by nutrients: erosion of the phosphorus paradigm. International Review ofHydrobiology 93:446-465. Leung, K.M.Y., A. Bjorgesester, J. Gray, W.K. Li, G.C.S. Lui, Y. Wang, and P.K.S. Lam. 2005. Deriving sediment quality guidelines for field-based species sensitivity distributions. Environmental Science and Technology 39:5148-5156. 44 ------- binder, S.H., G. Delclos, and K. Sexton. 2010. Making causal claims about environmentally- induced adverse effects. Human and Ecological Risk Assessment 16 (in press). Linkov, I., D. Loney, S. Cormier, F.K.. Satterstrom, and T. Bridges. 2009. Weight-of-evidence evaluation in environmental assessment: Review of qualitative and quantitative approaches. Science of the Total Environment 401:5199-5205. Maine Department of Department of Environmental Protection. 2009. Nutrient Criteria for Fresh Surface Waters http://www.maine.gov/dep/blwq/rules/Other/nutrients_freshwater/index.htm . [Accessed on October 25, 2009] McLaughlin, K. and M. Sutula. 2007. Developing Nutrient Numeric Endpomt and TMDL Tools for California Estuaries. An Implementation Plan Southern California Coastal Water Research Project Technical Report 540. Costa Mesa, CA [Available at: ftp://ftp.sccwrp.org/pub/download/DOCUMENTS/TechnicalReports/540_CA_N>JE_Phasell.pdf Mississippi River/Gulf of Mexico Watershed Nutrient Task Force. 2008. GulfHypoxia Action Plan 2008 for Reducing, Mitigating, and Controlling Hypoxia in the Northern Gulf of Mexico and Improving Water Quality in the Mississippi River Basin. Washington, DC. [Available at: http://www.epa.gov/msbasin/pdf/ghap2008_update082608.pdfj Paerl, H.W. 2009. Controlling eutrophication along the freshwater-marine continuum: dual nutrient (N and P) reductions are essential. Estuaries and Coasts 32:593-601. Peterson, B.J., J.E. Hobbie, A.E. Hershey, M.A. Lock, T.E. Ford, J.R. Vestal, V.L. McKinley, M.A.J. Hullar, M.C. Miller, R.M. Ventullo, and G.S. Volk. 1985. Transformation of a tundra river from heterotrophy to autotrophy by addition of phosphorus. Science 229:1383-1386. Scavia, D., D. Justic, and V.J. Bierman, Jr. 2004. Reducing hypoxia in the Gulf of Mexico: Advice from three models. Estuaries 27(3):419-425. Slavik, K., B. J. Peterson, L. A. Deegan, W.B. Bowden, A. E. Hershey, and J. E. Hobbie. 2004. Long-term responses of the Kuparuk river ecosystem to phosphorus fertilization. Ecology 85(4): 939-954. Stockner, J.G. and K.R.S. Shortreed. 1978. Enhancement of autotrophic production by nutrient addition in a coastal rainforest stream on Vancouver Island. Journal of the Fisheries Research Board of Canada 35:28-34. Suter, G.W., II, S.B. Norton, and S.M. Cormier. 2002. A methodology for inferring the causes of observed impairments in aquatic ecosystems. Environmental Toxicology and Chemistry 21:1101- 1111. 45 ------- Suter, G.W., H, S.B. Norton, and S.M. Cormier. 2010. The science and philosophy of a method for assessing environmental causes. Human and Ecological Risk Assessment 16 (in press). Turner, R.E., N.N. Rabalais, and D. Justic. 2008. Gulf of Mexico hypoxia: alternate states and a legacy. Environmental Science and Technology. 42:2323-2327. U.S. EPA. 2000a. Nutrient Criteria Technical Guidance Manual- Rivers and Streams EPA-822- B-00-001. U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2000b. Nutrient Criteria Technical Guidance Manual: Lakes and Reservoirs. EPA- 822-BOO-001. U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2000c. Slressor Identification Guidance document. EPA/822/B-00/025, U.S. EPA Office of Water, Washington, DC. U.S. EPA. 2001. Nutrient Criteria Technical Guidance Manual Estuarme and Coastal Marine Waters. EPA-822-B-01-003. U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2006a. Data Quality Assessment-A Reviewers Guide (QA/G-9R). EPA/240/B-06/002. U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2006b. Data Quality Assessment Statistical Tools for Practitioners (QA/G9s). EPA/B-06/003. U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2008. Nutrient Criteria Technical Guidance Manual Wetlands. EPA-822-B-08-001, U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2009a. Guidance on the Development, Evaluation, and Application of Environmental Models. EPA/IOO/K-09/003. Office of the Science Advisor, Council for Regulatory Environmental Modeling, U.S. Environmental Protection Agency, Washington, DC. U.S. EPA. 2009b. CADDIS- Helping Scientists Identify the Causes of Biological Impairments. http://cfpub.epa.gov/caddis/ [Accessed September 15, 2009] U.S. EPA. 2009c. Quality Management Tools - Systematic Planning. http://www.epa.gov/qualityl/dqos.html [Accessed November 11, 2009] Weed, D.L. 2005. Weight-of-evidence: A review of concept and methods. Risk Analysis 25:1545-57. Weisberg, S. 1985. Applied Linear Regression, 2nd Edition. John Wiley & Sons, New York, 324 pp. Wickwire, T. and C.A. Menzie. 2010. The causal analysis framework: Refining approaches and expanding multidisciplinary applications. Human and Ecological Risk Assessment 16 (in press). 46 ------- |