EPA-600/D-84-251 October 1984 Are the "National Guidelines" Based on Sound Judgments? by Charles E. Stephan U.S. Environmental Protection Agency Environmental Research Laboratory-Duluth 6201 Congdon Boulevard Duluth, Minnesota 55804 ------- NOTICE This document has been reviewed in accordance with U.S. Environmental Protection Agency policy and approved for publication. Mention of trade names or commercial products does not constitute endorse- ment or recommendation for use. ii ------- ABSTRACT: Until recently, procedures used to derive water quality criteria for aquatic life were not well defined and few principles were identified. On November 28, 1980, the United States Environmental Protection Agency published "Guidelines for Deriving Water Quality Criteria for the Protection of Aquatic Life and Its Uses" in the Federal Register. These have been subsequently revised and renamed to "Guidelines for Deriving Numerical National Water Quality Criteria for the Protection of Aquatic Life and Its Uses" and are referred to as the "National Guidelines." In addition, guidelines have been developed for deriving site-specific criteria either by modifying national criteria or by using other appropriate information. Establishing procedures for deriving water quality criteria and for assessing hazard to aquatic life have many similarities because both make use of information from many areas of aquatic toxicology and both assume that the science has developed sufficiently that these activities are feasible and desirable. The desirability of National Guidelines depends on the appropriateness of the strategy developed for using the resulting criteria and the numerous technical judgments that must be made when developing the Guidelines. KEY WORDS: aquatic toxicology, water pollution, water quality criteria, acute-chronic ratio, bioconcentration, bioaccumulation 1 ------- Most aquatic toxicologists are familiar with the colorful history of water quality criteria for aquatic life as exemplified in the Green Book [1], the Blue Book [2], and the Red Book [3]. Criteria in these books were derived by a variety of procedures, but the general approach might best be called the "lowest number approach" or "most sensitive species approach." Most of the criteria were based on the lowest available result from a toxicity test or were designed to protect the most sensitive species that had been tested. In January, 1978, when the U.S. EPA was preparing the sequel to the Red Book, Don Mount convinced appropriate people in the agency that there ought to be a better way to derive criteria. Naturally, the first step was to form a committee, and so six representatives from U-S. EPA's Environmental Research Laboratories began developing guidelines for deriving water quality criteria. One version wa9 published in the Federal Register on May 18, 1978, [4] for public comment; another on March 15, 1979 [5]; and another on November 28, 1980, with response to public comment [6]. Since then, work has been progressing on a new version which will be titled "Guidelines for Deriving National Water Quality Criteria for the Protection of Aquatic Life and Its Uses" and will be available for public comment in 1983. The U.S. EPA has also proposed "Guidelines for Deriving Site-Specific Water Quality Criteria for the Protection of Aquatic Life and Its Uses" [7]. (Although these are commonly referred to as the National Guidelines and Site-Specific Guidelines, respectively, only the National Guidelines will be discussed herein, and they will be referred to simply as the Guidelines.) The title of this article is not intended to mislead anyone into thinking that my answer to the question might be "No." Rather the title is intended to encourage people to realize that the Guidelines are based on numerous judgments, some of which are philosophical and some.technical. My 2 ------- purpose here Is to promote consideration of some of the judgments underlying the Guidelines. In many respects, developing guidelines for deriving water quality criteria is very similar to writing a standard practice for assessing hazard to aquatic organisms- One of the obvious similarities is that some of us have been working on both of these for what seems to be a long time. The major similarity, however, is that both require consideration of many facets of aquatic toxicology and both require numerous judgments concerning generalities as well as specifics. If the major difference between hazard assessment and risk assessment is that hazard assessment is qualitative and risk assessment is quantitative, then deriving numerical water quality criteria can be considered a form of risk assessment rather than hazard assessment. It is illuminating to consider the similarities between deriving water quality criteria and assessing hazard because many of the same philosophical and technical decisions have to be made in both activities. Although much effort has been spent in the last few years in ASTM on a practice for assessing hazard and in U.S. EPA on guidelines for deriving criteria, nobody expects the final word to come soon in either area because both involve working on state-of-the-art issues in aquatic toxicology. Work in both areas continues because people feel that both are feasible and desirable, even though the questions of feasibility and desirability have not been examined very closely. Feasibility of National Guidelines Criteria presented in the Green Book, Blue Book, and Red Book were derived using whatever data were available and whatever rationale was considered appropriate for interpreting the data that were ayailable for each ------- individual material. A basic judgment underlying the Guidelines is that a valid, comprehensive procedure can be applied to all materials. Note carefully that the claim is that the Guidelines can be applied to all materials; the claim is not that the Guidelines will allow derivation of criteria for all materials. Reasonable Guidelines must acknowledge at a number of points that for some materials the available data may not fit a recognizable pattern and so it may not be possible to derive a water quality criterion for aquatic life- In spite of differences between materials, however, it should be possible to develop a comprehensive procedure that will be valid for all materials. Another basic judgment is that not only is the available information sufficient to allow us to envision Guidelines, enough information is presently available to develop the Guidelines. Even though all of the desirable information is not available, the data that are available provide an adequate, basis for the Guidelines. When deriving water quality criteria or assessing hazard, decisions are based on data as much as possible, but in almost every case it is necessary to choose between conflicting data, to adopt simplifications, or to go beyond the available data. New data usually both answer and raise questions. Even if limited resources were not a problem, there will always be unanswered questions. Aquatic toxicologists will always be faced with the desire for more data. Thus a fundamental judgment underlying the Guidelines is that, in spite of a variety of unanswered questions, aquatic toxicology has advanced to the point that adequate information exists in the pertinent areas to develop guidelines for deriving water quality criteria for aquatic life. A corollary of this judgment about the state-of-the-art of aquatic toxicology is that the Guidelines are not "cast in stone". Much desired 4 ------- Information Is not available; therefore as new information and better rationales are developed, changes will be necessary, A major side benefit of the effort to develop Guidelines is that it aids in the development of new data by causing the re-examination of available data, the proposal of new ideas, and the clarification of research needs* Although new data and ideas should result in improvements from time to time, current information certainly justifies the development of Guidelines at this time- Thus the Guidelines are predicated on two fundamental judgments: one concerning the basic applicability of general principles to most materials and the other concerning the state-of-the-art of aquatic toxicology that lead to the conclusion that Guidelines are feasible. An additional but equally important question is whether they are desirable. Desirability of Guidelines Two of the more fundamental problems in aquatic toxicology are that (a) water quality can affect the toxicity of most materials and (b) aquatic species show a range of sensitivities to most materials [8]. It would seem only logical, therefore, that national criteria are useless because the only good criteria are site-specific criteria. Although the authors of the Guidelines realize the importance of local or site-specific criteria, the rationale of the relationship of national criteria to site-specific criteria has developed from a vague concept in 1978 to a more well-defined idea in 1980 to a specific strategy in 1983. The assumption is that if national criteria are appropriately derived, both the need for and the cost of deriving site-specific criteria can be minimized. The strategy is intended to be cost-effective, i.e., to minimize costs associated with site-specific criteria, by ensuring that most, but not necessarily all, site-specific 5 ------- criteria for a material are higher than the national criterion for the material. This is a cost-effective strategy because it permits the assumption that if the concentration of a material in a body of water is lower than the national criterion, the aquatic life usually will not be unacceptably affected; thus neither a site-specific criterion nor additional pollution control is needed. This means that site-specific criteria do not have to be derived for most materials in most bodies of water in which the actual concentrations do not exceed national criteria. Any other approach to the relationship of national criteria to site-specific criteria would mean that a site-specific criterion would have to be derived for each body of water in which there was any concern about the concentration of a particular material. However, in order for this strategy to be cost-effective, the Guidelines must not only result in national criteria that are not too high, they must also result in criteria that are not too low. If national criteria are unnecessarily low, too many site-specific criteria will have to be derived. In an attempt to be low enough but not too low, a national criterion is intended to be an appropriate criterion for an aquataic community that is among the most sensitive to the material of concern in water that contains low concentrations of substances that can reduce the toxicity of the material. If the highest acceptable concentration of a material were known for all bodies of water in the United States, the national criterion for the material would be equal to the lowest of these concentrations that was not judged to be an outlier. The second part of the strategy is that by using appropriate procedures for determining the relative sensitivities of various aquatic communities and 6 ------- the relative toxicities of a material in different waters, it is possible to derive many site-specific criteria merely by modifying national criteria. The strategy, therefore, not only reduces the number of site-specific criteria that are needed, but it also reduces the cost of obtaining many of the site-specific criteria that are needed. Another way in which valid Guidelines will save money is by resulting in better national criteria and better site-specific criteria. If either kind of criteria are derived by different people using different procedures, at least some of the criteria will be much too high and some will be much too low. If criteria are too low, money will be unnecessarily spent on pollution control and the nation will suffer economically. On the other hand, aquatic life will suffer if criteria are too high; and if aquatic life suffer too much, the nation will also suffer. Therefore, it is in the best interest of the nation that criteria be neither excessively high nor excessively low. The important point is that the community of aquatic toxicologists must decide whether national water quality criteria for aquatic life should be derived using some form of Guidelines or whether they should be derived without Guidelines. The Guidelines are based on the dual judgments that Guidelines are both feasible and desirable because valid Guidelines will be in the best interest of both the nation and the aquatic life. Although the questions of feasibility and desirability are partly technical and partly social, aquatic toxicologists must consider them to keep their work in perspective. Developing guidelines for deriving criteria and developing practices for assessing hazard must be both technically feasible and socially desirable if they are to be accepted as useful activities. If the strategy for national and site-specific criteria is to be cost-effective, it is not enough to derive the best possible criterion for a 7 ------- material; it is equally important that each criterion be a good estimate. If the concept of the Guidelines were merely to derive the best possible criterion based on available data, the Guidelines would provide ways of interpreting whatever data were available. In the extreme, if no results of toxicity tests or bioconcentration tests were available for a material, criteria for that material would be derived by extrapolation based on data on physical and chemical properties, structure, data on related materials, or some combination of all three. Criteria derived merely by doing the best that can be done using existing data are likely to sometimes be much too high and sometimes be much too low, and such criteria would not be cost-effective. To help ensure that criteria are generally good estimates, the concept of required data has been incorporated into the Guidelines. The idea was to define the required data so that when all are available, a good criterion can usually be derived; whereas when all the required data are not available, a criterion usually should not be derived. The distinction between the qualitative process of assessing hazard and the quantitative process of deriving criteria is pertinent here because more data are necessary to be quantitative than to be qualitative. Data currently required by the Guidelines are: 1. acute tests with species in at least eight different families; 2. acute-chronic ratios with at least three species; 3. a test with at least one plant species; and A. a bioconcentration factor in some cases. The concept of required data is a means of implementing the idea that criteria can only be cost-effective if they are good estimates. This mechanism is obviously not an ideal solution because good criteria cannot be 8 ------- derived from some sets of data which contain all the required data and, alternatively, good criteria can be derived from some sets that do not contain all the required data. It would be desirable to have a better means of implementing the concept of "good criteria," but as yet the state-of-the-art has not advanced that far. What do the Guidelines Intend to Protect? It would seem that one of the first steps in assessing hazard or deriving water quality criteria for protecting aquatic life would be to develop a reasonably clear definition of what Is meant by "protection". Unfortunately, neither the ASTM drafts on assessing hazard nor the published versions of the Guidelines have dealt with this issue. Over the years many people have expressed opinions about what constitutes adequate protection and the ideas cover a wide gamut. For this reason most aquatic toxicologists probably feel it is prudent to avoid the topic as long as possible. The proper attitude, of course, is that a cost-effective strategy for achieving protection of aquatic life is the ultimate goal of aquatic toxicology and therefore toxicologists must conscientiously work at defining the concept of protection if there is to be any justification for everything else that they do. In order to convince other scientists and the public that aquatic toxicology is a useful activity, aquatic toxicologists must work seriously at defining what constitutes adequate protection. In the water quality-based approach to pollution control, the public decides what uses are to be protected in each body of water. If the use known as "aquatic life" is to be protected, then criteria necessary to protect that use must be incorporated into the water quality standards. One of the important aspects of protection of aquatic life is that it is not enough to protect the presence of aquatic life; its uses must also be 9 ------- protected. The uses are important. If water quality criteria for aquatic life are not designed to protect the uses of aquatic life, the uses may not be protected. For example, many commercially and recreationally important aquatic species will be useless if they taste so bad that nobody will eat them or if they contain concentrations of materials that exceed FDA action levels. Recreational and commercial fisherman will be quite unhappy if they cannot eat or sell their catch after aquatic toxicologists have said that aquatic life is adequately protected. Another important aspect of protection is that there are very few kinds of aquatic species about which the public is concerned. Judging by the things that people usually complain about, as long as the presence and uses of these few species are not noticeably affected, most people feel that aquatic life are adequately protected. There are both fish and invertebrate species in salt water about which the public is concerned, but most people only care about fish in fresh water. Even so-called extremists - the snail darter types - are named after a fish. Very few people are concerned about adverse effects on various species of aquatic bacteria, fungi, protozoans, phytoplankton and zooplankton unless effects on those species result in unacceptable effects on a commerically or recreationally important species. Thus, real world protection of aquatic life and its uses should be operationally defined in terms of field monitoring of the kinds of species that the public cares about. Such monitoring would of course have to continue for many years to take into account seasonal and annual fluctuations. Also, it would be impossible to adequately monitor some species and would be prohibitively expensive for many others. The practical alternative would be to monitor surrogate species. Appropriate monitoring of desirable and surrogate kinds of species would ------- detect any direct unacceptable effects and would also detect any indirect unacceptable effects that might be caused by such things as loss of a key food organism, a change in energy flow, or a change in a predator-prey relationship. Such monitoring should be designed to detect effects that the public would consider unacceptable, rather than trying to detect other kinds of effects and extrapolating to effects that the public would consider unacceptable. If appropriately performed on a regular and continuing basis, a well designed monitoring program would detect unacceptable effects regardless of whether they were caused directly or indirectly. My purpose here is not to discuss monitoring programs per se but to express the point of view that an appropriate definition of "protection of aquatic life and its uses" should be based on the kinds of species that most people actually care about. Although this kind of definition of "protection of aquatic life and its uses" will certainly be considered unacceptable by some people, in my opinion the primary goal of aquatic toxicology should not be to protect such things as the function or structure of aquatic ecosystems unless effects on such things result in unacceptable effects on species of concern to the public. In particular, there is no reason to protect a species of bacteria, fungi, phytoplankton or zooplankton if that species can be replaced by one or more other species so that the kinds of species that most people care about are not adversely affected. This concept of protection leads directly to two judgments that are used in the Guidelines. An unstated concept is that the loss of a few or even several species among the lower forms of life, such as bacteria, fungi, and protozoans, is not of concern because there are so many other species that can replace the ones that are lost. In most cases a human-induced shift in a species ------- composition of some lower forms of life will not cause a problem. Also, most lower forms reproduce so rapidly that adaptation and repopulation can take place quickly. This may apply even to species as high as algae- For example, in the Shayler Run study [9], the test concentration of copper decimated the dominant algal species. However, other species replaced it so well that algal biomass was not reduced, algal diversity increased [10] and no resulting adverse effects on the fish and macroinvertebrates were detected. Eliminating all species of bacteria, fungi, protozoans or algae in a body of water would probably cause unacceptable effects on important species, but harming only a few such species, apparently including some dominant species, will not always cause a problem. A second fundamental judgment, which is stated in the Guidelines, concerns the number of higher species that need to be protected. The rationale is that it is not necessary to protect all species all the time. Aquatic communities can recover from some short-term adverse effects and they can adapt to some long-term adverse effects. The approach generally used in the Green Book, Blue Book, and Red Book was to set criteria so that all species that had been tested would be protected. This approach is usually criticized as resulting in criteria that are too low, but the resulting criteria can be too high if the most sensitive tested species is not as sensitive as some important species. In general, however, this approach will usually be overprotective- The major advantage of this approach is that it is fairly easy to search the literature, find the lowest number, and use it as the criterion. Unfortunately, although it is easy to decide that it is not always necessary to protect all species, this decision raises many difficult questions. The judgment used in the Guidelines is that if acceptable data are available on the toxicity of a material to a variety of ------- appropriate species, the criterion should be set to protect (a) 95 percent of the tested species and (b) all commercially, recreationally, and socially important species. Qualitative and Quantitative Judgments As a brief but important digression, let me comment on the vast difference between making philosophical or qualitative judgments and making quantitative judgments. Whereas philosophical and qualitative judgments can be based on rationales, many scientists seem to feel that quantitative judgments must be based on data. Obviously, however, if the appropriate data were available, the desired number could be calculated and the issue would not have to be decided by judgment. The counter argument is that if the necessary data are not available, judgment should not be used as a substitute for data. This is as unrealistic in applied aquatic toxicology as it is in most areas of life. Making assumptions and simplifications is a fact of life. The two times when judgment is clearly inappropriate is when it contradicts the available data and when the alternatives cover such a broad range that any decision is merely a guess and may be very unrealistic- Under reasonable circumstances, however, quantitative judgments can be just as justified as qualitative judgments. The major problem with quantitative judgments, as opposed to qualitative or philosophical judgments, is that people always ask "Why 95? Why not 94 or 96? or "Why 8? Why not 7 or 9?" The problem, of course, is that because it is a judgment, adequate data are not available to quantitatively justify the decision. For example, 95 was chosen because 90 and 99 resulted in Final Acute Values that seemed to be too high and too low, respectively, when compared to the data sets from which they were calculated. Of the numbers ------- available between 90 and 99, 9 5 is near the middle and is an easily recognizable number- On the other hand, 8 was chosen because 8 acceptable values were available for many materials, but more than 8 were available for only a few. Although there is not much of a difference between 7, 8, and 9, all of these are quite different from numbers like 1, 2, and 3, which have been advocated by some people. Hopefully, advances in aquatic toxicology will allow better justifications for these or other numbers, but at this time the Guidelines describe the best available way of deriving national criteria; in addition, it is felt that these criteria are a useful basis for a cost-effective strategy for protection of aquatic life. All of the problems of quantitative judgments could be avoided by not putting that level of detail into the Guidelines# This would give users of the Guidelines lots of flexibility to make appropriate case-by-case decisions. Unfortunately, even experienced aquatic toxicologists have different viewpoints on major and minor decisions. The fewer details there are in the Guidelines, the more variation there will be between criteria derived by different people for the same material. Thus it was necessary to make the Guidelines as detailed as feasible. The best way to decide what level of detail is appropriate in the Guidelines is to listen to the questions asked by people who try to use the Guidelines to derive criteria for aquatic life- The conclusion is that it is difficult to include too much detail. People who favor more flexibility and less detail usually feel either (a) that the Guidelines are biased toward underprotection or overprotection (both claims have been made) or (b) that some situations are so abnormal that predetermined Guidelines cannot adequately deal with them. The latter argument is the reason that some of the details in the Guidelines are explanations as to why criteria should not be derived in. some specific ------- situations. The Guidelines do specify that the most important tenant Is that criteria should be based on good science. When good science and the Guidelines do not agree, good science must be followed. Good science is certainly a valid reason for not following the Guidelines, but individual whim is not. There is a big difference between a position based on good science and one based on a personal preference. Two-number Criteria One of the major new features of the Guidelines is that a water quality criterion for aquatic life should consist of more than one number. The Blue Book mentioned the idea of two-number criteria, but never actually derived any such criteria. Later, John Eaton [11] proposed values for two^number criteria for some pesticides but the proposal was never adopted. Organisms can usually tolerate higher concentrations for short periods of time than they can for long periods and very few discharges are constant quality. If a never-to-be-exceeded, one-number criterion adequately protects aquatic life from long-term exposures, it will over-restrict dischargers by prohibiting short-term higher concentrations that could be tolerated by aquatic life. Similarly, a one-number average criterion will either underprotect aquatic life or overly restrict dischargers. In the worst of all possible cases, a criterion would both overly restrict dischargers and not adequately protect aquatic life. The easy decision to have more than one number in criteria results in the very difficult problem of how to do it. All kinds of combinations of two or more numbers and time periods can be proposed, including graphs. To be realistic, however, an approach must take into account (a) the kinds of toxicological data that are, or are likely to be, available; (b) the ------- differences between aquatic species; and (c) the practicalities of treatment plant operation and monitoring programs faced by dischargers and regulatory agencies. The simplest alternative to one-number criteria is, of course, two-number criteria and so the Guidelines specify criteria in terms of an average concentration and a maximum concentration* This is judged to be the best that can be done with the kinds of data that are, or are likely to be, available; in addition, a two-number criterion can adequately protect aquatic life without being unfair to dischargers. There are many formats that could be used for two-number criteria and many ways the two numbers might be calculated, so adoption of the idea of a two-number criterion still presents many options. The option used in the Guidelines is intended to be the best way of using available data to obtain the best two numbers that will provide reasonable flexibility to dischargers while also protecting aquatic life from exposures to long-term average concentrations and short-term exposures to higher concentrations. The question of the number of numbers in a criterion is tied directly to the issue of how criteria are used- Criteria do not limit dischargers. Dischargers are limited by effluent limitations, which are sometimes calculated from water quality standards, which in turn are based on water quality criteria- Extrapolating from a criterion to an effluent limitation can be technically complicated, and legal, economic, and social considerations often magnify the level of difficulty- Even if criteria are derived appropriately, standards may be inappropriate if, for example, the wrong use is selected to be protected. Further, even if the standard is appropriate, the effluent limitation may provide more or less protection than needed by the aquatic life- Two-number criteria are derived on the assumption that a discharger might want to use flow-proportional discharge in ------- order to discharge the maximum concentration allowed by the criterion during each period of time. Most dischargers and regulatory agencies do not consider flow-proportional discharge a viable option, and so permit limitations usually allow the discharge of some amount ,les6 than the theoretical maximum. The processes of deriving standards and effluent limitations are possibly as complex as the process of deriving water quality criteria; they have to deal with economic and social, as well as technical issues, and these can have as much bearing on the actual amount of protection afforded aquatic life as the Guidelines. Many of the important judgments concerning how much protection is afforded aquatic life are outside the realm of the Guidelines. Acute-Chronic Ratios One of the enduring controversies in aquatic toxicology is the appropriateness of using application factors. The Guidelines cleverly avoid this issue by using acute-chronic ratios instead of application factors. The problem is still the same, however, and the Guidelines permit use of an acute-chronic ratio to derive criteria for a particular material only if enough data are available for that material to justify its use. Because aquatic toxicology cannot yet provide a general answer to this problem, the Guidelines wisely require that at least a minimum amount of pertinent data be available and that the decision be based on data, not theories. Although the original suggestion was that experimentally determined application factors for a material would be the same for all species of fish [12], it has been found that acute-chronic ratios experimentally obtained for a material using different species of fish and invertebrates often increase or decrease as the acute sensitivities of the species increase or decrease. Because criteria 17 ------- are based on the Idea of protecting 95% of the tested species, the acute-chronic ratio used must be one that is appropriate to the fifth percentile. The Guidelines place quite stringent limitations on the use of acute-chronic ratios. The subject of acute-chronic ratios raises a minor but interesting point. Some of the public comment in 1978 stated that geometric means were used instead of arithmetic means in various places in the Guidelines merely to get a lover number* Although it is true that the geometric mean of a set of numbers will always be lower than the arithmetic mean, it is not true that use of the geometric mean will always result in a lower criterion. In addition, there is usually at least one mathematical rationale for choosing between a geometric mean and an arithemtic mean. As an illustration, assume that both acute and chronic tests have been conducted on a material with fot different species with the following results : Species Acute Value (;jg/litre) Chronic Value (ug/litre) Acute-Chronic Ratio Application Factor A 0.6400 0.0800 8.000 0.1250 B 1000 100.0 10.00 0.1000 C 320.0 20.00 16.00 0.0625 D 20.00 1.000 20.00 0.0500 Arithmetic Mean 13.50 0.0844 Geometric Mean 12.65 0.0791 If the acute value for another species is 100 yig/litre, what is the best estimate of its chronic value? Four calculations are possible using the four means: 18 ------- Acute-Chronic Ratio Application Factor 100 ug/litre x 0.0844 * 8.44 ug/litre Arithmetic Mean 100 pg/litre - —rtio 7-41 ^lltre Geometric Mean 100 lf^65t:re = 7,91 ^8/litre 100 ug/litre x 0.0791 ® 7.91 ug/litre Note that the ansver is the same using the two geometric means, but not the two arithmetic means. This is one reason why it is usually best to use geometric means rather than arithmetic means when dealing with ratios and similar kinds of data. On the other hand, the statistical reason is that ratios are more likely to be lognormally distributed than normally distributed. In this example the way to get the lowest possible criterion would be to use arithmetic means with acute-chronic ratios, except that if application factors are used, then geometric means would give the lowest criterion. Decisions concerning the content of the Guidelines should not be based on an attempt to make the resulting criteria as low or as high as possible. Final Residue Value Two of the most obvious ways in which the Guidelines might result in Final Residue Values that are too high could be avoided if the list of required data were strengthened. The judgment was, however, that these shortcomings are not serious enough in most cases to make the criteria undesirable or to require additional expensive data. The first area of concern is that data on chronic effects are not available for many important species of wildlife consumers of aquatic life. Without 6uch data, it is impossible to know whether various wildlife species might be unacceptably 19 ------- affected by materials accumulated by aquatic life. The second area of concern is that bioaccumulation factors (BAFs) might be higher than bloconcentratlon factors (BCFs) for many materials. BCFs are determined in laboratory bloconcentratlon tests and are Intended to measure only net uptake directly from water, although some additional uptake may occur if the food sorbs some of the material before it is eaten by the test organisms. The term BAF is used here to refer to the situation in which the food eaten by the organism is in steady-state with the concentration In the water so that the organisms of concern proportionately accumulate material from both food and water. Whereas a BCF almost has to be measured In a laboratory test, the best way to measure a BAF is in a field situation. For several materials BAFs appear to be higher than BCFs [13-18]. For many materials adequate data are not available concerning either toxicity to wildlife or BAFs or both. Thus for many materials the Final Residue Value either is too high or does not exist at all. It was decided, however, that if the required data were available, it would be better to derive criteria using the available data even If the Final Residue Value might be too high. Even some of the data that are available are not easy to use- Some wildlife studies report that the lowest concentration tested caused an adverse effect. Similarly, FDA action levels might be considered unacceptable concentrations because a Final Residue Value calculated from a BCF or BAF and an FDA action level should result in 50 percent of the organisms exceeding the FDA action level. More importantly, if the BCF or BAF is an average of values for different species, all the individuals of some species may exceed the FDA action level. A common way of dealing with situations in which data are lacking or incomplete is to use safety or uncertainty factors. Mammalian toxicologists 20 ------- routinely use factors of 10, 100, and 1000 [6], but 6uch factors have not become accepted in aquatic toxicology. Safety factors are not used in the Guidelines because the implications of national criteria are so great that safety factors are not considered cost-effective and are not technically justifiable. When available data do not allow adequate confidence in a criterion, the only acceptable alternatives are either to obtain additional information or to not derive a criterion- Summary Because numerous judgments were made during the development of the Guidelines, this discussion has only dealt with the major philosophical issues that determine the overall nature of the Guidelines and with a few representative important technical issues to show how various kinds of decisions were made. In addition, the validity of a water quality criterion for a material depends just as much on the validity of numerous detailed technical decisions concerning that material as it does on the validity of the Guidelines. Anyone who tries to use the Guidelines quickly finds that criteria cannot be derived mechanically. Numerous "small" decisions must be made and some of these can substantially affect the resulting criterion. The Guidelines provide a framework for deriving criteria and they attempt to establish an attitude toward derivation of water quality criteria for aquatic life, but criteria still must be derived by people who are both conscientious and competent. It is to be hoped that a better understanding of the Guidelines will result in increased confidence in the resulting criteria and will help interested persons ask questions and make suggestions that will help improve the Guidelines and the resulting criteria. In addition to resulting in better national and site-specific criteria at less cost, the ------- Guidelines have resulted in a better understanding of the relationships between various areas of aquatic toxicology and have resulted in the formulation of ideas that ought to be tested. Acknowledgments The committee consisting of Don Mount, Dave Hansen, Jack Gentile, Gary Chapman, Bill Brungs and Charles Stephan developed the Guidelines, with input from many other people. Various people at the U.S. EPA's Environmental Research Laboratories in Corvallis, Oregon; Duluth, Minnesota; Gulf Breeze, Florida; and Narragansett, Rhode Island provided most of the input on the technical content of various aquatic life criteria documents. All of these people contributed to this paper, but none of them necessarily agree with anything contained herein. 22 ------- References [1] National Technical Advisory Committee, Water Quality Criteria, Federal Water Pollution Control Administration, Washington, D. C-, 1968. [2] National Academy of Sciences-National Academy of Engineering, Water Quality Criteria 1972, EPA-R3-73-033, U.S. Environmental Protection Agency, Washington, D. C., 1973. [3] U.S. Environmental Protection Agency, Quality Criteria for Water, Washington, D. C., 1976. [4] U.S. Environmental Protection Agency, Federal Register, Vol- 43, No. 97, May 18, 1978, pp. 21506-21518. [5] U.S. Environmental Protection Agency, Federal Register, Vol, 44, No. 52, March 15, 1979, pp. 15926-15981. [6] U.S. Environmental Protection Agency, Federal Register, Vol. 45, No. 231, November 28, 1980, pp. 79318-79379. [7] U.S. Environmental Protection Agency, Federal Register, Vol. 47, No. 210, October 29, 1982, pp. 49234-49252. [8] Stephan, C. E. in Aquatic Toxicology and Hazard Assessment, ASTM STP 766, American Society for Testing and Materials, Philadelphia, 1982, pp. 69-81. [9] Weber, C. 1. and McFarland, B. H. in Ecological Assessments of Effluent Impacts on Communities of Indigenous Aquatic Organisms, ASTM STP 730, American Socity for Testing and Materials, Philadelphia, 1981, pp. 101-131. [10] Geckler, J. R., et al., "Validity of Laboratory Tests for Predicting Copper Toxicity in Streams," EPA-600/3-76-116, National Technical Information Service, Springfield, Va, December 1976. 23 ------- [11] Eaton, J- G., Personal communication, U.S. EPA, Duluth, Minnesota. [12] Mount, D. I. and Stephan. C. E., Transactions of the American Fisheries Society, Vol- 96, No. 2, April 1967, pp. 183-193. [13] Macek, K. J., et al., Aquatic Toxicology, ASTM STP 667, American Society for Testing and Materials, Philadelphia, 1979, pp. 251-268. [14] Bahner, L. H., et al., Chesapeake Science, Vol. 18, 1977, pp. 299-308. [15] Jfcrvinen, A. W., and Tyo, R. M., Archives of Environmental Contamination and Toxicology, Vol. 7, 1978, pp. 409-421. [16] U.S. Environmental Protection Agency, "Ambient Water Quality Criteria for Polychlorinated Biphenyls," EPA-440/5-80-068, National Technical Information Service, Springfield, Va, 1980, pp. B7-B10. [17] Boudou, A., et al., Bulletin of Enviornmental Contamination and Toxicology, Vol. 22, 1979, pp. 813-818. [18] Phillips, G. R., and Buhler, D. R., Transactions of the American Fisheries Society, Vol. 107, 1978, pp. 853-861. 24 ------- TECHNICAL REPORT DATA (Please reed Instructions on the reverse before completing) 1. REPORT NO. 2. EPA-600/D-8 4-2 51 3. RECIPIENT'S ACCESSION NO. PB8 5 1140 72 4. TITLE AND SUBTITLE Are the "National Guidelines'1 Based on Sound Judgments? 5. REPORT OATE October 1984 6. PERFORMING ORGANIZATION COOE 7. AUTHOR(S) C. E. Stephan B. PERFORMING ORGANIZATION REPORT NO. 9. PERFORMING ORGANIZATION NAVlE AND AOORESS Environmental Research Laboratory Office of Research and Development U.S. Environmental Protection Agency Duluth, MN 55804 10. PROGRAM ELEMENT NO. 11. CONTRACT/GRANT n6. 12. SPONSORING AGENCY NAME AND ADORESS same as above 13. TYPE OF REPORT ANO PERIOD COVERED 14. SPONSORING AGENCY COOE EPA-600/03 15. SUPPLEMENTARY NOTES 16. ABSTRACT Until recently, procedures used to derive water quality criteria for aquatic life were not well defined and few principles were identified. On November 28, 1980, the United States Environmental Protection Agency published "Guidelines for Deriving Water Quality Criteria for the Protection of Aquatic Life and Its Uses" in the Federal Register. These have been subsequently revised and renamed to "Guidelines for Deriving Numerical National Water Quality Criteria for the Protection of Aquatic Life and Its Uses" and are referred to as the "National Guidelines." In addition, guidelines have been developed for deriving site-specific criteria either by modifying national criteria or by using other appropriate information. Establishing procedures for deriving water quality criteria and for assessing hazard to aquatic life have many similarities because both make use of information from many areas of aquatic toxicology and both assume that the science has developed sufficiently that these activities are feasible and desirable.^ The desirability of National Guidelines depends on the appropriateness of the strategy developed for using the resulting criteria and the numerous technical judgments that must be nade when developing the Guidelines. 17. KEY WORDS ANO DOCUMENT ANALYSIS a. DESCRIPTORS b. IDENTIFIERS/OPEN ENDEO TERMS c. COSATI Field/Group 18. DISTRIBUTION STATEMENT Rp1»as<» rn nuhUe 1». SECURITY CLASS (This Report) unclassif ied 21. NO. OF PAGES 26 30. SECURITY CLASS (Thilptf) unclassified 22. PRICE CPA F •m 2220.1 (R««. 4.77) p« ------- |