Breakout Session Debriefings Breakout Session II Freshwaters III Discussion Chair Kevin Farley, Manhattan College Presented at the Workshop on Water Quality Modeling for National-Scale Economic Benefit Assessment Washington DC, February 9-10, 2005 ------- DISCLAIMER These proceedings have been prepared by Alpha-Gamma Technologies, Inc. under Contract No. 68-W-01-055 by United States Environmental Protection Agency Office of Water. These proceedings have been funded by the United States Environmental Protection Agency. The contents of this document may not necessarily reflect the views of the Agency and no official endorsement should be inferred. ------- Freshwaters III Breakout Group Participants Bondelid, Tim (Research Triangle Institute, Inc.) Bruins, Randy (US EPA, Office of Research and Development) Clesceri, Nick (Rensselaer Polytechnic University) Cocca, Paul (US EPA, Office of Water) Corona, Joel (US EPA, Office of Water) Crawford, Charles (US Geological Survey) Dabolt, Thomas (US EPA, Office of Water) Di Luzio, Mauro (Texas Agricultural Experiment Station) Farley, Kevin (Manhattan College) Grevatt, Peter (US EPA, Office of Wetlands, Oceans, and Watersheds) Griffiths, Charles (US EPA, Office of Policy Economics and Innovation) Hayes, Sharon (US EPA, Office of Water) Pendergast, Jim (US EPA, Office of Water) Reckhow, Ken (Duke University) Schwarz, Gregory (US Geological Survey) Shenk, Gary (US EPA, Chesapeake Bay Program) Shih, Jhih-Shyang (Resources for the Future) Stedge, Gerald (Abt Associates, Inc.) Wellman, Marjorie (US EPA, Office of Water) Weltz, Mark (USDA, Agricultural Research Service) Zipf, Lynn (US EPA, Office of Water) ------- Farley: Debriefings from Breakout Session II, Freshwaters III February 9, 2005 Freshwaters III Group—Kevin Farley, Discussion Chair Slide #1 "FreshwaterIII, Tolerance" (Tape 1A, Tape counter starts at 1921) Our charge today was to look at, in one word, "tolerance"—in terms of model tolerance. We didn't start off by doing our charge—we took the liberty, based on how Freshwater I and II handled their charges yesterday, we figured we had a little bit of time to talk about some other things, (laughter) Slide #2 "Reflections on Scope " (Tape counter starts at 1937) What we did first is actually reflected on yesterday, when our charge was to look at scope. What we were doing is, I guess, trying to put the overall objective of the workshop in context. In thinking about it and just looking at what should be involved in economic assessment, there are four parts, modeling being just one of the four. There is another strong component that is involved with just looking at data that you have for these types of assessments. Then there are the modeling tools and modeling platforms that you use to carry out these calculations. The third component is this link between criteria and environmental services, and then, lastly, valuation. It was interesting to see the first two speakers get up and say that their groups spent a fair amount of time trying to answer the question of criteria and environmental services. In our group, we deemed that it was probably more appropriate to handle that in a separate workshop just because of the difficulties in trying to make that connection. But, the importance of that link can't be stressed enough—that in order to carry on water quality modeling through the whole process of doing an economic assessment, you have to have a very strong linkage between what criteria we're going to use and how we relate that directly to the environmental services and the valuation of those environmental services. So, we focused, then, on the modeling issue, and the one thing that the people in the room kept pointing out is to always remember the final question—that what we're trying to do is demonstrate the benefits of current programs. So, in terms of looking at tolerance or things like how accurate or precise our models are, we should be thinking of it in terms of where we stand relative to a policy decision. If it's clear that you don't see improvements, then you probably don't have to go through a high-resolution model just to confirm that you're going to get less than 5% benefit. However, when you're in that range of between 10% and 80% or 90%, then these questions of accuracy and precision become much more important in the overall analysis. Slide #3 "Tolerance: Model Accuracy" (Tape counter starts at 2036) 1 ------- Farley: Debriefings from Breakout Session II, Freshwaters III February 9, 2005 So, with that, we first talked about model accuracy, or if you want to think of it in terms of how well the models are able to describe the central tendency of the data that are collected. I guess implied in that is something related to what Dominic (Di Toro) had brought up and was discussed yesterday: that we saw the idea of applying these models on a site-specific basis being a very important part of the validation process. Even if you were able to think about validating a national-scale model, just trying to do that would probably be too expensive. The way that we should be thinking about it is to have a good number of case studies where we can identify the accuracy and level of precision that we have in the different modeling approaches. With that, we talked about different water quality variables and tried to think about what are the types of accuracy that we have, and we came to the conclusion that it's going to be very different for each parameter that you're going to consider. From my background of doing things on toxic chemicals, we often talk about whether or not a model equation passes a "factor of two" test—whether we're within, say, a factor of two of an actual observation. People in the room said that with pathogens maybe if you're within an order of magnitude you should be happy. Then with DO, with a factor of two, there's a big difference between a DO of 6 mg/L and 3 mg/L, so you're going to have to think of it in terms of a much higher degree of accuracy in those cases. In terms of how we should look at accuracy, a lot of times we look at it as a point-by- point measurement, and in this case, thinking of it in terms of national assessment, just applying models that have pre-determined coefficients. So, we had discussed ways of improving accuracy. The one that we always think of is site-specific calibration so that you can hone in on what the data at your particular site are showing, but the other things that came up that are probably more relevant in terms of doing national assessment is how we think about averaging—whether we do spatial or temporal averaging of the data or the realizations that we're thinking about. So, again, this whole idea of is it sufficient to do average annual concentrations in a system. As you do average annual, you don't worry so much about those daily fluctuations, but you may be able to do a very good job in terms of describing that average annual condition. As Dominic (Di Toro) brought up, however, in a lot of cases average annual conditions isn't sufficient—we really want to know things on a probabilistic scale. In that case, maybe instead of making point-by- point comparisons we should be looking at how a calculated model distribution compares to an observed distribution. That's probably the most useful way of handling a lot of the issues that the Agency has to address. With this, though, realize that for each specific question you're going to want to handle this differently, so your selected matrix should be consistent with the specific policy decisions and economic benefit measures that you're going to consider. The one thing we didn't talk about—I had it on my list, and we did look at it later and talk about it in terms of model calibration, and I mentioned validation, but—how do you do validation for these things? I guess an important part, just to underscore it, is to make sure that you do these validations on a number sites such that you gain confidence to apply this on a national scale. 2 ------- Farley: Debriefings from Breakout Session II, Freshwaters III February 9, 2005 Slide #4 "Tolerance (cont.)—Model Precision " (Tape counter starts at 2198) Turning to precision, or thinking about how much explained or unexplained variability there is in your system. We first had that discussion about how you can make that distinction between explained and unexplained variability, and we came to the only conclusion here that when you have explained variability, the way you could improve the precision of the model is to increase model resolution. This came up yesterday with the coastal group with respect to hydrodynamics in a system. If you want to get a more accurate representation of what goes on in an estuary, you have to include hydrodynamic modeling, and that automatically assumes that you have to go to finer time and space scales. As I talked about yesterday, in addition to thinking about resolution in terms of time and space scales, I'm going to add that third category of reaction resolution or just how complex our chemistry and biochemical descriptions are in models that become important. So, again, if you want to become more precise, you're going to have to be willing to go to more complex models. Slide #5 "Tolerance (cont.) - Handling model uncertainty" (Tape counter starts at 2244) The last part of this that in some ways I thought was most interesting was the whole question of "How do we handle model uncertainty as we pass the information from a water quality model up to an economic evaluation?" What was clear from the economists in the room is that what they were looking for was not just an average value but, rather, they wanted information on distributions, which underscores this whole idea of doing things and looking at probability and passing through the whole idea of uncertainty up through the calculation. So, different ways that were identified where we could handle model uncertainty was to use analytical approaches, but most of us felt that we weren't, I guess, smart enough to actually know how to apply some of the analytical approaches or develop them, particularly as models get to be more non-linear it gets to be pretty challenging to have an analytical approach that would work. More often, you are then left with doing Monte Carlo simulations to get some idea of the probability that would be involved, or the distributions that would be involved, in your water quality parameters. As part of Monte Carlo simulations, we did have a discussion on using inverse techniques to develop model coefficients, in this case, using something like PEST, where in calibrating the model you use PEST to actually determine a lot of different sets of model calibration parameters that can describe your field observations. Then when you go forward and you look at your projections, you run them using all of those different combinations of calibration parameters such that you're getting a distribution of values that, again, you're passing forward as a distribution for the economic evaluation. 3 ------- Farley: Debriefings from Breakout Session II, Freshwaters III February 9, 2005 The difficulty with all of these approaches is that they add complexity to the analysis and computer time that, in some cases, probably isn't necessary, particularly if you're going through this on a first go-around. So, we did want to offer one simple approach that you could use and that is: Whatever model you're using, whether it be on an average annual basis or you have certain spatial averaging going on in the model, you just acknowledge the fact that you're only going to try to describe the mean and that the field data are going to show you some variability about the mean. Instead of trying to explicitly model that, what if we just took the information from analyzing the field data and passed that forward for the economic evaluation, with our new calculation of the mean. So, we would take something like the coefficient of variation from that distribution, along with a log-normal distribution and then look at what the new mean is that you're calculating and allow them to do the economic analysis that way. So, again, it may be a good first approach for a lot of the different problems that you're trying to solve. Lastly, as I was trying to type the slides together, the group kept going. We didn't really get to talk about how these specific models, the four models that we're talking about, really address these issues of tolerance. As I was sort of overhearing them and typing at the same time, their discussion revealed that SPARROW has a lot of interesting aspects about it that hit on some of the things that we talk about with tolerance. Instead of my trying to relay that to you, I'm just going to ask Greg (Schwarz) to come up and talk for a minute or two about what SPARROW can do in terms of tolerance and generating distributions. Greg Schwarz, USGS (Tape counter starts at 2390) Actually, I want to mention a couple of things. First of all, SPARROW can do all of these techniques. It does it all as part of the estimation process, and it's all inherent in the statistical analysis of a model, so it can handle uncertainty really well—statistical models are designed that way. I wanted to make one other point with regard to a major topic of discussion in our group, which was how uncertainty is measured in a model. I'm going to put up a quantitative model—hopefully this doesn't alarm you too much. (He writes this equation on the board: WQ =f(x,9) + e We have some variable water quality that we're relating to some variables in the model— land use or other things—and then we have some parameters, and then there's an error term. This is a statistical model, but it's descriptive, I think also, of the deterministic model. A lot of the discussion here was on how well we're predicting water quality—that's the measure of uncertainty. I think that's probably germane to a lot of issues that were discussed here, primarily things where we're trying to determine the metrics of water quality improvement—the GPRA and things like that. 4 ------- Farley: Debriefings from Breakout Session II, Freshwaters III February 9, 2005 In this case, the kinds of things that you want to look for for estimating good parameters—in a statistical model you want more data. So, going to higher resolution is not necessarily an answer to getting better estimates of these parameters. In the case of SPARROW, we see a big difference in scale of the model. Local-scale models don't have a lot of variation in these input variables, and consequently often don't measure these parameters all that well. A national-scale analysis, which has a lot of variation in conditions—a lot of differences in atmospheric deposition, for example—allows you to get much better estimates of these parameters that are going to be important for use in a policy analysis. Q&A Dominic Di Toro, University of Delaware (Tape IB, Tape counter starts at 100) I just wanted to make a comment that I forgot to make in my presentation. The presumption we were working on was that these models were being run by professional water quality modelers—that they're not being run by whoever is not doing anything in the office that day. It is, I think, crazy to think that at this level of analysis anyone can turn on these models and make them work, even the simplest of models, because if you're not professionally trained . . .who would buy a GC mass spec and then go look around for who isn't busy and say, "Why don't you go run that thing today?" (laughter) Unfortunately, that happens a fair amount in the modeling world, and so I caution you that all of this is predicated on the idea that you actually have somebody who knows what they're doing pushing the buttons—it's only reasonable to assume that, and I know for a fact, of course, that mostly that's not true. 5 ------- Farley: Debriefings from Breakout Session II, Freshwaters III Freshwater III: Tolerance i Four Parts of Economic Assessment ¦ Data ¦ Modeling ¦ Criteria/Environmental Services ¦ Valuation Can we demonstrate benefits of current programs? 6 ------- Farley: Debriefings from Breakout Session II, Freshwaters III ia&aiaEyJLipos Tolerance ¦ Model Accuracy ¦ Different for each WQ variable ¦ Can improve accuracy by ® Site-specific calibration ® Spatial and temporal averaging ® Distributions ¦ Selected metric (e.g., use of distributions) should be consistent with the specific policy decision and economic benefit measures Tolerance (continued) ¦ Model Precision ¦ Explained and unexplained variability « For explained variability, increase model resolution (e.g., time, space, reaction) ------- Farley: Debriefings from Breakout Session II, Freshwaters III February 9.2005 JTolerance (continued) ¦ Handling model uncertainty «Analytical Approaches ¦ Monte Carlo Simulations ¦ Inverse techniques (e.g., PEST) i Develop multiple calibrations p Run multiple projections ¦ Simple approach: pass observed (field data) distribution forward a ------- |