Breakout Session Debriefings Breakout Session II Freshwaters III Presented at the Workshop on Water Quality Modeling for National-Scale Economic Benefit Assessment Washington DC, February 9-10, 2005


Breakout Session Debriefings
Breakout Session II
Freshwaters III

Discussion Chair
Kevin Farley, Manhattan College

Presented at the Workshop on Water Quality Modeling for
National-Scale Economic Benefit Assessment
Washington DC, February 9-10, 2005

-------
DISCLAIMER

These proceedings have been prepared by Alpha-Gamma Technologies, Inc. under
Contract No. 68-W-01-055 by United States Environmental Protection Agency Office of
Water. These proceedings have been funded by the United States Environmental
Protection Agency. The contents of this document may not necessarily reflect the views
of the Agency and no official endorsement should be inferred.

-------
Freshwaters III Breakout Group Participants

Bondelid, Tim (Research Triangle Institute, Inc.)

Bruins, Randy (US EPA, Office of Research and Development)

Clesceri, Nick (Rensselaer Polytechnic University)

Cocca, Paul (US EPA, Office of Water)

Corona, Joel (US EPA, Office of Water)

Crawford, Charles (US Geological Survey)

Dabolt, Thomas (US EPA, Office of Water)

Di Luzio, Mauro (Texas Agricultural Experiment Station)

Farley, Kevin (Manhattan College)

Grevatt, Peter (US EPA, Office of Wetlands, Oceans, and Watersheds)

Griffiths, Charles (US EPA, Office of Policy Economics and Innovation)

Hayes, Sharon (US EPA, Office of Water)

Pendergast, Jim (US EPA, Office of Water)

Reckhow, Ken (Duke University)

Schwarz, Gregory (US Geological Survey)

Shenk, Gary (US EPA, Chesapeake Bay Program)

Shih, Jhih-Shyang (Resources for the Future)

Stedge, Gerald (Abt Associates, Inc.)

Wellman, Marjorie (US EPA, Office of Water)

Weltz, Mark (USDA, Agricultural Research Service)

Zipf, Lynn (US EPA, Office of Water)

-------
Farley: Debriefings from Breakout Session II, Freshwaters III

February 9, 2005

Freshwaters III Group—Kevin Farley, Discussion Chair

Slide #1 "FreshwaterIII, Tolerance"

(Tape 1A, Tape counter starts at 1921)

Our charge today was to look at, in one word, "tolerance"—in terms of model tolerance.
We didn't start off by doing our charge—we took the liberty, based on how Freshwater I
and II handled their charges yesterday, we figured we had a little bit of time to talk about
some other things, (laughter)

Slide #2 "Reflections on Scope "

(Tape counter starts at 1937)

What we did first is actually reflected on yesterday, when our charge was to look at
scope. What we were doing is, I guess, trying to put the overall objective of the
workshop in context. In thinking about it and just looking at what should be involved in
economic assessment, there are four parts, modeling being just one of the four. There is
another strong component that is involved with just looking at data that you have for
these types of assessments. Then there are the modeling tools and modeling platforms
that you use to carry out these calculations. The third component is this link between
criteria and environmental services, and then, lastly, valuation.

It was interesting to see the first two speakers get up and say that their groups spent a fair
amount of time trying to answer the question of criteria and environmental services. In
our group, we deemed that it was probably more appropriate to handle that in a separate
workshop just because of the difficulties in trying to make that connection. But, the
importance of that link can't be stressed enough—that in order to carry on water quality
modeling through the whole process of doing an economic assessment, you have to have
a very strong linkage between what criteria we're going to use and how we relate that
directly to the environmental services and the valuation of those environmental services.

So, we focused, then, on the modeling issue, and the one thing that the people in the room
kept pointing out is to always remember the final question—that what we're trying to do
is demonstrate the benefits of current programs. So, in terms of looking at tolerance or
things like how accurate or precise our models are, we should be thinking of it in terms of
where we stand relative to a policy decision. If it's clear that you don't see
improvements, then you probably don't have to go through a high-resolution model just
to confirm that you're going to get less than 5% benefit. However, when you're in that
range of between 10% and 80% or 90%, then these questions of accuracy and precision
become much more important in the overall analysis.

Slide #3 "Tolerance: Model Accuracy"

(Tape counter starts at 2036)

-------
Farley: Debriefings from Breakout Session II, Freshwaters III

February 9, 2005

So, with that, we first talked about model accuracy, or if you want to think of it in terms
of how well the models are able to describe the central tendency of the data that are
collected. I guess implied in that is something related to what Dominic (Di Toro) had
brought up and was discussed yesterday: that we saw the idea of applying these models
on a site-specific basis being a very important part of the validation process. Even if you
were able to think about validating a national-scale model, just trying to do that would
probably be too expensive. The way that we should be thinking about it is to have a good
number of case studies where we can identify the accuracy and level of precision that we
have in the different modeling approaches.

With that, we talked about different water quality variables and tried to think about what
are the types of accuracy that we have, and we came to the conclusion that it's going to
be very different for each parameter that you're going to consider. From my background
of doing things on toxic chemicals, we often talk about whether or not a model equation
passes a "factor of two" test—whether we're within, say, a factor of two of an actual
observation. People in the room said that with pathogens maybe if you're within an order
of magnitude you should be happy. Then with DO, with a factor of two, there's a big
difference between a DO of 6 mg/L and 3 mg/L, so you're going to have to think of it in
terms of a much higher degree of accuracy in those cases.

In terms of how we should look at accuracy, a lot of times we look at it as a point-by-
point measurement, and in this case, thinking of it in terms of national assessment, just
applying models that have pre-determined coefficients. So, we had discussed ways of
improving accuracy. The one that we always think of is site-specific calibration so that
you can hone in on what the data at your particular site are showing, but the other things
that came up that are probably more relevant in terms of doing national assessment is
how we think about averaging—whether we do spatial or temporal averaging of the data
or the realizations that we're thinking about. So, again, this whole idea of is it sufficient
to do average annual concentrations in a system. As you do average annual, you don't
worry so much about those daily fluctuations, but you may be able to do a very good job
in terms of describing that average annual condition. As Dominic (Di Toro) brought up,
however, in a lot of cases average annual conditions isn't sufficient—we really want to
know things on a probabilistic scale. In that case, maybe instead of making point-by-
point comparisons we should be looking at how a calculated model distribution compares
to an observed distribution. That's probably the most useful way of handling a lot of the
issues that the Agency has to address. With this, though, realize that for each specific
question you're going to want to handle this differently, so your selected matrix should
be consistent with the specific policy decisions and economic benefit measures that
you're going to consider.

The one thing we didn't talk about—I had it on my list, and we did look at it later and
talk about it in terms of model calibration, and I mentioned validation, but—how do you
do validation for these things? I guess an important part, just to underscore it, is to make
sure that you do these validations on a number sites such that you gain confidence to
apply this on a national scale.

-------
Farley: Debriefings from Breakout Session II, Freshwaters III

February 9, 2005

Slide #4 "Tolerance (cont.)—Model Precision "

(Tape counter starts at 2198)

Turning to precision, or thinking about how much explained or unexplained variability
there is in your system. We first had that discussion about how you can make that
distinction between explained and unexplained variability, and we came to the only
conclusion here that when you have explained variability, the way you could improve the
precision of the model is to increase model resolution. This came up yesterday with the
coastal group with respect to hydrodynamics in a system. If you want to get a more
accurate representation of what goes on in an estuary, you have to include hydrodynamic
modeling, and that automatically assumes that you have to go to finer time and space
scales.

As I talked about yesterday, in addition to thinking about resolution in terms of time and
space scales, I'm going to add that third category of reaction resolution or just how
complex our chemistry and biochemical descriptions are in models that become
important. So, again, if you want to become more precise, you're going to have to be
willing to go to more complex models.

Slide #5 "Tolerance (cont.) - Handling model uncertainty"

(Tape counter starts at 2244)

The last part of this that in some ways I thought was most interesting was the whole
question of "How do we handle model uncertainty as we pass the information from a
water quality model up to an economic evaluation?" What was clear from the economists
in the room is that what they were looking for was not just an average value but, rather,
they wanted information on distributions, which underscores this whole idea of doing
things and looking at probability and passing through the whole idea of uncertainty up
through the calculation.

So, different ways that were identified where we could handle model uncertainty was to
use analytical approaches, but most of us felt that we weren't, I guess, smart enough to
actually know how to apply some of the analytical approaches or develop them,
particularly as models get to be more non-linear it gets to be pretty challenging to have an
analytical approach that would work. More often, you are then left with doing Monte
Carlo simulations to get some idea of the probability that would be involved, or the
distributions that would be involved, in your water quality parameters.

As part of Monte Carlo simulations, we did have a discussion on using inverse techniques
to develop model coefficients, in this case, using something like PEST, where in
calibrating the model you use PEST to actually determine a lot of different sets of model
calibration parameters that can describe your field observations. Then when you go
forward and you look at your projections, you run them using all of those different
combinations of calibration parameters such that you're getting a distribution of values
that, again, you're passing forward as a distribution for the economic evaluation.

3

-------
Farley: Debriefings from Breakout Session II, Freshwaters III

February 9, 2005

The difficulty with all of these approaches is that they add complexity to the analysis and
computer time that, in some cases, probably isn't necessary, particularly if you're going
through this on a first go-around. So, we did want to offer one simple approach that you
could use and that is: Whatever model you're using, whether it be on an average annual
basis or you have certain spatial averaging going on in the model, you just acknowledge
the fact that you're only going to try to describe the mean and that the field data are going
to show you some variability about the mean. Instead of trying to explicitly model that,
what if we just took the information from analyzing the field data and passed that forward
for the economic evaluation, with our new calculation of the mean. So, we would take
something like the coefficient of variation from that distribution, along with a log-normal
distribution and then look at what the new mean is that you're calculating and allow them
to do the economic analysis that way. So, again, it may be a good first approach for a lot
of the different problems that you're trying to solve.

Lastly, as I was trying to type the slides together, the group kept going. We didn't really
get to talk about how these specific models, the four models that we're talking about,
really address these issues of tolerance. As I was sort of overhearing them and typing at
the same time, their discussion revealed that SPARROW has a lot of interesting aspects
about it that hit on some of the things that we talk about with tolerance. Instead of my
trying to relay that to you, I'm just going to ask Greg (Schwarz) to come up and talk for a
minute or two about what SPARROW can do in terms of tolerance and generating
distributions.

Greg Schwarz, USGS
(Tape counter starts at 2390)

Actually, I want to mention a couple of things. First of all, SPARROW can do all of
these techniques. It does it all as part of the estimation process, and it's all inherent in the
statistical analysis of a model, so it can handle uncertainty really well—statistical models
are designed that way.

I wanted to make one other point with regard to a major topic of discussion in our group,
which was how uncertainty is measured in a model. I'm going to put up a quantitative
model—hopefully this doesn't alarm you too much. (He writes this equation on the
board:

WQ =f(x,9) + e

We have some variable water quality that we're relating to some variables in the model—
land use or other things—and then we have some parameters, and then there's an error
term. This is a statistical model, but it's descriptive, I think also, of the deterministic
model.

A lot of the discussion here was on how well we're predicting water quality—that's the
measure of uncertainty. I think that's probably germane to a lot of issues that were
discussed here, primarily things where we're trying to determine the metrics of water
quality improvement—the GPRA and things like that.

-------
Farley: Debriefings from Breakout Session II, Freshwaters III

February 9, 2005

In this case, the kinds of things that you want to look for for estimating good
parameters—in a statistical model you want more data. So, going to higher resolution is
not necessarily an answer to getting better estimates of these parameters. In the case of
SPARROW, we see a big difference in scale of the model. Local-scale models don't
have a lot of variation in these input variables, and consequently often don't measure
these parameters all that well. A national-scale analysis, which has a lot of variation in
conditions—a lot of differences in atmospheric deposition, for example—allows you to
get much better estimates of these parameters that are going to be important for use in a
policy analysis.

Q&A

Dominic Di Toro, University of Delaware
(Tape IB, Tape counter starts at 100)

I just wanted to make a comment that I forgot to make in my presentation. The
presumption we were working on was that these models were being run by professional
water quality modelers—that they're not being run by whoever is not doing anything in
the office that day. It is, I think, crazy to think that at this level of analysis anyone can
turn on these models and make them work, even the simplest of models, because if
you're not professionally trained . . .who would buy a GC mass spec and then go look
around for who isn't busy and say, "Why don't you go run that thing today?" (laughter)
Unfortunately, that happens a fair amount in the modeling world, and so I caution you
that all of this is predicated on the idea that you actually have somebody who knows what
they're doing pushing the buttons—it's only reasonable to assume that, and I know for a
fact, of course, that mostly that's not true.

-------
Farley: Debriefings from Breakout Session II, Freshwaters III

Freshwater III: Tolerance

i

Four Parts of Economic Assessment

¦	Data

¦	Modeling

¦	Criteria/Environmental Services

¦	Valuation

Can we demonstrate benefits of current
programs?

6

-------
Farley: Debriefings from Breakout Session II, Freshwaters III
	ia&aiaEyJLipos

Tolerance

¦ Model Accuracy

¦	Different for each WQ variable

¦	Can improve accuracy by
® Site-specific calibration
® Spatial and temporal averaging
® Distributions

¦	Selected metric (e.g., use of distributions)
should be consistent with the specific
policy decision and economic benefit
measures

Tolerance (continued)

¦ Model Precision

¦ Explained and unexplained variability

« For explained variability, increase model
resolution (e.g., time, space, reaction)

-------
Farley: Debriefings from Breakout Session II, Freshwaters III
	February 9.2005

JTolerance (continued)

¦ Handling model uncertainty
«Analytical Approaches

¦	Monte Carlo Simulations

¦	Inverse techniques (e.g., PEST)
i Develop multiple calibrations
p Run multiple projections

¦	Simple approach: pass observed (field
data) distribution forward

a

-------