United States
Environmental Protection
Agency
Atmospheric Sciences Research
Laboratory
Research Triangle Park NC 27711
'/i
Research and Development
EPA/600/S9-86/018 Sept. 1986
&EPA Project Summary
Workshop on Model
Evaluation Protocols:
Chairman's Report
W. T. Fennel I
This report summarizes the results of
a workshop sponsored by the U.S. Envi-
ronmental Protection Agency that was
held to discuss procedures and proto-
cols for evaluating regional-scale acid
deposition models. The workshop was
the first of three that are planned to
assist the U.S. Environmental Protec-
tion Agency, the Ontario Ministry of En-
vironment, the Atmospheric Environ-
ment Service of Canada, and the
Electric Power Research Institute in de-
signing a model evaluation program.
The workshop was asked to consider
four major topics:
• procedures to be used in evaluat-
ing the performance of acid deposi-
tion models and methodologies for
applying these procedures
• data requirements of these proce-
dures and methodologies
• the probable impact of time and
budget constraints on the evalua-
tion process
• possible conflicts between client
needs and the probable output of
the evaluation program.
Each of these topics was considered,
and a series of recommendations was
made. These recommendations cov-
ered the manner in which the evalua-
tion should be conducted, the specific
tasks that should be undertaken to ef-
fect the evaluation, the data require-
ments of the recommended evaluation
program, and the types of questions
that can and cannot be answered by
such a program.
This Project Summary was devel-
oped by EPA's Atmospheric Sciences
Research Laboratory, Research Triangle
Park, NC, to announce key findings of
the research project that is fully docu-
mented in a separate report of the same
title (see Project Report ordering infor-
mation at back).
Introduction
The workshop on model evaluation
protocols, sponsored by the U.S. Envi-
ronmental Protection Agency (EPA) and
held in Raleigh, NC, on February 11-13,
1986, was the first of three workshops
planned to assist the EPA, the Ontario
Ministry of Environment (OME), the At-
mospheric Environment Service (AES),
and the Electric Power Research Insti-
tute (EPRI) in designing a program to
evaluate the performance of regional-
scale acid deposition models. It was at-
tended by a group of 27 scientists repre-
senting government laboratories,
industrial groups, private contractors,
and universities. An attendance list is
given in Appendix A of the report. The
workshop participants considered four
major topics.
• Procedures to be used in evaluating
the performance of acid deposition
models and methodologies for ap-
plying these procedures
• Data requirements of these proce-
dures and methodologies
• Probable impact of time and bud-
get constraints on the evaluation
process
• Possible conflicts between client
needs and the probable output of
the evaluation program
A series of recommendations was
made regarding the manner in which
the evaluation should be conducted, the
specific tasks that should be undertaken
to effect the evaluation, the data re-
-------
quirements of the recommended evalu-
ation program, and the types of ques-
tions that can and cannot be answered
by such a program. These recommen-
dations are presented in the report.
In reaching their conclusions, the
workshop participants drew on the re-
sults of two previous activities: the
EPRI-sponsored Utility Deposition Net-
work (UDN) workshop held in Novem-
ber 1985 and an August 1985 workshop
on evaluation of acid deposition models
(sponsored by EPA, OME and AES),
which resulted in a Concept Plan. The
UDN workshop was convened to review
a technical plan for a deposition moni-
toring network that would gather the
wet deposition and aerometric meas-
urements needed for an operational
evaluation of acid deposition models.
The UDN workshop made specific rec-
ommendations on the type of measure-
ments required and on the methods to
be used in making them. These recom-
mendations were incorporated into the
recommendations of evaluation proto-
cols workshop.
The Concept Plan, on the other hand,
outlined a series of field studies as well
as a wet and dry deposition monitoring
network that would generate the data
needed for both operational (how well a
model reproduces actual observations
of deposition and concentration fields
on the time and space scales needed)
and diagnostic (how well individual
components of the model simulate ac-
tual atmospheric processes) evalua-
tions of the models. This plan, however,
did not contain a set of specific tasks for
its fulfillment, nor did it indicate the rel-
ative importance of routine monitoring
activities, which generate the data for
operational evaluation, compared to
process studies, which provide the in-
formation needed for diagnostic evalua-
tion. The workshop participants, there-
fore, spent considerable time in
defining these tasks and in discussing
the general order of priorities.
From the clients' point of view, the
primary purpose of regional-scale acid
deposition models is to provide scientif-
ically defensible tools for analyzing the
consequences of alternative strategies
for controlling emissions of acid deposi-
tion precursors. Reflecting this need,
the clients posed a set of four key policy
questions that they expect the models
to address. These questions were re-
lated to (1) deposition loadings,
(2) source attribution, (3) chemical non-
linearity, and (4) detectability.
During the course of the workshop,
several issues were raised that could be
a source of conflict between the needs
of the clients and what the workshop
participants think is possible in a realis-
tic model evaluation program: conflicts
between model evaluation and model
development, problems in evaluating
complex models, and problems in set-
ting priorities.
One of the charges given the work-
shop was to set priorities for the model
evaluation process, that is, to determine
which aspects of the models contained
the greatest uncertainty and thus re-
quired the greatest emphasis in an eval-
uation program. Thus, the participants
strongly suggested that sensitivity stud-
ies be conducted as soon as possible in
order to guide the process of experi-
ment design. Nevertheless, the work-
shop participants recognized that the
clients required preliminary guidance in
this area; consequently, the tasks de-
scribed in the following sections are or-
dered according to the participants' ini-
tial priority judgments.
General Recommendations
Goals and Management
The workshop participants agreed
that the original goals of the model eval-
uation program could not be met by a
model evaluation program subject to
the probable time and budget con-
straints. Thus, these goals were modi-
fied to be more consistent with what
was thought to be achievable. These re-
stated goals are given below:
Deposition Loadings
Given data from a surface-based
monitoring network operating for only a
few years and sampling under a limited
range of chemical conditions, it will not
be possible to determine the accuracy
to which a given model can predict cli-
matologically valid deposition loadings
to a given area. It will, however, be pos-
sible to determine whether such a
model can simulate deposition fields
over the period sampled by the network
to an accuracy comparable to the uncer-
tainty to which the monitoring network
can define the actual deposition field. It
should also be possible to quantify ob-
jectively the level of disagreement be-
tween the measurements and alterna-
tive predictions, assuming that this
disagreement exceeds the measure-
ment uncertainty.
Source Attribution
No method is available for directly
testing or evaluating the ability of a
model to make this computation. The
models will certainly be able to com-
pute changes in deposition resulting
from changes in emissions; and it is
possible, by numerically tagging the
emissions from a given area, to deter-
mine where the emissions from the
tagged area may go. However, one
must remember that such relationships
apply only to a fixed distribution of
emissions. Because of the nonlinearity
of the chemistry, changing the emis-
sions at any point in the modeled do-
main may affect source attributions at
every other location.
Chemical Nonlinearity
Short of drastically changing existing
emission patterns, one cannot truly
evaluate the ability of a model to handle
this issue. The only solution, therefore,
is to perform extensive diagnostic stud-
ies that investigate how accurately
process modules in the models repre-
sent the corresponding natural phe-
nomena. If these modules appear to be
fairly accurate, then there will be confi-
dence in the ability of the models to
treat issues such as nonlinearity and
source attribution.
Detectability
The ability of a model to make this
sort of computation cannot be directly
evaluated. However, the degree of con-
fidence that one can have in the predic-
tions of the model depends on how well
the various process modules simulate
the actual atmospheric phenomena.
In summary, an operational model
evaluation program will indicate how
well the models can simulate current
deposition patterns over a time period
comparable to the length of the surface
monitoring program. A diagnostic eval-
uation program, on the other hand, will
indicate how well the process modules
in a given model simulate the important
physical and chemical processes in at-
mosphere. Diagnostic studies are es-
sential if we are to have confidence that
the models are capable of simulating
deposition for significantly different
emission compositions and distribu-
tions.
The field measurement program as-
sociated with the model evaluation
process should last a minimum of two
years, and longer if possible, because a
shorter measurement program would
-------
not gather a sufficient data set for a use-
ful evaluation of the models. All data
collected during diagnostic studies
should be released to potential users as
soon as they have been quality audited,
because experimental data for model
development purposes are critically
lacking. Filling this need was judged
more important than achieving a com-
pletely hands-off diagnostic evaluation
of sequestered data. The first year of
data from the operational evaluation
program should be delivered to the
users for model development purposes,
and only the second year of data should
be sequestered for use in blind evalua-
tion tests.
The model evaluation program
should be managed by a highly quali-
fied, disinterested party. The managing
organization should be responsible for
developing the evaluation protocols,
supervising their execution, and report-
ing the results. The National Academy
of Sciences was suggested as the most
acceptable organization for this role.
Recommended Tasks
The workshop recommended that
several preliminary tasks be started dur-
ing FY 86 because the results are
needed for planning the work to be ac-
complished in FY 87 and beyond: de-
velop model evaluation methods, de-
velop evaluation protocols, analyze
existing data bases, conduct model sen-
sitivity studies, and conduct emissions
studies.
Operational Evaluation
Given current budget restrictions,
highest priority should, be placed on es-
tablishing and operating a surface-
based wet deposition and aerometric
monitoring network, and on performing
other tasks to produce a data base for
operational evaluation of regional-scale
acid deposition models. The first four
tasks are listed in order of decreasing
priority. The last three are necessary
support tasks for any of the first four.
Some of the tasks have a diagnostic
component, but the workshop partici-
pants felt this component was vital to
interpreting the operational compari-
sons.
Deploy and Operate a Surface-
Based Monitoring Network
Top priority is given to the task of es-
tablishing and operating a 30-station
wet deposition and aerometric monitor-
ing network in the northeastern United
States. This network should be main-
tained for a minimum of two years.
Vertical Profiles Over the
Modeling Domain
In this task, frequent aircraft flights
will be made year-round over various
portions of the eastern United States to
obtain information on the vertical distri-
bution, from near the surface to several
thousand feet, of several important spe-
cies.
Deploy and Operate Subgrid
Variability Networks
The purpose of this task is to gather
data on the subgrid variability of precip-
itation chemistry and ambient concen-
tration of pollutants for interpreting the
results of network measurements. The
subgrid variability will also form the nu-
cleus for additional diagnostic studies.
Two subgrid variability networks
should be established: one in Kentucky
or the Ohio River Valley and one on the
U.S./Canada border. Each network
should consist of an enhanced central
monitoring station surrounded by a
cluster of approximately 100 sequential
precipitation chemistry monitoring sites
and should cover a 200 km2 area. Each
full subgrid network should be operated
for two 2-month intensive periods per
year, and about ten sites should be op-
erated continuously. During the inten-
sive observation periods, aircraft meas-
urements should be made of the vertical
profiles of the species measured at the
enhanced station.
Emission Inventories
Existing emission inventories should
be updated to correspond to the time
periods being modeled in the opera-
tional evaluation studies. In addition,
more extensive improvements should
be made to the inventories, such as im-
proving VOC inventories, if the FY 86
emission inventory task shows that this
effort is justifiable economically.
Support Tasks
In addition to these major tasks there
are three additional support tasks:
(1) quality audit data, (2) archive data,
and (3) perform evaluations.
Suggested Protocols
For the first year of measurements, all
data should be released as soon as
quality auditing is complete. In the sec-
ond year, the following policies are sug-
gested.
1. Emission inventory updates qual-
ity audited and released to modelers
2. Surface monitoring data quality
audited and sequestered
3. Subgrid variability data quality au-
dited and released to modelers
4. Vertical profile data quality audited
and sequestered
5. Comparison of model outputs and
data overseen by the National Academy
of Sciences
• surface comparisons made by
approved objective techniques
• vertical profile comparisons
made by objective and subjective analy-
sis
Diagnostic Evaluation
The primary purpose of the diagnos-
tic evaluation process is to ensure that
the models are providing an accurate
simulation of the physical and chemical
processes that control the transport,
transformation, and deposition of acidic
materials. Initially, seven types of diag-
nostic studies were identified as neces-
sary for realistically assessing the abil-
ity of acid deposition models to
simulate wet and dry removal. Two of
these study classes, subgrid variability
and vertical profiles, were included in
the basic operational evaluation pro-
gram. The remaining five study classes
were grouped according to a prelimi-
nary assessment of their relative impor-
tance: wet deposition modules, dry
deposition modules (high priority stud-
ies), atmospheric transport, gas-phase
chemistry, and treatment of the inflow
boundary conditions (low priority stud-
ies).
Wet Deposition Module
These studies should ensure that the
physical and chemical processes gov-
"erning the wet removal of acidifying
gases and aerosols are accurately rep-
resented in the models. Intensive pre-
cipitation scavenging studies should be
conducted in the two subgrid regions
during the intensive observation peri-
ods. Additional measurements would
be required from radar, cloud-physics-
equipped aircraft, and enhanced upper-
air soundings. Data can be analyzed by
simulating the observed cloud and pre-
cipitation chemistry fields with diagnos-
tic models; comparing observed and
simulated data will indicate how well
the model represents the critical proc-
esses. Given a sufficient number of case
studies, the parameterization schemes
that consistently result in the best repre-
-------
sentation of the observations should
emerge.
Dry Deposition Module
Model simulations should be com-
pared with alternative methods for
measuring or estimating dry deposition
in the atmosphere, using the dry depo-
sition core stations as the primary diag-
nostic reference points. Additional core
stations should be placed at each of the
two enhanced subgrid sites, and meas-
urements for deducing dry deposition
from air concentration and meteorolog-
ical measurements should also be
made at about four of the subgrid clus-
ter stations. Using combinations of
ground-based and aircraft data, esti-
mates can be made of dry deposition
fluxes to each of the subgrid areas, and
these estimates can be compared with
model computations.
Atmospheric Transport
In evaluating transport, two types of
studies are envisioned: studies of long-
range horizontal transport and studies
of vertical translation in storms. The
most effective tests of a model's ability
to handle pollutant transport are tracer
studies. The sampling system must be
composed of several tracer-sampling
aircraft in addition to a ground-based
sampling network.
Gas Phase Chemistry
This diagnostic evaluation, largely in-
direct, will involve detailed measure-
ment of hydrocarbon species, reaction
products, and NOXNOV chemistry at the
central stations. These measurements
will be compared with zero-dimensional
reaction-chemistry simulations, using
the parameterizations employed by the
models, as well as more elaborate de-
scriptions. Comparisons of key ratios of
reactants and intermediates will be em-
ployed as the primary tests of reality in
the submodel calculations. An alterna-
tive experimental approach is to per-
form Lagrangian-rype experiments.
Boundary Conditions
These studies should determine
whether model simulations are being
affected by the transport of errors into
the modeling domain through the in-
flow boundaries. A possible method of
determining transport effects would be
to make a series of research aircraft
flights along the inflow boundary at var-
ious altitudes in the boundary layer.
W. T. Pennell is with Battelle Pacific Northwest Laboratories, Richland. WA
99352.
Jack Durham is the EPA Project Officer (see below).
The complete report, entitled "Workshop on Model Evaluation Protocols:
Chairman's Report," (Order No. PB 86-217 122/AS; Cost: $9.95, subject to
change) will be available only from:
National Technical Information Service
5285 Port Royal Road
Springfield, VA 22161
Telephone: 703-487-4650
The EPA Project Officer can be contacted at:
Atmospheric Sciences Research Laboratory
U.S. Environmental Protection Agency
Research Triangle Park, NC27711
United States
Environmental Protection
Agency
Center for Environmental Research
Information
Cincinnati OH 45268
Official Business
Penalty for Private Use $300
. 0000329 PS
60604
------- |