United States
                    Environmental Protection
                    Agency
Atmospheric Sciences Research
Laboratory
Research Triangle Park NC 27711
'/i
                    Research and Development
EPA/600/S9-86/018 Sept. 1986
&EPA          Project  Summary

                    Workshop  on   Model
                    Evaluation  Protocols:
                    Chairman's  Report
                    W. T. Fennel I
                     This report summarizes the results of
                    a workshop sponsored by the U.S. Envi-
                    ronmental Protection Agency that was
                    held to discuss procedures and proto-
                    cols for evaluating regional-scale acid
                    deposition models. The workshop was
                    the first of three that are planned to
                    assist the U.S. Environmental Protec-
                    tion Agency, the Ontario Ministry of En-
                    vironment, the Atmospheric Environ-
                    ment Service of Canada, and the
                    Electric Power Research Institute in de-
                    signing a model evaluation program.
                    The workshop was asked to consider
                    four major topics:
                     • procedures to be used in evaluat-
                       ing the performance of acid deposi-
                       tion models and methodologies for
                       applying these procedures
                     • data requirements of these proce-
                       dures and methodologies
                     • the probable impact of time and
                       budget constraints on the evalua-
                       tion process
                     • possible conflicts between client
                       needs and the probable output of
                       the evaluation program.
                     Each of these topics was considered,
                    and a series of recommendations was
                    made. These recommendations cov-
                    ered the manner in which the evalua-
                    tion should be conducted, the specific
                    tasks that should be undertaken to ef-
                    fect the evaluation, the data require-
                    ments of the recommended evaluation
                    program,  and  the types of questions
                    that can and cannot be answered by
                    such a program.
                     This Project Summary was devel-
                    oped by EPA's Atmospheric Sciences
                    Research Laboratory, Research Triangle
                    Park, NC, to announce key findings of
the research project that is fully docu-
mented in a separate report of the same
title (see Project Report ordering infor-
mation at back).

Introduction
  The workshop on model evaluation
protocols, sponsored by the U.S. Envi-
ronmental Protection Agency (EPA) and
held in Raleigh, NC, on February 11-13,
1986, was the first of three workshops
planned to assist the EPA, the Ontario
Ministry of Environment (OME), the At-
mospheric Environment Service (AES),
and the Electric Power  Research Insti-
tute (EPRI) in designing a program to
evaluate  the performance of regional-
scale acid deposition models. It was at-
tended by a group of 27 scientists repre-
senting  government  laboratories,
industrial groups,  private contractors,
and universities. An attendance list is
given in Appendix  A of the report. The
workshop participants considered four
major topics.
  • Procedures to be used in evaluating
   the performance of acid deposition
   models and methodologies for  ap-
   plying these procedures
  • Data requirements of these proce-
   dures and methodologies
  • Probable impact of time and bud-
   get constraints on the evaluation
   process
  • Possible conflicts  between  client
   needs and the probable output of
   the evaluation  program
  A series of recommendations was
made  regarding the manner in which
the evaluation should be conducted, the
specific tasks that should be undertaken
to effect  the evaluation, the data  re-

-------
quirements of the recommended evalu-
ation program, and the types of ques-
tions that can and cannot be answered
by such a program. These recommen-
dations are presented in the report.
  In  reaching their conclusions, the
workshop participants drew on the re-
sults of two previous activities: the
EPRI-sponsored  Utility Deposition Net-
work (UDN) workshop held in  Novem-
ber 1985 and an  August 1985 workshop
on evaluation of acid deposition models
(sponsored by  EPA,  OME and  AES),
which resulted in  a Concept Plan. The
UDN workshop was convened to review
a technical plan  for a  deposition  moni-
toring network  that would gather the
wet deposition  and aerometric meas-
urements needed for an operational
evaluation of acid deposition  models.
The UDN workshop made specific rec-
ommendations on the type of measure-
ments required  and on the methods to
be used in making them. These recom-
mendations were  incorporated into the
recommendations of evaluation  proto-
cols workshop.
  The Concept Plan, on the other hand,
outlined a series of field studies as well
as a wet and dry deposition monitoring
network that would generate the data
needed for both  operational (how well a
model reproduces actual observations
of deposition and concentration fields
on the time and space scales  needed)
and  diagnostic (how well individual
components of the model simulate ac-
tual  atmospheric processes)  evalua-
tions of the models. This plan, however,
did not contain a set of specific tasks for
its fulfillment, nor  did it indicate the rel-
ative importance of routine monitoring
activities, which generate the  data for
operational  evaluation, compared  to
process studies, which provide the in-
formation needed for diagnostic evalua-
tion. The workshop participants, there-
fore, spent considerable time  in
defining these tasks and in discussing
the general order  of priorities.
  From the clients' point of view, the
primary purpose of regional-scale acid
deposition models is to provide scientif-
ically defensible tools for analyzing the
consequences of  alternative strategies
for controlling emissions of acid deposi-
tion  precursors. Reflecting this need,
the clients posed a set of four key policy
questions that they expect the models
to address. These questions were re-
lated to  (1) deposition loadings,
(2) source attribution, (3) chemical non-
linearity, and (4) detectability.
  During the course of the workshop,
several issues were raised that could be
a source of conflict between the needs
of the clients and what the workshop
participants think is possible in a realis-
tic model evaluation program: conflicts
between model evaluation and model
development, problems in evaluating
complex models, and problems in set-
ting priorities.
  One of the charges given the work-
shop was to set priorities for the model
evaluation process, that is, to determine
which aspects of the models contained
the greatest uncertainty and thus re-
quired the greatest emphasis in an eval-
uation program. Thus, the participants
strongly suggested that sensitivity stud-
ies be conducted as soon as possible in
order to guide  the process of experi-
ment design. Nevertheless, the work-
shop participants recognized that the
clients required preliminary guidance in
this area; consequently, the  tasks de-
scribed in the following sections are or-
dered according to the participants' ini-
tial priority judgments.

General Recommendations

Goals and Management

  The workshop participants agreed
that the original goals of the model eval-
uation program could not be met by a
model  evaluation program subject to
the probable time and budget con-
straints. Thus, these goals were modi-
fied to be more consistent with what
was thought to be achievable. These re-
stated goals are given below:

Deposition Loadings
  Given data  from a  surface-based
monitoring network operating for only a
few years and sampling under a limited
range of chemical conditions, it will not
be possible to determine the accuracy
to which a given model can predict cli-
matologically valid deposition loadings
to a given area. It will, however, be pos-
sible to determine whether such  a
model  can simulate deposition fields
over the period sampled by the network
to an accuracy comparable to the uncer-
tainty to which the monitoring network
can define the actual deposition field. It
should also be possible to quantify ob-
jectively the level of disagreement be-
tween the measurements and alterna-
tive  predictions, assuming  that this
disagreement exceeds the  measure-
ment uncertainty.
Source Attribution
  No method is available for directly
testing or evaluating the  ability of a
model to make this computation. The
models will certainly be able to com-
pute changes  in deposition resulting
from changes  in emissions; and it  is
possible, by numerically tagging the
emissions from a given area, to deter-
mine where the emissions from the
tagged area  may  go.  However, one
must remember that such relationships
apply only to a fixed  distribution of
emissions. Because of the nonlinearity
of the chemistry, changing the emis-
sions at any point  in the modeled do-
main may affect source attributions  at
every other location.

Chemical Nonlinearity
  Short of drastically  changing existing
emission patterns, one cannot truly
evaluate the ability of a model to handle
this issue. The only solution, therefore,
is to perform extensive diagnostic stud-
ies that investigate how accurately
process  modules in the  models repre-
sent the  corresponding natural phe-
nomena. If these modules appear to be
fairly accurate, then there will be confi-
dence in the  ability  of the models to
treat issues such as nonlinearity and
source attribution.

Detectability
  The ability of a model to make this
sort of computation cannot be directly
evaluated. However, the degree of con-
fidence that one can have in the predic-
tions of the model depends on how well
the various process  modules simulate
the actual atmospheric phenomena.
  In summary, an operational  model
evaluation program  will indicate how
well the  models can simulate current
deposition patterns over a time period
comparable to the length of the surface
monitoring program. A diagnostic eval-
uation program, on the other hand, will
indicate how well the process modules
in a given model simulate the important
physical and chemical processes in at-
mosphere. Diagnostic studies are  es-
sential if we are to have confidence that
the models are capable of simulating
deposition for significantly different
emission  compositions and  distribu-
tions.
  The field measurement  program as-
sociated with  the model evaluation
process should last a minimum of two
years, and longer if possible, because a
shorter  measurement program would

-------
not gather a sufficient data set for a use-
ful  evaluation of the models. All data
collected during diagnostic studies
should be released to potential users as
soon as they have been quality audited,
because  experimental data for model
development purposes are critically
lacking. Filling this need was judged
more important than achieving a com-
pletely hands-off diagnostic evaluation
of sequestered data. The first year of
data from the operational  evaluation
program should  be delivered to the
users for model development purposes,
and only the second year of data should
be sequestered for use in  blind evalua-
tion tests.
  The model evaluation program
should be managed by a  highly quali-
fied, disinterested party. The managing
organization should be responsible for
developing the evaluation protocols,
supervising their execution, and report-
ing the results. The National Academy
of Sciences was suggested as the most
acceptable organization for this role.

Recommended Tasks
  The workshop  recommended that
several preliminary tasks be started dur-
ing FY  86 because the  results are
needed for planning the work to  be ac-
complished in FY 87 and  beyond: de-
velop model evaluation methods, de-
velop evaluation protocols, analyze
existing data bases, conduct model sen-
sitivity studies, and conduct emissions
studies.

Operational Evaluation
  Given  current budget  restrictions,
highest priority should, be placed  on es-
tablishing and operating  a surface-
based wet deposition and  aerometric
monitoring network, and on performing
other tasks to produce a data base for
operational evaluation of regional-scale
acid deposition  models. The first four
tasks are listed in order of  decreasing
priority. The last  three are necessary
support tasks for any  of the first four.
Some of the tasks  have  a diagnostic
component,  but the workshop partici-
pants felt this component was vital to
interpreting  the operational compari-
sons.

Deploy and Operate a Surface-
Based  Monitoring  Network
  Top priority is given  to the task of es-
tablishing and operating  a 30-station
wet deposition and aerometric monitor-
ing network in the northeastern United
States. This network should be main-
tained for a minimum of two years.

Vertical Profiles Over the
Modeling Domain
  In this task, frequent aircraft flights
will be made year-round  over  various
portions of the eastern United States to
obtain information on the vertical distri-
bution, from near the surface to several
thousand feet, of several important spe-
cies.

Deploy and Operate Subgrid
Variability Networks
  The purpose of this task is to gather
data on the subgrid variability of precip-
itation  chemistry and ambient concen-
tration of pollutants for interpreting the
results of network  measurements. The
subgrid variability will also form the nu-
cleus for additional diagnostic  studies.
Two subgrid variability networks
should be established: one in Kentucky
or the Ohio River Valley and one on the
U.S./Canada border.  Each network
should consist of an enhanced central
monitoring station surrounded by a
cluster of approximately 100 sequential
precipitation chemistry monitoring sites
and should cover a 200 km2 area. Each
full subgrid network should be operated
for two 2-month intensive periods per
year, and about ten sites should be op-
erated continuously. During the  inten-
sive observation periods, aircraft meas-
urements should be made of the vertical
profiles of the species measured  at the
enhanced station.

Emission Inventories
  Existing emission inventories should
be  updated to correspond to the time
periods being modeled in  the opera-
tional  evaluation studies. In addition,
more extensive  improvements should
be made to the inventories, such as im-
proving VOC inventories, if the FY 86
emission inventory task shows that this
effort is justifiable economically.

Support Tasks
  In addition to these major tasks there
are three additional support tasks:
(1) quality audit  data,  (2) archive data,
and (3) perform evaluations.

Suggested  Protocols
  For the first year of measurements, all
data should  be released as soon as
quality auditing is complete. In the sec-
ond year, the following policies are sug-
gested.
  1. Emission inventory updates qual-
 ity audited and released to modelers
  2. Surface monitoring data quality
 audited and sequestered
  3. Subgrid variability data quality au-
 dited and released to modelers
  4. Vertical profile data quality audited
 and sequestered
  5. Comparison of model outputs and
 data overseen by the National Academy
 of Sciences
    • surface comparisons made by
 approved objective techniques
    • vertical profile comparisons
 made by objective and subjective analy-
 sis

 Diagnostic Evaluation
  The primary purpose of the diagnos-
 tic evaluation process is to ensure that
 the  models are  providing an accurate
 simulation of the physical and chemical
 processes that control the transport,
 transformation, and deposition of acidic
 materials. Initially, seven types of diag-
 nostic studies were identified as neces-
 sary for realistically assessing the abil-
 ity  of  acid deposition models to
 simulate wet and dry removal. Two of
 these study classes, subgrid variability
 and vertical profiles, were included in
 the  basic operational evaluation pro-
 gram. The remaining five study classes
 were grouped according to  a prelimi-
 nary assessment of their relative impor-
 tance:  wet deposition modules, dry
 deposition modules (high priority stud-
 ies), atmospheric transport, gas-phase
 chemistry, and treatment of the inflow
 boundary conditions (low priority stud-
 ies).

 Wet Deposition Module
  These studies  should ensure that the
 physical and chemical processes gov-
"erning  the wet  removal of  acidifying
 gases and aerosols are accurately rep-
 resented in the  models. Intensive pre-
 cipitation scavenging studies should be
 conducted in the two subgrid regions
 during  the intensive observation peri-
 ods. Additional  measurements would
 be  required from radar, cloud-physics-
 equipped aircraft, and enhanced upper-
 air soundings. Data can be analyzed by
 simulating the observed cloud and pre-
 cipitation chemistry fields with diagnos-
 tic  models; comparing observed and
 simulated data will indicate how well
 the model represents the critical proc-
 esses. Given a sufficient number of case
 studies, the parameterization  schemes
 that consistently result in the best repre-

-------
    sentation of the observations should
    emerge.

    Dry Deposition Module
     Model simulations should be com-
    pared with alternative methods for
    measuring or estimating dry deposition
    in the atmosphere, using the dry depo-
    sition core stations as the primary diag-
    nostic reference points. Additional core
    stations should be placed at each of the
    two enhanced subgrid sites, and meas-
    urements for deducing dry deposition
    from air concentration and meteorolog-
    ical measurements should  also be
    made at about four of the subgrid clus-
    ter stations. Using combinations of
    ground-based and aircraft data, esti-
    mates can be made of dry deposition
    fluxes to each of the subgrid areas, and
    these estimates can be compared with
    model computations.

    Atmospheric Transport
     In evaluating transport, two types of
    studies are envisioned: studies of long-
    range horizontal transport  and studies
    of vertical translation  in storms.  The
    most effective tests of a model's ability
    to handle pollutant transport are tracer
    studies. The sampling system must be
    composed of several tracer-sampling
    aircraft  in addition to a ground-based
    sampling network.

    Gas Phase Chemistry
     This diagnostic evaluation, largely in-
    direct, will involve detailed measure-
    ment of hydrocarbon species, reaction
    products, and NOXNOV chemistry at the
       central stations. These measurements
       will be compared with zero-dimensional
       reaction-chemistry simulations, using
       the parameterizations employed by the
       models, as well as more elaborate de-
       scriptions. Comparisons of key ratios of
       reactants and intermediates will be em-
       ployed as the primary tests of reality in
       the submodel  calculations. An alterna-
       tive  experimental approach  is to  per-
       form Lagrangian-rype experiments.
Boundary Conditions

  These studies should determine
whether  model simulations are being
affected by the transport of errors into
the modeling domain through the in-
flow boundaries. A possible method of
determining transport effects would be
to make  a series of research  aircraft
flights along the inflow boundary at var-
ious altitudes in the boundary layer.
          W. T. Pennell is with Battelle Pacific Northwest Laboratories, Richland. WA
            99352.
          Jack Durham is the EPA Project Officer (see below).
          The complete report, entitled "Workshop on Model Evaluation Protocols:
            Chairman's Report," (Order No. PB 86-217 122/AS; Cost: $9.95, subject to
            change) will be available only from:
                 National Technical Information Service
                 5285 Port Royal Road
                 Springfield, VA 22161
                  Telephone: 703-487-4650
          The EPA Project Officer can be contacted at:
                 Atmospheric Sciences Research Laboratory
                 U.S. Environmental Protection Agency
                 Research Triangle Park, NC27711
United States
Environmental Protection
Agency
Center for Environmental Research
Information
Cincinnati OH 45268

Official Business
Penalty for Private Use $300
    .  0000329    PS
                                     60604

-------