SEPA
            United States
            Environmental Protection
            Agency
           Office of Air Quality
           Planning and Standards
           Research Triangle Park NC 27711
EPA-450 3 79-032
May 1979
            Air
End  Use of Solvents
Containing Volatile
Organic Compounds

-------
                                     EPA-450/3-79-032
End  Use of  Solvents  Containing
  Volatile  Organic  Compounds
                        by

                     Ned Ostojic

            The Research Corporation of New England
                 125 Silas Deane1 Highway
              Wethersfield, Connecticut 06109
                 Contract No. 68-02-2615
                     Task No. 8
              EPA Project Officer: Reid E. Iversen
                     Prepared for

           U.S. ENVIRONMENTAL PROTECTION AGENCY
              Office of Air, Noise, and Radiation
           Office of Air Quality Planning and Standards
           Research Triangle Park, North Carolina 27711

                     May 1979

-------
                              DISCLAIMER


     This report has been reviewed  by the Office of Air Quality Planning
and Standards,  U.S.  Environmental Protection Agency, and approved for
publication.  Approval  does not  signify that the contents necessarily
reflect the views and policies of the U.S. Enviornmental Protection
Agency, nor does mention of trade names or commercial products constitute
endorsement or  recommendation for use.
                                  ii

-------
                               ABSTRACT
     Currently there are no standardized guidelines  for evaluating  the
performance of air quality simulation models.   In  this report we develop
a conceptual framework for objectively evaluating  model  performance.  We
define five attributes of a well-behaving model:   accuracy of the peak
prediction, absence of systematic bias, lack of gross  error, temporal cor-
relation, and spatial alignment.   The relative importance  of these  attri-
butes is shown to depend on the issue being addressed  and  the pollutant
being considered.  Acceptability of model behavior is  determined by cal-
culating several  performance "measures" and comparing  their values  with
specific "standards."  Failure to demonstrate a particular attribute may
or may not cause a model to be rejected, depending on  the  issue and pollutant.

     Comprehensive background material is presented on the elements of the
performance evaluation problem:  the types of issues to be addressed, the
classes of models to be used along with the applications for which  they are
suited, and the categories of performance measures available for considera-
tion.  Also, specific rationales are developed on  which performance standards
could be based.  Guidance on the interpretation of performance measure values
is provided by means of an example using a large,  grid-based air quality
model.
                                  m

-------
                           ACKNOWLEDGMENTS
     A number of persons have generously  provided their assistance and sup-
port to this project.   Special thanks  is  due Philip Roth, whose fore-
sight and leadership made this project possible.  His perceptive advice
and guidance contributed immeasurably  to  the results of this work.

     Steven Reynolds and Martin Hillyer made many significant, insightful
comments, which were greatly appreciated.

     For their patience and diligence,  grateful thanks is also due the
members of the SAI support staff, particularly Marie Davis, Sue Bennett,
Chris Smith, and Linda Hill.
                                   IV

-------
                              CONTENTS


DISCLAIMER	     ii

ABSTRACT	    "i

ACKNOWLEDGMENTS	     iv

LIST OF ILLUSTRATIONS  	    vii

LIST OF TABLES	     *i

LIST OF EXHIBITS	    xiv

  I  INTRODUCTION  	    1-1

     A.  Overview of the Problem 	    1-2

     B.  Structure of the Report 	    1-5

 II  SUMMARY	    H-l

     A.  Main Results	    H-l

     B.  Detailed Summary	    H-2
         1.  Summary of Chapter III (Issues)	    H-2
         2.  Summary of Chapter IV (Models)  	    II-3
         3.  Summary of Chapter V (Performance Measures) ....    I1-4
         4.  Summary of Chapter VI (Performance Standards) .  .  .   11-14

 III  TSSUES REQUIRING MODEL APPLICATION  	   III-l

     A.  A Perspective on the Issues	III-l
         1.  Federal Air Pollution Law	III-2
         2.  The Code of Federal Regulations	III-3

     B.  Generic Issue Categories  	   III-7
         1.  The Issues:  Their Classification 	   III-8
         2.  The Issues:  Some Practical Examples and
             Their  Implications for Air Pollution Modeling .  .  .  111-10
         3.  The Issues:  A Prologue to the Next Chapter  ....  111-13

-------
IV  AIR QUALITY MODELS	      JV-"1
    A.   Generic Model Categories 	      IV-2
        1.   Rollback Category	      Iv'2
        2.   Isopleth Category	      IV-4
        3.   Physico-Chemical  Category	      IV-5
    B.   Generic Issue/Model  Combinations 	     IV-16
    C.   Model/Application Combinations	     IV-22
    D.   Some Specific Air Quality Models	     IV-22
    E.   Air Quality Models:   A Summary	     IV-25
 V  MODEL PERFORMANCE MEASURES	       V-l
    A.   The Comparison of Prediction with Observation	       V-2
    B.   Generic Performance Measure Categories 	       Y-4
        1.  The Generic Measures	       V-5
        2.  Some Types of Variations Among Performance
            Measures	      v~10
        3.  Several Practical Considerations 	      V-10
    C.  A Basic Distinction:  Regional Versus Source-Specific
        Performance Measures	      V-15
    D.  Some Specific Performance Measures 	      V-22
    E.  Matching Performance  Measures to Issues and Models . .      V-27
        1.   Performance Measures  and Air Quality  Issues.  . . .      V-27
        2.   Performance Measures  and Air Quality  Models.  . . .      V-33
    F.  Performance Measures:   A  Summary 	...'..      V-36
VI  MODEL PERFORMANCE STANDARDS	      VI-1
    A.  Performance Standards:  A Conceptual Overview	      VI-2
    B.  Performance Standards:  Some Practical
        Considerations 	     VI-4
        1.   Data Limitations	     VI-5
        2.   Time/Resource Constraints	     VI-6
        3.   Variability of Analysis  Requirements  	     VI-6
    C.  Model  Performance Attributes	     VI-7
    D.  Recommended Measures  and  Standards	     VI-12
        1.   Recommended Performance  Measures  	     VI-14
        2.   Recommended Performance  Standards	     VI-23
        3.   Summary Table of Recommended Measures and
             Standards	     VI-30
        4.   Formulas for Calculating  Performance  Measures
             and Standards	     VI-32

                                    vi

-------
 VI  MODEL  PERFORMANCE  STANDARDS  (Continued)

    E.   A  Sample  Case:   The  SAI  Denver  Experience	VI-39

         1.  The  Denver Modeling  Problem 	  VI-39
         2.  Values  of  the Performance Measures	VI-4U
         3.  Interpreting the Performance Measure Values  	  VI-4b

     F.   Suggested Framework  for  a Draft Standard	VI-53

VII. RECOMMENDATIONS FOR FUTURE WORK	V11'1

     A.   Areas for Technical  Development 	  	  VII~2

         1.  Further Evaluation of Performance Measures  	  VI1-2
         2!  Identification and Specification of Prototypical
             Point Source "Test Bed" Data Bases	VI1-2
         3.  Examination of Performance  Evaluation  Procedure in
             Sparse-Data Point Source Applications  ...  	  VII-3
         4.  Further Development of Rationales for  Setting
             Performance Standards	- •   VII-4

     B.  Assessment of  Institutional Implications  	   VII-5

     C.  Documents To Be Compiled	VII~5

 APPENDICES

    A     IMPORTANT  PARTS OF  THE  CODE OF FEDERAL REGULATIONS
          CONCERNING AIR PROGRAMS	    A-'

    B     SOME SPECIFIC AIR QUALITY MODELS	    B'1

    C     SOME SPECIFIC MODEL PERFORMANCE MEASURES   	    C-l

     D     SEVERAL RATIONALES  FOR  SETTING MODEL PERFORMANCE
          STANDARDS	    D"1

 REFERENCES	    R-1
                                    vn

-------
                           ILLUSTRATIONS
II-l    Various  Levels  of Knowledge About Regional Concentrations  .  .  .    II-9

II-2   Various  Levels  of Knowledge About Specific-Source
       Concentrations	    11-9

 V-l    Various  Levels  of Knowledge About Regional Concentrations  ...     V-6

 V-2   Various  Levels  of Knowledge About Specific-Source
       Concentrations  	     V-7

 V-3   Sample Regional Isopleth Diagram Illustrating Ozone
       Concentrations  in Denver on 29  July  1975 for
       Hour 1200-1300  MST 	  	    V-17

 V-4   Sample Specific-Source Isopleth Diagram Illustrating
       Concentrations  Downwind of a  Steady-State Gaussian
       Point Source 	    V-18

 V-5   Concentration Isopleth Patterns for  Various Source Types ....    V-20

 V-6   Schematic of a  Point Source Measurement Network   	    V-21

 V-7   Locus of Possible Footprint Locations for an  Elevated
       Point Source	    V-21

VI-1    Orientation and Scaling of CAVE and  d* A*65 on
       a Prediction-Observation Correlogram .	      VI-37

VI-2   Locations of Monitoring Stations in  the Denver
       Metropolitan Region  	  VI-41

VI-3   Predicted and Observed Ozone  Concentrations at Each
       Monitoring Station During the Day  (Denver,  28 July 1976) ....  VI-42

VI-4   Correlogram of  Ozone Observation-Prediction Pairs
       for Sample Case (Denver, 28 July 1976) 	  VI-46

VI-5   Normalized Deviations About the Perfect Correlation  Line
       as a Function of Ozone Concentration (Denver, 28 July  1976)  . .  VI-47

VI-6   Non-Normalized  Ozone Deviations About the  Perfect Correlation
       Line Compared with Instrument Errors (Data  for  14 Hours and
       8 Stations, Denver, 28 July  1976)   	  VI-48

VI-7   Non-Normalized Ozone Absolute Deviation About the Perfect
       Correlation Line Compared with  Instrument  Error  (Data  for
       14 Hours and 8 Stations, Denver, 28  July  1976)  	  VI-49

                                 vi ii

-------
VI-8   Ground-Traces of the Predicted and Observed Peak Ozone
       Concentrations (Denver, Hours 1100-1200 to 1400-1500
       Local Standard Time, 28 July 1976)  ...............  VI-52
VI-9   Possible Relationships Between the Model  Performance
       Standards and a Guidelines Document ...............  VI-54
 C-l   Locations and Values of Predicted Maximum One-Hour-Average
       Ozone Concentrations for Each Hour from 8 a.m.  to 6 p.m .....    C-7
 C-2   Concentration Histories Revealing Time Lag or Spatial Offset   .  .   C-14
 C-3   Estimate of Bias in Model Predictions as  a Function of
       Ozone Concentration .......................   C-15
 C-4   Time Variation of Differences Between Means of Observed
       and Predicted Ozone Concentrations  ...............   C-l 7
 C-5   Probabilities of Ozone Concentration Exceedance .........   C-18
 C-6   Model Predictions Correlated with Instrument Observations
       of Ozone (Data for 3 Days, 9 Stations, Daylight Hours)   .....   C-19
 C-7   Model Predictions Compared with Estimates of Instrument Errors
       for Ozone (Data for 3 Days, 9 Stations, Daylight Hours) .....   C-21
 C-8   Map of Denver Air Quality Modeling Region Showing Air
       Quality Monitoring Stations . . . . ...............   C-23
 C-9   Time History of Predicted and Observed Concentrations at
       Monitoring Sites  ........................   C-24
C-10   Variations over All Stations of Observed and Predicted
       Average Ozone Concentrations  ..................   C-25
C-ll   Plots of Residuals and Forcing Variable .............   C-26
C-l 2   Distribution of Area Fraction Exposed to Greater
       Than a Given Concentration Value  ................   C-30
C-l 3   Isopleths of Ozone Concentrations (pphm)  on 29 July 1975  ....   C-35
C-14   Size of Area in Which Predicted Ozone Concentrations Exceed
       Given Values for Years 1976, 1985, and 2000 ...........   C-40
C-15   Typical Residuals Isopleth Plot for Annual Average N0£  .....   C-42
C-16   Estimated Exposure to Ozone as a Function of Ozone
       Concentration for 3 August 1976 Meteorology ...........  C-48
C-17   General Shape of the Exposure Cumulative Distribution
       and Density Functions ......................  C-49
                                     ix

-------
C-18   Shape of t(/(C), the Approximation to the Delta Function	C-52
C-19   Cumulative Ozone Dosage as a Function of the Time of Day
       for 3 August 1976 Meteorology 	   C-54
                                 o
C-20   Cumulative Exposure (in 10  Person-Hours) to Ozone
       Concentrations Above Given Level in One-Square-Mile Grid
       Cells Between 500 and 1800 Hours for 3 August 1976
       Meteorology and 1976 Emissions	C-55
C-21   Cumulative Ozone Dosages (in 10  pphm-Person-Hours) in the
       One-Square-Mile Grid Cells from 500 to 1800 Hours (MST) for
       3 August 1976 Meteorology and Emissions in 1976 	   C-58
C-22   Orientation with Respect to Measurement Station of Nearest
       Point at Which Prediction Equals Station Observation  	   C-59
C-23   Space-Time Trace of Location of Nearest Point Predicting
       a Concentration Equal to the Station Measured Value 	   C-60
 D-l   Possible Health Effects Curves  	    D-4
 D-2   Representation of Spatial and Concentration Dependent
       Population Functions  	    D-6
 0-3   Population Distribution as a Function of Concentrations 	   D-10
 D-4   Idealized Concentration Isopleths  	   D-ll
 D-5   Typical Radial Concentration Distributions About
       the Peak	D-13
 D-6   Predicted Population Distribution as a Function
       of Concentration  	   D-16
 D-7   Shifts in w(C) Caused by Nonuniform Population Distributions  .  .   D-l7
 D-8   Expected Shape of Health Effects Function 	   D-20
 D-9   Minimum Allowable Ratio of Predicted to Measured Peak
       Concentration Value 	   D-23
D-10   Prototypical Isopleth Diagram 	   D-28
D-ll   The Isopleth Diagram Replotted  	   D-29
D-12   Total Regional Control Cost as a Function of the Level
       of Control Required 	   D-32
D-13   Uncertainty Distribution for a Conservative Model  	   D-35
D-14   Uncertainty Distribution for a Nonconservative Model  	   D-35

-------
                               TABLES
II-l   Air Quality Issues Commonly Addressed,
       by Generic Model Type  ....................    H-5
I 1-2   Model /Application Combinations ................    I 1-6
II-3   Some Air Quality Models  ...................    H-7
II-4   Generic Performance Measure Information Requirements .....   11-10
II-5   Types of Variations Among Generic Performance
       Measure Categories ......................   11-12
I 1-6   Performance Measures Commonly Associated with
       Specific Issues   .......................   ll-~\3
11-7   Performance Measures That Can Be Calculated by
       Each Model Type   .......................   H-l3
 II-8   Performance  Measure Objectives  ................  11-15
 I 1-9   Importance of Performance Attributes by  Issue   ........  11-16
11-10   Importance of Performance Attributes by  Pollutant and
        Averaging Time ........................  11-18
II-ll   Measures Recommended  for Use in Setting  Model
        Performance Standards  ....................  11-19
11-12   Possible Rationales for Setting Model  Performance
        Standards  ..........................  n'21
11-13   Performance Attributes Addressable Using Performance
        Standard Rationales   .....................  11-22
11-14   Association of Rationales with  Generic Issues   ........  11-22
11-15   Recommended Rationales for  Setting Standards .........  11-26
11-16   Summary of Recommended Performance Measures and Standards   .  .  11-26
 IV-1   Air Quality Issues Commonly Addressed,
        by Generic Model  Type  ..............  ......  IV-1 8
 IV-2   Possible Designations of Application Attributes   .......  IV-23
 IV-3   Model/Application Combinations  ................  IV-24
 IV-4   Some Air Quality  Models ...................  IV-26
                                       xi

-------
 V-l   Generic Performance Measure Information Requirements .....     V-8
 V-2   Types of Variations Among Generic Performance
       Measure Categories ......................    V-ll
 V-3   Some Peak Performance Measures ................    V-23
 V-4   Some Station Performance Measures  ..............    V-24
 V-5   Some Area Performance Measures ................    V-26
 V-6   Some Exposure/Dosage Performance Measures   ..........    V-28
 V-7   Performance Measures Associated with Specific  Issues .....    V-34
 V-8   Performance Measures That  Can Be Calculated by
       Each Model  Type  .......................    V-37
 VI-1   Performance Measure Objectives ................   VI-10
 VI-2   Importance  of  Performance  Attributes by Issue .........   VI-10
 VI-3   Importance  of  Performance  Attributes by Pollutant and
       Averaging Time ........................  VI-1 3
 VI-4   Candidate Station Performance Measures ............   VI-16
 VI-5   Useful  Hybrid  Performance  Measures  ..............  VI-20
 VI-6   Measures Recommended  for Use in  Setting Model
       Performance Standards   ............ ........  VI~21
 VI-7   Possible Rationales for Setting  Model  Performance Standards  .  VI-24
 VI-8   Performance Attributes  Addressable Using  Performance
       Standard Rationales   .....................  VI'26
 VI-9   Association of Rationales  with Generic Issues   ........  VI-27
VI-10   Recommended Rationales  for Setting Standards .........  VI -29
VI-11   Summary of Recommended  Performance Measures and Standards  . .  VI-31
VI-12   Sample Values  for Model Performance Standards
        (Denver Example) .......................  VI~43
VI-13   Importance of Performance Attributes by Issue  ........  VI-56
VI-14   Importance of Performance Attributes by Pollutant
        and Averaging Time .....................  VI'56
VI-15   Model Performance Measures and Standards ...........  VI-57

                                  xii

-------
B-l   Some Specific Air Quality Models 	    B-3
C-l   Some Peak Performance Measures	    C-3
C-2   Several Peak Measure Combinations of Interest and
      Some Possible Interpretations  	    C-4
C-3   Some Station Performance Measures  	    C-8
C-4   Occurrence of Correspondence Levels of Predicted and
      Observed Ozone Concentrations  	   C-20
C-5   Some Area Performance Measures	c~29
C-6   Some Exposure/Dosage Performance Measures  	   C-45
D-l   Selected Parameter Values in Denver Test Case  	   D-15

-------
                               EXHIBITS
III-l    Formal Organization of CFR Title 40—Protection
        of  Environment	III-4

 IV-1    General Model Categories 	    IV-3
                                 xiv

-------
                          I    INTRODUCTION


     In this report a candidate framework is suggested within which  an
objective evaluation of air quality simulation model  (AQSM)  performance
may be carried out, along with an assessment of the relative applicability
of models to specific problems.  Quantitative procedures  are identified
that could facilitate assessment of the relative accuracy and  usability
of an AQSM.

     The subject addressed in this report is a broad and  complex  one.  Sel-
dom can a rule for judging model performance be stated that  does  not have
several plausible exceptions to it.  Consequently,  we view the  establish-
ment of model performance standards to be a pragmatic and evolutionary
exercise.  As we gain experience in evaluating model  performance, we will
need to modify both our choice of performance measures and the  range of
acceptable values we insist on.  Nevertheless, the  process must begin some-
where.  The recommendations contained in this report represent  such  a
beginning.

     Model performance evaluation should not be viewed as a  mechanistic
process, to be performed in a "cookbook" fashion.  Performance  measures
may be defined to be specific quantities whose value in some way  character-
izes the difference between predicted and observed  concentrations.   No set
of performance measures, however well designed, can fully characterize
model behavior.  Judgment is required of the model  user.   Predictions can
be compared with measurement data in a variety of ways.  Some  comparisons
involve the calculation of specific quantities and  are thus  suited for
having specific standards set.  (An example might be the  difference  between
the predicted and the observed concentration peak.)  Other comparisons are
more qualitative, better used in an advisory sense  to facilitate  "pattern
                                    1-1

-------
recognition."  (Concentration isopleth maps and. time profiles of predicted
and observed concentrations are examples of this type of qualitative com-
parison.)  Although we recommend a set of performance measures and standards
in this report, in no way does this recommendation suggest that computation
of measures be limited to this set.  For this reason, we catalogue many
different types of performance measures, only a small subset of which have
explicit, formal standards.

     The measures and standards we suggest for use will almost certainly
change as experience improves our "collective judgment" about what consti-
tutes mode! acceptability and what does not.  Perhaps the number of measures
will Increase to provide richer insight into model performance, or perhaps
the number will shrink without any loss of "information content."  Regard-
less of the list of measures and these standards that ultimately emerges for
use, ft is the conceptual structuring of the performance evaluation itself
that seems to be most important at this point.  We must identify clearly the
desirable model  attributes whose  presence  we are most  interested in detecting,
and we need to understand how we assess their relative importance, depending
on the issue we are addressing and the pollutant species we are considering.
Thfs report offers a conceptual structure for "folding in" all these concerns
and suggests candidate measures and standards.

A.   OVERVIEW OF THE PROBLEM

     Air quality simulation models (AQSMs) are widely used as predictive
tools, estimating tne impact on future air quality of alternative public
decisions.  Their predictions, however, are inherently nonverifiable.   Only
after the proposed action has been taken and the required implementation
time elapsed will measurement data confirm or refute the model's predictive
ability.

     Herein lies the dilemma faced by users of air quality models:   If a
model's predictions at some future time cannot be verified, on what basis
can we rely on that model to decide among policy alternatives?  In resolving
this dilemma, most users have adopted a pragmatic approach:  If a model can

                                   1-2

-------
 demonstrate  its  ability  to  reproduce a set of  "known" results for a similar
 type  of application,  then it  is judged an acceptable predictive tool.  It
 is  on this basis that model "verification" has become an essential prelude
 to  most modeling exercises.

      Several  investigators  (Calder, 1974, and Johnson, 1972) have objected
 to  this approach,  arguing that  it amounts to little more than "crude cali-
 bration."  They  suggest  that  true model validation can only be accomplished
 by  evaluating each component  sub-model--emissions, transport, or chemistry,
 for example.   While this may  be a scientifically  sound approach, there are
 so  many models available that it is difficult  to  complete  such efforts for
 them  all.  Worse,  the demand  for a model, truly validated  or not, often
 forces such  concerns  to  be  swept aside.  We take  a highly  pragmatic position
 in  this report,  one that is also consistent with  recommendations recently
 made  to and  by the U.S.  Environmental Protection  Agency [EPA] (Roth, 1977,
 and EPA, 1977).   Because verification is so often performed at the "output
 end"  (that  is, only model results are examined, comparing  them with "true"
 data), a systematic and  objective procedure is needed in assessing model
 performance  on that same basis.

      A further difficulty exists.   What  constitutes  a set  of "known" results?
This is not a problem easily solved.   For "answers" to be  known exactly,  the
"test" problem must be simple  enough to  be  solved  analytically.  Few problems
involving atmospheric dynamics are  so simple.   Most are  complex and nonlinear.
For those, the analytic test problem is  an  unacceptable  one.  Another, more
practical alternative often  is employed.   For  regional,  multiple-source
applications, the "known" results are taken  to be  the station measurements
of concentrations actually  recorded  on  a  "test" date.

      For source-specific applications,  the  source of interest may not yet
exist, permission for its construction  being the  principal  issue at hand.
For these applications, it  is  often  necessary  to  verify  a  model using the
most appropriate  of several  prototypical  "test cases."  Though  not existing
currently, these  could be assembled  from measurements taken at  existing
sources, the variety of source size, type and  location spanning the range of
values found in applications of interest.

                                   1-3

-------
      The term  "known"  is used imprecisely when referring to a set of measure-
ment data.   Station observations are subject to instrumentation error.  The
locations of fixed monitoring sites may not be sufficiently well distri-
buted spatially to record data fully characterizing  the  concentration
field and its peak value.  Nevertheless,  despite  those shortcomings,
"observed"  data often are regarded as  "true" data for the purpose of
model verification.

      In evaluating model  performance,  we  must  decide which performance
attributes  we most wish the  modeT  to possess.  Having assembled two sets
of data, one "known"  and the other "predicted," we can assess model perfor-
mance by comparing  one with  the  other.  Prediction and observation, however,
can be compared in  many ways.  We  must select  the quantities (performance
measures) that can most effectively test for the  presence of those attributes.

      Once we have decided on the performance measures best suited to our
 needs (and most feasible computationally),  we  can calculate these values.
 Having done so, however, we must ask a central question: How close must
 prediction be  to observation in  order for us to judge model performance
 as acceptable?  If we are to answer "how good  is  good,"  performance stand-
 ards for these measures must be  set, with allowable tolerances  (predicted
 values minus observed ones)  derived from a  reasonable rationale (health
 effects or pollution control cost considerations, for instance).

      By setting these standards explicitly, certain benefits may  be gained.
Among these are the following:

      >  A degree of uniformity is introduced in assessing model
         reliability.
      >  A rational  and objective basis is provided for  comparing
         alternative models.
      >  The impact  of limitations in both data gathering proce-
         dures and measurement network design can be made more
         explicit, facilitating any review of them that  may  be
         required.
                                    1-4

-------
      >  The performance expected of a model  is  stated  clearly,
         in advance of the expenditure of substantial analysis
         funds, allowing model  selection to be a more straight-
         forward and less "risky" process.
      >  The needs for additional research can be identified  clearly,
         with such efforts more directed in purpose.

B.    STRUCTURE OF THE REPORT

      The central purpose of this report is to suggest  means  for  setting  per-
formance standards for air quality dispersion models.   In doing so,  our dis-
cussion proceeds in two phases, the first exploring key elements  of  the over-
all problem, as well as their interactions, and the second synthesizing all
into a conceptual framework for model performance evaluation.

      We recognize three key elements of the performance assessment  problem,
all of which are interrelated:  the classes of issues addressed  by AQSMs  (air
quality maintenance planning or prevention of significant deterioration,  for
example),  the  types of AQSMs available for use (grid-based, trajectory, or
Gaussian models, for  instance) with the applications for which they are suit-
able,  and  the  classes of performance  measures that are candidates for our
use (two of which  are station  and exposure/dosage measures).

     We  consider each of these three  elements  in Chapters  III, IV,  and V,
providing  supporting  material  in Appendices  A,  B,  and  C.   In Chapter III,
we identify from current federal  law  and  regulations seven distinctly dif-
ferent types of air quality issues,  each  of  which may  be addressed  using an
AQSM.   In  Chapter IV, we assess  major model  classes, examining their capabil-
ities  and  limitations as well  as  their suitability for use in addressing
each of the generic classes of issues.   In Chapter V,  we discuss model per-
formance measures,  identifying four major types, which we  then assess  for
computational  feasibility and  suitability for use.

      We provide supplementary  detail  for these three chapters in  the first
 three  appendicies.   In  Appendix A, we outline important  portions  of the  Code
                                    1-5

-------
of Federal  Regulations.   In  Appendix  B, we  describe  in  summary  form a number of
specific air quality models.   In Appendix C, we  examine at  length a variety
of specific model  performance  measures, discussing their computation and pro-
viding illustrative examples of their calculation.

     Having identified issues  (Chapter III),  issue/model combinations
(Chapter IV), and issue/model/measure associations  (Chapter V), we reach
the synthesis phase in Chapter VI.   Here we first identify  five desirable
attributes  of model performance.   Then we  recommend  a set of performance
measures suitable for use in determining  the presence or absence of each
attribute.   Each measure is  chosen based  on two criteria:  First, it is
an accurate indicator of the presence of a problem type and second, it is
quantitative (that is, amenable to having specific standards set).

     Having selected the performance measures for use,  we then  offer several
possible rationales for determining the range of their acceptable values.
We examine four rationales,  discussing each in detail in Appendix D.  Having
done so, we recommend standards for use.

     We also consider the way in which the relative importance  of the five
model performance attributes varies with the issue being addressed and the
pollutant being considered.   We recommend a means for ranking problem types
that is dependent on these factors, using it as a way to decide from among
procedural  alternatives when a model fails to display a particular attribute.

     To illustrate how to interpret the values of the recommended  perfor-
fance measures, we discuss a sample case.  The sample case  history  is based
on the use of the grid-based SAI Airshed Model in modeling  the  Denver Met-
ropolitan region.  Supplementary means for gaining insight into model
behavior are also shown.

     Finally, a conceptual framework is suggested for a  draft model  perfor-
mance standard.  The elements it should contain are discussed,  as well  as
its  relationship to a supplementary guidelines document.
                                   1-6

-------
     With this final discussion, our presentation  is  complete,  though  the
subject itself is by no means exhausted.   Considerable additional  effort
is warranted, given the importance of this complex and difficult  topic.
We suggest in Chapter VII several areas in which we feel  such work would
prove fruitful.
                                   1-7

-------
                               II   SUMMARY


     In this chapter we summarize the results of this study.   First,
we state them in overall terms.  Then, we summarize detailed  results
on a chapter-by-chapter basis.

A.   MAIN RESULTS

     Several main tasks are accomplished in this report.   These represent
the chief results of the study.  We summarize them as follows:

     >  A conceptual framework is set for objective evaluation of
        dispersion model performance (Chapter VI).
     >  An outline for a draft model performance standards document is
        suggested (Chapter VI).
     >  Specific measures are recommended for use (Chapter VI).
     >  Specific rationales on which standards could be based are
        developed, several of which represent research that is
        original with this study (Chapter VI and Appendix D).
     >  Comprehensive  background material  is presented on key elements
        of  the  performance evaluation  problem:  the  types of  issues to
        be  addressed  (Chapter  III and  Appendix A), the classes of
        models  to be used along with the applications for which they
        are  suited  (Chapter IV and Appendix B), and  the categories of
        performance measures available for consideration (Chapter V
        and  Appendix C).
     >  Guidance on the  interpretation of  performance measure values
        is  provided by means of an  illustrative sample case  (Chapter VI).
                                   II-l

-------
B.   DETAILED SUMMARY

     Discussion in this report proceeds in two phases.   In the first of these,
we present a comprehensive examination of key elements  of the performance
evaluation problem.  This background phase consists of  the in-depth
analysis in Chapters III, IV and V, supported by material in Appendices
A, B and C.

     We intend the background phase of this report to be regarded  not as a
supplement but rather as an essential prelude to the second, or synthesis,
phase.  The second phase, contained in Chapter VI and Appendix D,  draws
from the background material to identify a set of performance criteria
that 1s both useful and computationally feasible.

     In this section we present detailed summaries of the important
results of the report.  We do so on a chapter-by-chapter basis.

1.   Summary of Chapter III (Issues)

     This chapter provides an issues framework within which the
application of air pollution models can be viewed.  First, an overview
is provided, highlighting important aspects of federal  air pollution
law (also see Appendix A).  By means of this discussion, seven generic
classes of issues are identified.  These issues are examined and
their implications for model applications explored.

     The seven issue classes,  divided into multiple-source and single-
source categories, are described as follows:

     >  Multiple-Source Issues
        -   SIP/C  (State Implementation Plan/Compliance).  The attainment
           of regional  compliance with NAAQS, as considered in the SIP.
        -   AQMP (Air Quality Maintenance Planning).  Regional main-
           tenance of compliance  vrith the NAAQS, as considered in
           the SIP.
                                 11-2

-------
     >   Single-Source  Issues
        -   PSD  (Prevention of  Significant Deterioration).  Limitation
           of the amount  by which  the air quality may be degraded in
           areas in attainment of  the NAAQS; this is considered in
           each SIP.
        -   NSR  (New Source Review).  Permit process by which applicants
           proposing new  or modified stationary  sources must demonstrate
           that both directly  and  indirectly caused emissions are
           within certain limits and that the  pollution control to
           be employed is performed with the best available tech-
           nology; this is considered in each  SIP.
        -   OSR  (0 f fse_t_Ru 1 es).  Interpretive decision by which all
           new  or modified stationary sources  in urban areas currently
           in noncompliance with the NAAQS are judged unacceptable
           unless the  applicant can demonstrate  a plan for reducing
           emissions  in an existing source by  an amount greater
           than the emissions  from the  proposed  new sources; this
           decision has a strong impact on the stationary source
           permit process.
        -   EIS/R  (Environmental  Impact  Statement/Report).  A state-
           ment of  impact required for  major projects undertaken by
           the  federal government  or financed  by federal funds
           (EIS), or  a report of project  impact  required of public
           or private agencies by  state or  local statutes  (EIR).
        -   LIT  (Litigation).   Court suits brought to resolve disagree-
           ment over  any  of  the issues  mentioned above or to secure  .
           variances  waiving federal,  state  or local requirements.

2.   Summary of Chapter IV  (Models)

     In Chapter III,  we identified a  set  of  generic air  quality  issues.
In this chapter, we define a set of generic  model  types.  Having done  so,
we match the two,  identifying in generic  terms those  issues for which
each model may  be a  suitable analysis  tool.  We  also describe  the  technical
formulations and  underlying  assumptions employed in each generic model
                                 II-3

-------
type, indicating some key limitations.  Through this presentation, we
specify the relationship between generic issues, models, and the appli-
cations for which they are suitable.

     The generic classes of dispersion models that we consider are:

     >  Rollback
     >  Isopleth
     >  Physico-chemical
        -  Grid
           •  Region Oriented
           •  Specific Source Oriented
        -  Trajectory
           •  Region Oriented
           •  Specific Source Oriented
        -  Gaussian
           •  Long-Term Averaging
           •  Short-Term Averaging
        -  Box

     In Table II-l we associate generic model types with air quality issues
for which their use is most appropriate.  In Table II-2 we present model/
application combinations of interest, characterizing applications by five
attributes:  number of sources, area type, pollutant, terrain complexity,
and required resolution.  The table lists the values of the attributes that
can be accommodated by each model type.

     In Table II-3 we relate some specific air quality models to the generic
model categories in which they may be classified.  Each of these models is
described in detailed summary form in Appendix B.

3.   Summary of Chapter V (Performance Measures)

     In this chapter we discuss the types of performance measures available
for use, examining their relationship with both the issues

                                   II-4

-------
TABLE  II-l.    AIR QUALITY ISSUES  COMMONLY ADDRESSED  BY  GENERIC  MODEL  TYPE
          Generic Model Type
        Refined Usage
        1.  Grid1
            a.   Region Oriented
            b.   Specific Source Oriented
        2.  Trajectory1
            a.   Region Oriented
            b.   Specific Source Oriented
                   3
                                   1
Gaussian
a.
                Short-Term Averaging
                1) Multiple Source
                11) Single Source
            b.  Long Term-Averaging
         Refined/Screening Usaqe
         4. Isopleth1'5
         Screening  Usage
         5. Rollback
         6. Box
                                   	Issue Category	
                                   SlP/CAQMP      PSD      MSR     BSR    IT57R    LTT
        Notes:
             1.  Only short-term time scales can be  considered (less than several days).
             2.  Regional  impact of new sources can  be assessed but not near-source, or microscale, effects.
             3.  Only non-reactive pollutants can be considered.
            4.  Only pollutants having long-term standards can be considered  (SO., TSP, and NO.).
             5.  Only photochemically active pollutants can be considered.
                                                II-5

-------
                    TABLE II-2.    MODEL/APPLICATION  COMBINATIONS
 Centric Model Type

REFINED USAGE

trid

a.  Region Oriented
                         Umber of
                          Sources
                                        Area Type
                      Multiple-Source    Urban
                                        Rural
                                                        Militant
03. MC. CO. «02
(1-hour). S02
(3- and 24-hour).
TSP
                        Terrain
                        Complexity
                                                                                               Required
                                                                                              Resolution
Simple             Temporal
Complex (Limited)   Spatial
ft.  Specific Source
    Oriented
                       Single-Source     Rural
Oj. HC. CO. Mfe
(1-hour). SO?
(3- and 74-hour).
TSP
 Simple
 Complex  (Limited)
                                                                                            Temporal
 Trajectory

 a.  Region Oriented
ft.  Specific Source
    Oriented
                       Multiple-Source   Urban
                      Single-Source      Urban
                                        Rural
01. HC. CO. 102
(1-hour). SO?.
(3- and 24-hour).
TSP

03. HC. CO. W>2
(1-hour). SO?.
(3- end 24-hour).
TSP
Si-pie
Siayle
Complex (United)
Teaporal
Spatial (Lialted)
Teaporal
Spatial  (LiBited)
tauisian
a . Long-Term
ft.
Short-Ter*
Averaging
Multiple- Source
Single- Source
Multiple-Source
Single- Source
Urban
Rural
Urban
Rural
SO? (Annual). TSP. Dimple
NOj (Annual)* Complex (Limited)
SO? (J- and 24-
hour). CO. TSP.
NOz. (l-hour)«
Si«ple
Coaplex (Li-ited)
Spatial
Temporal
Spatial
REFIMA/SC.KENING USAGE

Isopleth
                      ftuUiple-Source    Urban
                                                      .   .
                                                     (T-bour)
                    Simple             Teaporal (Ltaited)
                    Complex (Limited)
SCREENING USAGE

Rollback
                       Multiple-Source
                       Single- Source
                                        Urban
                                        Rural
                       Multiple-Source    Urban
03. HC. «02
S02. CO. TSP

03. HC. CO. NOz
(1-hour). SO?
(3- and 24-hour),
TSP
Simple
Complex (Limited)

Siaple
Conplex (Linited)
                                                                                            Temporal
* duty if M   is taken to be total
                                                 11-6

-------
           TABLE II-3.   SOME  AIR QUALITY  MODELS
     Generic Model  Type
Refined Usage
  Grid
  a.  Region Oriented
  b.  Specific Source Oriented

  Trajectory
  a.  Region Oriented


  b.  Specific Source Oriented

  Gaussian
  a.  Long-term Averaging
   b.   Short-term Averaging
 Refiner/Screening Usage
   Isopleth

 Screening Usage
   Rollback


   Box
Spec ific Model  Name
  SAI
  LIRAQ
  PICK

  EGAMA
  DEPICT
  DIFKIN
  REM
  ARTSIM

  RPM
  LAPS
  AQDM
  COM
  CDMQC
  TCM
  ERTAQ*
  CRSTER*
  VALLEY*
  TAPAS*

  APRAC-1A
  CRSTER*
  HANNA-GIFFORD
  HIWAY
  PTMTP
  PTDIS
  PTMAX
   RAM
  VALLEY*
  TEM
  TAPAS*
  AQSTM
   CALINE-2
   ERTAQ*
   EKMA
   WHITTEN
   LINEAR ROLLBACK
   MODIFIED ROLLBACK
   APPENDIX 0

   ATDL
 * These models can be used for both long-term and short-term
   averaging.

-------
and the models we  identified in Chapters III and IV.  Our discussion
proceeds as follows:  We first identify generic types of performance
measures; we then  catalogue some specific performance measures
(describing them in detail in Appendix C); and finally we match
generic performance measures to the issue/model/application combin-
ations presented in earlier chapters.

     We consider four generic performance measure categories:  peak,
station, area, and exposure/dosage.  The first category contains
those measures deriving from the differences between the predicted and
observed concentration peak, its level, location and timing.  The second
category includes  measures based on concentration differences between
prediction and observation at specific measurement stations.  Within the
third category are contained those measures based on concentration
field differences  throughout a specified area.  The fourth category
includes measures  derived from differences in population exposure and
dosage within a specified area.

     Each of these generic performance measure categories requires
successively greater knowledge of the spatial and temporal  distribution
of concentrations.  We show in Figure INI a schematic representation of
several distinct levels of knowledge about regional concentrations.  A
similar schematic  illustration appropriate for source-specific situations
is shown in Figure I1-2.  Listed in Table I1-4 are the information require-
ments for the four categories.  We also consider the relative likelihoods
that reliable information will be available supporting calculation of measures
from each of the four categories.

     Three types of variations are recognized among performance measures:
scalar, statistical, and pattern recognition.  Those measures of the
first type are based on a comparison of the predicted and observed
values of a specific quantity:  the peak concentration level, for
instance.   Those of the second type compare the statistical behavior
(the mean,  variance, and correlation, for example) of the differences
between the predicted and observed values for the quantities of interest.

                                 11-8

-------
         CONCENTRATION
         PEAK
         Vvvv
                                       MEASUREHHT STATION
                                       CONCENTRATIONS
                                             BOUNDARIES OF
                                             MODELING REGION
FIGURE II-l
VARIOUS LEVELS OF  KNOWLEDGE ABOUT
REGIONAL CONCENTRATIONS
             ASUREHENT
            STATION CONCENTRATIONS
                                             GROUND-LEVEL CON
                                             CENTRATION PEAK
                                        CONCENTRATION
                                        FIELD
                                        C(x.y.t)
   FIGURE  II-2.
   VARIOUS  LEVELS OF KNOWLEDGE  ABOUT
   SPECIFIC-SOURCE CONCENTRATIONS
                        II-9

-------
                TABLE I1-4.  GENERIC PERFORMANCE MEASURE
                            INFORMATION REQUIREMENTS
    Generic
  Performance
 Measure Type

Peak
Station
Area
Exposure/dosage
              Information Required
Predicted and measured concentration peak (level,
location, and time), i.e.,
                                                  ..  Meas.
Predicted and measured concentrations  at specific
stations (temporal history), i.e.,

                           •    1 * *  *  " stations
Predicted and measured concentration field within
a specified area (spatial and temporal  history),
i.e.,
                                     C(x,y,t)pred >
Both the predicted and measured concentration
field and the predicted and actual  population
distribution within a specified area  (spatial
and temporal history), i.e.,

                C(x,y,t)pred §

                C(x,y,t)pred f
                                11-10

-------
 Measures  of the  final  type are  useful  in  triggering  "pattern  recognition,"
 that is,  providing  qualitative  insight into  model  behavior, transforming
.concentration "residuals"  (the  differences between predicted  and observed
 values) into forms  that highlight certain aspects  of model performance.

      To illustrate  the types of variations found in each generic
 performance measure category, we present  Table II-5.  Some typical
 examples  are included  for  each  category/variation  combination.  In
 Section D of this chapter, a number of specific performance measures
 are listed. .Examined  in detail  in Appendix  C, they are classified
 according to the scheme presented here.

      For reasons we examine in  this chapter, performance measures may
 be associated with  the issue classes.   We match issue with measure  in
 Table II-6, indicating where their calculation might be of use.  Note
 that NSR and PSD are both  part  of the preconstruction review  process
 for a new source.

       Also,  we may  match measures  to model type, as  is  shown  in Table II-7.
  This  we  do based on differences among model  types in their ability to cal-
  culate each of  the measure types.   Isopleth,  rollback  and box models, for
  instance,  provide  insufficient  spatial resolution for calculation of station,
  area  or  exposure/dosage measures.   Likewise,  long-term averaging Gaussian
  models lack sufficient temporal resolution  to permit calculation of exposure/
  dosage measures.

       Several  important conclusions are reached in this chapter about the
  suitability for use of each of the four  measure types:

       >  Performance measures relying on  a comparison of the
          predicted  and "true" peak concentrations  may not be
          reliable in all circumstances, since measurement networks
          can provide only  the concentration  at the station record-
          ing the highest value,  not necessarily the value at  the
          "true"  peak.

                                  11-11

-------
           TABLE II-5.
   TYPES OF VARIATIONS AMONG  GENERIC
   PERFORMANCE  MEASURE CATEGORIES
Generic Performance
  Measure Category
Station
 Are*
 Cxpasurc/do&age
 Types of
Variations           	Typical Exaaple	

Scalar              Concentration residual* tt the  peak.
Pattern             Nap showing locations and values of Maximum
Recognition         one-hour-average concentrations for each hour.

Scalar              Concentration residual at the station Measuring
                    the highest value.
Statistical         Expected va\u*. variance and correlation coef-
                    ficient of the residuals for the nodeling day
                    at • particular Measurement station.

Pattern             At the ti«* of the peat (event-related). the
Recognition         ratio of the residual at the station having
                    the highest value  to the average of the resi-
                    duals at the other station sites (this can
                    Indicate whether the nodel performs better near
                    the peak than it does throughout the rest of
                    the modeled region).

Scalar             Difference In the  fraction of the andeled area
                    in which the NAAQS are exceeded.

Statistical         At the tla* of the peak, differences 1n the
                    area/concentration frequency distribution.

Pattern             For each modeled hour, Isopleth plots of the
Recognition         ground-level residual field.

Scalar             Differences in the number of perton-hours of
                    exposure to concentrations greater than the
                    NAAQS.
Statistical         Differences in the exposure concentration fre-
                    quency distribution.
Pattern             For the entire modeled day, an  isopleth plot
Recognition         of the ground level dosage residuals.
 • Residual:  The difference between "predicted"  and "observed.
                                         11-12

-------
TABLE  II-6.
PERFORMANCE MEASURES  COMMONLY
ASSOCIATED WITH SPECIFIC  ISSUES
Performance Measure Type
Issue
Multiple-source
SIP/C
AQMP •
Specific-source
PSD
HSR
OSR
EIS/R
in
Peak

X
X

X
X

X
X
Station

X
X

X
X
X
X
X
Area

X
X

X
X
X
X
X
Exposure/Dosage

X





X

  TABLE  II-7.
  PERFORMANCE MEASURES  THAT  CAN  BE
  CALCULATED BY  EACH MODEL TYPE
           Model
   Refined usage
    Grid
      Region oriented
      Specific source oriented
    Trajectory
      Region oriented
      Specific source oriented
    Gaussian
      Long-term averaging
      Short-term averaging
   Refined/screening usage
    Isopleth
   Screening usage
    Rollback
    Box
            Performance Heasure Type
                           Exposure/
          Peak  Station  Area  Dosage
                    11-13

-------
    >  Performance measures relying on a comparison of the
       predicted and "true" concentration fields may not be
       computationally feasible since neither predicted nor
       "true" concentration fields are always resolvable,
       spatially or temporally.
     >  Performance measures based upon a comparison of predicted
       and "true" exposure/dosage, though they are appealing
       because  of their ability to serve as  surrogates for the
       health effects  experienced  by the populace, may not be
       computationally feasible because of the difficulty in
       measuring the  "true"  population  distribution and the
       "true" concentration  field.   (We do suggest in Chapter VI
       and Appendix D, however, one  means by which health effects
        considerations can be accounted  for implicitly.)
     > Performance measures  based  upon  a comparison of the
       predicted and  observed concentrations at  station sites
        in the measurement network  may be of  the  greatest practical
       value.

4.   Summary of Chapter VI (Performance  Standards)

     The  central purpose of this report  is to suggest means for  setting
performance standards for air quality dispersion  models.   In  this
chapter we reach this goal.  Our discussion proceeds as follows:   First
we identify five key attributes of desirable  model  performance,  evaluating
how their relative importance varies depending  on the  issue addressed and
the pollutant/averaging time considered; then we propose  specific  perfor-
mance measures appropriate for use in testing for the  presence of  these
attributes; and finally we suggest rationales on which to base the setting
of formal standards.  Having recommended for use a list of performance mea-
sures and standards, we deal with two additional issues:   interpretation
of the values of the measures, which we illustrate by means of a sample
case study, and promulgation of formal performance criteria,  which we
explore by proposing an outline of a draft standard.
                                 11-14

-------
     The five attributes of desirable model  performance are defined  as
follows:  accuracy of the peak prediction, absence of systematic  bias,
lack of gross error, temporal  correlation, and spatial alignment.  Though
they are interrelated, each of the five performance attributes is  distinct.
Consequently, we must employ different kinds of performance measures to
determine the presence or absence of each.  We list in Table II-8 the
objectives of each type of performance measure.

                TABLE  II-8.  PERFORMANCE MEASURE OBJECTIVES
    Performance
    Attributes   _.     	Objective of  Performance Measures	
 Accuracy  of  the       Assess  the model's  ability  to  predict the concentra-
 peak prediction       tion  peak  (its  level, timing and  location)
 Absence of            Reveal  any  systematic bias  in  model  predictions
 systematic bias
 Lack of gross          Characterize the error  in model  predictions both at
 error                 specific monitoring stations and  overall
 Temporal               Determine  differences between  predicted and observed
 correlation            temporal behavior
 Spatial alignment     Uncover spatial  misalignment between the predicted
                       and observed concentration  fields

     We classify the difference between bias and error by means of the
following example.  Suppose when we compare a set of model predictions
with station observations, we  find several large positive residuals  (pre-
dicted  minus observed  concentrations)  balanced by several equally large
negative  residuals.   If we were testing for bias, we would allow the
oppositely signed  residuals to cancel.  A conclusion that the model  dis-
played  no systematic  bias therefore might be  a justifiable one.  On  the
other hand, were we testing for gross  error,  the signs of the residuals
would not be considered with  oppositely signed residuals no longer allowed
to  cancel.  Because the absolute  value of the residuals is large in  our
example,  we might  well conclude that  the  model predictions are subject to
significant  gross  error.
                                  11-15

-------
     Which of these performance attributes, however, is most  important?
This question has no unique answer, the relative importance of each
attribute depending on the type of issue the model is being used  to  address
and the type of pollutant under consideration.  In order to relate attri-
bute importance to application issue in a more convenient manner, we pre-
sent in Table II-9 a matrix of generic issues (as defined earlier in this
report) and problem type.  For each combination we indicate an "importance
category."  We define the three categories based on how strongly  we  insist
that model performance be judged acceptable for the given problem type.
For Category  1, we require that the performance attribute must be present
 (the problem  type is of prime  importance).  For Category 2, the attribute
should be present but, if  it is not, some leeway ought to be  allowed, per-
 haps at the discretion of  a reviewer (although the attribute  is of consider-
 able importance,  some degree of "mismatch" may be tolerable).   For Category 3,
we are not insistent that  the  performance attribute be present, though we
 state  that as being a desirable objective (the attribute is not of central
 importance).   The reasoning behind the entries in this table  is complex.
 For this reason,  we urge the reader to consult the detailed discussion in
 Chapter IV Section C.

       TABLE  I1-9.   IMPORTANCE OF PERFORMANCE ATTRIBUTES BY ISSUE

                             Importance of Performance Attribute*
 Performance Attribute   SIP/C   AQMP   PSD   NSR   OSR   EIS/R   LJI
 Accuracy of the  peak      1       1      1121       1
 prediction
 Absence of systematic     1111111
 bias
 Lack of gross error       11       11111
 Temporal correlation      2233333
 Spatial alignment         2213333
 * Category 1 - Performance standard must always be satisfied.
   Category 2 - Performance standard should be  satisfied, but some leeway
                may be allowed at the discretion of a  reviewer.
   Category 3 - Meeting the performance  standard is desirable but failure
                is not sufficient to reject the model; measures dealing
                with this problem should be regarded as  "informational."
                                 H-16

-------
     The relative importance of each performance attribute also is dependent
on the type of pollutant being considered and the averaging time required
by the NAAQS.  If a species is subject to a short-term standard, for
instance, model peak accuracy and temporal correlation might be of con-
siderable concern, depending on the issue being addressed.  However, if
the species is subject to a long-term standard, neither of these are of
appropriate form.  We indicate in Table 11-10 a matrix of the problem types
and pollutant species.  We rank each combination by the same importance
categories we used earlier in Table II-3.

     Conceivably, a conflict might exist between the ranking indicated
by the issue and the pollutant matrices in Tables II-9 and 11-10.  We
would resolve the conflict in favor of the less stringent of the two
rankings.

     Having identified the problem types of interest, we then suggest
specific performance measures for use.  Our recommended choice of perfor-
mance measures is based upon the following criteria:

      >   The measure is an accurate indicator of the presence of a
         given problem type.
      >   The measure is of the "absolute" kind, that is,  specific
         standards can be set.
      >  only station measures should be considered for use in
         setting standards.*  (This is more an  "unavoidable" choice
         than a. "preferred" one.)

      Based on these criteria, we recommend the set of measures described
 in Table 11-11.   The use of ratios (cpp/cpm and v,  for example)  can  intro-
 duce difficulties:   They can become unstable at low concentrations,  and  the
 statistics of a ratio of two random variables can  become  troublesome.  Never-
 theless, when used properly their advantages can be offsetting.   For example,
 the use of Cp /Cp  instead of Cp-Cm)  permits a health effects  rationale  to  be
 used in recommending a performance standard (see a  later  discussion).

*P!ote the caveat on  pages VI-18 and VI-19, with respect to point source applications,
                                   11-17

-------
         We draw a distinction  between  those  measures  that are of general
    use In examining model performance  and the much smaller  subset of measures
    that are most amenable to the  establishment of explicit  standards.   Many
    measures can provide rich insight  into model  behavior, but the informa-
    tion  Is conveyed in a qualitative  way not suitable for quantitative
    characterization (a requisite  for  use in  setting  performance  standards).
TABLE 11-10.    IMPORTANCE OF  PERFORMANCE ATTRIBUTES BY POLLUTANT AND AVERAGING TIME
                                           Importance of Performance Attribute*
                               Pollutants with Short-tern Standards
      Performance     3  .   CO**   MIC*      2
      Attribute   (1 hour)1 (1 hour) (3 hour)  (3 tour)
                            Pollutants with
                  	long-tern Standards
 CO     TSP**    *y   "y    TSP     *°l
B hour)  (24 hour)  (24 hour)  (1 year)  (1 year) fl year)
Accuracy of the
peak prediction
Absence of
systematic bin
lack of gross
•rror
Temporal
correlation
Spatial
alignment
1

1

1
1

1

1

1

1
2

2

1

1

1
2

2

1 1

1 1

1 • 1
2 1

2 1

1

1

1
2

2

1

1

1
3

2

1

1

1
3

2

3

1

1
«/An

2

3
•
1

1
II/A

2

8

1

1
H/A

2

    • Category 1 - Performance standard Mist be satisfied.
     Category 2 - Performance standard should be satisfied, but sow leeway Bay be allowed at the discretion of a reviewer.
     Category 3 - Meeting the performance standard Is desirable but failure Is not sufficient to reject the Model.
    t Ho short-ten M>2 standard currently exists.
    I Averaging tlaes required by the KAAQS are In parentheses.
   •• PrlMry standards.
   tt the performance attribute is not applicable.
                                           11-18

-------
TABLE 11-11.  MEASURES RECOMMENDED FOR USE IN SETTING MODEL PERFORMANCE STANDARDSf
    Performance
     Attribute
Accuracy of the
peak prediction
                     Performance  Measure
Ratio of the predicted station peak to the measured station
(could be at different stations and times)
                                               VA.
                                                 p/  rm
                      Difference in timing of occurrence of station peak*
Absence of
systematic bias
 Lack of gross
 error
 Temporal cor-
 relation*
  Spatial alignment
Average value and standard deviation of the mean deviation
about the perfect correlation line normalized by the average
of the predicted and observed concentrations, calculated for
all stations during those hours when either the predicted or
the observed values exceed some  appropriate minimum value -
(possibly the NAAQS)
                                              v, o_\
                                                  y '
                                                    OVERALL
 Average value and standard deviation of the absolute devia-
 tion about the perfect correlation line normalized by the
 average of the predicted and observed concentrations, calcu-
 lated for all stations during those hours when either the
  predicted or the observed values exceed some  appropriate
  minimum value  (possibly the NAAQS)
                                 OVERALL

  Temporal  correlation coefficients at each monitoring  station
  for the entire modeling period and an overall  coefficient
  averaged for all  stations
                      rf , r
                       *1   COVERALL

  for* 1 <. i <. M monitoring stations

  Spatial correlation coefficients calculated for each  modeling
  hour considering all monitoring stations, as well  as  an over-
  all coefficient average for the entire day

                     r   , r
                      xj   XOVERALL

  for 1  <. j <. N modeling hours
  * These measures are  appropriate when  the  chosen model  is used to consider questions
    involving photochemically  reactive pollutants subject to short-term standards.

  t There is deliberate redundancy in the  performance measures.  For example, in
    testing for  systematic  bias, U  and  o^.  are calculated.  The latter quantity
    is a measure of  "scatter"  about  the  perfect correlation line.  Thjs is also an
    indicator of gross  error and could be  used in conjunction with  |y"| and o~.
                                        11-19

-------
These "measures," often involving graphical display, really are tools
for use in "pattern recognition."  They display model behavior in
suggestive ways, highlighting  "patterns" whose presence reveals much
about model performance.  Several examples of such  "measures" are
isopleth contour maps of  predicted concentrations estimates of
"observed" ones,  isopleth contour maps of the differences between the
two, and time histories of  predicted and observed concentrations at
specific monitoring  stations.

     Although we focus on station measures for use  in setting model per-
 formance  standards,  we do not suggest that the calculation of performance
measures  be limited  to such measures. Many other measures should be used
 where  appropriate.   The  data should  be viewed in as many, varied ways as
 possible  in order to enrich insight  into model behavior.  We suggest a
 number of useful measures both in Chapter V and Appendix C.

      Having identified specific measures for use, we consider four rationales
 for setting appropriate  standards.   The rationales, along with a statement
of their  guiding principles, are shown in Table 11-12.  We discuss each in
 detail  in  Appendix D.

       The four rationales differ in  their ability to consider each of the
  five  problem types.   Shown in Table II-13 are the  types of problems
  addressable by measures  whose standards are set by each of the rationales.
  Only  the  Pragmatic/Historic rationale is of use in addressing all problem
  types; the other three  are of use principally in defining the level of
  performance required in  predicting  values at or near the concentration
  peak.  In Table 11-14 we associate  each rationale with those issues
  for which its use is  appropriate.

      We select in the following ways from among the alternative rationales.
  Hoping to  avoid introducing a procedural bias, we  first eliminate the
  Guaranteed Compliance rationale from further consideration.  Then,
  because the  Health Effects rationale is better suited for use in setting
                                  11-20

-------
          TABLE 11-12.
POSSIBLE RATIONALES FOR SETTING MODEL
PERFORMANCE STANDARDS
      Rationale
                  Guiding 'Principle
Health Effects
Control Level
Uncertainty
Guaranteed Compliance
Pragmatic/Historic
  The metric of concern is the area-integrated cum-
  ulative health effects due to pollutant exposure;
  the ratio of the metric's value based on predic-
  tion to its value based on observation must be
  kept to within a prescribed tolerance of unity.

  Uncertainty in estimates of the percentages of
  emissions control required must be kept within
  certain allowable bounds.

  Compliance with  the NAAQS must be "guaranteed";
  all uncertainty  must  be on the conservative side
  even  if this  approach means  introducing a syste-
  matic  bias.

   In each new  application,  a model  should perform
   at least  as  well as  the  "best" previous perfor-
   mance of  a model in  its  generic  class  in a  sim-
   ilar  application;  until  such a historical data
   base  is  complete,  other more heuristic approaches
   may be applied.
                                   11-21

-------
            TABLE 11-13.   PERFORMANCE  ATTRIBUTES ADDRESSABLE  USING
                          PERFORMANCE  STANDARD RATIONALES
Performance
Attribute
Accuracy of the
peak prediction
Absence of
systematic bias
Lack of gross
error
Temporal
correlation
Soatial alignment
Health*
Effects
X



X




Control Level* Guaranteed
Uncertainty Compliance
X X



X




Pragmatic/
Historic
X

X

X

X

X
  * These are most suited for photocheaically reactive pollutants subject
    to short-term standards.
         TABLE   11-14. ASSOCIATION OF RATIONALES WITH GENERIC ISSUES
                                           Issue Category
   Rationale

Health Effects

Control Level
Uncertainty

Guaranteed
Compliance

Pragmatic/
Historic
Multiple-Source
SIP/C
X
X
AQMP
X
X
PSD
X
X
Specific-Source
NSR
X
X
OSR EIS/R
X
X
LIT
X
X
X


X
X


X
X


X
                                    11-22

-------
standards for peak measures,  we choose to use it only  in  that way.  As is
clear from Table 11-13,  we presently have no alternative  but to  apply
the Pragmatic/Historic rationale for those measures designed to  test
for systematic bias and  gross error as well as to evaluate temporal
correlation  and spatial alignment.

     Where we invoke the Pragmatic/Historic rationale  as  justification
for selecting specific standards, we also state the specific guiding
principles we follow.  We summarize those here:

     >  When the pollutant being considered is subject to a short-
        term standard, the timing of the concentration peak may  be
        an important quantity for a model to predict.   This is parti-
        cularly true when the pollutant is also photochemically
        reactive.   We state as a guiding principle:  "For photochem-
        ically reactive pollutants, the model must reproduce  reason-
        ably well  the phasing of the peak."  For ozone an acceptable
        tolerance for peak timing might be ±1 hour.
     >  The model  should not exhibit any systematic bias  at concen-
        trations at or above some appropriate minimum  value (possibly
        the NAAQS) greater than the maximum resulting  from EPA-allowable
        calibration error in the air quality monitors.  We would
        consider in our calculations any prediction-observation  pair
        in which either of the values exceed the pollutant standard.
     >  Error (as measured by its mean and standard deviation)  should  not
        be significantly different from the distribution of differences
        resulting from the comparison of an EPA-acceptable monitor
        with an EPA reference monitor.  The EPA has set maximum
        allowable limits on the amount by which a monitoring technique
        may  differ from a reference method  (40 CFR § 53.20).   An "EPA-
        acceptable monitor"  is defined here to be one that differs from
        a  reference monitor  by up  to  the maximum allowable amount.
     >  Predictions and observations  should appear to be highly cor-
        related at a 95 percent confidence level, both when compared
                                 11-23

-------
        temporally and spatially.  We  can estimate  the minimum allow-
        able value for the  respective  correlation coefficient by using
        a t-statistic at  the  appropriate percentage level and having
        the degrees  of freedom appropriate  for  the  number of prediction-
        observation  pairs.

      The  guiding principles noted above are plausible ones, though in
 some cases  they are  arbitrary.  As a "verification  data base" of
 experience  is  assembled,  historically achieved  performance levels may
 be better indicators of  the expected level  of model  performance.
 Standards derived on this more pragmatic basis  may  supplant those
 deriving  from the "guiding  principles" followed in  this report.

      Our  recommended choice for use, when possible,  in establishing peak-
 accuracy  standards is a  composite one, combining the Health Effects and
 Control Level  Uncertainty rationales.   Were a model to overpredict the
 peak, a control strategy  based on its  prediction might be expected
 to abate  the health  impact  actually occurring,  though with more control
 than actually needed.   If the model  underpredicted, however, the control
 strategy  might be "underdesigned," with the risk existing that some of the
 health impact might  remain  unabated even after  control implementation.
 The penalty,  in a health  sense, is incurred only when the model underpre-
 dicts. The Health Effects  rationale then is one-sided, helping us set
 performance standards only  on the "low side."

      On the other hand, the Control  Level Uncertainty rationale is
 bounded "above" and  "below",  that is,  its use provides a tolerance
 interval about  the value  of the measured peak concentration.  For a
 model  to be judged acceptable under  this criterion, its prediction of
 the peak concentration would  have to fall within this interval.  Model
 underprediction could lead  to control  levels lower than required, but
 residual health risks.  Overprediction, on  the  other hand, could lead
 to abatement strategies posing little  or no health risk but incurring
control costs greater than  required.
                                  11-24

-------
     For the above reasons, we suggest that the Control  Level  Uncer-
tainty rationale be used to establish an upper bound (overprediction)
on the acceptable difference between the predicted and observed peak.
We would choose the lower bound (underprediction) to be the interval
that is the minimum of that suggested by  the Health Effects and
Control Level Uncertainty rationales.

     We list our recommendations in Table 11-15, noting the possibility
that the recommended rationales may not be appropriate in all  applications
for all pollutants.   Whether health effects would be an appropriate con-
sideration when considering TSP, for instance, is unclear.  The Health
Effects rationale, as defined in Appendix D, is best suited for use in
urban applications involving short-term, reactive pollutants.   In those
circumstances when the HE or CLU rationales are not suitable,  we suggest
the Pragmatic/Historic rationale.

     We summarize in Table 11-16 our list of recommended performance
measures and standards.  In it, we associate performance attribute
and standard.  To further describe the standard, v/e state the type of
rationale used and the guiding principle followed, as well as providing
sample values that are appropriate for the sample case we consider
in this chapter.

     We also discuss two supplementary subjects.  First, we illustrate
how performance measure values may be  interpreted by describing a
sample case based on use of the SAI Airshed Model in simulating the
Denver Metropolitan region  Then, we consider means by which model
performance criteria may be promulgated, suggesting an outline for a
draft standard.

     Thus we conclude this chapter and the report.  We note in closing
that the performance subject itself  is by no means exhausted.   Many
areas remain to be explored in greater detail, all warranting considerable
additional effort.
                                   11-25

-------
         TABLE 11-15.  RECOWENDED RATIONALES FOR SETTING STANDARDS
   Performance
    Attribute

Accuracy of peak
prediction
                                     Recommended Rationale
                    Health Effects* (lower side/underprediction)
                    Control Level  Uncertainty* (upper side/overprediction)
Absence of
sytematic bias
                    Pragmati c/Hi stori c


                    Pragmati c/Hi stori c


                    Pragmatic/Historic


Spatial alignment   Pragmatic/Historic
Lack of gross
error

Temporal cor-
relation
* These may not be appropriate for all regulated pollutants  in all  applica-
  tions.  When they are not, the Pragmatic/Historic rationale should be
  employed.  They are most applicable for photochemically reactive  pol-
  lutants subject to a short-term standard (03 and N02i  if a 1-hour
  standard is set).
                                 11-26

-------
        TABLE  11-16.    SUMMARY  OF RECOMMENDED  PERFORMANCE  MEASURES  AND  STANDARDS
                                                                          Performance Standard
    Performance
     Attribute

 Accuracy of the
 peak prediction
   Performance of Measure

Ratio of the predicted
station peak to the
measured station peak
(could be at different
stations and times)
Type of Rationale
Health Effects1-
(lower side) com-
bined with Control
Level Uncertainty
(upper side)
Guiding Principle
Limitation on uncertainty
1n aggregate health
Impact and pollution
abatement costs^
Sample Value
(Denver Example)
Cp
80 < fr-2- <. 150 percent
\
                    Difference in timing of
                    occurrence of station
                           Pragmatic/Historic
                                                                    Model must reproduce
                                                                    reasonably well  the
                                                                    phasing of the peak,
                                                                    Sty, il  hour
                                                                                                 ±  1  hour
Absence of          Average value and standard  Pragmatic/Historic
systematic bias      deviation  of mean devia-
                    tion about the perfect
                    correlation line normal-
                    ized by the average of the
                    predicted  and observed con-
                    centrations, calculated for
                    all stations during those
                    hours when either predicted
                    or observed values exceed
                    some appropriate minimum
                    value (possibly the NAAQS).
                                 'OVERALL
Lack of gross        Average value and Stan-     Pragmatic/Historic
error               dard deviation of absolute
                    mean deviation about the
                    perfect correlation line
                    normalized by the average
                    of the predicted and
                    observed concentrations,
                    calculated for all sta-
                    tions during those hours
                    when either predicted or
                    observed values exceed some
                    appropriate minimum value
                     (possibly the NAAQS)
                                               No or very little systematic
                                               bias at concentrations (pre-
                                               dictions or observations) at
                                               or above some appropriate
                                               minimum value (possibly the
                                               NAAQS); the bias should not
                                               be worse than the maximum
                                               bias resulting from EPA-
                                               allowable monitor calibra-
                                               tion error (-8 percent is
                                               a representative value for
                                               ozone); the standard devia-
                                               tion should be less than or
                                               equal to that of the differ-
                                               ence distribution of an EPA-
                                               acceptable monitor** com-
                                               pared with a reference moni-
                                               tor.  (3 pphm is represents-
                                               tlve for ozone at the 95
                                               percent confidence level)

                                               For concentrations at or
                                               above some appropriate
                                               minimum value (possibly
                                               the NAAQS), the error
                                              •• (as measured by the overall
                                               values of jiT|  and o|—| )
                                               should be indistinguishable
                                               from the difference result-
                                               ing from comparison of an
                                               EPA-acceptable monitor with
                                               a reference monitor
                                                                                                 No apparent bias at
                                                                                                 ozone concentrations
                                                                                                 above 0.06 ppm
                                                                                                 (see Table VI-12 and
                                                                                                 Figures Vl-5 and Vl-6
                                                                                                 for further details)
                                                                                                  NO excessive gross
                                                                                                  error  (see Table
                                                                                                  VI-12  and Figures
                                                                                                  Vl-5 and VI-6 for
                                                                                                  further details)
Temporal correla-
tion*
 Spatial alignment
      \      I "I/OVERALL

Temporal correlation coef-
ficients at each monitor-
ing station for the entire
modeling period and an
overall coefficient for
all stations
                           Pragmatic/Historic
                        V1
        COVERALL
                     for 1  i 1 i M monitoring
                     stations
Spatial  correlation coef-
ficients calculated for
each modeling hour con-
sidering all  monitoring
stations, as  well as an
overall  coefficient for
the entire day

  V  "OVERALL
for 1 i j <. N modeling
hours
                                                 Pragmatic/Historic
                                                                    At a 95 percent confidence
                                                                    level, the temporal pro-
                                                                    file of predicted and
                                                                    observed concentrations
                                                                    should appear to be in
                                                                    phase (in the absence of
                                                                    better information, a con-
                                                                    fidence interval may be
                                                                    converted into a minimum
                                                                    allowable correlation
                                                                    coefficient by using an
                                                                    appropriate t-statistic)
                                                                     At a 95  percent confidence
                                                                     level, the spatial distri-
                                                                     bution of predicted and
                                                                     observed concentrations
                                                                     should appear  to be cor-
                                                                     related
For each  monitoring
station,

  0.69 <. r.   <. 0.97
                                                                                                    Overall,
                                                                                                     OVERALL
                                                                                                               0.88
In this example a
value of r >. 0.53  is
significant at  the
95 percent confidence
level

For each hour,
 -0.43 t.T   i 0.66
          xj
Overal1,
                                                                                                                 0.17
                                                                                                      'OVERALL
                                                                                                   In this example a
                                                                                                   value of r 2. 0.71  is
                                                                                                   significant at the 95
                                                                                                   percent confidence
                                                                                                   level
 * These measures are appropriate when the chosen model  is  used  to consider questions involving photochemically reactive
   pollutants  subject to short-term standards.
 •t These may not be appropriate for all regulated pollutants in  all applications.  When they are not  the Pragmatic/
   historic rationale should be employed.
** The EPA has set maximum allowable limits on the amount by which a monitoring technique may differ  from a reference
   method.  An "EPA-acceptable monitor" Is defined here to  be one that differs from a reference monitor by up to the
   maximum allowable amount.
                                                      11-27

-------
            Ill    ISSUES REQUIRING MODEL APPLICATION
     Air pollution models  have been developed  over a period of years, not
always in response to specific needs.   While convenience and availability
(rather than strict suitability)  often motivated  their  use in particular
applications, certain classes of  models have come to be associated with
certain classes of applications.   For  this  reason, it is helpful to view
the setting of model performance  measures and  standards within that issue-
specific context.  This chapter is intended to provide  an issues framework
within which the application of air pollution  models can be viewed.  First,
an overview is provided, highlighting  important aspects of air pollution
law.  By means of this discussion, generic issues are identified.  Then,
these issues are examined and their implications  for model applications
explored.

A.   A PERSPECTIVE ON THE ISSUES

     Basic air pollution law  in this country has  been  enacted  at the fed-
eral level, although many important legal variants  exist among states  and
localities.  The passage of  legislation, however, is  often  just a  first
step.  Usually, only broad authority is granted in  the  original  law.   It
remains  to the federal agency thus chartered by the Congress  to set  the
specific regulations implementing  the  law.  These are then  promulgated,
becoming an additional part  of the Code of Federal  Regulations (CFR).
Notice is provided  of  such an action by publication in the  Federal Register
 (FR).  When disagreements exist over the degree to which the  promulgated
 regulations mirror  the intent of  the original  law,  civil suits may be  brought
 in  court to resolve disputes.  Judgments in such suits  can  and have  had
 important effects on the CFR.  In  the  remainder of this section we will
 explore  briefly  the body of  air  pollution  law, from enabling  legislation to
 promulgation  of regulations  in the CFR.

                                III-l

-------
1.   Federal Air Pollution Law

     Basic federal law is contained in the United States Code (USC).  It
is divided into "Titles" which are themselves divided into "Sections."
Groups of sections form "Chapters."  Title 42 of the USC (usually denoted
as 42 USC) is entitled "The Public Health and Welfare."  It contains the
basic law pertaining to air pollution:  Chapter 15B entitled "Air Pollution
Control" and Chapter 55 entitled  "National Environmental Policy."

     The Clean Air Act is contained in Section 1857 of Title 42 (within
Chapter 15B) and  is referenced by the notation 42 USC §1857.  Originally
enacted in  1963,  It has since been amended a number of times.  The most
notable changes occurred with the passage of the Clean Air Act Amendments
of 1970 and 1977, the former of which, among other things, created the
Environmental  Protection Agency (EPA), authorized the setting of national
ambient air quality standards (NAAQS) and required the development of state
 implementation plans  (SIPs) for the attainment of compliance with the NAAQS.
After passage  by  the Congress and signature by the President, a bill con-
 taining such amendments or  providing for new portions of the USC becomes
a part of the  public law and  is referred to both by the Congressional ses-
sion and  a  passage sequence number.  The 1970 Amendments, for example, are
referred  to as  Public Law 91-604.  For reference, the 91st Congress convened
for the two years from January 1969 to January 1971.

     The other  legislation most heavily affecting air pollution law is the
National Environmental Policy Act (NEPA) of 1969 (Public Law 91-190), which
amended Chapter 55 (National Environmental Policy) of Title 42.  In its
•primary features, the act created the Council on Environmental Quality
reporting to the  President and mandated the preparation of environmental
impact statements (EISs) for "major Federal actions significantly affecting
the quality of the human environment."  These are required for federal agency
actions and for projects supported "in whole or in part" with federal finan-
cing.  The NEPA is found in 42 USC §4321, 4331 to 4335, 4341, and 4341 to 4347.
                                 III-2

-------
2.   The Code of Federal  Regulations

     Implementation of federal  law is accomplished by promulgation  of
specific regulations,  the body  of which is contained in the Code  of Fed-
eral  Regulations.   The CFR is divided into "Titles" (not the same as those
in the DSC),  which  are themselves subdivided into "Chapters," "Subchapters,"
and "Parts."   All federal regulations pertaining to air pollution are  con-
tained in Title 40  which  is called "Protection of the Environment." The
formal organization of 40 CFR is shown in Exhibit III-l.   Note that Title  40
contains no Chapters II and III.

     Subchapter C,  "Air Programs," is expanded in that exhibit to include
"Part" subheadings  as  is  Chapter V, "Council on Environmental Quality."
The following parts within Chapter I  are of particular importance.   In Part
50 the primary and  secondary NAAQS are set for sulfur dioxide, particulate
matter, carbon monixide,  photochemical oxidants, hydrocarbons, and  nitrogen
dioxide.  In Part 51 requirements are stated for the development of SIPs.
All State plans, whether approved or disapproved, are published in  Part  52.
In Part 60 the emissions  standards are set for new and modified stationary
sources.  Further breakdown of  these parts by section heading is provided
in Appendix A.

     As originally conceived, SIPs were blueprints for achieving compliance
with the NAAQS.  As the  regulations have evolved, however, they now require
that SIPs now provide for air quality maintenance (AQM) once compliance
has been achieved.   SIPs are currently being revised according to the  man-
dates of the 7 August 1977 Clean Air Act Amendments and are required to
be reassessed periodically as to their ability to attain and maintain  the
NAAQS.
                                  III-3

-------
          EXHIBIT III-1.    FORMAL ORGANIZATION OF CFR TITLE 40—
                           PROTECTION OF ENVIRONMENT
Chapter 1.  Environmental Protection Agency
     Subchapter A - General (Parts 0-21)
     Subchapter B - Grants and Other Federal Assistance (Parts 30-49)
     Subchapter C - Air Programs (Parts 50-89)
          Part 50.  National primary and secondary ambient air quality
                    standards
          Part 51.  Requirements for preparation, adoption, and sub-
                    mi ttal of implementation plans
          Part 52.  Approval and promulgation of implementation plans
          Part 53.  Ambient air monitoring reference and equivalent
                    methods
          Part 54.  Prior notice of citizen suits
          Part 55.  Energy related authority
          Part 60.  Standards of performance for new stationary sources
          Part 61.  National emission standards for hazardous air
                    pollutants
          Part 79.  Registration of fuels and fuel additives
          Part 80.  Regulation of fuels and fuel  additives
          Part 81.  Air quality control regions,  criteria, and control
                    techniques
          Part 85.  Control of air pollution from new motor vehicles and
                    new motor vehicle engines
          Part 86.  Control of air pollution from new motor vehicles and
                    new motor vehicle engines:  certification and test
                    procedures
          Part 87.  Control of air pollution from aircraft and aircraft
                    engines
          Part 88-89.  [Reserved]
     Subchapter D - Water Programs (Parts 100-149)
     Subchapter E - Pesticide Programs (Parts 162-180)
     Subchapter F - [Reserved]
     Subchapter G - Noise Abatement Programs (Parts 201-210)
     Subchapter H - Ocean Dumping (Parts 220-230).
     Subchapter I - Solid Wastes (Parts 240-399)
                                III-4

-------
     Subchapter N - Effluent Guidelines  and Standards  (Parts  401-460)
     Subchapter Q - Energy Policy (Part  600)

Chapter IV.   Low Emissions Vehicle Certification Board (Part  1400)

Chapter V.   Council on Environmental  Quality (Parts  1500-1510)
     Part 1500.  Preparation of environmental  impact statement:
                 Guidelines
     Part 1510:  National  oil and hazardous substances pollution
                 contingency plan
                                  III-5

-------
     Contained within SIPs are procedures for controling emissions from both
mobile and stationary sources.  Because of the size and age of the vehicle
fleet, control of emissions from mobile sources is currently an important
part of other SIP segments dealing with NAAQS compliance.  As stricter auto-
motive emissions standards are achieved and older cars are removed from high-
ways through age attrition, stationary sources will contribute an increasing
fraction of the total emissions inventory.  Their importance thus increases
in the AQM segment of the SIPs.

     The portion of  40  CFR relating  to the review of applications for
new  or modified stationary sources 1s Section 51.18.  There it is stated
that "no approval  to construct or modify will be granted unless the appli-
cant shows to the  satisfaction of the Administrator that the source will
not  prevent or interfere with attainment or maintenance of any national
standard."  The quote Is a paraphrase of §51.18(a), as written in the
California SIP [40 CFR  §52.233(g)(3)].  Several issues of practical impor-
tance derive  from  this  section of 40 CFR.  New source review (NSR) proce-
dures are  thus required, with such stationary sources directed to meet
new  source performance  standards (NSPS) where stated in 40 CFR §60 or as
determined by the  appropriate reviewing agency and to install appropriate
pollution  control  equipment.  Also,  an important consequence of 40 CFR
§51.18 derives from  its interpretation in urban areas currently in noncom-
pliance with  the NAAQS.  In most instances, the addition of a single, modestly
sized, stationary  source would be unlikely to affect regional peak pollutant
concentration. Considered separately, an argument could be made that few new
stationary sources violate the letter of §51.18.  Taken in the aggregate,
however, emissions from several new  sources together could have serious ad-
verse effects  on regional pollutant  concentrations.  To overcome this inter-
pretive difficulty,  the EPA has employed the so-called offset rules (OSR).
All  new stationary sources in noncompliant urban areas are considered to be
in violation of §51.18  unless the applicant can demonstrate that a reduction
in emissions  from other sources and  a reduction in the air quality impact of
those emissions has  been achieved to offset those produced by the proposed
new source.
                                  111-6

-------
     Another issue  of importance  in  SIP  development  is  the  prevention of
significant deterioration  (PSD) of the air quality in areas currently in
attainment of the NAAQS.   Originally, 40 CFR contained  no provision for
consideration of PSD.   A court suit, however, brought about a judgment
that SIPs  must address this  issue.   As a consequence, subsequent to May 31,
1972, the  EPA Administrator  disapproved  all  SIPs  not considering PSD.
Standards  for PSD were promulgated in §52.21, entitled  "Significant Deteriora-
tion of Air Quality."

     In addition to SIPs,  environmental  impact statements and  reports
(EIS/R) represent the other  major class  of planning  documents  formally
required to address air quality issues.   In Chapter  V of 40 CFR, guidelines
are provided for drafting  EISs for major federal  actions.   They are required
not only for projects undertaken  solely  by the federal  government, but  also
for any major projects supported  "in whole or in  part"  by federal  financing.
EISs were  submitted to the CEQ for review.  They  are now,  however, received
and reviewed by the EPA.   State and  local agencies  can  also require for in-
dividual projects a formal statement of  environmental  impact.   In  California,
for instance, such  a statement is called an "Environmental  Impact  Report"
(EIR) and  is filed  pursuant  to the California Environmental Quality Act (CEQA)

     Running throughout air  pollution law is the  basic  right  of legal  appeal.
Court suits have played an important part in shaping the body of the  law.
Portions of the authorizing  statutes,  the CFR, and many individual  EIS/Rs
have come under legal challenge.   As a  result, litigation (LIT) also  re-
presents an important class  of issues addressed by air pollution modelers.

B.   GENERIC ISSUE  CATEGORIES

     In the previous section we have outlined many of the important features
of air pollution law.  A number of generic  issues thereby have been ident-
ified.  In this section we will summarize these generic issues, discuss each
briefly, and then examine their implications for air pollution modeling.
                                  III-7

-------
In the next chapter we will match these issue categories with"a number of
existing models, comparing application requirements with model  capabilities.

1.   The Issues:  Their Classification

     The air pollution burden In a geographical area is the result of the
complex interaction of missions from all sources as they mix and disperse
 in the atmosphere,  subject to prevailing influences of meteorology, solar
 irradiation,  and te'rrain.   The  total pollutant concentrations experienced
 are a  function of the effects of emissions from each of the mobile and
 stationary emitting sources, though that function is generally not a
 linearly  additive one.   Because the NAAQS are expressed in terms of total
 allowable concentration  levels  and are applicable at any location to which
 the public has access,  implementation plans are inherently regional in
 perspective.   There is  a certain duality of focus in SIPs, however:  While
 they detail plans for regional  NAAQS compliance and maintenance, they do so
 through curtailment of emissions from individual sources and source cate-
 gories.  Thus, while the focus  is ultimately on regional effects, the environ-
 mental impact of individual sources also must be considered.  This is an
 explicit  issue with new source  review (NSR), for instance.  As the number of
 sources to be considered decreases, the two perspectives--regional and
 single source-specific—merge together.  A case in point is the examination
 of the impacts of a few sources located in a rural area, where prevention
 of significant deterioration (PSD) is an issue.

      From the discussion of air pollution law presented earlier, we have
 isolated  several specific issues, each falling into one of two distinct
 generic issue categories.   The  chief distinction between the two is not
 simply the difference between regional and source-specific perspective, for
 each individual source has both a regional and a localized downwind impact.
 Rather, the clearest distinction  lies  in the number of sources considered.
 Questions of regional NAAQS compliance and maintenance are multi-source
 issues.  NSR, on the other hand, primarily concerns a single source.  Using
 such a distinction, the  principal issues addressed by air quality planners
 are as follows:
                                    111-8

-------
Multiple-Source Issues
-  SIP/C (State Implementation Plan/Compliance).   The
   attainment of regional compliance with the NAAQS,
   as considered in the SIP.
-  AQMP (Air Quality Maintenance Planning).  Regional
   maintenance of compliance with the NAAQS, as con-
   sidered in the SIP.
Single-Source Issues
-  PSD (Prevention of Significant Deterioration).  Limita-
   tion of the amount by which the air quality can be de-
   graded in areas currently in attainment of the NAAQS;
   this is considered in each SIP.
-  NSR (New Source Review).  Permit process by which  appli-
   cants proposing new or modified stationary sources must
   demonstrate that both directly and indirectly caused
   emissions are within certain limits and that the pollu-
   tion control to be employed is performed with the
   appropriate technology; this is considered in each SIP.
-  OSR (Offset Rules).  Interpretive decision by which all
   new or modified stationary sources in urban areas  cur-
   rently in noncompliance with the NAAQS  are judged unac-
   ceptable unless the applicant can demonstrate a plan for
   reducing emissions in existing sources  and that a  reduc-
   tion in the air quality  impact of these emissions  has
   been achieved to offset  those produced  by the proposed
   new source; this decision has a strong  impact on  the
   stationary source  permit process.
-  EIS/R  (Environmental  Impact Statement/Report).  A state-
   ment of impact required  for major projects  undertaken
   by the federal government or financed by  federal  funds  (EIS),
   or a report of project  impact  required  by state or  local
   statutes  (EIR).
 -  LIT  (Litigation).  Court suits brought  to resolve disagree-
   ment over any of the  issues mentioned above or to secure
   variances waiving  federal,  state  or  local requirements.
                            III-9

-------
     The above seven issues are classified according to their most fre-
quently encountered form.  We note that actual cases do not always conform
to the bounds of the generic issue categories as shown.  An EIS, for
instance, can have a regional perspective, as with the Denver Overview EIS
recently completed for Region VIII of the EPA.  Also, LIT can occasionally
have effects on regional NAAQS compliance and maintenance.  For example, PSD
and AQHP resulted  from court suits.

2.   The Issues:   Some Practical  Examples and Their Implications
     for Air Pollution Modeling

     Many  practical  examples can  be  found in which the issues identified
above  play an important  role in planning.  At this point, we will discuss
some of the more  important applications in which they are likely to be
encountered.  Modeling requirements  can thus be identified.  This discus-
sion will  serve as a prelude to the  examination of air pollution models
presented  in the  next chapter.

     First, we consider  the nature of multiple-source (M/S) issue appli-
cations.   SIP/C and AQMP can focus both on urban areas as well as on large
rural  sources.  Here we  concentrate  on the most frequently encountered
applications, those in urban areas.  Encountered in such regions are both
reactive pollutants [ozone (Og),  hydrocarbon (HC), and nitrogen dioxide
(N02)3 and relatively nonreactive pollutants [carbon monoxide (CO), sulfur
dioxide (SO*)** and total  suspended  particulates (TSP)].  There are a
variety of different source types:   point sources (power plants, refin-
eries, and large  industrial plants,  such as steel, chemical and manufac-
turing companies),  line  sources (highway, railroads, shipping lanes, and
•airport runways),  and area sources (home heating, light industrial users
of volatile chemicals, street sanding, gasoline distribution facilities,
and shipping ports).  Mobile sources (cars, trucks, and buses) almost
invariably can be  aggregated into highway line sources.  While a few
cities with air pollution  problems are located in complex terrain (Pitts-
burg,  for  example),  most are situated in relatively flat or gently rolling
terrain.   Geographical features can  play an important part in regional air
pollution  (for instance, the ocean near Los Angeles, the lake near Chicago,
and the mountains  near Denver).
* Sulfur dioxide is slowly reactive:  S02 •»• S0|, aerosol.
                                 111-10

-------
     Air pollution  modeling  in  such  circumstances  has  been used for several
principal  purposes.   It has  been  useful  in  estimating  the total amount of
emissions  cutback required to reach  compliance with  the  NAAQS.  Individual
control  strategies  also have been assessed, both for SIP/C and AQMP.  In-
sights  from regional  modeling have been  useful in modifying and improving
pollutant measurement network design.   In Denver, for  instance, use of the
SAI  Urban Airshed Model indicated for  a  particular model  day  the  presence
of an ozone (03)  peak in a then-unmonitored area.  Subsequent location of a
temporary monitoring  station at that site lead to the  observation of 03
readings in excess  of any previously measured.   Also,  models  have had an
influence on transportation  network  design (the  balance  of freeways, arterials,
and feeders) and modal split (the mix  between personal and mass transit).
Through the EIS/R process, individual  projects  (for  example,  the  Interstate
470 freeway and the construction of  wastewater treatment facilities, both
in Denver) have been  examined using  models to estimate air quality impact.

     Second, we consider the nature  of stationary single-source  (S/S)
issues.  Important applications occur  in both urban  and rural areas.  These
focus on the following:  (1) SIP/C and the permit approval  process for new
or modified stationary sources and (2) the variance  process  for existing
facilities.  As for the first of these,  SIP/C and the permit approval  pro-
cess, all new or modified major S/Ss,  urban and rural, are  subject to  NSR
and must meet NSPS and use the best  available pollution control  equipment.
Also, both direct and  indirect impact on air quality must be considered.

     In urban areas, major S/Ss might include proposed refineries, power
plants, and industrial facilities, as well as shopping, employment, and
recreational/sports centers.  With the last of  these, indirect effects are
particularly important.   Each draws appreciable  numbers of automobiles,
adding  to  local vehicle miles traveled (VMT) and  increasing  congestion and
thus pollutant emissions.  Also,  automobile hot soak  and some cold start
emissions  are concentrated  in accompanying parking  lots.
                                  III-ll

-------
     Urban S/Ss are dealt with in the SIP/C and permit application process
differently than are rural S/Ss.  In urban areas in noncompliance with the NAAQS,
OSR must be considered.  The air pollution modeler must be able not only to
represent the regional and localized downwind impact of the new S/S but also
to estimate the subtractive effect of reducing emissions from one or more
existing sources.

      Another  difference  between urban and rural areas has important signif-
 icance for the modeler.   In  rural areas, the relatively nonreactive pollu-
 tants (SOp and TSP) are  often of greater interest than are the more reactive
 ones.  Although the NO  emissions also produced at some point could gener-
 ate, with the addition of HC, photochemically reactive pollutants, they are
 usually not of primary concern. In  urban areas, the reactive pollutants
 (0 , N02, and HC) must also  be modeled.  When the incremental effect of a
 S/S is being considered  in an urban  areas (OSR, as well), this distinction
 can have a strong effect on  model choice.  This is particularly true when an
 S/S emits 0  precursors  such as NO  , which power plants do, or HC, which
            *v                     A
 refineries do.

      In rural areas, applications centering on energy development have been
 prominent in recent years, particularly in the northern and central Great
 Plains.  The direct air  pollution impact of these S/Ss would be produced by
 coal extraction (strip mining), conversion to natural gas, transport to
 energy production facilities if they are not on site (via unit train or
 slurry pipelines), or coal combustion in large power plants.  Indirect impact
 would result from the construction  of the above-mentioned facilities  (new
 highways, provision for  temporary construction crews) and the growth of nearby
 "boom" towns (housing for families  of workers and the additional population
 increase required to provide commercial and public services to workers).

      A complicating factor not confronted  in nonattainment regions  arises  in
 attainment areas:  PSD must be considered.  No  S/S or combination  of them is
 permitted to degrade significantly the  air quality in nonpolluted  rural  areas.
 In each SIP such areas are identified.  The modeler must  be able to assess  the
                                  111-12

-------
likelihood that an S/S will impinge on such areas to an unacceptable  degree.
Also, because pollutants from rural source" are either inert or slow  in
reacting and because surface deposition, rainout, and washout often proceed
at slow rates (depending on synoptic meteorology), atmospheric residence  times
are long for some pollutants such as the derivative products of SO^.   Trans-
port distances on the order of a thousand kilometers may not be unusual.   The
modeler must be able to account for pollutant transport and transformation
on this temporal and spatial scale, if required.

     In both urban and rural areas, the owner of a S/S has the right  to seek
a variance temporarily excusing the source from provisions of the law, but
not such as to cause a violation of the NAAQS.  A number of reasons could
motivate such a request.  For a power plant, petroleum shortages could result
in a need to burn high-sulfur fuel.  For a refinery, petroleum storage and
shipping needs might result in a variance request.  Other reasons might include
a need for an extension of the time required to comply with SIP control
strategy requirements or for periodic pollution control equipment maintenance
or replacement.

3.   The Issues:  A Prologue to the Next Chapter

     In this chapter, we have examined the body of air pollution law and
identified two generic  issue categories:  multiple-source issues and single-
source issues.  Seven separate  (though interrelated) types of issues  were
classified within that  structure:  SIP/C, AQMP, PSD, NSR, OSR, EIS/R, and
LIT.

     We have examined some practical examples illustrating particular
features of these issues as they manifest themselves in both  urban and rural
areas.  We have also discussed  some key implications that these issues have
for air pollution modeling.  This  serves as an important prologue to the
discussion of specific  models undertaken in the next chapter.  In that
chapter we will match application  requirements to model capabilities.  The
issues identified here  will serve  as the framework within which that dis-
cussion is carried out.

                                 111-13

-------
                       IV   AIR  QUALITY MODELS
     In the last chapter,  we identified  generic  types  of  air quality issues.
In this chapter, we  define generic  classes  of  models.   Having done so,
we match the two, identifying those issues  for which each model may be a
suitable analytical  tool.   We also  describe the  technical formulations
and underlying  assumptions employed in each generic model class, indicating
some key limitations.

     The final  choice  of a model  for use in addressing a  particular issue
can be made only by  considering the characteristics of the  proposed applica-
tion.  To facilitate the comparison between model  capabilities and applica-
tions requirements,  we define a set of applications attributes.  We then
match the two,  identifying for each generic model  the  combinations of
application attributes for which it is suited.  A  related means for match-
ing model to application is described in EPA (1978a).

     In this chapter we attempt to  specify  the relationship between issues,
models, and applications.   Having done so,  we  then develop  in Chapter V
model performance measures appropriate to each issue/model  combination of
practical interest.  This  will  set  the stage for a discussion of requisite
model performance standards in Chapter VI.

     In order to preserve  generality, our emphasis in  this  chapter centers
primarily on generic model categories rather than  on specific air quality
models.  Certain benefits  may be achieved thereby: General conclusions
appropriate to  an entire class of models may be  stated without reference
to any particular model, and extensive discussions of  any observed differ-
ences between intended capabilities and  technically achieved ones need not
be conducted for each  specific model.
                                   IV-1

-------
      Our central purpose In this report is to  discuss means for setting
 model performance standards.  While not central  to  this, however, we do
 recognize a need to associate some specific models  with our generic
 model categories.  To assist in doing so,  we examine in Appendix B a number
 of air quality models.  Though the list is not a complete one, a number of
 available models are examined in detail and tabulated according to several
 attributes.  Among these are the following: level  of intended usage
 (screening or refined), type of pollutant  (reactivity, averaging time),
 degree of resolution (spatial and temporal), and certain site specifics
 (terrain, geography, as well as source type and  geometry).

      We summarize at the end of this chapter that part of Appendix 6 needed
 to associate specific models with our generic  categories.  No attempt is
 made in this chapter or in Appendix B to screen  models for technical accept-
 ability nor  is  any attempt made to be all-inclusive.  Models are classified
 according  to their intended capabilities rather  than their technically achieved
 ones.  Among the references we have drawn upon in gathering this information
 are  the following:  Argonne (1977). EPA (19786), and Roth et al., (1976), as
 well as several  program users' manuals.

 A.   GENERIC MODEL CATEGORIES

      In this chapter air quality models and prediction methods are class-
 ified into generic model categories.  Here we  describe the structure of
 the classification scheme employed, the full form of which is shown in
 Exhibit IV-1.  Though many such schemes have been proposed (Roth et al.,
 1976, and Rosen, 1977, for example), we identify three broad divisions:
 rollback,  isopleth,  and physico-chemical.   We  describe here each of these
 categories, mentioning technical  formulation,  general capabilities, and
 major limitations.   In doing so,  we draw upon  material in Roth et al. (1976).

 1.    Rollback Category

      Included in the  first of these are all  those prediction methods in
which ambient pollutant concentrations  are assumed  to be directly (though

                                  IV-2

-------
not necessarily linearly)  proportional  to emissions,  according  to  some
simple relationship.   Emissions control  requirements  are presumed  propor-
tional to the amount by which the peak pollutant concentration  exceeds
the NAAQS.   Linear rollback and Appendix J are examples  of such methods.
                      I.   Rollback
                     II.   Isopleth
                    III.   Physico-Chemical
                          A.  Grid
                              1.  Region Oriented
                              2.  Specific Source Oriented
                          B.  Trajectory
                              1.  Region Oriented
                              2.  Specific Source Oriented
                          C.  Gaussian
                              1.  Long-Term Averaging
                              2.  Short-Term Averaging
                          D.  Box
                   EXHIBIT IV-1.  GENERAL MODEL CATEGORIES

     Because atmospheric processes are generally complex and nonlinear,
 the fundamental proportionality assumption invoked in rollback methods
 is frequently violated in actual application.  For this reason, rollback
 methods are usually regarded as -screening techniques, whose results give
 at best only a general indication of the amount of emissions control
 required.  They are most often  used when insufficient data are available
 to perform an analysis that is more technically justifiable.  Even then,
 results obtained with them are  appropriate only as a crude indication of
 the need for more extensive data gathering and analysis.  Because rollback
 methods  lack  spatial  resolution,  they  are  most suitable for addressing
 regional, multiple-source*  issues.  Also,  their use  is  more  appropriate
 for applications  involving  relatively  nonreactive pollutants  (SOp,  CO  and TSP)
 * In this report, "multiple-source" refers to many, well-distributed
   sources of all types and sizes.  It does not include, for instance,
   a single complex having multiple stacks.
                                   IV-3

-------
2.   Isopleth Category

     Within the second generic model category are included those methods
relying on isopleth diagrams to relate precursor concentrations  of primary
emissions (usually oxides of nitrogen and nonmethane hydrocarbon) to the
level of secondary pollutant (usually ozone) resulting from such a mixture.
As  is true with  the EPA EKMA method (see EPA, 1977), these diagrams are usually
constructed  from computer simulations using theoretically and chamber derived
chemical kinetic mechanisms.  They  invoke assumptions about a number of
parameters such  as regional ventilation and solar insolation, as well as
pollutant entrainment, carryover from the previous day, and transport from
upwind.  The accuracy of the postulated chain of chemical reactions is
evaluated using  smog-chamber data.  The types of information required to con-
struct an isopleth diagram are roughly equivalent to those required to employ
a box model, and we note that the two methods are conceptually similar in
many regards. He maintain a distinction between the two, however, because
of  the view  prevailing in the user  community that they are separate classes
of  models.   Also, not all box models are photochemical, as are isopleth-
based methods.

     Entry into  an isopleth diagram requires an estimate of the peak con-
centration actually occurring during the day on some initial base date.
Given an assumption about the relative proportion of precursor species
control  (HC  versus NO ), the degree of emissions cutback required to achieve
                      A
the NAAQS can be estimated  directly.

     Isopleth methods lack  spatial  resolution.  They are thus capable of
addressing only  regional, multiple-source  issues.  By  their nature,  isopleth
methods are  useful only  for applications involving photochemically  reactive
pollutants.   Because  of  the level  of approximation involved in  constructing
the isopleth diagram  itself, in  entering it using measured ambient  data,
and in accounting for the effect of transport from upwind, such methods  are
                                   IV-4

-------
more appropriate for use as screening tools.   In this capacity,  they  can
be helpful  in assessing the need for further, more refined analysis.   How-
ever, in some limited applications where the  assumptions invoked in the
formulation of the isopleth methods are generally satisfied, estimates of the
required degree of emissions control obtained using such a method can be
regarded as acceptably accurate.

3.   Phy s i c o-Chemi ca1 Category

     The third category contains models based upon physical and chemical
principles  as embodied in the atmospheric equations of state.  It is  divided
into four main subcategories:  grid, trajectory, Gaussian, and box.  We
discuss here each subcategory.

a.   Grid Subcategory

     Grid models employ a fixed Cartesian reference system within which
to describe atmospheric dynamics.  The region to be modeled is bounded
on the bottom by the ground, on the top usually by the inversion base
(or some other maximum height), and on the sides by the desired east-west
and north-south boundaries.  This space is then subdivided into a two- or
three-dimensional array of grid cells.  Horizontal dimensions of each cell
measure on the order of several'kilometers, while vertical dimensions can
vary, depending on the number of vertical layers and the spatially and
temporally varying inversion base height.  Some grid models assume only a
single, well-mixed cell extending from the ground to the inversion base;  .
others subdivide the modeled region into a number of vertical layers.

      Ideally,  the coupled  atmospheric  equations  of  state,  expressing  con-
 servation  of  mass, momentum,  and energy,  would  be solved  systematically
 within each grid  cell, with  a  chemical  kinetic  mechansim  used  to  describe
 the  evolution  of  pollutant species.   Several  major  difficulties  arise
 in practice.   Computing  limitations are  rapidly encountered.   A region
                                    IV-5

-------
fifty kilometers on a side and subdivided into five vertical  layers  requires
12,500 separate grid cells if grid cells are one kilometer on a side.
Maintaining a sufficient number of species to allow the functioning  of a
chemical kinetic mechanism compounds the storage problem.  For a ten-
species mechanism, storage of the concentrations for each species in each
grid cell in our example would alone require 125,000 storage  locations.

     To avoid these and other computing or numerical problems, most  grid
models solve only one atmospheric state equation—the conservation of
mass, or continuity, equation, decoupling the other two.  The momentum
equation is replaced by meteorological data supplied to the model in the
form of spatially and temporally varying wind fields.  The energy equation
is supplanted by externally supplied vertical temperature profile data,
from which inversion heights are also calculated.

     Other problems are encountered in solving the mass continuity equation,
a principal such problem being the atmospheric viscosity terms.  Turbulence,
which is a randomly varying quantity, can be described only in statistical
terms.  Species concentrations, as a result, can be found only as values
averaged over some time interval.  Also, the continuity equation can be
solved only if  turbulence effects are decoupled through a series of  approxi-
mations involving  turbulence gust eddy sizes and strengths.

     Grid models require the specification of time-varying boundary  condi-
tions on the outer sides and the top of the modeled region, the initial
Conditions (species concentrations) in each grid cell at the  start of  a
simulation, and spatially and temporally varying emissions for each  pri-
mary pollutant species.  The first two of these are derived from station
measurement data, and the last is obtained from an appropriate emissions
inventory for the modeled region.
                                    IV-6

-------
     Grid models  are  capable of considering both reactive and relatively
nonreactive pollutant species.   Models considering reactive species,
because of their  limited time scale (less than several  days), are
appropriate tools only for addressing questions involving pollutants
having short-term standards (0~, CO, HC, and S02) and for medium-range
pollutant transport (an urban plume, for example).  Some grid models
are designed to model large spatial regions (such as the Northern Great
Plains—see Liu and Durran, 1977) and thus can address  long-range transport
questions.  At their present state of development, these models are appropriate
tools only for examining questions involving relatively nonreactive pollutants
(principally long-term S02 and TSP).

     There are two major classes among grid models:  region oriented  and
specific source oriented.  In the first class, two basic variants exist:
urban scale and regional scale models.  The first of these attempts to
model the urban environment, considering emissions from a number of dif-
ferent sources and simulating both reactive and  relatively non-reactive
pollutant species over a spatial scale on  the  order of tens of kilometers
through a temporal scale of 8 to 36  hours.  Regional-scale models, on the
other hand, represent an attempt to model  long-range pollutant transport
over a spatial scale of hundreds of  kilometers through a  temporal scale
of several days.  Emissions are  assumed  to come  from a few widely dispersed,
usually rural,.sources; the pollutants  considered are  relatively nonreac-
tive  (or more  precisely, slowly  reactive)  ones such as S02-   (Though
S02 -*• S0=  it  does so much more  slowly  than the time scale of reactions
involving  the  more reactive  species.)   One such model  was developed by SAI for
use  in assessing the air quality impact of large-scale energy development in
the  Northern Great Plains  (Liu and Durran, 1977).

      Because of  their  spatial  extent,  regional oriented  grid models are
appropriate  tools for  addressing regional  (multiple-source)  issues, such
as  SIP/C  and AQMP.   Because  of their spatial  resolution, certain regional
questions  about  single-source issues can also be addressed.   The regional
                                    IV-7

-------
effect of a new source can be assessed.  The subtract!ve regional effect
of removing an existing source also can be estimated, an essential cap-
ability for addressing OSR questions.  However, only grid models specifi-
cally designed to consider a single source have sufficient spatial reso-
lution to assess near-source, or microscale effects.

     Specific source oriented models represent the second major class of
grid models.  Some specific examples of such models are listed later in
this chapter.  Models of  this type are particularly useful in two types
of applications:  examining the behavior of a plume containing reactive
constituents, and accounting for the effects of complex terrain on a
point source plume.  Because of their formulation, these models can con-
sider the effects of plume interaction with ambient reactive pollutants.
This is  of  interest in addressing single-source issues in urban areas with
significant levels of reactive pollutants.  Often urban-scale grid models
are used to predict the ambient conditions with which the plume interacts.

     Those  models designed for applications in complex terrain can be used
when it is  necessary to describe explicitly the wind fields and inversion
characteristics encountered by a dispersing pollutant.  Although simpler
models exists, they are often inadequate when applied in situations in
which terrain is particularly complex or when photochemical reactivity
is important.

b.   Trajectory Subcategory

     Trajectory models employ a reference coordinate system that is allowed
to move with the particular air parcel of interest.  A hypothetical column
of air is defined, bounded on the bottom by the ground and on the top by
the inversion base (if one exists), which varies with time.  Given a speci-
fied starting point, the column moves under the influence of prevailing
winds.   As  it does so, it passes over emissions sources, which inject pri-
mary pollutant species into the column.  Chemical reactions are simulated
in the  column, driven by a photochemical kinetic mechansim.  Some trajectory
                                   IV-8

-------
models  allow the  column  to be partitioned vertically into several  layers,
or cells.   Emissions  in  such models  undergo vertical mixing upward from
lower cells.  Other trajectory models  allow only a single layer;  in these,
vertical mixing is  assumed to be uniform and instantaneous.

     The forumlation  employed by trajectory models  to describe atmospheric
dynamics represents an attempt to solve the mass continuity equation
in a moving coordinate system.  The remaining state equations—conservation
of momentum and energy—are not solved explicitly.  As is done in grid models,
solution of the momentum equation is avoided by specification of a spatially
and temporally varying wind field, while solution of the energy equation
is sidestepped by externally supplying temperature  and inversion base height
information.

     Several basic  assumptions are invoked in the formulation of trajectory
models.  Since only a single air column is considered, the effects of
neighboring air parcels  cannot be included.  For this reason, horizontal
diffusion  of pollutants  into the column along its sides must be neglected.
This may not seriously impair model  results so long as sources are suffic-
iently  well distributed  that emissions can be idealized as uniform, or
nearly  so, over the region of interest.  However, if the space-time track
of the  air column passes near but not over large emissions sources, neglect
of the  effect of the  horizontally diffusing material from those sources
might cause model results to be deficient.  In general, problems occur
whenever there are significant concentration gradients perpendicular to
the trajectory path.

     Also, the column is assumed to retain its vertical shape as it is
advected by prevailing winds.  This requires that actual winds be  ideal-
ized by means of a mean wind velocity assumed constant with height.  Because
of the earth's rotation and frictional effects at ground level, winds aloft
usually blow at greater speeds than do surface winds, and  in different directions.
This produces an effect known as wind shear, which  is neglected in trajec-
tory models.  If emissions  are evenly distributed in amount and type over
                                   IV-9

-------
the region of Interest and winds are also uniform, this may not represent
a serious deficiency.  In such a case, material blown out of the column
by wind shear effects would be replaced by similar material blown into
it, with the net effect on model results expected to be small.  However,
If a significant fraction of the emissions inventory is contributed by
large point sources or if wind patterns display significant spatial vari-
ation, neglect of wind shear can seriously impair the reliability of
trajectory model results.

     Additionally, many trajectory models assume that the horizontal dim-
ensions  of the air column remain constant and unaffected by convergence
and divergence of the wind field.  Where winds are relatively uniform,
this may not  be  of serious consequence.  Where winds have significant
spatial  variation, as could be the case in even mildly complex terrain,
however, this assumption could lead to deficient results.  In the San
Francisco Bay region, for example, wind flow convergence during the day
causes the merging of several air parcels.  Peak pollutant concentrations
subsequently  occur in this merged "super-parcel."  A trajectory model
would be an inadequate tool for addressing problems in such a region.

     In  general, trajectory models require as inputs much the same types
of data  required to exercise a grid model.  Emissions are required along
the space-time track of the air column.  Wind speed and direction must be
provided to determine its movement.  Vertical temperature soundings must
also be  input in order to determine the height of the column (the height
of the inversion base).  Although these data need be prepared only for
the corridor  encompassing the trajectory path, general application of the
model to  an entire urban area requires that data be prepared for a. signi-
ficant portion of the region.

     Two major classes exist among trajectory models:  region oriented and
specific source oriented.  The first of these classes includes those models
designed to address multiple-source, regional issues, usually in urban

                                    IV-10

-------
areas.  The second class contains  so-called  reactive plume models.  For
reasons noted above, the use  of trajectory models  is appropriate on an
urban scale only  in certain circumstances.   Careful screening  is required
of the emission and meteorological  characteristics in a proposed appli-
cation region to  insure the appropriateness  of trajectory model usage.

    The second class of trajectory models includes those designed to
evaluate the air  quality impact downwind of  a specific source.  Because
of the underlying equation formation,  these  models are more  appropriate
for use in areas  having relatively simple terrain. However, because
they are capable  of simulating  photochemical reaction, they  can be used
in addressing issues involving  reactive pollutants.  Often,  region ori-
ented models are  used to generate  the  ambient conditions with  which the
reactive plume downwind of the  source  must interact.  For all  trajectory
models considering reactive pollutants, the  time scales remain short  (less
than several days).  Consequently, they are  inappropriate for  consideration
of problems involving pollutants subject to  long-term standards.

c.  Gaussian Subcategory

    In the formulation of Gaussian models,  the atmosphere  is  assumed to
consist of many diffusing pollutant "puffs," all moving on  individual
trajectories determined by prevailing  winds.  The concentration at any
point is assumed  due to the superimposed effect of all puffs passing  over
the point at the  time of observation.   Rather than keeping  track of the
path of each puff, their motion (both  advection and diffusion) is described
in terms of conditional state transition probabilities.  Given an initial
location at a particular time,  this state transition probability describes
the likelihood that the puff  will  arrive at  another specified  point a
given time interval later.  With an entire  field specified  at  some refer-
ence time, the net expected effect at  a.particular point and time is  calcu-
lated by determining the integral  sum  of the separate expected effects of
each puff in the  field.
                                    IV-11

-------
     Central  to this type  of formulation  is  a  knowledge  of the  time-varying
state transition probabilities  for the entire  concentration field.   In
practice, turbulence nonuniformities  and  terrain-specific effects  combine
to render it unlikely that such probabilities  can be  determined.   To over-
come this difficulty, traditional  Gaussian models (among others, those
recommended by the EPA) invoke  several assumptions.   First, the turbulence
field is assumed to be stationary and homogeneous, which implies it has
two Important qualities:  First the statistics of the state transition
probabilities can be assumed dependent only  on spatial displacement, thus
removing their time-dependency; and second,  the probabilities are  not
dependent on puff location in the field, thus removing spatial  variability.
These are satisfactory approximations so long as significant differences
do not exist between turbulence characteristics of the atmosphere  in dif-
ferent portions of the region to be modeled.  For applications  in  complex
terrain, for instance, such an assumption might not be justified.

     Once turbulence field stationarity and homogenity have been assumed,
it still remains to specify the functional form of the state transition
probability.  Gaussian models derive their name from their assumption that
this probability function is Gaussian in form.  Given this assumption,  the
concentration field can be determined analytically by evaluating the integral
expressing the summation of separate effects from all pollutant puffs affect-
ing the region of interest.  In order to isolate the effect of  an  individual
source, only puffs containing pollutants emitted from that source  are
considered.

     Concentrations about the plume center!ine are assumed to be  distri-
buted according to a Gaussian relationship,  whose vertical and  horizontal
cross-sectional shape is a function of downwind distance from the  source
and atmospheric stability class.  Analytic forms can be determined express-
ing the form of the downwind concentration field for several different
types of emissions regimes:  instantaneous "puff," continuous point source
emission (steady-state), continuous emissions from an area source, and
continuous emissions along a line source.
                                  IV-12

-------
     Several  other assumptions are invoked in Gaussian steady-state models.
The vertical  and horizontal  spread of the plume is  assumed characterized
by dispersion coefficients,  whose values  are dependent on the distance
downwind of the source.   They are assumed to be functions of atmospheric
stability and are thus characterized by stability class.   Specific values
are obtained from standard workbooks, such as that developed by  Turner, or
evaluation of data measured downwind of actual sources.

     In many models,  plume interaction with the ground and the inversion is
considered.  Usually, perfect or near-perfect reflection is assumed  to
occur.   Multiple reflections are often modeled, although some models  assume
that beyond a certain downwind distance mixing is uniform between the ground
and the inversion base.

     Consideration of plume rise is made in Gaussian point source models.
Depending upon ambient atmospheric conditons, such as temperature and humi-
dity, hot gases from an emitting stack may rise, sink or remain at the  same
height.  Simplifying thermodynamic equilibrium relationships, such as that
developed by Briggs, are often used to estimate the magnitude of plume  rise.

     Two major classes of Gaussian models exist:  long-term averaging and
short-term averaging.  Though both invoke the basic Gaussian assumptions,
major differences exist in formulation.  Long-term models divide the region
surrounding each source into azimuthal sectors.  The long-term variation
of the wind at the source must then be specified by wind speed and direction
(by sector) classes, along with the frequency of occurrence for each combin-
ation.   This information usually is conveyed in the form of a "wind rose."
Data describing the frequency of occurrence of the various atmospheric
stability categories must also be specified.  The probability of occurrence of
of stability category/wind vector (speed and direction) combination is  then
used to weight the downwind concentrations resulting from it.  The weighted
sum represents the expected value of the long-term averaged pollutant con-
centration.  Models employing this so-called "climatological" formulation
                                  IV-13

-------
are appropriate tools for addressing problems involving pollutants  for
which long-term (annual) standards are specified (502, TSP, and NO,,).

     The second class of Gaussian models includes those designed for short-
term analysis.  Prevailing wind direction and speed, as well as emissions
characteristics, are assumed to persist long enough that steady-state con-
ditions are established.  The downwind concentration field resulting from
source emissions can then be evaluated analytically.  Some models allow a
limited form  of temporal variability by dividing the modeling day into
segments  (perhaps  one hour long), during each of which conditions are assumed
to be in  steady state.   Source strengths and prevailing wind speed at the
height of emissions  release are required for each segment, as are sufficient
vertical  temperature profile data to calculate inversion base height, if
one exists, and atmospheric stability class.  The last of these is required
in order  to determine vertical and horizontal dispersion coefficients.
Because wind  data  frequently are not available at the height of emission
release,  surface wind measurements are extrapolated.  Wind speed is assumed
to vary vertically according to a power law, the exponent of which is given
as a function of stability class.  Determination of stability class is made
by one of several  appropriate methods, each of which is also dependent on
surface observations.

      Both Gaussian classes contain models that can be used to estimate the
impact of single or  multiple sources.  Some models are designed to consider
only a single point  source; others can model many different sources simul-
taneously. Consequently, the first group of these is appropriate only for
addressing single-source issues; the second group can be  used to consider
multiple-source issues  as well.  Most models in this second group, though
able to account for  many sources, can also simulate as few as one.  They
can thus  be used to  consider both single and multiple source issues.

      Full  consideration of regional-scale issues (SIP/C and AQMP) requires
of a model the ability  to simulate  all types of sources:  point, area, and
line.  Not all multiple-source Gaussian models are capable  of doing so.

                                  IV-14

-------
Some are used to consider only point and area sources;  others  are  used  to
consider line sources  only.   These latter nre usually intended for use  in
addressing traffic related questions; they might be used, for  instance,
to estimate the impact of emissions from a full  highway network on regional
CO distribution and level.  Consequent to the above, consideration of all
source types in a region may require the joint use of more than one model--
one considering point  and area sources and another simulating  line sources.

     An important restriction exists on the type of pollutant  species
that can be simulated  using Gaussian models.  Because the formulation
cannot accommodate explicit kinetic mechanisms,  only relatively nonreactive
pollutants can be modeled (CO, TSP, and SOp).*  However, some  models incor-
porate first-order, exponential decay to account for pollutant removal
processes and limited  species chemical conversion.  Multiple-source Gaussian
models assume that the combined effect of many emitters can be calculated
by linearly superimposing the effects from each  individual source.  Such
an assumption would be an erroneous one if questions involving reactive
species were being considered.

     Some Gaussian models have been designed to simulate the effects of
point source emissions in complex terrain.  Various assumptions are made
about the behavior of  the plume and the variation in height of the inver-
sion base as an obstacle is approached.  Usually the plume is  allowed to
impinge on the obstacle without any sophisticated means to account for  flow
alteration, although some models allow for flow convergence and divergence
in the wind field.  Also, the base of the inversion is sometimes assumed to
be at constant height  above the source; in other models it is  assumed to be
a fixed distance above the terrain, thus varying with it.  However, the
Gaussian formulation depends on the assumption of turbulence field station-
arity and homogeneity.  This is a simplification that may not be justified
in many applications in complex terrain.
* Long-term Gaussian models are also used to model annual N02, a reactive
  species, for which no short-term standard currently is set.  This
  usually is accomplished by combining NO and N02 as NOX, the "species"
  modeled.  NOX exhibits less variability during the day than NO? taken
  separately.

                                  IV-15

-------
 d.   Box Subcategory

     Box models  are the simplest of the physico-chemical models.  The region
 to  be modeled  is treated as a single cell or box, bounded by the ground
 on  the  bottom, the inversion base  on the top, and the east-west and north-
 south boundaries on the sides.  The box may enclose an area on the order
 of  several  hundred square  kilometers.  Primary pollutants are emitted into
 the box by the various sources  located within the modeled region, under-
 going uniform and instantaneous mixing.  Concentrations of secondary pol-
 lutants are calculated through  the use of a chemical kinetic mechanism.
 The ventilation characteristics of the modeled region are represented,
 though  only grossly,  by specification of a characteristic wind speed.

     Because of their formulation, box models can predict, at best, only
 the temporal variation of  the average regional concentration for each
 pollutant species.  Consequently,  they are capable of addressing only multi-
 ple source, regional  issues.  Furthermore, such models are useful only in
 regions having relatively  uniform  emissions.  In those areas where point
 sources contribute significantly to the emissions inventory (in number and
 amount),  the assumption of emissions uniformity may be an unsatisfactory one.

     Box  models require only limited data.  Emissions can be specified on
 a regional  basis, eliminating any  need for determining their spatial
 variation.   Only simple meteorological data need be supplied as input.  For
 these reasons, box models  can be used when little information is available.
 They are  more appropriately used as screening tools, helping to identify
 those situations requiring more extensive data collection and modeling
 analysis.

 B.   GENERIC ISSUE/MODEL COMBINATIONS

     The discussion in the previous section outlined the characteristics
 of  generic  classes of air  quality  models.  In this section we associate
 generic model  type with generic issue category.  In so doing, we indicate
 the gross suitability  of a generic mjdel type as a tool in addressing a
particular  issue.  As  noted earlier, each generic model (GM) has associated
with it a set  of limitations on its use.  In Section C we summarize the
                                   1V-16

-------
effects  of these  limitations.   We  first  classify  types of actual applications
according to several  key  attributes  and  then  indicate those which each GM
is capable of considering.   The result is  an  enumeration of possible model/
application combinations.

     In  order to  match model  to issue, we  present in Table IV-1 a matrix of
model/issue combinations.   For each  GM,  an indication is provided of its
usefulness in addressing  each of the seven generic issues identified in
the previous chapter.  Even where  a  GM is  indicated as suitable, however,
its inherent limitations  (some of  which  are noted in the table) may prevent
its use  in certain applications.   Consequently,  further examination is
required in order to  make a final  GM selection.

     Summarizing  the  basic features  of Table IV-1, we note the following:

     >  Grid Models
        -  Region Oriented Models.  Urban  scale  models are able to
           address multiple-source issues  (SIP/C, AQMP)  involving
           both reactive  and nonreactive pollutants.  Their  short-
           term temporal  scale (< 36 hours), however, restricts
           them to problems involving pollutants with short-term
           standards  (03> HC, CO,  and secondary  S02).  Their spatial
           resolution (on the order of  tens of kilometers) allows
           them to address some single-source issues  (OSR, EIR, LIT).
           Regional scale models,  as opposed to urban scale  ones,  are
           more oriented  towards application in  rural  areas  (few  sources)
           involving nonreactive Cor rather, slowly reactive) pollu-
           tants, such as S02, TSP, CO,  and N02, which  is slowly  reactive
           in nonurban areas because of limited  ambient  HC).   Their
           short-term temporal scale (on the order of a  week or less),
           often a practical restriction due to computing requirements,
           limits their use in predicting long-term pollutant concen-
           trations (S02, TSP, N02).  They are suited for addressing
           questions involving single-source issues (PSD, NSR, EIS/R, .
           LIT) in isolated rural  areas.

                                   IV-17

-------
           TABLE IV-1.   AIR QUALITY  ISSUES  COMMONLY ADDRESSED
                              BY  GENERIC MODEL  TYPE
                                                       Issue Category
  E.n.HC *,^ TVO.                   IM—HE5i£     HS    Bft    LM     01
Refined usage
1.  Srld1                                                x        z
    a.   Region Oriented                X         X                *       *3     *
    b.   Specific Source Oriented                          x       X       X      X         X
I.  Trajectory1
    I.   Region Oriented                XX                        X^     X         X
    ».   Specific Source Oriented                          X       I       X      X         *

J.  Causslan3
    I.   Short-Term Averaging                                              ill
        1) Multiple Source              X         X        *               J      J         J
        lit C<~.1»  t«ll*M               V                  *****
           Single Source               X
    b.   Long Tei» Averaging*            «          I        X        X       X      X        X
 Refined/Screening Usage
 4. isopleth1'5                        X          x
 Screening Usage
 S. tollbjck                           x         "
 «.|ox                               XX

Notes:
    1.  Only short-Urn ttne scales can be considered (less  than several days).
    2.  Regional !«<>act of new sources can be assessed but not near-source, or nlcroscale, effects.
    3.  Only non-reactive pollutants can be considered.
    4.  Only pollutants having long-tern standards  can be considered (SO^, TSP,  and NO^).
    5.  Only photochemlcatly active pollutants can  be considered.
                                         IV-I8

-------
-  Specific Source Oriented Models.  These models  are  used
   primarily for addressing single-source issues (PSD, NSR,
   OSR, EIS/R, LIT).  This class contains the so-called
   reactive plume models.  Their ability to consider reactive
   pollutants makes them suitable for urban applications  or
   rural applications where plume reactivity is important.
   However, because OSR (a primarily urban issue)  requires an
   estimate of the subtractive effect of removing  an existing
   source, only questions involving pollutants for which  linear
   superposition is approximately valid, i.e., nonreactive
   pollutants, can be addressed in an urban area with a specific-
   source model.  These models are also suitable for use  in
   applications where terrain complexity is important.
Trajectory Models
-  Region Oriented Models.  With some important restrictions,
   these models can be suitable for use in addressing multi-
   ple-source issues (SIP/C and AQMP) and, in limited circum-
   stances, some single-source issues (OSR, EIS/R, LIT).  Among
   the most important of such restrictions are the following:
   Emissions must be approximately uniform over the modeling
   region; air flow cannot be complex enough to cause merging
   of air parcels, i.e., flow convergence or divergence should
   not be important; and horizontal diffusion effects should
   not have significant nonuni fertilities, e.g., large point
   sources near but not within the space-time track of the
   advected air parcel being modeled.  Because chemical kinetic
   mechansims can be included in their formulation, these models
   are capable of considering reactive as well as nonreactive
   species.  Their temporal scale  is so short, however, that
   no estimates of long-term concentration averages can be
   computed.
-  Specific Source Oriented Models.  Subject to the same restric-
   tions mentioned above,  these models can be appropriate tools
   for  use  in considering  single-source  issues  (PSD,  NSR, OSR,
   EIS/R,  LIT).   Because  they can  consider reactive pollutant

                            IV-19

-------
   species, they can be used in applications involving reactive
   plumes.  Limited terrain complexity can also be simulated,
   so long as the abovetnentioned restrictions are not violated.
Gaussian Models
   Long-Term Averaging Models.  These models can be used to
   address both multiple-source issues (SIP/C, AQMP) and some
   single-source issues (PSD, OSR, EIS/R, LIT).  Because of
   the Gaussian formulation they cannot consider chemistry or
   surface removal effects beyond first order, i.e., exponential
   decay.  Thus, they are appropriate tools only for addressing
   questions involving nonreactive (slowly reactive) pollutants.
   Their  temporal scale is such that only pollutants having
   long-term (annual) standards can be considered (S02 primary
   standard, TSP, N02> where N02 is taken as NO + N02» i.e.,
   NO  ).  As currently configured, these models are appropriate
     ^
   for use in both urban and rural settings, although
   the terrain in such applications should be relatively
   simple.
-  Short-Term Averaging Models.  Two variants exist among
   these models:  multiple-source and single-source.  The
   types of issues they may be used to address divide
   similarly.  Some multiple-source models, however, do
   not consider all types of sources:  Some consider only
   point  and area sources; others consider only line
   sources.  The latter group is useful for examining the
   effects of traffic-related pollutants (particularly CO)
   resulting from highway network emissions.  Consequently,
   if regional questions are to be addressed, the concur-
   rent use of more than one model may be required.  Only
   relatively nonreactive pollutants may be examined
   using  this type of model.  Because of their short-term
   temporal scale, these models are best suited for
   addressing questions involving pollutants having short-
   term standards (CO, S02 secondary standard).
                            IV-20

-------
Rollback Models
   Because rollback models lack spatial resolution,  they
   are appropriate only for considering questions involving
   multiple-source issues (SIP/C, AQMP).  Their use  is
   generally confined to urban areas located in simple
   terrain.  Their assumption that emissions are directly
   proportional to peak pollutant values is a technically
   limiting one.  Consequently, they should be viewed as
   screening tools to evaluate the need for more extensive
   analysis and data gathering.
Isopleth Models
   Lacking spatial resolution, isopleth models are appro-
   priate only for use in addressing multiple-source
   issues (SIP/C, AQMP).  Employing ozone isopleth dia-
   grams derived through the use of a photochemical
   kinetic mechansim, these models are  designed  to examine
   questions involving reactive pollutants  (0-,  HC, short-
   term N02).  Their use is most appropriate for applications in
   urban areas located in simple terrain.   Because the isopleth
   diagram is constructed using regional ventilation, emissions,
   and background/transport assumptions, it is similar to
   the box models, which are described  below.  Like the
   box model, its technical limitations, except  under
   exceptional circumstances,  render it more useful and
   reliable as a screening tool to evaluate the  need for
   more extensive analysis.
Box Models
   Because they lack spatial resolution, box models are
   appropriate only for use in considering multiple-source
   issues (SIP/C, AQMP).  They assume  spatially  uniform
   emissions.  For this reason, their  use is more suited
   to areas that are urban or  semi-urban.  They  are best
   used in modeling areas Icoated in simple terrain but have
                            IV-21

-------
           also been  used  in applications  in complex  terrain.  An
           example of the  latter type of application  might be the
           modeling of a mountain valley containing several ski
           resorts and related developments.  Technical  limitations
           render the box  models more suitable as screening tools.
C.   MODEL/APPLICATION COMBINATIONS

     In the previous section we  discussed the  relationship  between generic
models and generic issues.   In this  section we associate those generic
models and the specific applications in  which  they may be used.   We first
classify applications by means of  several key  attributes.  We then com-
pare the possible values of these  with model capabilities.   For each generic
model type, we are thereby  able  to identify the range of applications for
which the model is suited.

     Applications are characterized here by five attributes:  number of
sources, area type, pollutant, terrain complexity, and required resolution.
In Table IV-2 we list the possible designations these attributes may assume.
Against these we match generic model capabilities, identifying the list of
designations for which each is suitable.  A chart of the resulting model/
application combinations is presented in Table IV-3.  While exceptions may
occur, the list of attribute designations shown is chosen based upon con-
siderations presented earlier in this chapter.

D.   SOME SPECIFIC AIR QUALITY MODELS

     Our central purpose in this report  is to  discuss means for setting
suitable standards for model performance.  As  prologue to this, both
air  quality  issues and  the models used to address them needed to be
examined.  We  have done so in general terms to this point.   Throughout
this discussion we have referred to air quality models only in generic
                                   IV-22

-------
terms.   By doing so, several advantages were achieved:   General  conclu-
sions appropriate to an entire class of models could be stated without
reference to any specific model, and extensive discussions  of any  observed
differences between intended capabilities and technically achieved ones
were not necessary for each particular model.
      TABLE IV-2.    POSSIBLE DESIGNATIONS OF APPLICATION ATTRIBUTES
              Attribute
         Number of Sources
       Possible Designations
Multiple-Source
Single-Source
         Area Type
         Pollutant
Urban
Rural

Ozone (03)
Hydrocarbon  (HC)
Nitrogen Dioxide (N02)
Sulfur Dioxide  (S02)
Carbon Monoxide (CO)
Total Suspended Particulates (TSP)
         Terrain Complexity
Simple
Complex
         Required Resolution     Temporal
                                 Spatial
                                  IV-23

-------
                  TABLE  IV-3.     MODEL/APPLICATION  COMBINATIONS
 teneric Model Type

KTIKED USAGE

6rid

«.  legion Oriented
   Number of
    Sources
Area TV
Multiple-Source    Urban
                  Kuril
                                  Pollutant
                         Terrain
                        Complexity
                         Required
                        Resolution
03. HC. CO. NO;      Simple              Tempora
(1-hour).  SO?        Complex (Limited)   spatial
(3- and 24-hour).
TSP
k.  Specific Source     Single-Source      Rural
    Oriented
                               03. HC. CO. NO;      Simple              Temporal
                               (1-hour). 502        Complex (Limited)
                               (3- and 24-hour).
                               TSP
 Trajectory

 «.  Region Oriented    Multiple-Source    Urban
6.  Specific Source    Single-Source     Urban
    Oriented                             Rural
                               0), HC,  CO. NO;
                               (1-hour). SO;.
                               (3- and  24-hour).
                               TSP

                               03. HC.  CO. NO;
                               (1-hour). S02.
                               (3- and  24-hour).
                               TSP
                     Simple
                                                                                             Temporal
                                                                                             Spatial (Limited)
                     Simple               Temporal
                     Complex (Limited)     Spatial (Limited)
tamsian

a.  Long-Term
    Averaging)

6.  Short-Term
    Averaging
 Multiple-Source    Urban
 Single-Source      Rural

 Multiple-Source    Urban
 Single-Source      Rural
 SO; (Annual). TSP.  Simple              Spatial
 HO; (Annual)*
 SO;  (3- and 24-     Simple              Temporal
 hour). CO, TSP,     Complex (Limited)    Spatial
 NO;, (l-hour)«
R£FJNEO/SCREENING USAGE

Uopleth               Multiple-Source    Urban
                               03, HC. NO?
                               (T-hour)
                    Simple
                    Complex (Limited) •
                     Temporal  (Limited)
SCREENIMC USAGE

toll tuck


Box
Multiple-Source    urban
Single-Source     Rurai
Multiple-Source    Urban
03. HC. »2
SO;. CO. TSP
03. HC. CO. NO?
(I-hour). SO?
(3- and 24-hour).
Simple
Complex (Limited)

 Simple
 Complex (Limited)
                                                                                             Temporal
• Only if N02 it taken to be total M>x
                                               IV-24

-------
     Having made  our  general  points  in  previous  sections,  however, we
associate  here  some specific  models  with  our  generic model  categories.
Though  this  is  not central  to our discussion  of  model  performance
standards, it may be  helpful  in linking specific models  to the  issues
and applications  for  which  they are  most  suited.

     In Table IV-4 we associate a number  of specific models with the generic
model types  identified earlier.  We  included  many  of the models with
which we were familiar.   Because the list is  intended  only to be a
representative  one, we did  not seek  to  make it fully complete.  Many
other models, particularly  Gaussian  ones, certainly exist  and would
be appropriate  for use in the proper circumstances.

     For the  models listed  in Table  IV-4, a detailed summary of their
characteristics is provided in Appendix B.  Among  the  information
contained  there is the following: model  developer, EPA  recommendation
status, technical description, and model  capabilities.  The last of these
is further subdivided  into source type/number,  pollutant  type, terrain
complexity,  and spatial/temporal resolution.


E.   AIR QUALITY  MODELS:  A SUMMARY

     In Chapter III we identified generic classes  of air quality issues.
In this chapter we defined  generic types  of models.  Having done so, we
associated the  two, identifying those issues  for which each model was a
potentially suitable  analysis tool.   We also  described the technical formula-
tions employed  in each generic type  of  model, indicating some key limitations.

     As noted in  Table IV-1,  several generic  model types may be of potential
use in  addressing the same  generic class  of issue. Only by considering the
characteristics of a  proposed application can a  final  choice of model be
                                  IV-25

-------
           TABLE  IV-4.   SOME AIR QUALITY MODELS
     Generic Model  Type
Refined Usage
  Grid
  a.  Region Oriented


   b. Specific Source Oriented

  Trajectory
  a.  Region Oriented


  b.  Specific Source Oriented

  Gaussian
  a.  Long-term Averaging
   b.   Short-term Averaging
 Refiner/Screening Usage
    Isopleth

 Screening Usage
    Rollbacjc


    Box
Specific Model Name
  SAI
  LIRAQ
  PICK
  EGAMA
  DEPICT
  DIFKIN
  REM
  ARTSIM
  RPM
  LAPS
   AQDM
   COM
   CDMQC
   TCM
   ERTAQ*
   CRSTER*
   VALLEY*
   TAPAS*

   APRAC-1A
   CRSTER*
   HANNA-GIFFORD
   HIWAY
   PTMTP
   PTDIS
   PTMAX
   W
   VALLEY*
   TEM
   TAPAS*
   AQSTM
   CALINE-2
   ERTAQ*
   EKMA
   WHITTEN
    LINEAR ROLLBACK
    MODIFIED ROLLBACK
    APPENDIX J
    ATDL
 * These models  can  be  used  for both long-term and  short-term
   averaging.               iy_26

-------
made.   To facilitate the comparison between model  capabilities  and appli-
cation requirements, we defined a set of application attributes.  We  then
matched the two,  identifying for each generic model  type the combinations of
application attributes for which it was suited.

     In this chapter we defined the interface between issue, model, and
application.  In  addition, we mentioned some specific air quality models
within each model category, giving additional detail on each in Appendix B.

     With the completion of this chapter, we are ready to consider model
performance measures.  In the next chapter, we identify performance measures
appropriate for the consideration of each air quality issue.  Having  done
so, we examine the interface of performance measure and model category.
Finally, in Chapter VI, we discuss several alternative rationales and
formats for setting model performance standards.  These are designed  to  be
consistent with the performance measures defined in Chapter V.
                                    IV-27

-------
                 V    MODEL PERFORMANCE  MEASURES
    The central purpose of this report is to identify means for
setting standards for air quality model performance.  As prologue to
doing  so,  we identified generic types  of  air quality issues  in Chapter III
and generic classes  of air  quality  models in Chapter IV, exploring their
interrelationships.  Now  it remains to discuss  the model performance mea-
sures  for  which performance standards  must be set.   Several  rationales for
setting these standards are presented  in  Chapter VI.

     In this chapter our discussion proceeds as  follows:  We first
identify generic types of performance  measures;  we then suggest some
specific performance measures  (describing them  in detail in Appendix
C); and finally we match generic performance measures to the issue/
model/application combinations presented  in  earlier chapters.  Before
beginning, however,  the notion of a model "performance measure" needs
to be  defined in more detail.

     Typically, air  quality models  are used  in  the following context:
a problem  is posed,  a model  is chosen  that is suitable  for  use in
addressing the issue/application, existing data are assembled for  in-
put and additional  data are gathered (if  needed), and a  simulation  is
conducted.  Results  often  are expressed in the form of spatially and
temporally varying  concentration predictions for one or many pollutant
species.   Since most problems are  hypothetical  ones posing  "what-if"
questions  (e.g., what if  a new power plant is built, or what if
population growth  and development proceeds as forecast),  model  results
in such situations  are inherently nonverifiable.  Consequently,  before
its results can be accepted,  the reliability of  the chosen model  must be
demonstrated.  Most frequently, "validation" is  accomplished by using
the model  to simulate pollutant concentrations  in a test situation

                                   V-l

-------
 which is  similar to  the hypothetical one and for which measurement
 data are  available.   A region-oriented model  (urban  or regional  scale)
 may be required  to predict  region-wide  concentrations resulting from
 conditions  existing  on some past date.  A specific-source model may
 have to reproduce the downwind  concentrations  resulting  from emissions
 from an existing source having  size  and siting characteristics similar
 to the proposed one.  If  its predictions are judged  to be  in  sufficient
 agreement with observed data, the  model is  then accepted as a satis-
 factory tool for use in addressing the hypothetical  problem.

     However, what do we mean by "satisfactory" agreement between predic-
 tion and observation?  What are the quantities most appropriate for use
 in characterizing differences between  the two? Within what range of
 values must these quantities remain?  The values for how many different
 quantities  must be "satisfactory"  before we judge model  predictions to
 be acceptably near test case observations?

      In this  chapter, we explore the second of these questions.   In doing
 so, we identify  a  set of model performance measures, surrogate quantities
 whose  values  serve to characterize the  comparison between prediction and
 observation.  We match these performance measures with the  generic types of
 air quality issues identified in Chapter III and the generic classes of air
 quality models listed in Chapter IV.   We defer until  Chapter VI  the next and
 final  step:   the specification of model performance standards  against which
 to compare for acceptability the values of the model  performance measures.

 A.   THE COMPARISON OF PREDICTION WITH OBSERVATION

     Before accepting a model for use in addressing hypothetical air
 quality questions, the user must validate it.   This is often done by
 demonstrating its ability to reproduce a set of test results, usually
 consisting of observational concentration data recorded at a number of
measurement stations for several hours during  the day.  In comparing
predictions with observation, several questions should be asked.  Among
these are the following:
                                     V-2

-------
>  What are the differences?  How much  does  prediction
   differ from observation at the location of the  peak
   concentration level  and at each of the  monitoring sta-
   tions?  What is the  spatial  and temporal  distribution
   of the residuals (the difference between  prediction and
   observation)?  Do these differences  correlate with diur-
   nal changes in atmospheric characteristics (mixing
   height, wind speed,  or solar irradiation, for instance)?
   If more than one species is  being considered, are there
   differences in performance between each species?
>  How serious are the  differences?  Are peak concentration
   levels widely different?  Are the estimates  of  the area
   in violation of the  NAAQS in substantial  disagreement?
   How near to agreement are the estimates of the  area ex-
   posed to concentrations within 10 percent of the peak
   value?  Are differences in the timing and spatial dis-
   tribution of concentrations  such that the expected
   health impacts on the population (exposure/dosage) are
   of different magnitude?  Do the predicted and observed
   patterns and levels  of concentrations lead to seriously
   different conclusions about the required  amount and cost
   of emissions control?  Are policy decisions  deriving
   from prediction and  observation different (such as a
   "build-no build" decision on a power plant based on PSD
   considerations)?
>  Are there straightforward reasons for the differences?  Are
   the locations and timing of  the concentration peaks slightly
   different between prediction and observation?   (If con-
   centration gradients within the pollutant cloud are
   steep, even a slight difference in cloud  location can
   produce large discrepancies at set monitoring sites.
   Such a problem could occur if there were  only slight
   errors in the wind speed or direction input to  the model.
   In such an instance, model performance  might otherwise

                             V-3

-------
        be perfectly adequate.)  Are wide fluctuations in ground-
        level concentrations and thus station measurements produced
        by relatively small discrepancies between the modeled and
        the  actual atmospheric characteristics?  [This "multiplier
        effect" can occur downwind of an elevated point source,
        for  example.  Because the emissions plume from a point
        source has dimensions much greater downwind than crosswind,
        slight changes in the atmospheric profile (stability
        category), having an effect on plume rise and dispersion,
        have a more than proportionate effect both on the downwind
        distance, at which the ground-level peak concentration
        occurs and on the amount of area exposed to a given con-
        centration level.]

      In the  remainder of this chapter we discuss, first in generic
 terms and then  in specific ones, several different types of model
 performance  measures.  While each type and variant is designed to high-
 light different  aspects of the comparison between prediction and obser-
 vation, they all  address the general questions noted above.  Those
 questions, and others like them, are the fundamental ones from which
 the notion of performance measures and standards derive.

 B.   GENERIC PERFORMANCE MEASURE CATEGORIES

      In this section, we define several generic model performance
* measure categories, distinguishing among them on the basis of their
 general characteristics and the amount of information required to
 compute them.  We also note three variants found among measures in each
 category. We  then  introduce some practical considerations which can
 limit the choice of performance measure.  In Section C we list some of
 the specific measures  included in the generic categories, beginning
 with a discussion of the fundamental differences between those designed
 to measure performance on a regional scale and those characterizing it
 on a specific-source scale.  Details of  these specific measures are
 provided  in  Appendix C.
                                    V-4

-------
  1.   The Generic Measures

      We consider here four generic performance measure categories:
  peak, station, area, and exposure/dosage.  The first category contains
  those measures related to the differences between the predicted and
  observed concentration peak, its level, location and timing.  The second
  category includes measures based upon concentration differences between
  prediction and observation at specific measurement stations.  Within the
  third category are contained those measures based upon concentration
  field differences throughout a specified area.  The fourth category in-
  cludes measures derived from differences in population exposure and
  dosage within a specified area.

      Each of these generic performance measure categories requires
  successively greater knowledge of the spatial and temporal distribution
  of concentrations.  We show in Figure V-l a schematic representation
  illustrating several distinct levels of knowledge about  regional  con-
  centrations.  A similar schematic appropriate for source-specific
  situations is shown in Figure V-2.  Listed  in Table V-l  are  the infor-
  mation requirements for the four categories.  These range from an
  estimate of a simple scalar quantity, concentration at the peak,  all
  the way to full knowledge of the spatially  and temporally resolved
  concentration field and population distribution.  For peak measures,
  the concentration residuals (the difference between predicted and
  observed values) are required at a single point and time.   For station
<  measures, the temporal variations of the  residuals are required at
  several points.  For both area and exposure/dosage measures, the  full
  residual field is required, both spatially  and temporally  resolved.
  The latter type of measure  requires, in  addition, the  spatial and
  temporal history of population  movement within the area of  interest.

      As the  information content  increases,  the ability of the performance
  measure to characterize  the comparison  between prediction  and observation
  also can increase.  However, measures  from  different  categories  tend to
  emphasize different aspects of  the  comparison.   For  this reason,  several
                                    V-5

-------
0>
                                 CONCENTRATION
                                 PEAK

                                 VVVV
MEASUREMENT STATION
CONCENTRATIONS
                    CONCENTRATION
                    FIELD
                    C(x,y,t)
        BOUNDARIES OF
        MODELING REGION
                      FIGURE V-l.  VARIOUS LEVELS OF  KNOWLEDGE  ABOUT  REGIONAL  CONCENTRATIONS

-------
                                                       POINT
                                                       SOURCE
 REVAILING
WIND
    MEASUREMENT
    STATION CONCENTRATIONS
    Ci(xi,y.,t)
                              GROUND-LEVEL CON
                              CENTRATION PEAK
                                           CONCENTRATION
                                           FIELD
                                           C(x,y,t)
FIGURE V-2.   VARIOUS LEVELS OF KNOWLEDGE ABOUT SPECIFIC-SOURCE  CONCENTRATIONS

-------
                                     TABLE V-1.  GENERIC PERFORMANCE MEASURE
                                                 INFORMATION REQUIREMENTS
00
                         Generic
                       Performance
                      Measure Type

                     Peak
                     Station
                     Area
                     Exposure/dosage
              Information Required
Predicted and measured concentration peak  (level,
location, and time), I.e.,

             VWVpred.,  Meas.

Predicted and measured concentrations  at specific
stations (temporal  history),  I.e.,

 C1(x1*rtW,Heas.    •     Visitations

Predicted and measured concentration field within
a specified area (spatial  and temporal  history),
I.e.,

                C(x,y,t)pred.tMeas.

Both the predicted and measured  concentration
field and the predicted and actual  population
distribution within a specified  area (spatial
and temporal history), I.e.,
                                                                      ..Meas.

                                                          C(x,y,t)pred |Actual

-------
types  of performance measures  are  usually required in order to fully
characterize a model's ability to  reproduce  observationally obtained
data.

    Because a model predicts  well  the  observed  concentration  peak, for
instance, does not necessarily mean its predictions can  reproduce the
spatially distributed concentration field.   A comparison of the temporal
history of concentration  values  at several specific stations might give
a better indication of spatial model  behavior.   Even this might not
prove  conclusive.  The prevailing  direction  of the winds input to the
model  might have been slightly in  error.  This may have  little impact
on concentration levels,  resulting only in a pollutant cloud slightly
displaced from its actual  location.  If concentration gradients are
steep  within the cloud, station  predictions  might not agree well with
the values observed, even though the model might not be  significantly
deficient.  In such a circumstance, area measures might  provide a better
means  for assessing model  performance.   For  instance, the areas in
excess of a specified concentration value could be compared for several
values ranging between the peak  and background values.

    Even employing the above  measures, the  degree of seriousness of
the disagreement between  prediction and observation might not  be
obvious.  Since health effects result from both the pollutant  level  and
length of exposure, measures expressing differences in exposure/dosage
might  give an  indication  of a  model's ability to estimate the  inter-
action of population with pollutant.  This might be helpful in a number
of circumstances.   For example,  suppose prevailing winds on "worst"  epi-
sode days carry  the  pollutant cloud containing ozone and its precursors
into adjacent  rural  areas before the early-afternoon peak occurs.   If few
people live  in the affected area,  exposure/dosage  measures may  indicate
that the model's failure to accurately  predict peak concentrations is of
little practical  consequence.
                                  V-9

-------
 2.   Some Types of Variations  Among  Performance Measures

      Three types of variations are found among performance measures:
 scalar, statistical, and "pattern recognition."   Those measures
 of the first type are based upon  a comparison of  the  predicted and
 observed values of a specific  quantity:   the peak concentration level,
 for instance.  Those of the second type  compare the statistical behavior
 (the mean, variance and correlation, for example) of  the  differences
 between the predicted and observed values for the quantities  of interest.
 Measures of the final type are useful in providing qualitative insight
 into model behavior, transforming concentration  "residuals"  (the  differ-
 ences between predicted and observed values)  into forms that highlight
 certain aspects of model performance and thus triggering "pattern
 recognition."

       In order  to  Illustrate the types of variations found in each
 generic performance measure category, we present Table V-2.   Some
 typical examples  are included for each category/variation combination.
 In section D of this chapter, a number of specific performance measures
 are  listed.  Examined in detail in Appendix C,  they are classified
 according to the  scheme presented here.

 3.    Several  Practical Considerations

      Several  practical considerations have a strong impact on the choice
 of model performance measures.   Each of these derive from limitations on
 the degree of spatial resolution attainable with most models  and  measure-
 ment  networks.

      Ideally, in assessing the performance of a  model, one might want to
 examine for several hours during the day the agreement between prediction
 and observation  throughout the concentration field (the spatial  distribu-
 tion of concentrations).  Differences between the  predicted and  observed
 values of the following could be uncovered thereby:  the location, timing,
 and level of  the concentration peak;  the  area exposed to a concentration
 in excess of a given value (e.g., the NAAQS);  and the concentration values
at stations within a measurement network.
                                   V-10

-------
            TABLE V-2.  TYPES OF VARIATIONS AMONG GENERIC
                        PERFORMANCE MEASURZ CATEGORIES
Generic Performance
 Measure Category

 Peak
 Station
 Area
  Types of
 Variations

Scalar

Pattern
recoqnition
Scalar
                        Statistical
      Typical Example
                        Pattern
                        recognition
Scalar


Statistical
                        Pattern
                        recognition
Concentration residual* at
the peak.
Map showing locations and
values of maximum one-hour-
average concentrations for
each hour.

Concentration residual at the
station measuring the highest
value.
Expected value, variance and
correlation coefficient of
the residuals for the model-
ing day at a particular
measurement station.
At the time of the peak(event-
related), the ratio of the
residual at the station hav-
ing the highest value to the
average of the residuals at
the other station sites (this
can indicate whether the model
performs better near the peak
than it does throughout the
rest of the modeled region).

Difference in the fraction of
the modeled area in which the
NAAQS are exceeded.
At the time of the peak, dif-
ferences in the area/concen-
tration frequency distribution.
For each modeled hour, iso-
pleth plots of the ground-
level residual field.
*Residual:  The difference between "predicted" and "observed."
                                  V-ll

-------
                          TABLE  V-2  (Concluded)


Generic Performance       Types  of
 Measure Category        Variations       	Typical Example	

  Exposure/dosage       Scalar            Differences  in the number of
                                         person-hours of exposure to
                                         concentrations greater than
                                         the NAAQS.
                        Statistical       Differences  in the exposure
                                         concentration frequency dis-
                                         tribution.
                        Pattern           For the  entire modeled day, an
                        recognition       isopleth plot of  the ground
                                         level dosage residuals.
                               V-12

-------
     Difficulties  hindering such an  examination arise  from two sources:
the limited spatial  resolution of the model  and the sparsity  of  the
measurement network.   While some models,  such as the Gaussian ones, are
analytic and thus  able to resolve the concentration field, many  cannot
do so completely.   Grid models, for  example, predict a single average
concentration value for each cell.  For this reason, they can not resolve
the concentration  field on a spatial scale any finer than the intergrid
spacing (usually on the order of one or two kilometers for urban scale
grid models).  Trajectory models are similarly limited:  They can resolve
the concentration  field only as finely as the dimensions of the  air
parcel being simulated.  Further, predictions are computed only  for a
particular space-time track, and not for the entire concentration field.

     The relatively small number of stations in most measurement networks
limits the ability to reconstruct completely the concentration field
actually occurring on the modeled day.  While  stations are well-placed in
some networks, in others they  are not.  Thus,  not only are stations
often 3-10 kilometers apart, their  placement does not always guarantee
the observation of peak or near-peak concentrations.  Further, even in
extended urban areas, seldom does the  number of stations  exceed  10 to 20.

      For these reasons, concentration  fields generally are not  known
with  precision, from  either model predictions  or observational  data.
Estimates  of the  spatial  distribution  of  concentrations  can  be  obtained
only  by inference from "sparse"  data.  The  use of  numerical  processes,
such  as interpolation and extrapolation,  to extend that  data introduces
additional  uncertainty into  the comparison  of predictions with  observations,

      Another consequence results from the limited  resolution of measure-
ment networks:  The value of the concentration peak actually occurring  on
 the day of observation may not be known.   Measurement networks  usually
 consist of fixed  stations arranged  in a set pattern.   Unless the air
 parcel  containing the peak drifts over or near one of the stations,  the
 maximum concentration value sensed by the network  will be less  (sometimes
 substantially so) than the value of the actual maximum.   When  prevailing
                                    V-13

-------
winds and pollutant chemistry are highly predictable for the  days  of
worst episode conditions, station placement can be designed so as  to
maximize the likelihood of sensing the true peak.   When  conditions are
not so predictable, a measurement network with a modest  number of
stations has little chance of "seeing" the true peak. For instance,
suppose the cloud containing the peak and all  concentrations  within 20
                                      V
percent of it covers an area of 25 square kilometers in  an urban area
having a total area of 1000 square kilometers.  If the cloud  has an
equal likelihood of being above any point in the urban region at the
time of the peak, by dividing the area of the cloud into the  total
urban area, we can make a crude estimate of the number of stations
required to guarantee a measurement within 20 percent of the  peak:
40 stations evenly spaced about 5 kilometers apart throughout the
urban region would be required.  Even if the probable location of
the cloud were known to be within an area equal to one-quarter of  the
urban area, 10 stations would be required just within that small area.
This degree of station density is high and may not be found in many
circumstances.

     The above example is a simplistic one.  The design  of actual
station placement can be a far more complex process than indicated here.
However, the example serves to underscore the main point:  a  measurement
network, though satisfying EPA regulations,* may still be unable to
guarantee an observation "close" to the actual concentration  peak, i.e.,
within 10 to 20 percent.

     The points raised in the above discussion have some practical
implications for the choice of a model performance measure.  Among
these are the following:
   Source:  40 CFR §51.17 (1975).

                                  V-14

-------
      >   Performance measures  relying on a  comparison of the
         predicted  and  "true"  peak  concentrations may not be
         reliable in all  circumstances  since measurement networks
         can  provide only the  concentration at  the station re-
         cording the highest value, not necessarily the value at
         the  "true" peak.
      >   Performance measures  relying on a  comparison of the
         predicted  and  "true"  concentration fields may not be
         computationally  feasible since neither predicted nor "true"
         concentration  fields  are always resolvable, spatially or
         temporally, at the scales  required for comparison
      >   Performance measures  based upon a  comparison of predicted
         and  "true" exposure/dosage, though they are appealing
         because of their ability to serve  as surrogates for the
         health effects experienced by  the  populace, may not be
         computationally  feasible because of the difficulty in
         measuring  the  "true"  population distribution and the
         "true" concentration  field.   (We do suggest in Chapter
         VI,  however, one means  by  which health effects considera-
         tions can  be accounted  for implicitly.)
      >   Performance measures  based upon  a  comparison  of the
         predicted  and  observed  concentrations  at  station  sites
         in the measurement network may be  of  the  greatest practical
         value.*

      While the above points are general ones,  exceptions  to them do
 occur in specific  applications.  Also, certain performance measures,
 though  not fully reliable on  their own, can be useful  in  a qualitative
 sense when used in conjunction  with other  measures.

 C.    A  BASIC DISTINCTION: REGIONAL VERSUS SOURCE-SPECIFIC
      PERFORMANCE MEASURES

      Some models are used to  address  multiple-source,  region-oriented
 issues; others are applied to consider single-source  issues.  The

*Note caveat  on pages VI-18 and  VI-19,  with respect  to  point source applications,
                                   V-15

-------
performance measures appropriate for each differ.   We consider here  the
distinction between regional  and source-specific performance measures.

     The distinction is drawn not so much between  the type of performance
measure used (peak, station,  area, or exposure/dosage),  but  rather between
the spatial scales over which it is applied.  To address urban or regional
scale issues (SIP/C, AQMP), we must consider a  region hundreds of square
kilometers in area, with the spatial and temporal  distribution of
concentrations the result of emissions from many sources.  The quantities
of interest are:  the regional peak concentration  (its level, location
and timing) and for each hour during the day (particularly at the time
of the peak), the spatial distribution of the pollutant concentrations,
by species.  This information is frequently conveyed in the  form of  a
concentration isopleth diagram, an example of which is shown in Figure
V-3.  The  diagram shown was produced by the SAI Urban Airshed Model,
                *
illustrating its ozone predictions for the Denver Metropolitan region
at Hour 1200-1300 MST on 29 July 1975.

     To address single-source issues, on the other hand, we  consider
only the region downwind of the specific source being modeled.  While
emissions  from it contribute to the overall pattern and level of
regional pollutant concentrations, 1t is usually the incremental impact
of those emissions that are of concern.  The principal quantities of
interest are:  the peak incremental ground-level concentration downwind
of the source and the spatial distribution of the  incremental concen-
trations within the downwind ground-level "footprint."  Specific para-
meters describing the latter are:  the area within which concentrations
exceed a certain value and  the shape of the concentration isopleths, usu-
ally conveyed in the form of a diagram such as  the one shown in Figure  V-4.
This diagram was constructed using a Gaussian formulation for a continu-
ously emitting  elevated point source.  Conditions  are in steady-state and
"perfect"  reflection from the ground is assumed.   No inversion layer exists.
It should be noted that winds are unlikely to persist long enough  for
actual conditions ever to resemble these isopleths beyond 20 to 25  km
(about 6 to 10 hours).
                                   V-16

-------
                         NORTH

               v >^» Ai? r i^.-^-—^ ^>^^^r^^^*

                'fe ^^a^^^^^^c^^^'^^
                *S .'SLfS^V^r^r ..'i. ju." •= ^"bv ^£^? <«^^v —^-.\c!^«

                   \r
-------
                                                           HOT£S:
                                               SOU»K STRCNGTH . 1000 tbn/hr
                                               WIND • ? rph
                                               tFFECTlVE STACK MIGHT • 250 ft
                                               PfPfECT GROUND REFLECTION
                                               HO |NV£»SJON
                                               E-STABILITY CLASS (SL:'-,"n» ST«BIE)
                   10-HOUR
                   TRAVEL
                   TIME
                                                             OoMmrlnd Oil Una
                                                               (klloaetcrs)
FIGURE V-4.
SAMPLE SPECIFIC-SOURCE ISOPLETH  DIAGRAM ILLUSTRATING CONCEN-
TRATIONS  DOWNWIND OF  A STEADY-STATE GAUSSIAN  POINT  SOURCE
                                         Y-18

-------
     Other types of sources produce different downwind.isopleth
patterns.  In Figure V-5 we show qualitatively the downwind concentra-
tion patterns resulting from emissions from each of the three prin-
cipal source types:  point, line, and area.  These are only represen-
tations; the actual location, level, and shape of the isopleth lines
are heavily dependent on wind speed, source strength, and atmospheric
stability class.  The figure does indicate, however, the general  shape
of the downwind area within which the source impact is felt.

     The type of source provides information in two areas:  It identifies
the modeling region within which the peak, station, area, and exposure/
dosage performance measures are to be applied; and it provides insight
for monitoring network design.  The observational data against which
model performance is to be judged are gathered at the measurement stations
within that network.  To measure properly the impact produced by a
specific source, the measurement network should be'deployed in a
pattern consistent with the concentration field shapes shown in Figure
V-5.  The station designed to measure the ground-level peak concentra-
tion should be located downwind from the source, several  kilometers
distant for an elevated point source and immediately adjacent for
either a line or an area source.  Located farther downwind are those
stations designed to resolve the concentration field and  to determine
the  concentration value most representative of the  regional incremental
impact of the source.  A schematic of such a measurement  network for a
point source is presented  in Figure V-6, showing one possible configu-
ration for the stations.

     Several difficulties  arise  in practice:  Wind  direction  is change-
able, and the location of  ground-level  footprints  is very sensitive to
atmospheric stability.  These problems  are particularly  acute when the
emitter  being considered is  an elevated point source.  To illustrate, we
show in  Figure V-7  the locus of  the downwind footprint if all wind direc-
tions are considered equally likely to  occur.   If we idealize the concen-
tration  isopleths  as being elliptical in shape,  we  can determine an
                                   V-19

-------
            SCURC;
           —X
      T
 tens of
kfloneters

                     PREVAIL IMG
         "~n  r~*iloaeteri
                                                      SOURtf
                                                                :~  i
                                        Mveral
                                      kilo
                                                      (PftmiLlHC
                                                      WIND
 (a)  faint Source (e.g.. Power
                          
-------
  PREVAILING
  WIND
                                             X MEASUREMENT
                                               STATIONS
                STATION
                SENSING  PEAK
POINT
SOURCE
  X
       FIGURE V-6.  SCHEMATIC  OF A POINT SOURCE MEASUREMENT NETWORK
      :>V:V:-::::Y":-X:-•:•:-.•'
      CONCENTRATION-':':.-/:}:-
      WITHIN A CERTAIN
      AMOUNT OF THE PEAK-
                       ^.-.MAXIMUM v.-.v;.:.:
                       ::;v:;CpNCENTRATION
         SJ $$:$ws$&Mini MUM ' CON CE" NT fa'$^:$$&y
           X#::y/&£Y-v:TION  OF INTEREST  :::^:+:}:-/
      FIGURE V-7.
LOCUS OF POSSIBLE FOOTPRINT  LOCATIONS FOR AN
ELEVATED POINT SOURCE.  All  wind  directions
are considered equally likely.

              V-21

-------
expression for the ratio of the  area within a given  isopleth to the
area of annulus, as shown in Figure V-7.   Doing so,  we can evaluate a
sample problem.   Referring once  again  to  Figure V-4, let the minimum
concentration value of interest  be 300 yg/m^.  Then, obtaining from
the figure the appropriate values, we  can calculate  that the isopleth
contains only 1.2 percent of the total  area of the annulus.  A monitor
placed at random within the annulus would have only  a 1.2 percent chance
of observing a concentration greater than the minimum value of interest.
This problem is compounded if we consider variations in the inner and
outer radii due to the varying dispersive power of the wind.

     The message of all this is  clear: When winds are variable, fixed
monitoring stations have little  chance of characterizing the concen-
tration field downwind of an elevated  point source.   Several specific
implications result for the gathering  of  measurement data for computing
point source performance measures.   Among these are  the following:

     >  Measurement data may have to be gathered  using mobile
        monitoring stations.  Plume  cross sectional  sampling
        could be done then based on  the wind speed/direction
        and atmospheric stability observed in  "real  time."
     >  The annulus (or sector,  if winds  are more predictable)
        containing the locus of peak concentrations  is much
        smaller in area than that containing the  minimum
        concentration of interest and  is  much  closer to the
        source (usually ranging  from 1-5  km distant).

D.   SOME SPECIFIC PERFORMANCE MEASURES

     Having discussed model performance measures  in  generic terms, we
now present some specific examples.  We provide in Appendix C a detailed
discussion of each specific measure.   To  summarize here, we provide a
list for each of the four generic types of performance measures:  peak
(Table V-3), station (Table V-4), area (Table V-5),  and exposure/dosage

                                  V-22

-------
         TABLE V-3.  SOME PEAK PERFORMANCE MEASURES


   Type                            Performance Measure
Scalar                a.  Difference* in the peak ground-level
                          concentration values.

                      b.  Difference in the spatial location of
                          the peak.

                      c.  Difference in the time at which the
                          peak occurs.

                      d.  Difference in the peak concentration
                          levels at the time of the observed
                          peak.

                      e.  Difference in the spatial location of
                          the peak at the  time of  the observed
                          peak.

 Pattern              Map showing the locations  and values of the
 recognition          predicted maximum one-hour-average  concen-
                      trations for each hour.
*
  "Difference" as used here usually refers to "prediction minus
  observation."
                             V-23

-------
            TABLE V-4.  SOME STATION PERFORMANCE MEASURES
    Type
            Performance  Measure
Scalar
Statistical
Concentration residual  at the station measuring
the highest concentration (event-specific time
and fixed-time comparisons).
Difference in the spatial locations of the pre-
dicted peak and the observed maximum (event-
specific time and fixed-time comparisons).
Difference in the times of the predicted peak
and the observed maximum.

For each monitoring station separately, the
following concentration residuals statistics
are of interest for the entire day:
1)  Average  deviation
    Average  absolute deviation
    Average  relative absolute deviation
4)  Standard deviation
5)  Correlation coefficient
6)  Offset-correlation  coefficient.
For all  monitoring stations  considered together,
the following residuals statistics are of
interest:
1)  Average deviation
2)  Average absolute deviation
3)  Average relative absolute deviation
4)  Standard deviation
5)  Correlation  coefficient
6)  Estimate of  bias as a function of
    concentration
7)  Comparison of the probabilities  of concen-
    tration exceedances as a function of
    concentration
Scatter  plots of all predicted  and observed
concentrations with a line of best fit deter-
mined in a  least  squares sense.
Plot  of the deviations  of the predicted versus
 observed points  from the perfect correlation
 line  compared with estimates of instrumentation
 errors.
                                 V-24

-------
                        TABLE V-4 (Concluded)


   Type	      	Performance Measure	

Pattern           a.  Time history for the modeling day of the  pre-
recognition           dieted and observed concentrations at each  site.
                  b.  Time history of the variations over all stations
                      of the predicted and observed average concentra-
                      tions.
                  c.  At the time of the peak (event-related),  the ratio
                      of the normalized residual  at the station having
                      the highest value to the average of the normal-
                      ized residuals at the other stations.
                                 V-25

-------
              TABLE V-5.  SOME AREA PERFORMANCE MEASURES
    Type
                 Performance Measure
Scalar
Statistical
Pattern
recognition
a.  Difference in the fraction of the area in which
    the NAAQS are exceeded.
b.  Nearest distance at which the observed concen-
    tration is predicted.
c.  Difference in the fraction of the area in which
    concentrations are within 10 percent of the
    peak value.

a.  At the time of the peak, differences in the
    fraction of the area experiencing greater than
    a certain concentration; differences in the
    following are of interest:
    1)  Cumulative distribution function
    2)  Density function
    3)  Expected value of concentration
    4)  Standard deviation of density function
b.  For the entire residual  field, the following
    statistics are of interest:
    1)  Average deviation
    2)  Average absolute deviation
    3)  Average relative absolute deviation
    4)  Standard deviation
    5)  Correlation coefficient
    6)  Estimate of bias as a function of
        concentration
    7)  Comparison of the probabilities of concen-
        tration exceedances as a function of con-
        centration
c.  Scatter plots of prediction-observation concen-
    tration pairs with a line of best fit determined
    in a least squares sense.

a.  Isopleth plots showing lines of constant pollu-
    tant concentration for each hour during the
    modeling day.
b.  Time history of the size of the area in which
    concentrations exceed a certain value.
c.  Isopleth plots showing lines of constant residual
    values for each hour during the day ("subtract"
    prediction and observed isopleths).
d.  Isopleth plots showing lines of constant residuals
    normalized to selected forcing variables (inver-
    sion height, for instance).
e.  Peak-to-overall performance-indicator, computed
    by taking the ratio of the mean residual in the
    area of the peak (e.g.,  where concentrations are
    within 10 percent of the peak) to the mean
    residual in the overall  region.
                                  V-26

-------
(Table V-6).   We include scalar,  statistical, and qualitative/composite
pattern recognition variants.

E.   MATCHING PERFORMANCE MEASURES TO ISSUES AND MODELS

     To this  point we have identified several performance measures
categories, discussed their general attributes and data requirements,
and associated with them a number of specific performance measures.
Two tasks remain in this chapter:  We first indicate for each of the
generic types of issues the performance measures most appropriate for
use; we then discuss the capability of each generic class of model  to
calculate those measures.

1.   Performance Measures and Air Quality Issues

     In Chapter III we identified seven generic types of air quality
issues, dividing them into two broad categories.  Within the first  of
these multiple-source issues,  we  included:  State Implementation Plan/
Compliance (SIP/C) and Air Quality Maintenance  Planning  (AQMP).  The
second category, source-specific  issues, was  defined to  contain the
following:   Prevention of Significant  Deterioration  (PSD), New Source
Review (NSR), Offset Rules  (OSR),  Environmental  Impact Statements/
Reports  (EIS/R), and Litigation  (LIT).  For each of  these issues we now
consider some important  distinctions that  bear on  the  selection of the
most appropriate model performance measures (PMs).

      >   Multiple-Source  Issues
         - SIP/C.   The  compliance portion of a SIP details
           plans  for achieving ambient pollutant levels at
           or below the  NAAQS in Air Quality Control  Regions
            (AQCRs) currently  in  noncompliance.  Because it
            is the  peak  concentration level that is of primary
            concern, a model should demonstrate its ability
            to predict that peak.  For a day chosen as the one

                                   V-27

-------
         TABLE  V-6.   S0f€  EXPOSURE/DOSAGE  PERFORMANCE  MEASURES
    Type
                 Performance Measure
Scalar
Statistical
Pattern
recognition
a.  Difference for the modeling day in the number of
    person-hours of exposure to concentrations:
    1)  Greater than the NAAQS
    2)  Within 10 percent of the peak.
b.  Difference for the modeling day in the total
    pollutant dosage.

a.  Differences in the exposure/concentration fre-
    quency distribution function; differences in the
    following are of interest:
    1   Cumulative distribution function
    2   Density function
    3   Expected value of concentration
    4   Standard deviation of density function
b.  Cumulative dosage distribution function as a
    function of time during the modeled day.

For each hour during the modeled day, an isopleth
plot of the following (both for predictions and
observations):
    1)  Dosage
    2)  Exposure
                                 V-28

-------
to be used for model  verification,  peak  performance
measures should be computed.   Also  contained within
SIPs are emissions control  strategies.   To  assess
the effects of controlling  specific sources, a  model
must be capable of spatially  resolving  its  concen-
tration predictions.   Area  PMs should be calculated,
if possible, to evaluate a model's ability to do so.
Station PMs are another means to evaluate model
spatial resolution, although pollutant cloud offset
can account sometimes for apparent large discrep-
ancies.  Because SIP/C is most frequently an issue
in densely populated urban areas, large differences in
health  effect  impact can exist between prediction and
observation.   Exposure/dosage PMs should be calcu-
lated,  if possible, in order to evaluate the ac-
ceptability of a model's performance.
AQMP.   Detailed within the maintenance portion of
a SIP  are  procedures  for insuring,  once  compliance
has  been achieved, that ambient pollutant  concen-
trations do not again rise above the NAAQS.  Because
violation  of  the  NAAQS is an  issue,  peak PM's  are
important  measures of model performance.   However,
because pollutant levels are  low  (relative to  the
values before compliance), small errors  in model
performance might not produce a  large  uncertainty
 in expected health impact.  Consequently,  the  use
 of exposure/dosage PMs may not be  necessary.   Also,
 emissions control strategies  may not be as global.
 Retrofit of control  devices  on existing sources will
 have been accomplished.   Automotive emissions  will
 have been controlled (presumably) such that point
 sources will   contribute a large fraction of the
 emissions inventory.  While incremental growth and
 development will alter the spatial and temporal
                        V-29

-------
   distribution of pollutants,  the  need  for modeling
   spatial  resolution  may not be so crucial  as  it was
   with SIP/C.   Agreement between prediction and  observa-
   tion as  measured by area  and station  PMs, while desir-
   able, may not always be required within  the  same
   tolerance as for SIP/C issues.
Specific-Source Issues
~  PSD.  Individual sources  are not permitted to  cause
   more than   small incremental increases  in concentra-
   tions in areas currently  In  attainment of the  NAAQS.
   Since these so-called "Class I"  regions  (often state
   or national parks)  are generally some distance from
   the polluting source (>10 kilometers), a model must
   be able to predict accurately ground-level concentra-
   tions some distance downwind from the source.   If the
   source being modeled is by Itself likely to  produce
   near-stack ground-level concentrations in excess of
   the NAAQS or     increments greater than  Class  II allow-
   able  increments, peak measures are of particular
   interest.  Otherwise,  "far-field" concentration predic-
    tions are more important than estimates of  the peak
   value.  Downwind station  PMs  are often the measures
   most  suitable  for  evaluating  model predictions for
   PSD Class I.  Also, plumes  from point source are very
   narrow, that is, their cross-wind dimensions are much
   smaller than their downwind ones.  Consequently, the
   incidence of a Class  I violation may be quite sensi-
   tive  to model performance, as measured by area PMs.
   However, exposure/dosage  PMs  are not likely to be of
   interest because of the sparsity of population in areas
   where PSD is an  issue and the relatively low concentra-
   tions occurring  there.
 ~  NSR.  New source review is an important  issue  in both
   urban and nonurban  regions.   With the-density of popula-
   tion  in urban  areas, many persons may live within a short
   distance  (<5 kilometers)  of a source.  The ground- '
   level peak concentration, then, may be an important
                           V-30

-------
indicator of near-source health impact.   Prediction
of that peak, as measured by a peak PM,  may be an
important model performance requirement.  However,
because ground-level concentrations fall off rapidly
farther downwind and because of the "narrowness" of
the plume, differences in exposure and dosage between
prediction and observation may not be of substantial
consequence.  Close agreement, as measured by area
and exposure/dosage PMs, may not be required.  Also,
in order to assess the impact of a new or modified
source, it is necessary to know its incremental effect
on regional air quality.  This is best represented  by
an "average" concentration value (including background)
well downwind of the source (>10 kilometers).  Thus,  a
model should demonstrate its ability to reproduce mea-
surement data at that downwind range.  The use of
station PMs  is  indicated.
OSR.  In order  to construct a new source or modify
an existing  one in  a  region experiencing concentra-
tions in excess of  the NAAQS, the owner of the source
must arrange for the  removal of existing sources.
An amount  greater than the emissions  from  the  proposed
new source must be  removed from the  regional  inven-
tory.  Currently, these  "offsets" are made on  the
basis of  emissions  rather  than as a  result of their
impact on  ambient concentrations.   In such a  case,
no air quality predictions  are required (unless a
region-wide  violation is attributable to  the  source
being  removed  or  cleaned up).  Only  an  accurate
emissions  inventory is  necessary.   However,  if off-
sets were "negotiated"  at the level  of  ambient concen-
 trations, the  predictions of air quality  models would
 assume significance.   The "far"  downwind  concentration
 value,  representative of its regional incremental
 impact,  would be the quantity of greatest interest,
                         V-31

-------
since it would describe the source's offset "potential."
Station  PMs then would be of use in evaluating
model performance.
EIS/R.  Projects having a significant,  adverse impact
on air quality usually are presented for public
review by means of an EIS or an EIR. Such projects
generally consist of one or a few distinct sources,
although some consist of a greater number.  An
example of the latter is the Denver Metropolitan
Wastewater Overview EIS recently completed by
Region VIII of the EPA.  Federal funding for
twenty-two separate sewerage treatment  facilities
was conditioned upon favorable review of the EIS
which examined their combined regional  Impact.  If
the sources are widely distributed throughout the
modeling region, spatial resolution may be an im-
portant model requirement.  In such a case, area
and station PMs would provide a useful  means  to
verify model acceptability.  If the combined
emissions from the proposed sources are relatively
low or they are localized to a narrow downwind
plume, their incremental health impact may be
small.  Exposure/dosage PMs  might be applied  to
assess model performance.  However, if, as in
Denver, the potential impact is more serious and
widespread, this latter type of PM can  be useful.
LIT.   Court challenges can arise to the basic air
pollution laws themselves, to their implementation
to federal  regulation, or to decisions  regarding
specific sources (requests for variances and
applications for construction/modification approval,
for example).   While challenges of the  first two
types can and have  had important consequences, we
identify the third  type as the principal variant
included in LIT.  When the specific source in question
                        V-32

-------
           is  to be  located  in an urban area, the model used to
           estimate  its effects should be expected to predict
           both its  near-source, ground-level concentration peak
           and its far-field "average" value.  Peak and station
           PMs should be used.  If the source is to be constructed
           in  a rural area,  PSD may  be an issue  in arriving at a
           build/no-build  decision.   If so,  accuracy of spatial
           resolution could  be important.  The use of area PMs
           could be  of assistance.

     We summarize  in Table V-7 many  of  the  points  mentioned  above.   In  it
issues are associated with the  generic  categories  of performance measures
most commonly required  for use  in assessing model  performance.   However,
exceptions do occur.  For this  reason,  the  final choice  of performance
measures should be dictated by  the character of the  specific application.

2.   Performance Measures and Air Quality Models

     In the previous section we associated performance measures with gen-
eric types of issues.  We now discuss the ability of generic classes of
models to  generate  predictions in a form suitable for calculation of those
measures.  All model types  produce estimates of the concentration peak.
Some can predict station concentrations.  Fewer can spatially resolve
the concentration field.  Fewer still are able  to determine an estimate
of exposure/dosage.  For each generic model category, we outline here
their  general capabilities.

      >  Grid,.  The  formulation of grid models permits the esti-
         mation of concentrations  averaged  for each  grid  cell.
         Consequently, the  concentration  field can be  resolved
         spatially as finely as  the  dimensions of  the  grid cell.
         The  peak  is estimated  to be the  maximum ground-level
         grid cell  concentration occurring  during  the  modeling  day.
         The  location of the peak is predicted only  as closely  as

                                    V-33

-------
     TABLE V-7.  PERFORMANCE MEASURES ASSOCIATED
                WITH  SPECIFIC  ISSUES
                      Performance  Measure Type
     Issue       Peak  Station  Area  Exposure/Dosage
Multiple-source
  SIP/C           XXX           X
  AQMP            XXX
Specific-source
  PSD             XXX
  NSR             XXX
  OSR                     X      X
  EIS/R           XXX           X
  LIT             XXX
                         V-34

-------
   a single grid  cell  dimension.  The  value  at  the peak is
   predicted only as  an  area  average in  the  vicinity of the
   peak (within one grid cell).   Because of  its spatial and
   temporal resolution,  predictions suitable for calculation
   of station,  area and  exposure/dosage  performance measures
   also can be  generated.
>  Trajectory.  Because  a single air "column" is simulated,
   only concentrations along  the space-time  track followed
   by the advecting air parcel  can  be  estimated.  Such
   models, as a consequence,  can predict station concentra-
   tions only for those over  which  they pass.  If several
   adjoining parcels are modeled, predictions at other
   stations can be determined.   The spatial  location of the
   peak can be estimated only as closely as  the dimensions
   of the air column.  The peak level  is estimated to be
   the greatest column-averaged concentration occurring
   -during the modeling day.  Averaging can take place over
   the entire vertical region from the ground to the inver-
   sion base, or over the lowest of several vertical  column-
   layers.  Because of their limited spatial resolution,
   regional trajectory models do not generate predictions
   In a form suitable for the calculation of area or
   exposure/dosage PMs.  Specific-source trajectory models,
   on the other hand, may do so.  Concentrations are pre-
   dicted as a function  of downwind distance from the source.
   Though  lateral  resolution is  limited, concentration esti-
   mates can be put  in  a form appropriate for  calculation of
    station, area  and  exposure/dosage PMs.
 >   Gaussian.   Concentration  field predictions  are expressed
    analytically.   Thus,  subject  to the  steady-state limita-
    tions of their formulation,  the short-term  averaging
    versions of these models  can  provide their  estimates  in a
    form that is  suitable for the calculation of all performance
    measure types.  The  long-term averaging  versions, however,
                              V-35

-------
        predict regional or sector-averaged estimates of annual
        concentrations.   Estimates of exposure/dosage (except
       . crudely on the basis of an annual  concentration level)  are
        difficult to derive.  Predictions  of annual  station averages,
        though, can be obtained for regional models  of this type.
     >  Isopleth.  Estimates in no other form than the regional
        peak concentration can be obtained with this method. This
        can be done only when the isopleth diagrams  can be inter-
        preted in an absolute sense.  This is the case only when
        the isopleth diagram has been derived for ambient condi-
        tions similar to the ones in the area being modeled.  In
        addition, a prediction of the peak can be verified only
        if a historical data base exists that is sufficient to
        determine a peak concentration in a previous base year and
        a record of the emissions cutbacks occurring since then.
     >  Rollback.  The only prediction obtainable from rollback
        is an estimate of the regional peak concentration.  This
        is determinable only if an historical data base exists
        such as that described for the isopleth method.
     >  Box.  A prediction of the regional peak concentration
        can be determined using this method.  No other estimates
        requiring finer spatial resolution can be computed.
        Diurnal variation in the estimates of regional average
        concentration, however, can be made.

     We summarize in Table V-8 many of the points mentioned above.  In
it, we indicate for each generic model the type of performance  measure
that may be calculated, given the capabilities and limitations  of each
formulation.

F.   PERFORMANCE MEASURES:  A SUMMARY

     In this chapter we identified generic performance measure  categories,
listed some specific performance measures, and then associated  the

                                  V-36

-------
        TABLE V-8.   PERFORMANCE MEASURES THAT CAN BE
                    CALCULATED BY EACH MODEL TYPE
                                 Performance Measure Type
           Model
Refined usage
  Grid
    Region oriented
    Specific source oriented
  Trajectory
    Region oriented
    Specific source oriented
  Gaussian
    Long-term averaging
    Short-term averaging
Refined/screening usage
  Isopleth
Screening usage
  Rollback
  Box
Peak  Station
  X
  X
      Exposure/
Area   Dosage
X
X
X
X
X
X
X
X
X
X
X
X
X
X

X
X
X
X
X

X

X
                                V-37

-------
generic measure with generic Issues, noting for each model  type the PMs
they are capable of calculating.  Having done so, we are now ready to
proceed with the final objective of this report:  the discussion of
model performance standards.  The presentation in Chapter VI will be
based upon the points raised in this chapter.  The following are of
crucial importance:

     >  Measurement networks often do not sense the "true"
        concentration peak.
     >  Only performance measures based upon station measure-
        ment data may be computationally feasible.
     >  Model predictions are often resolvable on a finer
        scale than measured concentrations; even though
        strict  comparison of prediction with observation
        through some  computed measure may  not be fruitful,
        the model  predictions themselves may still offer
        valuable insight.
                                   V-38

-------
               VI  MODEL  PERFORMANCE STANDARDS


     The central  purpose of this report is to suggest  means  for  setting
performance standards for air quality dispersion models.   Toward that end
our discussion has proceeded as follows:  Issues were  identified (Chapter
III); issue/model combinations were presented (Chapter IV);  and  alternative
issue/model/performance measure associations were discussed  (Chapter V).
We are now at the final step:  the setting of standards.   To place  this
in the proper framework, we first identify five attributes of desirable
model performance, showing how their relative importance  depends on the
issue being addressed and the pollutant being considered. Then  we  recom-
mend specific performance measures whose values reveal the presence or
absence of each performance attribute.  We detail several rationales for
establishing standards for those measures.  To illustrate the use of these
measures in assessing model performance, we present a  sample case.   It  is
based upon SAI experience in using a grid-based photochemical model in  the
Denver metropolitan region.  Finally, we detail possible forms the  actual
standard might assume, suggesting a sample draft outline and format.

     The subject  addressed  in  this report is a broad  and complex one.
Seldom  can a rule for  judging  model performance be stated that does not
have several plausible exceptions  to  it.  Consequently, we view the estab-
lishment of model performance  standards to be a pragmatic and evolutionary
exercise.  As we gain  experience  in  evaluating model  performance, we will
need to modify both  our  choice of performance measures and  the  range of
acceptable values we insist on.   Nevertheless,  the process must begin
 somewhere.   The  recommendations contained in this chapter represent such
a beginning.

     We feel  the measures and standards we suggest  for use  here will almost
 certainly change as experience improves our "collective  judgment"  about
 what constitutes model acceptability and what does  not.   Perhaps the

                                VI-1

-------
number of measures will increase to provide richer insight into model
performance, or perhaps the number will shrink without any loss of "informa-
tion content."  Regardless of the list of measures and their standards that
ultimately emerges for use, it is the conceptual structuring of the per-
formance evaluation itself that seems to be most important at this point.
We must identify the attributes of a well-performing jnodel, and we need to
understand how we assess their relative importance, depending on the issue
we are addressing and the pollutant species we are considering.  The dis-
cussion in this chapter offers a conceptual structure for "folding in" all
these concerns and suggests candidate measures and standards.

A.   PERFORMANCE STANDARDS:  A CONCEPTUAL OVERVIEW

     The chief value of air quality models lies tn their predictive ability.
Only through their use can the consequences of pollution abatement alter-
natives be assessed and compared.  Only by means of model predictions  can
the impact of emissions from newly proposed sources be estimated and evalua-
ted for acceptability.  However, because the questions typically asked of
models are hypothetical ones, their predictions are inherently nonverifiable.
Only after the proposed action has been taken and the required implementation
time elapsed will measurement data confirm or refute the model's predictive
ability.

     Herein lies the dilemma faced by users of air quality models:  If
a model's predictions at some future time cannot be verified in advance,
on what basis can we rely on that model to decide among policy alternatives?
In resolving this, most users have adopted a pragmatic approach:  If a
model  can demonstrate its ability to reproduce for a similar type of appli-
cation a set of "known" results, then it is judged an acceptable predictive
tool.   It is on this basis that model "verification" has become an essential
prelude to most modeling exercises.

    A further difficulty exists.   What constitutes a set of "known" results?
This is  not a  problem easily solved.   For "answers" to be known exactly, the
"test" problem must be simple enough  to be solved analytically.  Few problems

                              VI-2

-------
involving atmospheric dynamics are so simple.   Most  are  complex and nonlinear.
For these, the analytic test problem is an unacceptable  one.  Another, more
practical alternative often is employed.  For  regional,  multiple-source
applications, the "known" results are taken to be the station measurements
of concentrations actually recorded on a "test" date. For pollutants having
a short-term standard, the duration of measurement is a  day or less.  For
those subject to a long-term (annual) standard, the  duration is a year or more.

     For source-specific applications, the source of interest may not yet
exist, permission for its construction being the principal issue at hand.
For these applications, it is often necessary  to verify  a model using the
most appropriate of several protypical "test cases."  These could be assembled
from measurements taken at existing sources, the variety of source size,
type and location spanning the range of values found in  applications of interest.

     The term "known" is used imprecisely when referring to a set of measure-
ment data.  Station observations are subject to instrumentation error.  The
locations of fixed monitoring sites may not be sufficiently well distributed
spatially to record data fully characterizing the concentration field and  its
peak value.  Nevertheless, despite those shortcomings, "observed" data often
are regarded as "true" data for the purposes of model verification.

     Having assembled two sets of data, one "known"  and  the other  "predicted,"
we can assess model performance by comparing one with the other.   Predic-
tion and observation, however, can be compared in many ways. We must select
the quantities that can best characterize the distribution of pollutants  in
the ambient air, for  it is through comparison of their predicted and  observed
("known") values that we specify model performance.   We catalogued a  number
of useful performance measures in Chapter IV, as well as  in Appendix  C.
Later in this fchapter we indicate that subset we view as  having the
greatest practical usefulness.

     Once we have decided on the performance measures best suited  to  our
issue/application (and most feasible  computationally), we can calculate

                               VI-3

-------
these values.   Having done so,  however, we must ask a central  question:   How
close must prediction be to observation in order for us to judge  model per-
formance as acceptable?  In order for us to answer "how good is good," per-
formance standards for these measures must be set, with allowable tolerances
(predicted values minus observed ones) derived based upon a reasonable
rationale (health effects or pollution control cost considerations,  for
instance).

     By setting these standards explicitly, certain benefits may  be  gained.
Among these are the following:

     >  A degree of uniformity is introduced in assessing model
        reliability.
     >  The impact of  limitations in both data gathering proce-
        dures and measurement network design can be made more
        explicit, facilitating any review of them that may be
        required.
     >  The performance expected of a model is stated clearly,
        in advance of  the expenditure of substantial analysis
        funds, allowing model selection to be a more straight-
        forward and less "risky" process.
     >  The needs for additional research can be identified clearly,
        with such efforts more directed in purpose.

B.   PERFORMANCE STANDARDS:  SOME PRACTICAL CONSIDERATIONS

     Before continuing, we point put several practical considerations that
can  have  a direct impact on model verification.  Among the most important
of these  are the following:  data limitations  (due to its form,  quantity,
quality,  and availability); time/resource constraints; and variability in
the  level and timing of analysis requirements.  We discuss each of these
in turn.
                               VI-4

-------
1.    Data Limitations

     For a modeling simulation to be conducted, data must be gathered  charac-
terizing both the "driving forces" (emissions, meteorology, and vertical
temperature profile, for example) and the "resulting effects" (pollutant
concentrations).  To do so requires an extensive and coordinated effort.
Consequently, complete data sets usually are assembled for only a few  sample
days.  The dates on which these data are gathered are chosen as ones  likely
to be typical of "worst" episode conditions.  However, unanticipated shifts
in meteorology (frontal passage, for example) can occur, confounding attempts
to measure ambient conditions on high-concentration days.  Consequently,  the
data available for model verification may not be representative of conditions
on the day when the "second highest" concentration occurs, i.e., the worst
NAAQS violation.

     Confronted with such a situation, the modeler must decide the following:
Even if model performance proves acceptable for non-episode conditions,  can
it be considered "verified" as a predictive tool for higher-concentration
days?  This question is part of a still more general one:  Should a model
be verified for more than one day, each of these days experiencing a  dif-
ferent peak concentration?  If such a procedure were followed, model  perfor-
mance could be evaluated for concentrations ranging from the current  peak
value to ones nearer the NAAQS.  But, the meteorology occurring on days
experiencing low peak concentrations is not typical of that occurring  on
high peak days.  Should not the model, when used as a predictive tool,
employ maximum-episode meteorology?  We do not answer these questions  here
but note their importance as questions remaining to be resolved.  We  observe,
however, that limitations on data quantity and availability can constrain us,
limiting our flexibility in dealing with these questions.

     Another difficulty can arise because of spatial limitations in the
data.  As we noted in the last chapter, measurement networks provide
concentration data only at a few fixed sites.  In general, these networks
cannot guarantee observation of the "true" peak, nor are they sufficiently
                               VI-5

-------
well-spaced to assure that the  "true"  concentration field can be reconstructed
from the station measurements.   As  a practical  matter,  however,  these  station
data must form the basis for the comparison of  prediction and observation.
Station-type performance measures,  as  defined in Chapter V,  therefore  must
be the "preferred" (or rather the "unavoidable") measures of interest.   We
detail some of these later in Section  D.

2.   Time/Resource Constraints

     Both the amount and quality of the data collected  as well as  the  level
of modeling analysis performed  are  all strongly influenced by time dead-
lines and resource constraints.   This  has  several  consequences among which
are the following:  Because it  is difficult, expensive  and time  comsuming
to mount special data gathering efforts, heavy  reliance is placed  on previously
gathered data, even with its recognized deficiencies; also,  model  selection
occasionally is made more on the basis of  the form and  extent of existing
data and financial budgetary considerations than on grounds  more technically
justifiable.  In such cases a conscious choice  has been made, trading  model
performance for other considerations.

     The combined effect of inadequate data and inappropriate model choice
can reduce in value any assessment  of  model performance.  In this  report,
however, we take the following  view:   The  level of performance required of
a model is determined not by exogeneous considerations  but by the  nature of
the issue and the specific modeling application.

3.   Variability of Analysis Requirements

     Modeling analysis requirements differ from one application  to another.
There is an important question  to ask  in every  modeling situation:  How
much analysis is justified?  In the Los Angeles Basin,  for instance,  attain-
ment of the NAAQS for ozone cannot  be  achieved  without  widespread  and
extensive hydrocarbon (HC) emissions  control.   Ambient  HC levels are  currently
so high that more HC radicals are available than are "needed" by the  chain
                               VI-6

-------
of photochemical reactions that results in the 03 peak.   Consequently, reduc-
tions in HC emissions must be sizable before any appreciable  reduction in
peak 0- can be achieved.  The result of this is the following:  Estimates of
      «3
the percentage HC emissions control  required to reach NAAQS compliance in
Los Angeles are so high (75 to 80 percent) that they are not  strongly sensi-
tive to uncertainties in the value of the 03 peak, either measured  or predicted.

     If the only questions to be answered depended on the general  region-
wide level of HC emissions control required (a SIP/C-related  problem), then
a fair amount of uncertainty could be tolerated in model predictions of  the
0~ peak*  Use of a less sophisticated model might be acceptable.   Were a
different issue/question addressed, however, a model providing more detailed
predictions might be required.

C.   MODEL PERFORMANCE ATTRIBUTES

     Model predictions  are subject  to a number of sources  of uncertainty.   Some
of these  are  data related, while others are inherent in  the model theoretical
formulations.   Regardless of their  source,  however, errors manifest themselves
in similar ways.  They  may affect a model's ability to predict peak concen-
trations, as  well as  introduce  systematic bias or gross  error into its pre-
dictions.  They may  limit a  model's  ability to  reproduce temporal variation
or affect the spatial  distribution  of  the concentration  field.

     What are the attributes of desirable model  performance?  Ideally, we
would  ask that a model  have  five major attributes, the strength of our insis-
tence  depending on the  circumstance of our application and the pollutant we
are  considering.  The five model  performance  attributes  are:  accuracy of the
peak prediction,  systematic  bias, lack of gross  error, temporal correlation,
and  spatial  alignment.   The  first of these concerns  the  model's
ability to  predict accurately the level,  timing, and  location  of the concen-
tration peak.  The second attribute is the absence  of systematic bias, where
predictions  are shown not to differ from observations in any consistent and
unexplained  way.  The third attribute concerns the lack of gross  error, or
 rather the absolute  amount by which predictions differ  from  observations.

                                 VI-7

-------
     We classify the difference between bias and error by means of the
following example.  Suppose when we compare a set of model predictions with
station observations, we find several large positive residuals (predicted
minus observed concentrations) balanced by several equally large negative
residuals.  If we were testing for bias, we would allow the oppositely
signed residuals to cancel.  A conclusion that the model displayed no syste-
matic bias therefore might be a justifiable one.  On the other hand, were
we testing for gross error, the signs of the residuals would not be considered,
with oppositely signed residuals no longer allowed to cancel.  Because the
absolute value of the residuals is large in our example, we might well con-
clude that the model predictions are subject to significant gross error.

     The fourth of the desirable performance attributes is that of temporal
correlation.  When this is important, can the model reproduce the temporal
variation displayed by the observational data?  A model might be judged as
being capable of doing so if its predictions varied in phase with observa-
tion, that is, if they were "correlated."  The fifth desirable attribute is
that of spatial alignment.  At each time of interest, does the model  pre-
dict a concentration field that is distributed spatially like the observed
one?  To determine this, correlation of prediction with observation could
be assessed at several points in the concentration field, e.g., monitoring
stations.

     The five performance attributes are interrelated.  Suppose, for instance,
that our model does not reproduce well the photochemistry of ozone formation
in the atmosphere.  Not only could its estimates of the concentration peak
be in error,  but also its temporal correlation and spatial alignment might
be poor.   Even if the model  predicted the peak properly, problems might still
exist.   If the chemistry were "fast," the peak, though correct, might be pre-
dicted to occur sooner than  that actually observed.   Even if atmospheric
transport were properly modeled,  performance measures might then "detect"
temporal  and spatial  problems.

     By treating each performance attribute separately,  we may run the risk
of rejecting  a model  on several  grounds  where only a single reason actually

                                VI-8

-------
exists.. For example, slight errors in the wind field input to the model
might result in predictions apparently wroi.g both spatially and temporally.
Yet, only a single defect exists, in this case not due to the model at all.

     Nevertheless, we adopt a conservative viewpoint.'  We suggest evaluating
the model separately for the presence of each attribute, even though they
themselves may be interrelated.  Redundancy should not result in a satis-
factory model being unfairly rejected.  If model predictions are good, they
will be acceptable both spatially and temporally.  If they are poor, they
will probably be rejected, both for temporal and spatial masons.

     If model performance is mixed, showing, for example, good temporal cor-
relation but poor spatial alignment, two possibilities exist.  Either the
model performance may not be particularly poor or the performance measure
used to detect one or the other performance attribute is deficient (too
stringent or too lenient).   In either case, however, forcing model perfor-
mance to be reassessed malces sense.  On balance, while requiring a model to
"jump the hoop" twice may be redundant in looking for the same problem,  it
should  provide us a measure  of safety in the "double-check"  it provides, pre-
suming  each attribute assumes the  same importance (see  the discussion below).

     Although they are interrelated, the five model performance attri-
butes are distinct.  Consequently, we must  employ different  kinds  of per-
formance measures to determine the presence of  each attribute.  While we
defer to Section D a statement of  specific  measures we  recommend using,  we
list  in Table VI-1 their objectives.

     We have  identified five model performance  attributes.   Which  of  these,
however,  is most  important?  This  question  has  no unique answer, the  rela-
tive  importance fn each problem  depending on  the type of issue the model
is being  used  to  address  and the type of  pollutant  under consideration.
In order to  relate attribute importance  to  application  issue in  a  more con-
venient manner, we  present  in  Table  VI-2  a  matrix of generic issue class
(as defined  earlier  in  this report)  and  problem type.   For each  combination
                                 VI-9

-------
               TABLES VI-1.   PERFORMANCE MEASURE OBJECTIVES
   Performance
   Attributes

Accuracy of the
peak prediction

Absence of
systematic bias

Lack of gross
error

Temporal
correlation

Spatial alignment
          Objective of Performance Measures
Assess the model's ability to predict the concentra-
tion peak (its level,  timing and location)

Reveal any systematic bias in model  predictions
Characterize the error in model predictions both at
specific monitoring stations and overall

Determine differences between predicted and observed
temporal behavior

Uncover spatial misalignment between the predicted
and observed concentration fields
     TABLE VI-2.   IMPORTANCE OF PERFORMANCE ATTRIBUTES BY ISSUE
Performance Attribute

Accuracy of the peak
prediction

Absence of systematic
bias
Lack of gross error

Temporal correlation

Spatial alignment
       Importance of Performance Attribute*

   SIP/C    AQMP   PSD   NSR   OSR   EIS/R   LIT

     1111211
     1
1
1
1
1
1
1
1
2
2
1
2
2
1
3
1
1
3
3
1
3
3
1
3
3
1
3
3
* Category 1 - Performance standard must always be  satisfied.
  Category 2 - Performance standard should be satisfied,  but some  leeway
               may be allowed at the discretion of  a  reviewer.
  Category 3 - Meeting the performance standard is  desirable but failure
               is not sufficient to reject the model;  measures  dealing
               with this problem should be regarded as "informational."
                               VI-10

-------
we indicate an "importance category."   We define  the  three  categories based
upon how strongly we insist our model  demonstrate the presence of a given
attribute.  For Category 1, we require that performance  standards always
be satisfied (the problem type is of prime importance).   For  Category 2,
we state that the standard should be satisfied but some  leeway ought to
be allowed, perhaps at the discretion of a reviewer (while  the problem type
is of considerable importance, some degree of "mismatch" may  be  tolerable).
For Category 3, we are not insistent that standards be met, though we state
that as being a desirable objective (the problem  type is not  of  central
importance).

     A number of assumptions are embedded in Table VI-2. Among  the more
significant are the following:

     >   Both peak and "far-field" concentrations  are  of  interest
        in considering PSD and NSR questions.
     >   Specific-source issues  (PSD, NSR,  OSR,  EIS/R  and LIT) most
        often deal with sources assumed  to be  continuously  emitting
        at a constant level  (or nearly so);  consequently, performance
        measures  considering time variations between  prediction
        and observation are not the principal measures of interest.
     >   Spatial agreement between prediction and  observation  is par-
        ticularly important in  applications where PSD is an issue;
        this is so because source impact on pristine  areas  (Class I)
        and elevated terrain  (Class II)  often  occurs  well downwind
        of the  source,  with the magnitude and  incidence  of  impact
        highly  directional  and  spatially dependent.
     >   Specific-source impact generally occurs in a  narrow downwind
        plume;  thus, the monitoring network set up to provide measure-
        ment data often consists of only a few stations; as a result,
        the calculation of all-station performance measures may  not
        prove meaningful.
     >   Error is  less important in considering regional  issues than is
        the presence of a systematic bias.

                                VI-11

-------
     >   To  achieve  and maintain compliance with the NAAQS  (SIP/C, AQMP),
        alternate control  strategies must be developed and evaluated.
        For this to be done properly,  some degree of spatial resolution
        should be attained by  the model  and verified.

     The relative importance of each performance attribute is dependent
on the  type of pollutant being considered and  the averaging  time required
by the  NAAQS.   If a species is subject to a short-term standard, for
instance, accuracy  of the peak prediction and  temporal correlation might
be of considerable  concern, depending  on the issue  being addressed.  How-
ever, if the species is  subject  to a  long-term standard, neither of these
problem types are  of appropriate  form.  We  indicate in Table VI-3 a matrix
of the performance  attributes  and pollutant species.  We rank each combina-
tion by the same importance  categories we used earlier in  Table VI-2.

     Conceivably, a conflict might exist between  the ranking indicated by  the
issue and  the pollutant matrices  in Tables  VI-2 and VI-3.  We suggest  resolving
the conflict in favor of the  less stringent of the  two  rankings.   For example,
suppose the issue being addressed was  SIP/C and pollutant  being considered
was CO.  According to Table  VI-2, the  accuracy of the peak prediction should
be regarded as Category 1 (.the standard must always be  satisfied).  However,
according  to Table VI-3, it  should be  considered  as Category 2  (the standard
should be  satisfied but some  leeway may be  allowed).  The  conflict should
be resolved by allowing the  combined issue/pollutant ranking to  be Category 2.

D.   RECOMMENDED MEASURES AND STANDARDS

     In this section we reach a  major  goal  of  this  report:  We  identify a
recommended set of performance measures and propose rationales  for setting
standards  for each.  Our discussion in this section unfolds as  follows.
First, we  isolate a candidate list of performance measures from which we
select the recommended set.   Then, we detail  several rationales  on which to
base standards for our "preferred" measures.   Using these we identify
specific "guiding principles" from which standards  may be set.   In a final
                                VI-12

-------
             TABLE  VI-3     IMPORTANCE OF PERFORMANCE  ATTRIBUTES  BY
                                POLLUTANT  AND AVERAGING TIME
Pollutants
Performance 3
Attribute (1 hourl1
Accuracy of the 1
peak prediction
Absence of 1
systematic bias
Lack of gross 1
error
Temporal 1
correlation
Spatial 1
al Igrunent
CO**
(1 hour)
1

1

1

2

2

NfHC*
(3 hour)
1

1

1

2

2

with Short-term
S02 M>2
(3 hour) (T)f
1 1

1 1

1 - 1

2 1

2 1

Standards
CO
(8 hour)
1

1

1

2

2

Pollutants with
Long-term Standards
TSP**
(24 hour)
1

1

1

3

2

so2«*
(24 hour)
1

1

1

3

2

(1 year)
3

1

1

N/Atf

2

TSP
n year)
3

1

1

N/A

2

SO,
(1 year)
3

1

1

N/A

2

 * Category 2 - Perforce Itandtrd should'be satisfied,  but some leeway may be allowed at the discretion of a reviewer.
   Category 3 - Meeting the performance standard Is  desirable but failure Is not sufficient to reject the model.

 t No short-term N02 standard currently exists.
 I Averaging times required by the NAAQS are In parentheses.

•* Primary standards.
tt The performance attribute 1s not applicable.
                                                   VI-13

-------
synthesis,  we present a  summary  table  listing  for each  performance attri-
bute, the reconmended measures and  a means  for setting  standards for them,
along with a sample value for the standard  (ones listed are appropriate
for the Denver case study described in Section E of  this chapter).

1.   Recommended Performance Measures

     Of the many performance measures  considered in Chapter V (and in  more
detail in Appendix C), which of these are most suitable for use in establishing
standards for model performance?  The answer to this is constrained in two
major ways, the first conceptual and the second practical.  First, the con-
ceptual constraint is imposed by the types of performance attributes we are
concerned with:  The measures must adequately assess the presence  or absence
of each of the five attributes.   Second, the practical  constraint  is imposed
by the "sparseness" of  the observational data:  Since station observations
constitute  the only data available for characterizing "true" ambient con-
ditions, we  have little choice but to employ station performance measures
in determining model acceptability.

     We draw a distinction between those measures  that are of general  use
in examining model performance and the much smaller subset of them that is
most amenable to the establishment of explicit standards.  Many measures
can  provide rich insight into model behavior but the information is conveyed
in a qualitative way not suitable for quantitative characterization (a
requisite for use in setting performance standards).  These "measures,"
often involving graphical display,  really are tools for use in "pattern
recognition."  They display model  behavior in suggestive ways, highlighting
"patterns" whose presence reveals much about model  performance.  Several
examples of such "measures" are isopleth contour maps of predicted concen-
trations and estimated  "observed" ones, isopleth contour maps of the dif-
ferences between the two, and time histories of predicted and observed con-
centrations at specific monitoring stations.
                                VI-14

-------
     Though we focus on station measures for use in setting model performance
standards, we do not suggest the calculation of performance measures be
limited to them.  Many others, where each is appropriate,  should be used.
The data should be viewed in as many, varied ways as possible  in order to
enrich insight into model behavior.   We suggest a number of useful measures
both in Chapter V and Appendix C.

     Given that station measures are our "preferred" (rather,  our "unavoid-
able") choice, we now consider the list of candidate measures. From these
we select our final recommended set.  We present the candidate station per-
formance measures in Table VI-4.  We group them by the number  of stations
compared noting the performance attribute and generic issue class they are
most suited for addressing.  We identify four types of comparisons:

     >  Event Specific Values.  Predicted and observed concentra-
        tions are compared at the time a specific event occurs.
        For instance, the peak station prediction can be compared
        with the peak station observation, even though these  may
        occur at different stations and times.
     >  Comparative Values.  Predicted and observed concentrations
        are compared at  the same monitoring station.
     >  Average Values.  Predicted  and observed concentrations are
        compared averaged  for all monitoring stations.
     >  Offset Values.   Observed concentrations at  a given station
        are compared with  predicted values offset  by a small   amount
        spatially  (values  at  near-by stations)  and/or  temporally (values
        at other  times,  either earlier or later).

     Performance measures  are of two different  kinds:   "absolute" and
 "informational."   The  first type includes  those measures  for  which we can
 set specific,  absolute standards.   Measures  of  the second type are more
 informational  in  nature, providing  qualitative  insight  into model performance.
 Their  values  are  to be considered as "advisory,"  having associated with them
 no specific standard.
                                VI-15

-------
TABLE VI-4.    CANDIDATE STATION PERFORMANCE MEASURES
                                                Issue Category
Stiller*
Considered.
Peat SUtlens
({vent-Specific
Valmts)

















Ctck Sution
Separately
(Conparative
Values)












Kit Stations
Together
(Average Values)













Performance
Attributes
Accuracy of
the peak
prediction
(Concentra-
tion level 1






Accuracy of
the peak
prediction
(Location
of Peak)
Accuracy of
the peak
prediction
(Timing of
Peat)
Absence of
systematic
bias

Lack of
gross error


Temporal
correlation/
spatial
alignment
Temporal
correlation


Absence of
Systematic-
bias
Lack of
gross error







Temporal
correlation/
spatial
attgnwnt
Performance Measure
Description
1, Difference between or
ratio o< peak station
concentrations (could be
at different measurement
stations)
2 Difference between or
ratio of predicted and
observed concentrations
at the station recording
the «a»imu» measured
value
3. Spatial displacement
betxeen predicted and
observed peak stations


4. Timing difference be-
tween occurrence of
predicted and observed
peak

5. Average relative devia-
tion


6. Average absolute rela-
tive deviation
7. Standard deviation of
deviations
8. Correlation coefficient



9. Temporal offset corre-
lation coefficient
10. Plots of comparative
time histories
11. Average relative de-
viation

12. Average absolute rel-
ative deviation
13. Standard deviation of
deviations
14.. Correlogram of
prediction-observation
pain
15. Ratio of peak to
average deviation
It. Correlation coefficient



Multiple-Source
Status S1P/C AQMP
Absolute X X


Absolute X X





Informational X X




AbsaUte - X X




Absolute X X



Absolute X X

Absolute X X

Absolute X X



Informational X X

Informational X X

Absolute X X


Absolute X X

Absolute X X

Informational X X


Informational 1 x

Absolute i x



Specific-Source
PSD HSR OSIC E1S/R LIT
* X * X X


X XX





X XX









XXX XX



XXX XX

XXX XX









XXX XX


XXX XX

XXX XX










                         VI-16

-------
                                 TABLE VI-4  (Concluded)
                                                                    Issue Category
  Stations
  Considered
  Problem
Neirby Stations
(Offset Values).
             Temporal
             correlation
Spatial
alignment
             Temporal
             correlation
             Spatial
             Al ignment
Performance Measure
Description Status
17.
18.
19.
Temporal offset corre- Informational
tation coefficient
Plot of comparative Informational
tine histories
Spatial offset corre- Informational
                                 Multiple-Source
                                  StP/C   WJKP
                                                                           Specific-Source
                                                                      PSD  HSR  OS*«  ElS/«  LIT
    lation coefficient
    (comparison at the
    sane tine)
20.  Spatial/temporal offset  Informational
    correlation coefficient
    (comparison at differ-
    ent times)
• These leisures are appropriate 1f offsets are considered at the level of ambient concentrations
 rather than primary emissions.
          Often in  practice  modeling predictions are known  with greater spatial
     resolution than measurement data.   The  predicted concentration  field, for
     instance, can  be resolved at  intervals  of several kilometers  or less by
     various  types  of models,  including grid and Gaussian ones.  To  retain the
     information contained in  concentration  field  predictions, several "hybrid"
     performance measures can  be employed.   With these, concentration field
     predictions are compared  with station measurements.  We list  in Table VI-5
     several  of these hybrid measures.   When predictions are available in this
     more detailed  form, these measures may  be calculated to supplement those  in
     Table VI-4.
                                          VI-17

-------
     Our recommended choice of performance measures  is  based upon  the
following criteria:

     >  The measure is an accurate indicator of the  presence of a
        given performance attribute.
     >  The measure is of the "absolute" kind, that is, specific
        standards can be set.
     >  Only station measures should be considered for use in
        setting standards.  (This is more an unavoidable choice
        than a preferred one.)

     Based  on these criteria, we have selected the set of measures described
 in Table  VI-6.  The use of ratios (Cp /Cp  and v, for example) can introduce
 difficulties:  They can become unstable at low concentrations, and the sta-
 tistics of  a ratio of two random variables can become troublesome.  Neverthe-
 less, when  used properly, their advantages can be offsetting.  For example,
 the  use of  Cp /Cp  instead of (Cpn'Cpn,) permits a health effects rationale to
 be used in  recommending a performance standard (see  a later discussion of the
 effects rationale).

     Before continuing, however, we insert an important caveat.  For calcu-
 lation of these measures to be statistically meaningful, a certain minimum
 level of  spatial and temporal "richness" must be available from monitoring
 data.  Often, this criterion is met for multiple-source, urban applications.
 However,  for isolated point source applications, it may not  be.   For such
 cases, data inadequacies may be overcome by using prototypical  "test bed"
 data bases  for the purposes of model verification.  Selection of  the
 proper  "test bed" could be accomplished by choosing the prototypical data
 base that describes an application most nearly  like the proposed  one.
                                   VI-18

-------
These data bases, where they do not already exist,  could be assembled
through special measurement efforts at existing large point sources.  Mon-
itoring could be extensive enough to insure adequate data "richness."

     As a practical matter, however, such "test beds" are not currently
available.  Verification instead must be conducted using whatever data are
at hand.  These may be provided by tracer experiments.  Alternatively,
where a source already exists (for instance, where retrofit of pollution
control equipment is the issue or where construction of a new source is
to occur on the site of an existing source), some site-specific data already
may be available.

     Considerable care should be exercised when using such data to calcu-
late the performance measures listed in Table VI-6.  If the data are too
"sparse," in either a spatial or a temporal sense, these measures may be
of little value, or worse yet, may actually be misleading.  Additional
work needs to  be conducted to identify, if possible, supplementary perfor-
mance measures for use when  the available data is inadequate for reliable
use of  the recommended measures.

     Having  stated the above caveat, we continue.  A number of  key assump-
tions are embedded in  the  choice of the specific measures  shown  in Table
VI-5.   We state several  of them:

     >   Concentration  gradients  within a  pollutant  cloud can  be
         "steep".   Thus  a slight  spatial  misalignment of the cloud,
         perhaps an unconsequential problem on its  own,  can sometimes
         result in the predicted  peak  occurring at a different
         monitoring station than the measured peak.   Estimating the
         value of the concentration peak, however,  is often of
         much greater importance than predicting its exact location.
                                  VI-19

-------
                      TABLE  VI-5.     USEFUL  HYBRID  PERFORMANCE  MEASURES
    Stations
   Considered
Performance
 Attribute
Peak Sution
(tvent-Speclfic
Values)
Accuracy of
the peak
prediction
(Concentra-
tlon level)
                                                                                             Issue Category
                                            Performance Measure
        Description
                                 Status
1.   Difference between or
     ratio of predicted peak
     concentration and nigh-1
     est station value
                                                              Absolute
jjultiple-Source

 S1P/C    AQHP

   X       X
                                                                                      Specific-Source
PSD   JSR

 X     X
OSR«

  X
E1S/R

  X
UI
 x
 Each Station
 Separately
 (Comparative
 Values)
 All Stations
 Together
 (Average
 Values)
Accuracy of
the peak
prediction
(Location
of Peak)


Accuracy of
the peak
prediction
(Tiling of
Peak)

 Spatial
 alignment
 Lack of
 gross error
2.   Spatial displacement      Informational
     between the predicted
     peak and the station
     measuring the highest
     value


 3.   Tiering difference be-     Informational
     tween occurrence of
     the predicted peak and
     the maximum station
     measurement

 4.   Plot showing for each     InforMtional
     hour during the day the
     distance and direction
     from the measurement
     station to the nearest
     point at which a pre-
     dicted concentration
     occurs equal to the
     station measured value

 5.   Difference for each hour  Informational
     between the average pre-
     dicted concentration
     (averaged over the en-
     tire field) and the
     average station measure-
     ment (averaged over all
     stations)

 6.   Difference for each hour  Informational
     between the standard
     deviations of the pre-
     dicted concentrations
     and the station measured
     values
                                                                                                                   X      X
                                                                                         X       X
                                                                                         X       XXX       XX
                                                                                         X       XXX       XX
                                                      VI-20

-------
 TABLE VI-6 .  MEASURES RECOMMENDED  FOR  USE  IN SETTING MODEL  PERFORMANCE STANDARDS1
    Performance
     Attribute
Accuracy of the
peak prediction
                     Performance  Measure
Ratio of the predicted station peak to the measured station
(could be at different stations and times)
                                                      m
                      Difference in timing of occurrence of  station peak*

                                                 At
Absence of
systematic bias
 Lack  of gross
 error
 Temporal cor-
 relation*
Average value and standard deviation of the mean deviation
about the perfect correlation line normalized by the average
of the predicted and observed concentrations, calculated for
all stations during those hours when either the predicted or
the observed values exceed some appropriate  minimum value
(possibly the NAAQS)
                             'OVERALL

 Average  value and standard deviation of the absolute devia-
 tion about the perfect correlation line normalized  by  the
 average  of the predicted and observed concentrations,  calcu-
 lated for all  stations during those hours when either  the
 predicted or the  observed  values exceed some appropriate
 minimum  value (possibly the  NAAQS)
                                OVERALL

 Temporal correlation coefficients at each monitoring station
 for the entire modeling period and an overall coefficient
 averaged for all stations
  Spatial  alignment
                      li   OVERALL
 for 1 <. i <. M monitoring stations

 Spatial correlation coefficients calculated for each modeling
 hour considering all monitoring stations, as well as an over-
 all coefficient average for the entire day
                    r   , r
                     xj   XOVERALL

 for 1 <. j <. N modeling hours
  * These  measures  are appropriate when the chosen model  is used to consider  questions
    involving photochemically reactive pollutants  subject to short-term standards.

  t There  is  deliberate redundancy in the performance measures.   For example, in
    testing  for systematic bias, y" and  o_  are calculated.  The latter quantity
    is  a measure of "scatter" about the perfect correlation line.   This is  also and
    indicator of gross error and could be used in  conjunction with |p"|  and  o^-.
                                    VI-21

-------
Consequently, we suggest, when this seems reasonable (judg-
ment is necessary here), comparing the peak station pre-
diction with the peak station measurement, regardless of
when or where they both occur.
In addressing questions involving pollutants subject to
short-term standards, diurnal variation occurs in concen-
tration levels.  It is reasonable to insist short-term
predictions emulate that pattern.  Differences in the tim-
ing of the peak  should be considered  (particularly  for photo-
chemical^ reactive pollutants) and temporal correlation
should be evaluated.
 In many  circumstances,  percentage differences  between predicted
and observed concentrations  seem  better indicators  of model
 performance  than gross  differences.   For instance,  a difference
 of 0.04  ppm of ozone might  be regarded as serious  if ambient
 levels were 0.10 ppm where  it might not be if  those levels were
 0.24  ppm.  The use of such  measures can cause  some  problems:
 Ratios can become unstable  at low concentrations,  and the  statistics
 of a  ratio of two random variables can be complex.  Neverthe-
 less, percentage differences should be calculated  (possibly
 along with gross differences).   Further, we suggest that residuals
 (prediction minus observation)  be taken about  the  perfect  correla-
 tion  line (prediction equals observation), since we have no  a_
 priori reason to regard observation as any more  accurate than
 prediction.   This was pointed out by  Anderson  et al.  (1977).   We
 also  suggest normalizing the residuals by the  arithmetic
 average  of the predicted and observed concentration.
 The concentrations of greatest interest are often  the higher
 values,  that is, those that exceed some appropriate minimum
 value (possibly the NAAQS,  though this may differ from one
 situation to another).   We  may be less interested in model
 reliability below those levels.  We suggest that performance
 measures include only those prediction-observation "pairs" where
 one or the other value exceeds the chosen minimum value.   (Possibly
 "stratification" may be of interest, that is, repeating the calcu-
 lation of measures using different minimum values).

                                    VI-22

-------
       This  should not be done, however, if it results in the
       number  of pairs being reduced below the number required
       for statistical significance.
     >  Measurement stations usually are widely spaced.  We assumed
       this  spacing  to be so great that the use of spatial/temporal
       offset  correlation coefficients would  be of uncertain value.
       Consequently, we  did not include them  among the list of
       measures  recommended for use.
     >  Redundancy should be built into the calculation of per-
       formance  measures.  This provides an internal means for
       double-checking results.  For example, in testing for
       systematic bias,  IT and  a— are calculated.  The latter quan-
       tity  is a measure of "scatter" about the perfect correla-
       tion  line. This  is also an indicator  of gross error and
       should  be used  in conjunction with  jy"| and o\—\-

2.   Recommended  Performance Standards
     Having identified the performance measures requiring a  specific
standard, we now consider four alternative rationales  for setting those
standards.  We designate the four as  follows:

     >   Health Effects
     >   Control Level Uncertainty
     >   Guaranteed Compliance
     >   Pragmatic/Historic

     The guiding principles for each  of these  rationales  are stated in
Table VI-7.

     We describe in detail each rationale in Appendix D, deferring their
technical description in order not to interrupt the flow of this chapter.
However, to offer insight into their general nature, we present here a
brief outline of each.
                                VI-23

-------
          TABLE VI-7.
POSSIBLE RATIONALES  FOR  SETTING MODEL
PERFORMANCE STANDARDS
      Rationale
                 Guiding Principle
Health Effects
Control Level
Uncertainty

Guaranteed Compliance
Pragmati c/Hi storic
 The metric of concern 1s the area-Integrated cum-
 ulative health effects due to pollutant exposure;
 the ratio of the metric's value based  on pre-
 diction to Its value based on observation must be
 kept  to within a prescribed tolerance  of unity.
 Uncertainty  in the percentage of emissions  control
 required must be kept within certain allowable
 bounds.
 Compliance with the NAAQS must be  "guaranteed;"
 all  uncertainty must be on the conservative side
 even if its  means  introducing a systematic  bias.
  In each new  application of a model should perform
 at least as  well as the "best" previous  performance
 of a model  in  its  generic class in a similar appli-
 cation;  until  such a historical data base  is com-
 plete, other more  heuristic approaches may  be
 applied.
   >  Health Effects.  The most fundamental reason for setting
      air quality standards is to limit the adverse health impact
      the regulated pollutants (and their products) produce.
      Thus, founding a model performance standard on a health
      effects basis has strong intuitive appeal.  To do so, we
      assume an analytic form for urban population distribution
      and an exposure/dosage health effects functional, both
      of which require as inputs only easily derived data.   Using
      these, we determine in analytic form a new health-based
      metric:  the area-integrated cumulative health effects.   We
      estimate through this metric the total health burden  experi-
      enced by the population during  the day.  The model is required
      to predict concentrations that  do not differ from observa-
      tions to the point an unacceptable difference is seen in the
      health metric.  While the data  used is application-specific,
      the method itself is general.   The assumptions made in deriving

                              VI-24

-------
this rationale, while extensive, seem plausible.   A sample case
was conducted for ozone exposure i.i the Denver Metropolitan
region, with promising corroboration of the rationale in  several
key regards.  The sample case is described in detail in
Appendix D.
Control Level Uncertainty.  With this rationale we set perfor-
mance standards to ensure that uncertainty in estimates  of the
amount of pollution control required be kept within acceptable
bounds.  These limits may be determined in a number of ways,
but we consider limits on uncertainty in control cost as  a
promising means for doing so.  If we can assume that pollutant
production and evolution over the modeled region can be  approxi-
mated by some simple surrogate, such as an isopleth diagram
for ozone, then control uncertainty limits can be directly and
easily related to equivalent bounds in uncertainty in the pol-
lutant peak, the quantity to which control strategies are often
designed.
Guaranteed  Compliance.  The NAAQS are written  in quite
specific terms and must ultimately be complied with.  An
argument can be made that to "guarantee"  such  compliance,
uncertainty  in model predictions must be  on  the "conser-
vative" side.  That  is, the probability must be accept-
ably  small  that a control strategy designed  based  on model
predictions  will  not actually  achieve compliance.   We con-
sider this  rationale here and  in  Appendix D  primarily for
completeness.  While the  rationale has  some  potential
usefulness,  it implies the introduction of a systematic
bias  into  modeling results,  something we  would hope to
avoid in a  final  choice of a performance  standard.
Pragmatic/Historic.   Standards for all  performance measures
 cannot be derived based on the rationales mentioned above,
something  we will discuss later in this chapter.   Until
additional  research  expands  our options by providing insight
 into other rationales, we adopt a pragmatic approach.  We
may proceed in either of two ways.   If we are able to state

                         VI-25

-------
       heuristically a specific guiding principle for setting a
       standard for a particular measure, we invoke it.  Otherwise,
       we simply require the following:  In each new application
       a model should perform at least as well as the "best" pre-
       vious performance of a model in its generic class in a
       similar application.  In addition to being pragmatic, this
       last approach  Is also evolutionary, requiring a continually
       expanding and  updated model/application data base.

     The  four  rationales differ  in their  usefulness vis-a-vis the five
performance attributes.  Shown  in Table VI-8 are  the attributes addressable
by measures whose standards are set  by each of the rationales.  Only the
Pragmatic/Historic rationale is  of use in addressing all attributes;
the other three are of use principally in defining the  level of performance
required in predicting values at or near the  concentration  peak.  The Health
Effects and Guaranteed Compliance rationales  also may have  some application
to problems involving concentration field error.
          TABLE VI-8.  PERFORMANCE ATTRIBUTES  ADDRESSABLE  USING
                       PERFORMANCE STANDARD RATIONALES
Performance
Attribute
Accuracy of the
peak prediction
Absence of
systematic bias
Lack of gross
error
Temporal
correlation
^natial alinnmpnt
Health*
Effects
X



X




Control Level* Guaranteed
Uncertainty Compliance
X X



X




Pragmatic/
Historic
X

X

X

X

X
* These are most suited for photochemically reactive pollutants subject
  to short-term standards.
                                VI-26

-------
     One conclusion seems clear.   Unless more comprehensive rationales are
developed in subsequent research  work, several must be used simultaneously
to completely define standards of performance.  Any one of the  four can be
used to specify allowable bounds  on model performance in predicting peak
concentrations.  Either the Health Effects or the Pragmatic/Historic  ration-
ales can be helpful in setting standards for error measures. Only the latter
of these two rationales is of use for addressing attributes of  the other types,

     We associate in Table VI-9 each rationale with those generic issues
for which its use is appropriate.  Several assumptions are embedded in
that table.  Among them are the following:

     >  Health effects are not of overriding concern in PSD and OSR
        issues, for reasons noted earlier.  (Even though we indicate
        such a rationale may be used in addressing other specific-
        source issues, we observe that plume "narrowness" can limit
        downwind health impact).
     >  Near-source peak concentrations are not of primary interest
        in OSR, but rather "far-field" average values.
     >  The Guaranteed Compliance rationale is of use in addressing
        questions involving PSD as long as the air quality standards
        being used are the PSD class increments.
        TABLE VI-9.   ASSOCIATION OF RATIONALES WITH GENERIC ISSUES
Issue Category
Multiple-Source
SIP/C
X
X
AQMP
X
X
PSD
X
X
Specific-Source
NSR
X
X
OSR EIS/R
X
X
LIT
X
X
   Rationale
Health Effects
Control Level
Uncertainty
Guaranteed            X        X        X       X                X
Compliance
Pragmatic/            X        X        X       X       X        X
Historic
                                 YI-27

-------
     Having outlined the rationales  we  consider  in  this  report,  it  remains
to match them with the set of performance measures  we  recommended earlier
in this chapter.   As is clear from Table VI-8, we have no alternative  but
to apply the Pragmatic/Historic rationale for those measures designed  to
test for systematic bias or to evaluate temporal behavior and spatial  align-
ment.  However, several alternatives exist for measures  dealing with peak
performance and gross error.

     We select in the following ways from among the alternatives.  Hoping to
avoid  introducing a procedural bias, we first eliminate the Guaranteed Com-
pliance rationale from further consideration.  Then, because the Health
Effects rationale is better suited for use in setting standards for peak-
accuracy measures, we choose to use it only in that way.

     Our recommended choice for use in establishing standards for peak-
accuracy measures is a composite one, combining the Health Effects  and Control
Level  Uncertainty rationales.  Were a model to overpredict the peak, a
control strategy designed based on its prediction might  be expected to abate
the health impact actually occurring.  If the model underpredicted, however,
the control strategy might be "underdesigned," with the  risk existing  that
some of the health impact might remain unabated even after control  implemen-
tation.  The penalty, in a health sense, is incurred only when the  model
underpredicts.  The Health Effects rationale then is one-sided,  helping us
set performance standards only on the "low side."

     On the other hand, the Control  Level Uncertainty rationale is  bounded
"above" and "below", that is, its use provides a tolerance interval about the
value  of the measured peak concentration.  For a model to be judged accept-
able under this criterion, its prediction of the peak concentration would
have to fall within this interval.  Model underprediction could lead to
control levels lower than required,  but residual health  risks.  Overpre-
diction, on the other hand, could lead to abatement strategies posing  little
or no  health risk but incurring control costs greater than required.
                                 VI-28

-------
     For the above reasons, we suggest that the Control  Level  Uncertainty
rationale be used to establish an upper bound (overprediction)  on  the
acceptable difference between the predicted and observed peak.   We would
choose the lower bound (underprediction) to be the interval  that is  the
minimum of that suggested by the Health Effects and Control  Level  Uncertainty
rationales.

     We list pur recommendations in Table VI-10, noting  the  possibility for
peak-accuracy measures that the recommended rationales may not be  appropriate
in all applications for all pollutants.  Whether health  effects would  be an
appropriate consideration when considering TSP, for instance,  is unclear.
The Health Effects rationale is best suited for use in urban applications
involving short-term, reactive pollutants.  In those circumstances when the
HE or CLU rationales are not suitable, we suggest the Pragmatic/Historic
rationale.
      TABLE VI-10.  RECOMMENDED RATIONALES FOR SETTING STANDARDS
   Performance
    Attribute
Accuracy of  peak
prediction
Absence of
sytematic bias
Lack  of gross
error
Temporal cor-
relation
                 Recommended Rationale
Health Effects* (lower side/underprediction)
Control Level Uncertainty*  (upper side/overprediction)
Pragmati c/Hi stori c
Pragmatic/Historic
Pragmati c/Hi stori c
 Spatial  alignment   Pragmatic/Historic
 * These  may not  be  appropriate  for all  regulated pollutants in all applica-
   tions.  When they are not,  the  Pragmatic/Historic rationale should be
   employed.  They are most applicable  for  photochemicalily reactive pollu-
   tants  subject  to  a short-term standard (0,  and N0?,  if a 1-hour standard
   is set).
                                  VI-29

-------
3.   Sumnary Table of Recommended Measures and Standards

     Until now, our discussion has remained general  when relating  performance
measures and standards.  Here we become specific.  In Table VI-11, we sum-
marize for each of the five problem types whose presence we are testing for
the performance measures we recommend and the standards we suggest.  Since
the actual value of the standard may vary from one application to  another
or between pollutant types, we present sample values calculated based on a
sample case.  The example is appropriate for consideration of SIP/C in the
Denver Metropolitan region and is described in a case study fashion in Section
E of this chapter.

     Where we invoke the Pragmatic/Historic rationale as justification for
selecting specific standards, we also state the specific guiding principle
we followed.  We summarize those here:

     >  When the pollutant being considered is subject to a short-
        term standard, the timing of the concentration peak may be an
        important quantity for a model to predict.   This is parti-
        cularly true when the pollutant is also photochemical1y
        reactive.  We state as a guiding principle:   "For photochem-
        ically reactive pollutants,  the model  must reproduce reason-
        ably well the phasing of the peak."  For ozone an acceptable
        tolerance for peak timing might be ± 1  hour.
     >  The model should not exhibit any systematic  bias at concen-
        trations at or above some appropriate  minimum value (possibly
        the NAAQS) greater than the  maximum resulting from EPA-
        allowable calibration error.   We would consider in our calcu-
        lations any prediction-observation pair in which either of
        the values exceed the pollutant standard.  Error (as
        measured by its mean and standard deviation)  should be
        indistinguishable from the distribution of differences
        resulting from the comparison of an EPA-acceptable monitor
        with an EPA reference monitor.  The EPA has  set maximum
        allowable limits on the amount by which a monitoring technique
        may differ from a reference  method (40 CFR  §53.20).  An '

                                  VI-30

-------
       TABLE  VI-11.    SUMMARY  OF  RECOMMENDED  PERFORMANCE  MEASURES  AND  STANDARDS
    Performance
	Attribute

Accuracy of the
peak prediction
                                                                          Performance Standard
                        Performance of Measure   Type of Rationale
                     Ratio of the predicted
                     station peak to the
                     Measured station peak  .
                     (could be at different
                     stations and times)
Health Effects  -
(loner side)  com-
bined xith Control
Level Uncertainty
(upper side)
	Cuidins Principle

Limitation on uncertainty
in aggregate health
impact and pollution
abatement costsf
                                                     Sample Value
                                                   (Denver Example)
                                                                                                BO
                                                                                                        - <. 150 percent
                    Difference in timing of
                    occurrence of station
                                                Pragmatic/Historic
                    Model must reproduce
                    reasonably well the
                    phasing of the peak.
                    s«y, ±1 hour
                                                                                                 ±  1 hour
Absence of          Average value and  standard   Pragmatic/Historic
systematic bias     deviation  of mean  devia-
                    tion about the perfect
                    correlation line normal-
                    ized by the average of the
                    predicted  and observed con-
                    centrations, calculated for
                    all stations during those
                    hours when either predicted
                    or observed values exceed
                    some appropriate minimum
                    value (possibly the NAAQS}.
                           (u. o_)
                           x    "'OVERALL
Lick of gross       Average value and Stan-     Pragmatic/Historic
error               dard deviation of absolute
                    mean deviation about the
                    perfect correlation line
                    normalized by the average
                    of the predicted and
                    observed concentrations,
                    calculated for all sta-
                    tions durino those hours
                    •lien either predicted or
                    observed values exceed some
                    appropriate minimum value
                     (possibly the NAAQS)
                                                                    Wo or very little systematic
                                                                    bias at concentrations (pre-
                                                                    dictions Or observations)  at
                                                                    or above some appropriate
                                                                    Minimum value (possibly the
                                                                    NAAQS); the bias should not
                                                                    be worse than the maximum
                                                                    bias resulting from EPA-
                                                                    allowable monitor calibra-
                                                                    tion error (-8 percent is
                                                                    a representative value for
                                                                    ozone); the standard devia-
                                                                    tion should be less than or
                                                                    equal to that of the differ-
                                                                    ence distribution of an EPA-
                                                                    acceptable monitor** com-
                                                                    pared with a reference moni-
                                                                    tor.  (3 pphm is representa-
                                                                    tive for ozone at the 95
                                                                    percent confidence level)

                                                                    For concentrations at or
                                                                    above some appropriate
                                                                    minimum value (possibly
                                                                    the NAAQS), the error
                                                                    (as measured by the overall
                                                                    values of |uj  and o|^| }
                                                                    should be indistinguishable
                                                                    from the difference result-
                                                                    ing from comparison of an
                                                                    EPA-acceptable monitor with
                                                                    a reference monitor
                                                  No  apparent bias at
                                                  ozone concentrations
                                                  above 0.06 ppir
                                                  (see Table VI-12 and
                                                  Figures VI-5 and VI-6
                                                  for further details)
                                                  NO excessive gross
                                                  error (see Table
                                                  Vl-12 and Figures
                                                  VI-5 and VI-6  for
                                                  further details)
                           \      I*!/ OVERALL

Temporal  correla-    Temporal correlation coef-  Pragmatic/Historic
tion*               ficients at each monitor-
                    ing station for the entire
                    modeling period and an
                    overall coefficient for
                    all stations
                       rt  . rt
                        M   OVERALL
                    for 1 i 1 i. M monitoring
                    stations
                                                                    At a 95 percent confidence
                                                                    level, the temporal pro-
                                                                    file of predicted and
                                                                    observed concentrations
                                                                    should appear to be In
                                                                    phase (in the absence of
                                                                    better information, a con-
                                                                    fidence interval may be
                                                                    converted into a minimum
                                                                    allowable correlation
                                                                    coefficient  by using an
                                                                    appropriate  t-statistic)
                                                  For each monitoring
                                                  station,
                                                    0.69 i rt  <.0.97


                                                    Overall,
                                                    r,        - 0.88
                                                     TWERALL
                                                  In this example a
                                                  value of r >. 0.53 is
                                                  significant at the
                                                  95 percent confidence
                                                  level
Spatial alignment
                     Spatial  correlation coef-
                     ficients calculated for
                     each modeling  hour con-
                     sidering all monitoring
                     stations, as well as  an
                     overall  coefficient for
                     the entire day


                       xi* "OVERALL
                     for 1 i J <. N  modeling
                     hours
                                                Pragmatic/Historic
                     At a 95 percent confidence
                     level, the spatial distri-
                     bution of predicted and
                     observed concentrations
                     should appear to be cor-
                     related
                               For each hour,

                                -0.43 f. r   i 0.66
                                                                                                  Overal1,
                                                                                                     r         - 0.17
                                                                                                      "OVERALL
                                                                                                  In this example a
                                                                                                  value of r 2. 0.71 is
                                                                                                  significant at the 95
                                                                                                  percent confidence
                                                                                                  level
 - These measures are appropriate -hen the chosen model  is  used to consider questions involving photochemically reactive
   pollutants  subject to short-term standards.
 t These may not be appropriate for all regulated pollutants  In all applications.  When they are not the Pragmatic/
   historic rationale should be employed.
« The EPA has set maximum allowable limits on the amount by  which a monitoring technique may differ from a  reference
   method   £ •EPA™cc2ptable monitor" 1s defined here to  be one that differs from a reference monitor by up  to the
   •aximum allowable amount.

                                                            VI-31

-------
        "EPA-acceptable monitor" is defined here to be one that
        differs from a reference monitor by up to the maximum
        allowable amount.
     >   Prediction and observation should appear to be correlated
        at a  95 percent confidence level, both when compared
        temporally and spatially.  We  can estimate the nininum
        allowable value for  the respective correlation coef-
        ficient by  using  a t-statistic at the appropriate per-
        centage level  and having  the degrees  of  freedom  required
        by the number of prediction-observation  pairs.

     The guiding principles  noted above are  plausible ones, though  in  some
cases they are arbitrary.  As a "verification data  base" of experience is
assembled, historically achieved performance levels may  be better indicators
of the expected level of model performance.   Standards derived  on this more
pragmatic basis may supplant those deriving  from the  "guiding principles"
followed in this report.

4.   Formulas for Calculating Performance Measures and Standards

     A number of performance measures are recommended in Table  VI-6.   Here
we state explicitly the equations used for their calculation and the forms
assumed by the standards. We include, where  appropriate, brief theoretical
justifications for these relationships.

     The definitions are self-explanatory  for measures testing  the  accuracy
of the peak model  prediction.  Specifically,

                                 C
                                  pm
where Cp  is the peak station prediction, Cp,,, is the peak station measurement,
a is the lower bound on the ratio of the peaks, and B is the upper bound.
The bounds may be determined either from Pragmatic/Historic considerations
                                 VI-32

-------
or, where possible, by means of the Health Effects/Control  Level  Uncertainty
rationales described in Appendix D.  The latter of these two approaches may
prove feasible only when considering photochemically reactive pollutants
(particularly ozone) subject to a short-term standard.  Also, for such
reactive species,

                              |Atp| < &    ,                          (VI-2)

where |At |  is the absolute value of the difference between the predicted
and observed times of the station peak, and 6 is the maximum allowable  dif-
ference, say, one hour (this is an arbitrarily set value).

     Underlying our definitions of bias and error is the following assump-
tion:  A priori, we have no reason to prefer either prediction or observa-
tion as a better measure of reality.  Both, in fact, can be subject to  sig-
nificant uncertainty.  It follows from this assumption that residuals (pre-
dicted concentrations minus observed ones) should be taken perpendicularly
about the perfect correlation line.

     We emphasize an important point:  The residual for a given prediction-
observation pair is not the geometric distance from the perfect correlation
line, as displayed in a correlogram (such as the one shown later in
Figure VI-3).  Rather, the geometric distance must be scaled downward by a
factor of ^2.  That this is so follows from the discussion presented below.
It is based on our requirement that prediction and observation differ by no
more than the maximum amount by which an EPA-acceptable monitoring technique
may differ from the accepted reference technique.

     Uncertainty in monitoring results can be introduced from many sources.
Three principle source categories are the calibration method, the agreement
with the reference monitoring technique, and the actual instrument error.
The last of these categories includes instrument noise and precision, mea-
surement drift, and  interference from other contaminants.  In defining the
characteristics of  the EPA-acceptable monitor we wish to use as a standard,

                                  VI-33

-------
we have chosen to include only the first two error source categories.   We
thus eliminate the need to consider performance characteristics  of specific
monitoring instruments.  Also, in comparing a monitor with an instrument
using the EPA-accepted reference monitoring technique, it is not unreason-
able to assume that both are subject to the same instrument error.

     We may define an acceptance standard for a model insofar as error
and bias are concerned:  The distribution of differences between prediction
and observation must be indistinguishable from that resulting from the com-
parison of an EPA-acceptable monitor with the accepted reference monitor.
Specifically, we define "indistinguishable" to mean
                            o^^e    ,                               (VI-4)

where £ and e can be determined from federal regulations (40 CFR §53.20)
for instrument performance, and TT and o_ are defined below.

     We may confirm a model's acceptability by hypothesizing that the
acceptance standard for bias and error is satisfied and checking to deter-
mine whether this hypothesis is violated.  Consistent with this approach,
we may assume that each prediction and observation pair are random samples
drawn from the same distribution, the one that 'describes the behavior of
an EPA-acceptable monitor with respect to a reference monitor.   The stan-
dard deviation (S.D.) of a random variable whose value is the difference of
two other random variables having the same S.D. a may be expressed as
     The geometric distance from the perfect correlation line, d., may
be written as
                                 P. - M.
                             , = -5	1    ,                         (VI-6)
                             1     ^
                                 VI-34

-------
where Pi and M. are the i-th prediction-observation pair.   We  are  search-
ing for a test variable o  to compare with o.   Therefore,  referring  to
Equation VI-5, we see that we must divide di by /2 to obtain the properly
scaled mean deviation from the perfect correlation line,  d*, that  is,

                                 P. - M.
Thus, the average and standard deviation of the mean deviation may be
expressed as
                                                                      (VI-8)
     These quantities may be compared with those characterizing the distri-
bution of differences between an EPA-acceptable monitor and a reference
instrument.  Those values may be derived from 40 CFR §53.20.  As an example,
(see Burton, et al., 1976) an EPA-acceptable monitor for ozone/oxidants
could have a -8 percent bias and a 95 percent confidence interval of
±3 pphm (a a of 1.53 ppm).  If an EPA-acceptable monitor were defined to
be subject to instrument error as well, the -8 percent bias would remain
because it is assumed due to calibration, but the 95 percent confidence
interval would increase to ±7 pphm (a a of 3.57 ppm).

     We noted earlier that the "seriousness" of the magnitude of a given
residual depends on the ambient concentration of the pollutant being con-
sidered.  For instance, a value for d* of 2 pphm might be considered of
less importance when ambient concentrations are on the order of 30 pphm
than when they are 10 pphm.  In consideration of this effect, we suggest
                                  VI-35

-------
normalizing  residuals  by the arithmetic average of the predicted and
observed concentrations for a given  pair.  This is consistent with our
earlier statement that, a  priori,  we have  no reason to prefer observa-
tion over prediction as inherently better  indicators of reality.

     Defining the average  concentration to be

                                  P. + M.
                           'AVE = -4-^    •                      (VI-10)

we may write expressions for the normalized  average and standard deviation
of the mean deviation about the perfect correlation line:

                                                                    (VI-12)
     A deliberate redundancy has been built into the list of suggested  per-
 formance measures.  Both ap and a_ are measures of "scatter" about  the
 perfect correlation line.  Thus, they are also indicators of gross  error
 and may be used in conjunction with those measures explicitly listed in
 Table VI-6 for use in investigating gross error.  These measures consider
 absolute rather than signed residuals.  Specifically the normalized
 average value and standard deviation of the absolute deviation about the
 perfect correlation line may be written
                                 .VI-36

-------
                                                                    (VI-14)
Their values may be compared with standards such that
                                  M  £
                                                            (VI-15)

                                                            (VI-16)
when the values of A and Y may be derived from instrument performance
specifications in federal regulations.

     It may be helpful to visualize the definitions of d| and CAVE geomet-
rically on a correlogram.  Figure VI-1 is a schematic, showing the orien-
tation of the d*-CAVE axes with respect to the P-M axes of the correlogram.
The CAV£ axis is aligned with the perfect correlation line, and both the
d* and C.uc axes are scaled downward  by a factor of & from the P and
UAVE
M axes.
           FIGURE VI-1.
                  ORIENTATION  AND SCALING  OF  CAVE AND d* AXES
                  ON A PREDICTION-OBSERVATION  CORRELOGRAM

                         VI-37

-------
     Finally,  we consider measures suitable for use in testing  for tem-
poral correlation and spatial alignment.  The former of these is of con-
cern when the  chosen model is used to consider questions involving photo-
chemically reactive pollutants subject to a short-term standard.  We sug-
gest the use of temporal correlation coefficients, whose values are
defined to be
                      •pr 53 /P.-  i - Vp \/M.
                    -     3ST V  *J     1A  '
                    ™
v           ~V°H:	              (VM7)
          IV
           OVERALL
where r^ is the temporal correlation coefficient at the i-th  station for
the N divisions of the modeling period, and rtovERALL *s the avera9e correla-
tion coefficient for all the K monitoring stations.   Also,  ^ and ap.
are the mean and standard deviations of the predictions  for N  hours at the
i-th station.  Similarly, ^ and °H. are the mean and standard deviations
of the concentrations at the 1-th station.

     In testing for spatial alignment, we recommend  using the  following
spatial correlation coefficients:
           OVERALL
                               VI-38

-------
where rx.  is  the  spatial  correlation  coefficient  at  the j-th hour for the
K monitoring  stations,  and  rxnvERALL  1S  ^e  avera9e correlation coefficient
for all  the N modeling  period  divisions  (e.g.,  hours).  Also, yp. and °p.
are the  mean  and  standard deviations  of  the  predictions for K stations at
the j-th hour.  Similarly,  vi^.  and  o^ are the  mean and standard deviations
                            J        J
of the concentration at the j-th hour.

     As  for the form of the standard, we would  require that
where r .   is  defined  at the 95 percent confidence level,  perhaps  using
a t-statistic  if no better method is apparent.

I.   A SAMPLE  CASE:  THE SAI DENVER EXPERIENCE

     In Section D we recommended a set of measures and standards  for use  in
evaluating model performance.  Here we illustrate how these measures might
actually be used in practice.  To do so, we draw on SAI experience in model-
ing the Denver metropolitan region (Anderson et al . , 1977) using  the grid-
based SAI  Airshed Model  (Ames et al., 1978).  We first show for the sample
case the values we calculate for the performance measures; then we discuss
how to interpret their meaning.

1.   The Denver Modeling Problan

     Over the past several years, Region VIII of the EPA has prepared an
Overview EIS assessing the  impact on the Denver metropolitan region of the
proposed construction of twenty-two separate wastewater treatment projects.
Adopting a regional approach,  they assessed the projected  impact of the
facilities in several key ways,  among which was their  effect on air quality,
They contracted with SAI in late 1976 to conduct that  portion of the
assessment.  SAI employed  several air quality models,  one  a long-term
climatological model  (COM)  and the other a  short-term  photochemical model
 (the SAI Airshed Model).   We consider the  latter of  these  in our sample
case.
                                 VI-39

-------
     The grid-based Airshed Model is fully three-dimensional  and capable
of simulating concentrations of up to 13 chemical species,  including ozone,
nitrogen dioxide and several types of reactive hydrocarbons.   The modeling
grid chosen for overlaying the Denver Metropolitan region was 30 miles by
30 miles, subdivided horizontally into grid cells two miles on a side.

     In cooperation with local agencies, SAI assembled meteorological
information (spatial and temporal profiles of temperature and inversion
height, as well as wind speeds and directions) characterizing atmospheric
conditions on several summertime test days, 29 July 1975, 28  July 1976,
and 3 August 1976.  Also, gridded emissions inventories were  compiled
(hourly by species) for those days as were estimates for the  years 1985
and 2000.  Simulations were then conducted, with projections  also made
of air quality in the two subsequent years.

2.   Values of the Performance Measures

     We compare in this sample case the predicted and observed concentra-
tions of ozone at each monitoring station in the regional measurement  net-
work. :- The issues we address are SIP/C and AQMP.  On the test date we  have
chosen, 28 July 1976, eight monitoring stations provided ozone concentra-
tion data.  Their locations are shown in Figure VI-2.   Of the nine sta-
tions,  all but CAMP provided usable ozone measurements.   Data were
recorded as hourly averages for each hour throughout the day.

     The Airshed Model  generates its predictions as grid cell-averaged
hourly concentrations.   Through interpolation,  these values may then be
used to estimate station predictions (concentrations at fixed points
rather than grid cell averages).   Plotted in Figure VI-3 are  the predicted
and observed ozone concentrations at each of the eight stations reporting
on the modeled day (Anderson,  et al.,  1977).  From the station predictions
and observations,  we can calculate performance measure values.  We present
the values of these measures in Table VI-12.  We indicate in  the table
how these values might  be interpreted in evaluating model performance,
considering each in more detail  below.
                              VI-40

-------
                                     KEY

      NG -  Northglenn                    NJ
      WE -  Wei by                         GM
      AR -  Arvada                         0V
      CR -  C.A.R.I.H.                     PR
      CM -  Continuous Air Monitoring
           Program [CAMP]
National  Jewish Hospital
Green Mountain
Overland
Parker Road
                                  NORTH
                    ,
                        3K "*--•' *?—*%--=
                 ''-^r-^-:~-^~---~-^-^~^-'--^s^'^-£i'-*




                        j  .
                                                     I   I  t  I  I	I	I	I	I
                            1_J	I	I	I	I	I	1	1
                                  SOUTH
FIGURE  VI-2.  LOCATIONS OF MONITORING STATIONS  IN THE DENVER METROPOLITAN REGION

                                  VI-41

-------
                                             .-0	-0   PARKER RO
                                               5	9
           8   9   10  11   12       2
           ?   To   TT  T?   T   I   I
  fl
     *

Time of Day, By Hourly Interval
                                                     Cbsar/ed'
                                                     Predicted
FIGURE VI-3.

              PREDICTED AND OBSERVED OZONE CONCENTRATIONS
              AT EACH MONITORING  STATION DURING THE  DAY
              (DENVER, 28 JULY  1976)

                          .VI-42

-------
           TABLE VI-12.    SAMPLE VALUES FOR MODEL  PERFORMANCE  STANDARDS  (DENVER  EXAMPLE)
 Performance
  Attribute

Accuracy of
the peak pre-
diction
 Compos 1te
Importance
 Category*

    1
   Performance Measure
Ratio of predicted to mea-
sured station peaks

         vc-.
Timing of the pe«k+
          it.
                                                                               Performance Standard
                                   Calculated Value
                                   Interpretation
                                                                             80 i f^ilSO percent
                                                                                   pm

                                                                                    t 1 hour
                                                                                                                99 percent
                                                                                              +  1 hour
                                                          Peak performance  of the model
                                                          Is satisfactory.
                                                                                                  The timing of the peak  Is
                                                                                                  satisfactory.  Since  the modal
                                                                                                  provides only hourly  averages.
                                                                                                  this 1s as finely as  at  can b«
                                                                                                  determined.           p
•<
•—i
 i

CO
Absence of
systematic
bias
            Average value and standard
            deviation of the mean
            deviation about the  per-
            fect correlation line.
            normalized by the average
            of the predicted and
            observed concentrations
For concentrations (predicted
or observed)  at or above the
NAAQS,  the bias should not be
greater than  the maximum bias
resulting from EPA-allowable
monitor calibration error.  A
-8 percent b1as--not normal-
ized—Is representative, which
for this case 1s

        11 • -0.4 pphm
       o  -1.53 pphm

for an EPA-acceptable
mon1torS--see Burton, et al.
(1976)--when all concentra-
tions are considered.  An
EPA-acceptable monitor can
have an uncertainty with
respect to a reference moni-
tor of as much as i 3 pphm
for ozone at a 95 percent
confidence level.
For concentrations greater
than the NAAQS (8.0 pphm),

       i • 4.IS

      o- • 19.41

For all concentrations.
       M • -23.4t

      o- • 33.5t
       u
In a form suitable for
comparison with non-
normalized Instrument bias,

       M • -0.52 pphm

      o  • 1.22 pphm

when all concentrations
are considered.
                                                                                      For concentration] at or abovt
                                                                                      the NAAQS,  a slight positive
                                                                                      bias exists, though within
                                                                                      acceptable  bounds.  When all con-
                                                                                      centrations are considered, a
                                                                                      larger  negative bias seems to
                                                                                      exist.  Put In a form suitable
                                                                                      for comparison with an tfft-
                                                                                      allowable monitor,' however,  the
                                                                                      bias appears to  be  1nd1st1ngu1sh-
                                                                                      ble from that  resulting from maxi-
                                                                                      mum allowable  calibration error.
                                                                                      Overall, no conclusion of unac-
                                                                                      ceptably high  bias  would seem
                                                                                      justified.

-------
                                                       TABLE  VI-12  (Concluded)
              Compos He
 Performance  Importance
 Attribute    Category*
  ^performance Measure
 Lack of
 grow error
Average value and standard
deviation of the absolute
mean deviation about the
perfect correlation line,
normalized by the average
of the predicted and
observed concentrations
                                                        Performance Standard
                                    Calculated Value
For concentrations at or
above the NAAQS. the error
should be Indistinguishable
from the distribution of error
resulting from comparison of
an EPA-acceptable monitor'
with a reference monitor.
Representative values for an
EPA-acceptable monitor (-8
percent bias;  i 3 pphm at a
95 percent confidence level)
might be estimated to be

      |»| • 1.22 pphm

     «)M| • O.gS pphm    .

Note that these values are
based on non-normal1 ted
deviations.
                               For concentration* greater
                               than the NAAQS (8.0 pphm),

                                       |5| - 1S.7J

                                      9... - 19.41    .
                                       l»l
                               For all concentrations.

                                       |5| - 31.51

                                      .,.,. 33.51  .

                               In a font suitable for com-
                               parison with non-normal 1ted
                               Instrument error,
                                       I,) - 1.12 pphf)
                                      •    • 0.72 pphm  .
                                      Interpretation
                             For concentrations at or above
                             the KAAQS, the error seems to be
                             about half of what Is seen If all
                             concentrations are considered.
                             The model thus appears to be sub-
                             ject to less error at the higher
                             concentration range.  We can
                             determine the acceptability of
                             this error level by converting to
                             a non-normal 1 ted form for com-
                             parison with an estimate of that
                             resulting from use of an EPA-
                             aceeptable monitor.*  Even when
                             all concentrations'are considered,
                             the error In model predictions
                             appears to be less than that
                             resulting from monitoring technloue
                             differences.  Me conclude thut the
                             mode! performance Is acceptably
                             good Insofar as error Is concerned.
Temporal
correlation
Temporal correlation coef-
ficients at each monitor-
Ing station and an overall
coefficient (the Hi-
station average)

        *V COVERALL
for 1 4 1 4 H monitoring
stations
At a 95 percent confidence
level, predicted and
observed concentrations
should appear to be cor-
related.  Using • t-
stattstlc to estimate the
minimum acceptable correla-
tion coefficient. In this
example, we find
                                                           Xin
                                                                   0.53
For each monitoring station. For all stations and overall,  pre-
      ...         ...      dieted end observed concentrations
      o.w 4 r.  iu.»i    . tppMr tg DC correlated.   The  model
               1             performance appears to be within
Overall.                     acceptable bounds.
                                                                                          'OVERALL
                                                                                                     0.88
 Spatial
 alignment
Spatial correlation coef-
ficients for each model-
ing hour and an overall
coefficient for the entire
day (the all-hours
average)
                                V "OVERALL
                         for 1 4 J 5. H modeling
                         hours
At a 95 percent confidence
level, predicted and
observed concentrations
should appear to be cor-
related.  Using a t-
statlstlc to estimate the
minimum acceptable correla-
tion coefficient, in this
example we find

      r,    • 0.71   .
       "mln
For each modeling hour.

     -0.44 t r.  t O.Mi
                                                                                  Overall.
                                                                                       "OVERALL
                                                                                                   0.17
                                                           During none of the hours considered
                                                           (all  daylight hours) do prediction
                                                           and observation appear to be cor-
                                                           related at the 95 percent confidence
                                                           level.  Model predictions appear to
                                                           be spatially misaligned, although
                                                           the presence of temporal correlation
                                                           suggests  that the misalignment may
                                                           not be a  serious problem.  (Another
                                                           Interpretation may be correct!
                                                           Either rx is too stringent a measure
                                                           of spatial alignment or rt Is too
                                                           lenient a measure of temporal be-
                                                           havior.   Only by additional research,
                                                           however,  will we be able to confirm
                                                           or refute this.)
 * The composite  Importance category is determined by consulting Tables Vl-2 end IV-3 for the appropriate Issue and pollutant/averaging time (in  this
   example, SIP/C and ozonc/one-hour averaging time).  The composite category is the less stringent of the two Importance rankings.

 t These measures are appropriate when the chosen model is used to consider questions  involving photochemically reactive pollutants
   subject to short-term standards.
 I An  "EPA-acceptable monitor"
   maximum allowable amount.
      Is defined here to be one that differs  from a monitor using the EPA reference technique by up to the

-------
3.   Interpreting the Performance Measure Values

     Briefly, we summarize the conclusions suggested by the model  perfor-
mance measures.  First, even though the predicted and observed concentra-
tion peaks occur at different monitoring stations and times (North Glenn
at 2-3 p.m. versus Welby at 1-2 p.m.), their values agree quite closely,
well within the acceptable tolerance.

     Second, systematic bias appears to remain within acceptable limits.
We  can demonstrate this graphically/first by plotting prediction-
observation pairs in a correlogram (see Figure VI-4) ar.d then plotting the
normalized mean  deviation  about  the  perfect correlation line as is done
in  Figure  VI-5.   From  this latter figure  (suggested  by Anderson, et al.,
1977) we  see that the  Airshed  Model, while systematically  underpredicting
at  concentration levels below  4.5 pphm,  does not appear subject to such
bias at concentrations above that level.   Incidentally, recent internal
studies at SAI have  indicated  that the Denver  region may be subject to
background concentrations  as high as 4 pphm  (Anderson,  1978),  values
substantially  higher than  those supplied as  input to the Airshed  Model.
Also, we  may compare the  deviations  about the  perfect correlation line
 to those  that  we would expect from comparison  of an EPA-acceptable
 monitor with a monitor using the EPA reference technique  (normally
 distributed, -8 percent bias,  ± 3 pphm at the 95 percent confidence  level —
 see Burton, et al.,  1976).  This comparison is shown in Figure VI-6.   To
 aid in presenting this graphical comparison, we have converted deviations
 to the non-normalized form.  We observed that the means (a measure of syste-
 matic bias) of both are nearly the same and that the standard deviation of
 prediction-observation deviations is somewhat less than that of the monitor-
  ing error distribution.

      Third, consistent with our  conclusions about systematic bias, gross
  error also appears to be  within  tolerable bounds.  We show in Figure VI-7
  the distribution of non-normalized  error, that  is,  the absolute deviation
  of predictions  and observations  from  the perfect correlation line.  For
  reference we  also" estimate the  corresponding  distribution  resulting from
                                VI-45

-------
     20
     15
 E
 O
IB
E
1
U

     10
o

 CO
O

TJ
B
C

li

O
     C
                         5     NAAQS     10             15


                       P  =  Predicted  03  Concentration  (pphm)



            FIGURE  VI-4.    CORRELOGRAM OF  OZONE  OBSERVATION-PREDICTION
                           PAIRS  FOR  SAMPLE  CASE (DENVER,  28  JULY  1976)
                                     VI-46

-------
  0.25i—
e
o
0
o
LJ

a
C7)

1-0-5
c
o
fij
 'C.
 O
 u
 CJ
H-
 '-
 O
f i..


 E
 O
 "1


 I
-1.0-
   -1.5
   -2.01-
                                     ,r\    A   „
                                    Average Ozone Concentration (pphm)

                                         fPredicted  +  Observed!
                                         I         2          J
         FIGURE VI-5.
                    NORMALIZED  DEVIATIONS ABOUT THE PERFECT  CORRELATION LINE AS A FUNCTION

                    OF OZONE  CONCENTRATION (DENVER, 28 JULY  1976)

-------
                      MEAN (STO, DEV.
                      1.22 pphmj
            DEVIATION OF PREDICTED
            VERSUS OBSERVED POINTS
            FROM PERFECT CORRELATION
            LINE (111 ONE-HOUR-AVERAGED
            DATA POINTS)
                                                    0.0?
                                             (TRUE-INSTRUMENTAL)
                                             EPA ACCEPTABLE MONITOR
                                             (MEAN BIAS • -8 PERCENT;
                                             i 3 pphm AT 95 PERCENT
                                            'CONFIDENCE LEVEL)
                                                                            (TRUE-INSTRUMENTAL)
                                                                            MAXIMUM PROBABLE ERROR
                                                                            (MEAN BIAS • -B PERCENTi
                                                                            1 1 pphm AT 95 PERCENT
                                                                            CONFIDENCE LEVEL)
                                                   0         1
                                        Non-norm 11ied Deviation (pphm)
FIGURE  VI-6.
NON-NORMALIZED OZONE  DEVIATIONS  ABOUT THE  PERFECT  CORRELATION LINE
COMPARED WITH  INSTRUMENT  ERRORS  (DATA FOR  14 HOURS AND 8 STATIONS,
DENVER,  28 JULY 1976)

-------
               o.40r
c*
U3
               0.30-
            4)
            u
            u
            u
            O
               0.20-
            JO
            JO
            2
               0.10-
                                       MEAN  (SJQ.  DEV.  =  0.72  pphm)
         ABSOLUTE DEVIATION OF PREDICTED AND OBSERVED
         POINTS FROM PERFECT CORRELATION LINE (111
         ONE-HOUR-AVERAGED DATA POINTS)
                                                           (TRUE-INSTRUMENTAL)  EPA ACCEPTABLE
                                                           MONITOR (MEAN BIAS = -8 PERCENT;
                                                           ± 3 pphm AT 95 PERCENT CONFIDENCE LEVEL
                                                       3           4
                                                Non-normalized Error (pphm)
                   FIGURE  VI-7.
NON-NORMALIZED OZONE ABSOLUTE DEVIATIONS  ABOUT  THE  PERFECT  CORRELATION LINE
COMPARED WITH INSTRUMENT ERROR (DATA FOR  14  HOURS AND  8  STATIONS,  DENVER,
28 JULY 1976)

-------
comparison of an EPA-acceptable monitor with an EPA reference instrument.   We
see that the mean value and standard deviation of the prediction-observation
"error" are both somewhat less than those resulting from  instrument  differ-
ences.  The conclusion suggests itself that gross error is within  acceptable
bounds, though we caution that the shape of the instrument difference curve
1s an estimate and needs to be analyzed In further detail.

     Fourth, temporal behavior at each monitoring station seems satisfac-
tory,  appearing correlated to better than the requisite 95  percent con-
fidence level.  We note that the correlation we have observed provides
information  only  about the "shape" of the concentration profiles (shown
1n Figure VI-3),  not its  absolute level.  In general, predicted concen-
trations rise  and fall when observed values do, though the  concentration
values might be quite different.  Only by examining bias  and error per-
formance measures can we  draw conclusions about concentration levels.

      Fifth,  spatial  alignment does  not appear to be acceptably good.
 During none of the 14 hours  considered,  do the spatial patterns of pre-
 dictions and observations appear to be correlated  at the 95 percent
 confidence level.  In fact,  for a number of hours,  the correlation seems
 quite poor.   Two possible explanations exist.  Either the spatial cor-
 relation coefficient is too "stringent"  or the predicted concentration
 field in fact is misaligned.   Since temporal  correlation appears  strong,
 the lack of corresponding spatial correlation is somewhat surprising,
 though countervailing errors responsible for  this conceivably could  be
 present.  It is also possible that the temporal  correlation  coefficient
 either is too "lenient" or it should not be computed including concentra-
 tions at all daylight hours.  Presently, we do not know  which of  these
 explanations is correct, noting only that it is a subject  for future
 investigation.  Conceivably,  measurement data errors could also be  contri-
buting to the problem.

      In this example, we can examine model predictions for spatial  mis-
alignment.  To do so, we conducted an informal experiment  among several
of our staff.   In general, reconstructing the "true" concentration

                                  VI-50

-------
field from a "sparse" set of observational  data  is  a  difficult  and  uncer-
tain process.  Nevertheless, we attempted,  using only station measurement
data, to draw isopleth maps showing contours  of  constant  concentration
values.  The process, of course, is a  highly  subjective one, requiring the
person doing the drawing to make a number of  judgmental and  often arbi-
trary decisions.  In this case, a useful  result  was achieved.

     None of the participants in the experiment  were  able to draw unam-
biguous isopleth maps for those hours  when overall  concentrations were  low
(before 11 in the morning and after 3 in the  afternoon).   However,  while
they varied widely in their estimates during  the four "peak hours"  of the
configurations for lower outlying concentration  isopleths, each agreed
reasonably well on their estimates of the location of the peak. We com-
pare in Figure VI-8 a "ground-trace" of their composite estimates with
the peak locations predicted by the Airshed Model.

     We observe that the ground-traces of the predicted and observed peaks
differ, both in direction and  speed of drift.  This suggests that  either
the model has had some difficulty in simulating  atmospheric dispersion
or it  is being driven by inputs that imperfectly characterize ambient
conditions on the modeling day.  Based on a generally favorable model
performance rating, as judged  by the other four types of measures,  we
feel the latter of these two explanations is more likely.

     The model  input data most likely to have caused the alignment problem
is the temporally and spatially varying wind field.  By comparing the
ground-trace of the  predicted  peak with the directions and speeds of pre-
vailing winds that we input to the Airshed model, we confirmed that the
wind field  did  indeed appear to be  "forcing" the predicted pollutant
cloud  in just the direction noted  in Figure VI-8.

     We emphasize that  this  does  not confirm that  "errors" in  the  input
wind field  were responsible for the spatial misalignment, but  the  evi-
dence  is  suggestive.  Final  confirmation or refutation would come  by
                               VI-51

-------
                NORTH
                            ^2f^3^k3^?fff
^1200-1300
   £ ~- '• -a*- '*-i-S-i *r^t- "-
   fiea^Cg£*y
                     (time of day)
  PREDICTED


D MEASURED
         [  i  I   I  I  i  I  I  I   I  [  I   1  I  I   I  I  I	L—I
                SOUTH
  FIGURE VI-8.
  GROUND-TRACES OF THE PREDICTED AND OBSERVED PEAK OZONE
  CONCENTRATIONS (DENVER, HOURS 1100-1200 TO 1400-1500

  LOCAL STANDARD TIME, 28 JULY 1976)
                VI-52

-------
rerunning the Airshed Model using a wind field "adjusted"  to better
mirror our updated estimates of the meteorology on the modeling day.   If
agreement, as evaluated by the five types of performance measures, were
"better," then we might conclude that wind field imperfections were
responsible for our misalignment problems.

F.   SUGGESTED FRAMEWORK FOR A DRAFT STANDARD

     We have now completed our central objective in this report:  the
identification and specification of model performance measures and stan-
dards.  In doing so, however, we have not solved the problem but rather
only begun a discussion that will be a continually evolving one.  Almost
certainly, the specific measures and standards employed to evaluate
model performance will change as our insight and experience expands.
On balance, the most enduring benefit from this study will be the con-
ceptual structure it sets.

     With that structure in mind, we discuss one final subject:  a frame-
work for a draft model performance standard.  We view the promulgation
of the standard as having  two distinct parts:  the text of the standard
itself and an accompanying guidelines document.  Where the standard
should be quite specific about  selecting  and applying the performance mea-
sures to be used, there needs to be a guidelines document in which sup-
plementary discussion  and  examples are provided.  While a full examina-
tion of  the interrelationships  between the two documents is beyond the
scope of the current study, we  illustrate in Figure VI-9one possible
configuration.

     We  focus  in  this  discussion on suggested  elements of a draft per-
formance standard.  We state  several  of  the functional sections  it
should contain:

      >   Goals  and Objectives.   The reasons  for insisting on model
         validation  should  be  stated,  as  well  as  a  summary of
         expected  costs and benefits.  Our objectives  in conduct-
         ing  performance  evaluation should be  clearly  presented.
                              VI-53

-------
STANDARD
        GOALS AND OBJECTIVES
                 1
     OVERALL MODELING ACCEPTANCE
     CRITERIA (E.G., "MODELING
     MUST BE DONE FOR 'WORST
     CASE1 EPISODE CONDITIONS"
                  I
        DETERMINATION OF
        PERFORMANCE MEASURES
                  1
         SPECIFICATION OF
         PERFORMANCE  STANDARDS
                  1
        CALCULATION  OF MEASURES
                  I
          EVALUATION OF
          MODEL ACCEPTABILITY
                  I
            DETERMINATION OF
            REQUIRED ACTION
                                     RATIONALE FOR GOALS AND OBJECTIVES
 GUIDANCE  ON  CHECKING WHETHER
 THE MODELING EFFORT CONFORMS
 TO OVERALL  ACCEPTANCE  CRITERIA
                                       SUPPLEMENTARY GUIDANCE  ON
PROPER SELECTION AND RANKING
OF PERFORMANCE MEASURES

 BACKGROUND AND STATEMENT   |
11   i'
 OF RATIONALES FOR STANDARDS


  ADDITIONAL GUIDANCE ON THE
                              GUIDELINES
                                                                      /;1    I
  CALCULATION OF MEASURES


  GUIDANCE ON INTERPRETATION
  OF THE VALUES OF THE
  MEASURES;  CASE STUDIES

  SUPPLEMENTARY DISCUSSION OF
   PROCEDURAL ALTERNATIVES
   FIGURE VI-9.  POSSIBLE RELATIONSHIPS BETWEEN THE MODEL PERFORMANCE
                 STANDARDS AND A GUIDELINES DOCUMENT
                                VI-54

-------
>  Overall  Modeling Acceptance Criteria.   Important  criteria
   for judging a modeling effort  in an overall  sense should
   be clearly stated,  along with  the action required if any
   of the criteria are not satisfied.  Among possible criteria  are
   the following:  The verification must be done for modeling
   days typical of "worst case" conditions, the measurement
   network must meet certain stated minimum standards
   (numbers, types and configurations of the monitoring
   stations), and point source models must be verified using
   the appropriate prototypical data base (one appropriate for
   an application similar to the  proposed hypothetical one).
   Without these and perhaps other overall criteria  being sat-
   isfied,  model evaluation would be premature.
>  Determination of Performance Measures.  The procedure must be
   stated for determining the performance measures to be used
   for model evaluation.  Instructions must also be  provided
   for matching the importance ranking of each of the model
   performance attributes to the  type of issue being
   addressed and the pollutant/averaging time being  considered.
   We might do so using the importance tables we presented
   earlier in this chapter and repeat for convenience in
   Tables VI-13 and VI-14.
>  Specification of Performance Standards^  The standards must
   be clearly stated for each of  the performance measures to
   be used.  We present in Table  VI-15 one format for doing
   so, presenting the standards in the form of general prin-
   ciples.   In each instance, the actual numerical standard  is
   dependent on the characteristics of the specific application.
   Guidance must be provided on how to determine the proper
   numerical values.
>  Calculation of Measures.  Each measure should be defined
   mathematically, accompanied by directions on precisely how
   the measures are to be calculated.
                         VI-55

-------
               TABLE VI-13.     IMPORTANCE  OF PERFORMANCE  ATTRIBUTES  BY  ISSUE
                                                   Importance of Performance Attribute*
                          Performance Attribute

                          Accuracy Of the peak
                          prediction
                          Absence of systematic
                          bits

                          Lock of gross error
                          Temporal correlation

                          Spatial augment
S1P/C
1
1
2
2
2
!SSf.
1
i
2
2
2
PSO
1
1
1
3
1
«SR
1
1
1
- 3
3
OSR
2
1
2
3
3
II5/R
1
1
1
3
3
til
1
1
1
3
3
                            Category 1 - Performance standard must always be satisfied.
                            Category 2 - Performance standard should be satisfied, but some 1t*My
                                        My be allowed at the discretion of a reviewer.
                            Category 3 • Meeting the performance standard Is desirable but failure
                                        Is Mt sufficient to reject the model; Measures dealing
                                        •It* this problem should be regarded as 'Informational.*
          TABLE  VI-14.
IMPORTANCE OF  PERFORMANCE ATTRIBUTES  BY POLLUTANT
AND  AVERAGING  TIME
                                                 Importance of Performance Attribute*
                                  Pollutants with Short-teni Standards
                                                             Pollutants with
                                                           Long-term Standards
Performance
Attribute
Accuracy of the
peak prediction
Absence of
systematic bias
Lack of gross
error
Temporal
correlation
Spatial
alignment
V*
(1 hour)1
1

1

1
1

1

CO**
(1 hour)
1

1

1
2

2

NWC*
(3 hour)
1

1

1
2

2

so2
(3 hour)
1

1

1 .
2

2

^* CO
Ull (8 hour)
1 1

1 1

1 1
1 2

1 2

TSP**
(24 hour)
1

1

1
3

2

»2-
(24 hour)
1

1

1
3

2

N02«
(1 year)
3

1

1
VA++

2

TSP
(1 year)
3

1

1
M/A

2

S02
(1 year)
- 3

1

1
H/A

2

 • Category 1 - Performance standard Bust be satisfied.
   Category 2 - Performance standard should be satisfied, but some leeway may be allowed at the discretion of a reviewer.
   Category 3 - Meeting the performance standard Is desirable but failure is not sufficient to reject the model.

 •t No short-term H02 standard currently exists.

 I Averaging times required by the NAAQS are In parentheses.

*• Primary standards.

tt The performance attribute Is not applicable.
                                                 VI-56

-------
           TABLE  VI-15.    MODEL  PERFORMANCE  MEASURES AND STANDARDS*
    Performance
     Attribute

 Accuracy  of the
 peak prediction
              Performance Measure
Ratio of the predicted  station  peak to the mea-
sured station (could be at different  stations)
                    Cn  /Cn
                     PP  "«
Difference in timing of occurrence of station
peak +
                                                              Performance Standard
Limitation on uncertainty in aggregate health
impact and pollution abatement costs*
                                                                      Model must reproduce reasonably well  the
                                                                      phasing of the peak--say, ±1 hour
 Absence of system-
 atic bias*
Average value and standard deviation of the mean
deviation about the perfect correlation line,
normalized by the «verage of the predicted  and
observed concentrations, calculated for all
stations during those hours when either the
predicted or the observed values exceed some
appropriate minimum value (possibly the NAAQS)
                                   I" °*'OVERALL
No or very little systematic bias at concen-
trations (predictions or observations)  at or
above some appropriate minimum value (possibly
the NAAQS); the bias should not be worse
than the maximum bias resulting from EPA-
allowable calibration error (-8 percent is a
representative value for ozone);  also, the
standard deviation should be less than or
equal to that of the difference distribution
between an EPA-acceptable monitor" and an EPA
reference monitor (3 pphm is representative
for ozone at the 95 percent confidence level)
 Lack of gross       Average value and standard deviation of the
 error!             absolute mean deviation about the perfect cor-
                    relation line, normalized by the average  of the
                    predicted and observed concentrations, calcu-
                    lated for all stations during those hours when
                    either the predicted or the observed  values
                    exceed some appropirate minimum value (pos-
                    sibly the NAAQS)
                                                   For concentrations  at or above some appropria-.e
                                                   minimum value (possibly the NAAQS) the error
                                                   (as measured by the overall values of )u|  and
                                                   clul) should not be worse  than the error result-
                                                   ing from the use of an EPA-acceptable monitor**
                                              OVERALL
 Temporal  Cor-
 relation*
 Spatial  alignment
 Temporal correlation coefficients at each mon-
 itoring station for the entire modeling  period
 and  an overall coefficient averaged for  all
 stations
                  M    "-OVERALL
 for 1 <. 1 <. H monitoring stations

 Spatial  correlation  coefficients calculated
 for each modeling hour considering all  monitor-
 ing stations, as  well  as an overall  coefficient
 average for the entire day

                r  . r
                  xj    "OVERALL
 for 1 i j <. N modeling hours
 At a 95 percent confidence level,  the  temporal
 profile of predicted and observed  concentra-
 tions should appear to be in phase (in the
 absence of better information, a confidence
 interval may be converted into a minimum
 allowable correlation coefficient by using  an
 appropriate t-statistic)
 At a 95 percent confidence level, the spatial
 distribution of predicted and observed concen-
 trations should appear to be correlated
 * There is deliberate  redundancy in the performance measures.   For  example, in testing for systematic bias,  u and o-
   are calculated.   The latter quantity is a measure of "scatter"  about  the perfect correlation line.   This  is also
   an indicator of  gross error and should be used in conjunction with  |TT| and °irH-
 5 These measures are appropriate when the chosen model is used  to consider questions Involving photochemically reac-
   tive pollutants  subject  to short-term standards.
 + These may not be appropriate for all regulated pollutants in  all  applications.  When they are not,  standards derived
   based on pragmatic/historic experience should be employed.
** By  "EPA-acceptable monitor"  we mean a monitor that  satisfies the requirements  of 40 CFR  S53.20.
                                                      VI-57

-------
     >  Evaluation of Model Acceptability.  The  rating  procedure
        to be  used in evaluating model  performance must be stated.
        Guidance should be supplied  on  the  way in which problem
        importance ranking is  "folded in" with the performance
        rating for each of the measures.
     >  Determination of Required  Action.  The alternative actions
        required of the model  user,  depending on the model evalua-
        tion,  must be stated.  Among the possible alternative out-
        comes  of the model evaluation are the following:  The model
        is rated acceptable,  the model  requires  a waiver  from an
        outside reviewer before acceptance  can be granted (that is,
        the model is deficient in  some Category 2-importance problem
        area), or the model is unacceptable (the model  is deficient
        in some Category 1-importance problem area).

     We end our discussion of a suitable structure for  a  draft performance
standard by noting that this has  been only  a brief encounter with an
important and  complex subject.  We recommend that it be examined in far
greater detail in subsequent work.
                              VI-58

-------
                 VII   RECOMMENDATIONS FOR  FUTURE WORK


     In this study we have suggested a conceptual framework within which model
performance may be objectively evaluated.  We have identified key attributes
of a well performing model and selected performance measures for use in detect-
ing the presence or absence of each attribute.   For the measures chosen for
use, we have developed explicit standards that specify the range of their
acceptable values.

     Throughout, we have maintained the point of view that measures and stan-
dards of performance for models should be determined as independently as possible
of considerations about model-specific limitations and data inadequacies.
Remembering this perspective may be important when evaluating the practical
utility of the procedure suggested in this report in certain point source appli-
cations.  This is particularly true when the available measurement data are
"sparse."  Where data quantity and resolution (temporal and spatial) are insuf-
ficient to permit meaningful  calculation of the performance measures, we view
this more as a data inadequacy that must be overcome than as a deficiency in
the model  evaluation framework suggested here.

     The development of a performance evaluation procedure for models is an
evolutionary process.  We have advanced in this study a conceptual  structure
and a first-generation procedure for conducting such an evaluation.   We now
recommend ways in which development may proceed, moving from the conceptual
framework provided in this study to the realm of practical  application of per-
formance evaluation procedures.

     We recommend that the work begun in this study continue in several  key
areas.   In this chapter we outline briefly our  specific recommendations, group-
ing them into three categories:   areas for technical  development,  assessment of
institutional  implications, and documents to be compiled.   We consider each
category in turn.
                                     VII-1

-------
A.   AREAS FOR TECHNICAL DEVELOPMENT

     A number of important technical areas remain that would benefit from
additional developmental work.   We consider four key areas here.

1.   Further Evaluation of Performance Measures
     In this study, a sample case has been considered that permits us to
evaluate in a practical situation the utility of the recommended performance
measures in detecting the presence or absence of desirable model attributes.
However, the suitability for use of each of these measures needs further evalu-
ation over a range of circumstances.  Specifically, we recommend the following:

     >  Additional case studies need to be considered, with perfor-
        mance measures calculated for each.  The choice of case studies
        should be nade in order to  "stress" the evaluation procedure,
        that is, any limitations should be made apparent.  The range of
        case studies should include both multiple-source and specific-
        source applications.
     >  The behavior of the suggested performance measures needs
        to be assessed over a range of conditions.  Alternate or supple-
        mentary performance measures should be identified, if required,
        so as to further extend the range of applicability of the evalua-
        tion procedure suggested in this study.
     >  A performance measure evaluation analysis should be conducted.
        Two concentration fields, initially aligned spatially and
        temporally, could be progressively "degraded," that is, offset
        in space or time.  By observing the corresponding changes in the
        values of the performance measures and the conclusions that derive
        therefrom, insight could be gained into their overall suitability for use.

2.   Identification and Specification of Prototypical Point Source
     "Test Bed" Data Bases

     For the purposes  of model evaluation in the many specific-source appli-
cations where  site-specific data are either inadequate or nonexistent, a

                                    VII-2

-------
"test-bed," or surrogate, data base is required.   This  data  base must  provide
concentration data of sufficient spatial  extent and temporal  frequency to
permit the calculation of meaningful  values for the model  performance  measures.
Selection of a particular data base could be made by determining, from among
several prototypical "test beds," which derives from conditions  most like  those
in the proposed application.  We recommend that the following work be  under-
taken:

     >  A comprehensive list of prototypical point source  situa-
        tions should be compiled.
     >  For each prototypical situation, a "test bed" data base
        should be specified and assembled.

3.   Examination of Performance Evaluation Procedure in Sparse-Data
     Point Source Applications

     We have identified in this study several key attributes of  a well-
performing model, for each of which presence or absence may  be detected
by calculating certain performance measures.  However, for the values  of
these measures to assume statistical  significance, a certain minimum level
is required for the spatial extent and the temporal frequency of the measure-
ment data.  Often, in multiple-source applications, such a minimum level is
attained, particularly in urban areas with well-developed monitoring networks.
In specific-source applications, though, a minimum acceptable level of data
may not be attained.  To overcome this problem, we have suggested that proto-
typical point source data bases be assembled for the purposes of model evalua-
tion.  These data bases would provide sufficiently well-conditioned data for
calculation of the performance measures to be useful.

     As a practical matter, however, such data bases are not presently available
to the modeling community.  In lieu of their use, other sources  of data may be
used for the purpose of model evaluation, despite the deficiencies in  such data.
For example, a limited amount of tracer data may be gathered.  If the  situation
to be modeled involves either construction at a site where another source already

                                    VII-3

-------
exists or retrofit of pollution control  equipment, then some limited site-
specific monitoring data may be available.   Such data may not be sufficiently
"well-conditioned" to permit meaningful  calculation of the performance measures
suggested for use.  What can be done?  Should calculation of the performance
measures be allowed using the possibly deficient, sparse data available, or
should the model evaluation process be halted until more "robust" data are
acquired?  We suggest that the implications of both-these alternatives be assessed,
searching for those limited circumstances where a "middle ground" may be found,
with alternative measures and standards identified for use that are less
"demanding" in their measurement data requirements.  The implications of allow-
ing the use of such supplementary measurements also need to be examined.

     Also, a related issue may be important in point source modeling appli-
cations:  relative versus absolute model performance.  Are there circumstances
in which a model may be better able to predict relative, incremental changes
in concentration than absolute ground-level values?  It should be determined
whether or not such situations occur in practice.  If they do, relative vali-
dation of a model may become a consideration.  This could be of concern, for
example, when using a Gaussian model to assess the impact of control equipment
that  is retrofit on an existing source.  If relative performance is deemed im-
portant in some circumstances, then additional performance measures and stan-
dards should be identified which allow the modeler to make such an assessment.

4.    Further Development of Rationales for Setting Performance Standards

      Several rationales for setting performance standards have been examined
1n this study.  Some of these merit further technical development and assess-
ment of the range of their applicability.  Also, additional rationales should
be  identified where possible.  Towards these ends, we recommend the following:

      >  Additional developmental work should continue on the Health
         Effects  (HE) and Control Level Uncertainty  (CLU) rationales.
      >  The use of the HE/CLU rationales in setting a standard for
        the  ratio of predicted and observed peak station concentra-
                                    VII-4

-------
        tions  should be  exposed  to  peer  review.  A journal article
        on  the subject should  be prepared and submitted for publi-
        cation.
     >  Explicit error and bias  standards should be calculated for
        all  regulated pollutants-   This  may be done using monitoring
        specifications  in federal  regulations.   In this study, only
        bias and error  standards for ozone were  calculated numerically.

B.   ASSESSMENT OF INSTITUTIONAL IMPLICATIONS

     A number of institutional requirements  are  implied by any decision to
promulgate standards for model performance,  or even  by a  decision  to  publish
formal guidelines for model performance evaluation.   We recommend  that these
implications and their  attendant procedural  and  resource  requirements be
assessed.  Among the many questions to be resolved are the  following:

     >  Regulatory Responsibility
        -  How should formal performance standards be promulgated—
           or should they be promulgated at all?
        -  If standards are stated or recommended, how will  they
           be updated?
        -  Who will accumulate  information about historically
           achieved model performance?  (This information would
           be required when setting a standard invoking the Pragmatic/
           Historic  rationale.)
      >  Custodial  Responsibility
        -  Who  will  identify  and assemble the prototypical "test
            bed"  data bases  for  use in point source applications?
         -   Who  will  maintain, store, and distribute  the  "test bed"
            data bases?
      >   Review Responsibility
         -   Who should review  the adequacy of model  performance  in
            a specific application?
                                     VII-5

-------
        -  Does a model  need to be repeatedly  evaluated using  a
           "test bed"  data base?  If not,  who  decides  when a
           model/data  base combination  has been  sufficiently
           examined?
     >  Advisory Responsibility
        -  What advisory documents should  be provided  to the model
           user community?
        -  Who will provide guidance to model  users  and how should
           that support'be funded?

     These are simply a few of the many procedural and institutional  questions
that arise.  Answers to these and other key questions  should be sought at
an early date.

C.   DOCUMENTS TO BE COMPILED

     Specific documents will have to be drafted that describe  suggested or
mandated model performance standards.  Two documents seem appropriate for
publication (though conceivably they could be combined into a  single  guide-
lines document).  These documents are the following:

     >   Formally promulgated model performance standards along with
         specific procedures for evaluating performance.  These could
         be presented in guideline form rather than  as mandated stan-
         dards.  The latter of these two approaches  may be preferable,
         given the complexities of modeling and its  attendant  uncertain-
         ties.
      >  Advisory/informative model  performance  guidelines document.
         This  may provide the advice and information necessary to con-
         duct  a meaningful model  performance evaluation.   It could
         play  the role,  with respect to the performance standards,
         that  is indicated in Figure VI-9.
                                   VII-6

-------
              APPENDIX A
IMPORTANT PARTS OF THE CODE OF FEDERAL
  REGULATIONS CONCERNING AIR PROGRAMS
                  A-l

-------
                             APPENDIX A
             IMPORTANT PARTS OF THE CODE OF  FEDERAL
               REGULATIONS  CONCERNING  AIR PROGRAMS
               PART 50.  NATIONAL PRIMARY AND SECONDARY
                        AMBIENT AIR QUALITY STANDARDS
Section
 50.1       Definitions.
 50.2       Scope.
 50.3       Reference Conditions.
 50.4       National  primary ambient  air quality standards for
            sulfur oxides (sulfur  dioxide).
 50.5       National  secondary  ambient air quality standards for
            sulfur oxides (sulfur  dioxide).
 50.6       National  primary AAQS  for particulate matter.
 50.7       National  secondary  AAQS for particulate matter.
 50.8       National  primary and secondary AAQS for carbon monoxide.
 50.9       National  primary and secondary AAQS for photochemical oxidants,
 50.10      National  primary and secondary AAQS for hydrocarbons.
 50.11      National  primary and secondary AAQS for nitrogen dioxide.

 Appendix A—Reference Method for  the Determination of Sulfur Dioxide in
             the Atmosphere (Pararosaniline  Method).
 Appendix B—Reference Method for  the Determination of Suspended
             Particulates in the Atmosphere  (High Volume Method).
 Appendix C—Measurement Principle and Calibration Procedure for the
             Continuous Measurement of Carbon Monoxide in  the Atmosphere
             (Non-Dispersive Infrared Spectrometry).
 Appendix D—Measurement Principle and Calibration Procedure for the
             Measurement of Photochemical Oxidants Corrected for Inter-
             ferences due to Nitrogen Oxides and Sulfur  Dioxide.
 Appendix E—Reference Method for the Determination of Hydrocarbons
             Corrected for Methane.
 Appendix F—Reference Method for the Determination of Nitrogen  Dioxide
             (24-Hour Sampling  Method)

 Authority:  The provisions of this  Part  50  issued under Sec. 4, Public
             Law 91-604, 84 Stat.  1679  (42  U.S.C. 1857c-4).

  Source:     The provisions of this  Part  50  appear at 36 F.R. 22384,
             November 25,  1971, unless  otherwise noted in the CFR.
                                 A-2

-------
          PART 51.   REQUIREMENTS FOR PREPARATION,  ADOPTION,
                    AND SUBMITTAL OF IMPLEMENTATION  PLANS
Section
                    Subpart A—General  Provisions
 51.1     Definitions.
 51.2     Stipulations.
 51.3     Classification of regions.
 51.4     Public hearings.
 51.5     Submittal  of plans;  preliminary review of  plans.
 51.6     Revisions.
 51.7     Reports.
 51.8     Approval of plans.

              Subpart B—Han  Content and Requirements
 51.10    General requirements.
 51.11    Legal authority.
 51.12    Control strategy:  General.
 51.13    Control strategy:  Sulfur oxides and particulate  matter.
 51.14    Control strategy:  Carbon monoxide, hydrocarbons, photo-
          chemical oxidants, and nitrogen dioxide.
 51.15    Compliance schedules.
 51.16    Prevention of air pollution emergency episodes.
 51.17    Air quality surveillance.
 51.17a   Air quality monitoring methods.
 51.18    Review of new sources and modifications.
 51.19    Source surveillance.
 51.20    Resources.
 51.21    Intergovernmental cooperation.
 51.22    Rules and regulations.
 51.23    Exceptions.
                                  A-3

-------
                        Part 51  (continued)

                        Subpart  C—Extensions
51.30    Requests for 2-year extension.
51.31    Requests for 18-month extension.
51.32    Requests for 1-year postponement.
51.33    Hearings and appeals relating to  requests for one year
         postponement.
51.34    Variances.

           Subpart D--Maintenance of National  Standards
51.40    Scope.
         AQMA Analysis
51.41    Submittal date.
51.42    Analysis period.
51.43    Guidelines.
51.44    Projection  of  emissions.
51.45    Allocation  of  emissions.
51.46    Projection  of  air quality concentrations.
51.47    Description of data sources.
51.48    Data bases.
51.49    Techniques  description.
51.50    Accuracy factors.
51.51    Submittal of calculations.
         AQMA Plan
51.52    General
 51.53    Demonstration  of adequacy.
 51.54    Strategies.
 51.55    Legal  authority.
 51.56     Future strategies.
 51.57    Future legal authority.
 51.58    Intergovernmental cooperation.
 51.59    Surveillance.

                                A-4

-------
                         Part 51  (continued)
51.60    Resources.
51.61    Submittal format.
51.62    Data availability.
51.63    Alternative procedures.


Appendix A—Air Quality Estimation.
Appendix B—Examples of Emission Limitations Attainable with Reasonably
            Available Technology.
Appendix C—Major Pollutant Sources.
Appendix D—Emissions Inventory Summary (Example  Regions).

Appendix E—Point Source Data.
Appendix F—Area Source Data.
Appendix G--Emissions Inventory Summary (other Regions).

Appendix H--Air Quality Data Summary.
Appendix J--Required Hydrocarbon Emission Control as  a Function of
            Photochemical Oxidant Concentrations.

Appendix K—Control Agency Functions.
Appendix I—Example Regulations for Prevention of Air Pollution
            Emergency Episodes.
Appendix M~Transportation Control Supporting Data Summary.
Appendix N--Emissions Reductions Achievable Through Inspection,
            Maintenance and  Retrofit of Light Duty Vehicles.

Appendix 0—[No title—but related to  §51.18]
Appendix P—Minimum Emission Monitoring Requirements.

Appendix Q--[Reserved]
Appendix R—Agency Functions for Air Quality Maintenance Area Plans
             for the         AQMA in the State of	
             for the year	.


Authority:   Part  51  issued under Section 301(a) of the Clean Air Act
             [42 U.S.C.  1857(a)], as amended by Section 15(c)(2)  of
             Public Law  91-064, 84  Stat. 1713, unless otherwise noted.

 Source:     Part  51  appears  at 36  F.R. 22398, November 25, 1971, unless
             otherwise  noted. AQMA considerations arose from 41  F.R. 18388,
             May  3, 1976,  unless  otherwise  noted  in the CFR.  NSR seems to
             be required by §51.18, with Appendix 0 intended to assist in
             developing regulations.  Standards are in Part 60.


                                A-5

-------
                  PART 52.  APPROVAL AND  PROMULGATION
                           OF  IMPLEMENTATION  PLANS
Section
                    Subpart A—General  Provisions
 52.01   Definitions.
 52.02   Introduction.
 52.03   Extensions.
 52.04   Classification of regions.
 52.05   Public availability of emission data.
 52.06   Legal authority.
 52.07   Control strategies.
 52.08   Rules and regulations.
 52.09   Compliance schedules.
 52.10   Review of new source and modification.
 52.11   Prevention of air pollution emergency episodes.
 52.12   Source surveillance.
 52.13   Air quality surveillance; resources; intergovernmental
         cooperation.
 52.14   State ambient air quality standards.
 52.15   Public availability of plans.
 52.16   Submission to administrator.
 52.17   Severability of provisions.'
 52.18   Abbreviations.
 52.19   Revision of plans by Administrator.
 52.20   Attainment dates for national  standards.
 52.21   Significant deterioration of air quality.
 52.22   Maintenance of national standards.
 52.23   Violation and enforcement.

                       Subpart B—Subpart ODD
         SIPs for States and Territories
                                A-6

-------
                         Part 52 (concluded)
          Subpart EEE--Approval  and  Promulgation of Plans
Appendix A—Interpretive rulings for §52.22(b)—Regulation for review
            of new or modified indirect sources.
Appendix B-C—[Reserved]
Appendix D~Determination of sulfur  dioxide emission from stationary
            sources by continuous monitors.
Appendix E—-Performance specifications and specification test procedures
            for monitoring systems for effluent stream gas volumetric
            flow rate.
Authority:   40 U.S.C.  1857c-5,  42 U.S.C.  1857c-5 and 6; 1857g(a); 1859(g)

Source:      For Subpart A,  37 FR 10846, May  31, 1972, unless otherwise
            noted.
                               A-7

-------
               PART 60.   STANDARDS OF  PERFORMANCE FOR
                         NEW STATIONARY SOURCES
Subpart A—General  Provisions
Subpart B~Adoption and Submittal  of State  Plans for Designated Facilities
Subpart C—[Reserved]
Subpart D—Standards of Performance  for Fossil-Fuel-Fired Streat Generators
Subpart E—SOP for Incinerators
Subpart F—SOP for Portland Cement Plants
Subpart G~SOP for Nitric Acid Plants
Subpart H--SOP for Sulfuric Acid Plants
Subpart I—SOP for Asphalt Concrete  Plants
Subpart J--SOP for Petroleum Refineries
Subpart K--SOP for Storage Vessels for Petroleum Liquids
Subpart L--SOP for Secondary Lead Smelters
Subpart M--SOP for Brass and Bronze  Ingot Production Plants
Subpart N—SOP for Iron and Steel  Plants
Subpart 0—SOP for Sewerage Treatment Plants
Subpart P—SOP for Primary Copper Smelters
Subpart Q--SOP for Primary Zinc Smelters
Subpart R—SOP for Primary Lead Smelters
Subpart S—SOP for Primary Aluminum  Reduction Plants
Subpart T—SOP for the Phosphate Fertilizer Industry:  Wet Process
           Phosphoric Acid Plants
Subpart U—SOP for the Phosphate Fertilizer Industry:  Superphosphoric
           Acid Plants
Subpart V—SOP for the Phosphate Fertilizer Industry:  Diammonium
           Phosphate Plants
Subpart W—SOP for the Phosphate Fertilizer Industry:  Triple
           Superphosphate  Plants
Subpart X--SOP for the Phosphate Fertilizer Industry:  Granular Triple
           Superphosphate  Storage  Facilities
Subpart Y—SOP for Coal  Preparation  Plants
Subpart Z—SOP for Ferroalloy  Production Facilities
Subpart AA—SOP for Steel  Plants:  Electric Arc Furnaces
                                A-8

-------
                          Part 60  (concluded)
Appendix A—Reference  Methods.
Appendix B—Performance  Specifications.
Appendix C—Determination  of Emission Rate Change-
Appendix D—Required Emission Inventory  Information-


Authority:  Sections 111 and 114 of the  Clean Air Act, as amended by
            Section 4(a) of Public Law 91-604, 84 Stat. 1678
            (42 U.S.C. 1857c-6,  1857c-9).


Source:     36 FR 24877, December 23, 1971,  unless otherwise noted
            in the CFR.
                                 A-9

-------
           APPENDIX B
SOME SPECIFIC AIR QUALITY MODELS
                B-l

-------
                             APPENDIX  B
                SOME SPECIFIC  AIR  QUALITY MODELS


     In Chapter IV  of this  report we subdivided air quality simulation
models into the following generic categories:

     >  Rollback
     >  Isopleth
     >  Physico-Chemical
        - Grid
        - Trajectory
        - Gaussian
        - Box

 In this appendix we associate with  each of these  generic  types a number of
 specific models.  We  include many of the models with which we are  familiar.
 Because the list is  intended only to be a representative  one, we do  not
 enumerate  all  available models.  Many others, particularly Gaussian  models,
 certainly  exist and would  be appropriate for use  in the proper circumstances.
 In  compiling this  list, we have drawn heavily from material in Argonne  (1977),
 EPA (1977a), and Roth et al. (1976), as well as various program users'
 manuals.   Also we  have made no attempt to screen the models for technical
 acceptability.

      Among the information contained  in the accompanying table is  the fol-
 lowing:   model developer,  EPA recommendation status, technical  description,
 and model  capabilities.  The last of  these is further subdivided  into
 source type/number,  pollutant type, terrain complexity, and spatial/
 temporal  resolution.
                                    B-2

-------
                                                                             TABLE 8-1.     SOME SPECIFIC  AIR QUALITY  MODELS
Cateoary
ivtn.cn
             Linear Rollback
            EM tan isopietk
            Nethod
        EM
   Recuiumnda t Ion
      Status

 Accepted by CM
 for reactive and
 non-reactive pol-
 lutants;
 nonverlflable
Not yet i
•ended ((Kb
active Interest.
                                                       Ofveloper
                                                         [M
                                triable
                                            ItlOB
                                itttm;
                                     rlflablc
                    Application]
                    (Sui  Rafael.
                    w
                                                                                Ptterlptlon
                                                                    A linear relationship It attuned
                                                                    between MC mttfom >»0 peak
                                                                    pollutant level
               Itopleths of constant peak  Oj on a plot of
               NO! vs.  HHNC are constructed using • cher-
               1C<1 kinetic UOClianlsB timed te fit »•(
               chauber dIU for the Isopleth asyeptotot.
               The diagram Incorporates  diurnal variation
               In solar radiation; end Insolation. dllv-
               tlon. and Inversion typical of • stagnant.
               •rfd-sumr *>y l» LA.  Entry to t\»trm
               irith (-9 >.•.
                                                          •o trMtwnt of
                                                          Individual
No troaeont af
lodlvldtMl Mums
                    OntdMit (bot Hot
                    Mt DOM     comldorvd
                    aopllod to
dot
consider**

                                       HixtnUoo cot-
                                       back expired
                                       IKS) U MC
                                                                                                                                                   dnt (
                                                                                                                                    active and
                                                                                                                                    noiwacilvff
                                                                                                                                    pollutanul
               liopltth dlioru It stallar to OM «w«
               In EM MtMM *xctpt for • comunt !•>
               rather thM • dlvnulljr "rylns on.
               Entry ptratttn tUo dlffor.  Abtolnt
               MWC and HO, concMtratlom *n Mod to
               oiUraliw UN ntnt to unit* tkt
               actual oinkod moAlot a tm
                                                                                                                No tnotaont of
                                                                                                                IndlTldoal
                                                                                           net         leolonal
                                                                                           comldcnd  o«lj
                                                                                           (1-koor     (ur»M)
                                                                                           1-1.1 led)
                                               Not con-    National
                                               (idored     only
                                               (1-koor     (urbM)
                                               lapllod)
Peranuoo cut-
back nqolred
(POI) In me
                                                                      Neaional
                                                                      CorMo)
                                                                                                                                                        m/mc
                                                                                                                                                                         Regional widen
nnsico-offHiCN.

(MB           trld-Keolon Or1«.t« f BB)
Total aero- sorfac*       up to J*-
sols. four  ruughoest
HC categor- coef-
les (single flclents
bond, slow
double bond.
fast double
bond, car-
bonyl bond)
                        JO.     Norliontal     As fine at  At fine as
                         CO.    features can  input data  grid cell
                           .
                        .  HjOj
                    MM],  »?05
                     7. HOj,
                                                                                  ,
                                                                              «"7.
                                                                              RC03.
                                                                                                                                        .
                                                                                                                                    Thr** nt
                                                                                                                                    categortes
            be handled .
            through wind
            field; ver-
            tical fea-
            tures thr*
            cell verti-
            cal dlnen-
            tion
           resolution
           (Tta* scale
           up to ?«-
           hours)
                                     lootlal can-     keglonol-saile
                                     centnttun       prokleus; •**!-
                                     nous for eac*    uatlon stuoHes
                                     hour far eae»    have bm c»r-
                                     pollutaM of     rled oat for tA
                                     Interest; v*rt1- Us Vegn. and
                                     caVconcentro-   Denver;  vita
                                     tie* profiles;   Sacranent* and
                                     paiM predic-    St. Loais soon
                                     tloujs at BOB!-   to follow
                                     torlng stations;
                                     concentration
                                     Isapletkt
                                                                                                                 Roughly the      Regional-scale
                                                                                                                 sane as for the  orations; an
                                                                                                                 SAI node!        evaluation study
                                                                                                                                  has  bnn con-
                                                                                                                                  ducted for the
                                                                                                                                  V lay Are*.
                                                                     «*chan«su dividing HC Into "olef Ins a«d
                                                                            reactive anMtics': -para«f«n».
                                     hlghl/ reacti
                                     less reactive a
                                     ates'; •aldehyd
                                     kewnes-); onl
                                     1s considered; uelt consistent
                                                                                       tlcs. and
                                                                                       sone ar««tles
                                                                                                                                                                                         B-3

-------
                                                                         TABLE  8-1  (Continued)
                          CM
                    **co»Mr*</] of Ut oanltlat.
l»f«m»tiw %t u location >«4 MM of
each, oartlcle nutt be oaintafned.  foa-
                                                           *w i«*«r
                                                                                                           *«•
                                                                               »u
                                                                                u HO.     C
                                                                                 . W,    so. torn**
J» rto* H  to ffl* «
           |r.de,)l
                                                                                         tkrouo> vlnd- (T1o*
                                                                                         field !•»»
                                                       .                         .
                                                    tur*>:  MM nrylny mt»(m and i-0
                                                    •In*; fi«* chwicil tftcin k,»«< •»
                                                                 m) (tortlim;
                                                               comUitt.
                                                                                                                                                    Ml
                                                                                                                                                                •nliMtfM ttuty
                                                                                                                                                                tat kMM can-
                                                                                                                                                                U latin |a»roo-
                                                                                                                                                                •ent idCn ootor.
                                                                                                                                                                vatlon not tM
                                                                                                                                                                toad for 1 tu-
                                                                                                                                                                «ow reported «o
                                                                                                                                                                Itn ptoar to
                                                                                                                                                                Ml.rt.tt •?.)
  trl«-$l»t1flc Sourcp Ortantad

EWtt--E»M and      No raeommdatton
                   »t«»*
EnviroiMOntal
Rcuarch and
TocRnoloajr
                                  Stiuliut tlw d<«p*n|UCIM>
                                                    >uck »( M*r field dl>p*n1«ii  f«r nigfc-
                                                    Miy tOuTCH, fwMfktim Mir to pain
                                                          .      i-ii,, nwpir.
                                                    tun*:  tlM Mnrliif oliitaM ••< f-«
                                                    «r vuwto-d-0 idndtt tM M*ef«t eM»-
                                                    litrr/dK«/) m t/tmt rlM foMU
                                                    ••Utiwu «»M4 Mil »)u4i tta*
                                                    («• mrtfetl dlffmlHtgri kvlmut
                                                    dirrmivl^r eftM Mil
                                                                                                                        to fine at  to  fin. at    Concontrat«a
                                                                                                                        Input dtt*  arid col)     ffolda at
                                                                                                                                   »e nod for    tloni

                                                                                                                                   tola *a»H-
                                                                                                                                            'todltt (2-0
                                                                                                                                            »r»»a»)! "
                                                                                                                                            Mt of J-fl .rr-
                                                                                                                                            t«M In lonj-
                                                                                                                                                 ~~bi tranm-
E>MriMt1M of
rilM |Mct ill
       tWrMn
                   lUtm
                                     SCi
                   Inc.  (Stl«v*
                   «t «1.)
                                 *. 1-0 •
                                 1* aw
               «» Brt»llUd
CickmrBtotr
ind mrttnti
                   itatin
                            (KtiM
                                     Cntnt
                                     •nMrc*
         »t «).
                              >t«M
                                     UK (U»U
                                     brt*ra. tt);
                                     HOD off *r*d
                                     by CUT
                                     Ptelflc
                                     tiwiranimul
                                     Wrvlcn.
                                     Iw.  (butt
                                     Noxtu, C*(
                                                            4-tmd
                                                              '

                           to Miit tttrtn I*
                         .  Salvm conurntlw «f
              •Mi •OMtllm.  It wlnlltM It* wn
              1-D irtfld f «»ld n«n| potMrtUI
              ld> art t*f*t: f in it«»»«ud wi«f a 1Z-*U*
                                 raacttw •Kknldo); UM Mrttul <•)«••
                                 4« ldod tit* umnl e»l»»; *ortt
                                     dlffutlvltr *> a fwcttM of
                                                    ator* (radlont and kotoM o«M* surlocoi
                                                                d t»rc«w ara
                                 f loutiUd n**f • tt-iuo wtiun««a
                                 dnloMd to (roat anpirlaii* *nd 1«>*
                                 roacttw Hti M wttcal dlf'utivltf
                                 •Ndtd; tartulDM ara camortro U
                                 •quInlMt aroprtxw and UK.
                                                                                               u» (• w
                                                                         CooolM      to fIM at M fine M
                                                                                      Input data orld col)      tlal
                                                                                      rewlktlan tin
                                                                                                                                                                  at orld »•»-
                                                                                                                                                                  tiom
                                                     Itno
                                                     Ana
                                                     rionted
                                                                                                                         , ca.
                                                                                           IM P»H
                                                                                           toqul fao-
                                                                                           turx tan kt
                                                                                           hanolod tkra
                                                                                           tnt Kind
to fiM at Only olano;
tooot oata trajectory
rowlotlo* treat *or-
          ora) cowl*     air
                                                                                                                                                        ttdo.
                                            ftudlM;
                                                                                                                                                               toojltm. for
                                                                                                                                                               ••Mflti lOOClflC
                                                                                                                                                               ••(••in art
                                                                                                                                                                        •Itor
                                                                                                                                                              Sonera tint St«-
                                                                                                                                                              tloariW. Mid
                                                                                                                                                                                         . statlM
                                                                                                                                                                                 (*T.. M/M}. <-j»
                                         ftadtaml oaidint;
                                         aaolM W U
                                         latin (tin dan
                                                                    , M).    Mot tipllclt  to flM at  «»1y
                                                                   f. CO.   loot l»pH.    «o»«t *»t*  —-•
                                                                   \, C»%  IMUI fM-   ro»oto
-------
                                                                                                      TABLE B-1   (Continued)
                            CM
                       >«CO*M ndatlon
                          Status
                                           Developirr
 JUTTSIH
                     Mo recommendation    l*U
                     status
   Trajectory-Specific Source Oriented
                     Ho
                     sUtn
                     I/It MB
                     citlom  (for
                     th* California
                     Air Resources
                     »oard~CMm)
LAPS
                                                                       Description
                 The model Is tnjectory oriented and
                 Intended to In used for regional  appli-
                 cation.  It appears ta be similar to
                 01FKIH in tint the air column  allows up
                 to 10 vertical lay«n.   Features:
                 hourly emissions and horizontal  2-0
                 winds arc Input; sipulated species
                 include four HC classes (alkenes. aUanet,
                 aromatic*, and aldehydes) as Mil at o»l-
                 dantt. SO?, and sulfate' • S4-step mech-
                 anism it employed; no horllonUI  diffu-
                 sion; vertical dlffustvlty specified at up
                 to 10 vertical levels with tim*  varlitlo*.


                 The model t> designed to estimate concen-
                 trations of reactive specie! downwind of
                 a Ungle point or area! source,   hied on
                 Lagranglan (noving-with-air-parcel) ver-
                 sioa of ness conservation equation,  allott-
                 ing for background entralnment.  th* air
                 parcel containing the emitted pollutants
                 is allotted to drift dowiMlnd-
                 The parcel  eipands froai th* plume height
                 according to Measured plune width and depth
                 at functions of downwind distance or to
                 the Posoulll-Glfford methods.  Features a
                 modified M-S-D mechanism for HC-M^iSOj;
                 I-D wtnd field; pli»» rise  Input.


                 The uodel 1s designed to calculate concen-
                 tration fields  downwind of single or eul-
                 ttple concentreted sources.  The air parcel
                 II •Honed to drift downwind,  dtsperjtn?
                 laterally and vertically.  Features:  equi-
                 libria* coupling of NO. MOj. and 03; first
                 order conversion of SOj to sulfate; eddy
                 dlffinlvltles;  2-0 wind field; Brlqji pluoe
                 rlso;  up to  7 species can be specified.
	 Sowrvs
•woer Type-
Aiiy mmber »o1nt
Ar»a
Line






'"Type**1
Oj. M.
H0». SO}.
Sulfate
Four HC
groups
(alkenes.
alkan*s.
•romatlcs.
aldehydes)

Complexity
Not explicit
but hori-
zontal fea-
ture can be
handled thru
wind field



Resolution
Temporal
As fine at
Input data
resolution





Spatial
Only along
trajectory
track;
several
could be
run side-
by-side


Form of
Outwit
Temporal con-
centration
history In
lir parcel




Problem
Addressed
Regional uldant;
applied to Las
Vegas (UTS).
Troctet (If?*)
and SF Bay Are*
(1*74) « Mil
as LA Basin (U7J-
Eschenroedor and
tmrtlnel)
Single
sovrce
   , NO.     No terrain
  j,S02,   Interaction
Sulfate     currently
                                                    As fine at  Resolution
                                                    Input data  all the My
                                                    resolution, to source
                                                    long-range  (near-.
                                                    transport   medium-, and
                                                    as well     far-field)
                           Temporal con-    Single source
                           ctntratlon       problems, e.g.
                           history 1*       refineries,  power
                           downwind         plants for fumi-
                           dtrectlon        gattom; trapping;
                                            applied U:   Dost
                                            Landing PP.  Mwjterey;
                                            Lot Alamitoi PP,  IA;
                                                                                     Up te 10
                                                                                     Mlat
                                                                                     sources;
                                                                                     uparat*
                                                                                     •real
                                                                                     lources
            •olnt. areal.  Oj. NO.     No terrain    As fine as  Resolution     Vertical co»
            rlevated       NO,, S0>.   Interaction   Input data  all the way    tratlem amps
            «urces        lulfate                  resolution  to source      vert,  m 20 hi
                                                                                                                     Poi
                                                                                                                     e
                                                                                                                     sources
                                                                    Oil i, LA; romr
                                                                    Conuiis PP. farmfme-
                                                                    ton. M; Noons PP.
                                                                    Nook*. NH; Jefferson
                                                                    PP. Jefferson. Teus


                                                   Vertical concern •  Single or few
                                                   tratlem mopt (10  tources problems;
                                                   	 « 20 hours  only analytical
                                                   sta.); ground     problems attempted.
                                                   concern, cmps and  e.g.. steady-state
                                                   contours; concen- Couutan plumes
                                                   vs. dlst.; ground
                                                   com*, crossptot
  long-Tern Averaolno
AQDH—Air
Quality Display
Model
Reconv-nded by
EPA In guideline
(No. 2t)
TRU (for        This Is a cllnatologlcal steady state
Public          Gaussian plunt endel that estinotes the
Health          annual arithmetic average SO; and partlcu-
Service)        late concentration at ground level.  A sta-
                tistical nodal based on Larson (US!)  Is
                used to tranifom th» average concentration
                data from a limited number of receptors .
                Into an expected geoxetrlc wan and amilinu*
                concentration values for several averaging
                tines.  Features:  treats one or two pollu-
                tants siaultaneously; Holland (1953) pluw
                rise; no plune rise for areil sources; no
                temporal variation In sources; 16 wind dir-
                ections; • wind speed classes; S stability
                classes (Turner. I96«); Pasoulti-etfford
                stability coefficients; no chemical mechan-
                ism.; perfect reflection at ground; no
                effect at ml>1ng height until «, t 0-47H
                (•hen i • «2)> fqr * > *?• uniform niiing;
                no variation in wind speed with height;
                linear superposition of sources;
                °z(>) - a>b * c; does not treat fumigation
                or downwash; Larson procedure assumes log-
                normal concentration distribution and
                power law dependence of median and miilnuo
                concentrations or averaging time.
Nany (up    Point, areal.
to M user> elevated
specified   sources
recta-tor
locations;
•• **m
receptors
located on
a uniform
                          SO,. TSP
                          (could bo
                          used for
                          MO, Mitt
                          NOj ob-
                          tained
                          thru use
                          of an
           Relatively
           flat ter-
           rain; no
           Height dif-
           ference
           allowed
                                                                                                                                               source and
                                                                                                                                   appropriate receptors
                                                                                                                                   factor)
Steady-     Regional
state;      scale
averegl ng
time -
1 mo. to
1 f.i
Larsen pro-
cedurt can
be used to
transform
to 1-24
hour
averaget
1 mo.-l yr.  aver- Regional  long-
aged concentre-   term averages for
tlons; Individual relatively Inert
point, area        pollutants; urban
source culp-      areas prtanrily
ability list
for each
receptor
                                                                                                                                                                                                  B-S

-------
                                                                                                      TABLE B-l   (Continued)
                             CM
                           B*ftda
                           status
                                           Developer
                                                                       Ottcrtptlon
                                                                                             Sources
                                                                                                                                               •.•solution
                                                                                                                                                              Temporal     Spatial
                                                                                                                                                                       »er»ef
                                                                                                                                                                       Output
                                                                                                                                                                                                          «ddrts«ed
 COM jnd CDHQC--
 Cllnutoloqicat
 DisolJx Model
 iKOnrviutal by
 f» tn gut del In
 (No.  27)
 nils it • ciiMtoioaicai steady state
 Gaussian plimm uadtl for dtteralnlne. lonf-
 term (seasonal or annual) arithmetic aver-
 aop concentrations 
                                       ference       1 no. to
                                       allaxed ot*   1 yr.;
                                       timon source  larsa* pro-
                                       end receptors cedure can
                                                    bo Meal to
                                                    transform
                                                    to I-74
                                                    now
                                                    arareees
                           icale
                                         \ m. U 1  yr.    Peflonal
                                         avoraaed concen-  tan averaon
                                         tratlons; source- for relottvely
                                         roceptar culp-    inert polle-
                                         aolllty list      tants; erban
                                         (COMQC only)      areas primarily
 TCM--Teus           No racnmnndatlon    Te*as Air       This Is a cllnatalaelcal steady state
 Cliwtolofical       status               Control          Sausstan plune xxto) sinrflar to Ofl bvt  I "car-
 Hoot 1                                    Board           porattno, deslon features •reducing rue tloe
                                                          by as suck as tw orders of Monltyde.'
                                                          Features:  doMMSk and fuoloatlon not con-
                                                          sidered; all sources nave a  stnol* averaee
                                                          •missions rate for tM averaolnf period  (I.e..
                                                          oontk. season, year); Pasoul 11-Clfford-Turner
                                                          itaklllty classes; artitne helent not a factor
                                                          because no effect for typical clta*taloey.
                                                                                    IMItatted   Point. Mne.
                                                                                    (arbitrary  areol. ele-
                                                                                    receptar    «ate*
                                                                                    location—  sources.
                                                                                    MI iOde)  tall stack-
                                                                                               sources la
                                                                                               skort-teni
                                                                                                     tial-
                                                                         SO}. TSP.   Dolatlvely
                                                                         CO. NO,     flat ter-
                                                                                     rain; M
                                                                                     talent aHf
              Staedy-
              itate;
              evenelng
              time •

aliened ke-    I yr!:
              tarsen pro-
                                                              «0«tOMl
                                                                                                                                       ke used to
                                                                                                                                       transform
                                                                                                                                       to I-M
                                                                             tlon; concentn-
                                                                             tia* it (rid
                                                                             points (op to
                                                                             SOrfO): a llst-
                                                                             in§ of tke *
                                                                             kloMst contrl-
                                                                             kutart toccn-
                                                                             centratlom at
                                                                             eack (rid point
                                                        for relatively
                                                        Inert polle-
                                                        tana; orkan
                                                        areas primarily
ttTW—can
also bo ysed
for thprt-tem
tvertotiKj
ilo
status
                        racommmdation    CUT
Hits  Is  a  steady-state sector-averaead
Cavsslam plune nosel tkat calculates en
tratlons of up to sli pollutants from an
unlimited  number of point, line, and areal
sources.   The nodal can ke operated eltker
In the •cltmataloglcal* made or the •seouen-
tlal* node far skort-ttrm a»er«(in( times.
Feotures:  crosswind dispersion function may
ke »ecter-avoraoad over «.»*! for •leajuontlal*
mode and 'tall stacks.* tke crossvlnd dlsper-
siom function is (Iven ky tke eipectod value
wltliln the ZZ.S* sector for receptors ultkin
the doMnMlnd sector; for receptors adjacomt
to tke doMMind sector, a formulation Is used
•kick avoids CMttrltM one-kmr value* v*M
accvmutatlno. concentration estimates for
multiple kour avoraoos; erloes plumi rise;
sleek tip  damask Ulftord) for Ull st»cU;
•Ind speed poner lav; keif-Ufa decay fac-
tors for species: chemistry not treated dir-
ectly; perfect raflection at around and nti-
tno, layer; unlaue tmiislens rate for eack
source Uial may ke varied dlurnally. Mekly
or mantkly; S stability classes.
unlimited   Point line
(up to la  areal elm-
                                                                                    point! at   -tall  itert-
                                                                                    any Mine-   sources  In
                                                                                    ted I oca-   inert-term
                                                                                    tlgns)      -laouentlal-
Flit and      juady-    *e(1anal
kllly tor-    stetet can scale
rain; a tall  handle
stack- tar-   skort-torm
rain correc-  in 'toouan
tlon is evall-tlat- moot
amle for 'so- (I. 1. •.
eumtlal-     and M hr)
node aut not  and lenm.
•cllmtcoloel-  term In
cal- name;    •cllmeto-
alsa i unloue lofictl-
elevation     mode (I ma..
can ke fpecl-  seasonal.
flea1 for re-  I  yr.)
copters;
plus* and
mlileo depth
respond to
terrain
obstacles
                                        CancantntloM
                                        at eack
                                                                                              avaraea* for
                                                                                              relatively IM
                                                                                              pellotanUi
                                                                                                                                                                                      B-6

-------
                                                                                                                     TABLE B-l  (Continued)
Cite o» rr
                                                        Bevelooer
                                                                                   Description
                                                                                                                                                                                                        form of
                                                                                                                                                                                                        Output
               Short-Term Averaging

             APRAC-tA
Recommended by       EPA (devol-
EPA In guidelines    oped by
(No. 14 and 35)      Stanford
                     Research
                     Institute)
             CBTH-
             thli alto can be
             used far annual
             averaging
Recommended by
CPA In guidelines
(Ho. 13]
This Is a model which calculates hourly          Many        UM           CO. TSP
average CO concentrations for urban              (an exten-
areas.  Contribution from dispersion on          live traf-
1 scales are calculated:  extraurban,            flc tnveft-
matnly from sources upwind of city of            tary alue which If used thereafter);       4 lutera-
and local. I ram street canyon effects.           ally dt-
Featuni:  no pliant rise, fumigation or          fined re-
downwash; helical circulation In street          ceptors ere
canyons; hourly varying traffic emissions        used on
and 2-0 Kind field; oz(i) • axb; link            each street
Missions are aggregated Into area               where
sources; no wind power lav: 6 stability          street can-
classes (Turner); dispersion coefficients        yon affects
from HcClroy and Pooler (IMS), modified         are con-
using Letghton and OltMr (1953): no chem-       sldered
Istry; perfect reflection at surface and
inversion (Ignores latter until concen-
tration equals that calculated using bo«
model and uses that thereafter).

Steady-state Gaussian plume model appll-         Single      Point          CO. S0>.
cable In uneven terrain.   Features:  7           source up                 HO,. TSP
stability classes (Turner. Pasquill);            to l»
dispersion coefficients from Turner; no          stacks (all
chemistry-. Irlggs plums rise; no fumigation      assumed at
or downwash; perfect reflection at surface       the saee
* aodel used to
                                    calculate dispersion fron urban arva  sources.
                                    Analytic Integration of area sources.   All
                                    sources upwind of each receptor area  are
                                    sunned.  It Is mail applicable In areas
                                    wtiere no point source Information is  avail-
                                    able.  Features:  perfect reflection  at
                                    ground; Mixing height reflection not  con-
                                    sidered; hourly emissions and winds;
                                    az(x) • ax°; dispersion coefficients  from
                                    S-IOi (I9«8|; stability classes fro.  S-IUi;
                                    narrow plune approx. (no horizontal 'ilsper-
                                    slon); no plu«e rise: no chewlttr*.

                          EPA       Steady-state (S-S)  Gaussian  pi UK nadel that
                                    coacMites the hourly concentrations of  non-
                                    reactive pollutants downwind of roadways.
                                    lased on analytic  Integration, of line  source.
                                    It is applied  to each  lane of traffic.  Fea-
                                    tures:   no cheailstry:  perfect reflection at
                                    surface and Inversion; one road or highway
                                    segment per run; C  stability classes (Turner);
                                    dispersion coefficients  from Turner; for dis-
                                    tances « 100 n,  coefficient  from Zlanenaa
                                    and Thoxxison (1975); no  wind power law;
                                    hourly emissions and 2-0 wind,

                         EPA        An S-S Gauitian plunr  moOel  tKat  considers
                                    Multiple point sources,   tt  is  based on
                                    linear tddltlvlty  of Individual source
                                    effects.   Features: hourly  emissions and
                                    winds;  Briggs  plune rise;  r^ fumigation or
                                    downwash; no wind  power  law. Turner stabil-
                                    ity classes and dispersion  c:oeffIcients
                                    (horizontal and vertical): no chemistry:

                                    (niltlple reflection).
                                                                                                                       Haay
                                                                          SO,. TSP.
                                                                                                                          Simple
                          1-hr.. 8-hr. Regional
                          and 24-hr.
                          averages
                          (although
                                                                                                                                                                   Hourly concern-
                                                                                                                                                                   tration values
                                                                                                                                                                   at receptors
                                                          Regional prmelmms
                                                          Involving 1«art
                                                          pollutants: erban
                                                Up to 24    line
                                                (arbitrary
                                                receptor
                                                and release
                                                kcfotts)
                                                Up to ZS    Point
                                                (up to 30   (alevated)
                                                receptors)
                                                                                                                                                CO. TSP
            Level ter-
            rain
                                                                                                                                                                          average
                                                                                                                                                                          can  be
                                                                                                                                                                          estimated
              Hourly
              (1-24 hr.
              average)
                                                                                                                                                    hear to
                                                                                                                                                    medium field
                                                                                                                                                    downwind
One-hr.  aver-
age concentra-
tions at each
receptor
Regional- or
highway-specific
problems for
nonreactlve
pollutants
S02. TSP
Flat
terrain
                                                                                                                                        Hourly
                                                                                                                                        (1-24  hr.
                                                                                                                                        average)
                                      Regional       Hourly concen-    Regional ,	
                                                     trations; source  source problems;
                                                     contribution       	
                                                     list at each
                                                     receptor; aver-
                                                     age concentra-
                                                     tions
                                                                                                                                                                                                                      urban area; 	
                                                                                                                                                                                                                      reactive pollu-
                                                                                                                                                                                                                      tants
                                                                                                                                                                                                               B-7

-------
                                                                                               TABLE B-1  (Continued)
                             EM
                       RcconnendJttM
                           Statin
                                           Developer
                                                                      Description
                                                                                                                                      Capabilities
 PTBIS
                                 by EPA
                     ftecomwided by EPA
                                              EPA        A steady-state Gaussian pluec model that
                                                         estteiatei short-term center-tine concentra-
                                                         tions directly downwind of • point source.
                                                         Features:  same mi  RTKIP.
                                                         An S-S taussian plumt model Out Until the
                                                         maximum short-term concentrations fro* t
                                                         single point court* as • function of sta-
                                                         bility and wind speod.  Features:  same ) - l.»H  and uniform mlilnf
                                    thereefter; mliint kelakt determined fron
                                    dice-daily tenperature  soundlnos (stability
                                    class, also).
                                                                                   Niny        Point           SO., TV    Flat          Hourly      «e»lon»l
                                                                                   (recevters  areal                      terrain.       >nd         (urban) and
                                                                                   ira all at                                          iveraond    rural
                                                                                   tne un                                            uo to
                                                                                   Mlekt)                                             24 tirs.
                                                                 Hourly and aver-  national proolen
                                                                 aee concentre-    for nonreectlve
                                                                 ttans at recee-   pol Intents; urban
                                                                 tent 1 tori ted     ai
                                                                 lawn contrlbu-
                                                                 tlan list: cunu-
                                                                 latlv* fiaouancj
                                                                 alstrtbutton date
VM.LCT-
This can also bo
uied for annual
mraalnf (cll-
•ujtalogtcel
•Oder1
 TEW—Teus
 episodic Model
•ecomiended by EPA        CPA        An S-S GawtUn oluw uadel  for calcutatlnf
In guidelines                        annual and nMlnun 24-keur evernoe  SOf and TSP
(Ho. 14)                          '  from single point sources  In coonlex  terrain.
                                    Features:  cllnBCologlcal  an*  short-tern
                                    nodes; 16 »1od directions  and  ( «lnd  speed
                                    cateaortes; Irlggs plum rise  (1971.  1972);
                                    i stability (urban) classes  (Turner.  I9M):
                                    dispersion fron Pasoulll (1961) and Sir ford
                                    (1961); 6 stability classes  for rural; no
                                    «ind power leu; exponential  decay for choo-
                                    Istry and removal.

             ition    Texas Air      An S-S Gaussian plune model for predicting
                     Control         short-tern concentreclons (10 oiin.  to 24
                     Board          hour)  fron) nultlple point and area sources.
                                    Calculations  are porfomed for I u 24
                                    scenarios (eateoroloay. averaging tine, end
                                    mrtxinej height).   Features:  Irtggs plune, rise;
                                    mixing height penetration factor; up te S
                                    pollutants; m chemistry but exponential
                                    decay, no domwash on fumigation; »iod paamr
                                    !«•; rasauf 11-Cl'ford-Turner stability classes
                                    dispersion coefficient  from Turner; perfect
                                    reflection from surface and inversion until
                                    «, - 0.47H
No
status
Point, anal
(treated as
a point
      )
                                                                                                                                   SO., TV
                                                                                   Up u SO
                                                                                   (112 re-
                                                                                   captors on
                                                                                   radial
                                                                                   grid; can
                                                                                   be at dif-
                                                                                   ferent
                                                                                   taoelagi-
                                                                                   cal bgts.)
                                                                                                         Up to M   Point, area I   SOj. TV
                                      Conplex
                                      terrain
 Short- and  leglanal
 long-Urm   (urban and
 average     rural)
 (24-hr, and
     ml)
                           Flat
                           terrain
                                                                                                         us U200
                                                                                                         araal
                                                    Short-term  neglonal
                                                    (I. 1. 24   (urban)
                                                    hrs.)
                                                                                                         (mote
                                                                                                         iOxSO re-
                                                                                            -
                                                                              ,._'»««*»st
                                                                              24-hr,  ceneon-
                                                                              tratlan.  source
                                                                              contribution
                                                                              llsti long-term
                                                                              node: arithmetic
                                                                              emams.  source
                                                                              cantriWttam
                                                                              list
                           rssan concentre-   Imgienel •rumlemm
                           timm for each     for nemraactlve
                           grid point  (10    pollutants; urban
                           min.. 30 mln..    area
                           1 hr., 1 hrs..
                           and 24 hrt.);
                           printed  plot;
                           culpability Hit
                                                                                                         grid)
                                                                                                                                                                                        B-8

-------
                                                                                                       TABLE  B-1  (Concluded)
                            CM
                      Reco^miidatfon
                          Status
                                                                      "ascription
TAJ>AS--Topograpnic
A1r Pollution
         System
AQSIK-AIr Quality
Short Term Model
Ho recomnendation
status
CALINC-2
                     No  reconnendatio
                     status
No rccmmendatlo
status
USD* Forest     This model combines a simulation of the
Service         wind field over Mountainous terrain with a
                Gaussian derived diffusion model.  It pro-
                vides an estimate of the total allowable
                emissions within each of a number of grid
                cells (ranging from 0.2S km* to » km2) to
                maintain a preselected level of air quality.
                The diffusion model Is employed in each grid
                cell to provide an estimate of the mixing
                conditions within these cells.  These con-
                ditions are combined wit* the Pollutant
                Standards Index such that a maximum allowable
                ealsslon is calculated.  Features:  wind
                model (Cressman objective analysis, poten-
                tial flow over topography, influences of sur-
                face temperature and roughness); Gaussian
                model (a. and oz from Turner, effects of
                mass flow divergence included, stability
                classes from Turner, no upper bound on diffu-
                sion although the wind is calculated assum-
                ing a lid at a specified height above the
                topography); the calculated wind follows the
                terrain and thus gives a vertical wind com-
                ponent; no chemistry; no explicit treatment
                of plume behavior.

Illinois        An S-S Gaussian plume model for estimating
environmental    short tern concentration averages from oul-
Protectlon      tiple point sources in level or complex ter-
Agency          rains.  It can simulate late Inversion break-
                up fumigation, lake shore fumigation, and
                atmospheric trapping.  Features:  one or two
                pollutants simultaneously; no chemistry;
                Brlggs plume rise; no downwash; wind power
                law; user-supplied stability classes; disper-
                sion coefficients from Turner (1969): perfect
                reflection at ground and mixing height.
                     California      *> S-S Gaussian line source model for traffic
                     Air Resources    i«wact assessment.  Features:  no chemistry;
                     Board  (CARS)    perfect ground reflection; Pasquill stability
                                    cesses; hourly emissions; some accounting
                                     for depressed highways
                                         Atmospheric     The region of  Interest Is assumed to be
                                         Turtwlence      emconpassed by a single cell or bos.
                                         and Otffu-      bounded by the inversion above and the
                                         slon labor-     terrain below.  All concentrations ar*
                                         atory— ATOL     assumed to be  In steady-state.  Features:
                                         (Oak Ridge.     for given time, constant emissions rite
                                         Tenn.)          and simple winds; seven-step chemical
                                                         mechanism proposed by Frledlander and
                                                         Seinfeld (1969); uniform and constant
                                                         wind and constant mixing depth.
Stt

Many





urces
Tvpe
Point (no
distinc-
tion made1
between
point, line.
and a real
sources
in
Type Complexity
SO*. TS». Complex
CO





Resolution
Tamperel Spatial
toth short- limited
term and regional
long-term
estimates




OutDUt
Allowable emis-
sions In each
grid cell for
each pollutant
of Interest



Addressed
limited regional
Impact problems
in complex ter-
rain; nonreactlve
pollutants


Up to 200 Point
sourcti (elevated)
(up to goo
receptors
located on
a unlforei
rectangular
grid);
unieue to-
pographic
elevation
far bout
«•«» (an line
ei tensive
traffic
inw'nr Is
reqt. -11
All emitting
Inte a single
box







SO.. TSP Mostly flat
terrain, out
sooe correc-
tions for
complex ter-
rain






CO Relatively
flat terrain



Oj. NO. Not explicit
1OI. IMC








Short-tene Regional
(1. 3. and
24 hr.
averaging)








Short-term



Temporal No resolution
resolution
can be
obtained by
varying
Initial
conditions
to match a
temporal
pattern
Average concen-
Cratlonf at re-
ceptors; source
contributions
at receptors







Hourly
concentrations
at receptors


i Concentration
•ilwes et the
time considered







Regional point
source problems
for nonreactive
pollutants; urban
areas; shorelines'







Regional CO
problems
from traffic
sources

Regional onldant;
it was applied
to LA Basin (30
Sept. 1969 data).
Otone predictions
•ere low.




                                                                                                                                                                                                B-9

-------
               APPENDIX C
SOME SPECIFIC MODEL PERFORMANCE MEASURES
                  c-i

-------
                            APPENDIX C
           SOME  SPECIFIC MODEL PERFORMANCE MEASURES
     Having discussed model  performance measures in generic terms in
Chapter V, we now present some specific examples.  We discuss each of the
four generic types of performance measures:  peak, station, area, and
exposure/dosage.  We include scalar,  statistical, and "pattern recogni-
tion" variants.

1.   PEAK PERFORMANCE MEASURES

     The use of a performance measure of this type requires the modeler to
know information about both  the predicted and the "true" concentration peak.
The measurement network must be so  situated as to insure a high probability
of sensing the "true" peak concentration or a value near to it.  There are
three characterizing parameters of  interest:  peak concentration level,
spatial location, and time of occurrence.  The predicted and observed values
of some or all of these may  be available for comparison.  Differences in
their predicted and observed values represent the performance measures of
interest.   These peak measures are  summarized in Table C-l.

     Each  measure  conveys separate but related information about model
 behavior  in predicting  the concentration peak.   Their values  should  be
 examined  in combinations.  Several  combinations  of interest and some of
 their  possible  interpretations are shown in Table  C-2.   The table is not
 intended  to include  all combinations and interpretations.   Rather,  it
 illustrates by  example  how inferences can be made  about model  performance
 through the joint  use of performance measures.
                                C-2

-------
                TABLE  C-l.   SOME  PEAK  PERFORMANCE MEASURES
           Type	                      Performance Measure
        Scalar
         Pattern
         recognition
a.  Difference*  in  the peak ground-level
    concentration values.
b.  Difference in the spatial  location of
    the peak.
c.  Difference in the time at which the
    peak occurs.
d.  Difference in the peak concentration
    levels at the time of the observed
    peak.
e.  Difference in the spatial location of
    the peak at  the  time of the observed
    peak.
Map showing  the  locations and values of the
predicted maximum  one-hour-average concen-
trations for each  hour.
           "Difference"  as  used  here  usually  refers  to  "prediction minus
           observation."
     Several points  are contained  in  Table  C-2.    While  a  large  difference
in peak concentration  levels might in itself  be  sufficient reason  to question
a model's performance, a  simple  difference  in peak location might  not.   If
the concentration  residual  (the  difference  between predicted and observed
values) at the  peak is  small  (good  agreement)  and yet there is a  difference
in the spatial  location of  the peak,  this may be due mostly to slight  errors
in the wind  field  input to  the model.  The  slight offset in the  location of
the peak might cause predicted and measured concentrations to disagree at
specific monitoring  stations, particularly  if concentration gradients  within
the pollutant  cloud  are "steep."  However,  a  small displacement  in the con-
centration field,  unless  it resulted  in a  large  change in  population exposure
and dosage,  may not  be a  serious problem.   Model performance might be  otherwise
acceptable.
                                  C-3

-------
        TABLE C-2.   SEVERAL PEAK MEASURE COMBINATIONS OF INTEREST
                    AND SOME POSSIBLE INTERPRETATIONS
             Residual Values
Concentration
    Level
Location
Event-Related*

  Small          Small


                 Large
Timing
            Small
            Small
                             Large
  Large
Any value   Any value
Fixed-Time^

  Large          Large



              :   Small
Some Possible Interpretations
           Model  performance in predicting the
           concentration peak is acceptable

           Model  performance is still  good in
           predicting the peak  concentration
           level
           There is a possible error in the
           wind field input
           Concentration level  prediction is
           good

           There is a possible error in wind
           field input
           There is a possible error in the
           chemistry package or emissions input

           Model  performance is probably
           unacceptable
                         Model  performance may or  may not be
                         acceptable;  event-related (peak)
                         residuals  must  be examined to make a
                         final  judgment
                         Model  performance is  probably
                         unacceptable
                         Pollutant  transport is handled accep-
                         tably  well
                         There  is a possible error in the chem-
                         istry  package,  the emissions input,
                         or the inversion height time and
                         spatial  history
  Residual values are calculated at the time an event occurs (the peak).

  Residual values are calculated at a fixed time (the time of the observed peak)
                                 C-4

-------
     On the other hand, if the spatial offset of the location of the peak
is accompanied by a significant difference between the predicted and observed
times at which the peak occurs, more serious problems might be suspected.
Not only might there be a wind field problem, but the chemical kinetic
mechanism may be giving erroneous results (if the pollutant species  of
interest is a reactive one).  Alternatively (or additionally), one might
suspect that the emissions supplied as input to the model  were not the same
as those injected into the actual atmosphere.  Another possibility also
exists.  Slight differences between the modeled and actual wind field
might result in the air parcel in which the peak occurs following a  space-
time track having sufficiently different emissions to account for differences
in peak concentration values.

     Additional clarity of  interpretation can be achieved in another way.
We can compare concentration  level, location and timing, not just at the
time a specific event occurs  (the peak, for  instance) but also at a  fixed
time (the time at which the observed  peak occurs, for example).  Suppose
that the concentration level  residual at that fixed  time  (the difference
between maximum predicted concentration and  the observed peak value) is
large but the spatial one is  not.   In this case, one could conclude that
the model reproduced the pollutant  transport process but was  unable to
predict concentration levels.  This could result from many causes, among
which are errors in the chemical kinetic mechanism,  the emissions input,
or the inversion height space/time  profile.  Whatever the cause, however,
the conclusion remains the  same:  Model performance  is probably  inadequate.

     Alternatively, if both the fixed-time concentration  level and location
residuals are both large, a firm conclusion  about model acceptability may
be premature.  Performance  may or may not be satisfactory.   A comparison
with the event-related peak performance measures is  necessary before a
final judgment is made.
                                  C-5

-------
     If the model  being used is  capable of sufficient spatial and temporal
resolution,  a "pattern recognition" performance measure may be  of some  use:
a map showing the locations and values of the predicted maximum concentrations
at several times during the day.  Such a map is shown in Figure C-l.  It
was produced using the SAI Urban Airshed Model simulating conditions  in
the Denver Metropolitan region.

2.   STATION PERFORMANCE MEASURES

     The use of a station performance measure requires the modeler to
know, usually at each hour during the daylight hours,  the  values of both
the predicted and observed concentrations at each monitoring stations.  From
the two concentration time histories at each site, a number of  performance
measures are listed in Table C-3, divided into three categories:  scalar,
statistical, and ."pattern recognition.-

      Station measures are the performance measures whose use is most
 feasible in  practice.  Their calculation is  based upon the comparison
 of model predictions with observational data in the form that it is most
 often available—a  set of station  measurements.  By contrast, peak
 measures require the observation of the  "true" peak.  If this peak value
 is not the same as  the value recorded at that station in the monitoring
 network measuring the highest level,  if the  location of the peak is
 somewhere other than at that station, and if its time of occurrence is
 different than  the  time of the  peak observation, then the calculation of
 peak  performance measures may not  be feasible.  Although one can sometimes
 use numerical methods to infer  from station  data the level, location and
 timing of the peak,  results  are subject to uncertainty.

      Similarly, area  and  exposure/dosage measures require knowledge of the
 "true"  spatially  and  temporally varying concentration field.  However,
unless  circumstances  are  simple and the monitoring network is exceptionally
extensive and well-designed, the "true" concentration field will not be
known.   The only  data available will  consist of station measurements.  Infer-
ence of  the concentration field from  such data can often be an uncertain
and error prone process.
                                  C-6

-------
*   -
                              SOUTH
                     Meteorology of 3 August 1976


  FIGURE C-l.   LOCATIONS AND VALUES OF PREDICTED MAXIMUM ONE-HOUR-
               AVERAGE OZONE CONCENTRATIONS FOR EACH HOUR
               FROM 8 a.m. TO 6 p.m.
                               C-7

-------
              TABLE  C-3.   SOME  STATION  PERFORMANCE MEASURES
    Type
            Performance Measure
Scalar
Statistical
  Pattern
  recognition
Concentration residual at the station measuring
the highest concentration (event-specific time
and fixed-time comparisons).
Difference 1n the spatial locations of the pre-
dicted peak and the observed  maximum (event-
specific time and fixed-time  comparisons).
Difference 1n the times of the predicted peak
and the observed maximum.

For each monitoring station separately, the
following concentration residuals  statistics
are of interest for the entire day:
1)  Average deviation
2)  Average absolute deviation
3)  Average relative absolute deviation
4)  Standard deviation
5)  Correlation coefficient
6)  Offset-correlation coefficient.
For all monitoring stations considered together.
the following residuals statistics are of
interest:
1   Average deviation
2   Average absolute deviation
3   Average relative absolute deviation
4   Standard deviation
5)  Correlation coefficient
6)  Estimate of bias as a function of
    concentration
7)  Comparison of the probabilities of concen-
    tration exceedances as a  function of
    concentration
Scatter plots of all predicted and observed
concentrations with a line of best fit deter-
mined in a least squares sense.
Plot of the deviations of the predicted versus
observed points from the perfect correlation
line compared with estimates of Instrumentation
errors.

Time history for the modeling day of the pre-
dicted and observed concentrations at each site.
Time history of the variations over all stations
of the predicted and observed average concentra-
tions.
At the time of the peak  (event-related), the ratio
of the normalized residual at the  station having
the highest value to  the average of the normal-
ized residuals at the other stations.
                                C-8

-------
a.   Scalar Station Performance Measures

     Since the "true" concentration peak  is  not always known with confidence,
a surrogate is needed for determining model  performance in predicting the
concentration peak.  Such a measure is often based  upon a comparison of
the predicted and observed concentrations at the  station measuring the
highest value during the day.  The comparison can be  done at an event-related
time (the peak) or a fixed time.  Since the values  of the measures may
differ at the two times, the implications of those  differences should be
considered carefully.

b.   Statistical Station Performance Measures

     Many statistical station  performance measures  are of use.   Sometimes
the  behavior  of the  concentration  residuals at a single station  is  considered.
 At other times,  the overall  behavior of  the residuals averaged over all
 stations is the  focus of interest.  In either case,  however, several of
 the statistical  performance measures remain the  same.  We define them here
 (the tilde - denotes "predicted," while  m is the pollutant species, n
 is the hour of the day, k is the station index,  K  is the number of  stations
 being considered, and N is the number of hours  being compared:

      >  Average  Deviation

                                      ~
                      "    N£lfcl

      >  Average Absolute Deviation
                               N   K
                          " N n=l k=l

      >  Average Relative Absolute Deviation
                               N   K
                                                                     (C-3)
                                    1     Ck'
                                    C-9

-------
      >  Standard Deviation
                           T      2:   ?•" - •!•"
         or, alternatively,
                            N    .. K
^r {E   [g (?•• - 
-------
     Another statistical  measure is of interest.  The correlation  coeffi-
cient,  as  expressed below, provides an indication of the extent  to which
variations in observed station concentrations  are matched  by variations in
the predicted station values.   A close natch  is  indicated  by a value near
to one  (the value for "perfect" correlation).

     >   Correlation Coefficient
where
         N   r K
—Ur  F  fe
KM - 1  *—i  I *—'
m   '  n=l  Lk=l
                                              k=l
                N   K
               E E e-n
           m _ n=l k=l  *
'E-


m
'c
*\
?,

N
E
. n=l



KN
K
E
k=l
KN




cm,n


N
V
          N

         n=l
                                  m m
                                 °c°c
                            m\J
                         -^c
                                                                    (C-6)
                                                     (C-7)
                                                                    (C-8)
                                                                    (C-9)
                                                                    (C-10)
     If the value of the correlation coefficient is not close to one,
this may or may not be an indication that model performance is deficient.
For instance, suppose slight errors were embedded in the wind field
supplied to the model.  Possibly, the only effect of this could be a
slight offset between the predicted and the "true" pollutant cloud location.
The concentration level and its distribution within the cloud might be
                                  C-ll

-------
well predicted otherwise.  However, the correlation  coefficients computed
at individual stations (K = 1) might not demonstrate agreement between
prediction and observation, indicating instead  the opposite.  Conceivably,
this also might be the case even if the correlation  coefficient is computed
using concentration values averaged for all  stations (K - total number of
stations).

     Another statistical  measure is useful in overcoming this difficulty
when sampling stations are not too "sparsely" sited.  This measure is the
offset correlation coefficient and is  designed to compare predictions at
one station and time against observations at another station and/or time.
It is defined as follows:

     >  Offset Correlation Coefficient
                             N
                            Efe" - ps
where k is the index of the measurement station at which concentrations are
predicted, j is the index of  the station at which they are measured, and An
is the time offset between prediction and observation; also
                             E
                        "ck      H

                                                                   (C-12)
                                    C-12

-------
     Many reasons can account for differences between  prediction  and
observation.   The offset correlation coefficient itself cannot  be used
to isolate specific reasons,  but it can detect time lags or  spatial offsets
between comparative concentration histories.   A time lag might  occur
because of slight differences between modeled and actual wind speed, diurnal
inversion height history, emissions, or atmospheric chemistry,  as well as
any of a number of other reasons.  These differences could manifest them-
selves at a particular monitoring station as  a simple time lag, an example
of which is shown in Figure C-2(a).  Also, for the reasons mentioned above,
as well as differences in modeled and actual  wind direction, a  spatial
offset can occur which could  result in the actual and predicted pollutant
clouds passing over different but adjacent stations.  A comparison of the
concentration profiles at these two stations, such as those  shown in
Figure C-2(b), can reveal the offset.  Good agreement could  be  inferred  if
the value of the offset correlation coefficient between the  concentrations
at the two stations* at the same time, assumed a value near  one ("perfect"
correlation).

     In using station data as a basis for comparing prediction  with obser-
vation, the offset correlation coefficient should be computed as  a matter
of course.  For the station of interest (perhaps the one recording the highest
concentration value), computation of the following offset correlation coeffi-
cients might be revealing:  first, at the same hour, with all adjacent sta-
tions (unless none are nearby); then, at the  same station, for  adjacent  hours
(for example, one and two hours lag and lead); and finally,  with  all adjacent
stations and hours (to reveal the joint presence of spatial  offset and time  lag)
                                C-13

-------
                                 An

                       Hour of Day


 (a)  Time Lag (Predicted and Measured  Concentrations
      are for the same monitoring station)
    c
  ro O

  •»-> +•>
  C CO
  0) -M
  O -M
  O ro
                       Hour of Day
  jH
  •M C

  S- •!-
  4J -l->
  C (O
  QJ +J

  C
  O -t->
                       Hour of Day


(b)   Spatial effect (Predicted and Measured Concentrations
     are for Different but Adjacent Monitoring Stations)
      FIGURE C-2.   CONCENTRATION HISTORIES REVEALING
                   TIME LAG OR SPATIAL OFFSET
                        C-14

-------
     For  all  the monitoring stations considered together, several other
statistics  are  of interest.  For instance, the variation of bias in model
predictions with the level of pollutant concentration can be plotted as
shown in  Figure C-3 .   In this particular example, based upon simulations
of the Denver Metropolitan region performed using the SAI Urban Airshed
Model, the  fractional  mean deviation from perfect agreement between predic-
tion and  observation appears to vary randomly at the higher ozone concen-
trations.  Aside from an apparent systematic bias at very low concentrations,
no conclusion of significant bias seems demonstrable.
                                  Soot Jtean Squirt Ozone Concentr»t1on (ppha)
                                  	f(Observed)2 * (Predicted)21
    FIGURE C-3.   ESTIMATE OF  BIAS  IN  MODEL  PREDICTIONS AS A FUNCTION OF
                  OZONE CONCENTRATION.   This figure is based upon pr
                  tions of the SAI  Urban Airshed Model for the Denver
                  Metropolitan region.
                                   C-15

-------
     Residuals can vary in sign and magnitude during the modeling day.
It is often helpful to plot their diurnal variation.  An example is
shown in Figure  C-4,  based upon predictions of the SAI Urban Airshed Model
for three modeling days in Denver.  A discernable pattern might be sympto-
matic of basic model inadequacies.  In this example, however, no simple
pattern seems apparent.

     For each set of observations or predictions (for all stations and
times), there exists a cumulative concentration frequency distribution.
This describes the probability of occurence of a concentration in excess of
a certain value  for the range of possible concentration values.  An example
based upon the modeling effort noted earlier in shown in Figure C-5.  A con-
clusion might be drawn from this figure:  Although background ozone concen-
trations are not well-determined (low background concentrations are difficult
to measure accurately), higher concentrations are more predictably distributed.

     By plotting observed concentrations against predicted ones (at each
station for each hour), a graphic record of their correlation can be obtained.
The degree of clustering of observation-prediction pairs about the perfect
correlation line provides an indication of the degree of their agreement.
An example  is presented in Figure C-6.  For each particular combination of
observation and  prediction, the number of occasions on which they occurred
are  shown.

     Superimposed on the figure are the standard deviation bands (la) for
both the EPA standard and maximum acceptable instrumentation error.  These
bands portray the extent to which station measurements are accurate indi-
cators of "true" concentrations.  To conclude that a model is unable to
reproduce a set of "true" concentrations, one must know the value of those
concentrations.   Measurements, however, are imperfect surrogates.  If
concentration residuals are within instrumentation limits, differences could
be explained solely by measurement errors.  In such a case, no further
conclusions could be reached about model  predictive ability.
                                C-16

-------
  E
  ex
c -
S
  C
  O
  •*• 9
  M C
c a
•> t.
o «
c c
o at
o u

t> o
c u

M «•
o c
  o
•o M
to
C O

•i e
                 	MEW OF W.I STATIONS

                  O  NCAN OF ALL STATIONS, N JULY 1975

                  O  MEAN OF ALL STATIONS, ffl JULY 1978

                  0  MEAN OF ALL STATIONS. 3 AUGUST 197«

                     AVERAGE OF THE 3 DATS
                                             I
                         1
                         n
n

9
                                   in
10

TT
n
17
12
T
1
7
2

I
                         Tlw of 0«y by Hourly A»er.,1m, Period
 FIGURE C-4.   TIME VARIATION OF DIFFCREMCES DET'-'EEM MHANS OF OBSERVED AND

               PREDICTED OZONE  CONCENTRATIONS.  This figure  1s based  upon
               predictions of the SAI Urban Airshed Model for the Denver
               Metropolitan region.

-------
  M
o.
QL
10
£
OBSERVED

PREDICTED


279 DATA PAIRS FROM 3 DAYS, M HOURS, 9 STATIONS
         I  (   l   |f	j.
                    •»  •»              W   »  M  M  •  •   N    p    I     I   I   M  tl tl
                   Probability of Exceedance  of Given Ozone Concentration
    FIGURE C-5.   PROBABILITIES OF OZONE CONCENTRATION  EXCEEDANCE.  This figure  is  based
                  upon  predictions of the SAI Urban  Airshed Model for the Denver
                  Metropolitan region.

-------
          P=Predicted
FIGURE C-C.
MODEL PREDICTIONS CORRELATED WITH INSTRUMENT OBSERVATIONS
OF OZONE (DATA FOR 3 DAYS, 9 STATIONS, DAYLIGHT HOURS).
This figure is based on predictions of the SAI  Urban
Airshed Model for the Denver Metropolitan region.
                              C-19

-------
     Some of the information contained in Figure C-6  is summarized in

Table C-4.  The percent of prediction/observation pairs meeting certain

correspondence levels are indicated for this example.  The extent to

which concentration residuals compare with instrumentation error is

shown in Figure C~;7.  These same plots can be constructed for most

modeling applications for which station predictions  are known.
      TABLE C-4.   OCCURRENCE OF CORRESPONDENCE LEVELS OF PREDICTED
                   AND OBSERVED OZONE CONCENTRATIONS
                                              Percent of Comparisons
                                           Meeting Correspondence Level
         Correspondence Level                         Both Predicted and
 Between Predicted and Observed Pairs  Comparisons  Observed Cone. > 8 pphm
 1)  Factor of two (2P > 0 > P/2)          801

.2)  Computed value 1s within ± twice
     S.D. max. prob. inst. error
     (951 level) of observed value        100

 3)  Computed value 1s within ± S.O.
     of max. prob. inst. error
     (95X level) of observed value         93

 4)  Computed value Is within ± twice
     S.D. of inst. errors by EPA std.
     (95X level) of observed value         89

 5)  Computed value Is within ± S.O.
     of inst. errors by EPA std.
     (95X level) of observed value        -60
                                                         94%
                                                        100
                                                         90
                                                         77
                                                         37
c.
"Pattern Recognition" Station Performance Measures
     Several qualitative/composite model  performance measures are useful

in comparing station predictions with observations.  At each monitoring

site, for instance, the time history through  the modeling day of the pre-

dicted concentrations can be plotted directly with the time history of
                                 C-20

-------
                                                           DEVIATION OP PREDICTED VERSUS 08SERVEO POINTS
                                                            ROH PERFECT CORRELATION LINE (281 ONE-HOUR
                                                           AVERAGE  DATA POIN1S)
                                                            TRUC -  INSTRUMENTAL)
                                                           EPA ACCEPTABLE MONITOR (IVAN HAS •  -8 PERCENT;
                                                           1 3 PPItl • 95 PERCENT CONFIDENCE LEVEL)
                                                                     (TRUE - INSTRIMCNTAL)
                                                                     MAXIMA PROBABLE ERROR  (MEAN
                                                                     BIAS • -B PERCENT! t 7  PPIIM
                                                                      5PERCENT CONFIDENCE LEVEL)
                                           Difference (pphm)
FIGURE C-7.   MODEL PREDICTIONS  COMPARED WITH  ESTIMATES OF  INSTRUMENT ERRORS FOR OZONE  (DATA
               FOR  3 DAYS,  9 STATIONS, DAYLIGHT HOURS)

-------
the measurement data.  This is done in Figure C-9  for one of the days
(3 August 1976) in the Denver modeling example employed earlier.
Preceding this figure is a map in Figure C-8, which shows the names and
locations of the air quality monitoring stations in the Denver Metropol-
itan region.

     For each hour during the day, the predicted and observed concentrations
each can be averaged for all measurement stations.   The diurnal variation
of this all-station average can also be of interest.   An example  of such
a time history is shown in Figure C-10.

     At the time the concentration peak occurs,  the performance of the
model in predicting that peak is of interest as  is  its ability to predict
the lower concentration values at monitoring stations distant from the
peak.  An indication of the relative prediction-observation agreement at
the peak versus the agreement at outlying stations  can be found by com-
puting a composite performance measure.  The ratio  can be found of the
normalized residual at the station measuring the highest concentration
value to the average of the normalized residuals at the other stations.
If this ratio is large, better performance at the outlying stations than
near the peak can be inferred.  If the value is  small, the reverse is true.
If the ratio is near unity, agreement is much the same throughout the
modeled region.

     The value of a concentration residual  at a  station changes during
the modeling day.  If these changes can be tied  to  corresponding  changes
in atmospheric characteristics (the height of the inversion base, for
instance), we can sometimes draw valuable inference about model performance
as a function of the value of these atmospheric  "forcing variables."  Some
of these variables include:  wind speed, inversion  height, ventilation (com-
bining the previous two variables into a product of their values), solar
insolation, and a particular category of emissions  (automotive, for example).
                                 C-22

-------
                        KEY
   NG - Northglenn            NJ
   WE - Wei by                 GM
   AR • Arvada                0V
   CR - C.A.R.I.H.            PR
   CM - Continuous Air Moni-
        toring Program [CAMP]
National Jewish Hospital
Green Mountain
Overland
Parker Road
                          NORTH
                          SOUTH
FIGURE C-8.  MAP OF DENVER AIR QUALITY MODELING REGION SHOEING
            AIR QUALITY MONITORING STATIONS
                       C-23

-------
15
                               12--    • *• •  1   «
   tf-.ffftlTltTff»Tt
                                  fstart hour":
                                . I i tc r L nvwi
       Tine of Da>, Rjr Hourlv Interval [slop hour J
— 0—  Observes
—<=>	 Predicted
  FIGURE C-9.   TIME HISTORY  OF PREDICTED AND  OBSERVED CONCENTRATIONS
                AT MONITORING SITES.  This  figure is based on  the
                dictions of the SAI Urban Airshed Model in Denver
                for  3 August  1976.
                           C-24

-------
n

ro
en
                                     Time of Day By Hourly Averaging Period
FIGURE C-10.
                           VARIATIONS OVER ALL STATIONS OF OBSERVED AMD PREDICTED AVERAGE OZONE CONCENTRATIONS

                           This figure is based on the prediction of the SAI Urban Airshed Model in Denver.

-------
To examine residual values for cause-and-effect relationships, we can
plot on the same figure the time history of both the residual and the
forcing variable.  Alternatively we can plot the residual directly
with the forcing variables.  Examples of both of these are presented in
Figure C-11.
                 
-------
model performance measures.  In practice, however, we are seldom able to
resolve fully the "true" concentration" field, even if the model  we use is
capable of doing so for the predicted field.  This difficulty derives from
the limited sampling of measurement data generally available: Only measure-
ments at several scattered monitoring stations are recorded.  Unless ambient
conditions are highly predictable and the monitoring network is  extensive
and exceptionally well-designed, reconstruction of the "observed" concen-
tration field from discrete station measurements can be an uncertain and
error prone process.

     Nevertheless, the observed concentration field can be inferred with
accuracy in some cirucmstances.  In addition, models frequently can provide
spatially resolved predictions.  Grid models, for instance, predict average
concentrations  in a number of  grid cells.  Resolution is then provided as
finely as the horizontal  grid-cell dimensions (on the order of one  to sev-
eral kilometers).  Trajectory  model predictions can  be used  to calculate
concentrations  along  the  space-time track followed by the air parcel being
modeled.  Gaussian models are  analytic and  can resolve fully their  predictions.
Thus, even if the observed concentration field is known  only imperfectly,
the  predicted field,  because it is often much better resolved, can  still
provide qualitative information about model  performance.  Further,  the
shape of the predicted concentration field  can suggest ways  to extract
information for comparison with station measurements.  We discuss  "hybrid"
performance measures  later in  this Appendix.

     In this section  we present several  area performance measures.  When
predicted and observed concentration fields  are known, they  can  provide
considerable insight  into model performance. These  performance  measures
are  based upon  taking the difference between the  predicted  and  observed
values of certain quantities.   Even when the observed  values of these
quantities are  not  known  with  accuracy,  computation  of their predicted  values
 can  provide  a systematic  means for characterizing model  predictions.
                                 C-27

-------
     The performance measures presented here can be divided into three
types:  scalar, statistical, and "pattern recognition."   We discuss each
in turn.  In Table C-5, we list some of these measures.

a.   Scalar Area Performance Measures

     The seriousness of a pollutant problem is a function not only of the
concentration level itself but also of the spatial  extent of the pollutant
cloud.  Several scalar area performance measures are designed with this in
mind.  Even if a model predicts the peak concentration well, it may not
necessarily predict the extent of the area exposed  to concentrations near
to that value.  This might not be a serious defect  if the pollutant cloud
passed over uninhabited terrain.  However, if the cloud were to drift
over a densely populated urban area, a considerable difference in the
health effects experienced could exist between a cloud one mile across and
another five miles across.  This could affect correspondingly our willing-
ness to accept a model for use whose predictions of cloud dimensions
differed considerably from observed dimensions.

     Two performance measures of interest are the following:  the differences
between both the fraction of the area of interest within  which concentra-
tions exceed the NAAQS and the fraction experiencing concentrations within
10 percent of the peak value.  The first of these is a measure of the
general ability of the model to predict the spatial  extent of concentra-
tions in the range of interest.  The second estimates the performance of
the model in the higher concentration ranges at  which, presumably, health
effects are more pronounced.

     A third measure is of interest.   At each measurement station a set of
concentration readings are recorded.   It is interesting to compute from
the predicted concentration field the nearest distance at which there occurs
a value equal  to the observed value,  as well  as  the  azimuthal  direction
from the station to the nearest such  point.   This direction lies  along the
concentration gradient of the predicted field.  The magnitude of the distance
is a measure of the spatial offset between the predicted and observed concen-
tration fields in the vicinity of the monitoring station.  The direction  is
a measure of the orientation of the offset.
                                    C-28

-------
            TABLE C-5.  SOME AREA PERFORMANCE MEASURES
    Type
                 Performance Measure
Scalar
Statistical
 Pattern
 recognition
a.  Difference in the fraction of the area of interest
    in which the NAAQS are exceeded.
b.  Nearest distance at which the observed concen-
    tration is predicted.
c.  Difference in the fraction of the area of interest
    in which concentrations are within 10 percent of
    the peak value.

a.  At the time of the peak, differences  in the
    fraction of the area experiencing greater than
    a certain concentration; differences  in the
    following are of interest:
        Cumulative distribution function
        Density function
        Expected value of concentration
        Standard deviation of density function
        the entire residual field, the following
    statistics are of interest:
    1)  Average deviation
    2)  Average absolute deviation
    3)  Average relative absolute deviation
    4)  Standard deviation
    5)  Correlation coefficient
    6)  Estimate of bias as a function of
        concentration
    7)  Comparison of the probabilities of concen-
        tration exceedances as a function of con-
        centration
    Scatter plots of prediction-observation concen-
    tration pairs with a line of best fit determined
    in a least squares sense.

    Isopleth plots showing lines of constant pollu-
    tant concentration for each hour during the
    modeling day.
    Time history of the size of the area 1n which
    concentrations exceed a certain value.
    Isopleth plots showing lines of constant residual
    values for each hour during the day  ("subtract"
    prediction and observed isopleths).
    Isopleth plots showing lines of constant residuals
    normalized to selected forcing variables (inver-
    sion height, for  instance).
    Peak-to-overall performance indicator, computed
    by taking the ratio of the mean residual in the
    area of the  peak  (e.g., where concentrations are
    within 10 percent of the  peak) to  the  mean
    residual  1n  the overall region.
                               C-29

-------
b.   Statistical Area Performance Measures

     A number of statistical area performance measures are of use.  They are
generally computed either at a fixed time or at the time of a fixed event,
(the peak,for instance).  Before they can be computed, however, both the
predicted and observed concentration field must be transformed into a
compatible, discrete form.  The  scales of resolution must be made the same,
though  kept as  fine as possible.  For example, if a grid model provided
average  concentrations every  two kilometers  in a lattice-work pattern
spanning the region of  interest, then the observed concentration  field
inferred from station measurement must  also  be resolved at  two kilometer
intervals with  concentrations obtained  at each point  in the lattice-work.
If resolution cannot  be obtained so finely,  then  the  predicted concentration
field  must  be adjusted  to be  comparable with the  observed  one.   The  field
having the  coarsest resolution if the limiting  one.
      Once the fields have been resolved into a compatable form, several
 performance measures can be computed.  We can characterize a concentration
 field by indicating for each concentration value the fraction of the area
 experiencing a concentration greater than that value.  By so doing, we define
 a cumulative distribution function (CDF) such as that shown in Figure C-12.
 The CDF is the integral of its density function (f), also shown in the figure.
                                                                     CUMULATIVE
                                                                  — DISTRIBUTION
                                                                     FUNCTION (CDF)
                                                             PREDICTED
                                                             OBSERVED
                                                              DENSITY  FUNCTION  (f)
                                  Concentration
         FIGURE C-12.
DISTRIBUTION OF AREA FRACTION  EXPOSED TO GREATER
THAN A GIVEN CONCENTRATION VALUE
             C-30

-------
    For the predicted and observed concentration fields, the CDF's may
differ.  The following statistics can be compared in order to characterize
the difference;  the CDF itself, the mean expected concentration in the
node led region, and the standard deviation of the area density function.
If the CDF and f were continuous functions, the following express the
form of these measures:

     >  Cumulative Distribution Function
CDF(C
                                /
                         < K) - I   f(c)dc                          (C-16)
     >  Expected Concentration

                                 A
                           VA=  I   cf(c)dc                        I
                                -t
                                 LB    .
        where Cp is the peak and Cg is the background concentration.

     >  Standard Deviation
                      °A
      •  f P  (c  -  vA)*f(c)dc                      (C-18)
         However,  the CDF and f are not available in practice as continuous
         functions:  They are expressed discretely, derived from concentra-
         tions at the nodal points of a ground-level grid having dimensions
         I by J.  The above measures have the following discrete form:

      >  Discrete Cumulative Distribution Function

                                      J   I
                    CDF(Cm< K) =
                                   IJ  =l 1=1
                                    C-31

-------
where m Is the pollutant species and u is a unit step  function  whose  value
is
                         u(x)
                                  1     ,     x >0
                                  0    ,     x<  0
(C-20)
     >  Discrete Expected Concentration
                                    J    I
                           yj = -jjE E Cjj                       (C-21)


     >  Discrete Standard Deviation

                                      0   I

                                         1-1

     The predicted and observed concentration fields can be differenced,  with
the result being a spatially distributed residual field at the fixed  time or
event of interest.  The statistics of this residual field are essentially the
same as those described earlier in Eqs.  (C-l) to (C-10) for the set of  station
residuals.   They are as follows (the tilde " denotes "predicted," while m is
the pollutant species and I, J  are the number of nodes in the concentration
field grid):

     >  Average  Deviation
                                 0   1  ,__     -x
                                                                    (C-23)
     >  Average Absolute Deviation
                                 J   I
                                         lcij ' Sjl                 
-------
    >  Averaoe Relative Absolute Deviation
                   m_  I  C* X" ICJ J " tfj L                     (C-25)
                              rr       rm
                          3=1  1=1       C.
       Standard Deviation
                2
    >  Correlation Coefficient
              jn . *v   'V=1 1-1  •*     /XJ=» 1='	1	L      (c-27)
              r                    mm
                                  °c°c
    Calculation of the above statistics can be extended through the model-
ing day by including residual values not just at a specific time or event
but for each hour during the day.  Also, a graphical representation of the
correlation between prediction and observation can be developed by plotting
prediction-observation concentration pairs on a scatter plot, much as was
done for station values in Figure C-6.
                                  C-33

-------
 c.    "Pattern Recognition" Area Performance Measures

      Considerable information about model performance often can be found
 through the use of "pattern recognition" area performance measures.  Even
 if a comparison between prediction and observation is difficult due to the
 sparsity of the latter data, insight can still be gained through the use
 of the measures described here.

      The spatial and temporal development of the pollutant cloud  is  of con-
 siderable  interest.  Frequently, differences  between prediction and  obser-
 vation can be  spotted quickly by comparing isopleth plots showing  contours
 of  constant pollutant concentrations.  The development of the cloud  can be
 portrayed  graphically in a series of hourly isopleth plots.  Shown in
 Figures C-13(a) through (e) is a series  of hourly  isopleth plots.  These
 represent  predictions for ozone generated  by  the SAI Urban Airshed Model
 for the Denver Metropolitan region on 29 July 1975.  The locations of  the
 measurement stations are also shown,  as  they  were  in Figure C-8.

     The example illustrated in Figure C-13 is typical of applications  involv-
 ing  multiple-source, region-oriented issues  (SIP/C, AQMP).  However for
 specific-source issues,  the downwind Isopleth contours  are approximately
 elliptical. An example of a specific-source isopleth,  or "footprint",
 plot was presented  earlier in  Figure V-4.in Chapter V.

     Model  performance  can also be characterized by comparing against
 observation the time histories of the size of the area  in which  concentrations
 exceed a certain value.  Such a comparison would provide insight into the
 temporal variation  of prediction-observation differences.   An example of
 such a history  is presented  in Figure C-14 for ozone in  the Denver Metro-
 politan region.  A  meteorology the same as that  observed on 28 July 1976
was  employed by the SAI Urban Airshed Model, along  with  emissions  for that
date and projected  emissions for 1985 and 2000,  to  predict  the spatial  and
temporal distribution of ozone for each year.   Lines of  constant concentra-
tion values are also shown.

                                   C-34

-------
                             NORTH

                                                            i  £
                           •^fSS^^-^S^^^ cSE^T-
                           ^^^rf^r-zp-fj^^-ir*^-* J**?l-^ ^T-rr^r^T'J"<

                                                    it
                                                  ••"; '>
                              SOUTH
FIGURE  C-13.
         (a)  Hour 0800-0900 MST



ISOPLETHS OF  OZONE CONCENTRATIONS (pphm)  ON 29  JULY 1975,

Isopleth interval 1 pphm.  This figure is based on pr

dictions of the SAI Urban Airshed Model for the Denver

Metropolitan  region.
                              C-35

-------
      NORTH

                 • '  '  '  i  .  _i—i—i—i
(b)   Hour 1000-1100 MST





 FIGURE  C-13  (Continued)
       C-36

-------
      NORTH

        ^^^^^Sfe^fe^^^
        ^^^^LZ^^^^^Sz^^^&svSz
        S£:&&ga^&^S&S&3*&,
        |^sfevg^,a>^^gfe=»^5
        £S^^S^^«s*fL2K?£3SbS
 =r;-TL^-* 'ZiAT**-*^"—T^» SuT5C  X utxi*^*- *i>^S
 ^^^^^^^'  /m^^
 ^"^"•2 ffS^r^-Sr^g^s??^^ / / "***
 '^i^^f ^E&S&*AS/ ..,-•


 pyw/77Fx  '/:;''
 >-:x;Sl.' / /// X    i!
      SOUTH
(c)  Hour 1200-1300 MST
FIGURE C-13  (Continued)
        C-37

-------
i—i—i—i—i—i—i—i—r—i
                        SOUTH
                (d)  Hour 1400-1500 MST





                FIGURE C-13  (Continued)
                                                      60
                       C-38

-------
        NORTH

         SOUTH
(e)   Hour 1600-1700 MST
FIGURE C-13  (Concluded)
       C-39

-------
Year 1976 Emissions
                                                   Year 1985 Emissions
                                                                                   Year 2000 Emissions
  I
n
i
     1/1
     O)
     E*»

     QJ
     t- I
     O
     3
     CT,
     t/t
•a
01

S-
      i   rrfl  Mf I   i   I
                                                                                                   MM IMUtMMIHB: 11

                                                                                                            0 !•
                                                                                        	a	•  £ ^	_jf
                                                                                      ft   tt   ¥   I  4
                                                                                         *  4   4
                                 Time of Day By Hourly Interval
                                    Meteorology for 28 July 1976 Assumed
      FIGURE C-14.   SIZE OF AREA IN WHICH PREDICTED OZONE CONCENTRATIONS EXCEED GIVEN VALUES FOR YEARS 1976, 1985,
                    AND 2000.   This figure is based on predictions of the SAI Urban Airshed Model for the Denver
                    Metropol1 tan reglon.

-------
     If both the predicted and observed concentration fields are resolved
compatibly to the same scale, the two can be differenced and the residuals
plotted directly as isopleth contour plots.  This may be done either at a
fixed time/event or hourly.  The example shown in Figure C-15 is typical
of such a plot, although it was not derived from observational data.  This
particular figure was calculated by differencing the annual N02 concentra-
tions predicted by the EPA's Climatological Dispersion Model (COM) for two
emissions regions:  one a base case and the other a 17.5 percent reduction
in emissions in downtown Denver.  Since the magnitude of the residuals may
be strongly a function of certain atmospheric forcing variables (wind
speed or inversion height, for instance),  it can be helpful to normalize
residuals to the forcing variable values.

     Several model performance problems can be spotted qualitatively using
residual isopleth plots.  Some of those that might be apparent are:

     >  Good peak/poor spatial agreement.
     >  Bad peak/good spatial agreement.
     >  Different peak location.

     A  composite measure  can also  be useful  in assessing the relative  peak/
 spatial  performance  of a  model.  The peak-to-overall  indicator can  be  calculated
 at the time of the peak as the  ratio of the mean residual  in the  vicinity of the
 peak (where concentrations are  within 10 percent of the peak, for example) to
 the mean residual  in the  overall region.

 4.   EXPOSURE/DOSAGE PERFORMANCE MEASURES

      The health effects experienced by an individual in a  pollutant region
 seem to be a function of both the concentration level and  the duration of
 exposure.  The  aggregate impact experienced by  the total populace would be
 expressed by the sum of the effects impacting each individual.   The serious-
  ness of  the pollutant problem  would be related  not just to the spatial and
  temporal  development of  the pollutant alone but also to the spatial and temporal
  distribution  of the population living beneath it.   Several performance measures
  attempt  to guage model performance on this  basis.
                                     C-41

-------
                        NORTH
•~r—1   r  i    i  ~    i  i
                         SOUTH
FIGURE C-15.
          TYPICAL RESIDUALS ISOPLETH PLOT FOR ANNUAL AVERAGE N02<
          Units are in
                       C-42

-------
     In this  section  we  present some  of these performance measures, acknow-
ledging at the  outset the  difficulty  of their computation in practice.  Whether
the spatial scale is  urban/regional or source-specific, the problem is essen-
tially the same.   Not only must the predicted and observed concentration field
be known, but also the population distribution.  All  are temporally and spatially
varying.  Conceivably, the observed concentration field may be  estimable from
station measurements.  Recording actual population  movements during the modeling
day, however, seems a nearly unsurmountable task.   In reconciling these problems,
several options seem available; among these are the following  two:

     >  If the  observed concentration field can be  estimated
        acceptably well, both it and  the predicted  field can
        be used with the predicted population  distribution  to
        compute exposure dosage measures for comparisons.   Such a
        predicted distribution is frequently available when multiple-
        source, region-oriented issues are being considered.   To
        characterize diurnal variations in emissions, particularly
        mobile automotive ones, one must estimate the diurnal
        patterns of population movement.  Having done so,  one can
        infer the hourly spatial distribution of population.   How-
        ever, for specific-source issues, population distribution
        is seldom considered.  Since only the emissions from the
        individual source are of interest, those of the same species
        resulting from nearly population-related activities need  not
        be explicitly considered, except to compute a background  con-
        centration over which  the specific-source emissions are super-
         imposed.  Unless  additional  information can  be gathered
         (from  a  traffic planning agency perhaps), population distri-
         bution may not  be available,  even  as a prediction.
      >   If the observed concentration  field is not known acceptably
        well,  computation of the observed  exposure/dosage measures
         cannot be accomplished.  However,  these quantities often can be
                                    C-43

-------
        calculated for model  predictions  (presuming a predicted
        population distribution  history is  available).  Even though
        these cannot be compared against  their  observed values,
        they can help characterize  model  predictions.  A model
        sensitivity analysis  can be conducted to  estimate the effect
        of population distribution  on exposure/dosage calculations.
        If sensitive, the gathering of additional observational data
        might be warranted, as would an  expanded  effort in  predicting
        population movement.

     The exposure/dosage performance measures considered  here fall into
three types:  scalar, statistical,  and "pattern recognition."  We
present in Table C-6 some specific  measures.

a.   Scalar Exposure/Dosage Performance  Measures

     Several performance measures are defined in  terms of concentration
exposure and dosage.  The exposure  is defined to  be the product of the
number of persons experiencing a concentration in excess  of a certain value
and the time duration over which the value is exceeded.   It is expressed
analytically as follows:

             Em(x,y,n) = / * P(x,y,t) ufc"(x^.t)-nldt     ,          (C-28)
/                           r
                             P(x,y,t)  u|cm(x,y,t)-Ti]dt

                         ul
where Em(x,y,n) is the exposure at a point (x,y) to a concentration Cm(x,y,t)
of species o in excess of a given level,  n (the NMQS, for example);
P(x,y,t) is the population level at (x,y) at time t; u is the  unit step
function such that
                                        z  >0
                     U(2)  - '
                            /.
                                        z<  0
                                     C-44

-------
        TABLE C-6.   SOME EXPOSURE/DOSAGE PERFORMANCE MEASURES
    Type
                 Performance Measure
Scalar
Statistical
Pattern
recognition
a.  Difference for the modeling day in the number of
    person-hours of exposure to concentrations:
    1)  Greater than the NAAQS
    2)  Within 10 percent of the peak.
b.  Difference for the modeling day in the total
    pollutant dosage.

a.  Differences in the exposure/concentration fre-
    quency distribution function; differences in  the
    following are of interest:
    1)  Cumulative distribution function
    2)  Density function
    3)  Expected value of concentration
    4)  Standard deviation  of  density function
b.  Cumulative  dosage distribution  function as a
    function of time during the modeled day.

For each  hour during the modeled  day, an  isopleth
plot of the following (both for predictions and
observations):
    1)  Dosage
    2)  Exposure
                                C-45

-------
and  At = t? - t-j, is the duration of exposure.  The total exposure between
t, and t« over a region measuring X by Y  can  be written as
                                        m
                                       E(x,y,rOdx dy
     Since in practice the predicted and observed concentration  fields  are
known only at discrete points on a  ground-level grid, it follows  that the
population function P(x,y,t) must be resolved into a compatible,  discrete
form.  Once this is done, the discrete forms of Eqs. (C-23)  and  (C-30)  can
be written as follows:
                              = E   P!,  ufc";n-nl
                                n=N,   1J   L 1J     J
                                 J    I
                        £(«)  "         E?.(n)                       (C-32)
                         T     j=l  i=l   1J
where I and J are the X and Y dimensions of the grid while  N,  and Np are
the starting and ending hours of  the summation.

     Dosage is defined as  the product of the population at  a given point,
the pollutant concentration to which that population is exposed,  and the
length of time for which the exposure to that  concentration persists.  The
dosage provides a measure  of the  total amount  of pollutant  present in the
total volume of air inhaled by people over the time period  of  interest.  This
may be illustrated as follows.  Let the dosage, D, be  in  units of ppm-person-
hour.  If the volume of air inhaled is V cubic meters  per person-hour, the
quantity of pollutant, Q,  present in the air may be estimated  as

                       Q = DV x l(f6 cubic meters                    (C-33)

If V is assumed to be a constant, then Q is proportional  to D  and the dosage
0 provides a measure of Q.  It may be noted that the dosage provides no
                                   C-46

-------
information as to the amount of pollutant inhaled per person.   The dosage
at a point (x,y) may be expressed as
                            ±
                 Dm(x,y)=/
2
  P(x,y,t) C(x,y,t)dt                  (C-34)
while the total dosage within an area X by Y is

                             Y   X
                      D™ =  f  f   Dm(x,y) dx dy                    (C-35)
                           JQ  \
Expressed in discrete terms these two equations can be written as
                                                                     (c'36)
                              J   I
                                 Z D?.                             (C-37)
                                 TTi  1J
      Using Eqs.' (C-31) and (C-32) we can calculate two measures of interest:
  We can determine for  the predicted  and  observed  concentrations  the number
  of  person-hours of exposure  to  concentrations  (1) greater  than the NAAQS
  and (2) near the peak (within 10 percent,  for example).   Using  Eqs. (C-36)
  and (C-37), we can determine  for the modeling day the total  predicted and
  observed  pollutant dosage.  By comparison  of the predicted  and  observed
  values, the seriousness of  any differences between  the  two  can  be estimated
  in a way  that  relates, though crudely,  to  pollutant health  impact.

  b.   Statistical  Exposure/Dosage Performance Measures

      Exposure/dosage performance measures  have several  useful statistical
  variants.  One of these is  the difference between the predicted and observed
  exposure/concentration distribution function.  An example of such a function
  is  shown  in Figure C-16,  calculated for ozone in the Denver Metropolitan
                                     C-47

-------
I
  2
I,
o  I
                                                  =2=
                      10    12   14   16    18   20

                        Ozone Concentration (pphn)
                                   22
                                                           26   28
 FIGURE C-16.
ESTIMATED EXPOSURE TO  OZONE  AS  A FUNCTION OF OZONE
CONCENTRATION FOR 3 AUGUST 1976 METEOROLOGY.  This
figure is based on predictions  of the SAI Urban
Airshed Model for the  Denver Metropolitan region.
                               C-48

-------
region.  The figure is based on predictions made by the SAI  Urban Airshed
Model using actual emissions and meteorology for 3 August 1976, as well
as projected emissions for 1985 and 2000.

     Certain statistics of the exposure distribution are useful:  the
cumulative distribution function (CDF) itself, the density function (fE),
the expected value of the pollutant concentration, and the standard devia-
tion of the density function.  We show in Figure C-17 a representation of
the general shapes taken by the CDF^ and the f^.
                                                                    CDF,
                     'B
                Background
                                 Concentration
         FIGURE  C-17.
GENERAL SHAPE OF THE EXPOSURE CUMULATIVE
DISTRIBUTION AND DENSITY FUNCTIONS
Incorporated in this figure are two important assumptions:  None of the
population is exposed to concentrations above the peak value, Cp, while
all are exposed to concentrations at least as high as the background value,
 'B'
     The first of these is certainly a valid assumption.  The second may
not be accurate in all circumstances.  Those persons spending their days

                                   C-49

-------
indoors within environmentally controlled buildings may experience lesser
concentrations than the background value.  Noting this possible limitation,
however, we proceed.

     The CDFF can be derived from the exposure function defined in Eq. (C-30)
and illustrated with the example in Figure C-16.  It can be expressed as

                                    Em(C)
                       CDFr(C) = 1 - -J	                            (0-38)
                          E
The density function, fF, can be derived from this relation as follows
fE
                              £ [CDFE(C)]
                                                                      (C-39)
 Combining Eqs. (29) and (31), we can write


            ET(C) '[ f f*b*rt u[cm(x,y,t) - c] dt dx dy          (C-40)
                    Y  X  t
 From this, we can express its derivative as
            BT[ET(C)]= • /  / /  p(*>y>v «[cFn(x>y>t)"c]dt dx dy   (c"41)
                          Y  X  t
                                   C-50

-------
where 6 is the Dirac delta function defined such that 6(z) is 1 when z = 0
and zero for all other values of z.  The density function can thus be
written as
                     f  f  /"P(x,y,t)6[cm(x,y,t)  -  c] dt  dx  dy
                    Hi
(C-42)
 The expected value,  y£.,  and the standard deviation,  OE>  are defined  as follows

                           .C,
                            /P
                              CfE(C) dC
                          rcP
                    °E = /   
-------
This function has the form shown in Figure C-18.
            1.0
                                      AC
                              Concentration
           FIGURE G-18.   SHAPE OF  ^(C),  THE APPROXIMATION TO
                          THE DELTA FUNCTION
Using Eq. (C-45)  the discrete  form of  the  density function can be written

in the following  form:
                                              n - cl
(C-46)
The expected value and standard  deviation then can  be expressed as
                "E 'I § CkfE
(C-47)
aE =
                               (C  ~
(C-48)
where K is the number of equally spaced intervals, AC, spanning the concen-
tration range from CB to Cp.

                                 C-52

-------
     The quantities described above—the CDF_, f£, p_ and a-—form the
basis for a comparison between prediction and observation.   Differences
in the shape of the CDFF can be characterized by differences in  yc and
  2
CE , as well as being revealed by differences in the qualitative shapes
of the f^.  If these differences are large, model performance may be
judged unacceptable.

     The variation of the cumulative dosage function during the  modeling
day is another means for comparing prediction with observation.   An example
of such a dosage function is shown in Figure C-19, calculated for ozone in
the Denver Metropolitan region.  The figure is based on predictions made
by the SAI Urban Airshed Model.

c.   "Pattern Recognition" Exposure/Dosage Performance Measures

     The performance of a model in predicting exposure and dosage can be
judged qualitatively by comparing isopleth plots of predicted values with a
similar plot showing observed ones.  We present in Figures C-20  and C-21 the
ozone exposure and dosage contours, respectively, predicted by the SAI Urban
Airshed Model for Denver on 3 August 1976.  The population distribution
assumed in each was based on data supplied by the Denver Regional Council
of Governments.  Residential population figures were corrected  temporally
to account for daytime employment patterns.  No attempt was made, however,
to adjust for other shifts during the day.

     In Figure C-20, the cumulative exposure at one-mile intervals  is shown.
Isopleths of exposure to concentrations greater than a certain  value are
included for three different levels.  In Figure C-21, the cumulative dosages
are shown for each point on the same one-mile spaced grid.  In  both  figures,
the interval of time considered was 13 hours, from 500 to 1800  (MST).

5.   "HYBRID" PERFORMANCE MEASURES

     As noted earlier, model predictions often are more finely  resolved
spatially than measurement data.  A consequence of this is the  following:

                                    C-53

-------
10
      600  700   800  900   1000
            1100 1200  1300 1400  1500  1600 1700  1800  1900

            TIM of Day (MST)
 FIGURE C-19.
CUMULATIVE  OZONE DOSAGE  AS  A FUNCTION OF  TIME OF DAY
FOR 3 AUGUST 1976 METEOROLOGY.   This figure is based
on the predictions of  the SAI Urban Airshed Model for
the Denver  Metropolitan  region.
                           C-54

-------
                                           l« II  ta It  It  It  I*  IT  I-  It  M  •!  *b  M  M
                                                                                          M If JMI I* M
O
I
cn
en














M • • • •
17 00*0
14 1 1 1 1
,, 	 |CA| PKN
ii • • a i

19 • • 4 1
l!l • tt 1 1

II • • • •
1* • • • 1
* • • 0 1
Honnn CHI
a • • 0 o




















1 1 IA •
1 1 13 12
-J WIT. 1
9 •! III 13

3 9 II II

3 tt 14 14
1 T II II
l l ll li

(1 ll II
LAKE
* • II II
• 1 2 10
• •10

JUT
	 IffllM







nnoohrm.D


via mi ii tn



10 10 10 4
AAV ABA
lH ni |A i*

10 10 13 ffl
13 ia ia 10
10C
13 13 14 23
KDCCVATEU
II || |g 23

II II 14 33
0 II 10 03



WOOD
13 13 10 31
13 IB 31 31
14 Iffl 10 23

o • ai ai
CO
N CLtUIH V


T~T. •







rtft
KI
IN

3



14
5ft

80

03
00



17
17
18

ai
Lin
I.Y








NonrucLMM

TUOIWTON
NIL
rro
	 1 Bcrr rani »m





, — . «r«rL ica
fr- "1 1

" D E*n V ic-n 	 AUiu>iu-
33 03 82 42 I* 14 14 3 8 1
Man •& AA •• IA IA • A i



IT IB Ot 0* 21 II III 14 * «
17 83 33 1 94 a> 11 II 84 ft S
iu aa • • • ai ti B l •
CUCIUIV BILLS
27 33 • 1 S 14 81 I 1 •
tLTOH CRiCNWUOO VI.O














»Ht













«!!••••
• i i a • • •
4 a a 2 • • •
4 8 a a • o •
• • • 1 a • • •

• .
2 • 0 • • • •
-b •..'•.»-

                            (a)  Concentration Greater than 8 pphm; Year 1976 Emissions

             FIGURE  C-20.   CUMULATIVE  EXPOSURE  (IN  TO3  PERSON-HOURS) TO  OZONE  CONCENTRATIONS ABOVE  GIVEN
                           LEVEL  IN  ONE-SQUARE-MILE GRID CELLS  BETWEEN 500 AND 1800 HOURS  FOR 3 AUGUST
                           1976 METEOROLOGY  AND 1976 EMISSIONS.   Grid numbers  are  listed on left side and
                           top  of figure.  This plot is based on  predictions of the SAI Urban Airshed
                           Model  for the Denver Metropolitan region.

-------
                                          r   n  •  IB  ii  ia  is   u  IB   14  IT
                                                                                       ta  ai aa  an  a4  aa  M  tr  M  a*
o
in
en
**
30
3T
at
99
»4
33
23
at

I*
l«
ir
u

!!

•
ia
it
!•
                        OOLDCN
                          Honnimn
                                                                     Non-raciEim
                                                       WESTMHIBTEB
                                                                        TUORIfltMl
                                                                 mrre
                                                                                          IICKY HHIH *nSHL
                                                 AIWADA
                                           WITT. ROC
                                                                                      8T»rL I Nil
                                                   recevAitin

                                                  aaatiaio
                                                  aaa
                                                 NOOB
                                                               I*I«7IB444~4~|BBBBBBBBBB

                                                                                    -. - ^—.~  t
                                                          BIIERIDAII  CHCLCHOOB
                                                          mien i van  e.nuu.*uw         i	•
                                              BI38*4arillB«BB
                                                 '	1                 cnemir mtu  I
                                              »   .  t   •  T   4  7   T   I   I   I   B   4  •   B
                                              B   B  t   a  a   B  r   r   •   i   i  U   «  • FT
                                            jerr co    '—i    Lim^nm    CMEtmnoaito	'
                                             URBAN       ci, nit vtr
                                                         Cl, nil VtT
                                                             I   38
                                                                           aaa»t  I   I
                                                                                            i
                     BBBB'BB-
                                      (b)   Concentration Greater than 16  pphm;  Year 1976 Emissions
                                                        FIGURE  C-20  (Continued)

-------
                                                        II  !•  19  14  It   U  If  Ml  !• M  •! tt  M  14  M  M  *T  M  B*  M
o
tn
0*
M
ar
9*
33
•4
aa
aa
31
2*
14
id
17
U
II
14
ia
13
II
10
 «
 0
 T
                        GOLDEN
                          HOnniSON
                                                 imoo WIELD
                                                                     ROKTOCLCIIII
                                                                        TaonirroH
                                                                o  o   •  «   u
                                                                 ronL
                                                                 DCT8
                                                                                          IKKY HHTH AJUHL
                                                 AIWADA
                                                                               COMMERCE
                                           wrr.
                                                                                      HTAPL INTL
                                                                     D E M v e n
AONOHA
 0  •
                                                                             CLEMUALK
                                             LAKEVOOD
                                                                  tHCLKWOOO
                                                                        CIIEIIRV DILLfl
Jirr co
UllOAN CLHIM V


LiTn.r»
V


OH CIUCNWOOD VI 0



                                (c)   Concentration Greater than 24  pphm;  Year 1976 Emissions
                                                     FIGURE C-20   (Concluded)

-------
o
90
39
an
37
34
38
34
33
93
31
30
14
IO
17
14
18
14
19
13
II
10
9
n
T
0
4
9
1




» 0
0 0
. 1 .
Vi
0 0
0 0
O 0
0 0

COLI

~ L
: :]
0 0
0 0
m

0 0





0 0
0 0
1 1
1 1
1 1
1 1
1 1
1 1
1 1
1 1
M

n a
3 9
0 0
'f
0 III
IIUIIBOH—

0 0






1 1
1 1
1 1
1 1
1 1
1 14
1 10
10 10
4 4
4 4


0 14
9 9
9 9
14 14
14 0
1' '








-— 1 nnnonriELo

90000
tmm
a an 9 o
18 IT IT IT 18
14 IB 10 IB 17
AHVADA
IO IB IB IB 17
14 IB 19 19 33
33 23 33 33 33
WIT. WHS
84 34 88 38 36
eocevATci
99 33 99 33 34
39 3!l 34 34 98
IB 10 14 19 19
19 19 91 91 93
19 19 SI 31 33
imvooo
30 SO 33 33 ;I3
4 Itt 33 30 F07
3 4 14 94 37
4 4 4 4|,7
jerr co
tmoA*. ei


13





4
1IH9T
4
7
10
34
04
94
99
80
80
68
34
07
07
oen ii
H
94)
99
13
•HOW
19
19

19 14 18




IHMITI
8 B 13
r.n n
8 8 It
font
IWT8
8 8 10
4 4 IO
13 IB 10
01 01 40
01 01 40
98 07 84
• m L -1
98 49J309J
BE
84 84 80
04 84 80
	 1
64 44 *a|
94 04 03
09 04 30
40 Oil 43
24 20 1 46
,- -1 C
34 |48 44
93 43 43
LITTLCTOn
04 04 00
.10 Oo! 34

14 It

<• 14 90 91 99 9» 94 98 94


97 tfl 29 30



,

12 13
ici.eim
19 13
13 10
Miurroii
18 10
IB II
IB II
IB II
(1
IB 4
40 43
40 40
84 83
84 83
B4 83
CLI
84 83
44 87
49 87


10 10 T 4
10 It 7 T
II 0 B B
4 B • 1
imr
9 	 9 1 I
OtOVBCt.
9 9 1 10
99 9 90 17
99 9O 30 17
OTAFL IRTI
M 93 39 7
97 04 04 1 T
97 94 941 99
00 40 40 04
UIDALE
00 00 40 94
49 00 OB OO
43 00 On 00
IV 9 4 1 30 OH 1 14
ICMT OIULB I— 1 I
10 t 9 24) 00 | 4
ONECNWOOO VLO
99 13 13 II II 9
90 II II 10 10 B
94 II II 14 14 B



0 00 0 0
4 4 I 0 0 0
4 to 0 B B
1 1 1 1 J 0

IWTII AM9WL

17 14 14 19 •
17 14 14 19 IB
•

AulWRA


0
0
•

000
000
090



10
10






3 1 • 7 7
9 B 1 1 1
• 4 • 1 4 0
B 4 4 } 0 0

4
T
1
1
1

10
IO




0 9
0 O

ft •



000
1
1
1
0 0
0 O
0 0
0 0

          FIGURE  C-21.
CUMULATIVE OZONE DOSAGES (IN 106 PPHM-PERSON-HOURS) IN 0!ir.-SQUARE-MILE GRID
CELLS FROM 500 TO 1800 HOURS (MST) for 3 AUGUST 1976 METEOROLOGY AND EMISSIONS
IN 1976.  This figure is based on predictions of the SAI Urban Airshed Model
for the Denver Metropolitan region.

-------
model performance sometimes must be evaluated using performance measures
requiring different classes of data "completeness."  For instance,  the
observed concentration field may not be inferred reliably from station
data even though the predicted field can be well described.  In such a
case, concentration isopleth plots for both could not be constructed and
compared directly.  Still, we would not wish to rely solely on station
performance measures.  To do so, we would sacrifice some of the information
content available on the prediction side of the comparison.

     Several performance measures are "hybrid" ones.  They are designed
for use when a different level of concentration information is available
for prediction than for observation.  We discuss here such a measure, the
basis for which is shown in Figure C-22.
   MEASUREMENT
   STATION
                                                     PREDICTED CONCENTRATION FIELD
NEAREST POINT AT WHICH
PREDICTION EQUALS STATION
OBSERVATION
                             X
                            ' ACTUAL CONCENTRATION  FIELD
  FIGURE C-22.   ORIENTATION WITH RESPECT TO MEASUREMENT STATION OF NEAREST
                POINT AT WHICH PREDICTION EQUALS STATION OBSERVATION
                                     C-59

-------
In the figure, isopleths are shown for the predicted and actual  concentration
fields.  Only at the measurement station, however, is data available describing
the actual field.  The offset between the two fields nevertheless can be
characterized by determining the vector (distance, azimuthal  orientation)
from the station to the nearest point at which the predicted  concentration
equals the measured value.  This can be done for several hours,  producing
a time history of the distance and orientation of that point.  A plot of
this can be constructed, as shown in Figure C-23.
                                   NORTH
         WEST-
              5-6 p.m.
       3-4  p.m.
              SOL
                                                     6-7 a.m.
                                      TH
                                                2-1  p.m.
                                              -EAST
      FIGURE C-23.
SPACE-TIME TRACE OF LOCATION OF NEAREST POINT
PREDICTING A CONCENTRATION EQUAL TO THE
STATION MEASURED VALUE
     The space-time trace shown in the figure is centered at the measurement
station.  Similar traces could be constructed for each station.  Space-time
correlations could be made to infer the amount and orientation of the
displacement of the two concentration fields.
                                    C-60

-------
          APPENDIX D
SEVERAL RATIONALES FOR SETTING
 MODEL PERFORMANCE STANDARDS
              D-l

-------
                                  APPENDIX D
          SEVERAL RATIONALES  FOR SETTING MODEL PERFORMANCE STANDARDS


      In Chapter VI of this report, we identify a "preferred"  set of model
 performance measures, the values of which are  helpful  in assessing  the degree
 to which model predictions agree with observations.   It remains  for us to
 decide how "close" these must be in order to judge model performance to be
 acceptably good.   In this appendix, we present four alternate rationales
 for making such decisions:  Health Effects, Control Level Uncertainty,
 Guaranteed Compliance, and Pragmatic Historic.   To maintain perspective
 about each rationale and the problems for which their use may be appropriate,
 we recommend Section D of Chapter VI be read prior to considering this appendix.

 1-   Health Effects Rationale

      Ambient pollutant concentrations are not  themselves our  most funda-
 mental concern but rather the adverse health effects  they produce.   The
 NAAQS are chosen to serve as measurable, enforceable  surrogates  for the
 "acceptable" levels of health impact they imply.  Because health effects
 are of such basic importance, it makes  sense to define model  performance
 in such terms.  However, quantifying the health effects resulting from
 exposure to a specified pollutant level  can be a difficult and controver-
 sial  task.  Toxicological studies in laboratories by  necessity are  performed
 at high concentrations,  often at  levels  and dosages seldom occuring even
 in the most polluted urban areas.  Experiments are conducted  in  animals
 whose  response patterns  may not serve as perfect analogues for human behavior.
 Epidemiological studies  are confounded  by the  variety of effects occuring
 simultaneously in  a complex  urban environment.  Consequently,  isolation
 of a  "cause-and-effect"  relationship between health effect and pollutant
 level becomes statistically very  difficult.

     Nevertheless, in this discussion we indicate one means whereby health
effects can be used as a basis for evaluating  the acceptability  of  model
performance.   We postulate the existence of a  health effects  functional, *,
dependent both on concentration  level and health effects for  all  exposed
                                  0-2

-------
persons in the polluted region.  This quantity (the area-integrated cumu-
lative health effect) we use as the metric of interest.   If  the  ratio of
the predicted value of 4> to its observed value remains within a  certain
tolerance of unity, model performance is judged acceptable.

     Several features of this approach have appeal.   Among these are:

     >  The health effects functional need not be known precisely,
        only its general shape.
     >  The use of area-integrated cumulative health effects as  a
        »etric has strong intuitive appeal; it is less sensitive
        than dosage to concentrations not near the peak value.
     >  A transformation of variables reduces the spatial sensitivity
        of the metric, *, with more than one spatially distributed
        region mapped in to the same value of $; this can result
        in an increase in generality of application.
     >  Simplifying assumptions can be invoked to allow computation
        of specific numerical values.
*.   Area Cumulative Health Effects As a Concept

     "Total area dosage" is frequently used as a surrogate for  "total area
health effects."  Mathematically, total area dosage, DT can  be  expressed as
            DT(trt2) = J J J  2P(x,y,t)C(x,y,t)dt
                                                         dx         (D_i)
                        X  Y  11
where the duration of exposure is At (=t2-t,); P(x,y,t) and C(x,y,t)"are
the population and concentration at (x,y) at time t; and X and Y represent
the spatial limits of the polluted region.

     However, the concentration C(x,y,t) in this relation and the time
duration of exposure really combine to approximate health effects.  Suppose
that a health effects function exists such that
                                  D-3

-------
                             HE = HE(C,At)
                 (D-2)
Such a function could behave as shown in Figure  D-l, with HE disappearing
only when concentrations approach zero.   Alternatively,  a threshold concen-
tration might exist below which specific effects are either indistinguishable
from a background level  or below the threshold of perception.
                               Concentration
               FIGURE D-l,   POSSIBLE HEALTH EFFECTS  CURVES

     We define a new metric:  the area-integrated  cumulative health effects
functional, .  It can be written as follows:

            *(At) = J J J P{x.y,t) HECCfx.y.tJ.t-t^dt  dy dx         (D-3)
                    X  Y  At
If this function could be evaluated for predicted and "true"  values of
P(x,y,t) and C(x,y,t), we could formulate the performance  standard such
that their ratio, r, was required to remain within a  fixed tolerance of
unity, i.e.,
                                    I predicted
                          r =
                                   1 observed
> 1  - a
(D-4)
where a is some small value (10 percent, for instance)
                                 D-4

-------
chosen to represent a maximum acceptable level  of uncertainty  in
aggregate health impact.  It may be noted with  this  standard   that
model acceptability is called into doubt only if the predicted value
of 4> is less than the "observed" value.   This makes  sense  for  the
following reason:  Considering only a perspective based on health
effects, we are concerned that the model predict conditions leading
to health impact at least (or nearly so) as large as actually  occurs.
To bound model on the "upper" side, another rationale must be  used
(control level uncertainty, perhaps).

     The expression in Eq.  n,-3, however, is of only academic  interest
unless it can be made more tractable.  Several of its key limitations
are  as follows:

      >   It  is a  spatial  integral.  The value of P(x,y,t) and C(x,y,t)
         change  for each  new application  locale.  Thus it is diffi-
         cult  to  extend  results  obtained  in one situation to those
         expected in any  new one.
     >   The health effects function, HE,  is  dependent on concen-
         tration  and cannot be expressed  directly without being
         "mapped" through the  concentration field.

      However, through a  transformation  of variables, some  difficulties
 can  be overcome. We  will  replace in Eq. D-3  the double  spatial
 integration,by a single concentration  integration taken over  the range
 of ambient values  (background,  CB, to  the current peak, Cp).   Total
 population within  the modeling  region  at time  t,  PT(t), can be written as

                                             C (t)
                 f fp(x,y,t)  dydx = PT(t) -  JP   w(C.t) dC        (D-5)
                X Y                          CB

 where w(C,t)  is the population exposed to a concentration C at time t.
 (By definition, no one is exposed to concentrations lower than the
                                  D-5

-------
background value, Cg.)  A pictorial  representation of the population
function P(x,y,t) and wfC.t) is shown in Figure D-2.
                                     ISOPLETH
                                     C+dC*
              ISOPLETH C
                  I  IS THE POPU-
                  I  LATION AT THIS POINT
                  I
^-MODELING
  REGION
                                               -w(C,t)dC IS  THE
                                                 POPULATION WITHIN
                                                 THIS AREA
         FIGURE D-2.  REPRESENTATION OF SPATIAL AND CONCENTRATION
                      DEPENDENT POPULATION FUNCTIONS
     The equivalence expressed in Eq.  D-5 holds without qualification
providing the modeling region is chosen large enough to contain the
background  (Cg) isopleth  for every hour during the day.  However, this
requirement can be relaxed under the following condition:  No or very
few persons live or work  in the area outside the modeling region but
within the  Cg isopleth.   In such a case the modeling region need only
be large enough to enclose within it the population of interest.

     An important observation can now be made:  The health effects func-
tion, HE, can be introduced into both sides of Eq.  D-5 without disturb-
ing the equality.  Doing  so and integrating with respect to time, the
area integrated cumulative health effects (CHE) functional can be trans-
formed into
                                 D-6

-------
                          rrcp(t)
                 t(At)  =  JJ    wfC.tjHEtC.t-t;,)  dCdt             (D-6)
                         AtCB

It is this equation with which we deal in the remainder of this  section.

b.   Components of the Cumulative Health Effects Functional

     We now examine each of the two major components of the CHE  functional:
the population distribution and health effects function.  For Eq. D-6
to be of any use to us, it must be made analytic in a way that has  a  degree
of generality from one application locale to another.  Consequently,  we
are guided by three principal objectives:  Both W(C,t) and HE(C,At) must
                                           t
be analytic, integrable, and based upon simple, easily understood assump-
tions.  To accomplish this, important simplifications are invoked.  The
degree to which they limit the generality of the results is discussed,
although additional research beyond the scope of this study seems desirable,

Population Distribution Function

     The function w(C,t) represents the distribution of population with
respect to both concentration level and time of day.  As a first approx-
imation, we assume it is separable, i.e.,

                          w(C,t) = w(c) f (t)     ,                 (D-7)
                                         w
where w(C) is the distribution of  daytime  (workday) population with
respect  to concentration level alone  at a  particular fixed time (the time
of the concentration  peak,  for example), and  fw(t) is a weighting function
chosen to reflect the diurnal variation in that  distribution  (residential
vs.  commute  vs. work  hours).
                                  D-7

-------
     Within a pollutant cloud,  concentrations  tend to be distributed as
follows:  A distinct peak value occurs,  with concentration  falling off
as a function of radial distance from that peak.  Contours  of constant
concentration (isopleth lines)  surround  the peak  concentrically,  with
concentration diminishing to background  levels.   This radial  distribution
of concentration level is suggestive.  If population  is  distributed  about
the peak such that

                            2*  0
                   P(C) =   f  f  p(r*,e)r*dr*de     ,                
-------
cloud may have drifted some distance (10-30 km)  from the  densest
population centers.  However, our approach h?re  is highly pragmatic.
To render Eq. D-9 soluble, we must invoke simplifying assumptions.
Having done so, comparison of our results with actual data offers  us a
measure of our success.

     Such data has been obtained from ozone exposure/dosage studies
done for the Denver Metropolitan region using the grid-based SAI
Urban Airshed Model.  Shown in Figure  D-3 is the population density
function predicted on 3 August 1976 for the hour from 1300-1400 (1  to
2 p.m.)—the time of the predicted ozone peak (0.24 ppm).  The concen-
tration field predicted by the model was used.  A coarse population
distribution was derived based upon data supplied by the Denver
Regional Council of Governments  (DRCOG) and was adjusted to approximate
employment shifts.  Since the analysis supplied exposure estimates only
above 0.08 ppm which were expressed no more finely than in 0.02 ppm
increments,  an uncertainty band, as shown, exists about each point.

     Several  key observations can be made.  The value of w(C) seems to
become  very  small  at the peak concentration,  i.e., while concentration
levels  may be high near the  peak (within  90%  of it),  the area (and
population)  affected is small.   Also,  an  apparent anomaly  occurs between
0.18 and  0.20 ppm.  This may be  due to any of several causes.  Population
density non-uniformities,  however,  appear to  be the  most lively of these.

     Using  the  data contained in Figure  D-3 as a standard for comparison,
we may proceed  in  developing a  simplified, analytic form  for w(C).  We
make two  key assumptions  in  doing so.   First, we  assume  a  shape for the
 radial  concentration  distribution,  C(r),  which  we invert  to give us r(C).
Then we make a  simplifying assumption about  the population density
 distribution, p(r,e).

      To estimate C(r), we may idealize isopleth contours  as a  series of
 concentric circles, as shown in Figure D -4.  Further,  we may  assume

                                  D-9

-------
o
           200
        a.
        a.
        a>
        a.

       ro

        O
        o
        0)

        O.
        o
  150
           100
J3



+J



O




o   50
        a.
        o
        a.
             0
                                            K-OH
                                                  I—OH
                                                           NOTES:


                                                           1)  DATE:  3 AUGUST 1976


                                                           2)  TIME OF DAY:  1300-1400 HOURS


                                                           3)  POPULATION CORRECTED TO

                                                               ACCOUNT FOR EMPLOYMENT
                                                      ^-0-^
                 _L
J.
                                                  J.
J
             0    0.02  0.04  0.06  0.08  0.10   0.12  0.14  0.16  0.18  0.20  0.22   0.24  0.26  0.28  0.30

                                                   Concentration (ppm)
               FIGURE D-3.  POPULATION DISTRIBUTION AS A FUNCTION OF CONCENTRATIONS.  Based on predictions

                            of  the SAI Urban Airshed Model for the Denver metropolitan region.

-------
there to be N isopleths between the peak concentration, C ,  and the
background value, C.
                                                BACKGROUND C
                                                            B
                                                 PEAK C,
         FIGURE D-4.   IDEALIZED  CONCENTRATION  ISOPLETHS
      If we  assume  that  for  isopleths  separated  by  a  constant concentra-
 tion  decrement,  AC,  the interisopleth distance  grows exponentially  (that
 is, the isopleths  are separated by a  steadily growing distance),  then
 we may write an  expression  for the n-th radius  such  that
                                    n
                              r  =
                               n
                                       Ar.
                                      n-1
                                      £
                                      i=0
                                      l - e
                                              Ar
                                                                      (D-10)
                                  D-ll

-------
Since
                                   n(CP -  CB)
                                 - nr-N~;   •
(D-ll)
we  can solve for n, substitute  this into Eq.  D-10, and then generalize
to  yield the following:
                   C(r) . Cp - f in [l  -  (^jrj     .           (D-12)
 where AC is the interisopleth concentration decrement and b is chosen
 so that r(CB) equals the radius of the pollutant cloud, (here assumed to be
 the urban radius).  Several  typical such concentration distributions
 are shown in Figure 0-5.

      We can now invert this  relation to estimate r(C).  Doing so, we can
 write
                                                                   (D-12)
 Substituting this and its derivative into Eq.  D -9, we  get  an  expression
 for w(C) such that

                                                2ir
                                              J  p(r,e)de
                                 D-12

-------
o
u>
        E
        CL
        CL
   0.24

   0.22

   0.20

   0.18

   0.16

   0.14
I 0.12
4-1
2
g 0.10
u
          0.08

          0.06

          0.04

          0.02

             0
                                                             NOTES:
                                                             1)  Ar  IS THE DISTANCE FROM THE PEAK TO THE
                                                                 FIRST ISOPLETH, I.E., CD - AC ( the 0.20 pptn
                                                                 ISOPLETH)              F
                                                             2)  PEAK CONCENTRATION OF OZONE IS 0.24 ppm.
                                                             3)  BACKGROUND CONCENTRATION IS 0.04 ppm.
                                                             4)  POLLUTANT CLOUD RADIUS IS 13 MILES.
                                                                                      BACKGROUND
                                                   6           8          10
                                                Radius from Peak, r(C) (miles)
                                                                               12
14
16
                      FIGURE D-5.  TYPICAL RADIAL CONCENTRATION DISTRIBUTIONS ABOUT THE PEAK.  Parameters
                                   are chosen to be representative of the Denver metropolitan region.

-------
We now make another key simplifying assumption.  We approximate the value
of the integral by assuming a uniform radial  population  density,  i.e.,

                          /*
                         J  p(r,e)de = 2irD                          (D-14)
                         0
Substituting this into Eq.   D-13, we arrive at the final form for w(C):

                       w(C)  =

where
                                                                     (D-16)
                                                                     (D-17)
                    K3 =
and D is chosen such  that  the  integral of w(C) between (L and  Cp equals
the total population  within the modeled area.

     We have made thus  far a number of significant assumptions.  To  test
their adequacy, we can  select  parameter values appropriate  for the Denver
example, calculate w(C), and compare the results against the data shown
in Figure D-3.  The parameter  values selected are shown  in  Table D-l.

     In Figure 0-6 we  show the population distribution predicted by
Eq.  D-l5.  Several observations  can be made about its agreement with  the
test data.
                               D-14

-------
   TABLE D-l.   SELECTED PARAMETER VALUES IN DENVER TEST  CASE
Symbol


  CP

  CB

  AC
           Description
                                                             Value
    D
Peak concentration (ozone).

Background concentration.

Concentration decrement between
isopleth lines (N=5 isopleths).

Exponent by which interisopleth
distance grows, selected such that
C(r) equals C~ at r=13 miles from
the peak (at the approximate urban
radius).

Radius  from peak to the  first  iso-
pleth  (the 0.20 ppm contour).

Uniform population density_chosen
such  that  the integral of w(C)
between CD and CD equals the total
population (1.275 million).
0.24 ppm

0.04 ppm

0.04 ppm


0.4
 1  mile


 2405 persons/
 sq. mi.
                             D-l 5

-------
o
t
             250 r-
           •s 200
           Q.
           CL

           s-
           OJ
           CL
           tn
           c
           O

           i-
           
-------
     >  Qualitatively, the shapes seem to agree.
     >  The analytic form of w(C) seems to underpredict the
        distribution of population at higher concentration
        levels.
     >  The anomaly occurring in the data at 0.19 ppm remains
        unaccounted for in the analytic form.

     Despite the seeming limitations imposed by our assumptions,  however,
agreement with the test data seems surprisingly good.  It remains to be
seen in further investigation (beyond the scope of this study)  whether
this result is typical  or merely fortuitous.   We emphasize that results
obtained thusfar, while encouraging, should be regarded as preliminary.

     In deriving Eq. D-15, we assumed a uniform population distribution.
We can estimate qualitatively from our results the change in w(C) re-
sulting from variations in this assumption.  The shifts expected  in  w(C)
for a nonuniform population density are illustrated in Figure D-7.   In
all cases the integral  of w(C) is assumed to  equal  the total  regional
population.
                         PEAK OCCURS  IN
                         LOWER DENSITY REGION
                         UNIFORM POPULATION
                         DENSITY
                     \\
       o
       f*
                                       EAK OCCURS- IN HIGHER
                                      DENSITY  REGION
                 CB
                            Concentration
         FIGURE  D-7.
SHIFTS IN w(C) CAUSED BY NONUNIFORM
POPULATION DISTRIBUTIONS
                                D-17

-------
     We now consider other variation of w(C,t) with  time.   Temporal
changes in the function are caused by two principal  effects:

     >  Evolution of the Concentration Field
        - The peak concentration occurring at a  time t,  Cp(t),
          increases during the morning, usually  reaches  a
          diurnal peak in the early afternoon, and then  de-
          creases slightly by late afternoon.
        - The overall radius of the pollutant cloud~r(CB)~
          increases up to the time of the peak.
        - As the day progresses near-peak concentrations
          "spread out," that is, the percentage  of the total
          cloud area having concentrations near the current-
          hour peak (say, within 201 of it) increases during
          the day
     >  Population Shifts
        - Urban areas have two distinct patterns of popula-
          tion distribution during the day:  residential
          (non-work) and employment (workday).  These are
          separated by two peak-traffic commute periods.
        - A percentage of the population during the day is
          mobile, traveling from one point  to another.

 Me have assumed  here that the total impact  of these effects can be
 approximated  by  a separable weighting  function, fw(t),  applied to the
 function  w(C).   The extent to which this  is valid needs to be verified
 by additional  investigation.  Yet, as  a first approximation it has
 some plausibility,  and it allows  us to proceed  to an  analytic result
 for model performance standards—our principal  objective.

 Health Effects Function

      Health effects resulting from exposure to polluted air manifest
 themselves in many ways, each varying in the symptom it produces and
                                  D-18

-------
the seriousness of its impact.   Among such effects  are  the  following:
bronchial irritation, reduced lung function, enzyme damage,  eye  irri-
tation, dizziness, and coughing.  Some of these manifest themselves as
noticeable but low-level discomfort; others produce more serious impact
such as aggravation of respiratory illness.  Equating each  effect on an
absolute scale and relating their aggregate weighted impact directly to
ambient pollutant levels, however, is a formidable task.  Efforts at doing
so have been subject to uncertainty and controversy.  To overcome these
difficulties, we resort to several conceptual simplifications.  Rather
than differentiating between individual health effects, we collapse them
together into a single function, whose "seriousness" is dependent on
concentration level, C, and duration of exposure, At.  We represent
this by the following

                             HE = HE(C.At)                   (°-19)

     We now make an intuitive appeal.  While we may not know the value
of HE  in an absolute sense, we  observe that its value  increases, that
is, the  HE gets "worse," as concentration levels rise  and the duration
of exposure increases.  Further,  because  health effects at higher con-
centrations and durations are more serious, we expect  HE to grow faster
than linearly  with  increasing C (and  probably At).  We also can expect
HE to  exist even  at very low values  of C, though these effects  may be
small, perhaps  below the threshold of human perception.  Qualitatively,
 the shape  of  HE might look  as shown in Figure  D-8.

       Based on the reasons noted above, we  can make a  useful approximation.
 We assume that HE is  separable, one part dependent on C and At, and that
 it can be described by the following  simple relation:

                          HE(C,At) = ACYfH£(At)                  (D-20)

 where A is a  scaling  constant  (whose  value we need not know, as we shall
 observe later); -y  is  a "shaping"  parameter whose  value is likely to be
                                  D-19

-------
                                            LU
                                            X
                                            o>
                                            1C
                                            JV
                                                   THRESHOLD
          Concentration, C                       Exposure  Time,  At

  (a)  Variation With Concentration        (b)  Variation With Exposure Time

       FIGURE D-8.    EXPECTED SHAPE OF HEALTH EFFECTS FUNCTION
 greater than one, i.e., linear; and %Ut) is a weighting function
 dependent solely on exposure time.


 c.  Analytic Solution of the Cumulative Health Effects Functional


   .  Having now specified analytic forms for the population distribution
 function, w(C,t), and the health effects function, HE(C,At), we may

 proceed to evaluate the area- integrated cumulative health effects func-

 tional, $, as it was defined in Eq.  D -6.  We may rewrite * as follows:
At) =
     2

=  J  *„(*)
                                                 *"(C)ACYdC
  F(At)*(Cp)    ,
                                                              (D-21)
where Cp is the peak concentration experienced during the day.


     Using relations developed previously, we may evaluate *.  Its
value is
                              D-20

-------
      rp
)  = A |  w(C)CTdC
      CB
      /
       Cp
                          Aj  |Kjl  -  K,e-3L)e-^|CTdC       (D-22)
                            CB

Though no completely general solution exists to this equation,  the
integral may be evaluated in closed-form for each integer value of  Y,
the health effects function shaping parameter.  A point-wise analytic
solution to Eq. D -22 thus exists.

d.   Calculation of Minimum Allowable Predicted Peak

     As noted in Eq.  D-4, the model performance standard could be
specified in terms of a minimum allowable ratio of the "predicted"
to "measured" values of 
-------
where Cp  Is the predicted peak concentration and CRm is the measured
peak value.  By writing the standard in this form, an important simpli-
fication results:  Two parameters,  being constant, appear outside the
integrals in the numerator and denominator of Eq.  D-23.  Since their
values in both are equal, they cancel.  By this means, we eliminate the
need for "knowing" the health effects function scaling coefficient, A,
and the population distribution scaling constant, KQ.  With the
rationale we present here, uncertainty associated with both, while
appreciable, thus does not affect the setting of performance standards.

     He can invert Eq. D-23 to solve for the minimum allowable ratio
of predicted to measured peak concentration value.  He do so for  the
Denver example discussed earlier, presenting the results in Figure
D- 9.  We show results for several  representative  values of  Y and  r.
If health effects varied linearly with concentration and r equaled
0.90, for instance, any predicted peak would be acceptably higher than
64 percent of the measured peak value.  Similarly,  if health effects
were a cubic function of concentration and r=0.90,  the predicted peak
would have to exceed 80 percent of  the measured value.

     Several decisions must be made in determining a final  value for a
performance standard based upon this health effects rationale:  A
minimum acceptable value must be chosen  for r, the ratio of predicted
to measured area-integrated cumulative health effects;  and a judgment
must be made about the maximum likely value of Y , the exponent of
concentration in the health effects  function.   Possible values for use
might be  r and  Yof 0.90 and 3 or 4  respectively.  For reference, we
 note that for  Y= 10,  the minimum allowable ratio of predicted to
 measured  peak  is 94 percent.

 e.  The  Health Effects  Rationale;   A Summary

     A model  performance standard  based upon pollutant health effects
 has intuitive appeal.   For this reason the rationale presented in  this

                                    D-22

-------
          i.oor
o

ro
to
                                             2                             3

                                         Exponent of Health  Effects  Function, y
                  FIGURE D-9.   MINIMUM ALLOWABLE  RATIO OF PREDICTED TO MEASURED PEAK CONCENTRATION VALUE

-------
section is of interest.  Among the advantages it offers are the
following:

      >  It  is general  enough  to be applied in many different
         locales and applications; while parameters of  the method
         are appli cat ion -dependent,  the method  itself is much  less so.
      >  It is  analytic and based upon  easily derived parameter
         values.
      >  The test for model acceptability is based upon a
         simple comparison of predicted and measured  peak
         concentration values.
      >  Many of the sources of uncertainty in  the method drop
         out of its final formulation.
      >  Results can be condensed into  a single figure  such  as
         that shown in Figure D-9.

 Similarly, the rationale has several  limitations:

      >  Only a lower bound on the  allowable  difference between
         predicted and measured peak is provided; a prediction
         in excess of the measured peak  (even by a great deal)
         is not sufficient to reject a model  on health effects
         grounds since the model predicts effects at least as
         great  as those  actually existing.
       >  The method does not  evaluate  explicitly a model's
,         spatial or temporal  behavior.

       The  rationale presented here should be regarded  as a  preliminary
  method.  While meriting additional consideration, the method and many
  of its assumptions need to be examined critically.  Among  the funda-
  mental questions for which answers  need to  be sought  are  the following:

       >  On what basis  do we select the minimum allowable  ratio
          of area-integrated cumulative health effects?
                                     D-24

-------
    >  What value  of  health  effects  exponent  is most appropriate?
    >  Does the  population distribution, w  (C), always repro-
       duce the  data  as  well as indicated in  Figure D-6?
       Does it need to?
    >  Is w(C,t) really  a separable  function, as  assumed?
       What about HE(C,At)?
    >  Are health effects really related to peak  concentra-
       tion and  exposure time in the fashion assumed  here?
       What about those  who  work in  environmentally controlled
       buildings and  may thus be isolated from full exposure
       to ambient concentration levels?

     We feel  the  rationale presented here has a number of advantages.
We also feel  it requires  a careful review and some additional  examina-
tion,  particularly as  regards the questions noted above,

2.   Control  Level Uncertainty  Rationale

     In order to reduce peak ambient concentrations in an airshed from  a
particular level  to one at or below  the NAAQS, reduction of emissions  into
that airshed is  required.  The  degree of that reduction, however, is
dependent on the amount  by which  the current  peak level exceeds the
standard.  Uncertainty in our knowledge of  the current peak concentration
(due either to measurement or modeling  limitations) translates into cor-
responding uncertainty in the amount of emissions control we must require.
This direct relationship, though  generally  a  highly nonlinear one,fforms
the basis for  another rationale for  setting model performance standards.
Its guiding principle is  as  follows;  Uncertainty in the percentage of
emissions control  required (PCR)  must be  kept to within certain allowable
bounds.

      In  this  section  we  discuss this Control  Level Uncertainty  (CLU)
rationale.  We first  indicate for a  specific pollutant  (ozone)  how one
may proceed from PCR  bounds  to  equivalent allowable tolerances  on the
 difference  between the predicted and measured peak  concentration.  We then
                                   D-25

-------
present one means whereby the PCR bounds  can  be determined from  the
economies of pollution control costs.   Several  benefits  derive from
use of the CLU rationale, among which  are the following:

     >  It makes explicit the relationship between model  per-
        formance limits and the maximum acceptable level  of
        uncertainty in estimates of regional  emissions
        control.
     >  It provides a structure whereby model performance limits
        also can be related to equivalent uncertainty bounds
        on the total regional cost of pollution control  equipment.

     The  rationale presented here is a useful complement to the Health
 Effects  (HC) rationale presented earlier.  We noted in discussion of that
 rationale that it could not provide an upper bound on the maximum
 allowable difference between predicted and observed peak concentration
 levels.   It merely required that the predicted  peak be greater  than a
 fraction (near unity) of  the measured peak,  i.e.,  Cpp >. BCpm  where B is
 near unity  (e.g., 0.9).   Were  Cpp to be  larger  than Cpm, no health effect
 penalty would be incurred by  designing a control  strategy based upon Cpp.
 Rather, the principal penalty would be an economic one:  The  cost of control
 would be greater than that actually required.   It is in  setting the upper
 bound on the allowable  value  of Cpp -  Cpm that the CLU  rationale has its
 greatest value,  since it addresses  directly  the cost of control.

      We can generalize  this  point as  follows:  The greatest cost of under-
 prediction of the peak concentration  lies in the underestimation of
 health impact, while the greatest consequence of overprediction is the
 extra economic cost associated with unnecessarily imposed control.
 Health Effects and CLU,  then, are compatible rationales.  If the predicted
 peak is  required to statisfy ^ <. Cpp - Cpm <. 1^, then it seems reason-
 able that K2 be selected based upon the CLU rationale with ^ chosen to
 be  the  lesser of the values determined  by the HE and CLU rationales.
                                   D-26

-------
 a.    The  Relationship  Between CLU  and  the  Concentration Peak

      In most cases  a highly nonlinear  relationship  exists between primary
 emissions and the ambient concentrations that result  from them.  The
 dynamic behavior of the atmosphere is  complex, as are the chemical changes
 undergone by dispersing pollutants carried by it.   Simplifying  assump-
 tions, however, can sometimes be made.  We consider here  one example in
 which this can be done.

      For urban regions in which certain specific criteria are met  (Hayes,1977),
 the ozone production resulting from various non-nethane mixtures of precursor
 hydrocarbons (NMHC) and oxides of nitrogen (NOji can be represented by means
 of an  ozone  isopleth  diagram such as  the  one  shown in Figure D-10.  (EPA,
 1976).  Whether  the use of such a diagram is  justified in a.given region
 depends heavily  on  a  number of factors, among which are the prevailing
 meteorology, solar  insolation, emissions  type/timing/geometry, terrain  type/
 complexity,  and  the presence of  large upwind  pollutant sources.

       If  a  region meets the criteria,  however, an isopleth diagram may be
 used as  an approximation relating regional  emissions  to consequent peak
. ozone levels.   The  region-wide  cutback in emissions  of precursor HC and
- NO   necessary  to reach the NAAQS from a given starting point can then be
 calculated, given  a background  ozone  value (usually  about 0.04 ppm) and
 a control  mix (NMHC versus NOX  cutback).   Usually, in urban areas the
 emphasis has been on NMHC reduction.   The starting point often is defined
  in one of two ways:  It is specified by a peak 03  measurement and either a
 • NMHC/NO   ratio typical of ambient conditions prevailing  in  the early
 morningX(6-9 a.m.) or specific concentrations of either  of  the precursors.
  Most frequently, it is the  first of  these methods  that is  used.

       Because the chief value of  the  isopleth diagram is  in its use  in
  estimating  regional emissions cutback, it is helpful to replot the
   isopleth  diagram as  shown in Figure  D-ll  (Hayes,  1977).   In doing so,
                                    D-27

-------
                                    03/fOx1dant  (ppm)
                        0.08 0.12 0.16 0.200.24 0.28   0.32   0.36
                0.2
Source:  EPA (1976b),
0.4
0.6
0.8
1.0        1.2
 NMHC (ppmC)
1.6
1.8
2.0
                                        FIGURE D-10.   PROTOTYPICAL ISOPLETH DIAGRAM

-------
                  100
I
PO
vo
         I	1
           PEAK 03  (ppm)

                   0.36
                   0.24
                   0.20
                                                       8       10

                                                       NMHC/NOV
12
14
16
                 Note:  No change In NOV level and no 0- background concentration were assumed.
                                       X               <3
                                      FIGURE D-ll.  THE ISOPLETH DIAGRAM REPLOTTED

-------
percentage control required (PCR) can be highlighted explicitly.  While
in principle any mix of NMHC and NO  control could be considered, the
                                   «
example shown assumes that or.ly HC control is employed.  That is, per-
centage control reduction (PCR) is equivalent to percentage hydrocarbon
control required (PHCR),

     The PHCR diagram in Figure D-11 may be used in the following way
to deduce model performance standards.  First, the measured peak ozone
concentration and the appropriate 6-9 a.m. NMHC to NO  ratio together
                                                     A
define a unique point on the PHCR diagram.  The nominal PHCR is thus
identified.  Then, by defining an allowable band about the nominal PHCR
(say ± a where a is some small value}, we can identify directly an
equivalent band about the measured peak ozone value.   A model  predicting
an ozone peak within that allowable band would be judged as acceptable
under this rationale.

     We can illustrate the technique by means of an example.  Suppose the
measured peak ozone was 0.16 ppm and the 6-9 a.m. NMHC/NO  was estimated
                                                         A
to be 9.5.   This point is  denoted on the figure as A.   From Figure D-11,
we see that the PHCR is about  70 percent.   If we allow an uncertainty in
the PHCR of ± 10 percent,  we see that the value based upon model predic-
tions of the peak must lie between 60 and 80 percent.   The corresponding
values of peak ozone are determined from points C and B, respectively, on
the PHCR diagram.   For a model  to be judged as acceptable, it must
predict an ozone peak  value, Cp ,  such that 0.122 <. Cp  <. 0.24 ppm or
76 <. Cp- /Cpm <. 150  parcent.

     Several general observations  may be made about the above results,
though we caution that they are particular to ozone as a pollutant.
Among the observations are the following:

     >  Because of the characteristic shape of ozone PHCR diagrams,
        the upper value of the allowable tolerance band is less
        restrictive then the lower one.   This is illustrated clearly
        in the example.
                                D-30

-------
    >  The allowable  band  for  Cp   is  always bounded on the upper
       and lower  side (as  contrasted  with  the HE rationale which
       calculates only a lower bound).
    >  In those cities for which  use  of the ozone  isopleth shown
       in Figure  D-ll is appropriate  and where  the 6-9 a.m.
       NMHC/NO  is greater than about 5 or 6, the  width of the
       allowable  band for Cp  is  not strongly sensitive to the
       value of NMHC/NO .
                        /v

b.   The  Relationship Between CLU and Control  Cost

     While the allowable uncertainty in control  level  (±  a in the above
example)  may be set in many ways, we examine here one important  means  to
do so:  the explicit  use of regional pollution control costs, if these can
be specified unambigously.  We might, for  instance, choose as our guiding
principle the following:  The uncertainty  in the total cost of regional
pollution control  should not be greater than a certain value 6.   We may
restate this in terms of model  performance.  The level of control deriving
from the predicted peak, Cpp,  should  not differ in cost by more than a
certain amount  from that level  determined  based upon  the measured peak,  CpR.

     To proceed we must define the total regional  cost of pollution control,
TC.   Depending on the level  of control  required, alternative  regional
control  strategies can be  designed.  The cost of each generally  can be
specified, at least in approximate terms.   By plotting the cost  of a  series
of "preferred" strategies  against the level  of  control they  achieve,  TC
 can be determined, as shown in Figure D-12.

      Several aspects of the TC curve should be noted.  While TC is  zero
  for a PCR of zero, any  non-zero  value  of  PCR has associated with it a
  minimum, non-zero cost.  Thus, the TC  curve really "begins" with a step
  function at PCR  = 0.  TC  rises quickly at first as many fixed costs
  of control are incurred..  The  cost then increases more slowly as fnxed
  costs are spread over greater values of PCR.   Finally, at high levels of
  PCR   each additional amount of control becomes more  difficult (and more
  expensive)  to achieve.  The TC function,  consequently, rises rapidly.

                                   D-31

-------
          C_3
           tn
           O
           O
           •M
           c
           O
           CJ
           O
                           PCR
                      PCR
                              1              ' ~"0
                           Percentage Control Required
100
        FIGURE D-12.
TOTAL REGIONAL CONTROL COST AS A FUNCTION
OF THE LEVEL OF CONTROL REQUIRED
     Once the total cost function has been defined, the allowable band for
the predicted ozone peak can be found in the following way:

     >  Step 1.   The nominal control level PCRQ can be deter-
                  mined using a diagram such as that in Figure D-10.
                  With all-NMHC control as considered in deriving
                  that figure, PCRQ is identical to PHCRQ,
     >  Step 2.   The nominal control cost, TCQ, can be found using
                  a TC diagram similar to the one in Figure  D-ll.
     >  Step 3.   The maximum and minimum allowable TC values then
                  can be calculated and the corresponding bounds
                  on PCR determined.
     >  Step 4.   Using the PHCR diagram once again, the allowable
                  bounds on predicted peak ozone can be found by
                  employing the PCR bounds found in Step 3.
                                D-32

-------
     The above procedure is a straightforward one creating  a
structure in which control cost uncertainty can be considered explicitly.
The example presented, however, is appropriate only for considering  ozone
in those regions having ambient conditions simple enough to be represented
by an isopleth diagram.  Extension of the procedure to other pollutants
and into regions of greater atmospheric complexity requires that additional
research be conducted beyond the scope of the current effort.

3.   Guaranteed Compliance Rationale

     As formulated in the federal  regulations,  the NAAQS are explicit,
with maximum  pollutant  levels  specified that must not be exceeded with
greater than  a certain  frequency.   Peak one-hour concentrations of ozone,
for  instance, must not  exceed  0.08 ppm more often than  once  per year.
With the  standards written in  such an absolute  fashion, it may be argued
that little room  exists for uncertainty about achieving compliance.  Under
such circumstances,  a model's  performance should be  constrained to
 "guarantee" that  its use will  not lead to underestimating  the  degree of
emissions control required.

     Model behavior can affect significantly the likelihood of meeting
 the NAAQS.  In those regions currently in noncompliance,  the effective-
 ness of candidate control strategies can be assessed only by means  of
 model  predictions of the peak concentrations resulting from each.   If a
 model  systematically underpredicts the peak value for concentrations
 near the NAAQS,  the adequacy of controls might  be overestimated.   Similarly,
 if the model overpredicts the peak,  controls designed  using it might be
 excessive.

 a.   Description of the  GC  Rationale

      With  the above in mind,  we  examine  the Guaranteed Compliance  (GC)
 rationale for setting model  performance  standards.   We state  its guiding
 principle as follows:   Compliance with  the NAAQS must be  "guaranteed,"
                                   D-33

-------
with all model uncertainty on the conservative side even if it means
introducing a systematic bias into model predictions.  The term
"guaranteed" should be taken here in a limited sense.  We intend it
to mean that "the probability is very small" that a model will predict
a peak value less than the standard when its actual value is greater.

     We illustrate this principle using the diagrams in Figures D-13
and  14.   In these figures we illustrate two models, one "conservative"
 (Figure D-13)  and the other "nonconservative" (Figure D-14).  For each,
we show two cases:  an actual peak concentration, CA, higher than the
NAAQS,  Cs, and one near the standard.  We represent the probability density
 function  of the model as f(C) and the expected value of the predicted peak
 as C".   Two types of uncertainty affect a model's performance.  The first
 includes  error in model inputs and uncertainty in the values of the model
 parameters themselves. These affect the shape of f(C).  Uncertainty of
 the  second type is due to the inability of the model formulation to re-
 present reality fully.  The difference between the expected model predic-
 tion, £,  and the actual value, CA, of the peak concentration is a measure
 of the  effect  of formulation errors.  As we define it here, a "conserva-
 tive" model is  one for which the value of (f exceeds CA, while for a "non-
conservative" model the reverse is true.  In both figures, the shaded area
A represents the probability that the model will predict a peak concentra-
 tion less than  the standard at the same time the actual value is greater.

     With the GC rationale, we want to insure that A remains acceptably
 small.  In mathematical terms, we insist that
                         A «   /  f (C)dC <. £     ,               (D-25)
 where c is some suitably  small  number.  From the figures we see  that  A
 can be kept small  only if iC exceeds C..  Under  the  requirements  of the
 GC rationale, only a  model  having  these characteristics would  be judged
 acceptable.
                                  D-34

-------
 c
 u
 c
 C-1

 L.
 a
 u
 s_
 o
 c
 r
 cr
 c.1
                                               NAAQS    ACTUAL
                                                                     CSCA
                  Peak  Concentration


              (a)   Peak Concentration
                   Higher than the NAAQS
                                                Peak Concentration


                                               (b)  Peak Concentration
                                                   Near the NAAQS
                         FIGURE D-13.
                        UNCERTAINTY DISTRIBUTION FOR
                        A CONSERVATIVE MODEL
O)  t
u
3
U
U
cr
u

QJ
z:
—
NAAQS
ACTUAL
                       f(C)
                      01
                      u
                                  3
                                  U

                                  8
                      >
                      u
                      c
                      01
                      -
                      •cr
                      OJ
                                                                                 ACTUAL
              Peak Concentration


            (a)   Peak Concentration
                 Higher than the NAAQS
                                                  Peak Concentration

                                              (b)  Peak  Concentration
                                                  Near the NAAQS
                         FIGURE D-14.   UNCERTAINTY DISTRIBUTION FOR
                                        A NONCONSERVATIVE MODEL
                                            D-35

-------
     A practical consideration now becomes important.   For peaks near
the NAAQS, we have no way of knowing the actual peak,  C^, whose value
we are trying to predict.  This is clearly so.   Until  emissions control
has been implemented and ambient conditions "improve," we cannot estimate
C. with measurement data.  Our strategy using the GC rationale is as
follows:

     >  Step 1.   We assume C^ = C$ and estimate the amount by
                  which £ must exceed CA in order that A <_ %.
     >  Step 2.   We then use the model to predict the peak under
                  current (uncontrolled) conditions, C* for which
                  we have measurement data to estimate the current
                  peak, CA*.
     >  Step 3.   To judge acceptability, we require the model
                  prediction, C*, to exceed C.* by as  much as C~
                  exceeded C. when C. = C,..  Actually, this is a
                  bit more complicated.  Since C.* is  based upon
                  measurements, it is subject to instrumentation
                  error.   We know CA* only in terms of a measured
                  value and its probability density function.  There-
                  fore, we must consider the comparison of C.* and
                  C* statistically, requiring the probability that
                  C* exceeds C.* by C-C. to be greater than some
                  large value (near 1.0).

     We have invoked several important assumptions here, whose general
validity would require further verification if the GC rationale were to
be applied in judging model  performance.  Among them are the following:

     >  (T maintains the same relationship to CA for ambient condi-
        tions ranging from current ones to those characterizing
        compliance with the NAAQS.
                                 D-36

-------
     >   The probability density function,  f(C),  is  known  or
        can be determined, as can C.
     >   Instrumentation uncertainty can be characterized,
        allowing Step 3 to be accomplished.

     There are several difficulties associated with the GC rationale
approach, however, some of which are conceptual  and some  practical.
Among the most important of the conceptual difficulties is the intro-
duction of a conservative bias into model  predictions.   By insisting
that the model "overpredict" peak concentrations, almost certainly
we will select abatement strategies requiring more control than needed.
Difficulties of the practical kind also can be significant.   For most
models, determination of f(C) is a difficult (and usually impractical)
process.  The uncertainty in predicting the peak is partially due to
uncertainty in the data input to the model.  Since the model results are
related to inputs only in a complex and nonlinear way, estimating the
output uncertainty distribution in terms of the input error distributions
seldom can be done directly.  While a Monte Carlo-type of analysis  in
principle  can be conducted,  the number of  model runs required and the
amount of  computing  resources  consumed are so considerable as to render
such an analysis impractical.

b.   A Possible Simplification

     Short of doing  a Monte  Carlo  analysis,  is  there anything useful  that
can  be  determined?   In certain simple  circumstances, there  is.  We may
infer, when appropriate,  some  limited  information about  f(.C), C and C^.
To do  so,  we  first  recall  the modified form  of  Tchebycheff 's  inequality,
                         PJIx - ii|  itej  <>   >                (D-26)
                          1                   *
                                           9k
 where P is the probability that -ka >. x - " and -ko *• * '  T1» 1 1s a random
 variable, n is its expected value, and o is its standard  deviation.  This
                                  D-37

-------
relationship holds for all  probability  distributions.  We  can  adapt it
to the present problem by rewriting  it  in  the  following way:
                                                                 (D-27'
where C is a random variable whose value is  the peak concentration pre-
dicted by the model, £ is its expected value, and OG is its standard
deviation.  Cp is the standard (NAAQS).
             $
     The relation in Eq. D-27 is a useful  one.   The area A in Figures D-13
and 14 represents the same probability as  that on the left hand side of
Eq. D-27.  Using Eq. D-25, we may now write
                                                                 (D-28)
where £ is the maximum allowable value of A.  From this, we may infer the
minimum allowable value of o /(C  - C").  Its value is
                            C   5
      Still, we need an independent approximation of oc in order to solve
 Eq.  D-29  for the minimum value of C - C$.  To do so, we estimate the
 maximum value a  is likely to assume, that is, the aQ* such that
                                 D-38

-------
                              o  <_a *    •                       (D-30)
                               C    C

If we then use OG* in Eq.  D-29, we can determine (C" - Cs)mi-n-

     Suppose we represent model behavior with a system response function,
«j>, that tranforms model inputs into the model-predicted concentration
peak, i.e.,

                              C = »(£)    ,                      (D-31)

where C is the predicted peak, an e_ is the vector of model inputs.   Suppose
further that we know the probability distributions of each of the input
errors, and that we can identify their one-sigma variations, a£..  If  so,
we can determine the maximum change in the predicted peak that would occur
if all error sources varied simultaneously by a standard deviation from
their nominal values.  We note that increases in some inputs lower C and
others raise it.  Thus, to bound the value of AC, we consider the root-mean-
square of the changes  in C as each  input is  varied separately.  This max-
imum AC can be written as
                f\  4*   *!..+„   - *L.                    (D-32)
where each e.  (1  <. i  <. N)  is  varied separately and  the  corresponding change
 in peak  concentration is  represented by the quantity  in the  brackets.  If
 we assume that AC is  a suitable estimate of OG*,  we can write  (using
 Eq. D-29)
                            - C )    = •AC       ,                 (D-33)
                              Vmin    '	
                                  D-39

-------
which provides an indication of the amount of "overprediction" the model
must provide.

     We now present an example.  Suppose we consider a simple Gaussian
model [no reflection, continuously emitting source), whose only source
of error is the wind speed, U.  We assume the following:  ou = 0.5 m/sec,
U = 2 m/sec, and C  = 35 ppm (the one-hour federal standard for CO).  Using
Eq. D-32, we determine that AC = 7 ppm.  Then, using Eq. D-33 and assuming
that e = .05, we estimate that (C~ - Cs)min = 14.7 ppm.  Using the GS rationale,
we would require when modeling current ambient conditions that the model
over-predict the peak by this same amount (assuming that there was no error
associated with the measurement).

 c.    The GC  Rationale:  An  Assessment

     We have included the GC rationale in our discussion primarily for the
sake of completeness.  While the guiding principle underlying it--
"guaranteeing" that an adequate abatement strategy will be designed—
has its virtues, the method as conceived here has significant problems
associated with its use.  It is cumbersome and impractical, except in the
most  limited of circumstances.  Also, it may be excessively conservative,
introducing a systematic bias  into model evaluation.

      Unless the major problems noted here can be  solved somehow,  the other
rationales considered in this  chapter appear to have  greater promise.  We
do  not  recommend that this  rationale be pursued extensively in any additional
work.

 4.    Pragmatic/Historic Rationale

      Experience  is  growing  in  the  use  of  air quality  simulation models.  They
 have been applied  to a variety of  problems  in a  number of different situa-
 tions.   As an familiarity grows  with both their  capabilities  and  limitations,
 we become more able to  foresee their behavior in  new  applications.  Taking
                                  D-40

-------
advantage of our growing expertise, we may find it reasonable to set per-
formance standards for models based upon the following principle:  In
each new application a model must perform at least as well  as the "best"
previous performance of a model in its generic class in a similar application.

     This approach is a pragmatic one, forced upon us by some very practical
considerations:  our limited ability to derive theoretically justifiable
values for  the  standards and the number of different measures required to
characterize fully model performance.  Five major problem areas exist in
characterizing  the agreement of model predictions with field observations.
The model may be  judged on  its  ability to predict the concentration peak,
to avoid systematic  bias, to limit absolute error, to maintain spatial
alignment,  and  to reproduce temporal  behavior of concentrations.  To assess
a model's performance in these  five  areas, we recommended earlier in this
chapter  the use of a number of  different performance measures.   Our chief
difficulty  is as  follows:   There are as yet few theoretical means to assign
appropriate values for  these measures. 'We have identified  in this report
several  promising candidates for judging the  prediction  of  peak  concentrations
Additional  work is required, however, to determine  appropriate  standards
for many of the other measures.

      While  such additional  work is proceeding, what must we do?  Many  issues
of great practical  interest are pending,  each of  which requires the eval-
 uation of model performance.   Revisions  to State  Implementation Plans, for
 instance, must be reviewed. Model performance studies now  being conducted
 by the EPA  must continue.

      We recommend that the Pragmatic/Historic rationale be used to set
 acceptable  bounds for performance measures for which no other better method
 exists.  As research provides  greater insight into "better" rationales,  we
 recommend appropriate updates  to the standards.
                                  D-41

-------
     To employ this  rationale the following steps might be followed:

     >  Step  1.   The proposed application is categorized, identifying
                 the group of previous studies with which its per-
                 formance must be compared.  The criteria by which
                 this might be done could include pollutant type,
                 prevailing meteorology, source geometry, and terrain
                 irregularity.
     >  Step  2.   Performance measures appropriate to the applications
                 category are calculated.
     >  Step  3.   Calculated values are compared with the "best"  values
                 previously attained in a similar application.

     For the  Pragmatic/Historic rationale to be of use, the EPA would
have to accomplish the following steps.  A scheme for classifying appli-
cations into  "similar" categories needs to be  developed.  Then, data on
previous modeling efforts needs to be assembled and  appropriate perfor-
mance measure values calculated.  Finally, a mechanism  for updating the
"performance  data base"  needs to be established.  Such a mechanism would
require the EPA to assume a  custodial role over the  data  base, amending
it as results of new modeling studies become available.
                                D-42

-------
                              REFERENCES
Ames, J., et al. (1978), "The User's Manual for the SAI  Airshed Model,"
     EM78-89, Systems Applications, Incorporated, San Rafael,  California.

Anderson, G. E. (1978), private communication.

Anderson, 6. E., et al. (1977), "Air Quality in the Denver Metropolitan
     Region 1974-2000," EF77-22, Systems Applications, Incorporated,
     San Rafael, Calfiornia.

Argonne (1977), "Report to the U.S. EPA of the Specialists'  Conference on
     the EPA Modeling Guideline," 22-24 February 1977, Argonne National
     Laboratory, Argonne, Illinois.

Burton, C. S., et al. (1976), "Oxidant/Ozone Ambient Measurement Methods,"
     EF76-111R, Systems Applications, Incorporated, San  Rafael, California.

Calder, K. L. (1974), "Miscellaneous Questions Relating  to the Use of Air
     Quality Simulation Models," Proc. of the Fifth Meeting of the Expert
     Panel on Air Pollution Modeling, Chapter 6, NATO/CCMS.

Code of Federal Regulations [CFR] (1975), Title 40 (Office of the Federal
     Register, U.S. Government Printing Office, Washington, D.C.).

EPA (1977), "Uses, Limitations and Technical Bases of Procedures for
     Quantifying Relationships Between Photochemical Oxidants  and Precur-
     sors," EPA-450/2-77-021a, Office of Air Quality Planning and Standards,
     Environmental Protection Agency, Research Triangle  Park,  North Carolina.

	 (1978a), "Workbook for the Comparison of Air  Quality Models,"
     EPA-450/2-78-028a,b, Office of Air Quality Planning and Standards,
     U.S. Environmental Protection Agency, Research Triangle Park, North
     Carolina.

	 (1978b), "Guidelines on Air Quality Models,"  EPA 450/2-78-027,
     Office of Air Quality Planning and Standards, Environmental Protec-
     tion Agency, Research Triangle Park, North Carolina.

Johnson, W. B. (1972), "Validation of Air Quality Simulation Models," Proc.
     of the Third Meeting of the Expert Panel on Air Pollution Modeling,
     Chapter VI, NATO/CCMS.

Liu, M. K., and D. R. Durran (1977), "The Development of a Regional Air
     Pollution Model and Its Application to the Northern Great Plains,"
     EPA-908/1-77-001, Office of Energy Activities, U.S. Environmental
     Protection Agency, Denver, Colorado.

                                   R-l

-------
Rosen, L. C. (1977),  "A Review of  Air  Quality Modeling Techniques," UCID-
     17382, Lawrence  Livermore Laboratory, Livermore, California.

Roth, P. M., moderator (1977), "Report of the Validation  and  Calibration
     Group (II-5)," in "Report to  the  U.S. EPA  of  the Specialists' Conference
     on the EPA Modeling Guideline," pp. 111-120,  22-24 February 1977,
     Argonne National Laboratory,  Argonne, Illinois.

Roth  P  M., et al. (1976), "An Evaluation of  Methodologies for Assessing the
      Impact of Oxidant Control Strategies,"  EF76-112R,  Systems Applications,
      Incorporated, San Rafael, California.
                                    R-2

-------
                                   TECHNICAL REPORT DATA
                           (Please read Instructions on the reverse before completing)
  REPORT NO.
 PA-450/4-79-032
                             2.
                                                           3. RECIPIENT'S ACCESSION-NO.
 . TITLE AND SUBTITLE
 'erformance Measures  and Standards for Air  Quality Simu-
 ation Models
             5. REPORT DATE
             October 1979
                                                           6. PERFORMING ORGANIZATION CODE
 . AUTHOR(S)

 i.  R. Hayes
                                                           8. PERFORMING ORGANIZATION REPORT NO.
 . PERFORMING ORGANIZATION NAME AND ADDRESS
 Jystems Applications, Incorporated
950 Northgate  Drive
San Rafael, California  94903
                                                           10. PROGRAM ELEMENT NO.
             11. CONTRACT/GRANT NO.
                                                            68-02-2593
 2. SPONSORING AGENCY NAME AND ADDRESS
                                                           13 TYPE OF REPORT AND PERIOD COVERED
Office of Air  Quality Planning and  Standards
U. S. Environmental  Protection Agency
Research Triangle Park, North Carolina   27711
                                                            3.TYPE OF REPOR1
                                                            Final Report
             14. SPONSORING AGENCY CODE
15. SUPPLEMENTARY NOTES
 16. ABSTRACT
     (Currently  there are no standardized guidelines for evaluating  the  performance of
 air  quality  simulation models.   In  this report we develop a conceptual  framework for
 objectively  evaluating model performance.   We define five attributes  of a  well-
 behaving model:   accuracy of the  peak prediction, absence of systematic bias,  lack of
 gross  error,  temporal correlation,  and spatial alignment]  The relative importance of
 these  attributes is shown to depend on the issue being-aTraressed  and  the pollutant
 being  considered.  Acceptability  of model  behavior is determined  by calculating
 several performance "measures"  and  comparing their values with specific "standards."
 raflure to demonstrate a particular attribute may or may not cause  a  model to  be
 rejected, depending on the  issue  and pollutant.
     Comprehensive background material is presented on the elements of  the performance
 evaluation problem:  the types  of issues to be addressed, the classes of models  to be
 used along with the applications  for which they are suited, and the categories of
 aerformance  measures available  for  consideration.  Also, specific rationales are
 developed on which performance  standards could be based.  Guidence  on the  inter-
 pretation of performance measure  values is provided by means of an  example using a
 large, grid-based air quality model.
17.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
                                              b.lDENTIFIERS/OPEN ENDED TERMS
                           c.  COSATl Field/Group
Air  Pollution
Turbulent  Diffusion
Mathematical  Models
Computer Models
Atmospheric Models
Dispersion
Air Quality Simulation
  Model
Model Validation
Model Evaluation
 18. DISTRIBUTION STATEMENT

 Release  Unlimited
19. SECURITY CLASS (ThisReport)

 None
                                              20. SECURITY CLASS (Thispage)

                                               None 	
21. NO. OF PAGES

  311
                                                                         22. PRICE
EPA Form 2220-1 {9-73}

-------