Evaluation Of Performance Measures For An Urban Photochemical Model


 United States      Office of Air Quality       EPA-450/4-83-021
 Environmental Protection  Planning and Standards      July 1983
 Agency        Research Triangle Park NC 27711

 Air
Evaluation of
Performance
Measures
for an Urban
Photochemical
Model

-------

-------
                                    EPA-450/4-83-021
Evaluation of Performance Measures
 for an  Urban  Photochemical  Model
                      Robin L Dennis,
                     Mary W. Downton
                          and
                       Robbi S. Keil
                          by
              National Center for Atmospheric Research
              Environmental and Societal Impacts Group
                      P.O. Box 3000
                   Boulder, Colorado  80307
                  Contract No. AD-49-F-0-167-0
                       Prepared for

             U.S. ENVIRONMENTAL PROTECTION AGENCY
              Office of Air Quality Planning and Standards
             Monitoring and Data Analysis Division (MD-14)
                 Research Triangle Park, NC 27711

                        July 1983

-------
                               DISCLAIMER
     This report has been reviewed by the  Office  of Air Quality Planning
and Standards, U. S. Environmental  Protection Agency, and approved for
publication as received from National  Center for  Atmospheric Research.
Approval does not signify that the contents necessarily reflect the views
and policies of the U.  S. Environmental  Protection Agency, nor does men-
tion of trade names or  commercial  products constitute endorsement or
recommendation for use.  Copies of this  report  are available from the
National Technical Information Service,  5285 Port Royal Road, Springfield,
Virginia  22161.
                                   ii

-------
    EVALUATION OF PERFORMANCE MEASURES FOR AN URBAN PHOTOCHEMICAL MODEL

                             EXECUTIVE SUMMARY
     The AMS/EPA Dispersion Model Performance Workshop, in  September  1980,
recommended a large set of statistical measures  for use in  the evaluation
of air quality models.  The present study was designed to test the
recommended measures in an actual performance evaluation on Denver  data,
using three versions of the SAI Urban Airshed Model, termed  DOT, EPA1  and
EPA2§.  The study involved both an evaluation of the models  and  an
evaluation of the statistical performance measures.  The evaluation of the
models had two parts—a base year case and an emissions trend case.
Resulting recommendations are intended to aid in the future  use  of  the
models and in the planning of future performance evaluations on  urban
airshed models.

     Evaluation of the Models:  Base Year Case

     The three models in this study represent successive improvements  in
the photochemical airshed model.  All three versions showed considerable
bias (systematic underprediction) and noise, and a variety  of errors.   We
were able to identify several types of errors that degraded the  models'
performance.  They were: missing the peak in space, missing  the  peak  in
time, too rapid a vertical mixing, errors introduced by some of  the model
inputs, and difficulty in treating concentrated  sources of  NOX
emissions.  There continued to be systematic errors that contribute to
chronic underprediction that could not be identified.  It seems  that  this
model will have a problem with missing the peak  in space and time  for
typical regulatory cases in which a peak has been observed  at a  monitoring
site, particularly when there are few monitoring stations.   The  predicted
peaks are, however, in approximately the correct locations.  The model
randomly misses the peak in time, within two-hour limits, but this  is
judged not to be a significant problem for regulatory use of the model.

     The models only became differentiated by their peak predictions.   The
oldest model version, DOT, was the worst and the newest model version,
EPA2, was the best.  There was still bias (underprediction)  in the  peak
ozone predictions of EPA2, not less than 10% and not more than 30%.   The
responsiveness of peak ozone predictions of EPA2 to changes  in meteorology
appears to be less than is actually observed.  The DOT model appeared  to
respond randomly to changes in meteorology.  There were too few  monitoring
stations to assess the size of the predicted "ozone cloud".  Sizeable
clouds were predicted, but our impression is that they are  smaller  than the
observed "clouds".  Based on some of the systematic errors  identified, it
appears that the model can still be improved.
                                    111

-------
     Evaluation of the Models;  Emissions Trend Case

     In the course of this study, it became clear that: a performance
evaluation using a data set in which the emissions do not change cannot
provide reliable inferences about the performance of the model predictions
when the emissions do change.  Thus if the model is to be used to predict
changes in ozone concentrations due to changes in emissions, then the model
predictions must be tested with a data set in which a change in emissions
and a corresponding change in ambient ozone concentrations has been
established.

     For the prediction of the trend in peak ozone due to a change in
emissions, EFA2 was again the best.  It did appear to underpredict the
change in peak ozone—13 percent observed versus 10 percent predicted
change over a 3-year period.  The data base was too small to make any firm
estimate, however.  The Urban Airshed Model has certain idiosyncrasies  that
affect its predictions for regulatory use.  In particular, its predictions
of a change in peak ozone due to a change in emissions is affected by the
vertical mixing rate.  It appears that EPA2 can still be improved.

     Bias estimates from a base year evaluation do not seem to be adequate
indicators of how well the model predicts trends in ozone resulting  from a
change of emissions.  Some errors affect both base year predictions  and
ozone trend predictions, but the predictions for the two cases are not
equally sensitive to the same error.  Other errors seem to affect only  the
base year predictions.  This suggests that if certain errors can be  fixed,
then it is possible that the existence of bias in the model's predictions
for a historical day will not significantly affect its prediction of a
relative change in peak ozone due to a change in emissions.  Thus EFA2
seems to be more adequate from a regulatory perspective than from a  purely
scientific perspective.

     Evaluation of the Performance Measures

     A performance evaluation should be structured around attributes which
are important in the use of  the model.  Performance measures should  then be
chosen on the basis of the attributes that have been selected.   Thus the
list of measures and comparisons recommended by  the AMS/EPA workshop
included some measures which were not appropriate for the Denver
application and failed to include a measure of  the response  to  emissions
change which was needed in the Denver application.  In addition, many
graphical displays not mentioned at the workshop were found  to be extremely
useful.  Measures  found most useful in this study were bias, noise,
absolute deviation,  and correlation-related statistics,  applied  to  subsets
of the Denver data.  Emphasis was placed  on comparisons of  completely
paired hourly data for a diagnostic understanding of  the models  and  on
comparisons of  the daily maxima  for regulatory  purposes.

-------
     For detailed analysis of the locations and causes of errors in  the
models, statistics computed separately for each hour and site, or  for  each
day and site, were most helpful.  Sensitivity analyses and graphical
analyses of model performance under controlled changes in the data and in
the model were necessary to further explain the errors.  Analyses  of daily
peak concentrations revealed additional information because  they were
sensitive to different aspects of model performance.  To evaluate  the
models' performance for regulatory purposes, statistics computed on  the
daily maximum predictions were most appropriate.

     The performance measures were found to be useful aids in comparing
models, but subgrouping of the data, graphical analysis, sensitivity
analysis and case-by-case analysis was necessary  for diagnosing errors in
models.  Thus, it was felt that the measures would be inappropriate  as
absolute performance standards.  Understanding of the reasons underlying
the computed measures was necessary for a meaningful evaluation, and
professional judgment was required in drawing conclusions.   We expect  this
to be typical with air quality models.  The evaluation of a  model  will not
be a simple, routine matter.  The statistical measures provide an  aid  by
defining consistent "vital statistics" for a model.  But they are  not  a
substitute for the detailed, diagnostic analysis  necessary to support  an
evaluation of the adequacy of a model for use in  decision-making.
SDOT:  Urban Airshed Model with Carbon Bond I chemistry;  EPA1:   Urban
Airshed Model with Carbon Bond II chemistry; EPA2:  Urban Airshed  Model
with Carbon Bond II and revised numerical  algorithm.

-------
                             TABLE OF CONTENTS
  I.   INTRODUCTION 	   1
    A.  SELECTION OF EVALUATION CRITERIA 	   2
          Evaluation of a Model for Scientific or Regulatory Purposes  3
          Requirements Due to the Type of Model	   4
          Requirements Based on Use of the Model	   4
          Model Attributes Selected for the Denver Performance
            Evaluation 	   6
    B.    PRACTICAL LIMITATIONS ON PERFORMANCE EVALUATION  	   6

 II.  DISCUSSION OF THE PERFORMANCE MEASURES	   8
    A.  BASIC ASSUMPTIONS OF STATISTICAL TESTING 	   8
          Choice of a Data Sample for Model Evaluation . . 	   9
          Normality of the Data	   9
          Lack of Independence	11
          Other Sources of Dependence	14

    B.  PERFORMANCE MEASURES 	  15
          Gross Error	16
          Bias	17
          Noise	24
          Variability Comparison	25
          Correlation and Related Measures 	  26
          Trends Resulting from Changing Emissions  	  28
          Graphs	29
          Analysis of Subgroups of the Hourly Concentrations  ....  30
          Ways of Pairing Daily Maxim van Concentrations	31

III.  EXAMPLE PERFORMANCE EVALUATION  	  34
    A.  MODELS AND DATA BASE	35
          Data Base Used	36
          Meteorological Conditions  	  37

    B.  GENERAL DATA SET EVALUATION	39

    C.  FIRST LEVEL PAIRED COMPARISONS ON HOURLY DATA  	  41
          General Overview Statistics  	 . 	  41
          Model Performance by Hour of the Day	42

    D.  SECOND-LEVEL COMPARISON:  DIAGNOSING ERRORS
        IN HOURLY PREDICTIONS  	  44
          Missing the Peak in Space	45
          Point Source Influence	49
          Introduced Error 	  52
          Dispersion (Vertical Mixing Rate) Influence  	  58
          Missing the Peak in Time	62
                                    VII

-------
    E.   COMPARISON OF DAILY MAXIMUM CONCENTRATIONS 	  63
          Local Site Maximum for Each Day and Site, Paired by Hour  .  63
          Local Site Maximum for Each Day and Site, Unpaired by Hour  64
          All-Station Daily Maximum, Paired by Site  	  66
          All-Station Daily Maximum, Unpaired by Site  	  67
          Area-wide Daily Maximum, Over Entire Modeling Region ...  68
          Regression Analysis of the Daily Maximum Pairings  ....  69
          Evaluation of the Models Based on Daily Maxima 	  70

    F.   EMISSIONS CHANGE COMPARISON  	  72
          Definition of an Emissions Change Comparison 	  74
          Development of the Earlier Emissions Inventory 	  75
          Choice of Models and Days for the Emissions Comparison  .  .  77
          Estimation of the Change in Observed Maxima	78
          Results  	 .....  81

    G.   PERFORMANCE EVALUATION CONCLUSIONS 	  89
          Performance Character of the Model 	  90
          Insights on the Regulatory Use of the Model	94

IV.  IMPLICATIONS FOR PERFORMANCE MEASUREMENT  	  98
    A.  CONCLUSIONS OF THE USE OF STATISTICAL TECHNIQUES	98
         Evaluation of the Performance Measures  	  98
         Evaluation of Graphical Displays  	  102
         Evaluation of the Use of Subgroups of the Hourly
           Concentrations	102
         Estimating the Bias in the Predicted Peak Concentration  .  .  104
         Problems in Comparing Models on Hourly Data 	  106
         Effects of Non-normality on Bias Comparisons  	  108

   B.  RECOMMENDATIONS	110
         Recommended Performance Measures  	  110
         The Use of Statistical Measures as Performance Standards   .  113
         Evaluating the Usefulness of a Model	114

REFERENCES	117

TABLES	121

FIGURES	147
                                    VI11

-------
                              I.  INTRODUCTION




     In the implementation of laws to protect air quality, atmospheric




dispersion models have come to be a basis for establishing acceptable




levels of emissions control for air pollution sources.  To justify their




use as regulatory tools, the models should be accurate and be used




correctly.  In recognition of this need, a workshop on dispersion model




performance was convened jointly by the Environmental Protection Agency  and




the American Meteorological Society in September 1980 to recommend proce-




dures to evaluate model performance.  As a step toward improved, consistent




evaluation and comparison of models, workshop participants proposed  a  list




of performance evaluation measures.  They called for  testing of those




measures and for further development of evaluation methods through actual




tests of models (Fox, 1981).




     The present study developed as a result of the AMS/EPA workshop.




Three versions of the Urban Airshed Model developed by Systems Applications




Inc. (SAl) for simulating the production of photochemical pollutants were




tested on ozone data for Denver, Colorado.  Statistical measures and graphs




of model performance were used to compare and evaluate the three model




versions in a complete example evaluation.  Then, in  turn, the usefulness




of the measures themselves as evaluation tools was assessed.

-------
     This report begins with a brief discussion on the selection  of




evaluation criteria.  It is argued that a performance evaluation,  to




produce relevant results, should be structured around the  attributes  which




are important in the use of the model.  Models d&scuased in this  report  are




intended for regulatory use, and model attributes are selected  based  on  an




analysis of that use.  In the second section, general requirements for the




use of statistical measures and statistical tests are discussed.   Then some




measures, associated statistical tests, and graphs are described,  chosen on




the basis of the model attributes selected for the example performance




evaluation.  The third section describes the example performance  evaluation




for Denver, including the use of performance measures and  detailed




sensitivity studies that were needed to diagnose errors  in the  modeling.




Conclusions are drawn about the performance of the model on Denver data  and




suggestions for further research and improvements are made.   In the final




section, the performance measures themselves are evaluated, and their




appropriateness as performance standards is discussed.








A.  SELECTION OF EVALUATION CRITERIA




     A model performance evaluation needs  to be  structured around the




intended application of the model and  the  objectives of  the evaluation.




This will determine  the scope  and the  methods which  are  appropriate.   Such




a statement seems almost obvious, yet  it was found,  both in the example




evaluation described here  and  in earlier evaluations,  that important  model




characteristics can be easily  overlooked.  Initial  attention  to structuring




the evaluation  can  reduce  the  awkward  need to  add  new  analyses  and data at




a  later  stage.




                                      2

-------
Evaluation of a Model for Scientific or Regulatory Purposes




     This study focuses on evaluation of air quality models used  for




regulatory purposes.  Evaluation of a model intended for regulatory use




requires a somewhat different orientation than the usual scientific




approach to model testing.  A "scientific" evaluation generally focuses on




determining how well the model results mimic the observed behavior of




pollutant concentrations.  It establishes the accuracy of the model in




duplicating the magnitude, location, and timing of concentrations under




selected conditions.  The scientific evaluation is generally designed to




explore the details of a model's performance, to determine that the model




makes the right predictions for the right reasons and to identify errors as




a step toward improving the model.




     In contrast, a "regulatory" evaluation needs to determine how well the




model provides the results necessary for decision making.  Thus the




purposes and methods of the model's application will define what must be




covered by the regulatory evaluation.  The structuring should take into




account how the model is used in practice if meaningful results are to be




obtained.  This may affect the selection of both the data base and the




performance measures to be used in an evaluation.  Of course, in evaluating




a model for regulatory use a determination of the accuracy of model predic-




tions is needed, but the emphasis may be different than in a scientific




evaluation.  It will be necessary to ask how the model predictions will be




used, how the effects of errors can be minimized, how much error can be




tolerated, and whether there are some errors which cannot be tolerated.

-------
Requirements Due to the Type of Model




     A time-dependent photochemical model is tequired for predicting ozone




concentrations because of the complexity of the chemical processes involved




in ozone production.  Ozone is a secondary pollutant.  It is not emitted




directly but is produced by chemical reactions between several other pollu-




tants, primarily hydrocarbons and oxides of nitrogen (NC^) .  Therefore




ozone concentrations depend on (1) emissions of several pollutants; (2) the




rate of chemical reactions, which depends on the intensity of the sunlight;




and (3) the amount of mixing of the pollutants, which depends on meteoro-




logical factors such as wind speed.  The relationship between precursor




emissions and ozone is nonlinear, thus simple rollback models are inappro-




priate.




     In any time-dependent air quality model the entire day's pattern of




pollutant production contributes to the daily maximum.  Therefore the model




should be able to reproduce the entire hour-by-hour diurnal  pattern of




ozone concentrations on a given day.  Understanding of errors in the peak




prediction will require knowledge of the full day's predictions.  This




implies that both daily peak predictions and hourly predictions should be




evaluated.








Requirements Based on Use of the Model




     The Urban Airshed Model is used in  long-term  air quality management




required under the  federal Clean Air Act.   Development of State  Implementa-




tion Plans  (SIPs) under the Clean Air Act necessitates prediction of  a




relative change in  pollutant concentrations due  to a change  in emissions




                                     4

-------
for a worst-case day.  Pollution control strategies must be found which




will reduce emissions enough from today's levels to achieve a required




reduction in ozone concentrations by a certain date.  Typically the model




is used to simulate only a few worst-case high ozone days selected from




historical data.  Then different levels of future emissions are assumed,




corresponding to different control strategies, and the model is used to




predict the pollutant concentrations that would result from the changes  in




emissions alone, keeping the meteorological input to the model the same.




     Knowledge of this method of application was important in determining




the objectives of the evaluation, and the data and measures that would be




needed.  First, use of the model is confined to high ozone days and  is




focused on prediction of a daily maximum ozone concentration.  Thus




accuracy in the magnitude of the peak prediction on high ozone days  will be




of primary importance.  Furthermore, in practice the model is used to simu-




late only a few historical days, thus it must be able to replicate concen-




trations under meteorological conditions specific to a given day.




     Second, it is important that the model accurately predict changes  in




ozone due to emission changes.  The variation in concentrations in a given




year is primarily due to differences in meteorology, rather than differen-




ces in emissions.  Major changes in emissions from the automobile  fleet




occur over a perod of years.  Therefore the data base for model evaluation




should include points in time which are sufficiently spaced that emissions




changes will have occurred, and a measure must be found to compare observed




and predicted trends in concentrations.  Past evaluations of the Urban




Airshed Model have considered variations in meteorology but have not looked




                                     5

-------
systematically at changes in pollutant emissions (Hayes, 1979; Cole,




1982a and 1982b).









Model Attributes Selected for the Denver Performance Evaluation




     The above considerations led to selection of the following attributes




as the focus of this evaluation of model performance for Denver.




     1)  Accuracy of the magnitude of the peak prediction for each day.




     2)  Accuracy of hourly predictions, including the daily pattern of the




         predictions.




     3)  Accuracy of predicted trends in peak concentrations resulting from




         emission changes over a period of several years.




The first and third attributes are of primary operational importance and




would be given the most weight in the evaluation or selection of  a model




for regulatory use.  The second is important as an aid in interpreting the




results, for establishing confidence in the models and for diagnosing




errors.  These three attributes determine what performance measures and




what types of data will be needed to make relevant judgments about the




model.









B.  PRACTICAL LIMITATIONS ON PERFORMANCE EVALUATION




     In practice,  limited access to data may prevent the complete evalua-




tion of all desirable model attributes.  For example, in the Denver evalua-




tion, the available measurement network had only  five monitoring  stations,




not enough to adequately investigate the spatial distribution of  the  ozone




concentrations.  Attention given to structuring  the evaluation, by  listing




                                     6

-------
the important attributes for a given application,  increases  the  chance  that




the most important aspects will be covered  and promotes  an awareness  of the




ways in which the evaluation is incomplete.   Such  awareness  is  likely to be




important when the results of the evaluation  are to be  interpreted.

-------
                II.  DISCUSSION OF THE PERFORMANCE MEASURES




     This discussion of statistical measures is focused on evaluation of




time-dependent urban models.  However, many of the statistical considera-




tions will apply to evaluation of other models as well.  The first part of




this section discusses several statistical issues from a theoretical point




of view.  These issues are important in the selection of a data base and in




the use of formal statistical tests and confidence intervals.  The second




part of this section explicitly assembles a set of performance measures




that is most appropriate for evaluation of an urban airshed model in




regulatory use.  The value of certain time and space pairings is also




discussed.









A.  BASIC ASSUMPTIONS OF STATISTICAL TESTING




     Most standard statistical tests require that the sample be selected




randomly and independently and that the population be normally distri-




buted.  Each of these requirements presents special problems in our analy-




sis and will be addressed separately.  First, randomness is discussed  as a




problem in defining what population a sample actually represents, since




model-testing samples are invariably small with limitations outside of  the




control of the researcher.  Second, the amount of deviation from normality




is examined and its effect discussed.  Third, the lack of  independence  due




to autocorrelation is discussed and adjustments to the statistical tests




are computed.  Finally, other sources of dependence are delineated.

-------
Choice of a Data Sample for Model Evaluation




     Selection of sample data to be used for testing the model is an impor-




tant element of model evaluation which was not discussed in the AMS/EPA




workshop report.  Bias in the sample could easily bias the comparison of




two models, for example.




     If it is necessary to confine testing to a small number of days, we




need to define the type of days for which it is important to evaluate the




model.  Then every effort should be made to assure that the days chosen are




statistically random in all other respects.  If the days are not repre-




sentative then we should specify the limited population of days which the




study describes.  The validity of the model to describe a different  popula-




tion of days must rest on physical arguments and should not be assumed.




     If models are to be compared, they should be tested on the same data.




Serious biases in the comparison may be introduced by indiscriminately




comparing confidence intervals derived from two models using different data




sets.  If two models are tested on data sets from different urban locations




or years, the comparison may be biased by systematic differences between




the two population data sets.  Such systematic differences may be due to




differences in meteorology, differences in background fluxes, or differen-




ces in HC-to-NOx ratios.









Normality of the Data




     Many researchers have found that a full year's pollutant data  should




be transformed to approximate a normal distribution, using logarithmic or




exponential data transformations (e.g., Larsen, 1971).  Confining our popu-




                                     9

-------
lation to only the high ozone days seriously changes the shape of the




distribution, however.




     Hourly concentrations on high ozone days cover the entire range of  the




year's concentrations, from near zero in the early morning each day to the




annual maximum on one afternoon.  Histograms of the set of all high ozone




(>_. 10 ppra) summer weekdays, May-September 1975-80, are shown in Figure 1,




for each of the five Denver area monitoring sites.  Most of the distribu-




tions are bimodal.  The spike at the low end of four of the histograms




results from low early morning concentrations at those sites.  The mean  and




median are nearly identical in all five distributions, an indication that




they do not conform to the typical skewed shape of a log normal distribu~




tion.  Indeed, attempts to transform the data led to even greater devia-




tions from normality.  As a result, no data transformation was used for  the




remainder of the analysis.




     The set of daily maximum concentrations on high ozone days actually




represents the upper tail chopped off of a distribution which may, perhaps,




be log normal.  This upper tail deviates greatly from normality,  therefore




only those statistical tests whose probability levels are affected little




by deviations from normality will apply.




     Effect of nonnormality on bias estimates.  Bias is estimated  from the




mean of the model residuals Co-Gp.  In a paired comparison, it is  the




residuals, not the original concentrations, that must be normally  distribu-




ted.  Furthermore, the Central Limit Theorem tells us that  if  the  sample




size is reasonably large  (greater than 50 observations) then  a distribution




of sample means is approximately normal, even if the original  data was




                                     10

-------
quite nonnormal.  As a result, the bias estimates from large samples can




generally be assumed to be normally distributed, regardless of whether the




concentrations are normally distributed, and standard t-test procedures for




establishing confidence intervals on the bias estimate are appropriate




(with any necessary adjustments for autocorrelation).  However, if sample




sizes are small, confidence intervals based on Student's t will not




accurately reflect the specified probability levels.  For small samples,




tests which do not involve normality assumptions should be tried in deter-




mining the significance of the bias.




     Effect of nonnormality on variability and noise estimates.  Here,




lack of normality can be a more serious problem.  Both the F-test to com-




pare variances, and the chi-square-based confidence interval on a variance




estimate, require that the concentrations be normally distributed.




     Empirical studies, however, have shown that departures from normality




have only minor effects on the confidence levels associated with the




F-test, particularly if the parent populations have similarly-shaped dis-




tributions (Myers, 1979).  Markedly skewed distributions have been shown to




produce an overestimate of the alpha level, making the F-test more conser-




vative.  It should be remembered, however, that use of the F-test on  skewed




data will be less sensitive, increasing the chance of not detecting real




differences (Type II error).








Lack of Independence




     Autocorrelation effects.  Autocorrelation in a time series is the




tendency of an effect to carry over from one element of the series to the




                                     11

-------
next.  Many statistical techniques assume that data points are random  and




independent.  But in an autocorrelated time series, successive data points




are not independent.  Indeed, oxidant and carbon monoxide concentrations




have been shown to be highly correlated from one hour to the next  and,  to  a




lesser extent, from one day to the next.  This lack of independence greatly




reduces the precision of estimates of the population mean and variance.




     Autocorrelation in the concentrations.  Our population in the Denver




investigation was the set of hourly ozone concentrations on summer weekdays




in which the maximum concentration exceeded 100 ppb .  To avoid daily  auto-




correlation, successive days were deliberately avoided in selecting a




sample (at the expense of some loss of randomness).  Hourly autocorrelation




could not be avoided, therefore its effect on precision must be  assessed.




     The large sample of all high ozone summer weekdays, 1975-80,  was  used




to estimate the amount of hour-to-hour autocorrelation within a  day.




Hourly means were subtracted from each hourly concentration to remove  the




effect of the diurnal pattern, creating a stationary time series of devia-




tions  from the mean diurnal pattern.  Autocorrelations were then computed




for  lags of 1 to 6 hours within a single day.




     The average autocorrelation  function over the  5 monitoring  sites  was




.69, .38,  .21, .11,  .08, .08 for  lags 1 to 6.  When each of the  5 sites was




considered separately, 4 sites followed a pattern  quite  similar  to the




average.  This autocorrelation function is fairly  typical of  a  first-order




autoregressive process.  Such a process has the  form
                                            at
                                      12

-------
and for lag k, the autocorrelation function  is
                                   rk
     Thus the first few autocorrelations can be used  to  estimate  ,  i.e.,




rl»Vr2»  \r3 are a-^ estimates of <|>.  We have taken the  average of these




as our estimate, obtaining  m .63.




     One site, Highland, showed considerably higher  autocorrelation than




the others, with autocorrelations of  .86,  .65,  .49,  .37,  .25,  .20 for lags




1 to 6.  thus we have assumed a first order autoregressive process with an




estimated  * .82 at Highland.




     Hirtzel and Quon (1981) have derived  the  equivalent number of




independent measures for n autocorrelated  data points in a first-order




autoregressive process with autocorrelation $  at  lag 1.
                   n
     Within each day our data consist of  n  =  12  hourly concentrations.  At




the 4 sites for which r^ «  .63, the effective  number  of independent




observations for a day is computed to be  ne "3.3.  At Highland, where




r^ » .85, the effective independent n is  only  ne =•  1.9.  Thus the




precision of estimates of Co and  So is  equivalent to  that obtained with




only 2 or 3 independent measurements per  day.
                                      13

-------
     Autocorrelation in the Residuals.  In an ideal model the residuals
would not be autocorrelated even when the concentrations are.  The
residuals of the three models examined here all exhibit both a strong
diurnal pattern and considerable autocorrelation, however.
     As a result of the high autocorrelation, then, the standard error of  a
                                   f^r
single day's bias estimate is not V ,••„   as would be the case with  12
independent measurements.  Instead, under a first order autoregressive
                                             "
                                            j
process the standard error of the bias isW - .  The confidence
                                          1ne
interval on the bias estimate will be much wider, and the precision of  the
estimate much lower, than if successive residuals had been independent.
Whenever all of the hourly concentrations or residuals are included in  a
statistical analysis, the standard errors and the degrees of  freedom used
to compute confidence intervals must be adjusted accordingly.
     The set of daily maximum concentrations will not be autocorrelated
because we have excluded successive days  from the sample.  For  the  same
reason, statistics can be computed for each hour separately with  no auto-
correlation effect.


Other Sources of Dependence
     One other source of dependence between data points  should  be men-
tioned, although we have no way of adjusting  for it  directly.   There  is
likely  to be some  correlation between concentrations  occurring  at the  same
time at different  sites.  Thus when statistics  are  averaged  over  all  of the
sites, measurements  from the separate sites will not  be  entirely  indepen-
dent.   Standard errors  are  likely, therefore, to be  underestimated  and  any
confidence  intervals will be narrower than  they  should be.
                                      14

-------
B.  PERFORMANCE MEASURES




     The introduction to this report emphasized the need to  specify  the




attributes on which a particular model is to be evaluated.   Those  attri-




butes provide a basis for choosing performance measures which  are  relevant




to the model and its application.  Although an extensive list  of perfor-




mance measures was recommended by the AMS Workshop, some measures  from the




Workshop list may not be appropriate for a particular model, and additional




measures may be needed.  For example, in examining how the Urban Airshed




Model is used in practice it was noted that only a few days  are simulated,




therefore the model must replicate concentrations under meteorological




conditions specific to a given day.  Thus matching the observed and




predicted concentrations on a given day is necessary.  It would be inappro-




priate to use the unpaired t-tests and comparisons of frequency distribu-




tions that were recommended by the AMS Workshop for use in evaluating  point




source models.  Furthermore, the issue of emissions change was not addres-




sed by the AMS Workshop, therefore an additional measure was required




beyond the workshop list.




     The measures discussed below were selected for the Denver example




evaluation, based on the list of important attributes of model performance




which was developed in the introduction.  The measures were  related  to the




attributes of model performance as follows:




     (1)  Accuracy of peak predictions:  Bias, gross error,  noise, varia-




bility, correlation, linear regression.




     (2)  Accuracy of hourly predictions:  Bias, gross error,  noise,




variability, correlation.




                                     15

-------
     (3)  Accuracy of trends in peak concentrations resulting  from  emission

changes:  Difference between observed and predicted linear trends in

ambient concentrations.

     One potentially useful measure, spatial correlation, was  not used

because the data needed were not available.



Gross Error

     Two measures of gross error were suggested at the AMS Workshop.


                                    E(CQ- C  )2
          Mean square error:  MSB a 	°—
          Absolute deviation:   Id I
Both provide an overall measure of model inaccuracy.  They  are  so similar

that they are basically redundant, therefore only one would be  needed  as  a

performance measure.  Which is chosen  is likely  to be a matter  of personal

preference.  For each, it is desirable to obtain a low value  to minimize

the inaccuracy of a model.

     The absolute deviation is easy to interpret and  is  less  sensitive to

outliers (more robust) than the MSE.   However, it provides  no basis for

computing a confidence interval.

     The mean square error is familiar to regression  analysts because  it  is

the quantity which is minimized in the least squares  process.  Because the

errors  are squared, greater weight is  placed on  the  larger  errors.
                                      16

-------
Although it is possible to construct a confidence interval, this is gener-



ally not done because the distribution of the MSE is not a standard one—



it is a compounding of a normal distribution and a chi-square distribu-



tion.



     Our preference is for the absolute deviation, because of its  inter-



pretability and robustness.  However, other, more specific, measures  are



likely to be more useful than either of the gross error measures.  From a



theoretical point of view, for a performance evaluation it should  be  more



informative to consider separately the two components of the gross error:



bias and noise.







Bias



     The estimated bias in a model is the mean of the differences  between



observed and predicted values
                        d - C  - C
                             o    p
If the bias is significantly different from zero, it  indicates  a  systematic



tendency of the model to underpredict (if d > 0) or overpredict (if d <  0).



     On any particular sample of test data, of course,  some difference  from



zero is to be expected since no sample can perfectly  capture  all  of the



characteristics of the population.  Similarly two models may  show slightly



different bias levels on a given set of sample data when that difference



should be attributed to characteristics of the sample rather  than real






                                     17

-------
differences in bias between the models.  The model bias computed  from a



single data sample is merely an estimate of the true bias  in  the  model.




Computing a confidence interval around that bias estimate  indicates  how



much the true bias can reasonably be expected to deviate from our esti-




mate.  In a sense, the confidence interval provides a measure of  the



precision of the bias estimate.




     Bias comparisons based on Student's t.  An EPA list of statistics,




recommended for use in evaluating air quality models compiled and developed




in more detail by W.M. Cox from the AMS-recommended list,  suggests




estimating the bias in two different ways—using both a paired t  and an




unpaired (2-sample) t.  We would argue that the paired t is sufficient for




our purposes and that the unpaired t will offer no additional useful




information; therefore the difference between the two methods should be




discussed.




     The estimated bias is the same under both methods, that  is,





                       d - (C '-C ) = C - C
                             op     o   p








However, the confidence intervals will be different,  and the  interpretation



of "bias" will be somewhat different.  In order  to make our bias  estimate



as precise as possible we would like to obtain a confidence  interval that.




is as narrow as possible.



      Statistics on the paired  concentrations  provide  a more  stringent test




of the model because  they require some match  (in  time or  space) between the




observed and predicted concentrations.  A completely  unpa1' ^ed test looks at






                                      18

-------
the observations and the predictions as two independent data sets, with no



matching whatsoever, and simply asks, "Could these two independent samples



have come from the same population?" and, if not, "What is  the difference



between the means of the two populations from which they came?"



     The paired test assumes that there may be some correlation between



observed and predicted values, and accounts for  it.  The unpaired  test



assumes that there is no correlation between the observed  and predicted



values, therefore it does not account for any correlation.   Specifically,



under the paired test the standard error of the  bias estimate is
               fSG + sc - 2 cov  (
-------
     When the covariance (or correlation) is zero these two standard  errors




are the same.  If Co and Cp are positively correlated at all,  then  S-




will be smaller and will generally produce a narrower confidence  interval




(unless the sample size is very small, in which case the higher degrees  of




freedom for the unpaired test will make for a lower critical value  of t).




Therefore, if r^ Q  _>, 0> there is no point in computing bias based  on




unpaired concentrations, unless sample sizes are very small and sample




variances meet an assumption of equality.




     If the observed and predicted concentrations are negatively  correla-




ted, the unpaired t will, indeed, produce a narrower confidence interval.




But a negative correlation puts the validity of the entire model  in




doubt—the precision of the bias estimate assumes minor importance.   Thus,




as a general practice,  a one-sample (paired) t should be used  to  establish




a confidence interval for the estimated bias, and a two-sample (unpaired)  t




should not be required.




     For comparison of more than two groups, a two-way  analysis of  variance




is more appropriate than multiple t-tests.  If many confidence intervals




based on t are computed, one should be aware that the chance of making at




least one Type I error  increases as the number of t-tests  increases.   The




confidence intervals are useful because they dramatize  the  fact  that  the




computed biases are only estimates.   In choosing a 95%  confidence level,




then applying it many times, it is  important to recognize  that 5% of  the




computed confidence intervals will  not contain  the  true bias.




     Bias comparisons based on  the  Wilcoxon Test.  When samples  are small




and  there  is  reason to  believe  that the  data population is not normally





                                      20
-------
distributed, confidence intervals based on Student's  t may be misleading.




On extremely skewed data, one test showed that a sample size of N =  40  was




required to achieve an accurate 95% confidence interval on the sample mean,




and in this case the errors were not symmetrically distributed but were




confined primarily to one tail of the distribution of sample means  (Barrett




and Goldsmith, 1976).  Thus, when it is known that data is not normally




distributed, it may be worthwhile to try an alternative to the t-test.




     The Wilcoxon Paired Rank Test requires no assumptions about the dis-




tribution of the data.  It is based only on the rank-order of  the measure-




ments, and is quite sensitive to changes in the central tendency of  a




distribution.  It is described as a "mean slippage"  test by  Pearson  and




Hartley (1976), who provide an excellent brief description and the required




probability tables.




     The procedure is as follows:  Differences between  the paired measure-




ments are computed just as in a paired t-test.  The  differences  are  then




rank-ordered in the order of their absolute values,  i.e., in the ordering




process, the sign is ignored.  Finally, the sum T~ of the ranks  for  all of




the negatively signed differences is computed.  If the  differences  were




randomly distributed about a mean of zero, one would expect  that  there




would be similar numbers of positive and negative differences  and that  the




sum of ranks for negative differences would be approximately equal  to  the




sum of ranks for positive differences.  Distributions of sample  T~ values




have been computed for sample sizes up to N = 50.  For  larger  samples,  the




distribution of T~ is adequately approximated by a normal distribution  with




mean E(T~) - N(N + l)/4 and variance o2(T~) - N(N +  1)(2N +  D/24.




                                     21
-------
     Example 1:  Daily maxima predicted by the EPA1 model, selected  from




the entire grid area, compared with observed maxima for 11 days.









                                             Da>




                     123456789    10    11
Observed, C         153  146  162  166  157  117   117   100   154   121   101




EPA1 predicted, C   119  113  102  129  109   93   82   85   105   142    82




Difference, d        34   33   60   37   48   24   35   15   49   -21    19




Rank of  jdj           6    5   11    8    9    4     7     1   10     3     2




Sign ofd             +    +    -t.    +    +    +     +     +     +     --»-
     In this case there is only one negative difference,  and  its  rank is




3.  Hence the sum of ranks for negative differences  is T~ = 3.   Consulting




a probability table for T when N a 11, we  find  that  there is  a two-tailed




probability of  .01 that T~ will fall outside of  the  interval  [5,  61].




     We conclude that  the central tendencies of the  observed  and predicted




distributions are significantly different, with  C   > C  .   Therefore,  the




bias in EPA1 predictions of  the daily maximum  is significantly greater than




zero.




     Example 2:  Comparison  of bias  in DOT and  EPA1  models in the




prediction of daily maximum  concentrations.  Predictions  for  the 11 days




are chosen  from the full grid area.
                                      22
-------

1
34
47
-13
5

2 3
33 60
59 81
-26 -21
10 7

4 5
37 48
59 66
-22 -18
8 6
Day
6
24
18
6
2

7 8 9 10
35 15 49 -21
42 6 47 4
-792 -25
3419

11
19
19
0
—
EPA1 residual, d1


DOT residual, dg


Diff. dj - d0


Rank of (dj^ - d0j


Sign of di - dg
     Note that the difference residual, dj - dg, eliminates  the


observations from the comparison; only differences between model


predictions are tested.  Zero differences have  no sign,  therefore  they  are


not assigned a rank.  The sum of ranks for the  negative  differences  is


T~ = 5 + 10 + 7 + 8 + 6 + 3 + 9sa48.  For N =  10 ranked differences, there


is a two-tailed probability of  .05 that T~ will fall outside of the


interval [8, 47].  Therefore with confidence level a =  .05,  we can conclude


that the bias in the EPA1 model is less than the bias in the DOT model  on


this data.


     For comparison of more than two groups, the Friedman Rank Test  may be


used when the normality assumptions required by a 2-way  analysis of


variance are violated.  This test, too, is described by  Pearson and

                                                                 n
Hartley, and the test statistic is distributed  approximately as x  •


     Although the Wilcoxon test can be used to  estimate  confidence


intervals using the methods of  Hollander and Wolfe (1973),  the confidence


intervals used here are based on Student's t.   Unfortunately, these


                                     23
-------
confidence intervals may not be estimated accurately for small  samples.   By



doing both Wilcoxon and t-tests on each sample and comparing  the results,



it may be possible to judge whether t-based intervals in a particular



sample are too small or too large.



     Use of Wilcoxon and Friedman tests will not be appropriate on  auto-



correlated data.  The tests require that measurements be independent,  and



we do not know of any adjustments analogous to that of Hirtzel  and  Quon  for



the t-test.  Therefore we will only be able to use these tests  on data sets



which do not contain successive hourly concentrations.







Noise



     The estimated noise in a model is the variance of the differences




                         .2 m  E(d-d)2

                         bd      n-1







The standard deviation of the differences Sj (square root of  the vari-



ance) is a more interpretable form of the noise measure because it  is  in



the same units as the original data.



     Bias and noise are  two separate components of the gross  error, as



measured by the MSE, i.e.,




                       MSE =— S2 + (d)2  .
                              n   d







In the case when bias  is zero, the gross error consists only  of noise.  In



the  (unlikely) case that noise was zero, the gross error  would consist only



of bias.



                                      24
-------
     The effects of systematic bias in SL model could be removed by simple




proportional calibration procedures.  Corrections for noise, however, are




likely to be model-specific and may even require fundamental changes  in  the




model.  Thus it is of interest to know how much of the error is




attributable to noise and therefore not controllable by simple



calibration.



     A confidence interval for the estimated noise can be easily




established using the chi-square distribution, provided that the




differences are normally distributed.  The differences resulting  from our



three models all have skewed distributions, thus confidence intervals



              2
computed for S  would be only approximations.




     It should be noted that the noise level can be expected to be quite




large for completely paired hourly data, simply because predictions  from




state-of-the-art photochemical models tend to miss by a few hours or  a  few




miles.  Such errors are probably unavoidable.  Strict pairing will continue



to be useful for diagnostic purposes, but noise computed as a performance




measure should probably be based on less strict pairings,  such  as the




observed and predicted daily maxima.








Variability Comparison



A comparison of the variances of the observed and predicted concentrations



can be useful for diagnosing errors in the model.  If the variance in the



              2

predictions (S    ) is much smaller than the variance in the observed data


  2            P
(S_   ), then the model is doing a poor job of picking up day-to-day
                                      25
-------
fluctuations in ozone concentrations, and is probably holding too closely



to an "average" diurnal pattern.  The F-test can be used to determine if



predicted variability is too small.  If
                               --  >  F  .
                               JZ        crit.

                               SC
                                 P
(the critical value on the F-distribution, at the desired confidence



level), then the predicted variance is significantly less than  the observed



variance.



     If the variances are not significantly different then the  model  is



producing an acceptable amount of day-to-day variation, but  the F-test



doesn't tell us whether that variation occurs at the same time  or  place  as



that in the observed data.  If such matching is important then  the noise



measure is more appropriate than the variability comparison.







Correlation and Related Measures



     The Pearson correlation coefficient  should be used with some  caution



in model evaluation, since its results can be misleading.  It measures  the



strength of a linear relationship between observed and predicted concentra-



tions.  Three problems will be addressed:  1) Linear correlation ignores



the possibility of  a curvilinear relationship.  2) A perfect linear



correlation (r=l) could theoretically be  obtained even when  there  are large
                                      26
-------
errors in the model.  3) High correlations on hourly concentrations may be




obtained merely because the model is able to duplicate an average diurnal




pattern, regardless of its ability to simulate differences between days.




     1) The best initial picture of the relationship between observed and




predicted concentrations can be obtained from time series plots of Co and




Cp and scattergram plots of Co against Cp.  A curvilinear relation-




ship between Co and Cp, or other unexpected pattern, may become evident




and may be useful for diagnostic purposes.




     2) Computation of the correlation between Co and Cp should be




accompanied by computation of the slope and intercept of the regression




line Co • a + bCp.  A perfectly fitting model would produce not only  a




correlation of r=l but also a slope b5*! and an intercept a=0.  Errors in




both C0 and Cp will affect the magnitude of r.  In the above regression




equation, errors in Co will not produce bias in the estimate of b, but




they will affect its sampling distribution (Brier, 1975).  The scattergrara,




slope, and intercept may be even more useful than d for diagnosing bias  in




a model, since they may show systematic tendencies to overpredict at  some




concentration levels and underpredict at others.




     3) It is of dubious value, given the strong diurnal pattern  in ozone




concentrations, to throw together all hourly concentrations and compute  an




overall correlation between Co and Cp.  The magnitude of the differ-




ences between the hours of the day is much greater than the variation




within any given hour.  Therefore this overall correlation primarily




reflects the ability of the model to approximate the shape of the average




diurnal pattern.  High correlations, while mildly reassuring, often




                                     27
-------
indicate only that.  Differences in capturing deviations! from the average




diurnal pattern will have relatively small impact on the magnitude of r.




It could be worthwhile to remove the effect of the diurnal pattern by




subtracting hourly means from the observed and predicted concentrations




before computing the correlation.




     Of course, the cyclic effect of a diurnal pattern does not enter into




correlations of daily maximum concentrations.  This accounts for the dras-




tic reduction in r to be seen in the St. Louis ozone study (Cole, 1982a)




when going from full-day hourly data to daily maximum data.









Trends Resulting from Changing Emissions




     In the regulatory use of the Urban Airshed Model, accurate response to




changing emissions is critically important.  Changes in ozone maxima over a




period of years are affected by both emissions and meteorology.  To esti-




mate trends due to emissions change, it is necessary to filter out the




effects of year-to-year weather fluctuations.




     In this study a linear regression of concentration versus year was




chosen to estimate the annual trend, or rate of change, in peak concentra-




tions.  Regression was done on daily maximum concentrations Cv for high




ozone days in year y, based on the  trend model Cv * a + ty where the




trend t and the constant a are estimated coefficients.




     Observed and  predicted trends  over a period of years were estimated




separately, then compared.  The trend predicted by the models due to emis-




sions change was obtained by repeating the simulations of the original days




in 1979-80 using a Denver emissions inventory  for 1976.  Meteorological




                                     28
-------
conditions were unchanged.  To obtain an observed ozone trend  for compari-




son it was necessary to smooth out annual fluctuations in meteorology,




therefore daily maxima concentrations from high ozone days  for  all  six




years from 1975 to 1980 were used in the linear regression.









Graphs




     Graphs are an integral part of statistical modeling, at every  stage




from model development through model evaluation.  In particular, scatter-




plots of the residuals of a model are the primary tools recommended by




Draper and Smith (1966) for evaluation of a regression model.   We suggest




the following graphs.




     (1)  Histograms of the concentrations and of the differences,  to show




          the shape of their frequency distributions.




     (2)  Plots of Cp against Cg, and of both these against  time.




     (3)  Scatterplots of the differences (model residuals)  against any




          relevant variable, including time, observed concentration,




          predicted concentration, and variables used as  input  to the




          model.  The residuals should be scattered randomly within a




          horizontal band of even width.  If their pattern  is  sloped,




          curved, or cyclic, then inadequacies in the model may be  indica-




          ted.




     (4)  If the data fall into natural categories of some  type, plots of




          residuals by category, or bias by category.  If the  residuals  are




          normally distributed, then confidence intervals on the bias




          should also be plotted.  If not, box plots of the residuals by




                                     29
-------
          category would be a useful alternative (Kleiner and Graedel,




          1980).




     Because pollutant data falls into natural categories by monitoring




site, by hour of observation, and by day of observation it may be useful to




do plots for each of these categories separately.  In our experience, plots




of bias by category, rather than the residuals themselves, have provided




the best visual checks on possible patterns of error in a model.




     Graphs can also be very helpful in the search for specific causes of




error in a model.  For example, in the Denver evaluation plots of the




observed and predicted spatial field of concentrations and plots of wind




trajectories were found to be useful.









Analysis of Subgroups of the Hourly Concentrations




     Frequently it is worthwhile to do special analyses on subgroups of  a




data set in order to detect characteristics unique to that subgroup which




may be masked or averaged out in the full set.  One AMS workshop suggestion




involved separating the data into meteorological categories,  for example by




stability class or wind speed, and comparing model performance on different




categories.  Such categories should not be used with hourly  observations  if




confidence intervals are desired, though, because it would be  impossible to




determine the amount of autocorrelation in the subgroups.  A useful  alter-




native is to create categories based on hour of the day,  averaging  over  all




of the days.  This  tends  to  separate the  data roughly by  stability  class,




and  to show  special morning  and  afternoon characteristics  as well.   By




using  only one  hour from  each day in each group,  it eliminates the  problem




of hour-to-hour autocorrelation.




                                     30
-------
     Two ways of sorting the hourly concentrations  (breakdowns) were  found

to be useful in the Denver example evaluation:

     Comparing Co(x,t) with Cp(x,t) for each hour separately,  averaged

over all of the days.  This is the most important breakdown because  it

addresses how well the model is reproducing the diurnal pattern.   Perfor-

mance measures were computed for each site and for  all sites  together.

     Comparing CQ(x,t) with Cp(x,t) for each day separately,  averaged

over all hours in that day.  This breakdown is useful  as  an aid to diagno-

sis because it can isolate days with unusual characteristics.  It  obscures

information about the diurnal pattern, though, hence it does  not contribute

directly to the evaluation of performance.  Again,  the measures were

computed for each site and for all sites together.



Ways of Pairing Daily Maximum Concentrations

     The accuracy of maximum ozone predictions on a given day can  be  judged

in several ways, depending on how the "maximum" prediction are selected.

     Comparisons on the Site Maximum

     A local site maximum is the observed maximum concentration on a given

day at a given site.  Two comparisons were tried, one  requiring complete

pairing in time.  They were
     a)  CT  (s,h) with C;    (s,h),  paired by  hour  of  the  day.   The
                         "          observed maximum is  paired  with the
                                    prediction at  the  same hour.
                                      31
-------
     b)  C   (s,h) with C   (s,x),  unpaired by hour.,  The observed maximum
                         P          ...
                                    is paired with the predicted maximum
                                    for that day and site, no matter what
                                    the hour.
     Comparisons on the Daily Maximum

     The daily maximum is the maximum concentration over all sites.   Three

types of comparisons were tried, representing successively less  stringent

pairings in space.  They were
a)   C   (s) with  C   (s)  ,  paired by site.  The predicted  daily maximum
                               at the site of the observed daily maximum.


b)   CT  (s) with C   (x)   ,  unpaired by site.  The predicted daily
                               maximum that was predicted at any site,
                               whether observed there or not.
c)   C   (s) with C   (g)   , unconstrained in space.  The  predicted  daily
                              maximum from any grid point in  the  modeled
                              region.
     If the days to be simulated are chosen randomly,  then  we  would  expect

the following biases to result from these pairings.  Pairing by site,  (a),

constrains the choice of predicted maximum, therefore  it  can be expected to

lead to underprediction and hence to positive bias.  When daily maxima are

not paired by site, (b), there are equal numbers of observations and

predictions to choose from, hence no bias is inherent  in  the pairing

method.   If predicted maxima are unconstrained  in  space,  (c),  then over-

prediction, hence negative bias, should be expected because observed maxima

are limited to the monitoring  sites.

                                     32
-------
     In practice, however, studies of model performance are often confined




to days on which high ozone concentrations have been observed at the moni-




toring sites.  This should produce some additional tendency toward  under-




prediction under all three pairings, because days with lower observed




concentrations have been excluded.
                                     33
-------
                    III.  EXAMPLE PERFORMANCE EVALUATION




     This section presents the use of the performance measures described




above in an actual performance evaluation.  The goal is to discover which




measures give the most information for different purposes of evaluation,




and to demonstrate the complexity of interpreting the information contained




in performance measure statistics.  It will be shown that skill is




required, and that a performance evaluation most likely cannot be performed




in a routine mechanical fashion.




     Firstj the data set as a whole will be examined, looking  for anomalous




behavior.  Then, a first—level comparison of hourly observed and predicted




ozone concentrations will be presented.  Because an understanding of  the




sources of error is so critical, a second level comparison of  hourly  data




is presented, showing how a combination of statistical measures and sensi-




tivity analyses can be used to diagnose the causes of error.   The experi-




ence with the performance measures on hourly concentrations  is summarized




before next turning to the peak concentration comparisons.   Several ways of




matching peak concentrations are presented, moving through successively




less restrictive constraints in Che pairing of data.  After  summarizing




these comparisons, the performance evaluation is taken one important  step




further:  the model is tested for the regulatory purpose  for which  it was




designed—the prediction of changes in concentrations due to changes  in




emissions.  Finally, concluding comments  are made about the  carrying  out of




a  performance evaluation.
                                      34
-------
A.  MODELS AND DATA BASE




     This study was intended to evaluate the use of  statistical measures  to




discriminate between models as well as to evaluate a single model.   There-




fore, three versions of an urban photochemical model  were compared  on  the




same data base.  The basic model is the Urban Airshed Model,  developed and




modified by Systems Applications Inc. (SAI) for use  in  air quality  planning




work required under the federal Clean Air Act.  A description of  the model




and its usage is given in an EPA guideline  (Layland,  1980).   Variations on




this model have been used previously in Denver (Reynolds,  1979) and in Los




Angeles (Tesche, 1981 and Reynolds, 1979),  Sacramento (Reynolds,  1979),




St. Louis (Cole, 1982b), Tulsa (Reynolds, 1982) and  Philadelphia  (Haney,




1983).




     The three versions represent  incremental improvements  in the model.




The earliest version, which will be referred to as the  Department of Trans-




portation (DOT) model, uses Carbon Bond I chemistry  (Reynolds, 1979).   The




intermediate version, referred to  as the EPA1 model,  uses  an  improved




chemical mechanism, Carbon Bond II (Whitten, 1980).   The most recent




version, referred to as the EPA2 model, uses Carbon  Bond II chemistry  and,




in addition, reduces the artificial dispersion of pollutants  within the




model by using an improved finite  differencing method (Schere, 1982).




     The EPA2 version is the model currently recommended by  the Environ-




mental Protection Agency for prediction of  photochemical air  pollution in




State Implementation Plan work required by  the Clean Air Act.  Our  testing




of the EPA2 version on Denver ozone concentrations is one  segment of a




broader evaluation of that model which also includes St. Louis and




Philadelphia.




                                     35
-------
Data Base Used




     Eleven days were simulated for the performance work.  These  were




selected from high-ozone days in Denver in the summers of  1979  and  1980




having a peak ozone concentration of at least 100 ppb.   (The maximum




observed was 166 ppb.)  Only weekdays were included, because an emissions




inventory was available only for weekdays, but this is not a serious




restriction because observed ozone patterns are similar  for weekends and




weekdays.  The sample days, by and large, represent isolated high ozone




days—single day "episodes."  This is typical for Denver's highest  ozone




days.  The sample of days was also weighted towards the  days with the




highest observed maximum.




     Figure 2 shows the region modeled in and around Denver.   The shaded




areas show the contiguous metropolitan and developed regions.   The  five




monitoring stations are also shown:  Arvada, CAMP,  CARIH,  Highland  and




Welby.  The modeling grid of 2 mile by 2 mile cells has  been overlain  for




perspective.




     In all cases but one, the modeling area shown  in  Figure 2 fully con-




tained the predicted daily maximum.  For one day, the  peak was at the  edge




of the modeling region boundary.  It is estimated that this did not affect




the predicted daily maximum by more than 5%.  The simulations  were run from




5 a.m. to 5 p.m. (1700),  the time over which photochemical production takes




place.  All daily maxima, predicted and observed, were contained  in this




time period.  Because our population of days represented single-day




episodes, there was no  reason  to  carry  the  simulations further in time,




given limited computer  resources.




                                     36
-------
Meteorological Conditions




     Several types of data are included here to provide background on  the




meteorological conditions that existed on the eleven days that were




simulated.  We have not made a thorough analysis of these meteorological




conditions compared to meteorology on days with low ozone levels.  Thus we




cannot indicate which conditions are better than others as  indicators  of




high ozone days.  These data do, however, give some indication of the




similarities between the days which were modeled.




     Our judgment, after reviewing the performance evaluation of the




modeled days and reviewing these data, is that there does not seem to  be




any pattern to suggest that some of the eleven days would be associated




with multi-day periods of stagnation; rather, the eleven days have been




judged to represent individual, single-day developments of  high ozone




levels.  This seems to be the dominant and special character of Denver's




ozone problem.  The most prevalent characteristics of the days are strong




heating at the surface, very low wind speeds throughout the morning until




mid-day, low wind speeds through early afternoon, typical summertime mixing




depths and a high pressure ridge aloft.  A variety of wind  patterns, not  a




single type, characterize these days, but the dominant pattern is one  of




wind flows zigzagging over Denver.




     Table 1 shows the synoptic conditions of high pressure at 500 mb  and




at the surface for the modeled days, together with the conditions one  and




two days earlier and one day after.  These data are from the Daily Weather




Maps of NOAA.  No pattern is evident between the sequence of high pressure




                                     37
-------
at the surface and daily maximum ozone values.  A  "pattern"  is  evident  for




high pressure at 500 mb.  All of the days in our data  set  are ones  in which




a high-pressure ridge moves slowly over Colorado at 500 mb,  being  there the




day before and the day after the modeled day.




     Further data related to synoptic conditions are given in Table 2.   Of




note is the fact that the surface temperature on almost all  of  the  modeled




days is above 90°F, a high temperature for the Denver  area.  Precipitation




is associated with thunderstorms in the afternoon.  This  is  evident in




Table 3, showing sky cover by time of day.   It is  noteworthy that




insolation is strong through noon.  The time of maximum temperature




corroborates that strong surface heating is  occurring;  on  each of the eleven




days.




     Strong surface heating implies that these days should have high mixing




depths.  One must take subsidence into account, however.   Table 4  shows




that only one day had an upper  level inversion below  2100 m at  0500 MST and




no day had evidence of an upper-level inversion at  1700 MST. This  together




with the fact that there were no sudden changes in upper-level  dewpoint




temperatures implies that subsidence is not  a factor  that needs to  be




considered on these days.  There appears to  be no  relationship  between




estimated mixing depth and observed daily maximum  ozone on these eleven




days.




     The most notable and common meteorological condition across the eleven




days is the existence of low winds  in the morning.  This  is shown




in Table 5 for  the wind speeds  at the five ozone monitoring stations.  The




morning wind speeds  are low  and even  the average wind  speed for the day is




low, not far from 2 m/sec.




                                      38
-------
     The character of the wind flows in time  are  seldom  simple  for  the




eleven days.  Four rough categories are sufficient to give  the  appropriate




impression (Table 6).  As shown in Table 6, there is a simple straight




through wind flow on day 79180.  Day 79249 had a  straight through  flow




interrupted by a zigzag over Denver.  Several days had mostly a zigzag  wind




flow over Denver and three days showed curved wind flows.   Day  79218  was




interesting because it had a smooth wind reversal over much of  the  urban




area.









B.  GENERAL DATA SET EVALUATION




     Before going through an evaluation, it is worthwhile to check whether




any days show anomalous behavior.  This will  point out unusual  days for




which the simulation should be checked, either because the  model behavior




is out of the ordinary, or because the data base  contains unusually large




errors.  The bias and absolute deviation computed for each  day  separately




can show average differences in performance by day.  It  is  helpful  to com-




pute these as a percent of the mean observed  concentration  for  the day.




The daily bias and daily absolute deviation give  similar information, but




because we want to understand how the model is doing on  the whole  (looking




for gross errors), the daily absolute deviation is expected to  give a more




complete indication.  Figure 3 shows the daily bias and  the daily  percent




bias, while Figure 4 shows the daily absolute deviation  and the daily per-




cent absolute deviation, for all three models.  The differences between




days are minor in all four graphs, and the daily  percent absolute  deviation




is particularly uniform across, the days.   If  any  day showed an  absolute




                                     39
-------
deviation far higher than the others, or a bias that was significantly




larger, it should receive special analysis.  It is interesting to note  that




those three days on which the observed peak occurred at the Highland  site




are the days which have the highest daily bias.




     Figure 5 (a) shows the daily absolute deviation plotted  against  the




observed daily average concentration.  We are interested in detecting any




unusual relationship between a day's gross error and its average ozone




level because this, too, could indicate a day which requires  special  atten-




tion.  Again, however, no day stands out as unusual.




     For a thorough check on anomalous behavior, one should use confidence




intervals on the bias estimates as shown for EPA2 in Figure 5 (b).  Auto-




correlation in the model residuals has been taken into account in computing




the confidence intervals.  Residuals of all three models investigated here




contain substantial autocorrelation, as shown in Table 7.  The exponential




decline of each autocorrelation function with increasing lag  is charac-




teristic of a first order autoregressive process.  The autocorrelation  in




the EPA1 and EPA2 model residuals is slightly lower than in the DOT model




residuals, however the differences are too small to be significant.




     If the 12 hourly residuals from a single day are used together,  the




effective number of independent observations ranges from ne * 2.9  for the




DOT model to ne = 3.3 for the EPA2 model.  (Computed using the equation




on page 12.)  The estimates of $ show considerable variation, however,  so




ne = 3 per day will be used in computing the standard error of the  bias




for all three models, in order to avoid assuming more  precision  than  is




warranted.




                                     40
-------
     From the confidence intervals in Figure 5(b)  it can be  seen  that  bias




estimates do not differ significantly over the 11  days.  With only  five




monitoring stations, the small sample size makes it unlikely that




differences between days would be statistically significant, therefore a




day would have to be rather far out of line to be  termed anomalous  in  the




Denver example.  Figures 3, 4, and 5 give consistent information  about the




data set, indicating that no one day exhibits anomalous behavior.   We




conclude that all 11 days can be used for the performance  evaluation,  for




all three models.




     With this conclusion, we now turn to the first step of  the evaluation




itself:  the paired comparisons.









FIRST LEVEL COMPARISON:  PERFORMANCE MEASUREMENTS  FOR  HOURLY DATA




General Overview Statistics




     Performance measures for the three models on  the  full hourly data set




are shown in Table 8.  Measures are also given for each site separately to




show differences in performance between sites.  Autocorrelation  in the




differences, d, has been accounted for in establishing confidence intervals




for d.  Because some correlation between sites is  likely,  the confidence




interval estimates for the bias in the full data set is probably  somewhat




too narrow.




     Several basic conclusions can be drawn from these measures.   First,




all three models show bias significantly greater than  zero on the full data




set and at the majority of sites.  Thus, there is  a systematic  tendency to




underpredict in all three models.  Second, the variance in the  predictions




                                     41
-------
was significantly less than the variance in the observations  in  all  three




models.  Third, model performance is similar for the three:   bias, noise,




and variability show much greater differences between  sites than between




models.  Fourth, bias as a percent of the mean concentration  C   is




particularly high at CARIH and particularly low at CAMP  for all  three




models, with the differences bordering on statistical  significance.




     These general observations provide little insight into where the




models may be going wrong, however.  More detailed breakdown  of  the  data  is




required to find which hours or days contribute most Co  the bias and




noise.









Model Performance by Hour of the Day




     One of the most important predictive capabilities of the model  is its




ability to simulate the diurnal cycle of ozone production.  Figure 6 shows




the observed and predicted diurnal patterns, averaged  over our  11 sample




days, for each of the five monitoring sites.  The CARIH  site  stands  out,




having strong underprediction  throughout the day.  At  the other  four sites,




predictions are close to or slightly higher than observed values in  the




morning, with substantial underprediction at the peak.   This  is  a pattern




which was also observed in an  evaluation of the EPA2 version  of  the  model




on St. Louis data (Cole,  1982b).  Only  at CAMP do  the  afternoon  predictions




come close to observed values.




     Figure 7 shows the hourly bias,  averaged over  the 11 days,  for  each of




the three models  for each of the  five monitoring sites.   It  is  apparent




that  all of the models  show large bias  in the midday hours,  and  that all of




                                      42
-------
the models are basically alike in their hourly bias pattern.  The bias  is




largest from 11:00 a.m. on at all of the stations.  As noted earlier, the




bias pattern at each station is different, both in shape  and in  the  timing




of the maximum bias.




     Figure 8 shows EPA2 hourly bias estimates for each of  the five  sites,




with the associated 95% confidence intervals based on student's  t.   None of




the sets of hourly residuals differed significantly from  a  normal distribu-




tion under the Kolmogorov-Smirnov test, even with a significance level  of




o a .20.  Therefore t- and F-tests were assumed to be appropriate  for this




data.   However, to compare the t-test with the Wilcoxon Paired Rank  Test,




the Wilcoxon test was also used to determine whether hourly EPA2 biases




were significantly different from zero with a a .05.  For the 60 biases




tested (12 hours x 5 sites, each with n - 9 to 11), the Wilcoxon and




t-tests disagreed only twice, and in both cases of disagreement  the




confidence level was quite close to the .95 borderline.   This proportion of




disagreement (.033) is quite compatible with a significance level  of .05.




When all sites were analyzed together, with approximately 55 measurements




for each hour, the Wilcoxon and t-tests agreed on significance or




non-significance of the bias for every hour.




     Figure 9 shows the hourly bias and the hourly percent  bias  for  EPA2




with the 95% confidence intervals for all of the sites  averaged  together.




With the confidence intervals, one can see that the bias  is statistically




significant from 10:00 a.m. on.




     To give a general comparison of the models by hour,  Table 9 shows  the




bias and noise for each hour, averaged over all of the  sample days  and




                                     43
-------
sites.  The differences between the models are small  at  any  given  hour  in




noise as well as bias.  The bias peaks around noon, as do  the  observed




ozone concentrations, declining during the afternoon.  The noise,  on the




other hand, continues to increase until 3 p.m. (hour  15).  Thus  the  errors




in the models decrease in the afternoon on the average,  but  there  continue




to be high errors in the afternoon in some cases.  Looking at  each site




separately, the high noise in the afternoon is most characteristic of




Highland.  Extremely high noise at Highland for hours  14 and 15  shows that




errors differ greatly from day to day there,  in those  hours.  This




indicates the need for a special comparison of daily  data  at that  site.




     Daily bias, averaged over all hours and  sites, was  checked  above




(Figure 6), looking for anomalous days.  To look  inst€>.ad for unusual model




performance by day at a given site, Figure 10 shows the  daily bias,




averaged over all hours, for each station.  One immediately  notes, in this




figure, that the daily bias is quite different across  the  stations.




Furthermore, at Highland three days stand out with high  bias while the




others have near-zero bias.




     Different patterns at different stations, both in the hourly and in




the daily bias, are clues that many sources of error  have  been intertwined




within our average estimates of bias and noise.








D.  SECOND-LEVEL COMPARISON:  DIAGNOSING ERRORS IN HOURLY PREDICTIONS




     This evaluation  is meant to examine the  adequacy of the models from




a regulatory perspective.  Clearly the  above  analysis has  not provided




enough  information to make that  assessment.   Systematic  errors have been




                                     44
-------
found but their causes, and hence their impact on regulation, have not been




identified.  Some errors will be tolerated for regulatory applications,




others will not.  It is imperative to try to diagnose the causes of hourly




bias before one can truly begin to assess the adequacy of the model.




     Several causes of error will be pointed out in this section.  The




errors were discovered and/or analyzed by using graphs, statistics on




subgroups of the data, and sensitivity studies involving controlled changes




in the model or its inputs.  The causes of error discussed  are missing the




peak in space, point source influences, introduced error, dispersion




influences, and missing the peak in time.









Missing the Peak in Space




     Most models can be expected to place the peak in the wrong  position in




space, because trajectory errors are introduced in the wind field  when




hourly averages are used, because the monitoring instruments and  sites  are




not perfect, because data has to be interpolated and extrapolated, and




because the wind observation network is normally too coarse to resolve  the




necessary structure in the wind field.  The airshed models  seem  to exhibit




both errors in direction and errors in distance when missing the  peak in




space.




     The situation at Highland is an excellent, clear example  in  which




error in the spatial location of the predicted peak contributed  a  major




share of the bias and noise.  From Figure 10 it is clear that  three days:




Days 180, 218, and 249, have unusually high daily bias compared  to the




other days.  These three days also have much higher than average  daily




observed ozone, as indicated below.




                                     45
-------
        Day    Julian Date               Daily C  at Highland
1
2
3
4
5
6
7
8
9
10
11

79180
79193
79208
79218
79249
80170
80177
80191
80204
80207
80219
Average All Days:
90.9*
60.8
50.8
92.5*
85.4*
58.8
40.0
52.1
48.7
45.2
42.2
61.8
     The models did not make correspondingly high  predictions  at  Highland




on these three days; rather, the predictions are similar  to  those on the




other eight days.  A check of the isopleth maps of predicted ozone concen-




tration shows that on all three days the peak ozone cloud came near




Highland, but missed the monitoring site on every  occasion (See Figures 11,




12 and 13).  In addition, the predicted ozone peak was  earlier than the




observed peak on day 79180, on which the wind simply  blew southward across




Denver.  The predicted peak was late on day 79218, on which  there was a




wind reversal at mid—day over much of Denver.  The predicted peak was "on




time" on day 79249, on which the wind zigged a bit over Denver before




continuing southward.  Thus peaks were predicted nearby and  on trajectories




consistent with a peak being observed at Highland, but  they  clearly missed




the monitoring station.  The impact of missing the pejak in space  (and time)




is assessed in Table 10.  A significant portion of the  difference between




observed and predicted ozone peaks on these  three  days  c^n be explained by
                                     46
-------
the predicted peak having missed the monitoring  site.  At  least  half  of the




difference on these days, however, is associated with underprediction of




the peaks by the model.




     Several bias and noise measures provided a  clue to  this  problem.  In




particular, the hourly noise jumped by nearly a  factor of  two for  the hours




of 1400 and 1500 (see Table 11).  These are  the  hours when the observed




peaks occurred at Highland for these three days.   Table  11 also  shows the




change in the hourly bias and noise in EPA2  when the three days  are  removed




and the measures re-computed.  The high bias in  the  afternoon has  essen-




tially disappeared, being replaced by more random  behavior,  as one would




hope to see.  Thus, it appears reasonable to state that  most  of  the  bias




and noise between 1200 and 1600 at Highland  can  be attributed to the




model's behavior on those 3 days.  This problem  of missing the peaks  in




space also accounts for low variability in the model predictions at  this




site, since observed variability is also low in  the  eight-day subset  which




contains no peaks.




     The Highland example shows that it is possible  for  statistics on




subgroups to aid in pinpointing clear problem days.  Problems can  be




isolated if not every day has some problem or another.   While the




statistics could pinpoint days contributing  most of  the  bias  and noise,




they could not go to the next step and actually  assess which  errors  were




contributing to the bias and noise on those  problem  days.   The isopleths




for each hour of each day, and knowledge of  the  wind trajectories  on a




case-by-case basis, were necessary to assess the errors.
                                     47
-------
     Only one of the other four stations, Arvada, exhibits  some  of the




characteristics that seem to be associated with a day and site in  which  the




peak has been missed in space:  unusually high daily bias,  unusually high




observed peak ozone, and unusually large jumps in hourly noise.  Arvada  has




four days with high bias relative to the other days.  (There  was one day




with a large negative bias that was not large enough to be  a  candidate.




See Figure 10.)  Three of these days had a large observed ozone  average,




days 79180, 79193, and 80204.  But Arvada does not evidence any  large jumps




in the noise during peak hours, therefore missing the peak  in space is




probably not a major contributing source of error at this site.  The hourly




bias and noise in EPA2 for Arvada, with and without the days  79180, 79193,




and 80204, is given in Table  12.  There is some improvement in the bias  for




the hours 1200, 1300, and 1400, but nothing as dramatic as  with  Highland.




The hourly noise has hardly changed.  As with Highland, the early  morning




negative bias became more pronounced for the hours 600, 700,  800,  and 900.




     Elimination of the days  which had both high observed concentrations




and high bias did not produce much change in the error measurements (bias




and noise) at Arvada.  We conclude that sources of error other  than missing




the peak in space should be found at Arvada.  For example,  from  examination




of the observed ozone concentrations in Figure 11, it is clear  that on Day




180 the ozone peak  is supposed  to be in the vicinity of Highland,  as




predicted, and not  at Arvada.   But there should still be some ozone




remaining over central Denver,  upwind of the peak.  Thus day  79180 does not




so much represent a problem at  Arvada of missing  the peak  in  space, but




rather, of missing  the residual  or  left-over ozone when  the peak is to  the




south.




                                     48
-------
     Isopleths of ozone concentrations for days 79193  and  80204  are  shown




in Figures 14 and 15.  Figure 14 indicates that some of  the bias  at  Arvada




is due to the model missing the peak in space.  Figure  15  also suggests  the




peak is missed in space.  Wind trajectory analysis on  the  predicted  peak of




day 80204 indicates, however, that the predicted peak  may  be  an  artifact of




a 2 hour slow-down and 180* reversal in the wind field.  Additionally,  the




vertical mixing sensitivity study discussed later  finds  a  predicted  peak at




Arvada and CARIH on day 80204.  Thus causes other  than missing the peak  in




space contribute to the bias at Arvada.




     In summary, three indicators simultaneously giving  unusual




answers—unusually large daily bias, high observed ozone on those same




days, and large changes in hourly noise at peak hours—seem to be good




discriminators in looking for days in which missing  the  peak  in  space is




one major source of error.  However, to get beyond that  general




identification to the level of detail necessary to support a  regulatory




analysis, one must use graphs and a case-by-case examination  of  the




results.








Point Source Influence




     There are still unidentified sources of bias  in the Urban Airshed




Model predictions for Arvada and the interior  station,  CARIH.  As Figure 10




indicates, CARIH has a serious bias problem on almost  every day,  and Arvada




on some days.  The problem in the daily maximum predictions at CARIH is




shown in Table 13.  One possible cause is that the point source  nitrogen




oxide emissions are being mixed too rapidly to the ground  (vertically)  and




                                     49
-------
too rapidly horizontally, effectively depressing  the  ozone  production.




Thus, this potential source of hourly bias will be  investigated  next.




     A sensitivity study was carried out on  six of  the  days.   For this




sensitivity study the point sources were removed  trom the  input  data for




the EPA2 model and the day was resimulated.  The  days were  picked to span




the range of concentrations observed in our  11 sample days  at  CARIH,




because we wanted to find out whether CARIH  was influenced  by  the point




sources.  The six days also have wind flows  that  zigzag or  curve




over central Denver, near Arvada and CARIH.  Thus they  should  represent




well the influence of point sources on the prediction at these two sites.




     The hourly biases which resulted from the sensitivity study are given




in Table 14 for each of the stations.  It is not  surprising that there  was




no change at Highland, because there are no  point sources  near that site.




There was also essentially no change at Arvada and  Welby.   There was some




reduction of bias at 1000 and 1100  at CAMP,  little  change in the rest of




the hours.  At CARIH, there was some reduction of bias  at  900  and 1000, but




absolutely no improvement was given to the extreme  bias shown  at 1200.   It




is interesting to note that the bias at  1200 is much  larger for these six




days than for the 11 days on the average.




     Change is more  apparent when  looking  at the  maximum ozone predicted at




three sites, Arvada, CAMP and CARIH, as  shown  in  Table  15.   Arvada shows a




slight but uniform  improvement in  the maximum  predictions.   CARIH and CAMP




are more mixed, the  change being uneven  across  cases, but day 79193 shows




clear improvement in the model predictions.   Iscpleths  of predicted ozone




on different hours  for day  79193 are  shown,  in Figures  16, 17,  18 and 19,




                                     50
-------
showing the isopleths for the case with point  sources  and  comparisons




without point sources.




     On day 79193, the point sources produced  a number of  effects.   First,




they depressed the peak concentrations that were predicted (see




Figure 17).  Second, they cut the ozone peak apart,  reducing the  predicted




spatial extent of the ozone cloud (compare Figures  18  and  19).  Third,  they




caused stationary "holes" to appear in the predicted pattern of ozone




concentrations (see Figure 16).  These impacts on  the  predicted ozone are




important, but they do not explain CARIH's problem.  Day 79208  shows an




even larger reduction (Figure 22) in the  spatial extent of the ozone peak




as 79193.




     Day 80170 has the opposite effect on CARIH from Day 79193  in




Table IS.  Isopleths diagrams for this day are shown in Figures  20 and 21.




The isopleths show CARIH to be in a saddle and the  saddle  has become a bit




lower when the point sources are removed. But the  height  of the  maximum




predicted peaks and their spatial extent  are not changed much at  all,




though the hour at which the highest peak occurs did change. As  before,




some "holes" in the ozone disappear.




     None of the analyses of the point source  influence show any  cause to




associate the bias at CARIH or Arvada with some unusual behavior  associated




with the dispersion of point sources.  In other words, no significant




fraction of the hourly biases that we are trying to explain can be




attributed to point source effects.




     The sensitivity study did show the point  sources  were having other




important influences, however.  Figure 22 compares  the predictions of EPA1,




                                     51
-------
EPA2 without point sources, and EPA2 with point sources for day  79208.   It




suggests that for some days elimination of the artificial diffusion  in  the




change from EPAl to EPA2, while improving the peak predictions of EPA2,




made its results more sensitive to the point source emissions.   In addition




the comparison of contour plots from EPAl and EPA2 showed other  stationary




holes in the ozone were more apparent in EPA2 than in EPAl.  These




other holes behave similarly to those induced by the point sources,  which




suggests there may be sources in the area source emissions inventory that




act similarly to the elevated point sources.









Introduced Error




     By "introduced errors" we mean errors which can be directly attributed




to one of the inputs to  the model.  One case is addressed in detail  in  this




report, hourly bias introduced through the setting of background




concentrations.  An understanding  through sensitivity studies  of how the




model responds to these  inputs was established before the concomitant




pattern in the hourly bias was recognized.  The effect of two  other  inputs




to the model, the wind fields and  the emissions inventory, will  also be




discussed briefly.




     Other potential sources of introduced error were not examined.   A




known error is the use of  a surface-based photolytic rate constant  for




N02.  Because ultraviolet  radiation, which induces N02 dissociation,




increases with increasing  altitude, higher rate constants are  expected




above the surface (Demerjian, 1980).  This has a direct  impact on ozone




concentrations.  For Denver, this  effect  could possibly  account  for  a 10




                                     52
-------
percent underprediction of the peak ozone concentration.   Other sources of




error, such as the emission inventory, could easily  contribute  errors  of




this magnitude as well.




     A sensitivity study was conducted to test  the model  response to




changes in the value of background ozone.   Seven  of  the  11 days were picked



at random for this test.  EPAl was used  for the test due  to computer




resource constraints.  For the low background 20  ppb was  used;  for the high




background 90 ppb was used.  This variation was centered  at 55  ppb and the




average background for the seven days was 50.7  ppb.   Table 16 summarizes




the substantial effect on the daily maximum predicted ozone.  Similar




effects are seen at the individual stations, as for  example on  day 79218 in




Figure 25.  An examination of the isopleths shows that no peak  was changed




as to the prediction of its location by  an  increase  in the background  ozone




from the normal value.  See for example  Figures 26,  27,  28, and 29.  On




three of the days (80170, 80191, and 80219) there is an  apparent shift in




time of the daily maximum.  For the high ozone  cases shown, this shift in




time is caused by a change in the relative  importance of two different




peaks in the modeled region.  For example, on day 80170,  the primary peak




has its maximum at 1500, measuring 92.7  ppb (Figure  28),  while  the




secondary peak has its maximim at 1700, measuring 90.2 ppb (Figure 29).




For the high ozone background simulation, the original primary  peak still




has its maximum at 1500, measuring 124 ppb, while the original  secondary




peak still has its maximum at 1700, but  now measures 138  ppb, becoming the




primary peak.  The same behavior explains the time change of the daily




maximum for days 80191 and 80219.  All of the other  days  only had a single




                                     53
-------
peak predicted in Che modeled airshed; hence there was no change in the




timing or location of the peak.  (Day 79180*s change in time should not be




taken seriously because the peak is right at the edge of the modeling




region.)




     The simplified method used to set background ozone concentrations  for




the Denver model runs introduced additional bias in the early morning




predictions.  For each day a constant value for background ozone was used




and the vertical profile of background ozone was taken to be uniform




throughout the day.  Nighttime chemical reactions "scavenge" ozone at  the




surface; thus, in the morning the amount of ozone at the surface is reduced




and there is an increase in ozone concentration with height, returning  to




background levels.  This scavenging was not taken into account  for the




eleven days.




     Tracking background ozone through the use of monitoring data during




the day was not considered to be feasible because Highland is the only




station that is remotely rural, the other stations are interior  to the




metropolitan area (Arvada, which is upwind all day on day  79180  shows




significant ozone production.)  Highland was truly upwind of Denver on  only




one day out of the eleven.




     The sensitivity studies demonstrated that stations furthest  from  the




center of the urban area, in other words, Highland and Arvada,  were most




quickly affected by and responsive to the level of background ozone.   The




interior stations were  less affected  and also  affected later  in time.   For




Highland and Arvada, the background ozone value had  a strong  influence on




the predictions as early as 700 making  them the best candidates for  this




                                      54
-------
analysis.  For all of the stations, the 1700 prediction was  almost




completely determined by the background value.




     The previous analysis of missing the peak in space at Highland  showed




that removal of three days with high daily bias left eight days  that  had




only small errors at Highland associated with production of  ozone during




the day.  Such was not the case for Arvada.  Highland also showed low




variability in observed ozone concentrations on the eight days.  Thus




Highland is the only station for which a check might be valid on early




morning bias resulting from the settings of background ozone used in the




model runs.  If Highland is actually a fairly good station to use as  an




indicator of the background ozone  in the morning around the  Denver  area,




then the existence of statistically significant bias at Highland in  the




early morning, as shown in Table 11, indicates that the decision to  keep




the background ozone constant throughout the day introduced  additional bias




in the early morning, an overprediction on the order of 10 ppb  at 0700 and




decreasing to zero by 1000 or 1100.




     Results for the eight days in Table 11 suggest that the "detection




limit" of the bias statistic was around 25%; that is, the bias  needed to  be




larger than 20-25% of the observed concentration to be statistically




significant at the 95% confidence  level.  One must remember  that a  sample




of eight will not provide a very precise bias estimate.  Thus while




subgroupings of data can pinpoint  different sources of error, it will




probably seldom occur that those individual errors will be shown to  be




statistically significant, unless  the model is performing very  poorly.
                                     55
-------
     The analysis discussed in this section indicates that either  the




background was set too high for the early- to mid-morning period or  that




something else is not being properly accounted for in the model.   Given




known scavenging of ozone at the surface, we believe the latter




explanation.  The afternoon background, on the other hand, seems to  have




been set about right, but it is difficult to tell.  There is good




correspondence between the observed late afternoon ozone for the eight days




of Table 11 and the average predicted value for Highland:  57.3 ppb  vs.




59.1 ppb, respectively, but the bias at 1600 and  1700 in Table  11  indicates




something is still not quite right.




     Further sensitivity analysis examined the effect: of background




hydrocarbon concentrations on ozone predictions of the EPA1 model.   No




attempt was made, however, to assess the effect of errors introduced via




the background hydrocarbon concentrations input to the model on the  bias  or




gross error in the model predictions.  The original simulations assumed  a




constant vertical profile.  The Tulsa work (Reynolds, 1982) showed that the




concentrations of some of the hydrocarbon species at upper levels  (above




500 m) were less than half their concentrations at ground level.   A




relative profile was used to approximate the decrease in hydrocarbons aloft




at the top boundary of the model domain.  Ozone predictions at  the




monitoring sites were reduced by around 10 percent on day 79180  as seen  in




Figure 30, except at Highland in the vicinity of  the predicted  peak  where




the reduction was greater.  The daily maximum predicted  ozone  was  reduced




by 16 percent.
                                      56
-------
     At a later date, the sensitivity of the model  to  incoming  hydrocarbon




concentrations was investigated using surface data  from  the  Pawnee




Grasslands summer experiment run by NCAR (Delany, 1981;  Greenberg  and




Zimmerman, 1982).  Two situations were investigated, one representing clean




air background (.02 ppmC NMHC) and the other representing "dirty"  air




(1.2 ppmC NMHC at the upwind boundary).  The original  simulations  used an




intermediate background (.05 ppmC NMHC).  Two different  days were




simulated, one with the daily peak ozone cloud  located beyond the  urban




area (79180) and one with the daily peak ozone  cloud located over  the urban




area (80204).  When the ozone cloud was outside of  the urban area,  day




79180, the daily ozone maximum was reduced by  12% in the clean  air




sensitivity and increased by 15Z in the dirty air sensitivity.   When the




ozone peak was over the urban area, day 80204,  the  change in the daily




maximum was -1% and +1Z for the clean air and dirty air  sensitivities,




respectively.  Thus, there was not much influence when the ozone peak




occurred over the urban area.




     The suspected effect of possible interpolation-related  errors in the




wind fields and possible errors in the area source  emissions inventory




deserve mention.  A trajectory analysis of the  peak predicted by EPA2 for




day 80204 suggested that there was a narrowly confined slow-down and 180°




reversal in the wind field (a "dead-spot") for  several hours that




influenced the cell containing the predicted peak.  The  resultant  ozone




peak is high and narrow, yet the monitoring data suggests a large,  broad




peak.  The wind trajectories of the surrounding cells  do not show  this




"dead-spot" behavior.  As well, EPAl, with its  "leaky" horizontal  diffusion




                                     57
-------
does not show this behavior of predicting a high, narrow peak  as  does  EPA2




(see Figure 23 and the section on dispersion).  EPA1 predicts  a broad  peak




more in accordance with that suggested by the monitoring data.  More




analysis than was possible in this study would '.a necessary to judge




whether this section of the wind field is simply improbable in its  behavior




or is an artifact of interpolating the wind fields.  In any case, it is




clear that EPA2 is much more sensitive to such errors  in the windfield data




set than EPAl.




     Isopleths of the ozone predictions consistently show  a "hole"  or




narrowly confined low point in the predicted ozone just below  CARIH.   An




example was shown in Figure 16.  Contour plots of non-methane  hydrocarbon




and NOX emissions (Figure 24) show a large source of NOg in a  single




adjacent cell.  Thus it is expected that this NOg source is affecting  the




magnitude and spatial extent of the predicted ozone peaks.  This  spot  of




NOX in the area source inventory may be a major contributor to the  bias




at CARIH.  Sensitivity studies are needed to determine the magnitude and




extent of the NC^ source's impact on the predicted ozone.
Dispersion (Vertical Mixing Rate) Influence




     While the vertical mixing rate  is not a variable  that  is  easily




accessible to the modeler, it was known  from previous  carbon monoxide




modeling with the equivalent of  the  EPAl model  (see  Figure  31) that the




model had a serious problem of underprediction  when  unstable conditions




were being modeled.  It seems that  the Lamb polynomial calculates a




diffusivity near the surface that is almost an  order of magnitude too large




                                      58
-------
for even free convection situations (Panofsky, 1981).  McRae  (1981),  in his




thesis, took similar note of the problem and changed the Lamb polynomial




near the surface.  A discussion of this is included in the Masters  thesis




of Robbi Keil (1983).




     The daily bias in carbon monoxide predictions can be useful  to isolate




dispersion effects in the model because, unlike ozone, carbon monoxide




concentrations are not confounded by chemical reaction effects.   The  bias




in the total non-methane hydrocarbon (NMHC) predictions are also  useful for




this purpose.  While NMHC is certainly not inert, ambient levels  are




dominated by relatively less reactive paraffinic compounds.




     Figure 32 shows hourly bias, and Figure 33 shows daily bias  versus




observed concentration, indicating that, indeed, there is underprediction




of CO and NMHC.  Thus a bias exists that could be associated  with too rapid




a vertical mixing rate, using CO and non-methane hydrocarbons (NMHC)  as the




indicators.  The shape of the all-site hourly bias for CO in  Figure 32  is




different than the shape of the all-site hourly bias for ozone of




Figure 9.  The ozone hourly bias has an additional hunp at mid-day,




suggesting other factors also contribute at this time.




     Given that CO is relatively inert, the observed underprediction can




either be explained by an inventory problem or by the pollutants  being




mixed too rapidly upward, away from the surface.  If there were an




inventory problem, one would expect the hourly bias to be worst at  the  time




of the CO maxima, that is, during the morning rush hour (hours 7  and  8  in




Figure 32).  This is not the case.  The absence of significant bias until




mid-morning in the all-site average for CO suggests that too  rapid  a




                                     59
-------
vertical mixing is a more plausible explanation.  Examination  of  the




algorithm used to calculate vertical diffusivity in the model  corroborates




this explanation.




     While no action was taken for this work, a sensitivity  study on  two




days was undertaken to investigate the impact of reducing vertical mixing




on the ozone predictions.  The vertical mixing was reduced from the mixing




occurring for a normal day with free convection by assuming  a  neutral




atmosphere for the entire day.  The two days picked were a day with the




ozone cloud on the outskirts near Highland  (79180), and a day  with a  broad




ozone cloud in the central part of the urban region (80204).




     The main illustrative results for ozone are reproduced  in Table  17.




They show that decreasing the vertical mixing increases the  unconstrained




(full grid) predicted daily maximum for ozone.  It also increases the site




maxima when an ozone cloud is at (or nearly at) the site.  The hourly




difference, Co-Cp, for a monitoring station at the time when there is




an ozone maximum is considerably improved.  The daily  bias,  on the other




hand, is hardly changed.  Opposing changes  occur in the hourly biases that




are averaged out in the daily bias.




     The time series of the predictions for the regularly  simulated day and




the neutral day are shown in Figures 34 to  37 for CO and 03  both.  EPAl was




used for this sensitivity analysis because  that model  had  been used for the




CO modeling.  As can be seen, reducing the  vertical mixing does make  a




difference, especially in the region of the ozone cloud.   For  Day 79180 in




which there is no ozone cloud at the center, there  is  very little




difference in the prediction for the central monitoring  stations, but there




is  a large difference  for Highland, near  the ozone  cloud.




                                     60
-------
     Greater detail is given by looking at the hourly differences  for  all




stations for the two days.  These are given in Table 18.  As can be seen,




the hourly differences increase somewhat in the late morning and late




afternoon when going to the neutral day.  The differences for the  hours of




the peak improve for those stations near the peak.  When all stations  are




averaged, Day 79180 shows little change in the hourly bias, but Day 80204




shows good improvement at the peak hours.  The daily bias is very  little




changed when comparing the regularly simulated day with the neutral day.




Decreased vertical mixing results in less ozone being entrained from  aloft,




leading to slightly lowered ozone predictions at stations away from the




peak.  This counterbalances the greater ozone production from emissions  in




the vicinity of the peak, resulting in little change in the daily  bias




statistics.




     The results for carbon monoxide show considerable improvement.   The




daily bias of CO on day 79180 is reduced from an underprediction of 42%  of




the observed concentration to 30%.  For day 80204 the bias changes from  a




13% underprediction to a 27% overprediction.  The time of most interest  is




mid-day, from 1000 to 1500, when the mixing is strongest.  The mid-day bias




of CO on day 79180 was reduced from an underprediction of 71% to one  of  52%




and on day 80204 reduced from an underprediction of 48% to just 2%.   From




this sensitivity study, we conclude that some portion of the extreme  bias




at mid-day for CARIH, Arvada and CAMP very likely can be attributed to too




rapid a vertical mixing of the reacting pollutants away from the surface.




The results presented here suggest that excessive vertical mixing




contributes to the bias found in the model predictions.  As discussed




                                     61
-------
later, vertical mixing also has an important influence on  the  predictions




of peak ozone when there is a change in emissions.









Missing the Peak in Time




     Two examples of the models predicting the local peak  at the  wrong time




are shown in Figure 38.  In the 11-day sample, the local site  maximum  was




predicted at the wrong hour by all three models more than  60%  of  the  time.




     This mis-timing of the predicted peak has an effect on the daily  bias




and noise which may be confusing.  If the magnitudes of the predicted  and




observed peaks are similar, predicting the peak too late as in Figure  38




will produce a positive difference at the hour of the observed peak and a




negative difference at the hour of the predicted peak.  These  will  tend to




balance out in the daily bias computation, producing low bias  but high




noise.




     Examination of the CAMP data for each day shows mis-timing of  the peak




to be a common occurrence at that site.  It explains the pattern  of hourly




bias at CAMP shown in Figure 7, where noontime bias is positive and




afternoon bias is negative.




     The pattern in Figure 38 for CARIH  is actually atypical,  however.  Day




79218 is the only one in which the magnitude of the predicted  peak




approaches that of the observed peak.  On other days,  the  predicted peak is




much too low (as well as usually occurring at the wrong hour).  Therefore




there are no negative differences to balance  the  positive  ones in the daily




bias computation.
                                      62
-------
E,  COMPARISON OF DAILY MAXIMUM CONCENTRATIONS




     Model behavior was probed in detail above for diagnostic purposes  to




search for a variety of causes of error.  Comparison of observed  and




predicted concentrations over all of the daytime hours were necessary  for  a




complete diagnosis.  However, this very detail tends to obscure the




questions which are of most importance in a regulatory evaluation.  There  a




primary concern is with the daily maximum concentrations;  therefore, we




must evaluate the models' success in predicting the daily  maxima.  We  will




look first at the local site maxima, then at the overall daily maxima.









Local Site Maximum for Each Day and Site, Paired by Hour




     The most stringent pairing of daily maximum concentrations matches the




observed maximum at each site for each day with the predicted concentration




for that hour at that site.  Five monitoring sites and 11  days result  in 55




paired observations.  Frequently the maximum prediction misses the  time of




the observed maximum by one to three hours.  Therefore underprediction is




to be expected with this pairing, even if errors are merely random.




     The performance measures for our three models under this pairing




method are shown in the upper half of Table 19.  For all three models,  the




bias estimate is approximately 40% of the mean observed site maximum.   That




is, the models tend to underpredict by about 40%.  Noise levels are  similar




for the three models as well.  Furthermore, predictions of all three models




have significantly smaller variances than the set of observed daily




maxima.  Although a t-test at the 95% confidence level does not show a




significant difference in bias between the three models, the nonparametric




                                     63
-------
Wilcoxon test shows the difference in bias between the DOT  and  EPA1  models




to be statistically significant, with z ™ 3.21.








Local Site Maximum for Each Day and Site, Unpaired by Hour




     Removing the requirement that the model predict a day's maximum at  a




given site at the correct hour, this pairing matches the observed maximum




with the maximum prediction over all hours for that day at  that site.




Again, there are 55 paired observations, each day's maximum at  each  site.




     The lower half of Table 19 shows the performance measures  for this




pairing method.  The bias estimates have decreased to approximately  3OX  of




the observed mean; that is, the models underpredict by about 30%.  For each




model, the change in bias from the first, more stringent, pairing is




statistically significant (a * .05) using the t-test.,  The  t-test again




does not show a significant difference in bias between the  three models,




but the Wilcoxon test shows both EFA1 and EPA2 to have significantly




smaller bias than the DOT model (z • 3.43  and 3.21, respectively).




Evidently, even with a sample size over 50 the nonparametric test is the




more sensitive one when data differ greatly from a normal distribution.




     Noise levels are quite similar for the three models,  and  are similar




to those obtained in the first pairing.  Variances of the  predictions




remain significantly smaller than variances of the observed site maxima.




     Separate analyses for each site were performed using  this  pairing




method.  Arvada and Welby sites produced results similar  to those  for the




data set as  a whole, with bias estimates near 30% of the  observed mean for




all three models.  Predictions at the GARIH site are more  biased, with




                                     64
-------
average underpredictions ranging from 35% to 43%  in the  three models.   At




CAMP and Highland, estimated biases are lower, ranging from  14%  to  27%.




Most biases are significantly greater than zero under both Wilcoxon and




t-tests, the only exceptions being the EPA2 model at CAMP and all three




models at Highland.




     Noise levels are slightly lower at CAMP, CARIH, and Welby than in the




data set as a whole, and considerably higher at Highland.  Noise




differences between models are small, but differences between sites are




more pronounced, with Highland predictions having significantly  higher




noise than several other sites.  (Highland's high noise  was  the  result of




missing the location of the peak on the three days when  the  peak occurred




at Highland, as discussed above in the section on hourly predictions.)




     Another distinction between sites is in the variability of  the




predictions.  At Highland, predictions of all 3 models have  significantly




smaller variances than the observed data (probably explained by  missing the




peak in space, as described earlier), while at CAMP, prediction  variances




are not significantly different from observation variances  for  any  model.




At the other three sites the models perform differently: the DOT model




predictions are significantly less variable than  the observed data, while




the EPA2 predictions are not.  In fact, at CAMP, CARIH,  and  Welby the  EPA2




model achieves variances extremely close to the observation  variances.




Unfortunately, high bias and noise levels indicate that  this variability is




frequently occurring on the wrong days.




     The low variability in the model predictions can be attributed to a




number of factors which have already been discussed in regard to introduced




                                     65
-------
errors.  Many input parameters were held relatively constant  from  day  to




day, including background concentrations, photolysis rates, and emissions.




A priori, these would be expected to lead to lower variability in  the




predictions.  Given that these factors were common to all three models,




other factors clearly must be contributing to the lower variability  in the




DOT and EPA1 predictions as compared to EPA2.




     Correlations between observed and predicted site maxima  range from




-.122 to .622 over the 5 sites and the 3 models.  Only one of the




correlations is significantly different from zero, however, (EPA2  model at




Arvada) because of the small sample size at each site (critical value  is




r<05 - .576 with d.f. - 10).




     The above two sets of comparisons for maxima on each day at each  site




have examined the question, "how well are region-wide ozone maxima for the




day reproduced by the model for the area covered by the monitoring




stations?"  This question is part of the evaluation of the model's ability




to replicate the bulk production of ozone which is of central regulatory




concern.  The focus of regulatory concern, however, is centered on the bulk




ozone production at the major peaks, since only the peak prediction  is used




in SIP analyses.  Thus we next examine the daily maxima in  terms of  each




day's peak  concentration.









All-Station Daily Maximum, Paired by Site




     In  this comparison, the observed maximtxa from any monitoring  station




for a given day was paired with the maximum predicted on  that day  at that




site (Comparison (a)).  There was no pairing by hour.   If  there  are  errors




                                     66
-------
in the spatial positioning of the prediction, then constraining the



predicted maximum to a single fixed site can be expected to lead to



underprediction by the models by chance alone.



     Statistical comparison of the three models on this data is shown  at



the top of Table 20.  The estimated bias ranges from 43% to 49% of the mean



observed maximum. Cmax , for the three models.  A t-test does not detect  a
                   o


significant difference in bias between the three models.   The Wilcoxon



test, however, indicates with 95% confidence that the bias in the DOT  model



is significantly greater than that in EPA1 and EPA2 on this data.  Using



the Friedman test to compare the three models jointly also indicates a



significant difference between models (chi-square a 6.682, d.f. » 2,



p * .035).  Results of the Wilcoxon test on this data are  shown in the



upper part of Table 21.



     Variances of the predictions are considerably smaller than the



observation variance for all three models (although only for the DOT and



EPAl models are the differences statistically significant).  A scattergram



of the all-station daily maxima, paired by site, is shown  in Figure 39(a).



     These features of poor performance (high bias and low variability)  are



tempered somewhat in the EPAl and EPA2 models by relatively low noise  and



high correlations between Co and Cp.







All-Station Daily Maximum, Unpaired by Site



     Removing the requirement that the model predict the daily maximum at



the correct site, the observed maximum is paired with the  predicted maximum



for that day at any monitoring site (Comparison (b)).



                                     67
-------
     Model performances on this data are summarized in the middle  of




Table 20.  The estimated bias ranges from 31% to 42% of CmaX.  Again,  the




t-test does not show a significant difference in bias between the  three




models.  However, the Wilcoxon test at a 95% confidence level indicates




that EFA2 has significantly less bias than the other two models.   The




Friedman test on the three models jointly substantiates this (chi-square *




7.818, d.f. « 2, p = .020).  Wilcoxon tests for this data are shown in the




center of Table 13.  Both EPA1 and EPA2 predictions are positively




correlated with the observations, with noise levels somewhat less  than in




the DOT model.  The scattergram of the daily maxima, unpaired by  site, is




shown in Figure 39(b).








Area-wide Daily Maximum, Over Entire Modeling Region




     Observed concentrations are available only at monitoring sites,  but




predicted concentrations are available at a large number of grid  points




over the entire Denver metropolitan area.  The chance of attaining the true




maximum, then, can be expected to be higher over all of the grid  points




than over the 5 monitoring sites.  Thus in pairing the observed maximum




with the full-grid predicted maximum, we should expect the model  to over-




predict  (Comparison (c), Table 20).  This tendency will be partly




counterbalanced, though, by the  fact that monitoring sites have been




deliberately  located  in high pollution regions and that our  sample consists




of days  when  high ozone concentrations were observed at the monitoring




sites.
                                      68
-------
     Statistics at the bottom of Table 20 show that the models continue  to




underpredict the daily maxima, with biases ranging from 10% to 30%.  Here




for the first time, however, bias in one model, EPA2, is not significantly




different from zero (under both Wilcoxon and t-tests).  In comparing




models, the t-test indicates only that bias in the EPA2 model is




significantly lower than in the DOT model at the 95% confidence level




(t « 2.57).  The Wilcoxon test detects distinctions between all three




models, with EPA1 significantly less biased than DOT, and EPA2




significantly less biased than either of the others as shown in the  lower




part of Table 21.  Comparing the three models jointly on this data,  the




Friedman test shows highly significant differences (chi-square =*  14.045,




d.f. - 2, p - .001).




     Variability in the EPA2 predictions is very close to the variability




in the observations, with a significant positive correlation between the




observed and predicted maxima.  The scattergram of area-wide maxima,




unpaired in space, is shown in Figure 39(c).









Regression Analysis of the Daily Maximum Pairings




     A linear regression was performed on the three different pairings  •




shown in Figure 39 (paired by site, unpaired by site, and unconstrained  in




space).  The results, giving the slope, the intercept, and  the coefficient




of determination (r2), are shown in Table 22.




     Tests of significance of the correlation and regression coefficients




require that the data be normally distributed.  Because this data is




probably not normally distributed, we have used the coefficients  only  as




                                     69
-------
rough indicators rather than as statistical tests.  The significance  test




dramatizes the imprecision of a correlation computed on only eleven points,




however.  The sample r must be above .576 (or r2 above .332) in order  to be




significantly greater than zero.  Thus small differences in correlation




between two models should not be given undue importance.  Table 22 shows




that the DOT model has a consistently lower correlation between observed




and predicted peaks than the two EPA models.  The differences between  EPA1




and EPA2 correlations are relatively small.  The slopes and intercepts in




Table 22 do not provide a way to select between the three models.  In  this




example, the scattergrams in Figure 39 provide a better view of the




differences between the models and their relationship to the line
Evaluation of the Models Based on Daily Maxima




     Substantial underprediction of peak concentrations has been  shown  for




all three models.  A general impression emerges that the DOT model  is much




less able than EPA2 to predict day-to-day changes  in peak  ozone




concentrations, while EPA1 falls somewhere between.




     In every comparison, the DOT model shows the  highest  bias of the  three




models, and a near-zero correlation between observed and predicted  maxima.




In addition, the variability of DOT predictions is extremely  low:




significantly lower than the variability in the observed maxima  in  every




comparison.  Despite this low variability, the "noise,"  S(j,  in DOT




predictions is slightly higher than that in the other  two  models.   Thus the




variability it does produce is more likely to occur at  the wrong times  and




                                      70
-------
places.  We conclude that the DOT model is less capable of predicting  peak




ozone concentrations than either the EPA1 or the EPA2 model.




     Differences between the EPAl and EPA2 models are smaller.




Correlations between observed and predicted maxima are similar  for  the two




models.  Using the all-station, unpaired daily maximum, (comparison (b),




Table 20), the values of r2 indicate that EPA2 explains 46% of  the  observed




variance while EPA1 explains 36%.  Bias in EPA2 is somewhat smaller than




that in EPAl in most cases, while variability in EPA2 predictions is




consistently higher than in EPAl and closer to the variability  in the




observations.  Many of these differences are not statistically  significant,




but the consistent slight superiority of EPA2 on all of the performance




measures indicates that it performed best of the three models on this  data




set.




     Whether the performance of the best model is "good enough" for




regulatory purposes is still a question requiring professional  judgment.




The substantial bias in the EPA2 model can reasonably be  estimated  to  be




between 10% and 31% of observed maxima depending on whether predictions are




confined to monitoring sites or selected from the full grid.  This  could




indicate a need for some kind of calibration or tuning of the model, unless




improvements in the input to the model are found to correct the bias.




Using predictions from all grid points (unconstrained in  space) appears to




be a way to reduce the bias in the model.  This might be  appropriate if the




underprediction has been caused by errors in location of  the peak due  to




errors in the wind field.
                                     71
-------
     Similar results were obtained in a study of the performance  of  the




EPA2 model on ozone concentrations in the St. Louis, MO, area  (Cole,




1982a).  That study concluded that all grid points should be used in




determining the predicted maximum.  When this was done, the researchers




found that predicted peaks for most days were within ±30% of the  observed




peaks and concluded that the model performed with a "reasonable degree of




accuracy" in estimating observed peak ozone.  By this judgment, the  EPA2




model performed reasonably well on the Denver data too, since  predictions




for all of our 11 days fell within ±27% of the observed peaks.









F.  EMISSIONS CHANGE COMPARISON




     The attempt to gain a better understanding of the performance of  the




model by diagnosing errors showed that a complex of information is




contained in the bias and the other measures of paired comparisons.  The




analysis of errors led to a multitude of answered, partially answered  and




unanswered questions about how well the model is performing.   The




complexity of the evaluation and list of probable errors left  an  incomplete




appreciation of the strong and the weak points of the model, not  enough  for




an  unambiguous assessment, to our minds, of  the acceptability  of  the model




for regulatory purposes.




     Then the comparisons of daily maxima, while addressing more  directly




the manner in which the model would be used  for regulatory purposes, raised




new doubts about the performance of the model.  These doubts,  raised by




such results as the low variability in the model predictions  and  the




unsatisfactory correlation and regression coefficients,  created  a serious




                                      72
-------
question as to whether it is valid to draw inferences about the performance




of the model under conditions of changing emissions, based on a performance




evaluation in which the emissions do not change.  In the  same vein  the




sensitivity studies for the evaluation and for the data set development




showed how complex the model predictions are and how non-linear the changes




can be.




     The ultimate goal of a performance evaluation such as this one is to




answer the question, is the Urban Airshed Model good enough for regulatory




application?  Can a bias in the predictions for a set of  historical days be




calibrated out with any confidence that the predicted changes due to




changes in emissions are still valid enough to be used?   We believe the




performance evaluation as carried out thus far is incapable of giving  a




sufficiently unambiguous answer to that question.  That question must, we




believe, be tested directly for photochemical models.  In this section,




therefore, we present one approach for directly testing the predictions  of




the photochemical models in order to evaluate their response to emissions




changes.




     The approach which we have taken is to assemble a second, complete




emissions inventory for an earlier year.  That earlier year had to  be




sufficiently separated in time from the year of the performance evaluation




data set so that changes in the ambient concentrations, associated  with




changes in emissions, had actually been observed.  The meteorological




conditions of the set of performance evaluation days was  then used  to




re-simulate a set of pseudo-days using this "new" emissions inventory.




This procedure predicted changes in ozone concentrations  due only to




                                     73
-------
emissions changes for a fixed set of meteorological conditions.   The




relative change in the predicted pollutant concentrations was  then  compared




with the relative change determined for the observed pollutant




concentrations.




     For photochemical models, because the chemistry and meteorology  are




highly interrelated, and because the performance of the models could  be




different for different ratios in the emissions of NOX to non-methane




hydrocarbons, the ideal test approach would be to also perform a




symmetrical evaluation to the one just described.  That is,  a  second  set  of




evaluation days should be established for the earlier year.  Then an




emissions change test would be performed again using the later emissions




inventory to re-simulate pseudo-days associated with the meteorological




conditions of the evaluation days of the earlier year.  Thus the  analysis




presented here is one-half of a more ideal analysis approach for  testing  a




photochemical model.  It should, however, represent an adequate  approach  to




testing the predictions of models.  In either case (i.e., each half of the




ideal evaluation), the use of pseudo-days is necessary because exact




replicas of meteorological conditions in two different years would be




nearly impossible to find.  The approach also replicates the conditions




under which the model will be used  in regulatory analysis.








Definition of an Emissions Change Comparison




     The emissions change comparison should resemble as closely  as possible




the manner in which  the air quality models will be used in  a regulatory




application.  Thus we are interested in the changes  in the  area-wide




                                     74
-------
(unconstrained) and all-site daily maximum predictions  that occur due  to  a




change in the emissions.  For the model simulations, that change in ozone




concentrations is represented by use of the two emissions inventories, with




meteorological conditions held constant.  For the monitoring data, it  is




necessary to find a way of separating the effect of emissions  changes  on




the ozone trend from the effect of differences in meteorology  from year to




year.




     Estimating these changes is not a trivial task.  A change in emissions




implied by the difference between two emissions inventories is only valid




if the techniques used to estimate the emissions in each inventory are the




same.  Otherwise changes in techniques must be corrected for.   Trends  in




ambient concentrations can be confounded or even masked by such things as




changes in the location of a monitoring station, changes in the chemical




technique used to measure concentrations, changes in calibration procedures




and, last but not least, year-to-year meteorological variability.  All of




these factors must be checked for and taken into account in any trend




analysis of ambient concentrations.








Development of the Earlier Emissions Inventory




     Because the increase in the vehicle miles traveled (VMT)  in the Denver




area has been so rapid between 1970 and 1980  (4.7% per  year),  it takes




several years for the reduction in automobile tail-pipe emissions to have  a




noticeable effect on total Denver emissions.  Thus the  two emissions




inventories should be several years apart.  Availability of a




transportation data base can be a severe limitation on  the choice of years,




                                     75
-------
however.  The earliest year for which  a  transportation  data set was




available for Denver was 1975.  The earliest year of  the most  reliable




transportation data set, i.e., the transportation data  set  which had the




most up-to-date corrections, was 1976, because  that year was the base year




of the  1982 State Implementation Plan  (SIP) projections for ozone.   As




well, a point source inventory had been  developed for 1976  as  part  of the




SIP work.  The mobile source emissions model of EPA (MOBILE2)  was expected




to be reliable for any of the earlier  years.  Thus 1976 was chosen  as the




year for which to develop the second emissions  inventory.




     Because of the rapid growth in Denver's VMT, the span  of  1976  to 1979




was considered to be barely long enough  for this test and could turn out to




be marginal.  It was the best that could be achieved, however.  An




emissions inventory representing 1976  traffic and emissions conditions was




developed for both the Carbon Bond I and Carbon Bond  II hydrocarbon




splits.




     The 1976 inventory was very comparable to  the 1979 emissions




inventory.  Both inventories used the  same large-scale  transportation model




to establish the location and magnitude  of the  vehicle  miles traveled.




Major traffic count programs had been  carried out  in  1975  and 1979  in




Denver  to help adjust the transportation model  results.  Both inventories




used the same mobile source emissions  model and the  same procedures to




estimate the automotive emissions for  given vehicle miles  traveled.  The




1979 point source inventory was part of  a periodic update  of the 1976 point




source  inventory.  Thus problems or bias that might be  associated with the




emissions inventory would be  systematically similar  for each inventory.
                                      76
-------
Choice of Models and Days for the Emissions Comparison




     The major purpose of the emissions comparison presented here  should  be




a demonstration of its importance and usefulness.  The  question  that  ought




to be answered is, does the emissions comparison provide us with new  infor-




mation that we did not already have in some form from the  above  hourly  and




peak-concentration comparisons?  Therefore, it is important to perform  the




emissions comparison on all three of the  air quality models.




     Ideally all 11 performance days should have been re-simulated.




Because computer resources were limited,  however, the number of  days  re-




simulated with the second emissions inventory had to be reduced  to eight.




The choice of the three days to exclude was based on EPA2's performance for




the unconstrained daily maxima.  We elected to pick days with reasonably




consistent gross error performance.  We also wanted them to span the  range




of the observed maxima.  Of the daily maxima comparisons Day 5  (79249)  had




the greatest underprediction (26%) and Day  10 (80207) had  the greatest




overprediction (27%).  In fact, Day 10 was  a day in which  the major ozone




peak was most likely not observed at any  of the monitoring stations.  Con-




comitantly, these days also had the largest percent absolute deviation  (see




Figure 4).  Thus they were considered to  be less typical of the  average




gross error performance of the model.  Removing these days would not  affect




the range of our predictions, thus they were excluded from the  test.




     The third day to be excluded was Day 7 (80177).  It was a  toss-up




between Day 7 and Day 6 (80170).  Both had  the same observed daily maximum




and nearly the same predicted daily maximum.  Eliminating  one should  not




cause much of a loss of information.  As  shown in Figure 3, Day  7  had the




                                     77
-------
larger percent absolute deviation of the two days and it was slightly less




characteristic of the average gross error performance of the model, thus it




was excluded.









Estimation of the Change in Observed Maxima




     Ideally, a regression or time-series analysis of several years of data




that is based on a stochastic model with an accurate deterministic




component should be used to most precisely estimate the underlying trend in




the data. The trend due to emissions changes needs to be separated from the




year-to-year variation in the meteorology.  Work of such a nature on the




Denver data set, independent of this research effort, was not advanced




enough to use at this time, nor were we aware of other available work on




this problem.  Thus simpler techniques to establish the trend had to be




used to carry out the evaluation of the emissions change comparison.




     The earliest year for which ozone data at the five monitoring stations




exists is 1975.  Thus the trend had to be established using data for the




1975 through 1980 time period.  The distribution of the ozone monitoring




data of Denver is cube-root normal.  Thus one could not assume that an




annual trend in ozone concentrations derived from monthly means would be




the same as a trend based on just the extreme end of the distribution of




ozone concentrations.  The trend in the observed daily maxima needed to




be computed using a subset of each year's observed daily maximum




concentrations which resembled the limited population of days in the




performance evaluation data set.  The fraction of the concentration




distribution used needed to remain constant from year-to-year in order  to




                                     78
-------
give the same weight to each year's data.  Therefore, the range of




concentrations defined by the concentration cutoff of 100 ppb had to be




replaced with an equivalent definition for the range in terms of a




percentile of the distribution of daily maxima. The same percentile of the




daily maxima for each year needed to be used in the trend analysis, not  the




same range.  Thus, for each year, the same number of daily maxima is used




in establishing the data set of observed high ozone concentrations.




     Two different high ozone data series were established for the trend




comparison as a sensitivity check.  The first data series was the top  11




days of each summer's observed daily maxima, with consecutive high ozone




days excluded to better resemble the evaluation data set.  Eleven days were




used for this set to make the number comparable to the number of evaluation




days.  For each year, the eleven days having the highest peak concentra-




tions were used, after excluding all but the first day of any multiple day




series of high ozone.  The second series was the top 14 days of each




summer'a daily observed maxima, with consecutive high ozone days included.




Only 14 days from each year were used because there were only 14 daily




maxima of at least 100 ppb in the summer of 1980.  The annual means of the




two high ozone data sets are shown in Figure 40.




     The large scatter from year-to-year is evident.  The scatter is not




due to any changes in monitoring location or chemical techniques.  A




national change in calibration techniques took place in 1978, affecting  the




1979 and 1980 data.  For Denver, that change was found to be minor, a




slight increase in ozone reported of at most 3.8 percent.  There was




apparently no bias before or after the calibration change.  The change just




                                     79
-------
reduced the random error.   It was not possible to draw any simple




association between the scatter in the high ozone readings and the




year-to-year variability of the wind speed.  Thus external information had




to be used to infer a best estimate of the shape of the annual ozone  trend




before any fit could be tried through the points in Figure 40.




     To determine the appropriate shape of the ozone trend, the  trend in




daily hydrocarbon and NOX emissions was investigated.  The trend in




hydrocarbon and NOX emissions was based on the point source inventory




trend established by the Colorado Department of Health, the average daily




VMT estimated by the Department of Highways for each year  (based on traffic




count data) and the mobile source emission factors for each year from EPA's




MOB1LE2.  The resulting daily NOX emissions were expected  to  increase




somewhat over the period 1975-1980.  The resulting daily HC emissions




showed a near-perfect straight line decrease from 1975 through 1979.  1980




had double the decrease because VMT did not increase that  year,  due to high




gasoline prices.  The 1980 VMT was 5 percent lower than would have been




expected, assuming a regular, smooth trend from 1975 to 1980.  An




examination of an EKMA isopleth indicated that the change  in  ozone expected




per unit decrease in HC would lessen slightly  as the hydrocarbon emissions




decreased.  The conclusion from integrating the above  information was that,




while it would not be perfect, linear regression of  the data  points  forming




the averages shown in Figure 40 was considered to offer a  reasonable




approximation of the trend in ozone concentrations that could be attributed




to decreasing emissions with time.  The linear trend analysis on the  top  14
                                      80
-------
daily maximum ozone concentrations for each year,  1975-1980, shows that




they have decreased at the average rate of 6.2 ppb per year.  This rate of




decrease is significantly different from zero at the 99% confidence  level.




The trend analysis results are virtually identical for the top  11 days of




non-consecutive ozone daily maxima for each year.









Results




     The predicted percent increase in each day's  peak ozone due to  an




increase in emissions, corresponding to the change from  1979 to 1976




emissions conditions, is given in Table 23 for each model.  Clearly  the




three models seem to show a different response to  changes  in emissions.




Because the performance evaluation emissions inventory corresponds most




closely to 1979 emissions conditions and the second emissions inventory to




1976 emissions conditions, the percent increase in observed ozone maxima




should be computed using a 3 year interval.  Table 24 shows the estimated




trend in the observed ozone concentrations, as well as the trends due to




emissions change that are predicted by the three models.   The mean of the




observed concentrations on the 8 days used in the  emissions change




comparison was 137 ppb.  Under the same weather conditions, then, on the




basis of the trend in observed concentrations we would expect the mean




concentration to have been 18.6 ppb, or 13.6% higher in  1976.




     For each model Table 24 gives the predicted rate of change in the




daily maxima, both when the maximum is selected only at monitoring sites




and when the maximum is selected from the full grid.  To allow  for the bias




in the models, these changes should be compared as a percentage of




                                     81
-------
predicted concentrations, rather than in absolute form.  For each model,




the average change in peak ozone predicted between 1976 and 1979, expressed




as a percentage of the mean predicted peak ozone concentration for the 8




sample days, is shown in the last column of Table 24.  No matter which way




the predicted daily maxima are selected, the EPA2 model is more responsive




to the emissions change than either the EPA1 or the DOT model.  Even EPA2,




however, does not predict as great a change as that found in the observed




ozone data.




     None of the differences between the models in Table 24 are




statistically significant.  Still, on the basis of this comparison and the




earlier comparison of the models on daily maxima, we would conclude  that




EPA2 is superior to DOT and EPA1 for regulatory purposes.




     All models performed better than we had expected, based on the




performance evaluation up to this point.  The implication is that the




results of earlier segments of the evaluation did not give a good




indication of how the models would perform under a change in emissions.




The emissions change comparison produced and highlighted new,  independent




information.  The DOT model daily maxima predictions showed no effective




response to the differences in meteorology (the correlations and  regression




coefficients in Table 20).  Yet, the DOT daily maxima did show a  response




to changes  in emissions.




     None of the performance comparisons before the  emissions  change




comparisons gave any indication that DOT would perform  two-thirds  as well




as EPA2 on  the crucial  test for regulatory purposes.  The hourly




comparisons at the monitoring  sites  indicated  there  was  little significant




                                     82
-------
difference between models.  The peak-concentration comparisons  indicated




there was some difference between models (see Tables 20 and 22).  As




illustrated in Figure 41, there is no relation between daily bias and  the




percent change in ozone predictions due to a change in emissions.   As  the




scattergrams of Figure 42 indicate, there is little or no relation  between




either the observed or predicted peak ozone concentration for a day and




that day's percent change in the daily maximum ozone due to the emissions




change.  Although Figure 41 shows no relation between daily bias  and




sensitivity to changes in emissions, daily bias has been shown  previously




to be a rather insensitive measure of effect.  Looking instead  at the  bias




in the peak predictions (Figure 43), there does appear to be some




relationship between bias and emissions sensitivity, at least for the  EPA2




model.  Days with the highest bias show a smaller percent change  in the




peak ozone predictions.  This is consistent with the slight underestimate




of the slope of the trend line in the ozone observations over the 1975-80




time period.  Results from the vertical mixing sensitivity tests  to be




discussed later suggest a possible mechanism for this effect.




     A rather important piece of information is evident in the  results just




presented.  The modification to the Urban Airshed Model that produced  the




greatest improvement in its predictions on the emissions change test was




not the modification that produced the greatest improvements in its




predictions for changes in meteorology.  This can best be seen  in the




comparison of peak predictions for the unconstrained pairing.   That pairing




is least sensitive to the distortion that is introduced when the monitoring




stations happen to be consistently outside of the predicted cloud of




                                     83
-------
high-ozone.  Table 24 implies that the elimination of  the horizontal




numerical diffusion (the modification from EPA1 to EPA2) produced  the  great




majority of the improvement in the predictive capability of  the model  for a




change in emissions.  Tables 20 and 22 and Figure 39 imply that the change




of chemical mechanism (the modification from DOT to EFA1) produced the




majority of the improvement in the predictive capability of  the model  in




response to a change in meteorology with a "fixed" level of  emissions.




     The above results suggested the need for a sensitivity  study  on the




effect of vertical diffusion on predictions of the model when emissions




change.  It was observed above that reduction of the rate of horizontal




diffusion (the modification from EPA1 to EPA2) increased the predicted




relative change in peak ozone for a given change in emissions.  This raised




the question, "would a change in the vertical rate of  diffusion affect the




predictions similarly?"  The Urban Airshed Model does  appear to have  a




problem with too rapid a vertical mixing.  Resource limitations precluded a




full sensitivity study, but two days that were included in the emissions




change comparison, 79180 and 80204, had also already been simulated with a




reduced vertical diffusivity using EPA1.  Therefore those two days were




resimulated using EPAl with the 1976 emissions inventory  and reduced




vertical mixing.  This allowed the calculation of a new trend in  peak  ozone




resulting from changing emissions, for  the case of reduced vertical




mixing.




     Reducing the vertical diffusivity  in EPAl increased  substantially the




predicted relative change  in the daily  ozone maximum  for  the given change




in emissions.  The relative response of. the  ozone  peakis  to  changes in




                                     84
-------
emissions increased similarly for the two days with  reduced  vertical




mixing, a somewhat different response than  for the change  from EPA1  to




EPA2.  With reduced vertical diffusion, the relative change  in ozone  peaks




between the 1979 and 1976 data sets increased from 5.9%  to 10.8%  and  from




10.5% to 19.9% for days 79180 and 80204, respectively, which are




substantial and nearly identical percentage increases in the predicted




change for the two days.  This looks like a very  important effect and




should be directly checked with EPA2's predictions,  since  both the




horizontal and vertical rates of diffusion  at the surface  seem to be




important.




     The error in advection for day 80204,  which  probably  caused  an




abnormal peak in EPA2, did not seem to affect the prediction of a change in




ozone due to a change in emissions.  The two EPA  model versions give  fairly




similar predictions for day 80204, 10.5% and 14.0% for EPA1  and EPA2,




respectively.  In addition, a 14.0% change  is a typical  prediction for EPA2




on the eight days.  Thus it appears that although such errors in  the  wind




field will introduce a bias in the model's  prediction for  a given year,




that effect is not necessarily carried over into  the relative predictions




of the model for an emissions change.  This is an area that  deserves  more




investigation.




     The above analysis leads to two basic  conclusions.  First, if the




intent of the performance evaluation is to  assess the acceptability  of the




air quality model's projections of the effect of  emissions changes,  then




that assessment has to be made directly by  testing the model on a data set




which involves a change in emissions.  Inferences regarding model




                                     85
-------
performance with respect to emissions changes, if based on data  sets  which




do not involve a change in emissions, will be unreliable  and potentially




misleading.  In addition, inferences about areas on which to focus model




improvements will also be misleading.  For example, the single year's




evaluation with respect to meteorological change seemed to suggest  that




future effort should be on further improvements in the chemistry, while the




multi-year evaluation with respect to an emissions change seemed to  suggest




that future effort should be on improving the correctness of the diffusion




algorithm in the model.  These are two quite different components of the




model.  This conclusion merely reconfirms that the design of the evaluation




has to match the purpose of the evaluation.




     Second, it appears that some errors that contribute  to the  bias  in  the




model's prediction of ozone within a given year will  also affect the




model's prediction of the effect of an emissions change.  It would  appear




that there are other errors, however, that do not  affect  the prediction  of




response to emissions change.  More work should be done to understand which




errors impact the predictions related to a change  in  emissions  and  which  do




not.  This will affect the choice of simulation days  and  guide  regulatory




use of the model.



     There are two specific areas for further work which  are highlighted  by




the Denver emissions change results:  (1) vertical mixing and  (2) downwind




location of the peak relative to the major emissions  sources.   In regard




to vertical mixing, the rate of diffusion out of  the  ground-level boxes




(both horizontally and vertically) has been  shown  to  greatly  influence the




predictions of a  relative  change in  peak ozone  due to a change in




                                     86
-------
emissions.  The sensitivity of the peak ozone predictions  to the rate of




diffusion is greater for the emissions change prediction than for the




meteorological change (single year) prediction.   The  neutral day used in




the dispersion sensitivity study helped, but did  not  completely remove the




bias on days 79180 and 80204.  However, the neutral day appeared to




overcorrect the emissions change predictions (obtaining a  20% relative




change on day 80204).  Thus only looking at ozone bias  may not be the best




way to "get the diffusion right."  The fact that  the  daily CO bias on day




80204 went from underprediction to overprediction suggests that the inert




pollutants such as CO may provide a better key  to knowing  whether or not




the diffusion in the lowest boxes is  appropriate  or not.   For each new




city, the model may need to be cross-checked for  reasonableness.  The use




of the CO predictions as one basis for that cross-check, rather than ozone




predictions, should be investigated.  The  first step, however, is to




correct Lamb's polynomial which is present in all versions of the model to




reduce vertical diffusivity near the  surface.




     The second area in which further work is suggested is related to the




downwind location of the peak.  An investigation  of the percent change in




the main ozone peak due to a change in the emissions  showed that the change




appeared to be affected by the time and location  of the peak.  The percent




change tended to monotonically decrease after 1200 MST  and tended to be




less, by a fair amount, when the peak was  farther from  the center of the




urban area.  Across the eight days the average  percent  change in EPA2's




maximum ozone prediction at a given hour due to a change in emissions from




11.8% at 1200 to 10.3% and 8.1% at 1300 and 1400, respectively.  An




                                      87
-------
examination of the eight days in Table 23, using EPA2,  showed  that  days




79180, 79193 and 79208 had predicted peaks that were  farthest  from  the




center of Denver.  As well, 79193 and 79208 were the  only days  in which  the




predicted ozone peak was influenced by the point sources.  The  hours  of  the




predicted peaks were 1300, 1200 and 1200  for days 79180, 79193  and  79208,




respectively, not late in time at all.  The model seems to predict  less




change in ozone due to a change in emissions when the peak is  later in  time




and farther away from the main emissions  source.  To  check whether  this




tendency would hold on another day, the emissions change test  was carried




out for an extra day using EPA2, day 80207.  This day had a  predicted peak




of 154 ppb, it was at 1400 and it was to  the south of Denver,  past




Highland.  The relative change in ozone predicted for 80207  for the daily




maximum over the full grid was 8.4%, much like  the three days  discussed




above.  This analysis suggests that the model's prediction of  a relative




change in ozone due to a change in emissions is affected by  the timing  and




location of the peak.  This same effect was seen on two days in Tulsa




(Layland, 1983), days having predicted peaks close to and  far  from  the




city.  The veracity of such a prediction  clearly must be checked against




monitoring data.  A first step would be to check the  trends  in ozone at




each monitoring station to see if the stations  farther  away  from the urban




center show less of a decline in the ozone trend.  A  result  showing there




is no difference in the trend between stations  would  have  important




implications for appropriate use of the model  and  interpretation of its




predictions for different types of high-ozone  days.   While  it  is possible




that transported NOX emissions  and background  hydrocarbons  could provide
                                      88
-------
a mechanism for this effect, it is also possible that  some  types  of  days




should simply not be used for simulations for regulatory decision-making.




     In summary, we have found that the emissions change comparison  does




produce an assessment of the Urban Airshed Model that  is fairly independent




of the meteorological change assessment.  Thus  for  a regulatory assessment,




an emissions change comparison must be included to  investigate whether  the




model is good enough for regulatory use.  In addition,  two  areas  of  concern




with respect to use of the model for regulatory decisions have been  raised




which suggest a need for further investigation.









G.  PERFORMANCE EVALUATION CONCLUSIONS




     The example performance evaluation has pointed out a number  of




operating characteristics of the model.  These  characteristics relate  to




its general use as an urban photochemical model and to  its  use in




regulatory analysis.  Although the Denver example evaluation  has  a number




of flaws, the lessons about the operation of the model  are  valuable.   In




particular, it is evident that it is not a simple task to use the model in




the support of decision making.  We will use the term  "the  model" to mean




the Urban Airshed Model in general and EPA2 in  particular,  unless otherwise




noted.




     We do not believe we can make categorical  statements in  this report




about the goodness of the model.  The Denver example evaluation has  shown




that there are a number of errors in the model  and  in  the input data that




are explainable and which affected the results  presented in the example




evaluation.  We do believe that the bias shown  in this  Denver evaluation




                                     89
-------
can certainly be reduced.  Thus the Urban Airshed Model is clearly better




than the quantified evaluation indicates.  A clear, general  impression  is




that the model (EPA2) has come of age.  That does not answer  the  question




of whether it is good enough for regulatory application.  It  has  become




clear that the model has idiosyncracies and that one evaluation  for  one




city will not answer that question.  The purpose of the discussion of this




section is to shed light on that question, based on the Denver experience.




The emphasis will be on those attributes that relate to use  and  evaluation




of the model for purposes of making predictions for use in decision




making.




Performance Character of the Model




     Changes to the Urban Airshed Model from the DOT to EPA1  to  EFA2




versions resulted in improved predictions.  The day-to-day variability  of




the peaks was improved.  The amount of ozone produced at  the  surface




increased, reducing the bias in the predictions of the peaks. The




capability of the model to reproduce changes in peak concentrations  due to




changes in meteorology was greatly improved.  The capability of  the  model




to reproduce changes in peak concentrations due to changes in emissions




also appeared to be greatly improved.  Nonetheless, the model still  shows  a




pattern of chronic underprediction.




     The predictions from one model version to  the next changed  in the




regions of the peaks only, rather  than everywhere in the  grid.   The




predicted ozone concentrations for the "valleys"  and "saddles" between  the




peaks  and for the large  flat regions of  low ozone remained  insensitive  to




the changes  to the model.  The size of the base of the  ozone peaks remained




                                     90
-------
unchanged.  However, because the peaks were higher,  the  spatial  extent  of




high ozone areas increased as the peaks increased in magnitude.   This was




true not only for the change between EPA1 and EPA2,  but  also  for the




reduced vertical mixing sensitivities.




     The location and timing of the peaks were not  accurately reproduced




and it appears that the spatial extent of the peaks  may  be  underpredicted.




There do seem to be a number of factors (wind fields,  emissions, and




chemical mechanism) influencing the location and timing  of  the peaks.   This




study did not attempt to determine the relative contributions of those




three factors to this problem of the model.  The predicted  peak  ozone cloud




often moves at a different speed than a parcel of air, generally more




slowly, indicating that there is a complex interaction going  on.  Due to




the limited number of monitoring sites it is nearly impossible to say




anything definitive about the true spatial extent of the ozone peaks.   The




very sharp peak predicted by EPA2 on 80204 is considered to be primarily an




artifact of the wind field and does not represent a problem internal  to the




model.  From the limited monitoring data available,  it appears that the




steepest observed rate of change of ozone per distance is approximately




two-thirds of the average rate of change on the largest  predicted peaks.




Thus the predicted peaks still seem to be steeper than those  observed.   As




will be discussed below, one contributor to the problem  could be that  the




off-peak production of ozone is not sufficiently large.   The  inference  that




the areal extent of the peaks is underpredicted rests  on the  fact that  the




evidence for Denver seems to be consistent with the results from the  St.




Louis and Tulsa studies.




                                     91
-------
     The behavior of the model is very site specific.  In the collection  of




days that were simulated, every site monitored high concentrations of ozone




on a few of the days.  The hourly comparisons and the sensitivity studies




showed that each site had an individualized combination of errors




contributing to the bias at that site.  Thus it is not straightforward  to




interpret the bias and noise statistics across monitoring sites  and  it




should not be assumed that comparable errors are contributing to the the




statistical measures across sites.  For example, the fact that CAMP  is




located very near a major intersection of downtown arterials may contribute




to its low bias, whereas missing the peak in space and basic




underprediction of the peaks were the sources of the high bias at Highland.




     The predictions of the model are fairly sensitive to some of the  input




values.  The setting of background ozone levels greatly affected the




predictions of the model throughout the day.  Thus great care must be




exercised in setting background ozone levels throughout the day. There is




moderate sensitivity of the predicted ozone maxima to background




hydrocarbon levels on some days.  The sensitivity to special  features  in




the wind fields and emissions is important for  interpretation of the model




results.




     The EPA2 model has become more sensitive to certain errors  or  features




in the wind and emissions input data  as the result of elimination of the




artificial diffusion.  Any "dead-spots" in the  wind  field will result  in a




very large prediction of ozone  in a single cell if that cell  is  located in




a peak ozone area.  The result  is an  anomalous  spike of ozone, distorting




the interpretation of the spatial extent of the predicted high  ozone region




                                      92
-------
and affecting the estimation of the bias of the model's predictions.   In




the same vein, the influence of the point sources on the ozone predictions




is increased, because the large NC^ concentrations do not disperse  as




rapidly in EPA2 as they do in EPA1.




     The model behaves differently when emissions change than when  the




meteorology changes.  The level of bias estimated for days which have




basically the same emissions does not relate closely to the error in the




predictions when emissions are changed.  The predictions for an emissions




change are much more sensitive to dilution effects than predictions  for a




change in meteorology.  The responsiveness of the model to changes  in




meteorology appears to have no relation to its ability to predict well the




relative changes in concentrations due to changes in emissions.  This  is  an




important conclusion of this Denver example performance evaluation.  It




implies that some inferences from past studies which depend on bias




measurements to evaluate the performance of the model for regulatory




application may not be valid.




     The response of the predicted peaks to an emissions change seems  to  be




a function of time and location.  This is an area that deserves more




investigation.  The general pattern of model response during the day seems




to be that as the ozone cloud builds to a peak and then slowly declines,




the percent change in the hourly prediction due to emissions changes




decreases.  In what may be a related phenomenon, the farther away  the  daily




peak is away from the major emissions sources, the less change there is in




the predicted peak due to a change in emissions.  This behavior of  the




model needs to be verified against observed ozone data to establish




                                     93
-------
whether this is a problem in the model that becomes evident on




non-stagnation days.




     The model still has room for improvement.  Two areas of  improvement




are immediately indicated by this evaluation.  First, the algorithm  for the




calculation of vertical diffusivity should be corrected to correspond with




observations and theory near the surface.  This would reduce  the rate of




vertical mixing in the model, which has been shown to improve the




predictions of the model.  The model is sensitive to errors in  this




vertical diffusivity formulation.  Second, the box height at  the surface




should be held fixed.  This is necessary to make sure the lower diffusivity




is adequately and uniformly taken into account in the calculations.   It




will also eliminate a source of error due to variation in the volume of the




box between hours and variation in the vertical diffusivity calculated  for




the surface box, when in reality there is no variation.  These  two changes




should improve the predictions of the model both for an emissions change




and for changes in meteorology.








Insights on the Regulatory Use of the Model




     Evaluating the model for its acceptability for use  in decision  making




requires an understanding of the idiosyncracies of the model—what




influences its predictions.  Those idiosyncracies must be accomodated or




deemed unimportant  in order for the model's predictions  to be usable and  to




hold up under scrutiny of a "hostile" audience.  The  assumption is  that  the




model, because of its complexity and necessary simplification,  will




continue to be less  than perfect.  It will probably continue  to perform




                                     94
-------
better for some types of high ozone days than  for others.  While  a number




of insights about the model have evolved, we focus here  on those  that  are




most relevant to use of the model  for regulatory purposes and  to




minimization of the effects of errors and model idiosyncracies  on the  peak




ozone predictions.  These insights are associated with simulation of high




pollution cases.




     The model has difficulty correctly incorporating strong NOX




sources.  This characteristic of the model  affects the quality of its




predictions, both in terms of magnitude and spatial  extent of  the peaks.




It does appear as if this problem  affects model predictions  that  would be




used for decision making.  Accepted guidelines need  to be developed for




users of the model, telling them what to do.




     The predicted change in ozone peaks seems to be quite sensitive to




diffusion out of the ground-level box.  Thus simulation  of this diffusion




needs to be as correct as possible.  The sensitivity of  the  emissions




change prediction to vertical mixing is greater than the sensitivity of




the peak prediction on the historical day.  Thus bias in peak  ozone




predictions may not be the best guide to assessing whether the diffusion  is




correct.  One procedure to investigate further is the use of the  bias  in




the prediction of inert pollutants, especially carbon monoxide, as a




measure of the accuracy of the vertical mixing reproduced by the  model.  A




means of model adjustment based on CO bias  may be important  to account for




urban differences.  Such an adjustment might be far  more important than any




calibration for obtaining correct  predictions  of the magnitude of the  peak




for regulatory purposes.  The best situation would be that once the




                                     95
-------
vertical diffusivity is correctly simulated by the model no  further




adjustments would ever have to be made.  Further work is obviously required




on this topic.




     Changes in peak concentrations due to changes in emissions  seem to  be




a function of time of day and also seem to be sensitive to distance  of the




peak from the major source of emissions.  As the distance increases  the




predicted change decreases.  This characteristic behavior of the model must




be verified as correct or incorrect, possibly by performing  a trend




analysis on monitoring data to ascertain if the trend is a function  of




distance from the main source of emissions.  If this distance effect is




real, this could have important implications for regulatory  decisions.  In




any case, it has important implications for choice of the days that  are




most appropriate to be used for regulatory analysis.




     There is some indication that the tail of the ozone peak that  trails




behind may collapse too quickly.  This should be investigated.  The  cause




of this behavior may also be affecting the spatial extent of the predicted




ozone cloud.  It may also be implicated in the apparent decrease in  the




predicted change of ozone due to an emissions change as the  peak is  located




farther  from the main source of emissions.  We have not done any analysis




to allow us to speculate as to the cause of this behavior, but it  seems




worth investigating whether there is  a connection because  of the




implications on guidelines for the model's use.




     On  the basis of this Denver example evaluation and error diagnosis, it




appears  that some, but not all, of the problems that contribute to  the bias




in the predictions for a response to  change  in meteorology also contribute




                                      96
-------
to the bias in the predictions for a response to emissions  change.   The




degree of contribution is different, however.  Other problems  do  not appear




to affect the predictions of peak ozone for  a change in  emissions.   This




lack of strong association between the two kinds of predictions means that




much more attention must be given to evaluating the model  in the  way it  is




intended to be used.  As discussed above, only using ozone  predictions to




evaluate the model may be too limiting.  Clearly the ozone  predictions of




the model for an emissions change are more complicated and  sensitive than




previously imagined and less related to the  types  of evaluations  currently




in common use than presently assumed.  This  would  seem to  imply  that the




model is not yet adapted to casual regulatory use.  If a single  correction




of the vertical diffusivity in the model can apply across most urban areas,




then it appears possible that most of the remaining bias in the  single




year's predictions of the model will not seriously affect  its predictions




for changes in emissions.
                                      97
-------
               IV.  IMPLICATIONS FOR PERFORMANCE MEASUREMENT




     Conclusions about the performance measures derived from this study are




likely to be dependent, in part, on our particular Denver data set and our




simulation results.  This also is a special evaluation case, in that the




three models to be compared represent incremental improvements in one basic




model, which aids in interpretation and reduces the amount of analysis




required to evaluate the basic model.









A.  CONCLUSIONS ON THE USE OF STATISTICAL TECHNIQUES




Evaluation of the Performance Measures




     Bias was the most useful measure of the accuracy of the predicted




concentrations.  In the set of peak predictions it provided a basis for




statistically discriminating between models, through use of the Wilcoxon




test.  When computed on subgroups of the set of hourly predictions it




helped to pinpoint systematic errors in time and space.  It is useful also




in that it offers an ideal standard, zero bias, against which a model can




be judged.




     We found that proportional bias, i.e., dividing the bias by the




observed concentration, was also useful in getting a sense of the model.




But it must be interpreted judiciously because the proportional hourly bias




looks rather poor early in the morning but doesn't, at low concentrations,




really affect the important predictions of the model.




     Noise is less interpretable than the bias.  Its ideal value, zero,  is




virtually unattainable because there are practical limits on how accurate




a model can be.  The  same difficulty applies to  the gross error




                                     98
-------
measures,  d  and MSE.  The goal is to obtain a low value, but there is no




standard for determining what value is "small enough" for regulatory use




of the model.  Our three models did not have significantly different noise




levels in any of the data sets discussed above.  The noise could be useful,




however, in selecting between two models which have similar bias:  the




model with the smaller noise level would be preferred.




     Variability comparisons between the observed and predicted




concentrations provided useful diagnostic information, despite the fact




that they do not require pairing of the observation and prediction.  The




ideal variability in the predictions would be equal to the variability in




the observed concentrations, and this can be tested using the F-test,




although confidence levels are only approximate if the data is not normally




distributed.  In the example evaluation, the models tended to produce too




little variability in their predictions, probably holding too closely to a




fixed diurnal pattern.




     Correlation related measures appear to be useful only as rough




indicators of model performance.  The use of the correlation coefficient




needs to be discussed separately for daily maximum concentrations and for




hourly concentrations.  In analyzing daily maxima, the correlation should




be used in conjunction with the slope and intercept of the best-fitting




straight line, and should be accompanied by a scatterplot of Co vs.




Cp.  The use of correlation alone would not reveal linear transformations




or nonlinear relationships.  Also, there are practical problems in the use




of correlation and regression measures on a small set of extreme maxima.




Data that might form the upper end of an acceptable straight line over a




                                     99
-------
larger range of concentrations may appear to be an uncorrelated swarm of




points when the range is severely restricted.  Furthermore, on the small




set of ozone maxima available for Denver, the standard errors on the




regression coefficients were large indicating tha1, the estimates were quite




unreliable.




     In analyzing hourly concentrations, the correlation is primarily a




measure of the model's ability to replicate the average shape of the




diurnal pattern of ozone concentrations.  This is useful information, of




course, for establishing confidence in the general performance of the




model.  But it does not measure the types of error that are most relevant




to regulatory use of the model.  An error which is of great concern  for




regulatory purposes, inaccurate prediction of the magnitude of the daily




peak, may cause little or no reduction in the correlation.  On the other




hand, an error which is of less concern in regulation, missing the peak in




time by an hour or two, will reduce the correlation.  In comparison, hourly




biases for each hour of the day provided more relevant information about




the diurnal patterns for judging model performance and for diagnosing




errors in the models.  Time series plots of Co and Cp provided more




detailed diagnostic information for each day.




     Spatial correlation was not computed in the Denver evaluation because




five monitoring stations could not provide enough locational detail.  To




investigate spatial patterns, contour plots of the predicted concentrations




were compared with observed concentrations at the five sites.  This  was




useful in understanding spatial errors, which tended  to be different  for




each day.  As with  the other performance measures, we suspect  that  the




                                      100
-------
spatial correlation would only point out the days which have the largest




spatial errors.  Explanation of the errors would then be necessary and




would require detailed investigation.




     Comparison of the observed and predicted trends that result from




changing emissions presents particular problems in a model evaluation.  For




the other performance measures, a reasonably accurate observed value was




available for comparison with each predicted concentration.  Unfortunately,




an analogous observed trend caused solely by emissions change is not




available for comparison with the predicted trend.  It is necessary to use




observed data that compounds the effects of emissions change and




meteorological change.  To separate the effects of the two changes requires




either an accurate model of the meteorological effects or a model of the




shape of the trend due to emissions change.  In either of these cases, the




errors in the resulting estimate of the "observed" trend may be rather




large.  Thus the predicted trend will be compared against a rather




unreliable number, unless every effort is made to verify the modeling of




the observed trend.




     The linear trend approach used for the Denver evaluation was




appropriate for Denver ozone concentrations in 1975-80.  This was verified




using vehicle travel counts from the Colorado Department of Highways and




vehicle fleet emissions estimates from the federal mobile source emissions




model.  Such checking should be done before the same method is applied to




similar model evaluations in other locations.
                                     101
-------
Evaluation of Graphical Displays




     All of the graphs suggested earlier in this report were found to be




useful in the example evaluation.  In addition to the commonly used plots




of observed and predicted values and residuals, graphs of hourly bias (with




confidence intervals) and daily bias were helpful in interpreting the




statistical measures.




     Special graphical displays related to specific problems in air quality




models were valuable for diagnosing causes of error.  In the Denver




evaluation, contour plots of the predicted concentrations were made for




each hour of each day.  These contour plots were important for




understanding errors in the spatial location of the predictions.  Contour




plots of the emissions data used as input to the model were useful in




finding locations where errors in emissions input could be causing errors




in the predictions.  In addition, wind trajectory plots were useful in




tracking the development of the predicted ozone peak within the model.




     This experience indicates that graphs should be an integral part of




any performance evaluation.  They go beyond the summary statistics in




highlighting the type and location of errors in the predictions.









Evaluation of the Use of Subgroups of the Hourly Concentrations




     Statistics which were averaged over the full hourly data set provided




only very general information, merely an impression of high bias and low




precision, with correlations that were high enough  to  Indicate  some success




in capturing the diurnal cycle of ozone production.  So many effects were




averaged together that  specific conclusions about the usefulness of the




models were not possible.




                                      102
-------
     Sorting the data by site provided the additional information that




predictions at one site (CARIH) were substantially more biased than at the




other sites, suggesting that special features of that site should be




examined.




     Sorting the data by day was important to determine whether any of the




sample days presented particular difficulty to the models, in case some




unusual atmospheric phenomena cannot be duplicated in the models.  In the




Denver data there was little difference between days in model performance




when averaged over all sites, therefore no day required special analysis.




But at specific sites, unusual model performance on particular days




indicated by the daily bias and noise revealed specific site-related




problems in the modeling or in the input to the models.  Daily measures




should not form the primary basis of an evaluation, however, because all of




the information on the diurnal ozone pattern has been lost.  Because of




high autocorrelation over successive hours, confidence intervals on daily




bias estimates will be extremely large.




     Sorting the data by hour and by site was most useful for diagnosing




errors in model performance.  It revealed a systematic pattern of bias over




the day at every site, with the models tending to overprediction in the




early morning (7-9 a.m.) and to underpredict, with statistically




significant bias, in the mid-day peak hours and the afternoon.  Here, 95%




confidence intervals on the hourly bias were especially useful, because




they helped distinguish between bias which was a real, systematic feature




of the model and bias which might be only the result of random fluctuations




in the data.  Sorting by hour and by site, both bias and noise measures




                                     103
-------
averaged over all of the days helped to reveal patterns of error and led to




diagnosis of a number of reasons for error in predictions at specific




sites.  Still smaller subsets, involving hourly averages for selected days




at a particular site, were useful for diagnosing specific problems and for




confirming hypotheses about the sources of particular errors.









Estimating the Bias in the Predicted Peak Concentration




     The peak predictions are important in regulatory use of the model,




therefore it is desirable to estimate the bias in those predictions.  The




bias estimate depends, however, on which predicted concentration is matched




with the observed daily maximum.  Theoretically it might be  argued that  the




predicted daily maximum should be chosen only from a  monitoring site




location, because the chance of hitting the true area-wide peak would then




be the same for both the observed and predicted maximum (as  discussed in




the section "Ways of Pairing Daily Maximum Concentrations").  That argument




only holds if the days and the monitoring locations have been chosen




randomly, however.  In selecting days with high observed ozone




concentrations and locating the monitoring stations in high—pollution




areas, we have increased the chance of hitting the true area-wide peak  in




the observed concentrations.  The chance of hitting the true peak in a




prediction at a monitoring site has not been  increased accordingly,




however, because  spatial errors in the predictions must be expected.  The




result is that, for  a high ozone data  set, this pairing should  tend  to




produce some underprediction of the maximum,  or positive bias.
                                      104
-------
     The opposite tendency can be expected if the predicted maximum is




chosen from the entire grid area covered by the model.  In that case, the




predicted maximum is chosen from a larger number of locations than the




observed maximum.  Thus this pairing should tend to produce an




overprediction of the maximum, or negative bias.




     We conclude that a meaningful statement of the bias in the predicted




maximum should fall somewhere between the bias estimates produced under the




two pairings just described.  Where it falls in that range would depend on




the population of days used in the evaluation, the locations of the




monitoring sites, and the kinds of errors in the model.  Errors in the




spatial extent of the peak-ozone cloud, errors which systematically  affect




the prediction at a particular monitoring site, or errors which increase




the likelihood of the predicted peak missing a monitoring site will  each




affect the bias in a different way.  Furthermore, some real peaks may




totally miss the monitoring sites and not be observed at all.  To obtain




the most accurate bias estimate, it may be necessary to match observed and




predicted peaks by hand, using contour plots of the predicted




concentrations.  Thus establishing the bias in the peak predictions  is not




completely straightforward.




     In the example evaluation, we conclude that the bias in EPAZ's  peak




ozone predictions should be estimated between 10% and 31%, based on  biases




shown in Table 20.  The best bias estimate within that range, and the most




appropriate prediction of the daily maximum, will depend upon the types of




spatial errors that are affecting the predictions.  Spatial errors could




not be adequately investigated in the Denver data set, thus further




investigation is needed.




                                     105
-------
Problems in Comparing Models on Hourly Data

     At first glance it is somewhat surprising that the EPA2 model looks

distinctly better than the others in comparisons of the daily maxima, but

not in comparisons of the complete set of hourly lata.  Two factors in the

time-paired hourly data tend to mask the superior performance of EPA2 in

predicting the peaks:  missing the peak in time, and errors in the off-peak

hours.

     When the models miss the peak in time, predicting a maximum several

hours before or after the observed maximum, the effect on the performance

measures in a paired comparison can be quite misleading.  An example of

this is shown in Figure 38(a), the observed and predicted concentrations

for a day at CARIH.  All three models predict the peak 2 hours too late.

The maximum predicted by the DOT model is much too low, but that predicted

by EPA2 is acceptably close to the observed maximum.  We would definitely

select the EPA2 predictions as the superior ones in this example.

     The performance measures would have suggested a different conclusion,

however.  For this day at CARIH they are



                            DOT      EPAl      EPA2

     Bias                  29.3      21.9      21.3
     NVise                 31.4      34.9      40.9
     Absolute deviation    31.0      30.6      34.1
     r  (C0 vs. Cp)            .68       .60       .47



Except  for the bias  in the DOT model, both DOT and JiPAl  appear  to  be

superior to EPA2.  This  judgment  is based  -?n  their lower  noise  and absolute


                                      106
-------
deviation, their higher correlation, and a comparable bias in EPAl.  The

apparent inferiority of EPA2 in this case results entirely from its

superior estimation of the magnitude of the daily maximum.  The three

models perform similarly at other hours.

     This is not an isolated case.  The site maximum (i.e., the local  peak

at a given site for a given day) was predicted at the wrong hour by  all

three models more than 60% of the time, and by individual models even  more

frequently.  This can be expected to have a substantial  impact on

statistical measures of model fit under hourly pairing,  as shown above.   If

the goal is to find the model which best predicts the magnitude of the

daily maximum, hourly pairing can lead to the wrong choice.  Instead,  in

that case, the choice should be made by comparing model  performances on

the set of daily maxima, not paired by hour.

     Another example in which the statistical measures are misleading  is

shown in Figure 38(b).  This day at CAMP illustrates a problem in  averaging

hourly differences over all hours of a single day.  Inspection of  the

observed and predicted concentrations again leads us to  prefer EPA2  because

it most closely approximates observed 03 levels over several peak  hours.

The performance measures indicate otherwise, however, as shown below.



                            DOT       EPAl       EPA2

     Bias                   1.3       -3.8       -8.1
     Noise                 17.2       14.5       14.7
     Absolute deviation    12.1       11.6       13.8
     r (C0 vs. CD)           .85        .89         .88
                                     107
-------
EPA1 looks better than EPA2 on every measure, partly because the peak hours




are shifted and partly because of large errors in EPA2 in the morning.  By




averaging over the day, a variety of model errors are merged in each




measure, obscuring the most important distinction;  between models.  When




statistics are averaged over several full days, there is even more  tendency




for error effects to balance out.  It is no wonder, then, that our  summary




statistics on the full set of paired hourly data (Table 2) showed no clear




distinctions between models and gave little information on the sources of




errors in the models.









Effects of Non-normality on Bias Comparisons




     The Kolmogorov-Smirnov test of normality was applied to all of the




sets of daily- and site-maxima (observations, predictions, and residuals).




In addition, it was applied to the separate hourly sets of EPA2 residuals,




both for each site separately (n = 11)  and for all sites together




(n = 55).  In no case could the hypothesis of normality be rejected, even




with the rather liberal significance level of a » 20.  Such  a result is




conventionally taken to indicate that use of t- and F-tests  is




appropriate.




     A test of normality on a sample containing only 11  points cannot be




very precise, however.  Therefore bias  comparisons were  done using  both  the




t-test and the Wilcoxon Paired Rank Test to determine whether the Wilcoxon




test could provide additional information.




     On  the hourly EPA2 residuals, the  results of  the two  tests  were




virtually  the  same.  When  all sites were analyzed  together,  testing whether




                                      108
-------
the bias was significantly different from zero at each hour, the two tests




agreed on each of the 12 hourly data sets.  Separating the data by site




produced 60 sets of residuals, each containing approximately 11 points.




Hypothesis-testing results on these 60 biases using the Wilcoxon and




t-tests disagreed only twice, and the significance levels were close to  the




.05 borderline.




     Most of the sets of data involving peak concentrations showed very




large biases, thus it is not surprising to find that the Wilcoxon and




t-tests agreed that these biases were significantly different from zero.




The two tests agreed also on the one case in which bias was not




significantly different from zero.




     The two tests frequently did not agree, however, when residuals from




two models were compared on peak concentrations.  In these comparisons,  the




differences in bias were relatively small, hence the sensitivity of the




test could be critical.  In each case, the null hypothesis to be tested  was




that no difference exists between the bias in two models on a given set  of




peak concentrations.  Such tests were performed on the two sets of site




maxima (n » 55) and on the three differently matched sets of daily maxima




(each having n = 11).  In each data set, the Wilcoxon test found




significant differences in bias which were not detected by the t-test.   In




no case did the opposite occur; that is, on the rare occasions when the




null hypothesis was rejected by the t-test, the Wilcoxon test agreed.




     We conclude that the nonparametric Wilcoxon test, with fewer




restrictive assumptions then the t-test, can be more powerful than the




t-test when the normality of the data is in doubt.  This was shown to be




                                     109
-------
true in spite of the fact that the data "passed" a Kolmogorov-Smirnov




normality test.




     In this comparison, Students' t was found to be overly conservative,




less able than it should be to reject the null hypothesis at the specific




confidence level.  Nevertheless, it may still be useful to compute




confidence intervals using Students'  t.  If this is done, it should be




remembered that these intervals do not accurately represent the specified




level of confidence.  In this study it appears that the confidence




intervals are unnecessarily large.  In general, though, the type of error




will depend upon the underlying distribution.




     Uncertainty about the correctness of confidence intervals on the bias




casts some doubt on the appropriateness of using such intervals to




establish performance standards.  At the very least, distributions of




residuals should be examined.  For example, skewness in these distributions




could require asymmetrical confidence intervals to give fair treatment  to




positive and negative biases.









B.  RECOMMENDATIONS




Recommended Performance Measures




     The choice of performance measures should be based on the model




attributes which are to be evaluated.  Thus no single list of measures  will




be appropriate for every evaluation.  Recommendations based on the Denver




example evaluation should be  relevant$ however, for other time-dependent




airshed models.  The list of  performance measures, graphic displays,  and




data combinations which  appear  to be most  useful  in the  evaluation of a




                                      110
-------
time-dependent urban airshed model are shown in Tables 25 and 26.  In the




Denver study it was found that the long list of performance measures




recommended for evaluation of air quality models by the AMS workshop could




be reduced considerably.  Some of the recommended measures were simply




redundant, as is the case with mean square error and absolute deviation.




Others would be inappropriate, given the intended application of these




models to a small number of selected high ozone days, or even a single




worst-case day, in the evaluation of state air pollution control




strategies.  In particular, if the models are to focus on only a few days,




comparison of frequency distributions of concentrations over long periods




are not appropriate.  Furthermore, both observations and predictions will




be closely tied to the meteorological characteristics of the few chosen




days, therefore pairing by day must be maintained in the comparisons.




     The use of graphs to display results is to be encouraged at every




stage of a model evaluation.  Scatter diagrams, time series plots, and




contour plots are essential aids in intepreting the statistics, and are




useful, as well, in uncovering sources of error in the models.




     Two types of peak comparisons which we tried have not been included in




Table 25:  the predicted daily maximum at the site of the observed maximum,




and the predicted site maximum at the hour of the observed site maximum.




These were less important than the other peak comparisons in this study




because the information they offer about missing the peak in time and space




was obtained from the hourly data.  They do help by substantiating those




findings, however, and they definitely should be included in a study that




involved only peak concentrations and not hourly comparisons.




                                     Ill
-------
     Before using statistical tests or confidence intervals, two




assumptions underlying their use must be confronted:  normality and




independence.  The effect of non-normality on the t-test was checked in




some detail, and found to cause some significant differences in bias to be




overlooked in our data.  As a result, use of the Wilcoxon test to compare




sample biases is strongly recommended when normality is in doubt.  The




assumption that the data is normally distributed is also important in use




of the F distribution to compare variances, but deviations from normality




have been empirically shown to have only a minor impact on this test.




Therefore only in cases of extreme deviation from normality would one be




unable to apply the recommended F-tests.




     Lack of independence, due to autocorrelation of data in a time series,




is a more serious problem.  When air quality data is collected over




successive hours or days some mutual dependence of data points is almost




certain, and this dependence can seriously affect the accuracy of




statistical estimates.  Therefore autocorrelation should be measured and




corrresponding adjustments made to the confidence intervals and degrees of




freedom used in statistical testing.




     One essential aspect of model performance has not been covered in




earlier evaluations of the Urban Airshed Model, and no associated measure




is included  in the Workshop list.  As the model is used in practice, it was




important to evaluate the model's response to changing emissions.  A simple




comparison of linear  trends in observed  and predicted concentrations over




several years was chosen, under constraints of limited data and resources.




Independent  information  about Denver's vehicle travel  and vehicle  fleet




                                     112
-------
emissions indicated that a linear trend was a reasonable assumption.  In




future research it would be worthwhile to try other approaches which are




capable of better estimating the "observed" trend in ozone concentrations




due to emissions changes over the years, controlling for changes in




meteorology.








The Use of Statistical Measures as Performance Standards




     Procedures recommended by Draper and Smith (1966) for evaluating




regression models are widely used to determine whether a statistical model




should be accepted or rejected.  These methods primarily involve a careful




study of histograms and scattergrams of the residuals from the fitted




model.




     Although some statistics are available for use in judging the




acceptability of a model, in practice they are far less informative  than




the corresponding residual plots.  For this reason, Draper and Smith do not




recommend their use.  More commonly used are statistical measures, such as




R  and the F-test, for choosing the "best" of several models.  Such




measures formed the basis of the list of performance measures recommended




by the AMS workshop.  Even for choosing between models, however, Draper and




Smith caution against the automatic use of statistics, saying, "sensible




judgment is still required in the initial selection of variables and  in the




critical examination of the model through examination of residuals."




     Performance of the three models examined here is not impressive by the




usual standards applied to regression models, because these models have not




been "fitted" to observed data.  In addition to high bias and noise,  a




                                     113
-------
variety of systematic errors could be found in the residuals of each




model.  Yet these are "state of the art" models, the best available tool




for scientific and regulatory analysis of the urban airshed.  While we  are




forced to acknowledge that they are imperfect, it LB still quite possible




that they may be adequate for the purposes required of them.




     Despite many attempts within the field of statistics to establish




formal criteria of acceptability, statisticians emphasize the need  for




professional judgment based on the intended use of a model.  We expect  that




it will be equally difficult (impossible?) to establish  absolute criteria




for deterministic models.




     Comparisons and statistics like those listed in Tables 25  and  26 can




provide, as suggested at the AMS/EPA workshop, a "rational framework for




quantitatively evaluating the nature of differences between observations




and prediction by models."  However, this framework is likely to be




skeletal.  The performance measures provide "vital statistics"  but  not




understanding.  There are likely to be multiple causes of error in  a model,




some of which are serious for a given application and others of which are




not.  Diagnosis of the causes of error is necessary to determine their




effects on regulatory applications.  Then, judgment is needed to determine




the seriousness of the errors and whether adjustments can be made.









Evaluating the Usefulness of a Model




     Physical scientists  and statistical modelers haves  historically,




approached modeling  from  fundamentally different points  of view.   The




physical  scientist tries  to base  a model  as much as possible  on underlying




                                      114
-------
scientific truths which have been physically demonstrated in controlled

experiments.  Statistical models, on the other hand, are likely to be

derived from observed data rather than from physical principles and can

usually be validated only against limited samples of empirical data

collected under relatively uncontrolled conditions.  Thus statisticians are

better able to accept the prospect of an imperfect model.  The viewpoint

of the statistician is well summarized by Fhadke, Box, and Tiao.

     "On this view of modeling all models are wrong, but some models are
     useful.  Thus while it is useless to seek a true one we can  iterate
     towards successively more useful ones till we obtain one which is
     adequate for our purposes."

The user of air quality models looks for a "true" model because it is

important to base decisions on the most scientifically correct available

understanding of physical processes.  But because these models are

imperfect we, too, must iterate toward successively more useful ones until

we obtain one which is adequate.  From this perspective, the evaluation of

a model is intimately connected to the objectives of its application.

     The regulatory purpose requires accurate prediction of peak  ozone

concentrations on high ozone days, under changing emissions conditions.

The change in emissions takes place gradually over a period of years,

therefore a test of the models on data within a one or two year period  is

probably not sufficient.  In this study, then, two separate tests were

needed to match the regulatory purpose:  1) an analysis of daily  maximum

predictions under a variety of meteorological conditions represented by the

11 sample days in 1979-80, and 2) an analysis of predicted change in the

daily maxima when emissions' input to the model was changed from  1979 to


                                     115
-------
1976 Levels.  In both of these analyses, EPA2 performed better than the




other two models.




     Although we have concluded that EPA2 is the best of the three models




in the daily maximum predictions required for regulation, judgment will be




required to determine whether that model performs well enough to satisfy




the purposes of its users.  Absolute performance standards can lead to the




wrong decision, as shown in examples above.  Even in comparing the




performances of the models, knowledge of the nature of the differences in




performance was necessary to choose between them.  Statistical measures




provide helpful initial comparisons and valuable clues for decision making,




but they are not a substitute for detailed analysis upon which the decision




making must depend.
                                      116
-------
                                 REFERENCES








Barrett, J.P. and L. Goldsmith (1976), "When is N Sufficiently Large?",




     American Statistician, 30, pp. 67-70.




Brier, G.W. (1975), Statistical Questions Relating to the Validation of Air




     Quality Simulation Models, EPA-650/4-75-010, U.S. Environmental




     Protection Agency, Research Triangle Park, North Carolina,  313 p.




Cole, H.S., C.F. Newberry, W. Cox, G.K. Moss, and D. Layland  (1982a)




     "Application of the Airshed Model for Ozone Control  in St.  Louis,"




     82-20.1, 75th Annual Meeting of the Air Pollution Control Association,




     New Orleans, Louisiana, June 20-25, 1982.




Cole, H.W., W.M. Cox, D.E. Layland, G.K. Moss, C.F. Newberry  (1982b) "The




     St. Louis Ozone Modeling Project," draft report, U.S. Environmental




     Protection Agency, Research Triangle Park, North Carolina,  392 p.




Delaney, A. (1981) "The CHON Photochemistry of the Troposphere," Notes  of




     the 1980 Summer Colloquium, Advanced Study Program and Atmospheric




     Chemistry and Aeronomy Division, National Center for Atmospheric




     Research, Boulder, Colorado, 172 p.




Demerjian, K.L., K.L. Schere and J.T. Peterson (1980) "Theoretical




     Estimates of Actinic (Spherically Integrated) Flux and Photolytic  Rate




     Constants of Atmospheric Species in the Lower Troposphere," In




     Advances in Environmental Science and Technology, Volume  10, J. Pitts




     and R. Metcalf, eds., John Wiley & Sons, New York, New York,




     pp. 369-459.




Draper, N.R. and H. Smith (1966), Applied Regression Analysis, Wiley and




     Sons, New York.




                                     117
-------
Fox, Douglas G. (1981), "Judging Air Quality Model Performance," Bulletin




     American Meteorological Society, V. 62, No. 5, May 1981, pp. 599-609,




Greenberg, J. and P. Zimmerman (1982) private communication—unpublished




     data from 1980 Summer Colloquium, National Ceater for Atmospheric




     Research, Boulder, Colorado.




Haney, J.L., T.W. Tesche, and J.P. Killus (1983) "Application of the




     Systems Applications Airshed Model to the Philadelphia Metropolitan




     Area:  19 July 1979 Ozone Episode," U.S. Environmental Protection




     Agency, Contract No. 68-02-3582, Systems Applications, Inc., San




     Rafael, California, 1983, 121 p.




Hayes, S.R. (1979), Performance Measures and Standards for Air Quality




     Simulation Models. EPA-450/4-79-032.  U.S. Environmental Protection




     Agency, Research Triangle Park, North Carolina, 313 p.




Hirtzel, C.S. and J.E. Quon (1981), "Estimated Precision of Autocorrelated




     Air Quality Measurements," Summaries of Conference Presentations,




     Environmetrics 81, pp. 200-201.




Hollander, M. and R.A. Wolfe (1973) Nonparametric Statistical Methods,




     John Wiley & Sons, New York, New York.




Keil, R.  (1983) "The Impact of Meteorological Inputs on the Performance  of




     an Urban Airshed Model," Masters Thesis, Department of Meteorology,




     The  Pennsylvania State University, University Park, Pennsylvania  (in




     press).




Kleiner,  B. and I.E. Graedel (1980), "Exploratorv Data Analysis  in  the




     Geophysical Sciences," Reviews of  Geophysics and Space Physics, V.  18,




     No.  3, pp. 699-717.




                                      118
-------
Larsen, R.I. (1971), A Mathematical Model for Relating Air Quality




     Measurements to Air Quality Standards, EPA Office of Air Programs




     Publ. No. AP-89, Research Triangle Park, NC, 56 p.




Layland, D.E. (1980) "Guideline for Applying the Airshed Model to Urban




     Areas," EPA-450/4-80-020, U.S. Environmental Protection Agency,




     Research Triangle Park, North Carolina, 169 p.




Layland, D.E., S.D. Reynolds, H. Hogo and W.R. Oliver (1983) "Demonstration




     of Photochemical Grid Model Usage for Ozone Control Assessment,"




     83-31.6, 76th Annual Meeting of the Air Pollution Control Association,




     Atlanta, Georgia, June 19-24, 1983.




McRae, G.J.  (1981) Mathematical Modeling of Photochemical Air Pollution,




     Ph.D. Thesis, Environmental Quality Laboratory, Report No.  18,




     California Institute of Technology, Pasadena, California, 754  p.




Myers, Jerome L. (1979), Fundamentals of Experimental Design, Allyn and




     Bacon, Inc., Boston (pp. 67-68).




Panofsky, H. (1981) Private communication.




Pearson, E.S. and H.O. Hartley (1976), Biometrika Tables for Statisticians,




     Vol. II, Biometrika Trust, London.




Phadke, M.S., G.E.P. Box, and G.C. Tiao (1977), Empirical-Mechanistic




     Modeling of Air Pollution."  Proceedings of the 4th Symposium  on




     Statistics and Environment, ASA, Washington, D.C.   (pp. 91-100).




Reynolds, S.D., H. Hogo, W.R. Oliver, and L.E. Reid (1982) "Application  of




     the SAI Airshed Model to the Tulsa Metropolitan Area," U.S.




     Environmental Protection Agency, Contract No. 68-02-3370, Systems




     Applications, Inc., San Rafael, California, 392 p.




                                     119
-------
Schere, K.L. (1982) "An Evaluation of Several Numerical Advection Schemes,"




     draft report, U.S. Environmental Protection Agency, Research Triangle




     Park, North Carolina, 37 p.




Tennekes, H. (1973) A model for the dynamics of che inversion above  a




     convective boundary layer, J. Atmos. Sci., 30, pp. 558-567.




Tesche, T.W., C. Seigneur, L.E. Reid, P.M. Roth, W.R. Oliver, J.C.




     Cassmassi (1981) "The Sensitivity of Complex Photochemical Model




     Estimates to Detail in Input Information," EPA-450/4-81-031a, U.S.




     Environmental Protection Agency, Research Triangle Park, North




     Carolina, 181 p.




Whitten, G.Z., J.P. Killus, and H. Hogo  (1980) "Modeling of  Simulated




     Photochemical Smog with Kinetic Mechanisms—Volume 1.   Final Report,"




     EPA 600/3-80-028a, U.S. Environmental Protection Agency, Research




     Triangle Park, North Carolina, 348  p.
                                      120
-------
                                  Table 1

               Existence of High Pressure  Influencing Denver
                          high pressure:
                       no high pressure:
                                                   Surface
                                                High Pressure
       500 mb
High Pressure Ridge
Time is 1200 GUT equal to 0500 MST
                                  121
-------
                                           Table  2




                      Meteorological  Conditions  I on High Ozone Days
  * at Stapleton International Airport (NWS)




 ** from daily weather maps




*** from Colorado Dept. of Health data
                                                                           0600-1700 MST


Day

79180
79193
79208
79218
79249
80170
80177
80191
80204
80207
80219

Max Max
Temp* Temp* Day t ime
>80°F >90°F Precip.*
X
xx T
X X
x x
X
x x
xx T
X X
X X
x T
x x
Wind Speed
1200 GMT
500 mb**
(kts)
20
15
30
10
20
25
25
35
10
30
15
Wind Speed
Max in
1200 GMT Wind S
Surface** at M<
(kts)
5
5
10
5
5
5
5
10
5
5
5
jnitoi
(kts
13
20
12
10
12
12
12
12
8
17
12
                                              122
-------
                                     Table 3




                 Meteorological Conditions II on High  Ozone  Days
MST
Day
79180
79193
79208
79218
79249
80170
80177
80191
80204
80207
80219
Sky Cover (tenths)
5 6 7 8 9 10 11 12 13 14 15 16 17
0000002334585
1002233455999
1078333459958
0000001100000
0000000001100
0010014464674
0000000012399
0000012532255
10 740000001356
00000 0 0 2 2 2 10 10 4
0000000012596
Time of
Maximum
Temperature
(MST)
1500
1400
1400
1500
1500
1400
1400
1300
1500
1300
1300
Maximum
Ozone
on
Modeled
Day
(ppb)
153
146
162
166
157
117
117
100
154
121
101
Note:
Insolation
Strong
Moderate
Slight
Neutral
Sky Cover
0-4
5-7
7-8
9-10
                                     123
-------
                                        Table  4

                  Meteorological  Conditions  III  on  High  Ozone  Days
                          Existence  of Upper-Level
                        Inversion  below 2100 Meters
                           at                 at
   Maximum
Mixing Depth*
     U)
Day
79180
79193
79208
79218
79249
80170
80177
80191
80204
80207
80219
1200 can
no
no
no
no
yes
no
no
no
no
no
no
0000 GMT
no
no
no
no
no
no
no
no
no
no
no

1900
1900
1450
2100
2000
2000
4000
2250
2700
1750
2600
* As calculated by Tennekes1 model
                                         124
-------
                     Table 5




Central Denver 5-Station Average Wind Speed (m/s)




         (Department of Health Monitors)

Day
79180
79193
79208
79218
79249
80170
80177
80191
80204
80207
80219
1 1-Day
Ave.

5-6
1.7
2.1
1.4
2.2
1.4
1.7
1.8
1.7
1.8
1.1
1.5
1.67

6-7
2.1
1.8
2.0
2.0
1.6
1.0
1.7
1.9
1.4
0.8
1.3
1.60

7-8
2.6
2.2
1.8
1.8
1.8
1.2
1.4
1.4
1.0
1.1
1.1
1.58

8-9
1.9
1.4
1.2
0.9
1.4
1.0
0.9
1.7
1.3
1.9
1.7
1.39

9-10
1.8
1.1
1.1
0.9
1.2
1.4
1.5
1.1
1.5
1.4
1.5
1.32

10-11
2.2
1.4
1.1
1.0
1.1
1.1
2.3
1.7
0.9
1.2
2.7
1.52

11-12
2.1
1.7
1.6
1.4
1.9
1.0
3.0
1.7
1.5
1.1
2.7
1.79

12-13
2.2
1.9
3.0
1.5
1.8
1.0
4.3
2.0
2.8
1.7
3.4
2.33

13-14
2.5
2.1
2.1
1.8
3.0
1.3
4.1
2.1
3.3
1.9
4.3
2.59

14-15
2.5
2.9
2.6
2.5
3.7
3.1
2.5
2.0
2.7
3.9
2.9
2.85

15-16
4.2
5.2
3.2
3.0
4.1
3.4
1.6
2.6
2.6
4.5
4.5
3.54

16-17
4.2
5.2
3.6
3.6
4.1
4.2
1.8
2.6
2.3
3.5
3.3
3.49
12-Hour
Average
2.50
2.42
2.06
1.88
2.26
1.78
2.24
1.88
1.93
2.01
2.58
2.14
                      125
-------
                                Table 6

        Types of Wind Trajectories Occurring in the Six Hours

         Prior to the Observed and/or Predicted Ozone Peaks
Straight
Through

 791801
          79249!
Zigzap,
Curved
79193
80170
80177
80191
80219
79208
80204*
80207


Reversal

 79218t§
*Day with an approximately 3-hour "dead spot" in wind  field  at  location
 of the predicted peak

tDays with maximum at Highland

§Day with highest daily maximum ozone of the 1977 and  1980 summers
                                   126
-------
.Q
 to
H
       C  CO
       O T3
       O
       C
       3
        C
        O
           C
           flj
           C
           O
0)  tO
i<  i->
i-  in
O
o  m
O
•u  p
3  0)
        O. 01
        E  JO
        to  cs
        M  U
            HI

            10
                           -i   oi     tn  u
                           >   OJ  CO
                           -I   O  >
                           4-1   O  l-<
                           U   3  01
                           o>   co  in
                           u-i      JO
                           U-l  CM  O
                           W  —
                                        CO
                                        6  -e-
                                   co

                                    O
                                                                              CM ft
                                                                    00
                                              OO
U-I U
o
~H
tn
0) tn
•u 60
co cfl
6_j
•^
.,4
4J E
tn O
Cd P
U-I

tn
C
O
*rJ
4J
o
C
3
U-I \O
C O
O 4_i
.,_<
4-1 — *
CO
<— i tn
J
<



4-1
01
10
CO

IB
•» A M M
— • ^ >J r-
r- P~- \o U*N
• • • •

•« * n •»
r-- oo CM vo
r^ r- p~ \o
«...

co i^. oo r>-
CM CM -H — I
• • • •

O <• CM t>
CS CO CM -H
• • • *

M M M A
 U CL 0.
r^ CM — <
^C v£ ^O


CO O tTN
r» r— vD
• • •

co co co
«*— 1 r~ 4 ^H

•» •» M
in -J vo
»— t ~^ «H


O 00 00
CM — I <~t

MAM
P- CM -H
CM CM CM

A •» «S
m 00 r*-*
<• m co
• • •

A •» A
co O O
r- p~ \c
• • •

tn tn
tn f—1 t— t
•— i co to
CO 3 3
3 -O -O
•^3 -f-4 .H
• H tn to
to O1 QJ
0) V4 1-1
                                                          V4  CX
                                                          4)     r-t  CM
                                                          en H  •<  <
                                                          jj Q  ft-  CU
                                                          O O  U  U
                                                                            -4 CM
                                                                        H  < <
                                                                            §0* Or
                                                                            W td
                                            127
-------
                                           Table 8

                       Performance Evaluation Measures  for  all  Hourly
                     Ozone Concentrations, Paired by Hour and by  Station
Measure
C
o
S
o
n
Ar v ad a
59.1

36.8

131
CAMP
44.1

30.7

132
CARIH
63.6

35.6

127
Highland
61.8

34.1

126
Welby
57.2

31.4

130






All Sites
57.1

34.4

646
DOT Model
Bias d
Noise S,
a
FT
95% conf., d*
S
*
-------
                 Table  9

      Bias and Noise  for  Each  Hour
Averaged over Eleven  Days  and  Five  Sites
Hour
6
7
8
9
10
11
12
13
14
15
16
17
Over-
all
DOT Model
Bias Noise
• i
1.9 8.5
.2 10.6
2.9 14.2
5.1 16-9
10.3 20.5
17.2 27.0
26.1 28.7
21.6 24.7
18.7 28.6
17.8 30.1
13.7 23.1
13.3 22.0

12.4 23.6
	
EPA1 Model
Bias Noise
	 	 	
1.9 8.5
.2 10.4
2.5 14.0
3.9 16.8
8.4 20.2
14.4 25.5
22.9 28.1
19.6 23.2
16.6 28.7
16.2 29.7
13.3 22.4
13.5 21.6

11.1 22.8
	 ' 	 	 — '
__ 	 • 	
EPA2 Model
Bias Noise
1.8 8.2
-.2 10.6
2.4 14.7
3.1 18.6
8.6 21.9
14.5 26.1
22.7 28.7
21.9 24.1'
18.0 31.0
18.5 31.0
15.1 22.8
14.6 21.4
1
11.8 23.9 !
— 	 	 — — — —————'
                 129
-------
43
 CO
H
        w
        C
        o
c
  4=

 Ol
 73  00
    c
—      o
 C   U)
 OJ  -r^

 §is
 4-1  U-l
         01  11
         o  u-i
         C  >*-!
         01  w

         -,
                           JJ 43
                           CO
                           l-i  00
                           4-1  C
                           C -H
                                          -o
                                       in   c
                                       CO   CO
                                   O  O<  ~^
                                  u      4:
                                          4=   CX
                                           &c  a.
                                                                     vO
                                       Oi  3:
XI
 a.
 a
                                                                     n
                                                                     II
                                                                             Q

                                                                             oc
                                  t3
                                   C
                                   CO
                                           00
                                           !~
                                           CO
                                                             O     -H
                                   O  O
43
 a.
 a.
                                                                             in
                                                                             II
Q

 00
                                fl

                            E  -o  c ^

                            5  _l  —I  jD     en      CN
                           "-*   U  «C  D-     O      00
                            x  --i  oo a.     ^

                           S   a>  K ^

                               IX
 o.
 c.
                                                                             m

                                                                             U
 o

 00
                                CO
                                   T3
                            E  -o   c
                            3   0)   ft) x—•
                            £   >  -l  43
                           •H   ll  -C   O.
                            x   oi   no  a.
                            CO   t/l  -H '—'
                           £  43  Z
                               O
                                                      O     00
                                                      OO     "-^
                                           CO
                                           Q
                                                       130
-------
                             Table  11

          EPA2 at Highland:  Bias  and  Noise  for  Each Hour
             for Full Eleven Days  and  Eight-Day  Subset
Hour
6
7
8
9
10
11
12
13
14
15
16
17
Overall
11-Day
Bias
8.2*
-9.4*
-7.2*
-3.6
.2
6.2
9.3
20.5*
25.7
20.3
14.4
11.5*
8.4
Sample
Noise
6.8
6.6
6.3
9.2
13.7
17.5
24.2
27.8
41.0
44.4
22.2
14.8
25.0
Eight-Day
Bias
6.7*
-11.1*
-9.1*
-8.1*
-5.7
-2.4
.1
6.1
3.0
-3.6
3.1
8.3
-0.8
Subset**
Noise
6.7
6.3
5.6
4.5
10.9
9.2
19.1
13.2
13.3
17.3
7.3
10.1
12.4
C
0
22.0
19.5
29.6
37.7
46.7
56.8
70.4
73.3
64.3
59.8
57.6
57.3

 *Bias is significantly different  fro. zero at the 95% confidence level.

-The eight days analyzed  separately  at Highland are 79193, 79208, 80170,
  80177, 80191, 80204, 80207,  80219.
                               131
-------
                             Table 12

           EPA2 at Arvada:  Bias and Noise  for Each Hour
             for Full Eleven Days and Eight-Day  Subset
Hour
6
7
8
9
10
11
12
13
14
15
16
17
Overall
11-Day
Bias
-2.8
-4.8
-4.5
-1.6
6.4
14.0
25.4
26.0
27.0
24.9
17.3
13.3
11.8
Sample
Noise
9.2
10.1
15.6
19.0
19.7
18.8
26.8
30.6
30.4
28.8
24.0
28.0
25.1
Eight-Day
Bias
-5.3
-6.4
-7.9
-7.1
.4
10.1
18.0
13.5
16.8
20.0
17.9
13.0
7.1
Subset**
Noise
7.7
8.4
17.2
17.9
16.5
20.6
24.8
25.3
26.4
30.9
26.0
26.9
23.3
**The eight days analyzed separately  at  Arvada  are  79208, 79218, 79249,
  80170, 80177, 80191, 80207,  80219.
                               132
-------
                                Table 13

                   Observed and Predicted Site Maxima
                       at CARIH Monitoring Station

Date
79-180
79-193
79-208
79-218
79-249
80-170
80-177
80-191
80-204
80-207
80-219
CARIH Site Maximum* (ppb)
Observed
112
97
162
140
115
117
117
100
151
121
101
DOT
70
70
64
78
66
77
53
75
82
68
59
EPA1
73
87
68
109
72
73
61
70
84
79
62
EPA2
77
85
75
128
74
75
57
59
96
70
75
*Unpaired by hour
                                  133
-------
X)
 re
            o
            tc
            u;
CN   cr,

<   O
O-   o  *
rjt"1   L  rf

     3  >-

 C   O
        (L
        U
        c
    to  e

        x

        CO
            o
         3  Oi  U
         a
         o
        "D
         OJ


    3   to
            •o  a.-
            re
        XJ
        c  x;
        • 
                                                                                          O
                                                                                                    o-.

                                                                                                    C-J

                                                                                                    oc

                                                                                                    •c
                                                                                                     c
                                                                                                     re
                                                                                                            O
                                                                                                            CNI
                                                                                                            O
                                                                                                            cc
                                                                                                            c*
                                                                                                            c
                                                                                                            CO
                                                                                                            oc

                                                                                                            CN
                                                                                                     a>
                                                                                                     u
                                                                                                     c
                                                                                                     01
                                                                                                     3
source
                                                                                                             O
                                                                                                             a.
                                                                                                             t-4
                                                                                                             o
                                                                                                             •c
                                                                                                              

                                                                          P
                                                                          •x
                                                               134
-------
 C
 o
 C
 01
 u
 c
 Oi

 o
     0)
 E   o
 3   W
 E   3
     3
 ••   O
CN  JC
 o  CM oo
                                                           oo t>.  — \o
                                 x:  x     xi     oo  in
 4-ico      £X     Or**r—  moo
•H   E  a.  a.     —i
XiX     XI     inCNCNOOO
ijco      a.    Or»r»moO
• H   E  Q.  O.    —
3    O   •^
                                      X      Xl
                                      ra      a.
                                      Boa.
                                                        O  csi
                      oo O  -^  -* ON
                      O r*-  os  O ••"<
                      tN —H  ^H  CN CN
                      ON O  O  O O
                      r^ oo  oo  oo 00
                                                 cfl
                                                 0)
                                                 a.

                                                 oo
                                                 c
                                                • p-4
                                                 l-l
                                                 3
                                                •O
                                                                                  &C
                                                                                  (0
                                                                                  10
                           135
-------
                                     Table 16

                        Background Sensitivity:  Results II
                                    Using EPA1

                           Daily Maximum Predicted Ozone


Day*


79180
79193
79218
79249
80170
80191
80219


"Normal1
Back-
ground
Input
(ppb)
50
55
50
45
55
50
50


' Background

Time
of
Max
14
12
14
14
15
13
13
x »
S '


Value
(ppb)
119
113
129
109
93
85
82
= 104.3
- 17.9
Low Background

Time
of
Max
13
12
14
14
13
13
11
X
s


Value
(ppb)
80
100
115
97
70
65
66
= 84.7
- 19.5
[20 ppb]


%
Change
(-33%)
(-12%)
(-11%)
(-11%)
(-25%)
(-24%)
(-20%)
(-19%)

High Background (90 ppb]

Time
of
Max
13
12
14
14
17
11
15
X
s


Value
(ppb)
127
131
153
138
138
138
120
- 135.0
- 10.5


%
Change
( + 7%)
(+16%)
(+19%)
(+27%)
(+48%)
(+68%)
(+46%)
(+29%)

*7 days picked at random from the full  set of  11.
                                        136
-------
              Table 17

Influence of Vertical Mixing  in EPA1
      for Days 79180 and 80204

Daily Bias
79180
80204
_max -max . , .
C - C , unpaired by site
o p
79180
80204
max
C over monitoring sites
P
70180
80204
max
\s
P
79180
80204
Base
Run
16.9
14.3
60
70
93
84
119
105
Decreased Vertical
Mixing Run
17.5
13.9
41
38
112
116
148
141
Percent
Change
3.6%
-2.8
-31.7
-45.7
20.4
38.1
24.4
34.3
           137
-------
        Table 18

Neutral Day Sensitivity
     Hourly 03 Bias

       ALL SITES

          EPA1

H
0
U
R

6
7
8
9
10
11
12
'
13
14
15
16
17
79180
Base Run
Simulation
C C d
o p
(ppb) (ppb) (C -C )
o p
5.8
9.0
7.8
1.8
5.0
9.8
14.8

25.2
30.6
39.2
20.8
32.6
Daily Bias 16.9
All-Station Maximum
Difference
Unpaired in Site 60
Neutral Day
Simulation
C C d
o p
(ppb) (ppb) (C -C )
o p
5.8
9.0
7.8
6.8
11.8
10.4
11.2

22.0
28.8
38.0
22.6
35.2
17.5


41
80204
Base Run
Simulation
C C d
o p
(ppb) (ppb) (C -C )
16.4
10.8
7.0
2.4
1.2
18.6
44.6

34.6
14.6
6.6
6.2
9.0
14.3


70
Neutral Day
Simulation
C C d
o p
(ppb) (ppb) (C -C )
o p
16.4
10.8
11.0
6.4
-0.8
7.8
28.0

28.8
19.6
10.8
14.2
14.0
13.9


38
          138
-------






















o\
•-^
01
•—I
XI
s

























CM
r^
CN
m u
m
II 0
ec e co
.. c
*^v *H •• »»
ec p x^ c
E .H wo
•--re x -- <
X OH CO ±J
ec "O CB
£ni >«»4
O) ^
E -H >
OJ •« —H OJ
AJ H "O
• *J W
en j_i (n -c
^3 0) P
C .u cc
0) x: — ' t3
.u u en c
««_, .^ to
c/5 ft m JJ
>— ' CO
0 1 CB
CB CD E
W •-<
x; x
4-» 4J CO
CB .^ E
ft
X 01
eg co •"
O oi •*•'
LJ CO
•C 3
U W T3
CD CD QJ
fl 2 >
p OI
O 0) M
uj o xi oo
c o
CB CB en
E E u-i o
• ••< p O •-<
X 0
eo u-i P II
£ P 0>
Oi Xi X in
ft. g eg .
3 E 0
Z 0
ws
c
eg
£






c
o
.•J
i^
eg o.
»— ' C
O! l^
P
UJ
o
V«J


0>
K T)
• H CO
£



CM O fcM C.
w 1 co
II
h<


^
xj
>H
f—4
•-• O.
x> co
eg
• *4
U
IB


U
i— *
CO 01 CO
> JJ -
95% conf. inter
on bias estima
based on Student

n
CO
S "

•B
C
CO
60 -H
C Oi
• * T5
•s*
•u
Oi



in in oo
m csi  vO
oo r-- r^
CM CN CN



E * * *
3 O O en
E O in en
.,-1 ...
x vo in en
ec
E
o>
j-i
•*^
n ____^_
•o
OI
>
Wi —< vO O
01 ...
to — 1 »H ^
XI ^H ^H <^H
0
U-l
o
01
.5

XJ
J-)
CO
:ed concentration
(35.0, 50.6)
(33.0, 47.6)
(33.8, 48.8)
u
,*4
•c
oi oo en en
L, ...
& CN O — <
•* -3- -*
3
O
x;

>•!
X) -H CN
•o o ft. a-
OI O U U
l-l
•l-l
CO
PL!



* *
•k 4e
ON VO ^~ vn
•—< en en
•




vO O 00
t~- v£) \C
CM CM CM


01
4J
•-- * * *
to \C P» 00
•3- v£> CM
•o ...
c \o en CM
CO

>%
CO
•c
e
O)
>
•H
60
r- CM o
P ...
o o  ...
o ^s \o m
<• en en
§ ...
E r*. en oo
.H ...
X vO CM O
CO CM 01  OJ
P N
01
K C
Xi eo
0 £
j-»
0)
J= l-l
4J OI
AJ
c «
• H 0)
IJ
01 60
U
C 5*
CB --I
• f-t J-J
P C
eg CB
> 0
flj >*-l
J= '^
u C
60
C — <
CO CO
jZ
u 0)
P
p CB
tti ^v
.-H • CB
~* jj E
Cfl oj •- *
E o> X
to eo
« E
>» •u
^-i eo o>
ij "O u
C *H
tfl XI «>
U U
•H CO ^
iw oi a>
•H U
c P u
60 O -^
• H UJ -O ^-»
03 01 •

•H x: w x:
p ^j 01 *J
CO .H 01 -H
> ft Xi ft
o
CO en
o vc e f-
•H . 01 CM
a-i -H 01 •
0 ft
•H u u n
•C OJ
0) m xi in
P O O
ex * CD *
[Z4 C t-
OJ O
f ^ .H l-l
u co ±j CO
u 
-------
ON



















o
CM

U

XI
CO
H

































• •
X
C
O
• rd
JJ
O
• H
tj

^
a,

B
3

• fd
X

z
>,
^d
• v^
CO
Q














x
•a
o
x;

CU
52

00
c
•*d
hi
• fd
CO
BL.

jj
C
CU

Ol
Ud
Ud
• M<
a

hi
o
Ud

X
01
3
X
CO
I

01
o
e
to
e

o
Ud
hi
Ol
Oi


I

.
X
c
0
• H
JJ
CO
hi

c
V
U
e
o
y



S
.3
x
to
B

X
,_d
• id
CO
T)
T3
0)
>
hi
Ol
X
Xl
0

Ud
0

hi
Ol
XI
B
3
Z

CM
H

0
CO

•V
c
0
• fd
JJ
to
• H
^
01
•o

T3
^JJ
CO
•o
c
to
jj
CO









00
m
en
«4

n


x
CO
B O
CJ

.
c
to
e
o
jj
to ex
•-< o
Ol hi
V-"
3
0)
x -a
•H CO

*

CM o IN ex
co I w

n

&«





>n
u
• v4
^d
•-i a.
xi ca
to
• H
hi
to
>


JJ
^d
CO O* X
> JJ -
Id CO JJ

jj -rf 5
_C jj 13
01 JJ
• ca
Ud X
C to C
0 — I O
U XI
•o
»•« e 01
m o x
ON to
XI



X
ca
•n l-o
03




T3
C
CO

OO -d
e 01
• rd T3
v- o
— as
CO
a,




en CM ON
O en en
CM m in


00 — O
...
«^ ~H *-d
CM CM CM
* *
CM 00 CM
O en r>-
• • •
r-* ^ CM




e
x
to
e

•o
Ol sf ON — i

Id ON *•* in
Ol — * *-l
x
r»
o
H«l
o
41
vU
• H
00

JJ <^S /— N S~\
^0 00 ""^ en
* • •
E CN en cs
3 oo rv r-.
B

X in" o" -T
to ...
Q ON in -^
*^ >^ x^
Ol
jj
u

T3
j^
ex*

CM CM CM
• . ...
01 NO 9\ 00
jj NO m in
• *4
10
>,
Xl

T3
01
td — • CN
•M H 


..
01
4J
•H
CO

>^
.0
T3
OJ
i«
• Hj
•3
a
2
io

d^**
-Q


•K -)c
-K -K
viU r^ oO
o o r-
< ^O sO


00 ^ C^
• * •
CM O\ GO
CM -* ^
•K
xO m CN
00 vO CO
• • «
•«
en en c>
• • •
CN (^ O
%3- en fi
SH^ S^^ SHI'








O ^O P"*
• • *
f» O P'J
m in -^f







—I CM
H N
"2
c

u
Ud
>rd
C
00
•H
ce

cn
•id

X
C
O
• M
jj
o
•a
01

a.

Ud
O

V
u
c
to
*t-l
hi
to
>








































"""^
JJ
Ol
X
CO
JJ
to
"C

x:
u
to
01

^
o


o
™H
ii

.
Ud
.
•^

x:
jj
.^H
3t

00
ON
•
CM
O
hi
01
N
C
CO
x:
"
01
jj
CO
01
hi
oo

X
1— 1
JJ
c
CO
U
Ud
• H
C
oo
• H
X
01
to

to
.§
X
to
e
j^
•«d
CO
T3

•o
01
O
T3
Ol

D.
•o
C
to

T3
Ol
>
hi
Ol
n

O

S
Ol
-------
                                  Table 21

                         Wilcoxon Paired Rank Tests
          Comparing Daily Maximum Predictions by the Three Models
 Comparison
                     - of non-zero
                     differences
Rank sum
   T~
Significance
   of T~
(a)  Paired bv site:   Predicted maximum at site of observed maximum
    Max.  concentrations
       Observed vs. DOT       11
       Observed vs. EPAl      11
       Observed vs. EPA2      11
    Residuals
       DOT vs.  EPAl           11
       DOT vs.  EPA2           11
       EPA] vs. EPA2          10
                                          0
                                          0
                                          0

                                          8
                                          9
                                         23.5
                  <.05
                  <.05
(b)  Unpaired by site:   Predicted maximum over all monitoring sites
    Max.  concentrations
       Observed vs.  DOT       11
       Observed vs.  EPAl      11
       Observed vs.  EPA2      11
    Residuals
       DOT vs.  EPAl            11
       DOT vs.  EPA2            11
       EPAl vs. EPA2          11
                                          0
                                          0
                                          0

                                         12
                                          6
                                          7.5
                  <.05
                  <.05
(c)  Unconstrained in space:  Predicted maximum over full modeled region
Max. concentrations
   Observed vs. DOT       11
   Observed vs. EPAl      11
   Observed vs. EPA2      11
Residuals
   DOT vs. EPAl           10
   DOT vs. EPA2           11
   EPAl vs. EPA2          11
                                              0
                                              3
                                             12.5
   7
   1.5
   0
                                                             <.05
                               141
-------
                                Table .22

                       Daily Maximum Predictions:
                   Regression Against Observed Maxima
                              C0 - a + bCp
Pairing and
Model
Slope
b
Intercept
a
r2
(a) Paired by Site
DOT
EPA.1
EPA2
.54
1.11
.89
98.
51.
67.*
.041
.283
.291
(b) Unpaired by Site
DOT
EPA1
EPA2
.90
.98*
.78*
65.
53.
63.*
.165
.362
.460
(c) Unconstrained in Space
DOT
EPA1
EPA2
j
.40
.70
.66*
98.
62.
55.
.045
.303
.384
*Significantly different  from zero at  the  95%  confidence  level.   (Note
 that a "good" model should produce a  slope which  is  significantly greater
 than zero and an intercept which  is not  significantly  different from zero.)
                                       142
-------
    CO

    O)
CM


 O>
*—*
J3
 te
H
 
•O   C
t->  K


M  O

~*  *l--l

 3  to

It,  (B





 0,  W
    fx

    ON
 0)  <0
 c
 O  ON
 N  rx
O  ON
 E  c
•>->  -^
 X  en
 to
O
    OJ
    lj
    Cb
                          OC
                          c

                         	            —<  OO  rx inincMCMin

                          c
                          tu     ONCMOo
                      •c

                                vDOfrlvD—
-------
IB
H
        CO
        c
        c

        to
        CO
       • r^
        E
        C —<
        TO  TO
        E -C
        O  C
        w  
>











0)
M
C
TO
f.
U

£
QJ
>,
1
Error

•o
^
TO
•c
C
TO
JJ
CO
O'
6&
C
TO
(5

i—'
TO
D
C
S
C
0

m
O

•
X
CO
E

13
U
j_t
B
.•3
4J
CO
U

c
0

m
O

U-J
O
r
°
cr.
TO
•-H
TO
3
C
C
<
C
• M
c
c

ro
C
U-l
O

&••?

CO
TO
CO
TO
D

01
f— i
Q.
E
TO
C/5

00
»— J
TO
3
C
C

•
60
>
co
TO
O

O)
,_
O.
E
n:
c/;
oc

0)
0.
60
C
TO
6

9^5
CO
TO
a

0)

g"
TO
CO

00

01
4J
0
00
1
C7^
r—
o>
^^

c


.a
a
a.
^^
01
60
c
TO







gs?
VC
.
ro




*s
0

— «






in
•
^







v
f««
rn
»^






O
CM
vD
1


0
00
1
in
P*.
CT»
i—H

««
f*)
O

T3
QJ
^
i«J
a*
CO
*







vD O^ vO ^"^ ^J t^
* • * * • •
C O co f*-* \c r**





*^ m oo r** — * 01
• • • • • *
PO CO CSJ CM OJ CN






l^ ^D ff* ^3 ^H i^
• * • • • •
cn ro CM cs CN CN





* *
* * in in * *
• • • • • •
oo — * r*- n CM in
O*> CM 00 O 00 C7^
^4 *~H






\o ao o f» m oo
^ co in \o f*^ en
en ^COO UlBO
•O T5 -H
OJIBCO OlCOIB T3COCO
MEE UEE OJ E E
Oi -^ -H CL. -H -H l-l -H .H
XX XX ELI X X
OJ TO TO "•"! CO TO CO CO
< •£, 2 <£S HSS
^j f^ ^J
W W Q
                                                                                                        o
                                                                                                        u
                                                                                                        C
                                                                                                en
                                                                                                c
                                                                                                o
                                                                                               • H
                                                                                                CO
                                                                                                (0
                                                                                               •|H

                                                                                                a;

                                                                                               ffv
                                                                                                        o

                                                                                                       T3
                                                                                                        «J
                                                                                                        CO
                                                                                                        TO
                                                                                                       JO

                                                                                                         m
                                                                                                       O

                                                                                                       •O
                                                                                                        0)
                                                                                                        •o
                                                                                                        01
                                                                                                        l-i
                                                                                                        a,
                                                                                                        *
                                             144
-------
                             Table 25

            Recommended Combinations of C0 and  Cp


For evaluating accuracy of the peak prediction:

1)  Co   (s) with Cp   (x)  compares max. obs.  and  max.  pred.  for each
                  day, where the predicted max.  is  constrained to be
                  at the  location of a monitoring  site.
                  (1 set  of statistics)

2)  Co   (s) with Cp   (g)  compares max. obs.  and  max.  pred.  for each
                  day, where the predicted max.  is  selected from any
                  grid point in the modeled  region
                  (1 set  of statistics)

For diagnosis of site-specific daily maximum problems:

3)  Co   (s,h) with Cp    (s,x)  compares max.  obs.  and max. pred. for
                  each day at a given site,  unpaired by  hour.   (A set
                  of statistics for each site,  plus another averaged
                  over all sites.)

For diagnosis of sources  of error;

    Co (s,t) with Cp (s,t)  compares obs. and pred. concentrations
                  matched by site and time,  with the data sorted in
                  the following ways:

    1)  By site:  One set of statistics for  each site, averaged over
                  all hours and all days.

    2)  By day and site:  One set of statistics for each day/site
                  combination, plus one set  for each day averaged over
                  all sites.

    3)  By hour and site:  One set of statistics for each hour/site
                  combination, plus one set  for each hour averaged
                  over all sites.
                            145
-------
                                  Table 26

              Recommended Statistical Estimators and Displays;

Statistical Estimators:

     Bias d, with confidence interval based on a one-sample (paired) t with
                 adjustments for autocorrelation.  Bias comparisons based
                 on the  Wilcoxon paired rank test should also be done if
                 the data sets are suspected not to be normally
                 distributed.

     Noise S
-------
3UU
250
200
ISO
100
-£ 50
1 ft
1 1 1 1 1 1 1 1 1 1 1
—
r
-
-
-



-






-



-











-


o O ' 	
* 300
Z 250
^.
0 200
u
S 150
s
£ 100
Ik
!•
ik
50
O
300
250
200
ISO
tf\f\
IOO
50
ft

1 1 1 1 1 1 1 1 1 1
-
-

— I

_

-









w



—































^^






1 1 1 1 1 1 1 1 1 1
-
i r
-
_






























••

M 1 1 M 1 1
Arvoda
—
-
-
-
>K_ "






—
••
-


b.
-
1








1 1 1 1 1 1 1
Carih
-

-

_

rTVn 	


i i i i i i i i
Welby.
-
-
k
1-M 	

ISO

100



50
0

1
-

^



—
1 1 1 M 1 1 1 1 1 1 1 1 1
Camp
•~











-



-
—
-
RW_ _


i 1 1 i 1 1 1 1 1 i i i i i
Highland
(1978-80 only) "

_




J

0














~j
—1 —
1 ru-, n_

30 6O 90 120 ISO
CONCENTRATION (ppb)








      0   30  60 *>  120  150  WO
        CONCENTRATION (ppb)
FIGURE 1:  Distribution of Hourly Ozone Concentrations on
          High-Ozone Weekdays,  May-September, 1975-1980
                 ppm)

                      147
-------
FIGURE 2:  Denver Modeling Region Showing the 5 Monitoring Stations,
           Arvada, CAMP, CAR111, Highland and Welby, and the Model Grid.
                                148
-------
                                                  CD
                                                  
-------
 K
 LJ
 a
                                         S8«
    05
 C 0)
 O *J
-H-H
 4-1 CO
 CO
•H 60
 > C
 OJ-H
O M
    O
 01 -U
4Jf-l
 3 C
r-( O
 OS
 CO
•H OJ

O O

-B"O
 C QJ
 CO 00

 C lJ
 O 01
•H >
01
Ul
           r—  CM
       1—  X
                                                                                                  01 co
                                                                                                  QQ
                                                                                                  •Ui—|
                                                                                                  3
                                                                                                  05 4J
                                                                                                 42
                                                                                                 •H  U
                                                                                                  CO  CO
                                                                                                 D W
                                                                                                  tt
                                                                                                 •H
                                                                                                 tt,
                                       A30
                                      150
-------
          EPA2
_^
JQ
CL
CL
c
O
s
cu
O)
1=
O
to
-Q
fC

35


30

25
20
15

10


5
-5
-10
.


•
0
g
• • *
• • * * •

•



•
*» « A «* CM * f\ •••*
                                       70

            Daily Mean  Observed  Concentration (ppb)
X)
D.
CL
j_>
•S 35
QJ
c 30
0)
T3
C
C
8 20
S 15

5 10
5
on 5
re
£• -5
re
Ca
-10
EPA2





•



i





•i


i






i






i
i


,




i





•


i





i


i





i





i












i
i



.





i








         30      40      50      60     70

            Daily Mean  Observed  Concentration (ppb)
FIGURE 5:  Daily Absolute Deviation and Bias in EPA2
           versus Mean Observed Concentration,
           Averaged Over Five Sites.
                151
-------
     I   'I—i   I   i—I—'   I—i—I—i—I—i—I—i—(—rrn—i—\—i—f—'—I—"
«
a
       i— C\J  (1)
    I— 
                                                                                                 CO  O

                                                                                                 M  13
                                                                                                 3  Q)
                                                                                                 •H  csO
                                                                                                 O  CO
                                                                                                     M
                                                                                                 •a  01
                                                                                                 a;  >
                                                                                                 *j <
                                                                                                 o
                                                                                                 •rl  Cfl
                                                                                                 T3  0)
                                                                                                 (I)  4-1
                                                                                                 ^ -H
                                                                                                 P-i C/l

                                                                                                13  00
                                                                                                 C  C
                                                                                                 « -i-(
                                                                                                     t-i
                                                                                                T3  O
                                                                                                 01  4-1
                                                                                                 > -H
                                                                                                 !-i  C
                                                                                                 CU  O
                                                                                                 cn S
                                                                                                w
                                                                                                cs
                                                                                                »
                                                                                                o
     8    8
                                                         O    O    O
                                                         in    *    ^
                                     152
-------
NOUVIUN33N03 3SVH3AV
                                                       C
                                                       O
                                                      U
                                                      vD


                                                      W
                                                      Pi
                                                      J3
                                                      O
                                                      I—I
                                                      t-i
        153
-------
 N01J»B1N33NOO
•a
 0)
 3

•H
4-1
 c
 o
                                                           vD

                                                           W
NCH»biN33N03
         154
-------
a
r
o         o
n         m


I6ddl  S«I8
                                                                                            O


                                                                                            CO


                                                                                           a
                                                                                               60
                                                                                            cu  M
                                                                                           jr  o
                                                                                            0)  O
                                                                                               m
                                                                                           •a   '
                                                                                            a)  cu
                                                                                            00 J2
                                                                                            n)  *-»
                                                                                            1-.
                                                                                            CU  k-i
                                                                                            >  o
                                                                                           <  14-1
           i— C\J
        I— <=c <
      -  O CX 0-
        o uj uj
        x  o
                                 S         8        2

                                  IBddl  S»18  ilHOOH
                                                                                           •H  -a
                                                                                           co  0
                                                                                           i-H  Q)
                                                                                            t-i  cu
                                                                                            3  l-i
                                                                                            O  A
                                                                                           C~-

                                                                                           W
                                                                                           u
                                                                                           M
                                                                                           fc
                                   155
-------
o
•z.
            I    •    I
                            ISddl  SV18 ilHflOH
0)
c
                                                                                  O
                                                                                 CJ
          i— CM
    -  O D- Q-
       Q UJ UJ
       X O
                             8       S
                             Oddi  svie
                                                                                 r-^
                                                                                 W
                                                                                 a
                                                                                 o
                               156
-------
                                                               i
                                                                            "O
                                                                            0)
                                                                            a
                                                                            c
                                                                            o
                                                                            u
                                                                            w
                                                                            Pi

                                                                            o
• O O- Q_
  X O
                      8      S


                      Odd i S»I8
                         157
-------
in
ui
u
o
 o


 C
1/1


CO

v-
Z
UJ

(E

a
u-i

•a

 to

 CD


O
i


                                                                                                3
 to
 K
 S
                                         158
-------
 a
 r
                                  IBddl SVIB JllUOOH
                                                                                            cn
                                                                                           r-\
                                                                                            C8

                                                                                            >-<  [ft
                                                                                            V  >,
                                                                                            U  (0
                                                                                            c  o
 0) rH
 o
 c  o
 0) -C
•a  AJ
•H
UH  Vj
 C  0)
 O  >
o  o

6; -a
u-i  OJ
O\  OD
    CO
ji  M
 u  0)
ft  >
 *  <
oc
-l
                                                                                            to  o
                                                                                            tO  U
                                                                                            •H -H
                                                                                            CQ  C
                                                                                                O
                                                                                            »-i x:
                                                                                            3  o
                                                                                            o  TO
                                                                                            ac w
                                                                                            oo
os
3
O
H
                                        159
-------
I
o
                            I odd I  SVIB JOHflOH
                                                                                13 •
                                                                                01


                                                                                c
                                                                                o
                                                                                o
               1	r
                    I    '    I	1_
CO

w

p'

o
t—I
                                 S»I8
                                160
-------
     8

   svis
                                              
-------
in
UJ
*—

in
LJ
O
in
a.

CD

»—
2
LU

X

a
                                    S»18 113303d
                                                                                                    oo
                                                                                                    e
                                                                                                   O
                                                                                                 ~ QJ
                                                                                               CN  J=
                                                                                               W  t4
                                                                                                   0)
                                                                                                M  >
                                                                                                o  o
                                                                                               U-4
                                                                                                   T3
                                                                                                CO  0)
                                                                                                CO -tO
                                                                                               •H  CO
                                                                                               03  V-<
                                                                                                   0)
                                                                                               ^s  >
uj
o
a:
UJ
in
«
en
                                 88

                                 i add i  s»is
                                                                                                3  r-(  «
                                                                                                O   CO  Q
                                                                                               X   >
                                                                                                    U  rH
                                                                                               •O   0)  -I
                                                                                                C   JJ
                                                                                                CO   C  0)
                                                                                                   I-H  x;
                                                                                                U]      -U
                                                                                                cfl   0)
                                                                                               •H   U  "O
                                                                                               CQ   C  C
                                                                                                    0)  CO
                                                                                                5^-0
                                                                                               ^H  -H  03
                                                                                                (-11^0)
                                                                                                D   C  *J
                                                                                                O   O  -H
                                                                                               K  U  W
                                                                                               OS

                                                                                               O
                                                                                               I—I
                                                                                               fcn
                                                  162
-------

                       ISddI  S»li
                                                                          CO


                                                                           00
                                                                          I
                                                                           .c
                                                                           o
                                                                           fcJ
                                                                            0)
                                                                           T3

                                                                            O
      i— CM
.
t-   on. Q-
    OUJ UJ
h   x o
                                                                           H


                                                                            0)
                                                                            e8
                                                                            Q
                                                                            W
                                                                            o
                        163
-------
o
I
          1	1	1	1	1	1	1	1	1	1-
2        8
                            S       S

                             iBad I  S»I8
"O
 0)

 C
•H
i
-------
                                                          •a
                                                          0)
                                                          3

                                                          c
                                                          c
                                                          o
                                                          u
    i— C\J


-  g2 s
    UJ Lkl
-  x o
                       8


                     Sfll
                  165
-------
PERF.  EVfiL.  RUN  031PPBJ-EPR2  79180    12.00 - 13.00
  3WTQW -"ROW  a.liCCCI-C'  "0 0.12000    CBMTOUft 'w
                                                0 60000E-03 PTi3 3;«  0.62U2E-OI L»9£LS SC»LEO SY   1000.
              FIGURE  11:   Conrci :  Plot  Shotting Predicted Ozone Cloud
                           Relative to Hi^liland:  Oay 79180;   Observations
                           Are  Given in  Parenthesis.

                                     166
-------
PERF. EVflL.  RUN  B3(PPB)-EPR2 79218   15.00 - 16.00
 CONTOUM FKOM o.isoooe-oi TO  0.10900    CONTOUM INTCMV»L or O.WOOOE-OJ PTO.SI. O.WTX-OI L»KL$ SCALED IT   1000.0
           FIGURE 12:   Contour Plot  Showing  Predicted  Ozone Cloud
                         Relative to Highland:   Day 79218;   Observations
                         Are Given  in  Parenthesis.
                                       167
-------
PERF.  EVflL. RUN   83(PPB)-EPfl2 79249    13.00 - 14.00
 CONTOU« FRO*  0.1WOOE-01  TO 0.11000    COHTOU" INTE*V»L OT  0.50000C-02 PTli.JI-  0.4M49E-01 L.KL5 SCALED it   .000.0
             FIGURE 13:  Contour Plot  Showing Predicted  Ozone Cloud
                          Relative to Highland:   Day 79249;   Observations
                          Are Given  in  Parenthesis.

                                         168
-------
PERF.  EVfiL. RUN   03IPPB1-EPR2 79193    11.00-12.00
 COSTOuO f*IX 0.120006-01 TO 0.12000    COttTOU* INTERVAL OF  0.600OOE-02 »TO 31-  0.*M»7E-01 LUKLS SCM.CD •»   1000.0
              FIGURE 14:   Contour  Plot Showing Predicted Ozone Cloud
                           Relative to Arvada:  Day  79193;   Observations
                           Are Given in Parenthesis.

                                      169
-------
PERF.  EVRL. RUN   83(PPB1-EPR2 80204    11.00  -  12.00
                     1	T
                                        i    i   ... i    J	i    i    i   J
 CONTOUR FROM  0.90000E-02 TO  0.16200   CONTOUR DsTWR'. (f  0.90000E-02 PI'I 3 31- Q.72923E-01 t»BELS SC*LEO BT   1000.0
              FIGURE 15:   Contour Plot Showing Predicted Ozone  Cloud
                            Relative  to Arvada:   Day  80204;  Observations
                            Are Given in Parenthesis.

                                         170
-------
                                                                            4-1
                                                                            3
                                                                            O
                                                                            "O
                                                                            8
                                                                             CO
                                                                             0)   •
                                                                             O  .H

                                                                             3
                                                                             O  !-J  •
                                                                            C/2  £3
                                                                                O
                                                                             4J  ffi
                                                                             c
                                                                             CO

                                                                             O
                                                                             3  6C
                                                                             E "r-t
                                                                            •H  W
                                                                            CO v^

                                                                            IW  CO
                                                                             O  0)
                                                                                 O
                                                                             C  >-i
                                                                             O  3
                                                                             CO  O
                                                                            •H CO

                                                                             CO  +J
                                                                             O  O
                                                                             W
171
-------
                                               3
                                               O

                                               4-1
                                               •H
                                               13
                                               C
                                               CO
                                                en   •
                                                0)  CM
                                                O  lH
                                                t-l
                                                3  SJ
                                                O  3
                                               CO  O
                                                   sa
                                                O  ON

                                                g
                                                3  00
                                                e -H
                                                •H  M
                                                M-l  03
                                                O  0)
                                                    O
                                                C  VJ
                                                O  3
                                                CD  O
                                                •H C/3
                                                M
                                                CO  4J
                                                O. C
                                                H -H
                                                O  O
                                                CJ PL.
                                                r~
                                                i—i

                                                w
172
-------
                                                      ^   f

                                                                 4-1


                                                                 •H
                                                                 O
                                                                 PH
                                                                  C
                                                                  O
                                                                 •H
                                                                  4-1
                                                                  Cfl
                                                                 iH   •
                                                                  3  ro
                                                                  e  -H
                                                                     3
                                                                 n-j  o
                                                                  O  EC

                                                                  C   -
                                                                  O  ro
                                                                  CO  CT.
                                                                 •H  r-H
                                                                  (X
                                                                  e  >^
                                                                  o  cd
                                                                  o  o
173
-------
8
                                                                                                          3
                                                                                                          O
                                                                                                         T3
                                                                                                          C
                                                                                                          to
                                                                                                          4J
                                                                                                          U-*
                                                                                                          0)
                                                                                                          en   •
                                                                                                          41 :
                                                                                                          O
                                                                                                          D  bO
                                                                                                          E  -H
                                                                                                          14-1  CO
                                                                                                          O  0)
                                                                                                              o
                                                                                                          e  V4
                                                                                                          O  3
                                                                                                          CD  O
                                                                                                          •H  CO
                                                                                                          i-
                                                                                                          nj  j-j
                                                                                                          Cu C
                                                                                                          e  -H
                                                                                                          o  o
                                                                                                          U  PL,
                                                          174
-------
                                                                         4-1
                                                                          3
                                                                          O
                                                                          s
                                                                          CO
                                                                          Q)   .
                                                                          U CO
                                                                          M r-l


                                                                          O  H
                                                                          tf!  3
                                                                              O
                                                                          4-1 ffl
                                                                          c
                                                                          •H   "
                                                                          O O
                                                                          j: o
                                                                          4-1 00
                                                                          •H
                                                                           CO  O
                                                                           c
                                                                           o
                                                                           Cfl  4J
                                                                          .-I  JS
                                                                           3  M
                                                                           e  -H
                                                                          •H  M
                                                                          14-1  CO
                                                                           O  0)
                                                                              O
                                                                           c  ^
                                                                           O  3
                                                                           en  o
                                                                          •H  C/3
                                                                           t-l
                                                                           ni  4J
                                                                           ex C
                                                                           6  -H
                                                                           o  o
                                                                          U  P-i
                                                                           o
                                                                           CM
                                                                           w
                                                                           PS
                                                                           g
175
-------
                                                o
                                               .c
                                               T3
                                               ra
                                               CO   .
                                               OJ LT|
                                               O .-I
                                               p
                                               3  M
                                               O  3
                                               W  O
                                               c   •>
                                               •H O
                                               O r»
                                               PH ^H
                                                  O
                                               4= CO
                                               4-1
                                               •H  >,
                                               o
                                              •H
                                    J   S
                                               3  00
                                               e -H
                                              M-l   CO
                                               O   Q)
                                                   O
                                               C   H
                                               O   3
                                               W   O
                                              •H  C/D
                                               W
                                               CO  4-1
                                               CX  C
                                               E  -H
                                               O   O
                                              CN

                                              W
                                              O
176
-------
I	1	1	1	1	[	1	1	1	1    II
                                                                                                        4J  O
                                                                                                        J3  P-i
                                                                                                        M
                                                                                                        •H  60
                                                                                                        M  C
                                                                                                        Fn   O
                                                                                                        W   X
                                                                                                         5  <
                                                                                                            PH
                                                                                                        ^ W
                                                                                                         0)  O
                                                                                                        ,H
                                                                                                        s_^  C CM
                                                                                                             O rH
                                                                                                        CM  -H
                                                                                                        <  4-1  l-J
                                                                                                        CL,  (S  3
                                                                                                         u-i   e
                                                                                                         O  -H   •>
                                                                                                             Cfl  00
                                                                                                         tn      o
                                                                                                         C  J2  CM
                                                                                                         o   t->  a^
                                                                                                         •H  -H  r~
                                                                                                         4J   ?
                                                                                                         (Ti       >.
                                                                                                         iH   W   tO
                                                                                                          3   0)  Q
                                                                                                          E   o
                                                                                                         •H   ^
                                                                                                         V)   3
                                                                                                              O  -^
                                                                                                          0)   W  4-1
                                                                                                         ^3       *
                                                                                                         4J   4-1   OJ
                                                                                                              C  tH
                                                                                                         14-1   -H  S-l
                                                                                                          O   O  OJ
                                                                                                              fL,  >
                                                                                                          C       O
                                                                                                          O   M^
                                                                                                          W   C
                                                                                                         •H   -rl  05
                                                                                                          l-l  T3  OJ
                                                                                                          CO  3  O
                                                                                                          ft 1-1  t-l
                                                                                                          B  O  3
                                                                                                          O  C  O
                                                                                                          U  H  W
                                                                                                          CS
                                                                                                          CN
                                                                                                          W
                                                                                                          (4
                          177
-------
                                •o
                                 
-------
                                                                     J3

                                                                     •H
                                                                      M
                                                                      (X
                                                                      w
                                                                      J2
                                                                      4-1
                                                                      •H
                                                                      PH  rH
                                                                      W
                                                                          ^
                                                                      U-i   3
                                                                      O   O
                                                                         rc
                                                                      w
                                                                      C    •
                                                                      o  -a-
                                                                      •H  O
                                                                      4J  CM
                                                                      rt  o
                                                                      iH  00
                                                                      •H  tfl
                                                                      co o
                                                                      0)
                                                                      ja
                                                                      o
                                                                      en
                                                                      o
                                                                      o
                                                                      ro
                                                                      CS

                                                                      W
                                                                      Pi

                                                                      O
179
-------
I	1	1	1	1	1	1	1	1	1	1—1
                                                             CO
                                                             c
                                                             o
                                                            •H
                                                             cn
                                                             03
                                                            •H

                                                            w
                                                             CU
                                                            .H
                                                             I
                                                            -a
                                                             c
                                                             3
                                                             O
                                                             (-1
                                                            O
                                                             oo
                                                            O*
                                                            2:
                                                             O
                                                            X)
                                                             )-l
                                                             C8
                                                             U
                                                             O
                                                   I
                                               -   fc
O
18
0)
                                                             O

                                                             cn
                                                            4-1
                                                             O
                                                             O
                                                            U
                                                            w
                                                            OS
                                                            U
              180
-------
.a
 a.
 ex
       a.
       Q_
       o
 O     3
 S-     O
 CD     S-
-i^     C7>
 O C -^
 n3 3  O
CQ D;  ro
       cn
-C •>
 a  «
 3  O
 o
 i-j
 60 ••
^  c
 O  O
 «  -H
pa  4J
                                                                                                            N  M
                                                                                                           O  C
y-4  c
 o  o
 3
 (fl
 
-------
 Q.
o
CTi
        C-
        Q.
       O
C     "O
rs      c
o      =
i-      o
en     S-
^£      O
U  C -^
ro  3  O
CD a:  re
       £-3
-c  cu
C') en  3
•i-  (T3  O
                                                                                                               •a
                                                                                                                a)
                                                                                                                c
                                                                                                                o
                                                                                                               o
                                                                                                               K
                                                                                                               ft

                                                                                                               g
-------
.0     -•—•
 O-    -Q
 Q.     Q-
        Q.
O
C~i     O
 o
 s-
        o
        S_
O
-------
                                        T3
                                        C
                                        3
                                        O
                                        O
                                        ns
                                        CO

                                        s_
                                        03  d)
                                        i—  C
                                        ^  o
                                        CO N
                                        QJ O
                                        o:
                                           -o
                                        en c
                                        c  3
                                         c  o
                                         3  (O
                                        o;  CD

                                         c  j:
                                         o  en
                                         3 en
                                        t— C
                                         na -i-
                                         
                                         tO 'I-
                                         ••-  3

                                         ro  OJ
                                         Q- C
                                         E  O
                                         O  N
                                         <_) O
                                          vO
                                          OJ

                                          w
                                          Pd

                                          g
184
-------
                                                          •o
                                                          c
                                                          3
                                                          O

                                                          en
                                                          .*
                                                          o
                                                          03
                                                          CO

                                                          J-
                                                          re  o>
                                                          i —  c:
                                                          3  O
                                                          cn N
                                                          01 O
                                                          a:
                                                             T3
                                                          cn c
                                                          c  3
                                                          •r-  O
                                                          10  S_
                                                          =3  cn
                                                           3  tO
                                                           O  D1
                                                           <0
                                                           3 cn
                                                          i— C

                                                           0) C
                                                           U 3
                                                           C CC
                                                           to

                                                           S- O
                                                           o •»-
                                                           <4- 4->
                                                           S- fl3
                                                           O) i—
                                                           a. 3
                                                           o oo
                                                           i-
                                                           it)  O)
                                                           a.  c
                                                           E  O
                                                           O  N
                                                           o o
                                                           (N
185
-------
                                         O
                                         u
                                         re
                                         CO


                                         re o>

                                         *=j o
                                         w N
                                         HI O
                                         CT c
                                         C  3
                                         •f-  O
                                         bO  t-
                                         •c  u
                                         3  re
                                         Q£ CO

                                         C JE
                                         O  Ol
                                          re
                                          3  01
                                         r—  C
                                          re •!-
                                          >  10
                                         UJ =>
                                          U 3
                                          C Q£
                                          re

                                          E o
                                          o •<-
                                         ft- -t->
                                          s- re
                                          01 f-
                                         D- 3
                                          C -C
                                          O -M
                                          U1 T-


                                          re u
                                          o. c
                                          E O
                                          O IM
                                          o o
                                          CO
                                          cs
185
-------
                                                           T3
                                                            C
                                                            ^5
                                                            o
                                                            s_
                                                            cn
                                                           .^
                                                            o
                                                            ro
                                                           CO

                                                            s-
                                                            re  Q)
                                                           i —  C
                                                            3  o
                                                            en  N
                                                            0) O
                                                            a:
                                                               -D
                                                            cn  c:
                                                            C  13
                                                           •i-  O
                                                            to  s-
                                                            ^>  01
                                                                (O
                                                               CQ
                                                            C
                                                            o
                                                  J   I
                                                            re •>-
                                                            >  1^
                                                            UJ •=)

                                                            V  C
                                                            O  3
                                                            c o:
                                                            re
                                                            E  c
                                                            i.  O
                                                            O T-
                                                            n- •<->
                                                            s-  re
                                                            4- -i-
                                                             O C/0
                                                             C -C
                                                             O •*->
                                                             re  O)
                                                             o. c
                                                             E  O
                                                             O  N
                                                            <_)  O
CNI

w
187
-------
 (/>  O)
 cr  oo
 O)  tO

                                                                                                         •H
                                                                                                          4-1
                                                                                                         •H
                                                                                                          CO
                                                                                                          c

                                                                                                         CO


                                                                                                         "O

                                                                                                          3
                                                                                                          O
 O   •
 ra  c
«  o
   •H
 C  -u
 O  rt
Ji  4J
 Vj CO
 CB
 u  w
 O  P
 V. -H
13  l-<
                                                                                                         ^  C
                                                                                                          O  O
                                                                                                          V-l 4J
                                                                                                          3  M
                                                                                                          o  
-------
 >  C
CO  O)
C  l«
o>  re
00 CO
                                                                       S    R   8   S
                                                 DNOD CO
T3
 0)

 C
•H
 4-1

 O
                                                                                                 O
                                                                                                 CO

                                                                                                 w
                                                                                                 PS

                                                                                                 g
                                                 DN03 CO
                                                189
-------
 >  c
•I-  3
-t-> a
 c  
 Ol  T3
00 CQ
                                                                                               •o
                                                                                               c
o
u
                                                                                               o
                                                                                               ro
                                                 DN03 CO
                                            190
-------
OD


r».
                                                     CJ
                                                                     to
                                                                     o
                                                                  HI
                                                                 -o  s-
                                                                 •i-    10
                                                                  o  c
                                                                  •r-  O
                                                                  S-  fO
                                                                  Q-  S-
                                                                  r-  
-------
   c
w 5      is
< o

5 d ~   i-o
     [3 5  o*


     to       0
  O
  o
                        i  i i i  i  i    i
                                    0-
                  I  I  I  I  I  I I  1
         4.5



         4O




         3.5



   < *?  3.0

   CD  o.
     O

     O
         2.0



          1.5



          1.0
            0.5
                  11
                          i  i i i  i  ©i





                                 ©




                          ©O00
                  ©o
               _o
                 1  1 1 1  1  1  1  1  1 1 1 1
     en
           5OO
           40°
           300
     X ^  200
            KX>
                    1 1  i  it  I
                   I  I  I I  I  I  I  I  I  I I
FIGURE  32  :
               6 78 9 OH 12BW 15 16 17



                       HOUR





            Hourly Bias  for Carbon Monoxide  (CO)

            and Nontnethane  Hydrocarbons  (NMHC)

            Averaged over  the 11 Days.
                            192
-------
        V)
        m
FIGURE 33
I.O
1.4
12
1.0
O8
O6
0.4
0.2
n
i i i ' i ' i ' i ' i • i
~ All Sites 0"
-co ° -
© -
-
Q
o ©
_ _
i , i . i . i©. e , i . i
               0.2  0£  IX)   1.4  1.8  2.2  2£
                 OBSERVED DAILY MEAN (ppm)
4

I 3
o.
tO
| 2
_J
< .
0 I
O
Camp o
CO
o ~ o
o © -
©

§ • -


1 1 1 1
              I      23456
                 OBSERVED DAILY MEAN (ppm)
vuw
"S 500
Q.
Q.
V)
< 4OO
m
>.
I ^
200
1 ' ' '
Camp o
NMHC °
©
©
o o

©
©
©
i i i i ~
400   800   600   700   tOO   900
    OBSERVED DALY MEAN (ppb)
Daily Mean  Observed  Concentrations Compared
with Daily  Bias for  Carbon Monoxide (CO) and
Nonmethane  Hydrocarbons  (NMHC).
                        193
-------
    re
    Q
OH  to
    i-
   3
 re  o>
ca  z=
                                                       JHW CO
                                                                                                                JC
                                                                                                                4J
                                                                                                                •H
                                                                                                                IS
•U
•H
 >
•H
                                                                                                                    6C
                                                                                                                to  o
                                                                                                                C  W
                                                                                                                Q) -H
                                                                                                                    C
                                                                                                                    O
                                                                                                                Q £
                                                                                                                    O
                                                                                                                t-t  (0
                                                                                                                n) w
 3  cfl
 0)
^•^ ^^
   00
 (U rH
 O  O

 CO  QJ

r-l  O
 3  N
 CO O
 0)
                                                                                                                3 <;
                                                                                                                o o>
                                                                                                                Sd w
                                                                                                                W
                                                                                                                PS
                                                                                                                8
                          s     s
                                                          194
-------
   to
   Q
o;  to
    S-
 CJ -M
 CO  3
 (O  OJ
CQ S
8     8
                       0)

                       C
                       o
                       u
                                                                                                    W
                                                                                                    Pd

                                                                                                    O
                                                     195
-------
    (d
   O
 3 i—
cz  re
 O)
CO
    =5
    OJ
Id
(/)
               >•
               ffl
               o
               00
                                                                                                      •a
                                                                                                       0)
                                                                                                       a
                                                                                                       c
                                                                                                      •H
                                                                                                       4-1

                                                                                                       O
                                                                                                      U
                                                                                                      W
                                                                                                      Bl
                                                                                                      D
                                                                                                      O
                                  s    s    s
>    O    I
'    ^    I

 3N03 CO


   196
                                                  8    S
                                                                                in    o    in
-------
    Q
Ci  to
     S_
 CD +->
 (/>  3
 to  O)
co rr
                                                           3N03  03
                   CO
                   Z
                   LJ
                                                                                                                             0)
                                                                                                                             4J
                                                                                                                             •H
                                                                                                                             CO
                                                                                                                             C
                                                                                                                             •H
                                                                                                                             O
                                                                                                                             s
                                                                                                                         •H  O
                                                                                                                          >  cfl
                                                                                                                         •H  W
                                                                                                                          W  cfl
                                                                                                                          C
                                                                                                                          
-------
                                                3N03 03
ce  re
    s-
O) -P
to  3
                                                                                                     0)
                                                                                                     3
                                                                                                     C
O
O
                                                                                                     w
                                                                                                     Pi
                                                                                                     £3
                                                                                                     O
                                                                                                     M
                                                                                                     tl-l
                                                3N03 03
                                                     198
-------
 c
 3 i—
c;  ro
 OJ
 l/l
    
-------
 c:
 3  •—
 I/I  3
 (O  O)
CO  2:
                                                                                                                      60
                                                                                                                      C
                                                                                                                      VJ
                                                                                                                  •H  O
                                                                                                                  •H -H
                                                                                                                  AJ  C
                                                                                                                   0) .e
                                                                                                                  to  o
                                                                                                                      TO
                                                                                                                   re
                                                                                                                  O
« -a-
M o
4J CSJ
3 O
0) 00


01  O

4J  OJ
    r"
U-i  O
O  N
    O
en
4-1  l-J
TH  o
3  ^
tn
OJ  r-l
05  <
    eu
s^ w
                                                                                                                   3  4J
                                                                                                                   O -H
                                                                                                                   *o-  3
                                                                                                                   w
                                                                                                                   (^
                                                           3NOD  CO
                                                             200
-------
    ID
   O
 c
 3 i—
D:  re
           a.
           UJ  I-

           1/1

           UJ
           in  (-

           >-

           Q  .
           o
           o
           I



           I
           o
           C\J
           o
           CD
                                                                                              5
                                                                                              i
                                                    8


                                                 3NOD CO
                                                                                                           0)

                                                                                                           C
c
o
QJ
           I/)
           z
           I

           cr
           o
           (Nl
           O
           CD
                                                                                                          ID
                                                                                                          O
                                                                                                          M

                                                                                                          tn
                                              S2SSSS20


                                                 3N03  CO
                                                   201
-------
    re
   Q
 C
 3 i—
Ci  to
    i.
 QJ .(_>
 00  3
 IB  O)
C3 ZT
                                                                                                   T3
                                                                                                   Q)
                                                                                                   3
                                                                                                   C
                                                                                                   o
                                                                                                   u
                                                                                                   01

                                                                                                   W
                                                                                                   O
                                                  202
-------
                  in
                  z
                  o   .
                  o
                  (J
                  O
                  (M
                  O
                  a>
 c
 3 -—
o: 03
    s-
 0) •+->
 tn 3
 TO ai
CD r:
§       §       §
O       IT       O
ft       Fv       (N

 3M07  03
                                                                                                                            oo
                                                                                                                        >.  C
                                                                                                                        4-J   O
                                                             *j   o
                                                             •H   ra
                                                             CO  U

                                                             at  4-1
                                                             to   n
  .  o
Q  CN
    O
.-I  CO
 CO
 M  C
4-1  O

 QJ  0)
Z  -O
    i-(
 0)  X
A  O
•U  C
                                                             o
                                                                 c
                                                             to  o
                                                                                                                       3  CO
                                                                                                                       to  u
                                                                                                                       0)
                                                                                                                       a <
                                                                                                                       O CM
                                                                                                                      r--
                                                                                                                      ro
                                                                                                                      o
                                                             203
-------
                                                                                        g     i
    3
 c
 3  r—
ce  (o

 QJ  4->
 to  3
 fO  0)
                                                     3N03 03
                Q.
                Ijj

                in
                z
                LU
                l/l
                Z

                O
                cr
                ct
                O
                00
 OJ
 3

•H
4-1
 c
 o
u
                                                                                                              r-.


                                                                                                              K
                                                      DN03 00
                                                         204
-------
    
 in  3

 re  o>
CO Z
                                                                                                     •o
                                                                                                     0)
                                                                                                     c
                                                                                                     o
                                                                                                     CO


                                                                                                     w

                                                                                                     (X


                                                                                                     O
                                                                                                     M
                                                    205
-------

                                                                                              QJ
                                                                                             H
                                                                                              i-i
                                                                                              O
                                                                                             U-i

                                                                                              01

                                                                                             •H
                                                                                             H
                                       DNCO eo
 a
 0)
 PH

 0)
 jr
 4-1

 60
 C
 •H
 CO
 tn
                                                                                               CD
                                                                                               cu
                                                                                              r-H
                                                                                               cx
                                                                                              w
                                                                                               o
                                                                                              £
                                                                                              oo
                                                                                              01
                                                                                               tj
                                                                                               Pi
                                       3N03  CO
                                            206
-------
      (o) Pared by She
160

(40

120

100

60

60

40

20
          o DOT Model
          * ERA I Model
          • EPA2 Model
                         	[EPA2 '43% Bias
                              JEPA I  • 44% Bios
                            — DOT  = 49% Bios
      (b) Unpaired by  Site
                        	EPA2:3l%Bias .
                               EPA I =37% Bias
                        	DOT  =42% Bios .
       (c) Unconstrained in Space
 m
 O

 3
 £
 I

 B!
160

140

120

IOO

80

60

40

20
	EPA2 • 10% Bios _
	EPA I  22% Bios
	DOT = 30% Bios _
      0  20   40  60  80   OO  I2O  140  160   fflO

             OBSERVED DAILY PEAK 03 (ppb)
FIGURE  39:   Predicted versus Observed Daily
             Maximum Concentrations  under Three
             Pairing Methods.
-------
      160
      I4O
  JO

  o.
  Q.



  10

  O
  X
  <
  <
  o
120
      100
I6O
             I

             O
                       .Average Trend

                        (regressed)
                                 ©
      Means of  Top

      14 Days
        ii     i
140
       12O
       100
                                  I     I
                        .Average Trend

                        (regressed)
                                  o
            Means of  Top 11 Days


            Excluding  Consecutive Days

            	I    1^    i    I    I     I
              1975  76   77   78  79   1980
FIGURE 40:  Means of the Top Daily Maximum Ozone

           Concentrations  for  Summer for 1975-1980


           to Show the Trend Over Time.
                      208
-------
                 (All-site maximum, unpaired by site)
              CD
CH
20
16
12
8
4
0
24
20

16

12
8
4
0
24
20
16
12
8
4
ft
I 1 1 1 i 1 1 1 1 1 1 1
~_ EPA 2 I
0
0
Q
— Q —
-B0 •
0
1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1
I EPA 1 ~
-A
_
A A
- -
A
A
A
1 1 1 1 1 1 1 1 1 1 1 1

1 i 1 1 i 1 1 I 1 1 1 1
© DOT ~
©
-
_ © G°
: © ® :
i ° i
                            8   12   16  20   24
                            % CHANGE
FIGURE 41:   Emissions Change Results versus  Daily Bias
            for the Three Models.
                         209
-------
               OBSERVED
                                         PREDICTED
o.
ex
 IO
O
    ieo
    170
    160
    150
    WO
    130
    120
    IK)
    IOO
    90
BO
170
160
ISO
I4O
130
120
HO
IOO
90
    160
    170
    160
    ISO
    WO
    130
    120
     110
    IOO
     90
EPA 2

i i i i i i i i i i

_
B B
__ B
_ B _
O
-
B
-
B G
III 1 1 1 1 1 1 1
170
BO
ISO
140

130
120
110
IOO
90
60
TO

| | | | | I JL 1 | |
-
—
^ ^^
B i-i
— B _
00
-
B -
n
B°
— _
1 1 1 1 1 1 1 1 1 1
EPAI

i i i i i i i i i i

- —
A A
*• A _
A

_ _
_ _
A
-A A
1 1 1 1 1 1 1 1 1 1
170
160
ISO
WO
130

120
110
DO
90
80
TO

1 i 1 i i i 1 i i I

-
_
-A -

-A
A
- &
A
> A A
1 1 1 1 1 1 1 1 1 1
1 1 1 I 1 1 1
o °
o
©
- 0 0
III II 11
1 1 I
-
-
-
-
1 1 1
                             DOT
                              170
                              160
                              ISO
                              WO
                              130
                              120
                               110
                              too
                               90
                               80
                               70
1 1 1 1 1 1 1 1 1
© o ©
0 e
- 0%
1 1 1 1 1 1 1 I J
-
-
.
1
       02466K>l2WI6ie2022    0246610121416162022

      % CHANGE (All-Grid Moximum)  % CHANGE (All-Grid Maximum)
    FIGURE 42:  Emissions Change Results  versus Peak Ozone — Observed
               and Predicted.
-------
         EMISSIONS CHANGE VS. DAILY MAXIMUM RESIDUALS
O
UJ  X
o  <
z  5
UJ
tr  o
=  i
o _j
UJ
Q.
oo
55
45
35
25
15
5
-5

-15
i 1 i I i 1 i l i l i 1
1EPA2 I
-
D
_ GD -
B
Q
D -
_ _
Q
~i 1 i 1 i 1 i 1 i 1 i I
OO
75
65
55
45
35
25
15
5
1 1 « l ' l ' I
IEPAI 1
-
A
- & ;
A
-A
A -
i i i i i i i i
OO
75
65
55
45
35
25
15
5
I DOT I
— —
- oo -
1 o ol
-
-
_ ° o
1 0 -
1 1 1 1 1 1
        0  4  8  12  16 20  24
0  4  8  12  16
0  4  8  12
               % CHANGE (1979-1976), ALL-GRID MAXIMUM
Q.
o
o


UJ
cr
UJ
UJ
Q.
ou
75
65

55

45
35
25
15
1 1 ' I ' 1 ' 1
1EPA2
—
- EJ
-
D
-
- DDD
Q
LKt
as i i i i i i i i
04 8 12 16
1 ' 1
-
_
_
-
-
™
-
O
^
i i i i
20 24
oo
75
65
55
45
35

25
15 ,
_ ' l ' l ' l ' l _
A-
A
- A -
_ A
1 A I
A A
— —
i i i i i i i i i
3U
80
70
60
50
40
30
20
<
i V i 1 i 1
IDOT I
I eo ~
-
- -
- °0 :
I o I
1 1 1 1 1 1 1
                                 0  4  8  12  16
                   0  4  8  12
                % CHANGE (1979-1976), ALL-SITE MAXIMUM



    FIGURE 43:  Emissions Change Results vs Daily Maximum Residuals

              for EPA2 and DOT Models
                            211
-------
                                   TECHNICAL REPORT DATA
                            (Please read Instructions on the reverse before completing)
                              2.
4. TITLE AND SUBTITLE

 Evaluation  of Performance Measures  for an Urban
 Photochemical Model
                                                           3. RECIPIENT'S ACCESSION NO. ,
             5. REPORT DATE
               July 1983
             6. PERFORMING ORGANIZATION CODE
7. AUTMOB(S)
  Robin  L.  Dennis, Mary W. Downton «nd Robbl S. Kell
                                                           8. PERFORMING ORGANIZATION REPORT NO.
9. PERFORMING ORGANIZATION NAME AND ADDRESS
 National  Center for Atmospheric  Research
 Environmental  and Societal Impacts  Group
 P.O. Box  3000
 Boulder,  Colorado  80307
             10. PROGRAM ELEMENT NO.
               A13A2A
             11. CONTRACT/GRANT NO.

               AD-49-F-0-167-0
12. SPONSORING AGENCY NAME AND ADDRESS
                                                            13. TYPE OF REPORT AND PERIOD COVERED
 U.S. Environmental  Protection Agency
 Office of Air  Quality Planning and Standards
 Monitoring and Data Analysis Division  (MD-14)
 Research Triangle Park, North Carolina  27711
             14. SPONSORING AGENCY CODE
15. SUPPLEMENTARY NOTES
 Henry S. Cole,  Project Officer
16. ABSTRACT
        A workshop conducted by the American Meteorological Society for  EPA 1n
   September 1980 recommended a large set of statistical  measures for use  1n the
   evaluation of air quality models.  The present  study was designed to  test the
   recommended measures  in  an actual performance evaluation of an airshed  model
   on data developed for  Denver, Colorado.  Three  versions of the SAI Urban Airshed
   Model  were examined.   The study Involved both an  evaluation of the models and an
   evaluation of the statistical performance measures.  The evaluation of  the models
   had two parts—a base  year case and an emissions  trend case, the latter represent-
   ing the use of the models for regulatory purposes.   Resulting recommendations
   are Intended to aid the  future use of such models, the planning of future perform-
   ance evaluations of airshed models, and the use of performance evaluation
   statistics.
17.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS  C.  COSATI Field/Group
Air pollution
Atmospheric models
Photochemical reactions
Smog
Ozone
Nitrogen Oxides
Hydrocarbons
 Urban Airshed Model
 SAI Airshed  Model
 Carbon-Bond  Mechanism
 Denver
18. DISTRIBUTION STATEMENT
                                                                         21. NO.. OF PAGES
 Unlimited
                                              20. SECURITY CLASS (Thispage)
                                               Unclassified
                                                                         22. PRICE
EPA Form 2220-1 (R«v. 4-77)   PREVIOUS EDITION i* OBSOLETE
-------