Empirical Techniques for Analyzing Air Quality and Meteorological Data: Part II Feasibility Study of a Source-Oriented Empirical Air Quality Simulation Model


EPA-800/4-76-029b
June 1976
Environmental Monitoring Series
            EMPIRICAL  TECHNIQUES  FOR ANALYZING
          AIR  QUALITY AND METEOROLOGICAL  DATA
                      Part  II.  Feasibility Study  of a
                            Source-Oriented Empirical
                         Air Quality Simulation  Model
                                 Environmental Sciences Research Laboratory
                                      Office of Research and Development
                                     U.S. Environmental Protection Agency
                                Research Trianfte Park Norttt Carolina 27711

-------
                RESEARCH REPORTING SERIES

Research reports of the Office of Research and Development, U.S. Environmental
Protection Agency,  have been grouped into five series. These five  broad
categories were established to facilitate further development and application of
environmental technology. Elimination of traditional grouping was consciously
planned to foster technology transfer and a maximum interface in related fields.
The five series are:

     1.    Environmental Health Effects Research
     2.    Environmental Protection Technology
     3.    Ecological Research
     4.    Environmental Monitoring
     5.    Socioeconomic  Environmental Studies

This  report has been assigned to the ENVIRONMENTAL MONITORING series.
This  series describes research conducted to develop new or improved methods
and  instrumentation  for the identification and quantification of environmental
pollutants at the lowest conceivably significant concentrations. It also includes
studies to determine  the ambient concentrations of pollutants in the environment
and/or the variance of pollutants as a function of time or meteorological factors.
This document is available to the public through the National Technical Informa-
tion Service. Springfield, Virginia 22161.

-------
                                               EPA-600/4-76-029b
                                               June 1976
 EMPIRICAL TECHNIQUES FOR ANALYZING AIR QUALITY
             AND METEOROLOGICAL DATA
Part II.  Feasibility Study of a Source-Oriented
     Empirical Air Quality Simulation Model
                      by


         W. S. Meisel and M. D. Teener
         Technology Service Corporation
             2811 Wilshire Boulevard
         Santa Monica, California 90403

                      and

               Kenneth L. Calder
            Contract No. 68-02-1704
                Project Officer

               Kenneth L. Calder
      Meteorology and Assessment Division
   Environmental Sciences Research Laboratory
  Research Triangle Park, North Carolina 27711
     U.S. ENVIRONMENTAL PROTECTION AGENCY
      OFFICE OF RESEARCH AND DEVELOPMENT
  ENVIRONMENTAL SCIENCES RESEARCH LABORATORY
 RESEARCH TRIANGLE PARK, NORTH CAROLINA 27711

-------
                          DISCLAIMER
     This report has been reviewed by the Environmental  Sciences
Research Laboratory, U.S. Environmental  Protection Agency,  and
approved for publication.  Approval  does not signify that the
contents necessarily reflect the views and policies of the
U.S. Environmental Protection Agency, nor does mention of trade
names or commercial products constitute endorsement or
recommendation for use.
                             n

-------
                               PREFACE

     This is the second of three reports of work performed under EPA
Contract No. 68-02-1704, examining the potential role of state-of-the-art
empirical techniques in analyzing air quality and meteorological data.
The reports are entitled as follows:
     I.  The Role of Empirical Methods in Air Quality and Meteorological
         Analyses
    II.  Feasibility Study of a Source-Oriented Empirical Air Quality
         Simulation Model
   III.  Short-Term Changes in Ground-Level Ozone Concentrations:
         An Empirical Analysis
                                    m

-------
                                  ABSTRACT
     Meteorological dispersion functions in multiple-source  simulation
models for urban air quality are usually specified on the basis  of the
analysis of data from special field experiments, usually involving isolated
sources.  In the urban environment, individual sources cannot be isolated.
One may, however, ask for an empirical source-receptor relationship which,
when summed (or integrated) over all the sources, would minimize the average
squared error in prediction of measured values,  A methodology for empiri-
cally testing alternative forms and extracting optimal parameters for source-
receptor dispersion functions in this manner  is described.  Feasibility was
demonstrated on data for which the "true" source-receptor function was known;
the methodology recovered parameter values very close to true values.  This
approach can be used as a means for calibrating Gaussian-form models for
particular urban environments and in developing alternative source-receptor
functional forms.

-------
                         TABLE OF CONTENTS


PREFACE	   iv

ABSTRACT   	    v

].  INTRODUCTION   	    1

2.  PROBLEM FORMULATION  	    7

    2.1  MATHEMATICAL FORMULATION  	    7
    2.2  THE TEST DATA	   12

3.  OPTIMIZING PARAMETERS FOR THE GAUSSIAN FORM OF
      THE SOURCE-RECEPTOR FUNCTION 	   16

4.  MORE GENERAL SOURCE-RECEPTOR FUNCTIONS 	   26

5.  CONCLUSIONS	   38

REFERENCES   	   41

APPENDIX:  The Feasibility of Formulation of a Source-       42
           Oriented Air Quality Simulation Model that
           Uses Atmospheric Dispersion Functions Em-
           pirically Derived from Joint Historical Data
           for Air Quality and Pollutant Emissions
           (Kenneth L. Calder)

-------
                          1.   INTRODUCTION

     This report examines  the feasibility of extracting  an  empirical
source-receptor air pollutant dispersion function.   The  genesis  of the
ideas behind the present report originated with  the  EPA  project  monitor,
Kenneth L. Calder; the motivation is  best stated in  his  own words  (as
quoted by Niels Busch in the  proceedings of the  fourth meeting of  the
NATO/CCMS panel on air pollution modeling, from  a letter written by
K. L. Calder in March 1973:)
          It was felt that the topic  "The role of empirical/statistical
     modeling of air quality" might be an appropriate one for discussion
     at this time, since with one significant exception  (Barrie  Smith's
     paper on SO^ prediction  for London and Manchester that was  presented
     by Frank Pasquill at our last meeting) the  topic has been almost en-
     tirely neglected in our  discussions to date.  Perhaps  one reason for
     this state of affairs has been the historical  belief that air quality
     models based on statistical regression type of  analysis are not
     source-oriented and therefore are largely useless for  control strategy
     in terms of the contribution of  individual  sources  to  the degradation
     of air quality.  Also, of course, is the feeling that  insofar as the
     statistical models are empirically established  they will be specifi-
     cally restricted in application, e.g., as regards geographical loca-
     tion, meteorological  regimes, etc.  Although this may  be the  case  I
     am unaware that it has been clearly demonstrated that  these limita-
     tions are really inherent in all statistical-type of air quality

-------
    modeling.  The  question may be asked as to whether, with an appro-
    priate  analysis,  a  source-oriented  statistical-type of air quality
    model could  be  developed  which did  not  involve  prior  specification
    of meteorological dispersion  functions  per se and  incorporation  of
    these as  in  present air quality models.   My  thought here  is  that
    for  given "meteorological  conditions"  these  dispersion functions
    play the  role of transfer functions between  the air quality  distri-
    bution  and the distribution of pollutant  emissions, and  if one were
    smart enough might  therefore  conceivably  be  obtained  empirically by
    a mathematical  inversion  technique  (as, for  example,  by  numerical
     solution  of sets of integral  equations) utilizing  accumulated data
     on the  distributions of air quality and emissions.   If  this  could
     be accomplished then maybe a  major  shortcoming  of  the current sta-
     tistical  models could be removed  and we should  then  in  effect have
     an alternative to the customary  meteorological-dispersion type of
     modeling.
     These concepts are  the genesis of the ideas  in  this report.   They
are elaborated upon at considerable  length by  Mr. Calder in a  memorandum
written in support of the work on  this task which is included as  an Appendix.
     The difficulties in developing a  source-oriented empirical model can
be stated from a statistical  point of  view.  The  spatial distribution of
pollutant concentrations over  a region is determined by emissions  and
meteorological conditions.  The number of variables  determining the con-
centration at  a given point is tremendous,  particularly since  emissions

-------
arise from a large number of point sources and area sources.  Consequently
the number of emission variables alone can easily be in the hundreds.
If an empirical model were to be developed in the most obvious manner,
there should be an attempt to relate the pollutant concentration at a
given point to all the possible emission variables and meteorological
variables affecting the concentration at that point.  Since the determin-
ation of the relationship between emission/meteorological variables and
concentration requires examples of that relationship over a very wide
range of emission and meteorological variables, a tremendous amount of
                                                                 *
data would be required to adequately determine this relationship.   Further,
to obtain the spatial distribution by the direct approach, measurements
of the concentration at a large number of points might be necessary.
Hence, (because the amount of data required to specify the full variation
of the model in this formulation is unattainable) the most obvious approach
to developing empirical models is impractical.  It is typically difficult
to obtain one reliable emission inventory, much less a variety of such
inventories from widely different emission distributions in the same
geographical area.
     This difficulty can be overcome by converting an apparent disadvantage,
the diversity introduced by meteorological variation, into an advantage.
Suppose we have a point source and monitors as indicated in Figure 1.
      Suppose there were only 20 point sources and no area sources.
Further suppose that we only considered two values of emission rates
from each source, one "high" and one "low."  There are 220 (over one
million) combinations of values these 20 point sources can take.  It
would be impossible to construct a practical experiment to sample even
a fraction of this diversity.

-------
                 Wind Direction
        Source
                                              Monitor 1
                     Monitor  2
                            (a)
                            IWind Direction
        Source
* Monitor 1
                      Monitor 2
                            (b)
Figure 1.   An  illustration of the effect of wind direction in
           introducing  diversity despite fixed monitoring sites

-------
If the wind direction never varied from the direction shown in Figure la
and all else was equal, monitor 1 would always measure the same concen-
tration and monitor 2 would measure negligible concentration.   However,
since the wind blows in other directions as in Figure Ib, the situation
could be reversed and monitor 2 could measure significant concentrations.
With enough examples of the source-receptor relationship, the variation
of the concentration with distance from the point source could be deter-
mined empirically.  Parameters of some plume models were, in fact, esti-
mated by taking measurements of the concentration from isolated point
sources.
     In the urban environment, individual sources cannot be isolated.
Measurements are the result of contributions from a number of sources.
However, because of the wide diversity of meteorological conditions, the
concentration will vary widely at a given point, and the sources which
contribute to the concentration at that point will similarly vary.  One
may then ask for a consistent source-receptor relationship which, when
summed  (or integrated) over all the sources, would explain best on the
average the observed concentrations.  More specifically, one could choose
the source-receptor function which minimized the average squared error in
prediction of the measured values.  The feasibility of this approach is
the subject of this report.  The methodology is discussed in more detail
in Section 2 and in the Appendix.
     The data used to test these ideas is model-created data.  Model data
was chosen for three major reasons:
     (1)  With model data, the source-receptor function is known and can
be compared with the function extracted from the data.  With measurement
data, "truth" is unknown.

-------
      (2)  Area sources and point sources can be isolated and studied
      separately as well as jointly.
      (3)  The cost of verifying and organizing measurement data would
      have been beyond the scope of the present study.
      The model used was the  RAM model  (a version of the Gaussian plume
formulation) [4].  It was developed by the  Environmental Protection Agency
and is discussed further in  Section 3.
      One important form of source-receptor  function is the Gaussian
form.  This form is studied  in some detail  to determine the difficulty
of extracting optimal parameter values from observed data and an emissions
inventory.  This approach can be viewed as  a method of calibrating a
Gaussian plume model to fit  the particular  urban environment to which it
is to be applied.  This appears to be  practical, although requiring diver-
sity  in the location of the  monitors relative to the sources and to pre-
vailing wind directions.  This topic is also discussed in Section 3.
     In Section 4,  the possibility of using more general  source-receptor
functions is discussed.  Functional forms considered as source-receptor
functions are polynomials, piecewise continuous polynomials, and fully
continuous piecewise quadratic functions.    Since the data was generated
using a Gaussian form, these other forms are not as efficient as the
Gaussian form for this data; however, comparisons among the non-Gaussian
forms can be made.
     The report concludes with a discussion of the implications of the
results and potential applications of the methodology.

-------
                         2.   PROBLEM FORMULATION

     In this section, we discuss the mathematical  formulation  of  the prob-
lem, the approach used in solving the problem,  and the  data  used  in testing
that approach.

2.1  MATHEMATICAL FORMULATION
     The Appendix goes into some detail  in formulating  the approach employed
in this report.  We work with a rectangular coordinate  system  with x-axis
along the mean horizontal wind direction,  with  y-axis crosswind,  and with
the z-axis vertical.  Then in urban air  quality models  it is customary  to
consider the pollutant emissions in terms  of a  limited  number  (say J) of
elevated point-sources together with horizontal  area-sources,  the latter
being possibly located at a few distinct heights ?s (say, for  example,  for
s = 1,2,3).  The total concentration x(x,y,0) at ground level  at  the receptor
location (x,y,o) will be the sum of the  concentration contribution from the
point-source distribution, say xD(x»y>0) and that from  the area-source  dis-
tribution xA(x,y.o), i.e.,
                    x(x,y,o) = xp(x,y,o) + xA(x,y,o)                      0)
where
                xn(x,y,o) =
                 V
                        3   r r
           xA(x,y 0) = £  JjQAU,ri,es)K(x-5,y-Ti; 0,cs)dedn             (3)
                       5"" \   M

-------
and            Q («.)  = emission  rate of £-th elevated point-source,
                       located at  position  U»>)-
          Q.(5,n,£ )  = emission  rate of horizontal area-source distribution
           M      S
                       located at  height  t,  , and A denotes the total inte-
                       gration domain of  the area-source distributions.

     K(x-£,y-n; o>0  = source-receptor function; it gives the ground level
                       concentration at the receptor location (x,y,0) result-
                       ing  from  a  point-source of unit strength at U,n,d-

Note that this formulation  includes the assumption of horizontal homogeneity,
namely, that the impact of  a  given source upon a given receptor depends only
upon their relative and not absolute coordinates.  This assumption is true
for an urban environment only in an average sense.  A single wind direction
is similarly valid only in  an average sense.  Finally, it should be noted
that the above formulation  assumes steady-state conditions and is thus only
applicable for relatively short  time-periods (of the order of one hour),
when this may be an adequate  approximation  providing the emissions and
meteorological  conditions are not  rapidly changing.
     In equations (2)  and (3) above it is convenient to use "source-oriented"
position coordinates,  and to  consider a typical ground-level receptor loca-
tion as (x^), 1=1,2,	
     Let
            x' = x.-s    ,  dx'  =  -d?    ,   x'u = x.-^                  (4)
            y' = yrn    ,  dr  =  -dn    .   y'u " yrn£

-------
Then
                 Xp(xryro)  =  E  QnUWxf^y.O.g                 (5)
                        3   r r
         xA(x1.y1,o)  = E   J J  QA(x1-x',y1-y,cs)K(x',y; o,cs)dx'dy   (6)
                       S~* I   A



     In the following several different  source-receptor functions [K(x',y';

      will  be considered, including the  classical Gaussian form that is  the

basis for the RAM-model  [3]. For  the  latter, and with the meteorological

condition of infinite mixing depth
                                         2   I      i    C2
                              exp <- -Hfc—> exp <^	fe-
              K(x',y';  0,1
where U denotes the mean wind speed,  and we assume simple power-law depend-

encies for the standard deviation  functions, say
                                                                        (7b)
                                         O  z                            (7c)
Also, as in the RAM-model  we will  assume  that  the narrow-plume hypothesis

(see Appendix) may be employed in  order to  reduce the double integral of

-------
                                  10
equation (6)  to a one-dimensional  integral.   Thus  under  this  hypothesis

if
     oo


    f
                       K(x',y';  0,?)dr = G(x',c)                     (8)
then in place of equation (6) we have
                    f
XAUryrO) = E   /   QA(xrx',y.,?s)G(x',?s)dx'
              s=l  •/
                                                                       (9)
which only involves values of the area-source emission rates  in the vertical

plane through the wind direction and the receptor location.


For  the special case of a Gaussian plume
                                            2
 It is seen that in this case the total area-source contribution yA(x.,y.,0)

 only involves the meteorological parameters U and a (x"), i.e., is indepen-

 dent of a (x').  In evaluating equation (9) numerically, the source intensity

 function QAI>S» in Practice, piecewise constant.

     The basic equations (5) and (6) (or (5) and (9)), with the Gaussian

 forms for K(x',y'; 0,?) and G(x',s) involve four unspecified parameters

-------
                                   11
                                                             *
through the equations (7b) and (7c), namely, a ,b ,az and b^.   More


generally, any functional form chosen for K (and therefore G) may have


unspecified parameters; we will denote the set of unspecified parameters


by the vector a..  Thus for the special Gaussian form
                           a = (ay,by,az,bz)                           (11)
The explicit dependence of the calculated concentration values on these


parameters could be indicated by the notation x(x.,y. >o» ".)•


     The basic method employed in this study is that of choosing o_ to

                                                                            **
minimize the error between calculated and observed values of concentrations.


In order to express this statement formally, we must elaborate our notation


to indicate explicitly the dependence on wind direction; thus x(x,- >y,- »0;e; o_).


The dependence on e, in fact, involves a rotation of coordinate axes, as


shown in the Appendix.  For each wind direction e .(j=l ,2...R) there is a
                                                 J

concentration observation for each receptor location (monitoring station).


The receptor locations are denoted (x.,y.) for i=l,2...N, and are assumed


to be at ground level so that we may omit the symbol o in the x~notation.


Then the mean square error over all observations is
     *
      We also examine later the possibility of considering U a free parameter.
For our data the wind speed was taken as a constant, of 5 m/sec.

    **
      "Observed" in the present case is model-created test data;  the technique
is, of course, intended for practical use on measured data.

-------
                                  12
         e2<«) -J-_K   £

                 RN 1=1 J=l
                 ,   N   R
                                                                       (12)





where Xn and Xfl are given by equations (5) and (6) (or (5) and (9)).
       P      H
                                2
     The problem of minimizing e  with respect to o_ is a standard optimiza-


tion problem.  Chambers provides a good recent survey of available tech-


niques [2].  The particular technique we employed was "structured random


search" [3J; this is a rather inefficient technique, but one which does not


require calculation of derivatives and which converges under difficult con-


ditions (given enough time).  This technique's main advantage was that we


could modify the form of the source-receptor function without modifying the


search technique.  The results of applying this methodology to the best


data are discussed in Sections 3 and 4; however,  we first turn to a des-


cription of the test data.



2.2  THE TEST DATA


     For a realistic distribution of point-sources, area-sources  and


receptor locations, use was made of unpublished information from  a 1968

                                                     *
air pollution study conducted in St.  Louis,  Missouri.   Data  on the


elevated point-sources and  ground-level  receptor  locations  are presented
      Unpublished manuscript of National  Air  Pollution Control Administration,
"St. Louis S0? Dispersion Model  Study,"  by  D.  B. Turner and N. G.  Edmisten,

November 19687

-------
                                   13
in Figure 2.  The point sources range in strength from 9 g/sec to 2681 g/sec,
while the elevations range from 39 m to 495 m above ground level.  Data for
the area-sources are represented in Figure 3, which indicates the strengths
and the heights of the sources.  The area-sources were categorized as
either 20, 30 or 50 meters high.  The corresponding concentration data were
generated by the EPA-developed RAM algorithm [4], which is a specific imple-
mentation of the classical Gaussian plume formulation, that considers both
point- and area-sources, with three possible heights for the latter, and
which uses the "narrow-plume" hypothesis (i.e., equation (9)) to calculate
the area-source concentration contribution XA-  A constant wind speed U of
5 meters per second was employed, and sixteen wind directions at the points
of the compass were simulated.  Infinite mixing depth and a neutral atmo-
spheric stability category were assumed.  For the latter, in equations (7b)
and (7c), we have
                       a  = 0.072    ,   b  = 0.90
                        •J                »/
                       az = 0.038    ,   bz = 0.76
                                                                      (13)
For this data, these values and the indicated equations are optimal and
would produce zero mean-square error.  It is this result we hope to
recover from the data by the optimization procedure.

-------
                                        14
40
30
25
20
 10



T 	 1 1 1 1 '
° 9, °
O
N
' '

\




"


[
0


'
—


, ($> 00 -
,
0 D
n
o n
a D8 D
°D° o
Doo°a
n 0° ° D
D a
° n n «a ° n a
n on * D n°
_, ° o
a a o a o
- o°o a
a g a
a a _
0
a
0
o
0
D RECEPTOR
MORE THAN ONE
o POINT SOURCE
•
• SINGLE POINT SOURCE
J 	 1 1 1 - -J 1
  0
     5


Figure 2.
10
15
20
25
30
                        Location of point sources  and  receptors.  (There are 40
                        receptor locations and 62  point  sources.  Close point sources,
                        such as multiple stacks, are indicated  by single points.)

-------
                                    15
37


36
21


22
20


19


594
20
II 2
20
164 Ia2l
20 | 20
361
20

570
20
1 05
20
1 80
20
226
20 2
20
36
5
2.05
20
148 906
20 20
1269
20
396
20
368 273
20 20
309 9OO
20 20
21 6
20
481
20
842
20
571
20
O 1
20
18.7
20
194
20
588
20
581
2O
300
20
5.16
20
400
20
280
20
296
20
307
20
153
20
504
20
541
20
2.18
20
1.97
20
174
20
2 57
20
901
20
1 34
20
1 68
20
316
20
588
20
592
20
135
20
7.81
20
457
20
II
20
770
20
18 7
20
207
20
790
20
582
20
693
20
860
20
988
20
I9a
20
276
20
276
20
36.3
20
56.2
20
164
20
584
20
414
20
414
20
715
20
800
20
430
20
1 II
20
173
20
22.8
20
270
20
20
204
20
120
20
303
20
20
12-7
20
1 68
20
1 88
20
246
20
543
30
442
20
263
20
382
20
715
20
715
20
a 64
20
400
20
25 1
20
224
20
39O
20
306
20
20
346
20
175
20
122
20
20
105
20
178
20
146
20
128
20
187
30
133
30
277
20
8.84
20
832
20
4J4
20
2.65
20
309
20
3.09
20
3O9
20
929
20
2 59
20
558
20
801
20
100
20
8-6
20
364
20
20
535
20
922
20
289
20
20
20
253
20
466
20
715
30
272
30
266
30
119
20
11.4
20
669
20
556
20
1 87
20
168
20
100
20
370
20
965
20
878
20
617
30
30
iea
20
21 4
20
881
20
20
328
20
730
30
100.
30
635
30
296
30
242
30
20.5
20
933
20
974
20
4.05
20
155
20
687
20
3 6
20
747
30
88
30
694
20
961
30
32 7
30
473
30
387
30
30
82.8
30
71 0
30
71.3
30
81.9
30
284
20
153
20
22.1
20
19.0
20
894
20
10.3
20
534
20
20
32 1
20
540
20
634
30
844
30
5O7
20
444
20
261
20
257
30
126
30
3O
67.4
30
138
30
199.
30
548
30
308
20
357
20
61.3
20
45.5
20
11.3
20
19.4
20
107
20
3.7B
20
3.09
20
1.71
20
3.09
20
907
20
1.07
20
•?o4
2.72
20
239
20
3.16
20
1?
.339
20
20
12.4
2O
226
30
676
30
130
3O
273
20
370
20
285
20
531
30
30
30
48 1
20
145.
30
108.
30
896
30
628
30
781
20
70.8
20
26.7
20
12.7
20
242
20
ioe
20
4.34
20
434
20
¥o6
4.34
20
.708
20
.271
20
1.75
20
153
20
316
20
707
20
339
20
436
20
551
20
843
20
755
20
296
20
20
874
20
150
20
426
20
205
20
286
20
116
20
113
30
396
30
30
30
aeo
20
702
30
103
30
155
30
128.
30
696
30
40.3
20
98.0
30
167
30
10. 8
20
1.85
20
196
20
1.42
20
.708
20
384
20
20
460
20
15 8
20
16 9
20
21 9
20
797
30
152
30
72 3
30
335
30
30
30
321.
30
141
30
211
30
137
30
159
30
119
30
132.
3O
107.
30
6.93
20
.339
20
.316
20
1.62
20

786
20
20
535
20
20
32
20
170
20
98 3
20
05
30
218
30
208
30
30
30
580.
30
306.
30
386
30
463.
30
240.
30
257
30
121
20
.316
20
558
20

1.62
20
949
20

7.86
20

1.75
20
1.23
20
4.10
20
JO
465
20
519
20
519
20
231
20
82.3
30
251
20
392
20
224.
30
30
30
657.
30
224.
30
483
30
367.
30
150.
30
417
20

III
20
558
20

7.07
20
2.97
20
9.49
20
%9
9.62
20
832
20
195.
20
a la
20
335
20
707
20
677
20
222
30
707
20
370
20
707
20
1 05
20
417
20
3.8
20
689
20
II
20
71 1
20
303
20
271
20
336
2O
386
20
91
20

000
20
790
20
30
30
30
520
40
252.
40
780
30
269.
30
158
20

279
20
558
20
159
20
27.0
20
834
20
5.44
20
9.66
20
8.24
20
1.27
20
855
20
1.84
20
384
20
20
30
30
423.
50
956
40
217.
30
646
20
506
2.0
II 8
20
684
20
III
20
210
20
425
20

316
20
267
20
226
20
20
20
641
20
13.0
20
567
20
123
20
351
20
68
20
133
20
8 84
20
672
20
ioe
20
71 7
20
429
2O

174
30
651
20
316
20

000
20
0 00
20

20
565
2O
670
20
167
30
137
20
193.
30
20
225
30
309
20
260,
30
632
30
167.
20
268
20
884
20
6 84
20
558
20
513
20
573
20
153
20
887
2O
6.
2
707
20
97
0
292
20
427
20
20
558
20
736
20
299
30
660
20
262
20
20
837
20
337
20
824.
30
550
30
249
20
603
20
444
20
884
20
558
20
282
20
282
20
1.98
20
282
20
3.36
20
341
20
522
20
560
20
606
20
339
20
403
20

271
20
384
20
? 33
20
867
20
20
642
20
481
20
572
30
632
20
632
20
20
151
30
1 1.9
30
277.
30
265
30
365
20
62.7
20
279
20
133
20

539
20
110
30
346
30
746
20
339
20
323
20
113
20

41 5
20
4 20
20
174
20
809
30
4O6
30
407
20
320
20

271
20
945
20
20
642
20
3-76
20
487
20
316
20
6.32
20

552
30
149.
30
82.3
30
721
20
807
20
461
20
169
20
518
20
840
20
20
7 36
20
6 10
20
0904
20

2.22
20
222
20
242
20
840
20
20
222
20


3 16
20

20
20
361
30
71
30
137
20
II 5
20
80.4
20
20.3
20
349
20
282
20
316
20
316
20
20
20
128
20
II 5
20
23
20
20
920
20
254
20
373
20
2 82
20
0904] 65
20 | 20
165
20
2 95
20
2 22
20

20
222
20
316
20
316
20
316
20
20
20
20
128
20
12.8
20
20
108
20
20.9
20
332
20
125
20
2,83
20
282
20
5.65
20
3.60
20
4.04
20
5.65
20
344
20



2
47
0
568
20
568
20
20
2 69
20
255


1 95
20
20
20
20
787
20
19.2
20
20
0.6
20
1 14
20
713
20
485
20
357
20
511
20
114
20
2 86
20

279
20
279
20
165
20
568
20
640
20
568
20
5 68
20
20
2 69
2O
95
20
20
464
20
387
2O
430
2O
95
20
20
20
19.7
20
12 8
20
20
14 1
20
120
20
884
20
491
20
130.
20
997
20
i9.7
20
89
20
228
20
?05
20
20
20
971
20
12 8
20
20
10.6
20
971
20
8.84
20
979
20
736
20
49.8
20
86.2
30
46
20
228
20
272
20
273
20

1 1
01234

III 1 II II '
S 6 7 8 9 10 II 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 3O 11
               Figure 3.  Area Sources:  Upper  number  in  each  square
                          is emission rate  in g/sec-m^ (x 10"^)t
                          Lower number  is height  of  area  source  in
                          meters.

-------
                                     17
          3.  OPTIMIZING PARAMETERS FOR THE GAUSSIAN FORM
                   OF THE SOURCE-RECEPTOR FUNCTION
     The data base described in Section 2.2 contains concentration values
at forty receptors for sixteen wind directions, a total  of 640 values
(referred to as "actual" values).  The contribution to the concentration
from point and area sources was available separately, as well  as in toto.
     Equations (5) and (7a)provide a prediction of the point-source
pollutant concentration  at  any given receptor  location
once the four parameters are specified.  A comparison of values predicted
by these equations versus actual values allows calculation of the root-
mean-square value of the error with a given choice of parameter values.
(See equation (12), with area sources at zero.)
     With initial guesses of a =az=0.1 and b =bz=1.0, the search routine
described arrived at values of

                    a  = .074, b  = 0.92, az = 0.039, DZ = 0.77

when the "true" values (those used to create the data) were

                    a  = .072, b  = 0.90, az = .038, b  = 0.76.
                     J          J
     The root-mean-square  (RMS)  error initially was  157 yg/m3 and the
maximum error over the 640 values was 1205 yg/m3; the parameter values  after
100 iterations yielded an RMS error of 14 yg/m3 and  a maximum error  of
175 yg/m3  (Table  1 summarizes these results.)  To place the size of  the
final error  in perspective, we  note that the  actual  values  (due to point
sources alone) are as high as 1545  yg/m3.

-------
               Table 1.   Point sources only; parameter values at initial,  mid,  and  final
Iteration

  0 (initial)

 50 (mid)

100 (final)
ACTUAL
VALUES:

a
y
0.100
0.049
0.074
iteration during
b
y
1.00
0.85
0.92
search. (Windspeed is
a b
z z
0.100 1.00
0.050 0.71
0.039 0.77
(0.072)
(0.90)
(0.038)
(0.76)
                                                         RMS Error
                                                           (yg/m3)

                                                             157

                                                              85

                                                              14
(0)
Max. Error
 (uq/m3)

    1205

     711

     175




     (0)
                                                                                                       CO

-------
                                     19
     Employing equation (9) for area sources and using only the area-
source contribution in the "actual" data, we get similarly promising
results (Table 2).  Actual values of concentrations due to area sources
reach maximums of over 800 yg/m3.
     The results of treating point and area sources simultaneously,
representative of the case which would be encountered with measurement
data, are listed in Table 3; the algorithm once again closely approaches
the optimum values in 100 iterations.  Actual values of the total  con-
centrations from both point and area sources go above 1600 yg/m3.
     While the initial parameter values we chose in these cases con-
verged toward the values used in creating the data, experimentation
indicated that this was not always the case.  Small RMS errors could
be achieved with combinations of parameters significantly different in
value from those used in creating the data.  As indicated in Figure 4,
rather different combinations of a and b yield very similar values of
ax  over the range of x in which we are interested.  It is clear that
an essentially equivalent combination of values should not be deemed
erroneous, since they yield an accurate empirical model.  We regard this
a characteristic of the formulation chosen for calculating a and do not
regard it a difficulty of the methodology proposed.  Further, in practice,
initial values for the parameters would be chosen from the literature,
and the solution obtained would be a set of values similar to the initial
values, but which minimized the prediction error.  This aspect of imple-
mentation also suggests that a good initial guess would be employed and,
thus, that convergence to an "optimum" solution would be rapid.

-------
               20
0.25
0.5   ,   1
     x(KM) -
                              2.5
                                               10
 Figure 4.  Plot of ax  for several  values  of a and b.
            (The variable x is plotted on a log scale.)

-------
Table 2.    Area sources only;  parameter values  at  initial, mid,
and final iteration during search. (Windspeed is
fixed at 5.0 m/sec. Values a and b do not affect
area source values.) y y
Iteration az bz
0 (initial) 0.100 1.00
50 (mid) .028 0.89
100 (final) .037 0.79
ACTUAL
VALUES: (0.038) (0.76)
RMS Error
(ug/m3)
157
15
6
(0)
Max. Error
(yg/m3)
1205
69
24
(0)
                                                                                      ro

-------
                      Table 3.     Point and area sources together;  parameter  values at  initial,
   Iteration

  0 (initial)

 50 (mid)

100 (final)

ay
0.100
0.055
0.074
mid
at
by
1.00
0.79
0.89
and final
5.0 m/sec.
az
0.100
0.044
0.036
iteration
)
bz
1.00
0.67
0.74
during search. ( W
Both Point and
RMS Error
(yg/m3)
157
79
24
indspeed is
Area Sources
Max. Error
(yg/m3)
1205
583
194
ACTUAL
VALUES:
(0.072)
(0.90)     (0.038)
(0.76)
(0)
(0)

-------
                                      23
     One means of examining the tendency to converge to the "optimum"
solution is through a sensitivity analysis about the optimum.  If a
small change in parameter values causes a sufficiently large change in
the rror criterion, then the search algorithm will tend to converge
relatively rapidly.  Table 4 indicates the results of such an analysis,
where the parameters are changed one at a time from the optimum values.
Since both the RMS and maximum error are zero for the optimum values in
our test case, the errors indicated are also the changes in the given
error measures ( as well as the absolute errors for the perturbed para-
meter values).  The results suggest that a change in any parameter will
cause a significant error, so that the search algorithm should converge
rapidly.
     We note that the wind speed U is included in the table as a para-
meter.  One can in fact regard U as a statistical parameter to be
estimated from measurements.  Hence, for a period over which the wind
speed can be assumed constant, the error between predicted and actual
concentration values can be minimized with respect to U.  In our data,
the wind speed was constant for all wind directions, so that the error
over all receptors and all wind directions could be used to determine U.
Full exploration of the practicality of extending the methodology of
this report to estimating U requires further study.
     Forty receptors (i.e., air quality monitoring stations) are more than
are available in many monitoring systems.  How many stations are required
for this methodology to be effective?  The answer to this question is
heavily dependent on the number and distribution of sources, but the
indications from experiments with our test data suggest that a considerably

-------
                         Table 4.  Sensitivity analysis about optimum values.  Error is over 40
                                   receptors at 16 wind angles (640 samples).  Values are changed
                                   about 10%, except for wind speed U, where the change is 4%.
Value Change

Parameter

a
y

b
y
a
z
b,
z
U
From
Optimum

0.072


0.90

0.038

0.76

5.0

To

0.079


0.99

0.042

0.836

5.2
Point Sources Only
(ug/m3)

RMS

16


24

20

21

11

Max.

175


234

250

200

175
Area Sources Only
(yg/m3)

RMS
*
0

*
0

19

26

11

Max.
*
0

*
0

55

91

36
All Sources
(Hg/m3)

RMS

16


24

27

35

18

Max.

175


234

151

230

177
*
 Area source prediction not affected by this change.
                                                                                                            ro

-------
                                    25
smaller number of stations may suffice.   Table  5 indicates  errors  due
to changes in parameter values, one at a time,  from the  optimum values,
for a selection of the individual  stations.   The errors  are sufficiently
large that one would expect that optimum parameter values could be
extracted from a small number of stations at well-chosen locations.

-------
Table 5:  Sensitivity analysis.   Root-mean-square  and maximum error due  to  change  in
          each parameter  from  nominal  values  at  selected  receptors.   Concentrations are
          from both  point  and  area  sources.
Parameter
ay
by
az
bz
U
Change ^s^
From To
.072 .079
.90 .99
. 038 . 042
.76 .836
5.0 5.2
Error (in ^g/m3) at Selected Receptors.
1
RMS MAX
9 32
15 59
24 49
32 66
15 25
10
RMS MAX
67 260
23 63
32 102
44 111
18 30
13
RMS MAX
17 46
22 69
24 45
29 62
19 44
21
RMS MAX
10 32
20 62
27 51
37 80
18 33
22
RMS MAX
11 33
20 66
16 52
23 69
16 39
24
RMS MAX
14 52
10 31
33 109
27 65
14 37
35
RMS MAX
44 175
46 175
47 177
52 178
45 177
Error for
All 40
Receptors
RMS MAX
16 175
24 234
27 151
35 230
18 177
                                                                                                        IX)
                                                                                                        (Ti

-------
                                   27
             4.   MORE GENERAL SOURCE-RECEPTOR FUNCTIONS

     Since the concentration test data used in this report were generated
by a Gaussian-form source-receptor function, no other form can do better
in predicting this particular data.  One can, however, examine more general
forms as an initial assessment of difficulty, and to obtain an indication
of the types of functions which might prove useful for application to real
air quality data.  In the analysis of this section, only point-source data
was used; i.e., the area-sources were disregarded.
     We repeat Eq. (5) here for convenience:
                                        £=1
Note that the source-receptor function K is a function of three variables.
If this function of three variables is parameterized, with a vector of pa-
rameters ex, then XD depends on the values of these parameters and the mean-
square error [equation (12) with x/\ = 0] depends on ex.  Thus, the procedure
we employed in finding the optimal parameters of the Gaussian form is di-
rectly applicable to any parameterized form.  The difficulty and computa-
tional cost, however, increase with the number of parameters.
     Multivariate polynomials are an obvious choice to explore.  We tried
a function of the form of the exponential of a general second-order poly-
nomial in three variables:
    K = exp [a-|X' + o^y1 + a3<;  + o^x'y'
             + a5x' ? + a6y' C + a^x1 )2 + ag(y' )2
               V2 + aio]  •

-------
                                    28
with ten parameters.  The exponential  insures  that the function K will
not take negative values.  The resulting RMS error after 20 iterations
            3                                 3
was 152 yg/m ; the maximum error was 3130 ug/m .
     The exponential of a general third-order  polynomial  in three variables
(with 20 parameters) was similarly tested.  After 20 iterations, the RMS
                  3                                    3
error was 126 yg/m  and the maximum error was  2570 yg/m .
     The exponential of a continuous piecewise linear form with 28 free
parameters was tested.  This form is discussed in some detail  elsewhere [3].
                                                                     -3
The RMS and maximum errors after 20 iterations were 140 and 2808 yg/m
respectively.  This error is less than the second-order polynomial, but
more than the third-order polynomial.   Since the  "curvature" of the Gaussian
form from which the data was created is better suited to polynomial than
piecewise-linear approximation, this is not an unexpected result for the
test data; however, this comparison may not yield the same result on real
data.
     Examination of the data suggested a difficulty in fitting a continuous
functional form.  Most of the data points corresponded to moderate to low
SOp concentration levels; there were a relatively small number of high con-
centration values.  The continuous forms used  tended to make large errors
at the high values in attempting to minimize the  errors at the moderate/low
values (which predominated).  As an experiment, we divided the data into two
sets corresponding roughly into high concentration data and medium/low con-
centration data.  Explicitly, data for which (x'.y1,^) is such that

                        g(x' ,y' ,c) < 0,
where

-------
                                    29
       g(x',y',0 = (x1  - 2000)2 + lOO(y')2 + 400 c2  -  8 x  106,        (15)

is considered separately from data where g(x',y',?) ^ 0.  The  first  set are
values where the source-receptor function tends  to peak;  the second  set,
where the source-receptor function takes smaller values.  Equation  (15) was
developed by "eyeball" examination of the data.
     We then searched simultaneously for the best pair  of third-order  poly-
nomials, one for each region - forty parameters  in all.   Having  obtained
the result after 20 iterations, we eliminated all terms in  the two  poly-
nomials with near-zero coefficients, leaving 29  parameters, and  performed
twenty more iterations.  The following equation  resulted:

 For  g(x',  y1,  c)  < 0,

 K(x', y1, 0, ?) =
                   exp [a] C2 y1 + a2 x' y1 C +  a3(x')  y1

                                 + a4(y')3 + a5  c + a6 y1  C
                                 + a? x1 y1 + a8 y' +  ag £ x1
                                   a10(x')2 + an x1  C +  a12  x1

-------
                                    30
 For g(x', y1 , e) >. 0 ,
 K(x', y1, 0, c) =
                   exp[b1 y1 e + b2(y')2 + b3  c; +  b4 y1 +  b5(y()2  x1

                               + b6(x')3 + b?  x1 c + bg c2 +  bg  x1  y1

                               + b1Q(x')2 + bn x1 + b12(x')2 y1 + b13(x')
                                        x'
where the values of the coefficients are as given in Table 6.   The RMS
error was respectable,  but the maximum error was still  large.   The results
for all forms are listed in Table 7.  Table 8 provides  an  assessment of the
predicted versus actual values at the maximum error for each wind direction.
     It is instructive to look at the functions graphically.  In Figures 5-8,
"slices" of the true (Gaussian) form are compared with  the same slices  of
the empirically derived form represented by equations (15) and (16).
     While these results suggest that these more general forms (and perhaps
others) may prove useful, we do not feel that significant conclusions about
this question  can be reached with test data from a Gaussian form.  Referring
to Figures 5 and 7, we note that the Gaussian form implies rather extreme
rates of change in concentration and very flat tails which are very difficult
to approximate by anything other than a Gaussian form.   For actual measure-
ment data, the many errors in measuring air quality and meteorological

-------
                               31
Table 6.  Values of coefficients  in source-receptor  function.
        1               3.811  x 10"7          7.520 x  10"8



        2              -5.019  x 10"9         -7.502 x  10"6



        3              -3.312  x 10"11         -9.430 x  10"3



        4              -1.074  x 10~7         -5.505 x  10"3



        5              -1.203  x 10"1          5.318 x  10"11



        6               3.544  x 10"5          5.258 x  10"15



        7               2.242  x 10"6          2.769 x  10"7



        8              -3.008  x 10"2         -5.209 x  10"5



        9              -6.632  x 10"9          1.354 x  10"7



       10              -1.403  x 10"10         -1.125 x  10"9



       11               9.039  x 10"6          4.405 x  10"9



       12              -3.614  x 10"4         -7.290 x  10"13



       13               1.134  x 10"13         -1.657 x  10"12



       14               8:042                  3.842 x  10"10



       15                  —                 5.694 x  10"1

-------
                                 32
                     Table 7.   Errors for all forms
                                             q                    q
Functional  Form               RMS Error (yg/m )_    Max Error (yg/m )
2nd order polynomial                 152                   3130

3rd order polynomial                 126                   2570

Continuous piecewise-linear         140                   2808

3rd order polynomial
  (split regions)                    40                    550

Gaussian form (from
  Table 1)                           14                    175

-------
                            33
Table  8.  Prediction of maximum concentration for each
          wind direction by equations (15) and (16).
ind  Angle  Deg.  Actual  S02  Ug/m3)    Predicted S02
22.5
45.0
67.5
90.0
112.5
135.0
157.5
180.0
202.5
225.0
247.5
270.0
292.5
315.0
337.5
360.0
2409
1286
681
884
740
912
792
2577
1561
711
765
914
587
532
525
985
2448
1391
703
933
847
891
888
2027
1572
749
782
926
552
534
561
992

-------
                                                 SOURCE-RECEPTOR FUNCTION
         O
                      = 40
         00
13-   o
     o
     O)
     to
                               200
               RAM MODEL-  GAUSSIAN KERNEL


           UPWIND DISTANCE x' =  1500



               5 = SOURCE  HEIGHT  (= 40, 50, 60, TOO, 150 m)
  I         '         I          '

40O               600

   y'(CROSSWIND  DISTANCE in meters)


           Figure  5
                                                                                                                    CO
800
1000

-------
                                                SOURCE-RECEPTOR FUNCTION
>>
 •t


X
          00
      o
      CD
      1/1
         CM
             EMPIRICAL SOURCE-RECEPTOR FUNCTION


             DERIVED BY  BEST-FIT


           UPWIND DISTANCE  x'=  1500


              £= SOURCE  HEIGHT
                                 200
  400                600
y1 (CROSSWIND DISTANCE in meters)


        Figure 6
                                                                                                                   GO
                                                                                                                   tn
800
1000

-------
                                                SOURCE-RECEPTOR FUNCTION
>>
 *

X
         O
         00
     o
     CD
     to
             —   \ x =  1000 meters
                                                                            RAM MODEL - GAUSSIAN KERNEL

                                                                            SOURCE  HEIGHT  ? = 50 m
                                                                            x' = UPWIND  DISTANCE
                                                                                                                  u>
                               200
    400               600

y'(CROSSWIND DISTANCE in meters)


         Figure 7
800
                                                                                                          1000

-------
                                           SOURCE-RECEPTOR FUNCTION
                                                                  EMPIRICAL SOURCE-RECEPTOR FUNCTION


                                                                  DERIVED BY BEST-FIT


                                                                  SOURCE HEIGHT C = 50 m

                                                                  x'= UPWIND DISTANCE
K_P

 #t

o

• •»


>>
 •»


X
VO
 I
 o
CO
  o
  OJ
     C\J
                                                                  oo
                                                                  •vj
                            200
 400               600

y'(CROSSWIND DISTANCE in meters)
                                                                                         800
1000
                                                       Figure 8

-------
                                   38
variables, as well  as in compiling emission inventories, may make a  relatively
simple form as effective as the Gaussian form in obtaining  an  acceptable
average error.  Since we can only hypothesize at present on the  practical
effectiveness of such other forms, this  will  not be  discussed  further.

-------
                                   39
                           5.  CONCLUSIONS

     This report develops a methodology for deriving a class  of source-
oriented empirical models for determining spatial  concentration distri-
bution of an air pollutant, given emissions and a  categorization of the
meteorological conditions.  It is noted that direct empirical  modeling of
the relationship between emissions/meteorology and the resulting air
quality distribution is in most cases impractical.  The number of emis-
sions variables and potential receptor locations of interest  lead to  im-
practical requirements on the amount of data and its variability.  On
the other hand, given a variation of the wind direction, the  many source-
receptor pairs provide a wide sampling of source strengths and receptor
impacts.  Since the isolated effect of a given source is not  generally
available, it is proposed that a source-receptor function be  empirically
derived by minimizing the error in observed values of the total concentra-
tion  due to many  sources  versus model-predicted values.  Thus,  the source-
receptor function  estimating  the  contribution  of  a  source of  a  given
strength for  an arbitrary receptor  location  is  determined, as  opposed to
a  direct relationship  between  total  emissions  and air quality.   This
source-receptor function  could then  be  used  in  the  classical  manner  to
provide a  source-oriented air  quality simulation  model  by summing  the con-
tributions of different  point  and area  sources  at any receptor  location.
      It is first  suggested that  the  well-known  Gaussian form  source-
receptor function  can  be  utilized in this  manner; the parameter values
of that formulation  could be  determined empirically to minimize the  mean-
square error  between predicted and  actual  values.   This approach could be

-------
                                   40
 considered as a calibration of a Gaussian-form model to a particular urban
 environment.  This concept was tested using data generated from a model
 which employs the Gaussian form and, hence, for which the "true" parameters
 are  known.  The results indicated that extraction of these parameter values,
 even with a poor initial guess, was quite feasible.  However, it was fur-
 ther noted that the empirical parameter values obtained by the best-fit
 procedure may not be unique, and that different sets may lead to concen-
 tration  predictions with similar error characteristics relative to the
 "true" set.
     The use of forms other than the Gaussian form was also examined.
 The  use  of test data generated from the Gaussian form could not allow any
 definitive conclusions in regard to the practical utility of other forms.
 It was,  however, demonstrated that it was feasible to extract parameters of
 other forms which give a good approximation when applied to the test data.
     This report is intended to demonstrate a methodology and to suggest
 that there do not appear to be any fundamental difficulties in extending
 this to measured data.  In applying the methodology to practical situa-
 tions, however, some extensions are required.  For example, the Gaussian
 form utilized assumed a single stability category; in practice, a number
 of stability categories and hence a larger number of parameters would be
 employed.  There are other questions such as the number of monitoring sta-
 tions required to allow adequate estimation of the parameters required.
The requirements in this regard are sufficiently dependent upon the number
and location of sources that there is no obvious way to approach them in
a general sense.  In practice, the question would be answered by the

-------
                                   41
accuracy of the resulting model.  It should perhaps be emphasized that
this methodology will yield the model of the form chosen which gives the
minimum mean-squared error in forecasting pollutant concentrations over
the data base utilized, a characteristic which generates confidence in
the future use of the model in the same area.

-------
                                     42
                                 REFERENCES
1.   Chambers, John M.,  "Fitting Nonlinear Models:   Numerical  Techniques,"
    Biometreka. Vol.  60,  No.  1, 1973,  pp.  1-13.

2.   Meisel, W.  S., Computer-Oriented Approaches  to  Pattern  Recognition,
    Academic Press, New York,  1972,  pp.  51-53.

3.   Horowitz, Alan, Meisel,  W.  S.,  and D.  C.  Collins,  The Application  of
    Repro-Modeling to the Analysis  of  a  Photochemical  Air Pollution  Model
    (EPA-650/4-74-001), December 1973.

4.  Hrenko, Joan M. and D. B.  Turner,  "An Efficient Gaussian-Plume Multiple
    Source Air Quality Algorithm",  Paper 75-04.3,  68th Annual  APCA Meeting,
    Boston, June 1975.

-------
                                     43
                             APPENDIX
         The Feasibility of Formulation of a Source-Oriented
         Air Quality Simulation Model that Uses Atmospheric
         Dispersion Functions Empirically Derived from Joint
       Historical Data for Air Quality and Pollutant Emissions
                         Kenneth L. Calder
       Meteorology Laboratory, Environmental Protection Agency,
       Research Triangle Park, North Carolina
 INTRODUCTION
     The multiple-source simulation model for urban air quality is now
well-known and in common use.  It provides estimates of the spatiotemporal
distribution of concentration of an air pollutant, in terms of the corres-
ponding distribution of the pollutant emissions, and involves the use of
atmospheric dispersion functions that express the quantitative effects of
atmospheric transport and diffusion under the meteorological conditions
that are occuring.  These meteorological dispersion functions need to be
specified in advance, i.e., in some a priori fashion.  This usually involves
the analysis of data from some special ad hoc field experiments that are
made to characterize the diffusive power of the lower atmosphere in a fairly
general fashion.  The tests need to be conducted under a variety of meteoro-
logical conditions and for some simple canonical configuration of the emissions,
e.g., from a single point source.  Because of their fundamental source-oriented
structure these urban air-quality simulation models are widely used in analyzing
the effects on air quality of hypothetical and arbitrary emission control
      This  paper  was  prepared  as  a  basis  for discussion  between  the   contractor
and  the  project officer.   It  is  included  here for historical  background  and
to elaborate  certain  points  in the  report.   It is not  intended to  provide  a
complete discussion of the subject.

-------
                                    44
strategies.   They thus provide a rational  basis  for air quality management
based on control  of selective sources  of pollution.  The development and
improvement of such air quality simulation models is being very actively
pursued in most industrialized countries at the  present time.
     In strong contrast are those air  quality "models"  that primarily
involve some form of statistical regression analysis, and which depend
entirely on the availability of extensive meteorological and air quality
data for a particular urban location.   Although  such developments have had
useful applications for specific problems, the fact that they are receptor-
rather than source-oriented, and do not normally involve any explicit input
of information concerning pollutant emissions, has rendered them of very
limited value for studies of air pollution control strategies.   The belief,
historically, that this failure was an inherent  characteristic  of statistical
models has possibly led to some neglect in the study of their full  potential.
Also, of course, there has been the feeling that insofar as the statistical
models are empirically established they would be specifically restricted in
application, e.g., as regards geographical location, meteorological regimes, etc
     The present study grew from the idea that the atmospheric  dispersion
functions of the conventional source-oriented air-quality simulation model
play the role of transfer functions, as between  the distribution of pollutant
emissions and the air quality distribution.  They might therefore possibly
be obtained empirically, through an appropriate  mathematical inversion tech-
nique, from accumulated data on the joint distributions of air quality
and emissions.  In this case these empirically determined functions could

-------
                                     45
then be used In a conventional way as a basis for a source-oriented air
quality model, for prediction of air quality from arbitrary or hypothetical
distributions of emissions.  The possibility might in this way be provided
for developing an empirical-statistical source-oriented model in terms of
the large mass of accumulated historical data on air quality, rather than
through the input of a priori dispersion functions.  If this could be done
it might have the advantage of utilizing dispersion functions that were
determined directly from the actual conditions of urban dispersion.
     The present paper attempts to provide a preliminary theoretical dis-
cussion and examination of the feasibility of such a formulation.  It does
not contain numerical examples as these are the subject of an ongoing
research project.  However, it has seemed worthwhile to draw some attention
to these ideas at an early stage in the hope that, with more sophisticated
mathematical and statistical formulation, considerable improvement and gener-
alization may be possible.

GENERAL FORMULATION OF A MULTIPLE-SOURCE AIR QUALITY MODEL
     The starting point for most urban air quality models is the assumption
of a quasi-steady state.  Thus, in spite of the obvious long term variability
of pollutant concentrations and the meteorological conditions affecting
transport and diffusion, it is assumed that this variability can be treated
as though it resulted from a sequence of steady-state situations.  The
sequence interval is normally taken to be quite short and perhaps only of
the order of one hour.  For pollutants that can be regarded as chemically
inert  it is assumed that the concentration contributions produced at a

-------
                                     46
receptor location from several  sources  combine  additively.  Under  these
circumstances all possible cases  of emission  can  be  subsumed mathematically
by considering a volume distribution of emissions, and writing the total
mean concentration  x(x, y, z)  [for a rectangular coordinate system with
the plane z = 0 at the ground surface]  for  any  "steady-state" period,
that results from superposition of the  concentration fields from all the
sources, as a triple integral,
     xU.y.z) = A/YQv(€,n,c)R(x,  y,  z;  ?,  n,  ?)d£dndc               (i)
                 V
where
     Qy(?> n, C»)       = steady emission rate  per  unit  volume and per unit
                          time at position (£,  n» ?)
R(x, y, z; £, n, 0     = mean concentration at (x, y, z)  produced by a
                          steady point-source of unit strength located at
                          (£» n, C)
and the integration extends over the entire  volume  V occupied by  the
source distribution.
     Using the formal device of the Dirac delta function the above general
formulation of the superposition principle includes  the  case of the area-
sources and point-sources that are more normally considered  in air quality
simulation models.  Thus, for an extended, horizontal area-source, located
at height c = C , and of strength QA(C» n) per  unit area per unit time

                       Qu(5, n, c) = Q.(5, n) 6 (?  - O                (2)

-------
                                    47
so that equation (1) then gives (from the sifting  property of the 6-function
when under the integral sign)
     x(x, y, z) =    QAU, n)R(x, y, z;  £,  n,  ?0)d?dn              (3)
                 V
and the integral extends over the entire area  A of the area-source distribution.
Similarly for a single point-source of strength Q  at  (£0,  n0,  C0)
             QVU, n, c) = Qp 
-------
                                  48
which represents the effect at (x,  y, z) of unit concentrated source at
(5, ru c)» is often known as the influence or transfer function of the
problem.   If the distribution of causes is prescribed and the influence
function is known, then the equation permits determination of the effect
by direct integration.   However, if it is required to determine a distri-
bution of causes that will produce  a known distribution of effects, the
above equation is a Fredholm integral equation to determine Qy(£, n» c)-
The kernel is then identified with  the influence function of the problem.
In the above the kernel is a function of six variables.
     That the above is  so, can readily be seen from the following heuristic
consideration (which, however, is a basis for numerical solution of the
integral  equation).  In equation (1) above, let a, A, b, B, c, C denote
constants that define the region occupied by the emissions and receptor
locations, so that a <_ (x, £) <_ A,  b <_ (y, n) <_ B and c _< (z, c) _< C
(this involves no loss  of generality since Qv is zero outside the actual
region of emissions).
Let
                  CA -  C£_-, = A£ =  ^p    (A = 1, 2, .... L)

                  nm -  Vl = An =  T    (m = 1, 2, .... M)
                       5n-l  =A^=        (n=1'2> --
                       (L, M, N, £, m, n integers)
-------
                                   49
Then the integral equation (1) is the limiting form  for
     An, A?) -> 0, i.e., (L, M, N) -»• °° of the equation
                         L    M
     x(x,y,z) = A£AnAc   E   E
                        £=1  m=i  n=l
This must be true when a < x < A, b < y < B, c  ... y.  ... yM
     z  takes the values  z, , z2»


Let
                     =  Vn  ;  R(xi' yj' V h>
                        r y-j, zk) =xijk
 Then (6) reduces to the matrix equation


                   L    M    N
R       Y    = X                  (7)
                                RijkJln]ri  ^ = Xijk
                  JG^ 1   III"* I  ill



           [1 = 1, 2....L; j = 1, 2....M; k = 1, 2....N]



 This represents LMN equations for the LMN unknowns


 Y   (S, = 1....L; m = 1....M; n = 1....N), i.e., it is an even-determined
  £mnv

 linear system with as many equations as there are unknowns.  However,  note
-------
                                  50
that there are L2M2N2 values of R involved so that the system would be very
strongly underdetermined, with fewer equations than unknowns, if the Y's
were given and the R's regarded as unknowns.   In other words, the integral
relation expressed by equation (1) is not sufficient to determine the six-
variable function R in terms of specified three-dimensional  distributions
X and Q.
HORIZONTALLY HOMOGENEOUS DISPERSION FUNCTION
     In the very general formulation of the previous section, the meteoro-
logical dispersion function R(x, y, z; £, n, c) is a function of six inde-
pendent spatial variables.  This situation is, however, more general than
that assumed in most existing air-quality simulation models.  These assume
that the form of the dispersion function is independent of the horizontal
location of the source, so that the function R is invariant under horizontal
translation of the source-receptor pair.  Although this is obviously only
an appioximation, in view of the rather inhomogeneous nature and distribution
of  buildings  in  a city,  the  variation  of  the  meteorological  dispersion
with source  location may  nevertheless  be  small  in  comparison with  variations
due  to  other  causes, and  this  assumption  is always made for purposes  of
simplification.  We  shall  here refer  to  it as that of horizontal  homogeneity
of the  dispersion function.   In addition,  it  is  normally  assumed  that the
mathematical  form of R  is  unaffected  by  the direction of  the airflow  over
the city, provided that  the  x-axis of  the coordinate  system is  always taken
along the mean horizontal wind  direction  over the  region  of interest.
We shall only make use of  this  assumption at  a  later  stage  in the  analysis.
-------
                                      51
With horizontal homogeneity a great simplification results, and the six-
variable dispersion function reduces to one of four variables, i.e.,
R(x, y, z; £, n, 0 ->• K(x - £, y - n; z, c).  For example, for the common
Gaussian-plume model R is given by (in the case of infinite mixing depth)
                 exp {- TT
                                2TrUa,,(x-£)a  (x-£)
                                    y      Z                          (8)
where U is the mean wind speed and a   a  are the horizontal and vertical
standard deviations for the bivariate Gaussian distribution.  The latter
quantities are functions of the distance  (x  - £) downwind from the source
location and also of the atmospheric stability.  We note in  (8) that R is
a function of the horizontal coordinate differences (x - £), (y - n) and
the two (independent) variables (z - c) and  (z + c).
      In the following, since air  quality  measurements are normally only
available at ground level, we shall be concerned primarily with a special
case  corresponding to the concentration distribution at ground level.
In this case the basic integral relation  becomes
     x(x, y,  0)  =/7"/Qv(£>  n,  ?)K(x  -  5. y  - n;  0,
(9)
                   V
where  the  function  K is  now  one  of  only  three  variables.   In contrast to
the  problem  initially considered of a  3-dimensional concentration distribu-
tion,  and  for  which equation (1) could be  regarded as an  integral equation
to determine the  emissions distribution  QV(£,  n,  c) if  the dispersion
-------
                                     52
function R were specified, it is evident that the problem of determining
this 3-variable function Q  from equation (9) in terms of a known 2-dimen-
sional  concentration distribution is no longer well-determined, i.e., there
is insufficient information to determine Qv uniquely in a mathematical  sense.
(This is probably most clearly seen by an argument exactly similar to that
previously developed in relation to the interpretation of equation (7)].
The same, of course, would be true if the equation (9) were to be regarded
as one for the determination of the 3-variable dispersion function K in
terms of a specified x(*» y, 0) and Qy(£, n, c).  The latter,  however,  is
precisely the basis for the empirical  model  that is  being proposed in the
present paper to determine dispersion functions from air quality and emission
data.  It is therefore evident that a direct approach in terms of numerical
solution of the integral equation (9) will  not be possible.   Fortunately,
 an alternative  is  available  in  terms of  a solution  in a  "least-squares"
 sense.   In  the  latter  we  attempt  to determine  an approximate  solution
 for the  dispersion function  K by  restricting  K to membership  in a family
 of functions  that  involve a  number  of  parameters  (a,, a~,  . . ., a  ) that
 define  a  vector, say ex.   The parameter vector  cx_ thus  specifies a  parti-
 cular member  of the family.   A  familiar  example is  the  family of  multi-
 variate  polynomials, where a member  is specified by a particular  choice
 of values  for the  coefficients.   The approximate solution  is  then taken
 as the  "best-fitting"  function  of the  chosen  family as  determined by the
 method of  least squares  applied  to  the "observed" and  "calculated"  con-
 centration  values.  The  latter  are,  of course, determined  by  use  of the
 basic integral  relation  (9)  in  terms of  different functions K of  the
 chosen  family.
-------
                                     53
       In  all  urban  air quality models,  rather  than considering a general
  3-dimensional  volume-source distribution,  it  is customary to consider
  the  emissions  in terms of a limited  number (say J) of elevated point-
  sources  together with horizontal  area-sources, the latter being possibly
  located  at a few distinct heights cs (say, for example, for s = 1,2,3).
  Normally for any horizontal location (£,n) there will be associated only
  a  single area-source height.   However,  since  we may always take QA(£,n,Cs)=0
  when there is  no area-source emission  at height ^  , we could if desired,
  equally  well consider three superimposed area-source distributions
  (for s = 1,2,3).
       In  the above  case Eq. (9) becomes
                 3    - -
     x(x,y,o) = £  / /QA(5» n. CS)K(*  - s, y  - n;  o, cs)dcdn
                s=l  J J
                     A
                 J
                      Q(9}K(y - F   v - n •  0  C)
                     P\~^IM*    Si*      5,'      £
                £=1  v
where
     Q (5» n» ? ) = emission rate of horizontal  area-source  distribution
                    at height ? , A denotes the total  integration  domain
                               o
                    of the area-source distributions.
            Q (a) = emission rate of I   elevated  point-source,  located
                    at position (C^» n^> C^)
In the special case of ground-level area-sources alone this  reduces to

     XA(X, y, 0) S/YQAU. n, 0)K(x - 5, y - n; 0, 0)d^dn            (11)
                   A
-------
                                    54
     In this case all three functions that are involved are two-dimensional
and equation (11) can be regarded either as an equation for the unknown
function Q. if the functions x/\ and K are given, or alternatively as an
equation for the special two-dimensional form of dispersion function if the
functions x and Q are given.  However, the inclusion of elevated point-
sources and also of elevated area-sources, and hence the need to consider
at least a three-variable dispersion function, is vital for the consideration
of urban air quality.  We are consequently forced to consider the alternative
approach that is based on the method of least-squares approximation.
APPROXIMATION OF DISPERSION FUNCTION BY METHOD OF LEAST SQUARES
     As already mentioned above, the form of the dispersion function K
that appears in equation (10) is normally assumed to be independent of
the direction of the airflow over the city, provided that the x-axis of
the coordinate system is taken along the mean horizontal  wind direction.
To exploit this it is convenient to introduce "source-oriented" position
variables by

         x' = x - £           dx1 = -d£           x!£ = x.  - £
         y1 = y - n           dy' = -dn           yj^ = yi  - n£

Then with the x-axis still  along the mean horizontal wind direction,
equation (10) may be rewritten as
                     3   ~ r
     X(xi,y1;6.j.) =   £   / ^/^(x-x1 Iyi-yl,Cs)K(xl,y';0,Cs)dx1dyl
                    S *~ I  J J
                         A

                 +   £   Qp(*)K(x'  , y' ; 0, cJ                  (12)
                    *=1    v      u   u      a
-------
                                     55
where the small obvious changes in notation are made in order to express
more explicitly the functional dependence on the receptor location
(xr YjUfor i = 1,2...) and the wind direction 6.  (for j =  1,2...)  that
                                                 J
defines the angle between the x-axis and, say, the  x~-axis of a fixed
coordinate system.  If we use overbars to denote corresponding coordinates
relative to the fixed axes 0 x" y, then
                      x    = 3T   cos 9. + 7    sin  0.
                                      J             J
                      y    = y   cos e. - x"    sin  e.
                                      J             J
                      x.   = x7  cos e. + y7   sin  e.
                       1      I       J    I        J
                      yt   = yT  cos 6j - 3c7   sin  Bj
                      q   = q  cos e. ^   sin  e.
                      nA   = ^  cos e., - Tz   sin  e.
                      xu  = *u cos ej + *u  sin  ej
                      y'u  = *ucos ej - *u  Sln  ej
while if IL (x", y, C ) denotes the area-source strength distribution
relative to the fixed system of axes, then
                      QA(x, y, ?s) = Q"A(3T, y, cg)

     To apply the method of least-squares approximation, we  now restrict
the approximating dispersion function K to membership  in a specified family
of functions that involves a number of parameters whose values define a
parameter vector ex (= a.j, a2, a3, etc).  We denote  a general  member  of this
family of functions by K (x1, y1; 0, C; a} where K  is  a function of  the
-------
                                       56
three variables x1, y' and £, and the parameter vector a (the curly brackets


are used here to emphasize a specific family of functions).  Denote the


value of x obtained when this value of K is substituted in equation (12) by


Xca-|c (x.j, y.; 9.).  If there were an ex such that the equation could be


satisfied exactly, the observed concentration X^^*-,-* y,- ; 6,-) could be


predicted exactly by the function given by that a.  If a perfect fit is not


possible we may determine the value of ex that minimizes the mean square error


over a set of values of (x., y^) and 6..




      2,  o    \~*  -^ \    /         \
     p (n} =>   >   v   ix  v • fl J -
     <= \^l    t-J  A-» I X^kr. V^_- »yn- » D.;
                            l  1   J

K  x1 ,y' ;0,c; a
                                                                 dx'dy'
                           Or                n2
                          Z Qp(^)K P<:£,y;.r0;c£;a                   (13)
   Equation (13)  may be minimized with respect to o_ by any number of


   optimization techniques provided that the integral  can be calculated,


   and many numerical  integration techniques are suitable for this purpose.


   Also,  calculation of the area-source integral can be simplified under


   the "narrow plume hypothesis"  that is described below and the two-


   dimensional  integral  reduced approximately to a one-dimensional one.


   A  key  problem  is  the choice  of an appropriate family of parameterized

                                                  2
   functional  forms  for K{ } such that the error e  will be small, but


   such that the  number of parameters will be small.  Continuous piecewise
-------
                                   57
linear functions, as used in the recent EPA contract with Technology
Service Corporation, provide such a class of functions and are a
promising candidate for achieving a feasible solution.  Whatever form
of approximating function is used, however, it may be possible to make some
simplification by assuming specific dependencies on some meteorological
parameters, e.g., by assuming that the concentration  is inversely propor-
tional to the mean wind speed, rather than extracting this dependence
empirically.  Another possibility is the assumption of a Gaussian-plume
form for K{ } and then determination of the dispersion parameters empirically.
Thus from equation  (8) above we might take
                                        ,.2
           /            \          \    uv  VA  ';      I    °7 v* i ]
         K  (x',y';CU;a)  =   	i	*	—^	±	2	LL        (T4)
 and assume simple power-law dependencies for the standard deviation
 functions, say

                         a(x')  =  a(x')y
                                          b
                         az(x')  =  az(x') z

 where a »bv»az»bz  are constants.   Then  the parameter  vector a  =  (V,a  ,b  ,
 az.bz).
      Finally, we consider  the simplification  using  the  "narrow plume
 hypothesis"  ["A Narrow Plume  Simplification  for Multiple Source
 Urban Pollution Models," unpublished  note, K.  L. Calder, 31 Dec.  1969].
-------
                                 58
Evidently, if the concentration in a point-source plume decreases rapidly


with crosswind distance from the plume centerline, then in the integration


with respect to y1 in equation (12). the K function will be small except


for small values of  |y' j  .  We shall assume that this distance is


sufficiently small so that the y1  variations of the area-source functions


6A can be disregarded.  This idealization can be thought of in a formal


manner, by using the Dirac 6-function, and writing
             K(x', y1; 0, ?s) = G(x', ?s)«(y)                    (16)



where G(x', c ) is thus the crosswind integrated concentration from unit


point-source, since
                 o
                  K(x', y'; 0, r) dy1 = G(x', r)
                                ^               s
With (16), each area integral in equation (12) reduces to the form
               f QA(x. - x', y., Cs)G(x',
We thus have a one-dimensional integral to evaluate rather than a two-


dimensional one.  For the special case of a Gaussian-plume



                                            r2

                                   exP{-
                               nr
        K(x', y1; 0,
= G(x'. U     (17)
-------
                                  59
PROPOSED FEASIBILITY STUDY (19 NOV. 74)
     In an initial feasibility study it seems appropriate to use air
quality estimates calculated by use of a known multiple-source
simulation model, rather than actual observations of S02 concentrations.
However, so that a realistic emissions distribution and sampling network
be utilized, the calculations should be made for a real  situation rather
than a purely hypothetical one.
     The air quality model to be used for the study will be RAM*.  This
is a multiple-source Gaussian-plume model in which, for computational
efficiency, elemental area-sources  (50001) are aggregated into larger
squares where this is possible.  The model also estimates the area-source
concentrations using the narrow-plume approximations discussed above.
The algorithm also permits consideration of three different heights
for the area source emissions.  For the St. Louis emissions data (Turner-
Edmisten) that will probably be used,  there were  30 x 40 squares (5000')
and 62  point sources, and  40 receptor  locations at which concentrations
will  be calculated.
     For the purposes of the present study the standard deviation functions
for the basic Gaussian-plume will  be assumed to be simple power laws, as
given by equation (15) above, with  parameter values as below (when the
a's and distances are both measured in kilometers).
      A paper  "An Efficient Gaussian-Plume Multiple Source Air Quality
Algorithm," by Joan M. Hrenko and D.  Bruce Turner of the
Meteorology Laboratory, was presented at the 68th Annual APCA
Meeting in June  75 at Boston.
-------
                                    60
Stability Category
A
B
C
Neutral D
E
F
a
y
0.20
0.16
0.11
0.072
0.051
0.038
b
y
0.90
0.90
0.90
0.90
0.90
0.90
a
z
0.14
0.080
0.056
0.038
0.023
0.012
b
z
0.90
0.85
0.80
0.76
0.73
0.67
The a , b  values are given by F.  Pasquill  in  his  Table  6.IX  for  a
ground roughness length z  = 10 cm (see Atmospheric  Diffusion,  2nd  Edition,
John Wiley & Sons, 1974).  The a , b  values were  obtained by fitting,  over
the distance range 0.1 to 10 km, the curves given  in Figure 3.10  of
Meteorology and Atomic Energy, 1968.
     Some possible tasks follow:
Task 1
     Select a real urban location (e.g., St. Louis, New York, Chicago, etc.)
for which area-and point-source, short-term emissions distributions are
available for S02-  For a typical 1-hr emissions distribution use the
multiple-source Gaussian dispersion model  (probably the RAM model of
Meteorology Laboratory) - for one wind speed (5 m/sec), one stability
-------
                                    61
class (neutral)  and infinite mixing depth - to calculate total  1-hr
concentrations xU^y^e.) at ground level at a number of receptor
locations (x^.), 1=1,2,... and for the 16 cardinal  wind directions
ej = 0, 22-1/2°, 45°,..., 337-1/2°.  The concentration values will be
calculated (a) from the area sources alone, (b) from the point-sources
alone, and (c) from the point- and area-sources combined.
     Use the least-squares methodology proposed, and equation (13) to
recover the meteorological dispersion function K.  Determine
     (a) the degree of error in predicting concentrations, for the
receptor locations and wind directions actually used to derive the empiri-
cal dispersion function,
     (b) the degree of error in predicting concentrations at receptor
locations and for wind directions not used in the derivation (a measure
of interpolation accuracy),
     (c) the degree of error in predicting results for a somewhat different
emissions distribution (a test of extrapolation accuracy), and
     (d) compare the empirical dispersion function with the Gaussian form
used to compute input concentrations for the analysis.
Task 2
     Test the sensitivity of the method to the number of "observed" concen-
trations used, and to random errors in the emissions inventory.
Task 3
     Extend the preceding to a range of wind speeds, atmospheric stability
classes, and to several different emissions distributions.
-------
                                             62
                                  TECHNICAL REPORT DATA
                           (Please read Instructions on the reverse before completing)
  REPORT NO.
   EPA-600/4-76-029b
                                                          3. RECIPIENT'S ACCESSION-NO.
4. TITLE AND SUBTITLE
                 EMPIRICAL TECHNIQUES FOR ANALYZING AIR
QUALITY AND METEOROLOGICAL DATA.
Part II.  Feasibility Study of a Source-Oriented
Empirical Air  Quality Simulation Model
5. REPORT DATE
  June  1976
                                                          6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
 W. S. Meisel
 M. D. Teener
                                                           . PERFORMING ORGANIZATION REPORT NO.
                                                            TSC-PD-132-3
9. PERFORMING ORGANIZATION NAME AND ADDRESS

 Technology Service  Corporation
 2811 Wilshire Boulevard
 Santa Monica, California  90403
                                                          10. PROGRAM ELEMENT NO.

                                                            1AA009
                                                          11. CONTRACT/GRANT NO.

                                                            EPA 68-02-1704
12. SPONSORING AGENCY NAME AND ADDRESS
 Environmental  Sciences  Research Laboratory
 Office of  Research  and  Development
 U.S. Environmental  Protection Agency
 Research Triangle Park, North Carolina 27711
                                                          13. TYPE OF REPORT AND PERIOD COVERED
                                                            Final  Mav 74-Dec  75	
                                                          14. SPONSORING AGENCY CODE

                                                            EPA-ORD
15. SUPPLEMENTARY NOTES
 This is the  second of three reports examining the  potential  role of state-of-the-
 art empirical  techniques in analyzing air quality  and meteorological data.
16. ABSTRACT
      Meteorological  dispersion functions in multiple-source simulation models
 for urban  air quality are usually specified on  the  basis of data from special  field
 experiments,  usually involving isolated sources.   In  the urban environment, indi-
 vidual  sources cannot be isolated.  One may,  however,  ask for a source-receptor
 relationship  which,  when summed (or integrated)  over  all the sources, would
 minimize the  average squared error in prediction  of measured values.  The feasi-
 bility  of  this approach is demonstrated by application to model-generated data,
 where the  source-receptor relationship is known.
                                KEY WORDS AND DOCUMENT ANALYSIS
                  DESCRIPTORS
                                              b.IDENTIFIERS/OPEN ENDED TERMS
                                                                           COSATl Field/Group
 * Air  pollution
 * Meteorological  data
 * Atmospheric diffusion
 * Mathematical  models
 * Environmental simulation
                                                                             13B
                                                                             04B
                                                                             04A
                                                                             12A
                                                                             14B
13  DISTRIBUTION STATEMENT


     RELEASE  TO  PUBLIC
                                             19. SECURITY CLASS (This Report)

                                                   UNCLAS.STFTFn
                                                                         21. NO. OF PAGES
                  66
                                              20. SECURITY CLASS (This page)

                                                    UNCLASSIFIED
                                                                        22. PRICE
EPA Form 2220-1 (9-73)
-------