EPA-800/4-76-029b
June 1976
Environmental Monitoring Series
EMPIRICAL TECHNIQUES FOR ANALYZING
AIR QUALITY AND METEOROLOGICAL DATA
Part II. Feasibility Study of a
Source-Oriented Empirical
Air Quality Simulation Model
Environmental Sciences Research Laboratory
Office of Research and Development
U.S. Environmental Protection Agency
Research Trianfte Park Norttt Carolina 27711
-------
RESEARCH REPORTING SERIES
Research reports of the Office of Research and Development, U.S. Environmental
Protection Agency, have been grouped into five series. These five broad
categories were established to facilitate further development and application of
environmental technology. Elimination of traditional grouping was consciously
planned to foster technology transfer and a maximum interface in related fields.
The five series are:
1. Environmental Health Effects Research
2. Environmental Protection Technology
3. Ecological Research
4. Environmental Monitoring
5. Socioeconomic Environmental Studies
This report has been assigned to the ENVIRONMENTAL MONITORING series.
This series describes research conducted to develop new or improved methods
and instrumentation for the identification and quantification of environmental
pollutants at the lowest conceivably significant concentrations. It also includes
studies to determine the ambient concentrations of pollutants in the environment
and/or the variance of pollutants as a function of time or meteorological factors.
This document is available to the public through the National Technical Informa-
tion Service. Springfield, Virginia 22161.
-------
EPA-600/4-76-029b
June 1976
EMPIRICAL TECHNIQUES FOR ANALYZING AIR QUALITY
AND METEOROLOGICAL DATA
Part II. Feasibility Study of a Source-Oriented
Empirical Air Quality Simulation Model
by
W. S. Meisel and M. D. Teener
Technology Service Corporation
2811 Wilshire Boulevard
Santa Monica, California 90403
and
Kenneth L. Calder
Contract No. 68-02-1704
Project Officer
Kenneth L. Calder
Meteorology and Assessment Division
Environmental Sciences Research Laboratory
Research Triangle Park, North Carolina 27711
U.S. ENVIRONMENTAL PROTECTION AGENCY
OFFICE OF RESEARCH AND DEVELOPMENT
ENVIRONMENTAL SCIENCES RESEARCH LABORATORY
RESEARCH TRIANGLE PARK, NORTH CAROLINA 27711
-------
DISCLAIMER
This report has been reviewed by the Environmental Sciences
Research Laboratory, U.S. Environmental Protection Agency, and
approved for publication. Approval does not signify that the
contents necessarily reflect the views and policies of the
U.S. Environmental Protection Agency, nor does mention of trade
names or commercial products constitute endorsement or
recommendation for use.
n
-------
PREFACE
This is the second of three reports of work performed under EPA
Contract No. 68-02-1704, examining the potential role of state-of-the-art
empirical techniques in analyzing air quality and meteorological data.
The reports are entitled as follows:
I. The Role of Empirical Methods in Air Quality and Meteorological
Analyses
II. Feasibility Study of a Source-Oriented Empirical Air Quality
Simulation Model
III. Short-Term Changes in Ground-Level Ozone Concentrations:
An Empirical Analysis
m
-------
ABSTRACT
Meteorological dispersion functions in multiple-source simulation
models for urban air quality are usually specified on the basis of the
analysis of data from special field experiments, usually involving isolated
sources. In the urban environment, individual sources cannot be isolated.
One may, however, ask for an empirical source-receptor relationship which,
when summed (or integrated) over all the sources, would minimize the average
squared error in prediction of measured values, A methodology for empiri-
cally testing alternative forms and extracting optimal parameters for source-
receptor dispersion functions in this manner is described. Feasibility was
demonstrated on data for which the "true" source-receptor function was known;
the methodology recovered parameter values very close to true values. This
approach can be used as a means for calibrating Gaussian-form models for
particular urban environments and in developing alternative source-receptor
functional forms.
-------
TABLE OF CONTENTS
PREFACE iv
ABSTRACT v
]. INTRODUCTION 1
2. PROBLEM FORMULATION 7
2.1 MATHEMATICAL FORMULATION 7
2.2 THE TEST DATA 12
3. OPTIMIZING PARAMETERS FOR THE GAUSSIAN FORM OF
THE SOURCE-RECEPTOR FUNCTION 16
4. MORE GENERAL SOURCE-RECEPTOR FUNCTIONS 26
5. CONCLUSIONS 38
REFERENCES 41
APPENDIX: The Feasibility of Formulation of a Source- 42
Oriented Air Quality Simulation Model that
Uses Atmospheric Dispersion Functions Em-
pirically Derived from Joint Historical Data
for Air Quality and Pollutant Emissions
(Kenneth L. Calder)
-------
1. INTRODUCTION
This report examines the feasibility of extracting an empirical
source-receptor air pollutant dispersion function. The genesis of the
ideas behind the present report originated with the EPA project monitor,
Kenneth L. Calder; the motivation is best stated in his own words (as
quoted by Niels Busch in the proceedings of the fourth meeting of the
NATO/CCMS panel on air pollution modeling, from a letter written by
K. L. Calder in March 1973:)
It was felt that the topic "The role of empirical/statistical
modeling of air quality" might be an appropriate one for discussion
at this time, since with one significant exception (Barrie Smith's
paper on SO^ prediction for London and Manchester that was presented
by Frank Pasquill at our last meeting) the topic has been almost en-
tirely neglected in our discussions to date. Perhaps one reason for
this state of affairs has been the historical belief that air quality
models based on statistical regression type of analysis are not
source-oriented and therefore are largely useless for control strategy
in terms of the contribution of individual sources to the degradation
of air quality. Also, of course, is the feeling that insofar as the
statistical models are empirically established they will be specifi-
cally restricted in application, e.g., as regards geographical loca-
tion, meteorological regimes, etc. Although this may be the case I
am unaware that it has been clearly demonstrated that these limita-
tions are really inherent in all statistical-type of air quality
-------
modeling. The question may be asked as to whether, with an appro-
priate analysis, a source-oriented statistical-type of air quality
model could be developed which did not involve prior specification
of meteorological dispersion functions per se and incorporation of
these as in present air quality models. My thought here is that
for given "meteorological conditions" these dispersion functions
play the role of transfer functions between the air quality distri-
bution and the distribution of pollutant emissions, and if one were
smart enough might therefore conceivably be obtained empirically by
a mathematical inversion technique (as, for example, by numerical
solution of sets of integral equations) utilizing accumulated data
on the distributions of air quality and emissions. If this could
be accomplished then maybe a major shortcoming of the current sta-
tistical models could be removed and we should then in effect have
an alternative to the customary meteorological-dispersion type of
modeling.
These concepts are the genesis of the ideas in this report. They
are elaborated upon at considerable length by Mr. Calder in a memorandum
written in support of the work on this task which is included as an Appendix.
The difficulties in developing a source-oriented empirical model can
be stated from a statistical point of view. The spatial distribution of
pollutant concentrations over a region is determined by emissions and
meteorological conditions. The number of variables determining the con-
centration at a given point is tremendous, particularly since emissions
-------
arise from a large number of point sources and area sources. Consequently
the number of emission variables alone can easily be in the hundreds.
If an empirical model were to be developed in the most obvious manner,
there should be an attempt to relate the pollutant concentration at a
given point to all the possible emission variables and meteorological
variables affecting the concentration at that point. Since the determin-
ation of the relationship between emission/meteorological variables and
concentration requires examples of that relationship over a very wide
range of emission and meteorological variables, a tremendous amount of
*
data would be required to adequately determine this relationship. Further,
to obtain the spatial distribution by the direct approach, measurements
of the concentration at a large number of points might be necessary.
Hence, (because the amount of data required to specify the full variation
of the model in this formulation is unattainable) the most obvious approach
to developing empirical models is impractical. It is typically difficult
to obtain one reliable emission inventory, much less a variety of such
inventories from widely different emission distributions in the same
geographical area.
This difficulty can be overcome by converting an apparent disadvantage,
the diversity introduced by meteorological variation, into an advantage.
Suppose we have a point source and monitors as indicated in Figure 1.
Suppose there were only 20 point sources and no area sources.
Further suppose that we only considered two values of emission rates
from each source, one "high" and one "low." There are 220 (over one
million) combinations of values these 20 point sources can take. It
would be impossible to construct a practical experiment to sample even
a fraction of this diversity.
-------
Wind Direction
Source
Monitor 1
Monitor 2
(a)
IWind Direction
Source
* Monitor 1
Monitor 2
(b)
Figure 1. An illustration of the effect of wind direction in
introducing diversity despite fixed monitoring sites
-------
If the wind direction never varied from the direction shown in Figure la
and all else was equal, monitor 1 would always measure the same concen-
tration and monitor 2 would measure negligible concentration. However,
since the wind blows in other directions as in Figure Ib, the situation
could be reversed and monitor 2 could measure significant concentrations.
With enough examples of the source-receptor relationship, the variation
of the concentration with distance from the point source could be deter-
mined empirically. Parameters of some plume models were, in fact, esti-
mated by taking measurements of the concentration from isolated point
sources.
In the urban environment, individual sources cannot be isolated.
Measurements are the result of contributions from a number of sources.
However, because of the wide diversity of meteorological conditions, the
concentration will vary widely at a given point, and the sources which
contribute to the concentration at that point will similarly vary. One
may then ask for a consistent source-receptor relationship which, when
summed (or integrated) over all the sources, would explain best on the
average the observed concentrations. More specifically, one could choose
the source-receptor function which minimized the average squared error in
prediction of the measured values. The feasibility of this approach is
the subject of this report. The methodology is discussed in more detail
in Section 2 and in the Appendix.
The data used to test these ideas is model-created data. Model data
was chosen for three major reasons:
(1) With model data, the source-receptor function is known and can
be compared with the function extracted from the data. With measurement
data, "truth" is unknown.
-------
(2) Area sources and point sources can be isolated and studied
separately as well as jointly.
(3) The cost of verifying and organizing measurement data would
have been beyond the scope of the present study.
The model used was the RAM model (a version of the Gaussian plume
formulation) [4]. It was developed by the Environmental Protection Agency
and is discussed further in Section 3.
One important form of source-receptor function is the Gaussian
form. This form is studied in some detail to determine the difficulty
of extracting optimal parameter values from observed data and an emissions
inventory. This approach can be viewed as a method of calibrating a
Gaussian plume model to fit the particular urban environment to which it
is to be applied. This appears to be practical, although requiring diver-
sity in the location of the monitors relative to the sources and to pre-
vailing wind directions. This topic is also discussed in Section 3.
In Section 4, the possibility of using more general source-receptor
functions is discussed. Functional forms considered as source-receptor
functions are polynomials, piecewise continuous polynomials, and fully
continuous piecewise quadratic functions. Since the data was generated
using a Gaussian form, these other forms are not as efficient as the
Gaussian form for this data; however, comparisons among the non-Gaussian
forms can be made.
The report concludes with a discussion of the implications of the
results and potential applications of the methodology.
-------
2. PROBLEM FORMULATION
In this section, we discuss the mathematical formulation of the prob-
lem, the approach used in solving the problem, and the data used in testing
that approach.
2.1 MATHEMATICAL FORMULATION
The Appendix goes into some detail in formulating the approach employed
in this report. We work with a rectangular coordinate system with x-axis
along the mean horizontal wind direction, with y-axis crosswind, and with
the z-axis vertical. Then in urban air quality models it is customary to
consider the pollutant emissions in terms of a limited number (say J) of
elevated point-sources together with horizontal area-sources, the latter
being possibly located at a few distinct heights ?s (say, for example, for
s = 1,2,3). The total concentration x(x,y,0) at ground level at the receptor
location (x,y,o) will be the sum of the concentration contribution from the
point-source distribution, say xD(x»y>0) and that from the area-source dis-
tribution xA(x,y.o), i.e.,
x(x,y,o) = xp(x,y,o) + xA(x,y,o) 0)
where
xn(x,y,o) =
V
3 r r
xA(x,y 0) = £ JjQAU,ri,es)K(x-5,y-Ti; 0,cs)dedn (3)
5"" \ M
-------
and Q («.) = emission rate of £-th elevated point-source,
located at position U»>)-
Q.(5,n,£ ) = emission rate of horizontal area-source distribution
M S
located at height t, , and A denotes the total inte-
gration domain of the area-source distributions.
K(x-£,y-n; o>0 = source-receptor function; it gives the ground level
concentration at the receptor location (x,y,0) result-
ing from a point-source of unit strength at U,n,d-
Note that this formulation includes the assumption of horizontal homogeneity,
namely, that the impact of a given source upon a given receptor depends only
upon their relative and not absolute coordinates. This assumption is true
for an urban environment only in an average sense. A single wind direction
is similarly valid only in an average sense. Finally, it should be noted
that the above formulation assumes steady-state conditions and is thus only
applicable for relatively short time-periods (of the order of one hour),
when this may be an adequate approximation providing the emissions and
meteorological conditions are not rapidly changing.
In equations (2) and (3) above it is convenient to use "source-oriented"
position coordinates, and to consider a typical ground-level receptor loca-
tion as (x^), 1=1,2,
Let
x' = x.-s , dx' = -d? , x'u = x.-^ (4)
y' = yrn , dr = -dn . y'u " yrn£
-------
Then
Xp(xryro) = E QnUWxf^y.O.g (5)
3 r r
xA(x1.y1,o) = E J J QA(x1-x',y1-y,cs)K(x',y; o,cs)dx'dy (6)
S~* I A
In the following several different source-receptor functions [K(x',y';
will be considered, including the classical Gaussian form that is the
basis for the RAM-model [3]. For the latter, and with the meteorological
condition of infinite mixing depth
2 I i C2
exp <- -Hfc—> exp <^ fe-
K(x',y'; 0,1
where U denotes the mean wind speed, and we assume simple power-law depend-
encies for the standard deviation functions, say
(7b)
O z (7c)
Also, as in the RAM-model we will assume that the narrow-plume hypothesis
(see Appendix) may be employed in order to reduce the double integral of
-------
10
equation (6) to a one-dimensional integral. Thus under this hypothesis
if
oo
f
K(x',y'; 0,?)dr = G(x',c) (8)
then in place of equation (6) we have
f
XAUryrO) = E / QA(xrx',y.,?s)G(x',?s)dx'
s=l •/
(9)
which only involves values of the area-source emission rates in the vertical
plane through the wind direction and the receptor location.
For the special case of a Gaussian plume
2
It is seen that in this case the total area-source contribution yA(x.,y.,0)
only involves the meteorological parameters U and a (x"), i.e., is indepen-
dent of a (x'). In evaluating equation (9) numerically, the source intensity
function QAI>S» in Practice, piecewise constant.
The basic equations (5) and (6) (or (5) and (9)), with the Gaussian
forms for K(x',y'; 0,?) and G(x',s) involve four unspecified parameters
-------
11
*
through the equations (7b) and (7c), namely, a ,b ,az and b^. More
generally, any functional form chosen for K (and therefore G) may have
unspecified parameters; we will denote the set of unspecified parameters
by the vector a.. Thus for the special Gaussian form
a = (ay,by,az,bz) (11)
The explicit dependence of the calculated concentration values on these
parameters could be indicated by the notation x(x.,y. >o» ".)•
The basic method employed in this study is that of choosing o_ to
**
minimize the error between calculated and observed values of concentrations.
In order to express this statement formally, we must elaborate our notation
to indicate explicitly the dependence on wind direction; thus x(x,- >y,- »0;e; o_).
The dependence on e, in fact, involves a rotation of coordinate axes, as
shown in the Appendix. For each wind direction e .(j=l ,2...R) there is a
J
concentration observation for each receptor location (monitoring station).
The receptor locations are denoted (x.,y.) for i=l,2...N, and are assumed
to be at ground level so that we may omit the symbol o in the x~notation.
Then the mean square error over all observations is
*
We also examine later the possibility of considering U a free parameter.
For our data the wind speed was taken as a constant, of 5 m/sec.
**
"Observed" in the present case is model-created test data; the technique
is, of course, intended for practical use on measured data.
-------
12
e2<«) -J-_K £
RN 1=1 J=l
, N R
(12)
where Xn and Xfl are given by equations (5) and (6) (or (5) and (9)).
P H
2
The problem of minimizing e with respect to o_ is a standard optimiza-
tion problem. Chambers provides a good recent survey of available tech-
niques [2]. The particular technique we employed was "structured random
search" [3J; this is a rather inefficient technique, but one which does not
require calculation of derivatives and which converges under difficult con-
ditions (given enough time). This technique's main advantage was that we
could modify the form of the source-receptor function without modifying the
search technique. The results of applying this methodology to the best
data are discussed in Sections 3 and 4; however, we first turn to a des-
cription of the test data.
2.2 THE TEST DATA
For a realistic distribution of point-sources, area-sources and
receptor locations, use was made of unpublished information from a 1968
*
air pollution study conducted in St. Louis, Missouri. Data on the
elevated point-sources and ground-level receptor locations are presented
Unpublished manuscript of National Air Pollution Control Administration,
"St. Louis S0? Dispersion Model Study," by D. B. Turner and N. G. Edmisten,
November 19687
-------
13
in Figure 2. The point sources range in strength from 9 g/sec to 2681 g/sec,
while the elevations range from 39 m to 495 m above ground level. Data for
the area-sources are represented in Figure 3, which indicates the strengths
and the heights of the sources. The area-sources were categorized as
either 20, 30 or 50 meters high. The corresponding concentration data were
generated by the EPA-developed RAM algorithm [4], which is a specific imple-
mentation of the classical Gaussian plume formulation, that considers both
point- and area-sources, with three possible heights for the latter, and
which uses the "narrow-plume" hypothesis (i.e., equation (9)) to calculate
the area-source concentration contribution XA- A constant wind speed U of
5 meters per second was employed, and sixteen wind directions at the points
of the compass were simulated. Infinite mixing depth and a neutral atmo-
spheric stability category were assumed. For the latter, in equations (7b)
and (7c), we have
a = 0.072 , b = 0.90
•J »/
az = 0.038 , bz = 0.76
(13)
For this data, these values and the indicated equations are optimal and
would produce zero mean-square error. It is this result we hope to
recover from the data by the optimization procedure.
-------
14
40
30
25
20
10
T 1 1 1 1 '
° 9, °
O
N
' '
\
"
[
0
'
—
, ($> 00 -
,
0 D
n
o n
a D8 D
°D° o
Doo°a
n 0° ° D
D a
° n n «a ° n a
n on * D n°
_, ° o
a a o a o
- o°o a
a g a
a a _
0
a
0
o
0
D RECEPTOR
MORE THAN ONE
o POINT SOURCE
•
• SINGLE POINT SOURCE
J 1 1 1 - -J 1
0
5
Figure 2.
10
15
20
25
30
Location of point sources and receptors. (There are 40
receptor locations and 62 point sources. Close point sources,
such as multiple stacks, are indicated by single points.)
-------
15
37
36
21
22
20
19
594
20
II 2
20
164 Ia2l
20 | 20
361
20
570
20
1 05
20
1 80
20
226
20 2
20
36
5
2.05
20
148 906
20 20
1269
20
396
20
368 273
20 20
309 9OO
20 20
21 6
20
481
20
842
20
571
20
O 1
20
18.7
20
194
20
588
20
581
2O
300
20
5.16
20
400
20
280
20
296
20
307
20
153
20
504
20
541
20
2.18
20
1.97
20
174
20
2 57
20
901
20
1 34
20
1 68
20
316
20
588
20
592
20
135
20
7.81
20
457
20
II
20
770
20
18 7
20
207
20
790
20
582
20
693
20
860
20
988
20
I9a
20
276
20
276
20
36.3
20
56.2
20
164
20
584
20
414
20
414
20
715
20
800
20
430
20
1 II
20
173
20
22.8
20
270
20
20
204
20
120
20
303
20
20
12-7
20
1 68
20
1 88
20
246
20
543
30
442
20
263
20
382
20
715
20
715
20
a 64
20
400
20
25 1
20
224
20
39O
20
306
20
20
346
20
175
20
122
20
20
105
20
178
20
146
20
128
20
187
30
133
30
277
20
8.84
20
832
20
4J4
20
2.65
20
309
20
3.09
20
3O9
20
929
20
2 59
20
558
20
801
20
100
20
8-6
20
364
20
20
535
20
922
20
289
20
20
20
253
20
466
20
715
30
272
30
266
30
119
20
11.4
20
669
20
556
20
1 87
20
168
20
100
20
370
20
965
20
878
20
617
30
30
iea
20
21 4
20
881
20
20
328
20
730
30
100.
30
635
30
296
30
242
30
20.5
20
933
20
974
20
4.05
20
155
20
687
20
3 6
20
747
30
88
30
694
20
961
30
32 7
30
473
30
387
30
30
82.8
30
71 0
30
71.3
30
81.9
30
284
20
153
20
22.1
20
19.0
20
894
20
10.3
20
534
20
20
32 1
20
540
20
634
30
844
30
5O7
20
444
20
261
20
257
30
126
30
3O
67.4
30
138
30
199.
30
548
30
308
20
357
20
61.3
20
45.5
20
11.3
20
19.4
20
107
20
3.7B
20
3.09
20
1.71
20
3.09
20
907
20
1.07
20
•?o4
2.72
20
239
20
3.16
20
1?
.339
20
20
12.4
2O
226
30
676
30
130
3O
273
20
370
20
285
20
531
30
30
30
48 1
20
145.
30
108.
30
896
30
628
30
781
20
70.8
20
26.7
20
12.7
20
242
20
ioe
20
4.34
20
434
20
¥o6
4.34
20
.708
20
.271
20
1.75
20
153
20
316
20
707
20
339
20
436
20
551
20
843
20
755
20
296
20
20
874
20
150
20
426
20
205
20
286
20
116
20
113
30
396
30
30
30
aeo
20
702
30
103
30
155
30
128.
30
696
30
40.3
20
98.0
30
167
30
10. 8
20
1.85
20
196
20
1.42
20
.708
20
384
20
20
460
20
15 8
20
16 9
20
21 9
20
797
30
152
30
72 3
30
335
30
30
30
321.
30
141
30
211
30
137
30
159
30
119
30
132.
3O
107.
30
6.93
20
.339
20
.316
20
1.62
20
786
20
20
535
20
20
32
20
170
20
98 3
20
05
30
218
30
208
30
30
30
580.
30
306.
30
386
30
463.
30
240.
30
257
30
121
20
.316
20
558
20
1.62
20
949
20
7.86
20
1.75
20
1.23
20
4.10
20
JO
465
20
519
20
519
20
231
20
82.3
30
251
20
392
20
224.
30
30
30
657.
30
224.
30
483
30
367.
30
150.
30
417
20
III
20
558
20
7.07
20
2.97
20
9.49
20
%9
9.62
20
832
20
195.
20
a la
20
335
20
707
20
677
20
222
30
707
20
370
20
707
20
1 05
20
417
20
3.8
20
689
20
II
20
71 1
20
303
20
271
20
336
2O
386
20
91
20
000
20
790
20
30
30
30
520
40
252.
40
780
30
269.
30
158
20
279
20
558
20
159
20
27.0
20
834
20
5.44
20
9.66
20
8.24
20
1.27
20
855
20
1.84
20
384
20
20
30
30
423.
50
956
40
217.
30
646
20
506
2.0
II 8
20
684
20
III
20
210
20
425
20
316
20
267
20
226
20
20
20
641
20
13.0
20
567
20
123
20
351
20
68
20
133
20
8 84
20
672
20
ioe
20
71 7
20
429
2O
174
30
651
20
316
20
000
20
0 00
20
20
565
2O
670
20
167
30
137
20
193.
30
20
225
30
309
20
260,
30
632
30
167.
20
268
20
884
20
6 84
20
558
20
513
20
573
20
153
20
887
2O
6.
2
707
20
97
0
292
20
427
20
20
558
20
736
20
299
30
660
20
262
20
20
837
20
337
20
824.
30
550
30
249
20
603
20
444
20
884
20
558
20
282
20
282
20
1.98
20
282
20
3.36
20
341
20
522
20
560
20
606
20
339
20
403
20
271
20
384
20
? 33
20
867
20
20
642
20
481
20
572
30
632
20
632
20
20
151
30
1 1.9
30
277.
30
265
30
365
20
62.7
20
279
20
133
20
539
20
110
30
346
30
746
20
339
20
323
20
113
20
41 5
20
4 20
20
174
20
809
30
4O6
30
407
20
320
20
271
20
945
20
20
642
20
3-76
20
487
20
316
20
6.32
20
552
30
149.
30
82.3
30
721
20
807
20
461
20
169
20
518
20
840
20
20
7 36
20
6 10
20
0904
20
2.22
20
222
20
242
20
840
20
20
222
20
3 16
20
20
20
361
30
71
30
137
20
II 5
20
80.4
20
20.3
20
349
20
282
20
316
20
316
20
20
20
128
20
II 5
20
23
20
20
920
20
254
20
373
20
2 82
20
0904] 65
20 | 20
165
20
2 95
20
2 22
20
20
222
20
316
20
316
20
316
20
20
20
20
128
20
12.8
20
20
108
20
20.9
20
332
20
125
20
2,83
20
282
20
5.65
20
3.60
20
4.04
20
5.65
20
344
20
2
47
0
568
20
568
20
20
2 69
20
255
1 95
20
20
20
20
787
20
19.2
20
20
0.6
20
1 14
20
713
20
485
20
357
20
511
20
114
20
2 86
20
279
20
279
20
165
20
568
20
640
20
568
20
5 68
20
20
2 69
2O
95
20
20
464
20
387
2O
430
2O
95
20
20
20
19.7
20
12 8
20
20
14 1
20
120
20
884
20
491
20
130.
20
997
20
i9.7
20
89
20
228
20
?05
20
20
20
971
20
12 8
20
20
10.6
20
971
20
8.84
20
979
20
736
20
49.8
20
86.2
30
46
20
228
20
272
20
273
20
1 1
01234
III 1 II II '
S 6 7 8 9 10 II 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 3O 11
Figure 3. Area Sources: Upper number in each square
is emission rate in g/sec-m^ (x 10"^)t
Lower number is height of area source in
meters.
-------
17
3. OPTIMIZING PARAMETERS FOR THE GAUSSIAN FORM
OF THE SOURCE-RECEPTOR FUNCTION
The data base described in Section 2.2 contains concentration values
at forty receptors for sixteen wind directions, a total of 640 values
(referred to as "actual" values). The contribution to the concentration
from point and area sources was available separately, as well as in toto.
Equations (5) and (7a)provide a prediction of the point-source
pollutant concentration at any given receptor location
once the four parameters are specified. A comparison of values predicted
by these equations versus actual values allows calculation of the root-
mean-square value of the error with a given choice of parameter values.
(See equation (12), with area sources at zero.)
With initial guesses of a =az=0.1 and b =bz=1.0, the search routine
described arrived at values of
a = .074, b = 0.92, az = 0.039, DZ = 0.77
when the "true" values (those used to create the data) were
a = .072, b = 0.90, az = .038, b = 0.76.
J J
The root-mean-square (RMS) error initially was 157 yg/m3 and the
maximum error over the 640 values was 1205 yg/m3; the parameter values after
100 iterations yielded an RMS error of 14 yg/m3 and a maximum error of
175 yg/m3 (Table 1 summarizes these results.) To place the size of the
final error in perspective, we note that the actual values (due to point
sources alone) are as high as 1545 yg/m3.
-------
Table 1. Point sources only; parameter values at initial, mid, and final
Iteration
0 (initial)
50 (mid)
100 (final)
ACTUAL
VALUES:
a
y
0.100
0.049
0.074
iteration during
b
y
1.00
0.85
0.92
search. (Windspeed is
a b
z z
0.100 1.00
0.050 0.71
0.039 0.77
(0.072)
(0.90)
(0.038)
(0.76)
RMS Error
(yg/m3)
157
85
14
(0)
Max. Error
(uq/m3)
1205
711
175
(0)
CO
-------
19
Employing equation (9) for area sources and using only the area-
source contribution in the "actual" data, we get similarly promising
results (Table 2). Actual values of concentrations due to area sources
reach maximums of over 800 yg/m3.
The results of treating point and area sources simultaneously,
representative of the case which would be encountered with measurement
data, are listed in Table 3; the algorithm once again closely approaches
the optimum values in 100 iterations. Actual values of the total con-
centrations from both point and area sources go above 1600 yg/m3.
While the initial parameter values we chose in these cases con-
verged toward the values used in creating the data, experimentation
indicated that this was not always the case. Small RMS errors could
be achieved with combinations of parameters significantly different in
value from those used in creating the data. As indicated in Figure 4,
rather different combinations of a and b yield very similar values of
ax over the range of x in which we are interested. It is clear that
an essentially equivalent combination of values should not be deemed
erroneous, since they yield an accurate empirical model. We regard this
a characteristic of the formulation chosen for calculating a and do not
regard it a difficulty of the methodology proposed. Further, in practice,
initial values for the parameters would be chosen from the literature,
and the solution obtained would be a set of values similar to the initial
values, but which minimized the prediction error. This aspect of imple-
mentation also suggests that a good initial guess would be employed and,
thus, that convergence to an "optimum" solution would be rapid.
-------
20
0.25
0.5 , 1
x(KM) -
2.5
10
Figure 4. Plot of ax for several values of a and b.
(The variable x is plotted on a log scale.)
-------
Table 2. Area sources only; parameter values at initial, mid,
and final iteration during search. (Windspeed is
fixed at 5.0 m/sec. Values a and b do not affect
area source values.) y y
Iteration az bz
0 (initial) 0.100 1.00
50 (mid) .028 0.89
100 (final) .037 0.79
ACTUAL
VALUES: (0.038) (0.76)
RMS Error
(ug/m3)
157
15
6
(0)
Max. Error
(yg/m3)
1205
69
24
(0)
ro
-------
Table 3. Point and area sources together; parameter values at initial,
Iteration
0 (initial)
50 (mid)
100 (final)
ay
0.100
0.055
0.074
mid
at
by
1.00
0.79
0.89
and final
5.0 m/sec.
az
0.100
0.044
0.036
iteration
)
bz
1.00
0.67
0.74
during search. ( W
Both Point and
RMS Error
(yg/m3)
157
79
24
indspeed is
Area Sources
Max. Error
(yg/m3)
1205
583
194
ACTUAL
VALUES:
(0.072)
(0.90) (0.038)
(0.76)
(0)
(0)
-------
23
One means of examining the tendency to converge to the "optimum"
solution is through a sensitivity analysis about the optimum. If a
small change in parameter values causes a sufficiently large change in
the rror criterion, then the search algorithm will tend to converge
relatively rapidly. Table 4 indicates the results of such an analysis,
where the parameters are changed one at a time from the optimum values.
Since both the RMS and maximum error are zero for the optimum values in
our test case, the errors indicated are also the changes in the given
error measures ( as well as the absolute errors for the perturbed para-
meter values). The results suggest that a change in any parameter will
cause a significant error, so that the search algorithm should converge
rapidly.
We note that the wind speed U is included in the table as a para-
meter. One can in fact regard U as a statistical parameter to be
estimated from measurements. Hence, for a period over which the wind
speed can be assumed constant, the error between predicted and actual
concentration values can be minimized with respect to U. In our data,
the wind speed was constant for all wind directions, so that the error
over all receptors and all wind directions could be used to determine U.
Full exploration of the practicality of extending the methodology of
this report to estimating U requires further study.
Forty receptors (i.e., air quality monitoring stations) are more than
are available in many monitoring systems. How many stations are required
for this methodology to be effective? The answer to this question is
heavily dependent on the number and distribution of sources, but the
indications from experiments with our test data suggest that a considerably
-------
Table 4. Sensitivity analysis about optimum values. Error is over 40
receptors at 16 wind angles (640 samples). Values are changed
about 10%, except for wind speed U, where the change is 4%.
Value Change
Parameter
a
y
b
y
a
z
b,
z
U
From
Optimum
0.072
0.90
0.038
0.76
5.0
To
0.079
0.99
0.042
0.836
5.2
Point Sources Only
(ug/m3)
RMS
16
24
20
21
11
Max.
175
234
250
200
175
Area Sources Only
(yg/m3)
RMS
*
0
*
0
19
26
11
Max.
*
0
*
0
55
91
36
All Sources
(Hg/m3)
RMS
16
24
27
35
18
Max.
175
234
151
230
177
*
Area source prediction not affected by this change.
ro
-------
25
smaller number of stations may suffice. Table 5 indicates errors due
to changes in parameter values, one at a time, from the optimum values,
for a selection of the individual stations. The errors are sufficiently
large that one would expect that optimum parameter values could be
extracted from a small number of stations at well-chosen locations.
-------
Table 5: Sensitivity analysis. Root-mean-square and maximum error due to change in
each parameter from nominal values at selected receptors. Concentrations are
from both point and area sources.
Parameter
ay
by
az
bz
U
Change ^s^
From To
.072 .079
.90 .99
. 038 . 042
.76 .836
5.0 5.2
Error (in ^g/m3) at Selected Receptors.
1
RMS MAX
9 32
15 59
24 49
32 66
15 25
10
RMS MAX
67 260
23 63
32 102
44 111
18 30
13
RMS MAX
17 46
22 69
24 45
29 62
19 44
21
RMS MAX
10 32
20 62
27 51
37 80
18 33
22
RMS MAX
11 33
20 66
16 52
23 69
16 39
24
RMS MAX
14 52
10 31
33 109
27 65
14 37
35
RMS MAX
44 175
46 175
47 177
52 178
45 177
Error for
All 40
Receptors
RMS MAX
16 175
24 234
27 151
35 230
18 177
IX)
(Ti
-------
27
4. MORE GENERAL SOURCE-RECEPTOR FUNCTIONS
Since the concentration test data used in this report were generated
by a Gaussian-form source-receptor function, no other form can do better
in predicting this particular data. One can, however, examine more general
forms as an initial assessment of difficulty, and to obtain an indication
of the types of functions which might prove useful for application to real
air quality data. In the analysis of this section, only point-source data
was used; i.e., the area-sources were disregarded.
We repeat Eq. (5) here for convenience:
£=1
Note that the source-receptor function K is a function of three variables.
If this function of three variables is parameterized, with a vector of pa-
rameters ex, then XD depends on the values of these parameters and the mean-
square error [equation (12) with x/\ = 0] depends on ex. Thus, the procedure
we employed in finding the optimal parameters of the Gaussian form is di-
rectly applicable to any parameterized form. The difficulty and computa-
tional cost, however, increase with the number of parameters.
Multivariate polynomials are an obvious choice to explore. We tried
a function of the form of the exponential of a general second-order poly-
nomial in three variables:
K = exp [a-|X' + o^y1 + a3<; + o^x'y'
+ a5x' ? + a6y' C + a^x1 )2 + ag(y' )2
V2 + aio] •
-------
28
with ten parameters. The exponential insures that the function K will
not take negative values. The resulting RMS error after 20 iterations
3 3
was 152 yg/m ; the maximum error was 3130 ug/m .
The exponential of a general third-order polynomial in three variables
(with 20 parameters) was similarly tested. After 20 iterations, the RMS
3 3
error was 126 yg/m and the maximum error was 2570 yg/m .
The exponential of a continuous piecewise linear form with 28 free
parameters was tested. This form is discussed in some detail elsewhere [3].
-3
The RMS and maximum errors after 20 iterations were 140 and 2808 yg/m
respectively. This error is less than the second-order polynomial, but
more than the third-order polynomial. Since the "curvature" of the Gaussian
form from which the data was created is better suited to polynomial than
piecewise-linear approximation, this is not an unexpected result for the
test data; however, this comparison may not yield the same result on real
data.
Examination of the data suggested a difficulty in fitting a continuous
functional form. Most of the data points corresponded to moderate to low
SOp concentration levels; there were a relatively small number of high con-
centration values. The continuous forms used tended to make large errors
at the high values in attempting to minimize the errors at the moderate/low
values (which predominated). As an experiment, we divided the data into two
sets corresponding roughly into high concentration data and medium/low con-
centration data. Explicitly, data for which (x'.y1,^) is such that
g(x' ,y' ,c) < 0,
where
-------
29
g(x',y',0 = (x1 - 2000)2 + lOO(y')2 + 400 c2 - 8 x 106, (15)
is considered separately from data where g(x',y',?) ^ 0. The first set are
values where the source-receptor function tends to peak; the second set,
where the source-receptor function takes smaller values. Equation (15) was
developed by "eyeball" examination of the data.
We then searched simultaneously for the best pair of third-order poly-
nomials, one for each region - forty parameters in all. Having obtained
the result after 20 iterations, we eliminated all terms in the two poly-
nomials with near-zero coefficients, leaving 29 parameters, and performed
twenty more iterations. The following equation resulted:
For g(x', y1, c) < 0,
K(x', y1, 0, ?) =
exp [a] C2 y1 + a2 x' y1 C + a3(x') y1
+ a4(y')3 + a5 c + a6 y1 C
+ a? x1 y1 + a8 y' + ag £ x1
a10(x')2 + an x1 C + a12 x1
-------
30
For g(x', y1 , e) >. 0 ,
K(x', y1, 0, c) =
exp[b1 y1 e + b2(y')2 + b3 c; + b4 y1 + b5(y()2 x1
+ b6(x')3 + b? x1 c + bg c2 + bg x1 y1
+ b1Q(x')2 + bn x1 + b12(x')2 y1 + b13(x')
x'
where the values of the coefficients are as given in Table 6. The RMS
error was respectable, but the maximum error was still large. The results
for all forms are listed in Table 7. Table 8 provides an assessment of the
predicted versus actual values at the maximum error for each wind direction.
It is instructive to look at the functions graphically. In Figures 5-8,
"slices" of the true (Gaussian) form are compared with the same slices of
the empirically derived form represented by equations (15) and (16).
While these results suggest that these more general forms (and perhaps
others) may prove useful, we do not feel that significant conclusions about
this question can be reached with test data from a Gaussian form. Referring
to Figures 5 and 7, we note that the Gaussian form implies rather extreme
rates of change in concentration and very flat tails which are very difficult
to approximate by anything other than a Gaussian form. For actual measure-
ment data, the many errors in measuring air quality and meteorological
-------
31
Table 6. Values of coefficients in source-receptor function.
1 3.811 x 10"7 7.520 x 10"8
2 -5.019 x 10"9 -7.502 x 10"6
3 -3.312 x 10"11 -9.430 x 10"3
4 -1.074 x 10~7 -5.505 x 10"3
5 -1.203 x 10"1 5.318 x 10"11
6 3.544 x 10"5 5.258 x 10"15
7 2.242 x 10"6 2.769 x 10"7
8 -3.008 x 10"2 -5.209 x 10"5
9 -6.632 x 10"9 1.354 x 10"7
10 -1.403 x 10"10 -1.125 x 10"9
11 9.039 x 10"6 4.405 x 10"9
12 -3.614 x 10"4 -7.290 x 10"13
13 1.134 x 10"13 -1.657 x 10"12
14 8:042 3.842 x 10"10
15 — 5.694 x 10"1
-------
32
Table 7. Errors for all forms
q q
Functional Form RMS Error (yg/m )_ Max Error (yg/m )
2nd order polynomial 152 3130
3rd order polynomial 126 2570
Continuous piecewise-linear 140 2808
3rd order polynomial
(split regions) 40 550
Gaussian form (from
Table 1) 14 175
-------
33
Table 8. Prediction of maximum concentration for each
wind direction by equations (15) and (16).
ind Angle Deg. Actual S02 Ug/m3) Predicted S02
22.5
45.0
67.5
90.0
112.5
135.0
157.5
180.0
202.5
225.0
247.5
270.0
292.5
315.0
337.5
360.0
2409
1286
681
884
740
912
792
2577
1561
711
765
914
587
532
525
985
2448
1391
703
933
847
891
888
2027
1572
749
782
926
552
534
561
992
-------
SOURCE-RECEPTOR FUNCTION
O
= 40
00
13- o
o
O)
to
200
RAM MODEL- GAUSSIAN KERNEL
UPWIND DISTANCE x' = 1500
5 = SOURCE HEIGHT (= 40, 50, 60, TOO, 150 m)
I ' I '
40O 600
y'(CROSSWIND DISTANCE in meters)
Figure 5
CO
800
1000
-------
SOURCE-RECEPTOR FUNCTION
>>
•t
X
00
o
CD
1/1
CM
EMPIRICAL SOURCE-RECEPTOR FUNCTION
DERIVED BY BEST-FIT
UPWIND DISTANCE x'= 1500
£= SOURCE HEIGHT
200
400 600
y1 (CROSSWIND DISTANCE in meters)
Figure 6
GO
tn
800
1000
-------
SOURCE-RECEPTOR FUNCTION
>>
*
X
O
00
o
CD
to
— \ x = 1000 meters
RAM MODEL - GAUSSIAN KERNEL
SOURCE HEIGHT ? = 50 m
x' = UPWIND DISTANCE
u>
200
400 600
y'(CROSSWIND DISTANCE in meters)
Figure 7
800
1000
-------
SOURCE-RECEPTOR FUNCTION
EMPIRICAL SOURCE-RECEPTOR FUNCTION
DERIVED BY BEST-FIT
SOURCE HEIGHT C = 50 m
x'= UPWIND DISTANCE
K_P
#t
o
• •»
>>
•»
X
VO
I
o
CO
o
OJ
C\J
oo
•vj
200
400 600
y'(CROSSWIND DISTANCE in meters)
800
1000
Figure 8
-------
38
variables, as well as in compiling emission inventories, may make a relatively
simple form as effective as the Gaussian form in obtaining an acceptable
average error. Since we can only hypothesize at present on the practical
effectiveness of such other forms, this will not be discussed further.
-------
39
5. CONCLUSIONS
This report develops a methodology for deriving a class of source-
oriented empirical models for determining spatial concentration distri-
bution of an air pollutant, given emissions and a categorization of the
meteorological conditions. It is noted that direct empirical modeling of
the relationship between emissions/meteorology and the resulting air
quality distribution is in most cases impractical. The number of emis-
sions variables and potential receptor locations of interest lead to im-
practical requirements on the amount of data and its variability. On
the other hand, given a variation of the wind direction, the many source-
receptor pairs provide a wide sampling of source strengths and receptor
impacts. Since the isolated effect of a given source is not generally
available, it is proposed that a source-receptor function be empirically
derived by minimizing the error in observed values of the total concentra-
tion due to many sources versus model-predicted values. Thus, the source-
receptor function estimating the contribution of a source of a given
strength for an arbitrary receptor location is determined, as opposed to
a direct relationship between total emissions and air quality. This
source-receptor function could then be used in the classical manner to
provide a source-oriented air quality simulation model by summing the con-
tributions of different point and area sources at any receptor location.
It is first suggested that the well-known Gaussian form source-
receptor function can be utilized in this manner; the parameter values
of that formulation could be determined empirically to minimize the mean-
square error between predicted and actual values. This approach could be
-------
40
considered as a calibration of a Gaussian-form model to a particular urban
environment. This concept was tested using data generated from a model
which employs the Gaussian form and, hence, for which the "true" parameters
are known. The results indicated that extraction of these parameter values,
even with a poor initial guess, was quite feasible. However, it was fur-
ther noted that the empirical parameter values obtained by the best-fit
procedure may not be unique, and that different sets may lead to concen-
tration predictions with similar error characteristics relative to the
"true" set.
The use of forms other than the Gaussian form was also examined.
The use of test data generated from the Gaussian form could not allow any
definitive conclusions in regard to the practical utility of other forms.
It was, however, demonstrated that it was feasible to extract parameters of
other forms which give a good approximation when applied to the test data.
This report is intended to demonstrate a methodology and to suggest
that there do not appear to be any fundamental difficulties in extending
this to measured data. In applying the methodology to practical situa-
tions, however, some extensions are required. For example, the Gaussian
form utilized assumed a single stability category; in practice, a number
of stability categories and hence a larger number of parameters would be
employed. There are other questions such as the number of monitoring sta-
tions required to allow adequate estimation of the parameters required.
The requirements in this regard are sufficiently dependent upon the number
and location of sources that there is no obvious way to approach them in
a general sense. In practice, the question would be answered by the
-------
41
accuracy of the resulting model. It should perhaps be emphasized that
this methodology will yield the model of the form chosen which gives the
minimum mean-squared error in forecasting pollutant concentrations over
the data base utilized, a characteristic which generates confidence in
the future use of the model in the same area.
-------
42
REFERENCES
1. Chambers, John M., "Fitting Nonlinear Models: Numerical Techniques,"
Biometreka. Vol. 60, No. 1, 1973, pp. 1-13.
2. Meisel, W. S., Computer-Oriented Approaches to Pattern Recognition,
Academic Press, New York, 1972, pp. 51-53.
3. Horowitz, Alan, Meisel, W. S., and D. C. Collins, The Application of
Repro-Modeling to the Analysis of a Photochemical Air Pollution Model
(EPA-650/4-74-001), December 1973.
4. Hrenko, Joan M. and D. B. Turner, "An Efficient Gaussian-Plume Multiple
Source Air Quality Algorithm", Paper 75-04.3, 68th Annual APCA Meeting,
Boston, June 1975.
-------
43
APPENDIX
The Feasibility of Formulation of a Source-Oriented
Air Quality Simulation Model that Uses Atmospheric
Dispersion Functions Empirically Derived from Joint
Historical Data for Air Quality and Pollutant Emissions
Kenneth L. Calder
Meteorology Laboratory, Environmental Protection Agency,
Research Triangle Park, North Carolina
INTRODUCTION
The multiple-source simulation model for urban air quality is now
well-known and in common use. It provides estimates of the spatiotemporal
distribution of concentration of an air pollutant, in terms of the corres-
ponding distribution of the pollutant emissions, and involves the use of
atmospheric dispersion functions that express the quantitative effects of
atmospheric transport and diffusion under the meteorological conditions
that are occuring. These meteorological dispersion functions need to be
specified in advance, i.e., in some a priori fashion. This usually involves
the analysis of data from some special ad hoc field experiments that are
made to characterize the diffusive power of the lower atmosphere in a fairly
general fashion. The tests need to be conducted under a variety of meteoro-
logical conditions and for some simple canonical configuration of the emissions,
e.g., from a single point source. Because of their fundamental source-oriented
structure these urban air-quality simulation models are widely used in analyzing
the effects on air quality of hypothetical and arbitrary emission control
This paper was prepared as a basis for discussion between the contractor
and the project officer. It is included here for historical background and
to elaborate certain points in the report. It is not intended to provide a
complete discussion of the subject.
-------
44
strategies. They thus provide a rational basis for air quality management
based on control of selective sources of pollution. The development and
improvement of such air quality simulation models is being very actively
pursued in most industrialized countries at the present time.
In strong contrast are those air quality "models" that primarily
involve some form of statistical regression analysis, and which depend
entirely on the availability of extensive meteorological and air quality
data for a particular urban location. Although such developments have had
useful applications for specific problems, the fact that they are receptor-
rather than source-oriented, and do not normally involve any explicit input
of information concerning pollutant emissions, has rendered them of very
limited value for studies of air pollution control strategies. The belief,
historically, that this failure was an inherent characteristic of statistical
models has possibly led to some neglect in the study of their full potential.
Also, of course, there has been the feeling that insofar as the statistical
models are empirically established they would be specifically restricted in
application, e.g., as regards geographical location, meteorological regimes, etc
The present study grew from the idea that the atmospheric dispersion
functions of the conventional source-oriented air-quality simulation model
play the role of transfer functions, as between the distribution of pollutant
emissions and the air quality distribution. They might therefore possibly
be obtained empirically, through an appropriate mathematical inversion tech-
nique, from accumulated data on the joint distributions of air quality
and emissions. In this case these empirically determined functions could
-------
45
then be used In a conventional way as a basis for a source-oriented air
quality model, for prediction of air quality from arbitrary or hypothetical
distributions of emissions. The possibility might in this way be provided
for developing an empirical-statistical source-oriented model in terms of
the large mass of accumulated historical data on air quality, rather than
through the input of a priori dispersion functions. If this could be done
it might have the advantage of utilizing dispersion functions that were
determined directly from the actual conditions of urban dispersion.
The present paper attempts to provide a preliminary theoretical dis-
cussion and examination of the feasibility of such a formulation. It does
not contain numerical examples as these are the subject of an ongoing
research project. However, it has seemed worthwhile to draw some attention
to these ideas at an early stage in the hope that, with more sophisticated
mathematical and statistical formulation, considerable improvement and gener-
alization may be possible.
GENERAL FORMULATION OF A MULTIPLE-SOURCE AIR QUALITY MODEL
The starting point for most urban air quality models is the assumption
of a quasi-steady state. Thus, in spite of the obvious long term variability
of pollutant concentrations and the meteorological conditions affecting
transport and diffusion, it is assumed that this variability can be treated
as though it resulted from a sequence of steady-state situations. The
sequence interval is normally taken to be quite short and perhaps only of
the order of one hour. For pollutants that can be regarded as chemically
inert it is assumed that the concentration contributions produced at a
-------
46
receptor location from several sources combine additively. Under these
circumstances all possible cases of emission can be subsumed mathematically
by considering a volume distribution of emissions, and writing the total
mean concentration x(x, y, z) [for a rectangular coordinate system with
the plane z = 0 at the ground surface] for any "steady-state" period,
that results from superposition of the concentration fields from all the
sources, as a triple integral,
xU.y.z) = A/YQv(€,n,c)R(x, y, z; ?, n, ?)d£dndc (i)
V
where
Qy(?> n, C») = steady emission rate per unit volume and per unit
time at position (£, n» ?)
R(x, y, z; £, n, 0 = mean concentration at (x, y, z) produced by a
steady point-source of unit strength located at
(£» n, C)
and the integration extends over the entire volume V occupied by the
source distribution.
Using the formal device of the Dirac delta function the above general
formulation of the superposition principle includes the case of the area-
sources and point-sources that are more normally considered in air quality
simulation models. Thus, for an extended, horizontal area-source, located
at height c = C , and of strength QA(C» n) per unit area per unit time
Qu(5, n, c) = Q.(5, n) 6 (? - O (2)
-------
47
so that equation (1) then gives (from the sifting property of the 6-function
when under the integral sign)
x(x, y, z) = QAU, n)R(x, y, z; £, n, ?0)d?dn (3)
V
and the integral extends over the entire area A of the area-source distribution.
Similarly for a single point-source of strength Q at (£0, n0, C0)
QVU, n, c) = Qp
-------
48
which represents the effect at (x, y, z) of unit concentrated source at
(5, ru c)» is often known as the influence or transfer function of the
problem. If the distribution of causes is prescribed and the influence
function is known, then the equation permits determination of the effect
by direct integration. However, if it is required to determine a distri-
bution of causes that will produce a known distribution of effects, the
above equation is a Fredholm integral equation to determine Qy(£, n» c)-
The kernel is then identified with the influence function of the problem.
In the above the kernel is a function of six variables.
That the above is so, can readily be seen from the following heuristic
consideration (which, however, is a basis for numerical solution of the
integral equation). In equation (1) above, let a, A, b, B, c, C denote
constants that define the region occupied by the emissions and receptor
locations, so that a <_ (x, £) <_ A, b <_ (y, n) <_ B and c _< (z, c) _< C
(this involves no loss of generality since Qv is zero outside the actual
region of emissions).
Let
CA - C£_-, = A£ = ^p (A = 1, 2, .... L)
nm - Vl = An = T (m = 1, 2, .... M)
5n-l =A^= (n=1'2> --
(L, M, N, £, m, n integers)
-------
49
Then the integral equation (1) is the limiting form for
An, A?) -> 0, i.e., (L, M, N) -»• °° of the equation
L M
x(x,y,z) = A£AnAc E E
£=1 m=i n=l
This must be true when a < x < A, b < y < B, c ... y. ... yM
z takes the values z, , z2»
Let
= Vn ; R(xi' yj' V h>
r y-j, zk) =xijk
Then (6) reduces to the matrix equation
L M N
R Y = X (7)
RijkJln]ri ^ = Xijk
JG^ 1 III"* I ill
[1 = 1, 2....L; j = 1, 2....M; k = 1, 2....N]
This represents LMN equations for the LMN unknowns
Y (S, = 1....L; m = 1....M; n = 1....N), i.e., it is an even-determined
£mnv
linear system with as many equations as there are unknowns. However, note
-------
50
that there are L2M2N2 values of R involved so that the system would be very
strongly underdetermined, with fewer equations than unknowns, if the Y's
were given and the R's regarded as unknowns. In other words, the integral
relation expressed by equation (1) is not sufficient to determine the six-
variable function R in terms of specified three-dimensional distributions
X and Q.
HORIZONTALLY HOMOGENEOUS DISPERSION FUNCTION
In the very general formulation of the previous section, the meteoro-
logical dispersion function R(x, y, z; £, n, c) is a function of six inde-
pendent spatial variables. This situation is, however, more general than
that assumed in most existing air-quality simulation models. These assume
that the form of the dispersion function is independent of the horizontal
location of the source, so that the function R is invariant under horizontal
translation of the source-receptor pair. Although this is obviously only
an appioximation, in view of the rather inhomogeneous nature and distribution
of buildings in a city, the variation of the meteorological dispersion
with source location may nevertheless be small in comparison with variations
due to other causes, and this assumption is always made for purposes of
simplification. We shall here refer to it as that of horizontal homogeneity
of the dispersion function. In addition, it is normally assumed that the
mathematical form of R is unaffected by the direction of the airflow over
the city, provided that the x-axis of the coordinate system is always taken
along the mean horizontal wind direction over the region of interest.
We shall only make use of this assumption at a later stage in the analysis.
-------
51
With horizontal homogeneity a great simplification results, and the six-
variable dispersion function reduces to one of four variables, i.e.,
R(x, y, z; £, n, 0 ->• K(x - £, y - n; z, c). For example, for the common
Gaussian-plume model R is given by (in the case of infinite mixing depth)
exp {- TT
2TrUa,,(x-£)a (x-£)
y Z (8)
where U is the mean wind speed and a a are the horizontal and vertical
standard deviations for the bivariate Gaussian distribution. The latter
quantities are functions of the distance (x - £) downwind from the source
location and also of the atmospheric stability. We note in (8) that R is
a function of the horizontal coordinate differences (x - £), (y - n) and
the two (independent) variables (z - c) and (z + c).
In the following, since air quality measurements are normally only
available at ground level, we shall be concerned primarily with a special
case corresponding to the concentration distribution at ground level.
In this case the basic integral relation becomes
x(x, y, 0) =/7"/Qv(£> n, ?)K(x - 5. y - n; 0,
(9)
V
where the function K is now one of only three variables. In contrast to
the problem initially considered of a 3-dimensional concentration distribu-
tion, and for which equation (1) could be regarded as an integral equation
to determine the emissions distribution QV(£, n, c) if the dispersion
-------
52
function R were specified, it is evident that the problem of determining
this 3-variable function Q from equation (9) in terms of a known 2-dimen-
sional concentration distribution is no longer well-determined, i.e., there
is insufficient information to determine Qv uniquely in a mathematical sense.
(This is probably most clearly seen by an argument exactly similar to that
previously developed in relation to the interpretation of equation (7)].
The same, of course, would be true if the equation (9) were to be regarded
as one for the determination of the 3-variable dispersion function K in
terms of a specified x(*» y, 0) and Qy(£, n, c). The latter, however, is
precisely the basis for the empirical model that is being proposed in the
present paper to determine dispersion functions from air quality and emission
data. It is therefore evident that a direct approach in terms of numerical
solution of the integral equation (9) will not be possible. Fortunately,
an alternative is available in terms of a solution in a "least-squares"
sense. In the latter we attempt to determine an approximate solution
for the dispersion function K by restricting K to membership in a family
of functions that involve a number of parameters (a,, a~, . . ., a ) that
define a vector, say ex. The parameter vector cx_ thus specifies a parti-
cular member of the family. A familiar example is the family of multi-
variate polynomials, where a member is specified by a particular choice
of values for the coefficients. The approximate solution is then taken
as the "best-fitting" function of the chosen family as determined by the
method of least squares applied to the "observed" and "calculated" con-
centration values. The latter are, of course, determined by use of the
basic integral relation (9) in terms of different functions K of the
chosen family.
-------
53
In all urban air quality models, rather than considering a general
3-dimensional volume-source distribution, it is customary to consider
the emissions in terms of a limited number (say J) of elevated point-
sources together with horizontal area-sources, the latter being possibly
located at a few distinct heights cs (say, for example, for s = 1,2,3).
Normally for any horizontal location (£,n) there will be associated only
a single area-source height. However, since we may always take QA(£,n,Cs)=0
when there is no area-source emission at height ^ , we could if desired,
equally well consider three superimposed area-source distributions
(for s = 1,2,3).
In the above case Eq. (9) becomes
3 - -
x(x,y,o) = £ / /QA(5» n. CS)K(* - s, y - n; o, cs)dcdn
s=l J J
A
J
Q(9}K(y - F v - n • 0 C)
P\~^IM* Si* 5,' £
£=1 v
where
Q (5» n» ? ) = emission rate of horizontal area-source distribution
at height ? , A denotes the total integration domain
o
of the area-source distributions.
Q (a) = emission rate of I elevated point-source, located
at position (C^» n^> C^)
In the special case of ground-level area-sources alone this reduces to
XA(X, y, 0) S/YQAU. n, 0)K(x - 5, y - n; 0, 0)d^dn (11)
A
-------
54
In this case all three functions that are involved are two-dimensional
and equation (11) can be regarded either as an equation for the unknown
function Q. if the functions x/\ and K are given, or alternatively as an
equation for the special two-dimensional form of dispersion function if the
functions x and Q are given. However, the inclusion of elevated point-
sources and also of elevated area-sources, and hence the need to consider
at least a three-variable dispersion function, is vital for the consideration
of urban air quality. We are consequently forced to consider the alternative
approach that is based on the method of least-squares approximation.
APPROXIMATION OF DISPERSION FUNCTION BY METHOD OF LEAST SQUARES
As already mentioned above, the form of the dispersion function K
that appears in equation (10) is normally assumed to be independent of
the direction of the airflow over the city, provided that the x-axis of
the coordinate system is taken along the mean horizontal wind direction.
To exploit this it is convenient to introduce "source-oriented" position
variables by
x' = x - £ dx1 = -d£ x!£ = x. - £
y1 = y - n dy' = -dn yj^ = yi - n£
Then with the x-axis still along the mean horizontal wind direction,
equation (10) may be rewritten as
3 ~ r
X(xi,y1;6.j.) = £ / ^/^(x-x1 Iyi-yl,Cs)K(xl,y';0,Cs)dx1dyl
S *~ I J J
A
+ £ Qp(*)K(x' , y' ; 0, cJ (12)
*=1 v u u a
-------
55
where the small obvious changes in notation are made in order to express
more explicitly the functional dependence on the receptor location
(xr YjUfor i = 1,2...) and the wind direction 6. (for j = 1,2...) that
J
defines the angle between the x-axis and, say, the x~-axis of a fixed
coordinate system. If we use overbars to denote corresponding coordinates
relative to the fixed axes 0 x" y, then
x = 3T cos 9. + 7 sin 0.
J J
y = y cos e. - x" sin e.
J J
x. = x7 cos e. + y7 sin e.
1 I J I J
yt = yT cos 6j - 3c7 sin Bj
q = q cos e. ^ sin e.
nA = ^ cos e., - Tz sin e.
xu = *u cos ej + *u sin ej
y'u = *ucos ej - *u Sln ej
while if IL (x", y, C ) denotes the area-source strength distribution
relative to the fixed system of axes, then
QA(x, y, ?s) = Q"A(3T, y, cg)
To apply the method of least-squares approximation, we now restrict
the approximating dispersion function K to membership in a specified family
of functions that involves a number of parameters whose values define a
parameter vector ex (= a.j, a2, a3, etc). We denote a general member of this
family of functions by K (x1, y1; 0, C; a} where K is a function of the
-------
56
three variables x1, y' and £, and the parameter vector a (the curly brackets
are used here to emphasize a specific family of functions). Denote the
value of x obtained when this value of K is substituted in equation (12) by
Xca-|c (x.j, y.; 9.). If there were an ex such that the equation could be
satisfied exactly, the observed concentration X^^*-,-* y,- ; 6,-) could be
predicted exactly by the function given by that a. If a perfect fit is not
possible we may determine the value of ex that minimizes the mean square error
over a set of values of (x., y^) and 6..
2, o \~* -^ \ / \
p (n} => > v ix v • fl J -
<= \^l t-J A-» I X^kr. V^_- »yn- » D.;
l 1 J
K x1 ,y' ;0,c; a
dx'dy'
Or n2
Z Qp(^)K P<:£,y;.r0;c£;a (13)
Equation (13) may be minimized with respect to o_ by any number of
optimization techniques provided that the integral can be calculated,
and many numerical integration techniques are suitable for this purpose.
Also, calculation of the area-source integral can be simplified under
the "narrow plume hypothesis" that is described below and the two-
dimensional integral reduced approximately to a one-dimensional one.
A key problem is the choice of an appropriate family of parameterized
2
functional forms for K{ } such that the error e will be small, but
such that the number of parameters will be small. Continuous piecewise
-------
57
linear functions, as used in the recent EPA contract with Technology
Service Corporation, provide such a class of functions and are a
promising candidate for achieving a feasible solution. Whatever form
of approximating function is used, however, it may be possible to make some
simplification by assuming specific dependencies on some meteorological
parameters, e.g., by assuming that the concentration is inversely propor-
tional to the mean wind speed, rather than extracting this dependence
empirically. Another possibility is the assumption of a Gaussian-plume
form for K{ } and then determination of the dispersion parameters empirically.
Thus from equation (8) above we might take
,.2
/ \ \ uv VA '; I °7 v* i ]
K (x',y';CU;a) = i * —^ ± 2 LL (T4)
and assume simple power-law dependencies for the standard deviation
functions, say
a(x') = a(x')y
b
az(x') = az(x') z
where a »bv»az»bz are constants. Then the parameter vector a = (V,a ,b ,
az.bz).
Finally, we consider the simplification using the "narrow plume
hypothesis" ["A Narrow Plume Simplification for Multiple Source
Urban Pollution Models," unpublished note, K. L. Calder, 31 Dec. 1969].
-------
58
Evidently, if the concentration in a point-source plume decreases rapidly
with crosswind distance from the plume centerline, then in the integration
with respect to y1 in equation (12). the K function will be small except
for small values of |y' j . We shall assume that this distance is
sufficiently small so that the y1 variations of the area-source functions
6A can be disregarded. This idealization can be thought of in a formal
manner, by using the Dirac 6-function, and writing
K(x', y1; 0, ?s) = G(x', ?s)«(y) (16)
where G(x', c ) is thus the crosswind integrated concentration from unit
point-source, since
o
K(x', y'; 0, r) dy1 = G(x', r)
^ s
With (16), each area integral in equation (12) reduces to the form
f QA(x. - x', y., Cs)G(x',
We thus have a one-dimensional integral to evaluate rather than a two-
dimensional one. For the special case of a Gaussian-plume
r2
exP{-
nr
K(x', y1; 0,
= G(x'. U (17)
-------
59
PROPOSED FEASIBILITY STUDY (19 NOV. 74)
In an initial feasibility study it seems appropriate to use air
quality estimates calculated by use of a known multiple-source
simulation model, rather than actual observations of S02 concentrations.
However, so that a realistic emissions distribution and sampling network
be utilized, the calculations should be made for a real situation rather
than a purely hypothetical one.
The air quality model to be used for the study will be RAM*. This
is a multiple-source Gaussian-plume model in which, for computational
efficiency, elemental area-sources (50001) are aggregated into larger
squares where this is possible. The model also estimates the area-source
concentrations using the narrow-plume approximations discussed above.
The algorithm also permits consideration of three different heights
for the area source emissions. For the St. Louis emissions data (Turner-
Edmisten) that will probably be used, there were 30 x 40 squares (5000')
and 62 point sources, and 40 receptor locations at which concentrations
will be calculated.
For the purposes of the present study the standard deviation functions
for the basic Gaussian-plume will be assumed to be simple power laws, as
given by equation (15) above, with parameter values as below (when the
a's and distances are both measured in kilometers).
A paper "An Efficient Gaussian-Plume Multiple Source Air Quality
Algorithm," by Joan M. Hrenko and D. Bruce Turner of the
Meteorology Laboratory, was presented at the 68th Annual APCA
Meeting in June 75 at Boston.
-------
60
Stability Category
A
B
C
Neutral D
E
F
a
y
0.20
0.16
0.11
0.072
0.051
0.038
b
y
0.90
0.90
0.90
0.90
0.90
0.90
a
z
0.14
0.080
0.056
0.038
0.023
0.012
b
z
0.90
0.85
0.80
0.76
0.73
0.67
The a , b values are given by F. Pasquill in his Table 6.IX for a
ground roughness length z = 10 cm (see Atmospheric Diffusion, 2nd Edition,
John Wiley & Sons, 1974). The a , b values were obtained by fitting, over
the distance range 0.1 to 10 km, the curves given in Figure 3.10 of
Meteorology and Atomic Energy, 1968.
Some possible tasks follow:
Task 1
Select a real urban location (e.g., St. Louis, New York, Chicago, etc.)
for which area-and point-source, short-term emissions distributions are
available for S02- For a typical 1-hr emissions distribution use the
multiple-source Gaussian dispersion model (probably the RAM model of
Meteorology Laboratory) - for one wind speed (5 m/sec), one stability
-------
61
class (neutral) and infinite mixing depth - to calculate total 1-hr
concentrations xU^y^e.) at ground level at a number of receptor
locations (x^.), 1=1,2,... and for the 16 cardinal wind directions
ej = 0, 22-1/2°, 45°,..., 337-1/2°. The concentration values will be
calculated (a) from the area sources alone, (b) from the point-sources
alone, and (c) from the point- and area-sources combined.
Use the least-squares methodology proposed, and equation (13) to
recover the meteorological dispersion function K. Determine
(a) the degree of error in predicting concentrations, for the
receptor locations and wind directions actually used to derive the empiri-
cal dispersion function,
(b) the degree of error in predicting concentrations at receptor
locations and for wind directions not used in the derivation (a measure
of interpolation accuracy),
(c) the degree of error in predicting results for a somewhat different
emissions distribution (a test of extrapolation accuracy), and
(d) compare the empirical dispersion function with the Gaussian form
used to compute input concentrations for the analysis.
Task 2
Test the sensitivity of the method to the number of "observed" concen-
trations used, and to random errors in the emissions inventory.
Task 3
Extend the preceding to a range of wind speeds, atmospheric stability
classes, and to several different emissions distributions.
-------
62
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
REPORT NO.
EPA-600/4-76-029b
3. RECIPIENT'S ACCESSION-NO.
4. TITLE AND SUBTITLE
EMPIRICAL TECHNIQUES FOR ANALYZING AIR
QUALITY AND METEOROLOGICAL DATA.
Part II. Feasibility Study of a Source-Oriented
Empirical Air Quality Simulation Model
5. REPORT DATE
June 1976
6. PERFORMING ORGANIZATION CODE
7. AUTHOR(S)
W. S. Meisel
M. D. Teener
. PERFORMING ORGANIZATION REPORT NO.
TSC-PD-132-3
9. PERFORMING ORGANIZATION NAME AND ADDRESS
Technology Service Corporation
2811 Wilshire Boulevard
Santa Monica, California 90403
10. PROGRAM ELEMENT NO.
1AA009
11. CONTRACT/GRANT NO.
EPA 68-02-1704
12. SPONSORING AGENCY NAME AND ADDRESS
Environmental Sciences Research Laboratory
Office of Research and Development
U.S. Environmental Protection Agency
Research Triangle Park, North Carolina 27711
13. TYPE OF REPORT AND PERIOD COVERED
Final Mav 74-Dec 75
14. SPONSORING AGENCY CODE
EPA-ORD
15. SUPPLEMENTARY NOTES
This is the second of three reports examining the potential role of state-of-the-
art empirical techniques in analyzing air quality and meteorological data.
16. ABSTRACT
Meteorological dispersion functions in multiple-source simulation models
for urban air quality are usually specified on the basis of data from special field
experiments, usually involving isolated sources. In the urban environment, indi-
vidual sources cannot be isolated. One may, however, ask for a source-receptor
relationship which, when summed (or integrated) over all the sources, would
minimize the average squared error in prediction of measured values. The feasi-
bility of this approach is demonstrated by application to model-generated data,
where the source-receptor relationship is known.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS
COSATl Field/Group
* Air pollution
* Meteorological data
* Atmospheric diffusion
* Mathematical models
* Environmental simulation
13B
04B
04A
12A
14B
13 DISTRIBUTION STATEMENT
RELEASE TO PUBLIC
19. SECURITY CLASS (This Report)
UNCLAS.STFTFn
21. NO. OF PAGES
66
20. SECURITY CLASS (This page)
UNCLASSIFIED
22. PRICE
EPA Form 2220-1 (9-73)
------- |