April 1984
"SSS."
OF .
AIR QUALITY MODELS PERTAINING TO PARTICULATE MATTER
ENVIRONMENTAL SCIENCES RESEARCH LABORATORY
OFFICE OF RESEARCH AND DEVELOPMENT
U.S. ENVIRONMENTAL PROTECTION AGENCY
RESEARCH TRIANGLE PARK, NC 27711
-------
AIR QUALITY MODELS PERTAINING TO PARTICULATE MATTER
by
S.A. Batterman, J.A. Fay, D. Golomb, J. Gruhl
Energy Laboratory
Massachusetts Institute of Technology
Cambridge, MA 02139
Cooperative Agreement Number 809229-01
Project Officer
Jack H. Shreffler
Meteorology and Assessment Division
Environmental Sciences Research Laboratory
Research Triangle Park, NC 27711
ENVIRONMENTAL SCIENCES RESEARCH LABORATORY
OFFICE OF RESEARCH AND DEVELOPMENT
U.S. ENVIRONMENTAL PROTECTION AGENCY
RESEARCH TRIANGLE PARK, NC 27711
-------
DISCLAIMER
This report has been reviewed by the Environmental Sciences Research
Laboratory, U.S. Environmental Protection Agency, and approved for
publication. Approval does not signify that the contents necessarily
reflect the views and policies of the U.S. Environmental Protection
Agency, nor does mention of trade names or commercial products constitute
endorsement or recommendation for use.
n
-------
ABSTRACT
This report describes an evaluation of the Particle Episodic Model
(PEM), an urban scale dispersion model which incorporates deposition,
gravitational settling and linear transformation processes into the
predecessor model, the Texas Episodic Model (TEM-8). A sensitivity
analysis of the model was performed, which included the effects of
deposition, gravitational settling and receptor grid size.
Recommendations are made to improve the performance and flexibility of the
model.
PEM was applied to a source inventory of the Philadelphia area to
provide a preliminary estimate of source apportionment. PEM modeling
employed both hypothetical and actual meteorology. Results indicate that
area source emissions dominate TSP, SO? and sulfate concentrations at
urban receptors. A large fraction of the inhalable particles may arrive
from distant sources.
This report also contains an overview of receptor models (RMs) used
for the source apportionment of aerosols. Some diagnostic procedures for
RMs are evaluated using a synthetic data set. Described are RM trade-offs
and protocols and possible hybrid dispersion/receptor models. Issues
regarding the inter-comparison of source apportionments from receptor and
dispersion models are highlighted with reference to the 1982 Philadelphia
study.
This report was submitted in fulfillment of Cooperative Agreement
Number 809229-01 by the M.I.T. Energy Laboratory under the sponsorship of
the U.S. Environmental Protection Agency. This report covers a period
from April to October 1983 and work was completed as of November 4, 1983.
-------
ACKNOWLEDGEMENTS
The authors appreciate the guidance and assistance offered by the
Project Officers, Drs. K. Demerjian and J. Shreffler, and the comments,
discussion and data provided by Drs. R. Stevens, T. Dzubay, C. Lewis and
the scientists and engineers of the environmental agencies of the City
of Philadelphia and the States of New Jersey and Pennsylvania.
-------
CONTENTS
Abstract iii
Figures , vi
Tables vii
1. Introduction 1
2. Conclusions and Recommendations 3
3. Evaluation of Particle Episodic Model 5
PEM Verification 5
Transferral and Compilation 5
Comparison to TEM-8 6
Deposition and Settling 6
Transformation 10
Area Sources 11
PEM Improvements 15
Deposition, Settling and Transformation 16
Source Specifications 18
Area Sources 19
Outputs 18
Miscellaneous 19
Summary 21
4. Source Apportionment Using PEM 22
Philadelphia Source Inventory 22
Meteorology 29
Dispersion Modeling 29
Hypothetical Meteorology 30
Actual Meteorology 35
Discussion 40
5. Receptor Models 43
Overview of Receptor Models 43
Chemical Mass Balance Receptor Models 45
Multivariate Receptor Models 47
Composite Receptor Models 47
Diagnostic Procedures for CMB Models 48
Collinearity 48
Source Selection 51
Influential Characteristics 53
Robust Regression 60
Hybrid Receptor/Dispersion Models 61
Comparisons Between Source and Receptor Models 64
Evaluative Criteria 64
Averaging Time 65
Receptor Model Protocols 67
Protocols for Monitoring and Analysis 58
Modeling Protocols 69
Protocols for Inter-Comparison of Receptor Models . 70
Summary 70
References 72
-------
FIGURES
Number Page
1. Ambient downwind concentrations from test point source
under stability A 8
2. Ambient downwind concentrations from test point source
under stability D 8
3. Downwind ambient concentrations for stability B from point
and area sources with transformation 12
4. Percent secondary sulfate (of sulfur dioxide concentrations)
for conditions in Figure 3 12
5. Downwind center line concentrations from 1 km square area source
for various receptor grid sizes (km) under stability D 13
6. PEM map of SO? concentrations for 1 point and 3 area sources
showing receptors where concentrations are calculated ....... 13
7. PEM map using 2.5 km receptor grid showing concentrations from
area sources under stability E and 8 consecutive 45 degree
wind shifts 15
8. PEM map modeling area sources as point sources and stability D,
otherwise conditions as in Figure 7 15
9. Map showing location of point and area sources in the
Philadelphia inventory 23
10. Map showing emission rates of coarse particles from
Philadelphia area sources 27
11. Map showing emission rates of fine particles from area sources . . 28
12. Map showing emission rates of S02 from area sources 28
13. Map showing receptor grid of Philadelphia area, sub-census tract
areas, city limits and locations of monitors 31
14. Partial regression plots for 8 source classes 58
-------
TABLES
Number Page
1. Comparison of ambient participate concentrations and fraction
of gaseous concentration for test point source for several
deposition and settling velocities 9
2. Summary of TSP, S0£ and $04 emission inventories in the
greater Philadelphia area 24
3. Summary of TSP and S02 and 504 emission inventories
in sources in the local Philadelphia area 25
4. Primary sulfate emission factors 27
5. TSP source class contributions from PEM using hypothetical
meteorology under three stability categories 32
6. S02 and 504 source class contributions from PEM using
hypothetical meteorology for three stability categories 33
7. Average percent TSP source class contributions from PEM using
hypothetical meteorology 34
8. Average percent $04 source class contributions from PEM
modeling using hypothetical meteorology 34
9. Meteorological data used in PEM simulation 36
10. TSP source class contributions using actual meteorology 38
11. S02 and $64 source contributions using actual meteorology ... 39
12. Apportionment of hourly SC>2 and SOa concentrations at
selected city receptors for 12 wind directions 41
13. Differences between classes of receptor and dispersion models ... 46
14. Source signatures from Quail Roost II 46
15. Proportionate variance decomposition of source profiles 50
16. Simple correlation of source profiles 50
17. Variance inflation factors of the source profiles 50
18. Results of all possible regression procedure using synthetic
data and Mallow's Cp statistic 54
19. Estimated CMB source apportionment using 8 source classes 57
20. DFBETAS row deletion statistics for CMB of synthetic data set ... 57
vii
-------
SECTION 1
INTRODUCTION
This is the final report for the April-October 1983 phase of the
EPA-MIT Cooperative Agreement examining aerosol models and their
applications. The principal objective of this work is to assess the two
fundamental approaches to source apportionment of aerosol concentrations:
source-oriented dispersion modeling (DM) and receptor modeling (RM). We
aim to define the strengths, limitations, areas of applicability and
possible protocols for the use of these methods in the regulatory context.
The specific objectives include:
(1) Assessment of the new Particle Episodic Model (PEM), a dispersion model
which incorporates deposition, settling and transformation processes into
the standard Gaussian plume dispersion algorithm. PEM is an extension of
the Texas Episodic Model (TEM-8). This report contains a discussion of the
verification, applicability and limitations of PEM. Suggestions are given
to improve the versatility of the model. A validation study, in which
predicted and observed concentrations are compared, is the next logical
step in the model evaluation process.
(2) Review and general comments about receptor models. Part of this
project has involved a review of the models and literature that make up the
current understanding of receptor techniques, as well as our experience
with receptor models and similar statistical problems. The current status
and capabilities of RM are discussed, with attention to the potential uses
of RM in the regulatory process and the application of diagnostic tools to
RM which might be part of the modeling protocol.
(3) Discussion of potential hybrid dispersion-receptor models. There are
a number of ways to combine source and receptor models. Currently, it is
not clear that hybrid models are feasible and desirable. However, it is
clear that contributions from each approach can be made to improve the
other technique in certain circumstances. Recommendations are made
-------
regarding the development and application of hybrid models in inter-model
comparisons, such as the Philadelphia Study (below).
(4) Comparison of DM and RM in the Philadelphia Study. In July and August
of 1982 data were collected in the Philadelphia area, in part to compare
source apportionments from DM and RM on the same, real data. Ground
rules and certain specific tasks are required to make that comparison as
meaningful as possible. Guidelines and a discussion of issues for this
comparison are presented. Preliminary modeling results using PEM and a
source inventory compiled for the Philadelphia area are given.
Data from the Philadelphia Study were not available for this report.
Consequently, much of the analysis of receptor models contained herein is
theoretical or hypothetical. Further analysis using both real and
appropriate synthetic data will be required to determine the value and
application of the proposed approaches.
-------
SECTION 2
CONCLUSIONS AND RECOMMENDATIONS
A new dispersion model, the Particle Episodic Model (PEM), was found to
successfully incorporate deposition, gravitational settling and linear
transformation processes to the predecessor model, the Texas Episodic
Model. Thus, PEM should permit greater realism in urban scale modeling.
However, some improvements to area source calculations seem warranted.
Application of PEM to a source inventory for the Philadelphia area and a
sensitivity analysis indicate:
(1) Area source emissions may dominate TSP, SOg and sulfate concentrations
at receptors located within an urban scale. Therefore, source inventory
data for area sources, including source strengths, operating schedules,
micro-inventories (around receptor sites) and the degree and method of
aggregation of small sources may be critical to accurate dispersion
modeling. Distant sources warrant less attention.
(2) A large fraction of the observed inhalable particle (IP) mass,
particularly sulfate, probably comes from medium and long range transport
and not from local sources. Preliminary modeling has shown that sulfate
contributions from local sources are generally only a few ug/m , compared
with measured values which average about 24 ug/m . However, the modeling
examined a relatively short period, and the source inventory is known to be
very approximate.
(3) Particle and gaseous concentrations from middle to far field sources
(greater than 10 km) may be sensitive to deposition velocities, with greater
effects at larger distances. For these sources, gravitational settling is
relatively unimportant. Gravitational settling may be important only for
sources located in the vicinity of monitors which emit large particles with
high settling velocities.
-------
(4) Sulfate concentrations from distant sources are sensitive to
transformation rate, especially at low wind speeds.
Our review of the literature and experience with receptor models
indicate:
(1) The temporal and spatial variation of source profiles, and the
sensitivity of estimated apportionment to such variation is largely
unknown. The amount of data needed for representative and useful results is
also poorly defined.
(2) Generally, RM studies have been custom designed, including selection of
source signatures, filters and analysis. A high degree of subjectivity may
be involved in the interpretation and use of data and models.
(3) Each type of RM is prone to certain failures and limitations. Chemical
mass balance RMs are subject to problems of collinearity and influential
points. Diagnostic procedures can be used to determine whether these
problems exist; remedial procedures may be able to minimize their effects.
Standardized protocols may be required to ensure meaningful results and
promote appropriate uses of RMs.
Both dispersion and receptor models have useful attributes for the
source apportionment of aerosols. Dispersion models are predictive and
diagnostic. Receptor models are primarily interpretive. Standardized uses
of receptor approaches are possible; however, their applicability and
limitations need to be defined.
Hybrid models, which combine aspects from both dispersion and receptor
approaches, may have application to many air pollution problems, including
apportionment of ambient concentrations of criteria and hazardous
pollutants, visibility impairement and acid deposition. However, at
present, hybrid models are poorly developed and defined. The development
and validation of hybrid models will require an extensive data set for
various conditions and localities.
4
-------
SECTION 3
EVALUATION OF THE PARTICLE EPISODIC MODEL
The Particle Episodic Model (PEM) is based on the Texas Episodic Model,
Version-8. The new model successfully incorporates simplified deposition,
gravitational settling and linear transformation processes into the
predecessor model. PEM simultaneously calculates concentration and
deposition over an urban scale area in a rectangularly gridded receptor
network. A discussion of the the underlying concepts and algorithms in PEM
may be found in Rao (1983).
With respect to deposition, PEM provides greater flexibility and some
increased realism over comparable models. For example, the ISC model
simulates gravitational settling by tilting the (normally horizontal) plume
center!ine; deposition is modeled by the partial reflection of the source
contribution. Settling and deposition processes alter the distribution of
mass within the plume. The ISC models's treatment is not altogether
statisfactory since it does not model the modified distribution.
Consequently, the ISC model tends to overpredict concentrations near the
source and underpredict concentrations at large distances. However, as
shown later, deposition and gravitational settling only slightly affect
concentrations on the urban scale (less than 60 km), especially for the
expected range of settling and deposition velocities of inhalable particles.
PEM VERIFICATION
Code Transferral and Compilation
The PEM source code was checked for obvious errors or omissions. No
mistakes were found. The source code was then compiled on the IBM FORTRAN
VS compiler. Only one "block data" per program is permissible with this
compiler. The source code was modified accordingly.
Comparison to TEM-8
PEM was used with simple source and meteorological conditions in each
stability category. Results were compared to the TEM-8 model. This showed
5
-------
good agreement provided that the TEM time step was specified to be 10
minutes. Both models use the same P-G-T dispersion coeffients, but TEM
increases these coefficients depending on averaging time. PEM uses these
coefficients for calculating one hour averages. (The basic averaging period
in PEM is one hour, in TEM, 10 minutes.) Maximum differences between the
two models were under a few percent. However, under the most stable
categories at relatively long downwind distances, maximum differences could
range up to about 30 percent. These differences occur since TEM uses
"look-up" tables for dispersion coefficients at discrete distances, while
PEM uses piece-wise approximations. While both methods should be
sufficiently accurate for model applications, the piece-wise approximations
in PEM may provide more consistent results.
Deposition and Settling
An analysis was performed to verify the model's response and sensitivity
to changes in settling and deposition velocities. The settling velocity W
increases proportionally to the particle density and the square of particle
diameter for particles ranging from about 0.05 to 300 urn in diameter.
Mechanically generated particles, with diameters from about 2 to 50 urn, have
settling velocities around 1 cm/s. The settling velocities of sulfate and
SO. are believed to be about 1 to 3 orders of magnitude smaller than the
above value (Sehmel, 1980).
The deposition velocity V is the deposition flux divided by the airborne
concentration. Deposition velocities are functions of micrometeorological
variables (aerodynamic roughness); particle properties (diameter, density,
solubility); surface properties (surface roughness and moisture); and
measurement height (usually 1-1.5 m). Deposition is difficult to measure
and the uncertainties are large. Recommended S02 deposition velocity
estimates range from 0.04 to 7.5 cm/s (National Academy of Science, 1983);
SO- deposition velocities are generally less than 1 cm/s. For large
particles (greater than 50 urn), where gravitational settling is the primary
removal mechanism, V=W>0. In the case of a gas with perfect reflection from
surfaces, V=W=0. The effect of both settling and deposition is proportional
to V/u and W/u, respectively, where u is the wind speed.
In general, deposition velocities are greater than or equal to the
-------
gravitational settling velocity. PEM does not permit W>V>0,
representing, for example, re-entrainment of deposited particles in a
dust storm or soil erosion during high winds. Such situations can be
modeled as area sources with emission rates which depend on wind speed.
The effect of varying deposition velocity V and settling velocity W
(in cm/s) is shown in Figures 1 and 2 and Table 1. These examples use a
test point source with a 1 g/s emission rate, 0.13 m diameter stack 10 m
high, exit velocity of 35 m/s, and exhaust temperature of 400°K,
giving an effective plume release height of 17 m. Meteorological
conditions are stability A (Figure 1) and D (Figure 2), and 3 m/s
winds. For small distances (<2 km) and 3 m/s winds, increasing either
the deposition or settling velocity from 0 to 10 cm/s results in less
than 10 percent change of ambient concentrations with the test point
source under stability A (Figure 1) and about 100 percent under
stability D (Figure 2). Maximum concentrations occur about 0.5 km
downwind under stability category D.
In contrast to near field circumstances (within 1 to 2 km away),
concentrations from middle to far field sources (greater than about 10
km distant) may be very sensitive to settling and deposition
velocities. Table 1 shows the fraction of the gaseous concentration
(i.e., V=W=0) that results with selected deposition and settling
velocities using the test point source and the meteorological conditions
(Stability D) in Figure 2. For these conditions, concentrations at
large downwind distances are sensitive to small deposition velocities
(greater than 0.1 cm/s) when the settling velocity is 0. However,
ambient concentrations may be small at these downwind distances. For
example, a deposition velocity of 1 cm/s results in 74 percent of the
gaseous concentration at 3 km, and 33 percent at 60 km. For deposition
velocities above 1 cm/s, concentrations are relatively insensitive to
settling velocities from 0 to 1 cm/s. Settling velocities above 1 cm/s,
(representative of large particles) may have a large effect, producing
considerable near field deposition and low ambient concentrations at
large distances.
The incorporation of deposition and settling processes into PEM is
-------
100'
1
c
-4
§
H
1
w
10
Q.5
DOWN-WIND DISTANCE IN KILOMETERS
1.0
Figure 1. Ambient downwind concentrations from test point source under
stability A.
3
C
O
H
I
u
1000
100
10
1 -
0.5 1.0 1.5
DOWN-WIND DISTANCE IN KILOMETERS
2.0
ngure 'i. Ambient downwind concentrations from test point source under
stability D,
8
-------
Settling
Velocity
--cm/s--
Deposition
Velocity
--cm/s--
Downwind Distance in Kilometers
—3.. .-6— -15— -30— -60-
Ambient Concentration of Gas in ug/nT
8.677 2.978
0.799
0.297 0.111
Ratio of Aerosol Concentration to Gaseous Concentration
0
0
0
0
0.01
0.10
1.00
10.00
0.997
0.969
0.736
0.139
0.996
0.957
0.662
0.086
0.992
0.934
0.534
0.040
0.990
0.909
0.434
0.024
0.982
0.874
0.333
0.009
0.01
0.10
1.00
1.00
1.00
1.00
0.737
0.749
0.877
0.664
0.667
0.819
0.537
0.552
0.708
0.434
0.451
0.606
0.339
0.342
0.477
2.00
10.00
2.00
10.00
0.759
0.125
0.652
0.038
0.474
0.001
0.330
0.000
0.189
0.000
Table 1. Ambient concentrations and concentration ratios (fraction of gaseous
concentration) for test point source and specified deposition and
settling velocities.
-------
expected to produce only small changes in gaseous and IP concentrations
since local sources are the primary contributors of ambient concentrations
(excluding regional contributions). Longer distances are required for
significant settling and deposition with the typical range of velocities,
but concentrations are low at these distances. This was corroborated using
various meteorological scenarios and PEM modeling with the Philadelphia
source inventory (Section 4). Varying deposition and/or gravitational
velocities from 0 to 1 cm/s changed ambient concentrations by less than 1
percent at most receptors.
Transformation
Figure 3 shows PEM results for primary (502) and secondary (SO.)
pollutants from a point source and an area source of equal strength, using a
transformation rate of 1 percent per hour. No primary emissions of SO,
were assumed. Figure 4 shows that sulfate concentrations, as a fraction of
S02 concentration, increase proportionally with residence time (or downwind
distance) for the point source. Sulfate concentrations decline, however,
with residence time due to dilution. For the area source, the sulfate
fraction increases to a downwind distance of 7 km, whereupon it vanishes due
to a cut-off in the PEM program (see below).
For low wind speeds and a typical transformation rate of 1 percent per
hour, the transformation of S02 emitted within an urban area may
constitute a large fraction of ambient SO. concentrations. Since primary
emissions of SO, generally are between 1.5 and 14 percent of primary S02
emissions (the Philadelphia inventory averaged 4.3 percent), only several
hours residence time is necessary for secondary sulfate to exceed the
primary sulfate.
Concentrations from point sources beyond a distance of 60 km are
ignored in PEM. This may lower the level of predicted secondary pollutants
since there may be insufficient time for significant transformation within
60 km of travel. Concentrations from area sources are calculated for only
the four downwind grid cells, which explains the sudden drop at 7 km in
Figures 3 and 4. Consequently, the contribution of secondary pollutants
from area sources is underestimated.
10
-------
3
C
O
U
1QOOO
1000
100
0.1
0.01
5 10 IS
DOWN-WIND DISTANCE IN KILOMETERS
Figure 3, Downwind ambient concentrations for stability B, 0.5 m/s winds. One
point source (1 g/s, 0 plume rise); no secondary emissions; one area
source (1 g/s, 100 m square); transformation rate of 1 percent/hour.
(N
O
O
w
i
Z
W
H
&
RESIDENCE TIME IN HOURS
4 6
T
10
5 10 15
DOWN-WIND DISTANCE IN KILOMETERS
20
Figure 4. Percent secondary sulfate (of sulfur dioxide concentrations) for
conditions in Figure 3.
11
-------
Area Sources
Area source predictions were compared to "equivalent" ground level
point sources. PEM calculates area source contributions in only the four
downwind areas in a pattern selected by the program and within the area
source itself. If the receptor grid size chosen is very small, a zero
concencentation will result not far downwind of the area source. Figures 3,
4, 5 and 6 show that SC^ and 50^ concentrations vanish at preselected
distances downwind of the area source. This treatment may underestimate the
buildup of both primary and secondary pollutants, especially for area
sources which are near or smaller than the receptor grid size. Thus,
receptor grid size should be choosen so as not to loose important area
source contributions.
Area source calculations in PEM automatically decrease the stability
category by one, in order to better simulate dispersion under urban
condition. All else being equal, concentrations produced by stability
category A and B are the same for area sources, while they differ for point
sources. This treatment may be anomalous since many urban point sources
have low stack heights and undergo similar dispersion conditions as area
sources.
Area source modeling is sensitive to the size of the receptor grid.
Figure 5 shows downwind centerline concentrations from a 1 km square area
source with different receptor grid sizes. The maximum concentration from
the area source is produced by a 0.5 km receptor grid. Receptor grid sizes
larger than the area source (e.g., 2.5 km grid) result in low concentrations
since PEM increases the dimension of (small) area sources to the grid cell
size and correspondingly reduces the source intensity (in g/s/area) of the
area source. In comparison to the 1 km receptor grid, fine grids (e.g., 0.1
to 0.25 km) produce a somewhat lower maximum concentrations and rapid
decrease in ambient concentrations with distance.
The concentration pattern for the four receptors downwind of area
sources (selected by the program depending on the wind direction) may not be
realistic. Figure 6, a map of ground level concentrations generated by PEM,
shows concentrations produced by three area sources (0.1, 1, and 5 km on a
side) and a ground level point source without plume rise. All sources have
equal source strength. In contrast to the area sources, dispersion from the
-------
z
o
£
I
a
O
§
CJ
Q
W
10.
1.0
0.1
1.0-
0.1'
0.25
0.5
1.0 2.0 3.0 4.0 5.0 8.0
DOWNWIND DISTANCE FTCM AREA SCUPCE CENTROID IN KM
10.0
Figure 5. Downwind center!ine concentrations from 1 km square area source with
specified receptor grid sizes (km); stability D; 3 m/s winds; 1 g/s
emission rate;
0.00 3 00 * OO f.OO • 00 10 OO 11.00 U.OO
00 U. 00 70.00 22.00 24. OO
21 6O «7 71 74 47
0 0 0,37 40S 44* 270 1*0 71 37 2O 1 1 « 4 2 2 1 *0 000000
0 0 0 0 0 0 0
0 0 0 0 0 « I 3 11 12 37 SO 51 «2 «1 S* 51 44 31 3O 29
2 30 37 42 4^ 49 43 40
O.OO 2.0O 4.OO 1.00 1.00 10.00 12.0O 14.00 1ft. OO 1B.OO 20.00 22, OO 24. OO
24. OO
JJ 00
32 OO
21.09
20.00
It. 09
t«.OO
If 00
t«.M>
t».OO
14. OO
13.00
12.OO
11.OO
10.OO
t 00
• -OO '
7 09
4.00
9.0O
4.00
3.0O
2.0O
1.0O
O.OO
Figure 6. PEM map of S02 concentrations from (top to bottom) 3 area and 1 point
source; each source has 1 g/s emission rate; solid lines indicate area
sources; dashed lines show receptors where concentrations are calculated,
13
-------
point source affects markedly more downwind receptors. Point source
concentrations 20 km downwind are still 20 percent of peak concentrations.
Area sources influence a more restricted set of receptors and concentrations
decrease more quickly with distance. (The lower stability category used for
the area source explains only part of this effect.) Also, the pattern for
the four downwind cells leaves large gaps in coverage.
To examine the above effects in urban scale modeling the Philadelphia
area sources were described as area sources (2.5 km square in size) in one
simulation, and as point sources (with ground level release height and no
plume rise) in a second simulation. Both coarse (10 km) and fine (2.5 km)
grids were used to obtain eight hour average concentrations under
consecutive 45 degree shifts in wind direction and stability D. Stability
categories for area sources were increased by one to permit comparison with
point sources. In the vicinity of the sources, modeling using area sources
produces higher concentrations than point sources (Figure 7 and 8).
However, since area source concentrations are calculated only over 4 grid
cells downwind, receptors located beyond 10 km (4 x 2.5 km) have zero
concentrations using area sources, but non-zero concentrations using point
sources. Effects of super-position of point source plumes may result in low
(2-3 percent of maximum) concentrations 20 or 30 kilometers downwind (seen,
for example, in the NE direction in Figure 7).
There was generally good agreement between area and point source
modeling using the coarse grid; concentrations were within 7 ug/m (of an
average ambient concentration of 50 ug/m ) and perimeter concentrations
3 3
were within 2 ug/m (average of 3 ug/'m ).
In summary, area source modeling is affected by a number of factors.
The agreement between point and area source modeling depends critically on
the area source dimensions relative to the mean distance between sources.
Similarly, the agreement between area source modeling using various grid
sizes depends on the area source dimension relative to the receptor grid
size. Accurate modeling of near field circumstances may require fine
receptor grids, appropriate to the micro-inventory scale. A coarser grid
may be used to esimate mid and far field contributions. Total
concentrations would be estimated as the sum of the two predictions.
14
-------
470.OO 479.00 480. OO 485.00 49O.OO 499.OO 5OO.OO 9O9.0O 510.OO
4490.00
4447 . SO
4449.00
4442.90
AAAfl rtft
**^O. l^tf
4437. SO
4439. OO
4432. SO
4430.00
4437 . SO
4429.00
4422. SO
442O.OO
4417. 9O
4419. OO
44 12. SO
44 1O.OO
Figure 7. PEM map
sources
4491. 2O
4444 . 7O
4446 . 20
4443.70
444 1 . 2O
4438 . 70
4436 . 20
4433.70
4431.20
44 28 . 7O
4428. 2O
4423.70
442 1 . 2O
4418.70
44 1«. 2O
4413. 7O
4411.20
OOOOOO'OOOOOOOOOOO
ooooooooooooooooo
ooooooooooooooooo
00111111111111100
0 1224996868421 1OO
1 2 3 7 12 20 29 29 27 34 32 16 4 2 1 0 0
1 2 4 9 24 47 94 63 70 63 91 36 17 2 1 O 0
1 2 9 9 44 79 82 84 79 66 49 39 17 2 100
1 2 6 29 64 79 81 74 63 98 34 17 4 2 1 0 0
1 3 13 36 71 98 99 86 69 33 11 9 3 1 1 0 0
1 3 14 27 99 99 114 89 37 1O 6 4 2 1 1 00
1 3 9 18 36 62 82 49 11 6 4 2 1 1 O 0 O.
1 2 4 2O 31 42 61 39 8 9 3 1 1 0000
1 2 3 14 16 14 3Q 24 9 3 2 1 00000
01233444321 10 'OOOO
01111121111000000
470. OO 479.0O 48O.OO 489. OO 49O.OO 499.00 9OO.OO 509. OO 91O.OO
using 2.5 km receptor grid showing concentrations from area
; stability E; and 8 consecutive 45 degree wind shifts.
471. 2O 476. 2O 4S1-.20 486. 2O 49I.2O 496. 2O 5O1.2O 9O6.2O 911. 2O
1111222t110OO11t2
11112221111011122
11122221111111222
11122321111122332
3 3 4 611 14 9 1O 16 16 8 8 9 2 1 OO
4 4 9 10 20 26 28 29 29 28 28 17 2 1 1 1 1
3 9 9 14 24 31 43 39 33 36 36 12 1 1 1 1 1
2 4 11 2O 26 38 4O 61 98 30 14 3 3 1 1 1 1
2 4 17 29 99 72 76 69 37 19 3 4 3 3 2 2 2
4 4 13 33 69 102 74 27 1O 6 9 2 3 3 2 2 2
2 6 11 21 94 64 48 31 9 3 4 3 2 2 2 2 1
2 4 19 16 22 94 69 21 4 3 2 3 2 2 2 1 1
3 4 7 4O 49 39 31 9733232 1 1 1 '
344- 4O 412284937222 1 1
471.20 476.20 481.2O 486.20 491.2O 496.20 5O1 20 SO6.2O S11.2O
Figure 8. PEM map modeling area sources as point sources, stability D,
otherwise conditions as in Figure 7.
15
-------
PEM IMPROVEMENTS
We found the PEM model easy to use (even without a users guide).
However, the versatility and convenience of the PEM program could be
improved. Some of the recommendations in this section result from our
experience with the model; others are proposed in view of expected
applications. Implementation of the minor changes to the model, e.g.,
book-keeping operations, additional outputs and new user options, would
facilitate a sensitivity analysis and source apportionments. Other changes
might increase the realism of the model. The somewhat increased complexity,
memory storage and computer time required by the model would not be
important considerations to most model users. Incorporating changes before
the model is released would be advantageous since user modifications would
not be necessary thus eliminating further review by regulatory authorities.
A brief assessment is given below of the importance, objective, and
ease of implementing the various possible changes. Many of the suggestions
could be included as options (defaults would use the existing program).
Deposition, Settling and Transformation
The current model requires that deposition, settling and transformation
parameters pertain to all sources and meteorological conditions in an
averaging period (1 to 24 hours). The following suggestions provide a
hierarchy of increasing sophistication in handling these processes.
a. Hourly variations of deposition and transformation parameters.
Importance: Variable, depending on importance of deposition and
transformation.
Objective: Portray time varying characteristics, e.g., day/night
differences of deposition velocities.
Implementation: Programming simple. Little guidance for user.
Possibly tied to stability and wind speed inputs.
16
-------
b. Source-specific deposition, settling and transformation parameters.
Importance: Moderate for calculations of TSP where emission
characteristics vary widely. Less important for gases and IP with
similar characteristics.
Objective: Increased realism by reflecting different particle size
distributions from various sources.
Implementation: Programming simple, but specification possibly
difficult (especially if parameters vary on hourly basis) due to
incomplete data.
c. Particle size stratification of deposition and settling parameters.
Importance: Generally minor, since size (weighted) average parameters
may be used in most cases.
Objective: More accurate description of particle dispersion by
reflecting distribution of particle sizes. Also permits simultaneous
calculation of TSP, IP, and gaseous pollutants.
Implementation: Programming simple. Handle as additional pollutant
(e.g., increase number of pollutants to five and sum pollutants.)
d. Site-specific deposition velocities.
Importance: Moderate, depending if deposition is important.
Objective: Increased realism by modeling surfaces found over large
areas, e.g., water, urban, grass.
Implementation: Programming possibly difficult since present solution
algorithm may not be suitable. More complicated if deposition changes
with time. User must specify surfaces and deposition rates (possibly
keyed to surface type.
e. Insignificant transformation rates.
Importance: Minor, depending on transformation rate, source-receptor
distances and wind speed.
Objective: Reduced CPU time by eliminating calculations from (numerous)
small local sources and meteorology which produces neglible
transformation.
Implementation: Moderate programming difficulty. Requires look-up
charts and relative estimates of extent of transformation. Invisible to
user.
17
-------
Source Specifications
a. Hourly variations of source strength and other source characteristics
(exit velocity, temperature).
Importance: Small for modeling worst case conditions, possibly large
for validation and other studies, especially with significant daily
variations of emission sources.
Objective: Simulate operating schedules of major emitters.
Implementation: Programming simple. Possibly large amounts of input
data.
b. Determination of source class impacts.
Importance: Variable, depending on application.
Objective: Expands culpability option, particularly useful in SIP
revision. Also defines culpability of industrial source complex with
multiple emission sources.
Implementation: Programming straightfoward, involving additional
specification of source classes and book-keeping of class impacts for
receptors. User simply specifies source class.
Area Sources
a. Make area source calculations independent of receptor grid size.
Importance: Variable, depending on importance of area sources and
receptor grid size chosen.
Objective: Increased realism of area source modeling.
Implementation: Programming possibly difficult. Invisible to user.
b. Concentration and/or other criteria for determination of number of
downwind grid cells.
Importance: Variable, depends on importance and configuration of area
sources.
Objective: Increased realism of area source modeling by using
meaningful standard rather than fixed (4 downwind receptors)
calculations.
Implementation: Programming straightforward. Invisible to user.
18
-------
Outputs
a. Automatic scaling of concentration and deposition flux maps.
Importance: Minor, for convenience.
Objective: Eliminate useless outputs and model runs producing numbers
either too small or too large for display.
Implementation: Programming simple. Invisible to user.
b. Culpability list of all (or specified number) of sources at specified
receptors showing major contributors to receptor concentrations.
Importance: Normally minor, since culpability of top 10 point sources
is provided. However, expand to include area source contributions.
Objective: Facilitates source apportionment, especially in validation.
Implementation: Programming and use straightforward.
c. Hourly concentrations written on tape.
Importance: Minor, but convenient. Present option writes only final (1
to 24 hour) average concentrations on tape.
Objective: Permits analysis of critical meteorological conditions, etc.
Implementation: Trivial.
Miscellaneous
a. Hourly variations of calibration parameters, particularly intercept.
Importance: Minor.
Objective: Portray changes of background levels.
Implementation: Simple.
b. User specification of wind profile exponents and anemometer height.
Importance: Generally minor.
Objective: Tailor model to location, if estimates are available.
Implementation: Simple.
19
-------
c. Incorporate varying terrain.
Importance: Potentially large with elevated sources and complicated
terrain.
Objective: Model fumigation of elevated terrain.
Implementation: Programming difficult, since algorithms may not be
suitable and deposition and settling processes poorly understood with
terrain differences. Usage simple by specifying receptor elevation.
d. Increase or eliminate cut-off distance for point sources
Importance: Possibly important with certain emission inventories.
Objective: Extends modeling to tnesoscale (30-300 km).
Implementation: Trivial.
e. Use of updated dispersion coeffiecients
Importance: Important under some conditions
Objective: Standardize models with other UNAMAP models
Implementation: Trivial
e. Long Term Version of PEM
A long term version of PEM could be very useful. Such a model might be
similar to the ISC-LT model, which calculates annual concentrations
using site-specific meteorological data. The joint probability of the
STAR categories, i.e., each wind speed, wind direction and stability
category, is used to weight ambient concentrations for that category in
the determination of annual or seasonal average concentrations. This
would yield spatially varying long-term concentrations that might be
more valid than short-term concentrations predictions, which depend on
accurate source and meteorological data.
20
-------
SUMMARY
Of the recommended changes, we feel that the most Important improvements
increasing model credibility are those concerning area source modeling. For
sulfate modeling, hourly variation of transformation rates and elimination
of cut-off distances for point and area source calculations are the most
relevant. For TSP modeling, source specific specification of particle size
or deposition and settling velocity are important. We also feel that the
development of a long term version of PEM for seasonal or annual particulate
concentrations would complement existing air quality models. Most of the
remaining recommendations simply provide greater convenience for model users.
21
-------
SECTION 4
SOURCE APPORTIONMENT USING PEM
In this section we describe aerosol source apportionment in
Philadelphia using the PEM dispersion model. The source inventory,
meteorology and PEM predictions are presented. The results could be
indicative of the composition of aerosols obtained by dichotomous samplers
in the Philadelphia study.
SOURCE INVENTORY
The preliminary source inventory was compiled to obtain the
approximate composition and location of the major emission sources in the
Philadelphia metropolitan area. Sources of the data are listed at the end
of this section. The data do not include micro-inventories around receptor
sites. Present data do not allow a complete classification of sources.
However, this should not seriously handicap the apportionment since the
objective is to obtain only representative inventory and source
apportionment estimates. More detailed source and monitor data for the
Philadelphia study should permit more accurate modeling.
The inventory consists of a total of 50 area sources and 104 point
sources. The location of the sources is shown in Figure 9. Particle
emissions are broken down into 12 source classes as shown in Table 2. This
Table also shows the six source classes of S0? and (primary) SO, sources.
The composition of the emissions sources located within the city is
different than shown in Table 2, which includes a number of large sources
located up to 60 km from the city limits. Table 3 provides a summary of the
emission sources located within 470 to 500 UTM east, and 4410 to 4440 UTM
north, a region which encompasses Philadelphia city limits with the
exception of the northeast corner (see Figure 13). In this smaller region,
area sources emit most of the TSP and a few industrial sources emit most of
the S02 and SO^.
Area source emissions included both fine and coarse particles, as well
as S02 and SO^. Area source emissions were estimated by aggregating
about 600 sub-census tract emission estimates into fifty 2.5 km square
22
-------
4450*
0 10 20 30km
425.
459.
475,
see.
Figure 9. Map showing location of 104 point sources and 50 area sources used
in the Philadelphia inventory. Philadelphia city limits are shown
in the center of the figure. Scales are UTM coordinates.
23
-------
Category
Number of Sources
—TSP Emissions--
g/s percent
1
2
3
4
5
6
7
8
9
10
n
12
1
2
3
4
5
6
Area sources—coarse
Area source--fine
Refineries
Incinerators
Oil-fired utility
Coal-fired boilers
Metal and steel
NJ County Emissions
Chemical industries
Grain elevator
Misc. ground level
Misc. oil-fired
Totals
Category Number
Area sources
Oil--indus trial
Coal
Oil—miscellaneous
Chemical industries
Refineries
50
50
7
5
8
14
9
6
10
1
3
40
203
of Sources
56
8
14
40
9
7
246.18
17.23
127.30
39.54
299.39
287.10
280.33
266.00
77.89
25.92
24.65
50.76
1742.26
S02 Emissions
g/s percent
317.48 5.0
1531.25 23.8
2335.00 36.4
525.18 8.2
245JO 3.8
1457.30 22.7
14.1
9.9
7.3
2.3
17.2
16.5
16.1
15.3
4.5
1.5
1.4
2.9
100
SO^ Emissions
g/s percent
22.14 8.0
107.19 38.6
35.04 12,6
27.22 9.8
13.89 5.0
72.57 26.1
Totals
134
6411.31 100
278.05 100
Table 2. Summary of TSP (top) and S02 and 504 (bottom) emission
inventories in the Philadelphia area.
24
-------
Category
Number of Sources
—TSP Emissions--
g/s percent
1
2
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
Area sources — coarse
Area source — fine
Refineries
Incinerators
Oil-fired utility
Coal -fired boilers
Metal and steel
NJ County Emissions
Chemical industries
Grain elevator
Misc. ground level
Misc. oil-fired
Totals
Category Number
Area sources
Oil — industrial
Coal
Oil—miscellaneous
Chemical industries
Refineries
50
50
4
5
6
2
4
0
6
1
3
22
153
of Sources
50
6
2
22
6
4
246.18
17.23
95.70
39.54
50.19
66.30
25.11
0.00
19.29
25.92
19.15
36.59
651.20
S02 Emissions
g/s percent
249.15 12.0
1222.96 58.7
502.10 24.1
436.98 21.0
223.70 10.7
548.00 26.3
37.8
2.6
14.7
6.1
9.2
10.2
3.9
0.0
3.0
4.0
2.9
5.6
100
SO^ Emissions
g/s percent
17.23 10.5
85.61 52.3
7.53 4.6
13.10 8.0
12.80 6.5
27.47 16.7
Totals
90
2083.19 100
163.68 100
Table 3. Summary of TSP (top) and S02 and SO^ (bottom) emission
inventories in Philadelphia located within 470 to 500 UTM east
and 4410 to 4440 UTM north.
25
-------
areas. The area source inventory combined emissions from mobile, point and
area source emissions. However, coarse TSP emissions are primarily from
road dust; fine TSP emissions are from mobile sources and residential and
commercial activities; and SC^ emissions are primarily from residential
oil-fired burners. Figures 10 to 12 show area source estimates for the
three pollutants.
Primary S(L emissions were determined by multiplying SO^ emissions
by the factors given in Table 4 (National Research Council, 1983). Thus,
all S02 sources had primary sulfate emissions. Area sources had a
emission weighted sulfate emission factor of 7 percent. The emission
weighted sulfate emission factor for the entire inventory is 4.3 percent.
The inventory was compiled from the inventories listed below:
1. City of Philadelphia Air Management Services, Philadelphia, PA. B. Glazer,
T. Weir, (215) NU6-7393
Comprehensive inventory of city emission sources listing over 1000 point
sources including hazardous pollutant emissions and area source emissions of
fine and coarse particulates, S02, $04, NOx. Limited computer capability.
2. PEDCO Micro Inventory, PEDCO Environmental, Cincinnati, OH. Barb Siegal,
(513) 782-4700
Major TSP point sources (over 100 T/yr) in the Philadelphia area. Completed
for 1978 monitoring study. 1982-3 update forthcoming.
3. N.J. State Dept. of Environmental Protection, Bureau of Field Operations.
Trenton office: Dr. Ray Dyba, (609) 985-3009; Cherry Hill, Camden County
office: Terry Juchnowski (609) 984-0616
Estimates of criteria pollutant emissions and identification of major
sources for Burlington, Camden, Glousester and Mercer Counties. Not
computerized (on microfiche).
4. Pennsylvania Department of Environmental Resources, Inventory section,
Harrrisburg, PA. John Walker (717) 787-4324
PA point sources (excluding those located in the City of Philadelphia)
including emissions of criteria and other pollutants, operating schedules
and plant parameters. Sophisticated computerized inventory.
25
-------
Fuel
Coal
Residual oil—industrial
--residential
Distillate oil
Mobile sources
Miscellaneous
--Percentage Primary S04-
1.5
7.0
13.4
3.0
3.0
5.0
Table 4. Primary sulfate emission factors (National Research Council, 1983).
Soo
Figure 10. Map showing emission rates (g/s) of coarse particles from
Philadelphia area sources. Area sources are 2.5 km square.
27
-------
-------
5. "Thermal Electric Power Plant Contruction and Annual Production
Expenses," Energy Information Administration, Washington, D.C., 1980.
Several power plant emissions were estimated using these fuel records
and maximum NSPS emissions.
METEOROLOGY
Surface meteorological data for the PEM simulation were obtained for
the Philadelphia's Northeast airport and the Norristown meteorological
station. The Norristown data were received in a preprocessed format, which
included stability classifications. No mixing heights were obtained. PEM
simulations used constant, ficticious mixing heights (as specified).
These data were obtained from:
Pennsylvania Department of Environmental Resources, Meteorology
Section, Harrisburg, PA. Denis Lohman (717) 787-4319
DISPERSION MODELING
The PEM dispersion model was used to estimate the source apportionment
in the Philadelphia area under several meteorological scenarios. PEM can
simulate dispersion of either two independent pollutants, or one pair of
coupled pollutants. Thus, source apportionment of the 12 TSP source classes
was accomplished by 6 model runs, each run modeling two (independent) source
classes. Source apportionment of SOo and SO* for the 6 source classes
required 6 model runs, since S02 to SO, transformation was permitted and
each source emitted both S02 and primary sulfate. A constant 1 percent
per hour transformation rate was used. Deposition and gravitational
velocities were set to zero. A tape output of concentrations at each
receptor was generated by each model run. After the 6 runs for each
apportionment (and meterological scenario) were completed, a Conversational
Monitoring System (CMS) "macro" program consolidated the concentrations into
one file, which was then transferred to the Time-shared Reactive On-Line
laboratory (TROLL), a large, interactive statistical and data analysis
system (Information Processing Center, 1980). TROLL was used to generate
the source apportionment tables. Both CMS and TROLL reside on the MIT
IBM/370 Computer.
29
-------
Hypothetical Meteorology
First, hypothetical meteorology was used to portray the average effect
of the sources at each receptor. Eight hour concentrations were calculated
using consecutive hourly 45 degree wind shifts (starting at 0 degrees) using
a fine (25 x 25 grid; 60.0 km square) receptor network for the following
stability categories and wind speeds:
stability category wind speed (m/s)
A 2
D 4
E 3
Due to hourly shifts in the wind direction in these scenarios, no single
source contributed longer than one hour to receptor concentrations (with the
exception of receptors located within area sources).
Based on the results obtained, a coarse (4x4 grid; 30 km square, SW
corner UTM 4410 north, 470 east) was selected to include both areas of
maximum concentration and the Philadelphia monitoring sites. The receptor
grid and receptor numbers are shown in Figure 13. This receptor network
encompasses Philadelphia and adjoining suburbs. The coarse grid was chosen
primarily to reduce computations; it may not reflect the diversity of
concentrations at the various monitors in the Philadelphia area.
Predicted source class estimates are given in Table 5 for TSP
(excluding sulfate) and Table 6 for S02 and SO,. The average percent
source class contributions of TSP and S04 at the 16 receptors are listed
in Tables 7 and 8. The following results are highlighted:
(1) Receptors show a diverse source apportionment. Receptors 6, 7, 10 and
11, located within Philadelphia, receive high loadings from area sources.
(2) Point sources contribute less than 1 ug/m3 under stability A compared
3
to up to 60 ug/m from area sources. Point sources have larger
contributions under other stabilities. However, area sources dominate both
TSP and S02 concentrations.
(3) Low sulfate concentrations, generally about 5 percent of TSP, reflecting
the relatively high wind speeds which advect the pollutants outside the area
resulting in little SO^ to SO,, transformation, as well as the small
proportion of primary sulfate.
30
-------
470
U T M COORDINATES EAST
480 490
4450
4440
1
z
0)
a
8
u
4430
4420
4410 *
Figure 13. Map showing 5 km receptor grid of Philadelphia area. City limits shown
with dashed line. Small numbered regions are sub-census tract areas
Large numbers indicate receptor number for coarse (10 km) grid used in
^11.^' .Squares indicate location of five monitors of the
iladelphia study; the sixth is located 7 km south of map
31
-------An error occurred while trying to OCR this image.
-------An error occurred while trying to OCR this image.
-------
Stability Wind --Percent Contribution •
Category Speed area-coarse area-fine oil-ind NO-misc grain oil-misc other
A
D
E
2 ra/s
4
3
84.1
60.9
70.4
12.6
9.1
10.5
0.8
2.6
1.5
0.5
18.2
13.0
0.1
4.5
2.2
0.3
1.1
0.5
1.5
3.6
1.9
Table 7. Average (16 receptor) percent TSP source contributions from PEM
modeling using hypothetical meteorology. Source classes contributing
less than 1 percent of total TSP are labeled "other," and include
refineries, coal, steel, chemical and soil.
Stability Wind Percent Contribution
Category Speed area oil-ind coal oil-misc chemical refineries
A
D
E
2 m/s
4
3
60.4
39.0
56.1
17.7
21.2
14.2
5.4
3.6
4.9
6.0
22.0
13.6
2.0
3.8
2.2
8.3
10.0
9.0
Table 8. Average (16 receptor) percent 504 source class contributions from
PEM modeling using hypothetical meteorology.
34
-------
Actual Meteorology
Actual meteorological data were used to simulate three consecutive 12
hour periods, starting with 6 AM, August 22, 1982 (Table 9). These periods
were selected since they showed diversity: transition from unstable daytime
conditions to stability E (period 1); transition to stability A with some
moderate and fairly persistent winds in categories 3 and C (period 2);
neutral stability and low winds (period 3). Wind speed, direction,
stability, and temperature data were complete for these periods.
Source class apportionments for TSP, S02 and S04 are shown in
Tables 10 and 11. It is seen that area sources dominate as before, but
other source classes may be significant contributors to both TSP and SO^
concentrations. According to the source class inventory (Tables 2 and 3),
area sources emit nearly one-half of the TSP within the city area. This is
reflected in the source apportionment of Table 10.
Coal burning is the largest S02 emission class, and industrial oil
burning produces most of the primary sulfate. In the vicinity of the city,
however, most S02 and SO^ is emitted by 6 industrial oil users and 4
refineries located near the city. The highest S02 and SO^
concentrations from these sources (refineries) were 11.4 and 1.4 ug/m ,
respectively, for receptor 9 and period 1. These point sources have
elevated stacks ranging from 40 to 84 m which generally result in low ground
level concentrations. Consequently, area source emissions also dominated
the source apportionment of S02 and SO*.
The highest predicted sulfate concentration in the three periods was
6.3 ug/m (receptor 11, period 1). A search for conditions that produced
the maximum sulfate concentrations was performed by calculating hourly
concentrations using the coarse grid with incremental 30 degrees shifts of
wind direction under stability D for the receptors located in the city
(Table 12). The maximum concentrations from point sources are considerably
smaller than the average area source contributions. Only in one
circumstance was the point source contribution roughly comparable to the
area source contribution (receptor 10, wind direction 310 degrees, with an
3
area source contribution of 1.016 ug/m and an industrial oil source
i
contribution of 0.588 ug/m ). These results imply that most point sources
contribute little to sulfate levels in the city under most conditions.
35
-------
SCENARIO
NUMBER
1
2
3
4
5
6
7
8
9
to
It
12
SCENARIO
NUMBER
1
2
3
4
9
6
7
8
9
10
11
12
STABILITY WIND SPEED
CLASS CLASS
A
a
c
00
00
ON
E
£
ON
E
E
E
STABILITY WIND SPEED
CLASS CLASS
ON
00
DO
C
a
C
B
B
A
A
A
A
HIND
SPEED
on
88. OO
49.OO
313. OO
77 JM
109. OO
AMBIENT
TEMPERATURE
(DEO C)
28.00
26. OO
23. OO
25. OO
24. OO
23. OO
22.OO
22.OO
21. CO
21.OO
19. OO
19. OO
AMBIENT
TEMPERATURE
(DEC C)
19.00
21.0O
24. OO
25. OO
26. OO
27. OO
78. OO
27.00
28. OO
28. OO
28. OO
28. OO
Table 9. Meteorological data used for PEM simulation. All mixing heights set
to 2000 m and inversion penetration factor set to unity. Top: period
1; bottom; period 2.
36
-------
MARIO
STABILITY WIND
SPEED WIND WIND VINO
AMBIENT
M8ER CLASS CLASS SPEED SECTOR DIRECTION TEHPERATUt
1
2
3
4
S
a
7
8
9
10
11
12
A
8
C
00
DO
00
00
00
00
00
ON
E
(M/S)
1.30O
1. 10O
1.2CO
1.200
1.40O
1.00O
1.3OO
1.000
LOCO
1.00O
t.ooo
1.0OO
(DEO)
334.00
323. CO
17. OO
108.00
167.00
173. CO
2O9.00
219. OO
3O9.0O
284.00
147.00
93.0O
(oca c)
28. CO
26. OO
24. OO
23. OO
23. OO
22.0O
22.OO
21. CO
20.0O
19. OO
18.00
17.00
Table 9. (continued) Meteorological data; period 3,
37
-------
AREA-C AREA-F REFIN.
- - - SOURCE CLASS CONTRIBUTION IN UG/M3
INCIN. OIL-INO COAL STEEL NJ-MIS CHEM. GRAIN
- - - TOTAL
SOIL OIL-MIS CONCENT
RECEPTOI
NUM8EI
1. ...
2. . . .
3 ...
4. . .
g
6
7
8
9
1O. . .
11. ..
13...
13. ..
14. . .
15...
1«. . .
*
9
12. MI
8.577
o.ooo
o.ooo
21 581
214 9IO
153 2SO
0 OOO
15 197
. 185. 170
. 298.450
o-.ooo
. 1O. 639
. 47.240
. 111. 480
O.OOO
1.769
0.826
O.OOO
O.OOO
3. 344
33. 1O8
19.257
O.OOO
2. 198
27.876
47.321
O.OOO
1.829
7. 188
16.378
0.000
0.018
O.OOO
O.OOO
O.OOO
0.084
O.OO3
O.OOO
O.OOO
0. 119
0.066
O.OO4
0.001
0.313
0. 169
0.007
O.O03
O.OOO
0.034
O.OOO
O.OOO
0 016
O.OO2
O.OO8
O.OOO
O.OO3
O.OO3
0.017
0.003
O.O34
O.OO3
0.218
O.OO7
1.070
0.001
o.ooo
0.247
0 276
O.044
0.013
1.451
0.060
O.OOS
0.033
0.079
0.073
0.334
0.298
O. 113
0. 144
0.04S
O.447
O. 181
0.340
0.048
1.O03
1.537
0.263
0.063
0.540
2.253
0.083
0. 176
0.294
0.993
0.026
O.OOS
0 502
0.016
0.376
0.08O
0.492
0.070
O.OOO
0.022
0.036
3.648
0.010
O.OU
0. 103
0.054
O.OOO
O.OOO
o.ooo
o.ooo
o.ooo
O.026
O.OOO
o.ooo
1 .977
2.735
0.224
0.000
O.OO1
6.076
0.037
3.493
0. 108
O.OOO
O.OOO
O.OOS
0.065
0.005
0.027
0.050
0.097
0.013
0.063
0.322
O.O12
0. 107
0.262
1.013
O.OOO
O.OOO
O.OOO
O.OOO
0.003
O.OOO
o.ooo
o.ooo
4.833
0.22O
O.OOO
O.OOO
O.OO7
11.832
0.003
O.OO1
O.OOO
0.000
o.ooo
o.ooo
3. 4 1O
o.ooo
o.ooo
o.ooo
2.863
O.346
O.OOO
0.083
3. 139
O.629
0.066
O.OOO
O.O74
0.013
0.014
O.OOO
0.251
0.031
0.043
O. 154
O.21S
0.246
0. 1 16
0.023
0.07S
O.429
0.262
0.033
15.3SO
7.501
O.963
0.453
29.745
248.257
174.093
3.263
27.825
216.766
348. 8O4
6.413
16.019
74. 197
129.404
5.711
MEAN 67.29O 10.O58 0.049 0.022 O.2S6 0.528 0.341 0.911 0.135 1.056 0.859 0.124 81.422
AREA-C AREA-F REFIN.
- - - SOURCE CLASS CONTRIBUTION IN UG/M3 - - •
INCIN. OIL-INO COAL STEEL NJ-MIS CHEM.
GRAIN
- - - TOTAL
SOIL OIL-MIS CONCENT
RECEPTOR
NUMBER
1 . . 4 Rjn
2. ...
3
4. ...
5 ...
6 . .
7
8 ...
9 .
10. . .
11. ..
12. . .
13. . .
14. . .
15. . .
16. . .
4 963
2 276
0.816
6 75O
69 404
5 1 098
0 088
5 594
. SO. 321
. 88.269
O.OOO
1 . 39 1
3.332
19.372
O.OOO
0.711
0. 716
0.313*
0. 124
O. 989
1O. 664
6. 509
0.012
0. 870
7.565
14.211
O.OOO
O. 191
O.502
2.749
O.OOO
0. 505
0.1(9
0. OO3
O.OO2
0. 547
0. 04 1
0 OO5
0. OO3
0 t49
0.019
0.012
O.OOO
O.OO1
O.OOO
O.014
O.OOO
0. 082
0.076
O 049
O.O07
0. OO6
0 155
0 06O
O OOO
0. OO2
O.OO9
O.OOO
O.OOO
O.O07
O.O25
O.OOO
O.OOO
0 205
0. 068
0 316
0. O0 1
0 . 03O
0 642
0 032
0. OOO
0 067
0. 124
O.OO1
O.OOO
0.073
0.01O
0.482
O.OOO
O O95
O.O37
0 O38
0 . OO6
O O9O
0 030
0 O57
0 016
0 O73
O.O22
0.022
O.O69
O.OOS
O.O44
0.03O
0.058
O/"> * K
• \> ' 3
0 OO6
Onns
UWD
0 O05
0 O24
O 089
O O30
0 OOO
O OO2
0.016
0.001
O 004
O.O09
o.ooe
O.OOS
o.ooo
Onfi3
• vo<
0. 137
O 177
0 1 1 3
O 442
1 047
O 1 6O
0 226
2T7H
. J ' a
0.483
1.852
O. 169
0. 149
0.427
0.096
3.311
Orvc t
• UO I
0 128
0 . 040
0 0 1 3
0 O25
01 SO
. 1 09
Ortcs7
. L/3 /
OOOfl
U\JO
OO 1 1
. \J i I
0.076
0.028
0.013
O.OO7
0.007
O. 199
0.026
Oe*7Q
.3/9
Omo
. U Jtf
O. OOO
OfW>
. \J\J\J
Onn f
. Uv T
OO 1 1
. U 1 1
0 • OOO
O . OOO
OA17
• U J /
19.737
O.OOO
O.OOO
O.031
O.OOO
O.OO1
o.ooo
0 . (02
OO9T
• U* J
0.013
Of\fi*
• UU*
1 1 tt7
T . 1 0 /
01*7 "3
. J f >J
O. f!26
0 . isOO
0 . OOO
0.021
O.OO6
0.001
o.oot
O.O69
13.685
o.ooo
0. 193 7 . 25 1
01
-------An error occurred while trying to OCR this image.
-------
Table 12 also shows that area source contributions remain constant under
varying wind directions. According to this modeling, only the area source
in which the receptor resides is an important contributor to sulfate. Note
however, that the relatively large receptor grid size of 10 km encompasses
16 of the 2.5 area sources. This results in less variation than might be
found with smaller grid sizes.
The PEM results support the observation in Section 3 that secondary
sulfate may compose a large fraction of the sulfate mass, especially from
distant sources. Sulfate may attain about 10 percent of the S02 mass,
although only 4.3 percent primary sulfate is emitted by all sources.
However, it should be recalled that the S04 emission factor for area
sources was 7 percent. (A molecular weight ratio of 1.5 (S0^/S02) was
used for mass conversion.)
DISCUSSION
PEM modeling of the Philadelphia area is constrained by the limited
accuracy of the source inventory. In particular, estimates of area source
emissions, which dominated the apportionment, are very crude. More
accurate modeling may require better inventories, especially around
receptor sites and more detailed meteorological data. The Norristown
station does not necessarily provide representative surface observations
for the Philadelphia area. Also, the omission of mixing height data will
bias results. Mixing height was simply set to 2000 m. Lower mixing
heights were found to alter point source contributions by up to 40
percent. Very low mixing heights (e.g., 200 m) eliminated point source
contributions since elevated sources had plumes that penetrated the ceiling
layer.
Using 9 monitoring sites in Philadelphia from May to September 1979,
Suggs and Barton (1983) obtained 24 hour concentrations of fine particles
ranging from 2 to 54 with a mean of 29 ug/m . TSP concentrations at the
3 3
same sites ranged from 15 to 172 ug/m with a mean of 56 ug/m .
Predicted sulfate concentrations at the four receptors located within
Philadelphia (numbers 6, 7, 10, 11) ranged from 0.5 to 6.3 ug/m for the
three 12 hour periods. The corresponding TSP predictions ranged from 58 to
346 ug/m3.
40
-------An error occurred while trying to OCR this image.
-------
Thus, PEM predictions have the correct order of magnitude, although TSP
predictions seem high, and sulfate seem low. However, observed particle
concentrations include non-sulfate species as well as background regional
levels. Background concentrations were not included in the PEM modeling.
42
-------
SECTION 4
RECEPTOR MODELS
The principal objectives of this section include the following:
(1) Brief review of receptor models (RMs);
(2) Application and evaluation of several diagnostic procedures designed to
minimize the spurious results obtained by chemical mass balance RMs, and
suggestions for protocols which incorporate these diagnostic tools;
(3) Suggestions for the development and application of hybrid models, i.e.,
models which combine aspects of dispersion and receptor models;
(4) Description of issues concerning inter-comparisons between results of
receptor and dispersion models;
(5) Discussion of issues concerning RM protocols.
OVERVIEW OF RECEPTOR MODELS
Receptor models are procedures which use observed aerosol
characteristics to identify and quantify the sources of ambient air
pollutants. The aerosol characteristics most frequently measured include
the chemical and elemental mass, optical properties and particle size
distribution. Less frequently measured characteristics include isotope
ratios, organic and inorganic compounds, and crystalline structure. Unlike
dispersion models, RMs do not require meteorological or source data. Source
profiles may be used, but are not necessary.
In general, RMs make use of the following assumptions:
Non-reactive (conservative) aerosols. The total aerosol mass at a receptor
is a linear sum of the aerosol contributions from individual sources. In
general, there can not be any transformation of the aerosol from time of
emissions at the source through atmospheric transport to collection and
ultimate analysis.
43
-------
Stable aerosol characteristics. Characteristics of the aerosols are a
linear sum of the aerosol characteristics from individual sources, e.g.,
elemental ratios are assumed to be constant between source and receptor.
Identifying aerosol characteristics for source classes. Source
apportionment is possible only for those source classes that have
identifying characteristics. No unique characteristic is needed: just a
unique combination of characteristics.
Description of source profiles. Chemical mass balance models require
quantification of source characteristics, e.g., elemental ratios.
Multivariate models require less precise descriptions, but patterns of
characteristics must be recognizable. Sources with unknown or highly
variable characteristics may not be distinguishable.
Miscellaneous statistical assumptions. For regression models, the number of
characteristics must exceed the number of source classes, and errors
(residuals) should be normally distributed and uncorrelated. For
multivariate models, the number of filters must be sufficient for the
degrees of freedom required for the number of filter characteristics and
sources used.
Deviations from these assumptions will degrade the validity of the
receptor model. The magnitude of the deviations that typically occur in RM
applications is not known at the present time. Also unknown is the
susceptibility of RMs to such deviations. There are obvious situations
where the assumptions are violated and RM applications may not be useful.
For example, RM assumptions do not hold for secondary pollutants, e.g., smog
or sulfate, which undergo extensive transformation and scavenging. Also,
RMs can not apportion sources within a source category, e.g., determining
which of two power plants is culpable, since these sources have similar
aerosol characteristics.
There are many approaches to receptor modeling based on the above
assumptions. Cooper (1980) has classified receptor models by the
interpretive approach used to associate aerosol characteristics with
emission sources. These approaches include 1) regression analysis of
44
-------
aerosol characteristics, which has been labeled the chemical mass balance
(CMB) approach; 2) multivariate analyses of the variability of aerosol
characteristics; 3) composite receptor models which combine CMB and
multivariate methods; and 4) a miscellaneous category including enrichment
or depletion processes, trajectory analyses and, in general, hybrid
dispersion/receptor models. Table 13 highlights some of the major
differences between model approaches. ERT (1981) and Thurston (1983)
provide reviews of current models. A compilation of recent work in receptor
modeling is found in APCA (1982). Some aspects of the different models are
discussed below.
Chemical Mass Balance Receptor Models
Most chemical mass balance (CMB) models minimize the sum of squared
residuals (SSR) between observed and predicted aerosol characteristics:
Minimize SSR = (A S_ - CJT (A ^ - £)
where: A is the vector of estimated source apportionment (a. is the
—«. I
contribution of source class i to total pollutant mass).
S is the source profile matrix (s. . is a measure of aerosol
~~ i»J
characteristic j emitted by source class i).
C_ is the vector of observed ambient characteristic (c^ is a measure
of characteristic i).
CMB models require measured characteristics for one or more time periods and
source profiles. Table 14 shows the source profiles for sources used in the
Quail Roost II study of RMs. Ambient sampling and analysis procedures are
quite sophisticated. However, source sampling involves greater practical
difficulties and less accuracy. Moreover, source profiles often are
unavailable or highly variable within source classes (e.g., from one oil
burner to another), as well as dependent on operating conditions (e.g., fuel
burned). Consequently, the matrix of source profiles, S^ may be very
approximate. CMB models using inappropriate source profiles may produce
45
-------
Dispersion
Models
CMB
Identification of culpability
Individual sources
Source classes
Combined source classes
Source apportionment
Quantitative
Qualitative
Uncertainty estimates
Quantitative
Qualitative
Data Requirements
Source strength, location
Source profiles
Meteorology
-Receptor Models
Multivariate Composite
X
X
X
X
Table 13. Differences between major classes of receptor and dispersion models,
STEEL* STEELS OIL*
SOURCE CLASS
INCIN OILS COAL* 4GGREG GL*SS BASALT SOIL
*UTO WOOD
c. . . .
NA. . . .
AL . . .
SI . .
s . . .
CL, . . .
K
CA
TI . . .
V
CR. . . .
MN . . .
FE. . . .
NI . .
CD . . . .
ZN . . ,
AS. . .
BH. . .
P8. . . .
. . 149196
9383
13196
8995
34948
33141
1O20O
634 1O
1448
189
3848
15216
. 123383
3O72
231O
2831 1
82
91
7876
214629
14438
1OSC9
1 1653
46829
36707
1281O
52372
1 143
149
3OS9
22968
154932
2797
397O
42555
114
146
13OO8
4O617
32536
6947
195793
135376
824
746
836
757
9289
227
1 17
13315
5297
655
356
12
27
811
39991
133280
4855
7424
132 183
2974O7
2484 1O
1 1O73
1013
25
89O
36O
5721
93
4248
2 1OBO
129
831
16663
198625
2O676
6O88
76426
123929
13O71
1329
10614
1 1OO
691
431
118
133OO
50
595
1 136
339
399
4133
30450
8648
39387
142O63
15868
1O46O
1O4O7
36O48
284S
14O
348
28O
18909
187
1313
3756
287
ISO
3O39
985
5335
597 13
235756
6589
8O8
13520
4 1693
2793
65
60
906
4993O
33
580
490
41
13
119
2036
158353
3632
1989
178681
3079
32591
4428
336
161 1
3537
253
9104
1**71
432
S54
651
17
507
150
20 158
85596
166164
226
194
8580
45793
7910
223
118
1633
81767
85
96
83
12
1
4
68593
4271
92
12629
28O
3 1O
203O
996OS
1O3
161
391
20
1O
31O
518497 1
74649
832O
93O9
8782
8717
27 14
1 1288
7
1O6
89
598
7O43
6 1
3
3248
12
387O8
133622
575515
3327
82 16
798 1
866
1OO3O
9426
7769
19
2
3
20
1O32
s
3
617
12
9«
10O
7O45
4459
1 640 1 8
155397
1 33869
18979
8O47
35386
9OI1
797
330
348
71849
635
38O
688
162
11
1197
Table 14. Source signatures from Quail Roost II in ug/g.
46
-------
dB[uaAO aq; ;eqi ipns (XLU;BIU oi^sue^oBueqo aounos p3AU3sqo aq; pue
saunpsooud SLSA~IBUB UO:PBJ. 6ui.sn p3ALuap) SLXB UO^OBJ. aqq. saiBos PUB SS^BIOU
SISA"IBUB UO^OBJ. UOI.;BIUUOJ.SUB^ }36uB:j. 'souB^suL uoj 'XLUiBiu 3[Lj.oud aounos
3SLoaud B }noqi.LM luauiuoiauoddB aounos j.o sa^BuiL^sa aAL;Bli.^LiBnb apuojd
o^ A"^iiiqB ai)^ aABi) siapoui uo^daoaj a^BLJBAL^inui/uoj.ssau6au
uo^oaoay
*SJ.SA"|_BUB ^uauodmoD [Bdpuiud PUB SLSA"IBUB wia^snp *SLSA"[BUB
'uoLSsa-iBaj a^BLUBAi^iniu apntoui. saunpaooud a^eiaeAi^inw
•saounos g o^ 9
sdBijjad 6uiqsj.n6iuisip 'siapoiu gwo uBqq. uoi^niosau uawoi aABq o; pua^
uuoi.^uoddB [BH^OB ai|^ ;ou inq 'qoBa A"q
j.o ^unouiB aij^ PUB sadA"^ ao-inos aq; a^Boipuj. siapoiu asaq^.
^snqou apuoud o^ pa^oaQoo aq ^snui saiduiBS ^uaiqiuB (£861
09 °^ (£861 'd^d) Ot ^noqB aouis ^as^j.o ^BqMauios si a6B^UBApB
siqi *UMOu>|un SL saounos A"|.a>|j.i j.o aiij.0w»d uo A"^L^uapL aq^ uaqM pasn aq A"BUI
'snqi 'XL^BIU ai.Lj.oud aounos q.LOL[dxa UB BuLuinbau }ou j.o
Bq s[apoiu a^etJBAL^inw 'xaidmoo A"ua "
A"aqi •soiisj.ja^OBJBqD [oso-iaB 6uouiB sdi.qsuoi.^B[au ^ua^sisuoo uoj. >|00[ PUB
suoi^BAuasqo ^uaiquiB j.o satuas auu} B asn siapoui uo^daoau
s[apow uo^daoay
a6pij
PUB ^snqou 'sanbiuqaa^ aouBLUBA aAL^Daj.j.a 'sa-iBnbs ^SB3[ pa^t)6i.aM '
}SBai A"jBuipuo isaunpsooud BULMOQOJ. sq^ j.o A"UB asn siapooi
•sua^ndiuoD A*UBIU uo aj.qBij.BAB
A"|.uoiuiuoo auB sa6B>|DBd uoi.ss3u63-i *A"I^SBI *(S3SS8io 3ounos L °^ 9 i
uoi4nios3u q6i.q A"IULBJ. 3ABq os[B A~3qj. *p3p3au SL a|.diuBS US^I.LJ. auo
PUB 'paALuap aue saiBuiL^sa aounos aAnB^L^UBnb 'a[duiLS A"uaA A"n.Bindaouoo
aou.LS swy a;BLUBAL^[nui -IBAO sa6B^uBApB auios aABq s|.apoiu
paunsBaui aq; JL suoaua ao saBusqo UBIUS o; A";LAL;LSuas PCIB sa[i.j.oud aounos
auios uaawq.aq ^LUBILUILS aq; o; anp (sq.[nsau sno.Lunds o; psai A"BIU
*siua[qoud IBOL;SL;B;S [BuaAas o; auoud auB OS^B siapoui asaqi *suouua
-------
estimated source profiles is maximized. Then, conventional CMB procedures
are used to calculate mass contributions. This procedure may provide useful
results when source profiles are not known a priori.
In addition to target transformation factor analysis, composite
receptor models include multiple linear regression/factor analysis and
effective variance CMB/factor analysis procedures.
DIAGNOSTIC PROCEDURES FOR CMB MODELS
This section discusses several problems which are frequently
encountered in CMB receptor models. These problems are collinearity,
influential characteristics and outliers. In part, the reliability and
quantitative performance of CMB receptor modeling depend on resolving these
problems. Some relatively new diagnostic procedures, which have not been
applied previously (at least not in our search of the literature) are
discussed and demonstrated. These procedures are developed in depth by
Belsley, Kuh and Welsch (1980) and are implemented in the TROLL statistical
package (Information Processing Service, 1980).
Collinearity
Collinearity exists when source profiles, e.g., elemental ratios, are
similar or linearly dependent. In CMB models, collinearity may result from
1) similar source signatures; 2) too many source classes, and 3) too few
observations, e.g., not enough elemental measures. In general, collinearity
may lead to unstable, negative and/or too large estimates of source
contributions from one or more source classes. Ordinary least squares (OLS)
in CMB models can not be rationally applied before such dependencies have
been established.
There are many diagnostic tools which may detect the presence of
collinearity. These include:
-eigensystem analysis and condition number
-singular value decomposition and variance decomposition proportions
-examination of the correlation matrix of the source signatures
-variance inflation factor
These methods, which use only the source profile S^ (not the observations of
48
-------
aerosol characteristics), are illustrated using the (contaminated) source
profile matrix given to the Quail Roost II participants. This matrix (Table
14) contains 13 source profiles, each consisting of 19 elements. (The
matrix was contaminated in Quail Roost II by adding random measurement
errors to each element of the true source matrix.)
The singular value decomposition (SVD) and variance decomposition
proportions (VDP) analyses are performed using the procedures of Belsley,
et.al. (1980). The SVD consists of the first three columns of Table 15,
where "rows" (1st column) correspond to eigenvectors of the source profile
matrix, ranked by decreasing "singular values" (2nd column); and "condition
indices" (3rd column) are the ratio of the largest singular value to the
singular value of each row. (This analysis is similar to eigensystem
analysis: singular values are the square roots of the eigenvalues, but are
computed in a way which decomposes the source matrix S^, rather than S S_.)
The remaining columns show the VDP, where "coef" represents the 13 source
profiles. The VDP, which range from 0 to 1 and sum to 1 in each column,
show the fraction of variance associated with each source class and each
eigenvector.
Col linearity which may be serious enough to degrade source
apportionments is indicated by singular values with high condition indices
(greater than about 100) and two or more high VDPs (greater than about 0.8
or 0.9). Table 15 clearly shows the dependence between oil-A and oil-B
sources (condition index 122, coefficients 3 and 5) and a somewhat
surprising multiple dependence between steel-A, steel-B, glass, basalt, and
soil sources (condition index 300, coefficients 1, 2, 8, 9, and 10). The
coefficients for these source classes may be poorly estimated by OLS
regressions.
Somewhat similar dependencies for the Quail Roost II profiles are shown
by the correlation matrix (Table 16). Very high or low correlations (near +
1) indicate strong dependencies between two sources. There are six pairs of
profiles that have correlations above 0.9, namely, steel-A and steel-B
(arguments 1 and 2); coal-A and aggregate (6 and 7); aggregate and glass (7
and 8); aggregate and basalt (7 and 9); basalt and soil (9 and 10) and auto
and wood (11 and 12). Note that correlation coefficients do not show
multiple dependencies.
The variance inflation factor (VIF) does indicate both single and
49
-------
VARIANCE ptco»»o*nion MATRIX
IW
1
1
3
4
S
i
7
a
9
10
It
I]
13
S1NO.VM..
I S]t«
1. ill 14
1 29317
0. §36041
0.730SM
O 4467OI
O. 413176
0. 1(1581
O 13SO3S
0 044149
0 0276 13
O 020733
0 008438
COMB
1
1
3
3
3
B
8
13
18
S7
91
113
300
.INDEX
.3141
.019O4
.93734
.51335
. <«733
. 13711
.9419
.7491
2177
.5114
092
.029
COEF.1
O.
O.
0.
O.
0.
0.
0.
0.
O.OOl
0
O.O14
0 044
0 931
coir. a
0.
0.
o.
0.
0.
0.
0.
0.
O.003
0.014
0.034
0 O4
0 111
COCF.3
o.
0.
o.
o.
0.
o.
O.OO3
O.OO3
O.OO9
0.08
0 021
0 837
O.O67
COEF . 4
0-
0.
0.032
O.OS1
0. 123
O.OO8
O.OO7
0.
0.039
0.647
0 014
O.O13
0.074
COtf .5
o.
O.
0.
o.
o.
0.
o.
o.
o.oot
O.O19
0 139
0.836
O.019
COEF.«
0,
0.
o.
g.
o.
o.
o.
o.
o.ooc
o. 101
0.07ft
O.O3*
0.711
COEF. 7
0.
O.
0.
0.
0.
0.
0.
O.
0.
0.
0.399
0 111
0.631
COtf .1
0.
0.
0.003
0.
0.003
O.002
O.OH
O.01
O.
0. 101
0.013
O.OOl
0.849
COJf 9
0.
0.
0.
0.
0.
0.
0
0.
0.
O.O69
0.014
0.
O.919
COtf . 1O
O.
O.
0.
0.
0.
0.
0.
0
0 OO»
0.00*
0.041
0 003
0.93
COtf . 1 1
O.
6.003
0.
O.OO2
O.OO3
O.OO3
O.C47
0.63
0.003
O 089
0 O53
0 0»1
0. 141
COtf . 12
O.
O.
O
O.
o.
o.
0.
O 02S
0
0 OO1
0 11
0 634
0 117
cotf. i:
0.
o.
o.
0
0 OO3
O.O34
O.OOl
O.OO6
O.019
0 14
0 42
0 321
O.OS1
Table 15. Proportionate variance decomposition of the contaminated Quail Roost II
source profiles. "Coef correspond to 13 source class profiles (columns
of Table 10. See Table 17 for identification of profile numbers.)
ARG1
AKG2
ARG3
M)G4
ARG3
ARC*
ARG7
AftGB
ARC?
ARC 10
ARC 11
ARG12
MG13
ARC I
1.000
0.983
0.048
0.014
0.407
0.136
0.031
-0.060
0.150
0.104
0.441
0.712
0.09O
ARG2
l.GOO
0.097
0.024
0.468
0.124
-0 . 002
-0.004
O.OY3
0.092
0.728
0.774
O.078
ARG3
1.000
0.040
0.400
0.822
0.74?
0.38?
0.404
0.717
0.043
0.088
0.703
AR64
1.000
0.108
-0.054
-0.141
0.404
-0.182
-0.177
-0.013
-0.004
-0.00?
AR09
1.000
0.403
0.176
0.324
0.117
0.2B4
6.764
0.802
0.327
ARG*
l.OOO
0.96?
-0.036
0.877
0.931
0.084
0.123
0.681
AR07
1.000
-0.078
0.746
0.745
-O.I IS
-0.07?
0.70*
ARGB
1.000
-O.071
-0.141
-O.O24
-O.088
0.23?
AR07
1.000
0.740
-0.13?
-0.106
0.76?
ARG 10
l.OOO
0.026
0.068
0.747
AR011
l.OOO
0.937
-O.I 44
ARG12
l.OOO
-0.076
ARG13
l.OOO
Table 16. Simple correlation for the Quail Roost II source profiles. "Arg"
numbers refer to source profiles. (See Table 17 for identification.)
•Number- --Source— --VIF-
1 Steel-A 2963
2 Steel-B 2258
3 Oil-A 728
4 Incinerator 5
5 Oil-B 1541
6 Coal-A 1646
7 Aggregate 2449
•Number-
8
9
10
11
12
13
•Source—
Glass
Basalt
Soil
Automobile
Wood
Coal-B
-VIF--
84
2320
2319
23
652
108
Table 17. Variance inflation factors for the Quail Roost II source profiles,
50
-------
multiple dependencies. The VIFs for the same profiles are shown in Table
17. A high VIF (above about 30) may indicate strong single or multiple
dependencies. Table 17 suggests that a majority of the sources are involved
in strong collinear relationships. However, the number of dependencies and
the sources which contribute to them are not indicated by the VIFs.
We believe that the use of correlation coefficients and the singular
value decomposition and variance decomposition procedure of the source
profiles will provide a comprehensive identification of single and multiple
dependencies. These procedures indicate the source classes which will be
poorly estimated as well as those which are not adversely affected by the
collinearity.
Remedial procedures for alleviating the problems caused by collinearity
include:
-eliminating or combining sources
-using additional filter characteristics, e.g., more elemental analysis
-using ridge regression
-using principal components regression
Only the first procedure, eliminating (or combining) sources, is described
here.
Eliminating sources from the CMB model is, perhaps, the simplest method
of handling collinearity. The subset of sources to be used can be selected,
in part, from the collinearity analysis. For the Quail Roost profiles, four
or five of the collinear sources (e.g., oil-B, steel-A, glass and basalt)
might be eliminated, based on the SVD and VDP analysis. However, it is
necessary to account for the possibility that an actual source is eliminated
in the CMB model. This is the problem of source selection, described below.
Source Selection
In Quail Roost II, the subset of sources for CMB modeling was selected
as follows: 1) Initially, all sources were included in CMB modeling. 2)
Sources with negative source contribution estimates were removed one at a
time, discarding sources with the largest c/./a. (uncertainty/estimated
source contribution), and repeating CMB calculations for each removal. 3)
When all source contribution estimates were positive, sources with the
largest s'./a. were removed, and the CMB calculations were repeated after
51
-------
each removal. 4) After all cj'./a. were less than 1, discarded sources
were added experimentally. This last step is necessary since this procedure
does not necessarily guarantee the best model.
The "all possible regressions" (APR) procedure may provide a more rapid
and systematic procedure of determining sources in CMB models. These
computerized routines efficiently generate all possible regressions, that
is, all possible combinations of sources (Montgomery and Peck, 1982). The
procedure calculates regressions with 1 to p sources, where p may be
specified by the user (and is less than or equal to n). For each level of
p, the APR procedure calculates all regressions and determines the models
with the best fits, then prints out the sources included in a specified
number of the best models. This procedure is as fast as stepwise
regression, and is less arbitrary in the sense that no combination of
sources is omitted.
For n sources there are 2n-l possible regression models, a
potentially large number. Practically, however, if strong collinearity
involving k sources exists, then only a maximum of n-k sources need be
included in CMB calculations, and 2n~ -1 regressions need be calculated.
With the Quail Roost II data set, for example, several of the 13 sources had
strong collinear relationships and four or five fewer sources need be
included in any CMB model.
The APR procedure is demonstrated using a synthetic data set created
for the Philadelphia area. Synthetic data sets have the advantage that
complete control is maintained over the data: the "true" apportionment and
the errors are known. However, the error structure and composition of the
synthetic data sets may not be realistic. The synthetic data set created
here was designed only to illustrate the diagnostic procedures. The data
set was constructed using the PEM dispersion model, source inventory and
meteorology from Philadelphia (Section 3) and the same procedure used in
Quail Roost II. In brief, the 13 profiles from Quail Roost II were assigned
to the 12 source classes used in PEM modeling, multiplied by their predicted
contributions to receptor concentrations, and contaminated with random
errors. The synthetic data set used for this evaluation was derived from
period 2 meteorology at receptor 10 (see Section 4), and had major
3 3
contributions from three sources. Soil: 50.72 ug/m ; wood: 19.74 ug/m ;
3
and automobile: 7.80 ug/m . It also included minor contributions from
52
-------
other sources, as shown in Table 19. Ordinary least squares with 8 profiles
and a constant were used in a simple CMB approach. Four profiles, as
indicated in the col linearity analysis, were eliminated from the CMB model.
Mallow's C statistic, which is related to the mean square error of
the estimated source contribution, was the criterion used for source
selection in the APR procedure (Table 18). For p sources included in a
model (p ranging from 1 to 13), the five best models were generated. For
example, the best one variable model used Coal-A; the second best used
Oil-B, (sources 6 and 5 respectively). The best two variable models used
soil and wood sources (sources 10 and 12). In general, models with low
values of C are of interest. This corresponds to a 9 variable model.
However, Table 18 shows that a number of models with various combinations
and numbers of variables have about the same fit, i.e., there may be no
clearly superior model. The Table also shows that the models selected tend
to be quite similar, i.e., include or exclude similar sources. Interesting,
while all models composed of five or more sources included the major
contributing sources of the synthetic data set (sources 10, 11, and 12),
models of fewer variables may not include these sources. Models with a
large number of sources tended to include sources that were identified as
being collinear, e.g., coal-A and aggregate (sources 5 and 7). This
indicates that a model with fewer than 9 variables may be preferable since
collinear source profiles will not be accurately estimated.
Since the all possible regressions procedure tests all combinations of
sources it may be more comprehensive than the procedure used in Quail Roost
II. However, unlike the Quail Roost II procedure, it does not focus on
removing negative source contributions. This may not be a serious drawback
since the procedure generates many "near optimum" models. If the 2nd or 3rd
best model of p sources contains negative source estimates, then models
containing p-1 sources may be examined. The likelihood of significant and
negative source estimates should decrease as models contain fewer sources.
Influential Characteristics
Influential characteristics (or observations) are filter
characteristics that exert a disproportionate influence on the fitted
model. For example, lead is an influential characteristic for automobiles,
53
-------
REGRESSIONS WITH 1 VARIABLES (CP)
CRITERION VARIABLES
109293.
113790.
116458.
125723.
142717.
6
5
10
12
11
REGRESSIONS WITH 2 VARIABLES (CP)
CRITERION VARIABLES
1390.26
7533.54
13928.
49429.3
50748.8
10 12
6 12
6 1 1
5 6
? 6
REGRESSIONS WITH 3 VARIABLES (CP)
CRITERION VARIABLES
711.606
1094.24
3933.7
5O33.32
5745.85
6 10 12
1O 11 12
6 7 12
6912
4 6 12
REGRESSIONS WITH 7 VARIABLES (CP)
CRITERION VARIABLES
25.4399 2 4 5 1O 11 12 13
27.2665 3 6 9 10 11 12 13
28.0219 5 6 9 1O 11 12 13
49.3224 4 5 6 10 11 12 13
91.4441 6" 3 9 1O 11 12 13
REGRESSIONS WITH 8 VARIABLES (CP)
CRITERION VARIABLES
7.7S109 4 5 6 7 10 11 12 13
7.86788 3 4 6 7 10 11 12 13
11.6995 2 4 5 6 10 11 12 13
11.9773 1 4 5 6 10 11 12 13
13.9464 1 2 4 5 10 11 12 13
REGRESSIONS WITH 9 VARIABLES (CP)
CRITERION VARIABLES
6.64801 2 4 5 6 7 10 11 12 13
6.99388 1 4 5 6 7 10 11 12 13
8.83725 3 4 5 6 7 10 11 12 13
9.32423 3 4 6 7 9 10 11 12 13
9.39112 4 5 6 7 8 10 11 12 13
REGRESSIONS WITH 4 VARIABLES (CP)
CRITERION VARIABLES
446.823
485.597
51 1 .658
519.943
552.52 I
6 10 11 12
6 10 12 13
10
10
to
12
12
12
REGRESSIONS WITH 5 VARIABLES (CP)
CRITERION
VARIABLES
61.0221 5 10 11 12 13
243.687 6 9 1O 11 12
272.C88 6 10 11 12 13
296.008 1 6 10 11 12
341.835 2 6 10 11 12
REGRESSIONS WITH 6 VARIABLES (CP)
CRITERION VARIABLES
52.6796 5 6 10 11 12 13
62.2877 4 5 10 11 12 13
62.7126 5 7 10 11 12 13
135.011 6 9 10 H 12 13
173.831 4 6 9 10 11 12
REGRESSIONS WITH 1O VARIABLES (CP)
CRITERION VARIABLES
8. 1 1S24
8.29817
8.43179
S. 61338
8.66667
REGRESSIONS
CRITERION
10.026
10.2101
10.2512
1O. 3361
10.5527
REGRESSIONS
CRITERION
12.0252
12.026
12.1117
12. 1339
12.28O8
REGRESSIONS
CRITERION
14.
1 2
2 3
2 4
2 4
1 3
WITH
4
4
5
5
4
1 1
5
S
6
6
5
6
6
7
7
6
7
7
9
8
7
VARIABLES
10 11 12 13
10
10
10
10
11
1 1
1 1
11
12
12
12
12
13
13
13
13
(CP)
VARIABLES
1 2
2 4
2 3
1 4
1 3
WITH
3
5
4
5
4
12
4
6
5
6
5
5
7
6
7
6
6
a
7
B
7
VARIABLES
7
9
9
9
9
10
10
10
10
10
11
1 1
1 1
t 1
1 t
12
12
12
12
12
13
13
13
13
13
(CP)
VARIABLES
1 2
1 2
1 2
2 3
1 3
WITH
3
3
4
4
4
13
4
4
5
5
5
5
5
6
5
6
6
S
7
7
7
VARIABLES
7
7
8
8
a
9
8
9
9
9
10
10
10
10
10
11
1 1
1 1
t 1
1 1
12
12
12
12
12
13
13
13
13
13
(CP)
VARIABLES
1 2
3
4
5
6
7
8
9
1O
11
12
Table 18. Results of all possible regression procedure using synthetic data set
and Mallow's Cp statistic. Table 17 shows the identification of the
sources (shown under variables).
54
-------
since the estimated automobile contribution is roughly proportional to the
measured lead concentration, which is a fairly unique tracer for this
source. In general, estimated source contributions may depend more on the
influential points than the rest of the data.
Influential characteristics may increase the accuracy and decrease the
uncertainty of the estimate of the source contribution if the
characteristics are correct. Such (legitimate) extreme characteristics are
useful as indicators of a particular sources in RM, and also serve to show
which elements or characteristics should be analyzed in the filter sample.
However, the source apportionment estimate may be misleading if the
influential characteristics result from improperly recorded data,
measurement errors, or inappropriate source profiles.
Many diagnostic procedures exist to detect influential
characteristics. These procedures may be classified into three general
families (Belsley, et. al. 1980):
1. Single and multiple row deletions: Essentially, one or more filter
characteristics is removed from the analysis, and the source apportionment
is re-estimated. Large changes between the source apportionment estimated
with missing characteristics and with the complete set of characteristics
may indicate critical (i.e., possibly unique) filter characteristics and
possibly uncertain source estimates. This process is repeated for each
filter characteristic. Single and multiple row deletion measures include:
change of estimated source class coefficients—DFBETA(S)
change of fit--DFFIT(S), PRESS
2. Examination of the source profile (X) matrix: These measures examine
only the source profile matrix and look for single or multiple
characteristics which are unique to each profile. If few characteristics
distinguish a profile, the source apportionment may be sensitive to these
characteristics. Common measures include:
diagonals of the projection matrix H = X(XTX) x'
covariance matrix—COVRATIO
3. Examination of residuals: Residuals (the difference between observed
and fitted filter characteristics in CMS models) may be examined to
55
-------
determine if least squares procedures are appropriate and tests of
significance are valid, i.e., by examining whether the residuals are
correlated and normally distributed. In addition, plots of residuals may
provide a great deal of information concerning influential points. Residual
diagnostics include:
normal probability plots
standardized and studentized residuals
partial residual plots
partial regression leverage plots
Two types of diagnostics are illustrated using the synthetic data set
and ordinary least squares procedures. Five of the profiles that showed
col linearity problems were removed from the model. Actual and estimated
source contributions are shown in Table 19. This apportionment is meant
only to illustrate the use of diagnostics: no attempts were made to improve
the apportionment, e.g., by omitting sources with negative contributions.
DFBETAS are single row deletion statistics, which quantify the effect
of an individual filter characteristic on the estimated source class
contribution. DFBETAS indicate the change of the estimated source
contribution in terms of its standard error. A reasonable cut-off for
determining influential characteristics is +_ 1, where omitting a
characteristic, e.g., Vn concentration, might change the estimated source
contribution by one standard deviation (of its uncertainty).
Table 20 shows DFBETAS for the synthetic data set. Each row of the
Table shows the DFBETAS for the removal of one characteristic from the CMB
model. For example, removing characteristic 1, the mass of carbon, from the
regression would change the estimated oil-A contribution by 3.83 times the
standard error of this coefficient, a large amount. Five characteristics
(1=C, 3=AL, 7=K, 13=Fe, and 19=Pb) have DFBETAS which exceed the cut-off for
a several source classes. As expected, Pb is an influential characteristic
for the auto source class. The four other influential characteristics
affect steel-B, oil-A, incinerator, aggregate, coal-A, soil, coal-B and wood
classes.
Partial residual plots permit a visual indication of influential
subsets. Each plot reveals the relationship between particular source
profiles, adjusted for its collinearity with the other profiles, and the
56
-------
-Source Class-
Steel a
Steelb
01 la
Incin
Oilb
Coala
Aggreg
Glass
Basalt
Soil
Auto
Wood
Coalb
Source Apportionment in ug/m
-Actual-- -OLS Estimate- -Robust Estimate-
0.016
0.000
0.121
0.009
0.275
0.022
0.021
0.076
0.019
50.321
7.795
19.737
0.000
3.6
-1.0
13.7
-9.6
44.0
4.7
20.7
-6.4
13.8
-9.7
44.1
4.7
20.6
-6.5
Table 19. Estimated CMB source apportionment using 8 source classes and
synthetic data set; actual apportionments are PEM predictions; only
coefficients with 95 percent confidence are shown.
OBSER.
STEELB OILA INCIN COALA AGGREG SOIL AUTO
WOOD
COALB CONST
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
1. 170
O.02S '
0.938
O.O39
0.021
O.453
1 .597
-O. 155
0.045
0. 155
O.OO2
O.012
3.227'
O.OJJ
O.O17
•O.50G
-0.007
-O.631
O.OS6
3.831
0.066
1.867
-O.422
-0.514
-O.316
-0.557
-1.555
-O.O61 ,
-0.201
-O.OO1 '
O.O14
2.354
-0.056
O.OO4
O. 163
O.O03
0.227
0.313
' -3.258)
0. 136
0. 122
O.O96
O.OO3
0.089
-2.369
-0.728
-0. 159
-O.O27
0.002
-0.019
1 . 1 14
0.011
0.047
O.426
-O.OO3
0.060
0.207
-O.54S
O.OO1
-O. 117
-O. 197
0.055
0.538
r.797)
0.447
0. 130
0. 175
0.001
O.O03
- 1.834
O.O43
-0.015
-O.5O7
-O.OO4
-0.504
-0.051
-O.539
-O.023
0. 192
O. 111
-O.037
-0.551
-1.878
"O.287
-O. 142
-0. 140
-O.OO1
-0.010
0.982
-O.O31
0.016
O.545
O.OO4
0.5O2
-O.01Z
0.
-0.
-O.
0.
0.
1 .
- 1 .
0.
O.
-0.
0.
1.
0.
-O.
-0.
-O.
-O.
0.
1.688
065
346
314
213
434
322
62O
1 17
041
OOO
018
090
OO4
009
441
OO2
353
104
0
-O
O
O
-O
0
O
-O
0
0
-0
0
O
0
0
-0
0
-3
0.219
.314
. 171
.087
.029
. 190
.010
.029
. 136
. 141
.004
.030
.275
.064
.065
.241
.CO9
.-603
.850
.43.510
-O.208
-O. 109
0. 145
-0.055
-0.416
-1 .742-1
0. 198
-0.044
-O.206
-O.OO3
0.004
-0.41Q
-O.063
-0.025
0.406
O.OO9
O. 158
f. 136
' -3.015
-O. 1O8
-2.517
0.720
-O.439
-O.223
-O.437
1.472
-0.073
0.097
0.002
-O.03O
- 17702
O.041
0.026
0.331
-O.OO2
0.050
-O.238
-O.31S
-O.010
-O. 198
0.07O
0.016
-0.2O7
-O.659
0.075
0.236
-0.440
-0.010
0.062
-0.48S
-O. 173
-O. 145
-0. 159
O.024
1.069
0.312
Table 20. DFBETAS row deletion statistics for CMB of synthetic data set,
Observations refer to the elemental measures in Table 13.
57
-------
residuals (the difference between fitted and observed filter
characteristics), adjusted for the other sources (i.e., the source
contributions of the other sources have been removed from the residuals).
Thus, each plot illustrates the fit between source profiles and filter
characteristics where the effect of the other sources has been removed.
Partial residual plots for 8 source classes in the CMB model using the
synthetic data set are shown in Figure 14. For example, the top left plot
is the partial residual plot for the soil class. The estimated soil
contribution is related to the slope of the regression line (shown with
asterisks); the significance or fit of the data to the soil profile is
indicated by the scatter around the regression line. The plot for the
incinerator class shows a negative slope (and thus negative source estimate)
and considerable scatter, both of which imply that results for this source
may not be meaningful.
Influential points or subsets typically form isolated clusters of
point(s) located some distance from the majority of points on the partial
residual plots. Influential points heavily influence the regression line,
i.e., the source estimate, since the regression line minimizes the sum of
squared differences. For example, point D (indicated with an arrow)
(characteristic 7=K) is an influential point for the oil-A, incinerator,
coal-A, aggregate and soil classes. This point alone heavily weights the
regression line for oil-A, incinerator, coal-A and aggregate source classes,
which were poorly estimated by OLS. The validity of the K measurement is
thus highly important for these source estimates. Similarly, point K
(19=Pb) is influential for the auto and wood sources. Error in the lead
measurement will highly bias the auto estimate. However, the wood source
estimate may be more resistant since other characteristics lie on the
regression line. This information is similar to that provided by the
DFBETAS. The partial regression plots also show an influential subset—not
definitively indicated by the DFBETAS--in the case of coal-B, which has two
influential points, B (3=A1) and D (7=K).
DFBETAS and partial residual plots may be effective ways to determine
the effect of both single and multiple influential characteristics in CMB
models. Influential characteristics may require further analysis to
determine the legitimacy of the characteristics.
58
-------
no««-»
•••
t.as
O.3S
•O.7S
•1.7S
9 •*•• °-« '•? -o. us -0.078 -o.oas o.oas o.or»
0.2S
SOIL .H'c
.
la
OILA J . *
A «0
" '• """"1 ••«! AGGREG
HOC
t* FA
» I J
02J O.O4
I A.
.
K *
a.
• -O.Oi
.
COALB •' i<0 . -«•'<>
* u
• 1
H > A
•Fl
•HA3J
a oa
K .
1 *
.J
-o.oso -o.oas o.ooo , o.oai o.oso O.OTS o. too
Figure 14. Partial regression plots for 8 source classes. Adjusted elemental
mass on ordinate; adjusted source profile (as indicated) on
abscissa. Fitted least squares line is shown with asterisks.
59
-------
Robust Regression
Robust regression procedures are designed to diminish the effect of
outliers, such as erroneously measured filter characteristics, data entry
mistakes, etc. In ordinary least squares (OLS), outliers tend to have a
disproportionate effect on estimated source contributions. In robust
regression, the influence of outliers is constrained by downweighting
outliers. Robust procedures are especially useful when the normality
assumptions of OLS are in question. The principal value of robust
regression in CMB RM, perhaps, is to provide a check on OLS results.
Agreement between the OLS and robust regression results may help insure that
the estimated source class contributions are meaningful. Robust regression
may be viewed as an automated way of determining the influence of outliers
(as well as influential points).
Most robust procedures employ iteratively reweighted least squares.
First, source estimates and residuals are calculated using ordinary OLS.
Then, a specified influence function (Hampel, 1974) is used to downweight
large residuals in a weighted least squares regression. An iterative
process using weighted residuals and the influence function continues until
convergence. Watson's "effective variance" procedure also uses iteratively
reweighted least squares procedures. However, the effective variance
procedure gives more weight to ambient source measurements which are
measured with precision, as well as source class estimates which are
certain. It does not diminish the effect of outliers which tend to pull the
OLS solution.
Robust procedures were used on the synthetic data sets using 8 source
signatures in TROLL. The standard convex Huber influence function was used
to weight the squared residuals (Peters, et. al., 1981). Robust results
were very similar to OLS results, i.e, few characteristics were
significantly downweighted (Table 19). This probably occurred because the
contamination method did not generate outliers (only normal errors were
used.) In practice, however, results using OLS and robust procedures may
vary.
It is also possible to determine which characteristics were
downweighted in the robust regression. Examination of such points may
identify possible errors. We note that it is possible to combine robust
60
-------
procedures with ridge regression and the effective variance technique.
Lastly, a procedure similar to robust regression, called bounded-influence
regression, designed to limit the bias caused by contamination of data such
as that caused by keypunch mistakes (Peters, et. al., 1981 )f may have
application to CMB apportionments.
HYBRID RECEPTOR/DISPERSION MODELS
This section discusses several possibilities for hybrid
dispersion/receptor models. Hybrid models use a receptor model approach in
conjunction with source emission rates, source locations and/or dispersion
information. Such models conceivably might provide a more accurate and
flexible means of source apportionment. They may also help reconcile
differences between source and receptor models.
Some potential uses of hybrid RM/DM approaches include:
Operational validation. Receptor and dispersion models are used separately.
Inter-model comparisons may help identify emission inventory deficiencies,
i.e., missing or mis-estimated emission sources. Inter-comparisons can help
"confirm" apportionment estimates by employing a "consensus" standard.
Diagnostic validation. Model comparisons are used to refine and calibrate
model parameters, e.g., dispersion coefficients and source profiles. Such
validation may be appropriate for model development.
Complementary use. Outputs from receptor and dispersion models are
combined. For example, DM might be used to apportion regional or background
concentrations, and RM might estimate local concentrations. Other examples
include using RM with filter samples collected according to the observed
frequency of meteorological conditions, to determine long term averages;
collecting filter samples during meteorological patterns which result in the
highest DM predictions; restricting the profiles in CMB models to upwind
sources; and calculating back-trajectories which incorporate transformation
and deposition for receptor determined mass source apportionments (e.g.,
Thurston, 1933). Some of these approaches may be more practical with
short-term (e.g., hourly) filter samples.
61
-------
Coupled models. All data, including meteorology, source emissions and
profiles and filter characteristics are used in coupled models. Coupled
models intimately combine receptor and dispersion approaches, and thus may
be more complicated than other hybrid approaches.
Coupled models might be classified by their primary orientation, around
either dispersion or receptor approaches. Some speculations concerning
possible coupled hybrid dispersion/receptor models are given below.
For receptor oriented hybrid models, outputs of DMs might be treated
stochastically. CMB receptor models might then be extended by either
including DM predictions as priors in Bayesian optimizations, or by using DM
predictions of source contributions as coefficient weights in weighted
optimizations. Analogous methods are possible for multivariate RMs. Here,
preprocessed meteorological data might become new variables in a factor
analysis or multiple regression model. Association of particular
meteorological patterns with source class signatures may identify the likely
direction and distance of contributing sources. Alternately, source
contributions predicted by DMs may become variables in factor analyses,
along with the usual filter characteristics. Associations of predicted
source contributions with the respective source profiles may indicate areas
of agreement between the models.
The second general type of coupled models is based around dispersion
modeling. In some ways, this approach is similar to the diagnostic
validation discussed earlier applied to DMs. The (deterministic) inputs or
parameters of DMs are considered as stochastic variables (with estimated
distributions). Optimization is used to match DM predictions to observed
concentrations or receptor determined apportionments, at the same time
maximizing the probability of the stochastic variables in the DM. Yamartino
(1982) and Cooper (1982) have used a time series of observed concentrations
to estimate average source strength by region and sector, respectively. The
identification of large emissions in areas without known sources may help
quantify the regional (background) contribution. These studies did not use
aerosol characteristics.
More sophisticated coupled models may be possible if short-term (CMB)
receptor model results are available and of suitable accuracy. This may
62
-------
permit the estimation of short-term parameters in dispersion models. For
example, source inventories generally include only annual average emission
rates. However, hourly and seasonal emission rates are highly variable.
The distribution of hourly emissions may be estimated. The optimization
model may be used to select emission rates, within the assigned
distribution, that produce results closest to the source apportionment
determined previously using RMs. DM using the revised emission rates then
might provide the hybrid source apportionment. Similar approaches may be
possible for other variables in dispersion models, e.g., dispersion
coefficients and wind direction.
There has been relatively little work in the development of hybrid
models. Some of these models may be used to help apportion both local and
regional pollutants in the Philadelphia study. Data from this study, as
well as from realistic synthetic data sets, may provide a means to test the
feasibility of the various hybrid models.
63
-------
COMPARISONS BETWEEN SOURCE AND RECEPTOR MODELS
Some aspects of model validation and model inter-comparisons are
discussed in this section with reference to the Philadelphia study. The
evaluation of similarities and differences between dispersion and receptor
models may help explain strengths and weaknesses of the models, and possibly
indicate the certainty of the source apportionment.
One approach for identifying and quantifying the likely causes of
model agreement (or discrepancy) might use regression or factor analyses.
These procedures may be used to associate measures of agreement between RM
and DM results with meteorological data, averaging time, source class, and
other information.
Evaluation Criteria
This section addresses the quantitative "performance" measures used to
evaluate model results. In the Philadelphia study, both RM and DM results
will be compared, and the true source apportionment is not known. Such
inter-model comparisons involve somewhat different concerns than model
validation studies where predicted concentrations are compared to observed
concentrations, or the Quail Roost exercise, which inter-compared only
receptor models.
There are several useful strategies for inter-model comparisons:
(1) Comparison between all models. Provides the most information and makes
the fewest assumptions, but (n-l)l comparisons must be made for n models, a
potentially large number.
(2) Comparison of each model's results to the mean or median result. This
is useful since it highlights deviations from the mean or median
prediction. However, there is no assurance that the mean or median
prediction is the best estimate of source apportionment.
(3) Comparison to an estimate of true source apportionment, based on
considering and weighing all available evidence. Joint and independent
application of RM and DM as suggested by Core, et.al., (1982) might be the
64
-------
means to obtain the source apportionment estimate. Additional ways of
combining receptor and source models were discussed previously.
While the first two strategies permit direct comparisons of model
results, it is felt that the third strategy may be the most comprehensive
approach to model inter-comparison. However, as discussed previously under
the section on hybrid receptor/dispersion modeling, there are not yet
objective and practical approaches which appropriately combine all evidence.
A second issue of RM and DM comparisons concerns the quantity
compared. In general, it would seem that the quantities compared should
relate to modeling objectives. Comparison of (possibly size fractionated)
source class contributions of the particulate mass might be appropriate, as
used in Quail Roost II. It is important to realize that neither DMs nor RMs
apportion regional sulfate, which may contribute much of the fine particu-
late mass. DMs, such as PEM, do not model regional sources (beyond 60 km
distance). RMs may not accurately apportion sulfate since sulfate does not
have identifiable tracers. Consequently, apportionment of the sulfate in
the Philadelphia study will involve large uncertainties. Apportionment of
sulfate might be broken out separately from the apportionment of local
sources, to help clarify the contribution from regional sources.
Third, the different nature of the outputs produced by DM and RM should
be recognized in inter-comparisons. DMs provide deterministic point
estimates; RMs provide statistical or qualitative estimates. While it is
possible to weight comparisons between RM results with estimates of their
statistical confidence, this can not be accomplished easily with DM. One
possible solution is to compare only the statistically significant RM
results to DM results. Statistically insignificant results might be set to
zero. One or several confidence levels might be selected for this purpose.
(Confidence intervals should be explicity incorporated in the calculation of
the best estimate of source apportionment, as discussed later.)
Averaging Time
Bias, distributional effects and extrapolation errors are three effects
associated with averaging time which may influence model inter-comparisons.
DMs usually can predict long-term, i.e., seasonal and annual,
65
-------
concentrations with greater accuracy than short-term, i.e, hourly to daily,
concentrations. Many of the errors of short-term DMs result from inaccurate
or incomplete data from the source inventory, meteorology and ambient
monitoring. Many of these short-term errors are random, and tend to be
averaged out in long-term concentration averages. (These generalizations
may not apply to the modeling of regional pollutants.)
RMs also may be susceptible to short-term errors since source class
profiles are usually assumed to be time and source averages of measured or
estimated profiles. The errors in the source profile may be small if many
individual emission sources compose a source class (and if there are no
systematic biases). For example, if in a particular source class there are
n individual sources which equally contribute to observed concentrations,
then the uncertainties of the source profile may be expressed as:
standard deviation of individual profiles
Standard deviation of profile = -—;
since variances are additive. This suggests that source profile errors will
be small for source classes composed of many sources, even in short time
periods. Thus, the accuracy of RM based apportionments of automotive, oil,
soil, etc., source classes may be largely insensitive to averaging time. In
contrast, the accuracy of apportionments for some source classes composed of
only a few sources may be more sensitive to averaging time. Thus, long-term
apportionments from both RMs and DMs may be the most valid, although some
short-term RM results may be equally valid. This may be tested by
contrasting measures of agreement between RM and DM results for different
source classes at different averaging times.
It may be worthwhile to evaluate model agreement in short time periods
since, in many circumstances, it may not be possible to sample and analyze a
large number of samples. This might exclude multivariate models, since
these models require many observations. However, such comparisons may yield
information of practical value regarding the accuracy and reliability of CMB
models. (The Quail Roost II exercise primarily used multivariate models,
and thus only the average apportionment over 40 12-hour averaging periods
was evaluated.)
The confidence level or uncertainty in short-term (and long-term)
source apportionments should be included in inter-model analyses. The
uncertainty of single and multiple short-term apportionments, e.g., by
66
-------
applications of CMB models, might be estimated using statistics such as the
mean coefficient of variation of estimated contributions in particular
source categories.
The second effect of averaging time is its influence on pollutant
distributions. It is widely recognized that pollutant concentrations at
short averaging times have approximately lognormal distributions and large
standard deviations. Concentrations at long averaging times have much
smaller standard deviations. Similarly, the standard deviation of source
apportionments will depend on averaging time. Comparisons between DM and RM
should account for the these distributions by employing appropriate
performance measures, as discussed by the AMS (1979) and Fox (1981). This
also suggests that hybrid RM-DM or RM models which use short averaging times
(e.g., several hours) may be subject to non-normal errors. Consequently,
such models should employ robust procedures which are resistant to such
"contaminated" data, such as robust regression discussed earlier.
The third effect of averaging time concerns extrapolation errors. In
general, seasonal or annual source apportionments using RM and (necessarily)
a limited number of filters will require extrapolation of data. The number
of filter samples required to produce representative longer term source
apportionments is unknown, but is likely to be dependent on the site and
circumstance.
RECEPTOR MODEL PROTOCOLS
Pace (1983) has discussed possible roles for receptor models in the
regulatory framework. There are a number of important and unresolved
questions concerning the practical use of receptor models in the regulatory
setting. Protocols represent well defined and justified methods which help
assure quality results. Protocols for receptor models may be classified
into three areas: 1) protocols for the physical aspects of the study,
including the design, siting, operation, and quality control and assurance
of monitoring and filter analyses; 2) modeling protocols, which include
model selection, treatment of uncertainty and diagnostics; and 3) protocols
for model inter-comparisons, to determined preferred approaches. This
section discusses aspects of RM protocols using this classification.
67
-------
Protocols for Monitoring and Analysis
There are a number of general issues related to the amount and
quality of the information required in RM studies. These are important
issues since most costs of RMs are associated with sampling and analysis of
filters, rather than modeling. These issues and trade-offs include:
(1) Number of analytic tests on the filter samples: There is almost an
unlimited amount of analytical testing that can be done on a set of
samples. Obviously, it is important to arrange testing such that the most
cost effective methods are conducted first. Analytical tests that can break
col linear relationships between sources can be determined by col linearity
diagnostics. Site-specific cut-offs for analyses might be established,
beyond which further analysis is not cost-effective. This trade-off between
more and fewer analytical tests could be defined prior to regulatory use.
(2) Number of samples: A minimum number of filter samples is required for
representativeness; a larger number of samples is required for multivariate
receptor approaches. Here again it may not be worthwhile to collect
additional samples. Documentation of RM performance for specific sites and
situations might help define the necessary number of samples.
(3) Source sampling: More extensive source sampling may increase the
accuracy and selectivity of RM. Such analysis is expensive, time consuming,
and may not yield representative signatures. Thus, as in filter analysis,
the costs of source sampling can be balanced against its decreasing marginal
value.
(4) Length of sampling period: At this point most sampling periods are 12
hours in length. This allows a representative sample of seasonal conditions
with a reasonable number of filters. However, 12 hour intervals make it
difficult to use the information contained in wind direction and other
meteorological data. It is now possible to obtain hourly samples with
sufficient loadings for the analytical procedures. With hourly samples, the
source apportionment problem might well decompose into different problems
for each wind direction and thus yield finer source resolution. The cost,
68
-------
however, of running 20 to 25 days of hourly samples through the analytical
procedures would be prohibitive. Thus, a trade-off exists between the
representativeness of the data and the ability to utilize meteorological
information.
Modeling Protocols
The diagnostic procedures discussed previously are a subset of
modeling protocol issues. The main objectives of using diagnostic tools are
to increase the speed and effectiveness of receptor modeling and to decrease
the chance of spurious results. Diagnostics might be incorporated into
model protocols to 1) detect the presence of problems such as collinearity,
outliers and influential observations, 2) assess the extent to which the
source apportionment is degraded, and 3) determine whether corrective action
is necessary. The usefulness of the various diagnostic procedures might be
subject to the same sort of evaluation as model comparison, in order to
determine which procedures are the most appropriate.
A receptor modeling protocol, which incorporates diagnostics, might
include:
1. Use of source inventory, dispersion modeling, factor analysis, or
microscopy to identify major source contributors.
2. Use of measured or estimated source profiles in an effective
variance CMB or multivariate analysis to quantify impacts and possibly
detect additional sources.
3. Estimate the effect of collinearity in the source profiles (e.g.,
using singular value decomposition).
If necessary, re-estirnate apportionment (e.g., using ridge or
principal components regression, or deletion/combination of source
classes using all possible regressions). Obtain positive and
significant source contributions.
69
-------
4. Determination of the sensitivity to outliers and influential
points using DFBETAS, partial residual plots, and/or effective
variance techniques.
5. Confirm CMB results using robust regression techniques, or "fresh"
data.
Clearly, receptor model protocols will depend on the circumstances and the
effort of the study.
Protocols for Inter-comparison of Receptor Models
Model inter-comparison serves the purposes of documenting model
performance and uncertainty, and determining model applicability, i.e.,
matching models and uses. Inter-comparison studies of receptor models have
not stressed both objectives, largely due to limited data. Model
inter-comparisons may include measures of model adequacy for different
source and background compositions (e.g., rural western vs. urban eastern
sites); varying averaging times; and various levels of study effort (e.g.,
with or without source sampling). More effort is needed to estimate
confidence levels which are representative.
SUMMARY
Some of the major issues in the use of receptor models are:
(1) Art versus science in the application of RM. Currently there is a
substantial amount of imposed "intelligence" that is introduced into RM
procedures by experienced practitioners. Present RM studies are generally
custom designed, including selection of source signatures, filters and
analysis. In the regulatory context, some of the discretionary elements of
RM might have to be removed to reduce the chances for misuses of the
models. Regulatory protocols must balance the need to ensure robust and
representative results with the interpretative and case-specific nature of
RMs. A middle ground might ensure that monitoring, filter analysis, quality
assurance and control procedures are adequate, and RM results are not
70
-------
spurious. Also, appropriate measures of uncertainty must be included.
(2) Development of representative source profiles and synthetic data sets.
The variation in source profiles (over time and space) and the deviation
from the RM assumptions can be determined from source or near field
measurements, or perhaps derived using wind trajectory analysis. Synthetic
data sets with realistic error structures and compositions for various
aerosol regimes, e.g., rural western vs. urban eastern, may be used to
construct RM protocols. Such data sets may help identify the resolution and
uncertainties of the various receptor models as well as the necessary study
effort (sampling and analysis). Inter-model comparisons using this data can
be used to select preferred models and diagnostic approaches.
(3) Hybrid receptor/dispersion approaches. Hybrid models require more data
than either approach alone. Such models may be cost-effective if source
inventories and meteorological data are available. If hybrid models permit
significantly improved performance in terms of accuracy and flexibility of
source apportionment, their expense and complexity may be justified in other
circumstances. At present, hybrid models have not been demonstrated and
critically evaluated.
71
-------
REFERENCES
Air Pollution Control Association (1982): Specialty Conference on Receptor
Models Applied to Contemporary Pollution Problems.Danvers, MA, APCA
Publication. 388 pp.
Cooper, D.W., Receptor-Oriented Source-Receptor Analysis, loc. cit.
DeCesar, R., Evaluation of Multivariate and Chemical Mass Balance Approaches
to Aerosol Source Apportionment using Synthetic Data and an Expanded
Kacs Data set, loc. cit.
Yamartino, R.Y., Formulation and Application of a Hybrid Source-Receptor
Mode 1. loc. crT.
American Meteorological Society (1979): Air Quality Modeling and the Clean
Air Act: Recommendations to EPA on Dispersion Modeling for Regulatory
Applications. Boston, MA. 268 pp.
Belsley, D.A., E. Kuh, R.E. Welsch (1980): Regression Diagnostics:
Identifvino Inf—•«---•' r>_j._ —i •« ~ ?•*/*.-,-,? 1±... mm
Sons, New York.
Identifying Influential Data and Sources of Col linearity. John Wiley and
Yc
Cooper, Y.A., J.G. Watson: Receptor Oriented Methods of Air Particulate
Source Apportionment. J. Air Poll. Control Assoc. 30 1116-1125.
Core, J.E. et. al. (1982): Particulate Dispersion Model Evaluation: A New
Approach Using Receptor Models. J. Air Pol. Control Assoc. 32_ 1143-1147.
Energy Information Administration (1980): Thermal Electric Power Plant
Construction and Annual Production Expenses. Washington, D.C.
Environmental Research and Technology, Inc. (1981): The State of the Art of
Receptor Models Relating Ambient Suspended Particulate Matter to
Sources. Report P-A42Z to US Environmental Protection Agency, Research
Triangle Park, NC.
Fox, D.G. (1981): Judging Air Quality Model Performance, Review of the Woods
Hole Workshop. American Meteorological Society, Boston, MA.
Hampel, F.R. (1974): The Influence Curve and its R^le in Robust Estimation.
J. Amer. Statist. Assoc. j)9 383-394.
Huber, P.J. (1981): Robust Statistics. John Wiley and Sons, New York.
Information Processing Center (1980): TROLL Users Guide. Massachusetts
Institute of Technology, Cambridge, MA.
Montgomery, D, E.A. Peck (1982): Introduction to Linear Regression Analysis.
John Wiley and Sons, New York.
72
-------
REFERENCES (continued)
National Academy of Science, (1983): Acid Deposition: Atmospheric Processes
in Eastern North America. National Academy Press, Washington, D.C. 275pp.
Pace, T.G. (1983): Models to Develop Controls Strategies for PM-10, in
Proceedings of the 76th Annual Meeting of the Air Pollution Control
Assn. Atlanta, GA, June 19-24.
PEDCo Environmental (1979): TSP Source Inventory Around Monitoring Sites in
Selected Urban Areas, Philadelphia. Report to US Environmental
Protection Agency, Monitoring and Data Analysis Division, Research
Triangle Park, NC. 92 pp.
Rao, Shankar K. (1983): Plume Concentration Algorithms with DepositIon,
Sedimentation and Chemical Transformation IAG-AD-13-F-1-70/-Q.Rational
Oceanographic and Atmospheric Administration, Oak Ridge, TN. 87 pp.
Sehmel, George (1980): Particle and Gas Dry Deposition: A Review. Atmos.
Environ. 14 983-1011
Suggs, J.C., R.M. Burton (1983): Spatial Characteristics of Inhalable
Particles in the Philadelphia Metropolitan Area. J. Air Pol. Control
Assoc., 33_ 686-91.
Texas Air Control Board (1979): Texas Episodic Model, User's Guide
PB80-227572. National Technical Information Service, Springfield, VA.
Thurston, G.D. (1983): A Source Apportionment of Particulate Air Pollution
In Metropolitan Boston. Ph.D. Thesis, Harvard School of Public Health,
boston, MA.
73
-------
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
1. REPORT NO.
2.
3. RECIPIENT'S ACCESSION NO,
4. TITLE AND SUBTITLE
5. REPORT DATE
Air Quality Models Pertaining to Particulate Matter
6. PERFORMING ORGANIZATION CODE
r. AUTHOR(S)
S.A. Batterman, J.A. Fay, D. Golomb, J. Gruhl
8. PERFORMING ORGANIZATION REPORT NO.
9. PERFORMING ORGANIZATION NAME AND ADDRESS
Energy Laboratory
Massachusetts Institute of Technology
Cambridge, MA 02139
= LEMENT NO.
- 3156 (FY-84)
11. CONTRACT/GRANT NO.
Cooperative Agreement
809229-01
12. SPONSORING AGENCY NAME AND ADDRESS
Environmental Sciences Research Laboratory - RTP, NC
Office of Research and Development
U.S. Environmental Protection Agency
Research Triangle Park, NC 27711
13. TYPE OF REPORT AND PERIOD COVERED
Final 04/83 - 10/88
14. SPONSORING AGENCY CODE
EPA/600/09
15. SUPPLEMENTARY NOTES
16.ABSTRACT This rep()rt describes an evaluation of the Particle Episodic Model (PEM), an
urban scale dispersion model which incorporates deposition, gravitational settling and
linear transformation processes into the predecessor model, the Texas Episodic Model
(TEM-8). A sensitivity analysis of the model was performed, which included the effects
of deposition, gravitational settling and receptor grid size. Recommendations are made
to improve the performance and flexibility of the model.
PEM was applied to a source inventory of the Philadelphia area to provide a pre-
liminary estimate of source apportionment. PEM modeling employed both hypothetical and
actual meteorology. Results indicate that area source emissions dominate TSP, S0? and
sulfate concentrations at urban receptors. A large fraction of the inhalable particles
may arrive from distant sources.
This report also contains an overview of receptor models (RMs) used for the source
apportionment of aerosols. Some diagnostic procedures for RMs are evaluated using a
synthetic data set. Described are RM trade-offs and protocols and possible hybrid
dispersion/receptor models. Issues regarding the inter-comparison of source apportion-
ments from receptor and dispersion models are highlighted with reference to the 1982
Philadelphia study.
17.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.lDENTIFIERS/OPEN ENDED TERMS
COSATl Field/Group
18. DISTRIBUTION STATEMENT
RELEASE TO PUBLIC
19. SECURITY CLASS (This Report I
UNCLASSIFIED
21. NO. OF PAGES
20. SECURITY CLASS (This page)
UNCLASSIFIED
22. PRICE
EPA Form 2220-1 (R»v. 4-77) PREVIOUS EDITION is OBSOLETE
------- |