United States
Environmental Protection
Agency
-
Atmospheric Research and ^v, ;s
Exposure Assessment Laboratory ', Ly '
Research Triangle Park NC 27711 ri \x
Research and Development
EPA/600/S3-89/031 Sept. 1989
&ERA Project Summary
Materials Aerometric
Database for Use in Developing
Materials Damage Functions
Ruen-Tai Tang, P. Michael Barlow, and Paul Waldruff
Meteorological and air quality data
acquired at field exposure sites have
been accumulated into the Materials
Aerometric Database (MAO). Task
Group VII of the National Acid
Precipitation Assessment Program
(NAPAP) will use the MAD to develop
damage functions for materials ex-
posed at the sites; these functions
then will be used in preparing NAPAP
integrated assessment reports to
Congress. The MAD data cover as
many as six and a half years at five
materials exposure sites in the
eastern United States. Conservative
techniques based on secondary-site
data, regression predictions, and
other information have been applied
to the MAD to enhance the quality
and usability of the database. The en-
hanced version of the MAD, as well
as the original MAD, have been given
to Task Group VII.
This Project Summary was devel-
oped by EPA's Atmospheric Research
and Exposure Assessment Laboratory,
Research Triangle Park, NC, to an-
nounce key findings of the research
project that is fully documented in a
separate report of the same title (see
Project Report ordering information at
back).
Introduction
The EPA's Atmospheric Research and
Exposure Assessment Laboratory
(AREAL) has undertaken the task of
maintaining the Materials Aerometric
Database (MAD), which consists of air
quality and meteorological data from five
test sites to be used in developing dam-
age functions for the National Acid
Precipitation Assessment Program
(NAPAP) Materials Assessment Program.
The research objectives for this project
are as follows:
(1)To accumulate and organize an
aerometric database (MAD) con-
taining air quality data and meteoro-
logical measurements made at five
primary field sites.
Develop a uniform format for the
MAD data.
Provide validated tapes of the
MAD data to the principal investi-
gators within the Materials and
Cultural Effects Task Group (Task
Group VII).
Acquire quality assurance/quality
control (QA/QC) data from the site
operators.
Monitor the independent systems
and performance audits of the
sites, conducted by Research Tri-
angle Institute.
(2) To enhance the database and allow
its use in continuous-damage
models by making reasonable pre-
dictions for missing data points
(infilling).
Acquire secondary-source data for
infilling missing primary-source
data.
Provide data tapes of the en-
hanced air quality and meteoro-
logical data to the principal inves-
tigators within Task Group VII.
Technical Approach
Five materials exposure sites were
chosen for continuously recording air
quality, meteorology, particle loadings
and chemistry, and rain chemistry meas-
urements (Figure 1):
-------
/V 44.00*
N 44.00*
N 42.00*
N 40.00*
N 38.00*
N 38.00*
/V 36.00*
N 36.00*
/V 34.00*
N 34.00*
Figure 1. Locations of the five materials exposure test sites (marked with solid squares).
Adirondack Ecological Center, New-
comb, NY
Bell Communications Research Cen-
ter, Chester, NJ
County Services Building, Steuben-
ville, OH
West End Library, Washington, DC
Research Triangle Institute, Re-
search Triangle Park, NC
The following variables were measured
in order to quantitatively evaluate the
deposition of acids on the materials
samples:
Meteorological Variables
Wind speed average (WSA)
Wind direction average (WDA)
Wind direction vector (WDV)
Temperature (TP)
Dew point (DP)
Relative humidity (RH)
Precipitation (PR)
Solar radiation (SR)
Air Quality Variables
Sulfur dioxide
Ozone
Oxides of nitrogen (NOX)
Nitric oxide (NO)
Nitrogen dioxide (N02)
The data were supplied to the EPA in <
variety of site-dependent formats. Th<
format used by the Research TriangU
Park site was chosen as the stand arc
data storage format (Figure 2), an(
-------
Col. no :
Contents:
1 2345678
12345678901234567890123456789012345678901234567890123456789012345678901234567890
286 043 03 26 15F 3 18 13 320 33 -12 -91 550 0 0 319 9999
286 043 04 24 13 3 16 15 318 33 -17 -95 553 0 0 318 9999
YR DAY HR SO2 NO2 NO NOX O3 WDA WSA TP DP RH PR SR WDV WSV
Additional information:
Column 1 contains the site ID (1 = DC. 2 = NC, 3 = NJ, 4 = NY, 5 = OH)
Columns 5-7 contain the Julian date.
Column 21 contains an example information flag (discussed below); all are non-numerical
characters, except the minus sign.
WSV stands for wind speed vector, a variable reported by only the New York site.
Figure 2. Format for storage of all MAD data (based on the format for the RTP, NC, site).
software was developed to convert all
other sites' data to this format.
During the project, some of the data
have been lost due to equipment
shortages and failures, power outages,
etc. A number of secondary sources
were found to replace the missing data.
These sites were located near the pri-
mary sites with similar micrometeorology
and are listed in Table 1 with the types of
variables recorded at each site.
Each site performed ongoing QA on its
own systems and data, based on the for-
mat outlined in 40 CFR 58, Appendix A.
Also, an independent audit team from the
Research Triangle Institute (RTI) has
conducted annual or semiannual perfor-
mance and system audits at the sites
since 1985.
As the raw data were received, we
performed preliminary statistical, analysis
and quality assurance checks, identified
data problems, and contacted the site
operators about them. Wherever possi-
ble, these problems were corrected and
problem data were replaced by the site.
Using secondary-source data (if avail-
able) to infill primary-source data is the
most reliable method of infilling, as long
as the correlation between them is good
and1 any- bias can* be rem-oved:. We used
secondary-source substitution when-
ever possible, before using alternative
forms of infilling. For missing meteoro-
logical data, we used only secondary-
source substitution, because infilling this
type of data using predictions from any
form of calculations could corrupt the
database. For missing air quality data,
however, we employed three predictive
infilling methods when no acceptable
secondary-source data were available:
linear interpolation using good data on
either side of the gap; regression using a
long-term least-squares prediction;
and daytime or nighttime averages each
compiled from a month's worth of data.
We performed a preliminary survey of
the NC data to evaluate the occurrences
of missing data. For a given year and
variable, we recorded the gap length for
each instance of missing data and then
developed histograms of this information.
In most cases, the data displayed a peak
at one hour and then dropped off quickly
after two or three hours, as shown in the
example in Table 2.
We decided to take the most conserv-
ative approach to infilling the data. For
one-hour and two-hour gaps, we used
interpolation. After these were infilled, the
remaining gaps of three or more hours
were i«Wted using a regression predic-
tion, if the R2 for the regression equation
was greater than 0.50. We then infilled
any gaps still remaining with the ap-
propriate daytime or nighttime monthly
average. Listed below is a summary of
the steps we followed in processing each
subset of the database to produce the
final enhanced database:
Step 1. For both meteorological
and air quality data:
Replace ad missmg data, data below
detectable Itmits, or data above reason-
able limits with acceptable secondary-
source data, if the correlation between
primary- and secondary-source data is
high (r2 > 0.95). Use the following
mathematical replacement:
XpranW = xsec(0 + (the differ-
ence in their yearly
averages)
Step 2. For air quality data only:
For one-hour or two-hour gaps re-
maining after Step 1, infill with a
smoothed value interpolated from the
points before and after the gap. For gaps
three or more hours long, use regression
to replace the data if R2 for the regres-
sion equation is greater than 0.50; other-
wise, replace missing data with daytime
or nighttime monthly averages (where
daytime includes hourly data from 7 A.M
to 6 PM. and nighttime includes data from
7 P M. tO 6 A M )
Step 3. Apply the following
special corrections as needed:
Set solar radiation to zero at night, if
not already zero.
For air quality data only, smooth in
filled data into measured data to
avoid abrupt slope discontinuities
Use a five-time-step smoothing
scheme for infilling done with mul-
tiple regressions, secondary-source
data, and monthly averages; do not
use for infilling done with one- and
two-hour interpolations.
Maintain the NOX, NO, and N02
balance using:
Conc(NOx) = Conc(NO) + Conc(NO2)
An information flag accompanies every
data point in the MAD. The flag is a blank
for original, untouched data. If the data
were modified or infilled, this flag is set to
a code describing the infilling method.
Results
Detailed descriptions of the raw data
statistics and the site operations for all
sites are presented in an April 1988 EPA
internal report, Monitoring and
Operations at Materials Effects Sites
(R.T. Tang, P.M. Barlow, and J.W
Spence). Table 3 presents raw-data
summary statistics for one site (RTP,
NC).
-------
Table 1. Secondary MAD Data Sources
Primary Site Corresponding Secondary Site(s)
Variables Measured
Newcomb, NY
Chester, NJ
Steubemille, OH
Washington, DC
Research Triangle Park, NC
State University of New York site i km
away from primary site
Bell Core Lab, Chester, NJ
Harvard School of Public Health Study site
in Steubemille, OH
NOVAA, Mmgo Junction, OH
Washington National Airport
Raleigh-Durham Airport
USEPA, Research Triangle Park, NC
Meteorology
Meteorology
Meteorology, air quality
Meteorology
Meteorology
Meteorology
Meteorology, air quality
Table 2. Missing-Data Gap-Length
Frequencies for 1984 RTP, NC
Air Quality Data
Number of Gaps in a
Gap Length Variable's Data
(h) O3 SO2 NO NOX NO2
1
2
3
4-6
7-12
13-24
>24
31
10
2
4
5
4
6
19
15
6
15
6
4
8
12
5
3
4
1
1
2
16
4
3
4
1
1
2
13
5
4
4
0
2
2
We processed the raw meteorological
and air quality data from each site using
the steps discussed above, and then per-
formed statistical analyses on the en-
hanced MAD; the procedures and results
are given in Enhancement of Materials
Aerometric Database (R. T. Tang, P. M.
Barlow, and J. W. Spence), a July 1988
EPA draft internal report. Table 4 pre-
sents summary statistics for the infilled
data for the RTP, NC, site.
Discussion
The MAD is now available for use in
predicting damage functions. The raw
and enhanced MAD data tapes contain
the data for all sites over part or all of the
1982-1988 period. The missing air qual-
ity data have been infilled using the algo-
rithm discussed above. However, it was
not possible to find secondary sources
for all of the sites, so there are gaps in
the meteorological data. The use of
modeled meteorological data in the
development of the damage functions
could seriously bias the predicted data
values and the statistics that describe
them.
Also, a bias has already been found
the data from two sites. For data value
below the minimum detectable lirr
(MDL), the DC site has been reportir
the MDL and the NJ site has reporte
one-half the MDL. This is acceptable fi
some EPA uses, but we are current
trying to acquire the unmodified data.
Conclusions
We have developed an enhance
database that will provide materia
assessment investigators with a con
prehensive data set of meteorologic
and air quality data collected durm
materials test exposures. Two tapes, or
containing the raw data rewritten m
uniform format and the second containm
the enhanced database, have been pr<
vided to the principal investigators with
Task Group VII. The MAD provide
essential data for developing damag
functions to be used in estimating currei
materials damage due to acid ran
predicting future damage, and aiding
the development of control scenario
NAPAP will use this information to d<
velop reports to Congress.
-------
Table 3. Summary Statistics for the flaw MAD for the RTP, NC, Site
Variable
Year
1982'
1983
1984
1985
1986
1987
Statistic
Mean (ppm)
Std. Dev.
Mm.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
03
0.023
0.019
0.000
0.091
3.3
0.029
0.023
0.000
0.132
6.5
0.025
0.021
0.000
0.118
1.2
0.025
0.020
0.000
0.119
3.0
0.025
0.022
0.000
0.123
10.3
0.027
0.022
0.000
0.112
10.2
SO2
O.OOJ
0.003
0.000
0.021
3.2
0.003
0.004
0.000
0.048
4.6
0.004
0.004
0.000
0.040
42.5
0.002
0.002
0.000
0.026
71.2
0.002
0.005
0.000
0.080
42.5
0.003
0.004
0.000
0.054
13.3
NO
0.009
0.022
0.000
0.225
1.1
0.008
0.019
0.000
0.295
3.3
0.010
0.024
0.000
0.351
2.8
0.008
0.018
0.000
0.268
24.5
0.010
0.022
0.000
0.410
42.4
0.011
0.028
0.000
0.375
41.6
NOX
0.020
0.024
0.000
0.250
2.2
0.021
0.023
0.000
0.312
3.3
0.022
0.029
0.000
0.372
3.8
0.022
0.026
0.000
0.307
16.4
0.023
0.025
0.000
0.436
16.4
0.027
0.034
0.000
0.391
31.5
NO2
0.011
0.007
0.000
0.046
2.3
0.013
0.009
0.000
0.073
3.5
0.012
0.010
0.000
0.065
3.5
0.013
0.010
0.000
0.108
31.2
0.015
0.009
0.000
0.169
49.1
0.014
0.009
0.000
0.059
41.3
WSA
1.25
1.11
0.00
6.80
31.0
1.63
1.30
0.00
10.00
0.1
1.65
1.20
0.00
8.10
0.1
1.48
1.20
0.00
10.00
0.0
1.44
1.11
0.00
8.10
0.3
1.40
1.23
0.00
8.20
0.8
TEMP
16.74
7.96
0.00
31.80
0.4
14.64
10.07
-16.00
38.70
0.9
15.16
9.31
-12.70
35.00
0.1
15.29
9.64
-21.10
34.80
0.8
15.53
9.77
-12.70
37.70
0.3
14.91
9.88
-8.90
37.70
0.1
DEWPT
11.04
8.50
-13.50
2.60
6.5
6.86
9.43
-26.10
23.00
16.4
7.79
9.45
-17.50
23.10
10.5
8.02
10.58
-33.50
22.70
35.3
5.18
10.70
-26.30
22.80
46.3
7.98
9.88
-16.20
24.40
1.5
RH
71.22
17.19
24.00
99.40
6.8
65.07
20.22
16.20
98.60
17.3
65.96
78.85
13.40
100.00
10.5
61.00
19.46
11.90
98.40
35.3
57.05
20.46
10.60
95.30
46.3
66.01
20.06
10.70
100.00
1.2
PR
0.12
0.97
0.0
33.0
0.0
0.13
0.85
0.0
22.0
0.0
0.14
1.01
0.0
32.1
2.5
0.12
0.98
0.0
36.1
1.1
0.11
1.15
0.0
41.3
0.0
0.12
1.11
0.0
64.2
0.1
SR
13.12
19.41
0.0
76.20
4.4
14.03
21.03
0.0
78.0
2.4
11.38
17.57
0.0
76.0
0.1
13.67
20.27
0.0
78.0
1.6
13.34
20.12
0.0
76.0
0.3
13.27
19.88
0.0
80.0
1.7
My through December data only.
-------
Table 4. Summary Statistics for the Enhanced MAD for the RTP, NC, Site
Variable
Year
1982"
1983
1984
1985
1986
1987
Statistic
Mean (pom)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
Mean (ppm)
Std. Dev.
Min.
Max.
% Missing
03
0.023
0.019
0.000
0.091
0.0
0.029
0.023
0.000
0.143
0.0
0.025
0.021
0.000
0.118
0.0
0.025
0.020
0.000
0.119
0.0
0.025
0.021
0.000
0.123
0.0
0.026
0.021
0.000
0.112
0.0
S02
0.001
0.003
0.000
0.021
0.0
0.003
0.004
0.000
0.048
0.0
0.004
0.003
0.000
0.040
0.0
0.003
0.002
0.000
0.026
0.0
0.002
0.004
0.000
0.080
0.0
0.003
0.004
0.000
0.054
0.0
NO
0.009
0.022
0.000
0.225
0.0
0.008
0.019
0.000
0.295
0.0
0.010
0.024
0.000
0.351
0.0
0.007
0.016
0.000
0.268
0.0
0.008
0.017
0.000
0.410
0.0
0.010
0.022
0.000
0.375
0.0
NOX
0.020
0.024
0.000
0.250
0.0
0.021
0.023
0.000
0.312
0.0
0.022
0.029
0.000
0.372
0.0
0.022
0.024
0.000
0.307
0.0
0.022
0.023
0.000
0.436
0.0
0.026
0.029
0.000
0.391
0.0
NO2
0.011
0.007
0.000
0.091
0.0
0.013
0.008
0.000
0.073
0.0
0.012
0.010
0.000
0.119
0.0
0.014
0.014
0.000
0.268
0.0
0.015
0.013
0.000
0.278
0.0
0.016
0.014
0.000
0.276
0.0
WSA
1.28
.98
0.00
6.80
0.0
1.63
1.30
0.00
10.00
0.0
1.65
1.20
0.00
8.10
0.0
1.48
1.20
0.00
10.00
0.0
1.44
1.11
0.0
8.10
0.0
1.40
1.23
0.00
8.20
0.0
TEMP
16.77
7.97
0.00
31.80
0.0
14.50
10.12
-16.00
38.70
0.0
15.17
9.31
-12.70
35.00
0.0
15.18
9.69
-21.10
34.80
0.0
15.51
9.76
-12.70
37.70
0.0
14.91
9.88
-8.90
37.70
0.0
DEW PT
12.39
8.37
-12.20
23.90
0.0
8.21
9.84
-27.20
24.40
0.0
8.87
9.87
-17.20
24.40
0.0
7.63
10.54
-33.30
22.80
0.0
9.31
10.18
-22.80
26.10
0.0
8.63
10.35
-20.60
25.00
0.0
RH
77.57
18.45
24.60
100.00
0.0
69.12
20.36
20.70
100.00
0.0
68.65
1934
19.30
100.00
00
63.84
20.36
13.0
100.00
0.0
70.28
21.89
15.70
100.00
0.0
69.63
21.80
10.40
100.00
0.0
PR
0.12
0.97
0.00
33.0
0.0
0.13
0.85
0.00
22.0
0.0
0.14
1.01
0.00
32.10
2.5
0.12
0.98
0.00
36.10
1.1
0.11
1.15
0.00
41.30
0.0
0.12
1.11
0.00
64.20
0.1
SR
12.72
19.18
0.00
76.20
0.6
13.72
20.89
0.00
78.0
0.0
11.36
17.56
0.00
76.00
0.0
13.46
20.18
0.00
78.00
0.0
13.31
20.10
0.00
76.00
0.0
13:27
19.88
0.00
80.00
1.7
July through December data only.
-------
Ruen-Tai Tang, P. Michael Barlow, and Paul Waldruff are with Computer
Sciences Corporation, Research Triangle Park, NC 27709
F. H. Haynie is the EPA Project Officer (see below).
The complete report, entitled "Materials Aerometric Database for Use in
Developing Materials Damage Functions," (Order No. PB 89-181 2591 AS;
Cost: $13.95, subject to change) will be available only from:
National Technical Information Service
5285 Port Royal Road
Springfield, VA 22161
Telephone: 703-487-4650
The EPA Project Officer can be contacted at:
Atmospheric Research and Exposure Assessment Laboratory
U.S. Environmental Protection Agency
ResearchTriangle Park, NC 27711
United States Center for Environmental Research
Environmental Protection Information
Agency Cincinnati OH 45268
Official Business
Penalty for Private Use $300
EPA/600/S3-89/031
------- |