Tennessee Valley Authority United States Environmental Protection Agency Research and Development Office of Natural Resources Chattanooga TN 37401 TVA ONR-79/03 Office of Energy, Minerals, and Industry Washington DC 20460 EPA-600 7-79-084 March 1979 The Analysis of Suspended Particulates and Sulfates A Way to Begin Interagency Energy/Environment R&D Program Report ------- RESEARCH REPORTING SERIES Research reports of the Office of Research and Development, U.S. Environmental Protection Agency, have been grouped into nine series. These nine broad cate- gories were established to facilitate further development and application of en- vironmental technology. Elimination of traditional grouping was consciously planned to foster technology transfer and a maximum interface in related fields. The nine series are: 1. Environmental Health Effects Research 2. Environmental Protection Technology 3. Ecological Research 4. Environmental Monitoring 5. Socioeconomic Environmental Studies 6. Scientific and Technical Assessment Reports (STAR) 7. Interagency Energy-Environment Research and Development 8. "Special" Reports 9. Miscellaneous Reports This report has been assigned to the INTERAGENCY ENERGY-ENVIRONMENT RESEARCH AND DEVELOPMENT series. Reports in this series result from the effort funded under the 17-agency Federal Energy/Environment Research and Development Program. These studies relate to EPA's mission to protect the public health and welfare from adverse effects of pollutants associated with energy sys- tems. The goal of the Program is to assure the rapid development of domestic energy supplies in an environmentally-compatible manner by providing the nec- essary environmental data and control technology. Investigations include analy- ses of the transport of energy-related pollutants and their health and ecological effects; assessments of, and development of, control technologies for energy systems; and integrated assessments of a wide range of energy-related environ- mental issues. This document is available to the public through the National Technical Informa- tion Service, Springfield, Virginia 22161. ------- EPA-600/7-79-084 TVA/ONR-79/03 THE ANALYSIS OF SUSPENDED PARTICULATES AND SULFATES: A WAY TO BEGIN by Walter Liggett and William Parkhurst Office of Natural Resources Tennessee Valley Authority Chattanooga, Tennessee 37401 Interagency Agreement No. EPA-IAG-D5-E721 Project No. 80 BDM Program Element No. INE-625B Project Officer James T. Stemmle 401 M Street RD-681 Washington, D.C. 20460 Prepared for OFFICE OF ENERGY, MINERALS, AND INDUSTRY OFFICE OF RESEARCH AND DEVELOPMENT U.S. ENVIRONMENTAL PROTECTION AGENCY WASHINGTON, D.C. 20460 ------- DISCLAIMER This report was prepared by the Tennessee Valley Authority and has been reviewed by the Office of Energy, Minerals, and Industry, U.S. Environmental Protection Agency, and approved for publication. Approval does not signify that the contents necessarily reflect the views and policies of the Tennessee Valley Authority or the U.S. Environmental Protection Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use. 11 ------- ABSTRACT Total suspended particulate (TSP) and suspended sulfate (SS) levels have been sampled since November 1973 at five isolated sites across the Tennessee Valley. A method for beginning to analyze such data is demon- strated. This beginning is intended to lead finally to information on pollution sources, an objective that may require modeling meteorological influences and resolving sources. Analysis with this objective, which can be very complex, is effectively begun by using the method demon- strated in this paper. Applied to the TSP and SS data, this method suggests agricultural contributions to TSP levels, distant-source contributions to SS levels, and various influences of the meteorology. This method also shows deficiencies in the data collection that prevent the building of better, more quantitative models. One deficiency in this data set is the sixth-day sampling, which is not frequent enough to allow monthly variations in pollution levels to be distinguished from more rapid variations. Thus, data analysis would be more effective if the sampling frequency were increased and, further, if particle size and chemical composition were better resolved. This report was submitted by the Tennessee Valley Authority, Office of Natural Resources, in partial fulfillment of Energy Accomplishment Plan 80 BDM under terms of Interagency Agreement EPA- IAG-D5-E721 with the Environmental Protection Agency. Work was com- pleted as of January 12, 1979. 111 ------- CONTENTS Abstract iii List of Figures v List of Tables v 1. Introduction 1 2. Conclusions and Recommendations 3 3. The Method 4 Overview of the method 4 The TSP and SS components 5 The algorithm 7 4. Interpretation of the Data 14 Seasonal component 14 Valley-wide and local smooths 15 Valley-wide and individual roughs 17 5. Design of Monitoring 20 References 22 ------- LIST OF FIGURES Number 1 2 3 4 5 Seasonal patterns for TSP and SS Smoothed levels of total suspended particulates Smoothed levels of suspended sulfates Data decomposition showing flow of calculations Daily sulfate data with sixth-day sampling smoothed Page 6 8 9 11 16 LIST OF TABLES Number Robust Correlations and Number of Nonmissing Observations for the Roughs 10 VI ------- SECTION 1 INTRODUCTION Since November 1973, the Tennessee Valley Authority (TVA) has operated high-volume samplers at five sites to obtain background concen- trations and trends for total suspended particulates (TSP) and water- soluble suspended sulfates (SS). These sites, which are intended to represent large subregions of the Tennessee Valley, are remote from power plants and other large sources of industrial pollution. From east to west, these sites are in Washington County, Virginia (at Loves Mill); Monroe County, Tennessee (at Loudon); Jackson County, Alabama (at Hytop); Giles County, Tennessee; and Trigg County, Kentucky [at Land Between The Lakes (LBL)]. Samples have been collected for a 24-h period every sixth day and analyzed by standard methods.1'4 This paper demonstrates a method that helps investigators explain data like these. The explanations answer questions such as how much each source contributed to the observed levels, an important question in the application of the 1977 Clean Air Act Amendments. The method may suggest explanations with clear implications. However, the method may be only the first step in developing a more complete model of what influences the measurements. In this case, the method is intended to show the potential benefits of a more complete model and the require- ments for its development so that the considerable expense and expertise possibly needed can be justified and planned. Important benefits may not be available from a particular data set because meteorological influences or something else cannot be adequately modeled. The method is intended to indicate such a possibility. Models that explain air quality measurements are needed for many purposes, for example, to obtain information relevant to control strate- gies or to interpret the trends that monitoring is meant to detect.6 Such models involve several factors including the sources of pollution and the transport and transformation of pollutants. Such models are needed because they differentiate among these factors. Thus, they allow the effects of control strategies and other changes to be predicted and the causes of a trend to be understood. The method demonstrated in this paper decomposes the data into com- ponents that represent data variations of different temporal and spatial extents.7'8 A guide to the method is given by the equation, log-transformed data = seasonal component + Valley-wide smooth + local smooth + individual rough. This equation shows that log transforms of the data rather than the original data are decomposed into components. One component represents the seasonal (i.e., annually recurring) variation for all sites. Smooths, which show variations that persist in time, are computed for all sites (Valley-wide) and for the variations unique to each site (local). The individual roughs are the irregular variations not accounted for by the other components. A Valley-wide rough has been computed, but not incorporated in the decomposition. ------- -2- The method is useful because the data are easier to interpret component by component than all at once. The data are determined by many factors. The influence of these factors on each component is easier to understand than their influence on the undivided data. For example, the influence of seasonal factors can be seen in the seasonal component, but generally not in the other components. Thus, the method is much more revealing, yet no more complicated, than the histograms often used to summarize air quality data. The TSP and SS data from remote sites on which the method is demon- strated are interesting because of questions about pollutant origins. These origins are both distant sources and local, nonindustrial sources such as agriculture. The questions involve the methods and benefits of controlling such sources and the interference of such sources with the monitoring of a specific industrial source. Decomposing these data into components reveals several features observed in other regions. One feature is the patterns shown by the seasonal component. Nationwide, seasonal patterns in TSP are not con- sistent, indicating the importance of local sources, which differ for urban and rural monitoring.9 In the east, the seasonal patterns in SS have a single peak in the summer.9 Another feature is the relations among the series observed at different sites. For SS data, similarities in time behavior at widely separated sites have been observed in New York State.10"12 Some intersite differences observed in the data ana- lyzed here (unusually high SS levels at LBL) have been explained by Reisinger and Crawford.13 To describe the method, we discuss the components it produces from the TSP and SS data before we specify the computational details. This discussion, which is in Section 3, is thus more data-oriented than the usual description of a method. In Sections 4 and 5, we interpret the data presented in Section 3. Section 4 discusses the physical mechanisms responsible for the observations. Section 5 discusses consequences for the design of monitoring. ------- -3- SECTION 2 CONCLUSIONS AND RECOMMENDATIONS Many factors other than emission levels influence air quality moni- toring data; most obviously, the weather influences transport. Further, many sources other than those usually controlled contribute to pollution levels. We recommend that, when possible, the influences of these factors be modeled rather than treated as random. The influences on pollution levels most obvious in monitoring data are often of little interest in decision making. When this is true, we recommend that the data collection and analysis needed to adjust for these influences be undertaken. For example, adjustment of the data for meteorological influences should allow emission trends to be detected more easily. The effort finally needed to model the influences on the data might require expertise and data collection, which make monitoring much more expensive. We recommend that the importance of the information to be obtained determine the degree of monitoring to be done. The analysis method demonstrated helps guide future data collection and analysis by showing what is needed to meet objectives. We recommend that all monitoring data be subjected to such preliminary analysis. ------- -4- SECTION 3 THE METHOD OVERVIEW OF THE METHOD Both the TSP and the SS data are composed of time series from each site. These series each have 258 observations that have been transformed by: y = log1Q(x + 1), (1) where 3 x = an original observation, |Jg/m , 3 y = the corresponding log-transformed observation, (Jg/rn . Data transformations are discussed by Tukey.7 The first part of the decomposition computes a smooth trace through each series. This trace follows the slowly changing variations in the data, the variations that persist from sample to sample. It is not affected by the irregular sample-to-sample changes. It represents the data variations that monthly averages are intended to portray. Sub- tracting the smooth trace from the series that generated it gives a component that represents irregular sample-to-sample changes in the data. Thus, each series is decomposed into two components, a smooth trace and an irregular component called an individual rough. The smooth trace represents data fluctuations caused, for example, by seasonal changes in the weather, and the rough represents fluctuations caused, for example, by frontal passages. The smooth traces are computed by the use of running medians. Consider, for example, a running median that spans five observations. It is computed by finding the middle value (the third largest value) of every group of five successive values of a series. An alternative, a running month-long average, is computed by finding the average of every group of five successive values. A running median is less sensi- tive to isolated values that are very large or very small. Thus, running medians give a smooth trace that is less influenced by such values and, consequently, a rough that better represents such values. The actual algorithm for computing the smooth traces, which is described below, involves repeated computation of running medians, a method for obtaining the smooth trace at the ends of the series, and an approach to missing values. The second part of the decomposition extracts the Valley-wide com- ponent from the smooth traces for each site. Our choice for the Valley- wide component is the sample-by-sample average of the five smooth traces. This choice was made despite one- and two-day differences in sampling day that occur before May 1976 because smooth traces rather than the original data are averaged. Subtracting the Valley-wide component from the smooth traces gives the local smooths. ------- -5- The third part of the decomposition extracts the seasonal component from the Valley-wide component. The seasonal component is computed with- out the first and last seven values of the Valley-wide component so that exactly four years of data are used. Since each year has sixty-one values, the seasonal component has sixty-one values. Each of these values is the midmean of the corresponding four yearly values. (The midmean of four values is the average of the second and third largest.) Subtracting the seasonal component from the Valley-wide component gives the Valley- wide smooth. The Valley-wide smooth shows unusual years more clearly because the midmean instead of the average is used to compute the seasonal component. THE TSP AND SS COMPONENTS Further understanding of the method can be gained by considering the components produced from the TSP and SS data. However, before pre- senting these components, we present annual and 24-h summaries of these data to help the reader relate them to other data. For the calendar years 1974 through 1977, the annual geometric means of the TSP for these sites ranged from 28 to 43 (Jg/m3. These TSP levels are well below the primary and secondary National Ambient Air Quality Standards of 75 and 60 (Jg/m3, respectively. The 24-h TSP levels found in these data also do not exceed the primary and secondary stan- dards of 150 and 260 [Jg/m3, respectively. However, the 24-h levels for February 24, 1977, a day during a severe dust storm, are recorded as lost records. These levels are actually 88, 767, 699, 654, and 138 (Jg/m3 for Loves Mill, Loudon, Hytop, Giles County, and LBL, respectively, as shown by TVA laboratory files. This dust storm caused 24-h levels to exceed standards throughout the Southeast.14 For the same periods and sites, the annual arithmetic means of the SS ranged from 5.9 to 10.0 |Jg/m3. These levels are within the range expected in rural areas east of the Mississippi River.15 Some states have standards for SS, and the EPA is considering national standards. Suggestions for the annual standard16 lie between 5 and 15 |Jg/m3, and suggestions for the 24-h standard16 lie between 10 and 25 (Jg/m3. Four- teen instances of 24-h levels above 25 (Jg/m3 are contained in the data from these sites. Consider the components discussed above, starting with the seasonal component. The TSP and SS seasonal components are the most pronounced feature of the data. They are shown in Figure 1 after retransformation to compensate for the log transform. They are plotted on a horizontal axis that starts on the first day of winter and is divided seasonally. The estimate of the TSP pattern has peaks in mid-April and mid-July that reach 54 (Jg/m3- It has levels as low as 22 (Jg/m3. The estimate of the SS pattern has a peak in mid-July that reaches 13.0 (Jg/m3. It has levels as low as 3.7 (Jg/m3. The April and July TSP peaks invite comparison because the SS is a much larger fraction of the July TSP peak than of the April TSP peak. ------- -6- 80 NJ £ 03 ct: in C_J z CD C_J 0 TOTAL SUSPENDED PARTICIPATES WINTER SPRING SUMMER FALL 20 O5 D. s 10 ^— ex o o 0 SUSPENDED SULFATES WINTER SPRING SUMMER FALL SEASON Figure 1. Seasonal patterns for TSP and SS. ------- -7- The Valley-wide and local smooths in Figure 2 show any annual trends and persistent local conditions contained in the TSP data. The Valley-wide smooth shows that 1974 and 1977 are worse than 1975 and 1976, but it does not seem to provide convincing evidence of an increas- ing trend. Further, the Valley-wide smooth shows peaks in fall 1974 and in 1977 that invite explanation. The local smooths show that Hytop and Loudon have generally higher levels than the other sites. They also show some interesting peaks. The corresponding smooths for SS are shown in Figure 3. The Valley- wide smooth seems to show a decreasing trend. As part of this trend, the Valley-wide smooth shows that the winter, spring, and summer of 1975 had unusually high levels. The local smooths show that Hytop has generally higher levels than the other sites and that, except for 1974, Loudon has higher winter levels. Like the TSP smooths, these smooths have many peaks that suggest further investigation. The roughs are better summarized by the correlations shown in Table 1 than depicted by graphs because the roughs appear nearly random. This table requires four explanations. First, before May 17, 1976, Loves Mill was sampled one day and Loudon was sampled two days before the other three sites. Starting May 17, 1976, all sites were sampled on the same day. Thus, the table has two entries for Loves Mill, Loudon, and the Valley-wide rough, the first for the earlier period and the second for the later period. Second, the Valley-wide rough summarizes the three, then five, roughs from the sites sampled on the same day. The table contains correlations of the individual roughs and the Valley-wide rough to show the similarity of these roughs. Third, the table contains in the lower triangle the numbers of observations not missing and there- fore included in the correlations. These numbers are helpful in making inferences. Fourth, the correlations are computed by a robust method that prevents a few observations from dominating the results. This method is the standardized sum and difference method with 5 percent Winsorized variances centered at 10 percent trimmed means.17 The roughs have two striking features: (1) Roughs from sites sampled the same day are closely related; and (2) in most cases, for the days on which the Valley-wide rough is unusually high or low, all sites have unusually high or low levels. THE ALGORITHM Having described the data, we now show how the decomposition is computed. Before the decomposition is started, missing values in the data are replaced by linear interpolation between nearby values from the same site. Each data point is then transformed as described by Equa- tion (1). These steps produce a 5 x 258 array of values that should be thought of as being in block 1 of Figure 4 at the start of the decomposi- tion. These values represent the period from November 1973 through January 1978. From each smooth, seven values are dropped from each end to reduce the smooths to exactly four years. ------- -8- .00 .00 CD .00 Ld CJ LJ LD O .00 .00 .25 .00 -.25 - 1 1 1 1 T "1 1 1 1 1 T LOVES MILL GILES COUNTY J I I I I L J L WSSFWSSFWSSFWSSF 1974 1975 1976 1977 Figure 2. Smoothed levels of total suspended particulates, ------- -9- .00 .00 .00 LU C_J I •« CD O .00 .25 .00 -.25 - - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 - r LOVES MILL v\ HYTQP MvA/^/^/yyA, GILES COUNTY i i i i i i i i WSSFWSSFWSSFWSSF 1974 1975 1976 1977 Figure 3. Smoothed levels of suspended sulfates. ------- -10- TABLE 1. ROBUST CORRELATIONS AND NUMBER OF NONMISSING OBSERVATIONS FOR THE ROUGHS Loves Mill Loudon Hytop Giles County LBL Valley-wide Loves Mill •u 129/ 84° 132/ 90 131/ 79 137/ 91 143/ 95 Loudon Total suspended 0.09/0.58a 0. 0. 128/ 88 126/ 77 133/ 87 138/ 92 Hytop Giles County LBL Valley- wide particulates 32/0.48 12/0.61 -- 212 228 142/ 97 0.31/0.57 -0.01/0.60 0.62 -- 217 141/ 85 0.35/0.52 0.09/0.51 0.53 0.60 -- 148/ 98 0.34/0.76 0.05/0.76 0.84/0.77 0.83/0.91 0.84/0.76 — — Suspended sulfates Loves Mill Loudon Hytop Giles County LBL Valley-wide V, 125/ 86° 130/ 92 132/ 82 133/ 93 141/ 97 0.30/0.58a 0. 0. 126/ 89 126/ 80 129/ 89 136/ 93 43/0.55 09/0.69 -- 217 227 142/ 98 0.21/0.54 0.07/0.58 0.66 — 222 144/ 88 0.33/0.47 0.21/0.45 0.51 0.46 — 145/100 0.40/0.73 0.11/0.83 0.90/0.83 0.82/0.83 0.73/0.67 ^Correlation before May 17, 1976/correlation after May 17, 1976. ^Number before May 17, 1976/number after May 17, 1976. ------- -11- i LOVES MILL 1 LOUDON HTTOP GILES COUNTY LBL \ INDIVIDUAL ROUGHS i i \ 3 VALLEY-yiDE ROUGH 1 LOCAL SMOOTHS \ 7 * VALLEY-yiDE SMOOTH i 5 SEASONAL COMPONENT Figure 4. Data decomposition showing flow of calculations. ------- -12- The first step in the decomposition computes a smooth trace through the data for each site. The particular algorithm we chose for this purpose is called 4253H and is specified below.8 The smooth traces are subtracted from the data originally in block 1 and stored in block 2; the differences are left in block 1. The second step computes the Valley-wide component from the values remaining in block 1 by finding the median value for each sampling day. Before May 17, 1976, these medians are determined by Hytop, Giles County, and Land Between the Lakes only. Thereafter, they are determined by all sites. The resulting medians are stored in block 3. The third step replaces the values stored in block 1 that were initially missing. These values are replaced by the corresponding value of the Valley-wide component in block 3, except for the missing values from Loves Mill and Loudon before May 17, 1976. Missing values from these two sites before May 17, 1976, are replaced by zero. The fourth step ensures that in the end the values in block 1 have no smooth trace. It repeats computations like those in steps 1 through 3, using the values left in block 1 as inputs. The smooth traces of the values in block 1 are computed, subtracted from the values in block 1, and added to the values in block 2. Next, the Valley-wide component is recomputed as in step 2 and stored in block 3. Then, missing values are replaced as in step 3. Finally, these analogs of steps 1 through 3 are repeated yet another two times. What then remains in block 1 are the individual roughs, and what then remains in block 3 is the Valley-wide rough. The fifth step removes the Valley-wide component from the values stored in block 2. It averages the values for each sampling day, ignor- ing the one- and two-day differences in the schedule. These averages are subtracted from block 2, leaving the local smooths, and are stored in block 4. The sixth step obtains the seasonal component from these averages. The seasonal component is computed for each of the 61 sampling days in a year by finding the midmean of the yearly values for that sampling day. It is subtracted from block 4, leaving the Valley-wide smooth, and is placed in block 5. To obtain the values in Figure 1, we retransformed this seasonal component. The smoothing algorithm 4253H is the following sequence of computa- tions.8 First, running medians of length 4 and 2 are applied to give yt(1) = (1/2) median [y^, y^, + (1/2) median [y, y, Second, a running median of length 5 is applied to give y^'", 7t(1), yt+1(I), »„,<"]. (3, ------- -13- Third, a running median of length 3 is applied to give yt(3)= median [yt_/2),yt(2),yt+1(2)]. (*> Fourth, a running weighted average called banning is applied to give yt(4) = [Vl(3) * ' yt(3) * yttl(3)]/*. (5) The above formulas show that y *• ' is obtained from 13 original data points, Yt_g, . . . , Yt+6- To make the output the same length as the input, six points are joined to each end of the series. The points at the beginning are obtained by applying the sequence 4253H to the first 14 data points to give y?(- and Vg . The six new points, which are denoted by y_5> y_4> . . . , yQ, are obtained by linear extrapolation: The points for the other end are obtained similarly. ------- -14- SECTION 4 INTERPRETATION OF THE DATA The decomposition of TSP and SS data allows comparison of the various components to possible causal factors. Some causal factors are regional and some are local; some vary rapidly and some vary slowly. Thus, the decomposition is useful because the causal factors relate to some components, but not to others. SEASONAL COMPONENT The seasonal component is not only a prominent feature of most environmental data, but often the component that is most difficult to explain unambiguously. This difficulty is due to the seasonal nature of most possible causes. The TSP spring peak is interesting in that it seems to be related to annually recurring events in late March and early April. The most plausible explanation for this peak is regional agricultural and bio- logical activity. This period is the planting season in the Tennessee Valley and also the season for release of pine pollen. Both of these particulate sources should be important at rural sites and quite possibly at industrial-urban monitoring sites as well. The TSP and SS summer peaks coincide with many interrelated factors. These peaks result from the increased frequency of meteorological condi- tions conducive to the transformation, transport, and buildup of both primary and secondary pollutants. Among these factors are • High incidence of stagnating anticyclonic (high-pressure) airmasses; • High absolute atmospheric water vapor content; • High insolation; • High temperature; • Higher convective and less advective mixing; • Low frequency of regional rainfall; and • Low wind speed. Although anthropogenic emissions may be the origin of a significant portion of summer air pollution, the variation in emissions alone does not seem to account for these peaks since the power demand on the TVA system is as high in winter as in summer. ------- -15- VALLEY-WIDE AND LOCAL SMOOTHS Smooths are, by definition, representative of persistent behavior and, as such, are useful in determining the trend of the data. The Valley-wide smooth is indicative of persistent behavior common to all sites. The many features of the smooths shown in Figures 2 and 3 have not been analyzed, but two examples taken from the Valley-wide smooths and two examples from the local smooths will be discussed. Examining the Valley-wide smooth for TSP in Figure 2, note that the fall of 1974 is a period with high levels. We attribute these unusually high levels to a prolonged dry spell. This dry spell shows the effect of meteorology on pollutant levels. The mechanisms for the pollutant increase are the dry conditions and the presence of stagnating high- pressure systems, which allow a greater amount of wind-borne soil and pollutant buildup. Turning to the Valley-wide SS smooth in Figure 3, consider the general downward trend of the data. It appears that 1974 and 1975 experienced higher sulfate levels than did 1976 and 1977. What does this indicate? It could represent an actual decline in regional sulfate concentrations, which as mentioned previously, could be a function of year-to-year meteorological fluctuations. It also could be the result of the change in sampling techniques in July of 1976--the switch from Mine Safety Appliance Co. to Gelman Spectrograde high-volume filters. Subsequent experimentation with sulfate extraction suggests that the SS data obtained from the Gelman filters are on the average too low. The local smooths for TSP and SS at Giles County in the fall of 1974 are unusually low. Examination of the TSP and SS data during this period indicates either extremely low pollutant concentrations or lost records. An examination of corresponding data collected from the nearby Cumberland Steam Plant indicated no unusual data. This suggests that this negative peak is due to a defective high-volume sampler. The local SS smooth for LBL in August of 1976 is another inter- esting example. In this instance, the sixth-day sampling resulted in a smooth not typical of the entire month. Three of five sampling days during the month had high levels of SS. These levels, which were peculiar to LBL, were caused by transport from the Ohio Valley, a meteo- rological circumstance that occurs infrequently.13 This particular peak, therefore, is not representative of the entire period. This problem is an example of failure of the smoothing to separate the slowly varying and rapidly varying components. With sixth-day sampling, we are unable to separate these components. This problem, which is called aliasing18 and is a form of confounding, makes explana- tion of the smooths more difficult. The effect of aliasing is further demonstrated in the following example. The data used in this example are 192 days of daily SS values. In Figure 5, the six lines superimposed over the actual data are sixth-day smooths generated by using different starting days. Each smooth shows a ------- /-A K) £ \ 40.0 0) Z o f—( h (T h- Z UJ U Z O U 20.0 0 DAY 180 Figure 5. Daily sulfate data with sixth-day sampling smoothed. ------- -17- single peak in late August or early September, although the actual data have three major peaks in this period. Thus, the six smooths do not describe the actual data very well. Further, they are not similar to each other. The problems of aliasing could be eliminated through daily sampling or at least reduced through more frequent sampling. VALLEY-WIDE AND INDIVIDUAL ROUGHS The roughs contain that part of each day's value unsupported by adjacent values. In other words, a rough is the irregular part of a series, the high-frequency component. The roughs serve as a means for detecting unusual episodes. Also, they can be used for comparisons among the high-frequency variations at the various sites. Examination of the Valley-wide rough shows episodes that exhibit extreme levels throughout the region. Seventeen such episodes are considered in detail. Five episodes were chosen because they had the highest values of the SS Valley-wide rough. Of these, three are typical and two are unusual. The other twelve episodes have the four lowest values of the SS Valley-wide rough and the four highest and four lowest values of the TSP Valley-wide rough. The episodes occurring on January 4, 1974, August 26, 1974, and July 4, 1975, are typical high-SS episodes. The common factor in these cases is the presence of a stagnating anticyclonic airmass. The winter episode of January 4, 1974, coincides with the presence of a cold polar continental anticyclone (PcK), which had been stagnating over middle- America since January 1. The summer episodes coincide with the presence of warm maritime anticyclones (TmW), which had stagnated over the south- eastern United States. The presence of fog, smoke, haze, and low visi- bility are typically associated with such episodes. The episodes occurring on January 28, 1974, May 22, 1974, May 7, 1976, and April 7, 1977, are typical low-SS episodes. The common factor associated with these cases is the presence of regional precipitation in substantial amounts on the days preceding and/or during the sampling day. This precipitation is associated with airmass convergence. Episodes that do not fit into a similar mode require further inves- tigation. The episodes on February 12, 1977, and on September 10, 1977, represent two such nontypical cases. The episode occurring on February 12, 1977 is quite unusual. SS concentrations at the trend stations were 9.9, 6.5, 8.4, 8.4, and 7.6 (Jg/m3, consistently above the winter seasonal mean of 4.5 pg/m3. The meteorology on the sampling day was dominated by a cyclonic center, moving in a northeasterly direction across southeastern Missouri. The associated cold front resulted in measurable precipitation across the Valley on the day of sampling. The elevated SS values conceivably result from two factors: (1) static sampling during the five days before sampling, when the presence of a stagnating anticyclone could ------- -18- have resulted in SS buildup, and (2) partial sampling of this stagnating airmass until the rain began. It is logical to assume that, had the cyclone not developed, SS concentrations would have been much greater. The episode occurring on September 10, 1977, is also quite unusual. SS concentrations on this day were 16.9, 15.5, 30.4, 27.0, and 0.5 (jg/m3, generally above the summer seasonal mean of 10.4 |jg/m3. Indeed, as can be seen in the variation of values, this situation is unusual. On the day of sampling, the Valley meteorology was dominated by an approaching cold front from the northwest, followed by a maritime polar anticyclone (PmK). The 0.5-(Jg/m3 concentration recorded at LBL is so unusually low that SS concentrations at nearby TVA steam plants were also checked. This check confirmed low SS concentrations in the northwest section of the Tennessee Valley—undoubtedly associated with the PmK airmass. The airmass to the southeast of the front is associated with much higher SS concentrations. Because of the complex meteorology on the days before sampling, resulting from the passage of a tropical depression, the origin of this prefrontal airmass is uncertain. The three-dimensional trajectory model of the National Weather Service Techniques Development Laboratory shows that on September 7, 8, and 9, the trajectories into the Valley were from the north to northeast. These trajectories crossed the large sulfur dioxide emissions sources in the Ohio Valley. The episodes occurring on October 25, 1974, and July 28, 1975, are examples of one type of high-TSP episode. In this case, the common factor is a stagnating anticyclone, a PcK in the former episode and a TmW in the latter. The stagnating conditions are associated with fog, haze, smoke, low wind speed, and reduced visibility. We believe that, in cases such as these, fine particulates from natural and anthropogenic sources build up in the atmosphere and result in the elevated TSP levels. The episodes occurring on January 4, 1974, and April 4, 1974, are also examples of high-TSP episodes. The mechanisms are, however, much different from the ones discussed above. In these cases, the episodes are associated with frontal activity, rain, and high wind speeds. We believe that in these cases, coarse particulates, primarily from natural sources, are carried aloft by the high winds associated with the frontal activity and result in the elevated TSP levels. The episodes occurring on February 4, 1975, May 25, 1977, September 16, 1977, and November 3, 1977, are examples of low-TSP episodes. The common factor associated with these episodes is the presence of regional precipitation before and on the day of sampling. In all these episodes, the precipitation is associated with frontal activity. There is no readily apparent explanation for differentiating between meteorological conditions occurring during the second type of elevated TSP episodes and these low episodes. The differences are most likely related to the sources. The individual roughs isolate unusual data points and describe for each site the rapidly varying part of sample-to-sample variation. Unusual data points may reflect an unusual set of environmental circum- stances or an error in sampling, laboratory work, or recording. As such, the individual roughs may be used in quality assurance. ------- -19- As seen in Table 1, the individual roughs are well correlated. This, indeed, is a manifestation of the common regional behavior. When compared with the Valley-wide rough, these correlations provide a quanti- tative measure of regional "representativeness." The Giles County site appears to be most representative of regional TSP behavior, whereas the Loudon, Hytop, and Giles County sites appear to be most representative of regional SS behavior. ------- -20- SECTION 5 DESIGN OF MONITORING Ambient monitoring can be used to estimate exposure as part of a study of pollution effects or to evaluate sources as part of a study of control strategies. To achieve this latter objective, power plant sources, other industrial sources, agricultural sources, other local sources, and distant sources must be resolved. Considerations important to accomplishing this are shown by the data analyzed in this paper. One consideration is how to resolve the agricultural and biological contribution to the TSP. Compared with some other contributions, this contribution is believed to contain mostly larger-size particles and to be less dangerous to human health.19 Whatever the relative health effects of various types of particles, this contribution must be distin- guished as effectively as possible in studies of control strategies. Thus, studies of control strategies are an important basis for the frequently repeated recommendation that particulates be measured by size and chemical composition. Another consideration is how the meteorological influence can be removed. This influence is important in the study of long-term varia- tions, which is one of TVA's purposes for monitoring at these isolated sites. How such variations apparent in the data are interpreted depends on their cause: Variations caused by the weather have different implica- tions for control than variations caused by other factors. Thus, long- term variations must be analyzed by removing the influence of year-to-year differences in the weather to obtain the series that would have occurred had each year's weather been the same. This series should show the part of the variation caused by changes in emissions. The meteorological influence is also likely to be important in analyzing data from sites surrounding a power plant. This analysis could start with the same decomposition used above. Because all the sites would be sampled on the same day, the common rough, which is the analog of the Valley-wide rough, would be subtracted from the individual roughs to obtain a local rough for each site. The dependence of the local roughs and smooths on plume behavior would contain the evidence of pollution from the power plant. However, this dependence might exist even with no power plant contribution because of other sources. Thus, the resolution of sources also arises in this context, showing that analysis involving the weather will also be important for power plant data. Analysis involving the meteorological influence, although it is never easy, is made harder by the aliasing problem. In the analysis of these data, the aliasing problem prevents separation of slowly varying components from rapidly varying components. Such separation is important in an observational study such as this, where the objective is to explain as much of the variation as possible. If the sampling were daily, the data would be separated into more than just rough and smooth components. The most irregular component would contain rare meteorological events as ------- -21- well as the results of measurement blunders. Another component would reflect mostly the passage of weather systems, thus tracking the day-to- day variations in transport. A third component would be compared with monthly summaries of causal factors in the same way that we would like to compare Figures 2 and 3 with such summaries. This more extensive decomposition should allow monitoring to provide, under some circum- stances, better information than modeling. The recommendation that the sampling frequency for particulates be increased has been made previously on the basis that particulate measure- ments are a random sample.20 Although this basis for thinking about air quality data is widespread,21 it fails to acknowledge the possibility of modeling and adjusting for meteorological and other influences. When adjustment for these influences is considered, the major problem with sixth-day sampling is seen to be aliasing rather than accuracy. The features of these data, revealed by the analysis demonstrated in this paper, suggest various changes to be made in the data collection. These changes include more resolution in the sampling itself and collec- tion of more ancillary data. If these changes were made, adequate data for more detailed and quantitative model building would become available. The analysis demonstrated here has thus been shown to be important in ensuring that all the data necessary to satisfy the purposes of the monitoring are collected. Analysis with this purpose should be a part of all ongoing monitoring. ------- -22- REFERENCES 1. Jutze, G. A., and K. E. Foster. Recommended Standard Method for Atmospheric Sampling of Fine Particulate Matter by Filter Media— High Volume Sampler. J. Air Pollut. Control Assoc., 17:17-25, 1967. 2. U.S. Public Health Service. Determination of Sulfate in Atmospheric Suspended Particulates. 999-AP-ll, 1965. 3. Appendix B--Reference Method for the Determination of Suspended Particulates in the Atmosphere (High Volume Method). Fed Regist 36(84):8191-8194, 1971. "' 4. U.S. Environmental Protection Agency. Tentative Method for the Determination of Sulfates in the Atmosphere (Automated Technicon II Methylthymol Blue Procedure), 1977. 5. Goldsmith, B. J., and J. R. Mahoney. Implications of the 1977 Clean Air Act Amendments for Stationary Sources. Environ. Sci Technol 12:144-149, 1978. 6. Pratt, J. W., et al. Environmental Monitoring. National Academy of Sciences, Washington, D.C., 1977. 7. Tukey, J. W. Exploratory Data Analysis. Addison-Wesley, Reading, Mass. 8. Velleman, P. F. Robust Nonlinear Data Smoothers: Definitions and Recommendations. Proc. Natl. Acad. Sci. USA, 74:434-436, 1977. 9. Hidy, G. M., E. Y. long, and P. K. Mueller. Design of the Sulfate Regional Experiment (SURE), vol. 1. EPRI EC-125, Electric Power Research Institute, 1976. 10. Lioy, P. J., G. T. Wolff, J. S. Czachor, P. E. Coffey, W. N. Stasiuk, and D. Romano. Evidence of High Atmospheric Concentrations of Sulfates Detected at Rural Sites in the Northeast. J. Environ. Sci. Health, A12:l-14, 1977. 11. Galvin, P. J., P. J. Samson, P. E. Coffey, and D. Romano. Transport of Sulfate to New York State. Environ. Sci. Technol., 12:580-584, 1978. 12. Tong, E. Y., and R. B. Batchelder. Compilation and Analysis of Data Sets for the Evaluation of Regional Sulfate Models. Teknekron, Inc., Berkeley, California, 1978. 13. Reisinger, L. M., and T. L. Crawford. August 1976 Sulfate Episodes in the Tennessee Valley Region. TVA/EP-79/04, Tennessee Valley Authority, Chattanooga, Tennessee. 14. U.S. Environmental Protection Agency. National Air Quality and Emissions Trend Report, 1976. EPA-450/1-77-002, 1977. ------- -23- 15. Altshuller, A. P. Atmospheric Sulfur Dioxide and Sulfate—Distribution Of Concentration in Urban and Nonurban Sites in the United States. Environ. Sci. Technol., 7:709-712, 1973. 16. Rowe, M. D., S. C. Morris, and L. 0. Hamilton. Potential Ambient Standards for Atmospheric Sulfates: An Account of a Workshop. J. Air Pollut. Control Assoc., 28:772-775, 1978. 17. Gnanadesikan, R. Methods for Statistical Data Analysis of Multi- variate Observations. John Wiley and Sons, Inc., New York, 1977. p. 132. 18. Bloomfield, P. Fourier Analysis of Time Series: An Introduction. John Wiley and Sons, Inc., New York, 1976. 19. Hidy, G. M., et al. Summary of the California Aerosol Characteri- zation Experiment. J. Air Pollut. Control Assoc., 25:1106-1114, 1975. 20. Tong, E. Y., and S. A. DePietro. Sampling Frequencies for Determining Long-Term Average Concentrations of Atmospheric Particulate Sulfates. J. Air Pollut. Control Assoc., 27:1008-1011, 1977. 21. Mage, D. T., and W. R. Ott. Refinements of the Lognormal Probability Model for Analysis of Aerometric Data. J. Air Pollut. Control Assoc., 28:796-798, 1978. ------- TECHNICAL REPORT DATA (Please read Intlnictions on the reverse before completing) 1. REPORT NO. EPA/600/7-79-084 4. TITLE AND SUBTITLE THE ANALYSIS OF SUSPENDED PARTICULATES AND SULFATES: A WAY TO BEGIN 6. PERFORMING ORGANIZATION CODE 3. RECIPIENT'S ACCESSI OfV NO. 5. REPORT DATE March 1979 7. AUTHOR(S) Walter Liggett and William Parkhurst 8. PERFORMING ORGANIZATION REPORT NO. TVA/ONR-79/03 9. PERFORMING ORGANIZATION NAME AND ADDRESS Office of Natural Resources Tennessee Valley Authority Chattanooga, TN 37401 10. PROGRAM ELEMENT NO. INE - 625 B 11. CONTRACT/GRANT NO. 80 BDM 12. SPONSORING AGENCY NAME AND ADDRESS U.S. Environmental Protection Agency Office of Research & Development Office of Energy, Minerals & Industry Washington, D.C. 20460 13. TYPE OF REPORT AND PERIOD COVERED Milestone 14. SPONSORING AGENCY CODE EPA/600/7 15. SUPPLEMENTARY NOTES This project is part of the EPA-planned and coordinated Federal Interagency Energy/Environment R&D Program. 16. ABSTRACT Total suspended particulate (TSP) and suspended sulfate (SS) levels have been sampled since November 1973 at five isolated sites across the Tennessee Valley. A method for beginning to analyze such data is demonstrated. This beginning is intended to lead finally to information on pollution sources, an objective that may require modeling meteorological influences and resolving sources. Analysis with this objective, which can be very complex, is effectively begun by using the method demonstrated in this paper. Applied to the TSP and SS data, this method suggests agricultural contributions to TSP levels, distant-source contributions to SS levels, and various influences of the meteorology. This method also shows deficiencies in the data collection that prevent the building of better, more quantitative models. One deficiency in this data set is the sixth-day sampling, which is not frequent enough to allow monthly variations in pollution levels to be distinguished from more rapid variations. Thus, data analysis would be more effective if the sampling frequency were increased and, further, if particle size and chemical composition were better resolved. (Circle One or More) KEY WORDS AND DOCUMENT ANALYSIS DESCRIPTORS b.IDENTIFIERS/OPEN ENDED TERMS c. COSATI Field/Group Inorganic Chemistry Charac. Meas. & Monit. 7B 3. DISTRIBUTION STATEMENT Release to public 19. SECURITY CLASS (This Report) CURITY CLASS (Ihi. Unclassified 21. NO. OF PAGES 23 20. SECURITY CLASS (This page) Unclassified 22. PRICE EPA Form 2220-1 (9-73) ------- |