EPA/600/4-85/046 July 1985 Application of the Microenvironment Monitoring Approach to Assess Human Exposure to Carbon Monoxide Naihua Duan Rand Corporation Santa Monica, CA 90405 EPA Contract No. 63-02-4058 EPA Project Officer Harold Sauls ENVIRONMENTAL MONITORING SYSTEMS LABORATORY OFFICE OF RESEARCH AND DEVELOPMENT U.S. ENVIRONMENTAL PROTECTION AGENCY RESEARCH TRIANGLE PARK, NC 27711 ------- NOTICE This document has been reviewed in accordance with U.S. Environmental Protection Agency policy and approved for publication. Mention of trade names or commercial products does not constitute endorse- ment or recommendation for use. ------- - iii PREFACE This study was conducted to apply the microenvironment monitoring approach to assess human exposure to carbon monoxide, using data from the Washington Urban Scale Study carried out in the winter of 1982-1983 by Research Triangle Institute, and the CO Microenvironment Study conducted in the winter of'1983 by Battelle Columbus Laboratories, under the auspices of the U.S. Environmental Protection Agency. Additional technical background on the modeling of human exposure to air pollution is available in N. Duan, Models for Human Exposure to Air Pollution, The Rand Corporation, N-1884-HHS/RC, 1982. This report should be of interest to researchers who want to model human exposure to air pollution in terms of activity patterns. The analysis reported herein was performed pursuant to Contract No. 68-02-4058 with the U.S. Environmental Protection Agency. ------- -iv - SUMMARY This study applies the microenvironment monitoring (MEM) approach to estimate human exposure to carbon monoxide (CO), with activity time data from the Washington Urban Scale Study and CO concentration data from the CO Microenvironment Study. The estimated MEM exposures are then compared with estimated exposures based on the personal monitoring (PM) approach (the PM exposures). For the specific data being used in this study, the MEM exposures are about 40 percent higher than the PM exposures. However, despite that discrepancy, the MEM exposure is found to be a powerful predictor for the PM exposure. On the log scale, the MEM exposure has the correct span relative to the PM exposure; the discrepancy between the two sets of estimates is found to be a constant drift. Two major factors could explain the discrepancy between the MEM and the PM exposures. First, the CO Microenvironment Study might have oversampled microenvironments with high CO concentrations. For example, the commuting routes in the sample were selected to be the "ones considered to be heavily travelled and predicted to have high CO exposures during rush hour periods" (Mack et al., 1984). Second, Wallace, Thomas, and Mage (1984) found a discrepancy between the carboxyhemoglobin (COHb) levels estimated from breath measurements and those estimated from the PM exposures, indicating that the PM exposures might underestimate the actual exposure. One of their possible explanations for the underestimation is a "consistent decline in the readings" on the CO monitor "as the battery approached the end of its charge." Given the exploratory nature of the data used in this study, the results reported here should be considered as an illustration and should be generalized only with caution to future exposure studies. Based on the experience from this study, in future exposure studies, the other version of the microenvironment type (MET) approach, enhanced personal monitoring (EPM), should be given a higher priority when feasible. When the MEM approach is the only feasible one, the sampling of ------- microenvironments should be carried out with either the weighted sampling scheme or the simulated human activity scheme. In the estimation of exposures, it is assumed that there is already a scheme to classify microenvironments into METs. Part of this study also examined how to choose an appropriate classification scheme. In particular, Duan's (1981) criterion was applied to the microenvironments measured in the CO Microenvironment Study to evaluate various schemes for grouping similar microenvironments into METs. ------- - vi - ACKNOWLEDGMENTS The author would like to express his appreciation to the following colleagues for their support during the course of this study: T. D. Hartwell from Research Triangle Institute and Gregory Mack from Battelle Columbus Laboratories for their help in clarifying the data from the Washington Urban Scale Study and the CO Microenvironment Study; Allan Abrahamse and Adele Palmer from The Rand Corporation, for their thoughtful review of an earlier draft of this report, which resulted in a substantial improvement in its organization; Wayne Ott and Lance Wallace from U.S. Environmental Protection Agency, for their thoughtful comments on an earlier draft; David Holland from the U.S. Environmental Protection Agency, for valuable help in the delivery of the data used in this study, and thoughtful discussions and suggestions throughout the course of this study; and Harold Sauls from the U.S. Environmental Protection Agency, the Project Officer for this study, for valuable guidance on the conduct of this study. ------- -yvii - CONTENTS PREFACE iii SUMMARY ; ! i V ACKNOWLEDGMENTS ; Vi TABLES i ix GLOSSARY X Section .1. INTRODUCTION ' 1 Exposure Assessment 2 Review of the MET Approach 5 Methods for Estimating Exposure 11 II. ACTIVITY TIME DATA 14 Washington Urban Scale Study 14 METs and Activity Segments 15 Quality of Activity Time Data 18 Startup Time, Total Time, and Summary Statistics 19 III. CO CONCENTRATION DATA 23 Microenvironment Study 23 Summary Statistics for the MET Concentrations 24 IV. ESTIMATED EXPOSURES 28 Application of the Convolution Method 28 Estimated Exposures Based on MEM 31 Estimated Exposures Based on PM 32 Comparison of Exposure Distributions 34 Comparison of Individual-specific Exposures 36 Conclusion 41 V. EVALUATION OF MET CLASSIFICATION SCHEMES 42 Elementary METs 43 Duan's Criterion 44 Empirical Results 48 VI. FUTURE APPLICATIONS OF THE MET APPROACH 50 Enhanced Personal Monitoring and Microenvironment Monitoring 50 Sampling of Microenvironments 51 Study Designs and Data Collection 53 Further Use of the Activity Database 54 ------- --vi ij - Appendix A. DEFINITIONS OF NETs 57 B. QUALITY CRITERION FOR THE ACTIVITY TIME DATA 61 C. FREQUENCY DISTRIBUTIONS OF ACTIVITY TIME 64 D. SENSITIVITY ANALYSIS 73 E. DUAN'S CRITERION ' 82 F. ESTIMATION OF PARAMETERS IN DUAN'S CRITERION 84 REFERENCES 87 ------- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 16 17 19 20 22 22 25 26 29 31 33 33 35 35 37 39 48 TABLES Activity Time for Modes of Travel and Types of Shop .... Average Concentration for Modes of Travel and Types of Shop Cross Tabulation for the Two Criteria Frequency Distribution of Startup Time Frequency Distribution of Total Time Summary Statistics for Standardized Activity Times Summary Statistics for CO MET Concentrations Based on MEM Summary Statistics for CO MET Concentrations Based on PM Correlation Between MET Concentration and MET Time Summary of Imputed MET Concentrations Summaries for MEM and PM Exposures Percentiles of MEM and PM Exposures Summaries of log MEM and Exposures Percentiles of log MEM and PM Exposures MSE and GAIN for MEM Exposures Regressions of PM Exposures on MEM Exposures Ranking of MET Decompositions ------- - X c.d. f CO COHb - MEM MEM-C MEM-H MET MSE NAAQS NEM PEM PM ppm SHAPE GLOSSARY cumulative distribution function carbon monoxide carboxyhemoglobin microenvironment monitoring the convolution method applied to microenvironment monitoring data the hybrid method applied to microenvironment monitoring data microenvironment type mean squared error national ambient air quality standards NAAQS exposure model personal exposure monitor personal monitoring parts per million simulation of human air pollution exposures ------- - 1 - I. INTRODUCTION This study applies the microenvironment monitoring (MEM) approach, one of the two versions of the microenvironment types (MET) approach (also called the indirect approach), to estimate human exposure to carbon monoxide (CO), using activity time data from the Washington Urban Scale Study and CO concentration data from the CO Microenvironment Study. The estimated exposures based on the MEM approach (the MEM exposures) are then compared with estimated exposures based on the personal monitoring approach, also called the direct approach (the PM exposures). For the specific data used in this study, the MEM exposures are about 40 percent higher than the PM exposures. However, despite the discrepancy, the MEM exposure is found to be a powerful predictor for the PM exposure. On the log scale, the MEM exposure has the correct span relative to the PM exposure; the discrepancy between the two sets of exposure estimates is found to be a constant drift. Further details on these results are discussed in Sees. II-IV. Two major factors could explain the discrepancy between the MEM and the PM exposures. First, the CO Microenvironment Study might have oversampled microenvironments with high CO concentrations. For example, the commuting routes in the sample were selected to be the "ones considered to be heavily travelled and predicted to have high CO exposures during rush hour periods." (Mack et al., 1984). Second, Wallace, Thomas, and Mage (1984) found a discrepancy between the COHb levels estimated from breath measurements and those estimated from PM exposures, indicating that the latter might underestimate the actual exposure. One of their possible explanations for the underestimation is a "consistent decline in the readings" on the CO monitor "as the battery approached the end of its charge." Given the exploratory nature of the data used in this study, the results reported here should be considered as an illustration and be generalized only with caution to future exposure studies. Based on the experience from this study, in future exposure studies the enhanced ------- - 2 - personal monitoring (EPM) approach should be given a higher priority when feasible. When the MEM approach is the only feasible approach, the sampling of microenvironments should be carried out with either the weighted sampling scheme or the simulated human activity scheme. In the estimation of exposures, it is assumed that a scheme to classify microenvironments into METs has already been chosen. Part of this study also examined how to choose an appropriate classification scheme. In particular, Duan's (1981) criterion has been applied to the microenvironments measured in the CO Microenvironment Study to evaluate various schemes to group similar microenvironments into microenvironment types (METs). EXPOSURE ASSESSMENT Until recently, human exposure to air pollution could be assessed only with fixed-site ambient monitoring data. Typically people residing in the same neighborhood near a monitoring station were treated as homogeneous receptors fixed at the location of the monitoring station. Recent field studies with personal exposure monitors (PEMs) have found this approach inadequate for such pollutants as carbon monoxide, which are spatially variable or have nonambient sources or sinks. For such pollutants it is important to take into account people's mobility and activities in assessing their exposures. With the recent development of PEMs, it became feasible to incorporate the mobility and activities into exposure assessment, especially for CO, for which several reliable, continuous PEMs are available. There are two general approaches to exposure assessment using PEMs. The first is the personal monitoring (PM) approach, also called the direct approach, in which human subjects are sampled from the target population and are equipped with PEMs for a certain time to measure their exposures directly. This approach is taken in the Washington Urban Scale Study, the details of which are discussed in Sec. II. The advantages of this approach are its simplicity and its freedom from modeling assumptions. The main disadvantage is its cost, too high for large scale investigations. ------- - 3 - An alternative approach to assess exposure is the MET approach in which pollutant concentration data are combined with or enhanced by activity time data.1 There are two ways in which the MET approach can be implemented: the enhanced personal monitoring (EPM) approach, and the microenvironment monitoring approach. The latter approach was taken in the CO Microenvironment Study, the details of which are discussed in Sec. III. Assessment of Individual Exposures In many situations it is important that the estimated exposure be attributable to the exposed individuals. For example, in epidemiological studies to quantify the health effects of air pollution, one needs to know who is exposed to what levels of pollution. In that situation the estimated exposures must be close to the actual exposures. In this study we don't know the actual exposures; both the MEM exposures and the PM exposures are estimates and are subject to error. Nevertheless it is still preferable that the two estimated exposures be close to each other; if the two estimated exposures are substantially different from each other, at least one of them must be substantially different from the unobserved actual exposure. The mean squared error (MSE) will be used as the criterion to measure how close the two estimated exposures are to each other. The criterion is defined as follows: MSE = I. (MEM. - PM.)2/n , (1) 111 where MEM^ is the MEM exposure for the ith unit (say, person-day), PM^ is the PM exposure for the ith unit, n is the total number of units. The square root of MSE can be interpreted as the average magnitude of the difference between the two estimated exposures. In certain other situations it is important only to distinguish people exposed to higher levels of air pollution from people exposed to lower levels. For example, one might be interested in qualifying the health effects of air pollution--i.e., in determining qualitatively the *Duan (19S1); Duan (1982); Ott (1981) ------- - 4 - existence of health effects. In such situations it is important only to rank people's exposures, it is not necessary to know the exact levels of exposures. In such situations it is necessary only that the estimated exposure be a significant predictor for actual exposure. Therefore we can use the strength of the regression of the actual exposure on the estimated exposure as the criterion for evaluating the estimated exposure. The estimated exposure might be substantially biased from the actual exposure, but so long as the two measures are well correlated, the estimated exposure can be used as a proxy for actual exposure in qualifying health effects. In this study the actual exposures are unknown and therefore cannot be used to estimate the regression relationship between the estimated exposures and the unobserved actual exposures. Instead the PM exposure is used as the benchmark; the PM exposure is regressed on the MEM exposure2 to test for the relationship between the two estimated exposures. If the MEM exposure is not a good predictor for the PM exposure, it is unlikely that it will be a good predictor for the unobserved actual exposure. Assessment of Exposure Distributions For certain other purposes, such as risk assessment, it is necessary only to estimate the distribution of exposures, but it is not necessary to identify each individual's exposures, although it is useful to have such exposures available for reference. To take an extreme case, consider a population with two individuals, one exposed to 10 ppm of CO, another exposed to 1 ppm. If the dose-response relationship has already been established from other sources, the risk assessment can be done without knowing each individual's exposure. For example, assume that it is already known that exposure to levels higher than 9 ppm is harmful. The goal for risk assessment is to estimate what fraction of the target population is exposed to harmful levels. In this case the only information relevant to the risk assessment is that half of the hypothetical population is 2The PM exposure is taken in this analysis as the standard method; the MEM exposure is taken as the new method to be calibrated relative to the standard method. ------- - 5 - exposed to 10 ppm and the other half is exposed to 1 ppm. It is not necessary to distinguish which individual is exposed to which level. (To take the case even further, suppose an exposure assessment method wrongly estimated the first individual's exposure to be 1 ppm while his actual exposure is 10 ppm, and wrongly estimated the second individual's exposure to be 10 ppm while his actual exposure is 1 ppm, the result should be judged satisfactory for risk assessment, because the exposure distribution is correct.) REVIEW OF THE MET APPROACH The MET approach combines MET-specific pollutant concentration data and activity time data to estimate exposures. This approach incorporates information about people's mobility and activity. Two different ways can be used to implement the MET approach, the EPM approach and the MEM approach. Detailed discussion of these approaches was given in Duan (1981, 1982). Microenvironments and METs A microenvironment is a chunk of air space with homogeneous pollutant concentrations. The integrated exposure of a specific individual may be represented as a weighted sum of concentrations in the icroenvironments in which he spent time during a given study period, say a 24-hour period, weighted by the amount of time he spent in each microenvironment: E. = Z.Z. x x.. , (2) i J J ij where E^ is the ith person's exposure during the study period, the microenvironment concentration 2f. is the pollutant concentration in the J jth microenvironment, the microenvironment time T.. is the time the ith ij person spent in the jth microenvironment. The summation is taken over all microenvironments that some individual in the target population might spend some time in. If the microenvironment times t.. and microenvironment ij concentrations 2f have been measured, Eq. (2) may be used to calculate exposures E^. However, in any realistic situation, there are far too m ------- - 6 - many microenvironments to be measured extensively. To make good use of data collection resources, microenvironments of a similar nature must be grouped in terms of pollutant concentrations. For example, all indoor microenvironments might be grouped together, and all outdoor microenvironments might be another group. Such groups of similar microenvironments will be referred to as microenvironment types (METs). In terms of the METs, integrated exposures may be reformulated as follows: E. = I.C.. x T , (3) 1 k ik lk where the MET time T., is the total amount of time the individual spent ik in microenvironments belonging to the kth MET, and the MET concentration is the average concentration the individual is exposed to during the time he spent in microenvironments belonging to the kth MET. The MET time is related to the microenvironment times as follows: T. = 1.6 x t . . , (4) ik j jk ij 6^ = 1 if jth microenvironment belongs to kth MET, 0 otherwise. The MET concentration C., is related to the microenvironment ik concentrations as a weighted average of microenvironment concentrations 2f in microenvironments belonging to the kth MET, weighted by the fraction of time spent in each microenvironment belonging to this MET: C., = 1.6 x w. x y. , (5) ik j jk ijk j where w.., = t../T., . ijk ij ik Equation (3) will be referred to as the time-weighted summation formula. The discussion has focused on integrated exposure. Maximum one- hour and eight-hour exposure may be dealt with similarly. For each one- hour or eight-hour period, the time-weighted summation formula may be used to estimate the integrated exposure for the time period. The ------- - 7 - maximum may then be taken over all one-hour or eight-hour periods of interest. Implementation of the MET Approach The the main advantage of the MET approach is that concentration data and activity data are combined from two different sources. In certain situations one of the two databases might already be available or is inexpensive to obtain. In certain situations both databases might already be available. For example, the sources of human exposures to nitrogen dioxide and CO are fairly similar. Both are mainly determined by combustion sources. Therefore the same set of METs can be used for both pollutants, with little or no modifications. The activity data collected in the CO studies can therefore be used as the activity data for a future study of human exposure to nitrogen dioxide. If another database becomes available for MET concentrations for nitrogen dioxide or can be collected inexpensively, it will be feasible to combine the existing activity database with the nitrogen dioxide concentration database to assess human exposure to nitrogen dioxide without the need to collect additional activity data. There are two approaches to implement the MET approach, depending on how the concentration data are collected. In the first approach, the MET concentrations are obtained from personal monitoring. In the second approach, the MET concentrations are obtained from microenvironment monitoring. With either approach, the activity data have to be collected from human subjects, using either a diary or a recall survey. Enhanced Personal Monitoring The personal monitoring approach collects pollution concentration . data on a.sample of human subjects and derives each individual's exposure directly from the measured concentrations. If activity time data are also collected, either on the same sample of human subjects or on a different sample of human subjects, the concentration data from personal monitoring can be combined with activity time data to estimate exposure using the time-weighted summation formula Eq. (3). This approach is called the enhanced personal monitoring (EPM) approach.3 3Duan (1982) referred to this as the continuous personal monitoring approach. ------- - 8 - To use the personal monitoring data in the EPM approach, it is necessary to have continuous measurements of pollutant concentrations'* to derive the MET concentrations to be used in the time-weighted formula Eq. (3). Therefore the EPM approach requires that reliable continuous PEMs be available. Furthermore, for the PEMs to be used by untrained human subjects, they have to be small, lightweight, and easy to operate. For carbon monoxide, several reliable PEMs satisfy the above requirements and EPM is a feasible way to implement the MET approach. With the EPM approach, a sample of human subjects is equipped with PEMs. Participants keep a diary of their activities. From the continuous measurements and diaries, it is possible to determine at any instant the MET the participant was in and the instantaneous concentration the participant was exposed to. The average concentration the participant was exposed to during the time he spent in that MET may then be derived. That is the MET concentration needed for the time-weighted summation formula. The diaries the participants record during the monitoring phase provide some activity data, which can be supplemented by additional activity data. One inexpensive way to expand the activity database is to collect diary data on the same participants on additional days. For example, participants may be enrolled in the study for a week, record their daily activities, and be equipped with PEMs for one or two days during the study period. Additional participants may only fill out the diaries but not participate in the monitoring. The marginal cost in collecting the additional diaries should be low compared with collecting the monitoring data. Furthermore, existing activity data can be combined at little or no cost with the activity data collected on the participants who are equipped with PEMs. "With the "pure" personal monitoring approach one needs only integrated measurements for the assessment of integrated exposures. ------- - 9 - Microenvironment Monitoring An alternative way to implement the MET approach is microenvironment monitoring.5 Instead of MET concentration data with personal monitoring, a number of microenvironments may be sampled in each MET, with research staff or trained technicians sent to the sampled microenvironments to monitor those microenvironments directly. Duan (1982) noted that the distribution of microenvironment concentrations Y. can be different from the distribution of MET J concentrations except for a special case in which for each MET, each individual visits only one microenvironment of that type, and each microenvironment belonging to that MET has the same probability of being visited. This special case is unlikely to be satisfied, because people might visit several microenvironments belonging to the same MET, or some microenvironments will have higher probabilities of being visited. When the special case described above is not satisfied, Duan (1982) showed that the microenvironment concentrations are usually more variable than the MET concentrations. To use microenvironment concentrations to derive valid estimates of MET concentrations, it is necessary to obtain a valid sample of microenvironments. For the sample to be valid, the target population is not the population of microenvironments belonging to a certain MET. Instead, it is the coincidence (intersection) of microenvironments and people. For example, consider a MET consisting of all office spaces. Assume also that each identifiable room is sufficiently homogeneous and can be regarded as a microenvironment. If the collection of all microenvironments were the target population, a roster of all office spaces could be the frame for the sampling, or an area probability sample. However, some of these office spaces might be empty most of the time, some might be crowded most of the time. To obtain a valid sample of microenvironments, it is necessary to adjust for such differences. One possibility discussed in Duan (1982) is to count or estimate the number of people present at each sampled microenvironment and use these counts as weights. The crowded microenvironments get larger weights in 5This approach was referred to as the replicated microenvironment monitoring approach in Duan (1982). ------- - 10 - the analysis. Another possibility is to simulate real human activities using available diary data. Comparison of EPM and MEM Approaches Compared with the MEM approach, the EPM approach has the advantage of avoiding the difficult task of sampling microenvironments. In essence, with the EPM approach, the participants sample the microenvironments for the investigator. The difficulty in the sampling of microenvironments is the main disadvantage of the MEM approach. If all microenvironments belonging to the same MET have exactly the same concentration, or have very similar concentrations, this is not a problem. Any reasonable sample of microenvironments will give a good result. However, concentrations are more likely to vary substantially from microenvironment to microenvironment, even though they belong to the same MET. In this situation the appropriate sampling of microenvironments is important. The EPM approach is very demanding on hardware. As of now, CO appears to be the only pollutant for which the necessary continuous PEM is available for this approach. The main advantage of the MEM approach is that it does not require miniature continuous PEMs. Because the monitoring is conducted by research staff or trained technicians, it is possible to use portable continuous monitors, which are inconvenient for the untrained human subjects to use. (Of course it does not hurt to have PEMs available.) Another disadvantage of the EPM approach is the need to sample and monitor human subjects. This disadvantage is probably secondary, because it is already necessary to sample human subjects to obtain the diary, data. However, the inconvenience to the human subject in the collection of the diary information is likely to be minor relative to the collection of the monitoring data; with the latter the participant needs to wear the monitor for an extended time. The MEM approach has the advantage of not needing human subjects in the monitoring. However, human subjects must still be used to collect the activity data, unless there is a suitable activity database already available. ------- - 11 - Another disadvantage of the MEM approach is that it might be difficult to access certain microenvironments sampled for monitoring. With the EPM approach it might be easier for the human subjects to access such microenvironments. METHODS FOR ESTIMATING EXPOSURE The MET concentration data and the MET time data can be combined in several ways to estimate exposure. If one is interested only in average exposure, one can use the average-time weighted summation formula and estimate average exposure by E = IkCk x Tk, (6) where E is the average exposure, is the average MET concentration for the kth MET, and T^ is the average MET time for the kth MET. This method implicitly assumes that the MET concentrations and MET times are uncorrelated, because otherwise the correct formula for average exposure based on averaging both sides of the time-weighted summation formula (3) over the individuals should contain an additional covariance term: * = V5*" \ + Cov«v V- The assumption that MET concentrations and MET times are uncorrelated is not unusual. It is implicitly assumed in all models for human exposure using the MET approach, including SHAPE (Ott, 1981), the convolution method, and the hybrid method. The assumption basically rules out people's response to air pollution such as staying away from high concentration METs during polluted days. For most purposes the mere estimation of average exposure is inadequate, and it is necessary to estimate exposure distribution or individual exposures. There are several ways of doing this. One approach is to use simulation models such as SHAPE (Ott, 1981), in which the concentration and activity data are summarized by parametric (or nonparametric) probabilistic distributions, human activity and concentration data are simulated from those probabilistic distributions, ------- - 12 - and the simulated data are used to estimate exposures. This type of approach generally assumes that the concentration and time are independent to facilitate the simulation. Another approach is the convolution method proposed in Duan (1981). With this approach, units (e.g., person-days) are paired from the activity database with units (e.g., days) from the concentration database to form convoluted units (e.g., person-days), and the exposure for each convoluted unit is estimated using a time-weighted summation formula similar to Eq. (3). E. = I, C , x T , (7) lm k mk lk where E^ is the exposure combining the ith unit in the activity database and the mth unit in the concentration database, C , is the MET mk concentration for the mth unit in the concentration database in the kth MET, and T^ is the MET time for the ith unit in the activity database in the kth MET. To illustrate the application of Eq. (7), consider a study that has 43 days of MEM data, combined with a sample of 705 persons, each providing one day of activity diary. If the ith person in the activity sample spent the day according to {T^} and was exposed to concentrations {Cm> in the METs encountered during that day, he will receive exposure E. . As independence is assumed between the MET concentrations and lm times, each of the 43 concentration vectors is equally likely for each of the 705 participants. With the convolution method, the exposures E are derived for each of the 43 x 705 = 30,315 pairings of persons and days in the two databases. Each such pairing is one convoluted person-day. If the number of convoluted person-days is too large, we can sample from them. This method also requires that the concentration and time be independent. Duan (1982) showed that the distribution of exposures estimated from the convolution method is an unbiased estimate of the distribution of actual exposures and is a function of the empirical c.d.f.s for the MET concentrations and the MET times. Because the empirical c.d.f. is the efficient estimator of the true c.d.f. in the sense of being the minimum variance unbiased estimator, the exposure ------- - 13 - distribution estimated using the convolution method is also efficient in the same sense. An alternative to the convolution approach was suggested by Harold Sauls in a private communication. This method can be viewed as a hybridization between the average-time weighted summation formula Eq. (6) and the convolution method Eq. (7). With this hybrid method, the average MET concentration in each MET is used to estimate the exposure for each unit (day or person-day) from the activity database by E. = I.C. x T . (8) i k k lk This method ignores the variability in exposures due to the variability in MET concentrations. If all microenvironments belonging to the same MET have the same concentration, this method is preferable to the convolution method because of its simplicity. If the microenvironments belonging to the same MET vary substantially, this approach is likely to underestimate the variability of the exposure distribution. The estimated exposures based on this approach will be referred to as the hybrid exposures. ------- - 14 - II. ACTIVITY TIME DATA WASHINGTON URBAN SCALE STUDY A population-based study on CO exposure was conducted during the winter of 1982-1983 in the Washington, D.C. metropolitan area.1 Details on this study are available in Hartwell et al. (1984a and b) and Whitmore et al. (1984). An area probability sample of human subjects was enrolled for one day each in this study. The participants filled out activity diaries giving the activities they were engaged in during each time period. The activities are entered in the diaries as activity segments, where each activity segment is defined to be the time period between two reported changes in activities in the activity diary. The participants' exposures to CO were measured using PEMs, which gave the average concentration for each activity segment. The participants in the Urban Scale Study were selected from a probability sample. To extrapolate from the sample to the target population, it is necessary to weight the individual observations by the sampling weights based on sampling probabilities. In preliminary analysis, the summary statistics based on the weighted and the unweighted procedures were compared. The weighting did not have a major effect on the results. For example, the average time spent in car commuting differs by about 2 percent between the weighted and the unweighted estimates. Because the primary goal of this study is to compare the estimated exposures based on the the MEM and the PM approaches, the extrapolation to the target population is not crucial. To simplify the analysis it was decided not to weight the individual observations. *A similar study was conducted in the Denver metropolitan area at about the same time. Details on this study are available in Johnson (1984). ------- - 15 - METS AND ACTIVITY SEGMENTS In the Washington Urban Scale Study each participant fills out activity diaries for one day. During this sampling day, whenever there is a new activity--e.g., the participant stops reading a newspaper in the living room (end of an old activity) and goes outside for a walk (beginning of a new activity)--the participant is required to record the start time of the new activity and describe it. The period between two entries in the activity diary is referred to as an activity segment. Each activity segment is regarded as one microenvironment. Based on information available, activity segments are grouped into seven METs: packing, public transportation, private car, pedestrian, shops, offices, and others. The rest of this section gives the heuristic definitions of these METs. Detailed definitions are given in Appendix A. The MET parking is restricted to indoor parking because only indoor parking concentration data are available from the CO Microenvironment Study. The MET public transportation includes both bus and metrorail. Because both buses and metrorails are monitored in the Microenvironment Study, it is possible to consider them as distinct METs. However, in the evaluation of MET classification schemes to be discussed in Sec. V, it was found not to be beneficial to distinguish these two METs; therefore public transportation is considered as one MET without further refinement. The MET private car includes private cars, trucks, motorcycles, and vans. It is debatable whether this MET should be restricted to the narrow definition including private cars only. (Only private cars are monitored in the Microenvironment Study.) The four modes of travel were grouped into one MET for two reasons. (1) The amount of time spent in trucks, motorcycles, and vans is very small compared with the amount of time spent in private cars. The top part of Table 1 gives the average amount of time spent in each of these modes of travel. The total amount of time spent in the four modes of travel is 1.623 hours per person per day, out of which only 0.106 hours belong to the three modes other than private car, less than 7 percent of the total. (2) The MET concentrations based on PEM for those four modes of travel are roughly ------- - 16 - Table 1 ACTIVITY TIME FOR MODES OF TRAVEL AND TYPES OF SHOP Fraction MET Mode/Type Average Time of MET (%) Car Car Truck Motorcycle Van 1.517 hr 0.069 0.002 0.035 93.47 4.25 0. 12 3. 16 Total 1.623 100.00 Pedestrian Walking Jogging Biking 0.254 0.007 0.008 94.42 2.60 2.97 Total 0.269 100.00 Shops Stores Mall 0.369 0.015 96.09 3.91 Total . 0.384 100.00 similar.. The top part of Table 2 gives the average concentrations along with the standard errors for the averages. The difference between car and truck is small (about 1 ppm) and statistically insignificant (t = 0.73). The difference between car and van is not small (about 3 ppm) and is statistically significant (t = 3.67). However, only seven people reported using a van in their travel. The difference between car and motorcycle is about 2 ppm; the standard error and t-statistics for this difference are not available because only one person reported using a motorcycle in his travel. The MET pedestrian includes walking, biking, and jogging. It is again debatable whether jogging and biking should be grouped with walking into one MET. Table 1 shows that the amount of time spent jogging and biking is very small (less than 6 percent) compared with time spent walking. The difference in concentrations between walking and jogging is very small (less than 0.1 ppm) and statistically insignificant (t = 0.09). The difference between walking and biking is about 2 ppm and is statistically significant (t = 2.09). However, only ------- - 17 - Table 2 AVERAGE CONCENTRATION FOR MODES OF TRAVEL AND TYPES OF SHOP Average MET Mode/Type Na Concentration SEb Car Car 592 5 .1 0.22 Truck 22 6.3 1.67 Motorcycle 1 3.0 Van 7 2.1 0.79 Pedestrian Walking 220 2.3 0. 16 Jogging 6 2.3 0. 78 Biking 5 4.0 0.82 Shops Stores 225 2.2 0. 17 Malls 11 1.8 0.54 aThe number of participants who used this mode/type during the sampling period. ^Standard error of the average concentration. five people reported biking during the sampling period. Therefore they are combined into one MET. The MET shops consist of the activity segments reported as stores, shopping malls, and theaters in malls. The amount of time spent in the malls is small relative to the time spent in stores (less than 5 percent). The difference in concentration is very small (less than 0.5 ppm) and statistically insignificant (t = 0.65). Therefore they are combined into one MET. The MET offices consists of activity segments reported as offices. The MET other is a residual category for activity segments not considered above. The main component of activity segments in this MET is home. Because there are no microenvironment monitoring data corresponding to these activity segments in the Microenvironment Study, this MET cannot be refined any further. ------- - 18 - QUALITY OF ACTIVITY TIME DATA During the preliminary analysis of the activity time data from the Urban Scale Study, the quality of the data was found not to be uniform. Some participants took great care in providing a detailed and presumably accurate diary of their activities. Some participants reported questionable data. For example, some participants did not report spending any time sleeping during the sampling period. The same observation was made in Hartwell et al. (1984a and b). To identify the participants who provided more reliable diary data, two criteria were developed. The first is based on the participants' success or failure in following the skip logic in the activity diary and will be referred to as the skip-logic criterion. The second is based on the consistency checks that were developed during the preparation of the activity database (Hartwell et al., 1984a, Section 5.4.2.2) and will be referred to as the consistency criterion. The details on these criteria are given in Appendix B. The two criteria are strongly associated, indicating they measure some common trait--presumably the quality of the diary data. Table 3 gives the cross tabulation for the two criteria. Participants who pass the consistency criterion are more likely to pass the skip-logic criterion, and participants who fail the consistency criterion are also more likely to fail the skip-logic criterion. The X2 statistics for association in this 2x2 table is 14.027, with one degree of freedom. The P-value is 0.0002, indicating the association is highly significant. The two criteria are used to stratify the participants into different levels of reliability. There are 705 participants in the sample--the "whole sample." A subset of 361 participants passed the skip-logic criterion--i.e., followed the skip logic correctly. This pool will be called the "good sample." A subset of 127 participants passed both the skip-logic criterion and the consistency criterion. This pool is the "best sample." Applying all analyses to be discussed in the rest of this section and in later sections simultaneously on all three samples for comparison serves as a sensitivity analysis to examine whether the lack of uniformity in the quality of the activity data results in any difference ------- - 19 - Table 3 CROSS TABULATION FOR THE TOO CRITERIA Skip-logic Fail Pass Total Fail 267 234 501 Cons istency Pass 77 127 204 Total 344 361 705 in the conclusion. As it turns out, practically all of the analysis results are insensitive to the sample chosen. Therefore the results are given for the whole sample in the text, and for the "good" and the "best" samples in Appendix B. STARTUP TIME, TOTAL TIME, AND SUMMARY STATISTICS Startup Activity Most participants reported an activity segment labeled "begin diary," indicated by the activity code 87 in Hartwell et al. (1984a and b). Table 4 gives the frequency distribution of this activity. Some participants took hours to get started in their diaries. We interpret this as an oversight in the instruction. These participants failed to realize that "begin diary" is an instantaneous event that is over as soon as it is recorded, and a new activity should be reported immediately. For example, the participant might be watching TV in the evening when he started the diary, and continued watching TV following the start of the diary. The correct diary should be an activity segment with zero (or nearly zero) duration for "begin diary," followed by another activity segment for watching TV, followed by the next activity such as going to bed. Many of the participants reported "begin diary" as a time-consuming activity--for example, the participant in the earlier example might report the time he spent watching TV after the start of the diary as part of the activity segment "begin diary," and won't report another activity transition until he went to bed. The ------- - 20 - Table 4 FREQUENCY DISTRIBUTION OF STARTUP TINE Dens itya Range Count (Percent per minute) 0 (minute) 228 32.3 1 3 0.4 2 30 4.3 3 25 3.5 4 23 3.3 5 1'3 1.8 6 22 3.1 7 14 2.0 (median) 8-10 47 2.2 11 - 15 46 1.3 16 - 25 47 0.7 26 - 45 39 0.28 46 - 75 40 0. 19 76 - 120 29 0.09 121 - 180 38 0.09 181 - 240 30 0.07 241 - 300 22 0.05 301 - 360 4 0.009 361 - 420 4 0.009 420 - (b) 1 The density is defined as the percentage ((count/705)x100 percent) divided by the length of the range (the number of minutes covered by this range), and is given in the unit "percent per minute." For example, 6.7 percent of the participants fall inside the range 8-10 minutes, which covers three minutes, therefore the density for this range is 6.7%/3 minutes = 2.2%/minute. ^The largest nine observations are (in decreasing order): 496, 387, 382, 380, 363, 358, 323, 314, and 312, all in minutes. ------- - 21 - actual activity for this activity segment is missing. Therefore the activity segments "begin diary" were deleted from all further analyses and the subsequent activity segment treated as the real beginning of the diary. Total Time The total amount of time each participant reported on during the survey is not always exactly 24 hours. Table 5 gives the frequency distribution of the total time reported in the diary. The total time is skewed toward the lower end, because of the deletion of the startup time. Some participants' total time is appreciably shorter than 24 hours. To make the activity times comparable from participant to participant, all activity times are standardized relative to 24 hours. SAT = RAT * 24/Total time, where SAT = standardized activity time, RAT = reported activity time, total time = reported total time, all measured in hours. Unless noted otherwise, all activity times in the rest of this report are given as standardized activity times. Summary Statistics Table 6 gives the summary statistics for the standardized activity times. Further details on the distribution of activity times are given in Appendix C. ------- - 22 - Table 5 FREQUENCY DISTRIBUTION OF TOTAL TIME Range3 Count Density^ Minimum 17.0 17 - 18 5 0.7 18 - 19 4 0.6 19 - 20 24 3.4 20 - 21 34 4.8 21 - 22 44 6.2 22 - 23 94 13.3 23 - 23.5 77 21.8 23. ,5 - 24 171 48.5 24 - 24.5 136 38.6 24. .5 - 25 62 17.6 25 - 26 44 6.2 26 - 27 7 1.0 27 - 28 3 0.4 Maximum 27.483 £ Total time in hours including lower endpoint and excluding upper endpoint. ^Densities in percentages per hour range. See Note a in Table 4 for the definition of density. Table 6 SUMMARY STATISTICS FOR STANDARDIZED ACTIVITY TIMES MET Mean3 SDb Public 0. 100 0.423 Private car 1.517 1.591 Pedestrian 0.269 0. 707 Parking 0.084 0.766 Shop 0.384 1.064 Office 3.051 3.914 £ Average activity time in hours. ^Standard deviation of activity times in hours. ------- - 23 - III. CO CONCENTRATION DATA MICROENVIRONMENT STUDY The CO Microenvironment Study was conducted in the Washington, D.C. metropolitan area during the winter of 1983. The main part of the study focused on the measurement of commuting microenvironments, including parking garages, driving an automobile, riding a bus, riding a train, and walking. The detailed design and preliminary results from the study are given in Flachsbart (1982a and b, 1984) and Mack et al. (1984). For automobile commutes, the study identified eight routes that "collectively extend 160 miles, about 8.6% of the total length (1,853 miles) of Washington's arterials and freeways." (In 1980, the Washington metropolitan area had 9,432 miles of streets and roads, including arterials, freeways, and locals.) The routes selected were "ones considered to be heavily traveled and predicted to have high CO exposures during rush hour periods." (Mack et al. 1984.) Although the routes were chosen to be representative of the arterials and freeways, they might not be representative of all routes traveled by the general population. The empirical analysis found that for the commuting METs, the MET concentrations from the CO Microenvironment Study are substantially higher than corresponding MET concentrations based on personal monitoring from the Urban Scale Study. This could be due to the oversampling of arterial routes in the study. A Commuter Study Links Data Base was constructed from the commuting part of the Microenvironment Study. Each commuting route was divided into links ranging from one-half to three miles, each link being a physically distinct segment of the route, and will be regarded as a microenvironment. Additional links were coded for parking garages. For quality assurance, several commuting trips used collocated monitors or inside/outside pairs. Preliminary results on monitor accuracy and monitor precision were given in Mack et al. (1984). This study restricts attention to the primary monitor in the paired monitoring. ------- - 24 - The Microenvironment study also included monitoring on some indoor microenvironments--shopping centers and offices. Additional monitoring was also conducted on walking microenvironments. The pedestrian data are combined with those from the commuting part of the study and analyzed together. The Microenvironment Study did not give a comprehensive coverage of all microenvironments commonly encountered. One major exclusion was the home microenvironments. A residual MET, referred to as the MET other, consists of all microenvironments not covered in the Microenvironment Study. For the exposure estimation to be discussed in Sec. IV, the microenvironment monitoring data will be supplemented with personal monitoring data from the Urban Scale Study for the microenvironments not covered in the Microenvironment Study. SUMMARY STATISTICS FOR THE MET CONCENTRATIONS MET Concentrations Based on MEM For each MET other than the MET other, the measurements from the Microenvironment Study are aggregated into daily averages, which are used as the MET concentrations in further analysis. A total of 43 days were measured in the Microenvironment Study, from January 1 through March 18, 1983. Table 7 gives the summary statistics for the MET concentrations for the six METs. As was expected, the concentrations in parking garages are very high. The average concentration exceeds the one-hour federal standard level of 35 .ppm. The concentration in private cars is also fairly high. The average concentration exceeds the eight-hour federal standard level of 9 ppm. Public transportation, walking, and shops have moderate levels averaging about 5 ppm. Offices have low levels, averaging about 2 ppm. MET Concentrations Based on PM An alternative set of estimates of MET concentrations can be derived from the personal monitoring data in the Urban Scale Study. For each activity segment reported, the exposure for that activity segment is computed as the product of the duration of the activity segment and ------- - 25 - Table 7 SUMMARY STATISTICS FOR CO MET CONCENTRATIONS BASED ON MEM MET Meana SDb N'c Parking 44. .55 32. .36 29 Pedestrian 4, .95 2. .07 13 Public 5 . .34 3. . 12 24 Private car 11. .39 3. .11 32 Shop 4. .20 1, .54 9 Office 2. .29 0. .86 8 3 Average of the MET con- centrations given in ppm. ^Standard deviation of the MET concentrations given in ppm. Number of days on which MEM data was available for this MET. its average CO concentration. For each participant and for each MET, the exposures from the activity segments belonging to that MET are summed as the total exposure for that MET. The total exposure in the MET is divided by the total amount of time in the MET to get the MET concentration. For certain activity segments, the CO concentrations are not available, possibly because of monitor failure. Those activity segments are not included in the calculation of the MET concentrations. To assess the effect of those missing data, the amount of time belonging to such activity segments is calculated for each participant and for each MET. For three METs--namely, shops, parking, and public transportat ion-- none of the participants had any activity segments with missing CO concentration data. For the other three METs, for some participants some of the activity segments did not have CO concentrations. However, the amount of time for those activity segments is very small. For the MET private car, the average amount of time per participant for which CO concentration is missing is 0.004 hours. This is less than one-half of 1 percent of the average time of 1.623 hours spent in this MET. For the ------- - 26 - MET office, the average amount of time without CO concentration is 0.002 hours, very small compared with the average time of 3.051 hours in this MET. For the MET pedestrian, the average amount of time without CO concentration is 0.001 hours, again very small compared with the average time of 0.269 hours in this MET. The activity segments with missing concentration data therefore have very little effect. Table 8 gives the summary statistics for the average MET concentrations based on personal monitoring. Comparison of MET Concentrations The MET concentrations based on PM are substantially lower than the correponding MET concentrations based on MEM, especially in the commuting METs. (See Tables 7 and 8.) The most dramatic difference of all is the MET parking, in which there is a fourfold difference between PEM and MEM. The average MET concentration for private cars based on MEM is more than twice the corresponding average concentration based on personal monitoring. Two major factors could explain the discrepancy between the MET concentrations based on MEM and personal monitoring. First, for some METs, such as the commuting METs, the Microenvironment Study might have oversampled microenvironments with higher CO concentrations. For Table 8 SUMMARY STATISTICS FOR CO MET CONCENTRATIONS BASED ON PMa MET Mean SD Parking 9.60 12.6 Pedestrian 2.29 2. 35 Public 3. 10 2.65 Private car 5.08 5. 18 Shop 2. 19 2.47 Of f ice 1.82 2. 73 aThe summary statis- tics are based on 705 participants in the Urban Scale Study. ------- - 27 example, the commuting routes selected were the "ones considered to be heavily traveled and predicted to have high CO exposures during rush hour periods." (Mack et al., 1984.) For future studies using the MEM approach, alternative sampling strategies should be considered. Second, Wallace et al. (1984) found a discrepancy between the COHb levels estimated from breath measurements and those estimated from the exposures based on personal monitoring, indicating that the PEMs might underestimate the actual concentrations. One of their explanations for the underestimation is that there is a "consistent decline in the readings" on the CO monitor "as the battery approached the end of its charge." Wallace et al. (1984) reported that "a study is underway at EMSL-RTP to explain the behavior of PEMs after 24 hours of monitoring and determine the battery effect on the calibration." The result of that study should throw light on the nature of the discrepancy. ------- - 28 - IV. ESTIMATED EXPOSURES This section discusses the estimation of personal exposures using the- microenvironment monitoring approach and the personal monitoring approach. For the MEM approach, both the convolution method and the hybrid method are used. With either method, there is a substantial discrepancy between the MEM and PM exposures. The MEM exposures are about 40 percent higher. However, despite the discrepancy, the MEM exposure is found to be a powerful predictor for the PM exposure. On the log scale, the MEM exposure has the correct span relative to the PM exposures; the discrepancy between the two estimated exposures is found to be a constant drift. APPLICATION OF THE CONVOLUTION METHOD The convolution method discussed in Sec. I is an efficient method of combining concentration data and activity time data to estimate exposures. The hybrid method is a simpler method. This section will discuss several empirical issues encountered in the application of the two methods: the underlying assumption about independence between MET concentrations and MET times, how to deal with the METs that were not monitored in the Microenvironment Study, and the imputation of MET concentrations on days when the MET is not measured in the Microenvironment Study. Independence and Correlation Both the convolution and the hybrid methods assume that MET concentration and MET time are stochastically independent. The same assumption is implicitly made in other modeling approaches. The amount of data available in this study does not permit examination of the independence assumption in detail. Instead attention is restricted to the weaker assumption that MET concentration and MET time are uncorrelated. (Independence implies being uncorrelated, but being uncorrelated does not necessarily imply independence.) The data from the Urban Scale Study were used to estimate the correlations between MET ------- - 29 - time and MET concentration based on PEM. The results, given as Table 9, indicate that for the Urban Scale Study, MET time and MET concentrations are uncorrelated.1 This lack of correlation supports the application of both the convolution method and the hybrid method. Other Microenvironments As was discussed in Sec. Ill, the microenvironment monitoring in the Microenvironment Study was not comprehensive and did not cover all METs usually encountered in general human activities. Those microenvironments not covered in the Microenvironment Study will be grouped into one MET and referred to as the MET other. For example, no MEM data were collected in the home microenvironments, where most people spend most of their time. Therefore the MEM data must be supplemented with personal monitoring data for microenvironments without microenvironment monitoring data. Table 9 CORRELATION BETWEEN MET CONCENTRATION AND MET TIME MET Pa_. tb" Parking 0.05 0.6 Public 0.09 0.5 Private car -0.05 0.3 Pedestrian 0.02 0.8 Shops -0.02 0.7 Office CM O O 1 0.8 The correlation between MET concentration and MET time for the specific MET. ^The t statistic for the null hypothesis p=0. 1The correlations estimated from the Washington Urban Scale Study data are cross - individual. (Each person in the study is observed for only one day.) The result here still allows for the possibility that MET concentrations and MET times for the same individual might be correlated across days. The latter correlation can be tested only if there are multiple days on the same individual, such as in the Denver Urban Scale Study (Johnson et al., 1984). ------- - 30 - With PM data substituting for MEM data for the MET other, the time- weighted summation formula (3) is revised as follows. For the ith person and the jth day, the exposure is estimated by E. . = I.T., * C.. + T. x C. , (9) ij k lk jk lr lr where the summation is taken over all METs covered in the Microenvironment Study, is the MET time for the ith person in the kth MET covered in the Microenvironment Study, C.. is the MET Jk concentration for the jth day in the kth MET. The last term in Eq. (9) gives the exposure from the MET other, referred to as the rth MET. is the MET concentration for the ith person in the MET other, based on personal monitoring. T^^ is the MET time in the MET other for the ith person. The formula (8) for the hybrid approach is similarly modified as follows: E. = I. C, x T., + C x T. . (10) i k k lk r lr Imputation of MET Concentrations on Missing Days On any one of the sampling days, the microenvironment monitoring was not conducted on all the METs. It is therefore necessary to impute the missing MET concentrations. The imputation procedure is based on the relationship among MET concentrations. Even though the MET concentration for the kth MET is not observed on the jth day, the MET concentrations for the other METs may be used to impute the missing MET concentration if the MET concentrations are related to each other. An imputation procedure similar to the EM-algorithm (Dempster, Laird, and Rubin, 1977) is used to impute the missing MET concentrations. First the missing values are replaced by the sample mean based on observed values for the same MET. For each MET the MET concentration is regressed on the other MET concentrations. In other ------- - 31 - words, is regressed on C^, C^, etc. is regressed on C^, C^, etc. where is the concentration in the first MET, is the concentration in the second MET, etc. From each estimated regression model, the predicted concentrations are used to replace the ones originally missing. (The observed MET concentrations are retained and not replaced by the predicted values.). The algorithm is then cycled back to update the regression models. The algorithm is cycled through four iterations. The summary statistics for the final iteration are given in Table 10. Compared with the corresponding summary statistics in Table 7 based on MET concentrations, the average concentrations are similar, but the standard deviations for the imputed concentrations are lower. This is a common phenomenon in imputation procedures. The predicted values do not take into account the variation not explained by the fitted regression models. Table 10 SUMMARY OF IMPUTED MET CONCENTRATION'S MET Mean SD Parking 46.82 28.50 Pedestrian 4.82 1.19 Public 5.10 2.68 Private car 11.26 2. 70 Office 2.30 0.37 Shops 4.30 0.75 ESTIMATED EXPOSURES BASED ON MEM The exposures for 30,315 convoluted person-days were estimated based on the whole sample of all 705 participants in the Urban Scale Study and the 43 days in the Microenvironment Study. The summary statistics for the MEM exposures based on the convolution method are given as the first row in'Table 11 and the first column in Table 12. The average exposure is just over 2 ppm-days. The distributions are highly skewed. ------- - 32 - The summary statistics for the logarithmic transformation of the MEM exposures based on the convolution method are given as the first row in Table 13 and the first row in Table 14. The average log exposure is about 0.5 log(ppm-day). The log concentrations are somewhat skewed to the left. The summary statistics for the hybrid exposures are given as the second row in Tables 11 and 13, and the second column in Tables 12 and 14. The average and median exposures for the two methods are very similar. The hybrid exposures have a much narrower spread, indicating that neglecting the variability in the MET concentrations has an important effect on the exposure distribution for our data. In Tables 7 and 8 the standard deviations for the MET concentrations were found to be fairly high for several of the METs. ESTIMATED EXPOSURES BASED ON PM An alternative set of exposure estimates can be derived using the personal monitoring data available from the Urban Scale Study. In particular, E. = I.T.. x c.. + T. x c. , (11) i k lk lk lr lr where the summation is taken over all METs covered in the Microenvironment Study, T^ is the MET time for the ith person in the kth MET covered in the Microenvironment Study, is the MET concentration for the ith person in the kth MET. The last term in Eq. (11) gives the exposure from the MET other, referred to as the rth MET. C. is the MET concentration for the ith person in the MET other, based lr r > on personal monitoring. T is the MET time in the MET other for the ith person. In a comparison of Eqs. (9) and (11), it can be seen that the difference between the exposure estimates based on the two approaches is the replacement of the MET concentration based on PM, for the MET concentration based on MEM, C., . jk ------- - 33 - Table 11 SUMMARIES FOR MEM AND PM EXPOSURES Method Mean3 SDb IQRC Skew0* Kurte NEM-Cf~ 2.29 2.22 1.47 9.47 175.0 MEM-HS 2.29 1.63 0.77 9.39 114.4 PM 1.59 1.63 1.52 3.11 16.7 Average of the estimated exposures in ppm-days. ^Standard deviation of the estimated exposures. c Interquartile range of the esti- mated exposures. ^Skewness of the estimated exposures e Kurtosis of the estimated exposures ^MEM exposure using the convolution method. cr MEM exposure using the hybrid method. Table 12 PERCENTILES OF MEM AND PM EXPOSURES (ppm-day) Percentile MEM-C MEM-H PM 1 5 10 25 50 75 90 95 99 0.05 1.20 0.05 0.45 1.20 0.10 0.75 1.34 0.22 1.28 1.70 0.58 1.89 2.06 1.17 2.75 2.47 2.10 3.90 2.99 3.30 5.08 3.63 4.49 10.00 6.90 7.54 ------- - 34 - The personal monitoring approach provides 705 estimated exposures, one for each person in the Urban Scale Study monitored for one day. The summary statistics for the PM exposures are given as the third row of Table 11 and the third column of Table 12. The average PM exposure is about 1.5 ppm-days. The distribution is somewhat skewed, with skewness being about 3. The log PM exposure is given as the third row of Table 13 and the third column of Table 14. The average log PM exposure is about 0 log(ppm-day). The distribution is slightly skewed toward the left, with the skewness coefficient being about -0.7. The PM exposures are roughly approximated by the lognormal distribution. COMPARISON OF EXPOSURE DISTRIBUTIONS For certain purposes such as risk assessment, it is only necessary to estimate the distribution of exposures without identifying each individual's exposure. The comparison between the two sets of summary statistics for the estimated exposures given in Tables 11 and 12 indicates that the two distributions are substantially different. The average MEM exposure is about 40 percent higher than the average PM exposure. The difference is highly significant (t = 6.69 for the convolution method, t = .8.01 for the hybrid method). The percentiles for the MEM exposures based on the convolution method are consistently higher than the corresponding percentiles for the PM exposures through the entire range of their distributions. The percentiles in the lower range for the hybrid exposures are higher than the corresponding percentiles for the PM exposures, and the opposite is true for the upper range. This indicates that the distribution of hybrid exposures is closer to the center than the distribution for the PM exposures. The two-sample Kolmogorov-Smirnov test (Smirnov, 1939; Massey, 1951) for the difference between the MEM and PM exposure distributions is highly significant (P < 0.0000001 for both methods). The comparison between the summary statistics for the log estimated exposures also indicates major differences between the MEM and PM exposures. The average log MEM exposure is significantly higher than the average log PM exposure. ------- Table 13 SUMMARIES OF LOG MEM AND EXPOSURES Method Meana SDb IQRc Skewd Kurte MEM-C 0.56 0.82 0. 76 -1. 38 4.85 MEM-H 0. 74 0.36 0.37 1.81 8.71 PM -0.02 1. 10 1.29 -0.68 0.32 £ Average of the estimated log exposures in log(ppm-day). ^Standard deviation of the estimated log exposures. c Interquartile range for the esti- ated log exposures. ^Skewness of the estimated log exposures. 0 Kurtosis of the estimated log exposures. Table 14 PERCENTILES OF LOG MEM AND PM EXPOSURES (log (ppm-day)) Percentile MEM-C MEM-H PM 1 -2.98 0.18 1 o o 5 1 o v£> 0. 18 -2.33 10 -0.29 0.29 -1.53 25 0.25 0.53 -0.55 50 0.64 CM o 0. 15 75 1.01 0.90 0.74 90 1.36 1.09 1. 19 95 1.63 1.29 1.50 99 2.30 1.93 2.02 ------- - 36 - The discrepancy between the MEM exposure distribution and the PM exposure distribution is consistent with the discrepancy shown in Tables 7 and 8, indicating that the MET • concentrations based on MEM are substantially higher than the corresponding MET concentrations based on personal monitoring. The discrepancy can be attributed to the oversampling of high concentration microenvironments in the Microenvironment Study and the underestimation of concentrations in the personal monitoring in the Urban Scale Study. COMPARISON OF INDIVIDUAL-SPECIFIC EXPOSURES Mean Squared Error For certain situations, such as the quantification of health effects of air pollution, the estimated exposures must be close to the actual exposures. The actual exposures are unknown in this study. Both the MEM exposures and the PM exposures are estimates and are subject to errors. As was discussed in Sec. I, the two estimated exposures are compared in terms of how close they are to each other. The mean squared error (MSE) given in Eq. (1) is used as the criterion. To interpret the criterion MSE, the percentage gain was also derived from the ignorant estimate of zero: Gain = (1 - MSE^MSEq) x 100% , where MSE^ = Average of (PM exposure - MEM exposure)2, MSEq = Average of (PM exposure - 0)2. The percentage gain measures how much better the MEM exposure is than the ignorant estimate that all exposures are zero. The criterion can also be interpreted as the percentage of the sum of squares in the PM exposures that is explained by the MEM exposures,2 analogous to the R2 statistic in the usual regression analysis. 2The criterion GAIN could be defined alternatively as the percentage of sum of squares in the MEM exposures that is explained by the PM exposures, but since the burden of proof lies on the MEM exposures, GAIN is defined as given in the text. ------- - 37 - The MSE and GAIN results are given in Table 15. The mean squared error indicates that there is a substantial difference between the MEM and the PM exposures. On the original scale, the square root of MSE is about 2 ppm-days, indicating that on the average the MEM exposure is about 2 ppm-days away from the PM exposure. This is large compared with the average PM exposure of 1.6 ppm-day and the standard deviation of about 1.6 ppm-days, as given in Table 11. The percentage gains for both methods are around 16 percent, indicating that there is a lot of variation left unexplained by the MEM exposure. On the log scale, the square root of MSE is about 0.9 log(ppm- day) for the convolution method, indicating that on the average the MEM exposure is about 0.9 log(ppm-day) away from the PM exposure. This is also large compared with the average log PM exposure of about 0.0 log(ppm-day), and the standard deviation of about 1.1 log(ppm-day), as given in Table 13. The square root of MSE for the hybrid method, about 1.3, is even larger. The percentage gain for the convolution method is about 35 percent, indicating that the log MEM exposures based on the convolution method explain a substantial fraction of variation in the log PM exposures. The percentage gain for the hybrid approach is only about 4 percent. Table 15 MSE AND GAIN FOR MEM EXPOSURES Convolution Hybrid Gain Gain Scale MSE (Percent) MSE (Percent) Original 4.32^ 17.1 4.45a 15.8 Log 0.79b 34.1 1.6?b 4.2 3 MSE given in (ppm-day)2. ^MSE given in (log (ppm-day))2 ------- - 38 - The large mean squared errors are consistent with the discrepancy shown in Tables 7 and 8, indicating that the MET concentrations measured from MEM are substantially higher than the corresponding concentrations measured from personal monitoring. MEM Exposure as a Predictor for PM Exposure For certain situations such as qualifying the health effects of air pollution, it is only necessary that the estimated exposure be a good predictor of actual exposure. For such situations the appropriate way to assess the validity of the estimated exposure is to exaimine the regression relationship between the actual and estimated exposures. The slope coefficient in the regression relationship must be significant, indicating that the estimated exposure predicts the ranking of actual exposures, even though the magnitude might be off. Furthermore, the slope coefficient should be close to one, and the intercept coefficient close to zero, implying that the estimated exposures are properly calibrated relative to the actual exposures. In this study we don't know the actual exposures, therefore cannot estimate the relationship between the estimated exposures and the unobserved actual exposures. As discussed in Sec. I, the PM exposure is used as the benchmark and the regression relationship tested for between the two estimated exposures, regressing the PM exposure on the MEM exposure. The results for the regression of PM exposures on the MEM exposures are given as Table 16. On the original scale, the regression results show a very significant relationship between the PM and the MEM exposures. The convolution method gives a more significant slope coefficient than the hybrid method. This indicates that even though the MET concentrations from MEM and PM are substantially different, the MEM exposures are still useful for predicting the ranking of the PM exposures. In other words, given that a certain individual's MEM exposure is high, it is reasonable to expect that his PM exposure is also high. For health effect studies in which the main focus is on qualifying the existence of health effects, this result indicates that the MEM can be used as a proxy for the PM exposure. ------- - 39 - Table 16 REGRESSIONS OF PM EXPOSURES ON MEM EXPOSURES (t-statistics given in parentheses) R2 Method Scale Intercept Slope (Percent) Convol Original 0.528 (7.70) 0.466 (21.64) 39.9 Log -0.601 (-19.35) 1.053 (33.44) 61.3 Hybrid Original 1.011 (9.84) 0.254 (6.94) 6.4 Log -0.667 (-7.39) 0.879 (8.02) 8.4 The R2 statistic for the convolution method is about 40 percent, indicating that the MEM exposure is not only a significant predictor for the PM exposure but is also an informative predictor, explaining an important fraction of the variability in the PM exposure. The hybrid method has a much smaller R2. With the convolution method, the slope coefficient in this regression is about 0.5, substantially smaller than one, and the intercept coefficient is about 0.5 ppm-day, significantly larger than zero. This indicates that the MEM exposures are not well calibrated relative to the PM exposures. For simplicity the estimated regression model will be approximated as follows: PM exposure = 0.5 + 0.5 * MEM exposure. At low levels (less than 1 ppm-day), the MEM exposure underestimates the PM exposures. For example, for an individual with MEM exposure equal to zero, the regression model predicts that his actual exposure is probably about 0.5 ppm-day. At higher levels (more than 1 ppm-day), the MEM exposure overestimates the PM exposure. For example, for an individual ------- - 40 - with MEM exposure equal to 10 ppm-day, the regression model predicts that his PM exposure is probably about 5.5 ppm-day, substantially lower than the MEM exposure. Because the average MEM exposure is about 2 ppm- day, for most people the MEM exposure overestimates the PM exposure according to the regression model. It is possible to recalibrate the MEM exposures using the regression model as follows: Recalibrated estimate = 0.5 + 0.5 x MEM exposure. The feasibility of such a recalibration in practice remains to be studied. In future applications of the MEM approach, the PM exposures for such a recalibration might not be available. Moreover, because the PM exposures might be estimated on a small sample, it is not clear that the recalibration will improve the precision of the result. On the log scale, the regression results also show a significant relationship between the MEM exposure and the PM exposure, indicating that the MEM exposures successfully predict the ranking of the PM exposures. The convolution method gives a more significant slope coefficient than the hybrid method. The R2 statistic for the convolution method is about 60 percent, indicating that the log MEM exposure is fairly powerful in explaining an important, fraction of the variability of the log PM exposure. The hybrid method gives a much smaller R2. With the convolution method, the slope coefficient in the log- scale regression is very close to one, the difference is not statistically significant at the conventional 5 percent level (t = 1.68). This indicates that the span of the MEM exposures is well- calibrated relative to the PM exposures. The intercept coefficient is about -0.6 log(ppm-day), significantly less than zero, indicating that the MEM exposure consistently overestimates the PM exposure. For simplicity of discussion, the regression models are approximated as follows: log PM exposure = -0.6 + log MEM exposure. ------- - 41 - The approximate regression model indicates that at all levels, the log MEM exposures overestimate the log PM exposure by about 0.6 log(ppm-day). This can be interpreted as follows: The log MEM exposures have the correct span relative to the log PM exposures but have a constant drift -0.6 log(ppm-day). In terms of the original scale, this means that the PM exposure is proportional to thie MEM exposure with the proportionality factor exp(-0.6) = 0.55: PM exposure - 0.55 x MEM exposure. It is also possible to recalibrate the log MEM exposures using the regression model as follows: Recalibrated estimate = 0.6 + log MEM exposure. As was noted earlier in the discussion on the recalibration on the original scale, the validity of such a recalibration remains to be established. CONCLUSION The discrepancy found in this study between the MEM exposures and the PM exposures, in terms of both individual-specific exposure and exposure distribution, is probably specific to the data used in the current study and should be generalized to future exposure studies only with caution. Given the imperfect sampling of microenvironments and the problems in the personal monitoring discussed earlier, it is impressive that the MEM exposure still turns out to be a successful predictor for the PM exposure, especially on the log scale, on which the MEM exposure based on the convolution method has the correct span relative to the PM exposure, and the discrepancy is restricted to a constant drift. The convolution method is preferable to the hybrid method for the data used here. There is much variability in the MET concentrations in these data, which is unfavorable to the hybrid approach. ------- - 42 - V. EVALUATION OF MET CLASSIFICATION SCHEMES The analysis in the earlier part of this report assumes the MET classification scheme, which classifies all microenvironments usually encountered into seven METs: parking, pedestrian, public trans- portation, private cars, offices, shops, and other. Conceivably other classification schemes could also be used. This section discusses the basis for choosing this particular MET classification scheme. Duan (1981) recommended that the grouping of the microenvironments into METs should be carried out in two stages: a preliminary classification stage and an evaluation stage. In the first stage, the researcher should identify a profile of potentially useful METs. Those METs are the minimal chunks that can be identified from the information available, and will be called elementary METs. The second stage will evaluate the elementary METs and consider whether each of them should be analyzed as a distinct MET on its own or should be combined into coarser groupings referred to as composite METs. From the data from the Microenvironment Study and the Urban Scale Study, eight elementary METs can be identified: parking, pedestrian, bus, rail, private cars, offices, shops, and other. A criterion developed in Duan (1981) is applied to evaluate the above METs. The results are as follows: 1. The best decomposition of microenvironments covered in the Microenvironment Study is to separate commuting from business. The composite MET commuting consists of the elementary METs parking, buses, rails, private cars, and pedestrian. The composite MET business consists of the elementary METs shops and offices. 2. After this primary decomposition, the next best decomposition is to separate commuting into parking and in-transit. 3. After the first two decompositions, the next best decomposition is to separate business into shops and offices. ------- - 43 - 4. The next best decomposition is to separate the MET in-transit into vehicles and pedestrian. The composite MET vehicle consists of the elementary METs public tran sport at ion and private cars. 5. The next best decomposition is to separate the MET vehicle into public transportation and private vehicles. 6. The least effective decomposition is to separate the MET public transportation into buses and rails. 7. The elementary MET other has to be kept separate from all other elementary METs because there is no concentration data from the Microenvironment Study for microenvironments in this MET. Based on the above results, buses and rails were combined into one composite MET, public transportation, giving the seven METs used in the exposure analysis in earlier sections. ELEMENTARY METs Fairly detailed information was collected in both the Microenvironment Study and the Urban Scale Study to characterize the microenvironments and activity segments being measured. Based on the information common to the two studies, eight elementary METs may be identified: parking, pedestrian, buses, rails, private cars, offices, shops, and other. The definitions of parking, pedestrian, private cars, offices, shops, and other are given in Sec. II. Buses and rails are self-defined. Further details on the definitions of the elementary METs are given in Appendix A. The identification of the eight elementary METs given above did not use all the information available from the two studies. For example,, the Micr.oenvironment Study provides additional information on the in-transit microenvironments such as average speed of vehicle. With only the MEM data, the information on speed may be used to refine the elementary MET private car into, for example, private car at a low speed (say, below 35 mph), private car at a medium speed (say, between 35 and 50 mph), and private car at a high speed (say, above 50 mph). However, the speed information is not available from the Urban Scale Study, so ------- - 44 - those refined METs cannot be identified for activity time data from the Urban Scale Study. In the Urban Scale Study, it is known only that the participant is in a private car during a certain activity segment, but the vehicle's speed is unknown. Because any elementary MET to be considered has to be identifiable both for the Microenvironment Study and for the Urban Scale Study, the speed information cannot be used to refine the elementary MET private car. DUAN'S CRITERION Duan (1981) developed a criterion to evaluate MET classification schemes that has become known as Duan's criterion. This section will give a new interpretation of this criterion in terms of effective sample s ize. The criterion is based on the estimation of average exposure when implementing the MET approach using the enhanced personal monitoring (EPM) approach. In the context of EPM, assuming a sufficient amount of additional diary information to practically eliminate the part of the variability in the estimated average exposure due to the variability in the MET times, Duan (1981) showed that decomposition of a coarser MET into finer METs always decreases the variance of the estimated average exposure. Duan's criterion (DC) is defined in terms of the amount of decrease in this variance as follows: DC = n x (Var - Var r), c I where n is the number of person-days in the monitoring sample, Var^ is the variance of the estimated average exposure based on the coarser MET classification, and Var^ is the variance of the estimated average exposure based on the finer MET classification after the decomposition being considered. The criterion can be interpreted as the decrease in variance per observation. The mathematical form of the criterion is rather complicated and is given in Appendix E. Following are several of its important features: ------- - 45 - • The more different the concentrations in the two refined METs are, the more gain there is from the refinement. • The criterion may be factored into two terms, the first term determined entirely by the distribution of the MET times, the second term determined entirely by the distribution of the MET concentrations. Therefore the two factors in the criterion may be evaluated separately. For example, one may assess the first term from a population-based survey or activity diary and the second term from MEM data. • The more time spent in the two finer METs combined, the more gain there is from the refinement. • The more variable the time allocation is between the two finer METs, the more gain there is from the refinement. The variability in exposure comes from two sources: the variability in MET concentrations and the variability in MET times. The EPM approach incorporates additional activity data to eliminate or reduce the variability in the average exposure due to the variability in MET times. Therefore the value in a refined MET classification depends on the variability in MET times. If there is no variability in MET times, then there can be no gain from the decomposition. PIESS Duan's criterion may be interpreted in terms of the percentage increase in effective sample £ize (PIESS). There are two ways to increase effective sample size. The first is to increase the sample size. The second is to use a more efficient analytic method (such as the more refined MET classification) that gives a more precise result by decreasing the variance of the estimated quantity. The PIESS with a more efficient analytic method (such as the finer MET classification) is the percentage increase in sample size that would be required to achieve the same precision with the less efficient analytic method (such as the coarser MET classification). ------- - 46 - It can be shown that the PIESS in the kth decomposition (imposed after the first k - 1 decompositions have been imposed) is given by PIESSk = DCk/(Var(E) - DC - DC - ... - DCk) x 100 percent, where Var(E) is the variance of exposure, DC^ is the criterion for the first decomposition, DC^ is the criterion for the kth decomposition. The following is a hypothetical example: n = 100, Var(E) = 10, DC x = 5 , DC2 = 2. If one does not use the MET approach and simply monitors the exposure for n = 100 subjects with PEM, the variance for the estimated average exposure is Var(E)/100 = 0.1 . If one uses the MET approach and imposes the first decomposition, say decompose indoor and outdoor as two distinct METs, the variance of the estimated average exposure is reduced to (Var(E) - DC )/100 = 0.05 . The PIESS for this decomposition is DC^/(Var(E) - DC^) = 100 percent. For the coarser method (not using the MET approach) to achieve the same precision, it is necessary to increase the sample size by 100 percent. When that is done and 200 subjects are monitored with the coarser method, the variance of the estimated average exposure is ------- - 47 - Var(E)/200 = 0.05 , the same as the variance based on the MET approach with the first decompos it ion. Further imposition of the second decomposition, say decompose indoor into indoor with gas stove in use and indoor without gas stove in use, the variance of the estimated average exposure is further reduced to (Var(E) - DC - DC2)/100 = 0.03. The PIESS for this decomposition is DC^/CVarCE) - DC^ - DC^) = 67 percent. For the coarser method (using only the first decomposition) to achieve the same precision, it is necessary to increase the sample size by 67 percent. In other words, 100 observations with both decompositions is equivalent to 167 observations with only the first decomposition. The latter is equivalent to 334 observations without the MET approach. Relative Criterion Versus Absolute Criterion Duan's criterion (as well the PIESS interpretation of this criterion) compares the relative benefits from various classifications to determine which one is preferable. It can be applied to rank the relative benefits from various decompositions. The process can thought of as cutting a pie. Duan's criterion can be used to decide which is the best first cut, which is the best second cut, etc. However, where to stop cannot be determined from this criterion. If other sources permit determination of the maximum number of METs that can be used in a study--for example, the data logger might only allow six METs to be distinguished — the criterion may be applied to determine the best cuts or decompositions that will lead to the six most efficient METs. Other situations might not permit determination of how ------- - 48 - many METs to use from other sources. In such situations it will be desirable to develop an absolute criterion to determine where to stop. EMPIRICAL RESULTS Duan's criterion was applied to the concentration data from the Microenvironment Study and the activity time data from the Urban Scale Study to rank the relative benefits from various MET decomposition schemes. The details on the estimation of the criterion are given in Appendix F. Table 17 summarizes the final ranking of the MET decompositions and gives Duan's criterion and PIESS for each MET decomposition. The largest improvement in Duan's criterion and PIESS is achieved by the decomposition that distinguishes commute and business. (Because of the limitations in the Microenvironment Study, it is possible to address only the business part of noncommute, namely shops and offices.) This decomposition results in nearly a one-third improvement in effective sample size. Table 17 RANKING OF MET DECOMPOSITIONS MET 0 a MET lb MET 2b DCC PIESSd (Percent) All Commute Bus iness 364.6 31.1 Commute Parking In-transit 89. 1 8.2 Bus iness Shops Offices 15.8 1.5 In-trans it Vehicles Pedestrian 12.6 1.2 Vehicles Public Private car 6.2 0.6 Public Bus Rail 0.05 0.005 £ MET 0: The prototype MET being decomposed. bMET 1, MET 2 : The two new METs. CDuan's criterion. ^Percentage increase in effective sample size. ------- - 49 - Duan's criterion for this decomposition is about six times the value of the criterion for the next best decomposition of commute into parking and in-transit. This means that it takes about six decompositions similar to the latter to match the improvement in the former decomposition. In other words, the former decomposition is equivalent to six decompositions like the latter. The decomposition of business into shops and offices compares favorably with several finer decompositions of commute, even though the concentrations in the METs offices and shops are both low compared with commuting METs. This is mainly because much more time is spent in the MET business than in the commuting METs. Therefore in considering MET decompositions it is important to recognize METs in which people spend a good deal of their time, even though the concentrations in these METs might be low. Hartwell et al. (1984a) noted that even though the indoor CO level is generally lower than that encountered during commuting, those METs contribute more than half of the total exposure. Most of the decompositions result in a nontrivial improvement in PIESS, except for the decomposition of public transportation into bus and rail. It takes more than a hundred decompositions similar to this decomposition to match the improvement in other decompositions, mainly because the average amount of time spent in these METs is very small. Therefore this decomposition will be ignored in the rest of this analysis.1 lDuan's criterion does not permit determination of where to stop. The decision may be viewed as a decision that seven METs is the maximum number of METs that can be used or that a PIESS substantially less than 1 percent does not warrant the creation of an extra MET. ------- - 50 - VI. FUTURE APPLICATIONS OF THE MET APPROACH ENHANCED PERSONAL MONITORING AND MICROENVIRONMENT MONITORING There are two ways to implement the MET approach, enhanced personal monitoring (EPM) and microenvironment monitoring (MEM). If the former approach is taken, the sampled participants will sample the microenvironments for the study using his own activity pattern; therefore there is no need to be concerned about the sampling of microenvironments. In the latter approach, the research staff have to determine the microenvironments to be measured; therefore, an appropriate sampling scheme must be devised for the microenvironments. The main disadvantage of EPM relative to microenvironment monitoring is the need to use human subjects in the monitoring and the need to use PEMs capable of continuous measurements. When those factors do not apply--e.g., if a future exposure assessment study using the MET approach will be conducted in conjunction with a personal monitoring study with PEMs capable of continuous measurements--enhanced personal monitoring will be preferable. The apparent advantage is the avoidance of the difficult task of sampling microenvironments. So long as the human subjects are chosen to be representative of the target population, they will automatically generate a representative sample of microenvironments during their monitoring. There are certain situations in which microenvironment monitoring will be the better choice. An obvious case is when there are no PEMs capable of continuous measurements. In that situation it might still be possible to conduct microenvironment monitoring using portable (but not truly personal) monitors. Another case is when a comprehensive activity database already exists. In that case it is possible to avoid human subjects entirely by using microenvironment monitoring. A third situation for which microenvironment monitoring might be chosen is when the primary goal is not to assess exposure but to generate data to feed into exposure models such as SHAPE or NEM. In those situations it is still important to design an appropriate sampling ------- - 51 - scheme for microenvironments. Otherwise the distribution of microenvironment concentration generated as input to the models might not be appropriate. SAMPLING OF MICROENVIRONMENTS For the rest of this section it is assumed that microenvironment monitoring has been decided as the more appropriate way to implement the MET approach for a future exposure study. For each MET, one needs to determine the microenvironments belonging to that MET to be sampled. It was noted in Sec. I that the appropriate target population is not the population of microenvironments such as all office spaces, but the intersection of people and the microenvironments. There are two possible schemes to generate a representative sample of this target population. The first is the weighted sampling scheme. The second is to simulate real human activities, using an existing activity database. The two approaches are not exclusive. They may be applied simul- taneously in the same study. For home and office METs, the weighted sampling scheme might be preferable. For commuting METs, simulated activity might be preferable. If feasible it would also be desirable to collect some personal monitoring data at the same time for comparison. Weighted Sampling Scheme With this approach, the microenvironments are sampled as if they were the target population, then the number of people present in the sampled microenvironments are counted or estimated as weights to adjust the difference between the two target populations. This approach was first discussed in Duan (1982). It is valid only if each person in the period of interest will visit only one microenvironment belonging to the MET of interest. The assumption of visiting only one microenvironment of the same type will probably apply to certain METs but not to others. For example, it probably applies to the METs home and office for most people. However, it might not apply to certain METs such as commuting METs--a substantial fraction of people probably visit more than one such microenvironment during the period of interest. How multiple visits to ------- - 52 - microenvironments of the same type will affect the validity of this approach remains to be studied empirically. The best way to sample microenvironments in this approach is to use area probability sampling, with the modification that the count of people present at the microenvironment be incorporated as part of the sampling weights. Simulate Real Human Activities An alternative to the weighted sampling scheme is to abandon the direct sampling of microenvironments and sample people instead. This approach requires that there already is an activity database providing sufficient detail on the location and characteristics of the micro- environments reported in the diaries. This is not necessarily a restrictive requirement, because one type of situation in which microenvironment monitoring is preferable to enhanced personal monitoring is when there is an existing activity database. With this approach, diaries from the activity database are sampled and an attempt made to simulate the reported activity paths. For example, assume that we are interested in the MET commuting, to which the weighted sampling scheme might not apply because of the problem of visiting multiple microenvironments in the same MET. Diaries will be sampled from the activity database, and the commuting routes described in the diaries laid out along with descriptions of the conditions of the routes--e.g., type of vehicle, ventilation status, speed, and presence of smokers. (It might be easier to collect part of this information on a general level from a recall survey rather than from diaries.) The research staff can then set out with the monitoring instruments to measure the MET exposure this person would receive given the described commuting path and conditions. It may be impossible to match all aspects of a microenvironment based on information collected from the diaries and recall surveys. Nevertheless it should be possible to match the main relevant features. ------- - 53 - STUDY DESIGNS AND DATA COLLECTION Collection of Additional Diary Information If primary data collection for activity patterns will be used in a future study, it is preferable to collect the diary information over a longer period of time to enhance the information on activities. The marginal cost for collecting additional diaries from a person already in the sample is likely to be low. If enhanced personal monitoring will be used, it will be preferable to collect the diary information for a few days before passing out the PEM to ensure that the participant is already familiar with the diary instrument so that the quality of the diary information during the monitoring period will be good. Closed Format for the Diary Hartwell et al. (1984a) recommended that the open-ended format in the diaries in the Urban Scale Study be replaced by a closed format in future studies to improve the quality of the activity data. In the closed format the participant is given a list of METs with instructions how to determine which MET a certain activity belongs to, and is instructed to report the specific MET he is in during each activity segment. This change will improve the quality of the activity data to be collected. Hartwell et al. (1984a) also recommended that a simplified version of the diary be built into the data logger so that when the participant keys in the MET at each change in activity, the activity will be recorded automatically along with the concentration. This will probably improve the quality of the activity data. The hardcopy diary and electronic diary can be compared during analysis for validation, avoiding a substantial fraction of the mismatches between the diary and the monitoring data segments. Match Between Activity and Monitoring METs If microenvironment monitoring will be used in future studies along with primary diary data collection, it will be desirable to match the METs in the design phase so that the METs monitored can be matched with METs defined from the diaries. In particular, if the closed format for ------- - 54 - the diary is adopted, the METs in the diaries should match with the METs in the microenvironment monitoring. Overlap Between METs in Microenvironment Monitoring If microenvironment monitoring will be used in future studies, it will be desirable to design the monitoring schedule so that there will be sufficient overlap in time between each pair of MET to allow for the examination of the relationship among the MET concentrations. FURTHER USE OF THE ACTIVITY DATABASE The activity database from the Urban Scale Study is not perfect. However, it is the first of its kind, providing a population-based sample of real human activities containing information relevant to combustion-related air pollution. As was noted in Sec. I, one advantage of the MET approach is that an existing activity database can be used for other purposes. For example, the current activity database can be used in a future exposure study on nitrogen dioxide. The nonambient sources of nitrogen dioxide are basically the same as the nonambient sources for CO, namely, indoor combustion (smoking, gas stove) and commuting. The activities identified in the current activity database are probably sufficient for characterizing METs relevant for nitrogen dioxide. Therefore a future study on human exposure to nitrogen dioxide can be carried out with little need for new activity data. One possibility would be to use the monitors available to conduct a microenvironment study for nitrogen dioxide and apply the convolution method to estimate nitrogen dioxide exposure. Such a nitrogen dioxide study using the current activity database can serve as a useful pilot study for the future application of the continuous PEM for nitrogen dioxide under development. The information generated can help evaluate the MET classification scheme developed in this study for nitrogen dioxide, and will also be useful in identifying the subpopulation at high risk. The current activity database can also be used to provide updated input to existing exposure models such as SHAPE and NEM. Before the Urban Scale Study, the exposure models had to rely on synthetic activity patterns based on the best input information then available. With the ------- - 55 - activity database from the Urban Scale Study, it is possible to provide updated activity information for the exposure models, using either the realized activity paths or statistical summaries from the current activity database to recalibrate the statistical distribution used as ~ input to the exposure models. ------- - 57 - Appendix A DEFINITIONS OF METS The heuristic definitions of METs used in this study are given in Sec. II. This appendix gives the detailed definition rules for the elementary METs, including parking, pedestrian, buses, rails, private cars, offices, shops, and other. The composite MET public trans- portation considered in the exposure estimation is the union of the elementary METs buses and rails. DEFINITIONS OF ELEMENTARY METS FROM THE MICROENVIRONMENT STUDY Parking Garage From the Microenvironment Study, a microenvironment is defined to be a parking garage when the microenvironment is measured as part of the commuter part of the Microenvironment Study, and the link number is given in the Commuter Study Links Data Base either as 31, indicating the microenvironment is inside a parking garage at the beginning of the trip, or as 38, indicating the microenvironment is inside a parking garage at the end of a trip. Bus From the Microenvironment Study, a microenvironment is defined to be a bus when the microenvironment is measured as part of the commuter part of the Microenvironment Study, and the link number is given as 1-20, indicating this is a roadway link; the vehicle code is given as 6, indicating the vehicle is a bus. Rail A microenvironment is defined to be rail with the same criterion as was used to define bus, except that the vehicle is given as 7 instead of 6, indicating the vehicle is rail. ------- - 58 - Pedestrian There are two groups of pedestrian microenvironments measured in the Microenvironment Study. In the commuter part of the study, some pedestrian data are collected. Those microenvironments are identified by the same criterion that was used for bus and rail, except that the vehicle code is 8, indicating the "vehicle" is pedestrian. Some additional pedestrian data were collected along with the shop microenvironments during the noncommute phase of the Microenvironment Study. Those data are given as the Walking Survey Database and have been combined with the other pedestrian data in the analysis. Private Car The majority of the microenvironments measured in the commuter part of the Microenvironment Study are private cars. Those microenvironments are identified with the same criterion that was used for the other roadway METs, except that the vehicle codes are given as VI through V5, indicating the microenvironment is one of the five survey vehicles. Shops The shop microenvironments are measured in the noncommute part of the Microenvironment Study. The data are given as the Shopping Center Survey Database. Offices The office microenvironments are measured in the noncommute phase of the Microenvironment Study. The data are given as the Office Building Database. The analysis is restricted to the measurements taken between 7 a.m. and 6 p.m., when the majority of the population spend time in this MET. Other There are no MEM data from the Microenvironment Study for the microenvironments belonging to the MET other. ------- - 59 - DEFINITIONS OF METS FROM THE URBAN SCALE STUDY Parking An activity segment is defined to be parking if the location of activity is given in Hartwell et al. (1984a and b) as 0661, indicating the location is indoors-garage. Bus An activity segment is defined to be bus if in Hartwell et al. (1984 a and b) the activity code is given as 1, indicating the activity is "transit, travel;" the location of activity is given as 0100, indicating the location is "in transit;" and the mode of travel is given as 0300, indicating the trip was made in a bus. Rail An activity segment is defined to be rail using the same criteria as for bus except that the mode of travel should be 0500, indicating the trip was made in "train/subway." Car An activity segment is defined to be car using the same criterion for bus except that the mode of travel should be 0200, 0400, 0663, or 0664. Mode of travel 0200 indicates the trip is made in a car. Mode of travel 0400 indicates the trip is made in a truck. Mode of travel 0663 indicates the trip was made on a motorcycle. Mode of travel 0664 indicates the trip was made in a van. Pedestrian (Noncar) An activity segment is defined to be noncar, corresponding to the elementary MET pedestrian in Sec. Ill, using the same criterion for bus, except that the mode of travel should be 0100, 0661, or 0662. The mode 0100 indicates walking, the mode 0661 indicates jogging, and the mode 0662 indicates biking. ------- - 60 - Shops An activity segment is defined to belong to the MET shops if the location code is 0400 or 0664. Location 0400 indicates the activity segment was stores. Location 0664 indicates the activity segment was shopping malls/theater in malls. Offices An activity segment is defined to belong to the MET offices in Hartwell et al. (1984a and b) if the location code is given as 0300, which indicates the location is an office. Other All other activity segments are assigned to the MET other. ------- - 61 - Appendix B QUALITY CRITERION FOR THE ACTIVITY TIME DATA This appendix presents the detailed definitions of the two quality criteria for activity time data--namely, the skip-logic criterion and the consistency criterion. SKIP-LOGIC CRITERION The activity diary requires the participant to respond differently to two questions depending on whether the current activity segment is indoor or outdoor (the latter includes in-transit activity). If the current activity is indoor, the participant should give a valid response to the following questions: • Is there a garage attached to the building? • Is there a gas stove in use? For both questions, "don't know" is a valid response. If the participant did not give a valid response to those questions (indicated by a missing value code in the database), he is judged to have made an error in the skip logic. If the current activity is outdoor (including in-transit), the participant should skip those two questions. If the participant fails to do so and responds "yes," "no," or "don't know" to those questions, he is determined to have made an error in the skip logic. (The correct response to these questions for those activity segments is no response, indicated by a missing value code in the database.) Table B.l gives the frequency distribution of the error rate in following the skip logic for the whole sample of 705 participants. As was noted earlier, 361 participants made no errors in the skip logic. Those participants constitute the "good" sample. ------- - 62 - Table B.l FREQUENCY DISTRIBUTION OF ERROR RATES'* IN FOLLOWING THE SKIP LOGIC IN THE" ACTIVITY DIARIES Density0 (Percent per Ranged Count unit range) 0 361 (51.2)d 0 - 0.05 123 348.9 0.05 - 0.1 109 309.2 0.1 - 0.2 ' 62 87 .9 0.2 - 0.3 25 35.5 0.3 - 0.4 8 11.3 0.4 - 0.5 6 8.5 0.5 - 0.6 4 5.7 0.6 - 0.7 4 5.7 0 1 o 00 3 4.3 Maximum 0. ,7857 ^The fraction of activity segments on which a skip logic error is found. ^The ranges include the upper and exclude the lower endpoint. CThe density is given as percent per unit range — that is, the percentage in the range divided by the width of the range. See the discussion of density in Table 4, note a. ^The density over this range is infinite; the value given in parentheses is the actual percentage of respondents who pass this criterion. ------- - 63 - The skip-logic criterion is inversely related to the number of reported activity segments, namely the more activity segments a participant reports, the more chance he has of making an error. As can be seen from Table B.l, some participants rejected from the good sample made skip-logic errors on a very small fraction of activity segments. It is plausible that the activity data based on their diaries are still fairly reliable. However, for the sake of simplicity, the "good sample" is restricted to the participants who made no errors at all to avoid ambiguity. CONSISTENCY CRITERION During the preparation of the activity database before this study, several inconsistencies in the reported activity data were found, including those between reported activities and locations and those between activity time and monitoring time. The details of those inconsistencies were given in Hartwell et al. (1984a and b) , Section 5.4.2.2. Such inconsistencies are interpreted as the participant's failure to fill out the diary correctly, indicating that the activity data from this participant might be less reliable. Therefore the consistency checks are another criterion to test for the participant's reliability. ------- - 64 - Appendix C FREQUENCY DISTRIBUTIONS OF ACTIVITY TIME Tables C.l through C.7 give the frequency distributions for the activity times for each of the elementary METs. Generally speaking, the activity distributions are similar among the three samples. The prevalence of bus and rail users is low. (See Tables C.l and C.2.) For both bus and rail, a few participants report spending several hours in those METs. Those participants are more frequent in the whole sample and the good sample than in the best sample. For example, one participant in the whole sample reported spending more than four hours in the rail. The longest rail times for the good and the best sample are both less than two hours. The validity of those reported long bus and rail activities are questionable. In this case the skip-logic and consistency criteria were successful in eliminating those questionable participants from the best sample and the good sample. Most people spend some time in the MET private car (see Table C.3). The distribution of time spent in this MET is again fairly skewed, with a few participants spending a great deal of time in this MET. The maximum amount of time spent in this MET is 9-1/2 hrs for the best sample, 12 hrs for the .good sample, and 15 hrs in the whole sample. Again the quality criteria appear to have succeeded in eliminating participants spending questionably long times in this MET. About one-fourth to one-third of the participants spend some time in the MET pedestrian (see Table C.4). One participant in the whole sample reported spending close to 11 hrs in this MET. The quality criteria again were successful in eliminating this questionable participant from the good and the best sample. The maximum time in this MET is about four hours for the good sample and about two hours for the best sample. Only a few participants reported spending some time in the MET parking (see Table C.5). A few participants reported spending over ten hours in this MET. ------- - 65 - About half of the participants report spending time in the MET office (see Table C.6). One participant in the whole sample reported spending over 18 hours in this MET. For all three samples the distribution of time spent in this MET appears bimodal, with one cluster of participants spending a small amount of time (a few hours or less than one hour) and another larger cluster of participants who spend about eight hours. Comparison of the distributions among the three samples indicates that the distribution in the best sample is clustered around eight hours more than in the other two samples. There are also a few reported workaholics spending over 12 hours in this MET; again more of these unusual participants are in the whole sample than in the good sample and the best sample. About one-third to one-fourth of the participants spend some time in the MET shops (see Table C.7). Most of those are fairly short, but a few participants report up to eight or nine hours in this MET. The activity times have been standardized as was discussed in Sec. II. ------- - 66 - Table C.l FREQUENCY DISTRIBUTION FOR ACTIVITY TIME FOR THE MET BUS Whole Sample Good Sample Best Sample Dens ityb Dens ity Density (Percent (Percent (Percent Range3 Count per hr) Count per hr) Count per hr) 0 hrs 661 (93.8)c 335 (92.8)c 121 (95.3)c 0 - 0.25 2 1.1 1 1.1 0 0.0 0.25 - 0.5 11 6.2 5 5.5 0 0.0 0.5 - 0.75 14 7.9 10 11.1 2 6.3 0.75 - 1 2 1.1 1 1.1 1 3.1 1 - 1.5 8 2.3 5 2.8 2 3.1 1.5 - 2 4 1.1 2 1.1 0 0.0 2-3 1 0. 1 1 0.3 1 0.8 3 - 4 ' 1 0. 1 1 0.3 0 0.0 4 - 5 0 " "o.o 0 ~ 0.0 o" 0.0 5 - 6 1 0.1 0 0.0 0 0.0 Maximum 5 . 139 3. 262 2. 163 aThe ranges include the upper and exclude the lower endpoint. ^See the discussion on density in Table 4, note a. c The density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- Table C.2 FREQUENCY DISTRIBUTION FOR ACTIVITY TIME FOR THE MET RAIL Whole Sample Good Sample Best Sample Dens ityb Density Density (Percent (Percent (Percent Range3 Count per hr) Count per hr) Count per hr) 0 hrs 670 (95.0)c 341 (94.5)C 124 (9 7 . 6)c "0 - 0.25 3 1.7 2 2.2 1 3.1 0.25 - 0.5 12 6.8 6 6.6 1 3.1 0.5 - 0.75 6 3.4 3 3.3 0 0.0 0.75 - 1 4 2.3 2 2.2 0 0.0 1 - 1.5 8 2.3 7 3.9 1 1.6 1.5 - 2 1 0.3 0 0.0 0 0.0 2 - 3 0 0.0 0 0.0 0 0.0 3-4 0 0.0 0 0.0 0 O'TO 4-5 1 0.1 0 0.0 0 0.0 Maximum 4. .287 1. 478 1. 387 aThe ranges include the upper and exclude the lower endpoint. ^See the discussion on density in Table 4, note a. c The density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- Table C.3 FREQUENCY DISTRIBUTION FOR ACTIVITY TIME FOR THE MET PRIVATE CAR Whole Sample Good Sample Best Sample Dens ityb Dens ity Density (Percent (Percent (Percent Range3 Count per hr) Count per hr) Count per hr) 0 hrs 100 (14.2)c 56 (15.5)c 22 (17 .3)C 0 - 0.25 16 9.1 8 8.9 2 6.3 0.25 - 0.5 39 11.1 22 24.4 7 22.0 0.5 - 0.75 55 31.2 31 34.3 8 25 .2 0.75 - 1 68 38.6 32 35.5 11 34.6 1 - 1.5 139 39.4 65 36.0 18 28.3 1.5 - 2 93 26.4 49 27. 1 21 33. 1 2-3 103 14.6 53 14.7 24 18.9 3 - 4 47 6.7 23 6.4 7 5.5 4 - 5 16 2.3 10 2.8 4 3.1 5 - 6 10 1.4 4 1.1 0 0.0 6 - 7 6 0.9 1 0.3 0 0.0 7 - 8 3 0.4 1 0.3 0 0.0 8 - 9 4 0.6 3 0.8 2 1.6 9-10 2 0.3 1 0.3 1 0.8 10 - 11 1 0.1 0 0.0 0 0.0 11 - 12 2 0.3 2 0.6 0 0.0 12 - 13 0 0.0 0 0.0 0 0.0 13 - 14 0 0.0 0 0.0 0 0.0 14 - 15 0 ... „ 0.0 0 0.0 0 0.0 15 - 16 0. 1 " 0 0.0 0 0.0 Maximum 15. .252 11 .991 9. 523 ihe ranges include the upper and exclude the lower endpoint. ^See the discussion on density in Table 4, note a. The density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- _Z ^1.- . Table C.4 FREQUENCY DISTRIBUTION FOR ACTIVITY TIME FOR THE MET PEDESTRIAN Whole Sample Good Sample Best Sample Densityb Density Density (Percent (Percent (Percent Range3 Count per hr) Count per hr) Count per hr) 0 hrs 480 (68.1)C 255 (70.6)c 95 ( 7 4 . 8 ) c •0 - 0.25 47 26. 7 23 25.5 8 25.2 0.25 - 0.5 57 32.3 26 28.8 6 18.9 0.5 - 0.75 34 19.3 15 16.6 3 9.4 0.75 - 1 29 16.5 15 16.6 4 12.6 1 - 1.5 27 7.7 15 8.3 8 12.6 1.5 - 2 13 3.7 6 3.3 2 3.1 2 - 3 10 1.4 3 0.8 1 0.8 3 - 4 5 0.7 3 0.8 0 0.0 4-5 2 0.3 0 0.0 0 0.0 5 - 6 0 0.0 0 0.0 0 0.0 6-7 0 0.0 0 0.0 0 0.0 7 - 8 0 0.0 0 0.0 0 0.0 8 - 9 0 0.0 0 0.0 0 0.0 9 - 10 0 0.0 0 0.0 0 0.0 10 - 11 1 0. 1 0 0.0 0 "0.0 Maximum 10.935 3.867 2.362 ^he ranges include the upper and exclude the lower endpoint. ^See the discussion on density in Table 4, note a. CThe density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- - 70 - Table C.5 FREQUENCY DISTRIBUTION FOR ACTIVITY TIME FOR THE MET PARKING Whole Sample Good Sample Best Sample Dens ityb Dens ity Density (Percent (Percent (Percent Range3 Count per hr) Count per hr) Count per hr) 0 hrs 646 (91.6)c" 338 (93.6)c 122 (96. l)c 0 - 0.25 34 19.3 15 16.6 3 9.4 0.25 - 0.5 10 5.7 4 4.4 0 0.0 0.5 - 0.75 3 1.7 2 2.2 1 3.1 0.75 - 1 3 1.7 0 0.0 0 0.0 1 - 1.5 2 0.6 0 0.0 0 0.0 1.5 - 2 2 0.6 0 0.0 0 0.0 2 - 3 0 0.0 0 0.0 0 0.0 3 - 4 1 0.1 0 0.0 0 0.0 4 - 5 1 0.1 0 0.0 0 0.0 5 - 6 0 0.0 0 0.0 0 0.0 6 - 7 0 0.0 0 0.0 0 0.0 7 - 8 0 0.0 0 0.0 0 0.0 8 - 9 0 0.0 0 0.0 0 0.0 9 - 10 1 0.1 0 0.0 0 0.0 10 - 11 1 0.1 1 0.3 1 0.8 11 - 12 0 0.0 0 0.0 0 0.0 12 - 13 1 0.1 1 0.3 0 0.0 Maximum 12. .689 12. .689 10. .417 The ranges include the upper and exclude the lower endpoint. ^See the discussion on density in Table 4, note a. c The density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- Table C.6 FREQUENCY DISTRIBUTION FOR ACTIVITY TIME FOR THE MET OFFICE Whole Sample Good Sample Best Sample Density^ Density Dens ity (Percent (Percent (Percent Range3 Count per hr) Count per hr) Count per hr) 0 hrs 356 (50.5)c 188 (52. l)c 64 (50.4)c 0 - 1 50 7 . 1 21 5.8 6 4.7 1 - 2 19 2.7 5 1.4 2 1.6 2 - 3 20 2.8 7 1.9 2 1.6 3-4 9 1.3 2 0.6 0 0.0 4 - 5 15 2.1 6 1. 7 3 2.4 5 - 6 13 1.8 9 2.5 0 0.0 6 - 7 34 4.8 20 5.5 6 4.7 7-8 5 7 8.1 27 7.5 10 7.9 8 - 9 63 8.9 36 10.0 16 12.6 9-10 45 6.4 28 7.8 11 8.7 10 - 11 15 2.1 8 2.2 5 3.9 11 - 12 5 0.7 2 0.6 2 1.6 12 - 13 2 0.3 1 0.3 0 0.0 13 - 14 1 0.1 1 0.3 0 0.0 14 - 1 0 0.0 0 " 0.0 Maximum 18. .339 13. 145 11, .556 £ The ranges include the upper and exclude the lower endpoint. ^See the discussion on density in Table 4, note a. Q The density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- - 72 - Table C.7 FREQUENCY DISTRIBUTION" FOR ACTIVITY TIME FOR THE MET SHOPS Whole Sample Good Sample Best Sample Density'3 Density Density (Percent (Percent (Percent Ranged Count per hr) Count per hr) Count per hr) 0 hrs 471 (66.8)c 260 (72.0)c 94 (74.0 )c 0 - 0.25 48 27.2 25 27.7 8 25.2 0.25 - 0.5 45 29.0 19 21.1 4 12.6 0.5 - 0.75 ¦ 29 16.5 15 16.6 5 15 . 7 0.75 - 1 40 22.7 15 16.6 8 25.2 1 - 1.5 30 8.5 11 6.1 4 6.3 1.5 - 2 11 3. 1 2 1. 1 0 0.0 2 - 3 13 1.8 4 1.1 1 0.8 3 - 4 4 0.6 2 0.6 0 0.0 4-5 4 0.6 3 0.8 1 0.8 5 - 6 3 , 0.4 1 0.3 1 0.8 6 - 7 1 ' 0.1 0 0.0 0 0.0 7-8 1 0. 1 0 0.0 0 0.0 8 - 9 " 4 " 0.6 4 1.1"" 1 0.8 " 9-10 1 0.1 0 0.0 0 0.0 Maximum 9.789 8.547 8.121 ^he ranges include the upper and exclude the lower endpoint. ^S^e the discussion on density in Table 4, note a. c The density for zero hour is infinity. The value given is the percentage for participants with no time in this MET. ------- - 73 - Appendix D SENSITIVITY ANALYSIS This appendix compares the results given in the text across the three samples defined in terms of the quality criteria--namely, the whole sample of 705 available participants, the good sample of 361 participants who passed the skip-logic criterion, and the best sample of 127 participants who passed both criteria. (The results in the text were based on the whole sample.) ACTIVITY TIMES FOR THE ELEMENTARY METS Table 4 gave the summary statistics for the activity time in the various METs. Table D.l compares the activity time summaries across the three samples. Generally speaking, there is very little difference among the activity times across the three samples. ESTIMATED EXPOSURES Section IV discussed the MEM and PM exposures. Tables D.2 through D.13 compare the summary statistics for the estimated exposures across the three samples. Tables D.2-D.5 compare the summaries for the MEM exposures based on the convolution method. Tables D.6-D.9 compare the summaries for the MEM exposures based on the hybrid method. Tables D.10-D.13 compare the summaries for the PM exposures. The estimated exposures are similar across the three samples. Duan's criterion and PIESS Section V discussed Duan's criterion and PIESS. Tables D.14 and D.15 compare those measures across the three samples. The qualitative nature of the results is similar across the three samples. ------- - 74 - Table D.1 SUMMARY STATISTICS FOR STANDARDIZED ACTIVITY TIMES FOR ELEMENTARY METS (Time in hours) Whole Sample Good Sample Best Sample MET Mean SD Mean SD Mean SD Bus 0.060 0.319 0.065 0.293 0.051 0.258 Rail 0.040 0.241 0.043 0.207 0.017 0.132 Private car 1.517 1.591 1.485 1.609 1.503 1.475 Pedestrian 0.269 0.707 0.226 0.528 0.198 0.443 Parking 0.084 0.766 0.075 0.864 0.089 0.925 Shop 0.384 1.064 0.331 1.077 0.298 0.983 Office 3.051 3.914 3.201 3.996 3.547 4.190 ------- Table D.2 SUMMARIES FOR THE MEM EXPOSURES BASED ON THE CONVOLUTION METHOD Sample Mean3 SDb Skew0 Kurtd Whole 2.287 2.215 9 .471 175.0 Good 2.225 2.370 10.98 205. 1 Best 2.067 2.308 11.89 208.6 Average of the MEM exposures in ppm-days. ^Standard deviation of the MEM exposures. Q Skewness of the MEM exposures. ^Kurtosis of the MEM exposures. Table D.3 PERCENTILES OF MEM.EXPOSURES BASED ON THE CONVOLUTION METHOD Percentile Whole Sample Good Sample Best Sample 1 0.051 0.050 0.050 5 0.452 0.395 0.275 10 0.746 0.706 0.487 25 1.281 1.252 1.255 50 1.894 1.875 1.840 75 2. 753 2.665 2.533 90 3.902 3.671 3.459 95 5 .082 4.457 3.887 99 9.997 10.78 6.131 ------- - 76 - Table D.4 SUMMARIES FOR LOG MEM EXPOSURES BASED ON THE CONVOLUTION METHOD Sample Mean3 SDb Skewc Kurtd Whole 0.555 0.815 -1.381 4.858 Good 0.516 0.835 -1.452 4.904 Best 0.432 0.868 -1.528 4.241 Average of the log MEM exposures in log(ppm-day). ^Standard deviation of the log MEM exposures. Skewness of the log MEM exposures. ^Kurtosis of the log MEM exposures. Table D.5 PERCENTILES OF LOG MEM EXPOSURES BASED ON THE CONVOLUTION METHOD Percentile Whole Sample Good Sample Best Sample 1 -2.976 -2.996 -2.996 5 -0.794 -0.929 -1.290 10 -0.293 -0.348 -0.719 25 0.247 0.225 0.227 50 0.639 0.629 0.610 75 1.013 0.980 0.929 90 1.362 1.300 1.241 95 1.626 1.495 1.358 99 2.302 2.378 1.813 ------- - 77 - Table D.6 SUMMARIES FOR THE HYBRID EXPOSURES Sample Mean3 SDb Skewc Kurt^ Whole 2.289 1.628 9.387 114.4 Good 2.247 1.771 10.25 124.9 Best 2.286 1.856 8.808 90. 12 £ Average of the hybrid exposures. ^Standard deviation of the hybrid exposures. Q, Skewness of the hybrid exposures. ^Kurtosis of the hybrid exposures. Table D.7 PERCENTILES OF THE HYBRID EXPOSURES Percentile Whole Sample Good Sample Best Sample 1 1.202 ppm 1.202 1.202 5 1.202 1.202 1.202 10 1.340 1.278 1.258 25 1.706 1.628 1.557 50 2.063 2.051 2.093 75 2.479 2.465 2.541 90 2.996 2.958 3.007 95 3.635 3.365 3.356 99 6.905 6. 151 16.86 ------- - 78 - Table D.8 SUMMARIES FOR THE LOG HYBRID EXPOSURES Sample Meana SDb Skewc Kurtd Whole 0. 740 0.361 1.806 8.709 Good 0.717 0.362 1.930 10.80 Best 0.724 0.384 1.839 9 . 796 Average of the log hybrid exposures. ^Standard deviation of the log hybrid exposures. c Skewness of the log hybrid exposures. ^Kurtosis of the log hybrid exposures. Table D.9 PERCENTILES OF THE LOG HYBRID EXPOSURES (log (ppm)) Percentile Whole Sample Good Sample Best Sample 1 0.184 0. 184 0. 184 5 0. 184 0.184 0. 184 10 0.293 0.246 0.229 25 0.534 0.487 0.443 50 0. 724 0. 718 0. 738 75 0.908 0.902 0.932 90 1.097 1.084 1. 101 95 1.291 1.213 1.211 99 1.932 1.817 2.825 ------- - 79 Table D.10 SUMMARIES FOR THE PM EXPOSURES Sample Mean3 SDb Skewc Kurtd Whole 1.593 1.634 3.112 16.74 Good 1.514 1.572 3.047 14. 15 Best 1.411 1.395 2.906 13.40 Average of the PM exposures in ppm-days. ^Standard deviation of the PM exposures. c Skewness of the PM exposures. ^Kurtosis of the PM exposures. Table D.11 PERCENTILES OF THE PM EXPOSURES Percentile Whole sample Good sample Best sample 1 0.050 0.050 0.050 5 0.097 0.092 0.092 10 0.218 0.163 0. 151 25 0.577 0.505 0.423 50 1. 165 1.154 1. 155 75 2.103 2.010 1.833 90 3.295 3.106 2.968 95 4.494 3.888 3.738 99 7.541 10.14 9.318 ------- - 80 - Table D.12 SUMMARIES OF LOG PM EXPOSURES Sample Mean3 SDb Skewc Kurtd Whole -0.017 1.096 -0.678 0.315 Good -0.082 1. 118 -0.675 0. 168 Best -0.123 1.090 -0.688 0. 101 £ Average of the log PM exposures in log(ppm-day). ^Standard deviation of the log PM exposures. c Skewness of the log PM exposures. ^Kurtosis of the log PM exposures. Table D.13 PERCENTILES OF THE LOG PM EXPOSURES Percentile Whole Sample Good Sample Best Sample 1 -2.996 -2.996 -2.996 5 -2.330 -2.389 -2.389 10 -1.525 -1.814 -1.889 25 -0.550 -0.683 -0.861 50 0. 153 0. 143 0. 144 75 0.743 0.698 0.606 90 1. 192 1.133 1.088 95 1.503 1.358 . 1.319 99 2.020 2. 316 2.232 ------- Table D.14 DUAN'S CRITERION FOR VARIOUS MET DECOMPOSITIONS MET 0a MET lb- MET 2.b Whole Good Best A110 Commute Business 364.6 378.4 436.2 Commute Parking In-transit 89 . 1 73.3 65.9 Business!0 Shops Offices 15.8 16.5 17.6 In-transit Vehicles Pedestrian 12.6 10.8 13.8 Vehicles Public Private car 6.2 6.2 3.1 Public^ Bus Rail 0.05 0.05 0.02 aMET 0: The coarser MET being decomposed. ^MET 1, MET 2: The two new METs. c The correlation between the MET concentrations in the two finer METs cannot be estimated and is assumed to be zero. The resulting Duan's criterion and PIESS should be viewed as a lower bound. ^The correlation between the MET concentrations in the two finer METs cannot be estimated and is assumed to be one. The resulting Duan's criterion and PIESS should be viewed as a upper bound. Table D.15 PERCENTAGE INCREASE IN EFFECTIVE SAMPLE SIZE MET 0* MET lb MET 2b Whole Good Best All0 Commute Business 31.1 36.2 63.7 Commute Parking In-trans it 8.2 7.5 10.7 Business0 Shops Offices 1.5 1.7 2.9 In-transit Vehicles Pedestrian 1.2 1.1 2.4 Vehicles Public Private car 0.6 0.7 0.5 Public1^ Bus Rail 0.005 0.005 0.003 aMET 0: The coarser MET being decomposed. bMET 1, MET 2: The two new METs. c The correlation between the MET concentrations in the two finer METs cannot be estimated and is assumed to be zero. The resulting Duan's criterion and PIESS should be viewed as a lower bound. ^he correlation between the MET concentrations in the two finer METs cannot be estimated and is assumed to be one. The resulting Duan's criterion and PIESS should be viewed as a upper bound. ------- - 82 - Appendix E DUAN'S CRITERION The criterion defined and used in Sec. V for decomposing a coarser MET into two finer METs, denoted as MET^ and MET^, is shown in Duan (1981) to have the following expression: DC = DCt x DCC, where DC? = (yj + y^)2 * Var (R^, DCC = (y^ - y2C)2 + (Zn + I22 - 2 x Zi2), (E. 1) T V = average MET time in MET^, T y 2 = average MET time in MET^, R1 = V(T1 + V' T^ = MET time in MET^, T2 = MET time in MET2> Q y = average MET concentration in MET^, C y 2 = average MET concentration in MET^, Z^ = variance of MET concentrations in MET^, 1^2 = variance of MET concentrations in MET2, Z^2 = covariance between MET concentrations in MET^ and MET2. This expression was used in the calculation of Duan's criterion in the analysis reported in Sec. V. ------- - 83 - The factor DC^, measures how different the concentrations can be between the two METs. The factor can also be expressed equivalently as DCC = E(C1 - C2)2, (E.2) where C^, are the MET concentrations in the two METs. Empirically, the expanded formulation given in (E.l) is preferable than the more concise formula (E.2) because the two MET concentrations might not be observed simultaneously in microenvironment monitoring, as was discussed in Sec. V. With formula (E.2), the factor cannot be estimated in such situations. With formula (E.l), it is still possible to estimate part of the formula--namely, the terms other than t*ie term ^12' requires the two MET concentrations to be simultaneously observed, can be bounded, say, between zero and 0^ x o^, where 0^ and 0^ are the standard deviations of the MET concentrations in the two METs. (This is equivalent to bounding the correlation between zero and one. It is assumed that the correlation cannot be negative.) ------- - 84 - Appendix F ESTIMATION OF PARAMETERS IN DUAN'S CRITERION The estimation of Duan's criterion requires the estimation of several summary parameters for activity times and MET concentrations. ACTIVITY TIME For activity times, two parameters are needed: the average of the total amount of time spent in the two METs by the individuals in the sample, and the variance of the ratios Ri til/(til + ti2)' where t is the amount of time the ith person spent in the first MET, and t is the amount of time the person spent in the second MET. For each person, the activity times t. „ and t.^ are calculated by ll i2 summing the durations from activity segments associated .with the MET being considered. The average of the total amount of time is calculated directly. For the other parameter, the variance of the ratios R^, the ratio may be calculated only for individuals who spent some time in at least one of the two METs--i.e., t., + t.„ > 0. The ratios and the 11 i2 variance are calculated using those people only. This procedure implicitly assumes that if the people who did not spend any time in these two METs on the sampling day spend some time in these two METs on another day, the variance of their ratios R is the same as the one based on observed ratios. This assumption cannot be tested using the data available from this study. However, if there are multiple days of activity times from the same individual, such as in the Denver CO Study (Johnson, 1984), it will be possible to test this assumption. MET CONCENTRATIONS For MET concentrations, the following parameters are needed: Vi^ = average MET concentration in the first MET, ^2 = average MET concentration in the second MET, ------- - 85 - I = variance of the MET concentrations in the first NET, ^22 = variance of the MET concentrations in the second MET, p^2 = correlation between the MET concentrations in the two METs. If both METs are elementary as was defined in Sec. V, these parameters can estimated directly. The average and variance parameters are estimated from days on which microenvironment monitoring was conducted in these elementary METs. The estimates are the same as the summary statistics given in Table 7. This procedure implicitly assumes that the MET concentrations are second-order stationary--i.e., they have the same mean and variance on different days. The correlation is estimated from the days on which microenvironment monitoring was conducted in both METs. This procedure also assumes second-order stationarity. Certain pairs of METs--e.g., bus and rail--were never measured on the same day for more than one day; therefore the correlation cannot be estimated. In those situations both the uncorrelated case (p = 0) and the perfectly correlated case (p = 1) are considered and Duan's criterion derived for both cases, the former as an upper bound and the latter as a lower bound. When one or both of the METs is composite and not elementary-- i.e., is an aggregation of more than one elementary METs--the MET concentration must be defined on each day before estimating the parameters. For example, the MET vehicle consists of three elementary METs: private cars, buses, and rails. There are several ways the MET concentrations for "vehicle" can be estimated, which will be discussed below. Equation (5) expressed MET concentration as the average of microenvironment concentrations belonging to this MET, weighted by the amount of time the person spent in each microenvironment. With some algebraic manipulation, this definition leads to an analogous expression for MET concentration in a composite MET in terms of the MET concentrations in its component elementary METs. The MET concentration for a composite MET is the average of MET concentrations in the ------- - 86 - component elementary METs, weighted by the amount of time the individual spent in each elementary MET: C., = I.C. . x T. ,/Z.T. . , (F. 1) ik j ij ij j ij where is the MET concentration in the kth MET, a composite MET, consisting of the elementary METs {J}, the summations are summed over these elementary METs; for each elementary MET, say the jth, C is the MET concentration and T.. is the time spent in that MET. ij One way to estimate the MET concentration for a composite MET such as vehicle is to take the average of all microenvironment concentra- tions measured as part of this MET. Under this definition, the MET concen- trations in the elementary METs are weighted by the amount of time the investigators spent in each elementary MET, instead of the amount of time the general population spent. For example, the investigators might spend half of their time in private cars and half of their time in public transportation. If they average the microenvironment concentrations they measured as their estimate for the composite MET vehicle, the estimate is equally weighted by private cars and public transportation. However, the general population might not allocate their time between private cars and public transportation in the same way. When there are no activity data from a human population available, the direct averaging of microenvironments described above might be the only method feasible. During the earlier phases of this analysis, this method was used to estimate MET concentrations. The results are basically the same as those based on the more accurate method. For the present study, the activity data are available from the Urban Scale Study, which can be used to give better estimates of MET concentrations for composite METs. The final analysis uses the average amount of time the Urban Scale Study sample (the best sample) spent in each elementary MET as weights when estimating the MET concentration for a composite MET, using the formula (F.l) given above. All the results given in the text of this report are based on this method. Once the MET concentrations are estimated for the composite METs, the concentration parameters are estimated in the same manner as for elementary METs. ------- - 87 - REFERENCES Dempster, A. P., N. M. Laird, and D. B. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm (with Discussion)," Journal of the Royal Statistical Society, Vol. 39, No. 1 (Series B) , 1977, pp. 1-38. Duan, Naihua, "Micro-Environment Types: A Model for Human Exposure to Air Pollution," Technical Report No. 47, SIMS, Stanford University, Stanford, California, May 1981. , Models for Human Exposure to Air Pollution, The Rand Corporation, N-1884-HHS/RC, July 1982. Also published in Environment International, Vol. 8, 1982, pp. 305-309. Flachsbart, Peter G., "Development of a Field Methodology and Model to Estimate Commuter Exposures to Air Pollution Using Personal Exposure Monitors," Progress Report No. 1, Environmental Monitoring Systems Laboratory, Research Triangle Park, North Carolina, September 30, 1982a. , "Development of a Field Methodology and Model to Estimate Commuter Exposures to Air Pollution Using Personal Exposure Monitors," Progress Report No. 2, Environmental Monitoring Systems Laboratory, Research Triangle Park, North Carolina, February 17, 1984. , "Field Survey Procedures for Monitoring Carbon Monoxide Exposures of Commuters in the Washington Metropolitan Area," Environmental Monitoring Systems Laboratory, Research Triangle Park, North Carolina, September 1982b. Hartwell, T. D., C. A. Clayton, D. M. Michie, R. W. Whitmore, H. S. Zelon, S. M. Jones, and D. A. Whitehurst, "Study of Carbon Monoxide Exposure of Residents of Washington, D.C. and Denver, Colorado, Part I," RTI/2390/00-OIF, Research Triangle Institute, Research Triangle Park, North Carolina, January 1984a. , "Study of Carbon Monoxide Exposure of Residents of Washington, D.C. and Denver, Colorado, Part II: Appendices," RTI/2390/00-OIF, Research Triangle Institute, Research Triangle Park, North Carolina, January 1984b. Johnson, Ted, "A Study of Personal Exposure to Carbon Monoxide in Denver, Colorado," Environmental Monitoring Systems Laboratory, Research Triangle Park, North Carolina, January 1984. Mack, Gregory A., Peter A. Flachsbart, Charles Rodes, James E. Howes, Gerry G. Akland, Harold Sauls, "Microenvironmental Approach to Estimation of Carbon Monoxide Exposures to Commuters," 1984. (To appear in the Journal of the Air Pollution Control Association.) ------- - 88 - Massey, Frank J., Jr., "The Distribution of the Maximum Deviation between Two Sample Cumulative Step Functions," Annals of Mathematical Statistics, Vol. 22, 1951, 125-128. Ott, Wayne R., "Computer Simulation of Human Air Pollution Exposures to Carbon Monoxide," paper presented at the 74th Annual Meeting of the Air Pollution Control Association, Philadelphia, Pennsylvania, June 21-26, 1981. Smirnov, N., "On the Estimation of the Discrepancy between Empirical Curves of Distributions of Two Independent Samples," Bulletin Mathematique de l'Universite de Moscou, Vol. 2 (1939), fasc.2. Wallace, Lance A., Jacob Thomas, and David T. Mage, "Comparison of COHb Predicted from CO Exposures with End-tidal Breath Estimates of COHb from the Denver-Washington, D.C., Human Exposure Studies," paper presented at the 77th Annual Meeting of the Air Pollution Control Association, San Francisco, California, June 24-29, 1984. Whitmore, Roy, Sheldon M. Jones, and Martin S. Rosenzweig, "Final Sampling Report for the Study of Personal CO Exposure," RTI/2390/02-OIF, Research Triangle Institute, Research Triangle Park, North Carolina, January 1984. ------- j TECHNICAL REPORT DATA (Please read Instructions on the reverse before completing) 1. REPORT NO. 2. EPA/600/4-85/046 3. RECIPIENT'S ACCESSION NO. PBS5 2 289 55'rR 4. TITLE AND SUBTITLE APPLICATION OF THE MICROENVIRONMENTAL MONITORING APPROACH TO ASSESS HUMAN EXPOSURE TO CARBON MONOXIDE 5. REPORT DATE Julv 1985 6. PERFORMING ORGANIZATION CODE 7. AUTHOR(S) Naihua Duan 8. PERFORMING ORGANIZATION REPORT NO. 9. PERFORMING ORGANIZATION NAME AND ADDRESS Rand Corporation 1700 Main Street Santa Monica, CA 90406 10. PROGRAM ELEMENT NO. 11. CONTRACT/Oe^mOXXOC- 68-02-4058 12. SPONSORING AGENCY NAME AND ADDRESS Environmental Monitoring Systems Laboratory Office of Research and Development U.S. Environmental Protection Agency Research Triangle Park, NC 27711 13. TYPE OF REPORT AND PERIOD COVERED 14. SPONSORING AGENCY CODE EPA/600/08 15. SUPPLEMENTARY NOTES 16. ABSTRACT Exposure estimates based on monitoring carbon monoxide in microenvironment6 are compared to exposure estimates based on personal monitoring with individual, portable monitors. Methods of calculation are reviewed and discussed, and re- sults of calculations are presented. These data indicate that population expo- sure estimates based on data from the Washington Microenvironment Study, com- bined with people's activity data from the Washington Urban Scale Study, are I about forty percent higher than estimates based on personal monitoring data from the Urban Scale Study. The former set of exposure estimates is found to be a good predictor of the latter. Nevertheless, generalizations of these findings to other data bases are not valid at this time. 17. KEY WORDS AND DOCUMENT ANALYSIS a. DESCRIPTORS b.IDENTIFIERS/OPEN ENDED TERMS c. COSATl Field/Group 18. DISTRIBUTION STATEMENT RELEASE TO PUBLIC 19. SECURITY CLASS (This Report) UNCLASSIFIED 21 . NO. OF PAGES 99 20. SECURITY CLASS (This page) UNCLASSIFIED 22. PRICE $ U.s~b EPA Form 2220-1 (Rev. 4-77) previous edition is obsolete i ------- |