PB89-166599
PROCEEDINGS OF THE RESEARCH PLANNING
CONFERENCE ON HUMAN ACTIVITY PATTERNS
Environmental Research Center
Las Vegas, NV
Jan 89
U.S. DEPARTMENT OF COMMERCE
National Technical Information Service
-------
EPA/600/4-89/004
January 1989
Proceedings of the
RESEARCH PLANNING CONFERENCE ON
HUMAN ACTIVITY PATTERNS
edited by
Thomas H. Starks, Ph.D.
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Cooperative Agreement No. CR 814342-01
• Project Officer
Stephen C. Hern
Technical Monitor
Joseph V. Behar, Ph.D.
Exposure Assessment Division
Environmental Monitoring Systems Laboratory
Las Vegas, Nevada 89193-3478
OFFICE OF RESEARCH AND DEVELOPMENT
U.S. ENVIRONMENTAL PROTECTION AGENCY
LAS VEGAS, NEVADA
-------
NOTICE
Although the information in this document has been funded wholly or
in part by the United States Environmental Protection Agency under
Cooperative Agreement No. CR 814342-01 to the Environmental Research
Center, University of Nevada-Las Vegas, it has not been subjected to Agency
review and therefore does not necessarily reflect the views of the Agency,
and no official endorsement should be inferred.
Mention of trade names or commercial products does not constitute
endorsement or recommendation for use.
n
-------
TECHNICAL REPORT DATA
(Please read Instructions on the reverse before completing)
1. REPORT IS|O.
EPA/600/4-89/004
3. RECIPI
«. TITLE AND SUBTITLE
PROCEEDINGS OF THE RESEARCH PLANNING CONFERENCE ON
HUMAN ACTIVITY PATTERNS • • -
5. REPORT DATE
January 1989
6. PERFORMING ORGANIZATION COOE
7. AUTHOR(S)
Thomas H. Starks, Ph.D.
8. PERFORMING ORGANIZATION REPORT NO.
3. PERFORMING ORGANIZATION NAME AND AOORESS
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
10. PROGRAM ELEMENT NO.
11. CONTRACT/GRANT NO.
CR814342-01
12. SPONSORING AGENCY NAME AND ADDRESS
Environmental Monitoring Systems Laboratory - LV, NV
Office of Research and Development
U.S. Environmental Protection Agency
Las Vegas, NV 89193-3478
13. TYPE OF REPORT AND PERIOD COVERED
14. SPONSORING AGENCY CODE
EPA/600/07
15. SUPPLEMENTARY NOTES
16. ABSTRACT
The study of human activity patterns was initially an area of interest
in the field of sociology, but recently it has become important to people
investigating the amount and extent of exposure of human populations to
hazardous chemicals. This report presents the proceedings of a conference
held to compare various methods of studying human activity patterns, and to
determine where additional research is needed to develop methods for
collecting reliable human activity patterns data pertinent to the determination
of exposure rates. Entitled "Research Planning Conference on Human Activity
Patterns," the conference was held May 9,9, and 10, 1988, in Las Vegas, Nevada.
7.
KEY WORDS AND DOCUMENT ANALYSIS
DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS C. COSATl Field. Group
18. DISTRIBUTION STATEMENT
RELEASE TO PUBLIC
19. SECURITY CLASS iTins Report/
UNCLASSIFIED
21. NO. OF PAGES
292
20. SECURITY CLASS fThis page
UNCLASSIFIED
22. P9ICE
EPA Form 2220-1 (R*v. 4-77) PREVIOUS COITION is OBSOLETE .
-------
ABSTRACT
The study of human activity patterns was initially an area of
interest in the field of sociology, but recently it has become important to
people investigating the amount and extent of exposure of human populations
to hazardous chemicals. This report presents the proceedings of a
conference held to compare various methods of studying human activity
patterns, and to determine where additional research is needed to develop
methods for collecting reliable human activity patterns data pertinent to
the determination of exposure rates. Entitled "Research Planning
Conference on Human Activity Patterns," the conference was held May 8, 9,
and 10, 1988, in Las Vegas, Nevada. This report was submitted in partial
fulfillment of Cooperative Agreement No. CR 814342-01 by the Environmental
Research Center, University of Nevada-Las Vegas, under the partial
sponsorship of the U.S. Environmental Protection Agency.
in
-------
CONTENTS
Page
Noti ce i i
Abstract i i i
Acknowl edgements vi
Summary 1-1
Human Activity Pattern Studies: History and Issues
Estimating Americans' Exposure to Air Pollution: Issues, Alternatives
and Suggestions - John Robinson 2-1
Human Activity Patterns: A Review of the Literature for Estimating
^ Time Spent Indoors, Outdoors, and In Transit - Wayne Ott 3-1
Basic Activity Patterns Structure for Modeling Pollution Exposure -
Jacob Thomas and Joseph V. Behar 4-1
Personal Activities: Monitoring and Quality Assurance
A Comparative Evaluation of Self-Reported and Independently-Observed
Activity Patterns in an Air Pollution Health Effects Study -
Thomas H. Stock and Maria T. Morandi 5-1
Assessing Activity Patterns for Air Pollution Exposure Research -
James Adair and John D. Spengler 6-1
Perception of Daily Cigarette Consumption in an Office Environment -
David A. Sterling, O.J. Moschandreas, and Robert D. Gibbons 7-1
Capture of Activity Pattern Data During Environmental Monitoring -
Harvey Zelon 8-1
An Activity Pattern Survey of Asthmatics - Carolyn H. Lichtenstein,
H. Daniel Roth, and Ron E. Wyzga 9-1
Nonresponse: Avoidance and Data Analysis
The Treatment of Missing Survey Data - Graham Kalton and Daniel
Kasprzyk 10-1
Nonresponse Adjustment Methods for Demographic Surveys at the
U.S. Bureau of the Census - Rajendra P. Singh and Rita J.
Petroni 11-1
On the Robustness of the Maximum Likelihood Estimator in the Presence
of Nonresponse in Compositional Data - Chao-lung Chen 12-1
iv
-------
CONTENTS (Continued)
Page
Nonresponse Problems and Solutions: A Case Study - Dawn Nelson and
Chet Bowie 13-1
Principles of Questionnaire Design and Methods of Administration -
Wendy Visscher, Roy W. Whitmore, Mel Kollander and
F. Cecil Brenner 14-1
Microenvironments and Activities: Definitions and Distinctions
Estimation of Microenvironment Concentration Distribution Using
Integrated Exposure Measurements - Naihua Duan 15-1
Microenvironment Database for Total Human Exposure Studies -
Muni Ian Pandian 16-1
A Methodology for Estimating Carbon Monoxide and Resulting Carboxy-
hemoglobin Levels in Denver, Colorado - Ted Johnson 17-1
The Influence of Daily Activity Patterns on Differential Exposure
to Carbon Monoxide Among Social Groups - Margo Schwab 18-1
Identification of Research Needs 19-1
Conference Participants..-. ' '...!, 20-1
-------
ACKNOWLEDGEMENTS
Gerry Akland, Joseph Behar, Stephen Hern, and Wayne Ott of the U.S.
Environmental Protection Agency were important sources of information,
guidance, and encouragement in the planning of this conference. Chao Chen,
Lynn Fenstermaker, Carol Forsythe, Leslie Gorr, Muhilan Pandian, and Marie
Schnell of the Environmental Research Center are also thanked for their
help in the operation of the conference and the production of these
proceedings.
vi
-------
SUMMARY
Modern technology has brought a dramatic increase in the production
and consumption of man-made chemical.s. To determine human health risks
posed by these new chemicals, it is necessary to investigate and estimate
how, how often, and at what concentrations the human population is exposed
to the chemicals. This in turn requires information on when, where, and
how people spend their time; that is, what are the human activity patterns
of a population.
The U.S. Environmental Protection Agency (EPA) employs exposure
models (e.g., SHAPE and NEM) to help develop risk assessments for various
chemicals. A key component for these models is human activity pattern
information. To date, only a few exposure studies have gathered the human
activity pattern data required to drive such models, and these studies have
utilized essentially the same basic methods.
This conference was held to compare and contrast methods of human
activity pattern data collection and analysis from environmental exposure
studies (e.g., EPA'S TEAM VOC Studies) with those from non-environmental
studies. Of particular interest were validity of approach, methodology for
eliciting cooperation from individuals in the sample, and implementation of
questionnaires and other data acquisition methodologies. In addition,
there was interest in data analysis methods used to reduce bias caused by
nonresponse. Another important objective of this conference was to
identify research needs in human activity patterns research.
To accomplish the conference objectives, both speakers and attendees
were selected so as to provide a broad background of experience in dealing
with the problems and needs of human activity pattern studies and research.
The conference participants came from universities, research institutes,
and state and federal agencies; and they represented a broad spectrum of
disciplines and interests. The discussions following each session of
presentations were spirited, informative, and representative of many points
of view concerning the problems discussed.
The conference papers are categorized into three substantive areas of
human activity patterns studies: microenvironments and activities,
personal activities monitoring and quality assurance, and nonresponse.
MICROENVIRONMENTS AND ACTIVITIES
To describe human activity patterns, one must decide how detailed the
descriptions must be. For some environmental studies, it may be sufficient
to state when and how long people are indoors, in transit (car, bus, train,
or plane), or outdoors, and whether or not they are engaged in a particular
activity (e.g., smoking). For other environmental studies, it may be
necessary to be far more specific in the characterization of each person's
location and activity. Typically, the required specificity of the
descriptions depends on the nature of the environmental pollutant under
investigation. Some participants of the conference thought that it would
be wise to standardize the descriptions of locations (or microenvironments)
1-1
-------
and activities so that results of different environmental human activity
pattern studies could be compared, and also could be used in estimating
population exposure to pollutants other than those targeted in the survey.
The word microenvironment was frequently used in the conference, but
it was evident that there was some disagreement as to what its proper
definition should be. Pandian states that
In its simplest form, a microenvironment can be defined as a
control volume with a homogeneous pollutant concentration.
...Since the pollutant concentration of the same volume might
vary with time, a better definition of microenvironment is the
four dimensional concept (3-D space) x (time) (Duan, 1982).
...In this paper, a new concept is introduced in which the use
of microenvironments is extended to include all the different
components of the total human exposure process, from pollutant
sources to related health effects.
Thomas and Behar, on the other hand, state "...the term microenvironment
has been defined to mean the location where activity takes place."
Similarly, Chen uses the word microenvironment to represent type of
location (...well defined 'microenvironments' [e.g., kitchen, parking lot,
bathroom]...") but evidently employs the quotes on microenvironments to
indicate that this is his definition for the purposes of his paper. Many
of the other speakers do not try to define microenvironment, but rather
employ categorizations of microenvironments. Typically they categorize
microenvironments by types of location (e.g., residence, office, car,
etc.).
Pandian suggests the development of a relational data base that will
connect many different types of information that one might associate with
microenvironments such as pollutant sources, sinks, and concentrations;
locations; activities and their durations; pollutant" carrying media; dosage
processes; and health effects. Pandian points out that such a data base
would provide scientists studying human exposure with data on
microenvironments and related elements in a concise package, and would
expose areas where further research is needed.
Duan considers the difficult problem of estimating microenvironment
pollutant concentrations given the results of exposure measurements
integrated over several microenvironments and given data indicating how
long subjects wearing the exposure meters spent in each environment. He
gives estimation procedures based on three different types of independence
assumptions. It is possible that all of these independence assumptions
are too strong to approximate reality,
Johnson discusses a methodology that was developed to estimate carbon
monoxide exposure and resulting carboxyhemoglobin levels of residents of
five counties in the greater Denver metropolitan area. The author tells
how activity pattern data from previous surveys were used in this
estimation process. He also discusses some of the shortcomings of his
approach.
1-2
-------
Schwab gives a geographer's approach to the analysis of activity
pattern and exposure data. She examines the relationship between
sociodemographic factors and exposure, and gives results for this type of
analysis that are based on EPA's study of personal exposure to carbon
monoxide in Washington, D.C.
Thomas and Behar compare time-budget data from carbon monoxide
exposure studies in two cities, Denver and Washington, to determine the
"likeness" of the activity patterns observed in the two cities. They
investigate the proportions of time spent in residence, indoors (other than
residence), outdoors, and in transit. Both studies were conducted in the
winter season. The authors point out the need for data from other seasons
to determine seasonal effects.
Ott, in reviewing results of activity pattern studies from around the
world, found as a common thread that people are basically indoor animals
that spend less than five percent of their time outdoors. This paper
includes a long list of references that provides the reader with a good
bibliography of the activity patterns literature.
PERSONAL ACTIVITIES MONITORING AND QUALITY ASSURANCE
The basic questions discussed are (i) What methods of monitoring
human activity patterns give a good balance between being unobtrusive and
having accurate results? and (ii) How does one evaluate accuracy of human
activity pattern monitoring results? If monitoring methods are obtrusive,
this obtrusiveness may cause an increase in nonresponse rates and may cause
respondents to alter their normal activity pattern during the period of
monitoring. Relatively non-obtrusive methods typically rely on the memory
of the respondents concerning activities during a previous period of some
fixed length and thereby may be unreliable.
Estimation of measurement error variance and measurement bias in
human activity pattern monitoring data typically must rely on redundancy in
measurement instruments, with one instrument being more accurate (and often
more obtrusive) than the other. Some biases may be obvious, such as
failure to report time spent in bathrooms and restrooms, but the extent of
the bias may still be difficult to estimate. The research done in this
area of data quality assurance appears to be quite limited.
Stock and Moranti report on a study in which participants maintained
a personal activity log and were observed by technicians who also logged
the participants activities during some of the study period. They found
some large discrepancies between the pairs of logs; especially for logs of
participants who were young male children.
In a different type of study, Sterling, Moschandreas, and Gibbons
compare the amount of smoking in a workplace perceived by smokers and by
nonsmokers. While the smokers' perceptions were in agreement with the
amount of individual smoking that they reported, the nonsmokers perceived
the amount of cigarettes smoked to be significantly lower than that
reported and perceived by the smokers.
1-3
-------
Zelon discusses observed advantages and disadvantages for two methods
of reporting activity patterns that were used in EPA exposure studies.
These two methods are a 24-hour recall questionnaire and a diary in which
the respondent is to make entries whenever he changes activities. He
suggests a number of improvements that might be made in these procedures
including some pre-training of the respondents.
Robinson lists several methods for recording information on human
activities and some of their advantages and disadvantages. He also gives
some information on the amount of agreement found between results obtained
by different approaches. In addition, he lists the advantages of a
Computer Associated Telephone Interviewing '(CATI) system developed by the
University of California-Berkeley, and discusses its use in a study being
conducted for the California Air Resources Board.
Quality assurance and quality control procedures employed in the
Harvard Air Pollution Study are described by Adair and Spengler. The study
involves a survey of children in several cities to record daily respiratory
symptoms and to monitor exposure to N02 and respirable particles. In
addition to listing their procedures, the authors give some of the problems
and limitations resulting from the procedures. They also compare activity
pattern results from different regions and from different seasons of the
year.
Lichtenstein, Roth, and Wyzga describe a diary approach in which a
questionnaire is filled out on an hourly basis to determine the activity
patterns of asthmatics. They also briefly address the problem of how to
keep the use of monetary incentives from bringing in respondents who are
not actually members of the target population. . • . .
NONRESPONSE
Two types of nonresponse are discussed. One is called unit
nonresponse and this occurs when a sampled individual either cannot be
located or is unwilling or unable to participate in the survey. The other
type of nonresponse is called item nonresponse and this happens when only
some of the questions in the questionnaire go unanswered or answers are
deleted in editing. A probability sampTe is one in which each unit in the
target population has a known, nonzero, probability of being included in
the sample. Probability samples are advantageous in that they allow
unbiased estimation of the population characteristics and also estimation
of the variance of the estimates. However, when nonresponse is
encountered, one no longer has a true probability sample in that one no
longer knows the probability that information for a given individual will
be in the sample survey results. That is, one may select a probability
sample of a population, but, if some members of the population may be
nonrespondents, the actual sample from which data are collected is not a
probability sample of the population. Hence, samples of large human
populations are almost never truly probability samples. Nevertheless, if
the rate of unit nonresponse is small (say less than 5 percent), it is not
unreasonable to treat the sample as a probability sample. For larger
nonresponse rates, one has to worry about how the nonrespondents differ
1-4
-------
from respondents in terms of the variables being measured in the survey,
and some methods for correction for nonresponse should be applied in the
analysis so as to avoid seriously biased estimates.
«
One conference participant observed that for some surveys, the
objectives may be such that it is not necessary to try to obtain
probability samples nor to worry about nonresponse bias in estimation.
This can happen when the objectives are qualitative rather than
quantitative. For example, the purpose of the survey may be to show that
there are at least some people in the population being exposed to a
chemical, or receiving a major portion of their exposure to a chemical, at
a particular type of location or through a particular type of activity.
Nelson and Bowie list methods employed by the U.S. Bureau of the
Census to obtain low unit-nonresponse rates and also discuss an experiment
to evaluate experimental methods to further increase response rates. They
stress the importance of "using well-trained professional interviewers who
have a positive attitude."
Visscher, Whitmore, Kollander, and Brenner place emphasis on the role
of wording, testing, and administration of the questionnaire in the
avoidance of both unit and item nonresponse. In addition, they list the
steps required in the production of a good questionnaire from the careful
specification of the purpose of the survey to a pretest of the
questionnaire on people similar to those in the target population.
Kalton and Kasprzyk give an overview of data analysis procedures
designed to partially compensate for nonresponse in sample surveys. They
.review methods of weighting adjustment and imputation. Included in the
review are discussions of several "hot deck" methods of imputation. All
the methods discussed assume that once the auxiliary variables have been
taken into account, the missing values are missing at random. This is a
strong assumption about the nonresponse mechanism. The authors mention a
method that avoids the assumption, but point out that it is sensitive to
distributional assumptions. They conclude their paper with the statements
that all methods for handling missing survey data must depend on untestable
assumptions, and that the only safeguard against serious nonresponse bias
is to keep the amount of missing data small.
Singh and Petroni discuss nonresponse weighting adjustment used at
the Bureau of the Census for demographic surveys. Procedures for defining
noninterview adjustment cells are presented. A response is weighted
according to a function of the nonresponse rate for the cell containing the
respondent. An application of these methods to the Census Bureau's Survey
of Income and Program Participation is used as an example.
Chen considers the item nonresponse problem and develops a maximum
likelihood estimator based on a class of logistic normal distributions. He
then investigates the robustness of the estimator and relates the
robustness to assumptions about the nonresponse mechanism. He suggests
that use of response incentives may change the nonresponse mechanism, and
thereby change the robustness of estimators.
1-5
-------
RESEARCH NEEDS
The study of human activity patterns and their relation to human
exposure is a relatively.new science. Nearly all aspects of this science
require further development. As this science is developed, it will provide
better information for environmental policy makers in making decisions
about the health risks of many potentially hazardous chemicals. At the
conclusion of the conference, Michael Callahan pointed out that
researchers, when planning and when reporting their work, should be
cognizant of' problems faced by the people designing and enforcing
regulations. From the regulatory decision maker's perspective the
important questions are "What is the probability of harm from industrial
use of chemical X?" and "What can I do about it?"
Conference participants were asked for their recommendations
concerning future research needs in the study of human activity patterns
and their relationships to human exposure to pollutants. A listing of
these recommendations is given at the end of these proceedings. The
recommended research topics include
• measurement methods development, validation, and comparison;
• formulation of better exposure models and of methods for the
validation of the models;
• development of a standard data coding and format;
• investigation of the longitudinal aspects of activity patterns
and exposure; ,
• examination of procedures that might increase participation
rates of sampled individuals;
• determination of methods of data analysis best suited for
reducing nonresponse bias;
• performance of field and laboratory studies to determine the
associations between activities, microenvironments, and
exposures;
• development of guidelines and standards for field studies;
• development of protocols for interviewer and technician
training and for subject instruction;
• development of quality assurance and quality control methods
appropriate to activity pattern studies.
1-6
-------
ESTIMATING AMERICANS7 EXPOSURE TO AIR POLLUTION:
ISSUES. ALTERNATIVES AND SUGGESTIONS
by: John P. Robinson
Department of Sociology
University of Maryland,
College Park, MD 20742
ABSTRACT
With the increased interest in how .the public spends its time has come
a proliferation of techniques for its measurement. In addition to
observation,five such methods are described and contrasted. Some general
advantages and examples of the diary method are presented in relation to
this method.
A major problem with existing diary studies is that they are not
focused on variables that are of main concern or relevance to environmental
researchers. The features of a new study that was explicitly designed to
adapt the diary method to produce generalizable exposure estimates for a
large population are discussed. It is proposed that such a technique is
quite appropriate and adaptable for generating national level estimates,
although they would ultimately need to be validated and calibrated using a
combination observational/personal monitoring approach.
2-1
-------
INTRODUCTION
Time is an increasingly used indicator of human activity and
performance.Inferences about the quality of life are made from data on the
length of the workweek, the time we spend watching television, the number
of hospital days we are ill, or the time American men and women spend doing
housework. There seems to be a widening perception that our decisions
about time are becoming as important as our decisions about money.
Time plays much the same role in estimating the quality of the
interaction with our environment. The amount of time we spend in
certainenvironmental conditions, or in exposure to certain pollutants, is a
key indicator of the daily risks we take. At the same time, it also serves
to suggest understandable numbers that are directly subject to policy
manipulation; that is, the simple statement that we spend x minutes per day
exposed to carbon monoxide or y hours per week exposed to cigarette smoke
immediately implies steps that can be taken to reduce risk.
Measuring these amounts of time would also seem a fairly
straightforward matter. We can easily visualize the daily activities that
lead to individuals being exposed x minutes to carbon monoxide per day.
But that is because we are implicitly back-translating this number to the
common-sense method of observation. Unfortunately, the observational
method of estimation, while feasible and persuasive, is simply too unwieldy
and cost intensive to be workable as an approach. How many people selected
at random would be willing to have an observer follow them around for an
extended period of time? How much interviewer time would be required to
collect the data, and how much effort would be required to train the
interviewers to be adequate observers of what we want to observe? How much
might people's activities be affected by the presence of the observer?
ALTERNATIVE METHODS
These are some of the obstacles that face the observation method. It
is not so much that these problems are unsolvable as that they are unwieldy
and expensive to address. More cost-effective methods have been proposed,
and I will attempt to make a persuasive case for them. At the same time, I
will later argue for the ultimate need for more observational studies.
At least five methods of estimating time exposure can be found in the
literature:
1) Respondent estimate: This is the cheapest and perhaps most
commonly used method. What is involved is simply asking
respondents to estimate the time per day or per week they
spend doing a particular activity. This is the way the Census
Bureau and Bureau of Labor Studies obtain their estimates about
the length of the workweek or the numbers of vacation days. It
is the way many survey organizations have estimated time spent
watching television or using other media, or time spent in
voluntary organizations or time spent doing housework.
2-2
-------
2) Estimates of others: This is essentially the same approach,
except one uses other informants who live with or know the
respondent as respondents.
3) Telephone coincidental: In this technique, respondents give
only brief reports for a specific moment — usually the moment
the telephone rang in their household. This approach has
usually been used by media rating services.
4) Behavioral meters; This is probably the most expensive
approach, since it requires the use of expensive equipment,
respondent agreement to use such equipment and usually some
sort of technical help to install or adjust the metering
equipment. The most common example of this approach is the TV
monitoring boxes used by Neilson, Arbitron and the other TV
rating services. More sophisticated "people meters" have been
developed to replace these black boxes, but they still suffer
from the same problems, particularly in relation to respondent
cooperation (Hartwell, 1984, Johnson, 1985).
5) Respondent Diaries: Like estimates, these are respondent
self-reports. Diaries differ in that they require respondents
to give a full accounting of their time for a specified period,
such as an evening, a day, or week. The diary is constructed
to be comprehensive, so that respondents report for all time
during that period. There are almost as many diary approaches
as diary studies, with some studies using fixed reporting
intervals, others open; some using closed activity
categories, others open; some including several parameters of
activity, others only one or two; some retrospective (i.e.
about "yesterday") some prospective (i.e. about "tomorrow"),
and so forth.
Figure 1 illustrates the typical information obtained in a diary
format, using open-end action questions as in the 1965, 1975 and 1985
national studies of American's Use of Time conducted by The Survey Research
Centers at the University of Michigan and at the University of Maryland.
The diary approach takes 15-20 minutes to complete and thus is far
more expensive than the estimate approach, especially so in relation to
mostother methods. Several studies have demonstrated the reliability of the
diary method in terms of its ability to produce similar estimates. Thus,
Robinson (1977) found a .85 correlation between diary estimates using the
"yesterday" and "tomorrow" approaches and a .86 correlation between overall
estimates from a 1965-66 national sample and a separate random sample of
respondents from the single site of Jackson, Michigan.
The diary method has also demonstrated basic validity in the aggregate
sense. In one study (Robinson 1985), diary estimates correlated .81 with
estimates using "beeper" that went off at random moments during the day.
In another study, Juster (1985) found husbands' and wives' diary accounts
9-T
-------
flGCBE I: (AMPLE TIME DIAKT MCE (1985 Study of American's UM of Time)
What you did from midnight until 9 in the morning
THIW
^^d^^M^k*
nvoniyiv
1AM
2AM
SAM
4AM
S>
6)
7l
i
8'
i
VM
M
\
Whwdidyoudo?
Turn
B«gwi
tt«00
Tmw
Enctod
WhM»
UstOthw
BMpto
With You
Doing Aj^hinB
2-4
-------
independently agreeing about what the other was doing at various points
during the day. Chapin (1974) reports that his diary accounts squared well
with those reported by participant observers in his Washington, D.C.,
study.
Thus, the diary method has been found to produce national estimates
that have desirable measurement properties in terms of activity reports.
As such, they raise questions about the accuracy and interpretation of
alternative sources of data using other methods. As in other countries,
for example, diary reports of time employed people spent at work are 10-15%
lower than that reported as their "official" workweek. In the same way,
time spent watching television is far lower than that reported by media
rating services. On the other hand, free time is much greater than what
respondents estimate that they have.
But that is for activity; and as valuable as activity estimates are
for understanding what the population is doing, the more important question
for environmental research is the air quality in the location in which the
activity takes place and the length of time spent in that location.
Playing softball in a field next to a toxic waste site has far different
health implications than playing softball in a clean air environment.
Driving on an open country road is much different than driving in a
congested urban location. Working in the presence of smokers is far
different from working in a smoke-free work station.
However, location information is not a missing element in the national
time diary studies done to date. Information is carefully and regularly
collected pn "where".-each activity takes place, and Figure 2 shows the type
of dynamic location information that can be derived from diary studies. In
this figure, it can be seen that the pattern of the time spent at home for
employed men on a weekday in five countries is basically similar, but can
diverge at certain points during the day. In contrast to men in Western
Europe, for example, relatively few American men are at home between
midnight and 2 a.m. and also between 7 p.m. and midnight (at the end of
Figure 1). It can also be seen that there are also relatively fewer of them
who return home for lunch than is true in Western Europe (or in Peru).
Thus, if exposure levels at home (or in other locations) were known to vary
across the day, the diary estimates would provide information that would
reflect these variations. In this way, Figure 2 provides a far richer
source of data than the simple average number of minutes or hours in
certain locations.
AN ENVIRONMENTAL APPROACH
Nonetheless, it is clear that to address environmental issues one
needs to reverse the basic time diary method's current sociological focus
on what the public is doing and how that is changing. With an
environmental focus, the concentration instead needs to be on the nature of
the location in which activities occur. In other words activities are
useful mainly to the extent that they alert respondents to report more
accurately on where they are doing the activities and the possible exertion
rates during the activity.
2-5
-------
FIGURE 2: FERCDTIAGES 0? FEOPLE AT HOME ACROSS THE DAI
•MPURZD KEN OR A VBODAt
(1965 B.8. ««tl«m«l Study pf Jtotrloms* B»t of Time)
I Bn»«t««i0iit«tetM«t*«u««r*ar*«r«t«Mk««
2-6
-------
That emphasis is what has now been incorporated into the time\diary
study now underway on the air quality exposure of residents of the State of
California. The study is funded by the California Air Resources Board
(CARB) and was designed to obtain data useful in estimating exosure to
several pollutant sources on the previous day. The following features have
been incorporated into the survey instrument, which is conducted over the
telephone with a random sample of state residents:
1) It is specifically focused on exposure for a particular day.
Activity coding for that day has been streamlined to highlight
activities of major environmental concern, in particular,
activities that require higher breathing rates, such as sports
activities, or that involve exposure to chemicals, such as
painting and auto repair.
2) For each activity period during the day, respondents are asked
specifically about passive exposure to cigarette smoke or
smoking materials. In that way it is possible to have detailed
information about the length of exposure periods and when they
occur during the day, as well as reminding respondents of
periods of the day when they might not recall being exposed to
smokers.
3) This same feature of diary expansion could have been applied to
all other possibilities of daily exposure to problematic air
quality, but that would have greatly increased the reporting
burden on respondents and lengthened an already time-consuming
instrument. For that reason, a.more direct set of questions
regarding potential" exposure was also-developed for the survey.
These were largely asked after the diary was completed and the
respondent's memory had been refreshed regarding the previous
day's activities. Possible exposure to several sources of
pollution were included: gasoline engines, cars in garages,
solvents and cleaning agents, paints, glues, dry cleaning
chemicals, etc. on the previous day. A separate set of
questions asked about exposure at the workplace. These direct
questions were intended to focus respondents' attention on
important aspects of the previous day's activities, which they
might have otherwise overlooked in their attempts to
reconstruct the behavioral details of the previous day in the
diary. These questions are shown in Table 1.
2-7
-------
TABLE 1
EXPOSURE QUESTIONS BEFORE DIARY
>wpl< Does your job involve working on a regular basis, that is, once
a week or more often, with:
Gas stoves or ovens?
>wp2< Open flames?
>wp3< Solvents or chemicals?
>wp4< Dust or particles of any sort?
>wp5< Gasoline or diesel-powered vehicles or work equipment?
>wp6< Other air pollutants?
>smok< Did you smoke any cigarettes yesterday—even one?
(If yes) Roughly, how many cigarettes did you smoke
yesterday?
>smok< (CODE OR ASK AS NEEDED)
Did you smoke any cigars or pipe tobacco yesterday?
(If yes) Roughly how many cigars or pipes of tobacco did you
smoke yesterday?
EXPOSURE QUESTIONS AFTER DIARY
Just to be sure we didn't .miss any important information, I
have some -additional questions about- yesterday's activities.
Did you spend ANY time yesterday at a gas station or in a
parking garage or auto repair shop?
>P9ys< (If Yes) About how long in all yesterday did you spend in
those places?
>pgas< Did you pump or pour any gasoline (yesterday)?
>gstv< Did you spend any part of yesterday in a room where a gas range
or oven was turned on?
>nstv< Were you around more than one gas range or oven yesterday, or
only one?
>msl< Was the gas range or oven you were around for the longest time
yesterday being used for cooking, for heating the room, or
for some other purpose?
>mstm< Roughly how many minutes or hours IN ALL were you in rooms
where gas ranges or ovens were turned on (yesterday)?
Continued
2-8
-------
Table 1. (Continued)
>ms2< Does the oven or range that you were around the longest have a
gas pilot light or pilotless ignition?
>gspr< Was the gas range or oven being used for cooking, for heating
the room, or for some other purpose?
>htfl< What kind of heat was it --gas, electricity, oil, or what?
(IF COMBINATION: Which kind did you use most?
>heat< What type of heater was turned on for the longest amount of
time? Was it a wall furnace, a floor furnace, forced air,
radiator, space heater, or something else?
>open< Were any doors or windows in your home open for more than a
minute or two at a time yesterday?
>opnl< For about how long during the day, that is, from 6 a.m. to
6 p.m., (were they/was it) open?
>opn2< For about how long during evening or night hours, that is, from
6 p.m. to 6 a.m., (were they/was it) open?
>fanl< Did you use any kind of fan in your home yesterday?
>fan2< Was that a ceiling fan, window' fan, portable room fan, or
something else?
>airc< (Other than the fan you just mentioned) Did you use any kind of
air cooling system in your home yesterday, such as an air
conditioner?
>ACtp< What type is it?
<1> Evaporative cooler (swamp cooler)
<5> Refrigeration type (air conditioner)
<7> Other (SPECIFY)
>glue< Did you use or were you around anyone while they were using any
of the following yesterday:
Any glues or liquid or spray adhesives?
(NOT INCLUDING ADHESIVE TAPE)
Yes
<5> No
Continued
2-9
-------
Table 1. (Continued)
>pnt2< Any water-based paint products (yesterday)?
(ALSO KNOWN AS "LATEX PAINT")
>solv< Any solvents (yesterday)?
>pest< Any pesticides (yesterday) such as bug strips or bug sprays?
>pst2< When you were around pesticides yesterday, were you mostly
indoors or outdoors?
>soap< Any soaps or detergents (yesterday)?
>0cln< Any other household cleaning agents such as Ajax or ammonia
(yesterday)?
>aero< Yesterday, did you use any personal care aerosol spray
products such as deodorants or hair spray, or were you in a
room while they were being used?
>shwr< Did you take a hot shower yesterday?
>bath< Did you take a hot bath or use an indoor hot tub yesterday?
>moth< Are you currently using any of the following in your home:
Any mothballs, moth crystals, or cakes?
>deod< Any toilet bowl deodorizers?
>rmfr< Any SCENTED room fresheners?
All these improvements in time reporting were made possible within
the context of a larger significant improvement in overall diary
methodology. This involves the technology of CATI--Computer Associated
Telephone Interviewing, in particular the CATI system developed at the
University of California at Berkeley. With its flexible capacity for
question branching and the ease with which it handles open-end responses,
the Berkeley CATI system is ideally suited for collecting diary data such
as these.
The development of a full CATI diary thus represents a major
breakthrough in diary data collection. No longer is it necessary to record
diary entries in longhand for subsequent coding. No longer is it necessary
to go through an expensive and time-consuming coding process, particularly
the tricky problem of ensuring that all time periods add up to 1,440
minutes. With the parallel development of computer programs to convert the
variable-field diary entries to fixed-field analytic format, tabulations
from the diaries can be made within a day of the time the final study
interview is completed. In our prior diary studies, it could take up to a
2-10
-------
year to complete the diary coding and translate the data into fixed-field
format for conventional statistical analysis.
This is particularly true for the CATI location codes, which as Table
2 shows, are (predominantly) already in closed-end categories. That does
not, as yet, extend to the activity codes, which remain largely in open-end
text. Nonetheless, we have been able to develop a preliminary set of 15
closed activity categories that encompass more than half of the activities
that are reported in diaries. For air quality research purposes, where the
number of activities of interest are far fewer than 270+ activity codes
that have been developed, it is a simple step to devise a core list of
40-50 activities that capture the needed distinctions, so that the task of
analyzing this facet of time use is as easy as for the location data in
Table 2.
TABLE 2: DISPLAY OF THREE ALTERNATIVE CARB STUDY LOCATION CODES
Aj. Where in vour house were you?
<1> Kitchen <7> Garage
<2> Living rm, family rm, den <8> Basement
<3> Dining room <9> Utility/Laundry rm.
<4> Bathroom <10> Pool, Spa (outside)
<5> Bedroom <11> Yard, Patio, other outside house
<6> Study/office <12> moving from room to room in the
Other (SPECIFY) house
EL; Where'were vou? (if not at home)?
<1> Office building, bank, post office
<2> Industrial plant, factory
<3> Grocery store (convenience store to supermarket)
<4> Shopping mall or (non-grocery) store
<5> School
<6> Public bldg. (Library, museum, theater)
<7> Hospital, health care facility, or Dr.'s office
<8> Restaurant
<9> Bar, nightclub
10> Church
11> Indoor gym, sports or health club
12> Other people's home
13> Auto repair shop, indoor parking garage, gas station
14> Park, playground, sports stadium (outdoor)
15> Hotel, motel
16> Dry cleaners
17> Beauty parlor; barber shop; hairdressers
18> At work: no specific main location; moving among locations
Other indoors (SPECIFY)
Other outdoors (SPECIFY)
Continued
2-11
-------
Table 2. (Concluded)
C_.. How were you travelling? Here you in a car, walking, in a truck, or
something else?
<1> Car <6> Train/rapid transit
<2> Pick-up truck or van <7> Other truck
<3> Walking <8> Airplane
<4> Bus/train/ride stop <9> Bicycle
<5> Bus 10> Motorcycle, scooter
Other (SPECIFY)
At the same time, as a sociologist, I think there are very grave risks
in such a step, because one is then never able to recapture the specific
activity that interviewers choose to lump with other activities or the
behavioral context that leads to specific activity choices. Investigators
who have only a "sports" activity, for example, combine bowling and
shuffleboard with strenuous exercises and sports such as squash or jogging.
Such trade-offs in activity detail need considerable discussion and
deliberation.
Another significant feature of this CARB study is that, rather than
picking particular "typical" seasons or periods of the year, the data
collection period extends across all seasons of the year. Data are spread
from mid-October tp mid-December, January and February, April and May and
June .and July. This removes a major stumbling block to interpretation of
the diary data. It is still somewhat short, however, of the effort we made
in our 1985 and 1987 national data collections to spread interviews evenly
across almost all days of the year.
In order to reflect the greater variation of weekend day activities
over weekdays, we have oversampled Saturdays and Sundays in relation to
other days of the week. Where possible, we also interview respondents for
a designated day—that is, if they are not available on Monday to report
about Sunday, we will ask them on Tuesday about Sunday's activities.
There is another feature of this study, however, that may be of more
interest to environmental researchers. That is the inclusion of a
Scientific Advisory Panel. In addition to the several staff scientists who
have closely reviewed and commented on the CARB questionnaire, we have
solicited and followed the advice of an appointed panel of outstanding
researchers in the field across the country. The panel, which includes
scientists with expertise in the fields of engineering, economics,
statistics and public health, comes from academic, governmental and private
firms.
The panel's comments have considerably influenced the direction of the
study and specific source questions we have chosen to examine in the study.
However, several excellent recommendations could not be followed, and must
2-12
-------
await future research opportunities. At the same time, there is no
logistical reason to prevent their being incorporated into the current CATI
instrumentation.
A final important feature of this study has been the inclusion of
children aged 12-18 in the household as randomly selected. What is
currently under consideration is an additional study to include, for the
first time, diary data on all children under the age of 12. Many of these
data would need to be reported by adults, rather than the children
themselves, however, pilot testing indicates that parents keep fairly close
watch on what their children are doing—or at least where they are.
SOME RECOMMENDATIONS FOR FUTURE RESEARCH
These developments in the CARB study give great promise for a fully
integrated study, one that would not depend on secondary analysis of
sociological data but would be specifically designed for air quality
estimation purpose. The CATI format can easily be adapted for a national
sample, and can be easily reprogrammed to include exposure questions on
several alternative sources of air pollution. The use of split samples,
with questions on some pollutants asked of some respondents and questions
on different pollutants posed to other respondents, can again be easily
accomplished with this CATI instrumentation.
However, all of this needs to be accomplished within a more
comprehensive model of air pollution exposure and validated using basic
observational data on the actual circumstances of exposure across the day.
To reduce field costs, these observational studies need to be conducted.in
a limited number of randomly selected, sites across, the country. These
observational studies will require expensive personal monitoring equipment
to validate the impressions obtained from the telephone instrument
approach. To the extent possible, technical observers could follow
respondents on their round of daily activities to isolate actual high
incidence of exposure for as many sources of risk as present field
instrumentation would allow.
In that way, we could achieve the single objective of the two
methods—the accuracy and internal validity of personal exposure monitors
and the external validity of the diary method. We would have a fully
integrated set of estimates to generalize to the national population with
considerably more accuracy than we have now. With these norms in hand, the
next step of identifying respondents at high risk would then be possible.
The work described in this paper was not funded by the U.S.
Environmental Protection Agency and therefore the contents do not
necessarily reflect the views of the Agency and no official endorsement
should be inferred.
2-13
-------
REFERENCES
Chapin, S. (1974). Human Activity Patterns in the City: Things People Do
in Time and in Space John Wiley New York.
Hartwell, T. (1984). Study of carbon monoxide exposure of residents of
Washington. D.C.. and Denver. Colorado. EPA-600/S4-84-031, PB84-
182516, Environmental Monitoring Systems Laboratory, U.S.
Environmental Protection Agency, Research Triangle Park, North
Carolina.
Johnson, T. L. (1985). A study of personal exposure to carbon monoxide in
Denver. Colorado. EPA-6004-84-015, OB-840146-12. Environmental
Protection Agency, Research Triangle Park, North Carolina.
Juster, F. T. (1985). The validity and quality of time use estimates
obtained by recall diaries in Time, Goods and Well-Being (etc.) by
Juster, F. and Stafford, F. P.). Institute for Social Research, The
University of Michigan, Ann Arbor.
Robinson, J. (1977). How Americans Use Time: A Social Psychological
Analysis. (Further analyses were published in How Americans Used
Time in 1965-66. Monograph Series, University Microfilms, Monograph
Series (Ann Arbor).
Robinson, J. (1976). Changes in Americans' Use of Time: 1965-1975.
Cleveland State University, Communication Research Center, Cleveland,
Ohio. .
Robinson, J. (1985). Testing the validity and reliability of diaries
versus alternative time use measures. Time, Goods, and Well-Beinq
(ed.) F. T. Juster and F. P. Stafford. Institute for Social
Research, The University of Michigan, Ann Arbor.
Robinson, J. (1983). Environmental differences in how Americans use time:
the case for subjective and objective indicators. Journal of
Community Psychology. Vol. 11, pp. 171-189 .
Szalai, A. et al. (1972). The Use of Time. The Hague: Mouton.
2-14
-------
HUMAN ACTIVITY PATTERNS: A REVIEW OF THE LITERATURE FOR
ESTIMATING TIME SPENT INDOORS. OUTDOORS. AND IN TRANSIT
by: Wayne R. Ott
Chief, Air, Toxics, and Radiation Staff
Office of Research and Development (RD-680)
U.S. Environmental Protection Agency
Washington, D.C. 20460
ABSTRACT
This paper reviews field surveys of human activity patterns and "time
budgets" in the U.S. and other countries published in the sociological,
transportation, and environmental literature. This review emphasizes the
use of these activity data for assessing human exposure to environmental
chemicals. Although many previous activity pattern field surveys have been
conducted in fields outside the environmental sciences, few of these have
collected the kind of data needed to construct human exposure models for
environmental exposure assessment. Using previous studies as a data
source, this paper estimates approximately the times people spend in three
general categories of microenvironments: indoors, outdoors, and in
transit. From U.S. nationwide activity pattern surveys, employed persons
spend the following proportion of their time in three categories: indoors
(home, work, or other locations), 92%; in-transit, 6%; and outdoors, 2%.
By comparison, U.S. housewives spend the following proportion of time in
three categories: indoors (home or other locations), 94.1%; in-transit,
4.2%; and outdoors, 1.7%. The existing activity pattern literature reveals
that man is primarily an indoor creature.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
3-1
-------
INTRODUCTION
Information on the time people spend in certain microenvironments is
important for estimating population exposures to air pollution. A large
number and variety of studies in which data on human activities were
collected from real population samples have been completed over the last 60
years1"38 (Table 1). These studies may be useful in determining the amount
of time that individuals spend in particular locations throughout the day.
Such data are required in the human exposure-activity pattern models that
have been developed in the 1980's.39"44 To calculate a person's 24-hour
exposure, such_a model requires information on the person's activities and
locations visited throughout the day. Such information, if obtained from a
diary, can be coded to become an "activity profile" on the computer,
providing a record of each person's activities by time of day. Then the
computer uses this record to generate a corresponding estimate of a
person's exposure by time of day, thus producing an "exposure profile." A
number of exposure profiles can be generated, and the highest hourly
exposure (or 8-hour exposure) for each person can be obtained, producing a
frequency distribution of exposures for the entire population. The total
human exposure field studies conducted in the 1980s using personal
monitoring31"34- 45"46 all share a common finding: the activities of a person
are the most important determinant of the person's exposure to
environmental pollution. Thus, activity pattern data and exposure modeling
are essential ingredients of risk assessments.
3-2
-------
Table 1
STUDIES OF HUMAN ACTIVITIES AND TIME BUOZTS
REFERENCE
LOCATION
DATE
SURVEY SAMPLE
APPROACH
COMMENTS
Lundberg, Westchester
Komarovsky, and County, NY
Mclnemy (1)
Sorokin and Boston, MA
Berger (2)
de Grazia (3) U.S. national
sample
Nov - May 2.460 persons,
1931 - 1932 Including
school children
May- Nov
1935
Mar - Apr
1954
3-day diaries, with a few
7-day diaries covering a
total of 4,460 days
Approximately Diaries over a 4-week
100 persons aged period, with 1 form per
17 or older day
Probability
sample of 7,000
households
2-day diaries, with
15-nrinute time periods
from 6 a.m. to 11 p.m.
Part of an in-depth study of 1ife
values and leisure in Westchester
County
A total of 3,472 forms was
generated, giving considerable
detail on activities
Analysis focused en the use of
leisure tine (TV, radio) as a
function of age
Chapin and Durham, NC Oct - Nov
Hightower (4) . 1963
Snail
nonsystematic
sample of
adults
20-minute questionnaire,
plus 20-minute game
Limited experimental survey for
generating hypothesis about
discretionary time
Szalai (5-6) 12 Countries 1965 - 1966
24,392 persons
in 13 surveys
in 12 countries
1-day diary, with
before-and-after visit to
residence by interviewer '
Standardized sanpling,
interviewing, coding, and analysis
to compare activities in different
countries
Robinson (7-8) U.S. national
sample and
Jackson, Mich
1965 - 1966
1.244 adults in
national sanple
and 788 adults
in Jackson. Mich,
sample
1-day diary, with
before-and-after visit to
residence by interviewer
Included both primary and secondary
activities as well as location and
social corpany
Chapin. and U.S. national 1966 1,467 Interview about
Brail (9) sample households in activities on the
43 SMSAs and 48 previous day
states
Part of a study of residential
preferences. Employed 13-class
activity classification system
and
Chapin (10)
Washington. DC Spring 1968
1.667
respondents
1-hour interview about
activities on the
previous weekday and
weekend
Differentiated between
discretionary and obiigatory
activities
Brail and U.S. national 1969 1,199 Repeat survey on the 1966
Chapin (11) sample households population saiple (see
Chapin and Brail)
Little difference observed in
activity patterns between the 1966
and 1969 surveys
3-3
-------
Table 1 (cont)
STUDIES OF HUMAN ACTIVITIES AND TIME BUDGETS
REFERENCE
U.S. Dept of
Transportation
and Bureau of
Census (12-22)
Flachsbart,
Baer, and
Schalman (23)
Michelson, and
Reed (24)
Bullock.
Dickens,
Shapcott, and
Steadmn (25)
British
Broadcasting
Corp (26)
Robinson (27)
U.S. Dept of
Transportation
and Bureau of
Census (28)
Juster et al .
(29)
LOCATION DATE
U.S. national 1969 - 1970
sanple
Los Angeles Mar - Jun
County, CA 1972
Toronto, Canada 1973 -
Reading, Mar 1973
England
England, ' 1974 - 1975
Scotland, and
Wales
U.S. national Oct - Dec
sanple 1975
U.S. national 1978
sanple
U.S. National Feb - Dec
Sanple 1981
SURVEY SAWLE
6,000
households
244 persons
aged 16 or
older, non-
representative
sanple
591 famil ies of
child-bearing
years who moved
806 persons aged
16-70 provided
450 usable
diaries
1,822 persons
in winter and
1,723 persons
in sinner
1,519 persons
aged 18 or
older
Approximately
20,000
households
620 adults.
492 children
aged 3-7
APPROACH
Hone-based interview
Interview used cards to
describe frequency and
duration of 80 activities
in the past year
1 interview before the
move and 2 afterward
7-day diaries with
assistance provided on
the first day
7-day diary
Interview about
activities on the
previous day
Hcme-based interview
Personal .interviews and
telephone interviews
COMMENTS
Detailed interview conducted by
Census Bureau with emphasis on
travel activities
Small-scale study to examine the
hypothesis that the residential
environment facilitates or hinders
certain activities
Designed to access changes in
activities as a result of the move
Data collected to val idate a model
developed by Tcmlinson, Bullock,
et al.
Diary format divided into 42
half -hour tine bands (emphasized TV
viewing)
Sanple was more completely
representative of the U.S.
population than the 1965-1966 study
by Robinson
Expansion and update of initial
1969-1970 Nationwide Personal
Transportation Study (NPTS)
Follow-up study of households fron
the 1975 time use study
continued
3-4
-------
Table 1 (cont)
STUDIES OF HUMAN ACTIVITIES AND TIME BUDGETS
REFERENCE
Letz and Soczek
(30)
Johnson (31-32)
Hartwell
et al . (33-34)
Kl Inger and
Kuzmyak (35)
Johnson (36)
Robinson and
Holland (37)
Wiley and
Robinson (38)
LOCATION
Kingston-
Harrinni, TN
Denver, CO
Washington, DC
U.S. national
sample
Cincinnati", OH,
ffletropol itan
area
U.S. national
sarple
State of
Cal ifomia
DATE
Jun - Sep
1981
Dec -Feb
1982 - 1983
Dec - Feb
1982 - 1983
1983 - 1984
Mar & Aug.
1985 .
Jan 1985-
Jan 1986
Oct 1987 -
Jul 1988
SURVEY SAMPLE
322 persons.
including sore
children
Probability
sarple of 452
persons, 2 days
each
Probability
saiple of 712
persons
Probabil ity
sarple of 6,438
households in
50 states and
DC
Probability
sample of 973
persons
4900 adults
dgod over 18,
300
adolescents
over 12
180 persons
aged over 12
APPROACH
Telephone interviews
using a "yesterday
recall" diary
2-day diary with personal
monitor
1-day diary with personal
monitor
Home-based interview
Activity diaries for 3
consecutive 24-hour
periods plus
questionnaires
1-day "yesterday recall"
diary
1-day prospective diary
by nail
1-day "yesterday recall"
diary on telephone
COMMENTS
Micrcenvironments were coded by
Home, Work, Travel for exposure
model ing purposes
Activity Changes recorded as part of
U.S. EPA CO personal exposure field
study
Activity changes recorded as part
of U.S. EPA CO personal exposure
field study
Follow-up to NPTS surveys conducted
in 1969-1970 and 1977-1978
(extensive information on personal
travel habits)
Approximately 2,000 subject-days
for use with air pollution activity
pattern-exposure models
Included U.S. EPA questions on
dwelling unit characteristics
Very complete location codes.
Emphasized air pollution sources
(smoking, hot showers, gasoline
engines, solvents, air fresheners)
3-5
-------
TIME BUDGET STUDIES
When one examines the literature on human activities, the term "time
budget" ("zeitbudget," "budget de temps") frequently is encountered. A
time budget, which is conceptually similar to a person's money budget,
summarizes the amount of time an individual spends in each of many
activities over some time period (a day or a week). As Michel son noted,47 a
time budget contains considerable detail on a person's activities,
including the locations in which the activities take place:
A time budget is a record, presented orally or on paper, of what a
person has done during the course of a stated period of time. It
usually covers a 24-hour day or multiples thereof. The record is
taken down with precision and detail, identifying what people have
done with explicit reference to exact amounts of time. It is usually
presented chronologically through the day, beginning with the time
that a person gets up in the morning.
The information that is normally gathered in a time budget consists
of the time an activity began, the time it ended, the nature of the
activity per se, the persons who were present and active in the given
activities, and, not least, the exact location where the activity
took place.
One way of obtaining time budget information from the population
surveyed is by having each respondent maintain a diary over a 24-hour
period or longer. In another.approach, the so-called "yesterday" survey
approach, the interviewer asks each respondent about his or her activities
on the "day before." Once the diaries or questionnaires are completed,
they are collected, and the activities are coded according to some
systematic procedure. Then the results are tabulated, and the average
number of hours devoted to each activity, or class of activities, are
summarized for the population sample as a whole, or for various subgroups.
Several summaries of the historical development of time budget
research appear in the literature, including the literature review prepared
by Ottensmann48, the article by Szalai49, the paper by Converse50, and the
book by Chapin51. The present paper discusses the literature on activity
patterns primarily in the context of estimating air pollution exposure.
EARLY ACTIVITY PATTERN STUDIES
Perhaps the original interest in time budget research evolved from
the industrial revolution and the importance it attached to time: "Time is
money." Probably the first studies of human activities were time-and-
motion studies of industrial workers in the early part of the century. It
appears that time budget research of industrial workers in Moscow in the
1920's represents the first attempt to collect data on individual
activities over a 24-hour period48. Also in the 1920's, the Bureau of Home
Economics of the U.S. Department of Agriculture conducted research into how
farm women used time49.
3-6
-------
In the 1930's, several important time budget studies were carried
out, including an in-depth study by Lundberg, Komarovsky, and Mclnerny1 of
the leisure time activities of residents of Westchester County, New York.
As part of a study of the residents' participation in clubs, churches,
schools, the arts, and other leisure activities, these investigators
collected diaries that ranged from 3 to 7 days on 2,460 persons (Table 1).
Later in the decade, Sorokin and Berger2 distributed 1-day diary forms to
approximately 100 residents in Boston over a 4-week period, generating
3,472 daily report forms. Sorokin and Berger's work is interesting because
it attempts to examine the motivation for activities, social context, and
the predictability of behavior. Despite the nonrepresentativeness of the
sample and a poor response rate, the book describing their study is
considered a landmark in time budget research. According to Szalai49,
"This book probably did more than any other at that time to popularize the
time budget process as a sociological method of investigation." Also, in
the 1930's, McCormick52 suggested that time budgets ultimately could be used
to compare cultures. As Ottensmann noted48, it took more than 30 years for
McCormick's suggestion to be carried out.
Szalai49 points to the large number of time budget research studies
undertaken in other countries, particularly in France, beginning in the
1940's. In addition, time budget studies have been undertaken in Japan,
other Western European countries, the Soviet Union, and the eastern
European socialist countries. In Hungary, for example, the Central
'Statistical Office conducted a time budget study of a national (2%) sample
of 12,000 persons as part of its "micro-census" in 196349. This, apparently
was the first time budget study that was part of an official census. In
the 1950's, several studies were completed in Britain that focused on the
time spent by housewives in various activities throughout the day.48
Walker53 continued the studies of housewives in the United States, reporting
that, despite increased use of labor-saving devices, "homemaking still
takes time." In addition to the interest researchers and government
agencies expressed in changes in leisure time activities as a result of
affluence, technological innovation, and reductions in the number of hours
worked per week, another user of time budget information appeared in the
1950's. The broadcasting industry saw a need for information on the amount
of time that people spend in various activities in order to determine the
audience available throughout the day for its radio and television
programs. de Grazia3 discusses the results of a time budget study
commissioned by the Mutual Broadcasting Company in 1954 of the activities
of a U.S. national probability sample of 7,000 families.
MULTINATIONAL COMPARATIVE TIME BUDGET RESEARCH PROJECT
A time budget study that was impressive both for its scale and its
importance for international cooperation was the Multinational Comparative
Time Budget Research Project5. This project, launched in September 1964 by
a small international group of social scientists, employed common
principles for sampling, interviewing, coding, and tabulating the data.
The population sample consisted of nearly 25,000 persons in 12 countries
(Belgium, Bulgaria, Czechoslovakia, France, East Germany, West Germany,
3-7
-------
Hungary, Peru, Poland, Union of Soviet Socialists Republics, United States,
and Yugoslavia).
The data were collected by asking each individual in the sample to
record his or her primary activities for a "complete day," referred to as
"day n." Information also was requested on any parallel activities carried
out at the same time, the location where each activity was performed, and
the persons in whose company it was performed. Once a respondent was
selected, the interviewer provided a self-recording form on "day n - 1" to
be used on "day n." The investigator returned on day "n +1" and, by means
of an interview, checked, corrected, and completed the form filled out by
the subject. Thus the investigator obtained a "recapitulatory" interview
in addition to the raw diary data. If for any reason the form had not been
filled out, the investigator on "day n + 1" interviewed the subject about
"day n" activities, thus obtaining a "spontaneous interview." In each
case, a supplementary questionnaire at the close of the visit was used to
obtain information for each respon-dent on personal and demographic
characteristics5.
The multinational study developed an activity coding system
consisting of 100 categories of activities represented by a 2-digit code
(from 00 to 99). The activities represented by these codes can be grouped
into 10 larger classes: (1) working time and related activities, (2)
domestic housework, (3) care of children, (4) purchasing of goods and
services, (5) private needs such as meals and sleep, (6) adult education
and professional training, (7) civic and collective participation
activities, (8) spectacles, entertainment, and social life, (9) sports and
active leisure, and (10) passive leisure.6 An example, of the activities
within a particular class, domestic housework, shows that the activities
range from preparation and cooking of food to gardening and taking care of
animals (Table 2)5.
3-8
-------
Table 2. 2 Activities in the Domestic HouseWork Category of the
Multinational Time Budget Project5
Code Activity
10 Preparation and cooking of food
11 Washing and putting away utensils
12 Interior cleaning (sweeping, washing, making
beds)
13 Exterior cleaning (pavement or sidewalk)
14 Washing and ironing linen
15 Repair and maintenance of clothes, linen,
shoes
16 Miscellaneous repair and maintenance work
17 Gardening, taking care of animals
(not for profit)
18 Maintenance and provisioning for
heating and water
19 Miscellaneous (adding up accounts, tidying up
papers, usual attentions paid to members of
the household)
The Multinational Comparative Time Budget Research Project yielded a
rich data base .that is summarized in a number of tables., figures, and.
articles by various participants in Szalai's book6. For example, data from
the appendix of the book show the average time employed men, employed
women, and married housewives spend in various locations in 12 countries
(Table 3). Employed men in the 12 countries spend between 12 hours (in
Hungary) and 15.2 hours (in Belgium) inside their homes, while housewives
spend between 19.6 hours (in USSR) and 21.7 hours (in France) inside their
homes6. For the 12 countries, employed men average between 50%.and 63% of
the day inside their homes, compared to between 82% and 90% for housewives.
It is difficult to determine the overall amount of time spent indoors
from these data, because categories such as "at one's workplace" do not
distinguish between indoor and outdoor workplaces. Similarly, the
categories "in place of business" and "in all other locations" do not
specify whether they are indoors-or outdoors. However, I have estimated
the amount of time respondents spent in three general categories (indoors,
outdoors, and in transit), by assuming that:
• The categories "inside one's home", "at one's work place", "in
other people's homes", "in places of business", and "in
restaurants and bars" are assumed to be entirely indoors.
• The categories "just outside one's home" and "in all other
locations" are entirely outdoors.
3-9
-------
Table 3. Time Spent in Various-Locations in 12 Countries8
(Average Hours per Day, All Days of the Week)
01 02
03
04 05 06 07
08 09
10 11 12 13
14 15
EMPLOYED MEN, ALL DAYS
inside one's home
just outside one's home
at one's work place
in transit
in other people's home
15.2 12.5 14.3 13.6 13.6 14.2 13.8 12.0 12.9 14.0 13.4 13.6 13.4 12.9 13.0
0.5 0.7 0.3 0.3 1.0 0.5 0.4 1.0 0.1 0.2 0.2 0.3 0.3 0.5 1.4
5.0 7.7 5.9 7.2 5.4 5.1 6.8 7.5 6.4 7.0 6.7 6.5 6.8 7.1 6.1
1.5 2.1 1.6 1.5 1.7 2.2 1.7 2.0 2.5 1.7 1.6 1.5 2.0 1.8 2.2
0.5 0.2 0.3 0.5 0.5 0.6 0.3 0.3 0.5 0.5 0.5 0.6 0.2 0.7 0.5
in places of business
in restaurants and bars
in all other locations
total
EMPLOYED WOMEN, ALL DAYS
inside one's home
just outside one's home
at one's work place
in transit
in other people's home
in places of business
in restaurants and bars
in all other locations
total
HOUSEWIVES.' ALL DAYS
(Married Only)
inside one's home
just outside one's home
in transit
in other people's home
in places of business
in restaurants and bars
in all other locations
total
0.7
0.2
0.4
24.0
17.1
0.1
3.6
1.2
0.4
1.0
0.2
0.4
24.0
21.6
0.2
1.0
0.4
0.5
0.1
0.2
24.0
0.6
0.0
0.2
24.0
14.6
0.3
6.5
1.6
0.2
0.6
0.0
0.2
24.0
20.4
1.4
0.9
0.4
0.7
0.1
0.1
24.0
0.6
0.1
0.9
24.0
16.0
0.2
5.1
1.3
0.2
0.8
0.0
0.4
24.0
20.9
0.3
1.2
0.3
1.1
0.0
0.2
24.0
0.5
0.2
0.2
24.0
15.3
0.0
6.3
1.1
0.5
0.6
0.1
0.1
24.0
21.7
0.1
1.0
0.5
0.6
0.0
0.1
24.0
0.4
0.5
0.9
24.0
17.0
0.7
3.6
1.1
0.4
0.6
0.2
0.4
24.0
20.4
0.8
1.0
0.6
0.7
0.1
0.4
24.0
0.4
0.4
0.6
24.0
16.7
0.2
3.6
1.3
0.9
0.8
0.3
0.2
24.0
20.5
0.4
1.0
0.6
1.1
0.1
0.3
24.0
0.6
0.1
0.3
24.0
16.7
0.2
4.9
1.1
0.2
0.7
0.1
0.1
24.0
21.3
0.3
1.0
0.3
0.9
0.0
0.2
24.0
0.4
0.2
0.6
24.0
14.5
0.3
6.8
1.4
0.3
0.5
0.0
0.2
24.0
19.7
2.1
0.9
0.2
0.9
0.0
0.2
24.0
0.7
0.3
0.6
24.0
16.1
0.4
4.4
1.8
0.3
0.7
0.1
0.2
24.0
21.0
0.5
1.2
0.4
0.7
0.0
0.2
24.0
0.4
0.0
0.2
24.0
15.0
0.1
5.8
1.5
0.6
0.8
0.0
0.2
24:0
20.9
0.1
1.2
0.5
1.2
O'.O
0.1
24.0
0.7
0.4
0.5
24.0
15.4
0.0
5.2
1.3
0.7
0.9
0.2
0.3
24.0
20.5
0.1
1.0
0.8
1.2
0.1
0.3
24.0
0.7
0.4
0.4
24.0
15.3
0.1
5.0
1.3
0.6
1.1
0.2
0.4
24.0
.20.9
0.1
0.9
0.7
1.1
0.1
0.2
24.0
0.4
0.2
0.7
24.0
14.0
0.1
6.7
1.7
0.2
0.6
0.2
0.5
24.0
.
19.6
0.4
1.9
0.7
1.1
0.0
0.3
24.0
0.5
0.2
0.3
24.0
15.0
0.3
6.1
1.4
0.6
0.4
0.0
0.2
24.0
20.5
0.8
1.5
0.7
0.4
0.0
0.1
24.0
0.5
0.0
0.3
24.0
15.0
0.4
6.4
1.5
0.2
0.4
0.0
0.1
24.0
19.7
2.3
1.1
0.3
0.5
0.0
0.1
24.0
Cities: 01 Belgium
02 Kazanlik, Bulgaria
03 Olomouc, Czechoslqvakia
04 Six cities, France
05 100 districts. West Germany
06 Osnabruck, West Germany
07 Hoyerswerda, East Germany
08 Gyor, Hungary
09 Lima-Callao, Peru
10 Torun. Poland
11 Forty-four cities, USA
12 Jackson, Michigan, USA
13 Pskov, USSR
14 Kragujevac, Yugoslavia
15 Maribor, Yugoslavia
In this table from Section VII.3, "Distribution of Daily Time According to Different Locations," Tables 7-1.1 to
7-1.3, p. 795, Szalai0, data were weighted to ensure equality of days of the week and numbers of eligible
respondents per household.
3-10
-------
• The category "in transit" is neither indoors nor outdoors.
With these assumptions and the data restructured accordingly (Table
4), employed men in the 12 countries are seen to spend between 84% (in
Maribor, Yugoslavia) and 92% (in France) indoors, compared to between 89%
(in Maribor, Yugoslavia) and 97% (in France and Torun, Poland) for
housewives. However, many of the entries in Table 4 cannot be compared on
a statistical basis, because the number of respondents in each sample
varies. Also, the representativeness varies. For example, the Soviet
Union is represented by a single city and its suburbs (Pskov, population
115,000), while the United States is represented by a national sample of 44
metropolitan areas. Finally, the assumptions used to develop Table 4 need
to be examined because they may introduce error. However, the estimates in
Table 4 appear useful as a rough approximation of the actual times spent by
residents of 12 countries indoors, outdoors, and in transit, and they are
used in the remainder of this paper.
Activities of the U.S. Population
If only the data for the United States (44 cities) are considered,
employed men, on the average, spend 90% of the day indoors, versus 95% for
married housewives (Table 4). Overall, employed men in the United States
are estimated to spend 2.9% of the day outdoors, versus 1.7% for
housewives.
To estimate the time that employed people in the U.S. spend, in various
activities, it is necessary to combine the times from Szalai's categories
in Table 3. For example, U.S. employed men (44"U.S. cities in Table 3)
spend an average of 13.4 hours indoors at home (all days of the week),
versus 15.4 hours for U.S. employed women. If we weight both sexes equally,
then the category "employed U.S. persons" spends an average of 14.4 hours
per week indoors at home, or 60% of.the time (Table 5 and Figure 1). The
category indoors at home (IH) combines being in one's own home (60%) with
being in other people's homes (2.5%), for a total of 62.5%. Similarly, U.S.
employed men spend 6.7 hours per day at their workplace, and employed women
5.2 hours. Assuming, as in the above discussion, that all workplaces are
indoors, U.S. employed persons spend, on the average, 5.95 hours, or 24.8%
of their day indoors at their workplaces. Thus, the category indoors at
work (IH) consists of being at one's workplace (24.8%) and being in places
of business (3.3%), for a total of 28.1%.
U.S. male workers spend 1.6 hours in transit (T) micro- environments.
The average, 1.5 hours, is 6% of each person's day. The smallest category
is "just outside one's home" (0H); U.S. employed persons spend an overall
average of 0.1 hours in this category (men: 0.2 hours; women: 0.0),
consituting only 0.4% of one's time per day. If we assume that "all other
locations" (0.) is entirely outdoors and combine it with being outside
one's home (0H), then the total time spent outdoors by employed persons in
the U.S. is 0.4 + 1.7 = 2.1% (30 minutes), a relatively small proportion of
the day. This 30 minutes reflects the total time it takes people to walk
from home to an automobile, walk from the parking lot to a place of
3-11
-------
Table 4. Estimated Time Spent in Three Environmental Categories (Average Hours per Day)a
Country
Belgium
Bulgaria (Kazanlik)
Czechoslovakia
(Olomouc)
France (Six Cities)
West Germany
(100 Districts)
West Germany
(Osnabruck)
East Germany
( Hoyerswerda )
Hungary (Gyor)
Peru (Lima-Callao) •.
Poland (Torun)
United States
(44 Cities)
(Jackson, Mich.)
U.S.S.R. (Pskov)
Yugoslavia
(Kragujevac)
Yugoslavia (Maribor)
Indoors
21.6
21.0
21.3
22.0
20.4
20.7
21.6
20.4
20.8
21.9
'21.7
21.8
21.0
21.4
20.1
Employed
Outdoors
0.9
0.9
1.1
0.5
1.9
1.1
0.7
1.6
0.7
0.4
0.7
0.7
1.0
0.8
1.7
Men
Transit I
1.5
2.1
1.6
1.5
1.7
2.2
1.7
2.0
Z.S
1.7
1.6
1.5
2.0
1.8
2.2
Indoors
22.6
21.6
22.3
22.8
21.8
22.3
22.5
20.8
22.1
22.6
22.6
22.8
21.4
21.6
20.5
Housewives
Outdoors
0.4
1.5
0.5
0.2
1.2
0.7
0.5
2.3
0.7
0.2
0.4
0.3
0.7
0.9
2.4
Transit
1.0
0.9
1.2
1.0
1.0
1.0
1.0
0.9
1.2
1.2
1.0
0.9
1.9
1.5
1.1
aDerived by the author from data originally published in Szalai ,
Tables 7-1.1 to 7-1.3, p. 795; data are weighted to ensure equality of
days of the week and number of eligible respondents per household.
Married persons only
3-12
-------
Table 5. Time Spent by Employed Persons in Various Locations
in 44 U.S. Cities3
(Average Hours Per Day)
Employed Employed
Category Location Men Women Average %
IH Inside One's Home
0H Just Outside One's Home
Iu At
T In
IH In
Iw In
I0 In
00 In
One's Workplace
Transit
Other People's Homes
Places of Business
Restaurants and Bars
All Other Locations
Total
13
0
6
1
0
0
0
0
24
.4
.2
.7
.6
.5
.7
.4
.5
.0 .
15.
0.
5.
1.
0.
0.
0.
0.
24.
4
0
2
3
7
9
2
3
0
14
0
5
1
0
0
0
0
24
.4
.1
.9
.5
.6
.8
.3
.4
.0
60
0
24
6
2
3
1
1
100
.0
.4
.6
.2
.5
.3
.3
.7
.0
'Based on data from 44 U.S. Cities (Table 3)
3-13
-------
IN-TRANSIT
6%
OUTDOORS
2%
;NDOORS,
OTHER
INDOORS, WORK
28%
INDOORS, HOME
63%
Figure 1. Proportion of time U.S. employed persons spend in
indoor, outdoor, and in-transit microenvironments.
(Based on data from Table 3 for 44 U.S. cities; men
and women were weighted equally; percentage of hours
per day averaged over all days of the week.)
3-14
-------
employment and return, and a great variety of other brief outdoor
activities.
Using Szalai's data in Table 3, similar findings emerge for U.S.
unemployed, married women ("housewives"). The largest category is "inside
one's home," accounting for 20.5 hours, or 85.4% of the day. When this is
combined with the category "in other people's homes," we find that these
women spend 21.3 hours, or 88.7% of their time, inside homes (Figure 2).
Assuming, as above, that "in places of business" (12 hours) and "in
restaurants and bars" (0.1 hours) are all indoors, then U.S. housewives
spend 6.3 hours, or 5.4% of their time indoors. On the average, they spend
1.0 hours per day in transit, or 4.2% of the day. When the indoor and
in-transit categories are combined and subtracted from a 24-hour day, we
find that U.S. housewives spend only 0.4 hours, or 1.7% of their time
outdoors.
The survey methodology used for the U.S. population sample, along
with the analyses and findings, are described in detail in a monograph7 and
a book8 by Robinson. In 1975, Robinson27 conducted a follow-up study
including a more representative sample of the total U.S. population than
the 1965-1966 study, and in 1985-86 Robinson and Holland37 undertook a more
comprehensive U.S. national survey that included adolescents over age 12.
Diurnal Profiles
Although the estimates given in Tables 3 and 4 are useful, for
determining the total amount of time spent in various locations, they give
little information about the time of day that persons are present in each
location. Data from the multinational study also can be displayed in other
ways. A composite profile shows the proportion of the U.S. population
during the day engaged in selected activities such as sleeping, eating,
work, travel, home, leisure, and television (Figure 3). Diurnal profiles
for five countries show that U.S. employed men are are not likely to spend
their noon lunch hour in their homes, while men in other countries,
particularly France, are quite likely to do so (Figure 4). The diurnal
profiles for housewives also show similar patterns in the six countries,
but show a striking difference from the men's diurnal profiles (Figure 5).
Housewives spend most of the day inside their homes, except for slight dips
in the graphs in the morning and afternoon. Compared to other countries,
housewives in the United States seem to show less tendency to spend their
noontime periods at home and exhibit a dip (minimum) at approximately 8:00
p.m., presumably because they are eating out.
In addition to the studies by Robinson7"8-27 and Robinson and
Holland37, activity pattern studies have been carried out in Durham, N.C.,
by Chapin and Hightower4; on a U.S. sample of 43 Standard Metropolitan
Statistical areas (SMSA's) by Chapin and Brail9; on a follow-up U.S.
national sample by Brail and Chapin11; and in the Washington, D.C.,
metropolitan area by Hammer and Chapin10. Chapin and his associates employ
a 3-digit system for coding activities that is based on a "dictionary" of
about 225 activity codes. Although activities can be grouped in a variety
of ways, Chapin finds it convenient to group the original 225 activities
3-15
-------
INDOORS, OTHER
5.4%
N-TRANSIT
4.2%
DOORS
1.7%
INDOORS, HOME
88.7%
Figure 2. Proportion of time U.S. housewives (unemployed,
married women) spend in indoor, outdoor, and in-
transit microenvironments. (Based on data from
Table 3 for 44 U.S. cities; percentage of hours
per day averaged over all days of the week.)
3-16
-------
aoo TOO 2.00 100 400 5.00 eoo 700 sooaoo 1000 11.001200 13001400 BOO i6oo;70oiaooiaoo 20002100220023002400
TIME
MIDNIGHT 6AM NOON 6PM VCN:OnT
Figure 3. Diurnal profiles showing percentage of employed
men in 44 U.S. cities engaged in 9 types of
activities as function qf time of day (weekdays
only). (Source: Szalzi , Figure 5-1.11 A.
page 736)
Note: Data are weighted to ensure equality of days of the
week and number of eligible respondents.
3-17
-------
100 electoral distrxtl
FED REP GERMANY
Lima-Cuuao. PERU •
14 13 16 1?
19 20 21 22 23 24
Figure 4. Diurnal profiles showing percentage of employed
men in six countries present at home as a function
of time of day (weekdays only.) (Source: Szalai,
Figure 7-3.1 A, page 800)
Note: Data are weighted to ensure equality of days of the
week and number of eligible respondents.
3-18
-------
-t—*
-ft—ft-
/
f
••—f
-T- Six CUM, FRANCE
_.._ ferty-Mur °OUM, USA
___ WO «»c>aol 0»trict».
FEO.REP;OERMANY •
__ -tra-Canoe. PERU
-a>—*-
1 3 3
910 1112
151817
23 23 24
Figure 5. Diurnal profiles showing the percentage of
housewives in six countries present at home as a
function of timf of day (weekdays only.)
(Source: Szalai , Figure 7-3.3 A, page 804)
Note: Data are weighted to ensure equality of days of the
week and number of eligible respondents.
3-19
-------
according to two systems, one forming 28 categories of activities and
another forming 12. Finally, activities are grouped into two large,
general classes: obligatory activities and discretionary activities (Table
6). Considerable detail is available about the activities of residents in
Washington, D.C., in a report by Hammer and Chapin10 and the book by
Chapin , including a diurnal profile of their activities by time of day
(Figure 6) that is similar to the profiles prepared by Szalai6.
AIR POLLUTION EXPOSURE STUDIES
EPA has conducted several large scale field studies employing a
probability sample of the population in which45-46 respondents wear personal
monitors and record their daily activities in diaries. Through the use of
these techniques, called the Total Exposure Assessment Methodology (TEAM),
the carbon monoxide exposures of a representative sample of 452 persons in
Denver31"32 and 712 persons in Washington, D.C.33"34, were monitored in the
winter of 1982-83. The activity pattern data from Washington, D.C.,
(Figure 7) yielded overall findings similar to those from the Multinational
Comparison Time Budget Research Project 6.
The Washington, D.C., respondents, which include both employed
persons and housewives, spend only 1.3% of their time outdoors. This
figure is lower than the 1.7-2.0% obtained from Szalai6, probably because
people spend more time indoors in the winter in eastern U.S. cities than in
other seasons. Transportation accounted for 8.1% of the time. As with the
other studies, the bulk of the time, 90.6%, was spent indoors, with 73.4%
spent indoors at home and 17.2% indoors at work.
An overview of the carbon mojioxide human exposure field studies
appears in a paper by Akland et al.4, and the activity pattern data from
Washington, D.C., have been analyzed in detail by Hartwell et a/.33"34 and
Schwab55, who related sociodemographic factors to activities and exposures.
TRANSPORTATION STUDIES
Few areas of human activities have received more study than
transportation. In the United States, for example, legislation passed in
1952 required urban areas to conduct metropolitan-area transportation
studies as a prerequisite for receiving Federal funds for highway
construction. As a result, transportation studies have been undertaken in
200' areas of the United States , and these studies usually involve
collection of considerable detail about the transportation activities of
the urban population, particularly in cities with populations in excess of
50,000. One method for obtaining the data on activities is by an
"origin-destination" study, which uses a questionnaire called a "trip
report form" to determine the time of each trip, its purpose, the mode of
travel, and where it'ends (Figure 8)56.
3-20
-------
Table 6. 40-Category Activity Classification System
Suggested by Chapin51
OBLIGATORY ACTIVITIES
Miscellaneous
Main Job
Other Income-Related
Personal Care
Eating
Shopping
Sick or Utilization of Medical Care Services
Maintenance of Home, Yard, or Car
Housework and Child Care
Misc. Household Chores, Including Pet Care or Walking the Dog
Household Business
Education
DISCRETIONARY ACTIVITIES
Child-Centered Activities
Visiting, Writing Letters, Phoning Relatives
Overseeing Children's Study, Practice
Family Outings or Drives
Talking and Visiting Within Family
Visiting, Writing Letters, Phoning Friends
Visiting In the Neighborhood
Visiting Outside the Neighborhood
Other Socializing Activities
Relaxing, Loafing, Resting, or Napping
Reading Newspapers, Magazines, or Nonspecified Materials
Reading Books
Cultural Activities
Movies
Television
Radio
Crafts and Hobbies
Walking and Cycling •
Driving About, Sightseeing (Not with Family)
Participant Sports
Spectator Sports
Out-of-Town Holiday
Other Recreation
Religious Activities
Meetings of Voluntary Organizations
Public Affairs and Service Activities
Travel, Including Waiting for Travel
Sleep
3-21
-------
•
i,c,,.i,n, in Church a O»«i
Figure 6. Diurnal profiles of activities of heads of
households and spouses on weekdays in
Washington, D.C., spring 1968 (Source:
Chapin51, page 104.)
3-22
-------
PERCENT
so r
TRANSIT RESIDENCE OTHER INDOOR
MICROENVIRONMENT
•I EXPOSURE Bi TIME
1.5 1.3
OUTDOOR
Figure 7. Time spent in various microenvironments by
residents of Washington, D.C., from the carbon
monoxide TEAM exposure study conducted by EPA in
winter of 1982-83.33~34
3-23
-------
Urban Transportation Study
DWELLING UNIT SURVEY - INTERNAL TRIP REPORT
(For trips which began on the travel date (4:00a.m. to 4:00 a.m.)
by persons 5 years of age or older living in sampled dwelling)
Sheet.
Sample Number | I 41 2161
12
No.
1
1
1
I
|
I
2
1
2
1
13
No.
1
1
2
1
m
3
\
1
1
2
1
14
Trip Begin?
4415
Street No.
ScARAGRO Sfi
Street Name/Ending
YfiLJfi ClTV
City
VIRGINIA
State
1 1 1 1 1 1
/
Street No. f
Street Name/EMing
City Ł
State
1 1 1 1 1 1
/
Street No. /
Street Name/Grading
/
City &
State
1 1 1 1 1 1
4415
Street No.
5cAB«oan Sq.
Street Name/Ending
YOUR CITV
City
VA.
Sute
1 1 1 1 1 1
Street No. /
Street Name/fiMing
CiW p
State
1 1 1 1 1 1
IS
Trip End?
1622
Street No.
ELECTRIC RD
Street Name/Ending
Oum "Tatvfci
City
VlRAIKIIA
Sute
/f 1 1 1 1 1
Ł4d6
Street No.
GPAIJB PL.
Street Name/Ending •
YOUR ClTV
VIRGINIA
Sute,
/I 1 1 1 1 1
4415
Street No.
SeiRBORO So.
Street Name/Ending
VOUB CITV
City
VA
Stale
1 1 1 1 i 1
SALCM R~AVA
Street No.
i?o ^ A/ki«j Sr
Street Name/Ending
OL>« TflM/M
VA.
Sute,
/I 1 1 1 1 1
44(5
Street No.
&ARRORO So.
itreet Name/Ending
YOUR CITY
City
VA.
State
1 1 1 1 1 1
16
Get There?
0Auto Drmr
3. Transit Pauano?r
4. School But Pantnjlr
5. Taxi Patstngar
6. Truck OriMr
7. Truck Pmtngtr
8. WalkM to Work
9. WorkM « Mom*
n
3)Auto OriMr
2. Auto Pmvngcr
3. Transit P«$angtr
4. Si PtutnoH
6. Truck Dtivtr
7. Truck PMStngtr
8. Walked to Work
9. WorkM at Homt
n
ni^uto Of nrtr
3. Tranvt Pauano^r
4. Sctiool Bui PnMfigir
5. TaHi Pamnoir
6. Track Drinr
7. Truck Pawn**
8. WalkM to Work
9 WorkM al Hoim
n -
Qkuto Orim
3. Transit Ptuangtr
4. SOUX* Bui Pmnoir
S. T.M. Paoinav
6. Track OriMr
7. Truck PMMKQK
8. WalkM to Work
I. WorkM at Horn
n
17
You Start?
-fits €5)
PM
1 1 1 1
4-45^
1 1 1 1
5:30 AM
@
1 1 1 1
IO=QO.©
PM
MM
|:OQ AM
MM
18
You Arrive?
Z-^flJ®
PM
I I I
5ŁQQ_AM
MM
6:15 AM
QiJ
1 1 1
|Ql3Q_
PM
MM
i:3Q AM
1 1 1 1
Why Did You Go? **
Purpose
From To
1 Work (
2 Shop 2
3 Social 3
4 RtorMtion 4
S School S
0 Pmonal Allan 6
j Transfer lo Anottw 7
Mtaam ol TravM
8 SmiPaiMnoir 8
^ Horn* 9
1 1 !
J Work 1
2« Shop /?
# *"" Si
3 Social 3
4 RTCraation 4
S School S
6 Ptrional Allairi 6
j Tr«nit«r 10 Anotrttr 7
Miam ol Trn«
8 S«M Pntangtr B
9 Mono 9
1 1 1
1 Work 1
^ Shoo 2
3 Social 3
4 Rtcrianon 4
5 School S
6 Pirtonal Allairi 6
7 Truntttr to Anorrttr 7
M«.ni of TrmM
8 S«rv« PHtanotr 8
9 Norm (?
1 1 1
1 Work 1
2 Shop ^
3 Social 3
4 Rtcrtanon 4
S School S
6 Pmonal Aflairi 6
7 Trantltr to Another 7
Mtam ol Tram
9 Strv* Patsangtr 8
[) Horn 9
1 1 1
1 Work 1
^ Shop 2
3 Social 3
4 Rtcrtation 4
S School S
6 Personal Allairi 6
7 Tranttvr to Anothtr 7
Mainl ol Tr.v«t
8 Str» Pnimgir 8
9 Norm (Ł
I 1 1
Remarks
>
Figure 8. Sample trip report form for use in a metropolitan
transportation study (Source: Reference 54)
3-24
-------
As reported by Robinson, Converse, and Szalai57, the multinational
research project also collected information on the average time spent
commuting to and from work in various countries (Table 7). If the entries
for all countries are averaged, people who commute by public transit spend
an average of 82 minutes per day traveling to and from work; those who
commute by automobile spend an average of 55 minutes per day; and those who
commute by walking spend an average of 41 minutes per day. (Obviously,
persons who walk live closer to their places of work than those who use
other transportation modes). In the United States (44 cities), the average
time spent commuting by automobile is 46 minutes per day, or 23 minutes
each way. However, the variability of national samples makes it difficult
to compare travel times in different countries. For example, the travel
time for commuting by automobile in one Yugoslavian city (Kragujevac, 53
minutes) is higher than the U.S. average of 46 minutes, while the travel
time in another Yugoslavian city (Maribor, 44 minutes) is lower than the
U.S. average.
More important, however, is that the average time commuting, even if
it were known exactly for a particular country, would not be sufficient to
estimate air pollution exposures. The average time could be used to
calculate the average exposure (assuming the average pollutant
concentration associated with commuting were known), but it could not be
used to calculate the maximum exposure some commuters receive.
3-25
-------
Table 7. Average Time Spent Traveling To and From Work
by Mode of Transportation in 12 Countries55
(Minutes Per Day)
Public
Country Transoort Automobile Walkinq All Travel
Belgium
Bulgaria (Kazan! ik)
Czechoslovakia (Olomouc)
France (Six Cities)
West Germany
(100 Districts)
(Osnabruck)
East Germany
(Hoyerswerda)
Hungary (Gyor)
Peru (Lima-Callao)
Poland (Torun)
United States
(44 Cities)
(Jackson, Mich)
U.S.S.R. (Pskov)
Yugoslavia
(Kragujevac)
(Maribor)
Mean:
Standard deviation:
98
93
73
82
N.A.
71
82
104
103
71
81
--
67
70
71
82.0
13.3
55
73
62
46
N.A.
41
66
48
93
50
46
39
--
53
44
55.1
15.1
52
47
46
44
N.A.
46
30
40
48
: 41
30
34
32
47
40
41.2
7.2
66
57
59
50
40
47
62
64
89
60
50
38
--
51
51
56.0
12.7
3-26
-------
In any population subgroup such as commuters, some people will spend very
long times commuting, while others spend far less.
Defining exposures in a meaningful way requires information on the
variability and range of commute times. Ideally, one would like to have
the entire frequency distribution of travel times of commuters to generate
the entire frequency distribution of commuter exposures. Unfortunately,
most summaries of time budget studies present only average values and
seldom give histograms or information on the variability of the time spent
in various locations or activities, even though such histograms could be
generated from the raw data.
In 1969-70, the U.S. Department of Transportation arranged with the
Bureau of the Census to carry out a nationwide study of the
transportation-related activities of the U.S. population. This study,
called the Nationwide Personal Transportation Study, was based on home
interviews and covered individual activities in considerable detail12"22.
Assuming two automobile trips per day, home-to-work travel times averaged
22 minutes per day (Figure 9) which compared reasonably well with the
average of 46 minutes per day (that is, 23 minutes per trip) reported by
Robins, Converse, and Szalai . However, about one-third of the population
spends only 10 minutes commuting, while a small proportion (3.8%) spends 60
minutes, nearly three times the average value.
These figures demonstrate why the average duration of an activity is
not sufficient to characterize exposures in a meaningful way. Use of the
average value for commute times would underestimate, by a factor of three,
the very long commute times experienced by 3.8%, of the employed persons i-n
the U.S., which represents several million persons. The SHAPE model39 uses
frequency distributions such as Figure 9 to simulate human activity
patterns.
In 1983-84, the U.S. Nationwide Personal Transportation Study was
repeated on a probability sample of 6,438 households in 50 states and the
District of Columbia35. Comparing average home-to-work commute times for
the three time periods for which U.S. survey data are available shows that
the average commute time for all SMSA's declined from about 23 minutes in
1969-70 to about 21 minutes for all U.S. SMSA's in both the 1977-78 and
1983-84 survey periods (Table 8). Average commute times decrease with the
size of the SMSA. For example, in the 1983-84 U.S. survey, commute times
averaged 15.3 minutes in SMSA's with fewer than 250,000 persons, compared
with 22.1 minutes in SMSA's of 1-3 million persons, and 26.8 minutes in
SMSA's above 3 million persons35.
U.S. workers overwhelmingly favor the automobile for commuting (Table
9). Approximately 67-72% of the home-to-work trips were by passenger car,
while public transportation accounted for only 5-7% of the trips. Only
about 4-5% of U.S. workers walk to work. The typical home-to-work trip by
passenger car averages about 19 minutes, while trips by public
transportation are much longer. In 1983, the average trip by public
transportation took 46.1 minutes. Persons who walk to work average
3-27
-------
Percent of Persons
Average • 22 minutes
> 65% - 2.2%
-5%
5-15% 15-25% 25-35% 35-45% 45-55% 55-65%
Figure 9. Frequency distribution of home-to-work commuting
times for employed persons in the U.S. (excludes
persons who work at home or at no fixed address).
(Source: Nationwide Personal Transportation study,
Table A-6, Report No. 8. August, 1973, p. 57.
Based on data from Reference 19)
3-28
-------
slightly less than 9 minutes. Home-to-work trips by truck, van, and other
private transportation modes average about 20 minutes, or nearly the same
as automobile travel. While accounting for only 15.6% of the total travel
time in 1983, this category has been steadily increasing since 1969.
Table 8. Average Commuting Time of U.S. Workers by Size of SMSA*
(minutes)
SMSA Population
Fewer Than 250,000- 500,000-
YEAR 250,000 499.999 999.999
1969: 19.4 19.8 21.2
1977: 16.9 17.1 18.9
1983: 15.3 18.8 17.9
1,000,000- 3,000,000 All
2.999.999 and over SMSA's
23.7 25.6 23.1
21.8 25.2 20.8
22.1 26.8 20.9
*Source: Table 7-2 from Nationwide Personal Transportation Study;35
includes only SMSA's.
Table 9. Mode of Travel of Home-to-Work Trips of U.S. Workers3
Percentage
Truck, Van, and Work
Passenger Other Private Public at
YEAR Car Transportation Transportation Walk Home Other Total
1969:
1977:
%:
mm:
1983:
%:
mm:
67.3
72.1
19.0
70.6
19.1
5.8
11.5
19.9
15.6
20.1
7.3
5.7
38.8
5.3
46.1
5.1
4.7
8.8
4.1
8.9
4.5 10.0
3.7 1.4
- 16.7
3.5 0.9
- 29.9
100.0
100. Ob
19.8 '
100.0
20.4
aSource: Table 7-6 from Nationwide Personal Transportation Study;35
Size of U.S. workers number 75,758,000 in 1969; 93,019,000
in 1977; and 103,244,000 in 1983.
blncludes 0.9% unknown
3-29
-------
RESEARCH NEEDS
The literature on human activity patterns, time budgets, and
transportation-related activities is at this time voluminous and
comprehensive. A large number of studies of human activities have been
undertaken in the United States and other countries. In general, they have
been designed and implemented by two groups: sociological researchers and
transportation analysts. These studies have reflected the particular
interest of these two disciplines.
Despite the large volume of information available on human activity
patterns, these data are not really suitable for estimating the exposure of
the population to air pollution. Although crude exposure estimates are
possible, the data from existing studies suffer from three main problems:
• failure to collect in the diaries basic data that are important
for estimating air pollution exposures. For example,
respondents were not asked to indicate their smoking
activities, nor did they report if they used gas stoves or
other gas appliances. Rather, the emphasis was on their
leisure time activities, such as viewing television or
socializing.
• failure to code data on the diary forms in a manner suitable
for estimating air pollution exposures. For example, it would
be possible to .determine .explicitly from the-diaries whether a
'person Was indoors of outdoors, but the investigators were not
interested in this matter, so the resulting codes are of no
help. Estimating the percentage of time spent indoors from
existing activity pattern data requires making numerous
assumptions. For example, the activity category "working
around the house" is ambiguous. Is it indoors or outdoors? It
is likely that a revised coding of the original dairies could
yield more accurate information for air pollution purposes.
• failure to present the analyses and data summaries in a manner
suitable for estimating air pollution exposures. For example,
tables usually present the duration of each activity as an
average value so that only the average exposure of the
population can be computed. However, the average exposure may
be well below the relevant air quality standard, concealing the
fact that a significant proportion of the population is exposed
to levels above the standard. Determining that proportion
requires the entire frequency distribution of times spent in a
given activity.
Thus, much information needed to estimate human exposures is not
available in past studies of human activities and time budgets, where a
3-30
-------
large amount of superfluous and unnecessary information is available. For
example, the time spent "socializing" or "watching television" is of no use
in estimating exposures to air pollution. Flachsbart56 suggests that the
analyst of air pollution exposure is less interested in the "activity" than
in the "environmental setting." Thus, the exposure analyst is less
interested in whether an individual is talking with friends (socializing)
than whether he or she is driving in traffic, cooking indoors with a gas
stove, or parking a car inside an underground garage. Obtaining this
information requires conducting an activity pattern study specially
designed to estimate air pollution exposure.
Such a study should begin with a pilot study on a single city to
perfect the experimental design and data collection methodology and should
measure exposures with personal monitoring instruments. Once the results
are evaluated, a large-scale investigation would be carried out on a number
of cities or a national probability sample. The large-scale survey would
employ diaries and personal monitoring instruments to characterize the
frequency distribution of air pollution exposures of the population as a
whole and of selected cities. Information from the diaries could be
correlated with the measurements of exposure to determine how different
activities of the population affect their exposure rates.
CONCLUSIONS
Although a great body of literature and data currently exist on human
activity patterns, most of this research was conducted by sociological
researchers not interested in air pollution or environmental problems.
Thus,.the existing time budget studies contribute very" little to the newly
emerging field of human exposure modeling and monitoring. Field studies
are needed to gather data on the particular activities that affect human
exposure to pollution (for example, using consumer products, storing
chemicals in the home, driving, living with a smoker, using gas appliances,
visiting dry cleaning establishments, filling gasoline tanks). The
importance of human activity patterns in determining an individual's health
risk to environmental pollution has only recently been recognized. The
potential to gain a better understanding of the causes and sources of risk
to environmental chemicals through activity pattern research is enormous.
The existing time budget studies were not designed to estimate human
exposure, but they can still be used to make imperfect estimates of the
time spent in different microenvironments, as we have attempted to do in
this paper. When the data are interpreted in this manner, some interesting
findings emerge.
In general, people spend a very small amount of time outdoors.
Excluding the in-transit categories and all indoor locations, outdoors is
the smallest time category. Time budget data indicate that U.S. workers
spend only about 2% and U.S. housewives spend about 1.4% of their time
outdoors. In other countries, people always spend less than 10% of their
time outdoors, and in most countries they spend less than 5%.
3-31
-------
There are minor differences from country to country in human activity
patterns, probably resulting from differences in culture, transportation
systems, and climate. The similarities, however, are more striking than
the differences. The finding that emerges is that we are basically indoor
animals. When not indoors at home, we are indoors on our jobs, in stores,
or other locations. When not located within some room or building, we can
usually be found in a transportation microenvironment, such as a train,
bus, or automobile. In a modern society, total time outdoors is the most
insignificant part of the day, often so small it barely shows up in the
total.
Possibly, this emphasis on enclosed structures stems from our
evolution. Unlike the animals of the forest, we have no fur to protect us
from the cold. Nor are our bodies equipped with efficient weapons of
defense, such as claws, fangs, or tusks. Perhaps because of our insecurity
outdoors, we build residencies, and buildings, and even enclosed
transportation vehicles such as cars and buses. The existing data, on all
cultures thus far studied, if they are correct and valid, suggest that we
are primarily indoor animals.
Acknowledgment
I wish to thank Herb Hunt of General Sciences Corporation
for his tireless work on the development of this paper.
3-32
-------
REFERENCES
1. Lundberg, George A., Mirra Komarovsky, and Mary Alice
Mclnernv.Leisure: A Suburban Study.Columbia University Press. New
York Citv. 1934
2. Sorokin, Pitirim A., and Clarence Q. Berger, Time-Budgets of Human
Behavior. Harvard University Press, Cambridge, Mass.
1939.
3. de Grazia, Sebastian, "The Uses of Time," in Aoino and Leisure.
Robert W. Kleemeir, ed., Oxford University Press, New York City,
1961, pp. 113-153.
4. Chapin, F. Stuart, and Henry C. Hightower, "Household Activity
Patterns and Land Use," Amer. Inst. of Planners. 31, 3: 222-238,
August 1965.
5. Szalai, Alexander, "The Multinational Comparative Time Budget
Research Project: A Venture in International Research Cooperation,"
Amer. Behavior Scientist. 10, 4: 1-31, December 1966.
6. Szalai, Alexander, The Use of Time: Daily Activities of Urban and
Suburban Populations in Twelve Countries. Mouton, The Hague, 1972.
7. Robinson, John P., How Americans Used Time in 1965. University of
Michigan, Monograph from University Microfilms International, Ann
Arbor, Mich., 1977.
8. Robinson, John P. How Americans Use Time: A Social-Psychological
Analysis of Evervdav Behavior. Praeger Publishers, Praeger Special
Studies, New York City, 1977.
9. Chapin, F. Stuart, Jr., and Richard K. Brail, "Human Activity Systems
in the United States," Environ, and Behavior. 2:107-130, December
1969.
10. Hammer, Philip G. Jr., F. Stuart Chapin Jr., "Human Time Allocations:
A Case Study of Washington, D.C." Technical monograph, Center for
Urban and Regional Studies, University of North Carolina, Chapel
Hill, N.C., March 1972.
11. Brail, Richard K., and F. Stuart Chapin, Jr., "Activity Patterns of
Urban Residents," Environ, and Behavior. 5:163-191, June 1973
12. Strate, Harry E., "Automobile Occupancy", U.S. Department of
Transportation, Federal Highway Administration, Nationwide Personal
Transportation Study, Washington, D.C., Report No. 1, April 1972.
13. Strate, Harry E., "Annual Miles of Automobile Travel," U.S.
3-33
-------
Department of Transportation, Federal Highway Administration,
Nationwide Personal Transportation Study, Washington, D.C., report
No. 2, April 1972.
14. Strate, Harry E., "Seasonal Variations of Automobile Trips and
Travel," U.S. Department of Transportation, Federal Highway
Administration, Nationwide Personal Transportation Study, Washington,
D.C., Report No. 3, April 1972.
15. Beschen, Darrell A. Jr., "Transportation Characteristics of School
Children." U.S. Department of Transportation, Federal Highway
Administration, Nationwide Personal Transportation Study, Washington,
D.C., Report No. 4, July 1972.
16. Hatley, Rolan M., "Availability of Public Transportation and Shopping
Characteristics of SMSA Households", U.S. Department of
Transportation, Federal Highway Administration, Nationwide Personal
Transportation Study, Washington, D.C., Report No. 5, July 1972.
17. Gish, Robert E., "Characteristics of Licensed Drivers", U.S.
Department of Transportation, Federal Highway Administration,
Nationwide Personal Transportation Study, Washington, D.C., Report
No. 6, April 1973.
18. Goley, Beatrice T., Geraldine Brown, and Elizabeth Samson, "Household
Travel in the United States," U.S. Department of Transportation,
Federal Highway Administration, Nationwide Personal Transportation
Study, Washington, D.C., Report No. 7, December 1972.
19. Svercl, Paul V., and Ruth H. Asin, "Home-to-Work Trips and Travel,
U.S. Department of Transportation, Federal Highway Administration,
Nationwide Personal Transportation Study, Washington, D.C., Report
No. 8, August 1973.
20. Randill, Alice, Helen Greenhalgh, and Elizabeth Samson, "Mode of
Transportation and Personal Characteristics of Tripmakers," U.S.
Department of Transportation, Federal Highway Administration,
Nationwide Personal Transportation Study, Washington, D.C., Report
No. 9, November 1973.
21. Asin, Ruth H., "Purposes of Automobile Trips and Travel." U.S.
Department of Transportation, Federal Highway Administration,
Nationwide Personal Transportation Study, Washington, D.C., Report
No. 10, December 1974.
22. Asin, Ruth H., and Paul V. Svercl, "Automobile Ownership," U.S.
Department of Transportation, Federal Highway Administration,
Nationwide Personal Transportation Study, Washington, D.C., Report
No. 11, December 1974.
23. Flachsbart, Peter G., William C. Baer, and Gary Schalman, "Activity
Patterns in the Residential Environment," report prepared for the
3-34
-------
U.S. Public Health Service project: Research on the Residential
Environment. Graduate Program Urban and Regional Planning,
University of Southern California, Los Angeles, Calif., 1972.
24. Michelson, William, and Paul Reed, "The Time Budget," in Behavior
Research Methods in Environmental Design. William Michelson, ed.,
Community Development Series, Halsted Press, 1975.
25. Bullock, Nicholas, Peter Dickens, Mary Shapcott, and Philip Steadman,
"Time Budgets and Models of Urban Activity Patterns," Social Trends.
56: 45-63, Nov. 5, 1974.
26. British Broadcasting Corporation, The People's Activities and Use of
Time. BBC Audience Research Department, J. Smethurst (E.S.D.), Ltd.,
England, 1978.
27. Robinson, John P., "Changes in Americans' Use of Time: 1965-1975: A
Progress Report," Communications Research Center, Cleveland State
University, Cleveland, Ohio., 1977.
28. Asin, Ruth H., Personal communication regarding the Nationwide
Personal Transportation Study, U.S. Department of Transportation,
Federal Highway Administration, Washington, D.C. July 1979.
29. Juster, Thomas F., Martha S. Hill, Frank P. Stafford, and Jacquelynne
E. Pearsons, "1975-1981 Time Use longitudinal Panel Study," Report on
Project #466066, Survey Research Center, Institute for Social
Research, the University of Michigan, Ann Arbor, Mich., January 1983.
30. Letz, Richard E., and Mary Lou Soczek, "A survey of Time-Activity
Patterns in Kingston/Harriman, Tennessee: Methods Support for Modeled
Data," presented at a specialty conference on Quality Assurance in
Air Pollution Measurements, Air Pollution Control Association and
American for Quality Control Association, Boulder, Col., October
14-18, 1984.
31. Johnson, Ted, "A study of Personal Exposure to Carbon Monoxide in
Denver, Colorado," Report No. EPA-600/4-84-014, NTIS No.
PB-84-146125, U.S. Environmental Protection Agency, Research Triangle
Park, N.C., 1983.
32. Johnson, Ted, "A Study of Personal Exposure to Carbon Monoxide in
Denver, Colorado," Paper No. 84-121.3 presented the 77th Annual
Meeting at the Air Pollution Control Association, San Francisco,
- Calif., June 1984.
33. Hartwell, T.D., C.A. Clayton, R.M. Mitchie, Jr., R.W. Whitmore, H.S.
Zelon, S.M. Jones, and D.A. Whitehurst, "A Study of Carbon Monoxide
Exposure of Residents of Washington, DC, and Denver, Col," Report No.
EPA-600/54-84-031, NTIS No. PB-84-18356, U.S. Environmental
Protection Agency, Research Triangle Park, N.C., 1984.
3-35
-------
34. Hartwell, Ty, C.A. Clayton, R.M. Mitchie, Jr., R.W. Whitmore, H.S.
Zelon, D.A. Whitehurst, "A Study of Carbon Monoxide Exposure of the
Residents in Washington, D.C.," Paper No. 84-121.4 presented at the
77th Annual Meeting of the Air Pollution Control Association, San
Francisco, Calf., June 1984.
35. Klinger, Dieter, and J. Richard Kuzmyak, "Personal Travel in the
United States, Vol I, 1983-84 Nationwide Personal Transportation
study," U.S. Department of Transportation, Federal Highway
Administration, Washington, D.C., August 1986.
36. Johnson, Ted, "A Study of Activity Patterns in Cincinnati, Ohio,"
Draft Report No. RP940-06 PN 3640-2, prepared by PEI Associates,
Research Triangle Park, N.C., for the Electric Power Research
Institute, Palo Alto, Calif., June 12, 1986.
37. Robinson, John, and Jeffrey M. Holland, "Trends in American's Use of
Time: Some Preliminary 1975-1985 Comparisons," Draft Report to Office
of Technology Assessment, U.S. Congress, Survey Research Center,
University of Maryland, College Park, Md., May 1986.
38. Wiley, James A., and John P. Robinson, "Activity Pattern Study of
California Residents: A Micro-Behavioral Approach" research proposal,
University of California, Berkeley, Calif., undated.
39. Ott, Wayne R., "Exposure Estimates Based on Computer Generated
Activity Patterns," Paper No. 81-57.6 presented at the 74th Annual
Meeting of the Air Pollution Control Association, Philadelphia, Pa.,
June 21-26, 1981.
40. Duan, Naihua, "Models for Human Exposure to Air Pollution,"
Environment International. 8: 305-309, 1987.
41. Thomas, Jacob, David Mage, Lance Wallace, and Wayne Ott, "A
Sensitivity Analysis of the Enhanced Simulation of Human Air
Pollution and Exposure (SHAPE) Model," Report No. EPA-600/4-85-036,
NTIS No. PB-85-201101, Environmental Monitoring Systems Laboratory,
U.S. Environmental Protection Agency, Research Triangle Park, N.C.,
1984.
42. Ott, Wayne, Jacob Thomas, David Mage, and Lance Wallace, "Validation
of the Simulation of Human Activity and Pollution Exposure (SHAPE)
Model Using Paired Days from the Denver, Colorado, Carbon Monoxide
Field Study," Atmospheric Environment, in press.
43. Johnson, Ted, and Roy Paul, "NAAQS Exposure Model (NEM) and
Application to Nitrogen Dioxide," technical report, Office of Air
Quality Planning and Standards, U.S. Environmental Protection Agency,
Research Triangle Park, N.C., 1981.
44. Ott, Wayne R., "Concepts of Human Exposure to Air Pollution,"
Environment International. 7: 179-196, 1982.
3-36
-------
45. Ott, Wayne R., "Total Human Exposure: An Emerging Science Focuses on
Humans as Receptors of Environmental Pollution," Environmental
Sciences and Technology. 19: 880-886, October 1985.
46. Ott, Wayne, Lance Wallace, David Mage, Gerald Akland, Robert Lewis,
Harold Sauls, Charles Rodes, David Kleffman, Donna Kuroda, and Karen
Morehouse, "The Environmental Protection Agency's Research Program on
Total Human Exposure," Environment International, 12: 475-494, 1986.
47. Michelson, William, "Time budgets in Environmental Research: Some
Introductory Considerations:, in Environment Design Research, Vol.
11. Symposia and Workshops. W.F.E. Preiser, ed.,4th International
E.D.R.A. Conference, Dowden, Hutchinson, Ross, Inc., Stroudsburg,
Pa., 1973.
48. Ottensmann, John R., "Systems of Urban Activities and Time: An
interpretive Review of the literature", Urban Studies Paper, Center
for Urban and Regional Studies, University of North Carolina, Chapel
Hill, N.C., 1972.
49. Szalai, Alexander, "Trends in Comparative Time-Budget Research,"
Amer: Behavioral Scientist. 9: 3-8, May 1966.
50. Converse, Philip E., "Time-Budgets," in International Encyclopedia of
the Social Sciences. Vol. 16, David Sills, ed. Macmillan Co. And the
Free Press, New York City, 1968.
. 51. Chapin, F. Stuart Jr., Human Activity Patterns in the City; . Things
People Do in Time and Space. John Wiley & Sons, New York City, 1974
52. McCormick, Thomas C., "Quantitative Analysis and Comparison of Living
Cultures," Amer. Sociol. Rev. 4: 463-474, August 1939.
53. Walker, Kathryn E., "Homemaking Still Takes Time," Journal of Home
Economics. LXI: 621-624, October 1969.
54. Akland, Gerald G., Tyler D. Hartwell, Ted R. Johnson, and Roy W.
Whitmore, "Measuring Human Exposure to Carbon Monoxide in Washington,
D.C., and Denver, Colorado, during the Winter of 1982-1983," Environ.
Sci. Techno!.. Vol. 19, No. 10, October 1985.
55. Schwab, Margo, "Differential Exposure to Carbon Monoxide Among
Sociodemographic Groups in Washington, D.C.," Ph.D. Dissertation,
Graduate School of Geography, Clark University, Worcester, Mass.,
February 1988.
56. "Urban Origin-Destination Survey", U.S. Department of Transportation,
Federal Highway Administration, Washington, D.C., July 1975.
57. Robinson, John P., Philip E. Converse, and Alexandria Szalai Life in
Twelve Countries, in The Use of Time; Daily Activities of Urban and
3-37
-------
Suburban Populations in Twelve Countries. Alexandria Szalai, ed.,
Mouton, The Hague, pp. 112-144, 1972.
3-38
-------
BASIC ACTIVITY PATTERNS STRUCTURE
FOR MODELING POLLUTION EXPOSURE
By: Jacob Thomas
General Sciences Corporation
6100 Chevy Chase Drive
Laurel, MD 20707
and
Joseph V. Behar
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
ABSTRACT
The significance of activity patterns in man's relationship to the
pollutants in -his environment has become progressively evident to
environmental scientists over the last decade. Field studies of personal
exposures to several environmental pollutants have been conducted in
several major metropolitan areas of the country on statistically
representative samples of the respective populations. These data were
examined to determine the "likeness" of activity patterns in different
cities. Using data from Denver, Colorado, and Washington, D.C., the
distributions of occupancy duration periods for seven broadly defined
microenvironments were determined. Despite significant differences in
specific characteristics between the two cities, the overall similarities
found in the activity patterns are quite remarkable. Activity patterns
thus determined, when combined with microenvironmental concentration data
in a total human exposure model, will provide more realistic estimates of
human exposure to environmental pollution.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
4-1
-------
INTRODUCTION
The significance of activity patterns in man's relationship to the
pollutants in his environment has become progressively evident to
environmental scientists over the last decade. Scientists at the U.S.
Environmental Protection Agency and several of their colleagues in various
private and academic institutions have been mainly responsible for this
increasing consciousness. The field studies of personal exposures to
carbon monoxide were conducted in the Denver and Washington, D.C.,
metropolitan areas during the winter of 1982-83.J~4 These studies were
conducted on participants drawn from statistically representative samples
of the populations living in these two metropolitan areas. In addition to
carrying personal exposure monitors to measure the exposure to CO, they
provided very meticulous records of time, activity, and location for the 24
hours during which each participant carried the monitor. These studies
have provided a unique database of activity patterns over 24 hours of urban
dwellers in these two cities.
Such studies are costly and time consuming. It is not surprising
then that it has been repeatedly asked whether or not the activity patterns
found in the databases of these activity pattern studies are usable in
different cities and for different pollutants.
BACKGROUND
Given this objective, an intense examination of the Denver and
Washington, D.C., study data was conducted to determine the "likeness" of
activity patterns observed in these two cities. An activity pattern of an
urban dweller is the time spent by the individual in different activities
at different locations during a 24-hour period. It is not only the
duration of the activity that is important, but also the time of its
occurrence. The term microenvironment has been defined to mean the
location where the activity takes place. In pollutant exposure modeling,
microenvironments and the time spent in them by individuals are the dynamic
variables with which we deal.
In our present effort, we are not looking at specific pollutants; we
are seeking activity patterns by a broader brush. All human activities
occur either indoors or outdoors. The activity of transit needs to be
classified separately, however. Also, since most human activities occur
indoors, further distinction could be made between such indoor environments
as residence, office, school, shop, etc., which should be considered
separately from one another.
4-2
-------
The residence is the location where most of everyday life is spent.
For most people the activity of wage earning occurs elsewhere. The
"indoor" location can therefore be further broadly classifed as
indoor-residence and indoor-other. The outdoors can be classified into a
myriad of microenvironments, but as observed in the data, the time spent
outdoors is less than 5% in 24 hours, hence the distinction need be made
only when dealing wtih specific pollutants. Of the time spent in
residence, more than half of it is spent sleeping. The remainder of the
time spent in residence is classifiable as many other different activities.
When specific pollutants are considered, certain specific activities
in the residence become important and will require special attention when
modeled. For example, cooking with a gas stove is a specific activity
occurring in a specific indoor residential microenvironment important in
carbon monoxide (CO) exposure modeling. Under the category "indoor-other,"
the wage-earning activity is a major activity in everyday life. For office
workers, the microenvironment "office" is a major subcategory under
"indoor-other." Again, the balance of "indoor-other" categories such as
shops, restaurants, etc., are short-duration activities when considered as
a part of daily life. Some of them will need to be specifically modeled
for certain pollutants. The transportation or "transport" category must be
separated into transport by combustion and noncombustion modes.
METHOD
The modeling of exposure to virtually any pollutant can be achieved
if we consider all activities in all microenvironments in relation to seven
basic activity/location groups shown below (Figure 1 and Table 1). We
first segregate the universe of microenvironmental activities, into three
groups: Indoor, Transport, and Outdoor. These three broad groupings can
be further classified into the subgroups indicative of major activities in
which people are involved.
With appropriate adjustments for occupational exposure or other
pollution intensive activities or microenvironments, these basic groups of
activity/locations can be used for the analysis of activity data in almost
any urban/suburban area in the country. Certain additional factors, such
as seasonal or geographic adjustments of particular impact on specific
pollutants, must be made to accommodate the modeling of some pollutants.
Given the rich database of activities, and transitions between
activities, available from the.Denver and Washington studies, we know the
frequency distribution by sex, age, occupation, daypart, weekpart, etc.,
for a broad range of activities. While these studies were designed to map
exposure to carbon monoxide, the data can be used to form the basic
building blocks for modeling exposure to a variety of pollutants.
The duration of study days included in this investigation ranged from
22 to 26 hours. To investigate the basic structure of activity patterns
and the time spent in each of the seven broad microenvironments identified
above, it was necessary to standardize the duration of the study day to 24
hours. This meant that those observed longer than 24 hours were reduced to
4-3
-------
INDOOR
RESIDENCE
Sleep-
ing
in
Resi-
dence
Other
Resi-
dence
OTHER
Office
Other
non-
resi-
dence
indoor
TRANSIT
Combus-
tion
Powered
Non-
Combus-
tion
Powered
OUT-
DOOR
Figure 1. Derivation of Seven Basic Modeling Microenvironments
Table 1. Seven broad microenvironments into which all activity/location
combinations can be placed.
1 - Indoor residence sleeping
2 - Indoor residence other
3 - Indoor other office
4 - Indoor other nonresidence
5 - Combustion powered transit
6 - Noncombustion powered transit
7 - Outdoor
4-4
-------
24 hours, and those observed for less were adjusted to provide 24 hours.
The study days for most participants started and ended around 6 p.m. at
their residence. For most persons it was considered reasonable to extend
or reduce the time at their last location, such as residence, to
standardize the observation period to 24 hours. As a result, this study
provides data for 1066 person-days from the two cities (526 from Denver and
540 from Washington, D.C.) which can be evaluated in order to provide basic
activity pattern structures which may be more generally applicable to human
exposure studies.
RESULTS
Figure 2 and Table 2 compare the mean activity duration in the two
cities. The bars depict the major location categories. They are
subdivided to represent the activities. The indoor residence .(INDRES) is
subdivided into primary and secondary activities. They are sleep and
others, respectively. The other indoor location is subdivided between
office time and time spent in other indoor locations. The outdoor location
time is very small in total duration and has no subdivision. Transit time
is divided into primary mode of transport, internal combustion driven
vehicles, and secondary modes of transportation, including walking,
cycling, or travel by train. As illustrated in the graphic and table, the
"likeness" in the two cities is remarkable and greater than anticipated.
There are some differences which distinguish the two cities, however.
Washington, D.C., is more of an office-going city, and residents need a
little longer transit duration to traverse the larger metropolitan area.
Denver residents appear to spend the minutes saved in commuting getting
extra sleep. • .
The results of examination of some of the characteristics which
affect activity duration in each of the microenvironments are shown in
Figure 3 and Table 3 which contrast the week-day time distributions with
those of the weekend for all study participants (Denver and Washington,
D.C., combined). There is evidently more time to sleep, and much more time
is spent in residence during weekends. Less time is spent in transit, and
relatively few work during weekends.
Figure 4 and Table 4 show significant differences between activity
patterns based on sex for both cities. Women spend more time at home and
less time in office and indoor-other and in transit than do men.
Significant differences in time budget distribution are also evident among
different age groups as illustrated in Figure 5 and Table 5. Humans tend
to sleep less as age increases, and older people spend more time at their
residence and less time away. Transit time also decreases progressively
with age.
Figures 6a and 6b and the associated tables (6a and 6b) compare the
two cities for weekdays and weekends respectively. They confirm the larger
office and commute time for Washingtonians during weekdays. The sex
characteristics are similiar in the two cities (Figures 7a and 7b and
Tables 7a and 7b). Women spend more time in their residence than men and
less time in other indoor locations or in transit.
4-5
-------
1200
1000
800
600
400
200
Minutes Per 24 Hours
1059
1026
INOOTH INDRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER I WASHINGTON
• PRIMARY Bl SECONDARY
FIGURE 2 - Comparison of Time Budgets by City
TABLE 2 - Comparison of Time Budgets by City
Denver
n = 526
LOCATION
Indoor Residence
Indoor Other
Outdoor
Transit
ACTIVITY
All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion '
MEAN
1059.2
513.3
545.9
257.6
117.4
140.2
21.8
101.3
88.8
12.5
STD.ERR
10.2
5.2
10.5
9.4
8.6
7.2
2.4
3.7
3.5
1.5
Washington
n = 540
MEAN
1025.9
495.0
513.8
276.5
179.7
96.8
21.2
116.5
96.4
20.1
STD.ERR
10.6
5.0
10.0
10.0
9.6
5.8
2.3
4.2
4.0
2.1
T-TEST
P<
.023<
.01 2C
.297C
.169'
.0001
.OOO1
.8562
.0061
.157
.003
4-6
-------
Minutes Per 24 Hours
4 A f\r\
14UU '" " "
1186
INDOTH INGRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
WEEKDAY I WEEKEND
•I PRIMARY • SECONDARY
FIGURE 3 - Comparison of Time Budgets by Week Part
TABLE 3 - Comparison of Time Budgets by Week Part
Weekday
n = 798
Weekend
n=268
LOCATION
Indoor Residence
Indoor Other
Outdoor
Transit
ACTIVITY
All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion
vIEAN
994.1
490.7
503.4
313.4
192.5
121.0
STD.ERR
8.4
4.1
8.2
8.0
8.0
5.6 .
MEAN
1186.0
543.8
642.0
129.4
19.3
110.0
STD.ERR
11.7
7.3
13.0
9.4
5.3
8.1
T-TEST
P<
.0001
.0001
.0001
.0001
.0001
.3089
19.9
112.5
94.9
17.6
1.9
3.2
3.0
1.5
26.1
98.6
85.8
12.7
3.4
5.9
5.6
2.2
.1094
.0316
.1423
.1015
4-7
-------
Minutes Per 24 Hours
1200r- -
1097
INDOTH INDRES OUTDOR TRNSIT INDOTH INGRES OUTDOR TRNSIT
FEMALE I MALE
•I PRIMARY • SECONDARY
FIGURE 4 - Comparison of Time Budgets by Sex
Outdoor
Transit
All
All
Combustion
Non-Combustion
15.8
99.1
84.6
14.5
1.6
3.3
2.9
1.8
29:8
123.6
104.5
19.1
3.3
4.9
5.0
1.8
TABLE 4 - Comparison of Time Budgets by Sex
Female
n = 637
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1096.8
512.5
584.2
228.2
115.6
112.6
STD.ERR
9.2
4.7
9.2
8.6
10.9
5.6
Male
n = 429
MEAN
961.4
491.4
470.0
325.0
198.4
126.6
STD.ERR
11.2
5.7
10.8
10.9
10.8
8.1
T-TEST
P<
• .0001
.004C
.0001
.0001
.0001
.142Ł
.0001
.0001
.0003
.0639
4-8
-------
Minutes Per 24 Hours
1400i
1182
INO INROUTTRNINO INROUTTRNINO INROUTTRNINO INROUTTRN
18-24 I 25-44 | 45-59 | 60-70
• PRIMARY SI SECONDARY
FIGURE 5 - Comparison of Time Budgets by Age Group
TABLE 5 - Comparison of Time Budgets by Age Group
LOCATION
ACTIVITY
18-24
n = 99
AGE GROUPS
25-44
n = 594
45-59
= 236
60-70
n = 137
F-TEST
P<
Indoor Residence
All
Sleeping
Other
1015.7
519.7
496.0
1013.6
506.1
507.5
1044.8
503.4
541.4
1181.8
484.9
696.9
.0001
.1401
.0001
Indoor Other
All
Office
Other
289.1
132.1
157.0
289.3
170.9
118.4
271.6
154.9
116.6
147.9
54.4
92.5
.0001
.0001
.0158
Outdoor
All
18.6
22.4
20.9
20.3
.9079
Transit
All
Combustion
Non-Combustion
116.2
101.4
14.8
114.6
97.2
17.5
102.8
90.9
11.9
90.0
69.4
20.5
.0197
.0062
.2086
4-9
-------
Minutes Per 24 Hours
INDOTH INDRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER | WASHINGTON
• PRIMARY H SECONDARY
FIGURE 6a - Comparison of Time Budgets by City on Weekday
Outdoor
Transit
All
All
Combustion
Non-Combustion
19.7
106.2
93.3
12.9
2.9
4.4
4.2
1.7
20.1
118.1
96.3
21.8
2.5
4.6
4.3
2.5
TABLE 6a - Comparison of Time Budgets by City on Weekday
Denver
n = 378
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1011.9
499.9
512.0
301.9
158.4
143.5
STD.ERR
12.0
5.9
12.3
11.3
11.1
8.8
Washington
n=420
MEAN
978.0
482.4
495.6
323.8
223.1
100.7
STD.ERR
11.7
5.6
11.1
11.2
. 11.2
7.0
T-TEST
P<
.042f
.032C
.32 U
.170E
.0001
.0001
.9137
.0619
.6228
.0034
4-10 .
-------
1400
1200
1000
800
600
400
200
0
Minutes Per 24 Hours
1180
1193
INDOTH INDRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER | WASHINGTON
•I PRIMARY EH SECONDARY
FIGURE 6b - Comparison of Time Budgets by City on Weekend
TABLE 6b - Comparison
of Time Budgets by City on Weekend
Denver
LOCATION
Indoor Residence
Indoor Other
Outdoor
Transit
ACTIVITY
All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion
n =
MEAN
1180.0
547.5
632.4
144.4
12.6
131.8
27.1
88.7
77.2
11.5
148
STD.ERR
15.7
10.3
18.1
12.9
5.3
12.2
4.2
6.6
6.2
3.0
Washington
n =
MEAN
1193.5
539.3
654.2
110.9
27.7
83.2
24.9
110.8
96.5
14.3
120
STD.ERR
17.6
10.4
18.5
13.7
9.7
9.7
5.4
10.1
9.8
3.4
T-TEST
P<
.564<
.577C
.4011
.076Ł
.155-:
.002Ł
.749Ł
.060*
.OBK
.544(
4-11
-------
1200
1000
800
600
400
200
0
Minutes Per 24 Hours
1108
1084
INDOTH INGRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER | WASHINGTON
• PRIMARY •SECONDARY
FIGURE 7a - Comparison of Time Budgets for Females by City
Outdoor
Transit
All
All
Combustion
Non-Combustion
16.1
99.9
86.9
13.1
1.9
4.6
4.3
3.1
15.5
98.0
81.9
16.1
2.6
4.7
3.8
3.0
TABLE 7 a - Comparison of Time Budgets for Females by City
Denver
n=345
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1107.9
522.6
585.3
215.4
87.1
128.7
STD.ERR
12.0
6.5
12.5
11.1
9.8
8.1
Washington
n = 292
MEAN
1083.7
500.7
582.9
242.8
149.2
93.6
STD.ERR
14.2
6.8
13.6
13.3
12.5
7.4
T-TEST
P<
.193Ł
.020^
.897^
.1191
.0001
.0014
.8332
.7679
.3894
.4049
4-12
-------
Minutes Per 24 Hours
1000 i 966
800 -
600 -
400 -
200 -
INDOTH INORES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER | WASHINGTON
• PRIMARY B SECONDARY
FIGURE 7b - Comparison of Time Budgets for Males by City
TABLE 7b - Comparison of Time
Budgets for Males
Denver
n = 181
LOCATION
Indoor Residence
Indoor Other
Outdoor
Transit
ACTIVITY
All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion
MEAN
966.4
495.6
470.8
337.2
175.1
162.1
32.5
103.8
92.4
11.4
STD.ERR
16.9
8.6
17.6
15.9
15.8
14.1
5.8
6.1
6.1
1.9
by City
Washington
n=248
MEAN
957.8
488.3
469.5
316.1
215.5
100.7
27.9
138.2
113.4
24.8
STD.ERR
14.9
7.5
13.7
14.8
14.6
9.3
3.9
7.1
7.4
2.7
T-TEST
P<
.703C
.5211
.950«
.33*
.060Ł
.000;
.51 OS
.OOOC
.02&
.0002
4-13
-------
The comparison of age groups within cities are shown in the 8a-d
series of figures and tables. There does not appear to be any significant
interaction between these characteristics and the two cities.
DISCUSSION
The Denver and Washington, D.C., databases used in this study
represent distributions of occupancy duration periods for the seven broad
microenvironments identified above and can be sampled for simulation of
activity patterns. They can be made to accommodate a wide variety of
activity groups identified in many different activity pattern studies.
Specific activities of importance for a particular pollutant can also
be modeled within one of these seven microenvironments. For example,
cooking with gas stoves can be modelled as a part of the time spent in a
residence other than sleeping. Similiarly an occupationally hazardous
activity would be a part of Indoor other.
Despite the significant difference between the two cities in specific
characteristics, the overall similarities found between the two cities are
quite notable. Only the winter season in both cities is represented,
however, which may account for the very short outdoor duration in both
cities. The characteristic that Washington, D.C., residents are
predominantly office workers is seen in the significant difference in
indoor-office duration in the two samples. Similarly, the larger
metropolitan area of Washington, D.C., compared to that of Denver,
translates into the longer average transit duration. Whether or not the
mile-high elevation of Denver has anything to do with the longer sleep
duration observed in that city, compared to Washington, D.C., cannot be
determined from this limited data set. These types of city-spec'ific
characteristics must be taken into consideration when modeling activity
patterns for use in total human exposure modeling.
The similarities and differences observed in the characteristics of
these two cities must now be translated into the efficient and accurate
simulation modeling of a basic activity patterns structure. One can sample
from the database of detailed activity patterns for a limited number of
microenvironments in Denver and Washington, D.C., according to the
characteristics which affect activity patterns such as week part, or
demographic characteristics such as sex and age. City-specific
characteristics, such as commute-time differences, can be built into the
simulation from available census information for different cities. Sex and
age differences will be reflected by proportional sampling. Presently,
there is no seasonal information available. It is quite possible, however,
that the Cincinnati activity study may provide such information to add to
the database. Hence, a comprehensive database reflecting season, city, and
demographic characteristics for activity patterns does not, in principle,
seem a formidable undertaking.
With the activity patterns so structured, the next step is to
identify the microenvironments which are important for modeling exposure to
specific pollutants. For benzene exposure, for example, the important
4-14
-------
1200
1000
800
600
400
200
Minutes Per 24 Hours
1053
984
INDOTH INDRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER | WASHINGTON
•i PRIMARY B SECONDARY
FIGURE 8a - Comparison of 18-24 Age Group Time Budgets
TABLE 8a - Comparison of 18-24 Age Group Time Budgets by City
Denver
n = 526
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office .
Other
MEAN
1053.0
515.4
537.6
263.0
95.3
167.7
STD.ERR
36.7
23.7
38.4
31.4
25.4
26.6
Washington
n=540
MEAN
984.6
523.2
461.4
310.8
162.7
148.1
STD.ERR
32.7
21.0
28.3
33.3
32.3
25.6
T-TEST
P<
.167C
.8062
.106Ł
.298Ł
.114C
.595^
Outdoor
Transit
All
All
Combustion
Non-Combustion
11.4
111.6
97.0
14.6
3.9
15.2
13.9
7.4
24.6
120.0
105.2
14.9
9.0
11.2
11.4
4.8
.2104
.6504
.6496
.9791
4-15
-------
1200
1000
800
600
400
200
Minutes Per 24 Hours
1022
1005
INDOTH INORES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER | WASHINGTON
•I PRIMARY 3B SECONDARY
FIGURE 8b - Comparison of 25-44 Age Group Time Budgets
TABLE 8b - Comparison of 25-44 Age Group Time Budgets by City
Denver
n=300
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1021.8
517.6
504.2
286.5
138.7
147.8
STD.ERR
13.1
6.7
13.1
12.6
12.1
10.0
Washington
n = 294
MEAN
1005.3
494.4
510.9
292.2
203.8
88.4
STD.ERR
14.4
6.9
13.4
13.6
13.4
7.3
T-TEST
P<
.398C
.016C
.7215
.7601
.OOOC
.0001
Outdoor
Transit
All
All
Combustion
Non-Combustion
23.9
107.8
95.4
12.3
3.6
5.0
4.8
2.0
20.9
121.6
99.0
22.7
3.3
6.1
5.6
3.2
.5337
.0790
.6345
.0060
4-16
-------
Minutes Per 24 Hours
INDOTH INDRES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
DENVER I WASHINGTON
• PRIMARY • SECONDARY
FIGURE 8c - Comparison of 45-59 Age Group Time Budgets
TABLE 8c - Comparison of 45-59 Age Group Time Budgets by City -
Denver
n = 109
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1066.7
516.7
550.0
258.0
124.7
133.3
STD.ERR
23.6
10.4
22.8
21.6
19.1
16.4
Washington
n = 127
MEAN
1025.9
491.9
534.0
283.2
180.9
102.3
STD.ERR
22.2 "
8.6
20.4
20.6
19.4
12.5
T-TES1
P<
.2080
.064<
.6011
.399(
.039Ł
.128Ł
Outcjoor
Transit ,
All
All
Combustion
Non-Combustion
24.8
90.4
81.2
9.3
5.3
7.2
7.3
2.4
17.5
113.4
99.2
14.2
3.5
8.9
8.7
3.3
.2370
.0508
.1103
.2446
4-17
-------
1400
1200
1000
800
600
400
200
0
Minutes Per 24 Hours
1208
1453
INDOTH INDRES OUTDOR TRNSIT . INDOTH INDRES OUTDOR TRNSIT
DENVER I WASHINGTON
• PRIMARY H SECONDARY
FIGURE 8d - Comparison of 60-70 Age Group Time Budgets
TABLE 8d - Comparison of 60-70 Age Group Time Budgets by City
Denver
n=72
Washington
n=65
LOCATION
ACTIVITY
Indoor Residence All
Sleeping
Other
Indoor Other
Outdoor
Transit
All
Office
Other
All
All
Combustion
Non-Combustion
MEAN
1207.6
488.9
718.7
133.2
31.3
102.0
STD.ERR
20.6
14.9
24.6
16.9
14.1
11.4
MEAN
1153.2
480.4
672.8
164.1
82.0
82.0
STD.ERR
.25.7
13.9
25.1
23.2
19.7
14.3
T-TEST
P<
.0981
.6770
.1945
.2784
.0350
.2773
14.6
84.4
67.5
16.9
3.4
8.1
7.1
4.0
26.6
96.1
71.6
24.5
6.6
9.1
9.2
5.4
.1004
.3669
.7227
.2605
4-18
-------
activities appear to be active and passive smoking and exposure to
gasoline. The TEAM5 study, though not designed to provide activity pattern
data, can provide seasonal and geographical information for benzene
exposure modeling.
From this investigation comes the concept of a national activity
patterns database. Information from extensive activity pattern research,
as reviewed by Ott,6 including time use studies by Robinson et a?.7 can be
utilized to build a national activity patterns database. It can form the
basis for sampling activities for exposure simulation modeling for any
given city.
For some specific pollutants, however, additional modeling for
particular microenvironments will be required. For example, special
modeling efforts will be required to accomodate active and passive smoking
activities. Finally, background concentrations of some pollutants must be
estimable for different geographic regions by season and climate, in order
to realistically simulate exposure to those pollutants.
4-19
-------
REFERENCES
Johnson, T., "A Study of Personal Exposure to Carbon Monoxide in
Denver, Colorado," Report No. EPA-&00/4-84-014. NTIS No.
PB84-146125. U.S. Environmental Protection Agency, Research Triangle
Park, NC, 1984.
Johnson, T., "A Study of Personal Exposure to Carbon Monoxide in
Denver, Colorado," Paper No. 84-121.3 presented at the 77th Annual
Meeting of the Air Pollution Control Association, San Francisco, CA,
June 1984.
Hartwell, T.D., C.A. Clayton, R.M. Michie, Jr., R.W. Whitmore, H.S.
Zelon, S.M. Jones, and D.A. Whitehurst, "A Study of Carbon Monoxide
Exposure of Residents of Washington, D.C., and Denver, Colorado,"
Report No. EPA-600/54-84-031. NTIS No. PB84-183516. U.S.
Environmental Protection Agency, Research Triangle Park, NC, June
1984.
Hartwell, T.D., C.A. Clayton, R.M. Mitchie, Jr., R.W. Whitmore, H.S.
Zelon, and D.A. Whitehurst, "A Study of Carbon Monoxide Exposure of
the Residents in Washington, D.C.," paper No. 84-121.4 presented at
the 77th Annual Meeting of the Air Pollution Control Association, San
Francisco, CA, June 1984.
•Wallace, L.A., "The Total Exposure Assessment Methodology (TEAM)
Study: Summary and Analysis: Volume I," Report No. EPA-600/6-87-002a,
June 1987. . .
Ott, W.R., "Human Activity Patterns: A Review of the Literature for
Estimation of Exposure to Air Pollution," presented at the Research
Planning Conference on Human Activity Patterns, U.S. Environmental
Protection Agency, Environmental Monitoring Systems Laboratory, Las
Vegas, NV, May 10-12, 1988.
Robinson, J., and J.M. Holland, "Trends in American's Use of Time:
Some Preliminary 1975-1985 Comparisons," Draft Report to the Office
of Technology Assessment, U.S. Congress. Survey Research Center,
University of Maryland, May 1986.
4-20
-------
A COMPARATIVE EVALUATION OF SELF-REPORTED AND INDEPENDENTLY-OBSERVED
ACTIVITY PATTERNS IN AN AIR POLLUTION HEALTH EFFECTS STUDY
by: Thomas H. Stock and Maria T. Morandi
University of Texas
Health Science Center at Houston
School of Public Health
P.O. Box 20186
Houston, Texas 77225
ABSTRACT
As part of a community-based air pollution health effects study, 29
asthmatics residing in two Houston neighborhoods participated in a personal
monitoring project. Participants maintained personal log forms indicating
their presence in one or more of seven major microenvironments (ME's)
during each clock hour. These self-recorded activities were compared with
independent observations of the subjects' activities by technicians
performing personal air pollution monitoring. Considering the independent
observations as "truth," location misclassification error rates were
calculated, and the effects of age, gender, resident neighborhood,
complexity of activity, and training on these error rates were examined.
Of those variables considered, complexity of activity in an hour appeared
to have the greatest effect on error.
5-1
-------
INTRODUCTION
Accurate assessment of human exposure to air pollutants requires
either direct measurement by means of personal monitoring (PM) or an
indirect approach whereby personal time and location information are
combined with pollutant monitoring in relevant microenvironments (ME's) to
estimate personal exposure (1-3). Direct PM is performed relatively
infrequently, because of limitations imposed by such considerations as
cost, manpower, availability of appropriate monitoring devices, and
acceptance by study subjects (4). Although the indirect approach to
exposure assessment is generally more feasible, it requires both accurate
and appropriate ME pollutant monitoring as well as reliable records of the
temporal and spatial aspects of individual daily activity patterns (2).
The distributions of average times spent by individuals in various
ME's have been reported from time-budget surveys (5,6), as well as from
some investigations of air pollutant exposures (7-11). While these studies
have provided much" valuable information on relative exposure times of
populations in different ME's, as well as a relatively consistent pattern
of average time distributions among various population groups, the reported
data do not address some critical issues in the application of activity
patterns to estimating personal pollutant exposures. For example, it is
necessary to obtain information on individual variability in activity
patterns, distributions of location by time, and reliability of recorded
activities.
The objective of this investigation is to examine the issue of
reliability of self-reported activity patterns. As part of a community-
based air pollution health effects study (4,12,13), 29 asthmatics residing
in two Houston communities participated in a personal monitoring project in
which study subject activity patterns were both self-recorded on daily log
forms and independently observed and recorded by technicians operating the
monitoring instruments. The accuracy of individual self-reported activity
logs was evaluated considering the technician-reported activities as
"truth."
METHODS
PROTOCOL
Of the 51 study subjects in the six-month (May-October, 1981) Houston
Area Asthma Study, 30 agreed to participate in a personal monitoring (PM)
project. Each participant was accompanied by two air monitoring
technicians carrying portable analyzers for ozone and respirable
particulate mass during one or two daytime periods (between 7 A.M. and 7
P.M., CDT). Detailed observations of the subject's location and activity
were recorded by the technicians. The participants were not generally
aware that their activities were being recorded. Because of instrument
failure, one volunteer was not monitored; 21 participants were monitored
during two daytime periods while 8 subjects were monitored once.
5-2
-------
All asthma study participants maintained personal activity log forms
indicating their presence in one or more of seven major ME's (three indoor,
two outdoor, and two transportation-mode) during each clock hour. The
final daily activity log design was selected after pretesting it with a
subgroup of the asthma study participants prior to the field portion of the
investigation. All study participants were given a pre-health-study
training course which included a trial practice on recording activities on
the log forms. The personal logs were maintained on all study days,
whether or not PM occurred, and were normally completed at the end of each
12-hour period (daytime or nighttime). The daytime version of the activity
log is shown in Figure 1. The subjects were instructed to indicate their
presence in one or more ME at any given hour by drawing a horizontal line
within the appropriate box. No attempt was made to estimate a temporal
resolution of less than one hour. The dated daily activity log forms were
provided in batches sufficient to cover 10 days of the field portion of the
health study at a time. All participants in the asthma study paid a weekly
visit to a field office where study staff reviewed the week's activity logs
with them. Subjects were questioned about missing or apparently incorrect
records which were sometimes revised based on subject verification. New
activity log forms were also provided during the visits. These weekly
meetings also helped maintain the motivation level of the participants
throughout the study. .
DEMOGRAPHIC DESCRIPTION
Selected demographic characteristics of the PM study group are summarized
in Table 1. There was approximately an even distribution of study subjects
by both study area (site) and
-------
FIGURE 1. DAYTIME VERSION OF DAILY ACTIVITY LOG
HOURS
morning noon evening
PLACE 7 8 9 10 11 12 1 2 3 4 5 67
HOME
INDOORS SCHOOL OR WORK
ELSEWHERE
OUTDOOR- IN NEIGHBORHOOD
OUT OF NEIGHBORHOOD
IN OPEN CAR, TRUCK OR BUS
IN CLOSED CAR, TRUCK OR BUS
en
I
-------
TABLE 1. SELECTED DEMOGRAPHIC CHARACTERISTICS OF THE STUDY GROUP
Study Area
Age, yrs.
Sex \ 18 yrs.
> 18 yrs.
Total
CL SS
15 14
Mean Median Range
17.9 12 7-54
Male . Female Total
13 8 21
1 7 8
14 15 29
5-5
-------
calculating the frequency of this occurrence in the total number of valid
hours (561 hours). Second, error rates for each ME were calculated by
assigning a value of 1 if a disagreement occurred in any hour (0
otherwise), and calculating the frequency of this occurrence for each ME.
A more refined categorization of ME error rate was obtained by assigning a
value of -1 to errors of commission (i.e., the participant indicated he was
present in an ME when he was not), a value of 0 if both participant and
technician reported that the subject was not present in an ME, a value of 1
for error of omission (i.e., the participant reported not to be present in
an ME when he actually was), and a value of 2 if both technician and
participant reported presence in an ME. Finally, major relevant
misclassification errors (i.e., reporting to be indoors when actually
outdoors, or to be in a vehicle with open windows when the vehicle had
closed windows) were also calculated.
The error rates were evaluated in terms of classification
characteristics of the study participant: sex, area of residence (site),
age, date, (i.e., first versus second day of personal monitoring) and the
number of ME's visited during each hour. All computations and statistical
analyses were performed using the Statistical Package for the Social
Sciences SPSS-X, version 2 (15). Tests of statistical difference were
considered significant at p < 0.05.
RESULTS
A summary of .overall error rates for several potentially relevant
participant classification categories, based on 561 hours of comparison
data, is presented in Table 2. •
The summary overall error rate for all subjects independent of
classification category was 35.5% (or expressed as overall agreement,
64.5%). Overall error rates for individuals ranged from 0% for a subject
who remained inside her home throughout two complete monitoring periods, to
80% for a 7-year-old participant, who was the youngest subject in the PM
study. Chi-squared analysis for each stratification variable indicated
significant (p < 0.01) differences in overall error rates for gender and
the number of ME's visited during a given hour (# ME's/hr; as independently
observed by the technicians), and for age category and monitoring day
(date; for subjects with two PM days) variables (p < 0.05). Females had
lower overall error rates than males; adults had lower rates than younger
subjects. There was no statistical difference in overall error rates
between SS participants and Cl_ participants.
Summaries of individual ME error rates (i.e., agreement vs.
disagreement) are presented in Tables 3-5 for indoor, outdoor, and vehicle
environments, respectively. As shown in these tables, the pattern of
participant classification variables with significant differences in ME
error rates differs from the overall rates and varies according to ME. The
number of ME's visited in an hour shows the most consistent effect, with a
highly significant (p < 0.001) positive correlation between number of ME's
and error rates for all environment locations. Gender differences were
5-6
-------
TABLE 2. SUMMARY OF OVERALL ERROR RATES BY PARTICIPANT CATEGORY
Variable
Sex
Age
Site
Date*
# ME's/hr
Summary
Category
Females
Males
\ 18 yr
> 18 yr
1st
2nd
1
2
3
4
Error rate
%
29.1
41.6
38.5
29.5
CL
SS
41.4
30.9
14.1
80.4
82.0
100.0
35.5
Chi-squared D.F. p
9.1 1 0.003
4.1 1 0.042
34.1
0.4 1 0.520
37.1
4.8 1 0.028
248.8 3 0.000
Chi-squared significant at p < 0.05
includes participants monitored during two different dates
(447 hours)
5-7
-------
TABLE 3. SUMMARY OF INDOOR ME ERROR RATES BY PARTICIPANT CATEGORIES
Variable
Sex
Age
Site
Date -
# ME's/hr
Summary
Category
Female
Male
\18 yr
>18 yr
CL
SS
1st
2nd
1
2
3
4
Home
9.1
**
18.2
NS
10.3
*
17.8
22.0
**
• 10.5
4.7
29.5
***
44.0
28.6
13.7
Mean Error Rate (%)
School /work
1.5
***
8.0
NS
NS
NS
2.1
6.3
***
20.0
14.3
4.8
Elsewhere
NS
NS
NS
17.8
***
4.7
3.9
17.0
***
36.0
14.3
9.6
*** p < 0.001
** p < 0.01
p < 0.05
5-8
-------
TABLE 4. SUMMARY OF OUTDOOR ME ERROR RATES BY PARTICIPANT CATEGORIES
Variable
Sex
Age
Site
Date
# Me's/hr
Summary
*** .»
p < 0.
** r
p < 0.
p < 0,
Category
Female
Male
\18 yr 18.3
>18 yr
CL
SS
1st
2nd
1
2
3
4
001
01
.05
Mean Error Rate
In_neiqhborhood Out_of
NS
**
8.4
NS
NS
5.7
31.3
***
32.0
78.6
15.0
(%)
neighborhood
NS
NS
3.3
*
7.7
NS
2.6
12.5
***
12.0
0.0
5.3
5-9
-------
TABLE 5. SUMMARY OF VEHICLE ME ERROR RATES BY PARTICIPANT CATEGORIES
Variable
Sex
Age
Site
Date
# Me's/hr
Summary
*** -
p < 0.
** r
p < 0.
p < 0
Category
Female
Male
\18 yr
>18 yr
CL
SS
1st
2nd
1
2
3
4
001
.01
.05
Mean Error
Opened windows
NS
NS
NS
NS
2.9
18.8
***
48.0
50.0
11.4
Rate (%)
Closed Windows
9.5
***
2.4
NS
8.6
**
2.7
. . NS
3.1
11.6
***.
6.0
35.7
5.9
5-10
-------
significant for three of the ME's, with error rates for males higher for
two indoor ME's, and rates for females higher for the closed vehicle ME. A
similarly inconsistent pattern was observed for study area (site). One
indoor, one outdoor, and one vehicle ME exhibited significant differences,
with error rates of SS participants higher for two of the three
environments. A significant effect of monitoring day was apparent for only
two of the indoor ME's, with higher error rates on the first day for both
of these environments. A significant difference in error rate based on age
category was found for only one ME, with the younger group showing a higher
mean error rate.
Summary error rates for ME's ranged from 4.8% to 15.0%, with higher
rates for indoor and outdoor location categories associated with the ME's
occupied most frequently ("home" for the indoor and "in neighborhood" for
the outdoor categories). When these summary ME error rates are sorted by
error type, i.e., omission (+1) or commission (-1), the results indicate
that, for six of the seven ME's, rates of error of omission are
approximately twice as high as rates of error of commission. Only for
closed vehicles is this rate ratio reversed.
In order to ascertain the contribution of each of the participant
classification variables to the variance in overall and ME error rates,
ANOVA procedures with the classic experimental approach were used. A
summary of the results of these analyses, including significant (p < 0.05)
2-way interactions, is presented in Table 6. The amount of variance in
error rates explained by the participant classification variables ranged
from a low of 22.0% for the "outdoo'r/out of neighborhood" location to a
high of 55.5% for "vehicle/opened, windows." With the-exception of the
"outdoors/out of neighborhood" and "vehicle/closed windows" locations,
statistically significant main effects accounted for pver half of the
variance explained by the respective models. For the overall error rate,
the number of ME's visited during an hour accounted for approximately 84%
of the total variance explained by the model, as indicated by the
corresponding adjusted beta-squared, while sex, date, and interactions
between sex and the number of ME's visited per hour, sex and date, sex and
site, and age and the number of ME's visited per hour contributed only
marginally.
The number of ME's visited per hour was also the major contributor to
error rate (from a low of 27% to a maximum of 83% of the variance explained
by the respective models) for all individual ME locations, with the
exception of "indoors/home" and the "indoors/school-work" categories, where
date of monitoring was more important. Date was a contributing variable
only to the three indoor ME and overall error rates, excluding 2-way
interactions. Age of the participant was important in determining error
rates in outdoor environments, as well as the "vehicle/closed window" ME,
again not considering the interaction terms. Gender of the participants
was significant for explaining the "indoor/home," "indoor/school-work,"
"outdoor/in neighborhood," and overall error rates, as was site for
"outdoors/in neighborhood" and the two vehicle categories. In most cases,
all contributions aside from the "# ME's/hr" contributed relatively little
to overall variance. It is important to note that since a nonorthogonal
5-11
-------
TABLE 6. SUMMARY OF ANALYSIS OF VARIANCE RESULTS FOR OVERALL AND ME
CATEGORY ERROR RATES
Indoors
Outdoors
Vehicle
Overall
ME Total variance
explained by
model
Home 27.6
School/
work 38.1
Else-
where 38.5
In neighbor- 51.6
hood
Out of neighbor- 22.0
hood
Opened 55.5
windows
Closed 36.0
windows
44.3
Main Adjusted
effects beta-squared
Date
# ME's/hr
Sex
Date
Sex
* ME's/hr
# ME's/hr
Date.
# ME's/hr
Age
Sex
Site
1 ME's/hr
Age
# ME's/hr
Site
# ME's/hr
Age
Site
# ME's/hr
Sex
Date
10.9
5.3
2.0
22.1
1.4
1.2
28.2
2.2
32.5
2.0
2.0
0.01
6.2
1.2
46.2
0.001
9.6
3.2
2.8
37.2
1.2
0.6
Contributing 2-way
interactions
Sex: age, site
Age: date
Sex: age, # ME's/hr
Site: age
Age: date
Sex: site
Site: age
1 ME's/hr: date
Sex : age
Site: date
Age: date, If ME's/hr
Sex: date, age
Site: * ME's/hr
# ME's/hr: date
Sex: # ME's/hr
Site: .age, # ME's/hr,
Age: # ME's/hr
Sex: # ME's/hr
Site: date, # ME's/hr,
Age: * ME's/hr
# ME's/hr: date
Sex: # ME's/hr, date,
Age: . # ME's/hr
date
age
site
-------
analysis approach was used, it is possible for two variables to contribute
to variance through interactions, even if they are not independent
contributors.
During 12 of the 561 hours reported, the subjects totally
misclassified the outdoor/indoor location, while 4 of 99 hours reported as
having been in a vehicle were misclassified within the category (i.e.,
opened windows vs. closed windows and vice/versa).
DISCUSSION
The data analysis results indicate that both overall and ME errors of
self-reported location were not random, but were influenced by some of the
characteristics of the study population. It is also important to indicate
that the design of the study and the quality assurance measures adopted for
the self-reported data tended to reduce disagreements between observed and
reported activities. The overall error is the most stringent measure of
error calculated, because of the strict requirement for agreement in all
ME's for each hour. According to this error measure, the study
participants made an ME misclassification 35.5% of the time. From the
point of view of exposure assessment, some of these misclassifications may
not be crucial for certain pollutants (e.g., ozone exposure in two
different air-conditioned indoor microenvironments), but the overall error
rate attributed equal weights to errors in any ME. However, the
association of overall and ME error rates with certain characteristics of
the study population are indicative of the type of variables which should
be considered when designing activity log forms.
Although a positive association between error rate and increasing
number of microenvironments visited during an hour was hypothesized, since
accuracy of recall would be expected to decrease with increasing complexity
of activities, it was somewhat surprising that it was the major and most
consistent effect of all the category variables considered. When only one
ME was visited, the overall error rate was as low as 14.1%. The rate
increased dramatically to over 80% when two or more ME's were visited in
the same hour, and to 100% when four ME's were visited within the same
hour. This result has a direct bearing on the design of activity diaries,
and needs to be studied further.
Another unexpected result was the weak association between error
rates and the neighborhood of residence of the study participants, also
used as a surrogate indicator of SES. During the field portion of the
study, the staff had reported more difficulties in obtaining compliance
with protocols on the part of the SS participants. This observation was
not reflected in the SS-CL comparison of observed and reported activities,
probably due to the requirement that the study participants met weekly with
the study personnel at a field office in each of the communities. If the
participants did not report on the scheduled date, phone contact and, if
needed, a home visit was performed by the staff.
5-13
-------
An individual's personal monitoring days were typically separated by
a month or more. Lower error rates for the second day of monitoring
suggest an effect of training on the accuracy of reporting. In part, this
effect could also be due to the intensive interaction between field office
personnel and subjects.
Age and sex, both independently and interactively, had an effect on
some of the error rates. Overall, the females were more accurate than the
males. This effect occurred even after adjusting for age. (The males were
mostly in the younger age group which had larger error rates.) The older
participants also had lower error rates, even after adjusting for the
number of microenvironments visited per hour, which was larger for the
younger group of participants. Field technicians had reported difficulties
with communicating instructions to at least one of the younger members of
the study population. They acted promptly in reporting and correcting this
problem by increasing communication with both the subject and his parents.
This kind of intervention probably resulted in reduced error rates for some
of the categories of participants. It is important to indicate that the
effects of sex and age may only be specific to this study due to the
particular age distribution of the participants.
The participants were more likely to report that they were not in an
individual ME when they actually were (omission), than to record that they
were in a particular ME when they were not (commission). Extreme
misclassification of location (i.e., that which could probably lead to
gross errors in exposure estimates) occurred very infrequently. The
participants reported to be in an indoor environment while outdoors, and
vice versa, during only 2.3% of the hours observed. Also, they reported to
be in a vehicle with closed windows which were actually open, or vice
versa, during 4% of the total vehicular hours observed.
The participant classification variables included in this study
accounted for at most 55.5% of the error variance for any of the
environment categories considered. It is possible that other variables not
taken into consideration in this investigation also affected error rates.
CONCLUSIONS
The results of this study indicate that significant location errors
can occur with self-reported activity data, and that these errors are
linked in part to some of the characteristics of the study population.
Error rates were most affected by the number of environments visited by the
study participants during any given hour, probably associated with
decreased effectiveness of recall as individual activities increase. This
variable needs to be considered by researchers, since it could result in
increased misclassification of location and errors in exposure estimates
for the more active subgroups of their study population. The effects of
other variables such as gender, age, and socioeconomic status (as indicated
by area of residence in this study) can probably be controlled by intensive
and extensive communication between the study participants and staff.
5-14
-------
Although this paper is based on data collected in a project funded by
the U.S. Environmental Protection Agency, the analysis of the data
discussed in this paper was not funded by that Agency.
The work described in this paper was not funded by the U.S.
Environmental Protection Agency and therefore the contents do not
necessarily reflect the views of the Agency, and no official endorsement
should be inferred.
5-15
-------
WHO. Estimating
publication No. 69.
REFERENCES
Human Exposure to Air Pollutants. Offset
World Health Organization, Geneva, 1982.
Sexton, K. and Ryan, P.B. Assessment of human exposure to air
pollution: methods, measurements, and models. In: A. Watson, R.R.
Bates, and D. Kennedy (eds.), Air Pollution, the Automobile and
Public Health: Research Opportunities for Quantifying Risk. National
Academy Press, Washington, DC, 1988. p. 207.
Spengler, J.D. and Soczek, M.L. Evidence for improved ambient air
quality and the need for personal exposure research. Environ. Sci.
Techno!. 18: 268A, 1984.
Stock, T.H.; Kotchmar, D.J.; Contant, C.F.; Buffler, P.A.; Holguin,
A.H.; Gehan, B.M.; and Noel, L.M. The estimation of personal
exposures to air pollutants for a community-based study of health
effects in asthmatics -- design and results of air monitoring. JAPCA
35: 1266, 1985.
Chapin, Jr., F.S. Human Activity Patterns in the City. John Wiley
and.Sons, New York, NY, 1974. ' '
Robinson, J.P.; Converse, P.E.; and Szalai, A. Everyday life in
twelve .countries. In: A. Szalai (ed.), The Use of Time - Daily
Activities of Urban and Suburban Populations in Twelve Countries.
Mouton and Co., The Hague, 1972. p. 114.
Dockery, D.W. and Spengler, J.D.
particulates and sulfates. JAPCA 31:
Personal exposure
153, 1981.
to respirable
Fugas, M.; Sega, K.
airborne respirable
and Assess. 2: 157,
; and Sisovic,
particles and
1982.
A. Study of personal exposure to
carbon monoxide. Environ. Monit.
Spengler, J.D.; Treitman, R.D.; Tosteson, T.D.;
Soczek, M.L. Personal exposures to respirable
implications for air pollution epidemiology.
Technol. 19: 700, 1985.
Mage, D.T.; and
particulates and
Environ. Sci.
5-16
-------
10. Quackenboss, J.J.; Spengler, J.D.; Kanarek, M.S.; Letz, R.; and
Duffy, C.P. Personal exposure to nitrogen dioxide: relationship to
indoor/outdoor air quality and activity patterns. Environ. Sci.
Technol. 20: 775, 1986.
11. Morandi, M.T. and Stock, T.H. A comparative study of respirable
particulate microenvironmental concentrations and personal exposures.
Environ. Monit. and Assess. In Press, 1988.
12. Holguin, A.H.; Buffler, P.A.; Contant, C.F.; Stock, T.H.; Kotchmar,
D.J.; Hsi, B.P.; Jenkins, D.E.; Gehan, B.M.; Noel, L.M.; and Mei, M.
The effects of ozone on asthmatics in the Houston area. In: S.D.
Lee (ed.), Evaluation of the Scientific Basis for Ozone/Oxidants
Standards. Air Pollution Control Association, Pittsburgh, PA, 1985.
p. 262.
13. Contant, C.F.; Stock, T.H.; Buffler, P.A.; Holguin, H.A.; and Gehan,
B.M. The estimation of personal exposures to air pollutants for a
community-based study of health effects in asthmatics -- exposure
model. JAPCA 37: 587, 1987.
14. Stock, T.H. Formaldehyde concentrations inside conventional
housing. JAPCA 37: 913, 1987.
15. SPSS, Inc. SPSS-X User's Guide, 2nd edition. McGraw-Hill, New
York, NY, 1986. • .
5-17
-------
ASSESSING ACTIVITY PATTERNS FOR AIR POLLUTION EXPOSURE RESEARCH
by
James H. Adair and John D. Spengler
Harvard University, School of Public Health
665 Himtington Avenue
Boston, MA 02115
ABSTRACT
Time/activity diaries are means by which microenvir.onmental exposure
to pollutants can be assessed. This paper presents two such diaries, used
in large scale field research studies, designed to identify human activity
patterns in relation to N02 and/or PM2 5 exposure. From these patterns,
estimates of human exposure can be obtained which are more sensitive than
those made at fixed central site monitoring locations.
The diaries themselves reflect the research questions and the
technology available at the time of monitoring. The two main parameters,
time and location, vary according to the monitoring device chosen and the
pollution source in question. In turn, each parameter reflects the need to
make the diaries as simple and complete as possible for the target.sampling
group.
Results from one of the studies indicate regional, seasonal, and
day-of-the-week differences in children's activity patterns. They appear
to reflect differences in climate, urbanization, and other factors.
Results from the second study are not yet available.
6-1
-------
HOME ACTIVITY PATTERN USE IN TWO INDOOR AIR QUALITY RESEARCH STUDIES
Two major air quality research studies at the Harvard School of
Public Health are currently using time/activity diaries to determine home
activity patterns, microenvironments, and human exposure. One project, the
Harvard Indoor Air Pollution Health Study, employs fixed location
monitoring devices to measure N02 and respirable particulates. The
activity diaries are designed to assess time spent in microenvironments,
including those where monitors are located. The Gas Research Institute
Project is actually two large-scale N02 personal exposure studies. For
these studies conducted in Boston and Los Angeles, participants completed
time/activity diaries. The diaries for these studies use time as the major
dimension to capture duration-of-stay within microenvironments. For each
of these studies, time/activity is recorded over 24-hour periods that
correspond with monitoring.
While complete details about these projects are beyond the scope of
this presentation, this paper does provide a brief description of the
Harvard Indoor Air Quality Study. In an attached appendix, the quality
assurance aspects of the Gas Research Institute 'study are presented.
Further details for the Harvard Indoor Air Pollution Health Study can be
found in Ferris et a7. (1979), and Spengler et a7. For more information on
the Gas Research Institute Project, see Ryan et a7. (1988a); Ryan et a7.
(1988b) and Soczek et a/., (1987; Personal communications).
THE HARVARD INDOOR AIR POLLUTION HEALTH STUDY
The Harvard Air Pollution Health Study is a prospective epidemiologic
study involving about 20,000 people in six communities (Ferris et a?.,
1979). A component of this study is concerned with indoor air pollution
and respiratory effects. By the end of 1988, approximately 1,800 children
in six cities will have been surveyed for daily respiratory symptoms and
monitored for outdoor/indoor NO. and respirable particle exposure. In
addition, time/activity data wilt have been collected for the final three
cities in the study: Portage, WI, and Steubenville, OH, in 1986-87, and
Topeka, KS, in 1987-88. This paper is based upon the sample of children in
those cities. However, the summary activity pattern results are based only
upon Portage and Steubenville, where activity data collection is now
complete.
6-2
-------
AIR QUALITY MEASUREMENTS
Based on previous studies (Spengler et a/., 1985; and Quackenboss et
a/., 1985), it has been demonstrated that indoor measurements of particles
and NO. are predictive of personal exposures for nonoccupationally exposed
people. Therefore, in designing this large, indoor air pollution health
study, we utilized the concept of microenvironmental monitoring to
establish an estimate of exposure for children. Data to characterize these
exposures was gathered through monitoring devices and a time/activity
diary. In addition, questionnaires were administered to describe home and
health characteristics. Finally, a daily health diary was kept for the
children involved in the study.
N02 measurements were made with fixed-location passive diffusion
tubes (Palmes et a/., 1976). The Harvard Aerosol Impactor (Turner et a/.,
1984; Marple et a/., 1987) was used to measure respiratory particulates
less than 2.5 microns in size (PM2 J. Passive diffusion water vapor tubes
(Girman et a/., 1984) were also deployed as well as the Brookhaven National
Laboratory perfluorocarbon tracer system (Dietz et a/., 1982) for measuring
air exchange rates.
Integrated pollution monitors were placed in the child's home,
outdoors, and in the schools. Temporal variations were assessed by
repeated measurements twice in the winter and twice in the summer. A
subset of approximately 30 homes in each city was measured in each of four
seasons. Particles Were measured at multiple outdoor locations, the
activity room of each home, and in a single classroom in each school.
Timers controlled the particle samplers in the home (4 pm to 8 am. on
weekdays, 24 hours otherwise) and school (8 am to 4 pm on weekdays). This
corresponded to the projected activities in those locations. N02 was
measured in three rooms of the home, one location outside the home, and in
several classrooms per school. ' '
In three cities children completed a technician administered
time/activity diary covering 3 winter and 3 summer days. The purpose of
this diary was to determine the proper weights to apply to microenvironment
concentrations. A home characteristics questionnaire was administered to
describe the home environment for each participant. It included questions
about home type and setting, heating systems and fuels, cooking and water
heating fuels, ventilation, participants and smoking patterns, as well as
water and humidity. A floor diagram of the house was also drawn so that
house volumes could be calculated. The exact location of all measuring
devices and pollution sources were noted on floor plans. A daily health
diary was self-administered to record respiratory symptoms for the year in
which the study took place in a particular city. A substudy also collected
indoor and outdoor microbiological specimens concomitant with the other
data collected. Finally, yearly pulmonary function measurements were made.
TIME/ACTIVITY MEASUREMENTS
Time/activity diaries were filled out twice during the year
concomitant with source monitoring. They were first presented and
6-3
-------
partially filled out during the first or second of three home visits by the
field technician. These visits were designated as equipment setup, change,
and pickup visits. During the setup visit, sampling devices were placed
and the home characteristics questionnaire was administered. The change
visit included collecting the first week's samples and replacing them for
the second week sample. The pickup visit involved the collection of the
remaining samples and the monitoring equipment.
Field technicians were instructed to administer the activity diary
during the setup, visit if time allowed. During the change visit, the
technician reviewed the portion of the activity diary filled out by the
participant. If not completed or incorrectly filled out, an additional
diary was administered. However, in many cases, the diary was first
administered during the shorter change visit due to the lengthy setup
session (one hour or more). Field technicians administered a portion of
the diary to the participant on a 24- to 36-hour recall basis. This method
introduced the participant to the concept and level of detail required to
complete the remaining portion of the diary. Thus diary entries for the
remainder of the day plus the entire next day were filled out by the child
or parent (mother). At the next visit, the technician reviewed the
activity diary for errors and resolved any problems or misrecordings.
ACTIVITY LOG MEASUREMENT
We were interested in understanding a child's time/activity patterns
in general. Specifically, we wanted to know how much time was spent in
certain indoor locations. The indoor locations of particular interest were
the monitored microenvironments. (See exhibit 1.) .Home location
categories were the child's bedroom, the kitchen, the activity/livi'ng room,
and other rooms in the house. This level of detail was required because
kitchen/bedroom N02 differences often exceed factors of two. Nonhome
location catagories were the-school, outdoors, transportation (car/bus),
and other indoor environments.
The time component of the activity diary spanned three consecutive
24-hour time blocks. The first day began at midnight and extended to the
following midnight. Days two and three followed in a similar fashion.
Every hour had to be partitioned into activities. A minimum averaging time
of 15 minutes per category was acceptable except for transportation. Since
school children were often in transit for durations of less than 15
minutes, we attempted to have the technicians make particular note of short
travel times. Unfortunately, this was not uniformly adhered to by all
technicians. The person(s) filling out the diary (child, mother, father,
etc.) and the technician introducing the activity diary were recorded on
the diary to test for bias at a later date.
Diaries were collected in both the winter (December to April) and the
summer (May to August). Winter data did not include time spent during the
Christmas holidays. Summer data included time when some children were
still in school (May and part of June).
6-4
-------
QUALITY ASSURANCE AND QUALITY CONTROL
Several quality assurance and quality control steps were taken to
insure the integrity of the data. Quality assurance included:
1. the use of a three-consecutive-day activity diary format to provide
for replication and to capture day-of-the-week differences,
2. a technician demonstration of a 24- to 36-hour recall diary as a
"warm up"/introduction to the diary technique,
3. limiting the number of locations and keeping the recording of time to
15-minute or greater durations to insure nonconfusing, nonoverlapping
alternatives, and
4. standardized training of field technicians by the Harvard
investigators.
Quality control was accomplished through manual and computerized
procedures. These are described briefly below.
1. The initials of the person filling out the diary were recorded. This
was used to identify whether the child, mother, father, etc., filled
out the form and to test for respondent bias.
2. The initials of the field technician administering the activity diary
were also recorded. This allowed for a check of technician bias.
3. The field technician reviewed the completed form before leaving the
site and made corrections as needed.
4. " The site field manager reviewed all diaries for accuracy before
returning them to the central office for processing.
5. The central office field coordinator reviewed each diary to insure
the number of minutes totalled 1440 per day.
6. Site visits were made on a random basis by the site field manager and
the central office field coordinator to observe that proper protocol
was being followed. If problems were found, corrective action was
taken and planned visits were made to insure compliance.
7. The data entry program "SPSS Data Entry" allowed a screen image that
resembled the diary itself. This enhanced the data entry process.
8. As data were entered, automatic value checks were made for day of the
week (MON to .SUN), month (JAN to SEP), adults' initials (CHILD, MOM,
DAD, OTHER), and field tech initials. Range checks were also
automatically made to insure proper day of the month (1 to 31), year
(87,88) and time/location (=>0 and <=60).
6-5
-------
9. Ten percent of the forms were verified for data entry errors. A
total accuracy of 99% was found before correction. No systematic
errors were observed.
10. Location totals were recalculated by computer program. All diaries
which did not total 1440 minutes were rechecked and corrected as
necessary.
USES
The time/activity diaries will be used in connection with the
monitoring results to calculate weighted total exposures for individuals
and to apply to exposure simulation models. The general approach is one of
time-weighted average concentrations summed over microenvironments (Fugas,
1975; Duan, 1982). The specific technique is presented in Letz et a7.
(1984). Initial results using Portage and Steubenville activity data are
presented elsewhere (Spengler et a/., 1987).
LIMITATIONS
No activity diary is perfect. There are trade-offs in length,
complexity, administration, entry, and analysis. Our activity diaries have
limitations that can be categorized as "form" and "technique."
Form-specific limitations are:
1. The diary form (See Exhibit 1) caused some confusion because of the
number of rows and columns. This led to. entries being recorded on
the wrong line. (Additional shading was.placed on the form to reduce
this confusion.)
2. The initials of the person who filled out the form were not adequate
to identify the preparer. (The technicians were told to write
whether the preparer was the child, mother, father, etc.)
3. Field and central site coordinators observed slight differences in
the manner that the activity diaries were presented. (Corrective
action was taken and protocols were tightened to prevent bias.)
4. There is a potential underestimate of exposures for those activities
with less than 15-minute durations. Thus environments such as
car/bus involving short distances (to and from school) may not be
recorded adequately.
6. The degree or amount of activity in any location is not recorded.
This could affect exposure levels which can increase with increased
physical exertion.
Additional limitations are concerned with the technique used. They
are:
7. The total-weighted exposure is an estimate rather than a measurement.
6-6
-------
8. The averaged concentrations from the fixed devices do not indicate
le.vels at the time of exposure.
9. Activity logs do not record where in the room a person is in relation
to the monitoring device and/or pollutant source.
10. Diaries really do not indicate the level or type of physical
activities by the child. Thus, we cannot modify our estimates of
calculated exposure to estimate possible differences in dose.
ANALYSIS OF RESULTS
This section presents basic descriptive time/activity data for
elementary-aged children in Steubenville, Ohio, and Portage, Wisconsin. A
total of 597 three-day logs were filled out between the two sites over two
seasons. (See Table 1.)
THREE DAY LOGS
The activity logs were first analyzed for differences across the
three days of data collection. No significant differences were found among
the three days for any location. Results were similar in both the winter
and summer as well as for Portage and Steubenville.
MONITORING PLACEMENT.ACCURACY
A second analysis was performed to determine the percentage of time
over all locations accounted for by the fixed monitoring, devices. (See
Table 2.) Overall, the N02 monitors located in the kitchen, bedroom,
activity room, outside, and in the schools accounted for between 75% and
87% of a participant's activity/time. The remainder of the time was spent
in nonmonitored locations: car/bus (2-4%), other indoor locations (5-12%),
and other rooms in the home (3-7%). Further analysis indicated that the
child's weekend activities were less likely to be approximated by N02 our
microenvironmental monitoring scheme (65-80%). The weekend/weekday
difference can be partially explained by an increase in activities at
nonmonitored "other inside" locations. These replaced activities at the
monitored school.
REGIONAL DIFFERENCES
Another analysis compared communities to ascertain regional
differences. (See Table 3.) Portage children were in the kitchen, outdoor,
and car/bus microenvironments a significantly greater amount of time during
the academic period than their Steubenville counterparts. These same
6-7
-------
children were also in the car/bus microenvironment more after school let
out for the summer than during the academic year. These results appear to
reflect differences between a rural farming lifestyle (Portage) and a
distinctly urban way of life (Steubenville).
SEASONAL DIFFERENCES
Analyses for seasonal differences compared academic period (12/86 to
06/87) and vacation period (06/87 to 09/87) activities. (See Table 4.)
Concomitant to the expected in-school to out-of-school decrease in the
percentage of time spent at school, there was an increase in the amount of
time spent outside. It also appears that children were spending less time
in their bedrooms after school let out for the summer. Overall, the
seasonal analyses indicated activity differences resulting from school
being replaced by increased outdoor and other activities in the summer.
A second seasonal analysis compared winter (12/86 to 04/87) and
summer (05/87 to 09/87) activities. The results from this analysis
indicated two differences from the academic vs. vacation results. The
winter/summer difference showed a substantial amount of time spent in
school during the summer. This result was due to the definition for summer
which included a month when children were still in school. The second more
important finding was a small but significant winter (5%) to summer (4%)
decrease in the amount of time Portage children spent in the kitchen. In
addition, an increase in outdoor activities (5% to 16%) was also noted in
both cities. Combined, these results seem to indicate a cold weather/warm
weather factor influencing activity patterns.
DAY-OF-WEEK DIFFERENCES
After looking at day-of-the-week differences, weekdays were collapsed
into a single category, while weekend days were retained as separate
categories. However, due to site visit scheduling patterns during the
summer sampling cycle, an insufficient number of Sunday diaries were
available for analysis from Portage. (Results reported here use the
academic period vs. vacation period definition as a basis for the
winter/summer comparisons.)
Steubenville participants indicated increased usage of the activity
room on the weekend while school was in session and decreased usage of the
bedroom on Saturdays throughout the year. (See Table 5.) Total in-home
time increased on weekends during the school year but decreased once school
closed for the summer. The decrease between weekday and weekend school
time was expected and was accompanied by an increase in other activities on
the weekend.
Portage children recorded increased usage of the activity room on
weekends during the school year. (See table 6.) They also indicated
6-8
-------
greater bedroom usage on Sunday during the same time frame. The decrease
in school time between the weekday and weekend seems to have been taken up
by an increase in outdoor time plus other indoor activities. The percent
time spent "at home" increased between the weekday and weekend during the
school year.
OUTDOOR COMPARISONS
Because some pollutants (e.g., ozone) display distinct diurnal and
seasonal cycles, we were interested in a child's activity patterns over
time. Table 7 presents an analysis of outdoor activity patterns over time.
As expected, the time spent outdoors during the daytime (8:00 a.m.-
8:00 p.m.) was much greater at both sites than the time spent outdoors at
night (8:00 p.m.-8:00 a.m.) There was also an increase in the time
children spent outside at night once schools closed for the summer. Future
analysis will provide a detailed examination of the percent of the
population outdoors by hour of the day. We will incorporate weather and
pollution data into this analysis.
OTHER STUDIES
An attempt was made to compare time/activity results reported here
with results reported previously in the literature. . Two such studies
investigated time/activity patterns of school-aged children (Letz et a/.,
1983; and Quackenboss et a/., 1982). . While such comparisons are tenuous
due to study design differences, they do provide a beginning point to which
the data reported here can be compared.
Letz et a7. used activity data gathered from Watertown,
Massachusetts, children in the fall of 1982. (See Table 8.) The 1987
Portage and Steubenville data indicate a greater amount of time spent
indoors and in school. While summer data was also reported for the
Watertown children, the definition for summer was different, making strict
comparisons difficult.
Quackenboss et a7. presented data collected from school children in
Portage, Wisconsin, during 1981. (See Table 9.) Their data appears to be
similar to the 1987 Portage data. Differences may reflect time of data
collection: Portage 1981 (March); Portage 1987 (December through March for
winter and May to September for summer).
SUMMARY OF TIME ACTIVITY RESULTS
Differences in time/activity patterns were found between academic and
vacation periods as well as between heating (winter) and non-heating
(summer) seasons. There were interregional differences between Portage and
6^9
-------
Steubenville reflecting differences in climate, urbanization, and other
factors. Children displayed different activity patterns between weekdays
and weekends but not among weekdays. The amount of time spent outdoors
differed by season, time of day (day vs. night), and community. In
addition, we were able to establish that time/activity patterns placed
children in the proximity of our microenvironmental monitors about 80% of
the time. However, on some weekend days this percentage dropped to as low
as 67%.
In the study of respiratory symptoms in these communities, estimates
of total exposure for children will be derived from the microenvironment
concentrations. Certainly, exposures based on these time-weighted
microenvironment measurements are not perfect estimators, but they are a
substantial improvement over measurements taken at a central site outdoor
monitor. They allow generalization to a population based on several
microenvironmental locations rather than a single location. This statement
is more valid when there are indoor and outdoor sources of a pollutant.
However, even for ambient pollutants such as ozone and acid aerosols,
exposure will be modified by indoor activity patterns. In view of these
factors, there continues to be a need to conduct personal monitoring
studies to quantify the relationships among exposure, microenvironmental
monitors, and fixed-site ambient monitors.
ACKNOWLEDGEMENTS
Work reported in this paper was supported in part by National
Institute of Environmental Health Sciences Grants ES-01108 and ES-0002. We
are indebted to our field technicians and data processing personnel for
their efforts to provide us with the data upon which this paper is based.
We are 'especially thankful to the families in Portage, Steubenville, and
Topeka who shared their time and homes. We wish to acknowledge the help of
P.B. Ryan in organizing the time/activity material reported in the
appendix. Finally, visiting scholar Eric Lebret was instrumental in the
design of the time/activity form.
APPENDIX
ACTIVITY PATTERNS OBTAINED UNDER HARVARD'S GRI PROJECT*
The Gas Research Institute Project (GRI) is designed to identify and
quantify the portion of total public exposure to nitrogen dioxide due to
indoor sources. To accomplish this goal, a series of studies were
undertaken. The first two -- the Residential Characterization Study (Ryan
et a7.,1988a; Ryan et a/., 1988b) and the Personal Exposure Monitoring
Study (Ryan et a/., 1987) -- were both conducted in the Boston area
preliminary to the study for which activity diaries are herein reported:
The Los Angeles Personal Monitoring Study (Soczek et a/., Personal
communication, 1987). This combination of studies, once completed, will
allow the quantification and comparison of indoor and outdoor source
6-10
-------
contributions, as well as activity patterns in low and high ambient N02
areas.
The Los Angeles Personal Monitoring Study was designed to increase
the understanding of microenvirommental contributions to total sources,
while at the same time characterizing N02 distributions in a high ambient
N02 area. In the main, study participants/technicians collected 24-hour
personal exposure data for two consecutive days. Approximately 650
participants were involved. The sampling period covered the time between
May 1987 and April 1988. A supplemental study collected more detailed
microenvironmental data on 50 people sampled eight times over the year.
This will provide information on seasonal variations.
* Project funded by the Gas Research Institute and Southern California Gas
Company. Investigation conducted by J.D. Spengler, P. Barry Ryan, and
Steven D. Colome (U. of Cal. Irvine). Project officers are Dr. I. Billick
- GRI and Phil Baker - SoCalGas.
AIR QUALITY MEASUREMENTS
Nitrogen dioxide measurements consisted of personal N02 badge
monitors (Yanagisawa et a/., 1982). For the main study, a 24-hour badge
was worn by participants on two consecutive days. In addition, a bedroom
badge and an outdoor badge were placed in fixed locations for the overall
two-day period. For the supplemental study, two additional microenviron-
mental badges were also worn. An "at home" badge was worn in the home,
while an "away from home"-badge was worn when the participant was out of
the house. Homes were characterized by the use of a home characteristi.cs
questionnaire. This instrument, was designed to record source and source
usage patterns as well as dilution and ventilation parameters. A personal
characteristics questionnaire included occupational questions and a
yesterday recall of activities.
TIME/ACTIVITY MEASURE
Time/activity diaries were filled out for the two days in which
personal monitoring took place. They were presented by the field
technician at the home setup visit. At that visit, the technician
explained the monitoring protocol and guided the participant through a
practice activity diary in the instruction booklet. (See Exhibit 3.)
Twenty-four to 48 hours after the setup visit, a telephone call was made to
encourage protocol compliance, answer questions, and prompt for the return
of samplers as well as the activity diary.
The activity diary used time and location as its major dimension.
(See Exhibit 4.) Time was determined by the length of stay within a
microenvironmental location. When a location was entered, the time was
noted in the activity diary. Only when a location changed was a new entry
recorded.
The location dimension was divided into two major categories: inside
and outside. The inside location was further divided between home and not
6-11
-------
at home categories. These locations were then subdivided further. The
outside location was divided according to proximity to major roads.
Durations of 15 minutes or more were recorded except during gas stove usage
when durations of 5 or more minutes were entered.
QUALITY ASSURANCE - QUALITY CONTROL
Several quality assurance and quality control steps were taken to
insure the integrity of the data. Quality assurance measures included:
1. field technician setup visits to explain procedures (including
mail-back procedures);
2. a practice activity diary filled out with the assistance of the field
technician to insure understanding of procedures;
3. limiting the number of locations to insure nonconfusing,
nonoverlapping alternatives;
4. the use of two-day diaries to provide replication and to capture
day-of-the-week differences;
5. a pilot study conducted to assess four alternative ways for setting
up, following up, and finishing up each two-day monitoring period,
including:
a. field technician setup, field technician return after 24 hours,
and mail back at the end;
b. field technician setup, field technician telephoning after 24
hours, and mail back at the end;
c. field technician setup, no follow-up, field technician return
after 48 hours to collect measurement instruments, and
d. mailed setup, no followup, and mail back at end.
(As a result of the pilot study, the main study adopted
alternative b. At the same time, sample size was increased to
offset an anticipated 5-10% loss due to noncompliance and other
data loss.)
6. an instrumentation booklet and checklist were provided to help the
participants know what to do.
7. interviews with pilot participants and refusals were conducted to
assess response to initial presentation.
Quality control was accomplished through:
1. a visit by the senior staff to train the field technician followed by
a visit to insure proper observance of protocol by the field staff;
6-12
-------
2. telephone follow-ups twenty-four hours after setup to encourage
protocol compliance, answer questions, and prompt for return;
3. field coordinator checks of all returns;
(These checks were made to insure form completeness, proper
activity recording, and logical activity sequencing. If
errors were encountered, the field technician called the
participant back to correct discrepancies.)
4. central site coordinator checks to insure that diaries were matched
with the proper site and badge number;
(Checks were also made to insure that badge start and stop
times corresponded with activity start and stop times.)
5. 100% verification of all diaries after entry; and
6. programs designed to check for values (such as date) and field card
time/activity diary time inconsistencies.
LIMITATIONS
This method of time/activity exposure assessment suffers from certain
limitations. While the personal monitor insures a personal exposure
quantity, only estimates can be made of the source microenvironments. This
study attempted to address that issue through the use of stationary and
personal measurements at various locations. In addition* the supplemental
study used "at home" and "away from home" badges to assess the
microenvironments. However, measures of specific rooms within the home
environment were beyond the cost/technical capabilities of this study. As
concluded in the Six City section of this paper, until monitoring devices
become technically feasible and cost-effective for large field studies,
compromises will have to be made in microenvironmental assessment.
TABLE 1. NUMBER OF ACTIVITY DIARIES BY CITY AND SEASON
PORTAGE STEUBENVILLE
ACADEMIC PERIOD
VACATION PERIOD
TOTAL
139
50
189
236
172
408
TOTAL
375
222
597
ACADEMIC PERIOD was defined as from 12/86 to 06/87.
VACATION PERIOD was defined as from 06/87 to 09/87.
6-13
-------
TABLE 2. PERCENT OF CHILD'S TINE SPENT IN MONITORED LOCATIONS:
DAY-OF-WEEK MEANS AND STANDARD DEVIATIONS
PERIOD
SITE Dav ACADEMIC VACATION
(DEC-MAY) (JUNE-SEPT)
STEUBENVILLE All Days 82 (16) 73 (20)
Weekends 85 (14) 79 (20)
Saturday 69 (26) 65 (34)
Sunday 73 (24) 74 (18)
PORTAGE
All Days
Weekdays
Saturday
Sunday
88 (11)
89 (09)
77 (17)
79 (15)
72 (16)
71 (16)
80 (17)
-- (")
TABLE 3. PERCENT OF CHILD'S TIME SPENT IN SPECIFIED LOCATIONS:
REGIONAL MEANS AND STANDARD DEVIATIONS
ACADEMIC PERIOD VACATION PERIOD
LOCATION STEUBENVILLE PORTAGE STEUBENVILLE PORTAGE
HOME:
ACTIVITY ROOM 13 (11) 13 (13) 17 (16) 17 (13)
BEDROOM 41 (13) 41 (09) 39 (17) 37 (16)
KITCHEN 03 (03) 04 (04) 03 (03) 04 (04)
OTHER ROOM 05 (09) 03 (05) 06 (11) 07 (11)
TOTAL 62 (17) 61 (13) 65 (20) 65 (19)
NON-HOME:
SCHOOL 20 (12) 21 (10) 02 (06) 02 (06)
OUTDOORS 07 (08) 08 (07) 19 (15) 17 (13)
CAR/BUS 02 (03) 03 (03) 02 (04) 04 (05)
OTHER ACTIVITY 08 (11) 05 (09) 11 (18) 12 (16)
DON'T KNOW 01 (04) 00 (01) 01 (05) 00 (02)
TOTAL 38 (17) 38 (12) 35 (20) 34 (19)
ALL:
INDOORS 93 (08) 92 (07) 81 (14) 83 (13)
6-14
-------
TABLE 4. PERCENT OF CHILD'S TIME SPENT IN SPECIFIED LOCATIONS:
SEASONAL MEANS AND STANDARD DEVIATIONS
STEUBENVILLE PORTAGE
LOCATION ACADEMIC VACATION ACADEMIC VACATION
HOME:
ACTIVITY ROOM 13 (11) 17 (16) 13 (13) 17 (13)
BEDROOM 41 (13) 39 (17) 41 (09) 37 (16)
KITCHEN 03 (03) 03 (03) 04 (04) 04 (04)
OTHER ROOM 05 (09) 06 (11) 03 (05) 07 (11)
TOTAL 62 (17) 65 (20) 61 (13) 65 (19)
NON-HOME:
SCHOOL 20 (12) 02 (06) 21 (10) 02 (06)
OUTDOORS 07 (08) 19 (15) 08 (07) 17 (13)
CAR/BUS 02 (03) 02 (04) 03 (03) 04 (05)
OTHER ACTIVITY 08 (11) 11 (18) 05 (09) 12 (16)
DON'T KNOW 01 (04) 01 (05) 00 (01) 00 (02)
TOTAL 38 (17) 35 (20) 37 (12) 35 (19)
ALL:
INDOORS
93 (08)
81 (14)
92 (07)
83 (13)
TABLE 5. PERCENT OF CHILD'S TIME SPENT IN VARIOUS STEUBENVILLE
LOCATIONS: DAY-OF-HEEK MEANS AND STANDARD DEVIATIONS
LOCATION ACADEMIC PERIOD VACATION PERIOD
WEEK SAT SUN WEEK SAT SUN
HOME:
ACTIVITY ROOM 11 (10) 21 (15) 19 (15) 17 (16) 16 (17) 13 (11)
BEDROOM 41 (11) 37 (21) 44 (21) 40 (17) 32 (19) 39 (16)
KITCHEN 03 (03) 04 (03) 04 (04) 03 (03) 03 (03) 02 (02)
OTHER ROOM 05 (08) 07 (11) 08 (14) 06 (10) 04 (12) 04 (02)
TOTAL 60 (14) 69 (30) 75 (22) 66 (14) 55 (27) 58 (06)
NON-HOME:
SCHOOL 24 (10) 01 (04) 00 (01) 02 (06) 02 (06) 01 (06)
OUTDOORS 07 (08) 05 (09) 07 (12) 19 (15) 16 (15) 20 (12)
CAR/BUS 03 (03) 02 (02) 02 (03) 02 (03) 03 (06) 05 (09)
OTHER ACTIVITY 06 (12) 24 (30) 16 (21) 10 (16) 20 (31) 17 (19)
DON'T KNOW 01 (04) 00 (01) 00 (01) 01 (05) 02 (07) 00 (00)
TOTAL • 41 (11) 32 (30) 26 (22) -34 (19) 44 (27) 43 (20)
ALL:
INDOORS 93 (08) 95 (09) 93 (12) 81 (15) 84 (15) 80 (12)
6-15
-------
TABLE 6. PERCENT OF CHILD'S TIME SPENT IN VARIOUS PORTAGE LOCATIONS:
DAY-OF-WEEK MEANS AND STANDARD DEVIATIONS
LOCATION ACADEMIC PERIOD VACATION PERIOD
WEEK
HOME:
ACTIVITY ROOM
BEDROOM
KITCHEN
OTHER ROOM
TOTAL
NON-HOME:
SCHOOL
OUTDOORS
CAR/BUS
OTHER ACTIVITY
DON'T KNOW
TOTAL
ALL:
INDOORS
12
40
05
03
60
24
08
03
05
00
40
92
(12)
(09)
(04)
(04)
(11)
(08)
(06)
(03)
(08)
(01)
(11)
(07)
SAT
23
35
03
10
71
01
15
03
09
00
28
84
(14)
(12)
(03)
(12)
(18)
(03)
(15)
(04)
(14)
(00)
(26)
(15)
SUN
20
50
04
05
79
00
06
04
12
00
22
94
(13)
(10)
(03)
(06)
(15)
(00)
(09)
(03)
(11)
(00)
(15)
(09)
WEEK
17
37
04
07
65
02
17
03
12
00
34
83
(13)
(16)
(04)
(ID
(20)
(07)
(13)
(05)
(16)
(00)
(20)
(13)
SAT SUN
15
40
06
06
67
00
19
04
08
02
33
81
(16) - (--)
(15) -- (--)
(05) - (-)
(08) -- (--)
(18) - (-)
(01) -- (--)
(18) - (-)
(05) -- (--)
(14) -- (-)
(05) - (-)
(18) -- (-)
(18) -- (--)
TABLE 7. PERCENT OF CHILD'S TIME SPENT OUTDOORS:
TIME-OF-DAY MEANS AND .STANDARD DEVIATIONS
ACADEMIC PERIOD VACATION PERIOD
STEUBENVILLE PORTAGE STEUBENVILLE PORTAGE
DAY 06 (07) 08 (07) 16 (13) 15 (12)
NIGHT 01 (02) 01 (01) 03 (04) 02 (03)
TOTAL 07 (08) 08 (07) 19 (15) 17 (13)
6-16
-------
TABLE 8.
COMPARISON OF 1987 PORTAGE AND STEUBENVILLE
ACTIVITY DATA WITH 1982 WATERTOWN DATA
WATERTOWN 1982
ANNUAL ACADEMIC
YR
PORTAGE 1987
ANNUAL ACADEMIC
YR
STEUBENVILLE 1987
ANNUAL ACADEMIC
YR
SCHOOL
TRAVEL
OUTDOOR
INDOOR
12
03
18
82
16
03
14
86
16
03
10
90
21
03
08
92
16
02
10
90
20
02
07
93
TABLE 9. COMPARISON OF 1987 PORTAGE AND STEUBENVILLE
ACTIVITY DATA WITH 1981 PORTAGE DATA BY HEATING SEASON
PORTAGE PORTAGE 1987
1981 WINTER SUMMER
STEUBENVILLE 1987
WINTER SUMMER
INDOOR
OUTDOOR
MOTOR VEHICLES
OTHER INDOORS
TOTAL INDOORS
TOTAL OUTDOORS
- 61
09
05
24
85
14
63
06
03
27
90
09
61
16
04
18
79
20
66
04
02
, 28
94
06
62
16."
02
20
. 80
18
Total Outdoors
Total Indoors
Outdoors + Motor Vehicles
Indoor Home + Other Indoors
The work described in this paper was not funded by the U.S.
Environmental Protection Agency and, therefore, the contents do not
necessarily reflect the views of the Agency and no official endorsement
should be inferred.
6-17
-------
REFERENCES
Dietz, R. N. and Cote, E. A. "Air infiltration measurements in a home
using a convenient perfluorocarbon tracer technique.'" Environment
International. 8:419-433, 1982.
Duan, N. "Models for human exposure to air pollution." Environment
International. 8:305-309, 1982.
Ferris, B. G., Speizer, F. E., Spengler, J. D.; Dockery, D. W., Bishop,
Y.M.M., Wolfson, M., and Humble, C. "Effects of sulphur oxides and
respirable particulates on human health: ' methodology and demography
of populations in study." American Review of Respiratory Diseases.
120:769-779, 1979.
Fugas, M., "Assessment of total exposure to an air pollutant." Proceedings
of the International Conference on Environmental Sensing and
Assessment, IEEE #75-CH 1004-1 ICESA, Las Vegas, 1975.
Girman, J. R., Hodgson, A.T, Robison, B.K., Traynor, G.W. "Laboratory
studies of the temperature dependence of the Palmes N02 sampler."
Proceedings: National Symposium on Recent Advances in Pollutant
Monitoring of Ambient Air and Stationary Sources, Raleigh, North
Carolina. May 1983 EPA-600/9-84-001 January 1984.
Letz, R. E., Ryan, P. B., and Spengler, J. D. "Estimated distributions of
personal exposures to respirable particulates." Environment Monit.
Assess. 4:451-359, 1984.
Marple, V.A., Rubow, K.L., Turner,W., and Spengler, J.D. "Low flow rate
sharp cut impactors for sampling: Design and calibration." JAPCA
37:1303-1307, 1987.
Palmes, E. D., Gunnison, A. F., DiMattio, S., Tomcyzk, C. "Personal
sampler for N02." Am. Ind. Hvq. Assoc. 37:570-577, 1976.
Quackenboss, J. J., Kanarek, M. S., Spengler, J. D., and Letz, R.
"Personal Monitoring for Nitrogen dioxide exposure: Methodological
considerations for a community study." Environ. Int. 8:249-258,
1982.
Quackenboss, J.J., Spengler, J.D., Kanarek, M.S., Letz, R., and Duffy, C.P.
"Personal exposure to nitrogen dioxide: relationship to
indoor/outdoor air quality and activity patterns. Environ. Sci.
Tech. 20(6):775-783, 1985.
Reed, M. P., McKay, V., and Fraumeni, L. P. "Indoor Air Quality/Acute
Health Study: Field Technician's Reference Manual." Personal
communication, 1986.
6-18
-------
Ryan, P. B., Soczek, M. L., Spengler, J. D., and Billick, I.H. "The Boston
Residential NO. Characterization Study: I. Preliminary Evaluation of
the Survey Methodology." Int. Jour, of Air Pollution Control and
Waste Management. 38(l):22-27. 1988a.
Ryan, P.B., Soczek, M.L., Treitman, R.D. and Spengler, J.D. "The Boston
Residential N02 Characterization Study: II. Survey Methodology and
Population Concentration Estimates." Atmos. Environ., In press,
1988b.
Ryan, P.B., Spengler, J.D. "Nitrogen dioxide personal exposure assessment:
Methodological considerations in design, implementation and data
analysis." Presented at Environmetrics '87. Washington, DC. Nov.
1987.
Spengler, J. D., Treitman, R. D., Tosteson, T. D., Mage, D. T. and Soczek,
M.L. "Personal exposure to respirable particulates and implications
for air pollution epidemiology". Env. Sci. Tech. 19:700-707, 1985.
Spengler, J. D., Reed, M. P., Lebret, E., Chang, B. H., Ware, J. H.,
Speizer, F. E., and Ferris, B. G. "Harvard's Indoor Air
Pollution/Health Study." Presented at the 79th Annual Meeting of the
Air Pollution Control Association. June 22-27, 1986.
Spengler, J. D., Keeler, 6. J., Koutrakis, P., and Ryan, P. B. "Exposures
to Acidic Aerosols." Presented at the International Symposium on the
Health Effects of Acid Aerosols. October 19-21, 1987.
SPSS Data Entry II. SPSS,' Inc., 1987.
Turner, W. A., Marple, V. A., and Spengler, J. D. "Indoor Aerosol
Impactor." Aerosols. Elsevier Science Publishing Co., Inc., Lies,
Pui, and Fissan, editors. 1984.
Yanagisawa, Y., And Nishimura, H. "A badge type personal sampler for
measurement of personal exposure to NO- and NO in ambient air."
Environment International. 8:235-242, 198?.
6-19
-------
PERCEPTION OF DAILY CIGARETTE CONSUMPTION
IN THE OFFICE ENVIRONMENT
by: David A. Sterling, D.J. Moschandreas,
and Robert D. Gibbons
Editors Note: This published article appears in Bulletin of the
Psvchonomic Society 26(2):120-123. 1988.
7-1
-------
CAPTURE OF ACTIVITY PATTERN DATA DURING ENVIRONMENTAL MONITORING
by: Harvey S. Zelon
Research Triangle Institute
Research Triangle Park, N.C. 27709
ABSTRACT
The United States Environmental Protection Agency (U.S. EPA) has
sponsored many-studies involving the collection .of human exposure data.
Collection of data describing the activities undertaken by the study
respondents during the course of monitoring is an important component of
the research effort. The information collected may be vital in explaining
the results of environmental or biological monitoring. This paper will
describe some of the studies undertaken, the types of data collected,
focusing on the collection of activity descriptions, and the efforts
undertaken to assure completeness and quality in the data set.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
8-1
-------
INTRODUCTION
How people spend their time has long been a topic of great interest
to the social scientist. Time and motion studies have been conducted in a
broad variety of settings. Now the study of people's activities has become
a component of the environmental scientist's realm of interest.
Major efforts have been undertaken to describe the exposures of the
population to environmental chemicals of many varieties. The efforts
include sampling of exposures through various routes and attempts to
measure subsequent levels of the chemical in the body. Exposures may be
through the environmental routes, air and water, or through various other
routes, such as food, household dust, or occupational or avocational
exposure.
Vital to the explanation of the individual results of the
measurements and the attempts to correlate the various levels measured is
the ability to determine in great detail the activities undertaken by the
respondent during the periods when samples were being collected. Since the
environmental routes may account for only some of the potential paths of
exposure to various chemicals, it is vital to know if the levels of
chemical detected by the sampling and analysis process are from exposure
through other routes or should be considered to be •environmental exposures.
Two studies conducted by the Research Triangle Institute (RTI) for
the U.S. EPA involved the collection of various environmental samples.
These studies were the Total Exposure Assessment Methodology Study, known
as the TEAM study, and the Study of Carbon Monoxide Exposure in Residents
of Washington, D.C., and Denver, CO, known as the CO Study. In both of
these studies, data from the respondent were also collected and used to
help explain the relationships observed within the environmental and
biological samples. This paper will describe these studies with a primary
focus on the collection of data about the respondent's activities to be
used during the analysis. Emphasis will be placed in describing the
collection and validation of activity data, and discussions of the
strengths and weaknesses of the techniques will be presented.
Recommendations for improvements in approaches to collecting data
describing the routine daily activities of study respondents will conclude
the discussions.
TOTAL EXPOSURE ASSESSMENT METHODOLOGY (TEAM) STUDY
Between 1980 and 1985, a series of studies was conducted in multiple
sites throughout the United States. The goals of the studies were to
develop methods to measure individual total exposure (exposure through air,
food, and water) and resulting body burden of toxic and carcinogenic
8-2
-------
chemicals, and to apply these methods within a probability-based sampling
framework to estimate the exposures and body burden of urban populations in
several U.S. cities. In order to reach these goals, a three-pronged
approach was developed which included the development and testing of small
personal monitors to measure exposure to airborne chemicals, the
development of a transportable spirometer to measure the chemicals in
exhaled breath, and a survey design involving a stratified probability
selection with variations to insure the inclusion of members of potentially
highly exposed groups. Based on the desire to assess exposure through as
many routes as possible, and to prove a series of methodologies for the
collection and analysis of the chemicals of interest, the study was named
the Total Exposure Assessment Methodology study, or TEAM.
The study began with a pilot study conducted with 9 subjects in New
Jersey and 3 subjects in North Carolina. The New Jersey subjects all were
located in the areas which became part of the first main phase sample. All
pilot study subjects were selected purposively with the assistance of local
health department officials. No attempt was made to select subjects
representative of the population, but rather to select persons who were as
diverse as possible in occupation, location of residence, and other
variables of interest.
The study subjects were each visited for several days 3 times over
the course of the pilot study. During each series of visits, various
environmental and biological samples were collected, and a series of
questionnaires were completed. The samples collected were used to test
some 30 sampling and analytical protocols for 4 groups of chemicals of
interest. Based on.the results of the pilot study tests, the goals of the
.TEAM study could be met with only one group of the chemical compounds, the
volatile organics.
The main TEAM study was begun with the objective of collecting and
analyzing samples of 20 target chemicals selected from among all the
volatiles based on toxicity, carcinogenicity, mutagenicity, production
volume, presence in preliminary sampling and pilot studies, and amenability
to collection on Tenax. Some 600 persons, representing a total population
of 700,000 residents of cities in New Jersey, North Carolina, North Dakota,
and California were selected in a multi-stage, stratified, random sampling
process. Each participant was involved in a twenty-four hour monitoring
period during which two twelve-hour personal air samples were collected.
Ambient air samples were collected in each sampling segment for comparison
to personal air samples. Each participant provided two drinking water
samples, and at the end of the twenty-four hours of monitoring, a sample of
exhaled air. The first phase of the main study consisted of three seasons
of measurements in the two sites in northern New Jersey, and in comparison
sites in North Carolina and North Dakota. The second phase of the study
consisted of 3 seasonal sets of measurements: 2 in the Los Angeles area in
southern California, and one in an area northeast of Oakland, CA.
During the same time frame as the main studies, a series of special
studies was undertaken with small populations of persons with special
exposures or concerns. These included nursing mothers in whose milk some
8-3
-------
of the chemicals of interest were thought to be highly concentrated,
employees of dry cleaners, and lifeguards at swimming pools, all of whom
are exposed to high levels of chemicals in the set of interest. In
addition, a series of studies was conducted to monitor levels of volatile
organics and several other classes of chemical compounds in the air in
public buildings. In each of the special studies involving personal
monitoring, all participants completed the same sets of questionnaires that
participants in the main phase of the TEAM study were asked to complete.
Data collection for each respondent began during the initial stages
of sampling when each of the households in the sample segments were
screened for eligible residents. The screening consisted of creating a
household roster for each housing unit containing information on the age,
sex, occupation, and smoking status of each resident of the housing unit.
Based on this information, a sample of respondents was selected, and during
a series of return visits to the households, a field interviewer explained
the study, detailing the participation required of respondents and
enrolling respondents in the study. Once a sample member agreed to
participate and the necessary informed consent requirements were met, the
main study questionnaire was administered, and sampling appointments were
established. When the sampling teams completed the twenty-four hours of
monitoring, they administered the "24-Hour Exposure and Activity Screener."
This document (Figure 1) was designed to collect information about what the
respondent did during the monitoring period, as well as what he ate and
drank, and whether there was occupational or incidental exposure to
chemicals from the groups being studied. Each individual was asked to
provide details about his "main" activity and any other activity lasting
more than one hour. Descriptions of all travel and any unusual events
which occurred- during the monitoring period completed the data collection
effort. •
CARBON MONOXIDE MONITORING STUDY
Conducted in 1982 and 1983, this study was undertaken in an attempt
to evaluate methodologies for collecting personal exposure monitoring data
and corresponding personal activity data in an urbanized area. The
specific objectives were to develop a methodology for measuring the
distribution of carbon monoxide (CO) exposures of a representative
population of an urban area for assessment of the risk to the population;
to test, evaluate, and validate this methodology by employing it in the
execution of pilot field studies in Denver, CO, and Washington, D.C.; and
to obtain an activity-pattern data base related to CO exposures.
A stratified, three-stage, probability-based sample was selected in
both sites. Initial contacts with the selected households in the sample
area were conducted by telephone, using RTI's Computer Assisted Telephone
Interviewing (CATI) capabilities. The purpose of the initial contacts was
to determine household eligibility and to collect information on all
members of the household to permit the selection of the final (third-stage)
sample respondents.
8-4
-------
Figure 1
Ł4-HOUR ftCTIVITY SCREENER
Study Number: O.M.8. No. 2000-0364
24-HOUR EXPOSURE AND E,P,,.,9/30/86
Date: / / ACTIVITY SCREENER
TEAM Study
1. Have you pumped your own gas in the past 24 hours?
. [T] Yes (CO TO QUESTION 1a) |~2~| No (CO TO QUESTION 2)
a. During which monitoring periods?
I 1 | Overnight | 2 | Daytime
2. Have you done your own dry cleaning, or been in a dry cleaning establishment during the past 24 hours?
[T| Yes (CO TO QUESTION 2a) |T] No (GO TO QUESTION 3>
a. During which monitoring periods?
I 1 J Overnight [ 2 | Daytime
3. Have you smoked cigarettes, cigars, or a pipe in the past 24 hours?
(T| Yes (CO TO QUESTION3a) [~2~| No (CO TO QUESTION*)
a. During which monitoring periods?
I \ I Overnight . . | 2 | Daytime
4. Were you in an enclosed area with active smokers for more than 15 minutes at any time in the past 24 hours?
[~T| Yes (CO TO QUESTION la) [~2~] No (GO TO QUESTION 51
a. During which monitoring periods?
I 1 j Overnight | 2 | Daytime
5. Have you used or worked with insecticides, pesticides, or herbicides in any way. including farming, gardening, and
extermination, in the past 24 hours?
Yes | 2 | No
6. During this time of year, on an average weekday or weekend day, how many hours per day are spent:
(Answer a through d below)
Weekday Weekend Day
a. Away from home I I
b. Out of doors—leisure activities I
c. Out of doors—working | | | [ | |
d. In a motor vehicle | | | [ | |
8-5
-------
Figure 1
-------
Figure 1. (continued)
24-HOUR flCTIVITY SCREENER
8. a. Please lilt the specific name of any chemical or hazardous substance to which you have been exposed.
please indicate your activity which lasted the most time. In addition
lasted for more than one hour.
, please indicate any other activities which
Level of
Physical
Location: Activity:
Urban/
Indoor/ Suburban/ Strenuous/
Activity Outdoor Rural Light
1 (InnQptt) , I
5
d 1
01 I 02 1 03 04 05 |06|07|
1 •' ' ' * . ' ' '
08 1 09J 10 11 12 I 13 I 14 I '
15 | 16 | 17 18 19 J2o|2l|
22|23| 24 25 26 |27|28|
b. Please provide the following information for each trip during this time period.
Traffic:
Ventilation:
Windows:
Open/Closed
Trip Minutes Mode of Transport Heavy/Light NA
1.
2.
3.
4.
EH
UL71
rrm
mi
i
7
1
2
.1
3
1
2
i
2
3
3
c. Please indicate any unusual events which happened during this time period which might have any effect on your
exposure to environmental chemicals.
8-7
-------
After the third stage sample of respondents was selected,
another .telephone contact was initiated. During this call to the selected
respondents, the study details were discussed, requirements of
participation outlined, and cooperation sought. If the selected respondent
agreed to participate, an appointment was established for a personal visit
during which a field interviewer would bring the CO monitor to the
respondent for the beginning of the twenty-four hour monitoring period.
(Respondents in the Denver sample were monitored for 48 hours.) During the
telephone contact in which the respondent agreed to participate, a brief
questionnaire was completed which collected data for a computer model being
developed to predict and describe exposure levels.
At the time of the visit to the respondent to start the monitoring, a
Household (Study) questionnaire was administered, collecting information on
the respondents and their home and work environments. During the time
period(s) that the respondents were monitored by the personal exposure
monitor (PEM), they were also asked to carry and fill out an activity
diary, describing their activities, location, and environment. Each time
that the respondent began a new activity or changed location, he was asked
to push a button on the PEM, which stored an integrated reading of CO
exposure for the last activity, and reset the monitor to begin capturing
the new data. At the same time that the respondent pushed the button, they
were to make an entry in the diary. Figure 2 displays one page of the
diary.
COMPARISON OF THE TWO ACTIVITY PATTERN DATA COLLECTION MODES
The two studies took different approaches to the collection of data
describing the activities undertaken by.-the respondents during the time
that they were being monitored. The TEAM study used a questionnaire
administered at the end of each 24-hour monitoring period and relied on the
respondent's memory to provide the requested information. After the first
administration of the activity screener, the respondents began to learn
what was expected of them and were able to provide more complete
information. In fact, the respondents often provided data before being
asked, and in greater detail than could be recorded. This was particularly
true of details concerning the food consumed during the monitoring period.
The CO study collected data by self-report of the respondent using a diary
form to collect the details associated with each activity and location.
The level of detail provided and the number of activities entered were
determined completely by the respondent, with minimal input from the study
team and with no opportunity to gain experience by using the diary over
several monitoring periods.
The two approaches thus represent ends of a continuum, with the TEAM
study using a highly structured form to collect a specific but minimal data
set, while the CO study used a diary to collect an unspecified amount of
data about an unlimited variety of activities. The TEAM study collected
information about the activity which the respondent considered to be the
"main" activity undertaken during the monitoring period. In addition, any
activity which lasted more than one hour was documented. Determination of
main activity was left up to the respondent and was a source of great
8-8
-------
Fig ure Ł
RCTIVITY DIftRY
TIME FROM MONITOR I I I I ~1
A. ACTIVITY
D. ONLY IF IN TRANSIT
(1) Start address
B. LOCATION
In transit 1
Indoors, residence 2
Indoors, office 3
.Indoors, store ... 4
Indoors,-restaurant 5
Other indoor location 6
Specify:
Outdoors, within 10 yards of road
or street 7
Other outdoor location 8
Specify:
Uncertain 9
C. ADDRESS (if not in transit)
(2) End address
(3) Mode of travel:
Walking 1
Car 2
Bus 3
Truck 4
Train/subway 5
. Other 6
Specify:' •
ONLY IF INDOORS
(1) Garage attached to building?
Yes 1
No 2
Uncertain 3
(2) Gas stove in use?
Yes 1
No •. 2
Uncertain 3
ALL LOCATIONS
Smokers present?
Yes 1
No 2
Uncertain 3
8-9
-------
variability. In the CO study, there was also great variability, caused not
by the respondent's definition of the activities of interest, but by the
level of detail provided by the respondents. While one respondent might
list a general activity such as housework, another respondent would provide
separate entries for cleaning, dusting, washing clothes, vacuuming, etc.
The two respondents would cover corresponding periods of time but with
different numbers of entries. This discrepancy in level of detail was
particularly noticeable during the evening and night hours, while levels of
detail for daytime work activities tended to be more uniform. This may be
related to the impact of location changes as a reason to initiate a log
entry and diary description. During the instruction period provided to the
respondents, the combination of activity and location was stressed.
Respondents were told that a change in either member of the pair was the
signal for a PEM recording and a diary entry.
Verification of the activities reported by the respondents in both
studies was virtually impossible. To verify completely the activities
undertaken by an individual, it is necessary to observe the individual at
all times, and for an independent recorder to document all activities. If
the respondent knows that he is being observed, there is a very real
possibility that the respondent may change his activities, by choice or by
chance. The use of a covert observer, while minimizing the probability of
deliberate change in activity, may result in missing data due to
observational problems. During the TEAM pilot study, one observation of
the respondent was possible during the work day. One sample to be
collected for each respondent was a vial of water taken from a source at
work. When the sampling team arrived at the respondent's work setting,
they collected the water sample and observed the respondent. The
observation yielded responses, to two specific questions: Was the
respondent wearing the monitoring equipment as requested? and What was the
respondent doing at the time of the observation? The respondent's activity
was later checked against the responses to the activity diary questions.
This type of check is difficult to implement in a large-scale study and
provides only one or at best a few data points for comparison and
verification.
Another difference in the two modes of data collection is the ability
to review the answers being recorded in the questionnaire as it is being
administered and to probe for further details of interest, thus eliminating
any internal discrepancies in the data as the data are being recorded.
This is easily done in an in-person interview mode, especially when the
data collectors are knowledgeable about the subject material. In the case
of the TEAM study, the activity screener was administered by chemists on
the sampling team. In the CO study, the interviewers were not chemists or
environmental scientists, and the diary, while it was reviewed at the
respondent's residence for completeness of entries and general usability,
could not be reviewed for inconsistencies or missing activities. The
interviewer had no way of knowing what was not recorded, either
deliberately or by oversight, and thus could not probe or explore for
additional information.
8-10
-------
SUGGESTIONS FOR IMPROVEMENTS
While the utility of collecting activity data to supplement data from
environmental and biological samples is unquestioned, the manner in which
it is collected is still evolving. Based on the experience gained during
the two studies described, there are several modifications to the methods
of collecting activity data which should be considered.
The data collection instrument should be more highly structured. The
times at which entries should be made must be indicated, although the
frequency of entries will be determined by the study being undertaken. The
availability of a timing device with an audible alarm that can be attached
to the diary or worn by the respondent may be considered. Allowing the
respondent to keep the timing device could be an incentive for
participation.
Additional pre-training or instructions for the respondent may be of
value. If contacts are made with the respondent prior to the beginning of
the monitoring period, a study guide or brief written instructions can be
provided, along with a sample set of diary pages. The respondent could
complete the diary for a time period immediately prior to the visit of the
sampling team. The sample diary entries could be reviewed before the
beginning of the actual data collection period, and the respondent given
specific feedback on the level of detail provided and the types of
activities recorded. A drawback to this review is the potential lack of
consistency among the reviewers and trainers.
An additional component of pre-training could be the development of a
list of sample activities of interest. This list, developed with
substantive inputs from the technical members of the study team, could be
reviewed with the respondent in order to demonstrate the types of
activities of interest. There is a drawback to this approach, since some
respondents may feel that the activities on the list are the only ones of
interest and thus may not report any other activities.
Data collection instruments should be reviewed or administered at
shorter intervals. In the TEAM study, the activity screener was
administered at the end of each 24-hour period, even though the study team
was in the respondent's home every 12 hours. If the screener was
administered at each visit, it would rely less on memory, could be
administered in a shorter period of time, and used to collect more
information.
One additional item would involve the development of a check list of
activities, locations, and exposures of interest. This checklist could be
reviewed with the respondent at the end of the data collection period as a
means of jogging the respondent's memory, or allowing the respondent to
provide details of potential exposures that he did not think significant,
or just omitted from the record. The checklist can also be used as a part
of a review of the diary. If any of the special items were entered in the
8-11
-------
diary by the respondent, a detailed review and discussion of the entries
could be undertaken and details recorded for future analysis.
CONCLUSIONS
The two studies described included initial attempts to explore a new
component of environmental research: the examination of the relationship
of daily activities to exposure to many groups of toxic chemicals found
throughout the environment in which the population lives and works. While
the data collected were useful, more and better data will be available as
we evolve better means of collecting them and using them in the analysis of
exposure. We must use the experiences of previous studies to design and
implement the next studies in order to capture more of the data that are
available, but which have not yet been fully utilized.
8-12
-------
AN ACTIVITY PATTERN SURVEY OF ASTHMATICS
by: Carolyn H. Lichtenstein
H. Daniel Roth
Roth Associates, Inc.
Rockville, MD 20852
Ron E. Wyzga
Electric Power Research Institute
Palo Alto, CA
ABSTRACT
Asthmatics are more likely to react to air. pollution than
nonasthmatics, and human clinical studies have shown that they are at
greatest risk to these substances when they are exercising fairly
strenuously. Thus, the activity patterns of asthmatics, particularly the
occurrence of strenuous exercise, are of interest in assessing their health
risks.
Since asthmatics constitute a special population subgroup with its
own characteristics, a survey of their activity patterns must also be
uniquely designed. Several activity pattern surveys have identified
asthmatics and several other surveys have covered aspects of asthmatics
other than their activity patterns. However the recent survey by the
Electric Power Research Institute (EPRI) and Roth Associates, Inc. (RAI),
conducted in Los Angeles and Cincinnati, is the first activity pattern
survey to focus exclusively on asthmatics. The RAI survey includes
questions concerning asthma symptoms, medication, and factors that affect
asthma symptoms (e.g., anxiety and stress). This information is obtained
in a background questionnaire including demographic information and medical
history, plus a 3-day hourly diary.
The analysis of the EPRI-RAI survey data should prove very
interesting because it allows for a more detailed examination of many
aspects of asthmatics' behavior than has been possible from previous
surveys. In addition to an estimate of the pattern of exercise over the
days of the week, various relationships between asthma symptoms and their
causes can be estimated. For example, preliminary results based on the
L.A. diary data indicate that anxiety and stress are positively related to
9-1
-------
increased incidence of asthma symptoms. It is hoped that a further
understanding of the factors related to asthma will aid in policy decisions
aimed at this group.
9-2
-------
PURPOSE OF EPRI-RAI SURVEY OF ASTHMATICS
The Electric Power Research Institute (EPRI) and Roth Associates,
Inc. (RAI), designed and implemented a survey of asthmatics in Los Angeles
and Cincinnati. The goals of this survey were:
1. To examine the activity patterns of asthmatics, including a
comparison with the activity patterns of nonasthmatics.
2. To describe and examine asthmatics' patterns of symptomatic
response, medication, and exposure to factors related to
asthma.
3. To estimate the typical number of asthmatics' adverse effects
for the purposes of risk analysis.
4. To examine the interactions of the symptom patterns and
activity and medication patterns, (e.g., the effects of
medication on symptomatic response).
5. To study the causal nature of various factors related to
asthma, such as exercise, stress, etc.
6. To augment the results of clinical studies on asthmatics, in
part by comparing the patterns of asthmatic response in the
survey and laboratory settings.
Two questions of interest to this conference are: 1) Why are
activity patterns of asthmatics of special interest?; and 2) Why was it
necessary to carry out a special survey focusing on the activity patterns
of asthmatics? Asthmatics and their activity patterns are of special
interest because they are more likely to react to air pollution than
nonasthmatics. In fact, asthmatics seem to constitute the most sensitive
subgroup of the population; they react to certain pollutants even at
ambient levels of these substances. Clinical chamber studies indicate that
this response occurs only when the subjects are exercising strenuously.
Thus, examining the activity patterns of asthmatics for periods of
strenuous outdoor exertion is a primary concern.
Previous activity pattern surveys have been carried out for the
general population. A new activity pattern survey of. asthmatics only was
desired for two reasons. First, the activity pattern surveys of the
general public had flaws that limited the usefulness of their results.
Second, it is reasonable to assume that the activity patterns of asthmatics
are different from those of nonasthmatics due to their illness. Thus, the
information available on exercise levels and patterns for the general
population is probably not directly applicable to asthmatics.
Activity patterns of asthmatics will probably differ from those of
nonasthmatics in several ways. First, asthmatics will tend to exhibit
lower levels of exertion in general and shorter periods of strenuous
9-3
-------
exercise. Second, they may exhibit less activity outdoors where they can
be exposed to factors that could affect their asthma. Third, their
activities may be limited by medication taken and/or symptoms experienced.
Thus, any survey of asthmatics must include questions concerning
symptomatic responses, medication usage, and exposure to factors related
to, and possibly causing, episodes of their disease.
PREVIOUS SURVEYS
There have been several studies that have attempted to estimate
activity patterns. An early effort that modeled activity patterns, rather
than actually measuring them, is the NAAQS Exposure Model developed by
PEDco Environmental, Inc. (PEI) for the U.S. EPA. The population in each
study area was divided into age-occupation groups, each of which was
subdivided into three subgroups assumed to have different typical activity
patterns and levels. These activity patterns were then combined with
simulated pollution levels in the study areas for estimates of health risks
to the population. Two problems with this study are: 1) the activity
patterns and levels are modeled, rather than measured from a survey; and 2)
there is no emphasis on asthmatics (in fact, asthmatics are not even
identified).
A more recent effort is "A Study of Human Activity Patterns in
Cincinnati, Ohio" by PEI Associates, Inc., for EPRI. This work involved a
sample of 973 participants who completed a 3-day activity diary as well as
a detailed background questionnaire. Although this survey identified
asthmatics through an item on the background questionnaire, it did not
focus on them, and the number of asthmatics in the sample was quite mall
due to the low incidence rate of asthma. However, this survey was able to
produce estimates of activity probability distributions useful for risk
assessment for the general population.
There have also been studies that focused on asthmatics but did not
concentrate on activity patterns. A 1978 study prepared for Southern
California Edison Co., "Asthma in Six Los Angeles Communities: Analysis of
Findings Based Upon CHESS 1972-1973," examined many factors related to
asthma and tried to assess this relationship. CHESS is the Community
Health and Environmental Surveillance System implemented by EPA. The data
for this study was collected in a daily diary for the primary purpose of
assessing the potential relationship between pollution and asthma attack
patterns. Thus, results from the CHESS survey are extremely limited.
DESCRIPTION OF THE EPRI-RAI SURVEY
The EPRI-RAI survey consisted of the following elements:
• instrument,
• sampling plan, and
• data handling and analysis.
Each of these involved issues specific to an activity pattern survey of
asthmatics as well as issues related to any survey in general.
9-4
-------
SURVEY INSTRUMENT
The survey instrument consisted of two components: a background
questionnaire and a 3-day diary. Each participant completed one background
questionnaire, including questions characterizing his/her asthma; relevant
medical history; questions on medication usage and doctor/hospital contact;
questions on the relationship of various factors and asthma, and exposure
to these factors; and demographic information. This questionnaire was
based in part on several questionnaires focusing on asthma and related
diseases, particularly a questionnaire on asthma developed by the
International Union Against Tuberculosis, and the American Thoracic
Society/Division of Lung Diseases Respiratory Questionnaire.
The 3-day diary included questions on activity descriptions and
durations, symptomatic response, medication usage, and exposure to factors
associated with asthma (e.g., stress/anxiety level). The diary was filled
out hourly during waking hours, with a separate overnight summary. In
addition, a summary of the day was completed that evaluated the day
relative to other days in terms of activity level and various symptoms.
The hourly form contained eight questions for each of up to three
activities per hour, plus 15 general questions. (See the appendix for a
sample diary sheet.) The main activity was considered to be that which
lasted the longest, while the second and third activities had to last at
least 10 minutes each. Our hourly format is a big change from previous
activity pattern surveys in which people made an entry for each activity
engaged ,in.; It was felt that an hourly format placed less of a burden on
the participant while still obtaining the desired information. The
overnight summary asked questions very similar to those on the hourly diary
form, e.g., "What was your breathing like during the night?" and "Did you
take any medication for asthma between going to bed and getting up in the
morning?".
SAMPLING PLAN
The survey was administered to two different groups at two different
times. The first group consisted of asthmatics who had been subjects in
the clinical studies Rancho Los Amigos Medical Center near Los Angeles.
This source provided us with an easily obtainable sample of participants
whose clinical responses could be compared to their survey responses.
However, the survey results for this group cannot be considered estimates
of general population values. In addition, the survey of'this group acted
as a pretest for the larger second group.
This administration of the survey took place in April before the
worst of the smog season. All the participants completed the diary during
the three days Thursday, Friday, and Saturday in order to cover at least
one weekday and one weekend day. Previous activity pattern surveys had
indicated that Saturday and Sunday exhibit different patterns, but given
the small number of participants, it was felt that obtaining more
9-5
-------
information on weekdays was more important than obtaining information on
both weekend days.
The second group surveyed consisted of a random sample of asthmatics
living in Cincinnati, Ohio. This city was chosen as being "typical" in
many different ways, including pollution patterns. Asthmatics living in
the three counties surrounding and including the City of Cincinnati were
utilized in order to include both urban and rural populations. This sample
was surveyed during the summer (August), when people are active and
children are out of school. The participants completed the diary on either
Friday, Saturday, Sunday", and Monday (because there were enough
participants in this sample to gather information on a weekday plus both
weekend days).
Since the prevalence rate of asthma is quite low (about 3-5%), simple
random sampling would have been prohibitively expensive, so a technique
known as multiplicity sampling was used. In this approach, households were
telephoned randomly and any asthmatic in a selected household was eligible
for the study. In addition, a randomly selected adult in the household
could nominate asthmatics from among relatives living in the same 3-county
area. Only children, parents, and siblings were eligible for nomination.
This method produces a probability sample, thus permitting projections to
the population, although the data collected for this sample must be
weighted to adjust for the different probabilities of selection.
An important issue .of these surveys was the definition of
"asthmatic." The Los Angles sample included asthmatics as defined by the
clinical studies. These studies utilized medical histories and physiologic
testing, including pulmonary function tests, as well as doctor diagnosis
and current symptoms, to assess each subject's asthmatic status.
The Cincinnati survey used two. criteria to identify eligible
asthmatics. The first, and easiest to determine, was that the asthma had
to have been diagnosed by a doctor, i.e., each participant was asked: "Has
a doctor ever told you that you had asthma?" The second criterion was that
the participant had to exhibit current symptoms of asthma, as determined
from the following questions:
1. "Have you had any wheezing in the past 12 months, and if so,
how often?"
2. "Have you had any chest tightness in the past 12 months, and
if so, how often?"
If the respondent answered "yes" and "every day," "once a month," or "once
in a while" to either of these questions, he or she was considered eligible
for the study.
Since a monetary incentive was offered for participation in the
surveys, it was felt that an additional check was necessary to prevent
nonasthmatics in Cincinnati from misleading the interviewers. Thus, the
participant agreement form for this group required the name and address of
9-6
-------
the individual's doctor, plus the agreement that this doctor might be
contacted for confirmation of asthma diagnosis.
ANALYSIS
The analysis goals for the data from the two samples are different.
the ultimate focus of the Los Angeles data analysis is to relate activity,
medication, and symptom patterns to the clinical response patterns of the
participants. One issue is how the day-to-day symptom patterns obtained
from the survey relate to the symptom and pulmonary function responses
obtained in the laboratory. the other issue is whether, and how, a
participant's clinical responses can be further elucidated by his or her
activity patterns, medical patterns, and exposure to various factors
affecting asthma.
The focus of the Cincinnati survey is to estimate activity,
medication, and symptomatic patterns of the general population of
asthmatics. The ultimate goals of the analysis are policy-oriented,
especially the assessment of asthmatics' risk to various factors,
particularly air pollution.
The analysis of the data from the two surveys will have some common
aspects, of course. First, the activity patterns of both groups will be
examined and compared, both to each other and to those of nonasthmatics.
Second, the patterns of symptoms, medication, and exposure to explanatory
factors will be studies. One way that these patterns will be examined is
by subgroups of the samples, e.g., by severity of asthma. The clinical
studies developed criteria for separating asthmatics .into two categories --
minimal/mild and moderate/severe, which will be utilized in the survey
analyses. Third, the patterns discernible among asthmatic symptoms will
be related to activity, medication, and exposure patterns.
PRELIMINARY FINDINGS FROM THE L.A. SURVEY DIARY DATA
The data from the L.A. survey are currently being analyzed. Some
preliminary findings from the hourly diary data are:
1. Asthmatics spend much of their time inside at a low exertion
level.
2. Several asthma symptoms (wheezing, coughing, chest tightness,
and asthma attacks) occurred at fairly high rates.
3. Exertion level is significantly related to asthma symptoms in
the expected direction (as exertion level increases, symptom
incidence increases).
4. Several other factors, such as stress, are related to asthma
symptoms, any many asthmatics are exposed to such factors.
5. Medication is taken in response to asthma symptoms, rather than
as part of a maintenance regimen.
9-7
-------
6. Medication has a strong mitigating effect on respiratory
symptoms resulting from strenuous exercise.
Thus, a preliminary analysis of the data indicates several
interesting patterns. It is hoped that a further understanding of the
factors related to asthma, as well as the coping mechanisms utilized by
asthmatics, will aid in air quality policy aimed at this group.
The work described in this paper was not funded by the
U.S. Environmental Protection Agency and, therefore,
the contents do not necessarily reflect the views of
the Agency and no official endorsement should be
inferred.
9-8
-------
APPENDIX
Editor's Note: The questionnaire given in this appendix was
originally on two sides of a 17"xll" sheet of paper. We were forced to cut
it and then reproduce it on several pages so that it would fit into this
report.•
9-9
-------
60 Minutes =
1 hour
START
• HERE •
Q.l (CIRCLE ONE)
Time of day:
. a.m. p.m.
o
Q.2 (CIRCLE ONE)
Report on beginning and ending hours
12- 1 1-2 2-3 3-4 4-5 5-6
6-7 7-8 8-9 9-10 10-11 11-12
13-14 Less than an hour, from to
Is it noon or midnight?
JJL
Q.3 (CIRCLE ONE)
Day of the week
Monday
Friday05 Saturday^ Sunday QJ
Tuesday., Wednesday.,, Thursday,,,
Oi Oj 04
3-10
-------
MAIN
ACTIVITY
IN
LAST
HOUR
0.4 (CIRCLE ONE)
Mitt activity took up -lit
tlat In tht Uit hour?
(Till .111 bo MAIN ACTIVITY)
01 gtttlng rtady lo go
to—Mitrt
02 pr*9arln«/tatlng Mil
0) wort In/around tht (MM
04 trawling (0.9.. to or
fro« work)
Oi vort war '«• >>OM
M titreltlng
07 watching tv, rtadtng.
talking *«-
01 child cart
0* shopping or trrandt
10 othtr .
Q.S (FILL IN THE BLAMS)
About how Mny ilnuttt of
tho past hour did you tptnd
on MAIN ACWim i 1
of Btnuttt: I !
it-u
q.i How Mny of tht •Inuttl
rtporttd In Q.S Mr* In tht
following an»:
(FILL IN ZERO (0) IF NO
TINE IN AN AREA)
II-
SO TO Q.S
outtldi: _
ln»ldo: _
it-}} In a »thlelt: _____
»-« TOTAL: _
(TOTAL SHOULD EQUAL NUMER
OF NINVTES IN 101 IN 9.5)
0..;A (CIRCLE ALL
TWT AWT)
Miat —in» jour lo»ol»
of utrtlM ndll* do-
IW| MAIN ACTIVITY?
1 ItrMMMt
(o.g.. joojtna.).
2 codtrato
(fast •aUiB9) .
(tlo* •alktnq)
4 «aty
(jlttlnq. itandtno.).
GO TO 9.71
0.71 (FILL I* AIL THAI
About Kov muif
•Inuttt of tach typt
of tiirtlon?
-in
no-
TOTAL:
GO TO Q.8A
139-340
(SHOULD EOUAL
NUMBER OF
NINUTES IN BOX
IN Q.S)
GO TO Q.7A
Q.U (CIRCLE ALL THAT
Mat •<> your brtithtno,
llko durlnq HAIN
ACTIKITTT
q.SC (ANSWER BELOM, ft*
EACM ITDt CIRCLED
IN Q.U)
Vat tht brtathtnq you
rmorttd In Q-BA an
tithu iy->to> for you?
01 •fettling
OS »-<'
Don't biw <>«
Don't Knov n-4i
Don't Know 4*-50
Don't Know J1-5J
Q.SB
GO TO Q.9
•If you larktd fait, ihallov or htavy Drtathlno, In
Q.SA: HO» Mny unutts did thlt latl?
| Hl-Hl
• GO TO Q.SC
Q.» (CIRCLE ONE)
Old you txptrltnct any nthu
lyaptOB(t) not Hntlontd In
Q.tAT
1 Ttt. dttcrlbo bolM
2 No (GO TO q.10)
GO TO q.10
S5-54
tl-tl
q.10 (CIRCLE ALL THAT
AfrXT)
Htrt you fttllnq itrttttd
or am I out vhtlt dotnq
MAIN ACTIVITY?
I Ttt, united
2 Tot. ajutout
1 No
GO TO q.ll
to
tl
9-11
q.ll (CIRCLE ALL THAT
AW.Y)
Hat anyone tanking vhllo
you did MAIN ACTIVITY?
1 N» ono «at looking fl
2 I >ai noting, «
3 So—rant tlto *<> unking ti
B I don't know <«
If your MAIN ACTIVITY took
all 60 alnuttt of tht Uit
hour. GO TO q.28. If not.
pi tut go to q.12.
Reproduced from
best available copy.
-------
2ND
ACTIVITY
IN
LAST
HOUR
Q.I2 (CIRCLE ONE)
Wilt othtr activity took
tlw (It ttast 10 •imittt)
<> th* list hour?
(This win o* 2ND ACTIVITY
for UM p«$t hour)
01 totting rtady to 90
02 prtpannq/tatlng wal
0) work In/around tnt
hOM
04 travtltng (t.j., to or
froB work)
OS wrk away from hoot
M txtrclstng
07 watching tv, rtadlng,
talking
M child cart ti-ft
n thepping or trrands
10 othtr _
SO TO Q.13
Q.13 (FILL IN THE BUNKS)
About how Mfljr limit*! of
tht past hour did you ipond
on 2ND ACTIVITY?
, ,
: I I
of •Inutts:
Q.U How aany of th«
•Inutts rtporttd In 0.13
wro In thi following, anas:
(FILL IN ZERO (0) IF
NO TIME IN AN AREA)
outsldt:
74.75 tnstdt:
74.77 In a »thtd*: ^^^__
l»-n TOTAL:
(TOTAL SHOULD EQUAL NUMBER
OF NINUTES IN BOX IN Q.13)
GO TO Q.1SA
Q.16A (CIRCLE ALL THAT
Q.1SA (CIRCLE ALL 0.1SI «"Ll « ALL AWT)
-THAT APPLY) THAT Am Y)
Vhat was your brtathing
Hhat wtrt your Itvtls About how cany Hkt during 2ND
of txtrtlon whllt do- linutts of tach typt ACTIVITY?
Ing 2ND ACTIVITY? of txtrtlon?
1 strtnuous
2 BOdtratt
(fast walking) .....'., — ., , __
3 «11d • .
(slow walking) ., .•••••
4 ttsy
SOT00-158 TOTAL: 1
91 whttl
1 341-241 .
| 33 tight
1nq
OS substtrnat Irritation
07 shall
aw brtathing*
09 normal: slow to tndtratt..
(SHOULD EQUAL
NUMBER OF
NINUTES IN 801 C.16B *If you Hrktd fast, ihi
IN Q.13) O.I&A; How tuny Bimitts
GO TO Q.16A
O.lt
Has
rtpo
astta
Yts
Yts
Yts
Yts
Yts
Yts
YtS
Yts
Tts
GO 1
How or
did til
SC
C (ANSI
CACH
IN 0
lh« bn
rttd I)
sa symi
No
No
No
No
No
NO
No
No
No
ro Q.U
htavj
Is )ai
KCI KLOM. FOB
ITEM CIRCLED
.ISA)
lathing you
i Q.16A an
>to> for you?
Oon't Know"-"
Don' t Know M-M
Oon't Know«7-«»
Oon't Know fO-H
Oon't Know»J-»5
Oon't Know **-M
Oon't Know n-Ol
Oon't Know jox>«
Don't Know ioi-07
brtathing In
U
Q.17 (CIRCLE ONE)
014 you tiporttnct any isthoa
syipto*(s) not Mntiontd in
9.1U?
1 Yts, dtscrlb* btlow
10*
2 No (60 TO Q.18)
80 T0'-18 JM.I.
111-13
Q.ll (CIRCLE ALL THAT
Mm you fttllrn strtssod
or anxious whllt doing
2ND ACTIVITY?
I Yts, strtsstd 114
2 Yts, anxious US
3 No lit
GO TO Q.19
Q.I* (CIRCLE ALL THAT
APPLY)
Has anyont saoktno, whllt
you did 2ND ACTIVITY?
1 No on* was staking in
2 I MS uoklnq in
3 Soxaent tilt w»« soaking lit
» I don't know ix
If you did soatthlng tlst
(a 3RO ACTIVITY) In tht
past hour, GO TO Q.20.
If you h«vt accounttd for
all of tht past hour,
GO TO Q.28.
9-12
-------
3RD
ACTIVITY
IN
LAST .
HOUR ••>
Q.23A (CIRCLE ALL
THAT AFflT)
mat Mro your livols
of txtrtlon unlit do-
Ing 1RO ACTIV1TTT
0.20 (CIRCLE ONE)
Mtat otktr activity took
MM (at Itait 10 •inutts)
li tht lilt hour?
(TM« «n»i
About how «any
•Inutti of «acA typo
of txtrtton?
1 itftnuoui
(t.g. . jogging)
2 Bodtratt
(fatt walking)
(Slow walking)
4 taiy
(lining, itandlng)
CO TO Q.2JI
TOTAL:
(SHOULD EQUJ
NUMBER OF
NINUTES IN
IN 9-21)
741-744
74J-747
HI- no
771-777
IL
BOX
9.21 (FILL IN THE ILANKS)
About how aany itnutts of
tho past hour dtd you iptnd
on 1RO ACTIVITY?
of •Inutti:
174-75
9.22 Ho* Mny of tho
•thutot roporttd In 9.21
wrt In tho fol toning aroat:
(FILL IN ZERO (0) IF NO
TINE IN AN AREA)
l*-77 outlldo: -
I II- It Intido: ___^
1)0-11 In a vihlclo: _^^_
TOTAL: __
(TOTAL SHOULD EOUAL NUMER
OF NINUTCS IN SOZ IN Q.21)
GO TO Q.23A
0..2U
(CIRCLE ALL THAT
AfHT)
What ••» your broathlrq
Mko during JRO
ACTIVITTT
9.24C (ANStKI KLW, K»
EACH tTEH CIKLEO
IN O.MA)
Vat Uio brtatnlnf you
rtportid 1* Q.24A an
aitha* >r*pto> for you!
01 «Boojtft9
K coughing
0) tlghtnoii of chtit
M ilMrtniii of brtath
OS lubstornal Irritation
01 fatt breathing*
07 thai Ion breathing*
01 hoavy brtath Ing*
09 normal: »lo» to andiratt..
Ttl
Ttt
Ytt
Ttl
Tt»
Ttl
Ttl
Ttl
Ytl
No
No
Oon't Kao* ut-Jf
000't lIMM 1J*-40
Don't Kno> 141-44
Oon't lna» H4-«4
OM't KOOM 147-4»
Oon't loan lio-li
Ow't bmi 151-51
Oon't Know 1J4-S*
Oon't KMM 15»-41
eo TO q.2u
CO TO 0..2S
9.248 *lf you aarkod f»t, shallov or htavy brtathlng la
9. 24A: HOH Hny •Inutti did this last?
Q.2S (CIRCLE ONE)
Old yo« tiptrltnco any asthoa
iyaptM(i) not twnttonod In
0.24A7
1 Ttl. doicrlbo bolo»
t No (GO TO 9.28)
147
Q.2I (CIRCLE ALL THAT
ArtlT)
tfer* you fttUng itrttttd
or anxloui iihllt doing
JRO ACTIVITY?
1 Ttl, itrtntd "•
2 Ttl, anxious '"
J "o no
GO TO 9.27
9.27 (CIRCLE ALL THAT
APfllf)
Vat anyont twklng vhllt
you did JRO ACTIVITY?
1 No on* MI unking J7J
2 I wi tanking 177
1 toaont tilt ««i uoklng 171
S I don't know 17<
CO TO q.28
« TO 9.2«
14J-M
14S-44
9-13
-------
Thest questions are for all activities for the past hour. . .
Q.28 Did any chest symptoms limit your
activity in the past hour? (CIRCLE ONE)
1 Yes -- GO TO 0.29 JJfl
2 No -•• GO TO 0.30
0.29 What type of activity
was affected? (CIRCLE AS MANY AS APPLY)
1 Halted my recreational activity
2 Halted my work
3 Halted my study
4 other, please describe
GO TO Q.30
ID
nt-u
IS I
iai
183
Q.30 Old any nose or throat symptoms
Halt your activity? (CIRCLE ONE)
1 Yes •• GO TO Q.31
2 No — GO TO Q.32
115
0.31 What type of activity was affected?
(CIRCLE AS MANY AS APPLY)
1 Halted ay recreational activity
2 limited my work
3 limited my study
4 other, please describe ____^_____
GO TO Q.32
136
117
its
189
0.32 Old you take any medication
for astluu during the past hour?
(CIRCLE ONE)
t Yes - GO TO 0.33 AND Q.34
IX
2 No ••• GO TO Q.35
0.33 What type of medication did
you take? (CIRCLE AS MANY AS APPLY)
Q.34 Was the medication you reported
1n Q.33 a maintenance dose or was it
In response to specific symptoms you
had In the last hour?
ANSWER BELOW FOR EACH CIRCLE IN Q.33
1 Inhaled bronchodilater 1 maintenance
2 Inhaled steroid 1 maintenance
3 Inhaled croaolyn powder 1 maintenance
4 oral branchedtlator (pill or liquid).. 1 maintenance 2 response
5 oral steroid (pill or liquid) 1 maintenance 2 response
6 other, please describe 1 maintenance 2 response 3 both 301-01
2 response 3 both
2 response 3 both
2 response 3 both
3 both
3 both
191-91
193-94
195-96
197-91
199-100
GO TO Q.34
GO TO Q.3S
Q.35 Old you experience heavy or rapid breathing for
5 minutes or more during the past hour?
(CIRCLE ONE)
1 Yes
2 No
GO TO Q.36 X3
Q.36 Old you drink any beverage containing caffein (e.g.,
coffee, tea, soda) during the past hour? (CIRCLE ONE)
1 Yes
2 No
GO TO Q.37
X4
Q.37 The weather outside is:
(CIRCLE AS
1 sunny
2 cloudy
3 rainy
GO TO Q.38
MANY AS APPLY)
4 other
S dark
8 I can't tell
X5
X6
X7
XI
309
310
Q.38 The weather outside feels:
(CIRCLE AS
1 damp
2 cold
3 humid
GO TO Q.39
MANY
4
5
6
AS APPLY)
windv 7 other:
warm 8 I can't tell
hot
311
212
114
115
116
217
111
9-14
-------
Q.39 Mtr» you exposed to any Irritants or factors not covered above that affected your breathing in the last hour?
(CIRCLE ONE) U9f 3X.U
1 Yes, please describe 332-33
2 No GO TO Q.40
0.40 If you were (n a house or othir building In the last hour, pltast answer Q.40 a,b, and c; otherwise, you'rt done
for this hour. Thanks.
Q.40a Was air conditioner or Q.40b Has a gas stove Q.40C Here any windows open?
central air cooling on? used for cooking? (CIRCLE ONE)
(CIRCLE ONE) (CIRCLE ONE)
1 Yes 1 Yes 1 Yes
2 No 2 No 2 No
6 I don't know "4 6 I don't know "s 6 I don't know 316
Thanks; another hour is finished!
9-15
-------
FOR NOTES ABOUT THE NEXT HOUR -
Time
Activity
Begin
End
9-16
-------
THE TREATMENT OF MISSING SURVEY DATA
by: Graham Kalton and Daniel Kasprzyk
Editors Note: This published article has been copied with the kind
permission of the editor and the publishers of the journal, SURVEY
METHODOLOGY.
10-1
-------
Survey Methodology. June 1 986
Vol. 12. No. 1. pp. 1-16
Statistics Canada
The Treatment of Missing Survey Data
GRAHAM KALTON and DANIEL KASPRZYK1
ABSTRACT
Missing survey data occur because of total nonresponse and item nonresponse. The standard way to
attempt to compensate for total nonresponse is by some form of weighting adjustment, whereas item
nonresponses are handled by some form of imputation. This paper reviews methods of weighting ad-
justment and imputation and discusses their properties.
KEY WORDS: Nonresponse; item nonresponse; Weighting adjustments; Imputation.
1. INTRODUCTION
Surveys typically collect responses to a large number of items for each sampled element.
The problem of missing data occurs when some or all of the responses are not collected for
a sampled element or when some responses are deleted because they fail to satisfy edit con-
straints. It is common practice to distinguish between total (or unit) nonresponse, when none
of the survey responses are available for a sampled element, and item nonresponse, when
some but not all of the responses are available. Total nonresponse arises because of refusals.
inability to participate, not-at-homes, and untraced elements. Item nonresponse arises because
of item refusals, "don't knows", omissions and answers deleted in editing.
This paper reviews the general-purpose methods available for handling missing survey data.
The distinction between total and item nonresponse is useful here since different adjustment
methods are used for these two cases. In general the only information available about total
nonrespondents is that on the sampling frame from which the sample was selected (e.g., the
strata and PSUs in which they are located). The important aspects of this information can
usually be readily incorporated into weighting adjustments that attempt to compensate for
the missing data. Hence as a rule weighting adjustments are used for total nonresponse.
Methods for making weighting adjustments are reviewed in Section 2.
In the case of item nonresponse, however, a great deal of additional information is available
for the elements involved: not only the information from the sampling frame, but also their
responses for other survey items. In order to retain all survey responses for elements with
some item nonresponses, the usual adjustment procedure produces analysis records that in-
corporate the actual responses to items for which the answers were acceptable and imputed
responses for other items. Imputation methods for assigning answers for missing responses
are reviewed in Section 3.
In general the choice between weighting adjustments and imputation for handling miss-
ing survey data is fairly clearcut; there are cases, however, when the choice is not so clear.
These are cases of what may be termed partial nonresponse, when some data are collected
for a sampled element but a substantial amount of data is missing. Partial nonresponse can
arise, for instance, when a respondent terminates an interview prematurely, when data are
not obtained for one or more members of an otherwise cooperating household (for household
level analysis), or when a sampled individual provides data for some but not all waves of
a panel survey. Discussions of the choice between weighting and imputation to compensate
for wave nonresponse in a panel survey are given by Cox and Cohen (1985) and Kalton (1986).
1 Graham Kalton, Survey Research Center, University of Michigan, Ann Arbor, Michigan, 48106-1248 and Daniel
Kasprzyk, Population Division, U.S. Bureau of the Census, Washington, D.C., 20233. The authors would like
to thank the referees for their helpful comments.
10-2
-------
2 Kalton and Kasprzyk: Treatment of Missing Survey Data
Although weighting adjustments and imputation are treated as separate approaches in
the discussion below, they are in fact closely related. The relationship and differences bet-
ween the two approaches are briefly discussed in Section 4, which also mentions some alter-
native ways of handling missing survey data.
2. WEIGHTING ADJUSTMENTS
Weighting adjustments are primarily used to compensate for total nonresponse. The essence.
of all weighting adjustment procedures is to increase the weights of specified respondents
so that they represent the nonrespondents. The procedures require auxiliary information on
either the nonrespondents or the total population: The following four types of weighting
adjustments are briefly reviewed below: population weighting adjustments, sample weighting
adjustments, raking ratio adjustments, and weights based on response probabilities. More
details are provided in Kalton (1983).
2.1 Population Weighting Adjustments
The auxiliary information used in making population weighting adjustments is the distribu-
tion of the population over one or more variables, such as the population distribution by
age, sex and race available from standard population estimates. The sample of respondents
is divided into a set of classes, termed here weighting classes, defined by the available aux-
iliary information (e.g., White males aged 15-24, non-White females aged 22-34, etc.). The
weights of all respondents within a weighting class are then adjusted by the same multiplying
factor, with different factors in different classes. The adjustment is carried out in such a
way that the weighted respondent distribution across the weighting classes conforms to the
population distribution.
This type of adjustment is often termed poststratification. That term is avoided here,
however, because although population weighting resembles poststratification, there is an im-
portant difference between the two. Like population weighting, poststratification weights
the sample to make the sample distribution conform to the population distribution across
a set of classes (or strata). However, the'standard textbook theory of poststratification is
.concerned only with the sampling fluctuations that cause the sample distribution to deviate
from the population distribution, not with the more major deviations that can arise from
varying response rates across the classes. Poststratification adjustments are more like a fine
tuning of the sample, resulting generally in only small variations in the weights across strata.
In consequence, provided that the strata are not small, poststratification leads to lower stan-
dard errors for the survey estimates. In contrast, population weighting adjustments may in-
volve more major adjustments and result in higher standard errors.
Population weighting adjustments attempt to reduce the bias created by nonresponse and
coverage errors. Consider the estimation of a population mean ? from a sample in which
the elements are selected with equal probability. Suppose that the population is divided into
a set of weighting classes, with a proportion Wh of elements in class h. Assume that
respondents always respond and that nonrespondents never do. Let /?* and M/, be the pro-
portions of respondents and nonrespondents respectively in class h, and let R = LWhRhbe
the overall response rate. Then, following Thomsen (1973), the bias of the unadjusted respon-
dent mean (?) can be expressed as
(P,A - ?r)(R* - ft) + w*A/»)= A + B (1)
10-3
-------
Survey Methodology, June 1986
where ?rh and ?mA are the means for respondents and nonrespondents in class h respective-
ly, and ?r is the population mean for the respondents. The use of the population weighting
adjustment leads to the weighted sample mean, yp = T.Whyrh, where yr/l is the respondent
sample mean in class h. The bias of ?p is simply the second term in fl(.P), that is,
B(9P) = B.
If A and B are of the same sign, the population weighting adjustment reduces the ab-
solute bias in the estimate of T by \A\.'. If ?r/t = ?mh, as occurs in expectation when the
nonrespondents are missing at random within the weighting classes, then B = 0. In this case,
the population weighting adjustment eliminates the bias. The term A is a covariance-type
term between the class response rates and the class respondent means. It is zero if either
the response rates or the respondent means do not vary between classes. In either of these
cases, the population weighting adjustment has no effect on the bias of the estimator. It
may be noted that population weighting adjustments may increase the absolute bias of the
estimate of ?. This will occur when A and B are of opposite signs and \A\ < 2\B\.
Population weighting adjustments require external data on the population distributions
for the variables to be used. Care is needed to ensure that the data on which the population
distributions are based are exactly comparable with the survey data; otherwise, inappropriate
weights will result. Since the procedure weights up to population distributions, it does more
than just attempt to compensate for nonresponse. It also compensates for coverage errors
and makes a poststratification adjustment.
2.2 Sample Weighting Adjustments
. As with population weighting adjustments, with sample weighting adjustments the sam-
ple is divided into weighting classes; varying weights are then assigned to these classes in
an attempt to reduce the nonresponse bias. The essential difference between the two pro-
cedures lies in the auxiliary information used. As described above, population weighting ad-
justments are based on externally obtained population distributions. No data are needed for
the sample nonrespondents. In contrast, sample weighting adjustments employ only data
internal to the sample and require information about the nonrespondents.
With sample weighting adjustments, the nonresponse adjustment weights for the weighting
classes are made proportional to the inverses of the response rates in the classes, in order
to compute these response rates, the numbers of respondents and nonrespondents in the classes
must be determined. It is therefore necessary to know to which class each respondent and
• nonrespondent belongs. Since typically very little information about the nonrespondents is
available, the choice of weighting class is usually severely restricted. It is often limited to
general sample design variables (e.g., PSUs and strata), characteristics of those variables
(e.g., urban/rural, geographical region), and sometimes some additional variables available
on the sampling frame. On occasion it may also be possible to collect information on one
or two variables for the nonrespondents, for instance by interviewer observation.
As population weighting adjustments resemble poststratification, so sample weighting ad-
justments resemble two-phase sampling. The first phase sample is the total sample of
respondents and nonrespondents; the second phase sample is the subsample of respondents,
selected with different sampling fractions (response rates) in different strata (weighting classes).
The sample weighted mean can be represented by ys = CH\J>,A, where wh is the proportion
of the total sample in weighting class h. Assuming no coverage errors, E(wh) ±. Wh, the
population proportion in class h, as used in the population weighted estimator
10-4
-------
4 Kalton and Kasprzyk: Treatment of Missing Survey Data
J>p = ŁWHyrh. The bias of 9, is the same as that of yp, namely B(ys) = B as given in equa-
tion (1); hence the effect of the sample weighting adjustment on the bias of the survey estimate
is the same as that of the population weighting adjustment. Since sample weighting ad-
justments use only data for the sample, they do not compensate for coverage errors (unlike
population weighting adjustments).
Population and sample weighting adjustments have different data requirements, and hence
address different potential sources of bias. lit practice the two forms of adjustment are used
in combination. Generally sample weighting adjustments are applied first, and then popula-
tion weighting adjustments are applied afterwards. A common approach is initially to deter-
mine the sample weights needed to compensate for unequal selection probabilities, next to
revise these weights to compensate for unequal response rates in different sample weighting
classes (e.g., urban/rural classes within geographical regions), and finally to revise the weights
again to make the weighted sample distribution for certain characteristics (e.g.. age/sex) con-
form to the known population distribution for those characteristics. The use of this approach
in the U.S. Current Population Survey is described by Bailar et al. (1978).
As with population weighting adjustments, the aim of sample weighting adjustments is
to reduce the bias that nonresponse may cause in survey estimates. An effect of sample
weighting adjustments is, however, to increase the variances of the survey estimates. There
is therefore a trade-off to be made between bias reduction and variance increase.
An indication of the amount of increase in variance from weighting can be obtained by
considering the situation where the element variances within the weighting classes are all the
same and the variances between the class means are negligible compared to the within-class
variances. In this situation, the loss of precision from weighting is approximately the same
as that arising from the use of disproportionate stratified sampling when proportionate
stratified sampling is optimum; Kish (1965, Section 11.7C; 1976) discusses this latter case.
Under the above conditions, weighting increases the variance of a sample mean by ap-
proximately L = (T^WtJkt,) CLWH/kh), where Wh is the proportion of the population and
kh\s the weight for class h. An alternative expression for L is (En*) (LnHkt)/(Lnhkh)2,
where nh is the sample size in class h. The factor L becomes large when the variance of the
weights is large.
A large variance in the weights can arise from segmenting the sample into many weighting
classes with only a few sampled elements in each. When the weighting classes are small, their
response rates are unstable, and this gives rise to a large variation in the weights. To avoid
this effect,'it is common practice to limit the extent to which the sample is segmented. Even
so, there may still be some weighting classes that require large weights. Sometimes these
weighting classes are handled by collapsing them with adjacent ones and sometimes their
weights are cut back to some acceptable maximum value (see Bailar et al. 1978 and Chap-
man et a/. 1986, for examples). These procedures avoid the increase in variance associated
with the use of extreme weights, but they may lead to increased bias; their effect on the bias
is, however, unknown.
In some cases it seems desirable to use several auxiliary variables in forming the weighting
classes for population or sample weighting adjustments. However, if the classes are formed
by taking the full crossclassification of the variables, there will be a large number of weighting
classes. Unless the sample is very large, the sample sizes in the resultant weighting classes
will be small, and the instability in the response rates will lead to a large variance in the weights
and loss of precision in the survey estimates. One way to deal with this problem is to cut
down on the number of classes by collapsing cells, for instance by discarding some of the
auxiliary variables or using coarser classifications. Another way is to base the weights on
a model, as is done in raking ratio weighting discussed below.
10-5
-------
Survey Methodology, June 1 986 5
2.3 Raking Ratio Adjustments
When weighting classes are taken to be the cells in the crossclassificaiion of the auxiliary
variables, population weighting adjustments make the joint distribution of the auxiliary
variables in the sample conform to that in the population. Similarly, sample weighting ad-
justments make the joint distribution of the auxiliary variables in the respondent sample con-
form to that in the total sample. As noted above,- however, this crossclassification approach
may have the undesirable effect of creating many smalt, and hence unstable, weighting classes.
Also, it is not always possible to employ this approach with population weighting adjustments:
in many cases the population marginal distributions, and perhaps some bivariate distribu-
tions, of the auxiliary variables are available, but the full joint distribution is unknown.
An alternative approach is to develop weights that make the marginal distributions of
the auxiliary variables in the sample conform to marginal population distributions (with
population weighting) or marginal total sample distributions (with sample weighting), without
ensuring that the full joint distribution conforms. The- method of raking ratio estimation,
or raking, may be used to obtain weights that satisfy these conditions. Raking corresponds
to iterative proportional fitting in contingency table analysis (see, for instance, Bishop et
a/., 1975).
Consider the use of raking in the simple case of two auxiliary variables. Let Whk be the
proportion of the population in the (h, /t)-th cell of the crossclassification, and let WA* be
the proportion assigned to that cell by the raking algorithm. Conditional on the total and
respondent sample sizes in the cells (and assuming all cells have at least one respondent),
the bias of the raking ratio adjusted sample mean ?q = ECrt^/i* 's
B(9,) = 22«yifw(FMj - F^) + ŁŁ(#** - W«)(Frt* - ?«,. - ?,.k + F,)
where H^M = E(whk). The first term in this bias corresponds to the bias term B in equa-
tion (1) for the population and sample weighting adjustments. It is zero in expectation if
the ceil nonrespondents are random subsets of the cell populations. The second term is zero
if either WHk = Whk or there is no interaction in the ?rhlc for this classification.
Underlying the raking ratio weighting procedure is a logit model for the cell response rates.
With the model ln(RHlc/([ - RM)] = a* + 0» for the response rates in a two-way
classification, Whk = Whk. Thus, under this model, the second term in B(yq) is zero.
Further discussion of raking ratio weighting is given by Oh and Scheuren (1978a,1978b,
1983). Oh and Scheuren (1978a) also provide a bibliography on raking.
2.4 Weighting with Response Probabilities
Although a number of methods for weighting with response probabilities have been pro-
posed, this approach has not been widely adopted as an adjustment procedure. The basis
of the approach is to assume that all population elements have probabilities (usually required
to be non-zero) of responding to the survey. Some method is used to estimate the response
probabilities for responding elements. These elements are then given nonresponse adjust-
ment weights that are in inverse proportion to their estimated response probabilities.
An early application of this approach is the well-known procedure of Politz and Sim-
mons (1949, 1950). A single (evening) call is made to each selected household, and during
the course of the interview respondents are asked on how many of the previous five evenings
they were at home at about the same time. Their response probabilities are then taken to
be the fraction of the six evenings (including the one of the interview) that they were at home,
and the inverses of these probabilities are used in the analysis. Note that the procedure does
not deal with those who were out on all six evenings and those who refused.
10-6
-------
6 Kalton and Kasprzyk: Treatment of Missing Survey Data
Another approach for estimating response probabilities is to regress response status (1
for respondents, 0 for nonrespondents) on a set of variables available for both respondents
and nonrespondents, using a logistic or probit regression. The predicted values from the regres-
sion for the respondents are then taken to be their response probabilities, and weights in
inverse proportion to these predicted values are used in the analysis. A special case is when
the predictor variables are dummy variables that identify a set of classes. The predicted
response probabilities are then the class response rates, and the method reduces to a sample
weighting adjustment. The method is most appropriate for situations where a good deal of
information is available for the nonrespondents,'as for instance when the nonrespondents
are losses after the first wave of a panel survey. Little and David (1983) discuss the applica-
tion of the method for panel nonresponsc. It should be noted that if the regression is highly
predictive of response status, the resultant weights will vary markedly, leading to a substan-
tial loss in the precision of the survey estimates.
Drew and Fuller (1980, 1981) describe an approach for estimating response probabilities
from the number of respondents secured at successive calls. In their model, the population
is divided into classes. Within each class, every element is assumed to have the same response
probability which remains the same at each call. The model also allows for a proportion
of hard-core nonrespondents that is assumed constant across classes. Under these assump-
tions, the response probabilities for each class and the proportion of hard-core nonrespondents
can be estimated, and hence weighting adjustments can be made. Thomsen and Siring (1983)
adopt a similar approach using a more complex model.
Finally, mention should be made of a related approach that compensates for nonresponse
by weighting up difficult-to-interview respondents. Bartholomew (1961), for instance, pro-
posed making only two calls in a survey, and weighting up the respondents at the second
call to represent the nonrespondents. The assumption behind this approach is that the
nonrespondents are like the late respondents. This assumption seems questionable, however,
and empirical evidence from an intensive follow-up study of nonrespondents in the U.S. Cur-
rent Population Survey does not support it (Palmer and Jones 1966; Palmer 1967).
3. IMPUTATION
A wide variety of imputation methods has been developed for assigning values for miss-
ing item responses. The aim here is to provide a brief overview of the methods, the basic
differences between them, and some of the issues involved in imputation. A fuller treatment
is provided by Kalton and Kasprzyk (1982).
Imputation methods can range from simple ad hoc procedures used to ensure complete
records in data entry to sophisticated hot-deck and regression techniques. The following are
some common imputation procedures:
(a) Deductive imputation. Sometimes the missing answer to an item can be deduced with
certainty from the pattern of responses to other items. Edit checks should check for con-
sistency between responses to related items. When the edit checks constrain a missing
response to only one possible value, deductive imputation can be employed. Deductive
imputation is the ideal form of imputation.
(b) Overall mean imputation. This method assigns the overall respondent mean to all miss-
ing responses.
(c) Class mean imputation. The total sample is divided into classes according to values of
the auxiliary variables being used for the imputation (comparable to weighting classes).
Within each imputation class the respondent class mean is assigned to all missing responses.
10-7
-------
Survey Methodology, June 1 986 7
(d) Random overall imputation. A respondent is chosen at random from the total respon-
dent sample, and the selected respondent's value is assigned to the nonrespondent. This
method is the simplest form of hot-deck imputation, that is an imputation procedure
in which the value assigned for a missing response is taken from a respondent to the cur-
rent survey.
(e) Random imputation within dosses. In this hot-deck method, a respondent is chosen at
random within an imputation class, and the selected respondent's value is assigned to
the nonrespondent. • '^ .
(f) Sequential hot-deck imputation. The term sequential hot-deck imputation is used here
to describe the procedure used with the labor force items in the U.S. Current Population
Survey (Brooks and Bailar 1978). The procedure starts with a set of imputation classes.
A single value for the item subject to imputation is assigned for each class (perhaps taken
from a previous survey). The records in the survey's data file are then considered in turn.
If a record has a response for the item in question, its response replaces the value stored
for the imputation class in which it falls. If the record has a missing response, it is assign-
ed the value stored for its imputation class.
The hot-deck method is similar to random imputation within classes. If the order of
the records in the data file were random, the two methods would be equivalent, apart
from the start-up process. The non-random order of the list generally acts to the benefit
of the hot-deck method since it gives a closer match of donors and recipients provided
that the file order creates positive autocorrelation. The benefit is, however, unlikely to
be substantial.
The sequential hot-deck suffers the disadvantage that it may easily make multiple uses
of donors, a feature that leads to a loss of precision in survey estimates. Multiple use
of a donor occurs when, within an imputation class, a record with a missing response
is followed by one or more other records with missing responses. The number of imputa-
tion classes that can be used with the method also has to be limited in order to ensure
that donors are available within each class.
Useful discussions of the sequential hot-deck method are provided by Bailar et al.
(1978), Bailar and Bailar (1978, 1983), Ford (1983), Oh and Scheuren (1980), Oh et al.
(1980), and Sande (1983).
(g) Hierarchical hot-deck imputation. The above disadvantages of the sequential hot-deck
are avoided in the hierarchical hot-deck method, a form of hot-deck imputation developed
for the items in the March Income Supplement of the Current Population Survey. The
procedure sorts respondents and nonrespondents into a large number of imputation classes
from a detailed'categorization of a sizeable set of auxiliary variables. Nonrespondents
are then matched with respondents on a hierarchical basis, in the sense that if a match
cannot be made in the initial imputation class, classes are collapsed and the match is made
at a lower level of detail. Coder (1978) and Welniak and Coder (1980) provide further
details on the hierarchical hot-deck procedure.
(h) Regression imputation. This method uses respondent data to regress the variable for which
imputations are required on a set of auxiliary variables. The regression equation is then
used to predict the values for the missing responses. The imputed value may either be
the predicted value, or the predicted value plus some residual. There are several ways
in which the residual may be obtained, as discussed later.
(i) Distance function matching. This hot-deck method assigns a nonrespondent the value
of the "nearest" respondent, where "nearest" is defined in terms of a distance function
for the auxiliary variables. Various forms of distance function have been proposed (e.g.,
Sande 1979; Vacek and Ashikago 1980), and the function can be constructed to reduce
the multiple use of donors by incorporating a penalty for each use (Colledge et al. 1978).
10-8
-------
8 Kalton and Kasprzyk: Treatment of Missing Survey Data
Although at first sight these may appear a diverse set of procedures, they can nearly all
be fitted within a single unifying framework. The methods can all be described, at least ap-
proximately, as special cases of the general regression model
9 m> = bn + bfjzmiJ + emi (2)
where ?mi is the imputed value for the ith record with a missing .y value, zm<, are values reflec-
ting the auxiliary variables for that record, bro and /y/ are the regression coefficients for the
regression of y on x for the respondents, and em- is a residual chosen according to a specified
scheme for the particular imputation method.
Equation (2) represents the regression imputation method in an obvious way. If the em,'s
are set at zero, then the imputed value is the predicted value from the regression; otherwise
a residual of some form may be added. The equation also represents class mean imputation
by defining the ?/s to be dummy variables that represent the classes, and setting emi = 0.
The regression equation then reduces to $mi = J>rh, the class mean. Random imputation
within classes is obtained by adding a residual to the class mean, where the residual is the
deviation from the class mean for one of the respondents. Then fmi = j>,A + efttk, where
erhk is the deviation for respondent k in class h; this reduces to 9 mi = yru> tne value for that
respondent. The sequential and hierarchical hot-deck methods resemble the random within
class method. The overall mean and random overall imputation methods are degenerate cases
of the class mean and random within class methods that use no auxiliary information.
An important consideration in the choice of imputation method is the type of variable
being imputed. All the above methods can be applied routinely with continuous variables,
but some of them are not suitable for use with categorical or discrete variables (such as being
a member of (he labor force (1) or not (0), and the number of completed years of educa-
tion). Overall mean, class mean, and regression imputations impute values like 0.7 for being
a member of the labor force (i.e., a 10% chance) and 10.7 for the number of completed
years of education. These values are not feasible for individual respondents, and rounding
them to whole numbers leads to bias. For this reason, these imputation methods do not work
well for categorical and discrete variables. A notable advantage of all hot-deck methods is
that they always give feasible values since the values are taken from respondents.
there are two major distinguishing features of the above imputation methods that deserve
elaboration: whether or not a residual is added and, if one is, the form of the residual; and
whether the auxiliary information is used in dummy variable form to represent classes or
whether it is used straightforwardly in the regression. These features are discussed in the
next two subsections. Other issues arising with the use of imputation are then discussed in
subsequent subsections.
3.1 Choice of Residuals
Imputation methods may be classified as deterministic or stochastic according to whether
the emi's are set at zero or not. For each deterministic imputation method, there is a
stochastic counterpart. Let 9m«i be the value imputed by the deterministic method and
9 ma = 9 mid + *mt be that imputed by the corresponding stochastic method. Then
Łj(.Pmu) = 9 mid, where Ł2 denotes expectation over the sampling of residuals given the in-
itial sample, provided that E2(emi) = 0 (as generally applies).
The choice between a deterministic and the corresponding stochastic imputation method
depends on the form of survey analysis to be conducted. Consider first the estimation of
the population mean of the ^-variable using the sample mean of the respondents' values and
10-9
-------
Survey Methodology, June 1986 9
the nonrespondents' imputed values. As Kalton and Kasprzyk (1982) show, given that
Ei(ymis) — 9mid> it follows that the expectation of the sample mean is the same whether the
deterministic method or the corresponding stochastic method is used. Thus both methods
have the same effect on the bias of the estimate. However, the addition of random residuals
in the stochastic method causes a loss of precision in the sample mean. Although this loss
can be controlled by the choice of a suitable-method of sampling residuals (Kalton and Kish
1984), nevertheless some loss in precision "occurs. For this reason a deterministic scheme is
preferable for the purpose of estimating the population mean.
Consider now the estimation of the element standard deviation and distribution of the
^-variable. Deterministic imputation methods fare badly for these purposes, since they cause
an attenuation in the standard deviation and they distort the shape of the distribution. This
may be simply illustrated in terms of the class mean imputation method. By assigning the
class mean to all the missing values in a class, the shape of the distribution is clearly distorted
with a series of spikes at the class means. The standard deviation of the distribution is at-
tenuated because the imputed values reflect only the between-class and not the within-class
variance. The appeal of the stochastic imputation methods is that the residual term captures
the within-class (or residual) variance, and hence avoids the attenuation of the element stan-
dard deviation and the distortion of the distribution.
Since some survey analyses are likely to involve the distributions of the variables, stochastic
imputation methods like the hot-deck methods are generally preferred. Once a decision is
made to use a stochastic method, the question of how to choose the residuals arises. If the
standard regression assumptions are accepted, the residuals could be chosen from a normal
distribution with a mean of zero and a variance equal to the residual variance from the respon-
dent regression. However, this places complete reliance on the model. An alternative that
avoids the normality assumption is to choose the residuals randomly from the empirical •
distribution of the respondents' residuals. Another alternative is to select a residual from
a respondent who is a "close" match to the nonrespondent, measuring "close" in terms
of similar values on the auxiliary variables. This attractive alternative avoids the assumption
of homoscedasticity and guards against misspecification of the distribution of the residual
term. In the limit, the closest respondent is one who has the same values of all the auxiliary
variables as the nonrespondent. In this case, the nonrespondent is given one of the matched
respondents' values. This case arises with hot-deck methods, where nonrespondents and
respondents are matched in terms of the auxiliary variables, and nonrespondents are assign-
ed values from matched respondents.
A further consideration in the choice of residuals is to make the imputed values feasible
ones. As noted above, deterministic methods may impute values for categorical and discrete
variables that are not feasible. Some stochastic methods solve this problem through the alloca-
tion of the residuals. In particular, the use of respondents' residuals with the random within
class and the sequential and hierarchical hot-deck methods ensures that the imputed values
are feasible ones.
3.2 Imputation Class or Regression Imputation
As noted earlier, both imputation class and regression imputation methods fall within the
imputation model given by equation (2). The difference between them lies in the ways in
which they employ the auxiliary variables.
Imputation class methods divide the sample into a set of classes. For this purpose, con-
tinuous auxiliary variables have to be categorized. There is complete flexibility in the way
the classes are formed, and the symmetrical use of the auxiliary variables in different parts
10-10
-------
10 Kalton and Kasprzyk: Treatment of Missing Survey Data
of the sample is not required. Thus, for instance, in imputing for hourly rate of pay in a
sample of employees, the sample might first be divided into two parts, union members, and
nonmembers; then the imputation classes for the members might be formed in terms of age
and occupation whereas those for nonmembers might be formed in terms of sex and industry.
As a rule, the aim is to construct classes of adequate size that explain as much of the variance
in the variable to be imputed as possible. When the classes are formed by a complete
crossclassification of the auxiliary variables, the underlying model contains all main effects
and all interactions for the crossclassification. The limitation of imputation class methods
is that the number of classes formed has to be constructed to ensure that there is some
minimum number of respondents in each class. The hierarchical hot-deck method attempts
to extend the amount of auxiliary data used, but even with this method matches of respondents
and nonrespondents often cannot be made at the finer levels of detail. Coupled with the
use of a random respondent residual within a class, imputation class methods have the valuable
property that imputed values are feasible ones: that is, the imputed values are actual
respondents' values.
Regression imputation methods have an advantage over imputation class methods in the
number and in the level of detail of the auxiliary variables they can employ. Age can. for
instance, be taken as a continuous variable rather than being categorized into a few classes.
The regression model allows more main effects to be included in the model, but at the price
of fewer interactions. Regression models can, of course, include some interactions, but they
need to be specified. The models can also include polynomial terms and employ transforma-
tions, but again they need to be specified. The regression model has the potential of pro-
viding better predictions for the imputed values, but to achieve this careful modelling is
required. Careful imputation modelling is unrealistic for all the variables in a survey, but
it may be feasible for one or two major ones (and especially so for continuous surveys).
Without careful modelling, there is a serious risk of poor imputations, although as noted
earlier, this risk can be reduced by the allocation of random residuals from "close"
respondents.
If a regression imputation assigns the residual from a respondent with exactly the same
values of the auxiliary variables, the imputed value is necessarily a feasible one. If, however,
there is even a small difference between the respondent's and nonrespondent's values on the
auxiliary variables, the imputed value may not be feasible. A variant of regression imputa-
tion that avoids this problem, termed predictive mean matching, is described by Little (1986b)
(Little attributes the method to Rubin). With predictive mean matching, the nonrespondent
is matched to the respondent with the closest predicted value. Then, instead of adding the
respondent's residual to the nonrespondent's predicted value, the nonrespondent is assigned
the respondent's value. The method is thus a hot-deck method, and is similar to distance
function matching.
The choice between imputation class and regression imputation methods should in part
depend on the efforts made to develop the regression model. Unless adequate resources are
devoted to the development of a regression model, the imputation class methods may be
safer. The choice should also in part depend on the sample size. With large samples, hot-
deck methods are likely to be able to use enough classes to take advantage of all the major
predictor variables; however, with small samples this may not hold, and regression methods
may have greater potential. David et al. (1986) describe an interesting study that compares
regression models for imputing wages and salary in the U.S. Current Population Survey with
hierarchical hot-deck imputations. Despite the extensive efforts made to develop the regres-
sion models, the hot-deck imputations were not found to be inferior in this large sample.
3.3 Effect of Imputation on Relationships
Although most of the literature on imputation deals with its effect on univariate statistics
such as means and distributions, a large part of survey analysis is concerned with bivanate
10-11
-------
Survey Methodology, June 1986 1 1
and multivariate relationships. Here the analysis of relationships can be considered in broad
terms to include crosstabulation, correlation or regression analysis, comparisons of subclass
means or proportions, and any other analysis involving two or more variables. As will be
illustrated below, imputation can have harmful effects on all analyses of relationships, often
attenuating the associations between variables. Discussions of the effects of imputations on
relationships are provided by Santos (1981), Kalton and Kaspryzk (1982) and Little (1986a).
The general nature of the effect of imputation on relationships can be seen by considering
its effect on the estimate of the sample covanance in the simple situation where the v-variable
has missing responses that are missing at random over the population and (hex-variable has
no missing data. The sample covariance, sx)l, is calculated in the standard way, based on
the actual values for respondents and the imputed values for nonrespondents, as an estimate
of the population covariance Sxr Using the fact that E2(yma) = ymid as above, it can be
readily shown that the expected value of sxy under a deterministic imputation method is the
same as that under the corresponding stochastic method.
As Santos (1981) shows, the relative bias of s,r when the mean overall or random overall
imputation methods are used is approximately -M, where M is the nonresponse rate. This
occurs because the imputed .v-values are unrelated to their x-values, and hence the cases with
imputed values attenuate the covariance towards zero. This attenuation is decreased in
magnitude by imputation methods that use auxiliary variables. With class mean imputation
or random imputation within classes, the relative bias is approximately -iV/(Sxvz/SIV),
where S^.t = CffA5TrA is the average within-class covariance for classes formed by the aux-
iliary variables z, 5,,A is the covariance within class />, and Wh is the proportion of the
population in class h. With predicted regression imputation or regression imputation with
a random residual, both with a single auxiliary variable z, the relative bias is approximately
-M[\ - (Pnpyt/Prr)], where puv is the correlation between u and v. .
The disturbing feature of these results is that, unless A/ is small, sry calculated with im-
puted values under any of these imputation methods may be subject to substantial bias even
under the missing at random model. The estimates sly computed with imputed values ob-
tained under the imputation class and regression methods are unbiased only if the partial
covariance S^j is zero. In general, there is no reason to assume uncritically that S,v z is zero.
However, there is an important case when Srr{ = 0. This occurs when x = z, that is when
x is used as an auxiliary variable in the imputation procedure. In this case, the sample
covariance is unbiased under the missing at random model. This result suggests that if the
relationship between x and y is to form an important part of the survey analysis, x should
be used as ah auxiliary variable in imputing for missing ^-values.
The above theory assumes that only the ^-variable was subject to missing data. In prac-
tice the of-variable will often also be incomplete. If so, the sample covariance may be at-
tenuated because of the imputations for both variables. A special feature occurs when x and
y are both missing for a record. If the two values are imputed separately, the covariance
is attenuated, but if they are imputed jointly, using the same respondent as the donor of
both values, the covariance structure is retained. This suggests that when a record has several
missing related values, they should be taken from the same donor. Coder (1978) describes
the use of joint imputation from the same donor in the March Income Supplement of the
Current Population Survey.
As an illustration of how the above arguments about the attenuation of covariances app-
ly to other forms of relationships, we will give a simple numerical example of the effect of
imputation on the difference between two proportions. Let the variable of interest be whether
an individual has a particular attribute or not, and suppose that one half of the respondents
fail to answer this question. The missing responses are imputed by a random within class
imputation method using two classes, A and B. The objective is now to compare the
10-12
-------
1 2 Kalton and Kasprzyk: Treatment of Missing Survey Data
' Table 1
Number of Respondents with the Attribute, and Number of
Sampled Persons by Class, Sex and Response Status
Respondents with the attribute
Total respondents
Nonrespondents
Total sample
M
80
100
100
200
Class A
F
- 40 .
MOO
100
200
Total
120
200
200
400
M
60
100
100
200
Class B
F
20
100
100
200
Total
80
200
200
400
percentages of men and women with the attribute. The data are displayed in Table 1. Since
60% of the total respondents in class A have the attribute, 60 of the 100 male and 60 of
the 100 female nonrespondents in that class will be imputed to have the attribute. Similarly,
in class B 40% of the total respondents have the attribute, and so 40 male and 40 female
nonrespondents will be imputed to have the attribute. The proportion of actual and imputed
males with the attribute is thus (80 + 60 + 60-1- 40)/400 = 0.6 or 60%. For females the
corresponding proportion is (40 + 60 + 20 + 40)/400 = 0.4, or 40%. The difference bet-
ween these two percentages is 20%.
Had -sex also been taken into account in forming the imputation classes, the percentages
of males and females with the attribute would have been 70% and 30%, differing by 40%.
The failure to include sex as an auxiliary variable in the imputation has thus caused a substan-
tial attenuation in the measurement of the relationship between sex and having the attribute.
3.4 Multiple Imputations
Ideally the analyst using a data set with imputed values should be able to obtain valid
results for any analyses by applying standard techniques for complete data. However, as
noted in the last section, imputation can distort measures of the relationships between
variables. It also distorts standard error estimation.
All imputation methods except deductive imputation fabricate data to some extent. The
extent of fabrication depends on how well the imputation model predicts the missing values.
If the imputation model explains only a small proportion of the variance in the variable among
the respondents, the amount of fabrication in each imputed value is likely to be substantial.
If the imputation model explains a high proportion of the respondent variance, the amount
of fabrication is likely to be less serious. However, it needs to be recognized that the fit of
the imputation model for the respondents is not necessarily a good measure of the fit for
the nonrespondents.
Standard errors computed in the standard way from a data set with imputed values will
generally be underestimates because of the fabrication involved in the imputed values. Rubin
(1978, 1979) has advocated the method of multiple imputations to provide valid inferences
from data sets with imputed values (see also Hcrzog and Rubin 1983; Rubin and Schenker
1986). When multiple imputations are used for the purpose of standard error estimation,
the construction of the complete data set by imputing for the missing responses is carried
out several (say m) times using the same imputation procedure. The sample estimates
z, (i = 1, 2 m) of the population parameter of interest Z are computed from each of
the replicate data sets, and their average 2 is calculated. A variance estimator for 2 is then
10-13
-------
Survey Methodology, June 1986 13
given by V = W+ [(m + 1 )/m]B, where W\s the average of the within-replicate variance
of Z and 6 = Ł(z, - J)2/ (m - 1) is the between-replicate variance. Even with the inclu-
sion of the between-replicate variance component, however, the coverages of confidence in-
tervals for Z based on V are still overstated, with the amount of overstatement increasing
with the level of nonresponse.
This overstatement of the confidence levels can be addressed by modifying the imputa-
tion procedure, as described by Rubin and Schenker (1986). Their treatment considers the
random overall imputation method, and one: of their modifications allows for uncertainty
about the population mean and variance in the following way. With the standard random
overall imputation method, the conditional expected mean and variance of the imputed values
are the sample respondents' mean and variance. With the modification, the expected mean
and variance of the imputed values for a replicate are drawn at random from appropriate
distributions. The imputed values are then a random selection of respondents' values, modified
for the randomly-chosen mean and variance. When estimating the population mean, the ef-
fect of the changing expected mean and variance between replicates is to increase the between-
replicate variance component in V. This increase gives improved coverage for the resultant
confidence intervals.
A major problem with the use of multiple imputations is the additional computer analysis
needed, which increases as the number of replicates, m, increases. For this reason, a small
value of m, such as m = 2, may be preferred. A small value of m may, however, result in
a low level of precision for the variance estimator. Even with small m, it is questionable
whether the multiple imputation approach is feasible for routine analyses. It may be best
reserved for special studies, such as that described by Herzog and Rubin (1983).
In addition to providing appropriate standard errors, another advantage of multiple im-
putations from the same imputation procedure is that it reduces the loss of precision in survey
estimates arising from the random selection of respondents to act as donors of. imputed values
(see Section 3.1). This loss is reduced with multiple imputations by averaging over the
replicates. A small number of replicates serves well for this purpose. As noted earlier, Kalton
and Kish (1984) describe alternative ways of selecting the sample of respondents to achieve
this end.
A second major potential application of multiple imputations is to generate the imputa-
tions for the several replicates by different imputation procedures, making different assump-
tions about the nonrespondents. Suppose, for instance, that hourly rates of pay are to be
imputed for some earners in the sample. One procedure that might be used is the random
within class imputation method, which is based on an assumption that nonrespondents are
missing at random within the classes. If it is thought that the nonrespondents might in fact
come more heavily from those with higher rates of pay in each class, a simple modification
to the random within class method might be to impute values that are, say, 50 cents above
the donors' values. Other imputation procedures - for instance, using different imputation
classes - could also be tried. Comparison of the survey estimates obtained from the data
sets in which the different imputation procedures are applied then provides a valuable in-
dication of the sensitivity of the estimates to the values imputed. If the estimates turn out
to be very similar, they can be accepted with greater confidence; if they differ markedly,
the estimates need to be treated with considerable caution.
4. CONCLUDING REMARKS
Weighting and imputation have been presented as two distinct methods for handling missing
survey data, but in fact there is a close relationship between them. This may be illustrated
10-14
-------
14 Kalton and Kasprzyk: Treatment of Missing Survey Data
by considering any imputation method that assigns respondents' values 10 the nonrespondents.
For univariate analyses, this process is equivalent to dropping the nonrespondents' records
and adding the nonrespondents' weights to those of the donor respondents (Kalton 1986).
The differences between weighting and imputation emerge when one considers the
multivariate nature of survey data. It is possible to impute for the responses of a total
nonrespondent by taking all the responses from a single donor; however, weighting is generally
simpler in this case and it avoids the loss of .precision arising from the sampling of respondents
to serve as donors. It is not practicable to use weighting to handle item nonresponse since
it would result in different sets of weights for each item; this would cause serious difficulties
for crosstabulations and other analyses of the relationships between variables.
Weighting is a single global adjustment that attempts to compensate for the missing
responses to all the items simultaneously. Imputation, on the other hand, is item-specific.
This difference has consequences for the way that the auxiliary data are used. In forming
weighting classes, the focus is on determining classes that differ in their response rates. The
choice of auxiliary variables to use in imputation, however, is primarily made in terms of
their abilities to predict the missing responses.
An assumption underlying all the procedures reviewed in this paper is that once the aux-
iliary variables have been taken into account the missing values are missing at random. Thus,
for instance, the nonrespondents are assumed to be like the respondents within weighting
and imputation classes. This assumption can be avoided by using stochastic censoring models,
as has been done by Greenlees et al. (1982) in imputing wages and salaries in the Current
Population Survey. However, as Little (1986b) observes, these models are highly sensitive
to the distributional assumptions made.
• An alternative approach for handling missing survey data is to leave the values missing
• in the data set and let the analyst incorporate appropriate missing data models into the analysis
(Little 1982). This approach has much to commend it, but the labor and computing time
needed to implement it effectively preclude its use as a general purpose strategy. Rather,
the approach seems best suited for a small range of special analyses. In order to permit the
analyst to adopt this approach, it is essential that all imputed values be flagged to indicate
they are not actual responses, so that they can then be dropped from the analysis.
Finally, we should note that all methods of handling missing survey data must depend
upon untestable assumptions. If the assumptions are seriously in error, the analyses may
give misleading conclusions. The only secure safeguard against serious nonresponse bias in
survey estimates is to keep the amount of missing data small.
REFERENCES
BAILAR HI, J.C., and BAILAR, B.A. (1978). Comparison of two procedures for imputing missing
survey values. Proceedings of the Section on Survey Research Methods, American Statistical Associa-
tion, 462-467.
BAILAR. B.A., and BAILAR III, J.C. (1983). Comparison of the biases of the hot-deck imputation
procedure with an "equal-weights" imputation procedure. In Incomplete Data in Sample Surveys,
Volume 3, Proceedings of the Symposium, (Eds. W.G. Madow and I. Olkin), New York: Academic
Press. 299-311.
BAILAR, B.A., BAILEY, L., and CORBY, C.A. (1978). A comparison of some adjustment and
weighting procedures for survey data. In Survey Sampling and Measurement, (Ed. N.K. Namboodiri),
New York: Academic Press, 175-198.
BARTHOLOMEW, D. J. (1961). A method of allowing for 'not at home' bias in sample surveys. Ap-
plied Statistics, 10. 52-59.
10-15
-------
Survey Methodology, June 1 986 1 5
BISHOP, Y.M.M., F1ENBERG, S.E., and HOLLAND, P.W. (1975). Discrete Multivanaie Analyses.
Cambridge. Mass: The MIT Press.
BROOKS. C.A.. and BAILAR, B.A. (1978). An Error Profile: Employment as Measured by the Cur-
rent Population Survey. Statistical Policy Working Paper 3. U.S. Department of Commerce.
Washington, D.C.: U.S. Government Printing Office.
CHAPMAN. D.W., BAILEY, L., and KASPRZYK, D. (1986). Nonresponse adjustment procedures
at the U.S. Census Bureau. Survey Methodology, forthcoming.
CODER, J. (1978). Income data collection and processing from the March Income Supplement to the
Current Population Survey. The Survey of Income and.Program Participation Proceedings of the
Workshop on Data Processing, February 23-24, 1978. (Ed. D. Kasprzyk), Chapter!!. Washington,
D.C.: U.S. Department of Health, Education and Welfare.
COLLEDGE, M.J.. JOHNSON, J.H., PARE, R., and SANDE, I.G. (1978). Large scale imputation
of survey data. Proceedings of the Section on Survey Research Methods, American Statistical Associa-
tion, 431-436,
COX, B.G., and COHEN. S.B. (1985). Methodological Issues for Health Care Surveys. New York:
Marcel Dekker.
DAVID, M., LITTLE. R.J.A., SAMUHEL, M.E., and TRIEST. R.K. (1986). Alternative methods
for CPS income imputation. Journal of the American Statistical Association, 81, 29-41.
DREW, J.H., and FULLER, W.A. (1980). Modelling nonresponse in surveys wi:h callbacks. Pro-
ceedings of the Section on Survey Research Methods, American Statistical Association, 639-642.
DREW, J.H., and FULLER, W.A. (1981). Nonresponse in complex multiphase surveys. Proceedings
of the Section on Survey Research Methods. American Statistical Association, 623-628.
FORD, B.L. (1983). An overview of hot-deck procedures. In Incomplete data in Sample Surveys, Volume
2, Theory and Bibliographies, (Eds. W.G. Madow, I. OIkin and D.B. Rubin), New York: Academic
Press, 185-207.
GREENLEES, W.S.. REECE, J.S., and ZIESCHANG. K.D. (1982). imputation of missing values
when the probability of response depends on the variable being imputed. Journal of the American
Statistical Association, 77, 251-261.
HERZOG, T.N., and RUBIN, D.B. (1983). Using multiple imputation to handle nonresponse in sam-
ple surveys. In Incomplete data in Sample Surveys, Volume 2, Theory and Bibliographies, (Eds.
W.G. Madow, I. Olkin and D.B. Rubin), New York: Academic Press, 209-245.
KALTON, G. (1983). Compensating for Missing Survey Data. Ann Arbor: Survey Research Center,
University of Michigan.
KALTON, G. (1986). Handling wave nonresponse in panel surveys. Journal of Official Statistics. 2,
forthcoming.
KALTON, G., and KASPRZYK. D. (1982). Imputing for missing survey responses. Proceedings of
the Section on Survey Research Methods, American Statistical Association, 22-31.
KALTON, G., and KISH, L. (1984). Some efficient random imputation methods. Communications
in Statistics - Theory and Methods, 13(16). 1919-1939.
KISH, L. (1965). Survey Sampling. New York: Wiley.
KISH, L. (1976). Optima and proxima in linear sample designs. Journal of the Royal Statistical Socie-
ty, Ser. A. 139, 80-95.
LITTLE, R.J.A. (1982). Models for nonresponse in sample surveys. Journal of the American Statistical
Association, 77, 237-250.
LITTLE, R.J.A. (1986a). Survey nonresponse adjustments for estimates of means. International
Statistical Review, 54, 139-157.
LITTLE, R.J.A. (1986b). Missing data in Census Bureau surveys. Proceedings of the Second Annual
Census Bureau Research Conference, 442-454.
10-16
-------
16 Kalton and Kasprzyk: Treatment of Missing Survey Data
LITTLE. R.J.A., and DAVID, M.H. (1983). Weighting adjustments for non-response in panel surveys.
Working Paper, Washington, D.C.: U.S. Bureau of the Census.
DH. H.L.. and SCHEUREN, F. (I978a). Multivariate raking ratio estimation in the 1973 Exact Match
Study. Proceedings of the Section on Survey Research Methods, American Statistical Association,
716-722.
DH. H.L., and SCHEUREN. F. (1978b). Some unresolved application issues in raking ratio estima-
tion. Proceedings of the Section on Survey Research Methods,. American Statistical Association,
723-728.
DH, H.L., and SCHEUREN. F. (1980). Estimating trie variance impact of missing CPS income data.
Proceedings of the Section on Survey Research Methods,- American Statistical Association, 408-415.
DH, H.L.. and SCHEUREN. F. (1983). Weighting adjustment for unit nonresponse. In Incomplete
data in Sample Surveys, Volume 2, Theory and Bibliographies, (Eds. W.G. Madow, I. Olkin and
D.B. Rubin). New York: Academic Press, 143-184.
DH, H.L., SCHEUREN. F., and NISSELSON. H. (1980). Differential1 bias impacts of alternative
Census Bureau hot deck procedures for imputing missing CPS income data. Proceedings of the
Section on Survey Research Methods, American Statistical Association, 416-420.
PALMER, S. (1967). On the character and influence of nonresponse in the Current Population Survey.
Proceedings of the Social Statistics Section, American Statistical Association, 73-80.
PALMER, S., and JONES. C. (1966). A look at alternate imputation procedures for CPS noninter-
. views. Washington, D.C.: U.S. Bureau of the Census memorandum.
POLITZ, A., and SIMMONS, W. (1949). I. An attempt to get the 'not at homes' into the sample
without callbacks. II. Further theoretical considerations regarding the plan for eliminating callbacks.
Journal of the American Statistical Association, 44, 9-31.
POLITZ. A., and SIMMONS, W. (1950). Note on an attempt to get the 'not at homes' into the sam-
ple without callbacks. Journal of the American Statistical Association, 45, 136-137.
• RUBIN,-D.B. (1978). Multiple imputations in sample surveys: a phenomenological Bayesian approach
to nonresponse. Proceedings of the Section on Survey Research Methods, American Statistical
Association, 20-34.
RUBIN, D.B. (1979). Illustrating the use of multiple-imputations to handle nonresponse in sample
surveys. Bulletin of the International Statistical Institute, 48(2), 517-532.
RUBIN, D.B., and SCHENKER, N. (1986). Multiple imputation for .r.terval estimation from simple
random samples with ignorable nonresponse. Journal of the American Statistical Association, 81,
366-374.
5ANDE, G. (1979). Numerical edit and imputation. Paper presented to the International Association
for Statistical Computing, 42nd Session of the International Statistical Institute.
5ANDE, I.G. (1983). Hot-deck imputation procedures. In Incomplete Data in Sample Surveys, Volume
3, Proceedings of the Symposium, (Eds. W.G. Madow and I. Olkin), New York: Academic Press,
339-349.
SANTOS, R.L. (1981). Effects of imputation on regression coefficients. Proceedings of the Section
on Survey Research Methods, American Statistical Association, 140-145.
THOMSEN. I. (1973). A note on the efficiency of weighting subclass means to reduce the effects of
nonresponse when analyzing survey data. Statistisk Tidskrift, 4, 278-283.
FHOMSEN, 1., and SIRING, E. (1983). On the causes and effects of nonresponse: Norwegian ex-
periences. In Incomplete Data in Sample Surveys, Volume 3, Proceedings of the Symposium, (Eds.
W.G. Madow and I. Olkin), New York: Academic Press, 25-29.
VACEK, P.M., and ASHIKAGA. T. (1980). An examination of the nearest neighbor rule for imputing
missing values. Proceedings of the Statistical Computing Section, American Statistical Association,
326-331.
WELNIAK, E.J.. and CODER, J.F. (1980). A measure of the bias in the March CPS earnings im-
putation system. Proceedings of the Section on Survey Research Methods, American Statistical
Association, 421-425.
10-17
-------
NONRESPONSE ADJUSTMENT METHODS
FOR DEMOGRAPHIC SURVEYS AT THE U.S. BUREAU OF THE CENSUS
By: Rajendra P. Singh and
Rita 0. Petroni
Bureau of the Census, SMD
Washington, D.C.
ABSTRACT
All the surveys are subject to missing data. The missing data in a
survey could be either due to noncoyerage or nonresponse. The Cens-us
Bureau uses various approaches to-adjust'for the missing data. In this
paper, we will briefly discuss the various types of nonresponse in
demographic surveys and their possible effects on survey estimates. The
approach used at the Bureau to adjust for various types of nonresponse for
cross-sectional and longitudinal estimates will be discussed. Emphasis
will be placed on the weighting adjustments which utilize ratio estimators.
Discussion of the criteria to form ratio estimation cells will also be
presented in the paper.. As an example, the adjustment techniques of the
Survey of Income and Program Participation will be discussed.
11-1
-------
INTRODUCTION
A sound sampling plan for a survey includes extensive effort to
obtain usable data for each unit selected into the sample. Resources are
allocated to develop a good sampling frame, design a good questionnaire,
good interviewer's training, and other data collection procedures such as
how to gain cooperation of respondents. However, in spite of such efforts,
all surveys encounter missing data which could occur either due to
noncoverage or nonresponse. In this paper, we will discuss missing data
due to nonresponse and methods to adjust for it. It occurs when some or
all responses to the questions on a questionnaire are not obtained. This
may be due to the respondents inability or unwillingness to answer.
Researchers have been striving to reduce nonresponse. For example,
they have done this by better designing and testing questionnaires
thoroughly for complete and accurate answers before fielding the survey,
providing respondents aids to keep better records, giving respondents gifts
(cash or kind) to gain their cooperation and finding ways to improve
training given to the data collection staff. Researchers are also heavily
involved in improving the methods to account for missing data. Two
approaches commonly used are imputation and weighting adjustment.
In imputation, missing information is replaced with usable data from
other sources. Regression imputation (Kalton and Kasprzyk, 1982) and
cold-deck and hot-deck methods have been used by the U.S. Bureau of the
Census. The demographic surveys primarily use the cold-deck and hot-deck
procedures. The cold-deck procedure uses values from some prior
distribution (same survey or other source, while the hot-deck uses.current
responses from the same source (survey) to substitute for missing values.
Imputation is carried out by cross-classifying survey units into categories
(cells) by a few variables in an attempt to group responses that are
relatively homogeneous within the cells and heterogeneous between cells.
Within a cell, values obtained for survey units are inserted as responses
for missing items. To accomplish this, there must be at least one response
available in each category to be a donor for imputation.
Imputation is commonly used for partial response, that is, when a
questionnaire is partially answered. It has also been used to compensate
for complete nonresponse. One such example is the 1960 U.S. Census
(Pritzker et a/., 1965) adjustment for missing data. In this adjustment, a
nonresponding household was imputed by a responding household (donor) in
the same cross-category. This approach of imputing a complete
questionnaire amounts to doubling the weight of those respondents whose
records are duplicated. Such a procedure can increase the variance as
compared to weighting adjustment. Hansen, Hurwitz and Madow (1953) show
that the maximum increase in variance is about 12 percent for the method of
duplicating records. If a donor is used more than once, the variance
increase could be even larger.
Weight adjustment within cells (Oh and Scheuren, 1983) to compensate
for complete nonresponse (unit nonresponse) is the predominant technique
used in the demographic surveys of the Bureau of the Census. The general
11-2
-------
approach is basically the same for all its major surveys. It is simple and
less expensive to implement, as compared to imputation, and seems to work
well (Jones, 1984) for some labor force characteristics in the Current
Population Survey (CPS) such as number of persons in the labor force,
employed and unemployed. These estimates were not seriously affected by
noninterview bias. The only labor force categories with substantial bias
were those which included vacationers and persons on layoff.
In this paper, we will primarily discuss nonresponse weighting
adjustment for demographic surveys used at the Bureau of the Census.
Sections II and III discuss various types of nonresponse and adjustment
approaches to deal with these different types of nonresponse, respectively.
The effect of nonresponse on survey estimates is discussed in Section IV,
and the criteria to define noninterview cells are presented in Section V.
As an example, the noninterview adjustment methods used for the Survey of
Income and Program Participation (SIPP) are presented in Section VI.
Section VII presents a discussion on noninterview adjustment research.
TYPES OF NONRESPONSE
Nonresponse can be divided into the following categories:
Tvoe A Noninterview: A Type A noninterview occurs when every member
of the household is a noninterview. Also called a household
nonresponse, it occurs when non one is home, household members are
temporarily absent (for example, they could be away on vacation),
household members refuse to participate in the survey, or the
household cannot be located. .
Tvoe B Noninterview: This type of noninterview occurs when a housing
unit is vacant, occupied by persons with their usual residence
elsewhere, unfit or set or set to be demolished, under construction
and not ready for occupancy, or converted to temporary business or
storage. IT also occurs when a site for a mobile home, trailer or
tent is unoccupied or when a permit has been granted, but
construction is not started.
Type C Noninterview: It occurs when a housing unit is demolished, or
house or trailer is moved, converted to permanent business or
storage, or merged or condemned.
Type Z Noninterview: Type Z noninterview occurs when a member of an
interviewed household is not interviewed and a proxy interview is not
obtained. It is also called person nonresponse.
Item Nonresponse: Item Nonresponse occurs when a response to one or
more questions is not provided, though most of the questionnaire is
completed.
ADJUSTMENT FOR VARIOUS TYPES OF NONRESPONSE
Of these five types of noninterview, no adjustment needs to be made
11-3
-------
for type B and type C noninterviews. This is because type C noninterviews
are no longer housing units at the original address. For type B
noninterviews, only households with usual residence elsewhere occupy
housing units covered by these types of noninterview. Such households have
a chance of being in a sample at their usual residence.
Imputation techniques are used to deal with item nonresponse and type
Z nonresponse in most of the demographic surveys at the Bureau of the
Census. Weighting adjustment is used for type A nonresponse and in certain
cases for type Zs. The procedures used for type As and type Zs are similar
and based on the same general principals.
EFFECT OF NONRESPONSE ON SURVEY ESTIMATES
It is a common belief that respondents have different characteristics
from nonrespondents. This theory is supported by recent studies completed
by Petroni (1987), and Short and McArthur (1986). Thus, nonresponse
introduces bias in survey estimates. We believe that the bias is small
when the nonresponse rate is about 5% or less, but it increases as the
nonresponse rate in a survey increases. Increase in bias with increase in
nonresponse can be shown mathematically as follows:
Let P1 (i = 1,2,...K) be the proportion and R. be the response rate of
population members falling in ith group or cell. Thus, the overall response
rate, R, is given by:
K - - .
R = 2 P. R. ; 0 < R < 1
•i=l . and 0 < R. < 1 V i
K
where 2 P. = 1
Furthermore, assume that
7. = Mean of a characteristic of interest of the population
units falling in cell i.
7.{6) = Mean of a characteristic of interest of the population
in the ith class which would not respond if selected in a
sample.
7.(u) = Mean of a characteristic of interest of the population
in the ith class which would respond if selected in a
sample.
11-4
-------
Then y,
Sample estimate of
Sample estimate of
R,
Wjj - * - Selection probability of jth unit in ith cell.
y.. = Value of the characteristic of interest for the jth unit
J in the ith cell.
n, = Number of sample units in ith cell.
nfu = Number of sample units responding in ith cell.
P, = Proportion of sample units falling in the ith group or
cell.
K niu ~1 fn.
2 2 [TT^] Up
i-=l "1=1 ' . I iuJ : • (2.1)
u> K niu
K n j-U K
\ n
i=l 2
11-5
-------
The expected value of y(u) is
E [y(u)] - E [E (y(u) I n., nlo, n2, n2u ... nK,
. PiYKu) (2-2)
Therefore, the bias of the adjusted estimate is
Bias [y(u)] = E [y(u)] - y
Equation (2.3) suggests that the amount of bias depends on the
response rate and the difference in the mean values of the characteristics
for respondents and nonrespondents. With a small response rate, bias
increases even if the difference in the means of respondents and
nojirespondents is small.
Before discussing the criteria for noninterview (NI) adjustment, let
us consider the following situations:
and
for V i and j, or
2. R1 = R.J = R, V i and j, or
Under each of the three situations the bias is the same and is given by
Bias [y(u) ] - (1-R) [ Y(u) - Y(4) ] (2.4)
and is equivalent to using a single NI adjustment cell.
It is obvious from equation (2.3) that the bias in an estimate will
be reduced by using two or more cells if
11-6
-------
Y - Y
Yi(u) YH6)
Y - Y
Y(u) T(»)
and R, * R.J , V i, j.
(2.5)
(2.6)
Therefore, the success of the NI adjustment procedure requires the
identification of the survey variables which will define adjustment cells
such that these cells vary both with respect to survey estimates and
response rates. See Chapman (1976) for further details.
Note that there are other situations where bias could be reduced by
use of more than one NI cell even if the above two conditions are not
satisfied. For example, consider two cells. It is possible that one cell
meets criteria (2.5) and the other does not, yet the population
distribution into the cells and the response rates of the cells are such
that the bias is less using two NI cells instead of one.
CRITERIA TO DEFINE NONINTERVIEW ADJUSTMENT CELLS
The objective of noninterview adjustment is to reduce the bias in
survey estimates. A survey produces a large number of estimates, and
adjustments which reduce bias for one set of estimates may not work well
for another set of estimates. Therefore, it is essential to have a clear
understanding of the relative importance of various estimates when
implementing the criteria below to form NI cells. In addition to bias, it
is occasionally necessary to consider reduction of mean square error. This
is the case when the adjustment factor is large and, hence, increases the
variance significantly.
A. Lower Bias
The following four
cross-classification
estimates.
criteria
variables
are used
to reduce
in selecting the
the bias in survey
1. The variables are significantly correlated with the survey
estimates. The implicit assumption in selecting these
variables is that if for respondents these variables show a
significantly high correlation with survey estimates to be
produced, then they will also show high correlation among
nonrespondents. Since these variables must be available- for
both respondents and nonrespondents, the choice of the
variables is constrained. These variables are determined prior
to data collection to ensure the necessary data is obtained and
to avoid possible bias due to the particular sample selected.
2. Wi
V i.
ithin each weighting class E Y1(u) = E | Y.(i) 1
11-7
-------
3. The means of any two noninterview adjustment cells differ,
i.e., E [YI(U) ] . * E [Y.(O) ] for i ] j, V i and j
4. The response rate for any two cells differ, that is
R< # Rj , i # J, Vi and j.
B. Lower Variance
The variance contribution from a NI cell depends on the number of
responding and nonresponding units in that cell. For small cells the
nonresponse weight adjustment can be large. Therefore, the size of
the cell is an important consideration in defining a cell. One needs
to consider the trade-off between variance and bias in deciding the
size of the cell as bias should be reduced with a homogeneous
(usually a smaller) cell.
Cahoon and Bushery (1984) under a number of assumptions to simplify
the mathematics involved showed that the variance of an estimator for
cells with 25 sample units each is about 0.5% higher assuming 5%
nonresponse rate than a collapsed cell of 100 units. With 10%
nonresponse rate it is about 1.0% higher. In deriving these results
they assumed independence between sample units within a cell and
between cells, cells are of fixed equal size, and cells have the same
" expected 'response rate, expected value and variability of the
characteristics of interest.
To reduce variance, NI cells are collapsed if the number of
respondents in them is small or the noninterview adjustment factor is
too large.
n
-1
These limits are somewhat subjective. For most of the demographic
surveys at the Bureau, these limits are: a) minimum interviewed
cases in a cell are 20-35, and b) maximum NI adjustment factor is 2.
If one of these criteria is not satisfied by the cell it needs to be
collapsed with another cell. The following collapsing criteria
attempt to minimize the increase in mean square error of the survey
estimates of interest. A cell i should be collapsed with a cell j
if:
11-8
-------
1.
V 1, 1 * j
2.
R. -
v i, j
Usually, these two conditions are not
cells. In those circumstances, either
on condition 1, or a pair should be
square error even if neither of the
satisfied by the same pair of
more emphasis should be placed
found which reduces the mean
two conditions is satisfied.
Furthermore, if there is strong evidence that for a cell with a very
high noninterview adjustment factor E [Y.( J is very different from
any other cell, then the cell should be kept separate to minimize the
bias due to nonresponse (Shapiro, 1980). (Since the amounts of bias
and mean square error are unknown, experience is used to make
judgments regarding expected reductions in mean square error and
bias.)
THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
The Survey of Income and Program Participation (SIPP) is a new,
ongoing national household survey administered by the Bureau of the Census.
It is designed to provide improved data on income and participation in
government administered programs such as food stamps, Aid to Families .with.
Dependent Children (AFDC), Supplemental Security Income (SSI), etc. Data
on demographic characteristics, labor force, education, etc., are also
collected.
The SIPP is a multistage, stratified, systematic sample of the
noninstitutionalized resident population of the United States. This
population includes persons living in group quarters such as dormitories,
rooming houses, and religious group dwellings. Noncitizens of the United
States who work or attend school in this country and their families are
also eligible. Crew members of merchant vessels, Armed Forces personnel
living in military barracks, and institutionalized persons such as
correctional facility inmates and nursing home residents are ineligible.
Initially, a sample of living quarters in selected Primary Sampling Units
(PSUs) is taken. (Living quarters are those in which the occupants do not
live and eat with any person in the structure and that have either direct
access from the outside of the building or through a common hall, or
complete kitchen facilities for that unit only.) Persons residing in these
living quarters at the time of the first interview are considered to be in
sample. However, only persons who are at least 15 years of age at this
interview are eligible for interview. Limited data on children are also
collected by proxy interviews.
The SIPP sample is divided into four groups of equal
rotation groups. One rotation group is interviewed each
size called
month. In
11-9
-------
general, one cycle of four rotation groups is called a wave. This design
provides a steady workload for data collection and processing. Persons 15
years old and over in the sample are interviewed once every four months for
approximately 2.5 years. With certain restrictions, these sample persons
are followed if they move to a new address. Persons who began living with
sample persons after the first interview are considered to be part of the
sample only while residing with the sample persons. The reference period
for the interview is the four months preceding the interview month. For
example, for the first SIPP sample, the reference period for the November
1983 interview month was July through October 1983. These sample persons
were interviewed again in March 1984 for the November 1983 through February
1984 period. More details on the SIPP design are given in Nelson,
McMillen, and Kasprzyk (1985).
The SIPP questionnaire is long and complex. Questions are asked by
specific type of cash and non-cash income on months received and amounts
per month. For many types of income, additional questions are asked of
recipients. For example, in households with children covered by Medicaid,
up to eight questions about health insurance are asked. Questions are also
asked about assets and labor force status. Topical modules on various
subjects are also included in most interviews.
For the subsequent waves, only original sample persons (those
interviewed in the first wave) and persons living with them are eligible to
be interviewed. With certain restrictions, original sample persons are to
be followed if they moved to a new address. All noninterviewed households
from Wave 1 are designated as noninterviews for all subsequent waves.
Additional noninterviews result when original sample persons move without
leaving a forwarding address or move to extremely remote parts of the
country.
Due to the longitudinal nature (multiple interviews) of the survey,
the noninterview rate accumulates over the life of the panel. Starting at
about 5%-7% at the time of the first interview, it reaches slightly over 20
percent for the last interview of the panel. The following briefly
explains noninterview adjustment methods developed for the SIPP
cross-sectional and longitudinal estimates.
Noninterview Adjustment for Cross-Sectional Estimates
Noninterview adjustment for cross-sectional estimates are made at the
household level. At the time of the first interview very little
information (such as race of the reference person, owner-occupied or
renter-occupied housing unit, size of the household, and the Census region)
is available about the noninterviewed households. Therefore, a limited
number of variables correlated to the SIPP characteristics of interest can
be used to form noninterview cells. For first wave data, noninterview
cells were formed using the following variables. See King (1985) for a
detailed explanation.
a. Census region (Northeast, Midwest, South, West)
11-10
-------
b. Residence (metropolitan statistical areas (MSA), not MSA)
c. Race of reference person (black, non-black)
d. Tenure (owner, renter)
e. Household size (1, 2, 3, 4 or more)
The noninterview adjustments for subsequent waves are in addition to
the wave 1 adjustment, i.e., the NI adjustment made as a part of wave 1
weighting becomes an integral part of subsequent waves weighting. In
subsequent waves, additional information obtained on previous wave
respondents is available for use in developing noninterview cells. Using
1980 Decennial Census data, it was found that educational level, race and
origin of householder, household type, and tenure are highly correlated
with the important characteristics (income, poverty, etc.) estimated by the
SIPP. Also, Kalton et a7. (1985) showed that the participation of a
household in a given government program during the reference period covered
by interview (K) is highly correlated with its participation in interview
(K-l). For example, the correlations for food stamps and SSI were observed
to be about .9 and .8 respectively. The relationship is also strong
between interviews (K) and (K-2). For example, the correlation for food
stamp participants between interviews (K) and (K-2) is .8. These
correlations were obtained from the data collected in the Income Survey
Development Program (ISDP), a precedent of the SIPP.
Based on the above knowledge and experience of the Bureau staff, the
following household level variables were chosen to construct noninterview
adjustment cells for second and subsequent waves. A detailed description
of these cells is presented in King (1986).
a. Race and Spanish origin of reference person (non-Spanish white,
other).
b. Household type (female householder with own children under 16
years of age but no husband present, householder is 65 years of age
or older, others).
c. Education level of reference person (less than 8 years, 8-11
years, 12-15 years, and 16 or more years).
d. Type of income (welfare, etc., others).
e. Assets (bonds, etc., others).
f. Tenure (owner, renter).
g. Public housing or rent subsidized (resident of public housing
or recipient of government rent subsidies, others).
h. Household size (1, 2, 3, 4 or more)
11-11
-------
Cells which do not meet the following conditions are collapsed in a
predetermined manner.
1. Number of interviewed households in a cell is greater than or
equal to 30.
2. Noninterview adjustment factor is less than or equal to 2.
Noninterview Adjustment for Longitudinal Estimates
At present, longitudinal weighting procedures are developed only for
the estimates of persons. Two levels of noninterview adjustment are used
in these procedures. The first is at the household level and is similar to
the wave 1 adjustments for the cross-sectional estimate. It accounts for
persons who could not be interviewed at the first wave of the reference
period covered by the interval for which the longitudinal weights are
developed. The second adjustment is made at the person level to account
for those persons who could not be interviewed for at least one of the
later waves covering the reference period of interest. An alternative to
the weighting adjustment is imputation of the complete record for NI
persons. (This is similar to imputation of type Zs in cross-sectional
weighting.) However, this approach may have a significant adverse effect
(increase bias) on estimates of gross flows, one of the most important
longitudinal estimates . See Kalton (1986) and Singh et al. (1988).
The following variables were selected for use in the second level
longitudinal NI adjustment procedures in the same way as for the
cross-sectional adjustments are are based on the first interview covering
the time interval for which the longitudinal weighting is developed. Note
that certain person level variables are defined based on the household
level variables. For example, a household in which at least one HH member
received income from food stamps, the household i defined as having income
from food stamps and each member of the household is considered a food
stamp recipient. See Huggins (1988) for more information.
a. Average monthly HH income (<$1,200, $l,200-$3,999, > $4,000)
b. Employment status (self-employed, others)
c. Type of income (welfare, etc., unemployment compensation, others)
d. Assets (bonds, others)
e. Education level (< 12 years, 12-15 years, 16 or more years)
f. Race and origin (white and not Spanish, others)
g. Labor force status (in Tabor force, not in labor force)
The cells formed using the above variables are collapsed before
making noninterview adjustments if the number of interviewed persons in a
11-12
-------
given cell are either less than 30 and/or the noninterview adjustment
factor is greater than 2.0.
*
Noninterview Adjustment Research
To our knowledge, no study has been conducted to evaluate the
effectiveness of noninterview adjustment methods for the demographic
surveys. Therefore, the effectiveness of these procedures to reduce bias
in estimates is unknown. A study (Singh, 1987) to evaluate the SIPP
noninterview adjustment methods for cross-sectional estimates is underway.
The results from this study should be available later this year. Even then
no general statement can be made, since the SIPP provides a large number of
estimates. Some indirect evaluation of these procedures could be done.
For example, the SIPP estimates from wave 1 and from a later wave (say wave
4) for a given characteristic could be compared against corresponding
estimates from an independent source, especially administrative records.
However, the validity of such an evaluation will be questionable.
The Bureau of the Census has conducted noninterview adjustment
related research for its demographic surveys. Some of the research was
performed for the American Housing Survey (AHS-National). Parmer (1986)
examined correlations between variables of interest, between variables of
interest and evaluation variables, and the nonresponse rates for the
selected variables of interest. He also examined stability of the
variables considered to define noninterview adjustment cells. Research is
also being conducted on improving noninterview adjustment for the SIPP
(Petroni, 1988). Similar research may also prove useful for other
demographic surveys.
Some research to examine the feasibility and merits of computing
nonresponse adjustment factors as well as constructing weighting cells is
being conducted by Rosenbaum and Rubin (1983) and Little and Samuhel
(1983). Research is also needed in developing models- which may be used to
estimate response probabilities for units. This could be done for several
demographic surveys with similar values of independent variables.
Acknowledgments
The authors wish to express their appreciation for valuable technical
comments provided by Leroy Bailey, David Chapman, Lawrence Altmayer, and
Lloyd Hicks to improve the quality of this paper. Special thanks also goes
to Kimberly Wilburn for typing the paper. Without her persistence and
willing attitude, this paper could not have been completed.
The work described in this paper was not funded by the U.S.
Environmental Protection Agency and therefore the contents do not
necessarily reflect the views of the Agency, and no official endorsement
should be inferred.
11-13
-------
REFERENCES
Gaboon, L, and J. Bushery. (1984). "Effect of Noninterview Cell Size on
• the Variance of Estimates," Internal Census Bureau memorandum for
documentation, November 27, 1984.
Chapman, D.W. (1976). "A Survey of Nonresponse Imputation Procedures,"
Proceedings of the Social Statistics Section, Part 1, American
Statistical Association, 245-251.
Hensen, M.H., W.N. Hurwitz, and W.G. Madow. (1953). Sample Survey Methods
and Theory, Vol. I, New York: John Wiley and Sons, Inc.
Jones, C. (1986). "Imputation Based on Subsets of Interviewed Cases,"
Internal Census Bureau memorandum from Jones to Butz, December 29,
1984.
Kalton, G. (1986). "Handling Wave Nonresponse in Panel Surveys," Journal
of Official Statistics, 2, 303-314.
Kalton, G., and D. Kasprzyk. (1982). "Imputing for Missing Survey
Responses," Proceedings of the Survey Research Methods Section,
American Statistical Association, pp. 22-31.
Kalton, G., J. Lepkowski, and T. Lin. (1985). "Compensating for Wave
Nonresponse in the 1979 ISDP Research Panel," Proceedings of the
Survey Research Methods Section, American Statistical Association,
372-377. . • • .. .
King, K. (1985). "SIPP 85: Cross-sectional Weighting Specifications for
Wave I—Revision," Internal Census Bureau memorandum from Jones to
Walsh, November 21, 1985.
King, K. (1986). "SIPP: Cross-sectional Weighting Specifications for the
Second and Subsequent Waves," Internal Census Bureau memorandum from
Jones to Walsh, June 19, 1986.
Little, R.J.A., and M.E. Samuhel. (1983). "Imputation Models on the
Propensity to Respond," Proceedings of the Section on Survey Research
Methods, American Statistical Association, 415-420.
Nelson, D., D. McMillen, and D. Kasprzyk. (1985). "An Overview of the
Survey of Income and Program Participation: Update 1," SIPP Working
Paper Series no. 8401, U.S. Bureau of the Census.
Oh, H.L., and F.J. Scheuren. (1983). Weighting Adjustment for Unit
Nonresponse. Incomplete data in Sample Surveys, Vol. 2, New York:
Academic Press, 143-184.
Parmer, R.J. (1986). "Documentation of the AHS-National Noninterview
Adjustment Research for 1985," Internal Census Bureau memorandum for
11-14
-------
documentation, April 16, 1986.
Petroni, R. (1987). "SIPP 84: Characteristics of Initially Interviewed
Persons by Response Status," Internal Census Bureau memorandum from
Nonresponse Workgroup for the Record, September 3, 1987.
Petroni, R. (1988). "Evaluation of Mover Characteristics and
Nonresponse," Internal Census Bureau memorandum from Petroni to
Singh, April 6, 1988.
Pritzker, L., J. Ogus, and M.H. Hansen. (1985). "Computer Editing Methods
- Some Applications and Results," Bulletin of the International
Statistical Institute, Proceedings of the 35th Session Belgrade 41,
1965.
Rosenbaum, P., and D. Rubin. (1983). "The Central Role of the Propensity
Score in Observational Studies for Casual Effects," Biometrika 70,
41-55.
Shapiro, G. (1980). "A General Approach to Noninterview Adjustment,"
Internal Census Bureau memorandum from Shapiro to Programs Area
Branch Chiefs, March 11, 1980.
Short, K., and E. McArthur. (1986). "Life Event and Sample Attrition in
the Survey of Income and Program Participation," Proceedings of the
Section on Survey Research Methods, American Statistical Association,
200-205.
Singh, R., L. Weidman, and G. Shapiro. (1988). "Quality of the SIPP
Estimates," presented at the SIPP Conference on Individuals and
Families in Transition: Understanding Change through Longitudinal
Data, Annapolis, Maryland, March 16-18, 1988.
11-15
-------
ON THE ROBUSTNESS OF THE MAXIMUM LIKELIHOOD ESTIMATOR
IN THE PRESENCE OF NONRESPONSE IN COMPOSITIONAL DATA
by: Chao L. Chen
Environmental Research Center
University of Nevada, Las Vegas
- Las Vegas, NV. 89154
ABSTRACT
Human activity pattern data can be treated as compositional data.
Three statistical models are proposed for the compositional data, and we
adopted the logistical normal approach.
Once the logistical normal approach is chosen, the problem of missing
values in the compositional data can be transformed to the problem of
missing value problem in the multivariate normal case, hence the techniques
of treating missing values' ill the multivariate normal case can be applied.
The results show that certain techniques for missing values, though
useful if the square loss function is used as a criterion for judging
estimators, can lack robustness. The robustness depends on both
nonresponse rate and the nonresponse mechanism.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
12-1
-------
INTRODUCTION
To get a better estimate of the total exposure of a specific human
population (e. g., those living in a metropolitan area), one must know: (1)
the chemical attribute of the air in several well defined
"microenvironments" (e. g., kitchen, parking lot, bath room), and (2) the
proportion of a given time period each person spends in each
microenvironment and activities involved in that specific space and time.
In short, we call this a "human activity pattern". The current study is
focused on, some statistical considerations of part (2).
Human activity pattern data can be collected in many ways. We
consider a sample from the target population, in which each selected person
is asked to record his or her daily activities for a given time period.
Let the microenvironment-activity (m-a) combination be indexed by j. The
proportion of time spent in the j-th m-a for the i-th person, p.., can be
computed from the activity log if the record is complete. Our purpose is
to estimate the mean and covariance of py in the presence of item
nonresponse (incomplete log). For simplicity, we will consider simple
random sampling only; a more complex design may be necessary in the real
applications.
In section 2, some statistical models for the compositional data are
reviewed, Aitchison and Shen's (1980) logistic model is adopted. Section 3
explains how the incomplete logs can be connected to the missing values.
By a theorem of Aitchison (1986), we transform a missing value problem in
the compositional data to a missing value problem .in the multivariate
normal case, hence, the standard technique for the missing values mentioned
in Section 4 can be. applied indirectly to the compositional data. U.nder
some assumptions, the estimator mentioned in Section 4 is maximum
likelihood estimator (MLE). Hence, it is asymptotically optimal; but it is
not without any disadvantage. As illustrated in Section 5 by a simple
numerical example, this estimator suffers from lack of robustness under a
certain nonresponse mechanism and nonresponse rate.
STATISTICAL CONSIDERATIONS
If there are d+1 m-a combinations then
i. e., the sample space is the d-dimensional simplex:
j-1, 2, ..., d).
For simplicity, we drop the subscript i in the following discussion. There
are at least three ways to define probability density functions on a
simplex.
12-2
-------
(1) The Dirichlet distribution; see Wilks (1962, pp. 177-182.) for
the related properties.
(2) Through the square root transformation s^p,1'2, j=l, 2, ..., d+1,
a point p=(p,, P2, ..., pd)T in the d-dimensional simplex can be mapped into
a point on the (d+l)-dimensional hypersphere of unit radius. Stephens
(1982) contended that the von Mises distribution can be applied to fit
points on the hypersphere. In particular, his study includes a two-way
analysis of variance of the human activity pattern data collected from
university students;
(3) Through" the transformation v.=log(p./p.+1), j = l, 2, ..., d, one
obtains the vector v=(vp v2, ..., vd)T which has a d-dimensional
multivariate distribution with mean n and covariance matrix 2 as suggested
by Aitchison and Shen (1980). They assumed v has a multivariate normal
distribution which implies that p has a logistic normal distribution with
mean n and covariance matrix 2; in short, we say p is Ld(/i, 2). Note that
this transformation is a multivariate generalization of the logistic
transformation.
We will adopt the third approach for the following reasons:
(1) Aitchison and Shen (1980) pointed out that some properties of the
Dirichlet distribution imply a very strong assumption of independence of
p,'s which is unlikely in the current application. The family of logistic
normal distributions provide a wider class of distributions than the
Dirichlet distributions. An even more general class of distributions can
be obtained by different transformations. See Aitchison (1982).
(2) Missing-values analysis in the multivariate normal case is to
some extent developed (Little and Rubin, 1987). It is also easier to
relate the transformed variable v to the design variables (basic
information of each sampling unit known to us prior to the sampling) by the
multivariate normal theory.
(3) As indicated by Aitchison (1986, pp. 338-340), the directional
data approach for compositional data has a topological difficulty since the
square root transformation maps the simplex into only part of the unit
sphere.
Mathematical tractability is one of the main considerations in our
choice of an approach. It is sometimes doubtful that the real data satisfy
the logistic normal assumption. If a test of normality shows a
contradiction to logistic normal distribution, the von Mises distribution
and the generalization of logistic distributions serve as alternatives.
The Dirichlet distribution seems less promising, since the Dirichlet
distribution can be well approximated by the logistic normal distribution
(Aitchison and Shen, 1980).
12-3
-------
Under the logistic-normal model, the statistical method for analyzing
the human activity pattern data can be limited to the routine procedures of
multivariate analysis using the transformed variable v if the data are
complete. For example, we can rewrite p as:
(PP P2, ..- . Pd+1) - (W *2*u
in other words, the observed value (plt p2
perturbation of the parameter
where all the u.'s are nonnegative.
... , pd+1) is
... , d+1) by (ult u2,
Without changing the
viewed as a
u),
. :=1. It
with mean 0
log(p./pd+1), we can impose the condition Uj+u^.
that if u =(Uj, u2, ... , uj is logistic normal
matrix I, then p is Ld(jt, 2), where M-(/ip /V
Iog(jr2/7rd+1), ... , log(7rd/7rd+1))T. Once we get A> the estimate
estimate of n (ft) can be computed by the inverse transformation
d+1
magnitude of
can be checked
and covariance
»
the
of
exp(/ik)]
and the estimated covariance matrix of 1t can be obtained by
62GT, where 2 is the estimated covariance matrix of Ł and the (m, n)-th
element of G is the partial derivative of 7rm with respect to /zn evaluated
for /in=An.
MISSING VALUES IN ACTIVITY LOGS
Ideally, an. activity lo.g should record m-a combinations before and
after each change of m-a. Some of the sampled people will have complete
logs while, inevitably, others may have "coarse" logs. For example, in the
i-th person's log, the first recorded item is "leaving home and enter the
car at 7:20 a.m.", and the second record is "entering the cafeteria from my
office at 10:00 a.m." If we are willing to accept that there are only two
microenvironment-activities from 7:20 a.m. to 10:00 a.m., i. e., driving a
car (m-a 2) and working in the office (m-a 3), suppose all the other m-a
combinations are correctly recorded except that in another time interval,
m-a 3 and m-a 5 are again mixed together, then instead of obtaining
individual p12, p13, and p15, we can only get the sum of pi2, p13, p15 and
conclude that p.2, p.3, p.5 are within certain intervals.
Missing data can be partly identified by this kind of inconsistency
between two consecutive records but this method is by no means complete.
For instance, there may be another completely unrecorded m-a between 7:20
a.m. and 10:00 a.m. in the above example. Even a deemed complete log may
become spurious if completely unrecorded m-a is taken into consideration.
Situations become more complex as the logs become coarser, perhaps an ad
hoc procedure is necessary to examine each log. It is beyond the scope of
the present work to develop these procedures, but we assume that through
some suitable procedures, each log can be mapped into a single point in the
d-dimensional simplex if the log is complete; otherwise, we- can observe
some unmingled elements of p and the sum(s) of other elements of p.
12-4
-------
A coarse log is always a problem in the estimation procedure,
discarding the coarse logs and analyzing data only from the complete logs
seems unsuitable. In the following discussion, we will, sacrifice
information contained in the sum(s) of the elements of p, but at least we
will still make use of the unmingled observations in p. In terms of the
example mentioned in the first paragraph of this section, we ignore the
information provided by the sum of p12, p.3, p15, but other individually
observed proportions will be incorporated in the estimation procedure.
STRATEGIES FOR INCOMPLETE LOGS
To analyze the unmingled, individually observed proportions, the
following fact (Aitchison, 1986, p. 119) is useful:
Let p be Ld(0, 2), suppose the proportions in the m-a jp j2, ..., jc+,
are individually observed, llj^j^. . .
-------
normal case, obtaining MLE in the presence of missing values is pretty
simple if we have a monotone missing pattern and the nonresponse mechanism
is ignorable. Operationally, the procedure of computing MLE includes
continuously regressing the less observed variable(s) on the more observed
variable(s) and then filling in the unobserved parts using the regression
model. Sweep and reverse sweep operators are commonly used in this
procedure. For details, see Little and Rubin (1987, pp. 112-119).
The role of ignorable nonresponse mechanism should be addressed
further. The assumption of ignorable nonresponse insures that the complete
likelihood function can be factored into two parts, one corresponds to a,
the parameter of the nonresponse mechanism, the other corresponds to 0, the
parameter of the random variable we are interested in. If the parameter
space fl can be expressed as the Cartesian product of the parameter spaces
of a and /?, maximizing the complete likelihood can be achieved by
maximizing the separate parts (Little and Rubin, 1987). In the typical
terms of statistical inference, we can say that a is the nuisance
parameter. An MLE of /J can be obtained by maximizing the marginal
likelihood of /). Ignorable nonresponse eliminates the dependence of the
marginal likelihood of ft on the nuisance parameter a.
If a monotone missing pattern seems unlikely, the EM algorithm is
usually applied (Little and Rubin, 1987, pp. 142-145). For a thorough
discussion of the EM algorithm, see Dempster, Laird, and Rubin (1977). If
we can have a monotone pattern by sacrificing a limited amount of data,
Rubin (1987, pp. 189-190) suggested discarding data to obtain a monotone
pattern.
Assuming an ignorable nonresponse mechanism, we use .an bivariate
normal example to illustrate the estimation procedure in the presence of
nonresponse. Note that for the example we use, the missing pattern is
monotone.
Let y=(y^ y2)T follow a bivariate normal distribution with mean
/i=(/ip /i2) and covariance matrix 2. Let y. be subject to nonresponse.
Whenever it is necessary, we use the letters R and N to denote response and
nonresponse respectively; for example, (Ry., Ry2) is a random vector from
the subpopulation of respondents with mean (R/JJ, R/*2), the same nomenclature
can be applied to the nonresponse subpopulation and the covariance matrix.
A sample collected from the bivariate normal population can be so arranged
that the first m observations are complete, while for the second variable
of the last n-m observations, call it Ny2, are not observed. Note that both
Ry, and Ny1 are observed; the only unobserved part is .,y2. Any symbols
without K or N in front means that both response and nonresponse are
considered, for example, y. represents the mean of the n y/s. The MLE of ^
and the corresponding estimate of the variance can be obtained by the
following formula (Little and Rubin, 1987, pp. 100-103) if the nonresponse
is ignorable:
Mi. (4.1)
12-6
-------
{l/»Rr2/[n*(l-Rr2)]+(Hyl-y1)z/(n*Rs11)} (4.3)
where Rs12 and Rs., are the sample covariance and variance computed from the
first m observations, i. e., the response part. Similarly, Rr is the
sample correlation computed from the first m observations.
Using the estimator (4.2) is equivalent to constructing a simple
regression of Ry2 on Ry. and then obtaining the arithmetic mean of 8y2's and
Ny2's, where Ny2 is the predicted value based on the regression line
constructed from the response part.
ROBUSTNESS CONSIDERATION
To discuss the robustness of /L in (4.2), the influence function
approach will be employed in this section. For another approach of robust
estimation in which a contaminated normal distribution is considered, see
Little and Rubin (1987, pp. 194-217).
Loosely speaking, the univariate influence function is a measure of
the "influence" of an additional observation x to the estimator T(Fn).
Here, the estimator is expressed as a functional T of the empirical
distribution function F . Formally, the influence function is defined to
be the limit of the ratio of the difference of two statistical functionals
and e as e goes to zero:
IF(x; T, F)- 1-im {T[(1-OF+«*X]-T[F]}A,
where 6x is the indicator function.
Influence function can be generalized to the multivariate situation
(Hampel, 1986, p. 226). Also, the arguments in the statistical functionals
may have more than one distribution function. Some typical results are
illustrated below:
IF(y; T, F)=y-/i, when T is the mean functional,
IF(y; T, F) = (y-/i)2-a2, when T is the variance functional,
IF(y,, y,;. T, F) = (y.-/t.) (y.-/i.)-a , when T is the covariance
functional.
Since Aj is the usual sample mean, we will focus on ft., note that
the corresponding statistical functional of A2 is a function of two
bivariate distribution functions, J and NF. By tne three equations listed
above and the chain rule of calculus, the influence function of A? can be
expressed as
12-7
-------
Rffl *(N^rR/il)*[(RVR'il)2~Rail]/Rail2}' (5>1)
where a., is the (i, j)-th element of 2, p is the population nonresponse
rate, ana as stipulated previously, R and N denote response and nonresponse
respectively. If the nonresponse mechanism does not depend on y. and y,,
then we call this situation missing completely at random (MCAR). tinder tne
assumption of MCAR, ^^^ hence the above formula can be simplified.
, is
as
(2)
/L is better than ny2 for the following two reasons: (1)
asymptotically unbiased. The bias square term produced by using Ry;
(RM2-M2)2 becomes the dominant component in the mean squared error
the sample size increases and the nonresponse mechanism is not MCAR.
For fixed m, var(A2) in (4.3) decreases as n increases while the variance
of Ry» remains unchanged; in other words, we lose more information using Ry,
as tne nonresponse rate p increases. However, these advantages are not
without any price. The price one pays is robustness. A further inspection
of (5.1) reveals that influence function may increase as p or (^^JJ-^2
increase, the increase of (R/J2-/z2)2 will, in turn, increase (^^^i) if
there is correlation between yt and y2.
The effect of an outlying ^ on A2 is less complicated than that of
an outlying (Ryp Ry2) if the nonresponse is not MCAR, hence only the effect
of an outlier (Ryp Ry2) will be illustrated by the following numerical
.example. First, generate 200 y^y^ y2)T following a bivariate distribution
with mean ^=(10, 10)T and covariance matrix .
10, 9
9, 10
This can be done by first generating
a bivariate normal distribution with
using the SAS package. Then let y-
200 pseudo random vectors z following
mean 0 and identity covariance matrix
/j+A*z where A is a lower triangular
matrix and 2=A*AT is the Cholesky decomposition of 2. Assume the logistic
nonresponse mechanism:
Pr(y12 not observed|y11)=l/[l+exp(-a-b*y.1)].
(5.2)
This mechanism is ignorable in the sense that it does not depend on the
possibly unobserved y2. It is also MCAR if b=0. Points with larger yL have
a higher probability of being unobserved when b is positive. By varying
combinations of a and b, several nonresponse mechanism cases can be
generated. For each generated (yil,yi2)> we generate independent u.
following a uniform distribution on (0, 1). If
u.
-------
then we treat y.9 as unobserved. In order to compare the effect of
different b's under the same response rate, values of a and b are purposely
selected so the number of nonresponses in cases 3, 4, 5, and 6 are exactly
150. The computational results are listed in Table 1. In each case, 200
points are obtained, and, after the removal of the deemed nonresponse .y.g,
Ry2 and A2 are calculated. Then the outlier (25, 3) or (3, 25) is added to
the data set as a (Ryn, Ry12) point and A2+, the new estimate of is, is again
computed.
TABLE 1
ESTIMATES OF p. UNDER DIFFERENT RESPONSE MECHANISMS
case
# of N
added outlier
1 -«
2 -1.099
3 1.099
4 0.00
5 -15.45
6 24.95
0
0
0
0
2
• -2
.0
.0
.0
.14
.00
.00
0
48
150
150
150
150
10
10
9
9
6
13
.00
.01
.91
.22
.27
.72
10
9
10
9
9
10
.00
.98
.01
.97
.15
.36
(25,
(25,
(25,
(25,
(25,
(3,
3)
3)
3)
3)
3)
25)
9
9
9
9
6
14
.96
.93
.73
.44
.79
.1
Table 1 is an illustration of the assertion that the advantage gained
by using formula (4.2) is not without any price. In case 1, Ry2=#2, tne
influence of the added outlier is limited to the second decimal point. A
comparison of cases 1, 2, and 3 shows that influence of the outlier on A2
increases as nonresponse rate becomes higher. A comparison of cases 3, 4,
5, and 6 shows that given a fixed number of nonresponses, Ł2 is closer to
the true value 10 than Ry2 is. This phenomenon is even more obvious when
the magnitude of b becomes larger; however, the absolute value of A2-A2+
also becomes larger.
From the viewpoint of regression diagnostics, it is obvious that the
newly added point is a high leverage point, the simple regression model of
R/I on R>2 is ni9hly distorted by the added point. This in turn will
distort the imputed value Ny2 obtained from Nyj and the regression model.
Finally, we want to point out that for the numerical example we used,
the outlier is an obvious one, to detect the outlier in a systematic
method, we need a general rule to order the multivariate observations. See
Green (1981) for the bivariate example.
12-9
-------
DISCUSSION
Since nonresponse will complicate the analysis procedure and make the
statistical results less reliable, attempts should be made to reduce the
nonresponse rate. However, the remedy for the nonresponse is by no means
complete if we concentrate on nonresponse rate only, the role of
nonresponse mechanism should not be ignored. For example, incentive
methods may not only increase the response rate, but also change the
nonresponse mechanism, say, increase the magnitude of b in (5.2).
Reduction of the sensitivity to the outlier by the increase of the response
rate can be counteracted by the increase of b. From this point of view, a
complete consideration of using an incentive method should at least include
the following questions: When and to whom should we apply the incentive
methods—at the very beginning of the sampling program or at the follow-up
survey, to all the population or just to the "hard core?"
Though the increase of the number of microenvironment-activity
combinations will not change the computational procedures, it is still
necessary to decide how many m-a combinations we are going to define. For
a fixed number of observations, the number of microenvironment-activity
combinations cannot be increased without any limitation. The chance that
the MLE is close to the true value is small if the number of parameters
increases as the number of observations increases.
The numerical example used in this study comes from a bivariate
normal distribution. It can be generalized without difficulties to a
multivariate normal case as long as the assumption of ignorable nonresponse
mechanism and monotone missing pattern still hold. Further studies are
necessary if we do not have an ignorable nonresponse mechanism. One may
also argue that the outlier in the numerical example is an extreme one, but
what we want to emphasize here is that for the same outlier, the influence
varies as the nonresponse rate and nonresponse mechanism change. In other
words, we are more interested in the relative magnitude of A2-A2+ among
different cases in Table 1 than in the difference A2-A2+ in each
individual case.
12-10
-------
REFERENCES
Aitchison, J. (1982). The statistical analysis of compositional data (with
discussion). J. R. Statist. Soc. B, 44, 139-177.
Aitchison, J. (1986). The Statistical Analysis of Compositional Data.
London: Chapman and Hall.
Aitchison, J., and Shen, S. M. (1980). Logistic-normal distributions: Some
properties and uses. Biometrika 67, 261-272.
Dempster, A. P., Larid, N., and Rubin, D. B. (1977) Maximum likelihood from
incomplete data via the EM algorithm. vL. jL Statist. Soc. B, 39,
1-38.
Green, P. J. (1981) Peeling bivariate data. In Interpreting Multivariate
Data, edited by V. Barnett (John Wiley & Sons, New York), pp. 3-19.
Hampel, F. R.; Ronchetti, E. M.; Rousseeuw, P. J.; and Stahel, W. A.
(1986). Robust Statistics. New York: John Wiley & Sons.
Little, R. J. A.; and Rubin, D. B. (1987). Statistical Analysis with
Missing Data. New York: John Wiley & Sons.
Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys. New
York: John Wiley & Sons.
Stephens, M. A- (1?82). 'Use of the von Mises distribution to analyse
continuous proportions. Biometrika 69, 197-203.
Wilks, S. S. (1962). Mathematical Statistics. New York: John Wiley &
Sons.
12-11
-------
NONRESPONSE PROBLEMS AND SOLUTIONS: A CASE STUDY
by: Dawn Nelson and Chet Bowie
U.S. Bureau of the Census
Washington, D.C. 20233
ABSTRACT
The purpose of this paper is to describe the nonresponse problems in
one particular survey and the efforts being made to reduce it in hopes that
the information will be useful in planning other surveys. The survey is
the Survey of Income and Program Participation (SIPP). This survey has
been conducted by the Census Bureau since 1983 to provide longitudinal
information on the economic situation of households and persons in the
United States.
This paper presents information on our experience with nonresponse in
the first SIPP panel to be completed, the 1984 SIPP panel. It contains a
demographic profile of those who refused, which was developed by analyzing
information provided by SIPP interviewers. It also contains a discussion
of the reasons why respondents refused to participate in the survey and the
results of follow-up visits made to convert these refusals into interviews.
Finally, there is a description of other efforts that are being made to
maintain and improve the SIPP response rates. These efforts include
educating the interviewers and respondents about the survey, improving
interviewer training, and testing the effects of offering respondents a
gift for participating.
13-1
-------
INTRODUCTION TO NONRESPONSE PROBLEMS
When conducting a survey, the researchers and managers depend upon the
cooperation of respondents to produce accurate results. Without this
cooperation, the resulting survey data may be biased. Therefore, efforts
should be made to understand the reasons for nonresponse and to develop
ways to reduce it.
Nonresponse problems differ somewhat depending on the nature of the
survey; that is, the mode of interviewing (mail, telephone, diary, personal
visit), the type of respondent (person or business), the number of times
each respondent is interviewed, the length and content of the interview,
whether it is voluntary or mandatory, and so forth. This paper will focus
on nonresponse problems associated with household surveys conducted by
personal visit or a combination of personal visit and telephone.
According to the literature, a typical nonresponse rate for this type
of survey is probably about 20 percent (1). The Census Bureau, on the
other hand, manages to keep the rate to around 5 percent or less for most
of its household surveys. This should make you ask how the Census Bureau
addresses the problem of nonresponse. That question is not easily
answered, however. Much of the knowledge about this subject is considered
to be common sense; therefore, it is not well-documented at the Bureau.
But it is well-established that there are two main sources of nonresponse:
noncontacts and refusals.
NONCONTACTS
There are several reasons that an interviewer might not be able to
establish contact with a respondent; for example, they may be away at work,
school, prison, etc., or on vacation, or unable to answer the door due to
an illness or disability. Other barriers include bad weather or road
conditions and housing security measures that prevent access to the
respondent. Obviously some of these conditions are temporary and may be
overcome if enough time and money is allotted for the interviewers to make
return visits or "callbacks" to the respondents.
Callbacks can be made less costly and time-consuming, if you plan
ahead for them in designing the survey. For example, a cluster sample
design will ensure that neighboring housing units are selected in clusters
scattered throughout the sampling area. This will minimize the amount of
travel time needed for callbacks to neighboring units. However, some
concern has been expressed that cluster sampling may increase refusals if
an interviewed household tells a neighbor something negative about the
survey before the interviewer reaches them. Another planning suggestion is
to 'set up a flexible interviewing" schedule that includes nights and
weekends. This makes callbacks possible at different times of the day and
different days of the week which will increase the chance of finding
someone at home.
13-2
-------
There are also a number of steps that can be taken to avoid
noncontacts or limit the number of callbacks. For example:
1,
3.
4.
5.
6.
7.
REFUSALS
Keep records of the time contacts are attempted or made and
analyze them to determine when respondents are most likely to
be home and how many callbacks to allow.
Ask respondents who will be interviewed several times when is
the best time to visit and note it for future use.
Use "suspected" characteristics of the sampled person for clues
regarding the best time to attempt contact; e.g., persons
living in housing for the elderly are more likely to answer the
door during the day.
Allow proxy interviews; i.e., obtain the information about a
respondent from another knowledgeable person.
Ask a neighbor when the respondent is likely to be home.
Make an appointment by telephone. However, it should be noted
that many interviewers feel this procedure may lead to a
refusal because it is easier to turn away someone over the
phone than in person.
Leave a notice saying that an interviewer was there and asking
the respondent to call the interviewer to make an appointment.
Once contact is made, however, there is still the possibility that
the respondent may refuse to participate in the survey. In fact, this is
generally a bigger problem than making contact. Researchers Stephan and
McCarthy (2) have found that obtaining a response depends on several
factors including: 1) the form of the approach (mail, telephone, personal
visit), 2) the type of information requested and the advance notice given
to the respondent, 3) the characteristics of the respondent, 4) the
respondent's attitudes toward the group conducting the survey, 5) the
efforts made to overcome resistance, and 6) the circumstances under which
the interview is conducted.
The Survey Research Center at the University of Michigan also studied
the concerns expressed by respondents that might lead to a refusal (3).
The most frequently cited concerns were that the interview would 1) take
too much time, 2) ask for too personal, difficult, or unpleasant
information, or 3) have a negative effect on the respondent, e.g., denial
of some government program benefits. They also found that these concerns
were mitigated in some cases by the respondent's desire to be of public
service or because the respondent was lonely and wanted to talk to someone,
or flattered to be asked to participate, or just friendly and liked to
talk.
13-3
-------
It is more difficult to give advice on how to avoid refusals because
each respondent is different and may react differently. Some general tips
in this area have also been developed by Stephan and McCarthy (4). They
recommend:
1. using well-trained professional interviewers who have a
positive attitude,
2. assigning interviewers to respondents who have characteristics
that are similar to the interviewers' (but an interviewer
should not personally know a respondent),
3. providing advance notification about the interview,
4. having a good introduction and description of the survey for
the interviewers to use,
5. paying the respondents,
6. rescheduling the interview if the respondent is too busy when
first contacted, and
7. using a different interviewer to try to change a respondent's
refusal into an interview.
ADDRESSING REFUSALS AT THE CENSUS BUREAU
The Census Bureau follows all of these recommendations except that we
do not pay respondents because we are prohibited from doing so by
governmental regulations. In addition, the Bureau believes that it is
important to:
1. have a well-designed, fully-tested, brief questionnaire, and
2. guarantee the respondent's confidentiality and train the
interviewers on the importance of and ways to maintain
confidentiality. Census Bureau interviewers are subject to a
jail penalty or a fine if they disclose any information that
would identify a respondent. We only publish data in the form
of statistical summaries and never release any information that
identifies an individual. We believe that these measures help
to maintain our high response rates.
Also, the importance of having the right interviewers, training them
well, providing them with incentives, and soliciting their help in
addressing the nonresponse problem should be stressed. The Census Bureau
has a regular staff of around 3,000 sample survey interviewers who work out
of our 12 regional offices throughout the country. When they are assigned
to work on a survey, they are thoroughly trained on it through self-studies
done alone at home and in classroom sessions with other interviewers. The
interviewers also receive refresher classroom or self-study training at
least once a year or sometimes more often. This training will often
13-4
-------
emphasize problems the interviewers are encountering, such as nonresponse
problems. The Bureau feels that if the interviewers are familiar with the
basic concepts of the survey, the wording of the questions, and the way to
complete the forms, they will be more likely to display a positive and
professional impression to the respondents. The importance of this
impression was demonstrated in an experiment conducted by Durbin and Stuart
in which only 3 to 4 percent of the respondents refused professional
interviewers while about 13 percent refused the inexperienced amateurs (5).
The Bureau also tries to keep an open line of communication between
the interviewers in the field and the survey managers in the regional
offices and the central headquarters. One way we do this is through a
program called "Thanks for Asking" in which interviewers are invited to
send questions and suggestions to the head of our field operations office.
Matters of general interest are responded to in a newsletter that goes out
to all interviewers and the other letters are answered personally. We
receive a lot of valuable adv.ice from our interviewers in this way, and
occasionally an interviewer is given a cash bonus if their suggestion is
adopted.
The Bureau also provides feedback to the interviewers on how well
they are doing in general and in particular on keeping the nonresponse rate
low. A noninterview rate is calculated for each interviewer at the end of
an enumeration period so they know if their efforts are succeeding.
Interviewers are observed periodically by their supervisors, and their work
is also systematically reviewed for errors to enable us to detect
weaknesses that need to be corrected through additional training. We also
use these evaluations to reward good interviewers with pay incentives.
. Finally, despite all of the Bureau's efforts to make contacts and get
cooperation, some nonresponse is considered inevitable. Also, in some
cases, we feel it may be better to accept a refusal than to pursue an
interview with a respondent who is so negative or apathetic that they
provide data of questionable quality. In planning a survey, one must
decide on the extent of the efforts to be made in obtaining cooperation.
This decision will depend upon: 1) the amount of nonresponse expected, 2)
the likely differences between the respondents and nonrespondents, 3) the
accuracy required in the results, and 4) the funds and time available.
These plans should also include procedures for collecting information on
nonrespondents which can be used to make adjustments to the results to
account for the nonresponse. However, this paper will not attempt to
address this important topic of nonresponse adjustments.
A CASE STUDY: THE SURVEY OF INCOME AND PROGRAM PARTICIPATION
THE SURVEY AND ITS NONRESPONSE RATES
The Census Bureau has recently launched a major new survey, the Survey
of Income and Program Participation (SIPP), that has provided us with the
opportunity to further study the problem of nonresponse. The survey has
13-5
-------
been conducted by the Census Bureau since 1983 to provide longitudinal
information on the economic situation of households and persons in the
United States. The data are used to analyze the cost and effectiveness of
government transfer programs, to better understand the Nation's income
distribution, and to study national policy issues. Panels of approximately
12,500 households are introduced every February, which results in two or
sometimes three panels in the field concurrently. Respondents in each
panel are interviewed once every 4 months for 2 1/2 years. If they move,
attempts are made to continue interviewing them at their new address. (See
Reference 6 for a more complete description of the survey.)
Nonresponse rates are calculated for each "wave" of interviewing in a
panel. A wave is the 4-month period that is required to interview the
entire sample; one fourth of the sample, called a rotation group, is
interviewed each month of the wave. This paper presents a variety of
information (originally presented at the Bureau of the Census' Third Annual
Research Conference) on our experience with noninterviews in the first SIPP
panel to be completed, the 1984 SIPP Panel, which consisted of 9 waves of
interviewing (7). The interviews were voluntary and long—over an hour per
household on the average—and dealt mainly with a subject that is not
comfortable or pleasant for most people: income. After the first wave of
interviewing, our nonresponse rate including noncontacts and refusals was
4.9 percent. By the ninth and final wave of interviewing, the rate was
22.3 percent, including noninterviews from previous waves that were not
converted to an interview. This was high for a Census Bureau survey.
However, over one-fourth of the loss was accounted for by households that
had moved to an undetermined location or outside of our interviewing area
after the first interview. The remaining loss was primarily due to
refusals.
In Wave 1 of the 1984 Panel, about 76 percent of the nonresponses
(excluding lost movers) were refusals (3.7 percent). This percentage
increased throughout the panel until it reached 94 percent in Wave 9 (14.2
percent). A number of hypotheses have been suggested regarding the reasons
people refuse to participate in SIPP. Some people suspect that interview
length, frequency, and content are the prime candidates. Others believe
that interviewer characteristics — age, experience, understanding of the
survey, etc.--might be related to refusals. And still others think that
the problem is generic — people do not participate in our surveys because
they do not trust the government or are afraid of strangers due to the
increase in crime.
PROFILE OF THE REFUSERS
In an attempt to improve our understanding of the reasons for
nonresponse, SIPP interviewers are asked to provide a detailed description
of each noninterviewed household encountered. For each noncontact or
refusal, interviewers fill out a form providing information on the type of
noninterview, the demographic characteristics of a refuser, the reason for
refusal, and information on the follow-up attempts. Because of the
longitudinal survey design, more than one form can be completed for each
household during the time it is in the survey, since a noninterview in one
13-6
-------
wave can be revisited in the next wave and remain a noninterview. The
first data to be analyzed from these forms are refusals in Waves 1 through
6 of the 1984 Panel. Following is a demographic profile of these
households based on interviewer observed characteristics of the household
and the person who refused for the entire household.
Most refusals (about 80 percent) occurred in either central
city or suburban area households. Only around 20 percent of the
refusals occurred in rural area households. (See Table 1.)
Most refusals (approximately 73 percent) occurred in middle
income range households. (See Table 2.) NOTE: Interviewers were
asked to mark either high, middle, or low income (undefined in terms
of dollars) based on their own observation of the sample unit and its
location.
The average age of the person who refused household
participation was between 46 and 47. (See Table 3.)
More females (about 60 percent) refused household participation
than males. (See Table 4.)
Consistent with the population distribution, whites accounted
for the majority of household refusals, i.e., over 87 percent. (See
Table 5.)
TABLE 1. PERCENT DISTRIBUTION OF THE LOCATION OF THE
REFUSAL HOUSEHOLDS BY WAVE
Location
Central City
Suburb
Rural
% of
Sample*
28
41
31
Wave - •
1
40.2
39.1
20.7
2
38.3
42.4
19.3
3
36.1
40.3
23.6
4
37.3
41.4
21.3
5
39.9
40.0
20.2
6
44.1
36.4
19.5
* The composition of the sample changes slightly from wave-to-wave due
to additions and attrition; this percentage distribution is based on Wave 6
full sample data. The distributions of refusal households by wave are
based oh interviewer-reported observations.
13-7
-------
TABLE 2. PERCENT DISTRIBUTION OF INCOME LEVEL
OF REFUSAL HOUSEHOLDS
Wave
Income*
High
Middle
Low
1
10.9
73.9
15.2
2
13.8
72.2
14.0
3
10.8
72.2
17.1
4
9.1
73.2
' 17.7
5
8.7
75.4
15.9
6
7.6
72.7
19.7
* The level was self-defined by each interviewer and would not be
comparable to any reported income data based on the full sample.
TABLE 3. PERCENT DISTRIBUTION OF THE AGE CATEGORIES OF
RESPONDENTS REFUSING TO PARTICIPATE
Age Category
Less than 20
20-29
30-39
40-49
50-59
60-69
70 or older
Average age of
% of
Sample*
29
19
15
11 •
9
9
8
oerson refusing
Wave
1
0.2
9.8
28.6
19.7
17.2
17.4
7.0
42.6
2
0.2
14.2
20.2
16.9
16.4
17.2
15.1
49.4
3
0.7
16.0
19.9
17.3
16.2
16.9
13.1
48.1
4
0.8
15.5
22.8
17.1
16.3
15.9
11.7
47.4
5
0.4
19.5
20.5
16.9
17.6
14.2
10.7
'
46.8
6
0.5
17.2
27.5
16.4
17.5
11.6
9.5
44.9
* The composition of the sample changes slightly from wave-to-wave due
to additions and attrition; this percentage distribution is based on Wave 6
full sample data. The distributions of refusal households by wave are
based on interviewer-reported observations.
13-8
-------
TABLE 4. PERCENT DISTRIBUTION OF SEX OF PERSON
REFUSING TO PARTICIPATE
Sex
Male
Female
% of
Sample*
48
52
Wave
1
44.4
55.6
2
41.0
59.0
3
39.9
60.1
4
39.9
60.1
5
39.2
60.8
6
36.8
63.2
* The composition of the sample changes slightly from wave-to-wave due
to additions and attrition; this percentage distribution is based on Wave 6
full sample data. The distributions of refusal households by wave are
based on interviewer-reported observations.
TABLE 5. PERCENT DISTRIBUTION OF RACE OF PERSON REFUSING
% of
Race Sample*
White
Black . ,
American Indian
Asian
Other
Don't know
85.0
12.0
0.5
2.5
-
-
Wave
1
87.8
7.9
-
1.0
0.5
2.8
2
88.8
9.9
0.2
0.9
0.2
-
3
87.7
10.7
-
1.4
0.2
-
4
86.5
11.3
0.4
1.2
0.5
.
5
88.2
10.1
0..1
1.1
0.3
0.2
6
86.1
12.6
,
0.8
0.3
0.3
* The composition of the sample changes slightly from wave-to-wave due
to additions and attrition; this percentage distribution is based on Wave 6
full sample data. The distributions of refusal households by wave are
based on interviewer-reported observations.
REASONS GIVEN FOR REFUSING
Information is also available on why a household refused to
participate in the survey during Waves 1-6 (1984 Panel). Only one .reason
for refusing was coded per household even though multiple reasons may have
been given. The major reasons for refusing to be interviewed in Waves 1
and 2 were similar. (See Table 6.) Mainly, persons just "were not
interested in participating in the survey." This was reported 18.7 percent
of the time in Wave 1 and 13.2 percent of the time in Wave 2. The next
most frequently given reason was "too busy" (14.7 percent in Wave 1 and
13.3 percent in Wave 2). In both Waves 1 (9.9 percent) and 2 (12.8
percent), "invasion of privacy" was the third reason given for not
participating. "Voluntary survey" (9.3 percent) and "questions were too
13-9
-------
personal" (9.1 percent) were next in Wave 1. These two reasons were not as
important in Wave 2 (6.8 percent and 3.2 percent respectively) as the fact
that the respondent had only reluctantly participated in Wave 1 (8.8
percent). Also, 6.2 percent of the people refused in Wave 2 because they
did not understand we would be returning.
The main reason for refusing to participate in Wave 3 changed from
Waves 1 and 2. (See Table 7.) The major reason cited in Wave 3 was that
"we answered the questions in earlier visits." This accounted for 24.1
percent of all reasons given in Wave 3. "Too busy" (7.6 percent) and "just
not interested in participating" (6.8 percent) were cited less frequently
than in Waves 1 and 2. In Wave 3, 6.1 percent of the households who had
participated earlier refused now because they felt the questions were too
personal. This is a larger percentage than was reported for this reason in
Wave 2 (3.2 percent). Another 6.1 percent who had reluctantly participated
earlier were lost in Wave 3. Also, almost 4 percent of the households
indicated that the "interview is too long."
In Waves 4, 5, and 6, -the main reason for refusing continued to be
that "they had answered the questions in earlier waves." (See Table 7.)
The other reasons were in the same vein. In these waves, more people were
becoming angry and cited "harassment" as their reason for refusing to
participate. Also, people indicated they were "tired of all the visits and
that the survey goes on too long." They felt they should not have to keep
participating in the survey.
TABLE 6. WHY RESPONDENTS REFUSED TO PARTICIPATE IN WAVES 1 AND 2
Reason Given Percent of Households
Wave 1 Wave 2
Not interested in participating 18.7 13.2
No time, too busy 14.7 13.3
Invasion of privacy 9.9 12.8
Voluntary survey 9.3 6.8
Offended by income questions, too personal 9.1 3.2
Didn't believe information was confidential 3.4
All other reasons (e.g., Angry with government, 34.9 35.7
Illness, No reason)
Wave 1 Total 100.0
Reluctantly agreed to participate in Wave 1,
refused to participate in Wave 2 8.8
Refused in Wave 2, didn't understand we
would be back 6.2
Wave 2 Total 100.0
13-10
-------
Table 7. WHY RESPONDENTS REFUSED TO PARTICIPATE
IN WAVES 3, 4, 5, AND 6
Reason
Wave
Answered in earlier waves
No time, too busy •
Not interested
Reluctant to participate earlier
Felt questions were too personal
Voluntary survey
No change in household income
Interview is too long
Responding would cause family problems
Tired of being harassed, very angry
Tired of all the visits, survey too long
Confirmed refusal
All other reasons
Total
Percent of Households
3
24.1
7.6
6.8
6.1
6.1
4.5
4.4
3.9
3.0
-
-
-
33.5
100.0
4
28.7
6.4
-
-
4.4
3.4
-
3.8
3.7
7.6
5.0
-
37.0
100.0
5
29.5
-
-
-
3.9
4.4
3.2
3.2
-
9.4
9.0
5.8
31.6
100.0
6
25.4
-
-
-
-
-
-
-
3.1
9.6
10.4
13.5
38.0
100.0
CONVERTING REFUSALS INTO INTERVIEWS
Considerable effort is spent trying to convert refusal households
into interviews. Usually, we send a letter from the regional office to the
respondent asking them to reconsider participation. We also attempt to
convert refusals by making follow-up visits when the circumstances seem to
indicate that we might be successful. Table 8 shows the number of refusal
households that received follow-up visits in the first six waves (1984
Panel). In Wave 1, approximately 85 percent of the refusal households had
at least one follow-up visit. This drops to 55 percent in Wave 6 because
there are more confirmed refusals which are not eligible to receive
follow-up visits. Once a household refuses to participate for two
consecutive waves, it becomes a confirmed refusal, and no additional
letters are sent or visits made to that household. Table 8 also shows the
number of households converted during follow-up. Around 30 percent of the
refusal households receiving follow-up visits were converted. This number
remained stable during the six waves.
13-11
-------
TABLE 8. NUMBER OF REFUSAL HOUSEHOLDS THAT HAD FOLLOW-UP VISITS
Wave
123456
Total Households Eligible
for Follow-up Visit* 878 495 677 699 559 564
No follow-up visit
Confirmed refusal
Other reason
Follow-up visit reported
% of eligible households
Households converted
% of visits reported
\
134
69
65
744
84.7
254
34.1
97
56
41
398
80.4
131
32.9
245
166
79
432
63.8
120
27.8
243
132
111
456
65.2
133
29.2
216
157
59
343
61.4
104
30.3
252
172
80
312
55.
88
28.
3
2
* The following number of households were also eligible for follow-up
visits but are excluded because information on the visits was missing:
Wave 1-125; Wave 2-60; Wave 3-226; Wave 4-294; Wave 5-357; Wave 6-208.
Wave 2 only had 3 rotation groups.
The record of follow-up visits is shown in Table 9. In Wave 1,
29.0 percent of the 744 households visited were converted .to an interview
during the first Follow-up visit. Interviewers spent approximately 69
minutes trying to convert the household on this follow-up visit. This
includes travel to and from the household. Over half of these households
were converted by a supervisory field representative (SFR) who generally
has more experience and is considered to be a better interviewer. The same
interviewer that encountered the refusal was able to convert the interview
only 15.8 percent of the time.
After the first follow-up visit in Wave 1, 528 households had not
been converted. Of those 528 households, 99 were visited a second time.
Twenty-seven percent of the households visited a second time were
converted. The field staff spent 82 minutes on the second follow-up visit.
Very few households were visited a third time. Only three percent of the
households left to convert after the second visit were attempted a third
time, but a large percentage were converted.
During the other five waves, at least 24 percent of all refusal
households visited were converted to an interview in the first follow-up
visit. The majority of the time, the SFR was the person that was able to
convert the refusal. The same interviewer that encountered the original
refusal was the next most successful in converting refusals. Very few
households were visited a third time.
13-12
-------
TABLE 9. RECORD OF FOLLOW-UP VISITS
Follow-up
Visits
Wave 1
First
Second
Third
Wave 2
First
Second
Third
Wave 3
First
Second
Third
Wave 4
First
Second
Third
Wave 5
First
Second
Third
Wave 6
First
Second
Third
HH's
744
99
15
398
78
6
432
44
1
456
34
343
18
1
312
30
4
HH's Con- % Con-
verted verted
216
27
11
100
26
5
112
7
1
121
12
99
4
1
76
8
4
29.0
27.3
73.3
25.1
33.3
83.3
25.9
15.9
100.0
26.5
35.3
28.9
22.2
100.0
24.4
26.7
100.0
Person Completing the
Follow-up Conversion
Time
Spent RO Same Dif
(Min.) Staff SFR Int. Int.
69.4 4.2
82.7 18.5
85.1
54.9
21.2
34.3
59.9
45.3
45.0
40.5
46.4
37.8
27.3
40.0
18.8
70.4
30.0
4.0
7.7
5.4
14.3
2.5
8.3
1.0
25.0
2.6
51.9
33.3
90.9
66.0
61.5
60.0
57.1
71.4
100.0
59.6
50.0
15.8
18.5
14.0
15.2
14.3
57.0 8.3
41.7 25.0
73.7 7.9
50.0
50.0 50.0
12.0
11.1
9.1
6.0
3.8
14.0
8.3
11.1 6.1
Don't
Know
16.2
18.5
10.0
26.9
40.0
5.4 17.0
100.0
7.9
50.0
18.2
16.7
22.2
25.0
7.9
For the households converted in Wave 1, 22.7 percent were converted
because a new respondent agreed to be interviewed (Table 10). Getting a
different household member to participate was most successful in Wave 1.
In later waves, other reasons increased in importance relative to getting
another household member to participate in the survey (see reasons listed
in Section II of Table 10). Some examples of other reasons that were
important are:
The respondent
- reconsidered after reading letter sent by the Regional Office,
13-13
-------
was convinced of the benefits of the survey,
related more to an experienced interviewer/SFR, or
it was a better time for the respondent.
Also of interest is that by Wave 6, 15.4 percent of the interviews were
converted when we agreed to conduct a telephone interview instead of a
personal visit.
TABLE 10. REASON INTERVIEW CONVERTED
Wave
1 2
I. Why interview was given
Different household member 22.7 19.0
Other reason 77.3 81.0
II. Other Reason Given for Conversion
Convinced of survey's benefits
Read Regional Office letter
Experienced interviewer/SFR
Interviewer persistence
Convinced of confidentiality
Different interviewer
Answered only certain questions
Could not be reached earlier
Would only respond by phone
Convinced of legitimacy of survey
Encouraged by religious beliefs
Better, more convenient time
Translator used
Interviewed away from home
Convinced name, SSN not used
Proxy agreed to interview
One household member agreed, but
other members refused
Convinced this interview shorter
No reason given by respondent
Total
3
14.7
85.3
21.1
19.1
10.8
8.9
1.5
8.2
8.8
0.5
1.0
0.5
0.5
8.2
1.0
-
1.0
1.0
1.5
0.5
5.7
99.8
4
12.4
87.6
25.7
9.5
22.9
5.7
3.8
-
5.7
-
5.7
-
-
3.8
-
-
1.0
1.9
1.9
6.7
5.7
100.0
5
17.8
82.2
21.2
1.0
23.2
5.1
4.0
6.1
2.0
•
5.1
-
14.1
-
-
2.0
-
4.0
5.1
7_A
100.0
6
7.5
92.5
23.5
11.8
15.1
8.4
1.7
5.9
6.7
_
5.9
-
-
4.2
1.5
1.7
-
0.8
4.2
1.7
6.7
99.8
40.7
2.2
19.8
13.2
-
-
2.2
.
4.4
-
-
3.3
•.
-
-
-
3.3
8.8
2.2
100.1
GENERAL EFFORTS TO IMPROVE SIPP RESPONSE RATES
Other efforts are also made to maintain and improve the SIPP response
rate. First, we try to educate the interviewing staff regarding the
importance of the survey and the intended uses of the data. We believe that
if the interviewers understand the need for the U.S. Government to undertake
13-14
-------
such an ambitious survey, they can convey the survey's importance to the
respondents. Second, we try to educate the respondents concerning the
importance of their continued participation in the survey. The respondents
need to understand why the survey design requires them to be interviewed every
4 months over a 2 1/2 year period.
Various papers and articles that have been published about the survey
are used in educating the interviewers and respondents. These include
newspaper articles, articles in the Census Bureau's Data Users News. Public
Information Office releases, and papers written by Census Bureau staff and
outside researchers. We have also developed a special four-page brochure for
the respondents entitled "SIPP DATA NEWS." This publication contains a brief
summary about the survey, explains why the information collected is so
important, and provides interesting SIPP data in graphic form with
nontechnical narrative. The DATA NEWS is updated every 4 months and is given
to each respondent at the beginning of each new wave of interviewing. In
February 1986, interviewers began distributing a portfolio in which the
respondents can store financial records and data for the next interview to
make it more readily available. This folder contains a two-year reference
calendar to remind respondents about the longitudinal nature of SIPP and a
copy of pamphlets such as "America's Fact Finder" and "USA Statistics in
Brief." Each regional office also can include other SIPP-related materials in
the portfolio, such as a personalized letter from the regional director and
copies of newspaper articles that contain SIPP data.
Our third approach involves improvements in interviewer training. For
example, more emphasis is placed on teaching the interviewers how to "sell"
themselves and the survey. We tell the interviewers that if they are
.convinced of the importance of the survey, thefr attitude will be conveyed to
the respondent and will help to enlist cooperation. A checklist of important
things to do and remember is provided to each interviewer during their initial
training. (See Appendix 1.) Also, noninterview workshops are conducted at
the twice yearly interviewer training sessions to discuss the problem and what
interviewers are doing about it.
In some noninterview workshops, the interviewers conduct practice
interviews with another interviewer playing the part of a reluctant
respondent. They also watch videotapes showing simulated interviews with such
respondents. In addition, at some sessions we have had a SIPP data user tell
the interviewers how they use the data. We think this will help the
interviewers explain the data uses to respondents which may convince them to
participate. Also, the interviewers are taught to determine the specific
reason for the respondent's objection to participation and to tailor the
response accordingly. For example, if the respondent is in a low-income
household and feels that they do not earn enough income to make any
difference, we suggest pointing out that the survey also covers programs that
they may be participating in such as the school lunch or energy assistance
programs. One idea for gaining cooperation that has come out of such sessions
is to have the interviewer suggest conducting the interview on a "trial
basis." The respondent is told something like: "Could we just try a few
questions so you can see what the survey is like? You don't have to answer
any questions you consider to be too personal." Often a reluctant respondent
13-15
-------
will permit the interview after seeing that it really is not so bad after all.
Other ideas are suggested in Appendix 2.
Our fourth approach focuses on offering respondents some form of
compensation or tangible incentive for participating. We originally proposed
a lottery in which respondents would receive a ticket each time they were
interviewed and extra tickets if they stayed in the survey until the last
interview. At the end of the Panel, we planned to hold a drawing with the
winners receiving prizes. However, the Federal Codes that govern the
activities of the Census Bureau prohibit the use of a lottery.
As an alternative, we suggested testing whether giving respondents a
small gift as a form of appreciation would help to motivate respondents to
cooperate. The first interview period of the 1987 Panel (February-May 1987)
was chosen for the experiment because Wave 1 of each panel has consistently
shown the highest rate of new noninterviews. The April rotation group
(approximately 2,900 households distributed nationally) received a small
hand-held, solar-powered calculator imprinted with the Census Bureau logo.
The other three rotations from Wave 1 (February, March, and May) did not
receive a gift and served as the control group. Rotations are convenient to
use as treatment and control groups since by design they contain a random
sample of approximately one-fourth of the entire sample of a panel. In
addition, because survey operations and controls are carried out by rotation,
it is most convenient operationally and least confusing to implement.
Comparisons between Wave 1 noninterview rates for gift recipient (April
rotation) and nonrecipient (February, March, and May rotations) households
were made at the national and regional office levels at the 10 percent level
of significance. Nationally, the noninterview rates for recipients were
significantly lower than for nonrecipients. Although not significant (except
for the Charlotte Regional Office), the rates for recipient households were
lower for 9 of the 12 regional offices, while they were higher for 3 of the
regional offices. However, compared with the projected rate for April
households in earlier panels, the difference is not significant.
We also asked the interviewers to provide comments on the effectiveness
of the gifts (8). We received comments from 352 interviewers which were
interesting and enlightening. The leading comment was that the calculators
were well-received and the respondents seemed to like them. Yet only 41 of
the interviewers believed that the gifts made it easier to obtain respondent
cooperation. More interviewers (65) claimed that obtaining cooperation is the
result of the interviewer's skill more than anything else. We think this is a
good indication of the level of confidence instilled in our interviewers.
The noninterview rates must be followed over the future waves of
interviewing for the 1987 Panel and compared with previous panels to answer
the question of their effectiveness. Even if statistical differences do not
exist for all waves examined individually, a consistent trend of lower rates
of increase in noninterview rates from wave to wave for recipient households
may indicate that a token gift can reduce household nonresponse.
13-16
-------
IN SUMMARY
The SIPP is an ambitious data collection effort that attempts to measure
extremely complex phenomena: detailed income and asset sources, program
participation, weekly labor force status, health, child care, and taxes. As
in all surveys, the quality of the data is of major concern. The conclusions
drawn from SIPP data are affected by both sampling and nonsampling errors.
This paper examines one of the major sources of nonsampling error: sample
loss through household nonresponse.
We are attempting to measure and understand sample loss in SIPP. The
main cause of this loss is refusal to participate. The reasons given most
frequently for refusing the initial interview are that the respondent just is
not interested or is too busy. We are hoping that a gift at the first visit
will overcome these feelings and persuade the respondent to participate. The
initial results of the gift experiment conducted during the first wave of the
1987 Panel are encouraging. If the gift succeeds in increasing participation
in SIPP over the life of the panel, we may adopt the practice in future panels
as well and hopefully reduce our nonresponse rates.
The work described in this paper was not funded by the U.S. Environmental
Protection Agency and, therefore, the contents do not necessarily reflect the
views of the Agency, and no official endorsement should be inferred.
13-17
-------
APPENDIX 1
INTERVIEWER CHECKLIST
1. I assume I AM going to get each interview.
2. I properly identify myself, using my official CENSUS
Identification Card.
3. I make sure the respondent has had an opportunity to read the SIPP
introductory letter.
4. I provide the respondent with a copy of the SIPP fact sheet if I
feel it will help convince a reluctant respondent to participate.
5. I make sure I explain the Bureau's policy on confidentiality, if
asked.
6. I am well-informed on the survey: its sponsorship, purposes,
benefits to the respondent, etc.
7. I am careful to point out those selling points that would appeal
to a particular respondent; e.g., to learn about the effects of
unemployment or the kinds of help that are needed to assist a
person with low income.
8. I maintain a professional, businesslike attitude; I do not get
angry or discuss politics- or government policy.
9. I try to remember something about each respondent to show my
interest at subsequent interviews.
10. I am familiar with survey forms and procedures so that I can
conduct an interview as quickly and efficiently as possible.
11. I probe sufficiently so that whenever necessary I can avoid
calling a respondent for additional information.
12. I make sure the respondent knows I will do everything possible to
help them in participating in the SIPP Survey (e.g., make visits
at times convenient to respondent).
13-18
-------
APPENDIX 2
TECHNIQUES FOR GAINING COOPERATION
The following techniques were suggested by Census Bureau interviewers as
possible approaches for gaining cooperation in our surveys.
1. Develop an opening line that you're comfortable with, for example, "Are
you eating?" or "Did I come at a bad time?"
2. Always smile, be friendly, be positive, be enthusiastic, expect to
succeed.
3. Deal (answer) with the person's questions but get them started with the
interview. Move into it as quickly as possible. Don't let them
sidetrack you from your goal of starting the interview.
4. Defuse hostile comments by using neutral responses, such as "Uh-huh," or
"I understand."
5. Empathize with the person. Apply the golden rule, i.e., treat the
person the way you would like to be treated.
6. Find something to admire in the person's home or on the property.
7. Maintain1 eye contact. ' •
8. Hold your Identification Card where it can be seen easily.
9. Don't let the respondent intimidate you.
10. Dress appropriately for the neighborhood.
11. Know how the data you collect are used and be prepared for questions.
12. Never take negative comments personally.
13. Stress that we will bend over backwards to meet the needs of the
respondent. For example, we'll do the interview at their convenience,
help out in any way we can, talk fast, etc.
14. Use newspaper items or other clippings to illustrate use of the data.
Generate your own materials so that they have a local flavor or reflect
recent news articles.
13-19
-------
REFERENCES
1. Moser, C.A. and Kalton, G. Survey Methods in Social Investigation.
Basic Books, Inc., New York, 1972. p.171, and Hoinville, G. and Jowell,
R. et. al. Survey Research Practice. Heinemann Educational Books,
London, 1978. p.6.
2. Stephan, F. F. and McCarthy, P. J. Sampling Opinions: An Analysis of
Survey Procedure. John Wiley and Sons, Inc., New York, 1958. p. 261.
3. Sheatsley, P.B. The Harrassed Respondent: Interviewing Practices.
Paper presented at the Market Research Council Conference, October 15,
1965.
4. Ibid 2, p. 265.
5. Durbin, J. and Stuart, A. Differences in Response Rates of Experienced
and Inexperienced Interviewers. Journal of the Roval Statistical
Society. A, 114, 1951. pp.163-205.
6. Nelson, D., McMillen, D., Kasprzyk, D. An Overview of the Survey of
Income and Program Participation: Update 1. SIPP Working Paper Series
. No. 8401, U.S. Bureau of the Census, Washington, D.C., 1985.
7. Nelson, D., Bowie, C., and Walker, A. Survey of Income and Program
Participation (SIPP) Sample Loss and Efforts to Reduce It. In:
Proceedings of the Bureau of the Census Third Annual Research
Conference. Baltimore, Maryland, 1987. pp.629-643.
8. Census Bureau memorandum from D. Jackson to C. Bowie, "Interviewer's
Evaluation of the Gift Experiment," August 5, 1987.
13-20
-------
PRINCIPLES OF QUESTIONNAIRE DESIGN AND
METHODS OF ADMINISTRATION
by: Wendy Visscher
Roy W. Whitmore
Research Triangle Institute
Research Triangle Park, NC 27709
Mel Kollander
F. Cecil Brenner
U. S. Environmental Protection Agency
Washington, D.C. 20460
ABSTRACT
This paper discusses the basic process of questionnaire development:
(1) determining the data requirements of the survey; (2) framing the
questions and formatting the questionnaire; and (3) iteratively testing and
revising the questionnaire. Methods of questionnaire administration are
discussed in the context of their effect on questionnaire development.
Guidance is provided for formatting individual questions and the total
questionnaire so that the data collection instrument measures the
attributes of interest as accurately as possible.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
14-1
-------
PRINCIPLES OF QUESTIONNAIRE DESIGN AND METHODS OF ADMINISTRATION
INTRODUCTION
The design of a valid questionnaire is a scientific endeavor and
often a complex task. Survey specialists trained in the principles of
questionnaire design should be consulted during the development of the
questionnaire. A questionnaire is a measurement instrument analogous to
other scientific tools, such as air monitoring devices. Great care must be
taken in constructing the questionnaire, and it must be comparable in
validity and reliability to precise scientific instruments. This paper
provides a brief introduction to the process and principles of
questionnaire development.
To construct an appropriate questionnaire, researchers must:
1) determine the specific data output requirements of the study; 2) frame
and organize the questions; and 3) iteratively .test and revise the
questionnaire. The individual questions mus't be carefully worded and the
questionnaire organized in a meaningful format. The study sponsor must
decide, based on specific needs and resources, how the questionnaire should
be administered to the respondents: in face-to-face interviews, by
telephone, by mail, or some combination thereof.
These issues and others associated with planning, designing, and
managing the conduct of a survey are discussed in the Survey Management
Handbook published by the U.S. EPA.
DEFINING STUDY OBJECTIVES AND FRAMING THE QUESTIONS
The first step in the process of questionnaire development is
requesting the study sponsor to define the specific information needs of
the study and the ultimate uses of the information. The sponsor should
elaborate by translating these needs into specific data requirements.
These statements of purpose and data requirements constitute the primary
contents of the analysis plan.
A thorough background search of what has been done before and what
types of questions have worked best in previous studies should be conducted
(3). Survey specialists should investigate the study topic to ensure that
all relevant data are collected and to decide which questions should be
asked and how they should be framed. The questionnaire can then be built
around the specific needs of the study.
In deciding how to frame a question, survey researchers should
investigate the cognitive processes that a respondent must typically
perform to answer the question (13). The cognitive processes involved in
answering a question include the respondent's understanding of the
question, recall of information, and formulation of a response. These
processes are often studied in controlled settings. The understanding
gained about the respondent's thought processes as they respond to
14-2
-------
questions can be used to draft questions that are easier to answer and
result in more accurate data.
FORM OF THE QUESTIONNAIRE
The questionnaire should be formatted so that it can be completed
reliably and with minimal burden. The language should be friendly, and the
questions should flow naturally. The order of the questions is an
important consideration because the meaning of a question is often affected
by the questions that precede it (1). Items about similar topics are
particularly likely to influence each other (5). Finally, the language,
format, and flow of the questionnaire must effectively create rapport with
the respondent to ensure a successfully completed interview. For this
reason, difficult or sensitive questions should be delayed until late in
the interview after rapport has been established (12).
A well-designed interview for environmental studies generally
consists of four parts: an appropriate introduction, a few "warm-up"
questions, the body of the questionnaire, and, if appropriate, demographic
questions (3). The introduction should clearly state who is sponsoring or
conducting the study, the study objectives, and the importance of the
respondent's participation. An informed consent procedure may also be
included as part of the introduction in which the risks, benefits, and the
voluntary nature of the study are explained.
The initial questions should consist of a few simple, non-threatening
items which build interest and rapport and orient the respondent to the
topic before the main study questions begin. These initial questions can
influence whether or"not the respondent decides to complete the remainder
of the questions (6). Sometimes, these initial questions are used for
screening to determine whether the person is eligible for the study. For
example, in a study of residential radon exposure, a sponsor may only be
interested in owner-occupied dwellings, and the screening questions would
be used to determine if the potential respondent is the owner of the house
before proceeding with the interview.
The body of the questionnaire contains the critical study questions
and is used to obtain the bulk of the study data. When possible, questions
which are of greatest interest to respondents should be asked early (4).
Finally, a portion of the questionnaire can be used to obtain demographic
characteristics (e.g., age, sex, education), when appropriate, to enable
the researcher to describe the study population. Transitions between the
sections of a questionnaire should be smooth and unobtrusive. However, if
an abrupt change in question types occurs (e.g., household-level versus
personal), the new section should be introduced with a short explanation.
LENGTH OF THE QUESTIONNAIRE AND QUESTIONS
The survey specialist should give considerable thought to which
variables are really important to avoid including nonessential items on the
questionnaire (12). The length of the questionnaire and of the individual
questions depend on the interview mode and the difficulty of the study
14-3
-------
topic (3). Very lengthy questionnaires are generally very difficult for
both the interviewer and the respondent and can lead to respondent fatigue
and incomplete interviews, resulting in poor data quality.
Short questions are generally preferred over longer ones. If a
respondent is forced to listen to or read a very long question, especially
concerning a subject about which he knows very little, he is likely to
become disinterested. Long questions which require excessive thought or
are too difficult may result in unreasonable respondent burden (1). Short
questions are particularly effective when the interviewer reads them
slowly. This allows the respondent time to think about his responses but
doesn't overburden him with too much information. However, longer
questions may be more effective for memory questions because additional
clues can be provided to stimulate recall (11). Another approach is to ask
a series of lead-in questions to provide these memory clues. This reduces
the need for longer questions, often with only slight increases in the
overall length of the questionnaire.
TYPES OF QUESTIONS
A number of different kinds of questions can be used in a
questionnaire, including questions of fact, opinion, attitude, information,
and self-perception (3). Different types of questions are appropriate in
different studies, and questions of fact are probably the most useful for
environmental studies. Questions of fact are used to obtain information
about the respondents themselves or about measurable or observable
phenomena about .which the respondents have knowledge. Examples of
questions of fact might include those about respondent's smoking habits,
activity patterns, ventilation practices, use of pesticides, or whether his
house has a basement or a crawl space.
OPEN-ENDED VERSUS CLOSED-ENDED QUESTIONS
The types of questions used on questionnaires can be dichotomized as
closed-ended and open-ended. Closed-ended questions are designed to elicit
specific responses by providing a list of all possible responses to the
question and by having the respondent choose a response from this list.
Open-ended questions, on the other hand, are not restrictive, and the
respondent is allowed to answer freely. The survey specialist must decide
which type of question best fits the needs of the study.
Closed-ended questions are preferred for categorical variables. They
are very easy and fast to administer. Since they are often preceded on the
questionnaire (e.g., the interviewer circles the code associated with the
response), data processing is also fast and efficient. Closed-ended
questions are generally dichotomous (e.g., yes or no) or multiple choice
with several fixed alternatives (e.g., "Which of the following do you use
in your home: window fan, ceiling fan, circulating fan?"). The categories
used for closed-ended questions should be appropriate for the study
population. In fact, the major disadvantage of closed-ended questions is
that respondents are forced to classify themselves into categories that are
based on the researcher's frame of reference (5). Any elaboration that the
14-4
-------
respondent
question.
may choose to give is essentially lost in a closed-ended
Open-ended questions are sometimes appropriate for quantitative items
to be analyzed as continuous variables and for qualitative items for which
it is not practical to list all possible responses. Examples of
appropriate quantitative items are age of residence and length of daily
commute. Open-ended questions are often used for qualitative items in
"exploratory" studies in which the researcher has very limited knowledge of
the subject and doesn't know what responses to expect. Since open-ended
questions require that the interviewer record the responses verbatim, they
are time-consuming to administer, and a limited number of questions can be
asked of the respondents before they tire. Open-ended questions also
require more data processing time, since responses must be grouped for
analysis and codes must be assigned. This post-interview categorization
invariably leads to a loss of specificity.
Open-ended questions allow a researcher to learn more about the
respondent's opinions, feelings, and reasons for giving a specific
response. However, because of this freedom, open-ended questions may
elicit irrelevant information, and there is generally wide variability in
the amount of detail provided by different respondents (12).
Open questions can be used in pretest questionnaires, allowing more
meaningful categories to be developed for use with closed questions on the
final questionnaire (5). • Each type of question has advantages and
disadvantages, and there are situations to which each is particularly well
suited. These are summarized in Figure 1.
DESIGNING QUESTIONS
Simplicity is almost always -preferred in a good questionnaire.
Survey specialists must always guard against making the questions too
complicated (1). Each question should be specific and communicate the same
meaning to all respondents (6).
CLOSED-ENDED QUESTIONS
1. Fixed response alternatives
2. Categorical variables
3. Fast, easy administration
4. Pre-interview coding
5. Forces respondent to classify
him/herself into preset category
6. May result in incomplete data
if all potential responses
are not known
OPEN-ENDED QUESTIONS
1. Free response
2. Continuous variables
3. Administration slowed by
record-ing of verbatim
responses
4. Post-interview coding
5. Allows respondent to form
a more appropriate category
6. Respondents can assist in
creation of response list
for future studies
Figure 1. Advantages of Closed-ended vs. Open-ended Questions
14-5
-------
Colloquial English should be used whenever possible. Whether the
respondent understands the question is more important than whether the
question is grammatically correct. The use of slang should be avoided,
however, because different respondents may have different reactions to or
interpretations of this kind of language.
Respondents may give incorrect answers to questions which include
words that have no meaning for them. The survey specialist strives to
choose words that have the same meaning to all respondents, regardless of
educational experiences or cultural background. One should not assume that
respondents who do not know a word will respond in a specific way.
Questions should be neutral and should not impart the researcher's biases.
Neutrality is violated if the question is written is such a way that the
respondent is influenced to choose a particular answer. Respondents can be
influenced in this manner if they are offered unfair alternatives or if one
response is made to seem less desirable than another response, perhaps by
the use of emotionally charged words or stereotypes with negative
connotations.
Ambiguity or vagueness must be avoided. If the respondents do not
understand a question, their responses are not likely to be accurate. This
lack of understanding may be the result of an incomplete question (e.g.,
the question is not self-explanatory and not enough information is given);
a question that is indefinite in time (the respondent may not know what the
researcher means by "frequently," "often," or "usually"); or a question
that includes indefinite comparisons or imprecise categories (e.g., the use
of "below average," "average," and "above average") (8). The survey
.specialist should avoid using more than one subject in a question, since
the respondent may have different thoughts about each (e.g., "Do you
consider residential exposure to radon and carbon monoxide a health
hazard?"). A respondent may also be confused by double negatives in a
question (e.g., "Do you agree or disagree that radon is not a health
hazard?") (3).
The sponsor is invariably much more interested in and knowledgeable
about the survey subject than the respondents and may tend to prepare
questions that the respondent does not understand. Survey sponsors are
specialists educated in their field and tend to use long words and jargon.
This elevated language should be avoided, (e.g., Use the word "feeling",
not "intuitive;" and "clear," not "intelligible). The fact that the
questions make sense and are important to the sponsor does not mean they
will be understood by the respondent. A sur.vey specialist is needed to
translate the sponsor's data needs into questions that will reliably obtain
the desired information.
PRETESTING A QUESTIONNAIRE
A careful pretest of a draft questionnaire is an integral part of the
questionnaire design process. The pretest should be conducted among
persons who are as similar as possible to the target population subjects.
A small pretest consisting of 12-25 interviews should reveal the major
14-6
-------
weaknesses of the questionnaire (12). The researcher should know after the
pretest has been completed whether the questions had the same meaning to
all respondents and if the difficulty level and length of the questionnaire
are appropriate. Focus groups can also be a useful part of the evaluation
of questionnaires and survey procedures. Revisions can then be made to
produce a final questionnaire that will produce relevant and accurate data.
COMPARISON OF QUESTIONNAIRE ADMINISTRATION METHODS
There are three modes of administration for a questionnaire:
face-to-face interviews, telephone interviews, and self-administered
interviews. Many studies use a mixed-mode approach (e.g., screening
potential respondents by phone, then doing face-to-face interviews with
eligible persons). The specific needs of the study will dictate which mode
is most appropriate. The relative advantages and disadvantages of the
methods are discussed below.
Face-to-face interviewing is often efficient for environmental field
studies because the environmental measurements require face-to-face
contacts. Interviewers are recruited and trained to establish rapport with
respondents and complete interviews in a standardized fashion.
Face-to-face interviews generally result in higher response rates than
other modes of interviewing, and they are more flexible than mail
questionnaires because the interviewer can explain questions that the
respondent would not otherwise understand. The interviewer also has
control over the sequence of questions and selection of the respondent.
However, the reading of the questions and their explanations must be
consistent across interviewers. Telephone interviews can be less expensive
than-face-to-face interviews and often result in higher response rates than
mail surveys. Telephone interviews can be efficient and cost-effective for
surveys that must cover large geographic areas or be completed in a short
period of time, and, like face-to-face interviews, they are also more
flexible than mail questionnaires. Another advantage of telephone
interviews is that the interviewers typically conduct their interviews from
a central location. This allows close supervision of the interviewers and
a greater capacity for standardization and quality control. Telephone
interviewing also has certain disadvantages. The interviewer may have more
difficulty establishing rapport with respondents and recruiting them into
the study by phone than face-to-face because the respondents may doubt the
legitimacy of the telephone call. Also, a telephone interviewer has
difficulty judging whether the respondent really understands the questions,
and the respondent can stop the interview at any time simply by hanging up
the phone. Finally, telephone interviewing does not cover the
approximately 7% of U.S. households that do not have telephones. Since
this segment of the population is generally poor or ethnic, this
noncoverage can bias the results of a study (4).
A mail survey can be less expensive than either face-to-face
interviews or a telephone survey because postage is less expensive than an
interviewer's salary, resulting in a smaller cost per case. This allows
the sponsor to increase the sample size of the study without increasing the
cost above that of other modes. A mail survey may be preferred if the
14-7
-------
respondent needs time to obtain the information requested. A respondent
may be more likely to answer sensitive questions on a mail questionnaire
than during a telephone or face-to-face interview (12).
The major disadvantage of mail questionnaires is that they often
result in poor response rates, which can invalidate population inferences.
Multiple mailings plus telephone or face-to-face follow-ups are usually
needed to get acceptable response rates. Even so, typical response rates
are 40% for a first mailing and 60% after three mailings (4).
Nonrespondents are often less educated or less interested in the study
subject than respondents or may differ in other ways that can bias the
results of the study.
However, if a mail, survey is well-designed and extensive follow-up
procedures are followed, response rates nearly as high as those obtained
with face-to-face and telephone interviews are possible (12). Dillman
describes a "Total Design Method" which has resulted in response rates of
60% to over 90% in mail surveys of heterogeneous populations (10). The
Total Design Method incorporates careful study design and intensive
follow-up with specific techniques which tend to maximize response (e.g.,
specially-formatted questionnaires, multiple mailings, telephone and
face-to-face follow-up of nonrespondents, and various, forms of personalized
communication). These specialized procedures and intensive follow-up
efforts, although very effective at increasing response rates, also
increase the cost of the study.
A self-administered questionnaire must be simple and easy-to-follow,
since the respondent will not have the assistance of an interviewer when
completing it. However, this inflexibility also prevents the. interviewer
from introducing bias by selectively probing certain questions or
respondents. A mail survey is not appropriate if spontaneous responses are
desired or if the sponsor wants to make sure the respondent completes the
questionnaire without the assistance of other household members. The
success of a mail survey is also dependent on the literacy level of the
study population (9).
In summary, the appropriate mode of questionnaire administration
depends on the nature of the study and the resources (both time and money)
available to the sponsor. The choice of administration mode is a complex
decision and a mixed-mode approach is appropriate.
SUMMARY
A well designed and fully tested questionnaire is an essential part
of a successful survey. After a careful background search, the survey
specialist should know what information will be useful and should be
collected. The sponsor must decide, based on cost and the type and
complexity of the data to be obtained, whether to use a face-to-face
interview, a telephone interview, or a mail survey. A draft questionnaire
can then be developed using preceded closed questions or carefully worded
open questions. Special care must be taken during the design of the
questionnaire to ensure that the respondent will understand the questions,
14-8
-------
that the language used is non-threatening and unambiguous, and that the
questionnaire is not too long or difficult. A pretest should be done to
further refijie the questions before the study begins. The final
questionnaire that is used for the actual collection of the data should
produce the most accurate and useful data possible.
REFERENCES
1. Converse, J.M., Presser, S.: Survey Questions (Sage Publishing, Inc.,
Beverly Hills, CA, 1986).
2. Wallace, L.A.: The Total Exposure Assessment Methodology (TEAM
Study): Summary and Analysis: Volume I. (Office of Research and
Development, U.S. -Environmental Protection Agency, Washington, D.C.,
1987).
3. Backstrom, C.H., Hursh, G.D.: Survey Research (Northwestern
University Press, Chicago, IL, 1963).
4. Kelsey, J.L., Thompson, W.O., Evans, A.S.: Methods in Observational
Epidemiology (Oxford University Press, New York, NY, 1986).
5. Schuman, H., Presser, S.: Questions and Answers in Attitude Surveys
(Academic Press, New York, NY, 1981).
6. Berdie, D.R., Anderson, J.F., Niebuhr MA: Questionnaires: Design and
Use Second Edition.(The Scarecrow Press,.Metuchen, NJ, 1986).
7. Bel son, W.A.: The Design and Understanding of Survey Questions (Gower
Publishing Co. Ltd., Aldershots, Hants, England, 1981).
8. Bradburn, N.M., Sudman, S.: Improving Interview Method and
Questionnaire Design (Jossey-Bass Publishers, Washington, D.C.,
1979).
9. Fowler, F.J.: Survey Research Methods (Sage Publications, Beverly
Hills, CA, 1984).
10. Dillman, D.A.: Mail and Telephone Surveys: The Total Design Method.
(John Wiley, New York, NY, 1978).
11. Wright, T.: Statistical Methods and the Improvement of Data Quality
(Academic Press, Inc., New York, NY, 1983).
12. Rossi, P.H., Wright, J.D., Anderson, A.B., editors: Handbook of
Survey Research (Academic Press, Inc., New York, NY, 1983).
13. Jabine, T.B., Straf, M., Tanur, J.M., Tourangeau, R.: Cognitive
Aspects of Survey Methodology: Building a Bridge Between Disciplines
(National Academy Press, Washington, D.C., 1984).
14-9
-------
14. U.S. Environmental Protection Agency, Survey Management Handbook,
Office of Policy, Planning and Evaluation, Washington, D.C., 1983.
14-10
-------
ESTIMATION OF MTCROENVIRONMENT CONCENTRATION DISTRIBUTION
USING INTEGRATED EXPOSURE MEASUREMENTS
by: Naihua Duan
RAND Corporation & UCLA
ABSTRACT
• Several methods are proposed to estimate the distributions of pollutant concentrations
in microenvironments using integrated exposure measurements. For the mean, concentration in
each environment, we propose a regression estimate based on a linear model with the intercept
suppressed. For the variances and higher cumulants, we propose seveal regression estimates
based on regressing powers of residuals from the first linear model. We also discuss a general
deconvolution approach which can be used to estimate the entire distribution.
15-1
-------
1. Introduction
In applying the indirect approach to assess human exposure to air pollution, it is
necessary to first estimate the distribution of microenvironment concentrations and the
distribution of activity pattern. We can then combine the two estimated distributions to
estimate the distribution of the integrated exposure. We assume that we have conducted
an activity pattern survey from which we can estimate the distribution of activity patterns.
We are then left with the task of estimating the distribution of microenvironment concen-
trations.
For some pollutants such as CO, there are reliable personal monitors which give con-
tinuous measurements of the pollutant concentration. If we conduct personal monitoring
using such continuous personal monitors and record activity patterns at the same time, we
can determine the microenvironment concentrations and estimate the distribution for the
microenvironment concentrations. This was done, for example, in the Washington-Denver
CO studies.
For many pollutants such as NO? and VOC, we don't have reliable continuous per-
sonal monitors, and have to rely on integrated exposure measurements. There is no obvious
way to determine the microenvironment concentrations from integrated exposure measure-
ments. In this paper, we describe a method which can be used to estimate the distribution
of microenvironment concentrations from integrated exposure measurements and activity
patterns. We can then use the estimated microenvironment concentration distribution in
the indirect approach and estimate the distribution of integrated exposures for another
sample of human subjects for whom we have an activity pattern survey.
In the rest of this section we review the indirect approach and methods that can be
used to estimate the distribution of integrated exposures. In Section 2 we discuss regression
methods which can be used to estimate the means, the variances, and the covariances
for microenvironment concentrations. In Section 3 we sketch a method to estimate the
microenvironment concentration distribution using characteristics functions. In Section 4
we discuss the limitations of the modelling assumptions used in this paper and propose
several directions for future research.
1.1. Review of the Indirect Approach
We assume for now that our main interest is in an individual's integrated exposure1.
1 It is possible to extend the indirect approach and the microenvironment decomposi-
tion (1.1) to deal with other exposure measures such as the peak concentration encountered
during a given time period.
15-2
-------
The indirect approach is based on the following microenvironment decomposition of the
integrated exposure:
K
y = C'T = Ł ckTk, . (1.1)
fc=i
where Y denotes an individual's integrated exposure to a given airborne pollutant during
a given time period, say, a 24-hour period. Ck denotes the average pollutant concentration
encountered by this individual during this time period while he is in the k-th microenviron-
ment; we will refer to the vector C' = (Ci, ..., CK) as the microenvironment concentrations.
Tk denotes the amount of time this individual spent in the k-th microenvironment during
this time period; we will refer to the vector T' = (Ti,..M TK) as the activity pattern. Both
microenvironment concentrations and the activity pattern can vary from individual to in-
dividual, and from time period to time period. We will refer to each combination of an
individual and a time period as a sampling unit.
We assume in the microenvironment decomposition that the totality of all possible
locales and activities that the individual can engage in has been stratified into K microen-
vironments.2 Duan (1980) developed a criterion for identifying the stratification scheme
which can be used to improve the precision of the estimated average exposure. The cri-
terion was applied in Duan (1985) to identify the important microenvironments for CO
exposures.
In order to apply the indirect approach, we need to first estimate the microenviron-
ment concentration distribution
cjC) (1.2)
and the activity pattern distribution
FT(t) = P(Tj. < «!,..., TJC < tie). (1.3)
If we have an enhanced personal monitoring study, such as the Washington-Denver CO
' studies, which provides continuous measurements of concentrations and also records ac-
tivity patterns, we can determine the microenvironment concentrations for each sampling
unit and estimate the above distributions, using the corresponding sample distributions,
or some suitable parametric estimates. If the personal monitoring study only provides
integrated exposure measurements, such as the TEAM studies, there is no obvious way to
determine the microenvironment concentrations. In Sections 2 and 3 we describe methods
which can be used to estimate the microenvironment concentration distribution -Fc(c) in
(1.2) in those situations.
Once we have estimated the distributions FC(C) and FT(*), we can combine the two
distributions to estimate the distribution of integrated exposures,
Fy(y) = P(Y < y). (1.4)
2 Duan (1980, 1982, 1985) used the term microenvironment for an individual locale
or activity, and used the term microenvironment type for a collection of similar microen-
vironments. In accordance with the more prevalent use of the terminology, the term
microenvironment is used in this paper to refer to a collection of locales or activities.
15-3
-------
We discuss methods for doing this in the next subsection.
1.2. Exposure Distribution
There are two major methods for estimating the exposure distribution Fy (y) from the
microenvironment concentrations distribution FC (c) and the activity pattern distribution
FT(I), namely, the Cartesianization method3 (Duan 1980, 1985, 1987) and SHAPE (Ott
1981).
The two methods are similar in nature; both models require certain independence
assumptions.4 The Cartesianization method assumes that the microenvironment concen-
trations are stochastically independent of activity patterns. This assumption is equivalent
to
Fc,T(c,t)=Fc(c)FT(t), (1.5)
where FC,T denotes the joint distribution of C and T. In other words, the joint distribution
for C and T is given by the Cartesian product of the respective marginal distributions.
The distribution for integrated exposure is then given by
FY(y) = || l(c't < y)dFc(c)dFT(t), (1.6)
where l(c't < y) denotes the indicator function for the set of (c,t)'s which satisfy the
inequality c't = Ł*=1 cfctfc < y.
Given appropriate estimates for the microenvironment concentration distribution and
the activity pattern distribution, say, FC(C) and Ff(t), we can estimate the joint distri-
bution Fc,T(c,t) by the Cartesian product of the estimated marginal distributions, then
estimate the exposure distribution by
FY(y) = j j l(c't < y)dFc(c)dFT(t). (1.7)
SHAPE employs a different set of independence assumptions. SHAPE decomposes
microenvironment concentrations5 into short term averages, say, minute averages:
rfc
(1.7)
3 Duan (1980,1985) used the term "convolution method." The method was generalized
to a broader context in Duan (1987) and renamed as the Cartesianization method. The
essense of the method is to take the Cartesian product between the microenvironment con-
centration distribution and the activity pattern distribution.
4 This assumption might be rather restrictive. Further discussions are given in Section
4.
5 SHAPE also subtracts background ambient concentrations from the microenviron-
ment concentrations. The same can be done for the Cartesianization method. Alterna-
tively, we can implement both SHAPE and the Cartesianization method without sub-
tracting the background concentrations and deal with the background concentrations as a
source of covariance among the microenvironment concentrations.
15-4
-------
where bk(s) denotes the average concentration in the k-th microenvironment during the
s-th minute. SHAPE assumes that the minute averages bk(s) for the same microenviron-
ment have the same distribution, and that all minute averages are stochastically indepen-
dent:
bk(s)~Gk, s = l,...,!*, fc = !,...,#, (1.8)
where Gk denotes the distribution for minute averages in the k-th microenvironment.6
We then generate independent random samples for the minute averages, compute the
microenvironment concentrations according to (1.7),7 compute the integrated exposures
according to (l.l), then estimate the exposure distribution.
Under the independence assumption (1.5) employed in the Cartesianization method,
C and T are uncorrelated, and the variances of microenvironment concentrations do not
depend on activity patterns,
Var(Ck\T) = Efcfc, (1.9)
where E denotes the unconditional covariance matrix for C. Under the assumption that
the minute averages in (1.8) are stochastically independent, as in SHAPE, the C and T
are also uncorrelated, while the variances of microenvironment concentrations depend on
activity patterns as a result of averaging over time:
var(ck\T) = Tkk/Tk, (i.io)
where Tkk denotes the common variance for the minute averages (&&(!),,.., 6fc(Tfc)).
Switzer commented in a discussion that the independence assumptions used in either
the Cartesianization method or SHAPE might be unrealistic: because of averaging over
time, it is unlikely that the conditional variance Var(Ck\T) would remain constant as in
(1.9); on the other hand, the form for the conditional variance in (1.10) might not allow
for heterogeneity of microenvironments. Switzer suggested that one might decompose the
minute averages into two components,
bk(s) = ak + dk(s), (1.11)
where ak denotes a systematic component which does not vary over time, and dk(s) denotes
a random component which varies over time.8 We assume that the two components are
6 It follows from (1.7) that we can rewrite the microenvironment decomposition (1.1)
as follows:
tf T
y = E2>«.
*=!«=!
7 We assume here that activity patterns have already been generated, or we might use
observed activity patterns from a survey study.
8 It follows from (1.11) that the microenvironment decomposition (1.1) can be rewritten
as
K K Tk
15-5
-------
stochastically independent. We also assume without loss of generality that the mean of
the random component is zero:
E(dk(a))=0. (1.12)
To illustrate the decomposition (1.11), we consider the home microenvironment. The
CO concentrations in two different homes might be systematically different because of dif-
ferences in the two homes' heating and cooking facilities, ventilation, presence of smokers,
etc. Those systematic differences are be represented by the systematic component a^. The
CO concentrations in the home might also vary over time, which are represented by the
random component dfc(s). Under (1.11), the microenvironment concentration is given by
Tk
Ck = ak + Y^dk(s)/Tk. (1.13)
a-l
The Cartesianization method in essense neglects the presence of the random compo-
nents dk(s), while SHAPE in essense neglects the presence of the systematic component
afc. It is plausible in practical applications that neither component can be neglected.
Under the assumption that ak and bk(s) are both present and are stochastically in-
dependent, C and T are again uncorrelated, while the conditional variances are given
by
Var(Ck\T) = Łfc* + Tkk/Tk, (1.14)
where Ł denotes the covariance matrix for the systematic components (ai, ...,
-------
and C being the vector of regression parameters. If we have replicates for the same C's,
it would be possible to use an appropriate regression of Y on T to estimate C. However,
usually we don't have such replicates, therefore we cannot determine the microenviron-
ment concentrations. However, we can use an appropriate regression to estimate the mean
microenvironment concentrations
, k = l,...,K. (2.1)
Under either assumption (A), (B), or (C), we have the regression equation
K
Ł7(y|T) = T'-Y = Ł:ZVYfc (2.2)
k=l
Therefore we can regress Y on T to estimate 7, e.g., using the least squares regression.
It should be noted, though, that this regression model does not contain the intercept.
Tosteson and Spengler (1980) used a similar approach to estimate the mean microenviron-
ment concentrations; however, they included individual-specific intercept terms in their
regression model.
Under assumption (A) which is employed in the Cartesianization method, the vari-
ances (and covariance) of the microenvironment concentrations do not depend on the
activity pattern. Therefore we have the following regression equation;
..(2.3)
k=i
where r — Y — T'7 is the residual from the previous regression equation, (2.2), and E
denotes the covariance matrix for C. (The left-hand side of equation (2.3) is also the
conditional variance of Y given T.) While we don't observe the residuals r directly, we
can estimate it by the empirical residual f = Y — T'nr where 7 is the estimate for 7 after
fitting regression equation (2.2). It follows that we can estimate E using a regression of r2
on (T^, ...,TŁ) and (2TkTi, 1 < k < I < K). If we believe that the microenvironment con-
centration distributions follow a parametric form, say, the lognormal distribution, it would
then be possible to determine the distribution from the estimated means and the estimated
covariance matrix.
It is possible to impose restrictions on the covariance matrix E. For example, if we
assume that the off-diagonal elements in E are identical, i.e.,
E« = p, 1 < k < I < K,
where p denotes the common value of those off-diagonal elements. We can estimate the
diagonal elements of E (the variances of the microenvironment concentrations) and the
common covariance p using a regression of f2 on (T2, ..., TŁ) and 2 Ł* TfcTj.
If the background ambient contributions have been subtracted from V, it might be
possible to assume that the microenvironment concentrations are independent of each
15-7
-------
other, i.e., the off-diagonal elements in Ł are all zero.9 In this case we only need to
estimate the variances, using a regression of f2 on (T2,...,!^).
It is possible to improve upon the regression models (2.2) and (2.3) by allowing the
residuals to have different variances and using weighted least squares estimates. For ex-
ample, after fitting model (2.3), we may re-estimate (2.2) by weighted least squares, using
the reciprocal of the fitted values of Var(Y\T) based on (2.3) as the weights. It is also
possible to modify (2.3) similarly, if we fit similar regression models for higher moments of
Y.
2.2. No Systematic Component
Under assumption (B), which is employed in SHAPE, we can still use (2.2) to estimate
the means for the microenvironment concentrations. Since the microenvironment concen-
trations are the averages of the minute averages, the same estimate 7* can also be used to
estimate the mean for the minute averages (&&(!),...,6fc(Tfc)).
We also need to estimate the variances for the minute averages.10 Under assumption
(B) and (1.10), we have the following regression equation which is analogous to (2.3):
K
(2.4)
where Tkk denotes the variance for the minute averages in the k-th microenvironment,
(6fc(l),...,&fc(Tfc)). It follows that we can estimate (Tick, k = l,...,K) by a regression of f2
on (TI,...,TK)- This is analogous to the special case of (2.3) with all off-diagonal terms
of E being zero. If we believe that the distribution for the minute averages bk(s) for the
k-th microenvironment follow a parametric form such as the lognormal distribution, we
can determine the distribution from the mean and the variance. We can then conduct
SHAPE type simulations.
2.3. Systematic and Random Components both Present
Under assumption (C), we can use (2.2) to estimate the means for the microenviron-
ment concentrations. It follows from (1-7), (1.11), and (1.12) that the estimate •% also
estimates the mean for the systematic component, E(ak).
In this case we have a general regression equation for the cumulants of Y and C:
K
where /cy(y|T) denotes the j-th conditional cumulant for Y given T, /cy(Cjt) denotes the
j-th cumulant for'Cfc. Equation (2.2) is a special case of this cumulant equation for j — 1.
Equation (2.3) without the off-diagonal elements in E is a special case of this cumulant
equation for j = 2.
10 We assume that the minute averages from different microenvironments are indepen-
dent of each other; therefore, we don't need to estimate their covariances.
15-8
-------
We also need to estimate the variances and covariances for the systematic and the
random components. Under assumption (C) and (1.13), we have the following regression
equation which generalizes (2.3) and (2.4):
K K
E(r2\T) = Ł TtXkk + Ł 2rfcT,Efc, + Ł TkTkk, (2.5)
k=l kc(Trj) by inverting the Fourier transform (3.1). We can then estimate the
microenvironment concentration distribution by inverting the following Fourier transform
which defines the characteristic function for C:
= E(eir'c). (3.2)
15-9
-------
Under assumption (B), there does not appear to be a simple expression for the joint
characteristic function TJ>Y,T- Instead, we may consider the conditional characteristic func-
tion:
K
(3.3)
where t/ffc denotes the characteristic function for the minute averages (6fc(l), ...,&fc(Tfc)) in
the k-th microenvironment:
(3.4)
It follows that for each value of r?, we can estimate (0i(r?),...,t/>K(»7)) by a nonlinear
regression of eir}Y on T\,...,TK using (3.3). We can then invert the Fourier transform in
(3.4) to estimate the distribution for the minute averages in the k-th microenvironment,
It follows from (3.3) that the cumulant generating function for Y is linearly related
to the cumulant generating functions for 6fc's; therefore, the cumulants for Y are linear
combinations for the cumulants for ifc's:
K
*j(Y\T) = ^TkKj(bk), j = 1,2,..., (3.5)
k=i
where /c;(F|T) denotes the j-th conditional cumulant for Y given T, /c;(6fc) denotes the
j-th cumulant for 6fc. (Compare footnote 9 in Section 2.1.) The first cumulant is the mean;
therefore, (2.2) is a special case of (3.5) for j = 1. The second cumulant is the variance;
therefore, (2.4) is a special case of (3.5) for j = 2. We can use (3.5) for j > 2 to estimate
the higher cumulants for 6*'s. For example, we can estimate the third cumulants for &jt's
by a linear regression of f3 on (T"i, ...,!#•). It is also possible to estimate the distribution
for the minute averages from their estimated cumulants.
We have not been able to find a simple relationship for the characteristic functions
under assumption (C). For the special case with systematic components (a\, ..., OK-) being
independent of each other, we have the following cumulant equations:
K rfc
*y(y |T) = Ł 2>,.(afc) + Ł TkKj(dk). (3.6)
fc=l 8=1
Equation (2.2) is a special case of (3.6) for j = 1. Equation (2.5) without the off-diagonal
elements is a special case of (3.6) for j = 2. We can use (3.6) to estimate the higher
cumulants for Ofc's and d^'s. For example, we can regress f3 on (If , ..., T^) and (7\, ..., TK)
to estimate the third cumulants for a*'s and d^s.
4. Discussions and Conclusions
Estimating microenvironment concentration distributionsusing integrated exposure
measurements is useful for several purposes. First, it allows the potential to general-
ize from one population to another. For example, we might have integrated exposure
15-10
-------
in one metropolitan area, and an activity pattern study in another metropolitan area. If we
believe the microenvironment concentration distributions are the same in the two areas,
it would then be possible to combine the microenvironment concentration distributions
estimated from the first area with the activity pattern data from the second area to estimate
the exposure distribution in the second area. Second, it allows development of simulation
models such as SHAPE. Third, it is useful to help us identify the microenvironments
with potential for large contributions to human exposure, so that further research and
regulatory actions can be directed towards these microenvironments.
Despite those potentials, the results described in this paper remains to be validated
empirically before further applications. The author plans to do the validation using the
CO personal monitoring data from the Washington Urban Scale Study. In this study, a
sample of human subjects were recruited to carry continuous CO monitors for a 24-hour
period. They also record their activities during this period. For the validation analysis, we
will neglect the continuous data, pretend that we only have integrated CO measurements,
and apply the methods in this paper to estimate the microenvironment concentration
distributions. We can then compare the estimates with the actual distributions to see if
these methods provide valid estimates.
All results in this paper require some independence assumptions. This is intrinsic to
the indirect approach. For example, in order to estimate the average exposure Y using
the average microenvironment concentration C and the average activity patterns T, it is
necessary to assume that C is uncorrelated with T. Whether those assumptions are real-
istic remains to be studied empirically. There has not been very much work reported on
the validity of this type of independence assumptions. Duan (1985) examined this using
data from the Washington CO study and found no significant correlations between the
microenvironment concentrations and the corresponding activity patterns. Switzef (1988)
examined the minute averages from a microenvironment monitoring study conducted on
El Camino Real, an arterial route in Palo Alto, California, and found little autocorrelation
beyond the first four minutes. More empirical studies of this type still need to be done. It
will also be useful to examine the theoretical robustness properties of the results in this pa-
per against departures from the underlying independence assumptions. For example, if the
minute averages have lagged autocorrelations, do the estimates based on the independence
assumptions still have good statistical properties?
The work described in this paper was not funded by the U.S. U.S. Environmental
Protection Agency and therefore the contents do not necessarily reflect the views of the
Agency, and no official endorsement should be inferred.
15-11
-------
REFERENCES
1. Duan, N. (1980): "Micro-environment types: a model for human exposure to
air pollution", SIMS Technical Report No. 47, Dept. of Statistics, Stanford University,
Stanford, CA.
2. Duan, N. (1982): "Models for human exposure to air pollution", Environment
International, 8, 305-309.
3. Duan, N. (1985): "Application of the microenvironment monitoring approach to
assess human exposure to carbon monoxide", R-3222-EPA, The RAND Corporation, Santa
Monica, CA.
4. Duan, N. (1987): " Cartesianized sample mean: imposing known independence
structures on observed data", unpublished manuscript, The RAND Corporation, Santa
Monica, CA.
5. Ott, W. (1981): "Computer simulation of human exposures to carbon monoxide",
paper presented at the 74th Annual Meeting of the Air Pollution Control Association,
Phildelphia, PA.
6. Switzer, P. (1988): private communication.
15-J2
-------
MICROENVIRONMENT DATABASE FOR TOTAL HUMAN EXPOSURE STUDIES
by: Muhilan D. Pandian
Environmental Research Center
University of Nevada-Las Vegas
4505 S. Maryland Parkway
Las Vegas, NV 89154
ABSTRACT
Human exposure to a pollutant is determined by matching pollutant.
concentrations in microenvironments with human time activity patterns. In
this context, microenvironments are used to denote volumes with homogeneous
pollutant concentrations. In this paper, a new concept is introduced in
which the use of microenvironments is extended to include all the different'
components of the total human exposure process, from pollutant sources to
related health effects. The feasibility of incorporating this concept in a
database format is discussed, and the advantages of such a database are
also noted.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
16-1
-------
INTRODUCTION
During their daily activities, humans are exposed to the multitude of
pollutants present in the surrounding environment. In general, pollutants
are transported by a carrier medium (air, water, soil, or food) to the
physical boundaries that envelop us. On crossing this envelope, exposure
to pollutants can lead to dosage received by the target tissues, resulting
in cancer and/or noncancer risks.
The concentrations of pollutants found at a person's physical
boundaries at any specific time depend largely on where he/she is present
and what he/she is doing at that time. During a day's routine, the person
may pass through many different environments with varying physical
characteristics. These environments, which are denoted as
microenvironments, account for the variations in the sources and sinks of
pollutants, the carrier medium of the pollutants, the concentrations and
physical properties of the pollutants, and the temperature, relative
humidity, airflow characteristics, and other physical properties of the
environments.
Most exposure studies involving microenvironments have divided the
surrounding environment depending upon the pollutant of interest.
Researchers have repeated this division process when considering a
different pollutant. Also, when characterizing microenvironments,
scientists have ignored the health effect factor, which is a significant
component in the total human exposure process.
When studying a particular component in the total human exposure
process, it is important to consider the effects of other components as
well (Pandian, 1987). The following sections explain a new concept in
which the different components of the total human exposure process are
directly or indirectly related to microenvironments. The feasibility of
presenting this concept in the form of a database is also discussed.
SUGGESTED CONCEPT
In its simplest form, a microenvironment can be defined as a control
volume with a homogeneous pollutant concentration. Most exposure studies
to date have applied this definition. Since the pollutant concentration of
the same volume might vary with time, a better definition of
microenvironment is the four-dimensional concept (3-D space) x (time)
(Duan, 1982).. Two models have used this four-dimensional concept to
determine human exposure to air pollutants by matching pollutant
concentrations in microenvironments with human activity patterns over a
certain period of time. They are the National Ambient Air Quality
Standards Exposure Model (NEM) (Paul, 1981) and the Simulation of Human Air
Pollution Exposure model (SHAPE) (Ott, 1981; Systems Architects, Inc.,
1982).
Given a human subject's activity pattern, available exposure models
can predict the concentrations he/she is exposed to for specific
16-2
-------
pollutants. These models usually assume a different set of
microenvironments for each pollutant. Indoor environments, where human
beings spend more than 90 per cent of their time (Szalai, 1972; MAS, 1981),
are handled poorly in terms of pollutant concentrations and their related
sources and sinks. Dosage levels are estimated after assuming simple
linear behaviors, and no health effects are predicted (Pandian, 1987).
Since actual exposure occurs in a microenvironment, it should be
possible to relate the various components of the total exposure process to
the microenvironment, either directly or indirectly. Pollutant
concentrations in microenvironments can be correlated with the appropriate
sources and sinks; microenvironments can be associated with specific human
time activities; and dosages from exposure linked to target tissues. The
resulting health effects can then be traced to the accountable
mi croenvi ronments.
The current use of microenvironments in exposure studies can be
easily extended to include the characteristics of various pollutants and
their related health effects. This can be 'achieved by identifying all
possible microenvironments and characterizing each according to the
following factors:
i) control volume,
ii) pollutant(s), single or multiple,
iii) human time activities,
iv) pollutant(s) source characteristics,
v) pollutant(s) sink characteristics,
vi) exposure media and pathways,
vii) dosage processes, and
viii) health effects.
The next section briefly explains each factor.
CHARACTERISTICS OF MICROENVIRONMENTS
Microenvironments can be categorized by control volumes which may or
may not have definite physical boundaries. In the case of indoor
environments, a residence, for example, can be divided by using definite
boundaries into living room, bedroom, kitchen, dining room, bathroom,
garage, basement, and attic. Examples for control volumes with definite
boundaries in in-transit regimes are automobiles, buses, trains, airplanes,
and other enclosed vehicles. Outdoor environments can be divided into
control volumes by using imaginary boundaries; examples are a street in a
downtown business district, a street in a residential area, park area, and
the beach.
Pollutants include gaseous pollutants, radioactive pollutants,
particulate matter - which includes liquid and solid suspensions - and
bioaerosols. Bioaerosols, which are particulate matter, have been listed
separately because they may carry viable microorganisms in their structure.
16-3
-------
More than one pollutant can exist inside a microenvironment.
Depending upon the sources and sinks in the microenvironment, the
concentrations may vary spatially and temporally. A human subject is
exposed to all the pollutants present in his/her microenvironment.
Therefore, given his/her time activity pattern, a listing of
microenvironments with the pollutants present in each will immediately
provide information on the human subject's exposure pattern to the
pollutants in the relevant microenvironments.
Microenvironments can be associated with specific human activities.
From studies on human time activity patterns, it is possible to
statistically determine the activities and their durations which can be
linked to any microenvironment.
It is important to know the sources of pollutants to obtain their
concentrations in microenvironments. Two types of sources contribute
pollutants: those which generate background concentrations and those which
emit transient concentrations. Background sources responsible for
pollutant concentrations in microenvironments can be determined by
techniques such as receptor modeling (Friedlander, 1973). Transient
sources are highly time-dependent sources which are either inside the
control volume of a microenvironment or outside, in which case the
pollutants released will be swept inside by physical means. Examples of
background sources include tall smokestacks and vehicular traffic in
highways; examples of transient sources include smoking cigarettes, gas
cooking ranges, hot showers, unclean HVAC systems, dusty carpets, and
automobile exhaust systems.
Information on pollutant sinks.in microenvironments is necessary to
estimate realistic concentrations. Pollutants are removed by particular
mechanisms depending upon their physical and chemical properties. All
pollutants can be removed from a control volume by diffusion, air exchange,
or ventilation. Gaseous pollutants can also undergo chemical
transformation, and particulate matter can be deposited by impaction or
sedimentation.
Pollutant-carrying media include air, water, soil, or food, one or
more of which can exist in any microenvironment. The ubiquitous medium air
provides exposure through the respiratory tract, skin, and the digestive
tract. Water penetrates the human envelope through the skin and the
digestive tract, soil through the skin, and food through the digestive
tract. Exposure can be associated to microenvironments depending on the
different media present and the related human time activities. By using
pharmaco-kinetic relationships, exposure can be extrapolated to obtain
harmful dosage levels to human target tissues.
The last factor in the characterization of microenvironments involves
health effects, which can be indirectly related to microenvironments
through exposure to the pollutants found in certain microenvironments. In
cases where the prediction of health effects is very complex, one can
resort to the use of risk factors. A typical prediction would be: human
subject S, due to certain amounts of exposure to pollutant P in
16-4
-------
microenvironments M., M,, ..., M , has a risk factor of F for experiencing
health effect E.
DATABASE FORMULATION
After extensive and laborious literature searches and reviews,
information relating to the factors which characterize microenvironments
can be accumulated and categorized in a database format. The format should
be set up so that when one needs information on a particular component in
the total human exposure process, he/she can obtain data not just on that
component alone but also on associated'components. Explanations on how the
components are related should also be provided.
The formulation of such a database has many advantages. Most
important of all, a scientist studying human exposure would have access to
data on microenvironments and related elements in a concise package. If a
thorough literature search is carried out, the completed database will
reveal which data is available in the field of total human exposure. The
database will also expose areas where further research needs to be pursued,
both theoretical and experimental.
The completed database will facilitate the use of multiple effects
when studying human exposure. Factors such as exposure to multiple
pollutants and dosage from exposure through different pathways can be
considered. Weighting factors can be used to calculate exposure.
When accumulating data, it is important to note the accuracy of the
information. Data presented in the database should carry repprted results
on error analyses and statistical inferences. The database should be built
to include bibliographic records, and provisions should be made for
updating data. If feasible, the database should include meteorological
data such .as the STAR array as well as U.S. Bureau of Census files on
population distribution and growth, and migration patterns.
The actual formulation of the database itself can be easily
accomplished with the use of available computer software. The appropriate
fields necessary for the input of data can be listed first, followed by the
entry of collected data.
16-5
-------
REFERENCES
Duan, N. (1982): Models for Human Exposure to Air Pollution, Environ.
Intl., 8, 305-309.
Friedlander, S.K. (1973): Chemical Element Balances and Identification of
Air Pollution Sources, Environ. Sci. & Tech.. 7 (3), 235-240.
NAS (1981): Indoor Pollutants, Committee on Indoor Pollutants, Board of
Toxicology and Environmental Health Hazards, Assembly of Life
Sciences, National Research Council, National Academy Press,
Washington, DC.
Ott, W.R. (1981): Exposure Estimates Based on Computer Generated Activity
Patterns, paper presented at the 74th Annual Meeting of the Air
Pollution Control Association (APCA), Philadelphia, PA, APCA Pub. No.
81-57.6.
Pandian, M.D. (1987): Evaluation of Existing Total Human Exposure Models,
Prepared for U.S. Environmental Protection Agency, Las Vegas, NV.
Paul, R. (1981): User's Guide for NAAQS Exposure Model (NEM), Prepared for
U.S. Environmental Protection Agency, Research Triangle Park, NC.
Systems Architects, Inc. (1982): Environmental Modeling Catalogue (Draft),
Prepared for U.S. Environmental Protection Agency, Washington, DC,
EPA Contract No. 68-01-4723, 97-102.
Szalai, A. (1972): ed., The Use of Time: Daily Activities of Urban and
Suburban Populations In Twelve Countries, Mouton, The Hague.
16-6
-------
A METHODOLOGY FOR ESTIMATING CARBON MONOXIDE EXPOSURE AND RESULTING
CARBOXYHEMOGLOBIN LEVELS IN DENVER, COLORADO
by: Ted Johnson
PEI Associates, Inc.
505 South Duke Street
Suite 503
Durham, NC 27701
ABSTRACT
A methodology was developed for estimating carbon monoxide (CO)
ure and resulting carboxyhemoglobin (COHb) levels among residents of
r, Colorado. The methodology consisted of the following six steps.
1. Two populations-at-risk were defined for the five counties in
the greater Denver Metropolitan Area: a) all residents and b)
persons with angina.
2. Each population-at-risk was divided into an exhaustive set of
cohorts according to appropriate demographic and physiological
Variables.
3. A year-long sequence of exposure events was developed for each
cohort. An exposure event is defined by microenvironment,
smoking status, breathing rate, and duration (e.g., outdoors
near roadway, not smoking, fast breathing, 12 minutes).
Sequences were developed by applying a "random-walk"
statistical algorithm to activity diary data collected in
Cincinnati, Ohio, during March and August of 1985.
4. A probabilistic model was developed for estimating the CO
exposure associated with each event in each sequence. The
exposure during an event was assumed to reflect the ambient
17-1
-------
concentration measured at one or more fixed-site monitors, the
contribution of localized sources and sinks specific to the
microenvironment, and smoking status. Exposure values for
specific microenVironments were drawn from distributions fit to
personal monitoring data collected in Denver during the winter
of 1982/83.
5. A physiological model (the Coburn equation) was applied to the
exposure sequences developed in Step 4. The model determined
the COHb level of each cohort at the end of each hour.
6. The CO exposures and associated COHb levels were extrapolated
to each population-at-risk through the use of census-derived
weighting factors.
Computer software was developed for implementing Steps 3 through 6.
This paper has been reviewed in accordance with the U.S.
Environmental Protection Agency's peer and administrative review policies
and approved for presentation and publication.
17-2
-------
INTRODUCTION
The Environmental Strategies Project (ESP) for Metro-Denver is a
multiyear study of environmental issues and policy choices facing the
Denver area. The objectives of this study are: 1) to assess the relative
magnitude of a set of metro-Denver's environmental problems; and 2) to
identify strategies to reduce the health and welfare damages associated
with these problems.
Phase I of the ESP began in 1987. The purpose of Phase I is to
provide the ESP Advisory Committee with preliminary estimates of damages
caused by the major sources of environmental contamination in the Denver
area. These initial estimates will enable the Advisory Committee to
identify the environmental issues which merit more detailed analysis in
later phases of the project. To provide results quickly and to avoid
unnecessarily detailed analyses of less critical issues, the analyses
conducted under the first phase of the project have been designed to
provide "order of magnitude" estimates of health and welfare damages.
One of the air pollutants of most concern in the Denver area is
carbon monoxide (CO). Exposure to high levels of CO can increase the level
of carboxyhemoglobin (COHb) in the blood with a corresponding decrease in
the blood's ability to carry oxygen. Elevated COHb levels are associated
with reduced time to onset of attacks among persons.with angina and with
reduced vigilance among members of the general population. These and 'other
health effects related to CO exposure are thought to increase in high
altitude areas where ambient oxygen levels are reduced. Denver is
considered to have a high potential for CO-related health effects because
of the city's altitude (5280 ft) and high ambient CO levels. In an attempt
to characterize the extent of Denver's CO problem, PEI Associates, Inc.
(PEI), designed and executed a methodology for estimating CO exposures and
resulting COHb levels within the general Denver metropolitan area
population and within selected subsets of this population. This
methodology is described in the following report. Exposure estimates are
not provided, as these are currently under review by the Region VIII office
of the U.S. Environmental Protection Agency (EPA).
BACKGROUND
The methodology developed by PEI for the Phase I CO analysis was
derived from a general approach to estimating population exposure to air
pollution which has been used by the Strategies and Air Standards Division
(SASD) of EPA for most of the criteria pollutants. This approach divides a
population-at-risk residing in a particular study area into an exhaustive
set of cohorts. All members of a particular cohort are assumed to have
similar activity patterns and similar physiological characteristics. A
sequence of exposure events spanning a relevant time interval (e.g., one
year) is developed for each cohort. The exposure events are defined in
17-3
-------
such a way that the pollutant exposure during each event can be estimated
using supplementary data developed for this purpose. The resulting
sequence of pollutant exposures is then used to predict the occurrence of
specific health effects in each member of the cohort. Census-derived data
are used to extrapolate both the exposure estimates and the health effect
estimates to the entire population-at-risk. In some analyses, health
effects are not estimated, and only the extrapolated exposure estimates are
presented.
In adapting this general approach to the ESP-CO I analysis, PEI was
constrained by the fact that a period of only three months was available
for designing a methodology, collecting and processing the necessary data
bases, performing literature reviews, developing and debugging computer
software, running the software, performing sensitivity analyses, and
summarizing the results. The resulting methodology reflects the decision
by PEI to minimize the complexity of the computer models used to estimate
CO exposures and resulting COHb levels while making full use of personal
monitoring data and activity diary data from large-scale studies conducted
by PEI in Denver and Cincinnati.
The Denver study was conducted during the winter of 1982-1983 and
included as its target population those nonsmoking residents of the Denver
metropolitan urbanized area who were between 18 and 70 years of age in the
Fall of 1982. A total of 454 subjects were obtained through the use of a
screening questionnaire administered to several thousand households in the
study area. Each subject was asked to carry a personal exposure monitor
(PEM) and activity diary for two consecutive 24-hour sampling periods and
to provide a breath sample at the end of each sampling period. Each
subject also completed a detailed background questionnaire. PEI collected
CO data recorded at fixed sites throughout Denver to compare with
simultaneously measured PEM values. Reference 1 contains a more detailed
description of the study. Data from this study have been analyzed by
Johnson et al. (Reference 2).
The second study was conducted in a three-county area surrounding and
including Cincinnati, Ohio, during March and August 1985. The target
population of this study included all residents of Hamilton County, Ohio;
Clermont County, Ohio; and Kenton County, Kentucky. A total of 973
subjects (487 in March and 486 in August) were obtained through the use of
a screening questionnaire administered to several thousand households.
Each subject was requested to carry an activity diary for three consecutive
24-hour periods and to complete a detailed background questionnaire. As no
personal monitors were employed, personal CO exposure data were not
available for subjects of this study. This study, however, included three
population subgroups excluded from the Denver study: children, persons
over 70, and smokers. Reference 3 contains a more detailed description of
the Cincinnati study.
REQUIRED OUTPUTS AND GENERAL APPROACH
PEI was requested to provide estimates for a recent calendar year for
the following COHb indicators and populations-at-risk:
17-4
-------
1. Daily maximum 8-hour average COHb levels for the general
population;
2. One-hour COHB levels for persons with angina pectoris.
In each case estimates were to be tabulated for specified ranges of
COHb in three ways: the number of person-occurrences of the COHb levels
falling within each range, the number of persons with one or more COHb
values falling within each range, and the number of persons with yearly
maximum COHb values falling within each range. To provide maximum
flexibility, PEI developed software capable of providing these tabulations
for various combinations of time period (entire year, warm months only,
cold months only), regulatory scenario ("as is" conditions versus
attainment of the current 8-hour National Ambient Air Quality Standard
(NAAQS) for CO), smoking status (all persons, 'smokers only, and nonsmokers
only), and population group (e.g., retired persons).
The study area specified for the CO exposure analysis was identical
to the study area specified for a parallel analysis of ozone and
particulate matter exposure being conducted by another contractor. The
study area included all census tracts in Adams, Arapahoe, Denver, Douglas,
and Jefferson counties and three census tracts (131.03, 131.04, and 131.05)
in Boulder County. Approximately 89 percent of the Denver Standardized
Metropolitan Statistical Area (SMSA) is included in the study area.
PEI's general approach to estimating COHb values for each of the two
populations-at-risk within this study area was to 1) divide the population-
.at-risk into an exhaustive set of cohorts,.2) develop a year-long-sequence
of exposure events for each cohort, 3) estimate the CO exposure associated
with each event in each sequence, 4) apply the Coburn model to the
resulting sequences of CO exposures, and 5) extrapolate the resulting COHb
estimates to the population-at-risk through the use of census-derived
weighting factors.
POPULATION COHORTS
Two populations-at-risk were defined for the analysis:
1. All residents of the designated study area, including children,
retirees, and smokers
2. Persons with angina residing in the study area.
The first of the two populations-at-risk (i.e., the general
population) was represented by a set of 14 nonoverlapping population groups
(Table 1) hereafter referred to as demographic groups (DG's). A DG is a
group of people with similar demographic characteristics that may
reasonably be expected to have similar CO exposure patterns during a given
calendar year. The demographic characteristics used to define the DG's
were age, school status, employment status, commute time, and smoking
17-5
-------
status. These factors were assumed to strongly influence personal activity
patterns and resulting CO exposure patterns.
TABLE 1. NUMBER OF CINCINNATI SUBJECTS ASSIGNED TO EACH
DEMOGRAPHIC GROUP
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Number of assigned sub.iects
Demographic group LDG) |
Preschoolers, under 2 years of age
Preschoolers, 2 to 5 years of age
Students, 6 to 9 years of age
Students, 10 to 13 years of age
Students, 14+ years of age, smokers
Students, 14+ years of age, nonsmokers
Workers, commute # 20 minutes, smokers
Workers, commute #20 minutes, nonsmokers
Workers, commute 20+ minutes, smokers
Workers, commute 20+ minutes, nonsmokers
Homemakers and unemployed, smokers
Homemakers and unemployed, nonsmokers
Retired, smokers
Retired, nonsmokers
Total number of assigned subjects
March
21
46
27
27
11
58
46
79
13
42
19
51
9
33
482
August
11
30
39
38
9
44
36
80
28
47
24
47
10
24
467
Total
32
76
66
65
20
102
82
159
41
89
43
98
19
57
949
Each DG was subdivided into four cohorts. A cohort is a group of
people assumed to have identical sequences of CO exposure for a given
calendar, year. Each of the cohorts belonging to. a particular DG was
assigned a different blood volume according to the distribution of blood
volume in members of the DG (Table'2). Blood volume is an important
determinant of the rate at which a person's COHb level .varies as the
concentration of CO to which the person is exposed varies.
Each cohort was represented by a single 365-day exposure sequence
developed from data obtained from Cincinnati subjects assigned to the
specified DG. The Cincinnati study was selected for this purpose because
it included subjects which could be assigned to each of the DG's of
interest. As previously indicated, .the Denver study omitted children,
persons over 70 years of age, and smokers.
Responses to data items appearing in the Cincinnati background
questionnaire were the primary means of assigning Cincinnati subjects to
DG's. In general, the DG's were defined so as to provide as many different
exposure sequences as possible within the overall constraints that the DG
definitions emphasize standard Bureau of Census identifiers and that the
number of Cincinnati subjects assigned to each DG equal or exceed 15.
Table 1 lists the number of subjects assigned to each DG by month of
participation (March or August).
The second population-at-risk (persons with angina) was considered to
be a subset of the first. PEI developed age- and sex-specific prevalence
17-6
-------
TABLE 2. ESTIMATES OF COHORT BLOOD VOLUMES AND 1980 POPULATIONS
Code
1
3
3
4
5
6
7
8
9
Demographic group (DG)
Preschoolers, under 2
years of age
VB = wt * 73.0 ml/kg
Preschoolers, 2 to 5
years of age
VB = wt * 73.0 ml/kg
Students, 6 to 9 years
of age
VB = wt * 73.0 ml/kg
Students, 10 to 13 years
of age
VB = wt * 73.0 ml/kg
Students, 14+ years of
age, smokers
.
VB = wt * 73.5 ml/kg
Students, 14+ years of
age, nonsmokers
VB = wt * 73.5 ml/kg
Workers , commute < 20
minutes, smokers
VB = wt * 73.5 ml/kg
Workers, commute < 20
minutes, nonsmokers
VB = wt * 73.5 ml/kg
Workers, commute 20+
minutes, smokers
VB = wt * 73.5 ml/kg
(continued)
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
- B
C
D
A
B
C
D
Body
wt. ,
k9
6.3
8.9
10.7
12.9
12.0
14.2
16.7
20.2
19.5
23.2
26.7
32.0
29.4
36.3
41.6
50.1
47.5
56.8
63.8
75.3
47.5
56.8
63.8
75.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3
VB?
ml
460
650
781
942
876
1037
1219
1475
1424
1694
1949
2336
2146
2650
3037
3657
3491
4175
4689
5535
3491
4175
4689
5535
3866
4807
5446
6343
3866
4807
5446
6343
3866
4807
5446
6343
Percent
of DG
25.6
21.8
31.3
21.4
23.1
25.1
28.8
23.0
25.4
23.1
26.1
25.4
25.1
24.0
26.1
24.7
24.5
25.4
24.7
25.4
24.5
25.4
24.7
25.4
25.1
24.4
25.4
25.1
25.1
24.4
25.4
25.1
25.1
24.4
25.4
25.1
Population,
thousands
General
10.9
9.2
13.3
9.1
19.6
21.3
24.4
19.5
21.7
19.8
22.3
21.7
22.9
21.9
23.8
22.5
9.4
9,8
9.5
9.8
31.8
32.8
32.0
32.9
29.8
28.9
30.2
29.8
48.1
46.7
48.8
48.1
39.9
38.8
40.6
39.9
Angina
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.
0
0
0
0
0
0
0.385
0.374
0.390
0.385
0.605
0.588
0.612
0.605
0.517
0.502
0.523
0.517
17-7
-------
TABLE 2 (continued)
Code
10
11
12
13
14
Demographic group (ŁG)
Workers, commute 20+
minutes, nonsmokers
VB = wt * 73.5 ml/kg
Homemakers and unempl . ,
smokers
VB = wt * 73.0 ml/kg
Homemakers and unempl . ,
nonsmokers
VB * wt * 73.0 ml/kg
Retired, smokers
VB = wt * 73.4 ml/kg
Retired, nonsmokers
;VB.= wt * 73.4 ml/kg
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
"A
B
C
D
Body
wt.,
k9
52.6
65.4
74.1
86.3
47.9
59.9
68.4
80.7
47.9
59.9
68.4
80.7
52.3
64.2
72.1
83.3
52.3
64.2
72.1
83.3
VB?
ml
3866
4807
5446
6343
3497
4373
4993
5891
3497
4373
4993
5891
3839
4712
5292
6114
3839
4712
5292
6114
Percent
of DG
25.1
24.4
25.4
25.1
24.6
26.1
23.9
25.4
24.6
26.1
23.9
25.4
24.8
25.7
24.5
25.0
24.8
25.7
24.5
25.0
Population,
thousands
General
64.4
62.6
65.4
64.4
11.3
12.0
11.0
11.7
20.7
22.0
20.1
21.4
4.6
4.7
4.5
4.6
23.7
24.6
23:4
23.9
Anqina
0.812
0.790
0.822
0.812
0.120
0.127
0.116
0.123
0.216
0.229
0.210
0.223
0.224
0.233
0.222
0.226
0.055
1..094
0.996
0.017
VB: blood volume.
17-8
-------
rates for angina and applied these to the general population estimates to
determine the number of persons with angina in each cohort.
EXPOSURE EVENT SEQUENCE
An exposure event sequence is a time-ordered list of exposure events
experienced by each member of a cohort from midnight January 1 to midnight
December 31. Each exposure event is defined by microenvironment, smoking
status, breathing rate, and duration.
Microenvironments are generalized locations which are considered to
have predictable CO levels. Examples include parking garages, the
interiors of automobiles, and residences. An exhaustive set of
microenvironments was defined in terms of the subject responses to data
items appearing in the Cincinnati diary (Table 3).
Smoking status and breathing rate were also determined by responses
to items appearing in the Cincinnati activity diary. Possible responses
are listed below.
Smoking Status
17: subject smoking 13: slow
18: others smoking 14: medium
19: no one smoking 15: fast
16: breathing problem
Durations were specified in minutes, the smallest unit 'of time which
could be distinguished in the Cincinnati and Denver data bases. To allow
exposure sequences to be related to hourly average CO data from a downtown
fixed-site monitor, no duration exceeded 60 minutes and no event fell into
more than one clock hour. For example, a visit to a restaurant lasting
from 12:35 to 1:22 was treated as two events, th# first lasting from 12:35
to 1:00 and the second lasting from 1:00 to 1:22. Consequently, any given
event can be identified by specifying the hour h during which it occurs and
the position i of the event within the hour. Thus, the third event within
the ninth hour of the year can be identified by specifying h = 9 and i = 3.
The development for each cohort of an event sequence containing 365
days was complicated by the fact that no individual Cincinnati subject
could contribute more than three distinct days of diary data to the
sequence. PEI chose to overcome this obstacle by pooling the diary data
for all Cincinnati subjects assigned to a particular DG and then
constructing an individual 365day sequence for each of the four cohorts in
the DG by randomly selecting subject-days from the pool. Subject-days
within the pool were labeled as to day type (weekday, Saturday, or Sunday)
and season (warm or cold). When a wa#n-weather Saturday was needed for a
sequence, one was randomly selected from the subject-days labeled "warm
Saturdays."
The obvious drawback to this approach is that the resulting CO
exposure sequences may not exhibit the day-to-day repetitions in activities
that Would be present if each sequence of 365 days were obtained from the
17-9
-------
TABLE 3. ESTIMATED MICROENVIRONMENT FACTORS
Microenvironment factors
m
Location categories
A
Lambda
Theta
Sigma
Goodness-of-fit
Skew.
Kurt.
Pear.
Spmn.
• Corr.
Coef.
Motor vehicle microenvironments
1
2
3
4
Motorcycle
Bus
Truck/van
Car
0.90
0.64
0.52
0.49
0.61
0.62
0.43
0.44
2.95
1.31
0.11
0.42
3.
3.
3.
3.
73
47
89
93
-0.4
0
0
0
3.1
2.7
2.7
3.3
0.
0.
0.
0
11
08
07
0.52
0.47
0.43
0.41
Indoor microenvironments
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Church
Manufacturing fac.
Health care fac.
Rest. /bar
School
Service station
Other public build.
Store
Other repair
Office
Residential - other
Residential - meal
Res. garage
Indoor public garage
Auditorium
Shopping mall
Other indoor loc.
0.33
0.51
0.31
0.41
0.33
0.58
0.33
0.28
0.42
0.16
0.12
0.20
0.18
0.29
0.15
0*
0.24
0*
0.19
0.44
1.55
0.40
0.60
0.40
0.50
0.60
0.39
0.55
0.40
0.45
0.45
0.35
0.41
0.43
0.38
0.42
0.32
0.34
-1.73
-1.32
-2.48
0.01
-3.10
0.75
-1.24
-1/18
0.60
-0.81
-1.00
0.48
-0.96
2.41
0.22
1.09
0.43
1.44
-0.31
2.
2.
2.
2.
2.
4.
2.
3.
4.
3.
2.
3.
3.
3.
3.
2.
3.
1.
3.
48
68
91
99
84
18
80
17
43
06
59
16
38
64
16
35
07
96
68
0.4
0.7
0.7
0.4
0.9
0
0.8
• °
0.5
0
0.6
0
0
0
0
0
0
0
0
2.5
5.7
2.5
3.2
2.7
2.7
3.5.
2.0
1.8
2.3
3.1
2.4
2.0
4.2
2.1
2.6
2.7
3.6
3.2
_0
-0.
0.
0.
0.
0
0.
0-
0.
0.
0.
0.
0
0
0
0
0
06
10
11
10
11
11
11
11
11
06
0.38
0.57
0.56
0.51
0.58
0.37
0.69
0.48
0.34
0.36
0.34
0.27
0.29
0.33
0.17
0.23
0.27
Outdoor microenvironments
22 Outdoor public gar. 0.60 0.77 0.79 3.47 0 2.3 0 0.58
23 Park, golf course, 0.14 0.30 -3.56 3.07 0.9 2.2 -0.09 0.06
sports arena 0* 0.25 -1.87 2.30 0.6 2.1
(continued)
17-10
-------
TABLE 3 (continued).
Microenvironment factors
m
24
25
26
27
28
Location categories
Other outdoor loc.
School grounds,
bicycle
Near road
Outdoor service
sta. , parking lot
Residential grounds
A
0.58
0.78
0.39
0.29
0.14
Lambda
0.50
0.60
0.45
0.58
0.41
Theta
-2.12
-2.93
-1.30
0.33
-2.04
Sigma
3.51
2.97
3.50
3.03
2.47
Goodness -of-f it
Skew.
0.
1.
0.
0
0.
9
3
4
6
Kurt.
3.2
3.8
2.5
2.6
2.0
Pear.
0.
0.
0.
0.
0.
11
11
11
01
01
Spmn.
Corr.
Coef.
0.59
0.59
0.49
0.25
0.23
Assumed because Spearman rank correlation coefficient is not significant at
0.05 level.
Skew.: Skewness coefficient for distribution of residuals (skew = 0 for nor-
mal distribution)..
Kurt.: Kurtosis coefficient for distribution of residuals (.kurt = 3 for nor-
mal distribution)..
Pear.: Pearson correlation coefficient between Broadway fixed-site monitor
values and residuals of fitted model.
Spmn. corr. coef.: Spearman rank correlation coefficient between PEI values
and simultaneous. Broadway fixed-site monitor values.
17-11
-------
same subject. The advantage of this approach is that it makes full use of
all available data. The resulting exposure sequence will reflect to some
extent the person-to-person variability of all subjects assigned to a
particular DG.
ESTIMATION OF EXPOSURE DURING AN EVENT
The assumption was made that the CO concentration for a particular
cohort was constant during a given exposure event. This concentration was
estimated by assuming that the CO concentration during a particular
exposure event reflects 1) the ambient concentration as measured by one or
more fixed-site monitors, 2) the contribution of localized sources and
sinks specific to the microenvironment occupied during the event, and 3)
smoking. The resulting model can be expressed as
C0(c,h,i;m,s) = A(tn)*MON(h) + B(m) + C(s)
where C0(c,h,i;m,s) is the CO concentration estimated for cohort c, hour h-
(1 < h < 8760), and event i within hour h, given that the microenvironment
occupied is m and the smoking status is s.
The quantity MON(h) denotes the hourly average ambient CO
concentration for hour h which would be expected to occur at the Denver
"design-value" monitoring site under one of the two scenarios of interest
("as is" and "meets B-h NAAQS"). For the "as is" scenario, MON(h) was
taken directly from a data set consisting of 1) the 1986 hourly-average CO
data reported by the fixed-site monitor located on Broadway; and 2)
estimates of the missing values provided by an interpolation technique
based on time-series analysis. For the alternative scenario, the "as is"
data set was adjusted using a rollback formula so that the second largest
daily-maximum 8-hour CO value exactly equalled 9 ppm, the 8-h CO level not
to be exceeded more than once per year under the NAAQS for CO.
A(m) is a multiplicative constant specific to microenvironment m.
The quantity B(m) is an additive factor selected at random from a "Box-Cox"
distribution specific to m. The Box-Cox distribution is completely defined
by three parameters: lambda(m), theta(m), and sigma(m). Values of A(m),
lambda(m). theta(m), and sigma(m), jointly referred to as microenvironment
factors, were developed from the Denver data base by relating PEM
concentrations measured in microenvironment m to simultaneous fixed-site CO
values reported by the Broadway monitor. The values were determined by a
statistical procedure that increased the proportion of variance explained
by the stochastic term B(m) as the observed correlation between PEM values
and fixed-site values decreased (Reference 4). For microenvironments where
the correlation was not statistically significant, A(m) was set equal to
zero and all variance was explained by B(m). Table 3 lists the values of
A(m), lambda(m), theta(m), and sigma(m) determined for each
microenvironment.
The quantity C(s) is an additive factor reflecting the contribution
of active smoking (i.e., smoking status = 17) to the CO exposure. The
contribution of passive smoking is assumed to be included in the B(m) term.
17-12
-------
Appropriate values of C(s) were obtained by determining the continuous CO
exposure that would yield the steady-state COHb levels measured in typical
smokers after extended periods of smoking (Reference 4).
ESTIMATION OF HOURLY AVERAGE COHb VALUES
The sequence of CO exposure estimates determined for a particular
cohort together with the associated durations and breathing rates were read
by a computer subroutine that yielded a sequence of event-specific COHb
estimates for the cohort. This subroutine contained an algorithm that
applied an equation developed by Coburn (Reference 5) to the input data.
Another subroutine converted the event-specific COHb values to a series of
8760 hourly-average COHb estimates for the cohort.
The Coburn Equation Algorithm (CEA) required the following
physioloigical data on the members of the cohort:
Blood volume
Hemoglobin concentration
Endogenous CO production rate
CO diffusion rate
Altitude
Haldane constant
Initial COHb level
Ventilation rate (VA) by breathing rate.
The Heldane constant (M) expresses the relative affinity of hemoglobin for
CQ and oxygen. Values for M ranging from 210 to 250 have appeared in the
scientific literature. PEI selected the value of 218 for this analysis, as
this value had been selected by EPA for use in a prior CO exposure analysis
after considering the relative merits of the various suggested values of M
(Reference 6). - • .
PEI used a constant value of 5280 feet for altitude in the analysis. The
effect on COHb estimates of omitting CO exposures at higher altitudes
(e.g., weekend trips to the Rocky Mountains) is expected to be somewhat
offset by the lower ambient CO levels expected at these altitudes.
The CO exposure sequence for each cohort begins at midnight on January
1. The results of CO expo##res prior to this time are represented in the
CEA by an initial COHb level (i.e., the COHb level at midnight). PEI
assumed the "true" initial COHb value for each cohort fell within the range
of zero to 1 percent. PEI compared the effects of initial COHb levels of
zero and 1 percent on the COHb level estimated by th# CEA for a typical
adult cohort after a few hours of exposure. No effect was discernable.
PEI subsequently decided to assign each cohort an initial COHb level of 0.5
percent.
PEI conducted a search of the scientific literature to develop
estimates of the remaining parameters used in the CEA. Tables 4 and 5 list
these estimates by cohort. Details on the methods used to develop these
and other model inputs are provided in Reference 4.
17-13
-------
EXTRAPOLATION TO POPULATION-AT-RISK
Each DG was defined in terms of the age, working status, commuting
time, and smoking status of its members. The age, working status, and
commuting time identifiers were identical to identifiers used by the Bureau
of the Census. Data on smoking rates by age were obtained through a
literature search. By applying the smoking rates to the Denver census data
broken down by age, working status, and commuting time, PEI was able to
estimate the number of Denver area residents belonging to each DG.
Each DG population was further divided into four cohort populations
through the use of data giving the distribution of body weight by age. The
results were estimates of the number of Denver area residents in each
cohort (Table 2). The sum of these estimates equalled the population-at-
risk referred to as the "general population." Data on the prevalence of
angina by age were then used to estimate the number of persons suffering
from angina that belonged to each cohort. The sum of these latter
estimates equalled the population-at-risk referred to as "persons with
angina" (Table 2).
The cohort population estimates were used to extrapolate the cohort-
specific sequences of hourly-average COHb estimates to the various
populations-at-risk. Details concerning the extrapolation technique are
provided in Reference 4.
SAMPLE APPLICATION OF METHODOLOGY
To demonstrate the methodology, it has been applied to the cohort
designated as 8C in Table 2, that is, non-smoking workers who commute less
than 20 minutes and who fall in the body weight category centered around
74.1 kg. Figure 1 is a computer printout 'listing the first 54 events in
the exposure event sequence developed for the cohort together with the
values of CO exposure and COHb level associated with each event. The table
headings are defined below.
EVENT: the exposure event number
DAY: the Julian date (1 = January 1)
TIME: Start time of the event (e.g., 1731 = 5:31 p.m.)
DUR: duration of the event in minutes
MICRO: microenvironment occupied during event (see Table 3 for
code explanation)
SMOKE: smoking status during event L18 = others smoking, 19 = no
one
17-14
-------
BR: breathing rate category during event LI = 10,508 ml/min,
2 = 21,059 ml/min, 3 = 63,100 ml/min for cohort 8C)
A: multiplicative term (A(m)) associated with
microenvironment occupied during exposure event
MOM: hourly average fixed-site monitoring value (MON(h))
during exposure event
B: stochastic term [B(m)] selected from Box-Cox distribution
specific to microenvironment occupied during exposure
event
CO: estimated CO concentration to which cohort is exposed
during event
COHb: estimated COHb level at end of exposure event.
The listings for exposure event 23, for example, indicate the cohort
entered microenvironment 27 (outdoor service station or parking lot) at
1700 (5:00 p.m.) and remained there for 5 minutes. Because the breathing
rate of the cohort was slow (BR = 1), the cohort was assigned a ventilation
rate of 10,508 ml/min according to Table 5. The values of A(m), MON(h),
and B(m) in the exposure equation were determined to be 0.29, 3.3 ppm, and
7.9 ppm, respectively, for this event. The cohort was not smoking;
consequently the C(s) term in the exposure equation was set equal to zero
for this event. The exposure equation yields an estimated 'CO level for the
event equ-al to (0.29 * 3.3 ppm) + (7.9 ppm) or 8.9 ppm. Given a COHb level
of 0.307 percent at the beginning of the event, the Coburn equation yields
an estimate of 0.359 percent for the COHb level at the end of the event.
The cohort-specific physiological parameters used as inputs to the Coburn
equation for Cohort 8C can be found in Table 4.
Inspection of Figure 1 reveals a basic limitation of the model as
currently constructed. Because they are randomly drawn from
microenvironment-specific distributions, the values listed for B(m) under
column heading "B" display very little autocorrelation, even when
successive events occur in the same microenvironment (e.g., events 1
through 9). As a consequence, successive CO exposure estimates display
much less autocorrelation than is observed in PEM data. This limitation in
the model was recognized during its initial development but was not
addressed due to resource constraints. The limitation was not considered
to be serious, as the Coburn equation tends to dampen out fluctuations in
CO exposure when estimates of COHb are made. Nevertheless, future versions
of the model will incorporate an appropriate degree of autocorrelation in
the B(m) term so that the resulting CO exposure .estimates will more nearly
reflect the autocorrelation observed in PEM data.
17-15
-------
TABLE 4. ESTIMATED VALUES FOR INPUT PARAMETERS OF COBURN MODEL
Code
1
2
3
4
5
6
.
7
8
9
(con-
Demographic group (DG)
Preschoolers, under 2
years of age
Preschoolers, 2 to 5
years of age
Students, 6 to 9 years
of age
Students, 10 to 13 years
of age
Students, 14+ years of
age, smokers
Students, 14+ years of
age, nonsmokers
Workers, commute less
than 20 minutes, smokers
Workers, commute less
than 20 minutes, non-
smokers
Workers, commute 20+
minutes, smokers
inued)
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
weight
6.3
8.9
10.7
12.9
12.0
14.2
' 16.7
20.2
19.5
23.2
26.7
32.0
29.4
36.3
41.6
50.1
47.5
56.8
63.8
75.3
47.5
56.8
63.8
75.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3
Coburn model inputs3
VB
460
650
781
942
876
1037
1219
1475
1424
1694
1949
2336
2146
2650
3037
3657
3491
4175
4689
5535
3491
4175
4689
5535
3866
4807
5446
6343
3866
4807
5446
6343
3866
4807
5446
6343
DL
2.9
4.9
6.2
7.6
7.0
8.5
10.1
12.2
11.8
13.9
15.9
18.7
17.4
21.0
23.6
27.6
26.4
30.7
33.8
38:7
26.4
30.7
33.8
38.8
26.4
32.9
37.3
43.5
26.4
32.6
36.8
42.8
26.4
32.9
37.3
43.6
HB
12.3
12.3
12.3
12.3
12.3
12.3
12.3
12.3
12.8
12.8
12.8
12.8
13.3
13.3
13.3
13.3
14.2
14.2
14.2
14.2
14.2
14.2
14.2
14.2
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
vco
0.00051
0.00072
0.00087
0.00105
0.00097
0.00115
0.00135
0.00164
0.00165
0.00196
0.00225
0.00270
0.00258
0.00318
0.00365
0.00439
0.00448
0.00535
0.00601
0.00710
0.00448
0.00535
0.00601
0.00710
0.00503
0.00625
0.00708
0.00825
0.00503
0.00625
0.00708
0.00825
0.00503
0.00625
0.00708
0.00825
-
17-16
-------
TABLE 4 (continued)
Code
10
11
12
13
14
Demographic group (DG)
Workers, commute 20+
minutes, nonsmokers
•
Homemakers and unempl.,
smokers
Homemakers and unempl.,
nonsmokers
Retired, smokers
Retired, nonsmokers
-
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
weight
52.6
65.4
74.1
86.3
47.9
59.9
68.4
80.7
47.9
59.9
68.4
80.7
52.3
64.2
72.1
83.3
52.3
64.2
72.1
83.3
Coburn model inputs3
VB
3866
4807
5446
6343
3497
4373
4993
5891
3497
4373
4993
5891
3839
4712
5292
6114
3839
4712
5292
6114
DL
26.4
32.6
36.8
42.8
24.7
28.9
31.9
36.2
24.8
28.9
31.8
35.9
24.5
31.4
36.0
42.5
24.6
30.5
34.4
-39.9
HB
14.4
14.4
14.4
14.4
13.6
13.6
13.6
13.6
13.6
13.6
13.6
13.6
14.1
14.1
14.1
14.1
14.1
14.1
14.1
14.1
VCO
0.00503
0.00625
0.00708
0.00825
0.00430
0.00537
0.00613
0.00724
0.00430
0.00537
0.00613
0.00724
0.00489
0.00600
0.00674
0.00779
0.00489
0.00600
0.00674
0.00779
Abbreviations
VB: blood volume in milliliters
DL: CO diffusion rate in ml/min/torr
HB: hemoglobin concentration in grams/100 ml
VCO: endogenous CO production rate in ml/min.
17-17
-------
TABLE 5. ESTIMATES OF VENTILATION RATES (ml/min) TO BE ASSIGNED TO DIARY
BREATHING RATE RESPONSES
Code
1
2
3
4
5
-
6
7.
8
9
Demographic group (DG)
Preschoolers, under 2 years
of age
•
' Preschoolers, 2 to 5 years
of age
Students, 6 to 9 years of
age
Students, 10 to 13 years
of age
Students,. 14+ years of age,
smokers
Students, 14+ years of age,
nonsmokers
Workers, commute less than
20 minutes, smokers
Workers, commute less than
20 minutes, nonsmokers
Workers, commute 20+ minutes,
smokers
(continued)
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D '
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
VA by breathing rate response
Slow
1,897
2,227
2,456
2,735
2,621
2,900
3,218
3,662
3,574
4,043
4,488
5,161
4,830
5,707
6,380
7,460
7,130
8,311
9,200
10,660
7,130
8,311
9,200
10,660
7,777
9,403
10,508
12,057
7,777
9,403
10,508
12,057
7,777
9,403
10,508
12,057
Medium
3,770
4,432
4,892
5,453
5,223
5,784
6,422
7,314
7,136
8,079
8,972
10,323
9,660
11,420
12,771
14,939
14,276
16,647
18,432
21,365
14,276
16,647
18,432
21,365
15,576
18,840
21,059
24,170
15,576
18,840
21,059
24,170
15,576
18,840
21,059
24,170
Fast
11,233
13,221
14,599
16,282
15,593
17,276
19,189
21,866
21,331
24,161
26,839
30,893
28,904
34,183
38,237
44,740
42,751
49,865
55,220
64,018
42,751
49,865
55,220
64,018
46,652
56,444
63,100
72,433
46,652
56,444
63,100
72,433
46,652
56,444
63,100
72,433
17-18
-------
TABLE 5 (.continued)
Code
10
11
12
13
14
Demographic group (J3G).
Workers, commute 20+ minutes,
nonsmokers
Homemakers and unempl .
smokers
Homemakers and unempl .
nonsmokers
-
Retired, smokers
Retired, nonsmokers
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
VA by breathing rate response
Slow
7,777
9,403
10,508
12,057
7,180
8,704
9,784
11,346
7,180
8,704
9,784
11,346
7,739
9,250
10,254
11,676
7,739
9,250
10,254
11,676
Medium
15,576
18,840
21,059
24,170
14,378
17,438
19,605
22,742
14,378
17,438
19,605
22,742
15,500
18,534
20,549
23,405
15,500
18,534
20,549
23,405
Fast
46,652
56,444
63,100
72,433
43,057
52,237
58,739
68,149
43,057
52,237
58,739
68,149
46,423
55,526
61,670
70,138
46,423
55,526
61,570
70,138
17-19
-------
EVENT DAY TIME DUR MICRO SMOKE BR
MON
CO COHB
1
2
3
4
5
6
7
B
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
"1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
0
100
200
300
400
500
545
600
700
730
731
755
800
900
1000
1100
1200
•1245
1300
1400
1500
1600
1700
1705
1730
1731
1800
1845
1846
1900
1915
1917
2000
2030
2035
2045
2047
2100
2200
2202
2220
'2222
2300
2330
0
100
200
300
400
500
558
600
700
800
60
60
60
60
60
45
15
60
30
1
24
5
60
60
60
60
45
15
60
60
60
60
5
25
1
29
45
1
14
15
2 '
43
30
5
10
2
13
60
2
18
2
38
30
30
60
60
60
60
6O
58
2
60
60
50
15
15
15
15 .
15
15
15
15
15
26
4
27
14
14
14
14
8
14
14
14
14
14
27
4
26
• 15
15
26
4
4
26
12
12
26
4
26
8
8
26
4
26
15
15
15
15
15
15
15
15
15
15
15
15
15
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
•19
19
19
19
19
19
19
19
IB
18
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0. 12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.39
0.49
0.29
0.16
0. 16
0.16
0. 16
0.41
0.16
0. 16
0. 16
0. 16
0. 16
0.29
0.49
0.39
0. 12
0.12
'0.39
0.49
0.49
0.39
0.28
0.28
0.39
0.49
0.39
0.41
0.41
0.39
O.49
0.39
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
1.3
1.0
1.2
1.0
1.0
0.9
0.9
1.6
1.9
1.9
1.9
1.9
2. 1
2.2
1.7
1.5
3.0
3.0
2.3
1.7
1.9
3.0
3.3
3.3
3.3
3.3
3.7
3.7
3.7
2.6
2.6
2.6
2.1 '
2.1
2.1
2.1
2.1
0.7
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.4
0.4
0.6
0.5
0.8
0.2
2.3
-0.0
-0.1
1.1
0.4
-0.0
2.6
14.2
-0.3
0.3
0.1
2.9
0. 1
4.3
-0.0
0.9
0.0
1.8
5.4
0.1
0.0
7.9
1.6
0.0
0.0
-0.0
1.1
8.8
2.4
3.7
-0.0
0.3
2.2
2.3
3.5
0.2
2.6
-0.0
B.5
-0.1
0.2
3.0
4.8
-0.0
0.4
0.1
0.2
-O.O
2.2
1.3
-0.1
1.5
0.4
0.4
2.4
0.1
0.0
1.2
0.5
0.1
2.8
14.4
0.4
1.2
0.7
3.2
0.5
4.6
0.2
2.1
0.5
2.2
5.7
0.4
0.5
8.9
3.2
1.3
0.4
0.4
2.5
10.6
3.7
4.7
0.7
0.9
3.0
3.3
4.3
1.1
2.9
0. 1
8.6
0.0
0.2
3.0
4.8
0.0
0.4
0.1
0.2
0.0
2.2
1.3
0.0
1.6
0.5
0.384
0.442
0.331
0.258
0.291
0.271
0.259
0.438
2.027
2.011
0.765
0.743
0.706"
0.359
0.731
0.332
0.3BO
0.361
0.416
0.910
0.359
0.307
0.359
0.417
0.416
0.371
0.320
0.322
1.047
1.008
1 . 006
0.503
0.455
0.462
0.479
0.485
0.468
0.524
0.518
0.884
0.871
0.438
0.480
0.577
0.301
0.266
0.226
0.20B
0. IBS
0.350
0.350
0.269
0.323
0.290
Figure 1. Excerpt from the exposure event sequence developed for
Cohort 8C indicating the CO exposure estimated for"
each event and the resulting COHb level.
17-20
-------
MODEL VALIDATION
Resource limitations prevented PEI from validating the model during
its developmental stage. Appropriate data bases for validating portions of
the methodology exist, the most useful being COHb measurements obtained
from subjects of the 1982-83 Denver study discussed previously (Reference
1). This data base has been used in efforts to validate SHAPE, a similar
model for estimating COHb levels developed by Ott (Reference 7).
Unfortunately, the Denver study sample did not include children, elderly
persons, or smokers; COHb levels were not measured during the summer; and
the activity diary data obtained from each subject do not include breathing
rates. Data from other studies, as yet unidentified, would be needed to
completely validate the model presented here.
REFERENCES.
1. Johnson, T. A Study of Personal Exposure to Carbon Monoxide in
Denver, Colorado. Prepared by PEI Associates, Inc., for U.S.
Environmental Protection Agency, Environmental Monitoring Systems
Laboratory, under Contract No. 68-02-3755. December 1983 Lrevised
June 1984).
2. Johnson, T., J. Capel, and L. Wijnberg. Selected Data Analyses
Relating to Studies of Personal Carbon Monoxide Exposure in Denver
and Washington. Prepared by PEI Associates, Inc., for U.S.
Environmental Protection Agency, Environmental Monitoring Systems
Laboratory, under Contract No. 68-02-3496. February 1985.
3. Johnson, T. A Study of Human Activity Patterns in Cincinnati, Ohio.
Prepared by PEI Associates, Inc., for Electric Power Research
Institute, Environmental Assessment Department, under Contract No.
RP940-06. November 1987.
4. Johnson, T., J. Capel, L. Wijnberg, and R. Paul. Environmental
Strategies Project Phase One; Estimated Carboxyhemoglobin Levels of
Selected Populations-at-Risk in the Denver Urbanized Area. Prepared
by PEI Associates, Inc., for U.S. Environmental Protection Agency,
under Contract No. 68-02-3890. September 1987 (draft).
5. Biller, W., and H. Richmond. Sensitivity Analysis on Coburn Model
Predictions of COHb Levels Associated With Alternative CO Standards.
Prepared by Dr. William F. Biller, 68 Yorktown Road, East Brunswick.
New Jersey and the U.S. Environmental Protection Agency, under
Contract No. 68-02-3600. November 1982.
6. Johnson, T., and R. A. Paul. The NAAQS Exposure Model (NEM) Applied
to Carbon Monoxide. Prepared by PEI Associates, Inc., for U.S.
Environmental Protection Agency, Office of Air Quality Planning and
Standards. December 1983.
17-21
-------
7. Ott, W., J. Thomas, D. Mage, and L. Wallace. Validation of the
Simulation of Human Activity and Pollutant Exposure LSHAPE) Model
Using Paired Days from the Denver, Colorado, Carbon Monoxide Field
Study. Atmospheric Environment (in press).
ACKNOWLEDGMENT
The work described in this report was funded by the U.S.
Environmental Protection Agency under EPA Contract No. 68-02-3890. Mr. Tom
Donaldson was the EPA Project Officer and Mr. Steve Frey was the Work
Assignment manager. Mr. David Dunbar served as the Project Director for
PEL Mr. Ted Johnson was the PEI Project Manager and developed the general
approach to exposure modeling described in this report. He also defined
the cohorts to be analyzed and developed the cohort-specific input
parameters used by the Coburn model. Mr. Jim Capel developed the computer
programs that constructed the exposure event sequence for each cohort. Dr.
Louis Wijnberg developed the computer programs that determined the
microenvironment factors. Mr. Roy Paul developed the program that
calculated cohort-specific carboxyhemoglobin levels using the Coburn
Equation Algorithm. Ms. Alicia Ferdo and Mr. Joe Steigerwald conducted
literature searches that provided data for estimating the input parameters
of the Coburn model.
The authors would like to thank Mr. Tom Walker of Industrial
Economics, Inc., and Mr. Ken Lloyd of the U.S. Environmental Protection
Agency for their guidance and helpful recommendations. The authors are
indebted to Dr. Ron Wyzga:of the Electric Power Research Institute for
providing data obtained from the Cincinnati Activity Diary Study.
17-22
-------
THE INFLUENCE OF DAILY ACTIVITY PATTERNS ON
DIFFERENTIAL EXPOSURE TO CARBON MONOXIDE
AMONG SOCIAL GROUPS
by: Margo Schwab
Graduate School of Geography
Clark University
Worcester, MA 01610
ABSTRACT . . '
What people do, where they go, and when (their activity patterns)
have a pronounced effect on their exposure to pollutants. Because such
activity patterns vary systematically among population subgroups defined by
sex, work status, age, and income, it is hypothesized that exposure varies
among such groups. The research presented here used the data collected
during the EPA's study of personal exposure to carbon monoxide (CO) in
Washington, DC to examine the relationship between sociodemographic factors
and exposure. The results of a statistical comparison of exposure
characteristics among male workers, male nonworkers, female workers, and
female nonworkers show that working men (low-exposure jobs) have the
highest exposures, whereas nonworking women have the lowest exposures. In
addition, although automobiles are the primary source of CO, travel is not
a significant"contributor to total exposure for some groups.
This research was supported by the National Science Foundation (grant
SES-8617894).
18-1
-------
INTRODUCTION
An important finding in the field of total human exposure research is
that the locations that individuals visit, when they visit them, and what
they do there affects their exposure to air pollutants. Travel behavior
and time budget research show that these movement/activity patterns vary
systematically among population subgroups defined by such personal
characteristics as age, sex, work status, and income. The study presented
here links, for the first time, sociodemographic characteristics, activity
patterns, and exposure, testing the hypothesis that differences in the
activity patterns of social groups lead to differential exposure to carbon
monoxide (CO). The data collected during the EPA's field investigation of
personal exposure to CO in Washington, DC were used in a statistical
comparison of exposure characteristics among population subgroups. This
paper is divided into three parts: a discussion of the research context of
the differential exposure hypothesis, a description of the analysis method,
and a presentation of the results.
CONTEXT
Two diverse fields of study form the basis of the perspective taken
in this research: the recent focus of environmental assessment on total
human exposure to pollutants and the interests of urban geographers and
planners in the spatial and temporal patterning of social groups.
TOTAL HUMAN EXPOSURE ASSESSMENT '
Because fixed-site monitors do not accurately characterize the wide
range of pollutant concentrations people routinely come into contact with,
researchers have begun to measure the actual exposure of individuals. Much
of the recent analysis focuses on "total integrated exposure" (e.g., 1,2).
Here each person's exposure is.a function of the activities in which that
individual engages over the course of a day, thus the time spent in contact
with various pollutant concentrations. The discrete equation to describe
this concept defines total exposure for person i as the sum of each of the
products of the pollutant concentrations in microenvironments j and the
length of time the individual spends in j:
J
E(i)= 2 C(j) t(1j)
j=l
where E(i) = total exposure of person i over the
time period of interest,
C(j) = concentration experienced in
microenvironment j,
18-2
-------
t(ij) = time spent by person i in microenvironment
j, and
J = total number of microenvironments occupied
by person i over the period of interest (1).
This new focus on the individual has required incorporating the study
of activity patterns into environmental assessment. Indeed, recent
research has established that the activities a person participates in do
influence exposure. For example, the more time spent in microenvironments
with high concentrations of combustion-related pollutants (i.e., CO, N02,
RSP), such as travel (3,4), cooking with a gas stove (5,6), and being in
the presence of cigarette smokers (7,8), the higher will be a person's
total exposure.
But the nature of the relationship between daily activity patterns
and exposure is complex. Each individual's activity pattern is unique;
whereas several people might have similar exposure levels (i.e., the same
amount of exposure in a given day), their exposure profiles (that is, the
time\space sources of that exposure) may vary tremendously. For instance,
imagine three individuals with the same total daily exposure. The first
person receives most of her exposure during a long commute to work, whereas
the second person receives most of his exposure from the presence of
smokers on the job, and yet another receives the majority of her exposure
from home sources (e.g., gas stove, clothes dryers, and space heaters).
Such differences in exposure profiles highlight the need to consider
differences in activity patterns among individuals when formulating models,
health research conclusions, and exposure reduction strategies. But,
whereas it is useful from an analytic standpoint to focus on the
•individual, it would be enormously complicated and expensive to implement
individual-level policies. Useful models and feasible policy directives
are based upon generalizations about specific groups of people, places, or
activities. The methods of social scientists can help clarify the
relationship between activity patterns and exposure and thus provide the
foundation for useful generalizations about the exposure of population
subgroups.
ACTIVITY PATTERNS
Although the idea of studying activity patterns is new to the field
of environmental assessment, geographers and planners have a long tradition
of analyzing the nature of movement patterns. Analysis of data collected
via a variety of survey instruments, in several countries, and at multiple
times and places in the U.S. has consistently shown that the activity
patterns of population subgroups, defined by such personal characteristics
as sex, age, income, employment status, and race, vary systematically.
Specifically, because the constraints operating on and the opportunities
available to an individual are defined to some extent by their personal
characteristics, the activities of. social groups tend to exhibit distinct
patterns. Women, for instance, travel shorter distances to work and to
other activities (9-14), spend less time in leisure activities (15,16), and
spend fewer hours in waged work (17,18) than do men, even after employment
status has been controlled. Both the number of trips and the timing of
18-3
-------
that travel differ between those who work out the home full time and those
who do not work outside of the home (10,19,20). In addition, nonworkers
tend to organize their activities around the home location, whereas workers
organize their activities around both the home and the work place (21).
The elderly make fewer trips (22,23), travel shorter distances, use the bus
more often (24), and have more leisure time (15) than does the younger
adult population cohort. Lower income groups take fewer trips (25), travel
shorter distances for shopping (26,27), and have lower automobile ownership
rates than do higher income groups. These kinds of differences imply that
the exposure levels and profiles also differ among such population
subgroups, thereby allowing useful generalizations to be made.
METHOD
DATA
The study of human exposure has been hampered by the scarcity of
field data on the pollutant concentrations to which people are exposed.
Only since the early 1980's have portable, personal exposure monitors
(PEM), which accurately record pollutant concentrations at low, ambient
levels become available for general use (28). During the winter of
1982-1983 the EPA applied many of the methods used by social scientists to
study activity patterns to this new personal monitoring technology,
conducting a large-scale field investigation of actual individual exposure
to carbon monoxide. In each of two cities—Denver and Washington, DC--a
statistically representative sample of the noninstitutionalized, nonsmoking
population between the ages of 18 and 65 filled out questionnaires, kept
activity diaries, and carried PEMs with them throughout their daily
routines, recording CO concentrations at a time resolution of less than one
minute. Hartwell et a7. (29) and Johnson (30) describe the details of the
data collection processes in Washington and Denver, respectively.
The data base resulting from this study is the largest and most
detailed available on total human exposure to an air pollutant. The EPA
has used it to determine the actual frequency distribution of CO exposure
in the population (31), to study the relationship between actual personal
exposure and exposures based on fixed-site monitors (31,32), to highlight
high exposure settings (31,32), and to verify exposure models (33,34). But
the richness of this disaggregate data base has not yet been fully
explored. The present paper used the Washington, DC component, consisting
of information on one day's activities for approximately 700 persons, to
study the time/space paths of individuals with different personal
•characteristics, thereby investigating the implications of these paths for
exposure.
ANALYSIS TECHNIQUE
The method was to compare expo'sure characteristics among social
groups. I drew on the travel behavior tradition (e.g., 9,11,35) of
grouping individuals based on personal characteristics assumed to influence
activity patterns, thus influencing exposure. The results of previous
18-4
-------
travel behavior studies, constrained by data availability,1 led to focusing
on groups formed on the bases of the role-related characteristics of work
status and sex. Work status was subdivided into three groups: nonworkers,
workers in low-exposure jobs, and workers in high-exposure jobs (e.g.,
taxi, bus, and truck drivers, auto mechanics, police, cooks, and crane
operators).2 Figure 1 shows the groups and their sample sizes.
Nonworkers(a)
/ \
/ \
/ \
Males Females
(87)
(190)
Low-exposure
workers(b)
/ \
/ \
Males Females
(161) (157)
High-exposure
workers(c)
males and females
(37)
Figure 1. Sociodemographic grouping scheme with weighted sample sizes in
parentheses, (a) WORKTIME less than or equal to two hours on
the day sampled (workers sampled on weekends but not working
that day and nonworkers), (b) WORKTIME greater than two hours
on the day sampled (either weekdays and weekends), (Those coded
as workers and sampled on weekdays, but listing less than two
hours of time as business, study, or on-the-job travel were not
included in the analysis.) (c) males and females grouped
together to maintain acceptable subsample sizes.
The first step in -the analysis was exploratory, including an
examination of the descriptive statistics and the shape of the
distributions of each variable in Table 1 by each group. I then tested the
statistical significance of the null hypothesis that for each of the
variables "there is no difference among the five sociodemographic groups."
Parametric ANOVA with posteriori Scheffe tests were performed on the
transformed variables (positively-skewed variables were converted to
natural logarithms where appropriate), backed up by nonparametric
Kruskal-Wallis tests on the raw variables.
1. Income and race data were not collected. Few elderly or young people
were surveyed. Previous studies have shown that role-related
characteristics are much better determinants of activity patterns than are
economic characteristics.
2. These groups were defined partially on the basis of the original codes
assigned by Research Triangle Institute (RTI) during the data collection
and partially on the basis of new codes and grouping criteria formulated
specifically for this study. See notes in Figure 1.
18-5
-------
TABLE 1. DEFINITION OF VARIABLES
Code
Description
MAX1HR Maximum one-hour exposure.
MAX8HR** Maximum eight-hour exposure.
HOMEXP* Exposure while inside the home.
WORKEXP* Exposure while on the job (includes activities coded as
business, study, and on-the-job travel).
TRAVEXP"1" Exposure while traveling (includes activities coded as
parking but excludes activities coded as on-the-job
travel).
PHOME** Proportion of total exposure from home.
PWORK** Proportion of total exposure from work.
PTRAV""" Proportion of total exposure from travel.
HOMETIME Total time (hours) at home (inside, all activities).
WORKTIME Time (hours) in waged work (includes time in activities
coded as business,, study, and on-the-job travel).
LEISTIME Time (hours) in leisure activities.
TRAVTIME Time (hours) traveling (excludes on-the-job travel,
includes parking).
HOMECO# Mean CO concentration associated with home activities.
WORKCO# Mean CO concentration associated with work activities
(see above work variables).
TRAVCO# Mean CO concentration associated with travel activities
(see above travel variables).
Continued
3. The sampling weights were applied to the data during all stages of the
analysis.
18-6
-------
TABLE 1. (concluded)
Calculated by maximizing overall available one-hour averages of CO
exposure for each individual (provided by RTI). Units are ppm/hour.
Calculated by maximizing over all available eight-hour averages of CO
exposure for an individual (provided by RTI). Units are ppm/hour.
* Calculated by summing equation 1 over the activity records coded as
either home (or work or travel). In particular, exposure levels were
first calculated for each activity record by multiplying the number
of minutes in that activity by the mean CO concentration recorded
during that activity. Second, the individual exposure levels were
summed across the chosen activities). Units are ppm-hours.
** Calculated by dividing total daily exposure (from equation 1) by
either HOMEXP, TRAVEXP, or WORKEXP and then multiplying by 100.
Units are percent.
* Calculated by summing the CO concentrations associated with each
activity record coded as home (or work or travel, for WORKCO and
TRAVCO, respectively) for each individual and then dividing by the
number of activity records. Units are ppm.
The analysis presented here4 focused on the following questions:
1) Do either exposure levels or exposure profiles differ between
workers and nonworkers?
2) Within each employment-status group, do exposure levels or
profiles vary between men and women?
3) What is the role of time use versus CO concentrations in
differential exposure.
RESULTS
EXPOSURE LEVELS
I used two traditional measures, maximum one-hour (MAX1HR) and
maximum eight-hour (MAX8HR) exposure levels, to determine whether the
amount of exposure a person receives differs among the five
sociodemographic groups set out above. The results, displayed in Table 2,
show that both the ANOVA and Kruskal-Wallis test yield statistically
significant results (p<.01); there are differences among the five
4. Other parts of this research project not presented in this paper include
an assessment of the role of micro-level activities (e.g., being in the
presence of smokers and using gas appliances) on exposure patterns and an
examination of the relationship between residential location, CO
concentrations, and exposure. The results of these analyses will be
presented in a separate paper.
18-7
-------
population subgroups on both of the exposure level variables. Scheffe
tests, performed to identify the nature of the differences detected by the
ANOVA, show the following statistically significant (p<.05) results:
1) MAX1HR and MAX8HR are higher for workers in high-exposure jobs
than they are for each of the other social groups;
2) MAX1HR is lower for nonworking women than it is for either men
or women working in low-exposure jobs; and
3) MAX8HR is lower for nonworking women than it is for men in
low-exposure jobs.
This analysis shows that the level of CO to which individuals are
exposed varies with personal characteristics.
TABLE 2. RESULTS OF SIGNIFICANCE TESTS OF EXPOSURE LEVELS AND
ACTIVITY DIMENSIONS OF EXPOSURE AMONG GROUPS
Variable Transform Parametric Nonparametric
Test Stat. Prob. Test Stat. Prob.
MAX1HR
MAX8HR
HOMEXP
WORKEXP*
TRAVEXP
PHOME
PTRAV
PWORK*
.nat. log. ANOVA+
nat. log'. "
nat. log. "
nat. log. "
nat. log. "
none "
none "
none "
14
12
13
27
30
102
19
37
.88
,81
.56
.14
.29
.46
.78
.60
.000
.000
.000
.000
.000
.000
.000
.000
K-VT
It
II
II
II
II
II
II
41
43
59
60
102
254
115
48
.89
.13
.98
.34
.72
.03
.29
.38
.000
.000
.000
.000
.000
.000
.000
.000
* Nonparametric tests performed on untransformed (raw)
data
+ One-way analysis of variance (F value)
++ Kruskal-Wallis test (chi-square value)
# Tests of work exposure only performed between groups
of workers
EXPOSURE PROFILES
But the focus of the total human exposure research field on
individuals as receptors of pollutants implies the need to go beyond an
analysis of exposure levels, toward an assessment of how people's activity
patterns influence their exposure. To this end, I examined the time/space
sources of exposure. This new approach compares exposure profiles by
18-8
-------
analyzing differences in both the amount and the proportion of an
individual's total exposure that is attributable to home, work, and travel
activities.
A comparative analysis of these activity dimensions of exposure
documents the way in which exposure profiles vary among the
sociodemographic groups. Both the parametric and the nonparametric tests
yield significant (p<.01) differences among the groups for home, work, and
travel exposure (Table 2). The posteriori Scheffe tests find the following
specific differences, significant at p<.05, between groups:
1) workers have higher amounts of travel exposure and get a
greater proportion of their exposure from travel than do
nonworkers;
2) nonworkers have higher amounts of home exposure and get a
greater proportion of their total exposure from the home;
3) the amount of exposure from travel is lower for nonworking
women than it is for nonworking men;
4) workers in high-exposure jobs have higher work exposure levels
and get a greater proportion of their total exposure from work
than do either males or females in low-exposure jobs; and
5) the proportion of exposure from travel and home sources is
greater for workers in low-exposure jobs than it is for those
in high-exposure jobs. .
Figure 2 illustrates these differences. The bar graph shows the average
contribution of each activity to a person's total exposure for each group.
Time/space sources of exposure do vary with personal characteristics;
there are important differences in exposure profiles among population
subgroups defined by work status and sex. The results also highlight the
complexity of the relationship between personal characteristics, activity
patterns, and exposure. The interpretation of the pattern of differences
among the social groups in exposure associated with each activity is not
clear. In addition, it is important to point out that Figure 2 was derived
from group averages, but there is high within-group variation. The
standard deviation is almost as large as the mean for each variable, even
after controlling for social group (Table 3). The next section of
18-9
-------
PERCENT OF TOTflL EXPOSURE
FROM EflCH flCTIVITY -
NONUORK, WOMEN
NONUORK, MEN
LOU-EXP, UOMEN
LOU-EXP, MEN
HIGH-EXP UORKERS
CUTRRVEL
OTHER
0 20 40 60 30 100 120
Figure 2. Average exposure profiles. Each bar shows the proportion of
exposure the average individual in that social group receives from
home, work, travel, and other sources.
18-10
-------
TABLE 3. DESCRIPTIVE STATISTICS FOR EXPOSURE VARIABLES
Group
Men,
Men,
work
MAX1HR+ MAX8HR+
nonwork
(87)
low-exp
(161)
High-exp work
Women
Women
work
Men,
Men,
work
(37)
, nonwork
(190)
, low-exp
(157)
nonwork
(87.)
low-exp
(161)
5
6
(5
5
6
. (5
10
20
(21
4
4
(2
5
6
(5
High-exp work
Women
Women
work
(37)
, nonwork
(190)
, low-exp
(157)
.37*
.47**
.86)***
.64
.97
.52)
.88
.55
.79)
.56
.62
.96)
.19
.45
.38)
80.
72.
(22.
30.
34.
(26.
21.
27.
(20.
83.
77.
(25.
41.
39.
(24.
2
2.
(2.
2
2
(2
5
7
(5
2
2
(1
1
2
(2
65*
.21
,85
86)
.40
.90
.72)
.15
.11
.69)
.35
.20
.53)
.86
.53
.37)
95** '
77.)*"
38
20
35)
05
10
60)
98
69
29)
08
67
79)
HOMEXP++ TRAVEXP~
21.20
30.78
(34.15)
7.30
14.67
(21.91)
14.60
17.17
(14.35)
23.41
28.49
(24.76)
11.29
17.31
(21.75)
12.53
. -18.61
(17.90)
26.53
32.81
(23.27)
8.36
14.99
(15.80)
8.79
16.33
(21.18)
22.36
29.52
(19.58)
5
6.
(5.
8
10
(9
8
11
(16
2
4
.(5
7
9
(7
.04
,40
41)
.71
.72
.91
.31
.50
)
.72)
.34
.24
WORKEXP~
7.17
12.61
(20.30)
34.21
53.83
(47.84)
—
.22)
.06
.16
.90)
0
0
.0
24
28
(20
53
57
(25
0
0
0
19
25
(18
6.09
10.15
17.00
.
.96
.30
.01)
.61
.11
.66)
.84
.48
.93)
Median
Mean
Standard deviation
in ppm/hour
in ppm-hours
18-11
-------
analysis was designed to investigate reasons for this high person-to-person
variation and for the pattern of differential exposure reported above. The
examination focuses on the component parts of exposure—time use and CO
concentrations.
TIME USE
Exposure is a function of the time spent in an activity and the CO
concentration faced during that time. It is thus appropriate to ask how
time use varies among nonworking women, nonworking men, women working in
low-exposure jobs, men working in low-exposure jobs, and those working in
high-exposure jobs. A statistical comparison of the total amount of time
spent by each person at home, traveling, at work, and in leisure among the
social groups highlights the same type of differences in activity patterns
documented by previous studies. Nonworkers spend significantly more time
at home and less time in travel than do workers or males. Controlling for
both work status and sex reveals that male nonworkers spend more time in
both leisure and travel, but less time at home than do female nonworkers.
These results suggest that at least one reason for the differences between
workers and nonworkers and between male and female nonworkers in the amount
of exposure from travel and from home is differences in the amount of time
spent in these activities. This evidence supports the hypothesis that
differences in time use among groups lead to differences in exposure. But
the differences in exposure levels are not as strong as differences in time
use would suggest. Another important finding is that the high within-group
variation on the exposure variables is not due to time use; there is low
interperson variation on each of the time use variables, especially after
sex and work status have been controlled (Table 4).
18-12
-------
TABLE 4. DESCRIPTIVE STATISTICS FOR TIME USE VARIABLES
Group HOMETIME+ TRAVTIME+ WORKTIME+ LEISTIME+
Men, nonwork
(87)
Men, low-exp
work (161)
High-exp work
(37)
Women, nonwork
(190)
Women, low-exp
work (157)
20.47*
20.22**
(2.77)***
13.75
14.24
(2.77)
13.95
14.25
(2.55)
21.99
22.66
(2.23)
14.27
14.86
(2.83)
1.51
1.81
(1.26)
1.86
1.98
(1-01)
1.23
1.81
(1.55)
0.68
1.19
(1.36)
1.60
1.87
(1-09)
0
7.70
7.52
(2.01)
8.15
8.42.
(2.12)
0
7.36
6.99
(2.16)
4.15
4.76
(4.03)
• 1.30
1.85
(2.15)
0.20
1.34
(2.12)
2.73
3.29
(2.97)
1.26
1.62
(1.69)
Median
** >•
^ Mean
Standard deviation
* in hours
18-13
-------
CO CONCENTRATIONS
A new set of variables was created for the analysis of CO
concentrations, the other component of exposure. Whereas previous analyses
of this data set use PEM readings as the unit of analysis (31,32), this
study uses the individual. For each person I calculated the mean CO
concentration associated with all occurrences of a given activity (home,
work, and travel). Such measures maintain the integrity of differences
among individuals. The next step was to compare mean CO concentrations
across both groups and across activities.
A possible reason why differences in exposure between the pairs of
sociodemographic groups are not as strong as differences in their time use
would suggest is that CO concentrations are similar across activities. A
comparison of the mean CO concentrations across activities (ANOVA on the
natural logarithm of the mean CO concentration in each activity) yields a
significant statistic (F=163.39; p=0.000). Scheffe tests demonstrate that
the CO concentrations associated with travel are significantly (p<.05)
higher than are the concentrations associated with any of the other
activities.
A more surprising finding is that the CO concentrations associated
with home and low-exposure work places do not differ significantly from one
another (t=-0.66; p=0.51). This similarity in CO concentrations explains,
in part, why MAX8HR does not differ markedly between workers and
nonworkers, even though time use varies between these groups.
All of the CO concentration variables exhibit high within-activity
variation, even after controlling for social group. (Table 5), suggesting
the reason for both the high within-group variation in exposure and the
lack of a significant difference between the concentrations associated with
home and work.
Another possible reason for the pattern of exposure profiles found
above is that mean CO concentrations associated with each activity category
differ across sociodemographic groups as a result of subtle differences in
activities while at home, at work, or traveling. For example, women may
spend more of their time at home cooking, doing laundry, or in the vicinity
of gas appliances than do men, leading them to face higher CO levels at
home than do men. More frequent travel during rush hours and on heavily
traveled routes may lead workers to face higher CO levels while traveling
than nonworkers. Indeed ANOVAs of the mean CO concentrations associated
with the home and travel activity categories across social groups are
significant (Table 6). Posteriori tests show two interesting
relationships.
First; CO concentrations associated with the home are significantly
(p<.05) higher for male nonworkers than they are for male workers. The
presence of a difference between the male employment-status groups but a
lack of a difference between the female work-status groups could reflect
role-related differences between the sexes--a female's duties within the
18-14
-------
TABLE 5. DESCRIPTIVE STATISTICS FOR CO CONCENTRATION
VARIABLES
Group
HOMECO+ TRAVCO+
WORKCO+
Men, nonwork
(87)
Men, low-exp
work (161)
High-exp work
(37)
Women, nonwork
(190)
Women, low-exp
work (157)
1.62
1.79"
(1.45)"*
0.75
1.24
(1.67)
1.56
1.58
(1.13)
1.48
1.60
(1.29)
1.24
1.54
(1.91)
2.65
3.20
(2.21)
4.10
5.13
(5.15)
3.42
7.17
(12.14)
3.21
3.23
(2.01)
3.87
4.86
(3.64)
—
1.22
1.73
(2.45)
4.69
6.06
(4.91)
~ ~ ~
0.81
1.42
(1.78)
Median • . .
Mean
*** Standard deviation
+ in ppm
18-15
-------
TABLE 6. RESULTS OF SIGNIFICANCE TESTS OF CO
CONCENTRATION BY SOCIODEMOGRAPHIC GROUP
CO variable
HOMECO
TRAVCO
LEISCO
WORKCO**
Transform ANOVA
nat
nat
nat
nat.
. log.
. log.
. log.
log.
F-stat.
4.32
11.21
1.62
1.13+
Nonparametric tests performed on
variables
Between male and female workers i
Prob
.002
.000
.169
.258
Kruskal-Wallis*
X2
23.22
28.04
10.57
-1.99*
untransformed
n low-exposure
Prob.
0.000
0.000
0.032
h 0.047
jobs
probability)
Mann-Whitney test (z statistic)
18-16
-------
home are likely to be similar regardless of work status, whereas a male's
are not.
Second, CO concentrations associated with travel are significantly
(p<.05) higher for workers than they are for nonworkers, regardless of sex.
As reasoned above, this difference may be related to the joint effect of
the timing of trips and routes used by workers.
This examination has highlighted the importance of within-activity
variations in CO concentrations in explaining interperson and intergroup
differences in exposure.
SUMMARY AND IMPLICATIONS
This study of the relationship between personal characteristics,
activity patterns, and exposure offers useful insights into the nature of
human exposure to air pollutants. For the first time, it has been
empirically demonstrated that exposure characteristics vary among social
groups. The analysis shows that differences in activity patterns (manifest
both by differences in time use and by differences in CO concentrations
resulting from differences in micro-level activities) lead to differences
in the time/space sources of. exposure. Work status is a better
differentiator of exposure than is sex, but there are also important
differences between men and women. Whereas travel is an important
contributor to exposure for workers, especially men, the home is the major
time/space source of exposure for nonworkers, especially women. Among
nonworkers, men have higher travel exposures than do women.
Although this analysis was not able to pinpoint the extent to which
each of the many personal and household characteristics influences
exposure, the results highlight the value of incorporating a
sociodemographic component into exposure assessments. It is inadequate to
formulate exposure models based on a prototypical person; the individual's
social characteristics affect the probability that she/he will come into
contact with a given pollutant concentration. It is also important to
maintain the integrity of a person's total activity pattern, especially as
such patterns relate to personal characteristics.
In addition, the groundwork has been laid for future analyses by
showing that studying activity patterns in terms of time spent at home, at
work, and in travel is not sufficient to explain differential exposure
either among individuals or groups. Differences among people in their
micro-level activities lead CO concentrations to vary greatly within each
activity category, even after controlling for social group. The
implications of this high interperson variation in CO concentrations for
differential exposure require further research.
Finally, because we now know that systematic variations in activity
patterns among social groups do lead to differential exposure, it is
appropriate to extend the analysis by collecting data on the activity
patterns of high-risk groups—particularly young children, the elderly, and
low-income groups.
18-17
-------
In summary, the perspective and results of this study have important
research implications: the existence of differential exposure has been
demonstrated, the value of monitoring at the individual level has been
verified, and directions for future research have been outlined.
The work described in this paper was not funded by the U.S.
Environmental Protection Agency and, therefore, the contents do not
necessarily reflect the views of the Agency and no official endorsement
should be inferred.
18-18
-------
REFERENCES
1. Ott, W. Concepts of human exposure to air pollution. Environment
International 7: 179-196, 1982.
2. Duan, N. Models for human exposure to air pollution. Environment
International 8: 305-309, 1982.
3. Ott, W. and Willits, N.H. CO exposures of occupants of motor
vehicles: Modeling the dynamic response of the vehicle. SIMS
Technical Reports, No. 48, Department of Statistics, Stanford
University, 1981.
4. Peterson, W.B. and Allen, R. CO exposures to Los Angeles area
commuters. Journal of the Air Pollution Control Association 32:
826-833, 1982.
5. Spengler, J.D.; Ferris, B.G.; Dockery, D.W.; and Speizer, F.E.
Sulfur dioxide and nitrogen dioxide levels inside and outside homes
and the implications on health effects research. Environmental
Science and Technology 13: 1726-1280, 1979.
6. Sterling, T. and Sterling, E. Carbon monoxide levels in kitchens and
homes with gas cookers. Journal of the Air Pollution Control
Association 29: 238-241, 1979.
7. Repace, J. and Lowrey, A. Indoor air pollution, tobacco smoke, and
public health. Scien'ce 208: 464-472, 1980.
8. Spengler, J.D.; Treitman, R.D.; Tosteson, T.D.; Mage, D.T.; and
Soczek, M.L. Personal exposures to respirable particulates and
implications for air pollution epidemiology. Environmental Science
and Technology 19: 700-707, 1985.
9. Hanson, S. and Hanson, P. Gender and urban activity patterns in
Uppsala, Sweden. Geographical Review 70: 292-299, 1980.
10. Hanson, S. and Hanson, P. The travel-activity patterns of urban
residents: Dimensions and relationships to sociodemographic
characteristics. Economic Geography 57: 332-347, 1980.
11. Hanson, S. and Johnston, I. Gender differences in work trip length:
Explanations and implications. Urban Geography 6: 193-219, 1985.
12. Madden, J.F. Why women work closer~to home. Urban Studies 18:
181-194, 1981.
13. Black, J. and Conroy, M. Accessibility measures and the social
evaluation of urban structure. Environment and Planning A 9:
1013-1031, 1977.
18-19
-------
14. Federal Highway Administration. Personal travel in the U.S., Volume
I: 1983-1984 nationwide personal transportation study. U.S.
Department of Transportation, Washington, D.C., 1986.
15. Brail, R.K. and Chapin, F.S., Jr. Activity patterns of urban
residents. Environment and Behavior 5: 163-192, 1973.
16. Robinson, J.P.; Converse, P.E.; and Szalai, A. Everyday life in
twelve countries, in: A. Szalai (ed.), The Use of Time. Mouton, The
Hague, 1972.
17. U.S. Department of Labor, Bureau of Labor Statistics. Perspectives on
women: A data book. Bulletin 2800. U.S. Government Printing Office,
Washington, D.C., 1980.
18. Chapin, F.S., Jr. Human Activity Patterns in the City. New York:
John Wiley and Sons, 1974.
19. Pas, E. The effect of selected sociodemographic characteristics on
daily travel-activity behavior. Environment and Planning A 16:
571-581, 1984.
20. Doubleday, C. Some studies of the temporal stability of person trip
generation models. Transportation Research 11: 255-263, 1977.
21. Hanson, S. The importance of the multipurpose journey to work in
urban travel behavior. Transportation 9: 229-248, 1980.
22. Carp, F..M. The mobility of retired people. In: E.J. Cantilli and J.
Schmelzer (eds.), Transportation and Aging. U.S. Government
Printing Office, Washington, D.C., 1970. Cited in 10.
23. Potter, R.B. The nature of consumer usage fields in an urban
environment: Theoretical and empirical perspectives. Ti.ldschrift
Voor Economische en Sociale Geographic 68: 168-176, 1977. Cited in
10.
24. Hanson, P. The activity patterns of elderly households. Geoqrafiska
Annaler Series B 59: 109-124, 1977.
25. Douglas, A. Home-based trip end models -- A comparison between
category analysis and regression analysis procedures. Transportation
2: 53-70, 1973. Cited in 10.
26. Davies, R.L. Effects of consumer income differences on shopping
movement behavior. Ti.idschrift Voor Economische en Sociale Geographic
60: 11-121, 1969. Cited in 10.
27. Oppenheim, N. A typological approach to individual travel behavior
prediction. Environment and Planning 7: 141-152, 1975.
18-20
-------
28. Wallace, L.A. and Ott, W.R. Personal monitors: A state-of-the art
survey. Journal of the Air Pollution Control Association 32: 602-610,
1982.
29. Hartwell, T.D.; Clayton, C.A.; Mitchie, R.M.; Whitmore,R.W.; Zelon,
H.S.; and Whithurst, D.A. Study of carbon monoxide exposure of
residents in Washington, DC and Denver, CO. EPA-600/4-84- 031, U.S.
Environmental Protection Agency, Research Triangle Park, NC, 1984.
30. Johnson, T. A study of personal exposure to carbon monoxide in
Denver, CO. EPA-600/4-84-014, U.S. Environmental Protection Agency,
Research Triangle Park, NC, 1984.
31. Akland, G.; Hartwell, T.D.; Johnson, T.R.; and Whitmore, R.
Measuring human exposure to carbon monoxide in Washington, D.C. and
Denver, CO during the Winter of 1982-3. Environmental Science and
Technology 19: 911-918, 1985
32. Johnson, T.; Capel,( J.; and Wijnberg, L. Selected analyses relating
to studies of personal carbon monoxide exposure in Denver and
Washington, D.C.. Report compiled by PEDco (Durham, NC) for the U.S.
Environmental Protection Agency (Contract No. 68-02-3496/PN 3550)
1986.
33. Duan, N. Application of the Microenvironment Type Approach to
Assessment of Human Exposure to Carbon Monoxide. Rand, Santa Monica,
1985.
34- Ott, W.; Thomas,. J.; Mage, D.;-and Wallace, L. Validation of the
simulation of human air pollution exposure (SHAPE) model using paired
days from the Denver carbon monoxide field study. Atmospheric
Environment (in press -- 1988).
35. Fox, M.B. Working women and travel: The access of women to work and
community facilities. Journal of the American Planning Association
49: 156-170, 1983.
18-21
-------
IDENTIFICATION OF RESEARCH NEEDS
Near the end of the conference, the participants were asked to
identify needs for future research in the area of human activity patterns
and their relationship to exposure assessment. Those identified needs that
were legible and understandable were edited as little as possible and are
presented here. They have been sorted into five categories: data
collection processes, nonresponse problems, models, field studies, and
standardization and organization. Beyond the categorization, no conscious
effort was made to sort and order the identified needs. There is a
considerable amount of repetition, but we decided to list all of them with
the thought that the repetition represents an indication of the amount of
agreement between participants.
DATA COLLECTION PROCEDURES
In what form should data be collected?
A. Self Completed diary
B. Recall questionaire
C. Direct observation
0. Indirect observation
E. Electronic monitoring
F. Non-paper data recording (electronic event/time logger)
Consider time frames of data .collection:
A. How frequently should entries be made?
(1) At change of activity or location
(2) Every "x" minutes
B. How frequently should "diary" be reviewed?
Develop follow-up questions/check list.
How can we merge the data collection forms and technique experiences of the
social sciences into the future development of the environmental science
issues?
Develop "standard" activity pattern diary questionaires.
Develop a voice tape recorder with a .voice time recorder that would
facilitate an acurate diary with a minimum amount of effort for the
respondents.
Determine which survey methods - data collection devices are most
producti ve/effecti ve?
Quality Assurance. Besides coding and clerical errors, it is very
important to reduce errors at the data collection points. In a number of
studies, respondent filled the diaries or questionnaires. In such
circumstances, I worry about the quality of the data collected. Some
19-1
-------
questions I have in my mind are: Are definitions clear to respondents?
Are concepts clear to them? Do they interpret questions correctly?
Research is needed in this area to reduce response variance. Can a
group-training of respondents be arranged? Should a mock interview be
conducted with respondents? These and other ideas may help to improve the
quality of data at the collection end. Also I believe that efforts should
be made to get good quality data to start with rather than find errors
later and correct them.
Data Collection Instrument: In questionnaire design, it is important to
pretest it before fielding the questionnaire. The pretest should be done
on the target population. Even one respondent by subgroup would be more
useful than no pretesting.
Before developing new activity/time diaries, the resolution required in
terms of time period (1-5, 5-30, >30 min), of locations (# indoor, #
outdoor, with sources), and of activities (source usage, breathing rate,
heart rate). What is the time frame for anticipated responses
(acute/chronic), and how important is a short-term peak vs. a long-term
average exposure? Additionally, the total burden placed on the respondent
(relative to what is essential) should be considered in determining the
format of the diary/questionnaire (integrated vs. promoted time interval).
Several diary/questionnaire formats were presented here. Each has certain
advantages/disadvantages, in terms of burden and sensitivity/accuracy.
Without adequately defining what is needed (for continuous/integrated
monitoring), there is no real basis for selection between these methods.
Do more methodological research in particular on questionnaire design and
non-response avoidance. .
Develop standardized set of questionnaires for characterizing pollutant
exposure (indoor; outdoor; total) and activity information; e.g., for:
(a) pollutants with short time resolution -- 1 to 6 hrs. (such as
CO).
(b) pollutants with medium time resolution -'- 6 to 24 hrs. (VOC's).
(c) pollutants with longer time resolution -- > 24 hrs.
Generate an activity pattern questionnaire that is devoid of subjective
questions, such as strenuous activity; heavy/light traffic; high, medium,
or low usage of an appliance, etc.
Compare activity recording instruments:
(1) Paper and pencil diaries for concurrent records
(a) page-by-page format (Denver-Washington)
(b) matrix format (Lambert, Colome, Adair)
(2) Electronic monitors (Ott suggested monitor) for concurrent.
(3) 24-hour Retrospective Interview - telephone based CATI technique.
(4) Observational Studies (ethnography tradition, other family member
observing and reporting on activities of target subject, or
electronic tracking device).
19-2
-------
Convergent' methods would allow multi-method multi-trait approach to
assessing the efficiency of these methods. Scoring "omissions" and
"commissions" in recording errors against a reference or gold standard
method (e.g., observational standard or electronic).
(1) Test (experimental) time/diary formats and administrative approaches.
(2) Validate, with observer or electronic equipment.
Validate locations reported.
Identify methods to get accurate evaluations of critical microenvironments
(e.g., gas stations, home garage).
Methodological research on activity diaries, to find ways that are
cognitively simplest for respondents, while giving necessary data.
It also appears that a "multitrait, multimethod" study of activity diaries
should be conducted. This will help guard against "instrumentation" and
"observer" validity threats. Thus use "matrix" diaries and "time" driven
diaries in conjunction with other variables that "should" and "should not"
correlate with time/activities. This will result in a validation of
instruments/techniques. Use of standard designs for instrument validation
will also help to determine the instrument effect on behavior.
Develop QA/QC data validation procedures.
Use matrix sampling to determine which sour'ces to ask for from each
respondent. . •
Conduct pilot studies to determine human activity patterns by, using
electronic aids.
Development of monitoring devices that are less intrusive.
Continue development of light-weight, quiet, reliable, personal monitors
for target species.
Consider availability of transmitter technologies to monitor/verify
location information. If "ideal" monitors are not available, develop
specifications for monitor development.
Determine availability/needs for heart rate, respiratory ventilation
monitors with data logging ability.
Develop protocols for observer/technician/training and subject instruction
for diary/monitor studies. Extend literature for activity assessment on
this issue. There is an art and oral history in this area that should be
codified.
Develop improved activity recording procedures such as:
19-3
-------
(1) Recording devices that do not require paper.
.(2) Reminder devices that supplement paper diaries.
Develop continuous monitoring devices for pollutants.
Improvements in assessing the accuracy of reported personal
location/time/activity patterns. Information obtained from "objective"
observers is useful but not the answer, because presence of observer may
alter normal activity patterns (both type of pattern and accuracy of
recall). High priority should be placed on development and evaluation of
"nonintrusive" devices that allow objective verification of personal
location, and perhaps activity, as a function of time. (Perhaps award a
contract to Bell Labs or the DOD for this?)
NONRESPONSE PROBLEMS
Increase participation?
A. Incentives
B. Determine non - $ benefits to participant
C. Reduce burden (perceived or actual)
Estimation Research: I was not sure whether past activity pattern studies
used weighted data in the analysis. It is worth researching methods to
reduce bias in estimates due to missing data.
Improve Response Rate: Dawn Nelson presented methods to improve response
rate. Some of these are applicable to the studies. There are probably
other ways to improve rates. With a small sample size, even missing a few
respondents has a greater effect on the results and, hence, research should
be directed in this area.
Do more methodological research in particular on questionnaire design and
non-response avoidance.
Develop report on factors that will improve response rate.
(a) initial information as an incentive (how much detail does one
provide to participants?)
(b) effect of financial incentives.
(c) return of appropriate information after the study is completed.
Evaluate/review methods for increasing participation and compliance.
Develop better understanding of human behavior on topics such as the
following:
(1) Willingness to Participate
(2) Ability to Complete Forms (Diaries, Questionnaires) and Wear
Monitors (barriers in workplaces, certain outdoor activities)
19-4
-------
Examination of factors that might increase response rates for exposures.
Study biases from nonresponse.
Explore ways to compensate for nonresponse in the analysis. Also consider
the inclusion of variables in data collection to aid in the adjustment
process.
Random Digit Dialing (ROD). Using ROD, certain socio-economic areas are
not fairly represented in the study. The effect of this type of
undercoverage on Activity Pattern research study should be evaluated. [The
group without telephones has difficult socio-economic conditions, health
conditions, exposure level, etc.].
MODELS
Analysis Related Research: A number of studies used modeling approach to
analyze data. These models make certain assumptions. These assumptions
may have significant effects on the results. I recommend that sensitivity
analysis should be done to reduce risks of replications of results due to
the models, and one should be very careful about those assumptions the
model is highly sensitive to.
Formulate more sophisticated exposure models.
Design procedures to validate exposure models.
Improved human activity pattern - exposure models.
Develop improved mathematical and statistical models to utilize and
interpret the activity pattern and related information, which would fit
into total exposure. Current work such as SIMS and work on models such as
SHAPE and breath - personal TEAM should be continued. New models must
include Stochastic component; i.e., temporal relationships (most current
models assure independence of pollutant exposure from day to day).
Studies to validate exposure models.
Detailed assessment of the modeling procedures, especially sensitivity
analysis: these models appear very hazardous.
Incorporate autocorrelation (serial correlation) into exposure estimation
models for pollutants which exhibit autocorrelation in PEM data (e.g.,
carbon monoxide).
Compare pollutant exposure predictions of microenvironment-based models
with predictions of source-proximity models. Evaluate stepwise regression
as a means of combining models.
Time and activity patterns need to be defined for different population
groups. At the minimum, this should include age (group), sex, ethnic/SES
groups, work and/or school status (full/part-time; summer/winter/full
year). Additional classification factors could include the identification
19-5
-------
of pre-existing cardiopulmonary disease (i.e., chronic heart disease, COPD,
Asthma), since these might modify both activity and time patterns, in
relation to preceptible afr pollutant effects (e.g., irrigation).
Optimize the union of activity pattern information (time, density, and
location) and pollutant concentration capability (time of sampling). Such
optimization should lead to a definition of a universal set of
microenvironments. Of course such a set will be pollutant dependent.
Determine time resolution based on technological and "participant"
restraints.
FIELD STUDIES
Collect activity data on high-risk groups (large sample sizes).
A. Young children
B. Elderly
C. Those with a history of illness/prone to illness
D. Low Income Groups - especially inner city.
Some thought must be given for the coordinated efforts by various groups.
Use the same sample if possible for multipurpose studies. This will help
in increasing the sample size since some operational costs will be shared.
An extensive, nationwide activity pattern research field survey that
includes those activities likely to cause exposure to environmental
chemicals.
Specialized field surveys of the activity patterns of children.
Additional TEAM (direct approach) field studies that include improved
activity pattern data collection procedures. TEAM studies are needed for
ozone, N02, and polar organic compounds.
Establish some mechanism to coordinate studies to prevent overlap,
duplication, and repetition of mistakes.
Assess usefulness and advantages of probability based samples of the
general population. Although necessary for many study goals, there are
also many research hypotheses in health and exposure areas that do not
require precise representation of the general population.
Conduct Additional Field Studies:
More data are needed to evaluate and "validate" model accuracy and
precision characteristics.
For the collection of activity-location data on a large
population-based sample, initiate cooperative research effort with
Bureau of Census or NHANES group. Would encourage CATI interviews
rather than mail-out diaries for self-administration.
19-6
-------
Perform Large-Scale:
A. Population-based studies which assess activity patterns
stratified on sex, age occupation, role, and health status.
B. Person-day is unit of sampling. Geographical concerns somewhat
overshadowed by day of the week and seasonal concerns.
Field Studies:
A. TEAM studies for species with available instrumentation.
B. Targeted microenvironmental studies for certain species (e.g.,
benzene and garages/areas with passive smoking).
C. Targeted activity/exposure/location studies on "sensitive"
populations (e.g., asthmatics/ozone).
Laboratory Studies (in support of field monitoring):
A. Emissions rate determination in chambers.
B. Decay rate determination in chambers.
Establish voluntary policies for designing and conducting field studies;
e.g.:
(A) Survey and sampling statisticians should be involved.
(B) Linking health effects to exposure (having a health effect
component on questionnaires).
(C) Requiring analysis plan at beginning of project.- • .
Many studies are very small and in special locations. Generalization is
problematic. Are there ways to develop larger-scale representative
studies, perhaps using two-phase sample designs?
More consideration needs to be given to the time dimension. One needs to
ensure that samples are representative across time.
More attention to the need for assessing location/time/activity patterns
for important population subgroups, e.g., children, the elderly,
asthmetics, persons with COPD, residents of low-income substandard housing,
who may be at increased risk of pollutant exposure and/or adverse health
effects. The effect of geographical differences in climate and housing on
activity patterns also needs to be studied much more.
There continues to be a need to conduct personal monitoring studies to
quantify the relationships among exposure, microenvironmental monitors, and
fixed site ambient monitors.
Conduct a source driven activities diary study.
Modification or restriction of activities or time allocation in response to
the perception of air pollutants (directly or through media reports) might
19-7
-------
be considered as an indication of air pollution effects. Thus, research is
needed to help identify "normal" activities and time patterns and track
these individuals to determine if there are changes in these patterns that
are related to outdoor air quality (e.g., 03, N0x, PM10) or indoor
pollutants.
Better data on the concentrations formed in key microenvironments and their
causes.
There do appear to be regional and seasonal differences; thus, at the very
least there is a need to determine activities in various locations during
at least two (2) seasons (winter and summer). _ Care should be taken to
differentiate weekday, Saturday, and Sunday patterns.
Various subgroups need to be identified to determine their T/A patterns.
These should include "at risk" groups as well as occupational groups.
What is an optimum number of locations to be investigated?
STANDARDIZATION AND DATA MANAGEMENT
Definition of a uniform set of microenvironments.
Determination of a set of variables to be included in all studies to allow
comparison of data across studies.
Can we develop precise and consistent definitions of specific activities of
interest? Must be useful and, more importantly, must be collectible.
All future studies (i.e., survey designs) should include basic
demographic/socioeconomic data.
It is important to design studies so that the data are useful in a variety
of contexts.
Defining a manageable set of microenvironments (verbs). These could then
act as the foundation for all exposure studies, allowing cross-study
comparisons.
My recommendation is that standardized definitions and concepts will be
very useful. This could also help follow-up studies for screening the
target cases, if the common definitions are employed by them.
Consistent coding categories for locations and activities are essential if
there is any hope to combine data collected under different studies into a
single data base. The definitions of location categories (hierarchial, for
subsequent aggregation) and activities should be standardized.
A workshop or working room (< 20 individuals) be assembled to develop
recommendations to be circulated for comment and review on standardization
of diary formats (integrated and detailed) and of definitions for
19-8
-------
categories/activities. This group could meet for a 2-3 day period and
produce a set of instruments together with instructions for usage.
Organize available data in a standardized format.
Standardized formats for transferring activity data on tape and personal
computer disks.
Specialized software for .analyzing and graphically displaying activity
pattern information from these tapes/disks.
Relational data bases for activity patterns.
Determine minimum set of variables -- e.g., demographic characteristics -
that should be collected in a standard way in every study.
Standardize activity pattern questions (if not form of the survey), e.g.,
exertion level.
Suggestion of How to Carry This Out: Controlled study that asks several
different questions to get at the same information - then possible to
evaluate which question is most effective.
Standardize data collection in small-scale personal monitoring studies to
complement and allow extrapolation to large-scale population-based samples.
Personal characteristics (age, sex, income, geographic location,
occupation, heath status).
Adapt 00-99 International Activity Coding System to air pollution
monitoring needs - establish working group to produce a standard coding
scheme with sufficient detail to meet exposure-absorbed dose needs.
Construct a hierarchial locational data coding scheme that allows one to
collapse categories to major environments (i.e., indoor-residence,
indoor-school, indoor work, transit and outdoors).
Use hierarchial location codes that may be collapsed depending upon study
purpose/design. Evaluate available schemes and generate codes for future
research.
Greater efforts toward standardizing collected data on
location/time/activity patterns in order to achieve the standard QA
objective of "comparable" data sets. It is especially important for
investigators primarily concerned with evaluating exposures to one or two
specific pollutants not to exclude by design the possibility of later use
of their activity data for the estimation of personal exposures to other
pollutants. For example, knowledge of .specific times of relatively brief
periods outdoors may not be very important for assessing overall exposure
to VOC's, but would be of prime importance in evaluating exposure to ozone.
i
Determine reasonable number of location categories based on pollutant
sources and activity patterns.
19-9
-------
Establish guidelines and standards for designing and conducting field
studies. One project that would be useful: create compendium of
questionnaires, diaries and sampling plans. Another would be to define
terms.
Identify (specific) human activity patterns research areas.
Improve scientists' and statisticians' understanding of management and
policy makers' needs in the human exposure arena.
19-10
-------
CONFERENCE PARTICIPANTS
Dr. James H. Adair, Project Manager*
Harvard School of Public Health
665 Huntington Avenue
Boston, MA 02115
Dr. Joseph V. Behar * **
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Mr. Andrew Bond
Environmental Monitoring and Data
Analysis Division
EPA Mail Drop 76
Research Triangle Park, NC 27711 •
Ms. Elizabeth Bryan
U.S. Environmental Protection Agency
TS-798
401 M Street SW
Washington, D.C. 20460
Mr. Michael Callahan, Director*
Exposure Assessment Group
U.S. Environmental Protection Agency
RD-689
401 M Street SW
Washington, D.C. 20460
Dr. Chao Chen
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Dr. Steve Colome
Department of Social Ecology
University of California at Irvine
Irvine, CA 92717
Ms. Lori Coyner
E.R.T.
1220 Avenido Acaso
Camarillo, CA 93010
Mr. Michael Delarco
U.S. Environmental Protection Agency
401 M Street SW (RD-680)
Washington, D.C. 20460
Dr. Naihua Duan*
513 Wilshire Blvd.
Suite 249
Santa Monica, CA 90401
Mr. Michael Dusetzina
U.S. Environmental Protection Agency
(MD-13)
Research Triangle Park, NC 27711
Dr. Evan Englund
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Ms. Lynn Fenstermaker
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Mr. Chas. Fitzsimmons
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Mr. George T. Flatman
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3748
Dr. Kenneth Hedden
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Mr. Stephen C. Hern **
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Dr. C. H. Ho
Department of Mathematics
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
20-1
-------
Mr. Ted Johnson*
PEI Associates, Inc.
505 S. Duke Street, Suite 503
Durham, NC 27701-3196
Dr. Graham Kalton, Chairman*
Department of Biostatistics
University of Michigan
Ann Arbor, MI 48106
Dr. Robert Kinneson
Desert Research Institute
Water Resources Center
University of Nevada System
Las Vegas, Nevada 89120
Mr. Mel Kollander*
U.S. Environmental Protection Agency
(PM-223)
401 M Street SW
Washington, D.C. 20460
Prof. William Lambert*
UNM Medical Center
900 Camino De Salud NE
Albuquerque, NM 87131
Ms. Carolyn H. Lichtenstein*
Roth Associates, Inc.
6115 Executive Blvd.
Rockville, MD 20852
Dr. David McNelis, Director
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Dr. Forest Miller
Desert Research Institute
Water Resources Center
University of Nevada System
Las Vegas, NV 89120
Dr. D.J. Moschandreas*
IIT Research Institute
10 W. 35th Street
Chicago, IL 60616
Ms. Dawn Nelson*
Demograpic Surveys Division
Bureau of the Census
FOB03, Room 3377
Washington, D.C. 20233
Dr. William C. Nelson**
U.S. Environmental Protection Agency
MD-55
Research Triangle Park, NC 27711
Dr. Wayne R. Ott, Chief*
Air, Toxics, and Radiation Staff
U.S. Environmental Protection Agency
RD680
401 M Street SW
Washington, D.C. 20460
Dr. Craig Palmer
Environmental Research Center
University of Nevada-Las vegas
Las Vegas, Nevada 89154
Dr. Muni Ian Pandian*
Environmental Research Center
University of .Nevada-Las Vegas
Las Vegas, Nevada 89154
Mr. J. Gareth Pearson, Director**
Exposure Assessment Research
Division
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Mr. Thomas Phillips*
Air Resources Board
1102 Q Street
P.O. Box 2815
Sacramento, CA 95812
Dr. James Quackenboss
College of Medicine
University of Arizona
Arizona Medical Center
Tucson, AZ 85724
20-2
-------
Dr. John Robinson, Director*
Survey Research Center
University of Maryland
College Park, MD 20742
Ms. Margo Schwab
Harvard School of Public Health
665 Huntington Avenue
Bldg 1, Room 1310
Boston, MA 02115
Dr. R. Keith Schwer, Director
Center for Business and Economic
Research
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Dr. Rajendra P. Singh, Director
Survey of Income and Program
Participation Branch
Statistical Methods Division
U.S. Bureau of the Census
FOB#3, Room 3705
Suitland, MD 20233
Mr. Robert L. Snelling*
Acting Director
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Dr. Thomas Starks*
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
Dr. Bonnie Stern
Health Protection Branch
203 Environmental Health Center
Tunney's Pasture
Ottawa K1A OL2
Canada
Dr. Thomas H. Stock*
Department of Environmental Sciences
University of Texas
Health Science Center
P.O. Box 20036
Houston, TX 77225
Mr. Jacob Thomas*
General Sciences Corportion
6100 Chevy Chase Drive
Laurel MD 20707
Dr. John C. Unrue, Vice-President
for Academic Affairs*
University of Nevada-Las Vegas
Las Vegas, NV 89154
Mr. Llewellyn Williams**
U.S. Environmental Protection Agency
Environmental Monitoring Systems
Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478
Mr. A.L. Wilson
Wilson Environmental Associates
135 E. Live Oak Avenue
Suite 203
Arcadia, CA 91006
Mr. Harvey S. Zelon*
Center for Survey Statistics
Research Triangle Institute
Box 12194
Research Triangle'Park, NC
27708-2194
* Speaker
**Session Chairman
20-3
------- |