PB89-166599
PROCEEDINGS  OF  THE RESEARCH PLANNING
CONFERENCE ON HUMAN ACTIVITY  PATTERNS
Environmental  Research Center
Las Vegas,  NV
Jan  89
                U.S. DEPARTMENT OF COMMERCE
             National Technical Information Service


-------
                                           EPA/600/4-89/004
                                               January 1989
            Proceedings of the

      RESEARCH PLANNING CONFERENCE ON
          HUMAN ACTIVITY PATTERNS
                 edited by

          Thomas H.  Starks,  Ph.D.
       Environmental Research Center
      University of Nevada-Las Vegas
          Las Vegas, Nevada  89154
  Cooperative Agreement No. CR 814342-01
           •   Project Officer
              Stephen C. Hern

             Technical Monitor
          Joseph V.  Behar, Ph.D.

       Exposure Assessment Division
Environmental Monitoring Systems Laboratory
       Las Vegas,  Nevada  89193-3478
    OFFICE OF RESEARCH AND DEVELOPMENT
   U.S.  ENVIRONMENTAL PROTECTION AGENCY
             LAS VEGAS,  NEVADA

-------
                                  NOTICE
      Although the information in this  document  has  been funded wholly or
in  part  by  the  United  States  Environmental  Protection  Agency  under
Cooperative  Agreement  No.   CR 814342-01  to  the  Environmental  Research
Center,  University of Nevada-Las  Vegas,  it  has  not  been  subjected to Agency
review and therefore does not necessarily reflect the views of  the Agency,
and no official  endorsement  should be  inferred.

      Mention of  trade names  or  commercial  products does  not constitute
endorsement or recommendation for  use.
                                    n

-------
                                     TECHNICAL REPORT DATA
                             (Please read Instructions on the reverse before completing)
1. REPORT IS|O.
  EPA/600/4-89/004
              3. RECIPI

«. TITLE AND SUBTITLE

   PROCEEDINGS OF  THE RESEARCH PLANNING  CONFERENCE ON
   HUMAN ACTIVITY  PATTERNS          • • -
              5. REPORT DATE
                  January 1989
              6. PERFORMING ORGANIZATION COOE
7. AUTHOR(S)
   Thomas H. Starks,  Ph.D.
                                                              8. PERFORMING ORGANIZATION REPORT NO.
3. PERFORMING ORGANIZATION NAME AND AOORESS
   Environmental  Research Center
   University of  Nevada-Las Vegas
   Las Vegas, Nevada 89154
                                                               10. PROGRAM ELEMENT NO.
              11. CONTRACT/GRANT NO.

                 CR814342-01
12. SPONSORING AGENCY NAME AND ADDRESS
   Environmental Monitoring Systems Laboratory - LV, NV
   Office of Research  and Development
   U.S.  Environmental  Protection Agency
   Las Vegas, NV 89193-3478
              13. TYPE OF REPORT AND PERIOD COVERED
              14. SPONSORING AGENCY CODE
                EPA/600/07
15. SUPPLEMENTARY NOTES
16. ABSTRACT
            The  study of human activity  patterns was initially  an area of interest
       in the field  of sociology, but recently it has become  important to people
       investigating the amount and extent  of exposure of human populations to
       hazardous  chemicals.   This report  presents the proceedings of a conference
       held to compare various methods of studying human activity patterns, and to
       determine  where additional research  is needed to develop methods for
       collecting reliable human activity patterns data pertinent to the determination
       of exposure rates.  Entitled "Research Planning Conference on Human Activity
       Patterns," the conference was held May 9,9, and 10, 1988,  in Las Vegas, Nevada.
 7.
                                 KEY WORDS AND DOCUMENT ANALYSIS
                   DESCRIPTORS
b.IDENTIFIERS/OPEN ENDED TERMS  C.  COSATl Field. Group
18. DISTRIBUTION STATEMENT

  RELEASE TO PUBLIC
19. SECURITY CLASS iTins Report/
  UNCLASSIFIED
                                                                             21. NO. OF PAGES
292
                                                 20. SECURITY CLASS fThis page
                                                  UNCLASSIFIED
                                                                             22. P9ICE
EPA Form 2220-1 (R*v. 4-77)   PREVIOUS COITION is OBSOLETE .

-------
                                 ABSTRACT
      The  study  of  human  activity patterns  was  initially  an area  of
interest in the field of sociology,  but  recently  it has become  important to
people investigating  the amount  and  extent of exposure of human populations
to  hazardous  chemicals.     This  report  presents  the  proceedings  of  a
conference  held  to  compare various  methods  of  studying  human activity
patterns, and  to determine  where  additional research  is  needed to develop
methods for  collecting  reliable human activity  patterns  data pertinent to
the  determination   of  exposure  rates.    Entitled   "Research  Planning
Conference on Human  Activity Patterns,"  the conference was  held  May 8,  9,
and 10,  1988,  in Las Vegas,  Nevada.  This report was submitted  in partial
fulfillment of Cooperative Agreement No. CR 814342-01  by the Environmental
Research  Center,  University   of  Nevada-Las  Vegas,   under  the  partial
sponsorship of the  U.S.  Environmental Protection Agency.
                                    in

-------
                                  CONTENTS

                                                                      Page

Noti ce	   i i
Abstract	  i i i
Acknowl edgements	   vi

Summary	  1-1

            Human Activity Pattern Studies:  History and  Issues

Estimating Americans' Exposure to Air Pollution:  Issues, Alternatives
      and Suggestions - John Robinson	  2-1

Human Activity Patterns:  A Review of the Literature for Estimating
  ^    Time Spent Indoors, Outdoors, and In Transit - Wayne Ott	  3-1

Basic Activity Patterns Structure for Modeling Pollution Exposure -
      Jacob Thomas and Joseph V. Behar	  4-1


           Personal  Activities:  Monitoring and Quality Assurance

A Comparative Evaluation of Self-Reported and Independently-Observed
      Activity Patterns in an Air Pollution Health Effects Study -
      Thomas H. Stock and Maria T. Morandi	  5-1

Assessing Activity Patterns for Air Pollution Exposure Research -
      James Adair and John D. Spengler	  6-1

Perception of Daily Cigarette Consumption in an Office Environment -
      David A. Sterling, O.J. Moschandreas, and Robert D. Gibbons	  7-1

Capture of Activity Pattern Data During Environmental Monitoring -
      Harvey Zelon	  8-1

An Activity Pattern Survey of Asthmatics - Carolyn H. Lichtenstein,
      H. Daniel Roth, and Ron E. Wyzga	  9-1


                 Nonresponse:  Avoidance and Data Analysis

The Treatment of Missing Survey Data - Graham Kalton and Daniel
      Kasprzyk	 10-1

Nonresponse Adjustment Methods for Demographic Surveys at the
      U.S. Bureau of the Census - Rajendra P. Singh and Rita J.
      Petroni	 11-1

On the Robustness of the Maximum Likelihood Estimator in the Presence
      of Nonresponse in Compositional  Data - Chao-lung Chen	 12-1

                                     iv

-------
                            CONTENTS (Continued)
                                                                      Page

Nonresponse Problems and Solutions:  A Case Study - Dawn Nelson and
      Chet Bowie	 13-1

Principles of Questionnaire Design and Methods of Administration -
      Wendy Visscher, Roy W. Whitmore, Mel Kollander and
      F. Cecil Brenner	 14-1
      Microenvironments and Activities:  Definitions and Distinctions

Estimation of Microenvironment Concentration Distribution Using
      Integrated Exposure Measurements - Naihua Duan	 15-1

Microenvironment Database for Total Human Exposure Studies -
      Muni Ian Pandian	 16-1

A Methodology for Estimating Carbon Monoxide and Resulting Carboxy-
      hemoglobin Levels in Denver, Colorado - Ted Johnson	 17-1

The Influence of Daily Activity Patterns on Differential Exposure
      to Carbon Monoxide Among Social Groups -  Margo Schwab	 18-1


Identification of Research Needs	 19-1

Conference Participants..-.	'	'...!,	 20-1

-------
                             ACKNOWLEDGEMENTS
      Gerry Akland,  Joseph Behar, Stephen Hern,  and  Wayne Ott of the U.S.
Environmental  Protection Agency  were  important  sources  of  information,
guidance,  and encouragement  in  the planning of this conference.  Chao Chen,
Lynn Fenstermaker,  Carol Forsythe,  Leslie Gorr,  Muhilan Pandian, and Marie
Schnell of  the Environmental Research  Center  are also thanked for their
help  in  the  operation  of  the  conference  and  the  production of these
proceedings.
                                    vi

-------
                                  SUMMARY

      Modern technology has brought  a  dramatic increase in the  production
and consumption  of man-made chemical.s.   To determine  human  health risks
posed by these new  chemicals,  it  is necessary to  investigate and  estimate
how, how often,  and  at what concentrations the human  population  is  exposed
to the chemicals.   This  in turn requires  information on when,  where,  and
how people spend their time;  that is, what are the human activity patterns
of a population.

      The  U.S.   Environmental   Protection Agency  (EPA)  employs exposure
models (e.g., SHAPE  and  NEM)  to help develop  risk assessments for  various
chemicals.   A key  component  for these  models is human activity  pattern
information.  To date, only a  few exposure studies have  gathered the human
activity pattern data required  to  drive  such  models,  and  these studies  have
utilized essentially the  same basic  methods.

      This  conference was held to  compare and contrast methods of human
activity pattern data collection  and analysis from  environmental  exposure
studies  (e.g.,  EPA'S TEAM VOC Studies) with those from non-environmental
studies.   Of particular interest were validity of  approach,  methodology for
eliciting cooperation from individuals  in  the sample,  and implementation of
questionnaires  and other  data acquisition  methodologies.   In  addition,
there was  interest  in data analysis methods  used  to  reduce bias caused by
nonresponse.    Another  important  objective  of  this  conference  was  to
identify research needs in human activity  patterns research.

      To accomplish the conference  objectives, both  speakers and attendees
were selected so as to provide a broad background of  experience  in  dealing
with the problems and needs of human activity pattern studies and research.
The conference  participants  came from universities,  research institutes,
and state  and  federal agencies; and they represented a broad spectrum of
disciplines  and interests.    The  discussions  following  each  session  of
presentations were spirited,  informative,  and representative of many points
of view concerning the problems discussed.

      The conference papers are categorized  into three substantive  areas of
human  activity  patterns  studies:   microenvironments and  activities,
personal  activities monitoring  and quality assurance,  and nonresponse.

MICROENVIRONMENTS AND ACTIVITIES

      To describe human activity patterns,  one  must decide  how detailed the
descriptions must be.  For some environmental studies, it may be  sufficient
to state when and how long people  are indoors,  in  transit (car, bus, train,
or plane), or outdoors, and whether  or not they are engaged in a  particular
activity  (e.g.,  smoking).   For  other environmental  studies,   it  may  be
necessary to be  far more  specific in  the  characterization  of each  person's
location  and  activity.     Typically,   the  required  specificity  of   the
descriptions depends on the nature of the  environmental  pollutant under
investigation.    Some  participants of the  conference thought that it would
be wise to  standardize the descriptions  of locations  (or microenvironments)

                                    1-1

-------
and activities so that  results  of different environmental human activity
pattern studies  could  be compared, and also  could  be used in estimating
population exposure  to pollutants other than those  targeted in the survey.

      The  word microenvironment  was frequently used  in  the conference, but
it was  evident that there  was  some  disagreement as to  what  its proper
definition should be.  Pandian states that

      In its  simplest  form,  a microenvironment can be defined  as  a
      control   volume  with  a  homogeneous  pollutant  concentration.
      ...Since the pollutant  concentration of the same volume might
      vary with time,  a better definition  of  microenvironment  is the
      four dimensional concept (3-D space)  x (time) (Duan,  1982).
      ...In this  paper, a new concept is  introduced  in which the use
      of microenvironments  is extended  to include all the different
      components  of the total human exposure  process,  from pollutant
      sources  to  related health effects.

Thomas and Behar, on the  other  hand,  state "...the term microenvironment
has  been   defined  to  mean   the   location  where  activity takes place."
Similarly,  Chen  uses   the  word  microenvironment  to  represent  type of
location (...well  defined  'microenvironments'  [e.g.,  kitchen, parking  lot,
bathroom]...") but  evidently employs the  quotes  on microenvironments to
indicate that  this  is  his definition for the purposes of his paper.    Many
of the other  speakers  do not try  to  define microenvironment,  but rather
employ categorizations  of microenvironments.   Typically  they categorize
microenvironments by  types of  location  (e.g.,  residence, office,   car,
etc.).

      Pandian  suggests  the development of  a relational  data base  that  will
connect many  different  types of information that  one might associate  with
microenvironments such  as pollutant sources,  sinks,  and concentrations;
locations; activities and  their durations;  pollutant" carrying media; dosage
processes; and health effects.  Pandian points out  that such a  data  base
would  provide  scientists  studying  human  exposure  with  data on
microenvironments and  related elements in a concise  package,  and  would
expose areas where further research is needed.

      Duan considers the  difficult  problem of estimating  microenvironment
pollutant  concentrations given  the results of exposure  measurements
integrated over  several  microenvironments  and  given data indicating how
long  subjects wearing the exposure meters spent in  each environment.  He
gives estimation  procedures  based  on three different types of  independence
assumptions.   It is possible  that all  of these independence  assumptions
are too strong to approximate reality,

      Johnson  discusses  a  methodology that was developed to estimate carbon
monoxide exposure and resulting carboxyhemoglobin levels  of residents of
five  counties  in the greater  Denver metropolitan  area.  The author tells
how  activity   pattern   data from previous  surveys  were used   in   this
estimation process.   He  also discusses some of  the  shortcomings  of his
approach.

                                   1-2

-------
      Schwab gives  a geographer's  approach to  the analysis  of activity
pattern  and  exposure  data.    She  examines  the  relationship  between
sociodemographic factors and exposure,  and  gives  results  for this type of
analysis  that  are  based  on EPA's  study of  personal  exposure  to carbon
monoxide in Washington,  D.C.

      Thomas  and Behar compare  time-budget  data  from  carbon  monoxide
exposure  studies in two cities,  Denver and Washington,  to determine the
"likeness"  of  the  activity  patterns observed  in the two  cities.   They
investigate the proportions  of  time  spent in residence, indoors (other than
residence), outdoors, and in transit.   Both studies were conducted in the
winter season.   The authors point out the need for data from other seasons
to determine seasonal  effects.

      Ott, in reviewing  results of activity  pattern  studies  from around the
world, found as  a  common  thread that people  are  basically  indoor animals
that  spend less than five  percent  of  their time  outdoors.   This  paper
includes  a long list of references  that provides the reader  with a good
bibliography of the  activity  patterns  literature.

PERSONAL ACTIVITIES  MONITORING  AND QUALITY ASSURANCE

      The  basic  questions discussed  are (i)  What methods  of monitoring
human activity patterns give a good balance between being  unobtrusive and
having accurate results? and (ii) How does  one evaluate  accuracy of human
activity pattern monitoring results?  If monitoring methods are obtrusive,
this obtrusiveness  may cause  an increase in  nonresponse rates and may cause
respondents  to alter their  normal  activity pattern during the  period of
monitoring.  Relatively non-obtrusive methods typically rely on the memory
of the  respondents  concerning  activities  during  a previous  period of some
fixed length and thereby may  be unreliable.

      Estimation of measurement  error  variance  and measurement  bias  in
human activity pattern monitoring data typically  must rely on redundancy in
measurement instruments, with one instrument being  more accurate  (and often
more  obtrusive) than the  other.   Some  biases  may be  obvious,  such  as
failure to report time  spent in bathrooms and restrooms,  but the extent of
the  bias  may still  be  difficult  to estimate.  The  research  done in this
area of data quality assurance  appears to be quite limited.

      Stock and Moranti  report  on a  study in  which  participants maintained
a personal  activity log and were observed  by technicians who also logged
the  participants activities  during  some of the  study  period.   They found
some  large discrepancies between the  pairs  of logs; especially for logs of
participants who were young male children.

      In a different  type of  study,  Sterling,  Moschandreas,  and Gibbons
compare the  amount  of smoking  in a workplace  perceived  by  smokers and by
nonsmokers.   While the smokers' perceptions  were in agreement  with the
amount of  individual  smoking that  they  reported,  the nonsmokers perceived
the  amount  of cigarettes  smoked  to  be  significantly  lower than  that
reported and perceived by the smokers.

                                    1-3

-------
      Zelon discusses observed  advantages  and disadvantages for two methods
of  reporting  activity patterns  that were used  in EPA exposure  studies.
These two methods  are  a  24-hour  recall  questionnaire  and  a diary in which
the  respondent  is to  make entries  whenever  he changes  activities.   He
suggests a  number of improvements that might be made in  these procedures
including some pre-training of  the respondents.

      Robinson lists several  methods for recording  information  on human
activities  and some  of their  advantages  and disadvantages.   He also gives
some information on  the amount of agreement found  between results obtained
by  different  approaches.    In  addition,  he  lists  the  advantages  of  a
Computer Associated  Telephone  Interviewing '(CATI)  system  developed  by the
University  of California-Berkeley,  and discusses its  use  in  a study being
conducted for  the California Air  Resources Board.

      Quality assurance  and quality control  procedures  employed  in  the
Harvard Air Pollution Study are described  by Adair  and Spengler.  The study
involves a  survey of children  in  several cities to  record daily respiratory
symptoms and  to  monitor  exposure to  N02 and  respirable  particles.   In
addition to listing their  procedures, the  authors give some of the problems
and limitations  resulting  from the procedures.   They  also compare activity
pattern results  from different regions and from different seasons  of the
year.

      Lichtenstein,  Roth,  and  Wyzga  describe a diary approach  in which  a
questionnaire is  filled out on an hourly basis to determine  the  activity
patterns of asthmatics.   They also briefly address the problem of how to
keep the use  of  monetary  incentives  from bringing  in respondents who are
not actually members  of the target population.     .         •   .    .

NONRESPONSE

      Two  types   of   nonresponse are  discussed.     One  is  called  unit
nonresponse and  this  occurs when  a  sampled individual  either cannot  be
located or  is  unwilling or unable to  participate in the survey.   The other
type of nonresponse  is called  item nonresponse  and this  happens when only
some of  the questions in  the  questionnaire  go unanswered or answers are
deleted in  editing.  A probability sampTe is  one in which  each unit in the
target population  has a  known, nonzero,  probability  of being included in
the  sample.    Probability  samples  are  advantageous  in  that  they  allow
unbiased estimation  of the population  characteristics and also estimation
of  the  variance  of  the  estimates.     However,  when   nonresponse  is
encountered,  one no  longer has  a  true  probability sample in that  one no
longer knows  the  probability that  information for  a given individual  will
be  in  the  sample survey results.  That  is,  one may  select  a probability
sample  of  a population,  but,  if  some members of the population  may  be
nonrespondents,  the  actual sample from which data are collected  is not  a
probability sample  of the population.   Hence,   samples  of large  human
populations are  almost never truly probability  samples.   Nevertheless,  if
the rate of unit nonresponse is small (say less  than 5 percent), it is not
unreasonable  to  treat the sample as  a  probability  sample.    For  larger
nonresponse rates,  one has to worry about  how the nonrespondents  differ

                                   1-4

-------
from respondents  in terms of the variables  being  measured in the survey,
and some methods  for correction for nonresponse should  be applied in the
analysis so as to avoid seriously biased estimates.
                              «
      One  conference  participant observed  that  for some  surveys,  the
objectives  may  be  such  that   it  is  not  necessary  to  try  to  obtain
probability  samples nor  to worry about  nonresponse bias  in estimation.
This  can  happen  when  the objectives  are  qualitative  rather  than
quantitative.   For  example,  the purpose  of the survey may be to  show that
there  are at  least  some people  in  the  population being  exposed  to  a
chemical, or receiving  a  major  portion of their exposure to a chemical, at
a particular type of location or through a particular type of activity.

      Nelson  and  Bowie list  methods  employed by  the U.S. Bureau  of the
Census  to obtain  low unit-nonresponse rates  and also  discuss  an experiment
to evaluate experimental  methods to further  increase response rates.  They
stress  the importance  of  "using well-trained  professional  interviewers who
have a  positive attitude."

      Visscher, Whitmore,  Kollander,  and  Brenner place emphasis on the role
of  wording,   testing,  and  administration  of the  questionnaire  in  the
avoidance of  both unit and item nonresponse.   In  addition,  they list the
steps required  in  the production of  a good questionnaire from the  careful
specification  of  the  purpose  of  the  survey  to  a  pretest  of  the
questionnaire on people similar  to those  in the target population.

      Kalton  and  Kasprzyk  give an overview of data analysis  procedures
designed to partially  compensate for  nonresponse  in  sample surveys.   They
.review  methods of  weighting  adjustment  and imputation.   Included in the
review  are discussions of several "hot deck"  methods of imputation.  All
the methods  discussed assume that once the  auxiliary variables have been
taken into account, the missing values are missing  at  random.   This is a
strong  assumption about  the  nonresponse  mechanism.   The authors mention a
method  that  avoids  the assumption, but point  out  that  it is  sensitive to
distributional assumptions.   They conclude their paper with the statements
that all methods for handling missing  survey  data must depend  on untestable
assumptions,   and  that the only  safeguard against  serious nonresponse bias
is to keep the amount of missing data  small.

      Singh and  Petroni discuss nonresponse weighting  adjustment used at
the Bureau of  the Census  for demographic surveys.  Procedures for defining
noninterview  adjustment  cells   are  presented.    A  response  is  weighted
according to a function of the nonresponse  rate for the  cell containing the
respondent.  An application  of  these methods to the  Census  Bureau's  Survey
of Income and Program Participation is used as  an example.

      Chen considers the item nonresponse problem and  develops a maximum
likelihood estimator based on a  class  of  logistic normal  distributions.  He
then  investigates  the  robustness   of  the estimator   and  relates  the
robustness  to assumptions about the  nonresponse  mechanism.   He suggests
that use  of  response incentives may  change the nonresponse mechanism,  and
thereby change the robustness of estimators.

                                    1-5

-------
RESEARCH NEEDS

      The study of  human  activity patterns  and  their relation  to  human
exposure is a relatively.new science.   Nearly all aspects of this science
require further development.  As this  science  is developed, it will provide
better  information  for environmental  policy  makers in  making  decisions
about the health  risks  of many potentially hazardous chemicals.  At the
conclusion  of  the  conference,  Michael  Callahan pointed  out   that
researchers,  when  planning and  when  reporting  their   work,   should  be
cognizant  of' problems  faced  by the  people  designing  and  enforcing
regulations.    From  the  regulatory  decision  maker's  perspective  the
important questions are "What  is  the probability of harm from industrial
use of chemical X?" and "What can I do about it?"

      Conference  participants  were   asked  for  their  recommendations
concerning future research  needs  in the  study  of human  activity patterns
and their  relationships  to human exposure to  pollutants.   A  listing  of
these  recommendations  is given  at the  end  of these proceedings.    The
recommended  research topics include

      •     measurement methods development, validation,  and  comparison;

      •     formulation  of better exposure models and of methods  for the
            validation of the models;

      •     development of a standard  data coding and format;

      •     investigation of the  longitudinal  aspects of  activity patterns
            and exposure;           ,

      •     examination  of  procedures  that  might increase  participation
            rates  of sampled individuals;

      •     determination  of  methods   of  data  analysis  best  suited  for
            reducing nonresponse bias;

      •     performance  of field and laboratory  studies  to  determine the
            associations  between  activities,   microenvironments,  and
            exposures;

      •     development of guidelines  and  standards for field studies;

      •     development  of protocols   for   interviewer and   technician
            training and for subject instruction;

      •     development  of quality assurance and  quality control  methods
            appropriate to activity pattern studies.
                                   1-6

-------
              ESTIMATING AMERICANS7 EXPOSURE TO AIR POLLUTION:
                    ISSUES. ALTERNATIVES AND SUGGESTIONS

                     by:   John  P.  Robinson
                          Department of Sociology
                          University of Maryland,
                          College  Park, MD 20742
                                 ABSTRACT

     With the increased interest  in  how .the  public  spends  its time  has  come
a  proliferation  of techniques  for  its measurement.    In  addition to
observation,five such methods are described  and  contrasted.    Some  general
advantages and  examples of the diary method  are presented in relation to
this method.

     A major  problem with  existing diary  studies  is  that they  are not
focused on variables that are of main concern or  relevance to environmental
researchers.   The  features  of  a  new study that was  explicitly designed to
adapt the diary method to produce generalizable exposure estimates for  a
large population  are discussed.   It is proposed  that  such a technique is
quite appropriate  and  adaptable for generating  national  level estimates,
although they would ultimately need to be validated  and calibrated  using  a
combination  observational/personal monitoring approach.
                                    2-1

-------
INTRODUCTION

     Time  is  an  increasingly  used  indicator  of  human  activity  and
performance.Inferences  about  the  quality  of life are made from data on the
length of the workweek,  the  time we spend watching television, the number
of hospital  days we are ill,  or the  time  American men  and women spend doing
housework.   There seems  to  be  a widening perception that  our  decisions
about time are becoming as  important as our decisions about money.

     Time plays much  the same  role  in  estimating  the  quality  of  the
interaction  with  our   environment.    The amount  of  time  we   spend  in
certainenvironmental  conditions,  or  in exposure  to certain pollutants, is a
key indicator of the daily risks  we  take.   At  the same time,  it also serves
to  suggest  understandable numbers  that  are directly  subject  to policy
manipulation; that is,  the simple statement that we  spend x minutes per day
exposed to  carbon monoxide or  y  hours per week  exposed  to cigarette smoke
immediately  implies steps  that  can be taken to reduce risk.

     Measuring  these  amounts  of  time would  also  seem  a  fairly
straightforward matter.  We can  easily visualize the daily activities that
lead  to  individuals being exposed  x  minutes to carbon  monoxide per day.
But that  is  because  we are implicitly back-translating  this  number to the
common-sense  method of observation.     Unfortunately,   the  observational
method of estimation, while feasible and  persuasive, is simply  too  unwieldy
and cost intensive to be workable as an  approach.   How many people  selected
at random would be willing to have an observer follow  them  around for an
extended  period of time?  How much interviewer time  would be required to
collect  the data,  and how  much effort  would  be  required  to  train  the
interviewers to be adequate observers  of  what  we want  to  observe?   How much
might people's activities  be  affected by  the presence of the observer?

ALTERNATIVE METHODS

     These are  some of  the obstacles that  face  the  observation method.   It
is not so much  that these problems are unsolvable as that they  are  unwieldy
and expensive to address.  More  cost-effective  methods have  been  proposed,
and I will attempt to make a  persuasive  case for them.  At  the  same time, I
will later argue for the ultimate need for more  observational  studies.

     At least five methods of  estimating time exposure can be  found in the
literature:

      1)    Respondent  estimate:  This  is the  cheapest and perhaps most
            commonly used method.    What  is  involved  is  simply asking
            respondents      to  estimate  the  time per day or per week they
            spend doing a particular activity.  This is  the  way the Census
            Bureau and  Bureau of  Labor Studies obtain  their estimates about
            the length  of the workweek or the  numbers  of vacation days.  It
            is  the way  many  survey organizations have estimated time spent
            watching television  or using  other media,   or  time  spent  in
            voluntary organizations or time  spent doing housework.


                                    2-2

-------
      2)     Estimates of others:   This is essentially the  same  approach,
            except one  uses  other informants  who live  with  or  know  the
            respondent  as respondents.


      3)     Telephone coincidental:  In this  technique,  respondents  give
            only brief  reports  for a specific moment — usually the moment
            the telephone  rang  in their household.    This approach  has
            usually  been used by  media rating services.

      4)     Behavioral   meters;    This  is  probably  the  most  expensive
            approach,  since  it  requires  the  use of expensive  equipment,
            respondent  agreement to  use  such  equipment and usually  some
            sort  of technical  help to   install  or adjust the  metering
            equipment.   The most common example of this  approach  is the TV
            monitoring  boxes  used by Neilson,  Arbitron and the  other TV
            rating services. More  sophisticated  "people  meters"  have been
            developed to replace these black  boxes,  but they  still suffer
            from the same  problems, particularly in relation to respondent
            cooperation  (Hartwell, 1984, Johnson, 1985).

      5)     Respondent  Diaries:    Like estimates,   these   are  respondent
            self-reports.   Diaries differ  in that they require respondents
            to give  a full  accounting of their  time  for a specified period,
            such as  an evening,  a  day, or week.   The diary is  constructed
            to be comprehensive, so  that  respondents report for  all  time
            during that period.  There  are almost as many diary approaches
            as diary studies,  with some studies  using fixed  reporting
            intervals,   others  open;  some  using  closed         activity
            categories, others open; some including several parameters of
            activity,  others  only one  or two;  some retrospective  (i.e.
            about "yesterday")  some  prospective  (i.e.  about  "tomorrow"),
            and so forth.

      Figure  1  illustrates the  typical   information obtained  in  a  diary
format,  using open-end action  questions as  in  the  1965,  1975  and  1985
national  studies of  American's  Use of Time conducted by The  Survey Research
Centers  at the University  of Michigan and at the University of Maryland.

     The diary  approach takes 15-20 minutes  to complete and thus  is  far
more expensive  than the estimate  approach,  especially so  in relation to
mostother methods. Several  studies have demonstrated the reliability of the
diary method  in terms of its  ability to produce  similar  estimates.   Thus,
Robinson  (1977) found a  .85  correlation between  diary  estimates  using  the
"yesterday" and "tomorrow"  approaches and  a  .86 correlation  between overall
estimates  from  a  1965-66  national sample and  a  separate random  sample of
respondents from the single site  of Jackson,  Michigan.

     The  diary method has also  demonstrated basic validity in the aggregate
sense.    In one  study (Robinson 1985),  diary estimates  correlated  .81  with
estimates  using  "beeper" that  went off at random moments  during  the  day.
In another  study, Juster  (1985)  found  husbands'  and wives'  diary accounts

                                   9-T

-------
flGCBE I:  (AMPLE TIME DIAKT MCE    (1985 Study of American's UM of Time)
         What you did from midnight until 9 in the morning
THIW
^^d^^M^k*
nvoniyiv
1AM
2AM
SAM
4AM
S>
6)
7l
i
8'
i
VM
M
\






Whwdidyoudo?




















Turn
B«gwi
tt«00



















Tmw
Enctod




















WhM»




















UstOthw
BMpto
With You




















Doing Aj^hinB




















                      2-4

-------
independently agreeing  about what the other  was  doing at  various  points
during the day.   Chapin  (1974)  reports  that  his diary accounts squared well
with  those  reported  by participant observers  in  his  Washington,   D.C.,
study.

     Thus, the diary  method has been found to produce  national  estimates
that  have desirable measurement properties in terms of activity reports.
As  such,  they raise  questions about  the  accuracy and interpretation  of
alternative sources of  data using other methods.   As  in  other  countries,
for example,  diary reports of time employed  people  spent at work are 10-15%
lower than that  reported  as their "official" workweek.   In  the  same way,
time  spent watching  television is far  lower than  that reported  by media
rating services.   On  the  other hand, free time is  much greater  than what
respondents estimate that  they  have.

     But that is  for  activity; and as valuable as  activity estimates are
for understanding what the population is doing, the  more important question
for environmental research is the  air quality  in the location in which the
activity  takes   place and  the length  of  time  spent  in that  location.
Playing softball  in a field next to a toxic  waste site has far different
health  implications   than  playing softball  in a  clean  air  environment.
Driving  on an  open   country  road  is  much  different   than  driving  in  a
congested  urban   location.   Working in the  presence  of smokers  is  far
different from working in  a smoke-free work station.

     However,  location information is not  a  missing  element  in the national
time  diary studies done to date.   Information is  carefully and regularly
collected pn "where".-each activity takes place, and  Figure 2 shows the type
of dynamic location information that  can be  derived  from diary studies.  In
this  figure, it can be  seen  that  the pattern  of the time  spent at home for
employed men on  a weekday in five countries is  basically  similar,  but can
diverge at certain points during the day.   In contrast to  men  in Western
Europe,  for  example,  relatively few American  men are  at  home  between
midnight  and  2  a.m.   and  also  between  7  p.m. and midnight  (at  the  end  of
Figure 1). It can also be seen  that there  are  also  relatively  fewer of them
who  return  home  for  lunch than  is  true  in Western Europe  (or  in  Peru).
Thus, if exposure levels at home (or  in other  locations) were  known to vary
across the day,  the diary estimates would  provide information  that would
reflect these  variations.   In this  way,   Figure  2 provides  a  far  richer
source  of data  than  the  simple  average  number  of minutes or  hours  in
certain locations.

AN  ENVIRONMENTAL APPROACH

     Nonetheless,  it  is  clear that to address  environmental  issues one
needs to  reverse  the  basic time diary method's current sociological focus
on  what  the  public   is   doing  and  how  that  is  changing.    With  an
environmental  focus,  the concentration  instead needs to  be on  the nature of
the  location  in  which  activities occur.    In other words  activities are
useful mainly to  the extent  that they alert respondents  to report more
accurately on where they are doing the  activities and the  possible exertion
rates during the activity.

                                   2-5

-------
FIGURE 2: FERCDTIAGES 0? FEOPLE AT HOME ACROSS THE DAI
                         •MPURZD KEN OR A VBODAt
             (1965 B.8. ««tl«m«l Study pf Jtotrloms* B»t of Time)
  I Bn»«t««i0iit«tetM«t*«u««r*ar*«r«t«Mk««
                            2-6

-------
     That emphasis is what  has now been incorporated  into  the time\diary
study now underway on  the  air  quality exposure of residents of the State of
California.   The study is  funded by the  California Air Resources  Board
(CARB) and  was designed  to obtain data useful  in estimating  exosure to
several  pollutant sources  on the previous day.  The following features have
been incorporated into the survey  instrument, which  is conducted over the
telephone with a random  sample  of state residents:

      1)     It is specifically focused  on  exposure for  a particular day.
            Activity  coding  for that  day has been streamlined to highlight
            activities  of major  environmental  concern,  in  particular,
            activities that require higher  breathing rates, such as  sports
            activities,  or  that  involve  exposure to  chemicals,  such as
            painting  and auto  repair.

      2)     For each  activity  period  during the day,  respondents are asked
            specifically  about  passive  exposure  to  cigarette smoke  or
            smoking materials.   In  that way it is possible to have detailed
            information  about  the length of exposure periods and when they
            occur during  the   day,  as  well as  reminding  respondents of
            periods of the day when they might not recall being exposed to
            smokers.

      3)     This same  feature  of diary expansion  could  have been applied to
            all other possibilities  of  daily exposure  to problematic air
            quality,  but  that  would  have greatly  increased  the reporting
            burden on respondents and lengthened an already  time-consuming
            instrument.    For  that  reason,  a.more direct set of questions
            regarding  potential" exposure was  also-developed for  the survey.
            These were largely asked  after  the diary was completed and the
            respondent's  memory  had been refreshed regarding the previous
            day's  activities.   Possible exposure  to  several   sources  of
            pollution were  included:  gasoline  engines,  cars  in garages,
            solvents  and  cleaning agents,  paints,  glues,  dry  cleaning
            chemicals,  etc.  on  the  previous  day.    A separate  set of
            questions  asked  about exposure  at the workplace.   These direct
            questions were  intended  to focus  respondents' attention on
            important  aspects  of the  previous day's activities, which they
            might  have   otherwise  overlooked  in  their  attempts  to
            reconstruct the behavioral details of  the  previous day in the
            diary. These  questions are shown in Table 1.
                                    2-7

-------
                                  TABLE  1
EXPOSURE QUESTIONS  BEFORE DIARY

>wpl<       Does  your job  involve  working  on  a  regular basis, that is, once
            a week  or more often,  with:

            Gas stoves or ovens?
>wp2<       Open  flames?
>wp3<       Solvents  or chemicals?
>wp4<       Dust  or particles of any sort?
>wp5<       Gasoline  or diesel-powered vehicles or work equipment?
>wp6<       Other air pollutants?

>smok<      Did you smoke any cigarettes yesterday—even one?

            (If  yes)   Roughly,    how  many  cigarettes  did  you  smoke
            yesterday?

>smok<      (CODE OR  ASK AS NEEDED)
            Did you smoke any cigars or pipe tobacco yesterday?

            (If yes)   Roughly how many cigars  or pipes  of tobacco did you
            smoke yesterday?

EXPOSURE QUESTIONS  AFTER DIARY

            Just  to  be sure  we didn't .miss  any important  information,  I
            have  some -additional  questions about- yesterday's  activities.
            Did you  spend  ANY  time yesterday  at a gas  station or  in  a
            parking garage or auto repair  shop?

>P9ys<      (If Yes)   About how  long in  all  yesterday did you spend in
            those places?

>pgas<      Did you pump or pour any gasoline (yesterday)?

>gstv<      Did you spend  any part of  yesterday  in  a room where a gas range
            or oven was turned on?

>nstv<      Were  you around more  than one gas  range or oven yesterday, or
            only  one?

>msl<       Was the gas range or oven you were around for the longest time
            yesterday  being  used  for cooking,   for  heating the  room,  or
            for some  other purpose?

>mstm<      Roughly  how many  minutes or  hours  IN  ALL  were you  in rooms
            where gas ranges or ovens were  turned on (yesterday)?

                                                                 Continued

                                    2-8

-------
Table 1.  (Continued)
>ms2<       Does the oven or range  that you  were  around  the longest have a
            gas pilot light or  pilotless  ignition?
>gspr<      Was the gas range  or  oven  being  used for cooking,  for heating
            the room, or for some  other purpose?

>htfl<      What kind of heat was  it --gas, electricity,  oil,  or what?
            (IF COMBINATION: Which  kind  did  you  use  most?

>heat<      What type  of heater  was  turned  on for  the  longest  amount of
            time?   Was it  a wall  furnace,  a floor  furnace,  forced  air,
            radiator, space heater,  or something  else?

>open<      Were any  doors or windows in your home  open  for  more than a
            minute or two at a  time  yesterday?

>opnl<      For about  how long during the day,  that is,  from 6  a.m.  to
            6 p.m., (were they/was it)  open?

>opn2<      For about how long during  evening or night hours, that  is,  from
            6 p.m.  to 6 a.m.,  (were  they/was  it)  open?

>fanl<      Did you use any kind of  fan in your home  yesterday?

>fan2<      Was that a ceiling  fan,  window'  fan,  portable  room  fan,  or
            something else?

>airc<      (Other than the fan you just mentioned) Did you use any kind of
            air cooling system  in  your  home yesterday,   such as  an  air
            conditioner?

>ACtp<      What type is it?

            <1> Evaporative cooler (swamp cooler)
            <5> Refrigeration type (air conditioner)
            <7> Other (SPECIFY)

>glue<      Did you use or were you around anyone while they were using any
            of the following yesterday:
               Any glues or liquid or spray adhesives?
               (NOT INCLUDING ADHESIVE TAPE)

 Yes
               <5> No
                                                                  Continued
                                    2-9

-------
Table 1.   (Continued)
>pnt2<      Any water-based  paint products  (yesterday)?
               (ALSO KNOWN AS  "LATEX PAINT")

>solv<      Any solvents  (yesterday)?

>pest<      Any pesticides  (yesterday)  such as bug strips or bug sprays?
>pst2<      When you  were   around  pesticides yesterday,  were  you mostly
            indoors or outdoors?

>soap<      Any soaps or  detergents  (yesterday)?

>0cln<      Any other household cleaning  agents such  as  Ajax or ammonia
            (yesterday)?

>aero<      Yesterday,   did  you  use  any  personal  care  aerosol  spray
            products  such  as  deodorants or  hair spray, or were you  in  a
            room while they  were being  used?

>shwr<      Did you take  a  hot shower yesterday?

>bath<      Did you take  a  hot bath  or  use  an  indoor hot tub yesterday?

>moth<      Are you currently  using  any of  the following in your home:
               Any mothballs,  moth crystals, or cakes?

>deod<      Any toilet bowl  deodorizers?

>rmfr<      Any SCENTED room fresheners?
      All   these  improvements  in  time reporting were made  possible  within
the  context   of  a  larger  significant  improvement  in  overall  diary
methodology.   This  involves  the  technology of CATI--Computer Associated
Telephone  Interviewing,  in  particular  the  CATI system  developed at the
University  of California  at Berkeley.   With  its flexible  capacity for
question branching and  the ease with which  it  handles  open-end  responses,
the Berkeley  CATI  system  is  ideally suited  for collecting diary data  such
as these.

     The  development  of  a  full  CATI  diary  thus  represents  a  major
breakthrough in diary data collection.   No longer is  it  necessary to  record
diary entries  in longhand for subsequent coding.  No  longer  is  it necessary
to go through  an  expensive and  time-consuming  coding  process,  particularly
the  tricky problem  of ensuring  that  all  time  periods  add  up  to 1,440
minutes.  With the parallel development  of computer programs  to convert the
variable-field diary entries to  fixed-field analytic format, tabulations
from  the diaries can  be  made  within  a day of the   time the  final study
interview is completed.  In our prior diary  studies,   it could  take up  to a

                                    2-10

-------
year to complete  the  diary coding and translate  the  data into fixed-field
format for conventional statistical  analysis.

     This is particularly  true  for  the  CATI  location  codes,  which as Table
2 shows,  are  (predominantly)  already in closed-end categories.   That does
not, as yet, extend to the activity codes, which remain largely in open-end
text.  Nonetheless,  we have been able  to  develop a preliminary  set of 15
closed activity categories that  encompass  more  than  half of the activities
that are reported in diaries.  For air quality research purposes,  where the
number of activities  of  interest are far fewer than 270+  activity codes
that have been developed,  it is a  simple step  to  devise a  core  list of
40-50 activities  that  capture  the needed distinctions,  so that the task of
analyzing  this facet  of  time  use is as easy  as for the  location  data in
Table 2.

      TABLE 2:   DISPLAY OF THREE ALTERNATIVE  CARB STUDY LOCATION CODES
Aj.    Where in vour house were you?

      <1> Kitchen                    <7> Garage
      <2> Living rm, family rm, den  <8> Basement
      <3> Dining room                <9> Utility/Laundry rm.
      <4> Bathroom                  <10> Pool, Spa (outside)
      <5> Bedroom                   <11> Yard, Patio,  other outside house
      <6> Study/office              <12> moving from room to  room in the
       Other (SPECIFY)                 house

EL;    Where'were vou?  (if not at home)?

      <1> Office building, bank, post office
      <2> Industrial plant, factory
      <3> Grocery store  (convenience store to supermarket)
      <4> Shopping mall  or (non-grocery) store
      <5> School
      <6> Public bldg. (Library, museum, theater)
      <7> Hospital, health care facility, or Dr.'s office
      <8> Restaurant
      <9> Bar, nightclub
      10> Church
      11> Indoor gym, sports or health club
      12> Other people's home
      13> Auto repair shop, indoor parking garage, gas station
      14> Park, playground, sports stadium (outdoor)
      15> Hotel, motel
      16> Dry cleaners
      17> Beauty parlor; barber shop; hairdressers
      18> At work: no specific main location; moving among locations
       Other indoors  (SPECIFY)
       Other outdoors (SPECIFY)

                                                                   Continued

                                    2-11

-------
Table 2.  (Concluded)
C_..    How were you travelling?  Here you in a car, walking,  in  a  truck,  or
      something else?

      <1> Car                            <6> Train/rapid transit
      <2> Pick-up truck or van           <7> Other truck
      <3> Walking                        <8> Airplane
      <4> Bus/train/ride stop            <9> Bicycle
      <5> Bus                            10> Motorcycle, scooter
       Other (SPECIFY)
     At the same time,  as a sociologist,  I think there are very grave risks
in such a step,  because one is  then  never able to recapture the specific
activity  that  interviewers  choose  to lump  with  other  activities or  the
behavioral context that  leads to specific activity choices.   Investigators
who  have only  a  "sports"  activity,  for  example,   combine  bowling  and
shuffleboard with strenuous exercises and sports such  as  squash  or  jogging.
Such  trade-offs  in  activity   detail  need  considerable  discussion  and
deliberation.

     Another significant feature of this CARB study  is  that, rather  than
picking  particular "typical"  seasons or  periods of  the  year,  the  data
collection period extends across all seasons of the year.  Data  are  spread
from mid-October tp mid-December, January  and  February,  April and May  and
June .and  July.   This  removes a major stumbling block to interpretation  of
the diary data.   It is still somewhat short,  however,  of  the  effort we made
in our 1985  and  1987  national  data  collections to  spread interviews  evenly
across almost all days of the year.

     In order to reflect the greater variation of weekend day  activities
over weekdays,  we have  oversampled Saturdays and Sundays  in relation  to
other days of the week.   Where  possible, we  also  interview respondents  for
a designated day—that is, if they  are  not  available on Monday to  report
about Sunday, we will  ask them  on Tuesday about Sunday's activities.

     There is another  feature  of this  study,  however,  that may  be of more
interest  to environmental  researchers.    That   is  the  inclusion  of  a
Scientific Advisory Panel.  In  addition to  the  several  staff  scientists  who
have closely reviewed  and  commented on  the CARB questionnaire,  we  have
solicited  and  followed  the advice  of an  appointed  panel  of outstanding
researchers  in  the field  across the country.   The  panel,  which  includes
scientists  with  expertise  in  the  fields  of  engineering,   economics,
statistics and public health, comes from  academic,  governmental  and private
firms.

     The panel's comments have  considerably  influenced the direction  of the
study and specific source questions  we have  chosen  to  examine in  the  study.
However,   several excellent  recommendations could  not be followed,  and  must

                                    2-12

-------
await  future  research  opportunities.    At  the same  time,   there  is  no
logistical  reason  to  prevent their being incorporated into the current  CATI
instrumentation.

     A final  important  feature of  this  study  has  been  the  inclusion  of
children aged 12-18  in the  household as  randomly  selected.    What  is
currently under consideration  is  an additional study to  include, for the
first time,  diary data  on all children under the age  of  12.  Many  of these
data  would  need  to  be reported  by  adults,   rather than  the  children
themselves,  however,  pilot testing indicates that parents  keep fairly close
watch on what  their children are doing—or at least  where  they  are.

SOME RECOMMENDATIONS  FOR FUTURE RESEARCH

     These developments  in the  CARB study give great promise for a fully
integrated  study, one  that would  not depend on  secondary analysis  of
sociological  data but  would  be  specifically designed   for air quality
estimation  purpose.   The CATI format can  easily be  adapted  for  a  national
sample,  and  can  be easily reprogrammed  to include exposure  questions  on
several alternative  sources of  air  pollution.   The use  of  split  samples,
with questions on some  pollutants asked  of some respondents  and questions
on different  pollutants posed to other respondents,  can again be  easily
accomplished with  this CATI instrumentation.

     However,  all   of  this  needs  to  be  accomplished  within  a  more
comprehensive model  of  air pollution  exposure and validated using basic
observational  data on the actual  circumstances  of exposure across  the  day.
To reduce field costs, these observational  studies  need to be conducted.in
a limited  number  of randomly  selected, sites  across, the country.   These
observational  studies will  require expensive personal  monitoring equipment
to  validate  the   impressions  obtained  from  the telephone  instrument
approach.    To  the  extent possible,   technical   observers  could  follow
respondents  on  their round  of daily  activities  to  isolate actual  high
incidence  of exposure  for  as many   sources  of  risk   as  present  field
instrumentation would allow.

     In  that  way,   we  could  achieve the single  objective of  the  two
methods—the accuracy and internal  validity  of personal  exposure  monitors
and  the  external  validity of  the diary  method.    We would have a fully
integrated set of estimates to generalize  to  the  national  population  with
considerably more  accuracy than we have now.  With these norms in hand, the
next step of identifying respondents at high risk would then  be possible.

      The  work  described  in  this  paper  was  not  funded  by  the  U.S.
Environmental Protection  Agency  and  therefore  the  contents  do  not
necessarily reflect  the views of the  Agency and no  official endorsement
should be inferred.
                                   2-13

-------
                                REFERENCES

Chapin,  S. (1974).  Human Activity Patterns in the City:   Things People Do
      in Time and  in Space  John Wiley   New York.

Hartwell, T.  (1984).   Study of carbon  monoxide exposure of residents of
      Washington.  D.C..  and  Denver. Colorado.   EPA-600/S4-84-031,  PB84-
      182516,   Environmental  Monitoring  Systems  Laboratory,  U.S.
      Environmental   Protection Agency,   Research  Triangle Park,   North
      Carolina.

Johnson, T.  L.   (1985).   A  study of  personal exposure to carbon  monoxide in
      Denver. Colorado.    EPA-6004-84-015, OB-840146-12.    Environmental
      Protection Agency,  Research Triangle Park,  North Carolina.

Juster,  F.   T.  (1985).  The  validity  and quality  of  time use  estimates
      obtained by  recall diaries in Time,  Goods and Well-Being   (etc.) by
      Juster, F. and Stafford,  F.  P.).   Institute for Social Research,  The
      University of  Michigan, Ann Arbor.

Robinson,  J.  (1977).    How Americans  Use Time:  A Social  Psychological
      Analysis.   (Further  analyses were  published in How  Americans  Used
      Time in 1965-66.    Monograph  Series,  University Microfilms, Monograph
      Series (Ann  Arbor).

Robinson,  J.  (1976).    Changes in Americans'  Use  of  Time:    1965-1975.
      Cleveland State  University, Communication  Research Center,  Cleveland,
      Ohio.           .

Robinson,  J.  (1985).   Testing  the validity and  reliability  of  diaries
      versus alternative time  use  measures.   Time, Goods,  and  Well-Beinq
      (ed.)       F.  T. Juster and  F.  P.  Stafford.  Institute  for Social
      Research,  The  University of Michigan, Ann Arbor.

Robinson, J.  (1983).   Environmental differences  in how Americans use time:
      the  case for  subjective  and objective  indicators.    Journal  of
      Community Psychology. Vol. 11, pp. 171-189  .

Szalai,  A. et al.  (1972).   The Use of Time.  The  Hague:  Mouton.
                                   2-14

-------
          HUMAN ACTIVITY PATTERNS:  A  REVIEW OF THE LITERATURE FOR
          ESTIMATING TIME SPENT INDOORS.  OUTDOORS. AND IN TRANSIT

          by:   Wayne R.  Ott
               Chief,  Air, Toxics,  and Radiation Staff
               Office of Research and Development (RD-680)
               U.S.  Environmental Protection Agency
               Washington, D.C.  20460
                                 ABSTRACT

      This paper reviews field surveys  of human  activity  patterns and  "time
budgets"  in  the U.S. and  other countries published  in  the sociological,
transportation,  and  environmental literature.   This review emphasizes the
use of these  activity data for assessing  human exposure to environmental
chemicals.  Although many previous activity pattern  field  surveys have  been
conducted in  fields  outside  the environmental  sciences,  few of these  have
collected the kind  of data needed to  construct human  exposure models for
environmental  exposure  assessment.    Using  previous studies as  a   data
source,  this  paper estimates  approximately the  times people spend in  three
general   categories  of microenvironments:   indoors,  outdoors,   and in
transit.   From  U.S.  nationwide activity pattern surveys, employed  persons
spend the following proportion of their  time  in  three  categories:   indoors
(home, work,  or other locations),  92%;  in-transit,  6%;  and outdoors, 2%.
By comparison,  U.S.  housewives spend  the  following proportion of time in
three categories:   indoors (home or other locations), 94.1%;  in-transit,
4.2%; and outdoors,  1.7%.   The existing activity pattern  literature  reveals
that man  is  primarily an indoor creature.

      This  paper   has   been  reviewed   in   accordance  with  the  U.S.
Environmental  Protection Agency's peer and administrative review policies
and approved for presentation  and publication.
                                    3-1

-------
                               INTRODUCTION

      Information on the time people spend in certain microenvironments  is
important for  estimating population exposures to  air  pollution.  A large
number  and  variety  of  studies  in which  data on  human  activities were
collected from real  population samples  have been completed  over  the last  60
years1"38 (Table 1).   These studies may be  useful  in determining  the  amount
of time that individuals spend in  particular locations throughout the day.
Such data are  required  in  the human exposure-activity  pattern models that
have been developed in  the  1980's.39"44   To  calculate a person's 24-hour
exposure,  such_a model requires information on the person's activities and
locations visited throughout  the  day.   Such information,  if obtained  from a
diary,  can  be coded to become an  "activity profile"  on the computer,
providing a record  of each person's activities by time  of day.  Then the
computer  uses  this  record  to generate  a  corresponding  estimate  of  a
person's exposure by time of day,  thus producing an "exposure profile."  A
number  of exposure  profiles can  be generated,  and  the  highest   hourly
exposure (or 8-hour exposure) for  each person can be obtained,  producing a
frequency distribution of exposures for  the  entire population.   The total
human  exposure  field  studies  conducted  in  the  1980s   using personal
monitoring31"34- 45"46  all  share  a common finding:  the activities  of a  person
are  the  most  important  determinant  of  the  person's  exposure  to
environmental  pollution.   Thus,  activity  pattern data and exposure modeling
are essential  ingredients of  risk assessments.
                                    3-2

-------
                                                             Table 1

                                          STUDIES OF HUMAN ACTIVITIES AND TIME BUOZTS
   REFERENCE
   LOCATION
DATE
SURVEY SAMPLE
APPROACH
COMMENTS
Lundberg,         Westchester
Komarovsky, and   County, NY
Mclnemy (1)

Sorokin and       Boston, MA
Berger (2)
de Grazia (3)     U.S. national
                  sample
                  Nov - May     2.460 persons,
                  1931 - 1932   Including
                                school children
                  May- Nov
                  1935
                  Mar - Apr
                  1954
                            3-day diaries,  with a few
                            7-day diaries covering a
                            total  of 4,460  days
          Approximately     Diaries over a 4-week
          100 persons aged  period, with 1 form per
          17 or older       day
          Probability
          sample of 7,000
          households
                 2-day diaries,  with
                 15-nrinute time periods
                 from 6 a.m. to 11 p.m.
                    Part of an in-depth study of 1ife
                    values and leisure in Westchester
                    County

                    A total of 3,472 forms was
                    generated, giving considerable
                    detail on activities

                    Analysis focused en the use of
                    leisure tine (TV, radio) as a
                    function of age
Chapin and        Durham, NC        Oct - Nov
Hightower (4)  .                     1963
                                Snail
                                nonsystematic
                                sample of
                                adults
                            20-minute questionnaire,
                            plus 20-minute game
                                             Limited experimental  survey for
                                             generating hypothesis about
                                             discretionary time
Szalai (5-6)      12 Countries      1965 - 1966
                                24,392 persons
                                in 13 surveys
                                in 12 countries
                            1-day diary,  with
                            before-and-after visit to
                            residence by interviewer '
                                             Standardized sanpling,
                                             interviewing,  coding, and analysis
                                             to compare activities  in different
                                             countries
Robinson (7-8)    U.S. national
                  sample and
                  Jackson, Mich
                  1965 - 1966
          1.244 adults in
          national sanple
          and 788 adults
          in Jackson. Mich,
          sample
                 1-day diary, with
                 before-and-after visit to
                 residence by interviewer
                    Included both primary and secondary
                    activities as well as location and
                    social corpany
Chapin. and        U.S. national     1966          1,467             Interview about
Brail (9)         sample                          households in     activities on the
                                                  43 SMSAs and 48   previous day
                                                  states
                                                                              Part of a study of residential
                                                                              preferences.  Employed 13-class
                                                                              activity classification  system
       and
Chapin (10)
Washington. DC    Spring 1968
          1.667
          respondents
                 1-hour interview about
                 activities on the
                 previous weekday and
                 weekend
                    Differentiated between
                    discretionary and obiigatory
                    activities
Brail and         U.S. national     1969          1,199             Repeat survey on the 1966
Chapin (11)       sample                          households        population saiple (see
                                                                    Chapin and Brail)
                                                                              Little difference observed  in
                                                                              activity patterns between the  1966
                                                                              and 1969 surveys
                                                            3-3

-------
              Table 1 (cont)





STUDIES OF HUMAN ACTIVITIES AND TIME BUDGETS
REFERENCE
U.S. Dept of
Transportation
and Bureau of
Census (12-22)
Flachsbart,
Baer, and
Schalman (23)


Michelson, and
Reed (24)

Bullock.
Dickens,
Shapcott, and
Steadmn (25)
British
Broadcasting
Corp (26)

Robinson (27)



U.S. Dept of
Transportation
and Bureau of
Census (28)
Juster et al .
(29)

LOCATION DATE
U.S. national 1969 - 1970
sanple


Los Angeles Mar - Jun
County, CA 1972



Toronto, Canada 1973 -


Reading, Mar 1973
England


England, ' 1974 - 1975
Scotland, and
Wales

U.S. national Oct - Dec
sanple 1975


U.S. national 1978
sanple


U.S. National Feb - Dec
Sanple 1981

SURVEY SAWLE
6,000
households


244 persons
aged 16 or
older, non-
representative
sanple
591 famil ies of
child-bearing
years who moved
806 persons aged
16-70 provided
450 usable
diaries
1,822 persons
in winter and
1,723 persons
in sinner
1,519 persons
aged 18 or
older

Approximately
20,000
households

620 adults.
492 children
aged 3-7
APPROACH
Hone-based interview



Interview used cards to
describe frequency and
duration of 80 activities
in the past year

1 interview before the
move and 2 afterward

7-day diaries with
assistance provided on
the first day

7-day diary



Interview about
activities on the
previous day

Hcme-based interview



Personal .interviews and
telephone interviews

COMMENTS
Detailed interview conducted by
Census Bureau with emphasis on
travel activities

Small-scale study to examine the
hypothesis that the residential
environment facilitates or hinders
certain activities

Designed to access changes in
activities as a result of the move

Data collected to val idate a model
developed by Tcmlinson, Bullock,
et al.

Diary format divided into 42
half -hour tine bands (emphasized TV
viewing)

Sanple was more completely
representative of the U.S.
population than the 1965-1966 study
by Robinson
Expansion and update of initial
1969-1970 Nationwide Personal
Transportation Study (NPTS)

Follow-up study of households fron
the 1975 time use study

                continued
                 3-4

-------
               Table 1  (cont)

STUDIES OF HUMAN ACTIVITIES AND TIME BUDGETS
REFERENCE
Letz and Soczek
(30)

Johnson (31-32)



Hartwell
et al . (33-34)

Kl Inger and
Kuzmyak (35)



Johnson (36)



Robinson and
Holland (37)



Wiley and
Robinson (38)
LOCATION
Kingston-
Harrinni, TN

Denver, CO



Washington, DC


U.S. national
sample



Cincinnati", OH,
ffletropol itan
area

U.S. national
sarple



State of
Cal ifomia
DATE
Jun - Sep
1981

Dec -Feb
1982 - 1983


Dec - Feb
1982 - 1983

1983 - 1984




Mar & Aug.
1985 .


Jan 1985-
Jan 1986



Oct 1987 -
Jul 1988
SURVEY SAMPLE
322 persons.
including sore
children
Probability
sarple of 452
persons, 2 days
each
Probability
saiple of 712
persons
Probabil ity
sarple of 6,438
households in
50 states and
DC
Probability
sample of 973
persons

4900 adults
dgod over 18,
300
adolescents
over 12
180 persons
aged over 12
APPROACH
Telephone interviews
using a "yesterday
recall" diary
2-day diary with personal
monitor


1-day diary with personal
monitor

Home-based interview




Activity diaries for 3
consecutive 24-hour
periods plus
questionnaires
1-day "yesterday recall"
diary
1-day prospective diary
by nail

1-day "yesterday recall"
diary on telephone
COMMENTS
Micrcenvironments were coded by
Home, Work, Travel for exposure
model ing purposes
Activity Changes recorded as part of
U.S. EPA CO personal exposure field
study

Activity changes recorded as part
of U.S. EPA CO personal exposure
field study
Follow-up to NPTS surveys conducted
in 1969-1970 and 1977-1978
(extensive information on personal
travel habits)

Approximately 2,000 subject-days
for use with air pollution activity
pattern-exposure models

Included U.S. EPA questions on
dwelling unit characteristics



Very complete location codes.
Emphasized air pollution sources
                                                   (smoking, hot showers, gasoline
                                                   engines, solvents,  air fresheners)
                3-5

-------
                           TIME BUDGET STUDIES

      When one examines the literature  on human  activities,  the term "time
budget" ("zeitbudget,"  "budget de temps")  frequently  is encountered.   A
time budget,  which is  conceptually  similar to  a  person's money  budget,
summarizes  the  amount  of  time  an  individual  spends  in each  of  many
activities over some time period (a day or a week).   As  Michel son noted,47 a
time  budget  contains  considerable  detail  on a person's   activities,
including  the  locations  in which the activities  take place:

      A time budget is  a record,  presented orally  or on paper, of  what a
      person has done  during  the  course of  a  stated  period of time.   It
      usually covers a  24-hour day or multiples thereof.   The record  is
      taken down with  precision and  detail, identifying what  people  have
      done with explicit reference to exact amounts  of time.  It is usually
      presented chronologically through  the day, beginning with  the  time
      that a person gets up in  the morning.

      The  information  that  is  normally gathered  in  a time  budget  consists
      of the time  an activity  began,  the  time  it ended,  the nature  of the
      activity per  se,  the  persons who were present and active  in the given
      activities,  and,  not least, the  exact location where the  activity
      took place.

      One  way  of obtaining time  budget  information  from  the  population
surveyed  is by  having  each  respondent  maintain a diary  over a  24-hour
period or longer.   In  another.approach, the so-called  "yesterday"  survey
approach,  the interviewer asks each respondent  about his or her activities
on the  "day before."    Once the diaries or questionnaires  are completed,
they  are  collected,   and  the activities  are  coded   according  to  some
systematic  procedure.   Then  the  results are  tabulated,  and  the  average
number  of hours devoted to  each  activity,  or  class  of activities,  are
summarized for the  population sample as a whole, or  for  various  subgroups.

      Several   summaries of  the  historical development  of  time  budget
research appear in  the  literature,  including  the  literature review prepared
by Ottensmann48, the  article by Szalai49,  the paper by Converse50,  and the
book by Chapin51.  The  present paper discusses  the  literature  on  activity
patterns primarily  in the context of estimating  air  pollution exposure.

EARLY ACTIVITY PATTERN  STUDIES

      Perhaps the  original interest  in time budget research evolved  from
the industrial revolution and  the  importance  it attached to time:  "Time is
money."   Probably  the  first  studies of  human  activities were  time-and-
motion studies of  industrial  workers  in the early part  of the  century.   It
appears that  time  budget research of industrial workers in Moscow  in the
1920's  represents the first attempt  to  collect data  on   individual
activities over a 24-hour period48.  Also in the 1920's, the Bureau of Home
Economics  of the U.S.  Department of Agriculture conducted research into how
farm women used time49.

                                   3-6

-------
       In  the 1930's,  several  important time  budget  studies were  carried
 out,  including  an  in-depth  study by Lundberg,  Komarovsky,  and Mclnerny1  of
 the  leisure  time activities of residents of Westchester County,  New  York.
 As  part of  a  study of the residents'  participation  in  clubs,  churches,
 schools,   the  arts,  and other leisure  activities,   these  investigators
 collected  diaries  that  ranged  from  3  to 7  days on  2,460 persons (Table  1).
 Later in  the decade, Sorokin and Berger2 distributed 1-day  diary forms  to
 approximately 100  residents in Boston over a 4-week  period,  generating
 3,472 daily  report  forms.   Sorokin and  Berger's work is interesting because
 it  attempts  to examine the motivation  for activities,  social  context,  and
 the  predictability of behavior.  Despite  the  nonrepresentativeness of  the
 sample and  a poor response  rate,   the book  describing  their study  is
 considered a  landmark in  time budget  research.   According  to Szalai49,
 "This book probably did more than any other  at that time  to popularize  the
 time  budget  process as a sociological method of investigation."   Also,  in
 the  1930's,  McCormick52 suggested that time budgets ultimately could be used
 to  compare cultures.   As  Ottensmann noted48,  it took more  than 30 years  for
 McCormick's  suggestion to be carried out.

       Szalai49 points  to  the large  number of time budget research studies
 undertaken in other  countries, particularly  in France,  beginning in  the
 1940's.  In addition,  time budget  studies have been undertaken in Japan,
 other  Western  European  countries,  the  Soviet  Union,  and  the  eastern
 European   socialist countries.    In  Hungary,  for  example,  the  Central
'Statistical  Office conducted  a  time budget study of a national  (2%) sample
 of  12,000 persons  as part of its "micro-census" in  196349.   This, apparently
 was  the first time budget  study that was  part of an official  census.   In
 the  1950's,  several studies were completed  in  Britain  that  focused on  the
 time  spent  by  housewives  in  various activities  throughout  the day.48
 Walker53 continued the studies of housewives  in the United States, reporting
 that,  despite  increased use  of labor-saving devices,  "homemaking  still
 takes  time."    In addition  to the  interest researchers   and  government
 agencies  expressed  in  changes in leisure time  activities   as  a  result  of
 affluence,  technological  innovation,  and  reductions  in  the  number of  hours
 worked per week,  another user  of time  budget information  appeared in  the
 1950's.  The broadcasting  industry  saw  a need  for information on the amount
 of  time that people spend  in various activities in order to determine  the
 audience   available  throughout  the  day  for  its  radio  and  television
 programs.    de Grazia3  discusses  the results  of  a  time budget  study
 commissioned by the Mutual   Broadcasting Company in 1954  of the activities
 of  a U.S.  national  probability  sample of 7,000 families.

 MULTINATIONAL  COMPARATIVE TIME  BUDGET RESEARCH PROJECT

       A time budget study  that was  impressive both for its scale and  its
 importance for international  cooperation  was the Multinational Comparative
 Time Budget  Research  Project5.  This  project, launched in September 1964 by
 a  small   international  group  of  social  scientists,  employed common
 principles for  sampling,  interviewing, coding,  and  tabulating  the  data.
 The population sample  consisted of nearly 25,000 persons  in  12 countries
 (Belgium, Bulgaria,  Czechoslovakia,  France,  East  Germany,  West  Germany,

                                    3-7

-------
Hungary,  Peru,  Poland,  Union  of Soviet  Socialists  Republics, United States,
and Yugoslavia).

      The data were collected  by asking each individual  in the sample to
record his or her  primary  activities for  a "complete day," referred  to as
"day n."   Information  also  was  requested on  any parallel activities carried
out at the same  time,  the  location  where  each activity was performed,  and
the persons  in  whose  company  it was  performed.    Once a respondent  was
selected, the interviewer provided a self-recording form on "day n -  1" to
be used on "day  n."  The investigator returned on  day "n +1" and, by means
of an  interview, checked,  corrected, and  completed the form  filled out by
the subject.   Thus the investigator obtained a "recapitulatory"  interview
in addition to  the  raw diary  data.   If  for  any reason the form had not been
filled out, the  investigator on  "day n +  1" interviewed the  subject  about
"day  n"  activities, thus  obtaining a  "spontaneous interview."   In  each
case,   a  supplementary  questionnaire at the  close  of the visit  was used to
obtain  information  for each   respon-dent  on  personal  and  demographic
characteristics5.

      The  multinational  study  developed an  activity coding system
consisting of 100  categories of activities  represented by a  2-digit  code
(from 00 to 99).   The  activities represented by these codes can be grouped
into  10  larger  classes:  (1)  working  time and  related  activities,   (2)
domestic  housework,  (3) care  of children, (4)  purchasing  of  goods  and
services,  (5) private  needs  such as meals  and sleep,  (6)  adult education
and  professional  training,   (7)  civic   and  collective participation
activities, (8)  spectacles, entertainment,  and social life, (9) sports and
active leisure,  and (10)  passive  leisure.6 An  example, of the  activities
within a  particular  class, domestic housework, shows  that the activities
range from preparation and  cooking of food to gardening and taking care of
animals (Table  2)5.
                                    3-8

-------
      Table 2. 2 Activities in the Domestic HouseWork  Category  of  the
                     Multinational Time Budget Project5
Code                                Activity
    10                        Preparation  and cooking of food
    11                        Washing  and  putting away utensils
    12                        Interior cleaning  (sweeping, washing, making
                                beds)
    13                        Exterior cleaning  (pavement or sidewalk)
    14                        Washing  and  ironing linen
    15                        Repair and maintenance of clothes, linen,
                                shoes
    16                        Miscellaneous  repair  and maintenance work
    17                       Gardening, taking care of animals
                                (not for profit)
    18                       Maintenance and provisioning for
                             heating and water
    19                       Miscellaneous (adding  up accounts, tidying up
                                papers, usual attentions paid to members of
                                the household)
     The Multinational  Comparative  Time  Budget Research Project  yielded  a
rich data  base .that  is summarized  in a number  of tables.,  figures,  and.
articles by various participants in Szalai's book6.   For example,  data from
the  appendix of  the  book show  the average  time employed  men,   employed
women,  and  married housewives spend in  various locations in  12  countries
(Table  3).   Employed men  in  the 12 countries  spend  between  12  hours  (in
Hungary) and 15.2  hours  (in Belgium) inside   their  homes,  while housewives
spend between 19.6 hours (in USSR) and 21.7 hours (in  France)  inside their
homes6.   For the  12 countries, employed  men average between  50%.and  63% of
the day inside their homes, compared  to between  82% and  90% for housewives.

      It is difficult to determine the  overall  amount  of time spent indoors
from these  data,  because  categories such as  "at one's workplace"  do  not
distinguish  between  indoor  and  outdoor  workplaces.    Similarly,   the
categories  "in place  of  business"  and   "in  all  other  locations"  do  not
specify whether they  are indoors-or outdoors.   However, I  have  estimated
the amount of time  respondents spent in  three  general  categories  (indoors,
outdoors,  and in  transit),  by  assuming  that:

      •     The categories  "inside one's  home",  "at  one's work  place",   "in
            other  people's homes",   "in  places of   business",   and  "in
            restaurants and bars" are assumed to be entirely  indoors.

      •     The categories "just  outside one's  home" and  "in all  other
            locations"  are  entirely outdoors.

                                    3-9

-------
                             Table 3.  Time Spent in Various-Locations in 12 Countries8
                                    (Average  Hours per  Day,  All Days of the Week)
                            01    02
             03
04    05    06    07
08    09
10    11    12    13
                                                                              14    15
EMPLOYED MEN, ALL DAYS
inside one's home
just outside one's home
at one's work place
in transit
in other people's home
15.2  12.5  14.3  13.6  13.6  14.2   13.8   12.0   12.9   14.0   13.4   13.6   13.4   12.9  13.0
 0.5   0.7   0.3   0.3   1.0   0.5    0.4    1.0   0.1   0.2   0.2   0.3   0.3   0.5   1.4
 5.0   7.7   5.9   7.2   5.4   5.1    6.8    7.5   6.4   7.0   6.7   6.5   6.8   7.1   6.1
 1.5   2.1   1.6   1.5   1.7   2.2    1.7    2.0   2.5   1.7   1.6   1.5   2.0   1.8   2.2
 0.5   0.2   0.3   0.5   0.5   0.6    0.3    0.3   0.5   0.5   0.5   0.6   0.2   0.7   0.5
in places of business
in restaurants and bars
in all other locations
total
EMPLOYED WOMEN, ALL DAYS
inside one's home
just outside one's home
at one's work place
in transit
in other people's home
in places of business
in restaurants and bars
in all other locations
total
HOUSEWIVES.' ALL DAYS
(Married Only)
inside one's home
just outside one's home
in transit
in other people's home
in places of business
in restaurants and bars
in all other locations
total
0.7
0.2
0.4
24.0

17.1
0.1
3.6
1.2
0.4
1.0
0.2
0.4
24.0


21.6
0.2
1.0
0.4
0.5
0.1
0.2
24.0
0.6
0.0
0.2
24.0

14.6
0.3
6.5
1.6
0.2
0.6
0.0
0.2
24.0


20.4
1.4
0.9
0.4
0.7
0.1
0.1
24.0
0.6
0.1
0.9
24.0

16.0
0.2
5.1
1.3
0.2
0.8
0.0
0.4
24.0


20.9
0.3
1.2
0.3
1.1
0.0
0.2
24.0
0.5
0.2
0.2
24.0

15.3
0.0
6.3
1.1
0.5
0.6
0.1
0.1
24.0


21.7
0.1
1.0
0.5
0.6
0.0
0.1
24.0
0.4
0.5
0.9
24.0

17.0
0.7
3.6
1.1
0.4
0.6
0.2
0.4
24.0


20.4
0.8
1.0
0.6
0.7
0.1
0.4
24.0
0.4
0.4
0.6
24.0

16.7
0.2
3.6
1.3
0.9
0.8
0.3
0.2
24.0


20.5
0.4
1.0
0.6
1.1
0.1
0.3
24.0
0.6
0.1
0.3
24.0

16.7
0.2
4.9
1.1
0.2
0.7
0.1
0.1
24.0


21.3
0.3
1.0
0.3
0.9
0.0
0.2
24.0
0.4
0.2
0.6
24.0

14.5
0.3
6.8
1.4
0.3
0.5
0.0
0.2
24.0


19.7
2.1
0.9
0.2
0.9
0.0
0.2
24.0
0.7
0.3
0.6
24.0

16.1
0.4
4.4
1.8
0.3
0.7
0.1
0.2
24.0


21.0
0.5
1.2
0.4
0.7
0.0
0.2
24.0
0.4
0.0
0.2
24.0

15.0
0.1
5.8
1.5
0.6
0.8
0.0
0.2
24:0


20.9
0.1
1.2
0.5
1.2
O'.O
0.1
24.0
0.7
0.4
0.5
24.0

15.4
0.0
5.2
1.3
0.7
0.9
0.2
0.3
24.0


20.5
0.1
1.0
0.8
1.2
0.1
0.3
24.0
0.7
0.4
0.4
24.0

15.3
0.1
5.0
1.3
0.6
1.1
0.2
0.4
24.0


.20.9
0.1
0.9
0.7
1.1
0.1
0.2
24.0
0.4
0.2
0.7
24.0

14.0
0.1
6.7
1.7
0.2
0.6
0.2
0.5
24.0
.

19.6
0.4
1.9
0.7
1.1
0.0
0.3
24.0
0.5
0.2
0.3
24.0

15.0
0.3
6.1
1.4
0.6
0.4
0.0
0.2
24.0


20.5
0.8
1.5
0.7
0.4
0.0
0.1
24.0
0.5
0.0
0.3
24.0

15.0
0.4
6.4
1.5
0.2
0.4
0.0
0.1
24.0


19.7
2.3
1.1
0.3
0.5
0.0
0.1
24.0
Cities:   01 Belgium
          02 Kazanlik, Bulgaria
          03 Olomouc, Czechoslqvakia
          04 Six cities, France
          05 100 districts. West Germany
                  06 Osnabruck,  West  Germany
                  07 Hoyerswerda,  East  Germany
                  08 Gyor,  Hungary
                  09 Lima-Callao,  Peru
                  10 Torun.  Poland
                                  11 Forty-four cities,  USA
                                  12 Jackson,  Michigan,  USA
                                  13 Pskov,  USSR
                                  14 Kragujevac, Yugoslavia
                                  15 Maribor,  Yugoslavia
 In this table from Section VII.3, "Distribution of Daily Time According  to Different  Locations,"  Tables  7-1.1 to
 7-1.3,  p.   795,  Szalai0,  data  were weighted  to ensure  equality of  days of  the  week and  numbers  of  eligible
respondents per household.
                                                  3-10

-------
      •     The category "in transit"  is  neither  indoors nor outdoors.

      With these  assumptions  and  the data  restructured accordingly  (Table
4),  employed men  in  the 12 countries  are seen to  spend between 84%  (in
Maribor, Yugoslavia)  and  92% (in  France) indoors,  compared to  between  89%
(in  Maribor,  Yugoslavia)  and  97%  (in France  and  Torun,   Poland)  for
housewives.   However, many of the entries in Table 4 cannot  be compared on
a  statistical  basis,  because  the  number  of respondents  in each  sample
varies.   Also,  the  representativeness  varies.    For  example, the  Soviet
Union  is  represented by a  single city  and its suburbs (Pskov, population
115,000),  while the United States is represented  by a national sample of 44
metropolitan areas.  Finally, the assumptions  used to develop Table  4 need
to be examined because they may introduce error.   However, the estimates in
Table 4 appear useful as a  rough approximation of the actual  times spent by
residents of 12  countries  indoors,  outdoors,  and  in transit,  and they  are
used in the remainder of this paper.

Activities of the U.S. Population

      If only  the data for the United  States (44 cities) are  considered,
employed men, on  the  average,  spend 90%  of the day indoors,  versus 95% for
married housewives  (Table  4).   Overall,  employed  men  in  the  United  States
are  estimated  to  spend  2.9%  of  the   day   outdoors,   versus  1.7%  for
housewives.

     To estimate  the time that employed  people in the U.S. spend,  in various
activities,  it  is necessary to combine  the  times  from Szalai's  categories
in Table  3.   For  example,  U.S. employed  men (44"U.S. cities  in Table 3)
spend  an  average  of  13.4 hours  indoors at home  (all  days   of the  week),
versus  15.4  hours  for U.S.  employed women.  If we weight both  sexes equally,
then the  category "employed U.S.  persons" spends  an  average  of 14.4 hours
per  week  indoors  at  home,  or 60% of.the  time  (Table  5  and  Figure 1).   The
category  indoors at home (IH)  combines  being  in one's  own home (60%) with
being  in other people's  homes  (2.5%), for a total  of 62.5%. Similarly, U.S.
employed men spend 6.7 hours per day at  their workplace,  and  employed women
5.2  hours.   Assuming, as  in the  above  discussion, that all  workplaces are
indoors, U.S.  employed  persons  spend,  on the average,  5.95 hours, or 24.8%
of their  day indoors at  their workplaces.  Thus, the  category indoors at
work (IH) consists of being at  one's  workplace (24.8%)  and being  in  places
of business  (3.3%), for a total of 28.1%.

     U.S.  male workers  spend 1.6  hours in transit (T) micro- environments.
The  average, 1.5  hours,  is  6%  of each person's day.   The smallest category
is "just  outside  one's  home"  (0H);   U.S. employed persons spend an overall
average of  0.1   hours  in  this  category  (men:  0.2  hours;  women:  0.0),
consituting  only  0.4% of one's  time per  day.   If we assume that  "all other
locations"  (0.)   is  entirely outdoors  and combine  it  with  being  outside
one's  home  (0H),  then the  total  time  spent outdoors  by employed persons in
the  U.S. is  0.4 + 1.7 =  2.1% (30 minutes), a  relatively small  proportion of
the  day.   This 30 minutes  reflects the  total  time it  takes  people to walk
from home to an automobile, walk from the parking lot  to a place of

                                    3-11

-------
       Table 4.   Estimated Time  Spent  in Three Environmental Categories   (Average Hours per Day)a
Country

Belgium
Bulgaria (Kazanlik)
Czechoslovakia
(Olomouc)
France (Six Cities)
West Germany
(100 Districts)
West Germany
(Osnabruck)
East Germany
( Hoyerswerda )
Hungary (Gyor)
Peru (Lima-Callao) •.
Poland (Torun)
United States
(44 Cities)
(Jackson, Mich.)
U.S.S.R. (Pskov)
Yugoslavia
(Kragujevac)
Yugoslavia (Maribor)

Indoors
21.6
21.0
21.3
22.0
20.4
20.7
21.6
20.4
20.8
21.9
'21.7
21.8
21.0
21.4
20.1
Employed
Outdoors
0.9
0.9
1.1
0.5
1.9
1.1
0.7
1.6
0.7
0.4
0.7
0.7
1.0
0.8
1.7
Men
Transit I
1.5
2.1
1.6
1.5
1.7
2.2
1.7
2.0
Z.S
1.7
1.6
1.5
2.0
1.8
2.2

Indoors
22.6
21.6
22.3
22.8
21.8
22.3
22.5
20.8
22.1
22.6
22.6
22.8
21.4
21.6
20.5
Housewives
Outdoors
0.4
1.5
0.5
0.2
1.2
0.7
0.5
2.3
0.7
0.2
0.4
0.3
0.7
0.9
2.4

Transit
1.0
0.9
1.2
1.0
1.0
1.0
1.0
0.9
1.2
1.2
1.0
0.9
1.9
1.5
1.1
aDerived by the author from data originally published  in Szalai  ,
 Tables 7-1.1 to 7-1.3, p.  795;  data are weighted  to ensure equality of
 days of the week and number of  eligible respondents per household.
 Married persons only
                                                 3-12

-------
Table 5.  Time Spent by Employed Persons in Various Locations
                      in 44 U.S.  Cities3
(Average Hours Per Day)
Employed Employed
Category Location Men Women Average %
IH Inside One's Home
0H Just Outside One's Home
Iu At
T In
IH In
Iw In
I0 In
00 In
One's Workplace
Transit
Other People's Homes
Places of Business
Restaurants and Bars
All Other Locations
Total
13
0
6
1
0
0
0
0
24
.4
.2
.7
.6
.5
.7
.4
.5
.0 .
15.
0.
5.
1.
0.
0.
0.
0.
24.
4
0
2
3
7
9
2
3
0
14
0
5
1
0
0
0
0
24
.4
.1
.9
.5
.6
.8
.3
.4
.0
60
0
24
6
2
3
1
1
100
.0
.4
.6
.2
.5
.3
.3
.7
.0
         'Based  on  data  from  44  U.S.  Cities  (Table  3)
                             3-13

-------
                                             IN-TRANSIT
                                                 6%
                                                     OUTDOORS
                                                         2%

                                                         ;NDOORS,
                                                         OTHER
INDOORS, WORK

       28%
                         INDOORS, HOME
                              63%
Figure 1.   Proportion of time U.S. employed persons spend  in
           indoor,  outdoor, and in-transit microenvironments.
           (Based on data from Table 3 for 44 U.S.  cities; men
           and women were weighted equally; percentage of  hours
           per day  averaged over all days of the week.)
                            3-14

-------
employment  and  return,   and  a  great  variety  of other  brief  outdoor
activities.

      Using Szalai's  data in Table  3,  similar findings emerge  for U.S.
unemployed,  married women ("housewives").  The  largest category is "inside
one's home," accounting for  20.5  hours, or 85.4% of the day.  When this is
combined with  the  category  "in other  people's  homes,"  we find that these
women spend 21.3 hours,  or  88.7% of their time,  inside  homes (Figure 2).
Assuming,  as  above,   that  "in  places  of  business"  (12 hours)  and "in
restaurants and  bars" (0.1  hours) are all  indoors, then U.S.  housewives
spend 6.3 hours,  or 5.4% of  their time  indoors.   On the  average, they  spend
1.0  hours  per  day  in  transit,  or 4.2%  of the  day.  When  the  indoor and
in-transit  categories are combined and  subtracted from  a 24-hour day, we
find  that  U.S.  housewives  spend  only 0.4  hours,  or  1.7% of  their time
outdoors.

      The survey methodology used for the  U.S.  population  sample,   along
with the analyses and findings,  are described  in detail  in a monograph7 and
a  book8  by Robinson.    In  1975,  Robinson27  conducted  a follow-up  study
including a more representative sample  of the  total  U.S. population than
the  1965-1966 study,  and  in 1985-86 Robinson and  Holland37 undertook  a more
comprehensive U.S.  national  survey  that included adolescents over age 12.

Diurnal Profiles

      Although  the estimates  given  in  Tables  3  and  4 are  useful, for
determining the total  amount of  time spent in various  locations,  they give
little information about  the time  of day that persons  are present in each
location.  Data from the multinational  study  also can  be  displayed  in  other
ways.   A composite  profile shows the proportion  of  the U.S.  population
during the  day engaged  in  selected  activities  such  as  sleeping, eating,
work,  travel,  home,  leisure,  and television  (Figure 3).   Diurnal  profiles
for  five countries show  that U.S.  employed men  are are not  likely  to  spend
their  noon lunch  hour  in  their  homes,  while  men  in  other  countries,
particularly  France,   are quite  likely to do so  (Figure 4).   The diurnal
profiles for  housewives also show similar patterns in the six countries,
but  show a  striking difference from the men's diurnal  profiles  (Figure 5).
Housewives spend most of the day inside their  homes, except  for  slight dips
in  the graphs  in the  morning and afternoon.   Compared to other countries,
housewives  in  the  United States  seem to show  less  tendency to  spend  their
noontime periods at home and exhibit a dip  (minimum) at  approximately 8:00
p.m., presumably because they are eating out.

      In  addition  to  the   studies by  Robinson7"8-27   and  Robinson and
Holland37,  activity pattern  studies have been carried out in Durham,  N.C.,
by  Chapin  and Hightower4;  on  a U.S.  sample  of  43 Standard Metropolitan
Statistical  areas   (SMSA's)  by  Chapin  and  Brail9;  on   a  follow-up U.S.
national  sample by  Brail  and  Chapin11;  and  in  the  Washington,   D.C.,
metropolitan area by  Hammer and  Chapin10.  Chapin and  his associates  employ
a  3-digit  system for  coding  activities that  is  based on  a  "dictionary" of
about 225 activity codes.  Although activities  can be grouped in  a variety
of  ways, Chapin  finds it convenient to  group the original 225 activities

                                   3-15

-------
                                                  INDOORS, OTHER
                                                         5.4%
                                                         N-TRANSIT
                                                            4.2%
                                                            DOORS
                                                            1.7%
                      INDOORS, HOME
                           88.7%
Figure 2.  Proportion of time U.S.  housewives (unemployed,
           married women)  spend in indoor,  outdoor,  and in-
           transit microenvironments.   (Based on data from
           Table 3 for 44 U.S. cities; percentage of hours
           per day averaged over all days of the week.)
                           3-16

-------
 aoo TOO 2.00 100 400 5.00 eoo 700 sooaoo 1000 11.001200 13001400 BOO i6oo;70oiaooiaoo 20002100220023002400
                               TIME
 MIDNIGHT           6AM             NOON             6PM           VCN:OnT
Figure  3.   Diurnal  profiles  showing percentage of  employed
            men in 44 U.S. cities engaged in 9 types of
            activities as function qf time of day  (weekdays
            only).   (Source:  Szalzi ,   Figure 5-1.11 A.
            page  736)
  Note:  Data are  weighted to  ensure equality of  days of the
  week and number of eligible respondents.
                              3-17

-------
                                                       100 electoral distrxtl
                                                       FED REP GERMANY
                                                       Lima-Cuuao. PERU •
                                      14  13  16  1?
                                                    19  20  21  22  23 24
Figure 4.   Diurnal profiles  showing percentage  of employed
            men in six countries present at home as a function
            of time of day  (weekdays only.)  (Source: Szalai,
            Figure 7-3.1 A, page 800)
  Note:   Data are weighted to  ensure equality  of days of the
  week and  number of eligible  respondents.
                             3-18

-------
              -t—*
-ft—ft-
                                                           /
                                                              f
                                                       ••—f
                                                     -T- Six CUM, FRANCE

                                                    _.._ ferty-Mur °OUM, USA

                                                    ___ WO «»c>aol 0»trict».
                                                        FEO.REP;OERMANY •
                                                    __ -tra-Canoe. PERU
-a>—*-
    1  3  3
                         910  1112
                                          151817
                                                             23  23  24
 Figure 5.   Diurnal  profiles  showing the  percentage  of
             housewives in six countries present at home as a
             function of timf  of day (weekdays only.)
             (Source:  Szalai  ,   Figure 7-3.3 A, page  804)
Note:   Data are weighted to  ensure equality of days  of the
week and number of  eligible  respondents.
                           3-19

-------
according  to  two systems,  one  forming 28  categories of  activities and
another  forming 12.    Finally,   activities  are  grouped  into  two large,
general  classes: obligatory activities and discretionary activities (Table
6).  Considerable detail is available about  the activities of residents in
Washington,  D.C.,  in  a  report  by  Hammer  and  Chapin10  and the  book by
Chapin  ,  including a diurnal  profile of their activities  by  time of day
(Figure 6) that is similar  to  the  profiles prepared by Szalai6.

AIR POLLUTION  EXPOSURE STUDIES

     EPA  has  conducted several  large   scale  field  studies employing  a
probability sample of the population in  which45-46  respondents wear  personal
monitors and record their daily activities in diaries.  Through the use of
these techniques, called the  Total  Exposure  Assessment Methodology  (TEAM),
the carbon monoxide exposures  of a representative  sample of 452 persons  in
Denver31"32 and 712 persons  in Washington,  D.C.33"34,  were  monitored in the
winter  of 1982-83.   The  activity  pattern  data  from Washington,  D.C.,
(Figure 7) yielded overall  findings similar to  those  from the Multinational
Comparison Time Budget Research Project 6.

      The  Washington,  D.C.,   respondents,  which  include  both  employed
persons  and  housewives,  spend only  1.3% of  their  time  outdoors.   This
figure  is  lower  than  the  1.7-2.0% obtained from Szalai6,  probably because
people spend more time indoors in the winter  in eastern U.S.  cities than in
other seasons.   Transportation accounted  for  8.1%  of  the time.  As with the
other studies,  the bulk of the time,  90.6%,  was spent indoors,  with  73.4%
spent indoors  at home and  17.2% indoors at work.

      An  overview of  the  carbon  mojioxide  human  exposure  field  studies
appears in a paper  by Akland  et al.4,  and the activity pattern data from
Washington, D.C., have  been analyzed in detail by Hartwell et a/.33"34 and
Schwab55, who related  sociodemographic factors to activities and exposures.

TRANSPORTATION  STUDIES

      Few  areas  of  human  activities  have  received  more  study  than
transportation.   In  the United States,   for  example,  legislation passed in
1952  required   urban  areas  to  conduct  metropolitan-area  transportation
studies as  a  prerequisite  for  receiving  Federal  funds  for  highway
construction.   As a  result, transportation studies have been undertaken in
200' areas of  the  United  States ,  and  these  studies  usually  involve
collection of  considerable detail about the transportation activities of
the urban population, particularly in cities with populations in excess of
50,000.     One  method for  obtaining  the  data  on  activities  is  by an
"origin-destination"  study,  which  uses a  questionnaire  called  a  "trip
report  form" to  determine  the time of each trip,  its purpose,  the mode of
travel, and where it'ends  (Figure  8)56.
                                   3-20

-------
            Table 6.   40-Category Activity Classification System
                            Suggested  by  Chapin51
OBLIGATORY ACTIVITIES
    Miscellaneous
    Main Job
    Other Income-Related
    Personal Care
    Eating
    Shopping
    Sick or Utilization of Medical Care Services
    Maintenance of Home, Yard, or Car
    Housework and Child Care
    Misc. Household Chores, Including Pet Care or Walking the Dog
    Household Business
    Education

DISCRETIONARY ACTIVITIES
    Child-Centered Activities
    Visiting, Writing Letters, Phoning Relatives
    Overseeing Children's Study, Practice
    Family Outings or Drives
    Talking and Visiting Within Family
    Visiting, Writing Letters, Phoning Friends
    Visiting In the Neighborhood
    Visiting Outside the Neighborhood
    Other Socializing Activities
    Relaxing, Loafing, Resting, or Napping
    Reading Newspapers, Magazines, or Nonspecified Materials
    Reading Books
    Cultural Activities
    Movies
    Television
    Radio
    Crafts and Hobbies
    Walking and Cycling  •
    Driving About, Sightseeing (Not with Family)
    Participant Sports
    Spectator Sports
    Out-of-Town Holiday
    Other Recreation
    Religious Activities
    Meetings of Voluntary Organizations
    Public Affairs and Service Activities
    Travel, Including Waiting for Travel
    Sleep
                                    3-21

-------

                                                                     •
                                                i,c,,.i,n, in Church a O»«i
Figure 6.   Diurnal profiles of activities of heads of
            households and spouses  on  weekdays in
            Washington, D.C., spring 1968  (Source:
            Chapin51,  page 104.)
                         3-22

-------
    PERCENT
   so r
         TRANSIT    RESIDENCE  OTHER INDOOR
                    MICROENVIRONMENT
                   •I  EXPOSURE   Bi TIME
 1.5 1.3

OUTDOOR
Figure 7.   Time spent  in various microenvironments by
           residents of Washington, D.C., from the carbon
           monoxide TEAM exposure study conducted by EPA in
           winter of 1982-83.33~34
                        3-23

-------
                                        Urban Transportation Study
                            DWELLING UNIT SURVEY - INTERNAL TRIP REPORT
  (For trips which began on the travel date (4:00a.m. to 4:00 a.m.)
  by persons 5 years of age or older living in sampled dwelling)
                                                                        Sheet.
Sample Number |  I 41 2161
12
No.




1
1




1
I





|
I



2
1




2
1
13
No.




1
1




2
1




m
3
\



1
1




2
1
14
Trip Begin?
4415
Street No.
ScARAGRO Sfi
Street Name/Ending
YfiLJfi ClTV
City
VIRGINIA
State
1 1 1 1 1 1
/
Street No. f
Street Name/EMing
City Ł
State
1 1 1 1 1 1
/
Street No. /
Street Name/Grading
/
City &
State
1 1 1 1 1 1
4415
Street No.
5cAB«oan Sq.
Street Name/Ending
YOUR CITV
City
VA.
Sute
1 1 1 1 1 1

Street No. /
Street Name/fiMing
CiW p
State
1 1 1 1 1 1
IS
Trip End?
1622
Street No.
ELECTRIC RD
Street Name/Ending
Oum "Tatvfci
City
VlRAIKIIA
Sute
/f 1 1 1 1 1
Ł4d6
Street No.
GPAIJB PL.
Street Name/Ending •
YOUR ClTV
VIRGINIA
Sute,
/I 1 1 1 1 1
4415
Street No.
SeiRBORO So.
Street Name/Ending
VOUB CITV
City
VA
Stale
1 1 1 1 i 1
SALCM R~AVA
Street No.
i?o ^ A/ki«j Sr
Street Name/Ending
OL>« TflM/M
VA.
Sute,
/I 1 1 1 1 1
44(5
Street No.
&ARRORO So.
itreet Name/Ending
YOUR CITY
City
VA.
State
1 1 1 1 1 1
16
Get There?
0Auto Drmr
3. Transit Pauano?r
4. School But Pantnjlr
5. Taxi Patstngar
6. Truck OriMr
7. Truck Pmtngtr
8. WalkM to Work
9. WorkM « Mom*
n
3)Auto OriMr
2. Auto Pmvngcr
3. Transit P«$angtr
4. Si PtutnoH
6. Truck Dtivtr
7. Truck PMStngtr
8. Walked to Work
9. WorkM at Homt
n
ni^uto Of nrtr
3. Tranvt Pauano^r
4. Sctiool Bui PnMfigir
5. TaHi Pamnoir
6. Track Drinr
7. Truck Pawn**
8. WalkM to Work
9 WorkM al Hoim
n -
Qkuto Orim
3. Transit Ptuangtr
4. SOUX* Bui Pmnoir
S. T.M. Paoinav
6. Track OriMr
7. Truck PMMKQK
8. WalkM to Work
I. WorkM at Horn
n
17
You Start?


-fits €5)
PM


1 1 1 1


4-45^



1 1 1 1


5:30 AM
@


1 1 1 1

IO=QO.©
PM


MM


|:OQ AM


MM
18
You Arrive?


Z-^flJ®
PM


I I I


5ŁQQ_AM


MM


6:15 AM
QiJ


1 1 1

|Ql3Q_
PM


MM


i:3Q AM


1 1 1 1
Why Did You Go? **
Purpose
From To
1 Work (
2 Shop 2
3 Social 3
4 RtorMtion 4
S School S
0 Pmonal Allan 6
j Transfer lo Anottw 7
Mtaam ol TravM
8 SmiPaiMnoir 8
^ Horn* 9
1 1 !
J Work 1
2« Shop /?
# *"" Si
3 Social 3
4 RTCraation 4
S School S
6 Ptrional Allairi 6
j Tr«nit«r 10 Anotrttr 7
Miam ol Trn«
8 S«M Pntangtr B
9 Mono 9
1 1 1
1 Work 1
^ Shoo 2
3 Social 3
4 Rtcrianon 4
5 School S
6 Pirtonal Allairi 6
7 Truntttr to Anorrttr 7
M«.ni of TrmM
8 S«rv« PHtanotr 8
9 Norm (?
1 1 1
1 Work 1
2 Shop ^
3 Social 3
4 Rtcrtanon 4
S School S
6 Pmonal Aflairi 6
7 Trantltr to Another 7
Mtam ol Tram
9 Strv* Patsangtr 8
[) Horn 9
1 1 1
1 Work 1
^ Shop 2
3 Social 3
4 Rtcrtation 4
S School S
6 Personal Allairi 6
7 Tranttvr to Anothtr 7
Mainl ol Tr.v«t
8 Str» Pnimgir 8
9 Norm (Ł
I 1 1

Remarks
>





























Figure 8.    Sample  trip  report  form  for  use in a  metropolitan
               transportation  study   (Source:  Reference 54)
                                     3-24

-------
      As reported by  Robinson,  Converse,  and Szalai57,  the multinational
research  project  also  collected  information on  the average  time spent
commuting to and from work in various  countries  (Table 7).   If  the  entries
for all countries are averaged,  people who commute  by public transit spend
an average  of  82 minutes per day  traveling to  and  from  work; those who
commute by automobile spend  an average of 55 minutes per  day; and  those who
commute by  walking  spend  an average of 41  minutes  per  day.   (Obviously,
persons who walk  live closer to their places of work than those who use
other transportation modes).  In the United States  (44 cities),  the  average
time spent  commuting by automobile is 46 minutes  per day, or 23 minutes
each way.  However,  the variability of national  samples  makes  it  difficult
to compare  travel times  in  different  countries.  For example, the travel
time for commuting  by automobile  in one  Yugoslavian city  (Kragujevac, 53
minutes) is higher  than  the U.S.  average of 46  minutes, while the travel
time in another Yugoslavian  city  (Maribor,  44 minutes)  is  lower than the
U.S. average.

      More important, however,  is  that the average  time  commuting,  even if
it were known exactly for a particular country,  would not  be sufficient to
estimate  air  pollution  exposures.    The average  time  could  be used to
calculate  the  average  exposure  (assuming   the  average  pollutant
concentration associated  with commuting  were known), but  it could not be
used to calculate  the maximum exposure some commuters  receive.
                                   3-25

-------
Table 7.  Average Time Spent Traveling To and From Work
      by Mode of Transportation  in  12 Countries55
                   (Minutes Per Day)
Public
Country Transoort Automobile Walkinq All Travel
Belgium
Bulgaria (Kazan! ik)
Czechoslovakia (Olomouc)
France (Six Cities)
West Germany
(100 Districts)
(Osnabruck)
East Germany
(Hoyerswerda)
Hungary (Gyor)
Peru (Lima-Callao)
Poland (Torun)
United States
(44 Cities)
(Jackson, Mich)
U.S.S.R. (Pskov)
Yugoslavia
(Kragujevac)
(Maribor)
Mean:
Standard deviation:
98
93
73
82
N.A.
71
82
104
103
71
81
--
67
70
71
82.0
13.3
55
73
62
46
N.A.
41
66
48
93
50
46
39
--
53
44
55.1
15.1
52
47
46
44
N.A.
46
30
40
48
: 41
30
34
32
47
40
41.2
7.2
66
57
59
50
40
47
62
64
89
60
50
38
--
51
51
56.0
12.7
                          3-26

-------
In any population subgroup such as commuters,  some  people  will  spend very
long times commuting, while others spend far less.

      Defining exposures  in  a  meaningful  way requires  information on the
variability and  range  of  commute  times.  Ideally,  one  would  like to have
the entire frequency distribution  of travel  times of commuters to generate
the entire frequency distribution of commuter exposures.   Unfortunately,
most  summaries  of  time  budget studies present  only  average values and
seldom give histograms  or information on the variability of the time  spent
in various locations or  activities,  even though  such  histograms could be
generated from the raw  data.

      In 1969-70, the  U.S.  Department  of  Transportation arranged with the
Bureau   of  the  Census   to  carry  out   a  nationwide   study  of  the
transportation-related activities of  the  U.S.  population.  This study,
called  the  Nationwide Personal Transportation  Study,  was based  on home
interviews and  covered individual activities  in considerable detail12"22.
Assuming two  automobile trips  per day,  home-to-work travel times averaged
22 minutes  per  day  (Figure 9) which  compared  reasonably well   with the
average of 46 minutes  per day (that is, 23 minutes per trip) reported by
Robins,  Converse, and Szalai  .  However, about one-third of the population
spends only 10 minutes  commuting,  while a small proportion  (3.8%)  spends 60
minutes, nearly three times the average value.

     These figures demonstrate why the  average duration  of an activity is
not sufficient to characterize exposures in a  meaningful way.  Use of the
average value for commute times would underestimate, by a factor  of three,
the very long commute  times  experienced by 3.8%, of  the employed persons i-n
the U.S., which represents several million  persons.   The SHAPE model39 uses
frequency  distributions  such  as Figure  9  to   simulate   human  activity
patterns.

     In  1983-84,  the  U.S.   Nationwide  Personal  Transportation  Study was
repeated on a probability  sample  of  6,438  households in 50 states and the
District of Columbia35.  Comparing average  home-to-work commute  times for
the three time periods for which  U.S.  survey data are available shows that
the average commute time  for all  SMSA's declined from  about 23 minutes in
1969-70  to  about 21 minutes for  all  U.S.  SMSA's in both  the 1977-78 and
1983-84 survey periods  (Table 8).   Average commute  times decrease with the
size of the SMSA.   For example,  in the 1983-84 U.S. survey,  commute  times
averaged 15.3 minutes  in  SMSA's with fewer  than  250,000 persons,  compared
with  22.1  minutes in  SMSA's of 1-3  million persons,  and  26.8 minutes in
SMSA's above 3 million  persons35.

      U.S. workers overwhelmingly  favor the  automobile  for  commuting  (Table
9).   Approximately 67-72% of the  home-to-work trips were by passenger car,
while public  transportation accounted  for  only  5-7% of the  trips.   Only
about 4-5% of U.S. workers walk to work.  The  typical home-to-work trip by
passenger  car  averages  about   19 minutes,   while  trips by  public
transportation  are much  longer.    In  1983,  the  average   trip  by public
transportation  took 46.1  minutes.    Persons who  walk to  work  average

                                   3-27

-------
        Percent of Persons
                                        Average • 22 minutes
                                           > 65% - 2.2%
            -5%
5-15%   15-25% 25-35% 35-45% 45-55% 55-65%
Figure 9.  Frequency distribution of home-to-work  commuting
           times  for employed persons in the U.S.  (excludes
           persons  who work at home or at no fixed address).
           (Source:  Nationwide Personal Transportation study,
           Table  A-6,  Report No.  8. August, 1973,  p.  57.
           Based  on data from Reference 19)
                            3-28

-------
slightly less than 9 minutes.   Home-to-work trips by truck, van,  and  other
private transportation modes  average about 20 minutes,  or nearly the  same
as automobile travel.  While  accounting for only 15.6% of  the  total  travel
time in 1983, this category has been steadily increasing since 1969.


     Table 8.  Average Commuting  Time  of  U.S. Workers  by  Size of SMSA*
                             (minutes)
SMSA Population
Fewer Than 250,000- 500,000-
YEAR 250,000 499.999 999.999
1969: 19.4 19.8 21.2
1977: 16.9 17.1 18.9
1983: 15.3 18.8 17.9
1,000,000- 3,000,000 All
2.999.999 and over SMSA's
23.7 25.6 23.1
21.8 25.2 20.8
22.1 26.8 20.9
*Source: Table 7-2 from Nationwide Personal Transportation Study;35
         includes only SMSA's.
       Table 9.   Mode of Travel  of Home-to-Work Trips of U.S.  Workers3
	Percentage	

               Truck, Van, and                  Work
    Passenger  Other Private  Public            at
YEAR   Car   Transportation Transportation Walk Home Other Total
1969:
1977:
%:
mm:
1983:
%:
mm:
67.3

72.1
19.0

70.6
19.1
5.8

11.5
19.9

15.6
20.1
7.3

5.7
38.8

5.3
46.1
5.1

4.7
8.8

4.1
8.9
4.5 10.0

3.7 1.4
- 16.7

3.5 0.9
- 29.9
100.0

100. Ob
19.8 '

100.0
20.4
aSource: Table 7-6 from Nationwide Personal Transportation Study;35
         Size of U.S. workers number 75,758,000 in 1969; 93,019,000
         in 1977; and 103,244,000 in 1983.

blncludes 0.9% unknown

                                    3-29

-------
                              RESEARCH NEEDS
      The  literature  on  human  activity  patterns,   time  budgets,   and
transportation-related  activities  is  at  this  time  voluminous  and
comprehensive.  A  large number of studies  of  human  activities have been
undertaken  in  the  United States and other countries.   In general, they have
been designed  and  implemented by two groups:  sociological researchers and
transportation  analysts.    These studies  have reflected  the particular
interest of these two disciplines.

      Despite  the  large volume of information available on  human  activity
patterns, these  data are not  really suitable for estimating the exposure of
the population  to  air  pollution.  Although  crude exposure estimates are
possible, the  data  from  existing studies suffer  from three main problems:

      •     failure to collect in the diaries basic data that are important
            for  estimating  air  pollution exposures.    For example,
            respondents  were  not   asked  to  indicate  their  smoking
            activities,  nor  did  they report  if  they  used  gas stoves or
            other   gas  appliances.    Rather,  the  emphasis was on their
            leisure time  activities,    such  as  viewing  television  or
            socializing.


     •      failure to  code data on  the  diary  forms in a manner  suitable
            for  estimating air pollution  exposures.   For  example,  it would
            be possible  to .determine  .explicitly from  the-diaries whether  a
           'person  Was  indoors of outdoors,  but the  investigators  were not
            interested   in  this matter,  so the resulting  codes are of no
            help.   Estimating the  percentage of  time  spent  indoors from
            existing  activity  pattern  data  requires  making  numerous
            assumptions.   For  example,   the activity  category  "working
            around  the  house"  is  ambiguous.  Is it indoors or outdoors?  It
            is likely  that a revised coding of the original dairies could
            yield more accurate information for air pollution purposes.

•     failure  to  present  the analyses  and data  summaries  in  a manner
            suitable for estimating  air  pollution  exposures.   For  example,
            tables  usually  present  the  duration  of each  activity as an
            average value  so that  only  the  average  exposure   of  the
            population  can be computed.   However,  the average  exposure may
            be well below the  relevant air quality standard,  concealing the
            fact that a  significant proportion of the population is exposed
            to  levels  above the standard.   Determining  that proportion
            requires the entire  frequency distribution  of  times spent in  a
            given  activity.


      Thus, much  information needed to  estimate human  exposures is not
available  in  past  studies of human  activities and  time budgets,  where  a

                                   3-30

-------
large amount of superfluous and unnecessary information  is available.   For
example,  the time spent "socializing"  or  "watching television" is of no use
in estimating exposures to  air pollution.   Flachsbart56  suggests that the
analyst of air pollution exposure  is  less interested in the "activity" than
in  the "environmental  setting."   Thus,  the  exposure  analyst  is  less
interested in whether  an  individual  is  talking  with friends  (socializing)
than whether  he  or  she is driving in traffic, cooking indoors  with a gas
stove,  or parking  a  car inside  an  underground  garage.   Obtaining  this
information   requires  conducting an  activity  pattern  study  specially
designed to estimate air pollution exposure.

      Such a  study  should  begin  with a pilot study  on  a single  city to
perfect the experimental design and data collection methodology and should
measure exposures with  personal monitoring instruments.   Once the results
are evaluated, a large-scale investigation would be carried out on a number
of cities  or  a  national  probability  sample.   The large-scale survey would
employ diaries  and personal  monitoring instruments  to  characterize the
frequency  distribution of air pollution exposures of the population as a
whole  and of selected cities.   Information  from the  diaries could be
correlated with  the measurements of  exposure to determine  how different
activities of the population affect their exposure rates.


                                CONCLUSIONS

      Although a great  body of literature and data currently exist on human
activity  patterns,  most  of this research was  conducted  by  sociological
researchers  not  interested  in air pollution  or environmental  problems.
Thus,.the  existing  time budget  studies contribute very" little to the newly
emerging  field of  human exposure modeling and monitoring.   Field studies
are needed  to gather data on the particular  activities  that affect human
exposure  to  pollution (for  example,   using  consumer  products,  storing
chemicals in the home,  driving, living with a  smoker, using gas  appliances,
visiting  dry cleaning establishments,  filling gasoline  tanks).    The
importance of human activity patterns  in  determining an individual's health
risk  to  environmental  pollution  has  only  recently been  recognized.   The
potential to  gain a better  understanding of the causes and sources of  risk
to environmental  chemicals through activity pattern research is enormous.

      The existing time budget  studies were not designed to estimate human
exposure,  but they can still  be  used to make imperfect  estimates of the
time  spent  in different microenvironments, as we have attempted to do in
this paper.  When the data are interpreted  in  this manner, some  interesting
findings emerge.

       In  general,   people  spend  a  very  small  amount  of time outdoors.
Excluding  the in-transit  categories  and  all  indoor  locations, outdoors is
the smallest  time  category.  Time budget data indicate  that U.S.  workers
spend  only about 2% and  U.S.  housewives  spend  about 1.4%  of  their time
outdoors.   In other countries,  people always  spend  less  than 10% of their
time outdoors, and  in most countries they spend less than  5%.


                                   3-31

-------
      There are minor differences  from country  to  country  in  human  activity
patterns,  probably  resulting from differences  in  culture,  transportation
systems, and climate.   The similarities, however,  are  more striking than
the differences.  The finding that emerges is that we are basically  indoor
animals.  When not  indoors at home, we are indoors on our jobs,  in stores,
or other locations.   When not located within some room or building, we can
usually  be found in a  transportation microenvironment,  such  as a train,
bus, or  automobile.   In a modern  society,  total time outdoors is the most
insignificant  part  of the day, often so small it barely  shows  up in the
total.

      Possibly,   this   emphasis  on  enclosed  structures   stems  from  our
evolution.   Unlike  the  animals of the forest, we have no fur to  protect us
from the cold.   Nor are  our  bodies equipped  with efficient  weapons of
defense, such as claws,  fangs,  or  tusks.   Perhaps  because of  our  insecurity
outdoors,   we  build residencies,   and buildings,   and  even  enclosed
transportation vehicles  such  as cars and buses.  The existing data, on all
cultures thus far studied, if  they  are  correct and valid,  suggest that we
are primarily indoor animals.
                              Acknowledgment
         I wish to thank Herb Hunt of General Sciences Corporation
          for his tireless work on the development of this paper.
                                   3-32

-------
                             REFERENCES
1.     Lundberg,   George  A.,   Mirra  Komarovsky,   and  Mary  Alice
      Mclnernv.Leisure: A  Suburban  Study.Columbia University  Press.  New
      York Citv.  1934

2.     Sorokin,  Pitirim A., and  Clarence  Q.  Berger, Time-Budgets of Human
      Behavior.  Harvard University Press,  Cambridge, Mass.
     1939.

3.     de  Grazia,  Sebastian,   "The  Uses  of  Time," in Aoino  and Leisure.
      Robert W.  Kleemeir,  ed.,  Oxford University Press,  New York City,
      1961,  pp.  113-153.

4.     Chapin,  F.  Stuart,  and  Henry  C.  Hightower,   "Household Activity
      Patterns  and  Land  Use," Amer.  Inst.  of Planners.  31,  3: 222-238,
      August 1965.

5.     Szalai,  Alexander,  "The Multinational  Comparative  Time  Budget
      Research  Project:  A  Venture in  International Research  Cooperation,"
      Amer.  Behavior Scientist. 10,  4:  1-31,  December  1966.

6.     Szalai, Alexander, The  Use of Time:  Daily  Activities  of Urban and
      Suburban  Populations  in Twelve Countries. Mouton, The Hague, 1972.

7.     Robinson,  John  P.,  How Americans  Used Time in 1965.  University of
      Michigan,  Monograph  from University  Microfilms  International,  Ann
      Arbor, Mich.,  1977.

8.     Robinson,   John  P.  How  Americans  Use Time: A Social-Psychological
      Analysis  of Evervdav Behavior.  Praeger  Publishers, Praeger  Special
      Studies,  New  York City,  1977.

9.     Chapin, F.  Stuart,  Jr.,  and Richard  K. Brail, "Human Activity  Systems
      in  the United States,"  Environ, and  Behavior.  2:107-130, December
      1969.

10.   Hammer, Philip G. Jr.,  F.  Stuart  Chapin Jr., "Human Time Allocations:
      A Case Study  of Washington,   D.C."  Technical  monograph,  Center for
      Urban  and  Regional  Studies,   University  of  North  Carolina,   Chapel
      Hill,  N.C., March 1972.

11.   Brail, Richard K.,  and  F.  Stuart Chapin, Jr.,  "Activity  Patterns of
      Urban Residents," Environ, and Behavior.  5:163-191, June 1973

12.   Strate,  Harry  E.,  "Automobile  Occupancy",  U.S.  Department of
      Transportation,  Federal  Highway  Administration, Nationwide  Personal
      Transportation Study, Washington, D.C., Report No. 1, April 1972.

13.   Strate,   Harry  E.,   "Annual  Miles  of  Automobile  Travel,"   U.S.

                                   3-33

-------
      Department  of Transportation,  Federal  Highway  Administration,
      Nationwide Personal  Transportation Study, Washington,  D.C.,  report
      No.  2, April 1972.

14.    Strate,   Harry E.,  "Seasonal  Variations of  Automobile  Trips  and
      Travel,"  U.S.  Department  of Transportation,  Federal  Highway
      Administration, Nationwide Personal  Transportation Study, Washington,
      D.C., Report No. 3, April  1972.

15.    Beschen,  Darrell  A.  Jr., "Transportation Characteristics  of  School
      Children."  U.S.   Department of  Transportation,   Federal Highway
      Administration, Nationwide Personal  Transportation Study, Washington,
      D.C., Report No. 4, July 1972.

16.    Hatley, Rolan  M.,  "Availability  of Public  Transportation and Shopping
      Characteristics  of  SMSA  Households",    U.S.   Department  of
      Transportation,  Federal  Highway Administration,  Nationwide Personal
      Transportation Study, Washington,  D.C., Report No. 5,  July  1972.

17.    Gish,   Robert  E.,  "Characteristics  of Licensed   Drivers",   U.S.
      Department  of Transportation,  Federal  Highway  Administration,
      Nationwide Personal  Transportation Study, Washington,  D.C.,  Report
      No.  6, April 1973.

18.    Goley, Beatrice T., Geraldine Brown, and  Elizabeth Samson,  "Household
      Travel  in the  United  States,"   U.S. Department  of  Transportation,
      Federal  Highway Administration,  Nationwide  Personal  Transportation
      Study, Washington, D.C., Report  No.  7, December 1972.

19.    Svercl,  Paul V.,  and Ruth H. Asin,  "Home-to-Work Trips and Travel,
      U.S. Department of Transportation,  Federal  Highway  Administration,
      Nationwide Personal  Transportation Study, Washington,  D.C.,  Report
      No.  8, August  1973.

20.    Randill,   Alice,  Helen Greenhalgh,  and  Elizabeth Samson,  "Mode of
      Transportation and  Personal Characteristics  of Tripmakers,"  U.S.
      Department  of  Transportation,  Federal  Highway   Administration,
      Nationwide Personal  Transportation Study, Washington,  D.C.,  Report
      No.  9, November 1973.

21.    Asin,  Ruth  H.,  "Purposes of Automobile  Trips  and  Travel."    U.S.
      Department  of  Transportation,  Federal  Highway   Administration,
      Nationwide Personal  Transportation Study, Washington,  D.C.,  Report
      No.  10, December  1974.

22.    Asin,  Ruth  H.,  and  Paul  V. Svercl,  "Automobile Ownership,"  U.S.
      Department  of  Transportation,  Federal  Highway   Administration,
      Nationwide Personal  Transportation Study, Washington,  D.C.,  Report
      No.  11, December  1974.

23.    Flachsbart,  Peter G.,  William C.  Baer,  and Gary Schalman,   "Activity
      Patterns   in the  Residential  Environment,"  report prepared for the

                                   3-34

-------
      U.S.  Public  Health Service  project:    Research  on the  Residential
      Environment.     Graduate  Program  Urban  and  Regional  Planning,
      University  of Southern California, Los Angeles,  Calif.,  1972.

24.   Michelson,  William,  and Paul Reed,  "The Time Budget,"  in  Behavior
      Research Methods  in  Environmental Design.  William Michelson,  ed.,
      Community Development Series, Halsted Press,  1975.

25.   Bullock,  Nicholas,  Peter Dickens,  Mary Shapcott, and Philip Steadman,
      "Time Budgets and Models of Urban Activity Patterns,"  Social Trends.
      56:  45-63,  Nov.  5,  1974.

26.   British Broadcasting  Corporation,  The  People's  Activities  and Use of
      Time. BBC Audience Research Department,  J.  Smethurst (E.S.D.), Ltd.,
      England,  1978.

27.   Robinson, John P., "Changes  in  Americans' Use  of  Time:  1965-1975:  A
      Progress Report,"  Communications Research  Center, Cleveland State
      University,  Cleveland, Ohio., 1977.

28.   Asin,  Ruth  H.,   Personal  communication regarding  the  Nationwide
      Personal Transportation Study,   U.S.  Department  of Transportation,
      Federal Highway  Administration, Washington,  D.C. July 1979.

29.   Juster, Thomas F.,  Martha S.  Hill, Frank  P. Stafford, and Jacquelynne
      E. Pearsons,  "1975-1981  Time Use longitudinal Panel Study," Report on
      Project  #466066,  Survey  Research  Center,    Institute  for  Social
      Research, the University of  Michigan, Ann Arbor, Mich.,  January 1983.

30.   Letz,  Richard  E.,  and  Mary  Lou Soczek,  "A survey  of  Time-Activity
      Patterns in Kingston/Harriman, Tennessee: Methods Support for Modeled
      Data," presented at  a  specialty conference on Quality  Assurance in
      Air  Pollution Measurements, Air  Pollution  Control Association  and
      American for Quality Control Association,  Boulder,  Col.,  October
      14-18,  1984.

31.   Johnson, Ted,  "A study  of  Personal  Exposure  to Carbon  Monoxide in
      Denver,   Colorado,"  Report   No.   EPA-600/4-84-014,  NTIS  No.
      PB-84-146125, U.S.  Environmental  Protection Agency, Research Triangle
      Park, N.C.,  1983.

32.   Johnson, Ted,  "A Study  of  Personal  Exposure  to Carbon  Monoxide in
      Denver,  Colorado," Paper  No.   84-121.3 presented  the 77th  Annual
      Meeting  at  the  Air  Pollution  Control  Association,  San  Francisco,
    - Calif., June  1984.

33.   Hartwell, T.D.,  C.A.  Clayton,  R.M. Mitchie,  Jr.,  R.W.  Whitmore, H.S.
      Zelon,  S.M.  Jones, and  D.A.  Whitehurst,  "A  Study  of Carbon  Monoxide
      Exposure of Residents of Washington, DC,  and Denver, Col," Report  No.
      EPA-600/54-84-031,  NTIS  No.   PB-84-18356,  U.S.   Environmental
      Protection  Agency,  Research Triangle Park, N.C.,  1984.


                                   3-35

-------
34.   Hartwell, Ty, C.A.  Clayton,  R.M.  Mitchie, Jr., R.W.  Whitmore,  H.S.
      Zelon,  D.A. Whitehurst,  "A  Study  of Carbon Monoxide Exposure of the
      Residents in Washington,  D.C.," Paper No.  84-121.4  presented at the
      77th Annual  Meeting of the  Air  Pollution Control  Association,  San
      Francisco,  Calf.,  June  1984.

35.   Klinger,  Dieter,  and  J.  Richard Kuzmyak,  "Personal  Travel  in the
      United States,  Vol  I, 1983-84  Nationwide Personal  Transportation
      study,"  U.S.   Department   of Transportation,   Federal   Highway
      Administration,  Washington, D.C., August 1986.

36.   Johnson,  Ted,  "A Study of  Activity Patterns  in  Cincinnati, Ohio,"
      Draft Report  No.  RP940-06  PN 3640-2,  prepared by PEI  Associates,
      Research  Triangle  Park,   N.C.,   for  the  Electric  Power  Research
      Institute,  Palo  Alto, Calif.,  June 12, 1986.

37.   Robinson, John,  and Jeffrey M. Holland, "Trends in American's Use of
      Time: Some Preliminary  1975-1985  Comparisons,"  Draft Report to Office
      of  Technology  Assessment,  U.S.  Congress,  Survey  Research  Center,
      University of Maryland, College Park, Md., May 1986.

38.   Wiley, James  A.,  and  John  P. Robinson,  "Activity  Pattern  Study of
      California Residents: A Micro-Behavioral Approach" research proposal,
      University of California,  Berkeley,  Calif., undated.

39.   Ott,  Wayne  R.,  "Exposure  Estimates  Based  on  Computer  Generated
      Activity Patterns," Paper No. 81-57.6 presented  at the  74th Annual
      Meeting of the  Air  Pollution  Control Association,  Philadelphia, Pa.,
      June 21-26, 1981.

40.   Duan,  Naihua,    "Models  for Human  Exposure  to  Air  Pollution,"
      Environment International. 8:  305-309, 1987.

41.   Thomas,  Jacob,  David Mage,  Lance Wallace,  and  Wayne  Ott,  "A
      Sensitivity Analysis  of the  Enhanced  Simulation  of  Human  Air
      Pollution and Exposure (SHAPE) Model,"  Report  No.  EPA-600/4-85-036,
      NTIS No.  PB-85-201101, Environmental Monitoring  Systems Laboratory,
      U.S.  Environmental  Protection Agency,  Research Triangle  Park,  N.C.,
      1984.

42.   Ott, Wayne, Jacob Thomas, David Mage, and  Lance Wallace, "Validation
      of  the Simulation of Human Activity and  Pollution  Exposure  (SHAPE)
      Model Using  Paired  Days  from the Denver,  Colorado, Carbon Monoxide
      Field Study,"  Atmospheric Environment, in press.

43.   Johnson,   Ted,   and  Roy  Paul,   "NAAQS  Exposure  Model   (NEM)  and
      Application  to  Nitrogen  Dioxide,"  technical  report,  Office of Air
      Quality Planning and Standards, U.S. Environmental Protection Agency,
      Research Triangle Park, N.C.,  1981.

44.   Ott,  Wayne  R.,  "Concepts  of Human  Exposure  to  Air  Pollution,"
      Environment International. 7:  179-196, 1982.

                                    3-36

-------
 45.   Ott, Wayne R.,  "Total  Human  Exposure:  An  Emerging  Science Focuses  on
       Humans  as Receptors  of Environmental  Pollution,"  Environmental
       Sciences and  Technology.  19:  880-886,  October  1985.

 46.   Ott, Wayne,  Lance  Wallace,  David Mage, Gerald Akland,  Robert Lewis,
       Harold Sauls, Charles  Rodes, David Kleffman,  Donna  Kuroda,  and Karen
       Morehouse, "The Environmental Protection Agency's Research Program  on
       Total Human Exposure," Environment  International, 12: 475-494,  1986.

 47.   Michelson, William,  "Time budgets in  Environmental Research:   Some
       Introductory  Considerations:,  in  Environment Design Research,  Vol.
       11.  Symposia and Workshops.  W.F.E.   Preiser, ed.,4th  International
       E.D.R.A.  Conference,  Dowden,  Hutchinson,  Ross,  Inc.,  Stroudsburg,
       Pa., 1973.

 48.   Ottensmann,  John  R.,  "Systems  of Urban  Activities and Time:   An
       interpretive  Review  of the  literature", Urban Studies  Paper,  Center
       for Urban and Regional Studies,  University of North Carolina, Chapel
       Hill, N.C., 1972.

 49.   Szalai,  Alexander,  "Trends  in  Comparative  Time-Budget  Research,"
       Amer: Behavioral Scientist.  9:  3-8, May 1966.

 50.   Converse, Philip E., "Time-Budgets,"  in International Encyclopedia  of
       the Social Sciences. Vol. 16,  David  Sills, ed.  Macmillan Co.  And the
       Free Press, New York City,  1968.

. 51.   Chapin,  F. Stuart  Jr., Human Activity  Patterns  in  the  City; . Things
       People Do in Time and Space.  John Wiley & Sons,  New  York City,   1974

 52.   McCormick, Thomas C.,  "Quantitative Analysis  and Comparison of Living
       Cultures," Amer. Sociol. Rev. 4:  463-474,  August  1939.

 53.   Walker,  Kathryn E., "Homemaking Still  Takes Time," Journal  of Home
       Economics.  LXI: 621-624, October 1969.

 54.   Akland,  Gerald G.,  Tyler D.  Hartwell, Ted  R.  Johnson,  and  Roy  W.
       Whitmore, "Measuring Human Exposure to  Carbon Monoxide in Washington,
       D.C., and Denver, Colorado, during the  Winter of 1982-1983," Environ.
       Sci. Techno!.. Vol. 19, No.  10, October 1985.

 55.   Schwab,   Margo,   "Differential  Exposure  to  Carbon  Monoxide  Among
       Sociodemographic Groups  in  Washington,  D.C.,"  Ph.D.  Dissertation,
       Graduate School  of  Geography,  Clark University,  Worcester,  Mass.,
       February 1988.

 56.   "Urban Origin-Destination Survey",  U.S. Department of Transportation,
       Federal  Highway Administration, Washington, D.C., July  1975.

 57.   Robinson, John P.,  Philip E. Converse,  and Alexandria  Szalai  Life  in
       Twelve Countries,  in The  Use of  Time;   Daily Activities of Urban and

                                    3-37

-------
Suburban Populations  in Twelve Countries.  Alexandria Szalai,  ed.,
Mouton,  The Hague,  pp.  112-144, 1972.
                             3-38

-------
                     BASIC ACTIVITY PATTERNS STRUCTURE
                      FOR MODELING POLLUTION EXPOSURE

                       By:   Jacob Thomas
                             General  Sciences  Corporation
                             6100 Chevy Chase  Drive
                             Laurel,  MD  20707
                                   and
                             Joseph V. Behar
                             U.S. Environmental  Protection Agency
                             Environmental  Monitoring Systems
                               Laboratory-Las  Vegas
                             Las Vegas, Nevada  89183-3478
                                 ABSTRACT

      The significance of activity  patterns  in man's relationship  to  the
pollutants  in -his  environment  has  become  progressively  evident  to
environmental  scientists  over the last decade.   Field  studies  of personal
exposures  to   several  environmental  pollutants  have  been  conducted  in
several  major  metropolitan  areas  of  the  country  on  statistically
representative  samples  of the  respective populations.   These  data  were
examined to determine the  "likeness"  of activity  patterns in  different
cities.   Using data  from Denver,  Colorado,  and Washington,  D.C.,  the
distributions  of  occupancy  duration periods  for  seven  broadly  defined
microenvironments were  determined.   Despite  significant differences  in
specific characteristics  between the two  cities,  the  overall  similarities
found  in the  activity patterns are  quite remarkable.   Activity patterns
thus determined, when combined  with microenvironmental  concentration  data
in a total human exposure model,  will  provide  more  realistic  estimates  of
human exposure to environmental pollution.

      This  paper  has  been  reviewed   in  accordance  with   the  U.S.
Environmental  Protection  Agency's peer and administrative  review policies
and approved for presentation and publication.

                                   4-1

-------
INTRODUCTION

      The significance of  activity  patterns in man's relationship  to  the
pollutants  in  his  environment  has  become  progressively  evident  to
environmental scientists  over the  last  decade.    Scientists at  the  U.S.
Environmental  Protection  Agency and  several  of their colleagues in various
private  and  academic institutions  have  been  mainly responsible  for  this
increasing  consciousness.    The  field  studies of  personal exposures  to
carbon  monoxide  were  conducted  in  the  Denver   and  Washington,   D.C.,
metropolitan  areas  during the  winter  of 1982-83.J~4   These studies  were
conducted on  participants  drawn  from  statistically  representative samples
of the populations living in these two metropolitan  areas.   In  addition to
carrying personal exposure monitors to  measure the exposure to  CO,  they
provided very meticulous  records  of  time,  activity,  and location for the 24
hours during  which  each  participant carried  the  monitor.    These studies
have provided a  unique  database of activity  patterns over 24 hours of urban
dwellers in  these  two cities.

      Such studies  are costly and time  consuming.   It is  not  surprising
then that it has been repeatedly asked whether or not the activity patterns
found in  the  databases of  these  activity  pattern  studies are  usable  in
different cities and for different pollutants.

BACKGROUND

      Given  this  objective,   an  intense examination  of  the  Denver  and
Washington,  D.C.,  study data  was  conducted  to  determine  the "likeness"  of
activity patterns  observed in these  two  cities.   An activity pattern of an
urban dweller is  the time  spent  by  the  individual  in different activities
at  different  locations  during a 24-hour  period.    It is  not  only  the
duration  of the  activity  that is  important,  but  also the  time  of  its
occurrence.    The  term  microenvironment  has been defined to  mean  the
location where  the  activity takes place.   In  pollutant exposure modeling,
microenvironments  and the  time spent in them by individuals are the dynamic
variables with which we deal.

     In our present  effort, we  are  not  looking at  specific pollutants;  we
are  seeking  activity patterns by a broader brush.   All human  activities
occur either  indoors or outdoors.   The  activity  of transit needs to  be
classified  separately, however.   Also,  since most  human activities occur
indoors, further distinction  could be made between such indoor environments
as  residence,  office,  school,  shop,   etc.,  which should  be  considered
separately from  one  another.
                                    4-2

-------
     The residence is the  location  where most of everyday  life  is spent.
For  most  people  the activity  of  wage earning  occurs elsewhere.    The
"indoor"  location  can  therefore  be further  broadly  classifed  as
indoor-residence and  indoor-other.   The  outdoors  can  be classified into a
myriad of microenvironments,  but  as observed in  the  data,  the time spent
outdoors is less  than  5% in 24 hours,  hence  the  distinction  need be made
only  when  dealing wtih  specific  pollutants.    Of  the   time   spent  in
residence, more than  half of it is spent  sleeping.   The remainder of the
time spent  in  residence  is  classifiable  as many other different activities.

      When  specific pollutants are considered, certain  specific activities
in the  residence  become  important  and will require special  attention when
modeled.   For example,  cooking with a  gas  stove is  a specific  activity
occurring in  a  specific indoor residential microenvironment  important in
carbon monoxide (CO)  exposure modeling.   Under the category  "indoor-other,"
the wage-earning activity is a major activity  in everyday life.  For office
workers,  the microenvironment  "office"  is  a   major  subcategory  under
"indoor-other."   Again,  the balance  of  "indoor-other" categories such as
shops, restaurants, etc.,  are  short-duration  activities  when  considered as
a part  of daily life.   Some of them  will  need to be  specifically modeled
for certain pollutants.   The transportation or "transport" category must be
separated into transport  by combustion and  noncombustion modes.

METHOD

      The modeling of exposure  to  virtually  any  pollutant  can be  achieved
if we consider all activities in all  microenvironments  in relation  to seven
basic  activity/location groups shown below  (Figure  1 and  Table  1).   We
first  segregate the  universe of microenvironmental  activities, into three
groups:  Indoor,  Transport,  and Outdoor.  These three  broad  groupings can
be further classified into  the  subgroups indicative of  major  activities in
which people are involved.

      With  appropriate  adjustments  for occupational  exposure  or  other
pollution intensive activities or microenvironments,  these basic groups of
activity/locations can be used for the  analysis of activity data  in almost
any  urban/suburban area  in  the  country.   Certain  additional  factors,  such
as  seasonal  or geographic adjustments  of particular  impact on  specific
pollutants,  must be made  to accommodate  the modeling of  some  pollutants.

      Given the  rich  database  of  activities,   and   transitions  between
activities,  available from  the.Denver and  Washington  studies, we know the
frequency distribution  by sex, age,  occupation,  daypart, weekpart,  etc.,
for  a broad range  of activities.  While  these studies  were designed to map
exposure  to   carbon  monoxide,  the  data can  be  used  to  form the  basic
building blocks for modeling exposure  to  a  variety of pollutants.

      The duration of study days included in  this  investigation ranged from
22 to  26 hours.   To  investigate the  basic structure  of activity patterns
and  the time  spent in each  of the seven broad microenvironments identified
above, it was necessary to  standardize the duration of  the study day to 24
hours.  This meant that those observed longer  than 24  hours  were reduced to

                                    4-3

-------
INDOOR
RESIDENCE


Sleep-
ing
in
Resi-
dence

Other
Resi-
dence


OTHER


Office





Other
non-
resi-
dence
indoor
TRANSIT

Combus-
tion
Powered





Non-
Combus-
tion
Powered



OUT-
DOOR








    Figure 1.  Derivation of Seven Basic Modeling Microenvironments
Table 1. Seven broad microenvironments into which all activity/location
         combinations can be placed.
                1 - Indoor residence sleeping

                2 - Indoor residence other

                3 - Indoor other office

                4 - Indoor other nonresidence

                5 - Combustion powered transit

                6 - Noncombustion powered transit

                7 - Outdoor
                                     4-4

-------
24 hours, and those  observed  for less were adjusted to  provide  24 hours.
The study days  for most participants  started  and ended around  6  p.m.  at
their residence.   For most  persons  it  was  considered  reasonable  to extend
or  reduce  the  time at  their  last  location,   such  as   residence,   to
standardize the observation period  to 24 hours.  As a  result,  this study
provides data for 1066 person-days from the two cities (526  from Denver and
540 from Washington,  D.C.) which  can  be evaluated  in order to provide basic
activity pattern structures  which may be more generally applicable to human
exposure studies.

RESULTS

      Figure 2  and Table  2  compare  the mean activity duration  in  the two
cities.    The  bars   depict the  major  location  categories.    They  are
subdivided to represent the activities.   The  indoor residence .(INDRES)  is
subdivided  into primary  and  secondary  activities.   They  are  sleep  and
others,  respectively.  The other indoor  location  is  subdivided  between
office time and time  spent in  other  indoor  locations.  The outdoor location
time is very small in total  duration and has no subdivision.  Transit time
is  divided  into  primary  mode of  transport,   internal   combustion  driven
vehicles,   and  secondary  modes  of  transportation,   including  walking,
cycling, or travel by train.  As illustrated in the graphic and table, the
"likeness"  in the two cities  is  remarkable and greater than  anticipated.
There  are some differences which  distinguish the two cities,  however.
Washington,  D.C., is more  of an office-going  city, and residents  need a
little  longer  transit duration to traverse the larger  metropolitan area.
Denver  residents  appear to spend the  minutes  saved in  commuting  getting
extra sleep.                                        •       .

      The  results of  examination of  some of  the  characteristics which
affect  activity duration in  each  of  the  microenvironments  are  shown  in
Figure  3  and  Table 3 which contrast the week-day time  distributions with
those  of  the weekend for all  study participants (Denver  and Washington,
D.C., combined).  There is evidently more time  to  sleep,  and much more time
is  spent  in residence during weekends.   Less time is spent  in transit, and
relatively few work during weekends.

      Figure 4  and Table  4  show significant differences between activity
patterns  based  on  sex for both cities.   Women  spend more time at home and
less  time  in  office  and   indoor-other  and   in  transit  than  do  men.
Significant differences in  time  budget  distribution  are also evident among
different age groups  as illustrated  in Figure  5  and Table  5.   Humans tend
to  sleep  less as  age increases,  and  older people  spend  more time at their
residence and  less time away.   Transit time also decreases progressively
with age.

      Figures 6a  and  6b and the  associated  tables (6a  and  6b) compare the
two cities for weekdays and  weekends respectively.   They  confirm the larger
office  and commute  time  for  Washingtonians  during  weekdays.    The  sex
characteristics are   similiar  in the  two  cities (Figures  7a and  7b and
Tables  7a and 7b).   Women  spend  more time  in  their  residence  than men and
less time in other indoor  locations or  in transit.

                                    4-5

-------
1200


1000

 800


 600


 400

 200
Minutes Per 24 Hours

        1059
                                 1026
     INOOTH  INDRES OUTDOR TRNSIT INDOTH  INDRES OUTDOR TRNSIT
              DENVER         I      WASHINGTON
                • PRIMARY   Bl SECONDARY
          FIGURE 2 - Comparison of Time Budgets by City
TABLE 2 - Comparison of Time Budgets by City
Denver
n = 526
LOCATION
Indoor Residence
Indoor Other
Outdoor
Transit
ACTIVITY
All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion '
MEAN
1059.2
513.3
545.9
257.6
117.4
140.2
21.8
101.3
88.8
12.5
STD.ERR
10.2
5.2
10.5
9.4
8.6
7.2
2.4
3.7
3.5
1.5
Washington
n = 540
MEAN
1025.9
495.0
513.8
276.5
179.7
96.8
21.2
116.5
96.4
20.1
STD.ERR
10.6
5.0
10.0
10.0
9.6
5.8
2.3
4.2
4.0
2.1
T-TEST
P<
.023<
.01 2C
.297C
.169'
.0001
.OOO1
.8562
.0061
.157
.003
                          4-6

-------
             Minutes Per 24 Hours
        4 A f\r\
        14UU '"	"	" 	
                                                  1186
              INDOTH INGRES OUTDOR TRNSIT INDOTH  INDRES OUTDOR TRNSIT
                      WEEKDAY        I         WEEKEND
                          •I PRIMARY   • SECONDARY
                 FIGURE 3 - Comparison of Time Budgets by Week Part
             TABLE 3 - Comparison of Time Budgets by Week Part
                                         Weekday
                                         n = 798
                                          Weekend
                                           n=268
LOCATION


Indoor Residence



Indoor Other



Outdoor

Transit
ACTIVITY
All
Sleeping
Other

All
Office
Other

All

All
Combustion
Non-Combustion
vIEAN

994.1
490.7
503.4
313.4
192.5
121.0
STD.ERR

8.4
4.1
8.2
8.0
8.0
5.6 .
MEAN

1186.0
543.8
642.0
129.4
19.3
110.0
STD.ERR

11.7
7.3
13.0
9.4
5.3
8.1
T-TEST
P<
.0001
.0001
.0001
.0001
.0001
.3089
 19.9

112.5
 94.9
 17.6
1.9

3.2
3.0
1.5
26.1

98.6
85.8
12.7
3.4

5.9
5.6
2.2
.1094

.0316
.1423
.1015
                                     4-7

-------
              Minutes Per 24 Hours
         1200r-	-	
                      1097
               INDOTH  INDRES  OUTDOR TRNSIT INDOTH  INGRES OUTDOR TRNSIT
                        FEMALE           I          MALE
                           •I PRIMARY   • SECONDARY
                    FIGURE 4 - Comparison of Time Budgets by Sex
Outdoor

Transit
All

All
Combustion
Non-Combustion
15.8

99.1
84.6
14.5
1.6

3.3
2.9
1.8
 29:8

123.6
104.5
 19.1
3.3

4.9
5.0
1.8
TABLE 4 - Comparison of Time Budgets by Sex
Female
n = 637
LOCATION

Indoor Residence

Indoor Other


ACTIVITY

All
Sleeping
Other
All
Office
Other
MEAN

1096.8
512.5
584.2
228.2
115.6
112.6
STD.ERR

9.2
4.7
9.2
8.6
10.9
5.6
Male
n = 429
MEAN

961.4
491.4
470.0
325.0
198.4
126.6
STD.ERR

11.2
5.7
10.8
10.9
10.8
8.1
T-TEST
P<
• .0001
.004C
.0001
.0001
.0001
.142Ł
.0001

.0001
.0003
.0639
                                      4-8

-------
               Minutes Per 24 Hours
          1400i	
                                                             1182
               INO INROUTTRNINO INROUTTRNINO INROUTTRNINO INROUTTRN
                   18-24    I   25-44    |    45-59   |    60-70
                            • PRIMARY   SI SECONDARY
                  FIGURE 5 - Comparison of Time Budgets by Age Group
             TABLE 5 - Comparison of Time Budgets by Age Group
LOCATION
ACTIVITY
 18-24
 n = 99
                                               AGE GROUPS
 25-44
n = 594
 45-59
 = 236
 60-70
n = 137
F-TEST
   P<
Indoor Residence
All
Sleeping
Other
1015.7
 519.7
 496.0
1013.6
 506.1
 507.5
1044.8
 503.4
 541.4
1181.8
 484.9
 696.9
  .0001
  .1401
  .0001
Indoor Other
All
Office
Other
 289.1
 132.1
 157.0
 289.3
 170.9
 118.4
 271.6
 154.9
 116.6
 147.9
  54.4
  92.5
  .0001
  .0001
  .0158
Outdoor
All
  18.6
  22.4
  20.9
  20.3
  .9079
Transit
All
Combustion
Non-Combustion
 116.2
 101.4
  14.8
 114.6
  97.2
  17.5
 102.8
  90.9
  11.9
  90.0
  69.4
  20.5
  .0197
  .0062
  .2086
                                      4-9

-------
              Minutes Per 24 Hours
               INDOTH  INDRES  OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
                        DENVER         |        WASHINGTON
                           • PRIMARY   H SECONDARY
              FIGURE 6a - Comparison of Time Budgets by City on Weekday
Outdoor

Transit
All

All
Combustion
Non-Combustion
 19.7

106.2
 93.3
 12.9
2.9

4.4
4.2
1.7
 20.1

118.1
 96.3
 21.8
2.5

4.6
4.3
2.5
TABLE 6a - Comparison of Time Budgets by City on Weekday
Denver
n = 378
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1011.9
499.9
512.0
301.9
158.4
143.5
STD.ERR
12.0
5.9
12.3
11.3
11.1
8.8
Washington
n=420
MEAN
978.0
482.4
495.6
323.8
223.1
100.7
STD.ERR
11.7
5.6
11.1
11.2
. 11.2
7.0
T-TEST
P<
.042f
.032C
.32 U
.170E
.0001
.0001
.9137

.0619
.6228
.0034
                                     4-10  .

-------
1400

1200

1000

 800

 600

 400

 200

   0
Minutes Per 24 Hours

        1180
1193
     INDOTH  INDRES  OUTDOR TRNSIT  INDOTH INDRES OUTDOR TRNSIT
             DENVER         |       WASHINGTON
                •I PRIMARY  EH SECONDARY
     FIGURE 6b - Comparison of Time Budgets by City on Weekend
TABLE 6b - Comparison
of Time Budgets by City on Weekend
Denver

LOCATION

Indoor Residence


Indoor Other


Outdoor
Transit



ACTIVITY

All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion
n =
MEAN

1180.0
547.5
632.4
144.4
12.6
131.8
27.1
88.7
77.2
11.5
148
STD.ERR

15.7
10.3
18.1
12.9
5.3
12.2
4.2
6.6
6.2
3.0
Washington
n =
MEAN

1193.5
539.3
654.2
110.9
27.7
83.2
24.9
110.8
96.5
14.3
120
STD.ERR

17.6
10.4
18.5
13.7
9.7
9.7
5.4
10.1
9.8
3.4

T-TEST
P<
.564<
.577C
.4011
.076Ł
.155-:
.002Ł
.749Ł
.060*
.OBK
.544(
                        4-11

-------
         1200

         1000

          800

          600

          400

          200

            0
              Minutes Per 24 Hours
      1108
                                  1084
               INDOTH  INGRES OUTDOR TRNSIT  INDOTH  INDRES  OUTDOR TRNSIT
                        DENVER          |       WASHINGTON
                           • PRIMARY   •SECONDARY
              FIGURE 7a - Comparison of Time Budgets for Females by City
Outdoor

Transit
All

All
Combustion
Non-Combustion
16.1

99.9
86.9
13.1
1.9

4.6
4.3
3.1
15.5

98.0
81.9
16.1
2.6

4.7
3.8
3.0
TABLE 7 a - Comparison of Time Budgets for Females by City
Denver
n=345
LOCATION

Indoor Residence
Indoor Other
ACTIVITY

All
Sleeping
Other
All
Office
Other
MEAN

1107.9
522.6
585.3
215.4
87.1
128.7
STD.ERR

12.0
6.5
12.5
11.1
9.8
8.1
Washington
n = 292
MEAN

1083.7
500.7
582.9
242.8
149.2
93.6
STD.ERR

14.2
6.8
13.6
13.3
12.5
7.4
T-TEST
P<
.193Ł
.020^
.897^
.1191
.0001
.0014
.8332

.7679
.3894
.4049
                                     4-12

-------
    Minutes Per 24 Hours
1000 i	966
 800 -
 600 -
 400 -
 200 -
     INDOTH  INORES OUTDOR TRNSIT  INDOTH INDRES OUTDOR TRNSIT

              DENVER        |       WASHINGTON

                • PRIMARY  B SECONDARY
      FIGURE 7b - Comparison of Time Budgets for Males by City
TABLE 7b - Comparison of Time
Budgets for Males
Denver
n = 181
LOCATION

Indoor Residence
Indoor Other
Outdoor
Transit
ACTIVITY

All
Sleeping
Other
All
Office
Other
All
All
Combustion
Non-Combustion
MEAN

966.4
495.6
470.8
337.2
175.1
162.1
32.5
103.8
92.4
11.4
STD.ERR

16.9
8.6
17.6
15.9
15.8
14.1
5.8
6.1
6.1
1.9
by City

Washington
n=248
MEAN

957.8
488.3
469.5
316.1
215.5
100.7
27.9
138.2
113.4
24.8
STD.ERR

14.9
7.5
13.7
14.8
14.6
9.3
3.9
7.1
7.4
2.7
T-TEST
P<
.703C
.5211
.950«
.33*
.060Ł
.000;
.51 OS
.OOOC
.02&
.0002
                          4-13

-------
      The comparison  of  age groups within  cities are  shown  in the  8a-d
series of figures and tables.   There does  not appear  to  be any significant
interaction  between  these characteristics and the  two  cities.

DISCUSSION

      The Denver and  Washington,   D.C.,   databases  used  in  this  study
represent distributions of occupancy duration periods for the  seven  broad
microenvironments identified  above  and can  be  sampled for simulation  of
activity  patterns.    They  can be made  to accommodate  a  wide variety  of
activity groups  identified  in many different  activity  pattern studies.

      Specific activities of  importance  for a particular pollutant can also
be modeled  within  one  of  these seven  microenvironments.   For  example,
cooking with  gas stoves  can  be modelled as  a part of the time  spent in a
residence other than  sleeping.   Similiarly  an  occupationally  hazardous
activity would be a  part of Indoor other.

      Despite the significant  difference between the two cities in specific
characteristics, the overall  similarities  found between  the  two  cities are
quite  notable.    Only the winter  season  in both  cities  is  represented,
however,  which  may  account  for the very short  outdoor duration  in  both
cities.     The   characteristic  that  Washington,   D.C.,   residents  are
predominantly  office workers  is seen  in the significant  difference  in
indoor-office  duration  in  the two  samples.    Similarly,   the  larger
metropolitan  area  of Washington,  D.C.,   compared   to that  of  Denver,
translates  into the longer average transit  duration.   Whether  or  not the
mile-high elevation of Denver  has  anything to do with the longer  sleep
duration  observed  in that  city, compared to Washington,  D.C.,  cannot  be
determined  from this  limited data  set.    These types of city-spec'ific
characteristics  must be taken  into  consideration when modeling  activity
patterns for use in  total human  exposure modeling.

      The similarities and differences  observed in the  characteristics  of
these  two cities must now be  translated  into  the efficient and  accurate
simulation modeling  of a  basic activity  patterns structure.   One can sample
from  the  database of detailed  activity patterns  for a limited  number  of
microenvironments   in  Denver  and  Washington,   D.C.,  according  to  the
characteristics  which affect activity patterns  such  as week  part,  or
demographic  characteristics  such  as  sex and  age.    City-specific
characteristics,  such as  commute-time differences,  can be  built  into the
simulation from available census  information for different  cities.   Sex and
age differences will be reflected  by  proportional  sampling.    Presently,
there  is no seasonal information available.   It is quite possible, however,
that  the  Cincinnati  activity  study may  provide such  information  to add  to
the database.  Hence, a comprehensive database reflecting season, city, and
demographic characteristics for  activity  patterns does  not,  in principle,
seem a formidable undertaking.

      With  the   activity  patterns  so   structured,  the next   step  is  to
identify the microenvironments which are important for modeling exposure to
specific  pollutants.  For benzene exposure, for  example,  the  important

                                   4-14

-------
         1200


         1000


          800


          600


          400


          200
              Minutes Per 24 Hours
      1053
                                   984
               INDOTH INDRES OUTDOR TRNSIT INDOTH  INDRES OUTDOR TRNSIT
                        DENVER         |        WASHINGTON
                           •i PRIMARY   B SECONDARY
               FIGURE 8a - Comparison of 18-24 Age Group Time Budgets
TABLE 8a - Comparison of 18-24 Age Group Time Budgets by City
Denver
n = 526
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office .
Other
MEAN
1053.0
515.4
537.6
263.0
95.3
167.7
STD.ERR
36.7
23.7
38.4
31.4
25.4
26.6
Washington
n=540
MEAN
984.6
523.2
461.4
310.8
162.7
148.1
STD.ERR
32.7
21.0
28.3
33.3
32.3
25.6
T-TEST
P<
.167C
.8062
.106Ł
.298Ł
.114C
.595^
Outdoor

Transit
All

All
Combustion
Non-Combustion
 11.4

111.6
 97.0
 14.6
 3.9

15.2
13.9
 7.4
 24.6

120.0
105.2
 14.9
 9.0

11.2
11.4
 4.8
.2104

.6504
.6496
.9791
                                     4-15

-------
          1200


          1000


          800

          600


          400


          200
              Minutes Per 24 Hours
      1022
                                   1005
               INDOTH INORES OUTDOR TRNSIT INDOTH INDRES OUTDOR TRNSIT
                        DENVER         |       WASHINGTON
                           •I PRIMARY   3B SECONDARY
                FIGURE 8b - Comparison of 25-44 Age Group Time Budgets
TABLE 8b - Comparison of 25-44 Age Group Time Budgets by City
Denver
n=300
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1021.8
517.6
504.2
286.5
138.7
147.8
STD.ERR
13.1
6.7
13.1
12.6
12.1
10.0
Washington
n = 294
MEAN
1005.3
494.4
510.9
292.2
203.8
88.4
STD.ERR
14.4
6.9
13.4
13.6
13.4
7.3
T-TEST
P<
.398C
.016C
.7215
.7601
.OOOC
.0001
Outdoor

Transit
All

All
Combustion
Non-Combustion
 23.9

107.8
 95.4
 12.3
3.6

5.0
4.8
2.0
 20.9

121.6
 99.0
 22.7
3.3

6.1
5.6
3.2
.5337

.0790
.6345
.0060
                                      4-16

-------
              Minutes Per 24 Hours
               INDOTH INDRES OUTDOR  TRNSIT  INDOTH  INDRES OUTDOR TRNSIT
                        DENVER          I       WASHINGTON
                            • PRIMARY   • SECONDARY
                FIGURE 8c - Comparison of 45-59 Age Group Time Budgets
TABLE 8c - Comparison of 45-59 Age Group Time Budgets by City -
Denver
n = 109
LOCATION
Indoor Residence
Indoor Other
ACTIVITY
All
Sleeping
Other
All
Office
Other
MEAN
1066.7
516.7
550.0
258.0
124.7
133.3
STD.ERR
23.6
10.4
22.8
21.6
19.1
16.4
Washington
n = 127
MEAN
1025.9
491.9
534.0
283.2
180.9
102.3
STD.ERR
22.2 "
8.6
20.4
20.6
19.4
12.5
T-TES1
P<
.2080
.064<
.6011
.399(
.039Ł
.128Ł
Outcjoor

Transit ,
All

All
Combustion
Non-Combustion
24.8

90.4
81.2
 9.3
5.3

7.2
7.3
2.4
 17.5

113.4
 99.2
 14.2
3.5

8.9
8.7
3.3
.2370

.0508
.1103
.2446
                                      4-17

-------
         1400

         1200

         1000

          800

          600

          400

          200

            0
Minutes Per 24 Hours

         1208
                                     1453
              INDOTH INDRES OUTDOR TRNSIT . INDOTH INDRES OUTDOR TRNSIT
                       DENVER         I       WASHINGTON
                           • PRIMARY   H SECONDARY
               FIGURE 8d - Comparison of 60-70 Age Group Time Budgets
       TABLE 8d - Comparison of 60-70 Age Group Time Budgets by City
                                        Denver
                                         n=72
                                          Washington
                                             n=65
LOCATION
   ACTIVITY
Indoor Residence   All
                Sleeping
                Other
Indoor Other



Outdoor

Transit
   All
   Office
   Other

   All

   All
   Combustion
   Non-Combustion
MEAN
1207.6
488.9
718.7
133.2
31.3
102.0
STD.ERR
20.6
14.9
24.6
16.9
14.1
11.4
MEAN
1153.2
480.4
672.8
164.1
82.0
82.0
STD.ERR
.25.7
13.9
25.1
23.2
19.7
14.3
T-TEST
P<
.0981
.6770
.1945
.2784
.0350
.2773
14.6

84.4
67.5
16.9
3.4

8.1
7.1
4.0
26.6

96.1
71.6
24.5
6.6

9.1
9.2
5.4
.1004

.3669
.7227
.2605
                                     4-18

-------
activities  appear  to  be  active  and  passive  smoking  and  exposure  to
gasoline.   The TEAM5 study,  though  not designed  to  provide activity pattern
data,  can  provide  seasonal  and  geographical  information  for  benzene
exposure modeling.

      From this  investigation  comes  the concept  of  a  national  activity
patterns database.   Information from extensive activity  pattern  research,
as reviewed by Ott,6 including time use studies by  Robinson  et  a?.7  can  be
utilized to build  a  national  activity patterns  database.   It can  form the
basis  for  sampling  activities  for exposure  simulation  modeling for  any
given city.

      For  some specific  pollutants,   however,   additional  modeling  for
particular  microenvironments will  be  required.     For   example,   special
modeling efforts will be  required to  accomodate active and  passive smoking
activities.   Finally, background concentrations of  some pollutants  must  be
estimable for different geographic regions by season and  climate,  in order
to realistically simulate  exposure  to  those pollutants.
                                   4-19

-------
                           REFERENCES

 Johnson,  T.,  "A Study  of Personal  Exposure  to  Carbon  Monoxide      in
 Denver,   Colorado,"  Report  No.   EPA-&00/4-84-014.     NTIS  No.
 PB84-146125.   U.S.  Environmental Protection Agency, Research Triangle
 Park,  NC,  1984.

 Johnson,   T.,  "A Study  of Personal Exposure  to Carbon  Monoxide in
 Denver, Colorado," Paper  No.  84-121.3 presented at  the  77th  Annual
 Meeting of the Air Pollution Control Association,  San Francisco, CA,
 June  1984.

 Hartwell,  T.D., C.A.  Clayton,  R.M.  Michie,  Jr.,  R.W. Whitmore, H.S.
 Zelon,  S.M. Jones, and  D.A.  Whitehurst,  "A Study of Carbon Monoxide
 Exposure of Residents  of Washington,  D.C.,  and Denver,  Colorado,"
 Report  No.  EPA-600/54-84-031.    NTIS  No.   PB84-183516.    U.S.
 Environmental   Protection  Agency,  Research  Triangle  Park,  NC,  June
 1984.

 Hartwell,  T.D., C.A.  Clayton, R.M.  Mitchie, Jr.,  R.W. Whitmore, H.S.
 Zelon,  and D.A. Whitehurst,  "A Study  of Carbon Monoxide  Exposure of
 the Residents  in Washington,  D.C.," paper  No.  84-121.4 presented at
 the 77th  Annual  Meeting  of the Air  Pollution Control  Association, San
 Francisco,  CA,  June 1984.

•Wallace,   L.A., "The  Total  Exposure  Assessment Methodology  (TEAM)
 Study:  Summary  and  Analysis:  Volume  I,"  Report  No.  EPA-600/6-87-002a,
 June  1987.     .         .

 Ott,  W.R., "Human  Activity Patterns:  A Review of the Literature for
 Estimation of Exposure to  Air Pollution,"  presented  at the Research
 Planning Conference on  Human Activity Patterns, U.S.  Environmental
 Protection Agency,  Environmental Monitoring Systems  Laboratory,  Las
 Vegas,  NV,  May  10-12,  1988.

 Robinson, J.,  and  J.M.  Holland,  "Trends in American's Use of Time:
 Some  Preliminary 1975-1985 Comparisons," Draft Report  to the  Office
 of Technology  Assessment, U.S. Congress.   Survey Research Center,
 University of  Maryland,  May 1986.
                              4-20

-------
    A COMPARATIVE  EVALUATION OF SELF-REPORTED AND INDEPENDENTLY-OBSERVED
         ACTIVITY  PATTERNS  IN AN AIR POLLUTION HEALTH EFFECTS STUDY

                  by:    Thomas  H.  Stock  and Maria T. Morandi
                        University of Texas
                        Health  Science Center at Houston
                        School  of  Public Health
                        P.O.  Box 20186
                        Houston, Texas 77225
                                 ABSTRACT

      As part of  a  community-based  air pollution health effects study,  29
asthmatics residing  in two Houston neighborhoods  participated  in a  personal
monitoring project.    Participants maintained personal log  forms indicating
their  presence  in  one or  more of  seven  major  microenvironments  (ME's)
during each clock hour.   These  self-recorded activities were  compared  with
independent  observations  of  the  subjects'  activities  by  technicians
performing personal  air pollution monitoring.  Considering the  independent
observations  as  "truth,"  location  misclassification  error  rates  were
calculated,  and  the  effects  of  age,   gender,  resident  neighborhood,
complexity of activity,  and training  on these error rates were examined.
Of those variables  considered,  complexity  of activity in an  hour  appeared
to have the greatest effect  on error.
                                    5-1

-------
                               INTRODUCTION

      Accurate assessment  of human  exposure to  air pollutants  requires
either  direct measurement  by means  of personal  monitoring  (PM)  or  an
indirect  approach whereby  personal  time  and  location  information  are
combined with pollutant monitoring in relevant microenvironments  (ME's)  to
estimate  personal  exposure  (1-3).   Direct  PM  is  performed  relatively
infrequently,  because of limitations imposed  by  such considerations  as
cost,  manpower,   availability  of appropriate  monitoring  devices,   and
acceptance  by study  subjects (4).   Although  the  indirect approach  to
exposure assessment is generally more feasible,  it  requires  both  accurate
and appropriate ME pollutant monitoring  as  well  as reliable records  of the
temporal  and spatial aspects of individual daily activity patterns  (2).

      The distributions of  average times spent by individuals  in various
ME's have been  reported  from time-budget  surveys  (5,6),  as well  as  from
some investigations of air pollutant  exposures (7-11).  While these studies
have  provided much" valuable  information  on  relative exposure  times  of
populations  in different ME's, as  well  as  a relatively consistent pattern
of average time distributions  among various  population groups, the reported
data do  not address some critical  issues  in  the application of  activity
patterns to estimating personal  pollutant exposures.  For example,  it  is
necessary to obtain  information  on  individual  variability in  activity
patterns, distributions of  location by time, and reliability of  recorded
activities.

      The objective  of  this  investigation is  to  examine  the  issue  of
reliability of self-reported  activity patterns.  As part  of a  community-
based air pollution health effects study (4,12,13),  29 asthmatics residing
in two Houston communities participated  in  a personal monitoring project  in
which study subject activity patterns were both self-recorded on daily log
forms and independently observed  and recorded by technicians  operating the
monitoring  instruments.   The  accuracy of individual self-reported activity
logs  was evaluated  considering   the  technician-reported  activities  as
"truth."
                                  METHODS

PROTOCOL

      Of the 51 study subjects in the six-month  (May-October, 1981) Houston
Area Asthma Study,  30  agreed  to  participate  in  a personal  monitoring (PM)
project.     Each  participant  was  accompanied  by  two  air  monitoring
technicians  carrying  portable  analyzers  for  ozone  and  respirable
particulate mass  during  one or two daytime periods  (between 7 A.M.  and 7
P.M., CDT).   Detailed  observations  of the subject's location  and activity
were recorded  by the  technicians.    The  participants were not  generally
aware that  their activities were being  recorded.   Because of  instrument
failure, one  volunteer was not monitored; 21 participants  were  monitored
during two daytime periods  while  8 subjects were monitored once.
                                    5-2

-------
  All asthma  study participants  maintained  personal  activity  log forms
indicating their presence in one or more  of seven  major ME's  (three  indoor,
two  outdoor,  and two  transportation-mode) during  each  clock hour.   The
final daily  activity  log design was  selected after pretesting  it with a
subgroup of the asthma study participants  prior to the field  portion of the
investigation.    All   study participants were  given  a  pre-health-study
training course which included a trial practice on  recording activities on
the  log forms.   The  personal  logs  were maintained  on all  study days,
whether or not PM occurred, and were normally completed at the end  of each
12-hour period (daytime or nighttime).   The daytime  version of  the  activity
log  is  shown  in  Figure  1.   The subjects were instructed to  indicate their
presence in one  or  more  ME at any given  hour by drawing a horizontal line
within  the  appropriate box.  No  attempt was made  to  estimate a temporal
resolution of less than one hour.   The dated  daily  activity  log  forms were
provided in batches sufficient to  cover  10 days of the  field  portion of the
health  study at a time.  All participants in the  asthma  study paid  a weekly
visit to a field office where study staff reviewed the week's activity logs
with them.   Subjects  were  questioned  about missing  or  apparently incorrect
records which  were  sometimes revised based  on subject  verification.  New
activity  log forms were  also provided during the  visits.   These weekly
meetings  also helped  maintain  the motivation  level of  the  participants
throughout the study.   .

DEMOGRAPHIC DESCRIPTION

  Selected demographic characteristics of the PM  study  group  are  summarized
in Table 1.  There was approximately an  even distribution of  study  subjects
by both study area (site) and 
-------
     FIGURE 1.  DAYTIME VERSION OF DAILY ACTIVITY LOG
HOURS
morning noon evening
PLACE 7 8 9 10 11 12 1 2 3 4 5 67
HOME
INDOORS SCHOOL OR WORK
ELSEWHERE
OUTDOOR- IN NEIGHBORHOOD
OUT OF NEIGHBORHOOD
IN OPEN CAR, TRUCK OR BUS
IN CLOSED CAR, TRUCK OR BUS




















































































en
I

-------
TABLE 1. SELECTED DEMOGRAPHIC CHARACTERISTICS OF THE STUDY GROUP

Study Area

Age, yrs.

Sex \ 18 yrs.
> 18 yrs.
Total
CL SS
15 14
Mean Median Range
17.9 12 7-54
Male . Female Total
13 8 21
1 7 8
14 15 29
                              5-5

-------
calculating  the frequency of this occurrence in the  total  number  of valid
hours  (561  hours).   Second,  error rates for  each  ME were calculated  by
assigning  a  value  of   1  if  a disagreement  occurred  in  any  hour  (0
otherwise),  and calculating the frequency of this occurrence  for  each  ME.
A more refined categorization  of ME error rate  was  obtained by assigning a
value of -1  to errors  of commission  (i.e., the participant indicated he was
present in  an  ME  when  he was not), a  value of 0 if both  participant  and
technician reported  that the subject was not present in an ME,  a value of 1
for error of omission  (i.e.,  the participant reported not to be present  in
an ME  when  he  actually was),  and a  value of 2 if both  technician  and
participant reported   presence in  an  ME.    Finally,   major  relevant
misclassification errors  (i.e.,  reporting  to  be  indoors when  actually
outdoors, or  to be  in  a vehicle with open windows  when the  vehicle  had
closed windows) were also calculated.

      The  error  rates  were  evaluated   in  terms of  classification
characteristics of  the  study  participant:  sex,  area of  residence (site),
age,  date,  (i.e., first versus  second  day  of  personal monitoring)  and  the
number of ME's visited  during  each  hour.   All  computations  and statistical
analyses  were  performed using  the Statistical  Package  for the  Social
Sciences  SPSS-X,  version  2  (15).   Tests  of  statistical difference  were
considered significant at p < 0.05.


                                 RESULTS

      A summary of  .overall error  rates  for several potentially  relevant
participant classification categories, based  on  561 hours of comparison
data, is presented in Table 2.        •

      The summary  overall  error  rate for  all  subjects  independent  of
classification  category  was  35.5% (or  expressed  as  overall  agreement,
64.5%).   Overall  error  rates  for  individuals ranged  from 0% for a subject
who remained inside  her  home throughout two  complete monitoring periods,  to
80% for a 7-year-old  participant,  who was  the youngest  subject in the  PM
study.   Chi-squared analysis  for  each stratification variable indicated
significant (p <  0.01)  differences in overall error rates  for gender and
the number of  ME's visited during a  given hour (# ME's/hr; as independently
observed  by the  technicians),  and for  age category  and  monitoring  day
(date; for  subjects with two  PM days)  variables (p < 0.05).   Females  had
lower overall   error rates  than  males;  adults  had  lower  rates  than younger
subjects.   There was   no  statistical  difference in  overall   error  rates
between SS participants  and Cl_  participants.

      Summaries  of  individual  ME  error  rates  (i.e.,   agreement  vs.
disagreement)  are  presented in Tables  3-5 for indoor,  outdoor,  and vehicle
environments,   respectively.    As shown  in  these tables,  the  pattern  of
participant classification variables with  significant  differences  in  ME
error rates  differs  from the  overall rates and varies according to ME.  The
number of ME's visited  in an hour shows the most consistent effect, with a
highly significant (p  <  0.001) positive correlation  between number of ME's
and error rates for all  environment locations.   Gender differences  were

                                    5-6

-------
TABLE 2. SUMMARY OF OVERALL ERROR RATES BY PARTICIPANT CATEGORY
Variable
Sex
Age
Site
Date*

# ME's/hr

Summary
Category
Females
Males
\ 18 yr
> 18 yr

1st
2nd
1
2
3
4

Error rate
%
29.1
41.6
38.5
29.5
CL
SS
41.4
30.9
14.1
80.4
82.0
100.0
35.5
Chi-squared D.F. p
9.1 1 0.003
4.1 1 0.042
34.1
0.4 1 0.520
37.1
4.8 1 0.028

248.8 3 0.000


Chi-squared significant at p < 0.05
   includes participants monitored during two different dates
   (447 hours)
                                    5-7

-------
    TABLE 3. SUMMARY OF  INDOOR ME  ERROR  RATES  BY  PARTICIPANT CATEGORIES
Variable


Sex


Age


Site


Date -



# ME's/hr


Summary
Category

Female

Male
\18 yr

>18 yr
CL

SS
1st

2nd
1
2

3
4


Home
9.1
**
18.2

NS

10.3
*
17.8
22.0
**
• 10.5
4.7
29.5
***
44.0
28.6
13.7
Mean Error Rate (%)
School /work
1.5
***
8.0

NS


NS


NS

2.1
6.3
***
20.0
14.3
4.8

Elsewhere

NS


NS


NS

17.8
***
4.7
3.9
17.0
***
36.0
14.3
9.6
***  p < 0.001
**   p < 0.01
     p  <  0.05
                                     5-8

-------
TABLE 4. SUMMARY OF OUTDOOR ME ERROR RATES BY PARTICIPANT CATEGORIES
Variable

Sex
Age
Site
Date

# Me's/hr

Summary
*** .»
p < 0.
** r
p < 0.
p < 0,
Category

Female
Male
\18 yr 18.3
>18 yr
CL
SS
1st
2nd
1
2
3
4

001
01
.05
Mean Error Rate
In_neiqhborhood Out_of
NS
**
8.4
NS
NS
5.7
31.3
***
32.0
78.6
15.0

(%)
neighborhood
NS
NS
3.3
*
7.7
NS
2.6
12.5
***
12.0
0.0
5.3

                                 5-9

-------
TABLE 5. SUMMARY OF VEHICLE ME ERROR RATES BY PARTICIPANT CATEGORIES
Variable

Sex
Age
Site
Date

# Me's/hr

Summary
*** -
p < 0.
** r
p < 0.
p < 0
Category

Female
Male
\18 yr
>18 yr
CL
SS
1st
2nd
1
2
3
4

001
.01
.05
Mean Error
Opened windows
NS
NS
NS
NS
2.9
18.8
***
48.0
50.0
11.4

Rate (%)
Closed Windows
9.5
***
2.4
NS
8.6
**
2.7
. . NS
3.1
11.6
***.
6.0
35.7
5.9

                                5-10

-------
significant for three  of  the ME's,  with error rates  for  males  higher for
two indoor ME's,  and rates  for  females higher for the closed vehicle ME.   A
similarly  inconsistent pattern was observed  for  study area  (site).   One
indoor,  one outdoor, and one vehicle ME exhibited significant differences,
with  error  rates  of  SS  participants  higher  for  two  of  the  three
environments.   A significant  effect of monitoring day was apparent for only
two of the indoor ME's, with higher  error  rates  on  the first day for both
of these environments.   A  significant difference in error rate based on age
category was  found for only one ME, with the younger group showing a higher
mean error rate.

      Summary  error rates  for ME's ranged from 4.8% to 15.0%,  with higher
rates for  indoor  and  outdoor location  categories  associated  with the ME's
occupied most frequently ("home" for the  indoor  and  "in neighborhood" for
the outdoor categories).   When  these  summary  ME  error rates  are sorted by
error type, i.e.,  omission (+1) or commission (-1),  the  results indicate
that,   for six  of  the  seven  ME's,   rates  of error  of  omission  are
approximately  twice as high as  rates  of error  of  commission.    Only for
closed vehicles is this rate  ratio reversed.

      In  order  to ascertain the contribution of each  of  the participant
classification  variables  to the variance  in  overall   and ME error rates,
ANOVA  procedures with  the classic  experimental  approach were used.   A
summary of the results of these analyses,  including significant   (p < 0.05)
2-way  interactions,  is presented in  Table  6.  The amount of variance in
error rates explained by  the participant classification  variables ranged
from  a  low of 22.0%  for the "outdoo'r/out of neighborhood"  location to a
high  of  55.5% for  "vehicle/opened, windows."  With  the-exception  of the
"outdoors/out  of neighborhood"  and  "vehicle/closed   windows"  locations,
statistically  significant main  effects accounted  for pver half  of the
variance  explained by  the  respective models.   For the overall error rate,
the number of  ME's  visited during  an hour accounted for approximately 84%
of  the  total   variance  explained  by  the  model,   as indicated  by  the
corresponding  adjusted beta-squared,  while  sex,  date,  and interactions
between sex and the number of ME's visited per hour, sex and date, sex and
site,  and age and  the number of ME's  visited  per hour  contributed only
marginally.

      The number of ME's visited per  hour was  also the  major  contributor to
error rate (from a low of  27% to a maximum of  83% of the variance explained
by  the respective  models)  for all  individual   ME locations,  with  the
exception of "indoors/home" and the  "indoors/school-work" categories, where
date  of monitoring  was more important.  Date was a contributing variable
only  to  the  three  indoor ME  and overall  error rates,  excluding  2-way
interactions.   Age  of the participant was  important  in determining error
rates in  outdoor  environments,  as well  as  the "vehicle/closed window" ME,
again not  considering the interaction terms.  Gender  of  the participants
was  significant for  explaining  the  "indoor/home,"  "indoor/school-work,"
"outdoor/in  neighborhood,"  and overall  error  rates,  as  was site  for
"outdoors/in  neighborhood"  and  the  two vehicle categories.    In most cases,
all contributions aside from the "#  ME's/hr"  contributed relatively little
to overall variance.   It  is  important to note that since a  nonorthogonal

                                   5-11

-------
TABLE 6.  SUMMARY OF ANALYSIS OF VARIANCE RESULTS  FOR  OVERALL  AND  ME
         CATEGORY ERROR RATES


Indoors

Outdoors


Vehicle

Overall
ME Total variance
explained by
model
Home 27.6
School/
work 38.1
Else-
where 38.5
In neighbor- 51.6
hood
Out of neighbor- 22.0
hood
Opened 55.5
windows

Closed 36.0
windows
44.3
Main Adjusted
effects beta-squared
Date
# ME's/hr
Sex
Date
Sex
* ME's/hr
# ME's/hr
Date.
# ME's/hr
Age
Sex
Site
1 ME's/hr
Age
# ME's/hr
Site

# ME's/hr
Age
Site
# ME's/hr
Sex
Date
10.9
5.3
2.0
22.1
1.4
1.2
28.2
2.2
32.5
2.0
2.0
0.01
6.2
1.2
46.2
0.001

9.6
3.2
2.8
37.2
1.2
0.6
Contributing 2-way
interactions
Sex: age, site
Age: date
Sex: age, # ME's/hr
Site: age
Age: date
Sex: site
Site: age
1 ME's/hr: date
Sex : age
Site: date
Age: date, If ME's/hr
Sex: date, age
Site: * ME's/hr
# ME's/hr: date
Sex: # ME's/hr
Site: .age, # ME's/hr,
Age: # ME's/hr

Sex: # ME's/hr
Site: date, # ME's/hr,
Age: * ME's/hr
# ME's/hr: date
Sex: # ME's/hr, date,
Age: . # ME's/hr





date

age
site

-------
analysis approach was used, it is possible for two variables to contribute
to  variance  through interactions,   even  if they  are  not  independent
contributors.

      During  12  of  the  561  hours  reported,   the  subjects  totally
misclassified  the outdoor/indoor  location,  while  4 of 99 hours reported as
having  been  in  a vehicle  were misclassified  within the category  (i.e.,
opened windows vs. closed windows and vice/versa).


                                DISCUSSION

      The data analysis  results indicate that both overall and ME errors of
self-reported  location were not random,  but were influenced  by some of the
characteristics  of the study population.   It is also important to indicate
that the design  of the study and  the quality  assurance measures adopted for
the self-reported data tended to reduce disagreements between observed and
reported activities.   The  overall  error is  the  most stringent measure of
error calculated, because  of the strict requirement for agreement in all
ME's  for  each   hour.   According  to this  error measure,  the  study
participants  made an ME misclassification 35.5% of the time.   From the
point of view of  exposure  assessment, some of  these  misclassifications may
not  be  crucial   for  certain  pollutants   (e.g.,   ozone  exposure  in  two
different air-conditioned indoor microenvironments),  but the overall error
rate  attributed  equal  weights  to  errors  in  any ME.     However,   the
association of overall  and  ME error rates with certain  characteristics of
the study population  are indicative  of  the type  of  variables which should
be considered  when designing activity log forms.

      Although a  positive   association  between error rate  and increasing
number  of microenvironments visited during an hour was hypothesized, since
accuracy of recall would be expected to decrease  with increasing complexity
of activities,  it was somewhat surprising that  it  was  the  major and most
consistent effect of  all the  category variables considered.   When only one
ME was  visited,   the overall error  rate  was as  low as  14.1%.   The rate
increased dramatically  to  over 80% when two or  more ME's were visited in
the  same hour,  and  to  100% when four  ME's  were visited within  the same
hour.   This result has  a direct  bearing on the design of activity diaries,
and needs to be  studied further.

      Another unexpected  result was the  weak association  between error
rates  and  the neighborhood  of  residence  of the  study  participants,  also
used  as a  surrogate  indicator  of  SES.   During  the field  portion  of the
study,  the  staff had reported  more difficulties  in obtaining compliance
with  protocols on the part of the SS participants.   This observation was
not reflected in  the  SS-CL  comparison of observed and reported activities,
probably due to  the requirement  that the study  participants  met weekly with
the study personnel  at  a field  office  in  each of the communities.   If the
participants  did  not report on the scheduled  date,  phone contact and, if
needed, a home visit  was performed by the staff.
                                   5-13

-------
      An individual's personal monitoring days were typically separated by
a month  or more.    Lower  error rates  for the  second day  of  monitoring
suggest an effect of training on the accuracy of reporting.   In part, this
effect could also be due to  the  intensive  interaction  between field  office
personnel  and subjects.

      Age and sex,  both  independently  and  interactively,  had an effect on
some of the error rates.  Overall, the females were more  accurate than the
males.  This effect occurred even after adjusting  for  age.   (The males were
mostly in the younger age group  which  had  larger error rates.)   The older
participants  also had  lower error  rates,  even  after adjusting  for the
number  of  microenvironments  visited per hour,  which was  larger  for the
younger group of participants.  Field technicians  had  reported difficulties
with communicating instructions to at  least one  of the younger members of
the study population.  They acted promptly  in  reporting and  correcting this
problem by increasing communication with both the  subject and his parents.
This kind of intervention probably resulted in reduced error  rates for some
of the categories  of participants.  It  is important  to indicate that the
effects of sex  and  age may only be  specific to  this study due  to the
particular age distribution  of the  participants.

      The participants were more likely to report that they were not in an
individual  ME when they  actually were  (omission),  than  to record that they
were  in a  particular   ME  when  they  were   not   (commission).    Extreme
misclassification  of location  (i.e.,  that which could  probably  lead to
gross  errors in  exposure  estimates)  occurred  very  infrequently.    The
participants  reported to be  in an indoor  environment while outdoors, and
vice versa, during only 2.3% of the hours observed.  Also, they reported to
be  in  a vehicle  with closed windows  which  were  actually  open,  or vice
versa, during 4% of the  total vehicular hours  observed.

      The  participant classification  variables   included  in this  study
accounted  for  at  most  55.5%  of  the  error  variance  for  any  of  the
environment categories considered.   It  is possible  that other variables not
taken into consideration  in  this  investigation also affected error rates.

CONCLUSIONS

      The results  of this  study indicate that significant  location  errors
can  occur  with  self-reported activity  data, and that  these  errors are
linked  in  part  to  some  of  the  characteristics  of the study population.
Error rates were most affected by the number  of environments  visited  by the
study  participants  during   any  given  hour,  probably  associated with
decreased effectiveness  of recall  as individual activities increase.   This
variable needs  to be considered by  researchers,  since it could result in
increased  misclassification  of location and  errors  in exposure estimates
for  the more active subgroups of  their  study population.  The effects of
other variables such as gender, age,  and socioeconomic status (as indicated
by area of residence in  this  study) can probably  be controlled by intensive
and extensive communication  between the  study  participants and staff.


                                   5-14

-------
      Although  this paper is based on data collected  in  a  project funded by
the  U.S.   Environmental  Protection Agency,   the  analysis  of the  data
discussed  in  this paper was not funded by that Agency.

      The  work described  in  this  paper was  not  funded  by the  U.S.
Environmental  Protection  Agency  and  therefore  the  contents  do  not
necessarily reflect the views  of  the Agency,  and no official endorsement
should be  inferred.
                                   5-15

-------
WHO.      Estimating
publication No.   69.
       REFERENCES

  Human  Exposure  to  Air Pollutants.     Offset
  World Health Organization, Geneva,  1982.
Sexton,  K.   and  Ryan,  P.B.    Assessment of  human  exposure to  air
pollution: methods, measurements,  and models.   In:  A.   Watson,  R.R.
Bates,  and  D.    Kennedy (eds.),  Air Pollution,  the Automobile  and
Public Health:  Research Opportunities for Quantifying Risk.   National
Academy Press,  Washington,  DC,  1988.   p.   207.
Spengler, J.D.  and Soczek, M.L.    Evidence  for improved  ambient  air
quality and the need for  personal exposure research.   Environ.   Sci.
Techno!.  18: 268A,  1984.
Stock, T.H.;  Kotchmar,  D.J.;  Contant,  C.F.; Buffler,  P.A.;  Holguin,
A.H.;  Gehan,  B.M.;   and  Noel,  L.M.     The estimation of  personal
exposures  to air pollutants  for a community-based  study of  health
effects in asthmatics -- design and results of  air monitoring.   JAPCA
35: 1266,  1985.
Chapin, Jr., F.S.    Human Activity  Patterns  in  the  City.   John Wiley
and.Sons, New York, NY,  1974.          '           '
Robinson,  J.P.;  Converse,  P.E.;  and Szalai,  A.    Everyday  life  in
twelve .countries.   In:  A.   Szalai  (ed.),  The Use  of Time  -  Daily
Activities  of Urban  and Suburban Populations in Twelve  Countries.
Mouton and Co., The Hague,  1972.   p.   114.
Dockery, D.W.   and Spengler,  J.D.
particulates and sulfates.   JAPCA 31:
                  Personal  exposure
                  153,  1981.
                     to  respirable
Fugas, M.; Sega, K.
airborne respirable
and Assess. 2: 157,
;  and  Sisovic,
particles and
1982.
A.    Study of  personal exposure to
carbon monoxide.   Environ.  Monit.
Spengler,  J.D.;  Treitman,  R.D.;  Tosteson,  T.D.;
Soczek,  M.L.    Personal exposures to  respirable
implications  for  air  pollution  epidemiology.
Technol. 19: 700, 1985.
                                 Mage,  D.T.;  and
                                 particulates  and
                                 Environ.     Sci.
                              5-16

-------
10.    Quackenboss,  J.J.;  Spengler,  J.D.;  Kanarek,  M.S.;  Letz,  R.;  and
      Duffy,  C.P.    Personal exposure to  nitrogen dioxide: relationship to
      indoor/outdoor air  quality and activity  patterns.   Environ.   Sci.
      Technol.  20:  775,  1986.

11.    Morandi,  M.T.  and  Stock,  T.H.   A comparative  study  of respirable
      particulate  microenvironmental  concentrations and personal  exposures.
      Environ.   Monit.   and  Assess.  In Press, 1988.


12.    Holguin,  A.H.; Buffler,  P.A.;  Contant,  C.F.;  Stock, T.H.; Kotchmar,
      D.J.;  Hsi, B.P.;  Jenkins, D.E.; Gehan, B.M.; Noel,  L.M.; and Mei, M.
      The effects  of ozone  on asthmatics  in  the Houston  area.   In: S.D.
      Lee (ed.),  Evaluation  of  the Scientific Basis for Ozone/Oxidants
      Standards.  Air Pollution  Control  Association,  Pittsburgh, PA,  1985.
      p.   262.

13.    Contant,  C.F.; Stock,  T.H.; Buffler,  P.A.; Holguin,   H.A.;  and  Gehan,
      B.M.    The  estimation of personal  exposures to air  pollutants for a
      community-based study of  health effects  in asthmatics  -- exposure
      model.  JAPCA 37:  587,  1987.

14.    Stock,    T.H.      Formaldehyde concentrations  inside  conventional
      housing.   JAPCA 37:  913,  1987.


15.    SPSS,   Inc.    SPSS-X  User's Guide,  2nd  edition.    McGraw-Hill,  New
      York,  NY, 1986.                              •    .
                                   5-17

-------
      ASSESSING ACTIVITY PATTERNS FOR AIR POLLUTION EXPOSURE  RESEARCH

                                     by

                    James H. Adair and John D.  Spengler
                Harvard University,  School  of Public Health
                           665 Himtington Avenue
                              Boston, MA  02115
                                 ABSTRACT

      Time/activity diaries are means  by  which microenvir.onmental  exposure
to pollutants can be assessed.  This  paper  presents  two such diaries,  used
in large scale field research  studies,  designed  to identify human  activity
patterns in  relation to N02  and/or  PM2 5 exposure.   From  these patterns,
estimates of human  exposure  can  be obtained which are  more sensitive  than
those made  at fixed central  site monitoring  locations.

      The  diaries  themselves  reflect  the  research  questions  and  the
technology  available at the  time  of  monitoring.    The  two  main parameters,
time and location,  vary according to the monitoring device chosen  and the
pollution source in question.  In  turn, each parameter reflects the  need to
make the diaries as simple and complete as possible for the target.sampling
group.

      Results  from one of  the studies  indicate  regional,  seasonal,  and
day-of-the-week differences  in children's  activity patterns.   They appear
to  reflect  differences  in  climate,   urbanization,   and  other  factors.
Results from the second study are  not yet  available.
                                    6-1

-------
    HOME ACTIVITY PATTERN USE IN TWO INDOOR AIR QUALITY  RESEARCH STUDIES


      Two major  air quality  research studies  at the  Harvard School  of
Public Health are  currently using  time/activity  diaries  to determine home
activity patterns,  microenvironments,  and human exposure.  One project, the
Harvard  Indoor  Air  Pollution  Health  Study,   employs  fixed  location
monitoring  devices  to  measure  N02  and  respirable  particulates.    The
activity diaries are designed to assess time spent  in  microenvironments,
including those where  monitors are located.   The Gas  Research  Institute
Project  is  actually two large-scale  N02  personal  exposure studies.   For
these studies conducted in Boston  and Los  Angeles,  participants completed
time/activity diaries.   The diaries  for these  studies use time as the major
dimension to capture duration-of-stay within  microenvironments.   For each
of  these studies,   time/activity  is  recorded over  24-hour periods  that
correspond with  monitoring.

      While complete details  about these projects are  beyond  the  scope of
this  presentation,  this  paper does  provide  a  brief  description  of the
Harvard  Indoor  Air Quality Study.   In  an attached appendix,  the quality
assurance  aspects   of  the Gas Research  Institute 'study  are  presented.
Further  details for the  Harvard Indoor Air Pollution Health  Study can be
found in Ferris  et  a7.  (1979),  and  Spengler et a7.  For more information on
the Gas  Research Institute Project,  see Ryan et a7. (1988a);  Ryan et a7.
(1988b)  and Soczek  et a/.,  (1987; Personal communications).


               THE HARVARD INDOOR AIR  POLLUTION HEALTH STUDY


      The Harvard  Air Pollution Health Study is a prospective epidemiologic
study involving about 20,000  people in six  communities  (Ferris  et  a?.,
1979).   A  component of this  study is concerned  with indoor air pollution
and respiratory  effects.   By the end  of 1988,  approximately 1,800  children
in  six  cities will have been surveyed for daily respiratory  symptoms and
monitored  for  outdoor/indoor  NO.  and respirable particle exposure.   In
addition, time/activity data wilt  have  been collected  for  the final three
cities  in  the study: Portage, WI,  and  Steubenville, OH,  in  1986-87, and
Topeka,  KS, in 1987-88.   This  paper is based upon the sample of children in
those cities.  However,  the summary activity pattern results are based only
upon  Portage and  Steubenville,  where  activity data  collection  is  now
complete.


                                    6-2

-------
AIR QUALITY MEASUREMENTS

      Based on previous studies (Spengler et a/.,  1985; and Quackenboss et
a/., 1985), it has  been demonstrated that  indoor measurements of particles
and NO. are predictive of personal exposures for nonoccupationally exposed
people.  Therefore,  in designing this large,  indoor  air  pollution health
study,   we utilized   the  concept  of  microenvironmental  monitoring  to
establish an estimate of exposure for children.  Data  to characterize these
exposures  was gathered  through  monitoring devices  and  a  time/activity
diary.   In  addition,  questionnaires were administered to describe home and
health characteristics.   Finally, a  daily health diary was  kept  for the
children involved in the study.

      N02 measurements were  made with  fixed-location passive  diffusion
tubes (Palmes et a/.,  1976).   The Harvard  Aerosol Impactor (Turner et a/.,
1984;  Marple  et a/.,  1987)  was used to measure  respiratory particulates
less than 2.5 microns  in size  (PM2 J.  Passive diffusion water vapor tubes
(Girman et a/., 1984) were also deployed  as well  as  the Brookhaven National
Laboratory perfluorocarbon  tracer system  (Dietz et a/., 1982)  for measuring
air exchange rates.

      Integrated  pollution  monitors were placed  in the  child's  home,
outdoors,   and  in  the schools.   Temporal variations  were  assessed  by
repeated  measurements twice  in the winter and twice in  the summer.   A
subset of approximately 30 homes in each city was measured in each of four
seasons.    Particles  Were  measured  at  multiple outdoor  locations,  the
activity  room of each home,   and  in a  single  classroom in  each  school.
Timers  controlled  the particle  samplers  in the  home (4 pm to 8  am. on
weekdays, 24 hours otherwise) and school  (8 am to 4 pm on weekdays).  This
corresponded  to the  projected  activities in those  locations.    N02 was
measured in three rooms of the  home, one location outside the home, and in
several classrooms per school.   '  '

      In  three  cities  children completed  a  technician   administered
time/activity  diary covering 3 winter and  3 summer days.   The purpose of
this diary was to determine the proper weights  to  apply to microenvironment
concentrations.   A  home characteristics  questionnaire was administered to
describe the home environment  for each participant.   It included questions
about home  type  and setting,  heating systems and fuels,  cooking and water
heating  fuels,  ventilation,  participants  and smoking  patterns,  as well as
water  and  humidity.  A floor diagram of the house  was also  drawn so that
house  volumes  could be calculated.   The  exact location  of  all  measuring
devices  and pollution sources were noted  on floor  plans.   A daily health
diary was  self-administered to  record respiratory symptoms for the year in
which the study took place in a particular  city.   A  substudy also collected
indoor  and outdoor microbiological  specimens concomitant with  the other
data collected.  Finally,  yearly pulmonary  function measurements were made.

TIME/ACTIVITY MEASUREMENTS

      Time/activity  diaries  were  filled  out  twice  during  the  year
concomitant with  source  monitoring.    They  were  first  presented  and

                                    6-3

-------
partially filled  out  during  the  first or second of three home visits by the
field technician.   These visits  were designated as equipment setup, change,
and pickup visits.   During  the  setup  visit, sampling devices  were placed
and the home characteristics  questionnaire was administered.   The change
visit included collecting the first week's  samples  and  replacing  them for
the second week  sample.   The pickup visit  involved  the  collection of the
remaining samples  and the  monitoring equipment.

      Field technicians were  instructed to administer the  activity diary
during the  setup, visit  if  time allowed.   During  the  change visit,  the
technician reviewed  the portion of the  activity  diary filled out by the
participant.   If not completed or  incorrectly filled out,  an additional
diary was  administered.   However,  in many  cases,  the diary was  first
administered  during  the  shorter  change  visit due  to  the  lengthy  setup
session (one hour  or more).   Field technicians administered  a portion of
the diary to the participant on  a  24- to 36-hour recall basis.  This method
introduced the participant to the concept  and  level  of  detail  required to
complete the remaining  portion  of the diary.  Thus  diary  entries for the
remainder of the day plus the entire next day were filled out by the child
or  parent  (mother).    At the  next visit,  the  technician reviewed  the
activity diary for errors  and  resolved any problems or misrecordings.

ACTIVITY LOG MEASUREMENT

      We were interested  in understanding a child's  time/activity  patterns
in general.   Specifically,  we  wanted  to  know how much  time was  spent in
certain indoor locations.  The indoor locations of particular interest were
the  monitored  microenvironments.     (See exhibit  1.)   .Home  location
categories were the child's  bedroom, the  kitchen, the activity/livi'ng room,
and other rooms  in the  house.  This level  of  detail was required because
kitchen/bedroom  N02  differences  often  exceed factors  of  two.    Nonhome
location catagories  were the-school,  outdoors, transportation (car/bus),
and other indoor environments.

      The time component  of the activity diary spanned  three  consecutive
24-hour time blocks.   The first day began at  midnight and extended to the
following midnight.   Days  two  and three  followed   in a similar  fashion.
Every hour had to be partitioned into activities.  A  minimum averaging time
of 15 minutes per category was acceptable  except for  transportation.  Since
school  children  were  often  in transit  for  durations  of  less   than  15
minutes,  we attempted to have  the  technicians make particular note of short
travel times.   Unfortunately,  this was  not  uniformly adhered to by all
technicians. The  person(s)  filling out the diary (child,  mother,  father,
etc.) and the  technician introducing  the activity  diary were  recorded on
the diary to test for bias at  a  later date.

      Diaries were collected in  both the winter (December to April) and the
summer (May to August).    Winter data did not include time spent during the
Christmas holidays.   Summer  data included time  when some  children  were
still in school  (May  and part  of June).
                                    6-4

-------
QUALITY ASSURANCE AND QUALITY  CONTROL

      Several quality assurance and  quality control  steps  were taken  to
insure the integrity of the data.   Quality  assurance  included:

1.    the use of  a three-consecutive-day  activity diary format  to  provide
      for replication and to capture day-of-the-week  differences,

2.    a technician demonstration  of  a 24-  to  36-hour recall  diary as  a
      "warm up"/introduction to the diary technique,

3.    limiting the number of locations and  keeping the  recording of  time  to
      15-minute or greater durations  to insure nonconfusing,  nonoverlapping
      alternatives, and

4.    standardized  training  of field   technicians  by  the  Harvard
      investigators.

      Quality  control was  accomplished through manual  and computerized
procedures. These are described briefly below.

1.    The initials of the person filling out the diary  were recorded.   This
      was used to  identify  whether the  child, mother,  father,  etc.,  filled
      out the form and to test for  respondent bias.

2.    The initials of the field technician  administering the activity diary
      were also recorded.  This allowed for a check of  technician bias.

3.    The  field  technician  reviewed  the completed form before  leaving  the
      site and made corrections as  needed.

4. "   The  site field manager  reviewed all  diaries  for  accuracy  before
      returning them to the central office  for processing.

5.    The  central  office field coordinator reviewed each diary to  insure
      the number of minutes totalled  1440 per day.

6.    Site visits were made on a random basis by the site field manager and
      the  central  office  field  coordinator to observe  that proper  protocol
      was  being  followed.  If problems were found,  corrective  action  was
      taken and planned visits were made to insure compliance.

7.    The data entry  program  "SPSS Data Entry"  allowed a  screen image that
      resembled the diary itself.   This enhanced the  data entry  process.

8.    As data were entered, automatic value checks were made for day of the
      week (MON to .SUN),  month  (JAN to SEP), adults' initials (CHILD,  MOM,
      DAD,  OTHER),   and  field tech   initials.    Range checks were  also
      automatically made  to insure proper  day of the month  (1  to 31),  year
      (87,88) and time/location (=>0  and <=60).
                                    6-5

-------
9.    Ten percent  of the forms  were verified  for  data entry  errors.   A
      total  accuracy  of 99% was  found  before correction.   No systematic
      errors were observed.

10.   Location totals were recalculated by  computer  program.   All  diaries
      which did  not total  1440  minutes were rechecked and  corrected as
      necessary.

USES

      The  time/activity  diaries will  be  used  in  connection  with  the
monitoring  results to calculate weighted total  exposures  for individuals
and to apply to exposure simulation  models.   The general  approach  is one of
time-weighted average concentrations summed  over  microenvironments  (Fugas,
1975; Duan,  1982).   The specific  technique is presented  in  Letz et a7.
(1984).    Initial  results using  Portage  and  Steubenville activity data are
presented elsewhere (Spengler et  a/., 1987).

LIMITATIONS

      No  activity diary  is  perfect.    There  are  trade-offs  in  length,
complexity,  administration,  entry,  and  analysis.   Our activity  diaries have
limitations that  can be  categorized  as "form"  and  "technique."

     Form-specific limitations are:

1.    The diary  form  (See  Exhibit  1) caused some confusion because of the
      number of  rows  and columns.   This led  to. entries being recorded on
      the wrong line.  (Additional  shading was.placed on  the  form  to reduce
      this confusion.)

2.    The initials of the person who filled  out  the  form were  not adequate
      to  identify the  preparer.    (The  technicians  were told  to write
      whether the preparer was the child, mother,  father, etc.)

3.    Field and  central  site coordinators  observed slight differences in
      the manner that  the  activity diaries were  presented.   (Corrective
      action was  taken and protocols were tightened to prevent bias.)

4.    There is a potential underestimate of exposures  for  those  activities
      with  less   than  15-minute  durations.    Thus  environments  such as
      car/bus  involving short distances  (to and from  school)  may not be
      recorded adequately.

6.    The degree or amount of activity in  any  location is not  recorded.
      This  could affect  exposure levels which can  increase with  increased
      physical exertion.

      Additional  limitations are  concerned  with  the technique  used.  They
are:

7.    The total-weighted exposure is an  estimate  rather than a measurement.

                                    6-6

-------
8.    The averaged concentrations  from the fixed devices do  not  indicate
      le.vels  at  the time of exposure.

9.    Activity logs do  not record where  in the room a person is in relation
      to the  monitoring device and/or pollutant source.

10.   Diaries  really  do  not  indicate  the  level  or  type  of  physical
      activities by  the child.   Thus,  we  cannot modify our  estimates  of
      calculated exposure to estimate possible differences  in dose.

ANALYSIS OF RESULTS

      This  section   presents  basic descriptive  time/activity  data  for
elementary-aged  children  in  Steubenville, Ohio,  and  Portage,  Wisconsin.   A
total of 597 three-day logs were filled out between  the two sites over two
seasons. (See Table  1.)


                              THREE DAY LOGS
      The activity  logs were  first  analyzed  for  differences  across  the
three days of data collection.   No significant  differences were found among
the three days for any  location.   Results  were similar in both the winter
and summer as well  as for Portage  and Steubenville.


                       MONITORING  PLACEMENT.ACCURACY
      A second analysis was  performed  to  determine  the percentage of time
over  all  locations accounted  for by the  fixed  monitoring, devices.  (See
Table  2.)   Overall,  the  N02  monitors  located  in  the kitchen,  bedroom,
activity room, outside,  and  in the schools accounted  for  between 75% and
87% of a participant's activity/time.    The remainder of the time was spent
in nonmonitored locations: car/bus (2-4%), other indoor locations (5-12%),
and other  rooms  in the home (3-7%).  Further  analysis indicated that the
child's weekend activities were less likely to be approximated by N02 our
microenvironmental  monitoring  scheme  (65-80%).   The  weekend/weekday
difference  can be  partially explained  by an increase  in activities  at
nonmonitored  "other inside"  locations.   These replaced  activities  at the
monitored school.
                           REGIONAL DIFFERENCES


      Another  analysis  compared  communities  to  ascertain   regional
differences.  (See Table 3.)   Portage children  were  in  the  kitchen, outdoor,
and car/bus microenvironments  a  significantly  greater  amount of time during
the  academic period  than their  Steubenville counterparts.   These  same

                                    6-7

-------
children were also  in  the car/bus microenvironment more  after school let
out for the summer than during the academic year.  These results appear to
reflect differences between  a rural  farming lifestyle  (Portage)  and  a
distinctly urban way of life  (Steubenville).


                           SEASONAL DIFFERENCES


      Analyses  for seasonal differences compared  academic  period (12/86 to
06/87) and  vacation period (06/87 to  09/87)  activities.    (See Table 4.)
Concomitant to  the   expected  in-school  to out-of-school decrease in the
percentage of time spent at school, there was an increase in the amount of
time spent outside.   It also appears that children were spending less time
in  their  bedrooms  after  school   let  out for  the  summer.   Overall,  the
seasonal  analyses  indicated  activity differences  resulting  from school
being replaced  by increased outdoor and other activities in the  summer.

        A  second  seasonal  analysis compared  winter (12/86  to  04/87) and
summer  (05/87   to 09/87)  activities.     The  results  from  this  analysis
indicated two  differences from the  academic vs.  vacation results.   The
winter/summer  difference  showed  a substantial  amount of  time  spent  in
school during the summer.   This result  was due to the definition for summer
which included  a month  when children were still in  school.  The  second more
important finding was  a small  but significant winter  (5%)  to  summer (4%)
decrease in the  amount  of  time Portage children  spent  in  the kitchen.  In
addition,  an increase  in  outdoor  activities  (5%  to 16%)  was also noted in
both cities.  Combined, these  results  seem to indicate a cold weather/warm
weather factor  influencing activity patterns.


                          DAY-OF-WEEK DIFFERENCES


      After looking  at  day-of-the-week  differences,  weekdays were collapsed
into  a  single  category,   while   weekend days were retained  as  separate
categories.   However,   due to site visit scheduling  patterns  during the
summer  sampling  cycle,   an  insufficient  number  of  Sunday diaries  were
available  for   analysis  from  Portage.    (Results   reported  here  use the
academic  period  vs.   vacation   period  definition  as  a  basis   for the
winter/summer comparisons.)

      Steubenville participants indicated increased usage  of the  activity
room on the weekend while  school  was in  session and decreased usage of the
bedroom on  Saturdays throughout  the year.  (See Table 5.)   Total  in-home
time increased  on weekends during  the  school  year but decreased  once school
closed for  the  summer.   The  decrease  between weekday and  weekend school
time was expected and was accompanied  by an  increase in other activities on
the weekend.

      Portage children recorded  increased usage of the activity  room on
weekends  during  the  school  year.  (See  table  6.)   They  also indicated

                                   6-8

-------
greater bedroom usage on Sunday during the  same  time  frame.   The decrease
in school  time between the weekday and weekend seems to have been taken up
by an  increase in outdoor  time  plus  other  indoor activities.   The percent
time spent "at home"  increased  between the  weekday and weekend during the
school  year.


                            OUTDOOR COMPARISONS


      Because some  pollutants  (e.g., ozone) display  distinct  diurnal and
seasonal  cycles,  we were  interested in  a child's  activity  patterns over
time.  Table 7 presents an analysis of outdoor activity patterns over  time.
As  expected,   the  time  spent  outdoors  during  the  daytime  (8:00   a.m.-
8:00 p.m.) was much  greater  at  both  sites  than  the time spent outdoors at
night  (8:00 p.m.-8:00  a.m.)   There  was   also  an increase  in  the time
children spent outside at night  once  schools  closed  for the summer.   Future
analysis  will  provide  a detailed  examination of  the  percent of the
population outdoors  by  hour of  the day.   We will  incorporate  weather and
pollution data into  this  analysis.


                              OTHER  STUDIES
      An attempt  was  made to compare time/activity results reported here
with  results reported  previously in  the literature.  . Two  such studies
investigated time/activity patterns  of  school-aged  children (Letz et a/.,
1983; and  Quackenboss et a/., 1982). . While  such  comparisons are tenuous
due to study design differences,  they do provide a  beginning point to which
the data reported here can be compared.

      Letz  et   a7.   used  activity data  gathered  from  Watertown,
Massachusetts,  children  in  the  fall of  1982.   (See Table 8.)   The 1987
Portage  and Steubenville data  indicate  a greater amount of  time spent
indoors  and in  school.   While summer  data was  also reported for the
Watertown  children, the  definition for summer was different, making strict
comparisons difficult.

      Quackenboss et  a7.  presented  data  collected   from school children  in
Portage, Wisconsin, during 1981.   (See Table 9.)  Their data appears to  be
similar  to the 1987  Portage  data.   Differences may  reflect  time of data
collection:  Portage 1981 (March); Portage 1987  (December through  March for
winter and May to September for summer).


                      SUMMARY  OF  TIME ACTIVITY RESULTS


      Differences in time/activity patterns were found  between academic and
vacation  periods  as  well  as between  heating  (winter)   and  non-heating
(summer) seasons.  There were interregional differences  between Portage and

                                    6^9

-------
Steubenville reflecting  differences in climate,  urbanization, and  other
factors.    Children  displayed  different  activity  patterns  between  weekdays
and weekends but  not  among weekdays.   The  amount of time spent  outdoors
differed  by  season,  time of  day  (day vs.  night),  and  community.    In
addition,  we  were  able  to establish that  time/activity patterns  placed
children  in the proximity  of  our  microenvironmental  monitors  about  80% of
the time.  However,  on some weekend days this percentage dropped  to  as low
as 67%.

      In  the study of respiratory  symptoms in these communities,  estimates
of total  exposure for  children will be derived from the microenvironment
concentrations.     Certainly,   exposures  based  on  these  time-weighted
microenvironment measurements  are not  perfect estimators, but they  are a
substantial improvement over measurements  taken at a central  site outdoor
monitor.    They  allow generalization  to a  population  based on  several
microenvironmental locations rather  than a single location.  This  statement
is more  valid when  there are  indoor and  outdoor sources of  a pollutant.
However,   even  for  ambient pollutants  such  as ozone  and acid  aerosols,
exposure will be  modified by  indoor activity  patterns. In view  of  these
factors,   there continues to  be  a  need  to conduct  personal monitoring
studies  to quantify the  relationships  among exposure,  microenvironmental
monitors, and  fixed-site  ambient monitors.

ACKNOWLEDGEMENTS

      Work  reported in  this  paper was  supported  in part  by  National
Institute of Environmental  Health  Sciences Grants ES-01108 and  ES-0002.  We
are indebted to  our field technicians  and  data processing personnel for
their efforts to provide  us with  the data upon which this  paper  is  based.
We are 'especially thankful to the families  in Portage,  Steubenville, and
Topeka who shared  their time and homes.  We wish to acknowledge the help of
P.B.  Ryan in  organizing  the time/activity material  reported  in  the
appendix.  Finally, visiting  scholar Eric Lebret was  instrumental  in the
design of the  time/activity form.


                                 APPENDIX

          ACTIVITY PATTERNS OBTAINED UNDER HARVARD'S GRI PROJECT*


      The Gas Research Institute Project (GRI)  is designed to  identify and
quantify the portion of  total  public exposure to nitrogen dioxide  due to
indoor  sources.     To  accomplish  this goal,  a  series  of   studies  were
undertaken.  The first two --  the Residential Characterization Study  (Ryan
et a7.,1988a;  Ryan  et  a/.,  1988b)  and the  Personal  Exposure Monitoring
Study  (Ryan  et a/.,   1987)  -- were both conducted  in  the  Boston  area
preliminary to the  study for  which activity diaries are  herein  reported:
The  Los   Angeles  Personal  Monitoring  Study  (Soczek   et  a/.,   Personal
communication, 1987).   This  combination of  studies, once  completed,  will
allow  the quantification  and  comparison of  indoor  and  outdoor  source


                                   6-10

-------
contributions,  as  well  as activity  patterns in low and  high  ambient N02
areas.

      The Los Angeles Personal  Monitoring Study was designed  to increase
the understanding  of microenvirommental  contributions to  total  sources,
while at the same time characterizing  N02 distributions  in  a high ambient
N02 area.   In the  main,  study participants/technicians  collected 24-hour
personal  exposure  data  for  two  consecutive days.    Approximately  650
participants were involved.   The  sampling  period  covered the time between
May 1987  and April  1988.   A supplemental  study collected  more detailed
microenvironmental  data  on 50 people  sampled eight times  over  the  year.
This will provide information  on seasonal variations.

*  Project funded by the  Gas  Research Institute and  Southern California Gas
Company.   Investigation conducted  by  J.D.   Spengler,  P. Barry  Ryan,  and
Steven D. Colome (U. of Cal.   Irvine).  Project officers are Dr. I. Billick
- GRI and Phil  Baker -  SoCalGas.

AIR QUALITY MEASUREMENTS

        Nitrogen  dioxide  measurements consisted  of  personal  N02  badge
monitors  (Yanagisawa et  a/.,  1982).   For the main  study,  a 24-hour  badge
was worn by  participants on two  consecutive  days.   In  addition,  a bedroom
badge and an outdoor badge were  placed in  fixed  locations for the overall
two-day period.    For the supplemental  study, two additional microenviron-
mental badges were  also  worn.  An  "at home" badge was worn  in  the  home,
while an  "away  from home"-badge was worn when the  participant was out of
the house.   Homes  were  characterized by the use of a home  characteristi.cs
questionnaire.   This instrument, was designed to  record  source and source
usage patterns as well  as dilution and ventilation parameters.   A personal
characteristics  questionnaire  included  occupational  questions  and  a
yesterday recall of activities.

TIME/ACTIVITY MEASURE

      Time/activity  diaries   were  filled out  for the two  days  in  which
personal  monitoring took  place.    They  were  presented  by  the  field
technician  at  the   home  setup  visit.    At  that  visit,   the technician
explained  the monitoring  protocol  and guided  the participant  through a
practice  activity diary  in   the  instruction booklet.     (See  Exhibit 3.)
Twenty-four to 48 hours after the setup visit,  a  telephone call was made to
encourage protocol compliance, answer questions,  and prompt for the return
of samplers as well  as  the  activity diary.

      The activity  diary used time  and location as its  major dimension.
(See  Exhibit 4.)    Time  was determined  by  the  length  of  stay  within a
microenvironmental  location.   When  a  location was entered,  the time was
noted in the activity diary.   Only when a location changed was a new entry
recorded.

      The location dimension  was divided into two  major categories:  inside
and outside.  The inside location was  further divided between home and not

                                   6-11

-------
at home  categories.   These locations  were then subdivided  further.   The
outside  location  was  divided  according  to  proximity  to  major  roads.
Durations of 15 minutes or more  were recorded  except during gas stove usage
when durations of 5 or  more minutes  were entered.

QUALITY ASSURANCE - QUALITY CONTROL

      Several quality  assurance and quality  control  steps  were  taken to
insure the integrity  of the data.  Quality assurance measures  included:

1.    field  technician  setup  visits  to  explain  procedures  (including
      mail-back procedures);

2.    a practice activity diary  filled  out with  the assistance of the field
      technician to insure understanding of procedures;

3.    limiting  the  number   of  locations  to  insure nonconfusing,
      nonoverlapping  alternatives;

4.    the use  of two-day  diaries  to  provide  replication and  to  capture
      day-of-the-week differences;

5.    a  pilot study  conducted to  assess  four alternative ways for setting
      up, following  up,  and  finishing up each  two-day  monitoring period,
      including:

      a.    field technician  setup,  field  technician return after 24 hours,
            and mail back at the  end;

      b.    field technician setup,  field  technician  telephoning after 24
            hours, and  mail back  at the  end;

      c.    field technician setup, no follow-up,  field technician return
            after 48  hours to  collect measurement instruments, and

      d.    mailed setup, no followup,  and mail back at end.
            (As  a result  of the   pilot  study,  the main  study  adopted
            alternative b.   At  the  same  time,  sample  size was increased to
            offset an anticipated  5-10% loss  due to noncompliance and other
            data loss.)

6.    an  instrumentation booklet  and checklist  were  provided to help the
      participants know what to  do.

7.    interviews  with  pilot  participants  and refusals  were conducted to
      assess response to initial presentation.

      Quality control was accomplished  through:

1.    a visit by the senior staff to train the field technician followed by
      a visit to insure proper observance of protocol  by the field staff;


                                    6-12

-------
2.     telephone  follow-ups twenty-four hours  after  setup  to  encourage
      protocol  compliance,  answer questions, and prompt for return;

3.     field coordinator checks  of all returns;
            (These checks  were made to insure form  completeness,  proper
            activity   recording,  and  logical  activity  sequencing.     If
            errors  were encountered,   the  field  technician called the
            participant back to correct  discrepancies.)

4.     central  site coordinator  checks to  insure that diaries were  matched
      with the proper site  and  badge  number;
            (Checks  were  also  made  to  insure  that badge  start and  stop
            times corresponded  with activity start and stop times.)

5.     100% verification of  all  diaries after entry;  and

6.     programs designed to  check for values  (such as date)  and  field card
      time/activity diary time  inconsistencies.

LIMITATIONS

      This method of time/activity  exposure assessment suffers from  certain
limitations.    While  the  personal  monitor  insures a  personal exposure
quantity, only estimates can be made of the source microenvironments.   This
study  attempted to  address that  issue  through the use of  stationary and
personal measurements  at various locations.   In addition*  the supplemental
study   used   "at  home"  and   "away  from  home"   badges   to assess  the
microenvironments.   However,  measures  of specific rooms within the home
environment were beyond the cost/technical capabilities of  this  study.   As
concluded  in  the Six City section  of this paper,  until  monitoring  devices
become  technically  feasible  and  cost-effective  for large  field studies,
compromises will have to be made in microenvironmental assessment.

           TABLE  1.   NUMBER OF ACTIVITY DIARIES BY CITY AND SEASON
PORTAGE STEUBENVILLE
ACADEMIC PERIOD
VACATION PERIOD
TOTAL
139
50
189
236
172
408
TOTAL
375
222
597
ACADEMIC PERIOD was defined as from 12/86 to 06/87.
VACATION PERIOD was defined as from 06/87 to 09/87.
                                    6-13

-------
TABLE 2.    PERCENT OF CHILD'S TINE SPENT IN MONITORED LOCATIONS:
            DAY-OF-WEEK MEANS AND STANDARD DEVIATIONS
                                                 PERIOD
   SITE                  Dav            ACADEMIC        VACATION
                                        (DEC-MAY)      (JUNE-SEPT)

STEUBENVILLE          All Days           82 (16)         73  (20)
                      Weekends           85 (14)         79  (20)
                      Saturday           69 (26)         65  (34)
                      Sunday             73 (24)         74  (18)
PORTAGE



All Days
Weekdays
Saturday
Sunday
88 (11)
89 (09)
77 (17)
79 (15)
72 (16)
71 (16)
80 (17)
-- (")
TABLE 3.    PERCENT OF CHILD'S TIME SPENT IN SPECIFIED LOCATIONS:
            REGIONAL MEANS AND STANDARD DEVIATIONS
                        ACADEMIC PERIOD              VACATION  PERIOD
LOCATION           STEUBENVILLE    PORTAGE      STEUBENVILLE     PORTAGE

HOME:
 ACTIVITY ROOM        13  (11)      13  (13)         17  (16)       17  (13)
 BEDROOM              41  (13)      41  (09)         39  (17)       37  (16)
 KITCHEN              03  (03)      04  (04)         03  (03)       04  (04)
 OTHER ROOM           05  (09)      03  (05)         06  (11)       07  (11)
 TOTAL                62  (17)      61  (13)         65  (20)       65  (19)

NON-HOME:
 SCHOOL               20  (12)      21  (10)         02  (06)       02  (06)
 OUTDOORS             07  (08)      08  (07)         19  (15)       17  (13)
 CAR/BUS              02  (03)      03  (03)         02  (04)       04  (05)
 OTHER ACTIVITY       08  (11)      05  (09)         11  (18)       12  (16)
 DON'T KNOW           01  (04)      00  (01)         01  (05)       00  (02)
 TOTAL                38  (17)      38  (12)         35  (20)       34  (19)

ALL:
 INDOORS              93  (08)      92  (07)         81  (14)       83  (13)
                                    6-14

-------
TABLE 4.    PERCENT OF CHILD'S TIME SPENT IN SPECIFIED  LOCATIONS:
            SEASONAL MEANS AND STANDARD DEVIATIONS
                         STEUBENVILLE                     PORTAGE
LOCATION             ACADEMIC     VACATION        ACADEMIC      VACATION

HOME:
 ACTIVITY ROOM        13 (11)       17  (16)          13  (13)      17  (13)
 BEDROOM              41 (13)       39  (17)          41  (09)      37  (16)
 KITCHEN              03 (03)       03  (03)          04  (04)      04  (04)
 OTHER ROOM           05 (09)       06  (11)          03  (05)      07  (11)
 TOTAL                62 (17)       65  (20)          61  (13)      65  (19)

NON-HOME:
 SCHOOL               20 (12)       02  (06)          21  (10)      02  (06)
 OUTDOORS             07 (08)       19  (15)          08  (07)      17  (13)
 CAR/BUS              02 (03)       02  (04)          03  (03)      04  (05)
 OTHER ACTIVITY       08 (11)       11  (18)          05  (09)      12  (16)
 DON'T KNOW           01 (04)       01  (05)          00  (01)      00  (02)
 TOTAL                38 (17)       35  (20)          37  (12)      35  (19)
ALL:
INDOORS
93 (08)
81 (14)
92 (07)
83 (13)
TABLE 5.    PERCENT  OF  CHILD'S TIME  SPENT  IN  VARIOUS STEUBENVILLE
            LOCATIONS:   DAY-OF-HEEK  MEANS  AND STANDARD DEVIATIONS
      LOCATION             ACADEMIC  PERIOD               VACATION PERIOD
                      WEEK      SAT       SUN        WEEK      SAT      SUN
HOME:
    ACTIVITY  ROOM    11  (10)   21  (15)   19 (15)      17 (16)   16 (17)   13 (11)
    BEDROOM          41  (11)   37  (21)   44 (21)      40 (17)   32 (19)   39 (16)
    KITCHEN          03  (03)   04  (03)   04 (04)      03 (03)   03 (03)   02 (02)
    OTHER  ROOM       05  (08)   07  (11)   08 (14)      06 (10)   04 (12)   04 (02)
    TOTAL            60  (14)   69  (30)   75 (22)      66 (14)   55 (27)   58 (06)

NON-HOME:
    SCHOOL          24  (10)   01  (04)   00 (01)      02 (06)   02 (06)   01 (06)
    OUTDOORS         07  (08)   05  (09)   07 (12)      19 (15)   16 (15)   20 (12)
    CAR/BUS          03  (03)   02  (02)   02 (03)      02 (03)   03 (06)   05 (09)
    OTHER  ACTIVITY   06  (12)   24  (30)   16 (21)      10 (16)   20 (31)   17 (19)
    DON'T  KNOW       01  (04)   00  (01)   00 (01)      01 (05)   02 (07)   00 (00)
    TOTAL       •     41  (11)   32  (30)   26 (22)     -34 (19)   44 (27)   43 (20)

ALL:
    INDOORS          93  (08)   95  (09)   93 (12)      81 (15)   84 (15)   80 (12)
                                    6-15

-------
TABLE 6.    PERCENT OF CHILD'S TIME SPENT IN VARIOUS PORTAGE LOCATIONS:
            DAY-OF-WEEK MEANS AND STANDARD DEVIATIONS
      LOCATION            ACADEMIC PERIOD               VACATION PERIOD
WEEK
HOME:
ACTIVITY ROOM
BEDROOM
KITCHEN
OTHER ROOM
TOTAL
NON-HOME:
SCHOOL
OUTDOORS
CAR/BUS
OTHER ACTIVITY
DON'T KNOW
TOTAL
ALL:
INDOORS

12
40
05
03
60

24
08
03
05
00
40

92

(12)
(09)
(04)
(04)
(11)

(08)
(06)
(03)
(08)
(01)
(11)

(07)
SAT

23
35
03
10
71

01
15
03
09
00
28

84

(14)
(12)
(03)
(12)
(18)

(03)
(15)
(04)
(14)
(00)
(26)

(15)
SUN

20
50
04
05
79

00
06
04
12
00
22

94

(13)
(10)
(03)
(06)
(15)

(00)
(09)
(03)
(11)
(00)
(15)

(09)
WEEK

17
37
04
07
65

02
17
03
12
00
34

83

(13)
(16)
(04)
(ID
(20)

(07)
(13)
(05)
(16)
(00)
(20)

(13)
SAT SUN

15
40
06
06
67

00
19
04
08
02
33

81

(16) - (--)
(15) -- (--)
(05) - (-)
(08) -- (--)
(18) - (-)

(01) -- (--)
(18) - (-)
(05) -- (--)
(14) -- (-)
(05) - (-)
(18) -- (-)

(18) -- (--)
TABLE 7.    PERCENT OF CHILD'S TIME SPENT OUTDOORS:
            TIME-OF-DAY MEANS AND .STANDARD DEVIATIONS
                      ACADEMIC PERIOD              VACATION PERIOD
                  STEUBENVILLE     PORTAGE     STEUBENVILLE     PORTAGE

DAY                  06 (07)       08 (07)        16 (13)       15 (12)
NIGHT                01 (02)       01 (01)        03 (04)       02 (03)

TOTAL                07 (08)       08 (07)        19 (15)       17 (13)
                                    6-16

-------
TABLE 8.
COMPARISON OF 1987 PORTAGE AND STEUBENVILLE
ACTIVITY DATA WITH 1982 WATERTOWN DATA
            WATERTOWN 1982
            ANNUAL ACADEMIC
                      YR	
                        PORTAGE 1987
                       ANNUAL ACADEMIC
                                 YR
STEUBENVILLE 1987
 ANNUAL ACADEMIC
           YR
SCHOOL
TRAVEL
OUTDOOR
INDOOR
12
03
18
82
16
03
14
86
16
03
10
90
21
03
08
92
16
02
10
90
20
02
07
93
TABLE 9.    COMPARISON OF 1987 PORTAGE AND STEUBENVILLE
            ACTIVITY DATA WITH 1981 PORTAGE DATA BY  HEATING SEASON
                      PORTAGE    PORTAGE 1987
                       1981     WINTER   SUMMER
                                      STEUBENVILLE  1987
                                        WINTER   SUMMER
INDOOR
OUTDOOR
MOTOR VEHICLES
OTHER INDOORS
TOTAL INDOORS
TOTAL OUTDOORS
- 61
09
05
24
85
14
63
06
03
27
90
09
61
16
04
18
79
20
66
04
02
, 28
94
06
62
16."
02
20
. 80
18
Total Outdoors
Total Indoors
     Outdoors + Motor Vehicles
     Indoor Home + Other Indoors
      The  work  described  in  this  paper was  not  funded  by  the  U.S.
Environmental  Protection  Agency  and,  therefore,   the contents  do  not
necessarily  reflect the views  of the Agency  and  no official  endorsement
should be inferred.
                                    6-17

-------
                                REFERENCES

Dietz, R.  N.  and Cote,  E.  A.    "Air  infiltration  measurements in  a  home
      using a convenient perfluorocarbon  tracer technique.'"   Environment
      International.  8:419-433, 1982.

Duan,  N.   "Models  for  human  exposure  to  air  pollution."  Environment
      International. 8:305-309, 1982.

Ferris, B.  G.,  Speizer,  F.  E., Spengler, J.  D.;  Dockery, D.  W.,  Bishop,
      Y.M.M.,  Wolfson,  M.,  and Humble,  C.   "Effects  of  sulphur oxides  and
      respirable particulates  on  human health: '  methodology and demography
      of  populations in study."   American Review  of  Respiratory Diseases.
      120:769-779, 1979.

Fugas, M.,  "Assessment  of total exposure to an air pollutant."  Proceedings
      of   the  International  Conference  on  Environmental   Sensing   and
      Assessment,  IEEE  #75-CH 1004-1 ICESA, Las Vegas,  1975.

Girman, J.  R.,  Hodgson, A.T,  Robison,  B.K., Traynor,  G.W.    "Laboratory
      studies of  the  temperature dependence  of the  Palmes  N02 sampler."
      Proceedings:  National  Symposium  on  Recent  Advances  in Pollutant
      Monitoring of Ambient Air  and  Stationary Sources,  Raleigh, North
      Carolina.   May 1983  EPA-600/9-84-001 January  1984.

Letz, R.  E., Ryan, P.  B., and Spengler,  J. D.   "Estimated distributions of
      personal  exposures to respirable  particulates."   Environment Monit.
      Assess.   4:451-359, 1984.

Marple, V.A.,  Rubow,  K.L.,  Turner,W.,  and  Spengler,  J.D. "Low flow  rate
      sharp cut  impactors  for  sampling:  Design  and calibration." JAPCA
      37:1303-1307,  1987.

Palmes,  E.  D.,  Gunnison,  A.  F., DiMattio,  S.,  Tomcyzk,  C.   "Personal
      sampler for N02."  Am.  Ind. Hvq.  Assoc.   37:570-577,  1976.

Quackenboss,  J.  J.,   Kanarek,   M.  S.,  Spengler,  J.  D.,  and  Letz,  R.
      "Personal  Monitoring  for Nitrogen dioxide exposure:   Methodological
      considerations for a  community study."   Environ.   Int.   8:249-258,
      1982.

Quackenboss, J.J., Spengler, J.D., Kanarek, M.S., Letz, R., and Duffy,  C.P.
      "Personal  exposure  to   nitrogen   dioxide:  relationship  to
      indoor/outdoor air  quality and  activity  patterns.   Environ.  Sci.
      Tech. 20(6):775-783, 1985.


Reed,  M.  P.,  McKay, V.,  and Fraumeni, L.  P.   "Indoor  Air  Quality/Acute
      Health  Study:    Field  Technician's  Reference Manual."    Personal
      communication, 1986.
                                   6-18

-------
Ryan,  P.  B.,  Soczek,  M.  L.,  Spengler,  J.  D.,  and  Billick,  I.H.   "The  Boston
      Residential  NO. Characterization Study:  I.  Preliminary Evaluation of
      the Survey  Methodology."  Int.  Jour,  of Air Pollution  Control and
      Waste  Management.  38(l):22-27. 1988a.

Ryan,  P.B.,  Soczek,  M.L., Treitman,  R.D. and Spengler,  J.D.  "The  Boston
      Residential N02 Characterization Study: II.  Survey Methodology and
      Population  Concentration  Estimates."  Atmos.   Environ.,   In  press,
      1988b.

Ryan,  P.B.,  Spengler, J.D.  "Nitrogen  dioxide personal  exposure  assessment:
      Methodological  considerations  in  design,   implementation  and data
      analysis."  Presented  at Environmetrics '87. Washington, DC.  Nov.
      1987.

Spengler, J.  D., Treitman, R. D., Tosteson, T. D., Mage,  D. T.  and Soczek,
      M.L.    "Personal exposure to respirable particulates and  implications
      for air pollution  epidemiology".  Env. Sci. Tech.  19:700-707, 1985.

Spengler, J.  D., Reed,  M.   P.,  Lebret,  E.,  Chang,  B.  H., Ware,  J. H.,
      Speizer,   F.   E.,  and  Ferris,   B.  G.    "Harvard's   Indoor Air
      Pollution/Health Study."   Presented  at  the  79th  Annual Meeting  of the
      Air Pollution  Control Association.   June  22-27, 1986.

Spengler, J. D., Keeler,  6.  J.,  Koutrakis, P., and  Ryan,  P. B.   "Exposures
      to Acidic Aerosols."  Presented  at the  International  Symposium  on the
      Health Effects  of  Acid  Aerosols.   October 19-21,  1987.

SPSS Data Entry II.   SPSS,' Inc.,  1987.

Turner,  W.   A., Marple,  V.  A.,  and Spengler,  J.  D.    "Indoor Aerosol
      Impactor."  Aerosols.   Elsevier Science  Publishing Co.,  Inc.,   Lies,
      Pui,  and Fissan, editors.   1984.

Yanagisawa,  Y., And Nishimura,  H.    "A  badge type  personal  sampler for
      measurement of personal  exposure  to  NO-  and  NO  in  ambient  air."
      Environment International.   8:235-242,  198?.
                                   6-19

-------
                 PERCEPTION  OF DAILY CIGARETTE CONSUMPTION
                         IN  THE OFFICE ENVIRONMENT

                       by:  David A. Sterling,  D.J.  Moschandreas,
                             and Robert D. Gibbons
Editors  Note:    This  published  article  appears  in  Bulletin  of  the
Psvchonomic  Society  26(2):120-123.  1988.
                                   7-1

-------
     CAPTURE OF ACTIVITY PATTERN DATA  DURING ENVIRONMENTAL MONITORING

                       by:   Harvey S. Zelon
                             Research  Triangle  Institute
                             Research  Triangle  Park, N.C. 27709
                                 ABSTRACT

      The United  States  Environmental  Protection  Agency  (U.S.  EPA)  has
sponsored many-studies  involving the collection .of human exposure  data.
Collection  of data  describing  the  activities  undertaken by  the  study
respondents during the course  of monitoring is an  important  component of
the research effort.   The information  collected  may  be  vital  in  explaining
the results  of  environmental  or biological  monitoring.   This paper  will
describe  some of the  studies  undertaken,  the types  of data collected,
focusing  on  the  collection of  activity descriptions,   and  the  efforts
undertaken to assure  completeness and quality in  the  data  set.

      This  paper  has   been   reviewed  in  accordance  with  the   U.S.
Environmental  Protection  Agency's peer  and  administrative  review  policies
and approved for presentation and publication.
                                   8-1

-------
                               INTRODUCTION

      How people spend their time has long been a  topic  of  great  interest
to the social  scientist.   Time and motion studies  have been  conducted  in  a
broad variety  of settings.   Now the study of people's activities has become
a component of the  environmental scientist's  realm  of interest.

      Major efforts have been undertaken to describe  the  exposures  of the
population  to environmental  chemicals  of many  varieties.    The  efforts
include  sampling  of  exposures  through various  routes  and  attempts  to
measure subsequent levels  of  the  chemical  in  the body.   Exposures  may be
through the environmental routes,  air and water,  or through various other
routes,  such   as  food,   household dust,  or  occupational  or  avocational
exposure.

      Vital   to  the  explanation  of  the   individual   results   of  the
measurements and the  attempts to  correlate the various  levels  measured is
the ability to  determine in great detail  the  activities  undertaken  by the
respondent during the  periods  when samples were being collected.  Since the
environmental   routes  may account  for only some of  the potential  paths of
exposure  to  various  chemicals,   it  is  vital  to  know if  the levels  of
chemical detected  by  the sampling and analysis process  are  from  exposure
through other  routes or  should be  considered  to be  •environmental  exposures.

      Two studies  conducted by  the  Research Triangle  Institute (RTI)  for
the  U.S.  EPA  involved  the collection  of various  environmental  samples.
These studies  were the  Total  Exposure Assessment  Methodology  Study,  known
as the TEAM study,  and  the Study  of Carbon  Monoxide Exposure in Residents
of Washington,  D.C.,  and Denver, CO, known as the CO Study.   In  both of
these studies,  data from the respondent  were also collected  and  used to
help  explain   the  relationships  observed within  the  environmental  and
biological samples.  This paper will describe these studies  with a primary
focus on  the  collection of data  about  the  respondent's  activities  to be
used  during  the  analysis.    Emphasis  will  be placed  in describing  the
collection  and  validation of  activity  data,   and  discussions  of  the
strengths  and  weaknesses  of  the   techniques  will  be  presented.
Recommendations  for   improvements  in  approaches  to   collecting  data
describing the routine daily activities  of study  respondents will  conclude
the discussions.

            TOTAL  EXPOSURE ASSESSMENT METHODOLOGY  (TEAM)  STUDY

      Between  1980 and 1985, a series of studies  was conducted in  multiple
sites  throughout the  United  States.   The  goals  of  the studies were  to
develop methods to measure  individual total exposure (exposure through air,
food,  and  water)  and  resulting  body  burden of  toxic  and  carcinogenic

                                   8-2

-------
chemicals,  and  to  apply these methods within a  probability-based  sampling
framework to estimate the exposures and body burden of urban  populations  in
several  U.S.  cities.     In  order to  reach these goals,  a  three-pronged
approach was developed which  included  the development  and testing  of  small
personal  monitors  to  measure  exposure  to  airborne  chemicals,   the
development of  a  transportable  spirometer to  measure the  chemicals  in
exhaled  breath, and  a survey  design  involving  a  stratified probability
selection with variations to insure the inclusion of members  of potentially
highly exposed  groups.   Based on the desire to  assess  exposure  through  as
many  routes as possible,  and to prove a series of methodologies for the
collection  and  analysis  of the  chemicals  of interest, the study was  named
the Total Exposure Assessment Methodology  study,  or TEAM.

      The study  began  with a pilot study conducted with 9 subjects in New
Jersey and  3 subjects  in  North  Carolina.  The New Jersey  subjects  all  were
located in  the areas which became part of the first main phase  sample.  All
pilot study subjects were selected purposively with the assistance  of  local
health  department  officials.   No  attempt was  made  to  select  subjects
representative of the  population, but  rather to  select  persons who were  as
diverse  as  possible  in  occupation,  location  of  residence,  and   other
variables of interest.

      The study  subjects were each visited  for  several  days 3 times  over
the  course of  the  pilot  study.   During each  series  of  visits,  various
environmental  and  biological  samples were collected,  and  a  series  of
questionnaires  were completed.   The samples collected were used to  test
some  30  sampling and  analytical protocols  for  4 groups  of chemicals  of
interest.   Based on.the  results  of  the pilot study tests,  the  goals of the
.TEAM  study  could be  met  with  only  one  group of the chemical  compounds, the
volatile organics.

      The main  TEAM study was begun with the  objective of collecting and
analyzing   samples  of  20  target chemicals selected  from among  all the
volatiles  based on  toxicity,  carcinogenicity,   mutagenicity,  production
volume, presence in preliminary sampling and pilot studies,  and amenability
to collection on Tenax.   Some 600 persons,  representing a total  population
of 700,000  residents of cities in New Jersey, North Carolina,  North Dakota,
and  California  were  selected  in  a  multi-stage,  stratified,  random  sampling
process.   Each participant was  involved  in a  twenty-four hour  monitoring
period during  which two  twelve-hour  personal  air samples were  collected.
Ambient  air samples  were  collected  in  each  sampling  segment  for comparison
to  personal air  samples.   Each participant provided two drinking  water
samples, and at the end of the twenty-four hours  of monitoring, a sample  of
exhaled  air.   The  first  phase of the  main study  consisted  of three seasons
of measurements  in  the two sites in  northern New Jersey,  and  in comparison
sites in North Carolina  and  North  Dakota.   The  second phase  of the  study
consisted of 3  seasonal sets of measurements:  2  in the Los Angeles area  in
southern California, and one in an area northeast of  Oakland, CA.

      During the  same  time frame as the main studies,  a  series  of special
studies  was undertaken  with  small  populations of  persons  with special
exposures or concerns.   These included nursing  mothers in whose milk some

                                    8-3

-------
of  the  chemicals  of interest  were  thought  to  be highly  concentrated,
employees of dry  cleaners,  and lifeguards at swimming  pools,  all  of whom
are exposed  to high levels  of chemicals  in  the  set  of  interest.   In
addition,  a series  of  studies was  conducted  to  monitor levels of volatile
organics  and  several other  classes of chemical  compounds in the  air in
public  buildings.   In  each of the special  studies  involving  personal
monitoring,  all  participants  completed the  same sets of  questionnaires that
participants  in  the main  phase of the  TEAM  study were asked to complete.

      Data collection for each  respondent  began  during  the initial   stages
of  sampling  when  each  of  the  households in  the  sample  segments were
screened  for  eligible residents.   The  screening consisted  of creating a
household roster for each housing  unit  containing information  on  the age,
sex, occupation,  and smoking status of  each  resident of the housing unit.
Based on this  information, a  sample of respondents was selected, and  during
a series of return  visits to  the households, a field  interviewer explained
the  study,   detailing   the   participation required  of  respondents  and
enrolling  respondents in  the  study.   Once  a   sample member agreed  to
participate  and the  necessary  informed  consent requirements  were  met,  the
main study questionnaire was administered,  and  sampling appointments were
established.  When the  sampling teams  completed  the  twenty-four  hours of
monitoring,  they administered the  "24-Hour  Exposure  and  Activity Screener."
This document  (Figure 1)  was  designed  to collect  information about what the
respondent did  during the monitoring period,  as well  as  what he  ate and
drank,  and  whether  there   was occupational  or  incidental   exposure  to
chemicals from  the  groups  being  studied.  Each  individual  was  asked  to
provide details about  his  "main" activity and any  other  activity lasting
more than  one hour.   Descriptions of  all  travel and  any unusual   events
which occurred- during the monitoring  period  completed the data collection
effort.                             •

                     CARBON  MONOXIDE  MONITORING  STUDY

      Conducted  in  1982  and  1983,  this  study  was undertaken in an attempt
to  evaluate methodologies for collecting personal exposure monitoring data
and  corresponding  personal   activity data in  an  urbanized  area.    The
specific  objectives  were  to develop  a  methodology  for measuring  the
distribution  of  carbon  monoxide (CO)  exposures  of  a representative
population of an  urban area  for assessment of the risk to the population;
to  test,  evaluate, and  validate this methodology by employing  it   in the
execution of pilot field studies in Denver, CO,  and  Washington, D.C.; and
to obtain an  activity-pattern data  base  related to CO exposures.

      A stratified,  three-stage,  probability-based  sample was selected in
both sites.   Initial  contacts with the selected  households  in the  sample
area were conducted  by  telephone,  using RTI's Computer Assisted Telephone
Interviewing  (CATI) capabilities.   The  purpose of the initial  contacts was
to  determine  household  eligibility  and  to  collect  information  on  all
members of the household  to  permit  the selection  of  the  final  (third-stage)
sample respondents.
                                    8-4

-------
                                          Figure  1


                            Ł4-HOUR  ftCTIVITY   SCREENER


Study Number: 	                                                  O.M.8. No. 2000-0364
                                              24-HOUR EXPOSURE AND                E,P,,.,9/30/86
Date:	/	/	                          ACTIVITY SCREENER
                                                     TEAM Study


1.   Have you pumped your own gas in the past 24 hours?

    .    [T]  Yes (CO TO QUESTION 1a)        |~2~| No (CO TO QUESTION 2)


    a.   During which monitoring periods?

        I 1  |  Overnight                       |  2  | Daytime


2.   Have you done your own dry cleaning, or been in a dry cleaning establishment during the past 24 hours?

        [T|  Yes (CO TO QUESTION 2a)        |T] No (GO TO QUESTION 3>

    a.   During which monitoring periods?


        I 1 J  Overnight                       [  2  | Daytime


3.   Have you smoked cigarettes, cigars, or a pipe in the past 24 hours?


        (T|  Yes (CO TO QUESTION3a)        [~2~| No (CO TO QUESTION*)

    a.   During which monitoring periods?


        I \ I  Overnight              .  .       |  2  | Daytime


4.   Were you in an enclosed area with active smokers for more than 15 minutes at any time in the past 24 hours?


        [~T|  Yes (CO TO QUESTION la)       [~2~] No (GO TO QUESTION 51

    a.   During which monitoring periods?


        I 1  j  Overnight                       |  2  | Daytime


5.   Have you used or worked with insecticides, pesticides, or herbicides in any way. including farming, gardening, and
    extermination, in the past 24 hours?


             Yes                             |  2  | No


6.   During this time of year, on an average weekday or weekend day, how many hours per day are spent:
    (Answer a through d below)
                                                        Weekday                    Weekend Day
    a.  Away from home	   	I   I	


    b.  Out of doors—leisure activities	   		  I

    c.  Out of doors—working	   |   |   |    	  [    |    |

    d.  In a motor vehicle	   |   |   |    	  [    |    |
                                             8-5

-------
                                 Figure  1    
-------
                                Figure   1.  (continued)

                            24-HOUR flCTIVITY  SCREENER

8.  a.  Please lilt the specific name of any chemical or hazardous substance to which you have been exposed.
please indicate your activity which lasted the most time. In addition
lasted for more than one hour.
, please indicate any other activities which
Level of
Physical
Location: Activity:
Urban/
Indoor/ Suburban/ Strenuous/
Activity Outdoor Rural Light
1 (InnQptt) , I
5

d 1

01 I 02 1 03 04 05 |06|07|
1 •' ' ' * . ' ' '
08 1 09J 10 11 12 I 13 I 14 I '

15 | 16 | 17 18 19 J2o|2l|

22|23| 24 25 26 |27|28|
  b.  Please provide the following information for each trip during this time period.
                                                                       Traffic:
                                                                                     Ventilation:
Windows:
Open/Closed
Trip Minutes Mode of Transport Heavy/Light NA
1.
2.
3.
4.
















EH
UL71
rrm
mi

i
7

1
2

.1

3

1
2

i
2
3

3
  c.  Please indicate any unusual events which happened during this time period which might have any effect on your
      exposure to environmental chemicals.
                                          8-7

-------
            After the  third  stage  sample  of  respondents was  selected,
another .telephone contact was initiated.   During this call to the selected
respondents,   the   study  details  were  discussed,   requirements  of
participation outlined,  and  cooperation sought.   If the selected respondent
agreed to participate,  an appointment was established for a personal visit
during which  a field  interviewer  would  bring the  CO  monitor  to  the
respondent  for the  beginning of  the  twenty-four hour monitoring period.
(Respondents in the  Denver sample  were monitored  for  48 hours.)  During the
telephone contact in which  the  respondent  agreed to participate, a brief
questionnaire was completed  which  collected  data  for  a computer model being
developed to predict and  describe  exposure levels.

      At  the time of the  visit to  the respondent  to start the monitoring, a
Household (Study) questionnaire  was  administered, collecting information on
the respondents and their  home and work environments.    During  the time
period(s) that the respondents were monitored  by   the  personal  exposure
monitor  (PEM), they were also  asked to carry  and  fill  out  an  activity
diary, describing their  activities,  location,  and  environment.   Each time
that the respondent began a new activity or changed  location,  he was asked
to  push  a button on  the PEM,  which stored  an  integrated reading of CO
exposure for  the  last  activity, and reset  the monitor to begin capturing
the new data.  At the same time  that the  respondent pushed  the button, they
were  to  make an entry in the diary.   Figure  2 displays  one  page  of the
diary.

       COMPARISON OF THE TWO  ACTIVITY PATTERN DATA COLLECTION  MODES

      The two  studies  took  different  approaches  to  the collection of data
describing  the activities  undertaken by.-the respondents  during  the time
that  they were  being  monitored.    The  TEAM  study  used  a  questionnaire
administered at the end of each  24-hour monitoring period  and relied on the
respondent's memory to provide  the requested information.  After  the first
administration of the  activity screener,  the respondents began  to learn
what  was  expected  of  them  and  were   able   to  provide more  complete
information.   In fact,  the respondents  often provided  data  before being
asked, and in  greater detail than  could  be recorded.   This  was particularly
true  of  details  concerning  the  food consumed during  the  monitoring  period.
The CO study  collected data by  self-report  of the respondent using  a diary
form  to  collect the details  associated  with each activity  and location.
The level  of  detail provided  and  the  number of activities  entered were
determined completely by the  respondent,  with minimal input from  the study
team  and with no opportunity to  gain  experience by using the diary over
several monitoring periods.

      The two  approaches thus represent  ends of a continuum,  with the TEAM
study using a  highly structured  form to  collect  a specific  but minimal data
set,  while  the CO  study  used a  diary to collect  an  unspecified amount of
data  about  an unlimited  variety  of activities.   The TEAM study collected
information  about  the  activity  which the respondent considered to  be the
"main" activity undertaken during the monitoring period.    In addition, any
activity which lasted more than  one hour was documented.    Determination of
main  activity was  left  up  to  the  respondent and was  a  source  of great

                                    8-8

-------
                                 Fig ure  Ł
                             RCTIVITY DIftRY
TIME FROM MONITOR  I   I   I   I ~1
A.  ACTIVITY
D.   ONLY IF IN TRANSIT
    (1)  Start address
B.  LOCATION
    In transit	1
    Indoors, residence  	 2
    Indoors, office 	 3
   .Indoors, store  	  ... 4
    Indoors,-restaurant 	 5
    Other indoor location 	 6
      Specify: 	
    Outdoors, within 10 yards of road
      or street	7
    Other outdoor location	8
      Specify: 	
    Uncertain 	  9
 C.  ADDRESS (if not in transit)
                                           (2) End address
                                           (3) Mode of travel:
        Walking 	  1
        Car	2
        Bus	3
        Truck	  4
        Train/subway  	  5
     .   Other 	  6
          Specify:'	•

    ONLY IF INDOORS
    (1) Garage attached to building?
        Yes	1
        No	2
        Uncertain 	  3
    (2) Gas stove in use?
        Yes	1
        No	•.  2
        Uncertain 	  3
    ALL LOCATIONS
    Smokers present?
    Yes	1
    No	2
    Uncertain 	  3
                                    8-9

-------
variability.  In the CO study,  there was also great variability,  caused  not
by the respondent's definition  of the activities of interest,  but  by  the
level of detail provided  by the respondents.   While one respondent  might
list a general  activity  such as  housework, another respondent would provide
separate entries for  cleaning,  dusting,  washing clothes, vacuuming, etc.
The  two  respondents would cover  corresponding periods  of  time but with
different numbers  of  entries.   This  discrepancy in level  of detail  was
particularly  noticeable  during the  evening and night hours,  while levels of
detail for daytime  work  activities  tended  to be  more  uniform.   This  may be
related to the  impact of location changes  as  a reason to initiate  a  log
entry and diary description.  During the instruction  period  provided  to  the
respondents,   the  combination  of  activity and  location  was  stressed.
Respondents were told that a  change  in either  member of the pair was  the
signal for a  PEM recording and a diary entry.

      Verification  of the activities  reported  by the respondents in both
studies was  virtually impossible.   To verify  completely  the  activities
undertaken by an individual,  it  is  necessary to observe  the  individual  at
all times, and for  an independent recorder to document  all  activities.   If
the  respondent  knows that  he  is  being observed,  there is  a very real
possibility that the respondent  may change  his  activities, by  choice  or by
chance.   The use of a covert observer, while minimizing the  probability of
deliberate  change  in  activity,  may result  in  missing  data  due   to
observational problems.   During the TEAM pilot study,  one observation of
the  respondent was  possible  during  the  work  day.    One sample  to  be
collected for each  respondent was a vial of water taken from  a  source at
work.  When  the sampling team  arrived at the  respondent's work  setting,
they  collected  the  water  sample  and  observed the  respondent.     The
observation  yielded  responses, to  two  specific  questions:    Was  the
respondent wearing  the monitoring equipment  as  requested?  and  What  was  the
respondent doing at the  time of  the observation?  The respondent's activity
was  later checked  against the responses to the activity diary questions.
This  type of  check is difficult to implement  in a large-scale  study  and
provides  only  one or  at  best  a  few  data  points  for  comparison  and
verification.

      Another difference in the  two modes of data collection  is the ability
to review the  answers being  recorded in the questionnaire as  it  is  being
administered  and to probe  for  further  details of interest, thus eliminating
any  internal  discrepancies in  the data as the data are being  recorded.
This  is  easily  done in  an in-person  interview  mode, especially when  the
data  collectors are knowledgeable about the subject  material.   In the case
of the TEAM study,  the  activity screener was administered by  chemists on
the  sampling  team.   In the CO study, the interviewers were  not chemists or
environmental  scientists, and  the diary,  while it was  reviewed at  the
respondent's  residence for completeness of  entries  and  general  usability,
could not be reviewed  for inconsistencies or  missing  activities.    The
interviewer  had   no  way  of  knowing  what was  not   recorded,  either
deliberately  or by oversight,  and thus  could  not  probe  or explore  for
additional information.
                                   8-10

-------
                       SUGGESTIONS FOR IMPROVEMENTS

      While the utility of collecting  activity  data to  supplement data from
environmental  and biological samples  is  unquestioned,  the manner in which
it is collected  is  still  evolving.   Based on the  experience gained during
the two studies  described,  there  are  several modifications to the methods
of collecting  activity data which  should  be considered.

      The data collection instrument should  be  more highly structured.  The
times at  which  entries  should be made  must  be  indicated,  although the
frequency of entries will be determined by the  study being undertaken.  The
availability of a timing device with an  audible alarm that can be attached
to the  diary  or worn  by  the  respondent  may be considered.   Allowing the
respondent  to  keep  the  timing  device   could  be   an  incentive  for
participation.

      Additional pre-training or  instructions for  the respondent may be of
value.  If  contacts  are made with the  respondent  prior  to the  beginning of
the monitoring  period,  a study guide  or brief written  instructions can be
provided,  along with a sample  set of diary pages.  The respondent could
complete the diary  for a  time  period  immediately  prior  to  the  visit of the
sampling  team.   The  sample  diary  entries  could be reviewed before the
beginning  of  the actual  data collection period,  and the respondent given
specific  feedback  on the level  of  detail  provided   and  the  types  of
activities  recorded.  A drawback to  this review  is the potential lack of
consistency among the reviewers and trainers.

      An additional  component of pre-training could be  the development of  a
list  of  sample  activities  of  interest.     This list,   developed  with
substantive inputs  from  the  technical  members  of  the study team, could be
reviewed  with  the  respondent  in  order  to  demonstrate  the   types  of
activities  of  interest.   There is a  drawback to  this approach,  since some
respondents may  feel  that  the  activities on  the  list are the  only ones of
interest and thus may not report any other activities.

      Data  collection instruments should be reviewed  or administered at
shorter  intervals.     In  the  TEAM  study,   the  activity screener  was
administered at the  end of  each 24-hour  period, even though the  study team
was  in   the respondent's home  every  12 hours.    If the  screener  was
administered  at  each visit,   it  would  rely  less  on  memory,   could  be
administered  in  a   shorter period of  time,   and  used to  collect  more
information.

      One additional item would involve  the  development of a check list of
activities, locations, and  exposures  of  interest.   This checklist could be
reviewed with the respondent at the end  of the data  collection period as  a
means of  jogging the  respondent's memory,  or allowing the respondent to
provide details  of  potential  exposures that  he did not  think  significant,
or just omitted  from the  record.  The  checklist can  also  be used  as a part
of a review of the  diary.   If any of  the special  items  were entered in the


                                   8-11

-------
diary by the  respondent,  a detailed review  and  discussion of the entries
could be undertaken and details  recorded for  future analysis.


                                CONCLUSIONS

      The two studies described included initial  attempts to explore a  new
component of  environmental  research:   the examination of the  relationship
of daily activities to exposure  to  many groups of  toxic chemicals found
throughout  the environment  in which the population lives  and works.  While
the data collected  were useful, more  and  better  data will be  available  as
we evolve better means of collecting  them  and using  them  in  the analysis  of
exposure.  We must use the experiences of previous  studies to design  and
implement the next studies in order to capture  more of the data that  are
available,  but which have  not  yet  been  fully  utilized.
                                   8-12

-------
                 AN ACTIVITY PATTERN  SURVEY OF ASTHMATICS

                       by:   Carolyn  H.  Lichtenstein
                             H. Daniel  Roth
                             Roth Associates,  Inc.
                             Rockville,  MD  20852

                             Ron E.  Wyzga
                             Electric Power  Research  Institute
                             Palo Alto,  CA
                                 ABSTRACT

      Asthmatics   are more  likely  to  react  to  air.  pollution  than
nonasthmatics,  and human  clinical  studies  have  shown that  they are  at
greatest  risk  to  these  substances  when  they   are  exercising fairly
strenuously.   Thus, the  activity  patterns  of asthmatics,  particularly  the
occurrence of strenuous exercise,  are of  interest in assessing their health
risks.

      Since asthmatics constitute a special  population subgroup  with  its
own characteristics,  a  survey of  their activity patterns  must  also  be
uniquely  designed.   Several  activity  pattern surveys  have  identified
asthmatics and  several  other  surveys  have covered aspects of  asthmatics
other  than  their  activity  patterns.    However the  recent  survey by  the
Electric Power Research  Institute (EPRI) and Roth  Associates,  Inc.  (RAI),
conducted  in  Los  Angeles  and  Cincinnati,  is  the  first  activity pattern
survey  to focus  exclusively  on  asthmatics.    The RAI  survey  includes
questions concerning  asthma symptoms,  medication,  and  factors  that affect
asthma symptoms (e.g., anxiety and  stress).   This  information  is obtained
in a background questionnaire  including demographic information and medical
history, plus a 3-day  hourly diary.

      The  analysis   of  the  EPRI-RAI   survey data   should   prove   very
interesting  because  it  allows for  a  more  detailed  examination  of  many
aspects  of asthmatics'  behavior  than  has  been  possible  from  previous
surveys.   In  addition to an estimate of the pattern of exercise over  the
days of  the week,  various  relationships  between asthma symptoms  and  their
causes  can be estimated.   For example,  preliminary results based on  the
L.A. diary data indicate that anxiety and stress are positively related to

                                    9-1

-------
increased  incidence of  asthma  symptoms.    It  is hoped  that  a  further
understanding  of the factors  related to  asthma will aid in policy decisions
aimed at this  group.
                                    9-2

-------
                 PURPOSE OF EPRI-RAI  SURVEY  OF ASTHMATICS


      The Electric  Power  Research Institute (EPRI)  and Roth Associates,
Inc.  (RAI),  designed and implemented  a survey of asthmatics  in Los Angeles
and Cincinnati.   The goals of this survey were:

      1.     To  examine the  activity  patterns of  asthmatics,  including  a
            comparison with the activity patterns of nonasthmatics.

      2.     To  describe and  examine   asthmatics'  patterns of symptomatic
            response,   medication,  and  exposure  to  factors  related  to
            asthma.

      3.     To  estimate the typical number  of asthmatics' adverse effects
            for the  purposes of risk analysis.

      4.     To   examine  the  interactions   of  the  symptom  patterns  and
            activity  and  medication  patterns,   (e.g.,   the  effects  of
            medication on  symptomatic response).

      5.     To   study  the  causal  nature  of various  factors  related  to
            asthma,  such as exercise,  stress, etc.

      6.     To  augment the results of clinical  studies on asthmatics,  in
            part by comparing  the patterns  of  asthmatic  response  in  the
            survey  and  laboratory  settings.

      Two questions of  interest  to  this  conference are:   1)  Why  are
activity patterns  of  asthmatics of special  interest?;  and 2) Why  was  it
necessary to carry  out  a  special  survey focusing  on the activity patterns
of asthmatics?  Asthmatics and  their activity  patterns  are of special
interest  because they  are more  likely to react to  air pollution  than
nonasthmatics.    In  fact,  asthmatics  seem  to constitute  the  most   sensitive
subgroup  of the  population;   they react  to certain  pollutants even  at
ambient levels  of these substances.   Clinical chamber studies indicate that
this  response  occurs  only when  the  subjects are  exercising  strenuously.
Thus,  examining the  activity  patterns  of asthmatics  for  periods  of
strenuous outdoor exertion is a primary concern.

      Previous  activity  pattern  surveys have  been carried  out for  the
general population.   A  new  activity  pattern survey of. asthmatics  only was
desired  for two reasons.   First,   the activity  pattern  surveys  of  the
general  public had flaws  that  limited the usefulness  of their  results.
Second, it is reasonable to assume that the  activity patterns of asthmatics
are different from  those of nonasthmatics  due to their illness.   Thus,  the
information  available on  exercise levels  and patterns  for the general
population is probably  not directly applicable to asthmatics.

      Activity  patterns of  asthmatics will  probably differ  from  those  of
nonasthmatics  in several  ways.   First,  asthmatics will  tend to exhibit
lower  levels  of  exertion  in  general  and  shorter periods  of  strenuous

                                   9-3

-------
exercise.   Second,  they  may  exhibit  less  activity outdoors where they can
be  exposed  to  factors  that  could affect  their  asthma.    Third,  their
activities may be limited by medication taken and/or symptoms experienced.
Thus,   any  survey  of  asthmatics  must  include  questions  concerning
symptomatic responses, medication usage,  and exposure  to  factors related
to, and possibly causing,  episodes of their disease.

                             PREVIOUS SURVEYS

      There have  been several  studies that have  attempted to  estimate
activity patterns.  An early effort that modeled  activity patterns, rather
than  actually  measuring  them,  is the NAAQS Exposure Model  developed by
PEDco Environmental, Inc. (PEI) for the U.S.  EPA.  The population in each
study  area  was  divided  into  age-occupation groups,   each  of which  was
subdivided into three subgroups assumed to have different typical  activity
patterns  and  levels.   These  activity patterns  were  then  combined  with
simulated pollution levels in the  study areas for  estimates of health risks
to  the  population.   Two  problems with this  study are:   1)  the activity
patterns and levels are modeled,  rather than  measured from a  survey; and 2)
there  is  no emphasis  on asthmatics   (in fact,   asthmatics  are  not  even
identified).

      A more  recent  effort  is  "A Study  of Human  Activity Patterns in
Cincinnati,  Ohio" by PEI  Associates,  Inc.,  for EPRI.   This work involved a
sample of 973 participants who completed  a 3-day  activity diary as well as
a  detailed  background questionnaire.   Although this survey  identified
asthmatics through  an item  on the background  questionnaire,  it did not
focus on  them,  and  the number of asthmatics  in the  sample  was  quite mall
due to the low  incidence rate of  asthma.   However, this survey was able to
produce estimates of activity probability  distributions useful  for risk
assessment for the general population.

      There have  also been studies that focused  on asthmatics but did not
concentrate  on  activity  patterns.    A 1978  study prepared  for  Southern
California Edison Co.,  "Asthma  in  Six  Los Angeles  Communities:  Analysis of
Findings  Based  Upon CHESS  1972-1973," examined  many factors  related to
asthma  and  tried to  assess  this  relationship.    CHESS is  the Community
Health and Environmental  Surveillance  System  implemented by EPA.   The data
for this  study  was  collected in a daily diary for the primary  purpose of
assessing the potential  relationship between pollution  and  asthma attack
patterns.   Thus, results  from the  CHESS survey are extremely limited.

                     DESCRIPTION OF THE EPRI-RAI  SURVEY

      The EPRI-RAI survey consisted of  the following elements:

      •  instrument,
      •  sampling plan, and
      •  data handling  and analysis.

Each  of  these  involved issues specific  to  an activity  pattern survey of
asthmatics as well as issues  related to any survey in general.

                                   9-4

-------
SURVEY INSTRUMENT

      The survey  instrument consisted  of two  components:    a  background
questionnaire and a 3-day diary.   Each  participant completed one background
questionnaire,  including questions characterizing his/her asthma;  relevant
medical  history;  questions on  medication  usage  and doctor/hospital contact;
questions on the  relationship of  various  factors  and  asthma,  and exposure
to these  factors;  and  demographic information.   This  questionnaire was
based in  part  on  several  questionnaires focusing  on asthma and  related
diseases,   particularly a  questionnaire  on  asthma  developed  by  the
International  Union  Against  Tuberculosis,    and  the  American  Thoracic
Society/Division  of Lung Diseases  Respiratory Questionnaire.

      The 3-day  diary  included  questions  on  activity  descriptions  and
durations, symptomatic response,  medication  usage,  and exposure  to factors
associated with asthma  (e.g.,  stress/anxiety  level).   The diary was filled
out hourly  during waking hours,  with  a  separate  overnight  summary.   In
addition,  a  summary of the day was  completed that  evaluated the  day
relative to other days in terms of activity level and various symptoms.

      The hourly  form contained eight questions  for  each of up  to three
activities per hour,  plus 15 general  questions.   (See  the  appendix for a
sample diary sheet.)   The main activity was considered to  be  that which
lasted the  longest, while the second and third activities  had  to last at
least 10  minutes each.   Our  hourly format  is  a big  change  from previous
activity  pattern surveys in which people made  an  entry  for each activity
engaged ,in.;  It  was felt that an  hourly format placed less  of a burden on
the  participant  while   still  obtaining  the  desired information.    The
overnight summary asked  questions  very  similar  to those on the hourly diary
form, e.g.,  "What was your breathing  like during the  night?" and "Did you
take  any  medication for asthma  between  going  to bed  and  getting up in the
morning?".

SAMPLING PLAN

      The survey was  administered to two different groups at  two different
times.   The  first group consisted of asthmatics who  had been subjects in
the clinical  studies Rancho  Los  Amigos Medical Center  near Los Angeles.
This  source  provided  us with an  easily  obtainable  sample  of participants
whose  clinical   responses  could  be  compared  to  their  survey  responses.
However,  the survey results  for this  group cannot be considered estimates
of general population values.   In  addition, the survey of'this group acted
as a pretest for the larger second group.

      This  administration of the survey took  place  in April  before the
worst of  the smog  season.  All  the participants completed the diary during
the three days Thursday, Friday,  and  Saturday  in order  to  cover at least
one weekday  and  one weekend  day.   Previous activity pattern surveys had
indicated that Saturday and Sunday exhibit different patterns,  but given
the  small  number  of  participants,   it  was  felt   that  obtaining  more


                                   9-5

-------
information on weekdays  was  more important than obtaining  information on
both weekend days.

      The second  group surveyed consisted of a random sample of asthmatics
living in Cincinnati,  Ohio.   This city  was chosen as being  "typical" in
many different ways,  including pollution patterns.  Asthmatics  living in
the three counties  surrounding and including the City of Cincinnati  were
utilized  in order to include  both urban and  rural populations.  This sample
was  surveyed during  the  summer  (August),  when  people are active  and
children  are out  of school.  The participants completed the diary on either
Friday,   Saturday,   Sunday",   and Monday   (because  there  were  enough
participants  in  this  sample  to gather information on  a  weekday  plus  both
weekend days).

      Since the prevalence  rate of asthma  is quite low  (about 3-5%), simple
random sampling  would have  been  prohibitively expensive,  so  a  technique
known as multiplicity  sampling  was used.   In this approach, households were
telephoned randomly and any asthmatic in a selected household was eligible
for the  study.   In addition,  a randomly selected adult  in  the  household
could nominate asthmatics from among relatives living in  the same 3-county
area.  Only children,  parents,  and siblings were eligible for nomination.
This method  produces  a  probability sample,  thus permitting  projections to
the  population,   although the data  collected  for  this sample must be
weighted to adjust for the  different probabilities of selection.

      An  important  issue  .of  these  surveys  was  the  definition  of
"asthmatic."  The Los  Angles  sample included asthmatics  as  defined by the
clinical  studies.  These studies utilized medical histories and physiologic
testing,  including  pulmonary function tests, as well  as doctor  diagnosis
and current symptoms,  to  assess  each subject's asthmatic status.

      The  Cincinnati  survey  used  two. criteria  to identify   eligible
asthmatics.    The  first,  and  easiest  to determine,  was  that  the asthma had
to have been diagnosed by a doctor, i.e.,  each  participant was asked:   "Has
a doctor ever told you that you had asthma?"  The second  criterion was that
the participant  had to exhibit current symptoms of  asthma,  as determined
from the following questions:

      1.     "Have you had  any  wheezing  in the past 12 months, and  if so,
            how often?"

      2.     "Have you   had any  chest  tightness  in  the  past  12 months, and
            if so,  how often?"

If the respondent answered "yes" and "every day," "once a month," or "once
in a while" to either  of these  questions,  he or  she was considered eligible
for the study.

      Since  a monetary  incentive was offered for  participation  in  the
surveys,  it was  felt  that an   additional  check was necessary to prevent
nonasthmatics  in Cincinnati  from misleading the interviewers.   Thus, the
participant agreement  form for this group required the name  and address of

                                    9-6

-------
the  individual's  doctor,  plus the  agreement  that  this  doctor might  be
contacted for confirmation of asthma diagnosis.

ANALYSIS

      The analysis goals for the data  from  the  two samples are different.
the ultimate focus of the Los Angeles data analysis is to relate activity,
medication,  and symptom  patterns to  the  clinical  response  patterns  of the
participants.  One  issue is how the day-to-day symptom  patterns  obtained
from  the survey relate  to  the symptom  and pulmonary function  responses
obtained  in the  laboratory.    the other  issue  is  whether,   and  how,  a
participant's clinical  responses  can be further  elucidated by his  or her
activity  patterns,   medical patterns,   and exposure  to various  factors
affecting asthma.

      The  focus  of  the  Cincinnati  survey  is   to  estimate activity,
medication,  and  symptomatic  patterns  of  the   general   population  of
asthmatics.    The  ultimate goals  of  the  analysis are  policy-oriented,
especially   the   assessment of asthmatics'  risk  to  various  factors,
particularly air pollution.

      The analysis  of the  data  from the  two surveys will  have some common
aspects,  of course.   First, the activity patterns  of  both  groups  will  be
examined  and  compared,  both to each other  and  to those  of nonasthmatics.
Second,  the  patterns  of symptoms,  medication,  and exposure to explanatory
factors will be studies.   One  way  that these patterns will be examined is
by  subgroups  of the samples, e.g.,  by severity of asthma.   The  clinical
studies developed criteria for  separating asthmatics .into two categories --
minimal/mild  and  moderate/severe,  which will  be utilized in  the  survey
analyses.    Third,  the  patterns discernible among asthmatic symptoms will
be related to activity,  medication,  and exposure patterns.

            PRELIMINARY  FINDINGS FROM THE L.A. SURVEY DIARY  DATA

      The data  from the L.A.  survey are currently being  analyzed.   Some
preliminary findings from the hourly diary data are:

      1.    Asthmatics  spend much  of their time  inside  at  a  low  exertion
            level.

      2.    Several  asthma  symptoms  (wheezing,  coughing,  chest tightness,
            and asthma attacks)  occurred  at  fairly high rates.

      3.    Exertion level  is significantly related to asthma symptoms in
            the expected  direction  (as exertion  level  increases,  symptom
            incidence increases).

      4.    Several  other factors,  such  as stress,  are  related  to  asthma
            symptoms,  any many  asthmatics are exposed to such factors.

      5.    Medication is taken  in  response  to asthma symptoms, rather than
            as part  of a maintenance regimen.

                                    9-7

-------
      6.    Medication  has  a  strong  mitigating  effect  on  respiratory
           symptoms resulting from strenuous  exercise.

      Thus,   a  preliminary  analysis  of  the  data  indicates  several
interesting patterns.   It  is  hoped  that a  further  understanding  of the
factors related to  asthma,  as well as the  coping  mechanisms  utilized by
asthmatics, will aid in air quality policy aimed at this group.

           The  work described  in  this paper was not funded by the
           U.S.  Environmental Protection Agency  and,  therefore,
           the  contents  do not necessarily  reflect  the views of
           the  Agency  and  no  official endorsement  should  be
           inferred.
                                   9-8

-------
                                 APPENDIX

      Editor's  Note:    The  questionnaire  given  in  this  appendix  was
originally on two  sides  of  a  17"xll" sheet of paper.  We were forced to cut
it and then  reproduce it on  several  pages so that it would  fit  into  this
report.•
                                   9-9

-------
        60 Minutes =
        1 hour
                                       START
                                    •  HERE •
                                                   Q.l (CIRCLE ONE)
                                                   Time of day:
                                                     .   a.m.   p.m.
                                                     o	
Q.2 (CIRCLE ONE)
Report on beginning and ending hours

12- 1  1-2  2-3  3-4  4-5  5-6

 6-7  7-8  8-9  9-10  10-11 11-12
                                        13-14 Less than an hour, from	to

                                             Is it noon or midnight?
                                                                                   JJL
Q.3 (CIRCLE ONE)
Day of the week
Monday

Friday05 Saturday^ Sunday QJ
          Tuesday.,  Wednesday.,,  Thursday,,,
                Oi          Oj          04
                                        3-10

-------
                   MAIN
                   ACTIVITY
                   IN
                   LAST
                   HOUR
      0.4  (CIRCLE ONE)

      Mitt activity took  up -lit
      tlat In tht Uit hour?

      (Till .111 bo MAIN  ACTIVITY)

      01 gtttlng rtady lo go
         to—Mitrt
      02 pr*9arln«/tatlng Mil
      0) wort In/around tht (MM
      04 trawling (0.9.. to or
         fro« work)
      Oi vort war '«• >>OM
      M titreltlng
      07 watching tv,  rtadtng.
         talking               *«-
      01 child cart
      0* shopping or trrandt
       10 othtr .	
Q.S  (FILL  IN  THE BLAMS)

About how Mny ilnuttt of
tho past hour  did you tptnd
on MAIN ACWim  i	1
       of Btnuttt: I	!
                    it-u
                                                                        q.i How Mny of tht •Inuttl
                                                                        rtporttd In Q.S Mr* In tht
                                                                        following an»:

                                                                          (FILL IN ZERO (0) IF NO
                                                                             TINE IN AN AREA)
                                                                          II-
                                   SO TO Q.S
          outtldi:  _

           ln»ldo:  _

it-}} In a »thlelt:  _____

   »-«     TOTAL:  _
 (TOTAL SHOULD EQUAL NUMER
 OF NINVTES IN 101 IN 9.5)
0..;A  (CIRCLE  ALL
      TWT AWT)

Miat —in» jour lo»ol»
of utrtlM ndll* do-
IW| MAIN ACTIVITY?
1 ItrMMMt
  (o.g.. joojtna.).

2 codtrato
  (fast •aUiB9)  .
  (tlo* •alktnq)
4 «aty
  (jlttlnq. itandtno.).

GO TO 9.71
0.71 (FILL I* AIL THAI
     About Kov muif
     •Inuttt of tach typt
     of tiirtlon?
                            -in
                         no-
                                TOTAL:
                            GO TO Q.8A
                                                     139-340
                                         (SHOULD EOUAL
                                         NUMBER OF
                                         NINUTES IN BOX
                                         IN Q.S)
                                                                        GO TO  Q.7A
                                                                          Q.U  (CIRCLE ALL THAT
  Mat •<> your brtithtno,
  llko durlnq HAIN
  ACTIKITTT
q.SC (ANSWER BELOM, ft*
     EACM ITDt CIRCLED
     IN Q.U)

Vat tht brtathtnq you
rmorttd In Q-BA an
tithu iy->to> for you?
01 •fettling 	



OS »-<'
Don't biw <>«
Don't Knov n-4i
Don't Know 4*-50
Don't Know J1-5J
                                              Q.SB
                                                                                                          GO TO Q.9
                                                    •If you  larktd fait, ihallov or htavy Drtathlno, In
                                                     Q.SA: HO» Mny unutts did thlt latl?
                                                                                            | Hl-Hl

                                                                                            •  GO TO Q.SC
             Q.»  (CIRCLE ONE)

             Old you txptrltnct any nthu
             lyaptOB(t) not Hntlontd In
             Q.tAT

             1  Ttt. dttcrlbo bolM
             2  No  (GO TO q.10)

            GO TO q.10
                                S5-54

                                tl-tl
                       q.10   (CIRCLE ALL THAT
                             AfrXT)

                       Htrt  you  fttllnq itrttttd
                       or am I out vhtlt dotnq
                       MAIN  ACTIVITY?
                       I  Ttt, united

                       2  Tot. ajutout

                       1  No

                       GO TO q.ll
                                                                         to
                                                                         tl
                                                          9-11
               q.ll  (CIRCLE ALL THAT
                      AW.Y)

               Hat anyone  tanking vhllo
               you did MAIN ACTIVITY?

               1 N» ono «at looking       fl

               2 I >ai noting,            «

               3 So—rant tlto *<> unking ti

               B I don't know             <«
                                                                                        If your MAIN ACTIVITY took
                                                                                        all 60 alnuttt of tht Uit
                                                                                        hour. GO TO q.28.  If not.
                                                                                        pi tut go  to q.12.
                                                                                             Reproduced from
                                                                                             best available copy.

-------
           2ND
           ACTIVITY
            IN
           LAST
           HOUR
Q.I2 (CIRCLE ONE)

Wilt othtr  activity took
tlw (It  ttast  10 •imittt)
<> th* list hour?
(This win  o* 2ND ACTIVITY
for UM p«$t hour)

01 totting  rtady to 90
                          02 prtpannq/tatlng wal
                          0) work In/around tnt
                             hOM
                          04 travtltng  (t.j., to or
                             froB work)
                          OS wrk away  from hoot
                          M txtrclstng
                          07 watching tv, rtadlng,
                             talking
                          M child cart         ti-ft
                          n thepping or trrands
                          10 othtr _
                             SO TO Q.13
Q.13  (FILL IN  THE BUNKS)

About how Mfljr  limit*! of
tht past hour did you ipond
on 2ND ACTIVITY?
                  ,        ,
                : I        I
       of •Inutts:
                                    Q.U  How aany of  th«
                                    •Inutts rtporttd In 0.13
                                    wro In thi following, anas:

                                       (FILL IN ZERO (0) IF
                                        NO TIME IN AN  AREA)

                                              outsldt: 	

                                       74.75   tnstdt: 	

                                   74.77 In a »thtd*: ^^^__

                                       l»-n    TOTAL: 	
                                    (TOTAL SHOULD  EQUAL NUMBER
                                    OF NINUTES IN  BOX  IN Q.13)

                                    GO TO  Q.1SA
Q.16A (CIRCLE ALL THAT
Q.1SA (CIRCLE ALL 0.1SI «"Ll « ALL AWT)
-THAT APPLY) THAT Am Y)
Vhat was your brtathing
Hhat wtrt your Itvtls About how cany Hkt during 2ND
of txtrtlon whllt do- linutts of tach typt ACTIVITY?
Ing 2ND ACTIVITY? of txtrtlon?
1 strtnuous
2 BOdtratt
(fast walking) .....'., — ., , 	 __

3 «11d • .
(slow walking) ., 	 .••••• 	
4 ttsy

SOT00-158 TOTAL: 1
91 whttl
	 1 341-241 .
| 33 tight
1nq 	




OS substtrnat Irritation 	
07 shall

aw brtathing* 	


09 normal: slow to tndtratt..
(SHOULD EQUAL
NUMBER OF
NINUTES IN 801 C.16B *If you Hrktd fast, ihi
IN Q.13) O.I&A; How tuny Bimitts
GO TO Q.16A


O.lt
Has
rtpo
astta
Yts
Yts
Yts
Yts
Yts
Yts
YtS
Yts
Tts
GO 1
How or
did til
SC
C (ANSI
CACH
IN 0
lh« bn
rttd I)
sa symi
No
No
No
No
No
NO
No
No
No
ro Q.U
htavj
Is )ai
KCI KLOM. FOB
ITEM CIRCLED
.ISA)
lathing you
i Q.16A an
>to> for you?
Oon't Know"-"
Don' t Know M-M
Oon't Know«7-«»
Oon't Know fO-H
Oon't Know»J-»5
Oon't Know **-M
Oon't Know n-Ol
Oon't Know jox>«
Don't Know ioi-07
brtathing In
U
Q.17  (CIRCLE ONE)

014 you tiporttnct any isthoa
syipto*(s) not Mntiontd in
9.1U?

1 Yts, dtscrlb* btlow
                     10*
2 No (60 TO Q.18)
80 T0'-18         JM.I.

                   111-13
           Q.ll  (CIRCLE ALL THAT
           Mm you  fttllrn strtssod
           or anxious whllt doing
           2ND ACTIVITY?

           I Yts,  strtsstd    114

           2 Yts,  anxious     US

           3 No              lit

           GO TO Q.19
         Q.I* (CIRCLE ALL THAT
              APPLY)

         Has anyont saoktno, whllt
         you did 2ND ACTIVITY?

         1 No on* was staking       in

         2 I MS uoklnq            in

         3 Soxaent tilt w»« soaking lit

         » I don't know             ix
                                                                          If you  did soatthlng tlst
                                                                          (a 3RO  ACTIVITY) In tht
                                                                          past hour, GO TO Q.20.
                                                                          If you  h«vt accounttd for
                                                                          all of  tht past hour,
                                                                          GO TO Q.28.
                                              9-12

-------
                         3RD
                         ACTIVITY
                         IN
                         LAST    .
                         HOUR  ••>
Q.23A (CIRCLE ALL
      THAT AFflT)
mat Mro your  livols
of txtrtlon unlit do-
Ing 1RO ACTIV1TTT
             0.20  (CIRCLE ONE)

             Mtat otktr activity took
             MM (at Itait 10 •inutts)
             li tht lilt hour?
             (TM« «n»i
       About how «any
       •Inutti  of «acA typo
       of txtrtton?
1 itftnuoui
(t.g. . jogging) 	

2 Bodtratt
(fatt walking) 	

(Slow walking) 	

4 taiy
(lining, itandlng) 	

CO TO Q.2JI
TOTAL:










(SHOULD EQUJ
NUMBER OF
NINUTES IN
IN 9-21)


741-744

74J-747

HI- no

771-777
IL
BOX
       9.21  (FILL IN THE ILANKS)

       About how aany itnutts of
       tho past hour dtd you iptnd
       on 1RO ACTIVITY?
             of •Inutti:
                         174-75
                                                                             9.22 Ho* Mny of tho
                                                                             •thutot roporttd In 9.21
                                                                             wrt In tho fol toning aroat:

                                                                               (FILL IN ZERO (0) IF NO
                                                                                  TINE IN AN AREA)

                                                                               l*-77  outlldo: -

                                                                               I II- It   Intido: ___^

                                                                           1)0-11 In a vihlclo: _^^_
                                                              TOTAL:  __
                                                   (TOTAL SHOULD EOUAL NUMER
                                                   OF NINUTCS IN SOZ IN Q.21)

                                                   GO TO Q.23A
                                                                      0..2U
       (CIRCLE ALL THAT
       AfHT)
What ••» your broathlrq
Mko during  JRO
ACTIVITTT
 9.24C (ANStKI KLW,  K»
      EACH tTEH CIKLEO
      IN O.MA)

Vat Uio  brtatnlnf you
rtportid 1* Q.24A an
aitha* >r*pto> for you!
01 «Boojtft9 	
K coughing 	
0) tlghtnoii of chtit 	
M ilMrtniii of brtath 	
OS lubstornal Irritation 	
01 fatt breathing* 	
07 thai Ion breathing* 	
01 hoavy brtath Ing* 	
09 normal: »lo» to andiratt..
Ttl
Ttt
Ytt
Ttl
Tt»
Ttl
Ttl
Ttl
Ytl
No
No






Oon't Kao* ut-Jf
000't lIMM 1J*-40
Don't Kno> 141-44
Oon't lna» H4-«4
OM't KOOM 147-4»
Oon't loan lio-li
Ow't bmi 151-51
Oon't Know 1J4-S*
Oon't KMM 15»-41
                                eo TO q.2u
                                                                                                      CO TO 0..2S
                                            9.248  *lf you aarkod f»t,  shallov or htavy brtathlng la
                                                    9. 24A: HOH Hny •Inutti did this last?
          Q.2S  (CIRCLE ONE)

          Old yo« tiptrltnco  any asthoa
          iyaptM(i) not twnttonod In
          0.24A7

          1 Ttl. doicrlbo bolo»
           t No  (GO TO 9.28)
                               147
                     Q.2I  (CIRCLE ALL THAT
                           ArtlT)

                     tfer* you fttUng itrttttd
                     or anxloui iihllt doing
                     JRO ACTIVITY?

                     1 Ttl, itrtntd    "•

                     2 Ttl, anxious     '"

                     J "o              no

                     GO TO 9.27
            9.27 (CIRCLE ALL THAT
                 APfllf)

            Vat anyont twklng vhllt
            you did JRO ACTIVITY?

            1 No on* MI unking      J7J

            2 I wi tanking           177

            1 toaont tilt ««i uoklng 171

            S I don't know            17<
                                                                                    CO TO q.28
           « TO 9.2«
                            14J-M

                            14S-44
                                                             9-13

-------
Thest questions are for all  activities for the past  hour.  .  .
Q.28  Did any chest symptoms limit your
activity in the past hour? (CIRCLE ONE)

1 Yes -- GO TO 0.29             JJfl

2 No -•• GO TO 0.30
                                             0.29  What  type of  activity
                                             was affected?   (CIRCLE AS MANY AS APPLY)

                                             1  Halted my recreational activity
                                             2  Halted my work
                                             3  Halted my study
                                             4  other, please describe 	

                                             GO TO Q.30
                                                                                                                       ID
                                                                                                                                  nt-u
                                                                                                     IS I
                                                                                                     iai
                                                                                                     183
Q.30  Old any nose or throat symptoms
Halt your activity?  (CIRCLE ONE)
1 Yes •• GO TO Q.31

2 No — GO TO Q.32
                                115
0.31  What type of activity was  affected?
(CIRCLE AS MANY AS APPLY)

1 Halted ay recreational  activity
2 limited my work
3 limited my study
4 other, please describe ____^_____
                                             GO TO  Q.32
                                                                                                     136
                                                                                                     117
                                                                                                     its
                                                                                                     189
 0.32  Old you take any medication
 for astluu during the past hour?
 (CIRCLE ONE)

 t  Yes - GO TO 0.33 AND Q.34
                                IX
 2  No ••• GO TO Q.35
                                             0.33  What type of medication did
                                             you take?  (CIRCLE AS MANY AS APPLY)
                                                                                     Q.34   Was  the medication you reported
                                                                                     1n  Q.33  a  maintenance dose or was it
                                                                                     In  response  to  specific symptoms you
                                                                                     had In the last hour?
                                                                                     ANSWER BELOW FOR EACH CIRCLE IN Q.33

                                             1 Inhaled bronchodilater	 1 maintenance

                                             2 Inhaled steroid	 1 maintenance

                                             3 Inhaled croaolyn powder	 1 maintenance

                                             4 oral branchedtlator (pill or liquid).. 1 maintenance   2 response

                                             5 oral steroid (pill or liquid)	 1 maintenance   2 response

                                             6 other, please describe 	 1 maintenance   2 response  3 both   301-01
                                                                                                     2 response  3 both

                                                                                                     2 response  3 both

                                                                                                     2 response  3 both

                                                                                                                3 both

                                                                                                                3 both
                                                                                                                        191-91

                                                                                                                        193-94

                                                                                                                        195-96

                                                                                                                        197-91

                                                                                                                        199-100
                                                      GO TO Q.34
                                                                                                     GO TO Q.3S
 Q.35   Old you experience heavy or rapid breathing for
       5 minutes or more during the past hour?
       (CIRCLE ONE)
       1  Yes
                     2  No
                                 GO TO Q.36    X3
                                                              Q.36 Old you drink any beverage containing  caffein  (e.g.,
                                                              coffee, tea, soda) during the past hour?   (CIRCLE ONE)
                                                              1 Yes
                                                                             2 No
                                                                                         GO  TO  Q.37
                                                                                                        X4
Q.37 The weather outside is:
(CIRCLE AS

1 sunny
2 cloudy
3 rainy
GO TO Q.38

MANY AS APPLY)

4 other
S dark
8 I can't tell


X5

X6
X7
XI
309
310

Q.38 The weather outside feels:
(CIRCLE AS

1 damp
2 cold
3 humid
GO TO Q.39

MANY

4
5
6


AS APPLY)

windv 7 other:
warm 8 I can't tell
hot


311
212

114
115
116
217
111
                                                       9-14

-------
Q.39  Mtr» you exposed to any Irritants or factors not covered above that affected your breathing  in the last hour?
      (CIRCLE ONE)                                                                                     U9f 3X.U
      1  Yes, please describe	                                    332-33
      2  No        GO TO Q.40

0.40  If you were (n a house or othir  building  In the last hour, pltast answer Q.40 a,b,  and  c;  otherwise, you'rt done
for this hour.  Thanks.

Q.40a  Was air conditioner or       Q.40b  Has  a gas stove       Q.40C  Here any windows  open?
central air cooling on?             used for cooking?            (CIRCLE ONE)
(CIRCLE ONE)                        (CIRCLE ONE)
1  Yes                              1  Yes                        1  Yes
2  No                               2  No                         2  No
6  I don't know           "4        6   I don't  know      "s      6  I don't know       316
Thanks; another hour is finished!
                                                     9-15

-------
      FOR NOTES ABOUT THE NEXT HOUR -
                                             Time
Activity
Begin
End
                  9-16

-------
                   THE TREATMENT OF MISSING SURVEY  DATA

                  by:  Graham Kalton and Daniel Kasprzyk
Editors  Note:    This published  article  has been  copied  with  the  kind
permission  of  the  editor  and  the  publishers  of  the  journal,   SURVEY
METHODOLOGY.
                                   10-1

-------
Survey Methodology. June 1 986
Vol. 12. No. 1. pp.  1-16
Statistics Canada
                The Treatment of Missing Survey  Data

                  GRAHAM KALTON and DANIEL KASPRZYK1

                                    ABSTRACT
Missing survey data occur because of total nonresponse and item nonresponse. The standard way to
attempt to compensate for total nonresponse is by some form of weighting adjustment, whereas item
nonresponses are handled by some form of imputation. This paper reviews methods of weighting ad-
justment and imputation and discusses their properties.

KEY WORDS: Nonresponse;  item nonresponse; Weighting adjustments; Imputation.

                               1.   INTRODUCTION

   Surveys typically collect responses to a large number of items for each sampled element.
The problem of missing data occurs when some or all of the responses are not collected for
a sampled element or when some responses are deleted because they fail to satisfy edit con-
straints. It is common practice to distinguish between total (or unit) nonresponse, when none
of the survey responses are available for a sampled element, and item nonresponse, when
some but not all of the responses are available. Total nonresponse arises because of refusals.
inability to participate, not-at-homes, and untraced elements. Item nonresponse arises because
of item refusals, "don't  knows", omissions and answers deleted in  editing.
   This paper reviews the general-purpose methods available for handling missing survey data.
The distinction between total and item nonresponse is useful here since different adjustment
methods are used for these two cases. In general the only information available about total
nonrespondents is that on the sampling frame from which the sample was selected (e.g., the
strata and PSUs in which they are located). The important aspects of this information can
usually be readily incorporated into weighting adjustments that attempt to compensate for
the missing  data. Hence as a rule weighting adjustments are used for total nonresponse.
Methods for making weighting adjustments are reviewed in Section 2.
   In the case of item nonresponse, however, a great deal of additional information is available
for the elements involved: not only the information from the sampling frame,  but also their
responses for other survey items. In order to retain all survey responses for elements with
some item nonresponses, the usual adjustment procedure produces analysis  records  that in-
corporate the actual responses to items for which the answers were acceptable and imputed
responses for other items. Imputation methods for assigning answers for missing  responses
are reviewed in Section  3.
   In general the choice between weighting adjustments and imputation for handling miss-
ing survey data is fairly clearcut; there are cases, however, when the  choice is not so clear.
These are cases of what may be termed partial nonresponse, when some data are collected
for a sampled element but a substantial amount of data is missing. Partial nonresponse can
arise, for instance, when a respondent terminates an interview prematurely, when data are
not obtained for one or more members of an otherwise cooperating household (for  household
level analysis), or when  a sampled individual provides data for some but  not all waves of
a panel survey. Discussions of the choice between weighting and imputation to compensate
for wave nonresponse in a panel survey are given by Cox and Cohen (1985) and Kalton (1986).
1  Graham Kalton, Survey Research Center, University of Michigan, Ann Arbor, Michigan, 48106-1248 and Daniel
  Kasprzyk, Population Division, U.S. Bureau of the Census, Washington, D.C., 20233. The authors would like
  to thank the referees for their helpful comments.
                                    10-2

-------
 2                             Kalton and Kasprzyk: Treatment of Missing Survey Data

   Although  weighting adjustments and imputation are treated as separate approaches in
 the discussion below, they are in fact closely related. The relationship and differences bet-
 ween the two approaches are briefly discussed in Section 4, which also mentions some alter-
 native ways of handling missing survey data.


                         2.   WEIGHTING ADJUSTMENTS

   Weighting adjustments are primarily used to compensate for total nonresponse. The essence.
 of all weighting adjustment procedures is to increase the  weights of specified respondents
 so that they represent the nonrespondents. The procedures require auxiliary information on
 either the nonrespondents or the total population: The following four types of weighting
 adjustments are briefly reviewed below: population weighting adjustments, sample weighting
 adjustments, raking ratio adjustments,  and weights based on response probabilities. More
 details are provided in Kalton (1983).

 2.1   Population Weighting Adjustments
   The auxiliary information used in making population weighting adjustments is the distribu-
 tion of the population over one or more variables, such as the population distribution by
 age, sex and race available from standard population estimates. The sample of respondents
 is divided into a set of classes, termed here weighting classes, defined by the available aux-
 iliary information (e.g., White males aged 15-24, non-White females aged 22-34, etc.). The
 weights of all respondents within a weighting class are then adjusted by the same multiplying
 factor,  with different factors in different classes. The adjustment is carried out in such a
 way that the weighted respondent distribution across the weighting classes conforms to the
 population distribution.
   This type of adjustment is often termed poststratification. That term is avoided here,
 however, because although population weighting resembles poststratification, there is an im-
 portant difference between the two. Like population weighting, poststratification weights
 the sample to make the sample distribution conform to the population distribution across
 a set of classes (or  strata). However, the'standard textbook theory of poststratification  is
.concerned only with the sampling fluctuations that cause the sample distribution to deviate
 from the population distribution, not with  the more major deviations that can arise from
 varying response rates across the classes. Poststratification adjustments are more like a  fine
 tuning of the sample, resulting generally in only small variations in the weights across strata.
 In consequence, provided that the strata are not small, poststratification leads to lower stan-
 dard errors for the survey estimates. In contrast,  population weighting adjustments may in-
 volve more major adjustments and result  in higher standard errors.
   Population weighting adjustments attempt to reduce the bias created by nonresponse  and
 coverage errors.  Consider the estimation of a population mean ? from a sample in which
 the elements are selected with equal probability. Suppose that the population is divided  into
 a set of weighting  classes, with a proportion  Wh of elements in class h. Assume  that
 respondents always respond and that nonrespondents never do. Let /?* and M/, be the pro-
 portions of respondents and nonrespondents respectively in class h, and let R =  LWhRhbe
 the overall response rate. Then, following Thomsen (1973), the bias of the unadjusted respon-
 dent mean (?) can  be expressed as

                       (P,A  -  ?r)(R*  - ft)  +    w*A/»)= A + B     (1)
                                         10-3

-------
 Survey Methodology,  June 1986
 where ?rh and  ?mA are the means for respondents and nonrespondents in class h respective-
 ly, and ?r is the population mean for the respondents. The use of the population weighting
 adjustment leads to the weighted sample mean, yp =  T.Whyrh, where yr/l is the respondent
 sample mean  in class  h.  The bias of ?p is simply  the second term  in fl(.P),  that  is,
 B(9P) = B.
    If A and B  are of the same sign, the population weighting adjustment reduces the ab-
 solute bias  in the estimate of T by \A\.'. If ?r/t =  ?mh, as occurs in expectation when the
 nonrespondents are missing at random within the weighting classes, then B = 0. In this case,
 the population weighting adjustment eliminates  the bias. The term A is a covariance-type
 term between the class response rates  and the class respondent means.  It is zero if either
 the response rates or the respondent means do not vary between classes. In either of these
 cases, the population weighting adjustment has no effect on the  bias of the  estimator. It
 may  be noted that population weighting adjustments may increase the absolute bias of the
 estimate of ?. This will occur when A and B are of opposite signs and \A\  < 2\B\.
    Population weighting adjustments require external data on the population  distributions
 for the variables to be used. Care is needed to ensure that the data on which the population
 distributions are based are exactly comparable with the survey data; otherwise, inappropriate
 weights will result. Since the procedure weights up to population distributions, it does more
 than just attempt to compensate  for nonresponse. It also compensates for coverage errors
 and makes  a poststratification adjustment.

 2.2  Sample Weighting Adjustments
  .  As with population weighting adjustments, with sample weighting adjustments the sam-
 ple is divided into weighting classes; varying weights  are then assigned  to these classes in
 an attempt to reduce the nonresponse bias.  The essential difference between  the two pro-
 cedures lies in the auxiliary information used. As described above,  population weighting ad-
 justments are based on externally obtained population distributions. No data are needed for
 the sample nonrespondents.  In contrast,  sample weighting adjustments employ only data
 internal  to  the sample and require information  about the nonrespondents.
    With sample weighting adjustments,  the nonresponse adjustment weights for  the weighting
 classes are made proportional to the inverses of the response rates in the classes, in order
 to compute these response rates, the numbers of respondents and nonrespondents in the classes
 must be determined. It is therefore necessary to know to which class each respondent and
• nonrespondent belongs. Since typically very little information about the nonrespondents is
 available, the choice of weighting class is usually severely restricted. It  is often limited to
 general sample design variables (e.g.,  PSUs and strata), characteristics  of those variables
 (e.g., urban/rural, geographical region), and sometimes some additional  variables available
 on the sampling frame. On occasion it may also be possible to collect information on one
 or two variables for the nonrespondents, for instance by interviewer observation.
    As population weighting adjustments resemble poststratification, so sample weighting ad-
 justments resemble two-phase sampling. The  first phase  sample is the total sample of
 respondents and nonrespondents; the second phase sample is the subsample of respondents,
 selected with different sampling fractions (response rates) in different strata (weighting classes).
 The sample weighted mean can be represented by ys = CH\J>,A, where wh is the proportion
 of the total sample in weighting class  h. Assuming no coverage errors,  E(wh) ±.  Wh, the
 population  proportion in  class  h,  as used  in the  population  weighted estimator

                                    10-4

-------
4                              Kalton and Kasprzyk: Treatment of Missing Survey Data

J>p =  ŁWHyrh. The bias of 9, is the same as that of yp, namely B(ys) = B as given in equa-
tion (1); hence the effect of the sample weighting adjustment on the bias of the survey estimate
is the same as that of the population weighting adjustment. Since sample weighting ad-
justments use only data for the sample, they do not compensate for coverage errors (unlike
population weighting adjustments).
   Population and sample weighting adjustments have different data requirements, and hence
address different potential sources of bias. lit practice the two forms of adjustment are used
in combination. Generally sample weighting adjustments are applied first, and then popula-
tion weighting adjustments are applied afterwards. A common approach is initially to deter-
mine the sample weights needed to compensate for unequal selection probabilities, next to
revise these weights to compensate for unequal response rates in different sample weighting
classes (e.g., urban/rural classes within geographical regions), and finally to revise the weights
again to make the weighted sample distribution for certain characteristics (e.g.. age/sex) con-
form to the known population distribution for those characteristics. The use of this approach
in the  U.S. Current  Population Survey is described by Bailar et al. (1978).
   As with  population  weighting adjustments, the aim of sample weighting adjustments is
to reduce the bias that nonresponse may cause in survey  estimates.  An effect of sample
weighting adjustments is, however, to increase the variances of the survey estimates. There
is therefore a trade-off to be made between bias reduction and variance increase.
   An indication of the amount of increase in variance from weighting can be obtained by
considering the situation where the element variances within the  weighting classes are all the
same and the variances between the class means are negligible compared to the within-class
variances. In  this situation, the loss of precision from weighting is approximately the same
as that arising from the use of  disproportionate stratified sampling when proportionate
stratified sampling is optimum; Kish (1965, Section  11.7C; 1976) discusses this latter case.
   Under the above conditions, weighting increases the variance of a sample mean by ap-
proximately L = (T^WtJkt,) CLWH/kh), where Wh is the proportion of the population and
kh\s the weight  for  class h. An  alternative expression for L is (En*) (LnHkt)/(Lnhkh)2,
where  nh is the sample size in class h. The factor L becomes large when the variance of the
weights is large.
   A large variance in the weights can arise from segmenting the sample into many weighting
classes with only a few sampled elements in each. When the weighting classes are small, their
response rates are unstable,  and this gives rise to a large variation in the weights. To avoid
this effect,'it is common practice to limit the extent to which the sample is segmented. Even
so, there may still be some  weighting classes that require  large weights. Sometimes these
weighting classes are handled by collapsing them with adjacent ones and sometimes their
weights are cut back to some acceptable maximum value (see Bailar et al. 1978 and Chap-
man et a/.  1986, for  examples). These procedures avoid the increase in variance associated
with the use of extreme weights, but they may lead to increased bias; their effect on the bias
is, however, unknown.
   In some  cases it seems desirable to use several auxiliary variables in forming the weighting
classes for population or sample weighting adjustments. However, if the classes are formed
by taking the full crossclassification of the variables, there will be a large number of weighting
classes. Unless the sample is very large, the sample sizes in the resultant weighting classes
will be small, and the instability in the response rates will lead to a large variance in the weights
and loss of precision in the  survey estimates. One way to deal  with this problem is to cut
down on the  number of classes by collapsing cells, for instance by discarding some of the
auxiliary variables or using  coarser classifications. Another way is to base the weights on
a model, as is done  in raking ratio weighting discussed below.
                                  10-5

-------
Survey Methodology, June 1 986                                                   5


2.3  Raking Ratio Adjustments

   When weighting classes are taken to be the cells in the crossclassificaiion of the auxiliary
variables, population weighting adjustments make the joint distribution of the auxiliary
variables in the sample conform to that in the population. Similarly, sample weighting ad-
justments make the joint distribution of the auxiliary variables in the respondent sample con-
form to that in the total sample. As noted above,- however, this crossclassification approach
may have the undesirable effect of creating many smalt, and hence unstable, weighting classes.
Also, it is not always possible to employ this approach with population weighting adjustments:
in many  cases the population marginal distributions,  and perhaps some bivariate distribu-
tions, of the auxiliary variables are available, but the full joint distribution is unknown.
   An alternative approach is to develop weights that make the marginal distributions of
the auxiliary variables in the sample conform  to  marginal population distributions (with
population weighting) or marginal total sample distributions (with sample weighting), without
ensuring  that the full joint distribution conforms. The- method of raking ratio estimation,
or raking, may be used to obtain weights that satisfy these conditions. Raking corresponds
to iterative proportional fitting in contingency table analysis (see, for instance, Bishop et
a/., 1975).
   Consider the use of raking in the simple case of two auxiliary variables. Let  Whk be the
proportion of the population in the (h, /t)-th cell of the crossclassification,  and let  WA* be
the proportion assigned to that cell by the raking algorithm. Conditional on the total and
respondent sample sizes in the cells (and assuming all cells have at least one respondent),
the bias of the raking ratio adjusted sample mean ?q = ECrt^/i* 's
 B(9,) = 22«yifw(FMj -  F^) + ŁŁ(#**  - W«)(Frt* -  ?«,. -  ?,.k +  F,)

where  H^M = E(whk). The first term in this bias corresponds to the bias term B in equa-
tion (1) for the population and sample weighting adjustments. It is zero in expectation if
the ceil nonrespondents are random subsets of the cell populations. The second term is zero
if either WHk =  Whk or there is no  interaction in the ?rhlc for this classification.
   Underlying the raking ratio weighting procedure is a logit model for the cell response rates.
With  the  model ln(RHlc/([ - RM)]  = a*  + 0» for  the response rates in a  two-way
classification,  Whk = Whk. Thus, under this model, the second term in  B(yq) is zero.
   Further discussion of raking ratio weighting is given by Oh and Scheuren (1978a,1978b,
1983).  Oh and Scheuren (1978a) also provide a bibliography on  raking.

2.4  Weighting with Response Probabilities

   Although a number of methods for weighting with response probabilities have been pro-
posed, this approach has not been widely adopted as an adjustment procedure.  The basis
of the approach is to assume that all population elements have  probabilities (usually required
to be non-zero) of responding to the survey.  Some method is  used to estimate the response
probabilities for responding elements. These elements are then given nonresponse adjust-
ment weights that are in inverse proportion  to their estimated response  probabilities.
   An early application of this approach is the well-known procedure of Politz and Sim-
mons (1949, 1950). A single (evening) call is  made to each selected household, and during
the course of the interview  respondents are asked on how many of the previous five evenings
they were  at home at about  the same time. Their response probabilities are then taken to
be the fraction of the six evenings (including the one of the interview) that they were at home,
and the inverses of these probabilities are used in the  analysis. Note that the procedure does
not deal with those who were out on all  six evenings and those who refused.
                                        10-6

-------
6                             Kalton and Kasprzyk: Treatment of Missing Survey Data

  Another approach for estimating response probabilities is to regress response status (1
for respondents, 0 for nonrespondents) on a set of variables available for both respondents
and nonrespondents, using a logistic or probit regression. The predicted values from the regres-
sion for the respondents are then taken to be their response probabilities, and  weights in
inverse proportion to these predicted values are used in the analysis. A special case is when
the  predictor variables  are dummy variables that identify a set of classes.  The predicted
response probabilities are then the class response rates, and the method reduces to a sample
weighting adjustment. The method is most appropriate for situations where a good deal of
information is available for the nonrespondents,'as for instance when the nonrespondents
are losses after the first  wave of a panel survey. Little and David (1983) discuss the applica-
tion of the method for panel nonresponsc. It should be noted that if the regression is highly
predictive of response status, the resultant weights will vary markedly, leading to a substan-
tial  loss in the precision of the survey estimates.
  Drew and Fuller (1980, 1981) describe an approach for estimating response probabilities
from the number of respondents secured at successive calls. In their model, the population
is divided into classes. Within each class, every element is assumed to have the same response
probability which remains the same at each call. The model also allows for a proportion
of hard-core nonrespondents that is assumed constant across classes. Under these assump-
tions, the response probabilities for each class and the proportion of hard-core nonrespondents
can be estimated, and hence weighting adjustments can be made. Thomsen and Siring (1983)
adopt a similar approach using a more complex model.
  Finally, mention should be made of a related approach that compensates for nonresponse
by weighting up difficult-to-interview respondents. Bartholomew (1961), for instance, pro-
posed making only two calls in a survey,  and weighting up  the respondents at the second
call to represent the nonrespondents. The assumption behind this approach is  that the
nonrespondents are like the late respondents. This assumption seems questionable, however,
and empirical evidence from an intensive follow-up study of nonrespondents in the U.S. Cur-
rent Population Survey does not support it (Palmer and Jones 1966; Palmer 1967).
                                3.  IMPUTATION

   A wide variety of imputation methods has been developed for assigning values for miss-
ing item responses. The aim here is  to provide a brief overview of the methods, the basic
differences between them, and some of the issues involved in imputation. A fuller treatment
is provided by Kalton and Kasprzyk (1982).
   Imputation methods can range from simple ad hoc procedures used to ensure complete
records in data entry to sophisticated hot-deck and regression techniques. The following are
some common imputation procedures:
(a) Deductive imputation. Sometimes the missing answer to an item can be deduced with
    certainty from the pattern of responses to other items. Edit checks should check for con-
    sistency between responses to related items. When the edit checks constrain a missing
    response to only one possible value, deductive imputation can be employed. Deductive
    imputation is the ideal form of  imputation.
(b) Overall mean imputation. This method assigns the overall respondent  mean to all miss-
    ing responses.
(c) Class mean imputation. The total sample is divided into classes according to values of
    the auxiliary variables being used for the imputation (comparable to weighting classes).
    Within each imputation class the respondent class mean is assigned to all missing responses.

                                     10-7

-------
Survey Methodology, June 1 986                                                   7


(d) Random overall imputation. A respondent is chosen at random from the total respon-
   dent sample, and the selected respondent's value is assigned to the nonrespondent. This
   method is the simplest form of hot-deck imputation,  that is an imputation procedure
   in which the value assigned for a missing response is taken from a respondent to the cur-
   rent survey.
(e) Random imputation within dosses. In this hot-deck method, a respondent is chosen at
   random within an imputation class, and the selected  respondent's value is assigned to
   the nonrespondent.                     •  '^     .
(f) Sequential hot-deck imputation. The term sequential hot-deck imputation is used here
   to describe the procedure used with the labor force items in the U.S. Current Population
   Survey (Brooks and Bailar 1978). The procedure starts with a set of imputation classes.
   A single value for the item subject to imputation is assigned for each class (perhaps taken
   from a previous survey). The records in the survey's  data file are then considered in turn.
   If a record has a response for the item in question, its response replaces the value stored
   for the imputation class in which it falls. If the record has a missing response,  it is assign-
   ed the value stored for its imputation class.
      The hot-deck method is similar to  random imputation  within classes. If the order of
   the records in the data file were random, the two methods would be equivalent, apart
   from the start-up process. The non-random order of the list generally acts to  the benefit
   of the hot-deck method since it gives a closer match of donors and recipients provided
   that the file order creates positive autocorrelation.  The benefit is, however,  unlikely to
   be substantial.
      The sequential hot-deck suffers the disadvantage that it  may easily make multiple uses
   of donors, a feature that leads to a loss of precision in survey estimates. Multiple use
   of a donor occurs when, within an imputation class, a record  with a missing response
   is followed by one or more other records with missing responses. The number of imputa-
   tion classes that can be used with the method also  has to be limited in order to ensure
   that donors are available within each class.
      Useful discussions of the sequential hot-deck  method  are provided by Bailar et al.
   (1978), Bailar and Bailar (1978, 1983), Ford (1983), Oh and Scheuren (1980), Oh et al.
   (1980), and Sande (1983).
(g) Hierarchical hot-deck imputation. The above disadvantages of the sequential hot-deck
   are avoided in the hierarchical hot-deck method, a form of hot-deck imputation developed
   for the items in the March Income Supplement of the  Current  Population Survey. The
   procedure sorts respondents and nonrespondents into a large number of imputation classes
   from a detailed'categorization of a sizeable set of auxiliary variables. Nonrespondents
   are then matched with respondents on a hierarchical basis, in the sense that if a match
   cannot be made in the initial imputation class, classes are collapsed and the match is made
   at a lower level of detail. Coder (1978) and Welniak and  Coder (1980) provide further
   details on the hierarchical hot-deck procedure.
(h) Regression imputation. This method uses respondent data to regress the variable for which
   imputations are required on a set of auxiliary variables. The regression equation is then
   used to predict the values for the missing responses. The imputed value may either be
   the predicted value, or the predicted value  plus some residual. There are several ways
   in which the residual may be obtained,  as discussed  later.
(i) Distance function matching. This hot-deck method assigns a nonrespondent the value
   of the "nearest" respondent, where "nearest" is defined in terms of a distance function
   for the auxiliary variables. Various forms of distance function have been proposed (e.g.,
   Sande 1979; Vacek and Ashikago 1980), and the function can be constructed to reduce
   the multiple use of donors by incorporating a penalty for each use (Colledge et al. 1978).

                                          10-8

-------
8                              Kalton and Kasprzyk: Treatment of Missing Survey  Data


   Although at first sight these may appear a diverse set of procedures, they can nearly all
be fitted within a single unifying  framework. The methods can all be described, at least ap-
proximately, as special cases of  the general regression model
                            9 m> =  bn +    bfjzmiJ + emi                         (2)

where ?mi is the imputed value for the  ith record with a missing .y value, zm<, are values reflec-
ting the auxiliary variables for that record, bro and /y/ are the regression coefficients for the
regression of y on x for the respondents, and em- is a residual chosen according to a specified
scheme for the particular imputation method.
   Equation (2) represents the regression imputation method in an obvious way. If the em,'s
are set at zero, then the imputed value is the predicted value from the regression; otherwise
a residual of some form may be added. The equation also represents class mean imputation
by defining the ?/s  to be dummy  variables that represent the classes,  and setting emi  = 0.
The regression equation then reduces to  $mi  = J>rh, the class mean.  Random imputation
within classes is obtained by adding a residual to the class mean, where the  residual is the
deviation from the  class mean for one  of the respondents. Then fmi  = j>,A  + efttk, where
erhk is the deviation  for respondent k in class h; this reduces to 9 mi = yru> tne value for that
respondent. The sequential and hierarchical hot-deck methods resemble the random within
class method. The overall mean and random overall imputation methods are degenerate cases
of the class mean and random within class methods that use no auxiliary information.
   An important consideration in the choice of imputation method is the type of variable
being imputed. All  the above methods can be applied routinely with continuous variables,
but some of them are not suitable for use with categorical or discrete variables (such as being
a member of (he labor force (1) or not (0), and the  number of completed years of educa-
tion). Overall mean, class mean, and regression imputations impute values like 0.7 for being
a member of the  labor force (i.e., a  10% chance) and 10.7 for the number of completed
years of education.  These values are not feasible for individual respondents, and rounding
them to whole numbers leads to bias. For this reason, these imputation methods do not  work
well for categorical and discrete variables. A notable advantage of all hot-deck methods is
that they always give feasible values since the values are taken from respondents.
   there are two major distinguishing features of the above imputation methods that deserve
elaboration: whether or not a residual is added and, if one is, the form of the residual; and
whether the auxiliary information is  used in dummy variable form to represent classes or
whether it is  used straightforwardly in  the regression. These features are discussed in the
next two subsections. Other issues arising with the use of imputation are then discussed  in
subsequent subsections.

3.1 Choice of Residuals
   Imputation methods may be classified as deterministic or stochastic according to whether
the emi's are set  at zero or not. For each deterministic imputation method, there is a
stochastic counterpart.  Let 9m«i be the value imputed by the deterministic method and
9 ma  = 9 mid  + *mt  be that  imputed by  the corresponding  stochastic method.   Then
Łj(.Pmu)  = 9 mid, where  Ł2 denotes expectation over  the sampling of residuals given the in-
itial sample, provided that E2(emi)  = 0 (as generally applies).
   The choice between a deterministic and the corresponding stochastic imputation method
depends on the form of survey analysis to be conducted. Consider first the estimation of
the population mean of the ^-variable using the sample mean of the respondents' values and

                                       10-9

-------
Survey Methodology, June 1986                                                   9

the nonrespondents'  imputed  values. As Kalton  and Kasprzyk  (1982) show, given that
Ei(ymis)  — 9mid> it follows that the expectation of the sample mean is the same whether the
deterministic method or the corresponding stochastic method is used. Thus both methods
have the same effect on the bias of the estimate. However, the addition of random residuals
in the stochastic method causes a loss of precision in the sample mean.  Although this loss
can be controlled by the choice of a suitable-method of sampling residuals (Kalton and Kish
1984), nevertheless some loss in precision "occurs. For this reason  a deterministic scheme is
preferable for  the purpose of estimating the population mean.
   Consider now the  estimation of the element standard deviation and distribution of the
^-variable. Deterministic imputation methods fare badly for these purposes, since they cause
an attenuation in the  standard deviation and they distort the shape of the distribution. This
may be simply illustrated in terms of the class mean imputation method. By assigning the
class mean to all the missing values in a class, the shape of the distribution is clearly distorted
with a series of spikes at the class means. The standard deviation of the distribution is at-
tenuated  because the  imputed values reflect only the between-class and not the within-class
variance. The appeal  of the stochastic imputation methods is that the residual term captures
the within-class (or residual) variance, and hence avoids the attenuation of the element stan-
dard deviation and the distortion of the distribution.
   Since some survey analyses are likely to involve the distributions of the variables, stochastic
imputation methods  like the hot-deck methods are generally preferred.  Once a decision is
made to use a stochastic method, the question of how to choose the residuals arises. If the
standard  regression assumptions are accepted, the residuals could  be chosen from a normal
distribution with a mean of zero and a variance equal to the residual  variance from the respon-
dent regression.  However, this places complete reliance on the  model. An alternative that
avoids the normality assumption is to choose the residuals randomly  from the empirical •
distribution of the respondents' residuals. Another alternative  is  to select a residual from
a respondent who is  a "close" match to the nonrespondent, measuring "close" in terms
of similar values on the auxiliary variables. This attractive alternative avoids the assumption
of homoscedasticity and guards against misspecification of the  distribution of the residual
term. In the limit, the closest respondent is one who has the same values  of all the auxiliary
variables as the nonrespondent. In this case, the nonrespondent  is  given one of  the matched
respondents' values.  This case arises with hot-deck methods, where nonrespondents and
respondents are matched in terms of the auxiliary variables, and nonrespondents are  assign-
ed values from matched respondents.
   A further consideration in the choice of residuals is to make  the imputed values feasible
ones. As  noted above, deterministic methods may impute values for categorical and discrete
variables  that are not feasible. Some stochastic methods solve this  problem  through the alloca-
tion of the residuals.  In particular, the use of respondents' residuals with  the random within
class and the sequential and hierarchical hot-deck methods ensures that the imputed values
are feasible ones.

3.2  Imputation Class or Regression Imputation
   As noted earlier, both imputation class and regression imputation methods fall within the
imputation model given by equation (2). The difference between them  lies  in the ways in
which they employ the auxiliary variables.
   Imputation  class methods divide the sample into a set of classes. For this purpose, con-
tinuous auxiliary variables have to be categorized. There is complete flexibility in the way
the classes are  formed, and the symmetrical use of the auxiliary variables in different parts

                                   10-10

-------
10                            Kalton and Kasprzyk: Treatment of Missing Survey Data


of the sample is not required. Thus, for instance, in imputing for hourly rate of pay in a
sample of employees, the sample might first be divided into two parts, union members, and
nonmembers; then the imputation classes for the members might be formed in terms of age
and occupation whereas those for nonmembers  might be formed in terms of sex and industry.
As a rule, the aim is to construct classes of adequate size that explain as much of the variance
in the variable to be  imputed as possible. When the classes are formed by a complete
crossclassification of the auxiliary variables, the underlying  model contains all main effects
and all interactions for the crossclassification. The limitation of imputation class methods
is that the number of classes formed  has  to  be constructed  to ensure that there  is some
minimum number of respondents in each class. The hierarchical hot-deck method attempts
to extend the amount of auxiliary data used, but even with this method matches of respondents
and nonrespondents often cannot be made at the finer levels of detail. Coupled with the
use of a random respondent residual within a class, imputation class methods have the valuable
property that imputed values are  feasible ones: that is,  the imputed values are actual
respondents' values.
  Regression imputation methods have an advantage over imputation class methods in the
number and in the level of detail of the auxiliary variables  they can employ. Age can. for
instance, be taken as a continuous variable rather than being categorized into a few classes.
The regression model allows more main effects to be included in the model, but at the price
of fewer interactions. Regression models can, of course, include some interactions, but they
need to be specified. The models can also include polynomial terms and employ transforma-
tions, but again they need to  be specified. The regression model has the potential of pro-
viding better predictions for the imputed values, but to achieve this careful modelling is
required. Careful  imputation  modelling is  unrealistic for all the variables in a survey, but
it may be feasible for one or two major ones (and especially so for continuous surveys).
Without  careful modelling, there is a serious  risk of poor imputations, although as noted
earlier, this  risk can  be reduced by  the  allocation of  random  residuals  from  "close"
respondents.
  If a regression imputation assigns the residual from a  respondent with exactly the same
values of the auxiliary variables, the imputed value is necessarily a feasible one. If, however,
there is even a small difference between the respondent's and nonrespondent's values on the
auxiliary variables, the imputed value may not be feasible. A  variant of regression imputa-
tion that avoids this problem, termed predictive mean matching, is described by Little (1986b)
(Little attributes the method to Rubin). With predictive mean  matching, the nonrespondent
is matched to the  respondent with the closest  predicted value. Then, instead of adding the
respondent's residual to the nonrespondent's predicted value, the nonrespondent is assigned
the respondent's value. The method is thus a hot-deck method, and is similar to  distance
function matching.
  The choice between imputation class and regression imputation methods should in part
depend on the efforts made to develop the regression model. Unless adequate resources are
devoted to the development of a regression model, the imputation class methods may be
safer. The choice  should also in part depend on the sample size. With large samples, hot-
deck methods are likely to be able to use enough classes to  take advantage of all the major
predictor variables; however, with small samples this may not hold, and regression methods
may have greater potential. David et al. (1986) describe an  interesting study that compares
regression models for imputing wages and salary in the U.S.  Current Population Survey with
hierarchical hot-deck imputations. Despite the extensive efforts made to develop the regres-
sion models, the hot-deck imputations were not found to be inferior in  this large sample.

3.3  Effect  of Imputation on Relationships
   Although most of the literature on imputation deals with its effect on univariate statistics
such as means and distributions, a  large part of survey analysis is concerned with bivanate
                                         10-11

-------
Survey Methodology, June 1986                                                  1 1


and multivariate relationships. Here the analysis of relationships can be considered in broad
terms to include crosstabulation, correlation or regression analysis, comparisons of subclass
means or proportions, and any other analysis involving two or more variables.  As will be
illustrated below, imputation can have harmful effects on all analyses of relationships, often
attenuating the associations between variables. Discussions of the effects of imputations on
relationships are provided by Santos (1981), Kalton and Kaspryzk (1982) and Little (1986a).
   The general nature of the effect of imputation on relationships can be seen by considering
its effect on the estimate of the sample covanance in the simple situation where the v-variable
has missing responses that are missing at random over the population and (hex-variable has
no missing data. The sample covariance, sx)l, is  calculated in the standard way, based on
the actual values for respondents and the imputed values for nonrespondents, as an estimate
of the population covariance Sxr Using  the fact that E2(yma)  = ymid as above, it can be
readily shown that the expected value of sxy under a deterministic imputation method is the
same as that under the corresponding stochastic method.
   As Santos (1981) shows, the relative bias of s,r when the mean overall or random overall
imputation methods are used is approximately -M, where M is the nonresponse rate. This
occurs because the imputed .v-values are unrelated to their x-values, and hence the cases with
imputed values attenuate the covariance towards zero.  This attenuation is decreased in
magnitude by imputation methods that use auxiliary variables. With class mean imputation
or random imputation within classes, the relative bias is approximately -iV/(Sxvz/SIV),
where S^.t = CffA5TrA is the average within-class covariance for classes formed by the aux-
iliary variables z, 5,,A is the covariance  within class />, and Wh is the proportion  of the
population in class h. With predicted regression  imputation or regression imputation with
a random residual, both with a single auxiliary variable z, the relative bias is approximately
-M[\ -  (Pnpyt/Prr)], where puv  is the correlation between u and v.               .
   The disturbing feature of these results is that, unless A/ is small, sry calculated with im-
puted values under any of these imputation methods may be subject to substantial bias even
under the missing at  random model. The estimates sly computed with imputed values ob-
tained under the imputation class and regression methods are unbiased only if the partial
covariance S^j is zero. In general, there is no  reason to assume uncritically that S,v z  is zero.
However, there is an important case when Srr{ = 0.  This occurs when x = z, that is when
x is used as an auxiliary variable in the imputation procedure. In this case, the  sample
covariance is unbiased under the missing at random model. This result suggests  that if the
relationship between x and y is to form an important part of the survey analysis, x should
be used as ah auxiliary variable in  imputing  for missing ^-values.
   The above theory assumes that only the ^-variable was subject to missing data. In prac-
tice the of-variable will often also be incomplete. If so, the sample covariance may be at-
tenuated because of the imputations for both variables. A special feature occurs when x and
y are both missing for a record. If the two values are  imputed separately, the covariance
is attenuated, but if they are imputed  jointly, using  the same respondent as the donor of
both values, the covariance structure is retained. This suggests that when a record has several
missing related values, they should  be  taken from the same donor. Coder (1978) describes
the use of joint imputation from the same donor in the March Income Supplement of the
Current Population Survey.
   As an illustration of how the above arguments about the attenuation of covariances app-
ly to other forms of relationships, we will give a simple numerical example of the effect of
imputation on the difference between two  proportions. Let the variable of interest be whether
an individual has a particular attribute or not, and suppose that one half of the respondents
fail to answer this question. The missing responses are imputed by a random within class
imputation method using two classes, A  and B. The objective is now to compare the

                                    10-12

-------
 1 2                           Kalton and Kasprzyk: Treatment of Missing Survey Data


                                     ' Table 1
                Number of Respondents with the Attribute, and Number of
                    Sampled Persons by Class,  Sex and Response Status


Respondents with the attribute
Total respondents
Nonrespondents
Total sample

M
80
100
100
200
Class A
F
- 40 .
MOO
100
200

Total
120
200
200
400

M
60
100
100
200
Class B
F
20
100
100
200

Total
80
200
200
400
percentages of men and women with the attribute. The data are displayed in Table 1. Since
60% of the total respondents in class A  have the attribute, 60 of the 100 male and 60 of
the 100 female nonrespondents in that class will be imputed to have the attribute. Similarly,
in class  B 40% of the total respondents have the attribute, and so 40 male and 40 female
nonrespondents will be imputed to have the attribute. The proportion of actual and imputed
males with the attribute is thus (80  + 60 +  60-1- 40)/400 =  0.6 or 60%. For females the
corresponding proportion is (40 + 60 +  20 + 40)/400  = 0.4, or 40%. The difference bet-
ween these two percentages is 20%.
   Had -sex also been taken into account in forming the imputation classes, the percentages
of males and females with the attribute would have been 70% and 30%, differing by 40%.
The failure to include sex as an auxiliary variable in the imputation has thus caused a substan-
tial attenuation in the measurement of the relationship between sex and having the attribute.

3.4  Multiple  Imputations
   Ideally the analyst using a data set with imputed values should be able  to obtain valid
results for any analyses by applying standard techniques for complete data. However, as
noted in the last section, imputation can distort measures of the  relationships between
variables. It also distorts standard  error estimation.
   All imputation methods except deductive imputation  fabricate data to some extent. The
extent of fabrication depends on how well the imputation model predicts the missing values.
If the imputation model explains only a small proportion of the variance in the variable among
the respondents, the amount of fabrication in each imputed value is likely to be substantial.
If the imputation model explains a high proportion of the respondent variance, the amount
of fabrication  is likely to  be less serious.  However,  it needs to be recognized that the fit of
the imputation model for the respondents is not necessarily a good  measure of the fit for
the nonrespondents.
   Standard errors computed in the standard way from a data set with imputed values will
generally be underestimates because  of the fabrication involved in the imputed values. Rubin
(1978, 1979) has advocated the method of multiple  imputations to provide valid inferences
from data sets with imputed values  (see also Hcrzog and Rubin 1983; Rubin and Schenker
1986). When multiple imputations are used for the purpose of standard error estimation,
the construction of the complete  data set by imputing for the missing responses  is  carried
out several (say m)  times using  the same  imputation  procedure. The  sample  estimates
z,  (i =  1, 2	m) of the population parameter of interest Z are computed from each of
the replicate data sets, and their average  2 is calculated. A variance estimator for 2 is then

                                  10-13

-------
Survey Methodology, June 1986                                                 13


given by V  = W+ [(m + 1 )/m]B, where W\s the average of the within-replicate variance
of Z and 6  = Ł(z, -  J)2/ (m  - 1) is the between-replicate variance. Even with the inclu-
sion of the between-replicate variance component, however, the coverages of confidence in-
tervals for Z based on V are still overstated, with the amount of overstatement increasing
with the level of nonresponse.
   This overstatement of the confidence levels can be addressed by modifying the imputa-
tion procedure, as described by Rubin and Schenker  (1986). Their treatment  considers the
random overall imputation method, and one: of their modifications allows for uncertainty
about the population  mean and variance in the following way. With the standard random
overall imputation method, the conditional expected mean and variance of the imputed values
are the sample respondents' mean and  variance. With the modification, the expected mean
and variance of the imputed values for a replicate are drawn at random from appropriate
distributions. The imputed values are then a random selection of respondents' values, modified
for the randomly-chosen mean and variance. When estimating the population  mean, the ef-
fect of the changing expected mean and  variance between replicates is to increase the between-
replicate variance component in V. This increase gives improved coverage for the resultant
confidence intervals.
   A major problem with the use of multiple imputations is the additional computer analysis
needed, which increases as the number of replicates, m, increases. For this reason, a small
value of m, such as m = 2, may be preferred. A small  value of m may, however, result in
a low level of precision for the variance estimator. Even with small m, it is questionable
whether the multiple imputation approach is feasible for routine analyses. It may be best
reserved for special studies, such as that described by  Herzog and Rubin (1983).
   In addition to providing appropriate standard errors, another advantage of multiple im-
putations from the same imputation procedure is that it reduces the loss of precision in survey
estimates arising from  the random selection of respondents to act as donors of. imputed values
(see Section 3.1). This loss is  reduced with  multiple imputations  by averaging over the
replicates. A small number of replicates serves well for this purpose. As noted earlier, Kalton
and Kish (1984) describe alternative ways of selecting the sample of respondents to achieve
this end.
   A second major potential application of multiple imputations is to generate the imputa-
tions for the several replicates by different imputation  procedures, making different assump-
tions about the nonrespondents. Suppose, for instance, that hourly rates  of  pay are to be
imputed for some earners in the sample. One  procedure that might be used is the random
within class imputation method, which is based on an assumption that nonrespondents are
missing at random within the classes. If it is thought that the nonrespondents might in fact
come more heavily from those with higher rates of pay  in each class, a simple modification
to the random within  class method might be to impute  values that are, say, 50 cents above
the donors' values. Other imputation procedures - for instance, using different imputation
classes - could also be tried. Comparison of the survey estimates obtained from the data
sets in which the different imputation  procedures are applied then  provides a valuable in-
dication of the sensitivity of the estimates to the values imputed. If the estimates turn out
to be very similar, they can be accepted with  greater confidence; if they  differ markedly,
the estimates need to be treated with  considerable caution.
                          4.  CONCLUDING REMARKS

   Weighting and imputation have been presented as two distinct methods for handling missing
survey data, but in fact there is a close relationship between them. This may be illustrated


                                   10-14

-------
 14                             Kalton and Kasprzyk: Treatment of Missing Survey Data


 by considering any imputation method that assigns respondents' values 10 the nonrespondents.
 For univariate analyses, this process is equivalent to dropping the nonrespondents' records
 and adding the nonrespondents' weights to those of the donor respondents (Kalton  1986).
   The differences  between weighting and imputation emerge when one considers the
 multivariate nature of survey data. It is possible  to  impute for the responses of a total
 nonrespondent by taking all the responses from a single donor; however, weighting is generally
 simpler in this case and it avoids the loss of .precision arising from the sampling of respondents
 to serve as donors.  It is not practicable to use weighting to handle item nonresponse since
 it would result in different sets of weights for each item; this would cause serious difficulties
 for crosstabulations and other analyses of the relationships between variables.
   Weighting is a single global adjustment that attempts to compensate for  the missing
 responses to all the items simultaneously. Imputation, on the other hand, is item-specific.
 This difference has consequences for the way that the auxiliary data are  used. In forming
 weighting classes, the focus is on determining classes that differ in their response rates. The
 choice of auxiliary variables to use in imputation,  however, is primarily made in terms of
 their abilities to predict the missing responses.
   An assumption underlying all the procedures reviewed in this paper is that once the aux-
 iliary variables have been taken into account the missing values are missing  at random. Thus,
 for instance, the nonrespondents are assumed to be like the respondents within weighting
 and imputation classes. This assumption can be avoided by using stochastic censoring models,
 as has been done by Greenlees et al. (1982) in imputing wages and  salaries in  the Current
 Population Survey. However, as Little (1986b) observes, these models are highly sensitive
 to the distributional assumptions made.
 •  An alternative approach for handling missing survey data is to leave the values missing
• in the data set and let the analyst incorporate appropriate missing data models into the analysis
 (Little 1982). This  approach has much to commend it,  but the labor and computing time
 needed to implement it  effectively preclude its use as a general  purpose  strategy. Rather,
 the approach seems best suited for a small range of special analyses. In order to permit the
 analyst to adopt this approach, it is essential that all imputed values be flagged to indicate
 they are  not actual responses, so that they can then be dropped from the analysis.
   Finally, we should note that all methods of handling missing survey data must depend
 upon untestable assumptions. If the assumptions are seriously in error,  the analyses may
 give  misleading conclusions. The only secure safeguard against serious nonresponse bias in
 survey estimates is  to keep the amount of missing data small.
                                     REFERENCES

 BAILAR HI, J.C., and BAILAR, B.A. (1978). Comparison of two procedures for imputing missing
   survey values. Proceedings of the Section on Survey Research Methods, American Statistical Associa-
   tion, 462-467.
 BAILAR. B.A., and BAILAR III, J.C. (1983). Comparison of the biases of the hot-deck imputation
   procedure with an "equal-weights" imputation procedure. In Incomplete Data in Sample Surveys,
    Volume 3, Proceedings of the Symposium, (Eds. W.G. Madow and I. Olkin), New York: Academic
   Press. 299-311.
 BAILAR, B.A., BAILEY, L., and CORBY, C.A. (1978). A comparison of some adjustment and
   weighting procedures for survey data. In Survey Sampling and Measurement, (Ed. N.K. Namboodiri),
   New York: Academic Press, 175-198.
 BARTHOLOMEW, D. J. (1961). A method of allowing for 'not at home' bias in sample surveys. Ap-
   plied Statistics,  10.  52-59.

                                  10-15

-------
Survey Methodology, June 1 986                                                    1 5


BISHOP, Y.M.M., F1ENBERG, S.E., and HOLLAND, P.W. (1975). Discrete Multivanaie Analyses.
   Cambridge. Mass: The MIT Press.
BROOKS. C.A.. and BAILAR, B.A. (1978). An Error Profile: Employment as Measured by the Cur-
   rent Population Survey.  Statistical Policy  Working Paper 3.  U.S. Department of  Commerce.
   Washington, D.C.: U.S.  Government Printing Office.
CHAPMAN. D.W., BAILEY, L., and KASPRZYK, D. (1986). Nonresponse adjustment procedures
   at the U.S. Census Bureau. Survey Methodology, forthcoming.
CODER, J. (1978). Income data collection and processing from the March Income Supplement to the
   Current Population Survey. The Survey of Income and.Program Participation Proceedings of the
   Workshop on Data Processing, February 23-24, 1978. (Ed. D. Kasprzyk), Chapter!!. Washington,
   D.C.: U.S. Department of Health, Education and Welfare.
COLLEDGE, M.J.. JOHNSON, J.H., PARE, R., and SANDE, I.G. (1978). Large scale imputation
   of survey data. Proceedings of the Section on Survey Research Methods, American Statistical Associa-
   tion, 431-436,
COX, B.G., and COHEN. S.B. (1985). Methodological Issues for Health Care Surveys. New York:
   Marcel Dekker.
DAVID, M., LITTLE. R.J.A., SAMUHEL, M.E., and TRIEST.  R.K. (1986). Alternative methods
   for CPS income imputation. Journal  of the American Statistical Association, 81, 29-41.
DREW, J.H., and FULLER, W.A. (1980).  Modelling nonresponse in surveys wi:h callbacks. Pro-
   ceedings of the Section on Survey Research Methods, American Statistical Association, 639-642.
DREW, J.H., and FULLER, W.A. (1981). Nonresponse in complex multiphase surveys. Proceedings
   of the Section on Survey Research Methods. American Statistical Association, 623-628.
FORD, B.L. (1983). An overview of hot-deck procedures. In Incomplete data in Sample Surveys,  Volume
   2, Theory and Bibliographies, (Eds. W.G. Madow, I. OIkin and D.B. Rubin), New York: Academic
   Press, 185-207.
GREENLEES, W.S.. REECE, J.S., and ZIESCHANG. K.D. (1982). imputation of missing values
   when the probability of response depends on the variable being imputed. Journal of the American
   Statistical Association, 77, 251-261.
HERZOG, T.N., and RUBIN, D.B. (1983). Using multiple imputation to handle nonresponse  in sam-
   ple surveys. In Incomplete data in Sample Surveys,  Volume 2,  Theory and Bibliographies, (Eds.
   W.G. Madow, I. Olkin and  D.B. Rubin), New York: Academic Press, 209-245.
KALTON, G. (1983). Compensating for Missing Survey Data. Ann Arbor: Survey Research Center,
   University of Michigan.
KALTON, G. (1986). Handling wave nonresponse in panel surveys. Journal of Official Statistics. 2,
   forthcoming.
KALTON, G., and KASPRZYK. D. (1982). Imputing  for missing survey responses. Proceedings of
   the Section on Survey Research Methods, American Statistical Association, 22-31.
KALTON, G., and KISH, L. (1984). Some efficient random imputation methods. Communications
   in Statistics - Theory and Methods, 13(16). 1919-1939.
KISH,  L. (1965). Survey Sampling.  New York: Wiley.
KISH, L. (1976). Optima and proxima in linear sample designs. Journal of the Royal Statistical Socie-
   ty, Ser. A.  139, 80-95.
LITTLE, R.J.A. (1982). Models for nonresponse in sample surveys. Journal of the American Statistical
   Association, 77, 237-250.
LITTLE, R.J.A. (1986a). Survey nonresponse adjustments for estimates of means. International
   Statistical Review, 54,  139-157.
LITTLE, R.J.A. (1986b). Missing data in Census Bureau surveys. Proceedings of the Second Annual
   Census Bureau Research  Conference,  442-454.


                                     10-16

-------
 16                              Kalton and Kasprzyk: Treatment of Missing Survey Data


 LITTLE. R.J.A., and DAVID, M.H. (1983). Weighting adjustments for non-response in panel surveys.
   Working Paper,  Washington, D.C.: U.S. Bureau of the Census.

 DH. H.L.. and SCHEUREN, F. (I978a). Multivariate raking ratio estimation in the 1973 Exact Match
   Study. Proceedings of the Section on Survey Research Methods, American Statistical Association,
   716-722.
 DH. H.L., and SCHEUREN. F. (1978b). Some unresolved application issues in raking ratio estima-
   tion. Proceedings of the Section on Survey Research Methods,. American Statistical Association,
   723-728.
 DH, H.L., and SCHEUREN. F. (1980). Estimating trie variance impact of missing CPS income data.
   Proceedings of the Section on Survey Research Methods,- American Statistical Association, 408-415.

 DH, H.L.. and SCHEUREN. F. (1983). Weighting adjustment for unit nonresponse. In Incomplete
   data in Sample Surveys,  Volume 2, Theory and Bibliographies, (Eds. W.G. Madow, I. Olkin and
   D.B. Rubin). New York: Academic Press, 143-184.
 DH, H.L., SCHEUREN. F.,  and  NISSELSON. H. (1980). Differential1 bias impacts of alternative
   Census Bureau hot deck procedures for imputing missing CPS income data. Proceedings of the
   Section on Survey Research Methods, American Statistical Association,  416-420.
 PALMER, S. (1967). On the character and influence of nonresponse in the Current Population Survey.
   Proceedings of the Social Statistics Section, American Statistical Association, 73-80.

 PALMER, S., and JONES. C. (1966). A look at alternate imputation procedures for CPS noninter-
 .  views.  Washington, D.C.: U.S. Bureau of the Census memorandum.

 POLITZ, A., and SIMMONS, W. (1949). I. An attempt to get the 'not at homes' into  the sample
   without callbacks. II. Further theoretical considerations regarding the plan for eliminating callbacks.
   Journal of the American Statistical Association, 44, 9-31.

 POLITZ. A., and SIMMONS, W.  (1950). Note on an attempt to get the 'not at homes' into the sam-
   ple without callbacks. Journal of the American Statistical Association, 45,  136-137.
• RUBIN,-D.B. (1978). Multiple imputations in sample surveys: a phenomenological Bayesian approach
   to nonresponse.  Proceedings of the Section on Survey  Research Methods,  American Statistical
   Association, 20-34.

 RUBIN, D.B. (1979). Illustrating the  use of multiple-imputations to handle nonresponse in sample
   surveys. Bulletin of the International Statistical Institute, 48(2), 517-532.

 RUBIN, D.B., and SCHENKER, N. (1986). Multiple imputation for .r.terval estimation from simple
   random samples with ignorable nonresponse. Journal of the American Statistical Association, 81,
   366-374.

 5ANDE, G. (1979). Numerical edit and imputation. Paper presented to the International Association
   for Statistical Computing,  42nd Session of the International Statistical  Institute.

 5ANDE, I.G. (1983). Hot-deck imputation procedures. In Incomplete Data in Sample Surveys, Volume
   3, Proceedings of the Symposium, (Eds. W.G. Madow and I. Olkin), New York: Academic Press,
   339-349.

 SANTOS, R.L. (1981). Effects of  imputation on regression coefficients. Proceedings of the Section
   on Survey Research Methods, American Statistical Association, 140-145.

 THOMSEN. I. (1973). A note on the  efficiency of weighting subclass means to  reduce the effects of
   nonresponse when analyzing survey data. Statistisk  Tidskrift, 4, 278-283.
 FHOMSEN, 1., and SIRING, E. (1983). On the causes and effects of nonresponse: Norwegian ex-
   periences. In Incomplete Data in Sample Surveys, Volume 3, Proceedings of the Symposium, (Eds.
   W.G. Madow and I. Olkin), New  York: Academic  Press, 25-29.
 VACEK, P.M., and ASHIKAGA. T. (1980). An examination of the nearest neighbor rule for imputing
   missing values. Proceedings of the Statistical Computing Section, American Statistical Association,
   326-331.
 WELNIAK,  E.J.. and CODER, J.F.  (1980). A measure of the bias in the March CPS earnings im-
   putation system. Proceedings of the Section on Survey Research Methods,  American Statistical
   Association, 421-425.


                                      10-17

-------
                      NONRESPONSE ADJUSTMENT METHODS
          FOR  DEMOGRAPHIC SURVEYS AT THE U.S. BUREAU OF THE CENSUS

                        By:   Rajendra P. Singh and
                             Rita 0. Petroni
                             Bureau of the Census, SMD
                             Washington, D.C.
                                 ABSTRACT

      All  the surveys are subject to missing data.   The missing data in a
survey  could  be either  due to  noncoyerage or  nonresponse.    The Cens-us
Bureau  uses various  approaches to-adjust'for  the missing data.   In this
paper,  we  will  briefly discuss  the  various  types  of nonresponse  in
demographic surveys  and  their possible effects  on  survey estimates.  The
approach used at the Bureau to adjust for various types of nonresponse for
cross-sectional  and  longitudinal estimates  will be discussed.   Emphasis
will be placed on the weighting adjustments  which utilize  ratio  estimators.
Discussion of the criteria to form ratio  estimation cells will  also be
presented in  the  paper..   As an example, the adjustment techniques of the
Survey of Income and  Program Participation will  be discussed.
                                   11-1

-------
INTRODUCTION

      A sound  sampling plan  for a  survey includes  extensive effort  to
obtain usable data for each unit  selected  into  the  sample.   Resources are
allocated to develop  a good sampling frame, design a  good  questionnaire,
good  interviewer's training,  and  other  data collection procedures  such  as
how to gain cooperation of respondents.  However, in spite of such efforts,
all  surveys  encounter missing   data  which  could occur  either  due  to
noncoverage or nonresponse.   In  this paper, we will discuss  missing data
due to nonresponse  and methods to adjust for it.   It  occurs  when  some  or
all responses to the  questions on a  questionnaire are  not obtained.   This
may be due to the respondents  inability or unwillingness to answer.

      Researchers have been striving to reduce  nonresponse.   For example,
they  have  done   this by  better  designing  and  testing  questionnaires
thoroughly  for complete  and accurate answers before fielding  the  survey,
providing respondents  aids  to  keep better records, giving  respondents gifts
(cash  or  kind)  to  gain  their cooperation  and finding  ways to  improve
training  given to the data collection staff.  Researchers are also  heavily
involved  in improving  the methods  to  account for  missing data.    Two
approaches commonly  used are imputation and weighting adjustment.

      In  imputation,  missing information is replaced with usable  data from
other  sources.    Regression  imputation  (Kalton and Kasprzyk,  1982)  and
cold-deck and  hot-deck methods have been  used  by  the U.S. Bureau  of the
Census.   The  demographic  surveys  primarily  use  the  cold-deck and hot-deck
procedures.    The  cold-deck procedure  uses  values  from  some  prior
distribution (same survey or other source,  while the hot-deck uses.current
responses from the same source (survey) to  substitute  for missing  values.
Imputation is carried  out  by cross-classifying survey units  into categories
(cells)  by  a  few  variables  in  an   attempt to group  responses  that  are
relatively  homogeneous within the cells and heterogeneous  between cells.
Within a  cell, values obtained for  survey  units are inserted as  responses
for missing items.  To accomplish  this,  there must be at least one response
available in each category  to  be a donor for imputation.

      Imputation  is commonly  used for partial  response,  that is,  when  a
questionnaire is partially answered.   It  has  also  been used to compensate
for  complete nonresponse.    One  such  example  is  the  1960  U.S.   Census
(Pritzker et a/., 1965) adjustment for missing data.  In this adjustment,  a
nonresponding  household  was imputed by a responding household  (donor)  in
the   same cross-category.    This  approach  of  imputing  a  complete
questionnaire  amounts to doubling  the  weight  of those  respondents whose
records  are duplicated.   Such a procedure can increase the  variance  as
compared  to weighting adjustment.  Hansen, Hurwitz and  Madow (1953) show
that  the maximum increase in variance is about 12 percent  for the method of
duplicating  records.    If a donor  is  used more than  once,  the  variance
increase could be even larger.

      Weight adjustment within cells (Oh and Scheuren,  1983) to compensate
for complete  nonresponse  (unit nonresponse) is  the predominant  technique
used  in the demographic surveys of  the  Bureau  of the  Census.   The general

                                   11-2

-------
approach is basically the same  for all  its major  surveys.   It  is simple and
less expensive to  implement, as  compared  to  imputation,  and seems to work
well (Jones,  1984)  for  some labor  force  characteristics  in  the Current
Population  Survey (CPS)  such  as  number  of persons  in the  labor force,
employed and  unemployed.  These estimates were  not  seriously affected by
noninterview bias.   The  only labor force categories with  substantial bias
were those which  included vacationers and persons on layoff.

      In  this paper,  we  will  primarily discuss  nonresponse  weighting
adjustment  for demographic surveys  used at  the Bureau  of  the Census.
Sections  II  and  III discuss various  types of nonresponse and adjustment
approaches to deal with these different types  of  nonresponse,  respectively.
The effect of  nonresponse  on survey estimates  is discussed in Section IV,
and the  criteria  to  define noninterview cells  are presented in Section V.
As  an example, the noninterview  adjustment methods  used for the Survey of
Income  and  Program  Participation  (SIPP)  are  presented  in  Section VI.
Section VII presents  a discussion on noninterview adjustment research.

TYPES OF NONRESPONSE

      Nonresponse can be  divided  into the following categories:

      Tvoe A Noninterview:  A Type  A noninterview occurs when  every member
      of  the  household   is a  noninterview.     Also  called  a  household
      nonresponse,  it  occurs when non  one is  home,  household members are
      temporarily  absent  (for example,  they  could be  away on vacation),
      household  members  refuse   to participate  in  the  survey,   or the
      household cannot be located.                                  .

      Tvoe B Noninterview:  This type of noninterview occurs when  a housing
      unit  is vacant,   occupied by  persons   with  their  usual  residence
      elsewhere,   unfit or  set  or set to be demolished,  under construction
      and  not  ready  for  occupancy, or  converted to  temporary business or
      storage.   IT also  occurs when a  site for  a mobile home, trailer or
      tent   is  unoccupied  or   when   a permit   has  been  granted,   but
      construction is not started.

      Type C  Noninterview:   It occurs when a housing unit is demolished, or
      house  or  trailer   is moved,  converted  to  permanent  business or
      storage, or merged or condemned.

      Type Z  Noninterview:   Type Z  noninterview  occurs  when  a  member  of an
      interviewed household is not interviewed  and a  proxy  interview  is not
      obtained.  It  is also called person  nonresponse.

      Item Nonresponse:   Item Nonresponse occurs  when a response  to  one or
      more questions is  not provided,   though  most of the  questionnaire is
      completed.

ADJUSTMENT FOR VARIOUS TYPES OF NONRESPONSE

      Of  these five  types  of noninterview,  no adjustment  needs to  be made

                                    11-3

-------
for type B and type C  noninterviews.   This  is  because type C noninterviews
are  no  longer  housing  units  at  the  original  address.    For  type  B
noninterviews,   only   households  with  usual  residence  elsewhere  occupy
housing units covered by these types of noninterview.  Such households have
a chance of being in a sample at their usual  residence.

      Imputation  techniques are  used to deal  with item nonresponse and type
Z  nonresponse in most of the  demographic  surveys  at  the  Bureau  of  the
Census.  Weighting adjustment is  used for type  A nonresponse  and  in certain
cases for type Zs.  The procedures used for type As and type  Zs are similar
and based on the  same  general principals.

EFFECT OF NONRESPONSE  ON SURVEY  ESTIMATES

      It is a common belief that respondents have different characteristics
from nonrespondents.   This theory  is  supported by recent studies completed
by  Petroni  (1987),  and  Short  and  McArthur  (1986).    Thus,  nonresponse
introduces  bias  in survey estimates.  We believe that the bias  is small
when the  nonresponse  rate  is about  5% or  less,  but  it increases  as  the
nonresponse rate  in a  survey  increases.   Increase in bias with increase in
nonresponse can be shown mathematically as  follows:

Let  P1  (i  = 1,2,...K)  be the proportion  and  R.  be  the response  rate of
population members falling in ith group or cell.  Thus, the overall response
rate, R, is given by:

      K                          -                              -       .
R =   2     P.  R.        ;            0 < R <  1
     •i=l             .               and 0 <  R.  <  1  V  i

            K
where       2     P. =  1
Furthermore, assume that

      7.     =     Mean of a characteristic of interest  of  the  population
                  units falling in cell  i.

      7.{6)  =     Mean of a characteristic of interest  of  the  population
                  in the ith class which would not respond if selected in a
                  sample.

      7.(u)  =     Mean of a characteristic of interest  of  the  population
                  in the ith class which would respond if selected in a
                  sample.
                                    11-4

-------
Then   y,
                          Sample  estimate  of
                          Sample  estimate  of
                          R,
              Wjj    -      * -  Selection  probability  of jth unit in ith cell.
              y..    =      Value of the characteristic of  interest  for the jth unit
               J          in the ith cell.
              n,     =      Number of sample  units  in  ith cell.
              nfu    =      Number of sample  units  responding  in  ith cell.
              P,     =      Proportion of  sample  units falling in the  ith group or
                          cell.
 K     niu          ~1  fn.
 2     2      [TT^]     Up
i-=l   "1=1 '           .   I  iuJ	:	  •              (2.1)
           u>      K     niu
                    K    n      j-U         K
                  \    n
                  i=l  2

                                            11-5

-------
The expected value of y(u) is

      E [y(u)]     -     E [E (y(u) I  n.,  nlo, n2, n2u  ... nK,
                         .     PiYKu)                                  (2-2)
Therefore, the bias of the adjusted estimate is

Bias  [y(u)]       =     E [y(u)]     - y
      Equation  (2.3)  suggests  that  the  amount  of bias  depends  on  the
response rate and  the  difference in the mean values of  the  characteristics
for  respondents and  nonrespondents.   With  a small  response  rate,   bias
increases   even  if the  difference  in  the  means  of  respondents  and
nojirespondents is small.

      Before discussing  the  criteria  for  noninterview (NI)  adjustment,  let
us consider the following situations:


                         and

                         for V i and j, or

2.    R1 =  R.J  =  R,  V i  and  j,  or
Under each of the three situations the bias is the same and is given by


      Bias   [y(u)  ]    - (1-R)     [ Y(u)  - Y(4)  ]                (2.4)


and is equivalent to using a single NI adjustment cell.

      It  is  obvious  from equation (2.3)  that  the bias in an  estimate  will
be reduced by using two or more cells if


                                    11-6

-------
             Y    - Y
             Yi(u)   YH6)
                        Y   -  Y
                        Y(u)   T(»)
and R, * R.J ,  V i, j.
                       (2.5)


                       (2.6)
      Therefore,  the success of the  NI  adjustment procedure requires the
identification  of the survey variables which will define  adjustment  cells
such  that  these cells  vary both  with  respect  to  survey  estimates and
response rates.   See Chapman (1976)  for further details.

      Note  that there are other situations where bias could be reduced by
use of  more  than one NI  cell  even  if  the above two  conditions  are not
satisfied.   For example,  consider two cells.   It  is  possible that one cell
meets  criteria  (2.5)  and  the other  does not,   yet  the  population
distribution into the cells  and the  response rates  of the cells are such
that the bias  is  less using two NI  cells  instead of one.

CRITERIA TO DEFINE NONINTERVIEW ADJUSTMENT  CELLS

      The objective of noninterview  adjustment  is  to  reduce  the  bias in
survey  estimates.   A  survey produces  a  large  number of estimates, and
adjustments which reduce  bias  for  one set of estimates may not work well
for another set of estimates.  Therefore,   it is essential to have a  clear
understanding  of  the relative importance of  various  estimates  when
implementing  the  criteria  below to  form  NI  cells.  In  addition  to bias, it
is occasionally necessary  to consider reduction of mean square error.  This
is the  case when the adjustment factor  is  large and,  hence, increases the
variance significantly.
A.    Lower Bias

      The  following   four
      cross-classification
      estimates.
                 criteria
                variables
 are  used
to  reduce
 in  selecting   the
the  bias  in  survey
      1.     The variables  are significantly  correlated with  the survey
            estimates.    The  implicit  assumption  in  selecting  these
            variables is that  if  for respondents these  variables show a
            significantly high  correlation with  survey estimates  to be
            produced,  then  they  will  also  show high  correlation among
            nonrespondents.   Since  these  variables  must be available- for
            both  respondents  and  nonrespondents,  the  choice  of  the
            variables is  constrained.  These variables are determined prior
            to data collection to ensure the necessary data  is  obtained and
            to avoid possible bias due to the particular  sample selected.
      2.    Wi

            V i.
ithin  each weighting class   E   Y1(u)    =   E   | Y.(i)  1
                                   11-7

-------
      3.     The  means  of  any two noninterview adjustment cells differ,

            i.e.,    E    [YI(U)   ]  . * E    [Y.(O)  ] for  i ] j, V i and j


      4.     The  response  rate  for  any two cells differ, that is

            R< # Rj , i # J, Vi and j.
B.     Lower Variance

      The variance  contribution from a  NI  cell  depends on  the number of
      responding and nonresponding  units  in that cell.  For small cells the
      nonresponse weight adjustment can be large.   Therefore,  the size of
      the cell  is an important  consideration  in defining a cell.  One needs
      to consider the  trade-off  between  variance  and bias  in deciding the
      size  of  the  cell as  bias  should be reduced  with   a  homogeneous
      (usually  a smaller) cell.

      Cahoon and Bushery (1984)  under  a  number of assumptions to  simplify
      the mathematics involved  showed that the  variance of an estimator for
      cells with  25 sample  units  each is about  0.5% higher  assuming 5%
      nonresponse rate than a collapsed  cell of  100 units.    With  10%
      nonresponse rate it is about 1.0% higher.   In  deriving these results
      they  assumed  independence between  sample  units  within  a  cell  and
      between  cells, cells  are  of fixed equal  size, and cells have  the same
    "  expected 'response rate,  expected  value  and  variability of  the
      characteristics of interest.

      To  reduce  variance,   NI  cells  are  collapsed  if  the  number  of
      respondents in them is  small  or the  noninterview  adjustment factor is
      too large.
                                 n
                                                         -1
      These limits  are  somewhat subjective.   For  most of the demographic
      surveys  at  the Bureau,  these limits  are:   a)  minimum interviewed
      cases in a cell are 20-35, and b) maximum NI adjustment factor  is 2.
      If one of these criteria is not satisfied by the cell it needs  to be
      collapsed  with another  cell.   The  following  collapsing criteria
      attempt to minimize the  increase  in  mean square error of  the  survey
      estimates of  interest.   A cell i should  be  collapsed with a cell j
      if:
                                   11-8

-------
      1.
           V 1,  1 *  j
      2.
                             R. -
           v  i,  j
      Usually,  these two conditions are not
      cells.   In those  circumstances,  either
      on condition  1,  or a  pair  should be
      square error  even if  neither  of the
satisfied by the same  pair  of
more emphasis should be placed
found which  reduces the  mean
two  conditions is  satisfied.
      Furthermore,  if there is strong evidence  that  for a cell  with a very
      high noninterview adjustment  factor  E [Y.( J  is very different from
      any other cell,  then  the cell  should  be kept separate to minimize the
      bias due to nonresponse (Shapiro,  1980).   (Since the amounts of bias
      and  mean square  error  are  unknown,  experience  is  used  to  make
      judgments regarding  expected  reductions in  mean  square  error and
      bias.)

THE SURVEY OF INCOME AND  PROGRAM PARTICIPATION

      The  Survey  of  Income  and Program Participation  (SIPP)   is  a  new,
ongoing national household  survey administered  by the  Bureau of the Census.
It  is  designed to  provide improved data  on  income  and  participation in
government administered programs  such as food  stamps,  Aid to Families .with.
Dependent Children  (AFDC),  Supplemental  Security Income  (SSI),  etc.   Data
on  demographic characteristics, labor force,   education,  etc.,  are also
collected.

      The  SIPP is  a  multistage,  stratified,  systematic  sample  of the
noninstitutionalized  resident  population  of  the  United States.    This
population includes persons  living  in  group quarters  such as dormitories,
rooming houses, and religious  group  dwellings.   Noncitizens  of  the United
States who work or attend school  in this country  and  their families are
also eligible.   Crew members of merchant  vessels,  Armed Forces personnel
living  in  military  barracks,  and  institutionalized   persons  such  as
correctional  facility inmates and  nursing  home residents are ineligible.
Initially, a  sample  of living quarters in selected Primary Sampling  Units
(PSUs) is taken.   (Living  quarters  are those in which the occupants do not
live and  eat  with  any person in  the structure  and  that have either direct
access from  the outside of  the building  or  through a  common hall,  or
complete  kitchen facilities for that  unit only.)  Persons  residing  in  these
living quarters at the time of the  first interview are considered to be in
sample.   However,  only persons who  are at least 15  years of  age at this
interview are  eligible for interview.   Limited data  on  children are also
collected by  proxy  interviews.
      The SIPP  sample is divided  into four  groups  of equal
rotation  groups.    One  rotation group  is interviewed  each
                  size  called
                  month.    In
                                   11-9

-------
general, one cycle  of  four  rotation  groups is called a wave.   This  design
provides a steady workload for data collection and processing.   Persons  15
years old and over in the sample are interviewed  once every four months  for
approximately 2.5 years.   With certain restrictions, these  sample  persons
are followed if they move to a new address.   Persons who  began  living with
sample persons after the  first  interview  are  considered  to be  part  of  the
sample only while residing  with  the  sample persons.   The  reference  period
for the  interview  is the four months  preceding  the  interview month.   For
example, for the  first SIPP sample,  the reference period  for the  November
1983 interview month was  July  through  October 1983.   These  sample  persons
were interviewed again in March 1984 for the November 1983 through  February
1984  period.    More details  on the  SIPP  design  are  given  in  Nelson,
McMillen, and Kasprzyk (1985).

      The SIPP questionnaire is  long  and  complex.    Questions are  asked  by
specific type  of  cash and non-cash income on months received  and amounts
per month.   For many  types of income, additional questions are asked  of
recipients.  For example, in households with  children covered by Medicaid,
up to eight questions about health insurance are  asked.   Questions  are also
asked about  assets  and  labor  force status.   Topical modules  on various
subjects are also included in most interviews.

      For  the  subsequent  waves,  only  original   sample  persons  (those
interviewed in the first wave)  and persons living with them are  eligible to
be interviewed.  With  certain  restrictions,  original  sample  persons  are to
be followed if they  moved to a  new  address.   All noninterviewed households
from  Wave 1  are  designated  as  noninterviews for  all  subsequent  waves.
Additional noninterviews  result  when  original sample persons move  without
leaving  a forwarding  address  or move to  extremely remote  parts  of  the
country.

      Due to the  longitudinal  nature  (multiple  interviews) of  the  survey,
the noninterview rate  accumulates over the  life  of the panel.   Starting at
about 5%-7% at the time of the first interview, it  reaches slightly  over 20
percent  for the  last  interview of  the  panel.    The  following  briefly
explains  noninterview  adjustment  methods   developed  for  the SIPP
cross-sectional and longitudinal  estimates.

Noninterview Adjustment for Cross-Sectional  Estimates

      Noninterview adjustment for cross-sectional estimates are  made  at  the
household  level.     At   the  time  of  the  first   interview very  little
information  (such  as race  of  the  reference person,   owner-occupied  or
renter-occupied housing unit, size of the  household,  and  the Census  region)
is available  about  the  noninterviewed households.   Therefore, a limited
number of variables  correlated to the  SIPP characteristics of interest  can
be used to form  noninterview  cells.   For first  wave  data, noninterview
cells were  formed using the following variables.    See  King (1985) for  a
detailed explanation.

      a.    Census region (Northeast, Midwest,  South, West)


                                   11-10

-------
      b.     Residence (metropolitan statistical areas (MSA), not MSA)

      c.     Race of reference  person  (black, non-black)

      d.     Tenure (owner,  renter)

      e.     Household size  (1,  2, 3,  4 or more)

      The noninterview adjustments  for subsequent waves are in  addition  to
the wave  1  adjustment, i.e.,  the  NI adjustment made  as  a part of wave  1
weighting  becomes an  integral part  of  subsequent  waves weighting.    In
subsequent  waves,   additional  information  obtained on previous  wave
respondents is  available for  use  in  developing  noninterview cells.   Using
1980 Decennial  Census  data,  it was found  that educational level,  race  and
origin of householder, household  type,  and tenure  are highly  correlated
with the important characteristics  (income,  poverty,  etc.)  estimated by  the
SIPP.   Also,  Kalton  et  a7.  (1985)  showed that  the  participation  of  a
household in a given government program during the  reference period covered
by  interview  (K)  is  highly correlated with its participation  in interview
(K-l).  For example,  the correlations for  food stamps and  SSI  were  observed
to  be  about  .9 and  .8  respectively.   The  relationship  is  also strong
between interviews  (K) and (K-2).    For example,  the correlation  for food
stamp  participants  between  interviews  (K)  and   (K-2)   is   .8.     These
correlations  were obtained from the data collected  in the Income Survey
Development Program (ISDP),  a precedent of the SIPP.

      Based on  the above knowledge and experience of the  Bureau  staff,  the
following household  level  variables  were  chosen to  construct  noninterview
adjustment cells  for  second  and subsequent  waves.    A  detailed  description
of these cells is presented in King (1986).

      a.     Race and Spanish origin of reference person (non-Spanish white,
      other).

      b.     Household  type  (female householder  with own children  under  16
      years of  age but no  husband  present,  householder is 65 years of  age
      or older, others).

      c.     Education  level  of reference person  (less than 8 years, 8-11
      years, 12-15 years,  and 16 or more years).

      d.     Type of income  (welfare,  etc., others).

      e.     Assets (bonds,  etc., others).

      f.     Tenure (owner,  renter).

      g.     Public housing or  rent subsidized  (resident of public housing
      or recipient of government rent subsidies, others).

      h.     Household size  (1,  2, 3,  4 or more)


                                   11-11

-------
Cells  which do  not  meet  the  following  conditions are  collapsed  in a
predetermined manner.

      1.     Number of interviewed households  in  a  cell  is greater than or
            equal  to 30.

      2.     Noninterview  adjustment factor  is less than or equal to 2.

Noninterview Adjustment for  Longitudinal Estimates

      At  present,  longitudinal weighting procedures  are developed only  for
the estimates of  persons.   Two  levels of  noninterview adjustment are used
in these  procedures.  The first  is at the  household  level  and  is  similar to
the wave 1  adjustments for  the  cross-sectional  estimate.   It  accounts  for
persons who could not be interviewed at  the first  wave  of the reference
period covered  by  the  interval  for which  the  longitudinal  weights  are
developed.   The second  adjustment is made at the  person  level to account
for those  persons who could  not be  interviewed  for at least one  of  the
later waves  covering  the reference  period  of interest.   An alternative to
the weighting adjustment  is imputation   of  the complete  record  for  NI
persons.     (This  is  similar to  imputation of type  Zs  in cross-sectional
weighting.)  However, this  approach  may have a significant adverse effect
(increase  bias)  on estimates of gross flows,  one  of the  most important
longitudinal estimates .  See  Kalton  (1986)  and Singh et al.  (1988).

      The  following  variables were  selected  for use in  the  second level
longitudinal NI  adjustment procedures   in  the  same  way   as for  the
cross-sectional  adjustments  are  are  based  on  the first interview covering
the time interval  for which the  longitudinal weighting is developed.  Note
that  certain person  level  variables are  defined  based  on  the household
level  variables.   For example, a  household  in which  at least  one HH member
received  income from food stamps, the household i defined as  having income
from  food  stamps  and each  member of the  household is considered  a food
stamp recipient.   See Huggins  (1988)  for more information.

a.    Average monthly HH  income  (<$1,200, $l,200-$3,999, > $4,000)

b.    Employment status  (self-employed, others)

c.    Type of income (welfare, etc.,  unemployment compensation, others)

d.    Assets (bonds, others)

e.    Education level  (<  12  years,  12-15 years, 16 or more years)

f.    Race  and origin (white and  not  Spanish, others)

g.    Labor force  status  (in Tabor force, not  in labor force)

      The  cells   formed  using the  above  variables are  collapsed  before
making noninterview  adjustments  if  the  number of interviewed  persons in a


                                   11-12

-------
given cell  are either  less than  30 and/or  the noninterview  adjustment
factor is greater  than 2.0.
 *
Noninterview Adjustment  Research

      To  our knowledge,  no  study  has been  conducted  to evaluate  the
effectiveness  of  noninterview  adjustment methods  for  the  demographic
surveys.  Therefore,  the  effectiveness  of  these procedures  to reduce bias
in  estimates is  unknown.   A  study  (Singh,  1987)  to  evaluate  the SIPP
noninterview adjustment  methods for cross-sectional  estimates is underway.
The results  from this study  should  be  available  later this year.  Even then
no general statement can be  made,  since  the SIPP provides  a  large number of
estimates.   Some  indirect  evaluation of these procedures  could  be done.
For example, the SIPP estimates from wave 1 and  from  a later wave (say wave
4)  for  a given characteristic  could  be  compared  against  corresponding
estimates from  an independent source,  especially  administrative records.
However, the validity of such an  evaluation will be questionable.

      The  Bureau   of the  Census  has  conducted  noninterview  adjustment
related  research  for its demographic surveys.   Some of  the  research was
performed for  the American  Housing Survey  (AHS-National).   Parmer  (1986)
examined correlations between  variables  of interest,  between variables of
interest  and evaluation  variables,  and  the  nonresponse  rates  for  the
selected  variables  of   interest.     He also  examined  stability  of  the
variables considered to define noninterview adjustment  cells.   Research is
also  being  conducted on  improving noninterview adjustment for  the SIPP
(Petroni,   1988).    Similar  research  may also  prove  useful  for other
demographic surveys.

      Some  research to examine  the  feasibility  and merits  of computing
nonresponse  adjustment  factors as  well  as  constructing  weighting  cells is
being  conducted  by Rosenbaum  and Rubin  (1983) and  Little  and  Samuhel
(1983).   Research  is also needed in developing  models- which may be used to
estimate response probabilities for units.   This could be done for  several
demographic surveys with similar  values  of  independent variables.

Acknowledgments

      The authors  wish to express their  appreciation  for valuable technical
comments  provided by Leroy  Bailey, David  Chapman,  Lawrence Altmayer,  and
Lloyd Hicks to improve the quality  of this  paper.  Special thanks also goes
to  Kimberly Wilburn for  typing  the paper.   Without her persistence  and
willing attitude,  this paper could  not have been completed.

      The  work described  in  this  paper  was  not funded by  the  U.S.
Environmental   Protection  Agency  and  therefore  the  contents  do  not
necessarily  reflect the views  of the Agency, and  no official  endorsement
should be inferred.
                                   11-13

-------
                                REFERENCES
Gaboon,  L,  and J. Bushery.   (1984).   "Effect of Noninterview Cell Size on
    •  the Variance of Estimates,"  Internal  Census Bureau  memorandum for
      documentation,  November  27,  1984.

Chapman, D.W.   (1976).   "A Survey of  Nonresponse  Imputation Procedures,"
      Proceedings  of the  Social  Statistics Section,  Part  1,  American
      Statistical  Association, 245-251.

Hensen,  M.H.,  W.N. Hurwitz, and  W.G.  Madow.   (1953).   Sample Survey Methods
      and Theory,  Vol.  I,  New  York: John Wiley and Sons, Inc.

Jones,  C.   (1986).   "Imputation  Based on Subsets of Interviewed Cases,"
      Internal Census  Bureau  memorandum from Jones to Butz,  December 29,
      1984.

Kalton,  G.  (1986).   "Handling  Wave  Nonresponse in Panel Surveys," Journal
      of Official  Statistics,  2, 303-314.

Kalton,  G.,  and  D.  Kasprzyk.    (1982).    "Imputing for  Missing  Survey
      Responses,"  Proceedings  of the  Survey  Research Methods  Section,
      American Statistical  Association, pp. 22-31.

Kalton,  G.,  J. Lepkowski,  and  T. Lin.   (1985).   "Compensating  for Wave
      Nonresponse  in the  1979  ISDP Research  Panel," Proceedings  of the
      Survey  Research  Methods Section, American  Statistical  Association,
      372-377.        .      •  • ..                              .

King,  K.  (1985).   "SIPP  85:  Cross-sectional Weighting Specifications for
      Wave  I—Revision,"  Internal Census Bureau memorandum from  Jones to
      Walsh, November 21,  1985.

King,  K.  (1986).   "SIPP:   Cross-sectional Weighting Specifications for the
      Second and Subsequent Waves,"   Internal  Census Bureau memorandum from
      Jones  to Walsh, June 19, 1986.

Little,  R.J.A.,   and M.E.  Samuhel.    (1983).   "Imputation Models  on the
      Propensity to Respond,"  Proceedings  of  the Section on  Survey Research
      Methods, American Statistical Association, 415-420.

Nelson,  D., D.  McMillen,   and D.  Kasprzyk.   (1985).    "An  Overview  of the
      Survey of Income and Program Participation:   Update 1," SIPP Working
      Paper  Series no. 8401, U.S.  Bureau of the Census.

Oh, H.L.,   and  F.J.   Scheuren.    (1983).   Weighting   Adjustment  for Unit
      Nonresponse.   Incomplete  data in Sample  Surveys, Vol.  2, New York:
      Academic Press, 143-184.

Parmer,  R.J.    (1986).    "Documentation  of   the AHS-National  Noninterview
      Adjustment Research  for 1985," Internal Census   Bureau memorandum for

                                   11-14

-------
      documentation,  April  16,  1986.

Petroni,  R.   (1987).   "SIPP 84:  Characteristics of  Initially  Interviewed
      Persons by Response  Status,"  Internal  Census  Bureau memorandum from
      Nonresponse Workgroup for the  Record, September 3, 1987.

Petroni,   R.    (1988).     "Evaluation  of  Mover   Characteristics  and
      Nonresponse,"  Internal   Census  Bureau  memorandum  from  Petroni  to
      Singh,  April  6,  1988.

Pritzker,  L., J.  Ogus,  and  M.H. Hansen.   (1985).   "Computer Editing Methods
      - Some Applications  and  Results,"  Bulletin  of the  International
      Statistical  Institute,  Proceedings  of the 35th  Session Belgrade 41,
      1965.

Rosenbaum,  P., and D. Rubin.   (1983).   "The Central Role  of  the  Propensity
      Score  in Observational  Studies for Casual  Effects,"  Biometrika 70,
      41-55.

Shapiro, G.   (1980).   "A  General  Approach to  Noninterview Adjustment,"
      Internal Census  Bureau  memorandum  from   Shapiro  to  Programs  Area
      Branch  Chiefs,  March  11,  1980.

Short, K.,  and E.  McArthur.   (1986).   "Life Event and Sample Attrition in
      the  Survey of  Income  and Program  Participation," Proceedings  of the
      Section on  Survey Research Methods,  American Statistical Association,
      200-205.

Singh,  R.,   L. Weidman,  and  G. Shapiro.  (1988).   "Quality of the SIPP
      Estimates,"  presented  at the  SIPP Conference  on   Individuals  and
      Families in  Transition:   Understanding Change  through Longitudinal
      Data, Annapolis,  Maryland,  March 16-18, 1988.
                                   11-15

-------
           ON THE ROBUSTNESS  OF THE MAXIMUM LIKELIHOOD ESTIMATOR
           IN THE PRESENCE OF NONRESPONSE IN  COMPOSITIONAL  DATA

              by: Chao  L. Chen
                  Environmental Research Center
                  University of Nevada, Las Vegas
               -  Las Vegas, NV. 89154
                                 ABSTRACT

      Human activity  pattern data can  be  treated as  compositional  data.
Three statistical  models  are proposed for the  compositional  data, and we
adopted the logistical  normal  approach.

      Once the logistical  normal  approach is chosen, the problem of missing
values  in  the compositional  data can  be  transformed to the  problem of
missing value problem in the multivariate normal case,  hence the techniques
of treating missing values' ill  the multivariate normal  case can  be applied.

      The results  show  that  certain  techniques  for missing values, though
useful  if  the square  loss function  is  used as  a criterion  for judging
estimators,   can  lack  robustness.    The  robustness  depends  on  both
nonresponse rate  and the nonresponse mechanism.

      This  paper   has  been  reviewed  in   accordance   with  the  U.S.
Environmental  Protection  Agency's  peer  and administrative review policies
and approved for  presentation  and publication.
                                   12-1

-------
INTRODUCTION

      To get a better estimate  of  the total  exposure of a specific  human
population (e.  g.,  those living  in  a metropolitan area),  one must know:  (1)
the   chemical   attribute  of the  air   in   several   well   defined
"microenvironments"  (e.  g., kitchen,  parking lot, bath room),  and  (2)  the
proportion  of  a  given  time period  each  person   spends  in  each
microenvironment  and activities  involved in that specific  space  and  time.
In short, we call this  a  "human activity pattern".     The  current  study is
focused on, some statistical considerations of part (2).

      Human  activity  pattern  data can  be  collected in  many ways.   We
consider a sample from the  target population, in which each selected person
is asked  to record  his  or her  daily  activities for a given time  period.
Let the microenvironment-activity  (m-a)  combination  be indexed  by j.   The
proportion  of time  spent  in the j-th  m-a for the i-th person,  p..,  can be
computed from the activity log if the record is complete.  Our purpose is
to  estimate the  mean  and covariance  of  py  in   the  presence  of  item
nonresponse  (incomplete log).   For  simplicity, we will  consider  simple
random sampling  only; a more  complex design may be necessary  in the  real
applications.

      In section  2,  some statistical  models  for  the  compositional  data  are
reviewed, Aitchison  and  Shen's (1980)  logistic model  is adopted.  Section 3
explains  how the incomplete logs can  be connected  to the missing  values.
By a  theorem of  Aitchison  (1986),   we  transform  a missing  value problem in
the compositional  data  to a  missing  value problem .in  the multivariate
normal case, hence, the  standard  technique for the missing values mentioned
in Section  4 can be. applied indirectly to the compositional data.   U.nder
some  assumptions,  the estimator mentioned   in   Section  4  is  maximum
likelihood estimator (MLE).  Hence, it is asymptotically  optimal;  but  it is
not without any  disadvantage.  As illustrated  in  Section 5  by a  simple
numerical example,  this  estimator  suffers from  lack of  robustness  under a
certain nonresponse  mechanism and nonresponse rate.

STATISTICAL CONSIDERATIONS

      If there  are d+1 m-a  combinations then
i. e.,  the sample  space  is the d-dimensional  simplex:
                                                         j-1, 2,  ..., d).

For simplicity,  we drop the subscript  i  in the following discussion.   There
are  at  least three  ways  to  define  probability  density  functions on  a
simplex.
                                   12-2

-------
      (1)  The Dirichlet  distribution;  see Wilks (1962, pp.  177-182.)  for
the related properties.

      (2)  Through  the square  root  transformation s^p,1'2,  j=l,  2,  ...,  d+1,
a point p=(p,,  P2,  ..., pd)T in the d-dimensional simplex can be mapped into
a point on the (d+l)-dimensional  hypersphere of  unit  radius.   Stephens
(1982) contended  that the von Mises distribution  can  be applied  to  fit
points on  the  hypersphere.   In particular,  his study includes  a  two-way
analysis  of variance of  the human activity  pattern data collected  from
university students;

      (3)  Through" the transformation  v.=log(p./p.+1), j = l, 2,  ...,  d,  one
obtains  the  vector  v=(vp   v2,   ...,   vd)T  which has  a  d-dimensional
multivariate distribution with mean n and covariance matrix 2 as suggested
by Aitchison  and  Shen (1980).  They  assumed  v has  a multivariate normal
distribution which implies that p  has  a logistic  normal  distribution  with
mean n and  covariance matrix 2; in short, we say p  is Ld(/i, 2).  Note that
this  transformation  is   a  multivariate  generalization   of  the  logistic
transformation.

      We will  adopt the third approach for the following  reasons:

      (1)  Aitchison and  Shen  (1980) pointed out that  some properties of the
Dirichlet  distribution  imply a very strong assumption of  independence of
p,'s which is unlikely in the current application.   The family of logistic
normal  distributions provide a  wider  class  of  distributions  than  the
Dirichlet distributions.    An even more  general  class of distributions  can
be obtained by different  transformations.  See Aitchison  (1982).

      (2)  Missing-values  analysis in the multivariate  normal case  is to
some  extent developed (Little and Rubin,  1987).    It  is also  easier to
relate  the transformed  variable  v  to the  design  variables   (basic
information of each sampling  unit  known  to us  prior  to the  sampling) by the
multivariate normal theory.

      (3) As indicated by Aitchison (1986,  pp. 338-340),  the directional
data approach for compositional data has a topological difficulty since the
square  root transformation  maps  the  simplex into  only  part of the  unit
sphere.

      Mathematical tractability  is one of the main considerations in  our
choice of  an approach. It  is  sometimes doubtful  that the real data satisfy
the  logistic  normal  assumption.    If a  test  of  normality  shows  a
contradiction to  logistic  normal  distribution,  the von  Mises distribution
and  the  generalization of logistic distributions  serve  as  alternatives.
The  Dirichlet distribution   seems  less  promising,  since the  Dirichlet
distribution can  be  well  approximated  by the  logistic normal distribution
(Aitchison and Shen,  1980).
                                   12-3

-------
      Under the logistic-normal  model, the statistical  method for analyzing
the human activity pattern data can be limited to the routine procedures of
multivariate  analysis using  the transformed  variable v  if the data  are
complete.  For example, we can rewrite p as:
             (PP P2,  ..-  .  Pd+1)  - (W *2*u

in other  words, the  observed value  (plt  p2
perturbation of the  parameter
where  all  the  u.'s  are  nonnegative.
       ...  ,  pd+1)  is
...  ,   d+1) by  (ult  u2,
Without changing  the
                                                               viewed  as a
                                                                      u),
           . :=1.  It
         with  mean 0
log(p./pd+1),  we can impose the condition  Uj+u^.
that if u =(Uj, u2, ...  ,  uj  is logistic normal
matrix I, then p is Ld(jt, 2), where  M-(/ip  /V
Iog(jr2/7rd+1),  ...  , log(7rd/7rd+1))T.   Once we  get  A>  the estimate
estimate of n (ft) can be computed by the  inverse  transformation
                                                                       d+1
                                                              magnitude of
                                                             can be checked
                                                             and covariance
             »
            the
                                                                  of
                                         exp(/ik)]
and the estimated covariance matrix of 1t can be obtained  by
62GT,  where 2  is  the  estimated covariance  matrix of Ł  and  the (m, n)-th
element of  G  is the partial derivative of  7rm  with  respect  to /zn evaluated
for /in=An.

MISSING VALUES IN ACTIVITY LOGS

      Ideally,  an. activity lo.g should  record  m-a combinations  before and
after each  change of m-a.   Some  of the sampled people  will  have complete
logs while, inevitably, others may have "coarse" logs.  For example, in the
i-th person's  log,  the first recorded  item  is  "leaving  home and enter the
car at 7:20 a.m.", and the second record is  "entering the cafeteria from my
office at 10:00 a.m."   If we are  willing to accept that there are only two
microenvironment-activities from 7:20 a.m.  to  10:00  a.m., i.  e., driving a
car (m-a  2)  and working  in  the office (m-a 3), suppose  all  the other m-a
combinations are  correctly recorded except   that  in  another  time interval,
m-a 3  and   m-a  5  are again  mixed  together,  then  instead of obtaining
individual  p12, p13, and  p15, we can  only get  the sum of pi2,  p13, p15 and
conclude that p.2,  p.3,  p.5 are within  certain intervals.

     Missing data can be  partly  identified by this  kind of inconsistency
between two  consecutive  records but  this method is  by  no means complete.
For instance,  there may  be  another  completely  unrecorded m-a between 7:20
a.m. and  10:00 a.m.  in the above example.   Even  a  deemed complete log may
become spurious  if completely unrecorded m-a  is  taken  into consideration.
Situations  become more complex as  the logs become coarser,  perhaps  an ad
hoc procedure  is  necessary to  examine each  log.   It is  beyond the scope of
the present  work to develop these  procedures, but we assume  that through
some suitable  procedures,  each log can  be mapped into a single point in the
d-dimensional  simplex if the  log is complete;  otherwise,   we-  can observe
some unmingled elements of p and the sum(s)  of  other elements of p.
                                    12-4

-------
      A  coarse log  is  always  a  problem  in  the estimation  procedure,
discarding the coarse logs  and  analyzing  data only from the complete  logs
seems  unsuitable.    In  the  following  discussion,  we  will,  sacrifice
information contained in the sum(s) of the  elements of  p,  but at least we
will still make  use  of the unmingled observations in p.   In terms of the
example  mentioned in the first  paragraph of this  section,  we ignore the
information  provided by  the sum of p12,  p.3,  p15,  but  other individually
observed proportions  will  be incorporated  in the estimation procedure.

STRATEGIES FOR INCOMPLETE LOGS

      To  analyze  the unmingled,  individually  observed proportions,  the
following fact (Aitchison, 1986,  p.  119) is  useful:

      Let p be Ld(0,  2), suppose the  proportions in  the m-a jp  j2,  ..., jc+,
are  individually observed,  llj^j^. . .
-------
normal case,  obtaining MLE  in the presence  of missing values  is pretty
simple if we have a monotone missing pattern  and the nonresponse mechanism
is  ignorable.   Operationally,   the procedure of  computing MLE  includes
continuously regressing the less observed variable(s) on the more observed
variable(s)  and then filling in  the  unobserved parts using the regression
model.    Sweep and  reverse sweep operators  are  commonly used  in  this
procedure.  For details,  see  Little and Rubin  (1987, pp. 112-119).
      The  role  of  ignorable nonresponse  mechanism  should  be  addressed
further.  The assumption of ignorable nonresponse  insures  that the complete
likelihood function can be factored  into two  parts,  one corresponds to a,
the parameter of the nonresponse mechanism,  the  other  corresponds to 0, the
parameter  of the  random variable we are interested in.   If the parameter
space fl  can  be  expressed  as  the Cartesian  product of the parameter spaces
of  a  and  /?,   maximizing the  complete  likelihood  can  be achieved  by
maximizing the  separate parts  (Little  and  Rubin, 1987).   In the typical
terms  of  statistical   inference,   we  can  say that  a  is  the  nuisance
parameter.    An  MLE  of  /J  can  be  obtained  by  maximizing  the  marginal
likelihood of /).   Ignorable  nonresponse eliminates the dependence of the
marginal likelihood of  ft on the  nuisance parameter a.

      If a monotone missing  pattern  seems  unlikely,  the  EM algorithm is
usually  applied (Little and Rubin,  1987,   pp.  142-145).   For  a thorough
discussion of the EM algorithm,  see Dempster,  Laird, and Rubin (1977).  If
we can  have  a  monotone pattern  by sacrificing a limited  amount of data,
Rubin  (1987, pp.  189-190) suggested discarding data  to obtain a monotone
pattern.

      Assuming  an ignorable nonresponse  mechanism,  we use .an  bivariate
normal  example  to illustrate the estimation  procedure  in the presence of
nonresponse.   Note that  for the example we  use,  the  missing  pattern is
monotone.

      Let y=(y^  y2)T  follow a bivariate  normal  distribution  with  mean
/i=(/ip  /i2)  and covariance matrix 2.   Let  y.  be subject  to nonresponse.
Whenever it is necessary,  we  use the letters R and N to  denote response and
nonresponse  respectively;  for example, (Ry.,  Ry2) is  a  random vector from
the subpopulation of respondents with mean  (R/JJ, R/*2),  the  same nomenclature
can be  applied  to  the  nonresponse  subpopulation and the covariance matrix.
A sample collected  from the  bivariate normal  population can be so arranged
that the first  m  observations are  complete,  while for the  second variable
of the last n-m observations, call  it Ny2,  are  not  observed.   Note that both
Ry,  and Ny1 are observed;  the only  unobserved  part  is  .,y2.   Any symbols
without  K or  N in front means  that both response  and  nonresponse are
considered, for example,  y. represents the mean of  the  n  y/s.  The MLE of ^
and  the corresponding  estimate of  the variance can be obtained  by the
following formula (Little and Rubin, 1987,  pp.  100-103) if  the nonresponse
is ignorable:

      Mi.                                                        (4.1)
                                   12-6

-------
                  {l/»Rr2/[n*(l-Rr2)]+(Hyl-y1)z/(n*Rs11)}               (4.3)

where Rs12 and Rs.,  are the sample covariance and variance computed from the
first m  observations,  i.  e.,  the response  part.    Similarly,  Rr  is  the
sample correlation computed from the first m observations.

      Using  the  estimator  (4.2)  is  equivalent to  constructing  a  simple
regression of Ry2  on  Ry. and then obtaining  the  arithmetic mean  of 8y2's and
Ny2's,  where Ny2  is the  predicted  value based  on  the  regression  line
constructed from the response part.

ROBUSTNESS CONSIDERATION

      To  discuss  the  robustness of  /L  in (4.2),  the influence function
approach will be  employed  in  this  section.   For another approach of robust
estimation in which  a contaminated normal  distribution  is  considered,  see
Little and Rubin (1987, pp. 194-217).

      Loosely speaking,  the univariate influence function  is  a measure of
the  "influence"  of  an additional  observation x  to the  estimator T(Fn).
Here,  the  estimator  is  expressed  as  a  functional  T  of the  empirical
distribution function F .   Formally, the  influence  function is defined to
be the limit of  the  ratio of the difference of two  statistical functionals
and e as e goes to zero:

                  IF(x; T, F)- 1-im {T[(1-OF+«*X]-T[F]}A,

where 6x is the indicator function.

      Influence function  can  be generalized  to the multivariate  situation
(Hampel,  1986, p. 226).  Also,  the arguments  in the  statistical functionals
may  have more than  one distribution function.   Some typical  results are
illustrated  below:

      IF(y; T, F)=y-/i, when T is the mean functional,

      IF(y; T, F) = (y-/i)2-a2, when T is the  variance functional,

      IF(y,,  y,;.  T,   F) = (y.-/t.) (y.-/i.)-a  ,   when T  is  the  covariance
functional.

      Since  Aj is  the usual  sample mean,  we will  focus on ft.,  note  that
the  corresponding   statistical  functional  of  A2   is  a  function  of  two
bivariate distribution  functions,  J  and NF.  By tne three equations listed
above and  the  chain  rule of  calculus,  the influence function  of A? can be
expressed as
                                    12-7

-------
                    Rffl *(N^rR/il)*[(RVR'il)2~Rail]/Rail2}'              (5>1)

where a., is the  (i,  j)-th element of 2,  p  is the population  nonresponse
rate, ana as stipulated previously, R and N denote response and nonresponse
respectively.   If the nonresponse mechanism does  not depend on y. and y,,
then we call this situation missing completely at random (MCAR).  tinder tne
assumption of MCAR, ^^^  hence  the  above  formula  can  be simplified.

                                                                      ,  is
                                                                         as
                                                                        (2)
       /L  is  better than  ny2  for the  following  two reasons:  (1)
 asymptotically  unbiased.   The bias square term produced  by using Ry;
 (RM2-M2)2  becomes  the dominant component in the mean  squared error
 the  sample size increases and the nonresponse  mechanism  is not MCAR.
 For  fixed m, var(A2) in  (4.3) decreases  as  n increases  while the  variance
 of Ry» remains unchanged;  in other words,  we lose more information  using Ry,
 as tne nonresponse  rate  p increases.   However,  these  advantages are  not
 without  any price.   The price one pays  is robustness. A  further inspection
 of (5.1) reveals  that  influence  function may  increase  as p or  (^^JJ-^2
 increase,  the  increase of  (R/J2-/z2)2  will, in  turn,  increase  (^^^i)   if
 there is  correlation between  yt and y2.

       The effect  of an  outlying ^  on  A2 is  less complicated than that  of
 an outlying (Ryp Ry2) if  the nonresponse  is not MCAR, hence only the  effect
 of an outlier  (Ryp  Ry2)  will be illustrated by the following numerical
.example.   First,  generate 200 y^y^ y2)T  following a  bivariate distribution
 with  mean ^=(10,  10)T and covariance matrix       .

                   10, 9

                    9, 10
This can be done  by  first  generating
a bivariate normal distribution with
using  the  SAS package.   Then let y-
                                      200 pseudo random vectors  z  following
                                      mean 0  and  identity  covariance  matrix
                                      /j+A*z  where A  is  a  lower triangular
matrix and 2=A*AT  is  the  Cholesky  decomposition  of 2.   Assume the logistic
nonresponse mechanism:
      Pr(y12  not observed|y11)=l/[l+exp(-a-b*y.1)].
                                                                      (5.2)
This mechanism is ignorable  in  the sense that  it  does not depend  on the
possibly unobserved y2.   It is also MCAR if b=0.   Points with larger yL have
a higher probability of being unobserved  when b is positive.   By varying
combinations  of  a  and  b,  several  nonresponse  mechanism cases can  be
generated.    For  each  generated  (yil,yi2)>   we generate independent  u.
following a uniform distribution  on (0,  1).  If

                           u.
-------
then  we treat  y.9  as  unobserved.     In  order  to  compare  the  effect  of
different b's under the same response rate, values of a and b are purposely
selected so the number of  nonresponses  in  cases  3,  4,  5,  and 6 are exactly
150.  The  computational  results  are listed in Table 1.   In each case, 200
points are obtained,  and,  after  the removal  of the deemed nonresponse .y.g,
Ry2  and A2 are calculated.  Then  the  outlier  (25,  3)  or  (3,  25)  is added to
the data set as a  (Ryn,  Ry12) point  and A2+,  the new estimate of is, is  again
computed.

                                  TABLE 1
             ESTIMATES OF p. UNDER DIFFERENT RESPONSE MECHANISMS
       case
# of N
added outlier

1 -«
2 -1.099
3 1.099
4 0.00
5 -15.45
6 24.95
0
0
0
0
2
• -2
.0
.0
.0
.14
.00
.00
0
48
150
150
150
150
10
10
9
9
6
13
.00
.01
.91
.22
.27
.72
10
9
10
9
9
10
.00
.98
.01
.97
.15
.36
(25,
(25,
(25,
(25,
(25,
(3,
3)
3)
3)
3)
3)
25)
9
9
9
9
6
14
.96
.93
.73
.44
.79
.1
      Table 1 is an illustration of the assertion that the advantage gained
by using  formula (4.2)  is not without any price.   In case  1,  Ry2=#2, tne
influence of the  added outlier is limited to the  second  decimal  point.   A
comparison of cases  1, 2,  and 3  shows  that influence of the outlier  on  A2
increases as nonresponse rate  becomes higher.   A comparison of cases 3,  4,
5, and  6  shows  that given a fixed  number of nonresponses,  Ł2 is closer  to
the true  value  10 than Ry2 is.  This  phenomenon is even  more  obvious  when
the magnitude  of b becomes  larger; however, the  absolute value  of A2-A2+
also becomes larger.

      From the viewpoint of  regression  diagnostics,  it is obvious that the
newly added point  is  a high  leverage  point, the simple regression model  of
R/I  on  R>2 is  ni9hly  distorted  by the added  point.   This  in  turn  will
distort the imputed value Ny2 obtained from  Nyj and  the regression model.

      Finally,  we want to point out that  for  the numerical example we  used,
the  outlier  is  an obvious  one,   to  detect  the outlier in  a  systematic
method, we need a general rule to order the multivariate  observations.  See
Green (1981)  for the bivariate example.
                                    12-9

-------
DISCUSSION

      Since nonresponse will  complicate the analysis procedure and make the
statistical results  less  reliable,  attempts should be made  to  reduce the
nonresponse rate.   However,  the remedy for  the  nonresponse  is  by no means
complete  if  we  concentrate  on  nonresponse  rate  only,   the  role  of
nonresponse mechanism should  not be  ignored.    For  example,   incentive
methods may not  only increase the  response rate,   but  also change  the
nonresponse  mechanism,   say,   increase  the magnitude  of  b  in  (5.2).
Reduction  of the sensitivity  to the outlier by the increase of the response
rate can be counteracted  by the increase  of  b.   From this  point  of view,  a
complete consideration of  using an incentive method should at least include
the following  questions:  When and to whom  should we apply  the  incentive
methods—at the very beginning of  the sampling program or at the follow-up
survey, to all  the population or just to the "hard core?"

      Though  the   increase  of the  number  of   microenvironment-activity
combinations will  not change  the computational  procedures,  it  is  still
necessary  to decide how many m-a combinations we are going  to define.   For
a fixed number of observations,  the number of  microenvironment-activity
combinations cannot be increased without  any  limitation.    The chance that
the MLE is  close  to the  true  value  is  small  if the  number  of  parameters
increases  as the number of observations increases.

      The  numerical  example used in this  study comes  from a  bivariate
normal  distribution.    It can  be generalized without  difficulties  to  a
multivariate normal  case as long as the assumption of  ignorable nonresponse
mechanism and  monotone missing pattern  still hold.   Further  studies are
necessary if we do  not have  an ignorable nonresponse mechanism.   One may
also argue that the outlier in  the numerical example is an extreme one, but
what we want to emphasize here is  that for the same outlier,  the influence
varies as  the  nonresponse rate and nonresponse mechanism  change.   In other
words,  we are  more  interested in the relative  magnitude  of A2-A2+  among
different  cases   in  Table  1  than  in   the difference A2-A2+  in  each
individual case.
                                   12-10

-------
                                 REFERENCES
Aitchison, J. (1982).  The statistical  analysis of compositional  data (with
      discussion).   J.  R.  Statist.  Soc.   B, 44,  139-177.
Aitchison,  J.   (1986).   The  Statistical Analysis  of Compositional  Data.
      London: Chapman and  Hall.
Aitchison, J.,  and Shen, S. M. (1980).   Logistic-normal  distributions: Some
      properties and uses.   Biometrika  67, 261-272.
Dempster, A. P., Larid, N., and Rubin,  D. B.  (1977) Maximum likelihood from
      incomplete data  via  the EM algorithm.   vL. jL Statist. Soc.   B,  39,
      1-38.
Green, P.  J.  (1981)   Peeling  bivariate data.   In Interpreting Multivariate
      Data, edited by V. Barnett  (John  Wiley &  Sons,  New York), pp. 3-19.
Hampel,  F.  R.;  Ronchetti, E.  M.;  Rousseeuw,  P.  J.;  and  Stahel,   W.  A.
      (1986).  Robust Statistics.  New  York: John Wiley & Sons.
Little,  R.  J.  A.;   and Rubin,  D.  B.  (1987).   Statistical Analysis with
      Missing Data.   New York: John Wiley & Sons.
Rubin, D.  B.  (1987).   Multiple  Imputation for  Nonresponse  in Surveys.  New
      York: John Wiley & Sons.
Stephens,  M. A-  (1?82).  'Use  of the von Mises distribution  to  analyse
      continuous proportions.   Biometrika 69,  197-203.
Wilks,  S. S.  (1962).   Mathematical  Statistics.   New  York: John Wiley &
      Sons.
                                   12-11

-------
             NONRESPONSE PROBLEMS AND SOLUTIONS:  A CASE STUDY

                      by:   Dawn  Nelson and Chet Bowie
                           U.S. Bureau of the Census
                           Washington, D.C.  20233
                                 ABSTRACT

      The purpose of this paper is to describe the nonresponse problems in
one particular survey and the efforts  being made  to reduce  it  in hopes that
the information  will  be  useful  in planning other surveys.   The survey is
the Survey  of Income and Program  Participation (SIPP).   This  survey has
been conducted  by the Census Bureau since  1983 to  provide  longitudinal
information  on  the economic situation  of households and persons  in the
United States.

     This paper presents  information on  our experience with nonresponse in
the first SIPP panel  to  be  completed, the  1984  SIPP  panel.    It contains a
demographic profile of those who refused, which was developed by analyzing
information provided by  SIPP interviewers.   It  also  contains a discussion
of the reasons why respondents refused to participate  in the survey and the
results of follow-up visits  made to convert these refusals  into  interviews.
Finally,  there  is a description of other  efforts that are  being  made to
maintain  and improve  the  SIPP response rates.    These  efforts  include
educating  the interviewers  and respondents  about  the survey,  improving
interviewer  training,  and  testing  the  effects of offering  respondents a
gift for participating.
                                   13-1

-------
                    INTRODUCTION TO NONRESPONSE PROBLEMS
     When conducting a survey,  the  researchers and managers depend upon the
cooperation  of respondents  to produce  accurate results.    Without this
cooperation,  the resulting survey  data may  be biased.   Therefore, efforts
should be  made to understand  the  reasons for nonresponse  and  to develop
ways to reduce it.

     Nonresponse problems  differ  somewhat depending on  the  nature of the
survey; that is, the mode of interviewing (mail,  telephone, diary, personal
visit), the  type  of respondent (person or  business),  the number of times
each respondent is  interviewed, the length and  content  of the interview,
whether it is voluntary or mandatory,  and so forth.   This paper will focus
on  nonresponse problems  associated with  household surveys  conducted by
personal  visit or a  combination of  personal visit and telephone.

     According to the literature,   a typical nonresponse rate for this type
of  survey  is probably about  20 percent  (1).   The  Census Bureau,  on the
other hand,  manages  to keep  the rate  to  around  5 percent or less for most
of  its household  surveys.   This should make you ask how  the Census  Bureau
addresses  the  problem  of nonresponse.    That  question  is  not  easily
answered,  however.   Much of  the knowledge  about  this subject  is  considered
to  be  common sense; therefore, it is not well-documented  at the Bureau.
But it is  well-established that there  are  two main  sources of nonresponse:
noncontacts and refusals.

NONCONTACTS

      There  are several  reasons that an  interviewer might  not  be able to
establish  contact with a respondent; for example,  they  may be  away  at work,
school, prison,  etc.,  or on  vacation,  or unable to  answer the door due to
an  illness  or  disability.   Other barriers  include  bad weather or road
conditions  and  housing  security   measures  that  prevent  access  to  the
respondent.   Obviously some of these  conditions are  temporary and may be
overcome if  enough time  and money  is allotted for the  interviewers to make
return visits or "callbacks"  to the respondents.

      Callbacks can  be made  less  costly and time-consuming,  if you plan
ahead  for  them in designing  the   survey.   For  example,  a  cluster  sample
design will  ensure that  neighboring housing units are  selected  in  clusters
scattered  throughout the sampling  area.   This  will  minimize the amount of
travel  time needed  for  callbacks  to  neighboring  units.   However,  some
concern has  been  expressed that cluster sampling may  increase refusals if
an  interviewed household  tells a  neighbor something negative  about the
survey before the interviewer reaches  them.   Another planning  suggestion is
to 'set  up  a  flexible  interviewing"  schedule  that  includes  nights  and
weekends.  This makes  callbacks possible  at different times of the day and
different  days  of  the  week  which will  increase  the chance  of finding
someone at home.
                                   13-2

-------
     There  are  also   a  number  of  steps  that  can   be  taken  to  avoid
noncontacts or limit the  number of callbacks.  For example:
      1,
      3.
      4.


      5.

      6.




      7.


REFUSALS
Keep records  of the time  contacts are attempted  or made and
analyze them to determine  when  respondents are most likely to
be home and how many callbacks to allow.

Ask respondents who will  be interviewed several times when is
the best time  to visit  and  note  it  for future use.

Use "suspected" characteristics  of  the  sampled  person for  clues
regarding  the best  time  to  attempt contact;   e.g.,  persons
living in housing for the elderly are more  likely to  answer the
door during the day.

Allow  proxy interviews;  i.e.,  obtain  the  information about  a
respondent from another knowledgeable person.

Ask a neighbor when the respondent  is likely to be home.

Make an appointment by telephone.   However, it  should  be  noted
that  many  interviewers  feel  this  procedure  may  lead  to   a
refusal  because it  is easier  to  turn  away  someone over the
phone than in  person.

Leave  a notice  saying  that  an interviewer  was  there  and  asking
the respondent to call  the  interviewer to make an appointment.
      Once contact  is  made,  however, there  is  still  the possibility that
the respondent may  refuse  to  participate  in  the survey.   In fact, this  is
generally  a  bigger  problem than making contact.   Researchers Stephan and
McCarthy  (2)  have  found  that obtaining  a  response depends  on several
factors including:  1) the form of the approach  (mail, telephone,  personal
visit), 2)  the type of information requested and the advance notice given
to  the respondent,  3)  the  characteristics of  the respondent,  4)  the
respondent's  attitudes toward the  group  conducting the  survey,  5)  the
efforts made  to  overcome  resistance,  and  6)  the circumstances under which
the interview is conducted.

     The Survey Research Center at the University of  Michigan also  studied
the concerns  expressed by respondents  that might lead  to  a refusal (3).
The most frequently cited concerns were that the  interview would 1) take
too  much  time,   2)  ask  for  too   personal,   difficult,    or  unpleasant
information,  or  3)  have  a negative  effect  on the respondent, e.g.,  denial
of some government  program benefits.   They also found that  these  concerns
were mitigated in some cases  by  the respondent's desire  to be of  public
service or because the respondent  was lonely  and wanted to talk to  someone,
or flattered  to be asked  to  participate,  or just friendly and  liked  to
talk.
                                   13-3

-------
     It is more difficult to give advice on  how  to avoid refusals because
each respondent is different and may react differently.   Some general tips
in this area have  also  been developed by Stephan  and McCarthy  (4).   They
recommend:

      1.     using  well-trained  professional   interviewers  who  have  a
            positive  attitude,

      2.     assigning interviewers  to respondents who have characteristics
            that are similar  to the  interviewers'  (but  an  interviewer
            should  not personally know a respondent),

      3.     providing advance notification about the interview,

      4.     having a good introduction  and  description of the  survey for
            the interviewers to use,

      5.     paying  the respondents,

      6.     rescheduling the interview  if the  respondent is  too busy when
            first  contacted, and

      7.     using a different interviewer to try to change  a respondent's
            refusal  into an  interview.

                 ADDRESSING REFUSALS AT THE CENSUS BUREAU

      The  Census Bureau  follows  all  of these  recommendations except that we
do  not pay respondents because  we  are  prohibited  from  doing   so  by
governmental regulations.    In  addition,  the  Bureau believes  that  it is
important  to:

      1.   have  a well-designed, fully-tested, brief questionnaire,  and

      2.     guarantee  the  respondent's  confidentiality  and  train  the
            interviewers on  the  importance  of   and  ways  to  maintain
            confidentiality.  Census  Bureau  interviewers are subject to a
            jail penalty or  a  fine  if they disclose  any information that
            would identify  a respondent.   We only publish data in the form
            of  statistical  summaries  and never  release any information that
            identifies  an individual.   We believe that these measures help
            to  maintain  our  high  response rates.

     Also,  the  importance of  having  the right interviewers,  training them
well,  providing  them  with  incentives,   and  soliciting  their help  in
addressing the  nonresponse problem should be  stressed.   The Census  Bureau
has a regular staff of around 3,000  sample  survey interviewers who work out
of our 12  regional offices throughout the country.  When they are assigned
to work on a survey,  they are thoroughly trained  on it through self-studies
done alone  at  home and  in classroom  sessions with  other  interviewers.  The
interviewers also  receive refresher  classroom or  self-study  training at
least  once a  year  or  sometimes more  often.    This training  will  often

                                   13-4

-------
emphasize problems the interviewers are  encountering,  such as nonresponse
problems.  The Bureau feels that if the  interviewers are familiar with the
basic concepts of the survey,  the wording of the questions, and the way to
complete the  forms,  they  will  be more  likely  to display a  positive and
professional  impression  to  the  respondents.    The  importance of  this
impression  was demonstrated in  an experiment  conducted  by Durbin and Stuart
in which only 3 to 4  percent  of the  respondents refused  professional
interviewers while about  13 percent refused the inexperienced  amateurs  (5).

      The Bureau  also tries to  keep  an open line of communication between
the  interviewers in the  field  and  the survey  managers   in  the  regional
offices  and  the  central  headquarters.   One  way we do this  is  through a
program  called  "Thanks  for Asking"  in  which interviewers are invited to
send questions and suggestions to the head of our field operations office.
Matters of general interest are responded to in a newsletter that goes out
to all  interviewers and  the  other letters  are  answered  personally.   We
receive  a  lot of valuable adv.ice  from our interviewers in  this  way, and
occasionally  an  interviewer is given a  cash  bonus  if  their  suggestion is
adopted.

      The  Bureau also  provides  feedback to the interviewers  on  how well
they are doing in general and in particular on keeping  the  nonresponse rate
low.  A  noninterview rate  is calculated  for each interviewer at the end of
an  enumeration  period  so they  know  if  their  efforts   are  succeeding.
Interviewers are observed periodically by their  supervisors, and their work
is  also  systematically  reviewed  for  errors  to   enable us to  detect
weaknesses that  need to be corrected through additional training.   We also
use these evaluations to  reward  good  interviewers with pay incentives.

     . Finally, despite all of the Bureau's efforts to make  contacts and get
cooperation,  some nonresponse  is  considered inevitable.   Also,  in some
cases,  we  feel  it  may  be better  to accept  a  refusal than  to  pursue an
interview  with   a respondent  who  is  so negative  or  apathetic  that they
provide  data of  questionable  quality.   In  planning  a survey,  one must
decide on  the extent of the efforts to  be  made  in  obtaining cooperation.
This decision will depend  upon:  1)  the amount of nonresponse expected,  2)
the  likely  differences between  the  respondents  and  nonrespondents,  3) the
accuracy required in the  results,  and  4)  the funds  and  time available.
These plans  should  also include procedures  for  collecting information on
nonrespondents  which can  be  used to  make  adjustments to the results to
account  for the  nonresponse.    However,  this  paper will  not attempt to
address this important topic  of  nonresponse adjustments.


       A CASE STUDY:  THE SURVEY OF INCOME AND PROGRAM  PARTICIPATION


THE SURVEY AND ITS NONRESPONSE RATES

     The Census  Bureau has recently launched a major new survey, the Survey
of Income  and Program Participation  (SIPP),  that has provided us with the
opportunity  to  further  study the problem  of  nonresponse.   The survey has

                                   13-5

-------
been conducted  by the Census  Bureau since  1983  to provide  longitudinal
information  on  the economic  situation  of households  and persons  in  the
United States.   The data are used to analyze the cost and effectiveness of
government  transfer  programs,  to better  understand the Nation's  income
distribution, and to  study  national policy  issues.  Panels of approximately
12,500 households  are  introduced every February, which  results  in  two or
sometimes  three  panels in  the  field concurrently.   Respondents in  each
panel are  interviewed once  every  4 months  for  2  1/2  years.   If they move,
attempts are made to  continue  interviewing  them at their new address.  (See
Reference 6 for  a more  complete description of the survey.)

      Nonresponse rates are calculated for  each "wave" of  interviewing in a
panel.   A wave  is the 4-month period that  is required  to  interview  the
entire  sample;   one  fourth of  the  sample,  called  a  rotation group,  is
interviewed  each month of the  wave.   This paper  presents a variety  of
information (originally presented  at  the  Bureau of the Census' Third Annual
Research Conference)  on our experience with noninterviews  in the first SIPP
panel to  be  completed, the 1984  SIPP Panel,  which  consisted of 9 waves of
interviewing (7).  The interviews  were voluntary and  long—over an hour per
household  on the  average—and  dealt mainly with a subject that  is  not
comfortable or pleasant for most people:   income.   After the first wave of
interviewing, our  nonresponse  rate  including noncontacts and refusals  was
4.9  percent.   By the ninth and  final  wave of interviewing,  the  rate  was
22.3  percent,  including noninterviews  from previous waves  that  were  not
converted  to an  interview.   This was  high for  a  Census  Bureau survey.
However,  over one-fourth of the  loss was accounted for by households that
had  moved  to  an  undetermined  location  or outside  of our  interviewing area
after  the  first interview.    The  remaining  loss  was  primarily  due  to
refusals.

     In  Wave 1  of the  1984 Panel,   about  76 percent of  the nonresponses
(excluding lost movers)  were refusals  (3.7 percent).   This  percentage
increased  throughout  the panel until  it reached 94 percent in Wave 9 (14.2
percent).  A number of hypotheses have  been suggested regarding the reasons
people refuse to  participate  in  SIPP.   Some people  suspect that  interview
length,  frequency, and content are the prime  candidates.  Others believe
that  interviewer characteristics — age,  experience,  understanding  of  the
survey,  etc.--might  be related to refusals.   And  still  others think that
the  problem is  generic — people do not participate  in  our surveys because
they  do  not trust the  government or are  afraid  of strangers due  to  the
increase in crime.

PROFILE OF THE REFUSERS

      In  an attempt  to  improve our  understanding  of  the reasons  for
nonresponse, SIPP  interviewers are asked to  provide  a detailed description
of  each  noninterviewed  household encountered.    For  each  noncontact  or
refusal,  interviewers fill   out a  form providing information on the type of
noninterview, the demographic  characteristics of a refuser,  the reason  for
refusal,   and  information  on the  follow-up  attempts.    Because  of  the
longitudinal  survey  design, more than one form can  be  completed  for each
household  during the  time  it is in the survey, since a noninterview in  one

                                   13-6

-------
wave can  be  revisited in  the  next wave  and  remain a noninterview.   The
first data to be analyzed  from these forms are refusals in Waves 1 through
6  of the  1984  Panel.     Following is  a  demographic  profile of  these
households based on  interviewer  observed  characteristics  of the household
and the person who  refused  for  the  entire household.

            Most refusals  (about 80 percent)  occurred in  either  central
      city or  suburban  area households.   Only  around 20  percent of the
      refusals occurred  in  rural  area households.   (See Table 1.)

            Most refusals  (approximately  73  percent) occurred in middle
      income range  households.  (See Table 2.)   NOTE:   Interviewers were
      asked to mark either high,  middle, or low  income (undefined  in terms
      of dollars) based  on their  own observation  of the sample  unit and its
      location.

            The  average  age  of  the  person   who  refused  household
      participation was  between 46  and 47.  (See  Table 3.)

            More females (about 60 percent) refused household  participation
      than males.  (See  Table 4.)

            Consistent with  the  population distribution, whites accounted
      for the majority of  household refusals,  i.e., over 87 percent.   (See
      Table 5.)


           TABLE 1.  PERCENT DISTRIBUTION OF THE  LOCATION OF THE
                         REFUSAL  HOUSEHOLDS BY WAVE

Location
Central City
Suburb
Rural
% of
Sample*
28
41
31
Wave - •
1
40.2
39.1
20.7
2
38.3
42.4
19.3
3
36.1
40.3
23.6
4
37.3
41.4
21.3
5
39.9
40.0
20.2
6
44.1
36.4
19.5

*     The composition of the  sample  changes  slightly  from  wave-to-wave  due
to additions and attrition; this percentage distribution is based  on  Wave  6
full  sample data.   The distributions  of refusal households  by wave  are
based oh interviewer-reported observations.
                                    13-7

-------
               TABLE 2.   PERCENT DISTRIBUTION OF INCOME LEVEL
                           OF REFUSAL HOUSEHOLDS
Wave
Income*
High
Middle
Low
1
10.9
73.9
15.2
2
13.8
72.2
14.0
3
10.8
72.2
17.1
4
9.1
73.2
' 17.7
5
8.7
75.4
15.9
6
7.6
72.7
19.7

*     The  level  was  self-defined  by  each  interviewer and  would not  be
comparable to any reported income data  based  on  the  full  sample.
          TABLE 3.  PERCENT DISTRIBUTION OF THE AGE CATEGORIES OF
                         RESPONDENTS REFUSING TO PARTICIPATE

Age Category
Less than 20
20-29
30-39
40-49
50-59
60-69
70 or older
Average age of
% of
Sample*
29
19
15
11 •
9
9
8

oerson refusing
Wave
1
0.2
9.8
28.6
19.7
17.2
17.4
7.0

42.6
2
0.2
14.2
20.2
16.9
16.4
17.2
15.1

49.4
3
0.7
16.0
19.9
17.3
16.2
16.9
13.1

48.1
4
0.8
15.5
22.8
17.1
16.3
15.9
11.7

47.4
5
0.4
19.5
20.5
16.9
17.6
14.2
10.7
'
46.8
6
0.5
17.2
27.5
16.4
17.5
11.6
9.5

44.9

*     The composition of the  sample  changes  slightly  from wave-to-wave due
to additions and attrition; this percentage distribution is based on Wave 6
full  sample data.   The distributions  of refusal  households by  wave are
based on interviewer-reported observations.
                                    13-8

-------
              TABLE 4.   PERCENT  DISTRIBUTION OF SEX OF PERSON
                          REFUSING TO PARTICIPATE

Sex
Male
Female
% of
Sample*
48
52
Wave
1
44.4
55.6
2
41.0
59.0
3
39.9
60.1
4
39.9
60.1
5
39.2
60.8
6
36.8
63.2

*     The composition of the sample changes slightly from wave-to-wave due
to additions and attrition;  this percentage distribution  is based on Wave 6
full sample  data.    The  distributions of  refusal  households  by  wave are
based on interviewer-reported observations.
         TABLE 5.  PERCENT DISTRIBUTION  OF RACE OF PERSON REFUSING

% of
Race Sample*
White
Black . ,
American Indian
Asian
Other
Don't know
85.0
12.0
0.5
2.5
-
-
Wave
1
87.8
7.9
-
1.0
0.5
2.8
2
88.8
9.9
0.2
0.9
0.2
-
3
87.7
10.7
-
1.4
0.2
-
4
86.5
11.3
0.4
1.2
0.5
.
5
88.2
10.1
0..1
1.1
0.3
0.2
6
86.1
12.6
,
0.8
0.3
0.3

*     The composition of the sample changes slightly from wave-to-wave due
to additions and attrition;  this  percentage distribution is based on Wave 6
full  sample  data.    The  distributions of  refusal  households  by  wave are
based on interviewer-reported observations.

REASONS GIVEN FOR REFUSING

      Information  is  also  available  on  why  a  household  refused  to
participate in the survey during Waves 1-6  (1984  Panel).   Only one .reason
for refusing was coded per household even though multiple reasons may have
been given.   The major reasons for refusing to be  interviewed in Waves 1
and  2  were  similar.    (See Table  6.)    Mainly,  persons  just  "were not
interested in participating  in  the  survey."  This was reported  18.7 percent
of the  time  in  Wave  1 and 13.2 percent  of the time in Wave  2.   The next
most frequently  given reason was  "too busy"  (14.7 percent  in Wave 1 and
13.3  percent in Wave 2).   In  both  Waves  1  (9.9  percent)   and  2  (12.8
percent),   "invasion of  privacy"  was   the  third  reason  given   for  not
participating.   "Voluntary  survey" (9.3  percent) and  "questions  were too

                                   13-9

-------
personal" (9.1 percent) were next in Wave 1.   These two reasons were not as
important in Wave 2  (6.8  percent  and  3.2 percent respectively) as the fact
that  the respondent  had  only reluctantly  participated  in  Wave 1  (8.8
percent).  Also,  6.2 percent of the  people refused  in  Wave  2 because they
did not understand we would be returning.

      The main reason  for refusing to participate in Wave 3  changed from
Waves 1  and 2.   (See Table 7.)  The  major reason  cited in Wave 3 was that
"we  answered  the questions  in earlier  visits."   This accounted  for 24.1
percent of all reasons given in Wave 3.  "Too busy" (7.6 percent) and "just
not  interested in  participating"  (6.8 percent) were  cited  less frequently
than  in  Waves 1  and 2.   In Wave  3,  6.1  percent of the households who had
participated earlier refused  now  because they felt the questions  were too
personal.  This is a larger percentage than was reported for this reason in
Wave 2 (3.2 percent).  Another 6.1 percent who had reluctantly participated
earlier  were  lost in  Wave 3.  Also, almost  4  percent of the  households
indicated that the "interview is too long."

      In Waves 4,  5,  and 6, -the main reason  for refusing continued  to be
that  "they  had answered  the  questions in earlier waves."   (See Table 7.)
The other reasons were in  the  same  vein.   In  these waves,  more people were
becoming  angry  and  cited  "harassment"  as  their reason  for  refusing to
participate.  Also, people indicated they were "tired of all  the visits and
that the survey goes on  too  long."   They felt they should not have to keep
participating in the survey.
     TABLE 6.  WHY RESPONDENTS REFUSED TO PARTICIPATE IN WAVES 1 AND 2
Reason Given                                      Percent  of Households
                                                  Wave 1          Wave  2

Not interested in participating                    18.7            13.2
No time, too busy                                  14.7            13.3
Invasion of privacy                                 9.9            12.8
Voluntary survey                                    9.3             6.8
Offended by income questions, too personal           9.1             3.2
Didn't believe information was confidential          3.4
All other reasons (e.g., Angry with government,     34.9            35.7
   Illness, No reason)
                     Wave 1 Total                 100.0

Reluctantly agreed to participate in Wave 1,
  refused to participate in Wave 2                                  8.8
Refused in Wave 2, didn't understand we
  would be back                                                     6.2

                     Wave 2 Total                                 100.0
                                   13-10

-------
                         Table 7.  WHY RESPONDENTS REFUSED TO PARTICIPATE
                                      IN WAVES 3, 4,  5,  AND 6
Reason
Wave
Answered in earlier waves
No time, too busy •
Not interested
Reluctant to participate earlier
Felt questions were too personal
Voluntary survey
No change in household income
Interview is too long
Responding would cause family problems
Tired of being harassed, very angry
Tired of all the visits, survey too long
Confirmed refusal
All other reasons
Total
Percent of Households
3
24.1
7.6
6.8
6.1
6.1
4.5
4.4
3.9
3.0
-
-
-
33.5
100.0
4
28.7
6.4
-
-
4.4
3.4
-
3.8
3.7
7.6
5.0
-
37.0
100.0
5
29.5
-
-
-
3.9
4.4
3.2
3.2
-
9.4
9.0
5.8
31.6
100.0
6
25.4
-
-
-
-
-
-
-
3.1
9.6
10.4
13.5
38.0
100.0
CONVERTING REFUSALS INTO INTERVIEWS

      Considerable  effort  is spent  trying  to convert  refusal  households
into interviews.   Usually,  we send a letter  from  the  regional  office  to  the
respondent asking  them to reconsider  participation.   We  also  attempt to
convert refusals  by making follow-up visits when the circumstances seem to
indicate that we  might be successful.  Table 8 shows the number of refusal
households that  received  follow-up visits in  the first  six waves   (1984
Panel).  In Wave 1, approximately 85 percent of  the  refusal  households  had
at least one follow-up  visit.   This drops to 55 percent in Wave 6 because
there  are  more  confirmed  refusals which  are  not  eligible  to  receive
follow-up  visits.     Once  a household  refuses  to  participate  for  two
consecutive  waves,  it becomes  a  confirmed  refusal,  and  no  additional
letters are sent  or visits made to  that household.  Table  8  also shows  the
number of households converted during follow-up.   Around 30 percent  of  the
refusal households  receiving follow-up  visits were converted.  This  number
remained stable during the  six waves.
                                   13-11

-------
      TABLE  8.  NUMBER OF REFUSAL HOUSEHOLDS THAT HAD  FOLLOW-UP VISITS
                                             Wave
                               123456

Total  Households  Eligible
 for Follow-up Visit*          878    495     677     699      559     564
No follow-up visit
Confirmed refusal
Other reason
Follow-up visit reported
% of eligible households
Households converted
% of visits reported
\
134
69
65
744
84.7
254
34.1

97
56
41
398
80.4
131
32.9

245
166
79
432
63.8
120
27.8

243
132
111
456
65.2
133
29.2

216
157
59
343
61.4
104
30.3

252
172
80
312
55.
88
28.





3

2

*   The  following number  of households were  also eligible  for  follow-up
visits  but  are excluded  because  information on  the visits was  missing:
Wave  1-125;  Wave 2-60; Wave  3-226;  Wave 4-294;  Wave 5-357;  Wave  6-208.
Wave 2 only  had 3 rotation groups.


     The record  of follow-up  visits  is  shown  in Table  9.   In Wave  1,
29.0 percent of  the 744  households visited  were  converted  .to an interview
during  the  first Follow-up  visit.    Interviewers spent approximately  69
minutes trying to convert  the  household on  this  follow-up visit.   This
includes travel to and from the household.   Over  half of these households
were converted by a supervisory field representative  (SFR)  who  generally
has more experience and is considered  to  be  a  better  interviewer.   The same
interviewer  that encountered the refusal  was able  to  convert the interview
only 15.8  percent of the time.

      After the  first  follow-up visit in Wave  1,  528 households  had not
been converted.   Of those 528  households, 99 were visited  a second time.
Twenty-seven   percent  of the  households  visited  a  second  time  were
converted.   The field  staff  spent 82 minutes on the second follow-up visit.
Very few households were visited a third  time.   Only three percent  of the
households left  to convert  after  the second visit were attempted  a third
time, but  a  large percentage were converted.

      During the  other five waves,   at  least 24  percent  of  all  refusal
households visited were  converted  to an interview in  the  first  follow-up
visit.  The majority of the time, the SFR was the person  that  was  able  to
convert the refusal.   The same interviewer that encountered the  original
refusal was  the  next  most  successful in converting refusals.  Very few
households were visited a  third time.
                                   13-12

-------
                       TABLE 9.   RECORD  OF  FOLLOW-UP VISITS
Follow-up
Visits

Wave 1
First
Second
Third

Wave 2
First
Second
Third

Wave 3
First
Second
Third

Wave 4
First
Second
Third

Wave 5
First
Second
Third

Wave 6
First
Second
Third
 HH's
744
 99
 15
398
 78
  6
432
 44
  1
456
 34
343
 18
  1
312
 30
  4
HH's Con- % Con-
 verted   verted
 216
  27
  11
 100
  26
   5
 112
   7
   1
 121
  12
  99
   4
   1
  76
   8
   4
 29.0
 27.3
 73.3
 25.1
 33.3
 83.3
 25.9
 15.9
100.0
 26.5
 35.3
 28.9
 22.2
100.0
 24.4
 26.7
100.0
                       Person Completing the
                        Follow-up Conversion
             Time
            Spent    RO          Same  Dif
            (Min.)  Staff  SFR   Int.  Int.
69.4    4.2
82.7   18.5
85.1
54.9
21.2
34.3
59.9
45.3
45.0
40.5
46.4
37.8
27.3
40.0
18.8
70.4
30.0
 4.0
 7.7
 5.4
14.3
 2.5
 8.3
 1.0
25.0
 2.6
       51.9
       33.3
       90.9
 66.0
 61.5
 60.0
 57.1
 71.4
100.0
 59.6
 50.0
       15.8
       18.5
14.0
15.2
14.3
 57.0   8.3
 41.7  25.0
 73.7   7.9
 50.0
 50.0  50.0
      12.0
      11.1
       9.1
6.0
3.8
      14.0
       8.3
11.1   6.1
                                Don't
                                Know
      16.2
      18.5
10.0
26.9
40.0
5.4   17.0
                   100.0
       7.9
      50.0
      18.2
      16.7
      22.2
      25.0
       7.9
     For the  households converted in Wave  1,  22.7 percent  were converted
because  a  new respondent agreed  to  be  interviewed (Table  10).   Getting a
different  household member to participate  was most successful  in  Wave 1.
In later waves,  other reasons increased  in  importance  relative  to getting
another  household member  to participate in the  survey  (see reasons listed
in Section II of  Table  10).   Some  examples  of  other reasons  that  were
important are:

      The respondent
      -  reconsidered after reading  letter sent  by the  Regional Office,
                                   13-13

-------
         was convinced of the benefits of the survey,
         related more to an experienced interviewer/SFR,  or
         it was a better time for the respondent.

Also of  interest  is that by Wave  6,  15.4 percent of  the  interviews were
converted when  we  agreed  to  conduct  a  telephone interview  instead of a
personal  visit.
                    TABLE 10.   REASON  INTERVIEW CONVERTED
Wave
1 2
I. Why interview was given
Different household member 22.7 19.0
Other reason 77.3 81.0
II. Other Reason Given for Conversion
Convinced of survey's benefits
Read Regional Office letter
Experienced interviewer/SFR
Interviewer persistence
Convinced of confidentiality
Different interviewer
Answered only certain questions
Could not be reached earlier
Would only respond by phone
Convinced of legitimacy of survey
Encouraged by religious beliefs
Better, more convenient time
Translator used
Interviewed away from home
Convinced name, SSN not used
Proxy agreed to interview
One household member agreed, but
other members refused
Convinced this interview shorter
No reason given by respondent
Total
3

14.7
85.3

21.1
19.1
10.8
8.9
1.5
8.2
8.8
0.5
1.0
0.5
0.5
8.2
1.0
-
1.0
1.0

1.5
0.5
5.7
99.8
4

12.4
87.6

25.7
9.5
22.9
5.7
3.8
-
5.7
-
5.7
-
-
3.8
-
-
1.0
1.9

1.9
6.7
5.7
100.0
5

17.8
82.2

21.2
1.0
23.2
5.1
4.0
6.1
2.0
•
5.1

-
14.1
-
-
2.0
-

4.0
5.1
7_A
100.0
6

7.5
92.5

23.5
11.8
15.1
8.4
1.7
5.9
6.7
_
5.9
-
-
4.2
1.5
1.7
-
0.8

4.2
1.7
6.7
99.8





40.7
2.2
19.8
13.2
-
-
2.2
.
4.4
-
-
3.3
•.
-
-
-

3.3
8.8
2.2
100.1
GENERAL EFFORTS TO IMPROVE SIPP RESPONSE RATES

      Other efforts are  also  made  to maintain and improve the SIPP  response
rate.    First,  we  try  to  educate  the   interviewing  staff  regarding  the
importance of the survey  and the intended uses of the data.  We  believe that
if the  interviewers understand  the need for  the  U.S.  Government  to undertake

                                   13-14

-------
such  an  ambitious survey,  they can  convey  the survey's  importance to the
respondents.     Second,  we  try to  educate  the respondents  concerning the
importance  of  their  continued participation  in the survey.  The respondents
need to understand why the survey design requires them to be interviewed  every
4 months over a 2 1/2 year period.

      Various  papers  and  articles  that have  been published about the survey
are  used in  educating the  interviewers  and  respondents.    These include
newspaper articles,  articles in the Census Bureau's Data Users News. Public
Information Office releases, and papers written by Census Bureau staff and
outside researchers.   We  have also developed  a  special  four-page  brochure for
the respondents entitled  "SIPP DATA NEWS."  This  publication contains a  brief
summary  about  the survey,  explains  why  the  information collected  is so
important,   and  provides   interesting SIPP  data in  graphic  form  with
nontechnical narrative.  The DATA NEWS is  updated every 4 months  and  is  given
to  each  respondent at  the beginning of each new wave of interviewing.  In
February  1986,  interviewers  began  distributing a portfolio in  which the
respondents can store financial records and data for  the next interview to
make  it  more readily available.   This folder  contains a two-year reference
calendar  to remind respondents about  the  longitudinal  nature of SIPP  and a
copy  of pamphlets such as  "America's  Fact  Finder" and  "USA Statistics in
Brief."  Each regional office also  can include other SIPP-related  materials in
the portfolio,  such  as a  personalized letter from the regional director and
copies of newspaper articles  that contain SIPP data.

      Our third  approach  involves  improvements in interviewer training.   For
example,  more  emphasis is placed on teaching the interviewers how to "sell"
themselves  and  the   survey.    We  tell the  interviewers  that  if  they are
.convinced of the  importance of the survey,  thefr attitude will  be conveyed to
the respondent and will help to enlist cooperation.   A  checklist  of  important
things to do and remember  is  provided to each interviewer during their initial
training.   (See  Appendix  1.)    Also, noninterview workshops are  conducted at
the twice yearly interviewer training sessions to discuss the problem and what
interviewers are doing about  it.

      In  some  noninterview  workshops,  the  interviewers  conduct   practice
interviews  with  another  interviewer  playing  the  part of  a  reluctant
respondent.  They also watch  videotapes showing simulated interviews  with such
respondents.   In  addition, at some sessions we  have  had a SIPP data  user tell
the  interviewers  how  they  use the data.    We think  this  will  help the
interviewers explain  the  data  uses  to respondents which  may convince them to
participate.   Also,  the  interviewers are taught to determine the  specific
reason  for the  respondent's objection to participation  and  to tailor the
response  accordingly.   For  example,  if  the respondent  is  in a low-income
household  and  feels  that  they   do not  earn  enough  income  to  make any
difference, we  suggest  pointing out that the  survey  also covers programs that
they  may be participating in  such  as the  school lunch or energy assistance
programs.   One idea for gaining cooperation that has come out of such sessions
is  to have the interviewer  suggest conducting the  interview  on  a "trial
basis."   The respondent  is  told  something like:   "Could we  just try  a few
questions  so you can see  what the  survey is  like?  You don't have to answer
any questions you  consider to be too  personal."  Often  a reluctant respondent

                                   13-15

-------
will permit the interview  after  seeing that it really is not so bad after  all.
Other ideas are suggested  in Appendix 2.

      Our  fourth approach  focuses  on  offering  respondents  some  form of
compensation or tangible incentive for  participating.   We originally  proposed
a lottery  in  which respondents would  receive a ticket  each  time they  were
interviewed and  extra tickets  if they stayed in the  survey  until the  last
interview.  At the  end of the Panel, we planned  to hold a drawing with the
winners  receiving  prizes.    However,   the  Federal Codes that  govern  the
activities of  the Census Bureau  prohibit  the use of a lottery.

      As an alternative,  we suggested testing whether giving respondents  a
small gift  as a  form of appreciation would  help  to motivate respondents to
cooperate.  The  first  interview period of the  1987  Panel  (February-May 1987)
was chosen  for the  experiment because  Wave  1 of each panel has consistently
shown  the highest  rate of  new  noninterviews.    The  April  rotation group
(approximately  2,900  households  distributed  nationally) received  a small
hand-held,  solar-powered  calculator imprinted with the Census Bureau logo.
The other  three  rotations  from Wave 1  (February,  March, and  May)  did not
receive a gift and  served as  the  control group.   Rotations are  convenient to
use as  treatment and  control  groups since  by design  they contain a random
sample  of approximately  one-fourth  of  the  entire sample of a  panel.   In
addition, because survey operations and controls  are carried out by rotation,
it is most convenient operationally and least confusing  to implement.

     Comparisons  between Wave 1 noninterview rates  for  gift recipient  (April
rotation)  and nonrecipient (February,  March,  and May  rotations)  households
were made at  the national and regional  office levels at  the 10  percent level
of  significance.   Nationally,  the  noninterview rates  for  recipients  were
significantly  lower than for nonrecipients.   Although not significant  (except
for the  Charlotte  Regional  Office),  the rates for recipient households  were
lower for 9 of the  12  regional  offices,  while they were higher for 3 of the
regional  offices.    However,  compared  with the  projected  rate for April
households in  earlier panels,  the  difference is not significant.

     We  also  asked  the interviewers  to  provide comments  on the  effectiveness
of  the  gifts   (8).   We received  comments  from 352  interviewers which  were
interesting and  enlightening.  The leading  comment was that the  calculators
were well-received  and the respondents seemed to like them.  Yet only 41 of
the interviewers believed that the gifts made  it  easier  to obtain respondent
cooperation.  More interviewers  (65)  claimed that obtaining cooperation is the
result of the  interviewer's skill  more than anything else.  We  think this  is  a
good indication of the level  of  confidence instilled in  our interviewers.

      The  noninterview rates must  be  followed  over  the future  waves of
interviewing  for the 1987 Panel  and  compared with previous panels to answer
the question  of  their  effectiveness.   Even  if statistical  differences do not
exist for all waves  examined  individually,  a  consistent  trend of  lower rates
of  increase in noninterview  rates from  wave  to  wave for recipient households
may indicate that a token  gift can reduce household nonresponse.
                                   13-16

-------
                                  IN  SUMMARY


      The SIPP is an ambitious  data collection  effort that attempts to measure
extremely  complex phenomena:   detailed  income and asset  sources,  program
participation, weekly labor force status,  health,  child  care,  and taxes.   As
in all surveys, the quality of the data is of major concern.  The conclusions
drawn from SIPP data are  affected by both sampling and  nonsampling errors.
This  paper examines one of the major sources  of nonsampling  error:   sample
loss through household nonresponse.

      We are  attempting  to measure  and understand  sample loss in SIPP.  The
main  cause of this loss is refusal  to participate.  The reasons given most
frequently for refusing the initial  interview are that the respondent just is
not interested or  is  too busy.  We are hoping  that a gift at the first visit
will overcome these feelings and persuade the respondent to participate.  The
initial  results of the gift experiment conducted during the first wave of the
1987 Panel are encouraging.  If the gift  succeeds  in increasing participation
in SIPP over the life of the panel, we may adopt the practice in future panels
as well  and hopefully reduce our nonresponse  rates.

     The work described in  this paper was  not funded by the U.S. Environmental
Protection Agency and, therefore,  the contents do not necessarily reflect the
views of the Agency, and no official  endorsement should be inferred.
                                   13-17

-------
                            APPENDIX 1


                       INTERVIEWER CHECKLIST



 1.     I  assume  I AM going to get each interview.

 2.     I   properly   identify   myself,   using  my   official   CENSUS
       Identification Card.

 3.     I  make  sure the respondent has had an opportunity to  read  the SIPP
       introductory letter.

 4.     I  provide the  respondent  with a copy of the SIPP fact sheet if  I
       feel  it will help convince a reluctant respondent to  participate.

 5.     I  make  sure  I  explain the Bureau's  policy on confidentiality,   if
       asked.

 6.     I  am well-informed  on the survey:   its  sponsorship,  purposes,
       benefits  to the respondent, etc.

 7.     I  am careful  to point out those  selling points  that would  appeal
       to a particular respondent;  e.g.,  to  learn  about the effects  of
       unemployment or the  kinds of  help  that are needed  to assist  a
       person  with low income.

 8.     I  maintain a professional,  businesslike attitude;  I  do not get
       angry or  discuss politics- or government policy.

 9.     I  try  to remember  something about  each  respondent to  show  my
       interest  at subsequent interviews.

10.     I  am familiar  with  survey  forms and  procedures so that  I  can
       conduct an interview  as quickly and efficiently  as possible.

11.     I  probe  sufficiently  so that  whenever  necessary  I  can avoid
       calling a respondent  for  additional information.

12.     I  make  sure  the  respondent knows  I  will  do everything possible  to
       help them in  participating in the SIPP  Survey  (e.g., make  visits
       at times  convenient to respondent).
                              13-18

-------
                                 APPENDIX 2

                      TECHNIQUES  FOR  GAINING COOPERATION

      The following techniques  were suggested  by Census Bureau interviewers as
possible approaches for gaining cooperation in our  surveys.
 1.   Develop an opening line that you're comfortable with, for example, "Are
      you eating?" or "Did I  come at  a  bad time?"
 2.   Always  smile,  be  friendly,  be  positive,  be  enthusiastic,  expect  to
      succeed.
 3.   Deal (answer) with the person's questions but get them started with the
      interview.   Move into  it as  quickly  as possible.   Don't  let them
      sidetrack you from your goal of starting the  interview.
 4.   Defuse hostile comments by  using  neutral  responses,  such as  "Uh-huh," or
      "I understand."
 5.   Empathize with the person.   Apply the golden rule,  i.e.,  treat the
      person the way you would  like to  be treated.
 6.   Find something to admire  in the person's home or on  the property.
 7.   Maintain1 eye contact.                           ' •
 8.   Hold your Identification  Card where it can be seen easily.
 9.   Don't let the respondent  intimidate you.
10.   Dress appropriately for the neighborhood.
11.   Know how the data you  collect are used and be prepared for questions.
12.   Never take negative comments personally.
13.   Stress  that we  will   bend over  backwards   to  meet the  needs  of the
      respondent.   For  example,  we'll  do the  interview at their  convenience,
      help out in any way we  can, talk  fast, etc.
14.   Use newspaper  items or other clippings  to illustrate  use of the data.
      Generate your own materials so  that they have a local flavor or  reflect
      recent news articles.
                                   13-19

-------
                                REFERENCES
1.   Moser, C.A.  and Kalton,  G.   Survey Methods  in Social  Investigation.
     Basic Books,  Inc.,  New York,  1972.  p.171,  and  Hoinville,  G.  and Jowell,
     R. et.  al.   Survey  Research Practice.   Heinemann Educational  Books,
     London,  1978.  p.6.

2.   Stephan,  F.  F. and McCarthy, P. J.   Sampling  Opinions:   An  Analysis  of
     Survey Procedure.   John  Wiley and Sons, Inc., New York, 1958. p. 261.

3.   Sheatsley,  P.B.    The  Harrassed Respondent:   Interviewing  Practices.
     Paper presented at the  Market Research Council  Conference,  October  15,
     1965.

4.   Ibid  2,  p.  265.

5.   Durbin,  J. and Stuart,  A.  Differences  in Response Rates  of  Experienced
     and  Inexperienced  Interviewers.    Journal of the  Roval  Statistical
     Society.  A,  114,  1951.  pp.163-205.

6.   Nelson, D., McMillen,  D., Kasprzyk, D.   An Overview of  the  Survey  of
     Income and Program Participation:   Update 1.   SIPP Working Paper Series
  .   No. 8401,  U.S.  Bureau of the  Census, Washington, D.C., 1985.

7.   Nelson, D.,  Bowie, C.,  and  Walker,  A.   Survey of Income  and Program
     Participation  (SIPP)  Sample Loss  and  Efforts  to  Reduce  It.   In:
     Proceedings  of  the  Bureau  of  the  Census  Third  Annual   Research
     Conference.   Baltimore,  Maryland, 1987.  pp.629-643.

8.   Census Bureau  memorandum from D.  Jackson  to  C. Bowie,  "Interviewer's
     Evaluation of the  Gift  Experiment," August 5, 1987.
                                  13-20

-------
                  PRINCIPLES OF QUESTIONNAIRE DESIGN  AND
                         METHODS OF ADMINISTRATION

                        by:   Wendy Visscher
                             Roy W. Whitmore
                             Research Triangle Institute
                             Research Triangle Park,  NC  27709

                             Mel Kollander
                             F. Cecil Brenner
                             U. S. Environmental  Protection Agency
                             Washington, D.C.  20460
                                 ABSTRACT


      This paper discusses the basic process of questionnaire development:
(1)  determining the  data requirements  of the  survey;  (2) framing  the
questions and formatting  the  questionnaire; and (3) iteratively testing and
revising the  questionnaire.   Methods of questionnaire  administration are
discussed  in  the context  of their  effect  on questionnaire  development.
Guidance  is  provided for  formatting individual  questions and  the  total
questionnaire  so  that  the  data  collection  instrument  measures  the
attributes of interest as  accurately  as possible.

      This  paper  has   been  reviewed   in   accordance  with   the   U.S.
Environmental  Protection  Agency's peer and  administrative  review policies
and approved  for presentation  and publication.
                                   14-1

-------
      PRINCIPLES OF QUESTIONNAIRE DESIGN AND METHODS OF ADMINISTRATION


INTRODUCTION

      The design  of  a valid  questionnaire  is a  scientific  endeavor and
often a  complex  task.   Survey  specialists  trained  in  the principles of
questionnaire design  should be  consulted  during the development  of the
questionnaire.  A questionnaire is  a measurement instrument analogous to
other scientific  tools,  such as  air monitoring devices.   Great  care must be
taken in  constructing the  questionnaire,  and  it must  be comparable in
validity and  reliability to precise scientific  instruments.   This paper
provides a   brief   introduction  to  the  process  and  principles  of
questionnaire  development.

      To  construct  an  appropriate  questionnaire,   researchers  must:
1) determine the specific data output requirements of the  study; 2) frame
and  organize  the  questions;   and  3)  iteratively .test  and  revise  the
questionnaire.  The  individual  questions mus't be  carefully worded and the
questionnaire organized  in  a meaningful format.  The  study sponsor must
decide,  based  on  specific needs  and resources, how the questionnaire should
be  administered   to  the  respondents:    in   face-to-face   interviews,  by
telephone,  by  mail, or some combination thereof.

      These issues and others  associated with  planning,  designing,  and
managing the  conduct of a survey are discussed  in  the Survey Management
Handbook published by the U.S. EPA.

DEFINING STUDY OBJECTIVES AND FRAMING THE QUESTIONS

      The  first   step in  the  process  of  questionnaire  development  is
requesting the study sponsor  to define  the  specific information needs of
the  study and the ultimate uses of  the  information.   The sponsor should
elaborate  by  translating  these needs  into   specific  data requirements.
These statements  of  purpose  and data requirements constitute the primary
contents of the  analysis  plan.

      A thorough  background  search  of what  has  been  done  before and what
types of questions have worked best in previous studies  should  be  conducted
(3).  Survey  specialists should investigate  the  study topic to  ensure that
all  relevant  data are collected and to  decide  which questions should be
asked and how they should be  framed.  The questionnaire  can then be built
around the specific needs of the study.

      In deciding how  to frame  a   question,  survey  researchers  should
investigate  the  cognitive  processes  that   a respondent  must typically
perform to answer the  question  (13).  The cognitive  processes  involved in
answering  a  question  include  the  respondent's understanding  of  the
question,  recall  of   information, and  formulation of a  response.   These
processes  are often   studied  in controlled  settings.   The understanding
gained  about the respondent's thought  processes  as  they   respond  to


                                   14-2

-------
questions can  be used to  draft questions that  are easier  to  answer and
result in more  accurate data.

FORM OF THE  QUESTIONNAIRE

      The questionnaire  should be formatted  so  that it  can be completed
reliably and with minimal  burden.   The language should be  friendly, and the
questions  should  flow  naturally.    The  order   of the  questions  is  an
important consideration  because the meaning of a  question  is  often  affected
by  the  questions that precede it (1).    Items  about similar  topics are
particularly likely  to  influence each other  (5).   Finally,  the language,
format, and flow of  the questionnaire must effectively create rapport with
the  respondent to ensure  a successfully  completed interview.   For this
reason, difficult  or sensitive questions  should  be delayed  until  late in
the interview after rapport has been established (12).

      A  well-designed  interview for  environmental  studies  generally
consists  of four  parts:    an  appropriate introduction,  a  few "warm-up"
questions, the body  of the  questionnaire,  and, if appropriate,  demographic
questions (3).   The  introduction should clearly state who is  sponsoring or
conducting  the study,  the study  objectives, and the importance of the
respondent's participation.   An informed consent procedure may  also be
included as part of  the introduction in which the  risks,  benefits, and the
voluntary nature of the  study  are  explained.

      The initial questions should consist  of  a few simple, non-threatening
items which  build interest and  rapport  and  orient  the respondent to the
topic before the main study questions  begin.   These initial   questions can
influence whether or"not  the  respondent  decides  to  complete  the remainder
of  the  questions (6).   Sometimes, these  initial  questions  are used for
screening to determine whether  the person  is  eligible  for the study.  For
example,  in  a  study  of  residential radon  exposure,  a  sponsor may only be
interested in  owner-occupied  dwellings,  and  the  screening questions  would
be  used to determine if the potential respondent  is  the owner of the  house
before proceeding with the interview.

      The body  of  the questionnaire  contains  the  critical study questions
and  is used to obtain the bulk of the study data.   When  possible, questions
which are of greatest interest to respondents should be  asked early (4).
Finally,  a  portion of the questionnaire  can  be  used to obtain  demographic
characteristics  (e.g.,  age, sex, education), when appropriate, to enable
the  researcher  to  describe  the study population.    Transitions  between the
sections of  a  questionnaire should be smooth  and  unobtrusive.   However, if
an  abrupt change in question  types  occurs (e.g.,  household-level versus
personal), the new section should  be  introduced with  a short explanation.

LENGTH OF THE QUESTIONNAIRE AND QUESTIONS

      The  survey specialist  should  give considerable  thought  to   which
variables are really important to avoid  including  nonessential items  on the
questionnaire  (12).  The  length of the questionnaire and  of  the individual
questions depend on the  interview mode and  the difficulty  of the   study

                                   14-3

-------
topic (3).  Very  lengthy questionnaires are generally  very difficult for
both the interviewer and the respondent and can lead to respondent fatigue
and incomplete interviews,  resulting in poor data quality.

      Short questions  are  generally  preferred over  longer  ones.    If a
respondent is  forced to listen to or read a very long question, especially
concerning a  subject about which  he  knows very  little,  he  is  likely to
become disinterested.   Long questions which require excessive thought or
are too difficult may result in unreasonable respondent burden (1).   Short
questions  are particularly effective  when  the   interviewer reads  them
slowly.    This  allows the respondent time  to think about his responses but
doesn't  overburden  him with  too  much  information.     However,   longer
questions  may  be  more effective  for  memory questions  because additional
clues can be provided to stimulate  recall  (11).  Another approach is to ask
a series of lead-in questions to provide these memory clues.   This reduces
the need  for  longer questions,  often with  only  slight  increases  in the
overall  length of  the questionnaire.

TYPES OF QUESTIONS

      A  number  of  different  kinds  of  questions  can  be  used   in  a
questionnaire,  including questions  of  fact,  opinion, attitude,  information,
and self-perception  (3).   Different types  of questions are appropriate in
different  studies,  and  questions of fact  are  probably  the most useful for
environmental  studies.   Questions of fact  are  used  to  obtain information
about  the  respondents  themselves  or  about  measurable  or  observable
phenomena  about  .which  the respondents  have  knowledge.     Examples  of
questions  of  fact might include those about  respondent's smoking habits,
activity patterns, ventilation  practices,  use  of pesticides, or whether his
house has a basement or a crawl  space.

OPEN-ENDED VERSUS  CLOSED-ENDED  QUESTIONS

      The  types of  questions used on questionnaires can be  dichotomized as
closed-ended and open-ended.  Closed-ended  questions are designed to elicit
specific  responses by providing a  list  of all possible  responses  to the
question  and  by having  the respondent choose a response from this list.
Open-ended questions,   on  the other  hand,  are not  restrictive,  and the
respondent is allowed to answer freely.  The  survey specialist must decide
which type of question  best fits the needs of the study.

      Closed-ended questions are preferred  for  categorical variables.  They
are very easy and fast to administer.   Since they  are often  preceded on the
questionnaire (e.g., the interviewer  circles  the  code  associated with the
response),  data  processing  is  also  fast  and efficient.    Closed-ended
questions  are  generally dichotomous (e.g., yes or no)  or multiple choice
with several fixed  alternatives  (e.g.,  "Which  of  the following do you use
in your home:   window fan,  ceiling  fan,  circulating fan?").  The categories
used  for  closed-ended  questions   should  be  appropriate  for the  study
population.   In fact,  the major disadvantage of closed-ended  questions is
that respondents are forced to classify themselves  into  categories that are
based on the researcher's frame of  reference  (5).   Any elaboration that the

                                   14-4

-------
respondent
question.
may  choose  to  give  is  essentially lost  in  a closed-ended
      Open-ended questions  are sometimes  appropriate  for  quantitative  items
to be analyzed  as continuous  variables and  for  qualitative  items  for  which
it  is  not  practical   to  list  all  possible  responses.     Examples  of
appropriate  quantitative items are  age  of residence  and length of  daily
commute.   Open-ended questions  are often  used for qualitative items in
"exploratory" studies in which the researcher  has  very  limited  knowledge of
the subject  and doesn't know what responses  to expect.   Since open-ended
questions require that the interviewer record the responses verbatim, they
are time-consuming to administer,  and a limited number of questions can be
asked of the respondents  before  they  tire.    Open-ended  questions also
require  more data processing time,  since  responses must  be  grouped for
analysis and codes  must be assigned.  This post-interview categorization
invariably leads to  a loss  of specificity.

      Open-ended  questions  allow  a researcher to  learn more  about the
respondent's  opinions,  feelings,  and  reasons  for  giving   a specific
response.    However,  because  of  this freedom, open-ended  questions may
elicit  irrelevant information,  and there is generally wide variability in
the amount of detail  provided by different respondents  (12).

      Open questions can be  used  in pretest questionnaires, allowing more
meaningful  categories to be developed for use with closed questions on the
final  questionnaire  (5).  •  Each  type  of question  has  advantages and
disadvantages,  and there are  situations to  which  each  is particularly well
suited.   These are summarized in Figure 1.

DESIGNING QUESTIONS

      Simplicity  is  almost   always  -preferred  in  a  good  questionnaire.
Survey  specialists   must always  guard  against making  the  questions too
complicated  (1).  Each question should be specific and  communicate the same
meaning to all respondents  (6).
      CLOSED-ENDED QUESTIONS

1.  Fixed response alternatives
2.  Categorical variables
3.  Fast, easy administration
4.  Pre-interview coding
5.  Forces respondent to classify
    him/herself into preset category
6.  May result in incomplete data
    if all potential responses
    are not known
                              OPEN-ENDED QUESTIONS

                              1.   Free  response
                              2.   Continuous  variables
                              3.   Administration  slowed by
                                  record-ing of verbatim
                                  responses
                              4.   Post-interview  coding
                              5.   Allows respondent to form
                                  a  more appropriate category
                              6.   Respondents can assist in
                                  creation of response list
                                  for future  studies
       Figure 1.   Advantages  of  Closed-ended vs. Open-ended Questions

                                    14-5

-------
      Colloquial  English should be  used  whenever possible.   Whether the
respondent  understands  the  question is  more  important than  whether the
question  is grammatically correct.   The use of  slang  should be avoided,
however,  because  different  respondents  may  have different  reactions to or
interpretations of this kind of language.

      Respondents  may give  incorrect answers to  questions  which  include
words  that  have no  meaning  for them.   The survey  specialist strives to
choose  words  that  have the same meaning to all respondents,  regardless of
educational experiences or cultural  background.  One  should not assume  that
respondents who  do  not know  a word  will  respond in  a  specific  way.
Questions should be  neutral  and  should  not  impart  the researcher's  biases.
Neutrality  is violated  if the question is  written is  such a way that the
respondent  is influenced to choose a particular  answer.   Respondents  can be
influenced  in this manner if they are offered  unfair  alternatives  or if one
response  is made  to  seem less desirable than  another response, perhaps by
the  use  of  emotionally  charged  words  or  stereotypes  with  negative
connotations.

      Ambiguity or vagueness must be avoided.  If the  respondents do not
understand  a question, their responses are not likely to be  accurate.   This
lack of understanding may be the result  of an  incomplete  question  (e.g.,
the question  is not  self-explanatory  and  not  enough  information is  given);
a  question  that is indefinite in time (the respondent may not  know what the
researcher  means  by  "frequently,"  "often," or "usually");  or a question
that includes indefinite comparisons or imprecise  categories  (e.g.,  the use
of "below  average,"  "average,"  and "above  average")   (8).    The  survey
.specialist  should avoid using more than  one  subject in a  question, since
the  respondent may   have  different  thoughts  about  each  (e.g.,   "Do you
consider  residential  exposure  to  radon  and  carbon  monoxide  a   health
hazard?").   A  respondent  may also  be  confused by  double  negatives  in a
question  (e.g.,  "Do you  agree  or disagree  that radon  is  not  a  health
hazard?") (3).

      The  sponsor  is invariably  much more  interested in and  knowledgeable
about  the  survey  subject than  the  respondents  and may tend to  prepare
questions that the  respondent  does not understand.  Survey sponsors are
specialists educated  in  their field and tend to use  long words  and  jargon.
This elevated language  should be avoided,  (e.g.,  Use  the  word "feeling",
not  "intuitive;"  and "clear,"  not  "intelligible).    The  fact  that the
questions make sense and are  important to  the sponsor  does not  mean  they
will  be understood  by the  respondent.   A  sur.vey  specialist is  needed to
translate the sponsor's  data needs into questions  that will  reliably obtain
the desired information.

PRETESTING  A QUESTIONNAIRE

      A careful pretest  of a draft questionnaire is an integral  part of the
questionnaire design process.   The pretest  should   be  conducted among
persons who are as similar as possible to  the  target population  subjects.
A  small  pretest  consisting of  12-25 interviews   should reveal  the major

                                    14-6

-------
weaknesses of the questionnaire  (12).  The researcher should know after  the
pretest has been  completed  whether the questions had the  same  meaning  to
all respondents and  if the difficulty level and length of the questionnaire
are appropriate.  Focus groups can also be a useful  part of the  evaluation
of questionnaires and survey procedures.   Revisions  can then  be made  to
produce a final questionnaire  that will  produce relevant  and accurate data.

COMPARISON OF QUESTIONNAIRE ADMINISTRATION METHODS

      There  are  three  modes  of administration  for   a  questionnaire:
face-to-face   interviews,   telephone   interviews,   and   self-administered
interviews.    Many  studies  use  a  mixed-mode approach  (e.g.,   screening
potential  respondents by phone,  then  doing face-to-face  interviews with
eligible persons).  The specific needs  of  the  study will  dictate  which mode
is most  appropriate.    The  relative advantages  and disadvantages of  the
methods are discussed  below.

      Face-to-face interviewing  is often efficient for environmental  field
studies  because  the  environmental  measurements  require face-to-face
contacts.  Interviewers are  recruited and  trained  to establish rapport with
respondents   and complete  interviews  in  a  standardized  fashion.
Face-to-face  interviews generally  result in  higher  response  rates than
other  modes  of  interviewing,    and  they  are more  flexible  than mail
questionnaires because  the   interviewer  can  explain questions that  the
respondent  would not otherwise  understand.    The  interviewer also  has
control  over the sequence of questions and selection  of  the respondent.
However,  the  reading of the questions  and  their  explanations must  be
consistent across interviewers.   Telephone interviews can be less expensive
than-face-to-face interviews  and often  result  in higher response  rates than
mail surveys.  Telephone interviews can be efficient and  cost-effective  for
surveys that must cover  large geographic  areas or be  completed  in  a  short
period  of time,  and, like  face-to-face interviews, they are  also more
flexible  than  mail   questionnaires.     Another  advantage  of   telephone
interviews is  that the interviewers typically  conduct their  interviews from
a  central location.    This allows  close supervision of the  interviewers  and
a  greater capacity for  standardization  and  quality  control.    Telephone
interviewing also has  certain disadvantages.   The  interviewer may have more
difficulty establishing  rapport  with respondents  and  recruiting them into
the study by phone than  face-to-face because the respondents may doubt  the
legitimacy  of  the  telephone call.    Also,   a  telephone  interviewer  has
difficulty judging whether the respondent  really  understands the  questions,
and the respondent can stop the  interview at any time simply by  hanging up
the  phone.     Finally,  telephone  interviewing  does  not   cover  the
approximately  7% of U.S. households that do  not have  telephones.   Since
this  segment  of  the  population  is   generally poor   or ethnic,  this
noncoverage can bias the  results  of a study (4).

      A  mail  survey   can   be  less  expensive  than  either  face-to-face
interviews or  a telephone survey  because postage  is  less expensive  than an
interviewer's  salary,  resulting  in a smaller cost  per  case.  This  allows
the sponsor to increase the  sample size  of the  study without increasing  the
cost above  that of other modes.  A mail  survey may be preferred  if  the

                                   14-7

-------
respondent needs time  to  obtain the information requested.   A respondent
may be more  likely  to  answer sensitive questions  on  a  mail  questionnaire
than during a telephone or  face-to-face interview (12).

      The major  disadvantage of  mail  questionnaires  is  that  they  often
result in poor response rates, which can  invalidate population  inferences.
Multiple mailings  plus telephone or  face-to-face follow-ups  are  usually
needed to get acceptable response rates.   Even  so, typical  response rates
are 40% for a first  mailing and 60% after three mailings  (4).
Nonrespondents  are  often  less  educated or less  interested in  the  study
subject  than  respondents  or  may  differ in other  ways  that can  bias  the
results of the study.

      However, if a mail,  survey is  well-designed  and extensive follow-up
procedures are  followed,  response rates nearly as high as  those obtained
with  face-to-face  and  telephone  interviews are  possible  (12).   Dillman
describes a  "Total  Design  Method" which  has resulted  in response rates of
60% to over  90% in mail surveys  of  heterogeneous populations  (10).   The
Total  Design  Method  incorporates   careful  study design  and  intensive
follow-up with specific techniques which  tend  to  maximize  response (e.g.,
specially-formatted  questionnaires,   multiple mailings,   telephone  and
face-to-face follow-up  of nonrespondents,  and  various,  forms of  personalized
communication).    These  specialized  procedures  and  intensive follow-up
efforts,   although  very  effective  at  increasing  response  rates,   also
increase the cost of the study.

      A  self-administered  questionnaire must be simple and easy-to-follow,
since the  respondent will  not have the assistance of an  interviewer when
completing it.   However,   this inflexibility also prevents the. interviewer
from  introducing   bias  by  selectively  probing certain  questions  or
respondents.   A mail survey is not appropriate if  spontaneous responses are
desired  or if the  sponsor  wants  to  make sure the respondent completes the
questionnaire  without  the assistance  of  other household  members.    The
success  of a mail  survey  is  also dependent on the literacy  level  of the
study population (9).

      In  summary,   the appropriate mode  of questionnaire  administration
depends  on the nature  of the  study and the  resources  (both time and money)
available to the sponsor.   The  choice of administration mode is a complex
decision and a mixed-mode  approach is appropriate.

SUMMARY

      A  well designed  and  fully  tested  questionnaire  is  an  essential  part
of  a  successful  survey.   After a careful  background search,  the survey
specialist  should  know what information  will  be useful  and  should  be
collected.    The sponsor  must  decide,  based  on  cost and  the  type  and
complexity  of the  data to  be  obtained,  whether to use a  face-to-face
interview, a telephone  interview,  or a mail survey.   A draft questionnaire
can then  be  developed  using preceded  closed questions or carefully worded
open  questions.    Special  care  must be  taken during  the  design  of  the
questionnaire to ensure that  the respondent will understand the questions,

                                   14-8

-------
that the  language used is  non-threatening and unambiguous,  and that the
questionnaire is  not  too  long or difficult.  A pretest  should be done to
further  refijie  the  questions  before  the  study  begins.    The  final
questionnaire that  is used for  the  actual collection of  the data should
produce the most  accurate  and  useful data possible.


                                 REFERENCES

1.    Converse,  J.M.,  Presser, S.:  Survey  Questions  (Sage  Publishing,  Inc.,
      Beverly Hills,  CA,  1986).

2.    Wallace,  L.A.:  The  Total  Exposure Assessment  Methodology   (TEAM
      Study): Summary and Analysis: Volume I.    (Office  of Research and
      Development, U.S. -Environmental  Protection  Agency, Washington,  D.C.,
      1987).

3.    Backstrom,   C.H.,   Hursh,  G.D.:    Survey Research  (Northwestern
      University Press, Chicago,  IL,  1963).

4.    Kelsey, J.L., Thompson,  W.O.,  Evans, A.S.: Methods  in  Observational
      Epidemiology (Oxford University Press, New York, NY, 1986).

5.    Schuman, H.,  Presser,  S.:  Questions and Answers in  Attitude Surveys
      (Academic  Press, New York,  NY,  1981).

6.    Berdie, D.R., Anderson, J.F., Niebuhr MA: Questionnaires:  Design and
      Use Second Edition.(The  Scarecrow Press,.Metuchen, NJ, 1986).

7.    Bel son, W.A.: The Design and Understanding  of  Survey  Questions  (Gower
      Publishing Co.  Ltd., Aldershots, Hants,  England, 1981).

8.    Bradburn,   N.M.,   Sudman,   S.:   Improving  Interview Method  and
      Questionnaire  Design  (Jossey-Bass  Publishers,   Washington,  D.C.,
      1979).

9.    Fowler, F.J.:  Survey Research  Methods (Sage  Publications, Beverly
      Hills, CA,  1984).

10.   Dillman, D.A.:  Mail  and Telephone  Surveys:  The Total  Design Method.
      (John Wiley, New York, NY,  1978).

11.   Wright, T.:  Statistical  Methods  and  the Improvement of  Data Quality
      (Academic  Press, Inc., New  York, NY,  1983).

12.   Rossi,  P.H., Wright,  J.D.,  Anderson,  A.B.,  editors:  Handbook of
      Survey Research  (Academic Press,  Inc., New York, NY, 1983).

13.   Jabine, T.B.,   Straf,  M.,  Tanur,  J.M.,  Tourangeau,  R.:  Cognitive
      Aspects of Survey Methodology:   Building a  Bridge  Between Disciplines
      (National  Academy Press, Washington,  D.C., 1984).


                                   14-9

-------
14.    U.S.  Environmental  Protection Agency,  Survey Management Handbook,
      Office  of Policy,  Planning and Evaluation, Washington,  D.C.,  1983.
                                   14-10

-------
   ESTIMATION OF MTCROENVIRONMENT CONCENTRATION DISTRIBUTION
               USING INTEGRATED EXPOSURE MEASUREMENTS

                         by:     Naihua Duan
                                RAND Corporation & UCLA
                                  ABSTRACT
      • Several methods are proposed to estimate the distributions of pollutant concentrations
in microenvironments using integrated exposure measurements. For the mean, concentration in
each environment, we propose a regression estimate based on a linear model with the intercept
suppressed.  For the variances and higher cumulants, we propose seveal regression estimates
based  on regressing powers of residuals from the first linear model. We also discuss a general
deconvolution approach which can be used to estimate the entire distribution.
                                      15-1

-------
1. Introduction
    In applying the  indirect approach to assess human exposure to air pollution, it is
necessary to first estimate the distribution of microenvironment concentrations and the
distribution of activity pattern. We can then combine the two estimated distributions to
estimate the distribution of the integrated exposure. We assume that we have conducted
an activity pattern survey from which we can estimate the distribution of activity patterns.
We are then left with the task of estimating the distribution of microenvironment concen-
trations.
    For some pollutants such as CO, there are reliable personal monitors which give con-
tinuous measurements of the pollutant concentration. If we conduct personal monitoring
using such continuous personal monitors and record activity patterns at the same time, we
can determine the microenvironment concentrations and estimate the distribution for the
microenvironment concentrations. This was done, for example, in the Washington-Denver
CO studies.
    For many pollutants such as NO? and VOC, we don't have reliable continuous per-
sonal monitors, and have to rely on integrated exposure measurements. There is no obvious
way to determine the microenvironment concentrations from integrated exposure measure-
ments. In this paper, we describe a method which can be used to estimate the distribution
of microenvironment concentrations from integrated exposure measurements and activity
patterns. We can then use the estimated microenvironment concentration distribution in
the indirect approach and estimate the distribution of integrated exposures  for another
sample of human subjects for whom we have an activity pattern survey.
    In the  rest of this section we review the indirect approach and methods that can be
used to estimate the distribution of integrated exposures. In Section 2 we discuss regression
methods  which can  be used to estimate the means, the variances, and the  covariances
for microenvironment concentrations.  In Section 3 we sketch a method to estimate the
microenvironment concentration distribution using characteristics functions. In  Section 4
we discuss the limitations of the modelling assumptions used in this paper and propose
several directions for future research.

1.1.  Review of the Indirect Approach
    We assume for now that our main interest is in an individual's integrated exposure1.
  1  It is possible to extend the indirect approach and the microenvironment decomposi-
tion (1.1) to deal with other exposure measures such as the peak concentration encountered
during a given time period.

                                      15-2

-------
 The indirect approach is based on the following microenvironment decomposition of the
 integrated exposure:
                                            K
                                y = C'T = Ł ckTk,      .                     (1.1)
                                           fc=i
 where Y denotes an individual's integrated exposure to a given airborne pollutant during
 a given time period, say, a 24-hour period. Ck denotes the average pollutant concentration
 encountered by this individual during this time period while he is in the k-th microenviron-
 ment; we will refer to the vector C' = (Ci, ..., CK) as the microenvironment concentrations.
 Tk denotes the amount of time this individual spent in the k-th microenvironment during
 this time period; we will refer to the vector T' = (Ti,..M TK) as the activity pattern. Both
 microenvironment concentrations and the activity pattern can vary from individual to in-
 dividual, and from time period to time period. We will refer to  each combination of an
 individual and a time period as a sampling unit.
     We assume in the microenvironment decomposition that the totality of all possible
 locales and activities that the individual can engage in has been stratified into K microen-
 vironments.2 Duan (1980) developed a criterion for identifying the stratification scheme
 which can be used to improve the precision of the estimated average exposure.  The cri-
 terion was applied in Duan  (1985)  to identify the important microenvironments for CO
 exposures.
     In order to apply the indirect approach, we need to first estimate the microenviron-
 ment concentration distribution
                                                      cjC)                      (1.2)

 and the activity pattern distribution

                           FT(t) = P(Tj. < «!,..., TJC < tie).                      (1.3)

 If we have an enhanced personal monitoring study, such as the Washington-Denver CO
' studies, which provides continuous measurements of concentrations and also records ac-
 tivity patterns, we can determine the microenvironment concentrations for each sampling
 unit  and estimate the above distributions, using the corresponding sample distributions,
 or some suitable parametric estimates.  If  the personal monitoring study only  provides
 integrated exposure measurements, such as the TEAM studies, there is no obvious way to
 determine the microenvironment concentrations.  In Sections 2 and 3 we describe  methods
 which can be used to estimate the microenvironment concentration distribution  -Fc(c) in
 (1.2) in those situations.
     Once we have estimated the distributions FC(C) and FT(*), we can combine the two
 distributions to estimate the distribution of integrated exposures,

                                 Fy(y) = P(Y < y).                            (1.4)
   2  Duan (1980,  1982, 1985) used the term microenvironment for an individual locale
 or activity, and used the term microenvironment type for a collection of similar microen-
 vironments.  In accordance with the more  prevalent use of the terminology, the term
 microenvironment is used in this paper to refer to a collection of locales or activities.

                                        15-3

-------
We discuss methods for doing this in the next subsection.

1.2. Exposure Distribution
    There are two major methods for estimating the exposure distribution Fy (y) from the
microenvironment concentrations distribution FC (c) and the activity pattern distribution
FT(I), namely, the Cartesianization method3 (Duan 1980, 1985, 1987) and SHAPE (Ott
1981).
    The two methods are similar in nature;  both models require certain independence
assumptions.4 The Cartesianization method assumes that the microenvironment concen-
trations are stochastically independent of activity patterns.  This assumption is equivalent
to
                             Fc,T(c,t)=Fc(c)FT(t),                          (1.5)
where FC,T denotes the joint distribution of C and T.  In other words, the joint distribution
for  C and T  is given by the Cartesian product of the respective marginal distributions.
The distribution for integrated exposure is then given by

                       FY(y) = || l(c't < y)dFc(c)dFT(t),                    (1.6)

where l(c't < y)  denotes the  indicator function for the set  of (c,t)'s which satisfy the
inequality c't = Ł*=1 cfctfc < y.
    Given appropriate estimates for the microenvironment concentration distribution and
the activity pattern distribution, say, FC(C) and Ff(t), we can estimate the joint distri-
bution Fc,T(c,t)  by the Cartesian product of the  estimated marginal distributions, then
estimate the exposure distribution by

                       FY(y) = j j l(c't < y)dFc(c)dFT(t).                    (1.7)

    SHAPE  employs a different set of independence assumptions. SHAPE decomposes
microenvironment concentrations5 into short term averages, say, minute averages:
                                       rfc
                                                                              (1.7)
  3  Duan (1980,1985) used the term "convolution method." The method was generalized
to a broader context in Duan (1987) and renamed as the Cartesianization method. The
essense of the method is to take the Cartesian product between the microenvironment con-
centration distribution and the activity pattern distribution.
  4  This assumption might be rather restrictive. Further discussions are given in Section
4.
  5  SHAPE also subtracts background ambient  concentrations from the microenviron-
ment concentrations.  The same can be done for the Cartesianization method.  Alterna-
tively, we can implement both SHAPE and  the  Cartesianization method without sub-
tracting the background concentrations and deal with the background concentrations as a
source of covariance among the microenvironment concentrations.

                                       15-4

-------
where bk(s) denotes the average concentration in the k-th microenvironment during the
s-th minute. SHAPE assumes that the minute averages bk(s) for the same microenviron-
ment have the same distribution, and that all minute averages are stochastically indepen-
dent:
                       bk(s)~Gk, s = l,...,!*, fc = !,...,#,                   (1.8)
where Gk denotes  the distribution  for minute averages in the k-th microenvironment.6
We then generate  independent random  samples  for the minute averages, compute the
microenvironment concentrations according  to (1.7),7 compute the integrated exposures
according to (l.l),  then estimate the exposure distribution.
    Under the independence assumption (1.5) employed in the Cartesianization method,
C and T are  uncorrelated, and the variances of microenvironment concentrations do not
depend on activity patterns,
                                Var(Ck\T) = Efcfc,                            (1.9)
where E  denotes the unconditional  covariance matrix for C. Under the assumption that
the minute averages in (1.8)  are stochastically independent, as in SHAPE, the C and T
are also uncorrelated, while the variances of microenvironment concentrations depend on
activity patterns as a result of averaging  over time:
                              var(ck\T) = Tkk/Tk,                          (i.io)
where Tkk denotes the common variance  for the minute averages (&&(!),,.., 6fc(Tfc)).
    Switzer commented in a discussion that the independence assumptions used in either
the Cartesianization method or SHAPE  might be unrealistic:  because of averaging over
time,  it is unlikely that the conditional variance Var(Ck\T) would remain constant as in
(1.9); on the  other hand, the form for the conditional variance in (1.10)  might not allow
for heterogeneity of microenvironments. Switzer  suggested that one might decompose the
minute averages into two components,

                                bk(s) = ak + dk(s),                           (1.11)

where ak denotes a systematic component which does not vary over time, and dk(s) denotes
a random component which  varies over time.8  We assume that  the two components are
  6  It follows from (1.7) that we can rewrite the microenvironment decomposition (1.1)
as follows:
                                      tf  T

                                 y = E2>«.
                                     *=!«=!

  7  We assume here that activity patterns have already been generated, or we might use
observed activity patterns from a survey study.
  8  It follows from (1.11) that the microenvironment decomposition (1.1) can be rewritten
as
                                 K         K  Tk
                                       15-5

-------
stochastically independent.  We also assume without loss of generality that the mean of
the random component is zero:
                                  E(dk(a))=0.                              (1.12)

    To illustrate the decomposition (1.11), we consider the home microenvironment. The
CO concentrations in two different homes might be systematically different because of dif-
ferences in the two homes' heating and cooking facilities, ventilation, presence of smokers,
etc. Those systematic differences are be represented by the systematic component a^. The
CO concentrations in the home might also vary over time, which are represented by the
random component dfc(s). Under (1.11), the microenvironment concentration is given by

                                        Tk
                              Ck = ak + Y^dk(s)/Tk.                         (1.13)
                                       a-l

    The Cartesianization method in essense neglects the presence of the random compo-
nents dk(s),  while SHAPE in essense neglects the presence of the systematic component
afc. It is plausible in practical applications that  neither component can be neglected.
    Under the assumption that ak and bk(s) are  both present and are stochastically in-
dependent, C  and T are again uncorrelated, while  the  conditional variances are given
by
                            Var(Ck\T) = Łfc* + Tkk/Tk,                       (1.14)
where Ł denotes the covariance matrix for the systematic components (ai, ...,
-------
and C being the vector of regression parameters.  If we have replicates for the same C's,
it would be possible to use an appropriate regression of Y on T to estimate C. However,
usually we don't have such replicates,  therefore we cannot determine the microenviron-
ment concentrations. However, we can use an appropriate regression to estimate the mean
microenvironment concentrations
                                        , k = l,...,K.                         (2.1)

Under either assumption (A), (B), or (C), we have the regression equation

                                               K
                             Ł7(y|T) = T'-Y = Ł:ZVYfc                         (2.2)
                                              k=l

Therefore we can regress Y on T  to estimate 7, e.g., using the least squares regression.
It should be noted,  though, that  this regression model does not contain the intercept.
Tosteson and Spengler (1980) used a similar approach to estimate the mean microenviron-
ment concentrations; however,  they included individual-specific intercept terms in their
regression model.
     Under assumption (A) which is employed in the Cartesianization method, the vari-
ances  (and  covariance) of  the  microenvironment concentrations do not depend on  the
activity pattern. Therefore we have the following regression equation;
                                                                            ..(2.3)
                                  k=i
where r — Y — T'7 is the residual from the previous regression equation, (2.2), and E
denotes the covariance matrix for C.  (The left-hand side of equation (2.3) is also the
conditional variance of Y given T.)  While we don't observe the residuals r directly, we
can estimate it by the empirical residual f = Y — T'nr where 7 is the estimate for 7 after
fitting regression equation (2.2). It follows that we can estimate E using a regression of r2
on (T^, ...,TŁ)  and (2TkTi, 1 < k < I < K). If we believe that the microenvironment con-
centration distributions follow a parametric form, say, the lognormal distribution, it would
then be possible to determine the distribution from the estimated means and the estimated
covariance matrix.
    It is possible to impose restrictions on the covariance matrix E.  For example, if we
assume that the off-diagonal elements in E are identical, i.e.,

                              E« = p, 1 < k < I <  K,

where p denotes the common value of those off-diagonal elements.  We can  estimate the
diagonal elements of E (the  variances of the microenvironment concentrations) and the
common covariance p using a regression of f2 on (T2, ..., TŁ) and 2 Ł*
-------
other,  i.e., the off-diagonal elements in Ł  are all  zero.9  In this case we only need to
estimate the variances, using a regression of f2 on (T2,...,!^).
    It is possible to improve upon the regression models (2.2) and (2.3) by allowing the
residuals to have different variances and using weighted least squares estimates. For ex-
ample, after fitting model (2.3), we may re-estimate (2.2) by weighted least squares, using
the reciprocal of the fitted values of Var(Y\T) based on (2.3) as the weights.  It  is also
possible to modify (2.3) similarly, if we fit similar  regression models for higher moments of
Y.

2.2. No Systematic Component
    Under assumption (B), which is employed in SHAPE, we can still use (2.2) to estimate
the means for the microenvironment concentrations. Since the microenvironment concen-
trations are the averages of the minute averages, the same estimate 7* can also be used to
estimate the mean for the minute averages (&&(!),...,6fc(Tfc)).
    We also need to estimate the variances for the minute averages.10 Under assumption
(B) and (1.10), we have the following regression equation which is analogous to (2.3):

                                           K
                                                                              (2.4)


where Tkk denotes  the variance for the minute  averages in the  k-th microenvironment,
(6fc(l),...,&fc(Tfc)). It follows that we can estimate (Tick, k = l,...,K) by a regression of f2
on  (TI,...,TK)-  This is analogous  to the special  case of (2.3) with all off-diagonal terms
of E being zero. If we believe that the distribution for the minute averages bk(s)  for  the
k-th microenvironment follow a parametric form such as the lognormal distribution, we
can determine the distribution from the mean and the variance.  We can then conduct
SHAPE type simulations.

2.3. Systematic and Random Components  both Present
    Under assumption (C), we can use (2.2) to estimate the means for the microenviron-
ment concentrations.  It follows from (1-7), (1.11), and (1.12) that the estimate •% also
estimates the mean for the systematic component, E(ak).
     In this case we have a general regression equation for the cumulants of Y and C:

                                         K
where /cy(y|T) denotes the j-th conditional cumulant for Y given T, /cy(Cjt) denotes the
j-th cumulant for'Cfc. Equation (2.2) is a special case of this cumulant equation for j — 1.
Equation (2.3)  without the off-diagonal elements in E is a special case of this cumulant
equation for j = 2.
  10  We assume that the minute averages from different microenvironments are indepen-
dent of each other; therefore, we don't need to estimate their covariances.

                                        15-8

-------
    We also need to estimate the variances and covariances for the systematic and the
random components. Under assumption (C) and (1.13), we have the following regression
equation which generalizes (2.3) and (2.4):

                             K                         K
                  E(r2\T) = Ł TtXkk + Ł 2rfcT,Efc, + Ł TkTkk,              (2.5)
                            k=l         kc(Trj)  by inverting the Fourier transform (3.1).  We can then estimate the
microenvironment concentration distribution by inverting the following Fourier transform
which defines the characteristic function for C:

                                       = E(eir'c).                             (3.2)

                                       15-9

-------
    Under assumption (B), there does not appear to be a simple expression for the joint
characteristic function TJ>Y,T- Instead, we may consider the conditional characteristic func-
tion:
                                                K
                                                                               (3.3)
where t/ffc denotes the characteristic function for the minute averages (6fc(l), ...,&fc(Tfc)) in
the k-th microenvironment:
                                                                               (3.4)
It follows that for each value of r?, we can estimate (0i(r?),...,t/>K(»7)) by a nonlinear
regression of eir}Y  on T\,...,TK using (3.3). We can then invert the Fourier transform in
(3.4) to estimate the distribution for the minute averages in the k-th microenvironment,
    It follows from (3.3) that the cumulant generating function for Y is linearly related
to the cumulant generating functions for 6fc's; therefore, the cumulants for Y are linear
combinations for the cumulants for ifc's:

                                    K
                         *j(Y\T) = ^TkKj(bk), j = 1,2,...,                    (3.5)
                                   k=i

where /c;(F|T) denotes the j-th conditional cumulant  for Y given T, /c;(6fc) denotes  the
j-th cumulant for 6fc. (Compare footnote 9 in Section 2.1.) The first cumulant is the mean;
therefore, (2.2) is a special case of (3.5) for j  = 1.  The second cumulant is the variance;
therefore, (2.4) is a special case of (3.5) for j = 2. We  can use (3.5) for j > 2 to estimate
the higher cumulants for 6*'s. For example, we can estimate the third cumulants for &jt's
by a linear regression of f3 on (T"i, ...,!#•). It  is also possible to estimate the distribution
for the minute averages from their estimated cumulants.
    We have not been able to find a simple relationship for the characteristic functions
under assumption (C). For the special case with systematic components (a\, ..., OK-) being
independent of each other, we have the following  cumulant equations:

                                   K             rfc
                       *y(y |T) = Ł 2>,.(afc)  + Ł TkKj(dk).                   (3.6)
                                  fc=l             8=1

Equation (2.2) is a special case of (3.6) for j = 1. Equation (2.5) without the off-diagonal
elements is a special case  of (3.6)  for j =  2. We can use (3.6) to estimate the higher
cumulants for Ofc's and d^'s. For example, we can regress f3 on (If , ..., T^) and (7\, ..., TK)
to estimate the third cumulants for a*'s and d^s.

4. Discussions and Conclusions
    Estimating microenvironment concentration distributionsusing integrated exposure
measurements is  useful for several purposes.  First, it allows the potential  to general-
ize  from one population to another. For example, we might have integrated exposure

                                       15-10

-------
in one metropolitan area, and an activity pattern study in another metropolitan area. If we
believe the microenvironment concentration distributions are the same in the two areas,
it would then be possible to combine the microenvironment concentration distributions
estimated from the first area with the activity pattern data from the second area to estimate
the exposure distribution in the second area. Second, it  allows development of simulation
models such as  SHAPE. Third, it is useful to help us identify the microenvironments
with potential for large  contributions to human exposure, so that further research and
regulatory actions can be directed  towards these microenvironments.
    Despite those potentials, the results described in this paper remains to be validated
empirically before further applications.  The author plans to do the validation using the
CO personal monitoring data from the Washington Urban Scale Study. In this study, a
sample of human subjects were recruited to carry continuous CO monitors for a 24-hour
period. They also record their activities during this period. For the validation analysis, we
will neglect the continuous data, pretend that we only have integrated CO measurements,
and apply the methods  in this paper to estimate the  microenvironment concentration
distributions.  We can then compare the estimates with the actual distributions to see if
these methods provide valid estimates.
    All results in this paper require some independence assumptions.  This is intrinsic to
the indirect approach. For example, in order  to estimate the average exposure Y using
the average  microenvironment concentration C and the  average activity patterns T, it is
necessary to assume that C is uncorrelated with T. Whether those assumptions are real-
istic remains to  be studied empirically. There has  not been very much work reported  on
the validity  of this type of independence assumptions. Duan (1985) examined this using
data from the Washington CO study and  found no significant correlations between the
microenvironment concentrations and the corresponding activity patterns.  Switzef (1988)
examined the minute averages  from a microenvironment monitoring study conducted on
El Camino Real, an arterial route in Palo Alto, California, and found little autocorrelation
beyond the first  four minutes. More empirical studies of this type still need to be done.  It
will also be useful to examine the theoretical robustness properties of the results in this pa-
per against departures from the underlying independence assumptions. For example, if the
minute averages have lagged autocorrelations, do the estimates based on the independence
assumptions still have good statistical properties?


       The  work described in this  paper was not funded by the  U.S. U.S.  Environmental
Protection Agency and therefore the contents do not necessarily reflect the views of the
Agency, and no official endorsement should be inferred.
                                      15-11

-------
                                REFERENCES

    1.  Duan, N. (1980):  "Micro-environment  types: a model for human exposure to
air pollution", SIMS Technical Report No.  47,  Dept.  of Statistics, Stanford University,
Stanford, CA.
    2.  Duan, N. (1982): "Models for human exposure to air pollution", Environment
International, 8, 305-309.
    3.  Duan, N. (1985): "Application of the microenvironment monitoring approach to
assess human exposure to carbon monoxide", R-3222-EPA, The RAND Corporation, Santa
Monica, CA.
    4.  Duan, N. (1987): " Cartesianized sample mean:  imposing known independence
structures on observed data", unpublished manuscript, The RAND Corporation, Santa
Monica, CA.
    5.  Ott, W. (1981): "Computer simulation of human exposures to carbon monoxide",
paper presented  at the 74th  Annual Meeting of the Air Pollution Control Association,
Phildelphia, PA.
    6.  Switzer, P. (1988): private communication.
                                      15-J2

-------
        MICROENVIRONMENT DATABASE FOR TOTAL HUMAN EXPOSURE STUDIES

                        by:   Muhilan D. Pandian
                             Environmental Research Center
                             University of Nevada-Las Vegas
                             4505 S. Maryland Parkway
                             Las Vegas, NV  89154
                                 ABSTRACT

      Human exposure  to a pollutant  is determined  by  matching pollutant.
concentrations in microenvironments with human time  activity patterns.   In
this context,  microenvironments  are used to  denote  volumes  with  homogeneous
pollutant concentrations.   In this paper,  a  new concept is introduced  in
which the use of microenvironments  is  extended to include all  the different'
components of the total  human  exposure process,  from pollutant sources  to
related health effects.   The feasibility of  incorporating this concept  in  a
database  format  is  discussed,  and the advantages of  such a database are
also noted.

      This  paper   has   been  reviewed  in  accordance  with  the  U.S.
Environmental  Protection Agency's  peer and administrative  review  policies
and approved for presentation  and publication.
                                   16-1

-------
INTRODUCTION

      During  their daily activities,  humans  are  exposed to the multitude of
pollutants  present  in  the  surrounding environment.  In general, pollutants
are transported by  a  carrier medium (air, water,  soil,  or  food)  to  the
physical  boundaries  that  envelop  us.   On crossing this envelope,  exposure
to pollutants can  lead to  dosage  received by the  target tissues, resulting
in cancer and/or noncancer risks.

      The  concentrations  of  pollutants  found  at  a  person's  physical
boundaries  at any  specific time depend  largely  on where he/she is present
and what he/she is doing  at  that  time.   During  a  day's routine, the person
may  pass  through  many  different  environments  with  varying  physical
characteristics.     These   environments,   which   are  denoted   as
microenvironments,  account for the variations  in  the  sources  and  sinks of
pollutants,  the carrier medium of the  pollutants,  the  concentrations  and
physical properties  of the  pollutants,  and  the  temperature,  relative
humidity,  airflow  characteristics,  and other  physical  properties  of  the
environments.

      Most  exposure studies involving  microenvironments  have divided  the
surrounding   environment  depending  upon  the  pollutant  of interest.
Researchers  have  repeated  this division process  when  considering  a
different  pollutant.    Also,   when  characterizing  microenvironments,
scientists have ignored the health  effect  factor,  which  is a significant
component in  the total human exposure process.

      When  studying  a particular component  in the total  human  exposure
process, it  is  important  to consider the  effects of other components as
well  (Pandian,  1987).  The  following  sections explain  a  new concept in
which the  different  components  of  the total  human exposure  process  are
directly or  indirectly related to microenvironments.   The feasibility of
presenting  this concept in the form of a database  is also discussed.


SUGGESTED CONCEPT

      In its  simplest  form,  a  microenvironment can be defined as a control
volume with  a homogeneous pollutant concentration.   Most exposure studies
to date have  applied this  definition.  Since the pollutant concentration of
the  same  volume  might  vary  with  time,   a  better  definition   of
microenvironment  is  the  four-dimensional  concept (3-D  space)  x  (time)
(Duan,  1982)..   Two  models  have used  this  four-dimensional  concept to
determine  human  exposure  to   air  pollutants   by  matching pollutant
concentrations in  microenvironments  with human  activity patterns  over a
certain  period  of  time.    They are  the  National  Ambient   Air  Quality
Standards Exposure  Model  (NEM)  (Paul, 1981)  and  the Simulation  of Human  Air
Pollution  Exposure  model  (SHAPE) (Ott,  1981;  Systems Architects,  Inc.,
1982).

      Given a human subject's activity pattern,  available  exposure models
can  predict  the  concentrations  he/she  is  exposed  to  for  specific

                                   16-2

-------
pollutants.     These  models  usually  assume  a  different  set  of
microenvironments for  each  pollutant.   Indoor  environments,  where  human
beings spend more than  90  per cent of  their time (Szalai,  1972;  MAS,  1981),
are handled poorly in terms of  pollutant  concentrations and their  related
sources and  sinks.    Dosage levels  are estimated  after  assuming  simple
linear behaviors,  and no health effects are predicted  (Pandian, 1987).

      Since actual exposure occurs in  a microenvironment,  it should be
possible to relate the various  components of the total  exposure  process to
the  microenvironment,   either  directly  or  indirectly.     Pollutant
concentrations in microenvironments  can  be correlated with  the  appropriate
sources and sinks; microenvironments can be associated  with specific human
time activities;  and dosages from exposure linked to target tissues.   The
resulting  health  effects can  then   be  traced  to the   accountable
mi croenvi ronments.

      The  current use  of microenvironments  in exposure  studies  can be
easily extended  to include  the characteristics of various pollutants  and
their  related health effects.    This  can be 'achieved by  identifying  all
possible  microenvironments and  characterizing  each  according  to  the
following factors:

            i) control  volume,
           ii) pollutant(s), single or multiple,
          iii) human  time  activities,
           iv) pollutant(s)  source characteristics,
            v) pollutant(s)  sink characteristics,
           vi) exposure  media and pathways,
          vii) dosage processes, and
         viii) health effects.
The next section briefly explains each factor.


CHARACTERISTICS OF MICROENVIRONMENTS

      Microenvironments  can  be  categorized by  control volumes  which  may or
may  not  have definite  physical  boundaries.    In  the   case  of  indoor
environments,  a  residence,  for example, can be divided by using definite
boundaries  into  living room,  bedroom,  kitchen,   dining  room,  bathroom,
garage, basement, and  attic.   Examples for control  volumes with definite
boundaries in in-transit regimes  are automobiles, buses, trains, airplanes,
and  other enclosed  vehicles.   Outdoor  environments can  be divided into
control volumes by using  imaginary boundaries;  examples are a  street in  a
downtown business district,  a street in a residential area,  park area,  and
the beach.

      Pollutants  include  gaseous  pollutants,  radioactive  pollutants,
particulate  matter  - which  includes  liquid and solid suspensions  -  and
bioaerosols.   Bioaerosols,  which are  particulate matter,  have  been  listed
separately because they  may  carry viable microorganisms  in their  structure.
                                   16-3

-------
      More  than  one  pollutant  can  exist  inside  a  microenvironment.
Depending  upon  the  sources  and  sinks  in  the  microenvironment,   the
concentrations may vary  spatially and  temporally.   A  human subject  is
exposed  to all  the  pollutants  present  in  his/her  microenvironment.
Therefore,   given  his/her  time  activity  pattern,   a  listing   of
microenvironments with the  pollutants present  in  each  will  immediately
provide  information  on  the  human  subject's   exposure  pattern  to  the
pollutants in  the relevant microenvironments.

      Microenvironments can  be  associated  with  specific  human activities.
From  studies  on  human   time   activity  patterns,   it  is   possible   to
statistically determine the activities  and their durations  which can  be
linked to any  microenvironment.

      It is important to  know  the sources of pollutants to  obtain  their
concentrations in  microenvironments.    Two types  of  sources contribute
pollutants:  those which generate  background concentrations  and those  which
emit  transient  concentrations.    Background  sources  responsible  for
pollutant  concentrations  in  microenvironments  can  be  determined   by
techniques  such  as receptor  modeling  (Friedlander,  1973).    Transient
sources  are  highly time-dependent  sources which are  either inside  the
control  volume  of  a microenvironment  or  outside,   in  which  case  the
pollutants released will  be swept inside  by  physical means.   Examples of
background  sources include  tall smokestacks  and vehicular traffic  in
highways;  examples of  transient sources  include smoking cigarettes,  gas
cooking  ranges,  hot  showers,  unclean HVAC systems,  dusty  carpets,  and
automobile exhaust  systems.

      Information on  pollutant  sinks.in microenvironments  is  necessary to
estimate realistic concentrations.  Pollutants  are removed  by particular
mechanisms depending  upon their physical  and  chemical  properties.    All
pollutants can be removed from a control  volume  by diffusion,  air exchange,
or  ventilation.    Gaseous   pollutants can  also  undergo  chemical
transformation,  and  particulate matter can be deposited by  impaction or
sedimentation.

      Pollutant-carrying  media  include air, water, soil, or  food, one or
more of which  can exist in any microenvironment.  The ubiquitous medium  air
provides exposure  through the  respiratory tract,  skin, and  the  digestive
tract.   Water penetrates  the  human  envelope  through the  skin and  the
digestive  tract,  soil  through  the skin,  and food through  the  digestive
tract.   Exposure can be associated to microenvironments depending on  the
different media  present and the related  human  time activities.   By  using
pharmaco-kinetic  relationships,  exposure  can  be  extrapolated  to obtain
harmful dosage levels to human target  tissues.

      The last factor  in the characterization  of microenvironments involves
health  effects,   which  can  be  indirectly  related  to microenvironments
through exposure  to the pollutants found  in certain microenvironments.   In
cases  where  the  prediction of  health effects   is very  complex, one  can
resort to the  use  of risk  factors.  A typical  prediction would  be:  human
subject  S,   due  to  certain  amounts  of exposure  to  pollutant  P   in

                                  16-4

-------
microenvironments M.,  M,,  ...,  M ,  has a risk factor of F  for  experiencing
health effect E.

DATABASE FORMULATION

      After  extensive  and  laborious  literature  searches  and  reviews,
information  relating  to the factors  which  characterize microenvironments
can be accumulated and categorized  in  a database  format.  The format  should
be set up  so that when one needs information on  a  particular  component  in
the total  human  exposure process,  he/she  can obtain data not just on  that
component alone but also on associated'components.  Explanations on how the
components are related should also  be  provided.

      The  formulation  of  such  a  database  has  many advantages.     Most
important of all,  a scientist  studying  human exposure would have  access  to
data on microenvironments and related elements in a concise package.    If  a
thorough  literature  search  is carried out,  the completed  database  will
reveal which data  is  available  in  the field  of total  human exposure.  The
database will also expose areas where further research needs to be pursued,
both theoretical  and experimental.

      The completed database will  facilitate the use  of multiple effects
when  studying human   exposure.    Factors  such  as exposure  to  multiple
pollutants  and  dosage from exposure through  different  pathways  can  be
considered.  Weighting factors  can  be  used to calculate exposure.

      When accumulating data,  it  is  important  to note the accuracy  of the
information.  Data presented in the database should carry  repprted results
on error analyses and  statistical  inferences.  The database should be built
to  include  bibliographic   records,   and provisions  should  be  made for
updating  data.    If feasible,  the database  should  include meteorological
data  such  .as the  STAR array as well  as U.S.  Bureau of  Census  files  on
population distribution and growth, and  migration  patterns.

      The  actual  formulation of the database  itself can be   easily
accomplished with the  use of available  computer  software.  The appropriate
fields necessary for the input of data can be listed first, followed  by the
entry of collected data.
                                   16-5

-------
                                REFERENCES

Duan,  N.  (1982):  Models for  Human Exposure  to Air  Pollution,  Environ.
      Intl.,  8,  305-309.

Friedlander,  S.K.  (1973):  Chemical  Element Balances and  Identification  of
      Air Pollution Sources,  Environ. Sci.  & Tech.. 7 (3), 235-240.

NAS  (1981):  Indoor Pollutants, Committee on Indoor Pollutants,  Board  of
      Toxicology  and  Environmental  Health  Hazards,    Assembly of  Life
      Sciences,   National  Research  Council,   National  Academy   Press,
      Washington,  DC.

Ott, W.R.  (1981):  Exposure Estimates Based on Computer  Generated Activity
      Patterns,  paper presented  at the  74th Annual  Meeting  of the Air
      Pollution Control  Association  (APCA), Philadelphia,  PA,  APCA Pub. No.
      81-57.6.

Pandian, M.D.  (1987):  Evaluation  of Existing Total Human Exposure  Models,
      Prepared  for U.S.  Environmental Protection Agency, Las Vegas, NV.

Paul, R. (1981): User's  Guide  for NAAQS Exposure  Model  (NEM),  Prepared for
      U.S.  Environmental  Protection  Agency, Research Triangle Park, NC.

Systems Architects, Inc.  (1982):  Environmental Modeling  Catalogue (Draft),
      Prepared  for  U.S.  Environmental  Protection Agency, Washington, DC,
      EPA Contract No.  68-01-4723, 97-102.

Szalai,  A.  (1972): ed.,  The Use of Time: Daily  Activities  of Urban and
      Suburban  Populations In Twelve Countries, Mouton, The Hague.
                                   16-6

-------
A METHODOLOGY FOR ESTIMATING CARBON MONOXIDE EXPOSURE AND  RESULTING
           CARBOXYHEMOGLOBIN LEVELS  IN  DENVER, COLORADO

              by:   Ted Johnson
                   PEI Associates, Inc.
                   505 South Duke Street
                   Suite 503
                   Durham, NC  27701
                             ABSTRACT
  A methodology  was  developed for  estimating  carbon  monoxide  (CO)
 ure and resulting  carboxyhemoglobin  (COHb)  levels  among  residents  of
 r, Colorado.   The  methodology consisted of the following  six  steps.

  1.    Two populations-at-risk were defined  for  the  five  counties  in
        the greater Denver Metropolitan  Area:   a)  all  residents  and  b)
        persons with angina.

  2.    Each population-at-risk was divided  into  an exhaustive  set  of
        cohorts according to appropriate demographic and physiological
        Variables.

  3.    A year-long sequence of exposure events was developed for each
        cohort.   An  exposure  event  is defined  by microenvironment,
        smoking status,  breathing  rate, and  duration  (e.g.,  outdoors
        near  roadway,  not  smoking,    fast  breathing,   12  minutes).
        Sequences  were  developed  by  applying a  "random-walk"
        statistical  algorithm  to  activity  diary  data  collected  in
        Cincinnati,  Ohio,  during March and August of 1985.

  4.    A  probabilistic  model  was developed for  estimating the  CO
        exposure associated with  each  event  in  each  sequence.    The
        exposure during  an  event was assumed to reflect  the ambient
                               17-1

-------
            concentration measured  at  one  or more fixed-site monitors,  the
            contribution of localized sources and  sinks  specific  to  the
            microenvironment,   and  smoking  status.    Exposure values  for
            specific microenVironments were drawn from distributions fit to
            personal monitoring  data  collected  in Denver during the winter
            of 1982/83.

      5.     A physiological  model  (the Coburn equation)  was  applied to  the
            exposure  sequences developed  in  Step 4.   The  model determined
            the COHb level of each cohort at the end of each  hour.

      6.     The CO exposures and associated  COHb levels were extrapolated
            to each population-at-risk  through the use of  census-derived
            weighting  factors.

Computer  software  was  developed for implementing Steps  3 through 6.

      This  paper has  been   reviewed   in  accordance  with the U.S.
Environmental  Protection Agency's peer  and  administrative review  policies
and approved for presentation and publication.
                                   17-2

-------
                               INTRODUCTION


      The Environmental  Strategies Project  (ESP)  for  Metro-Denver is  a
multiyear study  of  environmental  issues  and  policy  choices facing  the
Denver area.   The objectives of this  study are:   1)  to assess the relative
magnitude of  a set  of  metro-Denver's environmental  problems;  and  2)  to
identify strategies  to  reduce the health and  welfare damages  associated
with these  problems.

      Phase I  of the ESP  began  in 1987.   The purpose  of Phase I  is  to
provide the ESP  Advisory Committee with preliminary estimates  of  damages
caused by the  major  sources of environmental contamination  in  the  Denver
area.   These  initial  estimates  will  enable  the Advisory  Committee  to
identify the  environmental issues which  merit  more detailed analysis  in
later phases  of the  project.   To  provide results  quickly  and to  avoid
unnecessarily  detailed   analyses  of  less  critical   issues,  the  analyses
conducted under  the  first phase  of  the  project  have  been designed  to
provide "order of magnitude" estimates of health and  welfare damages.

      One of  the air pollutants  of  most concern in  the  Denver area  is
carbon monoxide (CO).  Exposure to  high  levels of CO can increase the level
of carboxyhemoglobin  (COHb) in the  blood  with  a corresponding decrease  in
the blood's ability  to  carry oxygen.   Elevated  COHb  levels are  associated
with reduced  time  to onset of attacks among persons.with  angina and with
reduced vigilance among  members of  the general population.  These and 'other
health  effects related  to CO exposure  are thought  to  increase in  high
altitude  areas  where  ambient  oxygen  levels   are  reduced.    Denver  is
considered  to  have a high potential  for CO-related  health  effects  because
of the city's  altitude (5280 ft) and high  ambient CO levels.  In an attempt
to characterize  the  extent of Denver's  CO  problem,  PEI Associates,  Inc.
(PEI), designed and executed a methodology for  estimating CO exposures  and
resulting  COHb  levels  within  the  general  Denver  metropolitan  area
population  and  within  selected  subsets  of  this  population.     This
methodology is described  in the following report.   Exposure  estimates  are
not provided,  as  these are currently under review by the Region VIII office
of the U.S.  Environmental  Protection Agency (EPA).


                                BACKGROUND


      The methodology developed by PEI  for the Phase  I CO  analysis  was
derived from  a general  approach to estimating population  exposure  to  air
pollution which has  been  used by the  Strategies and Air Standards Division
(SASD) of EPA for most of the  criteria pollutants.  This approach divides a
population-at-risk residing in a particular  study area  into  an  exhaustive
set of  cohorts.   All members of a  particular  cohort are  assumed  to have
similar  activity patterns and similar  physiological characteristics.   A
sequence of exposure events spanning  a relevant time  interval  (e.g.,  one
year) is developed for each cohort.   The exposure events are  defined  in

                                   17-3

-------
such a way that  the  pollutant  exposure  during each event can  be  estimated
using  supplementary  data developed  for  this  purpose.     The  resulting
sequence of pollutant exposures  is  then  used to predict the occurrence  of
specific health effects in each member of the  cohort.   Census-derived  data
are used to extrapolate both  the exposure estimates and the health  effect
estimates  to  the  entire  population-at-risk.    In some analyses,   health
effects are not estimated, and only  the  extrapolated exposure estimates are
presented.

      In adapting this general  approach  to  the ESP-CO I analysis, PEI  was
constrained by the  fact that  a period of only three months was  available
for designing a  methodology,  collecting  and processing the  necessary  data
bases,  performing literature  reviews,  developing  and  debugging computer
software,  running  the software,   performing  sensitivity   analyses,   and
summarizing the results.    The resulting methodology reflects  the decision
by PEI  to  minimize  the  complexity  of the computer models used  to estimate
CO exposures and  resulting COHb levels while  making full  use of personal
monitoring  data and  activity diary  data  from large-scale studies  conducted
by PEI in Denver  and Cincinnati.

      The Denver  study  was conducted during the  winter of 1982-1983  and
included as its target population those nonsmoking  residents of the  Denver
metropolitan  urbanized area who were between 18 and 70 years of age  in the
Fall of  1982.  A total  of 454 subjects  were obtained through the use  of a
screening questionnaire administered to several thousand households  in the
study area.  Each subject was asked to  carry a personal exposure monitor
(PEM) and  activity  diary  for  two consecutive 24-hour sampling  periods and
to  provide a breath sample  at  the end  of each  sampling   period.    Each
subject also completed a  detailed background questionnaire.   PEI  collected
CO  data  recorded  at  fixed   sites throughout  Denver to compare  with
simultaneously  measured PEM values.   Reference 1  contains  a more detailed
description  of the  study.    Data  from this study have been  analyzed  by
Johnson et  al.  (Reference  2).

      The second  study was conducted in  a three-county area  surrounding and
including  Cincinnati, Ohio,  during March   and  August  1985.   The  target
population  of this  study  included  all  residents of Hamilton County,  Ohio;
Clermont County,  Ohio;   and  Kenton County,  Kentucky.   A total  of  973
subjects (487 in March and 486  in August) were  obtained through the  use of
a  screening  questionnaire  administered  to several  thousand   households.
Each subject  was  requested to  carry  an  activity diary for three consecutive
24-hour periods and to complete a detailed background questionnaire.   As no
personal  monitors  were   employed,   personal  CO  exposure  data  were  not
available for subjects of this study.  This  study,  however,  included three
population subgroups excluded from the  Denver study:   children, persons
over 70, and smokers.  Reference 3  contains  a  more detailed description of
the Cincinnati  study.

                   REQUIRED OUTPUTS  AND  GENERAL APPROACH

      PEI was requested to provide  estimates  for a recent calendar year for
the following COHb indicators  and populations-at-risk:

                                    17-4

-------
      1.    Daily  maximum  8-hour  average  COHb  levels  for  the  general
            population;

      2.    One-hour COHB levels for persons with angina pectoris.

      In  each  case  estimates  were to be tabulated for specified ranges  of
 COHb  in three ways:   the number  of person-occurrences of the COHb levels
 falling within each range,  the number  of  persons with  one  or more COHb
 values  falling within each  range,  and the number  of persons with yearly
 maximum  COHb  values  falling  within  each  range.    To  provide  maximum
 flexibility,  PEI  developed  software capable of providing  these tabulations
 for  various combinations of  time period (entire year,  warm months only,
 cold  months   only),   regulatory  scenario   ("as  is"  conditions  versus
 attainment of the  current  8-hour  National  Ambient  Air  Quality Standard
 (NAAQS)  for CO),  smoking status (all  persons,  'smokers only,  and nonsmokers
 only),  and  population group  (e.g., retired persons).

      The study  area  specified  for the CO exposure analysis  was  identical
 to  the  study area   specified  for  a  parallel   analysis  of  ozone   and
 particulate matter exposure  being  conducted  by another  contractor.    The
 study area included all  census  tracts  in Adams, Arapahoe,  Denver,  Douglas,
 and Jefferson  counties and three census tracts (131.03, 131.04, and  131.05)
 in  Boulder County.   Approximately 89 percent of  the Denver  Standardized
 Metropolitan Statistical Area (SMSA) is included in  the study  area.

      PEI's general approach  to estimating COHb values for each  of  the  two
 populations-at-risk within this study area was to 1) divide the population-
.at-risk into  an  exhaustive  set  of cohorts,.2)  develop a  year-long-sequence
 of  exposure events  for each cohort, 3) estimate the CO exposure associated
 with  each  event  in  each  sequence,   4) apply the  Coburn   model  to  the
 resulting sequences of CO exposures,  and 5)  extrapolate  the  resulting  COHb
 estimates  to  the  population-at-risk through  the use  of census-derived
 weighting factors.

 POPULATION COHORTS

      Two populations-at-risk were defined  for the  analysis:

      1.    All residents of  the designated study area, including children,
            retirees,  and smokers

      2.    Persons  with  angina residing in  the study area.

      The  first  of   the  two  populations-at-risk  (i.e.,    the  general
 population) was represented by  a  set of  14 nonoverlapping population groups
 (Table  1) hereafter referred to  as demographic groups (DG's).  A DG  is a
 group  of  people  with  similar  demographic  characteristics that   may
 reasonably be  expected to have  similar CO exposure  patterns during  a  given
 calendar year.   The  demographic characteristics  used to define the DG's
 were  age,  school   status,  employment status,  commute time,  and  smoking


                                    17-5

-------
status.   These factors  were  assumed to strongly influence personal activity
patterns and resulting  CO exposure patterns.

          TABLE  1.  NUMBER OF CINCINNATI  SUBJECTS  ASSIGNED TO EACH
                             DEMOGRAPHIC GROUP

Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Number of assigned sub.iects
Demographic group LDG) |
Preschoolers, under 2 years of age
Preschoolers, 2 to 5 years of age
Students, 6 to 9 years of age
Students, 10 to 13 years of age
Students, 14+ years of age, smokers
Students, 14+ years of age, nonsmokers
Workers, commute # 20 minutes, smokers
Workers, commute #20 minutes, nonsmokers
Workers, commute 20+ minutes, smokers
Workers, commute 20+ minutes, nonsmokers
Homemakers and unemployed, smokers
Homemakers and unemployed, nonsmokers
Retired, smokers
Retired, nonsmokers
Total number of assigned subjects
March
21
46
27
27
11
58
46
79
13
42
19
51
9
33
482
August
11
30
39
38
9
44
36
80
28
47
24
47
10
24
467
Total
32
76
66
65
20
102
82
159
41
89
43
98
19
57
949

      Each DG was  subdivided into four cohorts.   A cohort  is  a  group of
people  assumed  to have  identical sequences  of CO  exposure for  a  given
calendar,  year.    Each of  the cohorts  belonging to. a  particular DG  was
assigned  a different  blood volume according to the  distribution  of  blood
volume  in members of the DG  (Table'2).   Blood  volume  is  an  important
determinant  of  the rate at  which  a person's  COHb  level .varies as  the
concentration of CO to which  the person is exposed varies.

      Each cohort  was represented by a  single 365-day exposure  sequence
developed  from  data  obtained from  Cincinnati  subjects  assigned to  the
specified DG.   The Cincinnati  study  was  selected for this purpose because
it  included  subjects which  could  be assigned to  each  of  the   DG's  of
interest.   As  previously  indicated, .the  Denver study omitted  children,
persons over  70  years  of  age,  and  smokers.

      Responses  to data items  appearing  in   the  Cincinnati  background
questionnaire were the primary means of assigning Cincinnati  subjects to
DG's.  In general,  the DG's were defined so  as to provide as many different
exposure  sequences as  possible within the  overall  constraints  that the DG
definitions  emphasize standard Bureau of Census identifiers  and  that  the
number  of Cincinnati   subjects assigned to each DG  equal  or  exceed  15.
Table  1  lists  the number of  subjects  assigned  to  each DG by month  of
participation (March or August).

      The second population-at-risk (persons with angina) was considered to
be a subset  of  the first.   PEI developed age- and sex-specific prevalence
                                   17-6

-------
TABLE 2.  ESTIMATES OF COHORT BLOOD VOLUMES AND 1980 POPULATIONS
Code
1



3



3



4



5



6



7



8



9



Demographic group (DG)
Preschoolers, under 2
years of age

VB = wt * 73.0 ml/kg
Preschoolers, 2 to 5
years of age

VB = wt * 73.0 ml/kg
Students, 6 to 9 years
of age

VB = wt * 73.0 ml/kg
Students, 10 to 13 years
of age

VB = wt * 73.0 ml/kg
Students, 14+ years of
age, smokers
.
VB = wt * 73.5 ml/kg
Students, 14+ years of
age, nonsmokers

VB = wt * 73.5 ml/kg
Workers , commute < 20
minutes, smokers

VB = wt * 73.5 ml/kg
Workers, commute < 20
minutes, nonsmokers

VB = wt * 73.5 ml/kg
Workers, commute 20+
minutes, smokers

VB = wt * 73.5 ml/kg
(continued)
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
- B
C
D
A
B
C
D

Body
wt. ,
k9
6.3
8.9
10.7
12.9
12.0
14.2
16.7
20.2
19.5
23.2
26.7
32.0
29.4
36.3
41.6
50.1
47.5
56.8
63.8
75.3
47.5
56.8
63.8
75.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3

VB?
ml
460
650
781
942
876
1037
1219
1475
1424
1694
1949
2336
2146
2650
3037
3657
3491
4175
4689
5535
3491
4175
4689
5535
3866
4807
5446
6343
3866
4807
5446
6343
3866
4807
5446
6343

Percent
of DG
25.6
21.8
31.3
21.4
23.1
25.1
28.8
23.0
25.4
23.1
26.1
25.4
25.1
24.0
26.1
24.7
24.5
25.4
24.7
25.4
24.5
25.4
24.7
25.4
25.1
24.4
25.4
25.1
25.1
24.4
25.4
25.1
25.1
24.4
25.4
25.1

Population,
thousands
General
10.9
9.2
13.3
9.1
19.6
21.3
24.4
19.5
21.7
19.8
22.3
21.7
22.9
21.9
23.8
22.5
9.4
9,8
9.5
9.8
31.8
32.8
32.0
32.9
29.8
28.9
30.2
29.8
48.1
46.7
48.8
48.1
39.9
38.8
40.6
39.9

Angina
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.
0
0
0
0
0
0
0.385
0.374
0.390
0.385
0.605
0.588
0.612
0.605
0.517
0.502
0.523
0.517

                            17-7

-------
TABLE 2 (continued)
Code
10



11



12



13



14



Demographic group (ŁG)
Workers, commute 20+
minutes, nonsmokers

VB = wt * 73.5 ml/kg
Homemakers and unempl . ,
smokers

VB = wt * 73.0 ml/kg
Homemakers and unempl . ,
nonsmokers

VB * wt * 73.0 ml/kg
Retired, smokers

VB = wt * 73.4 ml/kg

Retired, nonsmokers

;VB.= wt * 73.4 ml/kg

Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
"A
B
C
D
Body
wt.,
k9
52.6
65.4
74.1
86.3
47.9
59.9
68.4
80.7
47.9
59.9
68.4
80.7
52.3
64.2
72.1
83.3
52.3
64.2
72.1
83.3
VB?
ml
3866
4807
5446
6343
3497
4373
4993
5891
3497
4373
4993
5891
3839
4712
5292
6114
3839
4712
5292
6114
Percent
of DG
25.1
24.4
25.4
25.1
24.6
26.1
23.9
25.4
24.6
26.1
23.9
25.4
24.8
25.7
24.5
25.0
24.8
25.7
24.5
25.0
Population,
thousands
General
64.4
62.6
65.4
64.4
11.3
12.0
11.0
11.7
20.7
22.0
20.1
21.4
4.6
4.7
4.5
4.6
23.7
24.6
23:4
23.9
Anqina
0.812
0.790
0.822
0.812
0.120
0.127
0.116
0.123
0.216
0.229
0.210
0.223
0.224
0.233
0.222
0.226
0.055
1..094
0.996
0.017
 VB:  blood volume.
                                     17-8

-------
rates for angina  and  applied  these  to the general population estimates  to
determine the number of persons with  angina in each cohort.

EXPOSURE EVENT SEQUENCE

      An exposure event sequence is  a time-ordered list  of  exposure  events
experienced by each member of a cohort from midnight January  1  to  midnight
December 31.   Each exposure event is defined by  microenvironment,  smoking
status,  breathing rate, and duration.

      Microenvironments are generalized  locations  which  are  considered  to
have  predictable  CO  levels.     Examples include  parking  garages,   the
interiors   of automobiles,  and  residences.     An  exhaustive   set   of
microenvironments was defined in  terms  of the  subject  responses to data
items appearing in the Cincinnati  diary (Table 3).

      Smoking status  and  breathing  rate  were also determined by  responses
to items appearing in the Cincinnati activity  diary.    Possible responses
are listed below.

                  Smoking  Status
              17:  subject smoking         13:   slow
              18:  others  smoking           14:   medium
              19:  no one  smoking           15:   fast
                                           16:   breathing problem

      Durations were  specified in minutes, the  smallest  unit  'of time which
could be distinguished in  the  Cincinnati and Denver data bases.   To allow
exposure sequences to  be related to  hourly average CO data  from a  downtown
fixed-site monitor, no duration exceeded 60  minutes  and  no  event  fell  into
more  than  one clock  hour.   For  example, a  visit to a  restaurant  lasting
from  12:35  to  1:22 was treated as  two events, th# first  lasting from 12:35
to 1:00 and  the  second lasting from  1:00 to  1:22.   Consequently,  any given
event can be  identified by specifying the hour h during which it occurs  and
the position  i of the event within the hour.  Thus,  the  third event within
the ninth hour of the year can be  identified  by  specifying h = 9 and  i = 3.

      The development  for  each  cohort of an event  sequence  containing  365
days  was  complicated  by  the fact  that  no  individual  Cincinnati  subject
could  contribute  more than  three  distinct days  of  diary data  to  the
sequence.   PEI chose to overcome this obstacle by pooling the diary data
for  all  Cincinnati   subjects  assigned  to  a  particular  DG  and then
constructing  an  individual 365day sequence for  each  of the  four cohorts in
the  DG  by  randomly  selecting subject-days  from the  pool.   Subject-days
within the  pool  were  labeled as to day type  (weekday,  Saturday,  or Sunday)
and season  (warm or  cold).  When a  wa#n-weather  Saturday was needed for a
sequence,  one was randomly  selected from the  subject-days  labeled "warm
Saturdays."

      The  obvious drawback to  this approach   is  that  the  resulting  CO
exposure sequences may not exhibit the day-to-day repetitions  in activities
that Would  be present if  each  sequence  of 365 days were obtained  from  the

                                   17-9

-------
                TABLE 3.   ESTIMATED  MICROENVIRONMENT  FACTORS
Microenvironment factors
m
Location categories
A
Lambda
Theta
Sigma
Goodness-of-fit
Skew.
Kurt.
Pear.
Spmn.
• Corr.
Coef.
Motor vehicle microenvironments
1
2
3
4
Motorcycle
Bus
Truck/van
Car
0.90
0.64
0.52
0.49
0.61
0.62
0.43
0.44
2.95
1.31
0.11
0.42
3.
3.
3.
3.
73
47
89
93
-0.4
0
0
0
3.1
2.7
2.7
3.3
0.
0.
0.
0
11
08
07

0.52
0.47
0.43
0.41
Indoor microenvironments
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Church
Manufacturing fac.
Health care fac.
Rest. /bar
School
Service station
Other public build.
Store
Other repair
Office
Residential - other
Residential - meal
Res. garage
Indoor public garage
Auditorium
Shopping mall
Other indoor loc.
0.33
0.51
0.31
0.41
0.33
0.58
0.33
0.28
0.42
0.16
0.12
0.20
0.18
0.29
0.15
0*
0.24
0*
0.19
0.44
1.55
0.40
0.60
0.40
0.50
0.60
0.39
0.55
0.40
0.45
0.45
0.35
0.41
0.43
0.38
0.42
0.32
0.34
-1.73
-1.32
-2.48
0.01
-3.10
0.75
-1.24
-1/18
0.60
-0.81
-1.00
0.48
-0.96
2.41
0.22
1.09
0.43
1.44
-0.31
2.
2.
2.
2.
2.
4.
2.
3.
4.
3.
2.
3.
3.
3.
3.
2.
3.
1.
3.
48
68
91
99
84
18
80
17
43
06
59
16
38
64
16
35
07
96
68
0.4
0.7
0.7
0.4
0.9
0
0.8
• °
0.5
0
0.6
0
0
0
0
0
0
0
0
2.5
5.7
2.5
3.2
2.7
2.7
3.5.
2.0
1.8
2.3
3.1
2.4
2.0
4.2
2.1
2.6
2.7
3.6
3.2
_0
-0.
0.
0.
0.
0
0.
0-
0.
0.
0.
0.
0
0
0
0
0

06
10
11
10

11
11
11
11
11
06





0.38
0.57
0.56
0.51
0.58
0.37
0.69
0.48
0.34
0.36
0.34
0.27
0.29
0.33
0.17
0.23
0.27
Outdoor microenvironments
22  Outdoor public gar.   0.60   0.77   0.79  3.47      0     2.3   0     0.58
23  Park, golf course,   0.14   0.30  -3.56  3.07      0.9   2.2  -0.09  0.06
    sports arena          0*    0.25  -1.87  2.30      0.6   2.1

(continued)
                                    17-10

-------
TABLE 3 (continued).
Microenvironment factors
m
24
25
26
27
28
Location categories
Other outdoor loc.
School grounds,
bicycle
Near road
Outdoor service
sta. , parking lot
Residential grounds
A
0.58
0.78
0.39
0.29
0.14
Lambda
0.50
0.60
0.45
0.58
0.41
Theta
-2.12
-2.93
-1.30
0.33
-2.04
Sigma
3.51
2.97
3.50
3.03
2.47
Goodness -of-f it
Skew.
0.
1.
0.
0
0.
9
3
4

6
Kurt.
3.2
3.8
2.5
2.6
2.0
Pear.
0.
0.
0.
0.
0.
11
11
11
01
01
Spmn.
Corr.
Coef.
0.59
0.59
0.49
0.25
0.23
 Assumed because Spearman rank correlation coefficient is not significant at
 0.05 level.
Skew.:  Skewness coefficient for distribution of residuals (skew = 0 for nor-
        mal distribution)..
Kurt.:  Kurtosis coefficient for distribution of residuals (.kurt = 3 for nor-
        mal distribution)..
Pear.:  Pearson correlation coefficient between Broadway fixed-site monitor
        values and residuals of fitted model.
Spmn. corr. coef.:  Spearman rank correlation coefficient between PEI values
                    and simultaneous. Broadway fixed-site monitor values.
                                    17-11

-------
same subject.   The advantage of this approach is that it makes full use of
all  available data.   The  resulting  exposure sequence will  reflect to some
extent the  person-to-person  variability of all   subjects  assigned  to  a
particular DG.

ESTIMATION OF  EXPOSURE DURING  AN EVENT

      The assumption  was  made that  the CO  concentration  for a particular
cohort was constant during a given exposure event.  This concentration was
estimated  by  assuming that  the  CO  concentration  during   a  particular
exposure  event reflects 1) the ambient concentration as measured by one or
more  fixed-site  monitors,  2) the  contribution of localized sources and
sinks specific to  the microenvironment occupied  during  the  event, and 3)
smoking.   The resulting model  can  be expressed as

                 C0(c,h,i;m,s) = A(tn)*MON(h)  + B(m) + C(s)

where C0(c,h,i;m,s) is the CO concentration estimated for cohort c, hour h-
(1 < h < 8760), and event i within  hour h,  given  that  the microenvironment
occupied  is m and  the  smoking  status is s.

      The  quantity  MON(h)  denotes  the  hourly  average  ambient  CO
concentration  for  hour h  which would  be  expected to  occur  at the Denver
"design-value" monitoring  site under  one  of the two scenarios of  interest
("as  is"  and  "meets  B-h  NAAQS").   For the "as is" scenario,  MON(h) was
taken directly from a  data  set consisting of 1) the  1986 hourly-average CO
data  reported by  the  fixed-site  monitor  located  on Broadway;   and  2)
estimates  of  the  missing values  provided  by   an  interpolation technique
based on time-series  analysis.  For the  alternative scenario, the "as is"
data set was  adjusted using a rollback formula  so that the  second largest
daily-maximum 8-hour CO value exactly equalled  9 ppm, the 8-h CO level not
to be exceeded more than once  per  year  under  the NAAQS for CO.

      A(m)  is  a  multiplicative constant  specific  to  microenvironment m.
The quantity B(m)  is an additive factor selected at random from  a  "Box-Cox"
distribution specific  to m.   The  Box-Cox distribution  is  completely defined
by three parameters:   lambda(m),  theta(m),  and  sigma(m).   Values of  A(m),
lambda(m). theta(m),  and  sigma(m),  jointly  referred to as microenvironment
factors,   were developed  from  the  Denver  data  base  by   relating PEM
concentrations measured in microenvironment  m to simultaneous  fixed-site CO
values reported by  the Broadway monitor.   The  values were determined by a
statistical procedure  that  increased  the  proportion of variance  explained
by the stochastic term B(m) as the  observed  correlation between PEM values
and fixed-site values  decreased (Reference 4).   For microenvironments  where
the correlation was not statistically significant,  A(m)  was set equal to
zero and all  variance was explained by B(m).  Table 3 lists  the values of
A(m),  lambda(m),   theta(m),   and  sigma(m)  determined  for  each
microenvironment.

      The quantity  C(s) is  an additive factor  reflecting the contribution
of  active  smoking  (i.e.,   smoking status =  17) to the CO exposure.   The
contribution of passive smoking is assumed to be included  in  the B(m)  term.

                                   17-12

-------
Appropriate values of C(s)  were  obtained  by determining the continuous CO
exposure that would yield the  steady-state COHb levels  measured  in  typical
smokers after extended  periods  of smoking  (Reference 4).

ESTIMATION OF HOURLY AVERAGE COHb VALUES

      The  sequence  of  CO exposure  estimates determined  for a particular
cohort together with the associated durations and  breathing  rates were  read
by a  computer subroutine that yielded  a sequence  of event-specific  COHb
estimates  for the cohort.   This  subroutine contained  an  algorithm  that
applied an equation  developed  by Coburn  (Reference 5)  to the input data.
Another subroutine converted the event-specific COHb values to a series of
8760 hourly-average COHb estimates  for the cohort.

      The  Coburn  Equation  Algorithm  (CEA)  required  the  following
physioloigical data on  the members  of the  cohort:

      Blood volume
      Hemoglobin concentration
      Endogenous CO production  rate
      CO diffusion rate
      Altitude
      Haldane constant
      Initial COHb level
      Ventilation rate  (VA)  by  breathing  rate.

The Heldane  constant (M)  expresses  the  relative affinity  of hemoglobin for
CQ and  oxygen.   Values  for M ranging from 210 to 250 have  appeared  in the
scientific literature.   PEI selected the value of  218 for this  analysis, as
this value had been selected by EPA for use in a prior CO exposure  analysis
after considering the relative merits of  the various  suggested values  of  M
(Reference 6).                                  -   •         .

PEI used  a constant  value of 5280 feet  for altitude in the  analysis.   The
effect  on COHb  estimates  of  omitting  CO  exposures  at  higher altitudes
(e.g.,  weekend trips to  the Rocky Mountains)  is  expected to be  somewhat
offset by the lower ambient CO  levels expected  at these  altitudes.

     The CO  exposure sequence for each  cohort begins at midnight  on January
1.  The results  of CO  expo##res  prior  to this time are  represented  in the
CEA by an initial  COHb  level  (i.e.,  the  COHb level  at  midnight).   PEI
assumed the  "true" initial COHb value for each cohort  fell  within the range
of zero to 1 percent.   PEI  compared the  effects of initial  COHb levels of
zero  and  1 percent on  the  COHb  level  estimated by th# CEA for a  typical
adult  cohort after a few hours  of exposure.   No  effect was discernable.
PEI subsequently decided to assign each  cohort an  initial  COHb  level  of 0.5
percent.

      PEI  conducted a  search  of  the  scientific  literature  to   develop
estimates of the remaining parameters used in the  CEA.   Tables  4  and 5  list
these estimates  by cohort.  Details on  the methods used to develop these
and other model inputs  are provided in Reference 4.

                                   17-13

-------
EXTRAPOLATION  TO  POPULATION-AT-RISK

      Each DG was defined  in  terms  of the age, working  status,  commuting
time, and  smoking status  of  its  members.  The  age,  working  status,  and
commuting time identifiers  were  identical  to identifiers used by the Bureau
of the  Census.   Data on  smoking rates  by  age were  obtained  through  a
literature search.  By applying  the smoking rates to the Denver census data
broken down by age,  working  status,  and  commuting time, PEI  was  able to
estimate the number  of Denver  area residents belonging  to  each  DG.

      Each DG  population was  further  divided  into  four cohort populations
through  the use of data giving the distribution of body weight by age.  The
results  were  estimates of the  number  of Denver  area residents  in  each
cohort (Table  2).   The sum of these estimates equalled the population-at-
risk referred  to  as  the  "general  population."  Data on  the  prevalence of
angina by  age  were  then  used to estimate the number of  persons  suffering
from  angina that  belonged  to  each  cohort.    The  sum  of  these  latter
estimates  equalled  the population-at-risk referred to  as  "persons  with
angina"  (Table 2).

      The cohort population estimates  were used to  extrapolate the cohort-
specific  sequences  of  hourly-average  COHb  estimates  to  the  various
populations-at-risk.   Details concerning the extrapolation  technique are
provided in Reference 4.


                     SAMPLE APPLICATION  OF METHODOLOGY

      To demonstrate  the  methodology,  it has  been applied to the cohort
designated as 8C in Table 2,  that is,  non-smoking workers who commute less
than 20 minutes  and  who  fall in the body weight category centered around
74.1 kg.   Figure  1  is a  computer printout 'listing the first  54  events in
the  exposure  event  sequence  developed  for the  cohort together with the
values of CO exposure and  COHb level associated with each event.  The table
headings are defined  below.

      EVENT:      the exposure event number

      DAY:        the Julian  date  (1 = January 1)

      TIME:       Start time  of  the event  (e.g., 1731  =  5:31  p.m.)

      DUR:        duration  of the event  in minutes

      MICRO:      microenvironment occupied during event  (see Table 3 for
                  code explanation)

      SMOKE:      smoking  status during  event L18 = others smoking,  19 = no
                  one
                                   17-14

-------
      BR:         breathing rate category during event LI =  10,508  ml/min,
                 2 = 21,059 ml/min,  3 = 63,100 ml/min for cohort 8C)

      A:          multiplicative  term  (A(m))   associated   with
                 microenvironment occupied  during  exposure event

      MOM:        hourly  average  fixed-site  monitoring  value  (MON(h))
                 during exposure event

      B:          stochastic term [B(m)] selected from Box-Cox distribution
                 specific  to  microenvironment  occupied  during exposure
                 event

      CO:         estimated  CO concentration  to which  cohort is  exposed
                 during event

      COHb:       estimated COHb level at end of  exposure event.

      The  listings  for  exposure event 23,  for example,  indicate  the cohort
entered microenvironment  27  (outdoor  service station  or  parking  lot)  at
1700  (5:00 p.m.) and remained  there  for 5  minutes.   Because the  breathing
rate of the  cohort  was  slow  (BR = 1), the cohort was assigned a ventilation
rate  of 10,508  ml/min  according  to  Table 5.  The values of A(m),  MON(h),
and B(m)  in  the exposure  equation were  determined  to  be  0.29,  3.3  ppm,  and
7.9  ppm,  respectively,  for  this  event.    The cohort was not  smoking;
consequently the C(s) term in  the exposure  equation was set equal  to zero
for this  event.   The exposure  equation  yields an estimated  'CO level  for the
event equ-al  to (0.29  *  3.3 ppm) + (7.9  ppm)  or 8.9 ppm.  Given a COHb level
of 0.307 percent at the beginning of the event,  the Coburn equation yields
an estimate of  0.359 percent  for the COHb level at the end of the event.
The  cohort-specific physiological  parameters used as  inputs to  the Coburn
equation  for Cohort 8C  can be  found in Table 4.

      Inspection of  Figure  1  reveals  a basic limitation  of the model  as
currently  constructed.     Because   they  are  randomly  drawn  from
microenvironment-specific  distributions,  the values listed  for  B(m)  under
column  heading "B"  display  very  little  autocorrelation,   even  when
successive  events  occur  in   the  same microenvironment  (e.g.,  events  1
through 9).   As a consequence,  successive CO exposure estimates  display
much  less  autocorrelation than  is observed  in PEM data.  This limitation in
the  model  was  recognized  during  its initial   development  but  was  not
addressed  due to resource constraints.   The limitation  was  not  considered
to be serious,  as  the  Coburn  equation  tends to  dampen  out  fluctuations in
CO exposure  when estimates of  COHb are  made.  Nevertheless,   future versions
of the model  will  incorporate  an  appropriate  degree  of autocorrelation in
the B(m)  term so that the resulting  CO exposure  .estimates  will  more nearly
reflect the  autocorrelation observed  in PEM data.
                                   17-15

-------
TABLE  4.   ESTIMATED VALUES FOR INPUT PARAMETERS OF COBURN MODEL
Code
1



2



3



4



5



6
.


7



8



9



(con-
Demographic group (DG)
Preschoolers, under 2
years of age


Preschoolers, 2 to 5
years of age


Students, 6 to 9 years
of age


Students, 10 to 13 years
of age


Students, 14+ years of
age, smokers


Students, 14+ years of
age, nonsmokers


Workers, commute less
than 20 minutes, smokers


Workers, commute less
than 20 minutes, non-
smokers

Workers, commute 20+
minutes, smokers


inued)
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D

weight
6.3
8.9
10.7
12.9
12.0
14.2
' 16.7
20.2
19.5
23.2
26.7
32.0
29.4
36.3
41.6
50.1
47.5
56.8
63.8
75.3
47.5
56.8
63.8
75.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3
52.6
65.4
74.1
86.3

Coburn model inputs3
VB
460
650
781
942
876
1037
1219
1475
1424
1694
1949
2336
2146
2650
3037
3657
3491
4175
4689
5535
3491
4175
4689
5535
3866
4807
5446
6343
3866
4807
5446
6343
3866
4807
5446
6343

DL
2.9
4.9
6.2
7.6
7.0
8.5
10.1
12.2
11.8
13.9
15.9
18.7
17.4
21.0
23.6
27.6
26.4
30.7
33.8
38:7
26.4
30.7
33.8
38.8
26.4
32.9
37.3
43.5
26.4
32.6
36.8
42.8
26.4
32.9
37.3
43.6

HB
12.3
12.3
12.3
12.3
12.3
12.3
12.3
12.3
12.8
12.8
12.8
12.8
13.3
13.3
13.3
13.3
14.2
14.2
14.2
14.2
14.2
14.2
14.2
14.2
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4
14.4

vco
0.00051
0.00072
0.00087
0.00105
0.00097
0.00115
0.00135
0.00164
0.00165
0.00196
0.00225
0.00270
0.00258
0.00318
0.00365
0.00439
0.00448
0.00535
0.00601
0.00710
0.00448
0.00535
0.00601
0.00710
0.00503
0.00625
0.00708
0.00825
0.00503
0.00625
0.00708
0.00825
0.00503
0.00625
0.00708
0.00825
-
                               17-16

-------
TABLE  4  (continued)
Code
10



11



12



13



14



Demographic group (DG)
Workers, commute 20+
minutes, nonsmokers

•
Homemakers and unempl.,
smokers


Homemakers and unempl.,
nonsmokers


Retired, smokers



Retired, nonsmokers


-
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
weight
52.6
65.4
74.1
86.3
47.9
59.9
68.4
80.7
47.9
59.9
68.4
80.7
52.3
64.2
72.1
83.3
52.3
64.2
72.1
83.3
Coburn model inputs3
VB
3866
4807
5446
6343
3497
4373
4993
5891
3497
4373
4993
5891
3839
4712
5292
6114
3839
4712
5292
6114
DL
26.4
32.6
36.8
42.8
24.7
28.9
31.9
36.2
24.8
28.9
31.8
35.9
24.5
31.4
36.0
42.5
24.6
30.5
34.4
-39.9
HB
14.4
14.4
14.4
14.4
13.6
13.6
13.6
13.6
13.6
13.6
13.6
13.6
14.1
14.1
14.1
14.1
14.1
14.1
14.1
14.1
VCO
0.00503
0.00625
0.00708
0.00825
0.00430
0.00537
0.00613
0.00724
0.00430
0.00537
0.00613
0.00724
0.00489
0.00600
0.00674
0.00779
0.00489
0.00600
0.00674
0.00779
 Abbreviations
     VB:  blood volume in milliliters
     DL:  CO diffusion rate in ml/min/torr
     HB:  hemoglobin concentration in grams/100 ml
     VCO:  endogenous CO production rate in ml/min.
                                      17-17

-------
TABLE 5.  ESTIMATES OF VENTILATION RATES (ml/min) TO BE ASSIGNED TO DIARY
                         BREATHING RATE RESPONSES
Code
1



2



3



4



5


-
6



7.



8



9



Demographic group (DG)
Preschoolers, under 2 years
of age

•
' Preschoolers, 2 to 5 years
of age


Students, 6 to 9 years of
age


Students, 10 to 13 years
of age


Students,. 14+ years of age,
smokers


Students, 14+ years of age,
nonsmokers


Workers, commute less than
20 minutes, smokers


Workers, commute less than
20 minutes, nonsmokers


Workers, commute 20+ minutes,
smokers


(continued)
Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D '
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D

VA by breathing rate response
Slow
1,897
2,227
2,456
2,735
2,621
2,900
3,218
3,662
3,574
4,043
4,488
5,161
4,830
5,707
6,380
7,460
7,130
8,311
9,200
10,660
7,130
8,311
9,200
10,660
7,777
9,403
10,508
12,057
7,777
9,403
10,508
12,057
7,777
9,403
10,508
12,057

Medium
3,770
4,432
4,892
5,453
5,223
5,784
6,422
7,314
7,136
8,079
8,972
10,323
9,660
11,420
12,771
14,939
14,276
16,647
18,432
21,365
14,276
16,647
18,432
21,365
15,576
18,840
21,059
24,170
15,576
18,840
21,059
24,170
15,576
18,840
21,059
24,170

Fast
11,233
13,221
14,599
16,282
15,593
17,276
19,189
21,866
21,331
24,161
26,839
30,893
28,904
34,183
38,237
44,740
42,751
49,865
55,220
64,018
42,751
49,865
55,220
64,018
46,652
56,444
63,100
72,433
46,652
56,444
63,100
72,433
46,652
56,444
63,100
72,433

                                   17-18

-------
TABLE 5 (.continued)
Code
10



11



12



13



14



Demographic group (J3G).
Workers, commute 20+ minutes,
nonsmokers


Homemakers and unempl .
smokers


Homemakers and unempl .
nonsmokers
-

Retired, smokers



Retired, nonsmokers



Cohort
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
VA by breathing rate response
Slow
7,777
9,403
10,508
12,057
7,180
8,704
9,784
11,346
7,180
8,704
9,784
11,346
7,739
9,250
10,254
11,676
7,739
9,250
10,254
11,676
Medium
15,576
18,840
21,059
24,170
14,378
17,438
19,605
22,742
14,378
17,438
19,605
22,742
15,500
18,534
20,549
23,405
15,500
18,534
20,549
23,405
Fast
46,652
56,444
63,100
72,433
43,057
52,237
58,739
68,149
43,057
52,237
58,739
68,149
46,423
55,526
61,670
70,138
46,423
55,526
61,570
70,138
                                      17-19

-------
EVENT DAY  TIME DUR  MICRO  SMOKE BR
                                            MON
CO   COHB
1
2
3
4
5
6
7
B
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
"1
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
0
100
200
300
400
500
545
600
700
730
731
755
800
900
1000
1100
1200
•1245
1300
1400
1500
1600
1700
1705
1730
1731
1800
1845
1846
1900
1915
1917
2000
2030
2035
2045
2047
2100
2200
2202
2220
'2222
2300
2330
0
100
200
300
400
500
558
600
700
800
60
60
60
60
60
45
15
60
30
1
24
5
60
60
60
60
45
15
60
60
60
60
5
25
1
29
45
1
14
15
2 '
43
30
5
10
2
13
60
2
18
2
38
30
30
60
60
60
60
6O
58
2
60
60
50
15
15
15
15 .
15
15
15
15
15
26
4
27
14
14
14
14
8
14
14
14
14
14
27
4
26
• 15
15
26
4
4
26
12
12
26
4
26
8
8
26
4
26
15
15
15
15
15
15
15
15
15
15
15
15
15
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
•19
19
19
19
19
19
19
19
IB
18
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
19
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0. 12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.39
0.49
0.29
0.16
0. 16
0.16
0. 16
0.41
0.16
0. 16
0. 16
0. 16
0. 16
0.29
0.49
0.39
0. 12
0.12
'0.39
0.49
0.49
0.39
0.28
0.28
0.39
0.49
0.39
0.41
0.41
0.39
O.49
0.39
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
0.12
1.3
1.0
1.2
1.0
1.0
0.9
0.9
1.6
1.9
1.9
1.9
1.9
2. 1
2.2
1.7
1.5
3.0
3.0
2.3
1.7
1.9
3.0
3.3
3.3
3.3
3.3
3.7
3.7
3.7
2.6
2.6
2.6
2.1 '
2.1
2.1
2.1
2.1
0.7
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.4
0.4
0.6
0.5
0.8
0.2
2.3
-0.0
-0.1
1.1
0.4
-0.0
2.6
14.2
-0.3
0.3
0.1
2.9
0. 1
4.3
-0.0
0.9
0.0
1.8
5.4
0.1
0.0
7.9
1.6
0.0
0.0
-0.0
1.1
8.8
2.4
3.7
-0.0
0.3
2.2
2.3
3.5
0.2
2.6
-0.0
B.5
-0.1
0.2
3.0
4.8
-0.0
0.4
0.1
0.2
-O.O
2.2
1.3
-0.1
1.5
0.4
0.4
2.4
0.1
0.0
1.2
0.5
0.1
2.8
14.4
0.4
1.2
0.7
3.2
0.5
4.6
0.2
2.1
0.5
2.2
5.7
0.4
0.5
8.9
3.2
1.3
0.4
0.4
2.5
10.6
3.7
4.7
0.7
0.9
3.0
3.3
4.3
1.1
2.9
0. 1
8.6
0.0
0.2
3.0
4.8
0.0
0.4
0.1
0.2
0.0
2.2
1.3
0.0
1.6
0.5
0.384
0.442
0.331
0.258
0.291
0.271
0.259
0.438
2.027
2.011
0.765
0.743
0.706"
0.359
0.731
0.332
0.3BO
0.361
0.416
0.910
0.359
0.307
0.359
0.417
0.416
0.371
0.320
0.322
1.047
1.008
1 . 006
0.503
0.455
0.462
0.479
0.485
0.468
0.524
0.518
0.884
0.871
0.438
0.480
0.577
0.301
0.266
0.226
0.20B
0. IBS
0.350
0.350
0.269
0.323
0.290
 Figure  1.   Excerpt  from the exposure event sequence developed for
            Cohort 8C  indicating the CO exposure estimated for"
            each  event and  the resulting COHb level.
                                17-20

-------
                             MODEL VALIDATION

      Resource limitations prevented PEI  from  validating  the model during
its developmental  stage.   Appropriate data  bases for validating portions of
the methodology  exist,  the most  useful  being COHb measurements obtained
from subjects of  the  1982-83  Denver study discussed previously  (Reference
1).  This  data  base  has  been  used in efforts to validate SHAPE,  a similar
model  for  estimating  COHb  levels  developed  by  Ott  (Reference  7).
Unfortunately,  the Denver study sample did  not  include children,  elderly
persons,  or smokers;  COHb levels  were  not measured during the summer; and
the activity diary data obtained from each subject  do  not  include breathing
rates.  Data  from other  studies,  as yet  unidentified,  would be  needed to
completely validate the model  presented here.


                                REFERENCES.

1.    Johnson,  T.   A Study  of Personal  Exposure to  Carbon  Monoxide in
      Denver,  Colorado.    Prepared by  PEI Associates,   Inc.,  for  U.S.
      Environmental  Protection  Agency,  Environmental  Monitoring  Systems
      Laboratory,  under  Contract  No. 68-02-3755.   December 1983  Lrevised
      June 1984).

2.    Johnson,  T., J.  Capel,  and  L.  Wijnberg.   Selected Data  Analyses
      Relating  to Studies of  Personal  Carbon  Monoxide  Exposure  in Denver
      and  Washington.    Prepared  by  PEI  Associates,  Inc.,  for  U.S.
      Environmental  Protection  Agency,  Environmental  Monitoring  Systems
      Laboratory,  under Contract No.  68-02-3496.   February  1985.

3.    Johnson, T.  A  Study  of Human  Activity Patterns in  Cincinnati,  Ohio.
      Prepared  by PEI  Associates,   Inc.,  for  Electric  Power  Research
      Institute,  Environmental  Assessment  Department,  under Contract No.
      RP940-06.   November 1987.

4.    Johnson,  T., J.  Capel,   L.  Wijnberg, and  R.   Paul.   Environmental
      Strategies  Project  Phase  One;  Estimated Carboxyhemoglobin  Levels of
      Selected Populations-at-Risk in the Denver Urbanized  Area.    Prepared
      by PEI  Associates,  Inc.,  for  U.S.  Environmental  Protection Agency,
      under Contract  No.  68-02-3890.  September 1987 (draft).

5.    Biller, W.,  and H.  Richmond.  Sensitivity  Analysis on Coburn  Model
      Predictions of  COHb Levels  Associated  With Alternative CO  Standards.
      Prepared by Dr. William F.  Biller,  68  Yorktown  Road,  East  Brunswick.
      New  Jersey  and the  U.S.  Environmental Protection  Agency,   under
      Contract No. 68-02-3600.   November  1982.

6.    Johnson, T., and R. A. Paul.   The  NAAQS  Exposure  Model  (NEM) Applied
      to Carbon Monoxide.   Prepared  by PEI  Associates,   Inc.,  for U.S.
      Environmental Protection  Agency,  Office  of  Air  Quality Planning and
      Standards.  December 1983.
                                   17-21

-------
7.     Ott,  W.,  J.  Thomas,  D.  Mage, and  L.  Wallace.   Validation of the
      Simulation of Human  Activity and Pollutant  Exposure  LSHAPE) Model
      Using Paired Days from the  Denver,  Colorado, Carbon Monoxide Field
      Study.   Atmospheric Environment (in press).


                              ACKNOWLEDGMENT
      The  work  described   in  this  report  was  funded  by  the  U.S.
Environmental  Protection Agency under EPA Contract No.  68-02-3890.  Mr. Tom
Donaldson was  the EPA Project Officer  and Mr.  Steve Frey was  the Work
Assignment manager.  Mr.  David Dunbar  served as the Project Director for
PEL  Mr. Ted Johnson was  the PEI Project Manager and developed  the general
approach to exposure modeling described  in this report.   He also defined
the  cohorts  to   be  analyzed  and  developed  the  cohort-specific  input
parameters  used by the  Coburn model.  Mr.  Jim Capel  developed the computer
programs that constructed  the exposure event sequence for  each cohort.  Dr.
Louis  Wijnberg  developed  the computer  programs  that determined  the
microenvironment  factors.   Mr.   Roy   Paul  developed  the program that
calculated  cohort-specific  carboxyhemoglobin  levels using   the   Coburn
Equation Algorithm.   Ms.  Alicia Ferdo and  Mr.  Joe  Steigerwald conducted
literature  searches that  provided  data  for estimating  the input parameters
of the Coburn model.

      The  authors would  like  to  thank  Mr.  Tom  Walker  of  Industrial
Economics,  Inc.,  and  Mr.  Ken Lloyd of  the U.S. Environmental  Protection
Agency  for  their  guidance and  helpful  recommendations.   The   authors are
indebted to  Dr.   Ron Wyzga:of  the Electric  Power Research Institute for
providing data obtained from the Cincinnati Activity  Diary Study.
                                  17-22

-------
                THE INFLUENCE OF DAILY  ACTIVITY  PATTERNS ON
                  DIFFERENTIAL EXPOSURE TO CARBON MONOXIDE
                            AMONG SOCIAL  GROUPS
                  by:   Margo  Schwab
                       Graduate School of Geography
                       Clark  University
                       Worcester, MA  01610
                                 ABSTRACT   .     .  '


      What people  do,  where they go,  and when  (their  activity patterns)
have a pronounced  effect on their  exposure to  pollutants.   Because such
activity  patterns vary systematically  among  population subgroups  defined  by
sex, work status,  age, and  income,  it  is  hypothesized that exposure  varies
among such groups.  The research  presented here  used  the  data collected
during the  EPA's  study  of  personal exposure  to  carbon  monoxide  (CO)  in
Washington,  DC to examine the relationship between  sociodemographic factors
and  exposure.    The  results  of  a  statistical  comparison  of exposure
characteristics  among  male  workers, male nonworkers,  female workers, and
female nonworkers   show  that  working  men  (low-exposure jobs) have the
highest exposures,  whereas nonworking  women have the lowest exposures.   In
addition,  although  automobiles  are the primary source of  CO, travel  is not
a significant"contributor to total exposure for some groups.

      This research was supported  by the National Science  Foundation  (grant
SES-8617894).
                                   18-1

-------
                               INTRODUCTION


      An important finding in the field of total  human  exposure  research  is
that the locations that  individuals visit,  when  they visit them,  and  what
they do there  affects their exposure  to  air  pollutants.  Travel  behavior
and time budget  research show that these  movement/activity patterns  vary
systematically  among  population  subgroups  defined  by  such  personal
characteristics as age, sex, work status,  and  income.   The  study presented
here links, for  the  first  time,  sociodemographic  characteristics,  activity
patterns,  and  exposure,  testing the  hypothesis that  differences in the
activity patterns of  social groups lead to differential  exposure  to  carbon
monoxide (CO).    The  data collected during  the  EPA's  field  investigation  of
personal exposure to CO  in  Washington,  DC  were  used in  a  statistical
comparison of  exposure characteristics among  population subgroups.   This
paper is divided into three parts:   a  discussion  of the research context  of
the differential  exposure hypothesis,  a description  of  the  analysis method,
and a presentation of the results.


                                  CONTEXT
      Two diverse fields of  study  form the basis of the  perspective  taken
in this research:   the recent focus of  environmental  assessment on  total
human exposure  to pollutants  and  the interests  of  urban geographers  and
planners in the  spatial and temporal  patterning of social groups.

TOTAL HUMAN EXPOSURE ASSESSMENT  '

      Because fixed-site monitors  do not accurately characterize the  wide
range of pollutant  concentrations  people routinely come  into  contact  with,
researchers have begun to measure the actual  exposure of individuals.   Much
of the recent analysis  focuses on  "total integrated exposure"  (e.g.,  1,2).
Here each person's  exposure  is.a function  of the activities  in  which  that
individual  engages over the course of a day,  thus the time spent in  contact
with various pollutant  concentrations.   The discrete equation  to describe
this concept defines total exposure  for person i as the  sum of each of the
products of the pollutant  concentrations  in microenvironments j and  the
length of time the individual  spends  in j:
                                               J
                                          E(i)= 2  C(j) t(1j)
                                              j=l

where E(i)  = total  exposure of person  i  over the
              time period of interest,
      C(j)  = concentration experienced in
              microenvironment j,
                                    18-2

-------
      t(ij)  =  time  spent  by person i in microenvironment
              j,  and
         J  =  total number of microenvironments occupied
              by  person i over the period of interest (1).

      This new focus  on  the  individual  has  required  incorporating the study
of  activity  patterns  into  environmental   assessment.    Indeed,   recent
research  has  established that  the activities a person participates in do
influence exposure.   For example, the more time spent  in microenvironments
with high concentrations of  combustion-related pollutants (i.e., CO, N02,
RSP), such  as  travel  (3,4),  cooking with a gas stove  (5,6),  and being in
the  presence  of  cigarette smokers  (7,8),  the  higher will be  a person's
total exposure.

      But the  nature of  the  relationship between daily activity patterns
and  exposure  is  complex.   Each individual's  activity pattern is unique;
whereas several people might have  similar  exposure  levels  (i.e.,  the same
amount of exposure in a  given  day),  their  exposure  profiles  (that is,  the
time\space sources of that exposure) may vary tremendously.  For instance,
imagine three  individuals  with the same total  daily  exposure.   The first
person receives most  of  her  exposure during a  long commute  to work, whereas
the  second  person  receives most  of  his  exposure from  the  presence  of
smokers on  the job,  and  yet  another receives  the majority of her exposure
from home sources (e.g., gas  stove,  clothes dryers,  and  space heaters).
Such  differences  in  exposure  profiles highlight  the  need to  consider
differences in activity  patterns among individuals when formulating models,
health  research  conclusions,   and  exposure  reduction strategies.    But,
whereas   it  is  useful   from  an  analytic  standpoint to focus  on  the
•individual,  it would  be  enormously complicated and  expensive to implement
individual-level  policies.   Useful models and  feasible policy directives
are  based upon generalizations  about specific groups of people, places, or
activities.    The  methods  of  social  scientists  can  help  clarify  the
relationship  between  activity  patterns and exposure  and  thus  provide the
foundation  for useful generalizations about  the  exposure  of population
subgroups.

ACTIVITY PATTERNS

      Although the idea  of studying  activity  patterns  is  new to the field
of environmental  assessment,  geographers and  planners  have  a long tradition
of analyzing  the  nature  of movement  patterns.   Analysis of data collected
via  a variety  of  survey  instruments,  in several countries, and  at multiple
times  and places  in  the U.S.  has consistently  shown that  the activity
patterns  of population subgroups, defined  by  such personal characteristics
as sex,  age,   income,  employment status,  and race,  vary  systematically.
Specifically,  because  the constraints  operating on and the  opportunities
available to  an  individual   are defined to some extent by their personal
characteristics,  the activities  of. social  groups tend to  exhibit distinct
patterns.   Women,  for instance,  travel shorter distances to  work  and to
other activities  (9-14),  spend  less time in leisure  activities  (15,16), and
spend fewer hours in waged work  (17,18) than do men, even  after  employment
status  has  been  controlled.   Both  the number of trips and  the timing of

                                    18-3

-------
 that  travel  differ  between  those who  work  out  the  home  full  time  and those
 who  do not work outside  of the home  (10,19,20).  In addition, nonworkers
 tend  to  organize their activities around the home location,  whereas workers
 organize their activities  around  both the home  and  the work place (21).
 The  elderly  make fewer trips (22,23),  travel shorter distances,  use the bus
 more often  (24),  and have  more  leisure time  (15)  than does the  younger
 adult population cohort.  Lower income groups  take fewer trips (25), travel
 shorter  distances for shopping  (26,27), and have lower automobile  ownership
 rates than do  higher income groups.   These kinds  of differences  imply that
 the   exposure   levels  and   profiles  also  differ  among  such  population
 subgroups, thereby  allowing useful  generalizations  to  be  made.


                                   METHOD

 DATA

       The  study of human exposure has  been  hampered  by  the scarcity  of
 field data  on  the  pollutant concentrations  to which people are  exposed.
 Only since  the early  1980's  have portable,   personal  exposure  monitors
 (PEM),  which  accurately  record pollutant  concentrations  at low,  ambient
 levels  become  available  for  general  use (28).    During  the  winter  of
 1982-1983  the  EPA  applied many of the methods used by social scientists  to
 study  activity patterns  to  this  new  personal   monitoring  technology,
 conducting a large-scale  field investigation  of actual  individual  exposure
 to carbon monoxide.  In  each  of two  cities—Denver  and Washington, DC--a
 statistically  representative sample of the  noninstitutionalized,  nonsmoking
 population between  the ages of 18 and 65 filled out questionnaires, kept
 activity diaries,   and  carried PEMs  with  them  throughout  their  daily
 routines,  recording CO concentrations at a time resolution  of less than one
 minute.   Hartwell  et a7.  (29)  and Johnson (30) describe the details of the
 data collection processes in Washington and Denver,  respectively.

       The  data base resulting  from  this  study  is  the  largest   and most
 detailed available  on total human exposure to an air pollutant.   The EPA
 has  used it  to determine the  actual  frequency distribution  of  CO exposure
 in the population   (31),  to study  the relationship between  actual  personal
 exposure and exposures  based on fixed-site monitors  (31,32),  to  highlight
 high exposure  settings  (31,32),  and to verify  exposure models (33,34).  But
 the   richness   of  this disaggregate   data base  has not  yet  been  fully
 explored.   The present paper used the Washington,  DC component,  consisting
 of information on  one day's activities  for approximately  700  persons,  to
 study  the  time/space  paths  of  individuals with  different   personal
•characteristics, thereby  investigating the implications  of  these  paths for
 exposure.

 ANALYSIS TECHNIQUE

       The  method  was to compare  expo'sure characteristics among  social
 groups.    I  drew   on  the  travel  behavior tradition (e.g.,  9,11,35)  of
 grouping individuals based  on  personal characteristics assumed to  influence
 activity patterns,  thus  influencing  exposure.   The results of  previous

                                    18-4

-------
travel  behavior studies,  constrained  by data  availability,1 led to focusing
on groups formed on the  bases  of  the role-related  characteristics  of  work
status  and sex.  Work  status  was subdivided into three groups:   nonworkers,
workers  in  low-exposure jobs,  and workers  in  high-exposure  jobs  (e.g.,
taxi,  bus,  and truck drivers,  auto mechanics, police,  cooks,  and crane
operators).2  Figure 1 shows the groups and  their sample sizes.
     Nonworkers(a)
         / \
        /   \
       /     \
    Males   Females
    (87)
(190)
 Low-exposure
   workers(b)
    /   \
   /     \
Males   Females

(161)    (157)
                             High-exposure
                              workers(c)
males and females

     (37)
Figure 1.   Sociodemographic  grouping  scheme  with  weighted  sample  sizes  in
            parentheses,  (a) WORKTIME less  than  or  equal  to  two  hours  on
            the day sampled  (workers  sampled on weekends  but  not  working
            that day and nonworkers),  (b) WORKTIME greater than two hours
            on the day sampled  (either  weekdays  and weekends),  (Those coded
            as workers and sampled on  weekdays, but  listing less  than two
            hours of time as  business,  study, or on-the-job travel  were not
            included  in the  analysis.)  (c)  males  and  females  grouped
            together to maintain acceptable subsample  sizes.
      The  first  step  in  -the  analysis  was  exploratory,   including  an
examination  of  the  descriptive  statistics  and  the  shape  of  the
distributions of each variable in  Table  1  by  each  group.   I then tested the
statistical  significance  of the  null  hypothesis  that  for  each of  the
variables "there is  no difference among the  five  sociodemographic groups."
Parametric  ANOVA  with  posteriori  Scheffe  tests were  performed on  the
transformed  variables  (positively-skewed  variables  were  converted  to
natural   logarithms  where  appropriate),   backed  up  by  nonparametric
Kruskal-Wallis tests on  the raw  variables.
1. Income and  race  data  were not collected.  Few elderly  or  young people
were  surveyed.     Previous  studies  have  shown  that role-related
characteristics are much better determinants of activity patterns  than are
economic characteristics.

2. These groups were defined partially on  the  basis  of  the original  codes
assigned by  Research Triangle  Institute (RTI) during the  data  collection
and  partially  on  the basis of new codes and grouping criteria  formulated
specifically for this study.  See notes in Figure 1.
                                   18-5

-------
                     TABLE  1.  DEFINITION OF VARIABLES
Code
Description
MAX1HR            Maximum one-hour exposure.
MAX8HR**           Maximum eight-hour exposure.
HOMEXP*           Exposure while inside the home.
WORKEXP*           Exposure while on the job (includes activities coded as
                  business, study, and on-the-job travel).
TRAVEXP"1"           Exposure while traveling (includes activities coded as
                  parking but excludes activities coded as on-the-job
                  travel).
PHOME**           Proportion of total exposure from home.
PWORK**           Proportion of total exposure from work.
PTRAV"""           Proportion of total exposure from travel.
HOMETIME          Total time (hours) at home  (inside, all activities).
WORKTIME          Time  (hours) in waged work  (includes time  in activities
                  coded as business,, study, and on-the-job travel).
LEISTIME          Time  (hours) in leisure activities.
TRAVTIME          Time  (hours) traveling (excludes on-the-job travel,
                  includes parking).
HOMECO#           Mean CO concentration associated with home activities.
WORKCO#           Mean CO concentration associated with work activities
                  (see above work variables).
TRAVCO#           Mean CO concentration associated with travel activities
                  (see above travel variables).
                                                                   Continued
3. The sampling  weights  were applied to  the  data during all  stages  of the
analysis.
                                    18-6

-------
                           TABLE  1.   (concluded)
      Calculated by maximizing  overall  available one-hour averages  of  CO
      exposure for  each  individual (provided by RTI).  Units  are ppm/hour.
      Calculated by maximizing over all available eight-hour averages  of  CO
      exposure for  an  individual  (provided by RTI).   Units are ppm/hour.
*     Calculated by summing equation 1 over  the  activity records  coded  as
      either home (or  work  or travel).  In particular, exposure  levels were
      first calculated for each  activity record  by  multiplying  the  number
      of minutes in that activity by the  mean CO concentration  recorded
      during that  activity.   Second,  the individual exposure levels were
      summed across the  chosen activities).   Units are ppm-hours.
**     Calculated by dividing  total daily  exposure  (from  equation   1)  by
      either  HOMEXP,  TRAVEXP,  or WORKEXP  and  then multiplying by  100.
      Units are percent.
*     Calculated by  summing the  CO  concentrations  associated  with  each
      activity  record  coded as  home  (or work or travel,  for  WORKCO  and
      TRAVCO,  respectively) for   each individual and then  dividing by  the
      number of activity records.  Units are ppm.


      The analysis  presented here4 focused on  the  following  questions:

      1)    Do either  exposure levels or exposure profiles differ between
            workers and  nonworkers?

      2)    Within  each  employment-status  group,  do  exposure  levels  or
            profiles vary between men and women?

      3)    What is  the role  of  time   use  versus  CO  concentrations  in
            differential  exposure.


                                  RESULTS

EXPOSURE LEVELS

      I  used  two  traditional  measures,   maximum  one-hour  (MAX1HR)  and
maximum  eight-hour (MAX8HR)  exposure levels,  to  determine  whether  the
amount  of  exposure  a  person receives  differs  among   the  five
sociodemographic groups  set out  above.   The results,  displayed  in  Table  2,
show  that  both  the ANOVA  and  Kruskal-Wallis test yield statistically
significant   results  (p<.01);   there  are  differences  among  the   five

4.  Other parts of this research project not presented in this  paper include
an assessment  of the  role  of micro-level  activities (e.g.,  being in the
presence of smokers and  using gas appliances)  on exposure  patterns and  an
examination  of the  relationship  between residential   location,   CO
concentrations,  and exposure.    The  results  of these analyses  will  be
presented in  a separate  paper.
                                   18-7

-------
population  subgroups on both  of the  exposure level variables.   Scheffe
tests,  performed to  identify the nature of the differences detected  by  the
ANOVA,  show the following statistically significant (p<.05) results:

      1)  MAX1HR  and MAX8HR are higher for  workers  in  high-exposure jobs
            than they are for each of  the other social groups;

      2)    MAX1HR is lower for nonworking  women than it  is  for either  men
            or women working in low-exposure  jobs; and

      3)    MAX8HR  is  lower for nonworking  women than it  is  for men  in
            low-exposure jobs.


      This  analysis  shows  that the level  of CO to  which individuals  are
exposed varies with personal characteristics.


TABLE 2.   RESULTS OF SIGNIFICANCE TESTS OF EXPOSURE  LEVELS AND
                ACTIVITY DIMENSIONS OF  EXPOSURE AMONG GROUPS
      Variable  Transform      Parametric       Nonparametric

                           Test   Stat.  Prob.  Test   Stat.  Prob.
MAX1HR
MAX8HR
HOMEXP
WORKEXP*
TRAVEXP
PHOME
PTRAV
PWORK*
.nat. log. ANOVA+
nat. log'. "
nat. log. "
nat. log. "
nat. log. "
none "
none "
none "
14
12
13
27
30
102
19
37
.88
,81
.56
.14
.29
.46
.78
.60
.000
.000
.000
.000
.000
.000
.000
.000
K-VT
It
II
II
II
II
II
II
41
43
59
60
102
254
115
48
.89
.13
.98
.34
.72
.03
.29
.38
.000
.000
.000
.000
.000
.000
.000
.000
* Nonparametric tests performed on  untransformed  (raw)
  data
+  One-way analysis of variance (F  value)
++ Kruskal-Wallis test (chi-square  value)
#  Tests of work exposure only performed  between  groups
   of workers
EXPOSURE PROFILES

      But  the  focus  of  the  total   human  exposure  research  field  on
individuals  as  receptors of  pollutants  implies the  need  to go beyond  an
analysis of exposure levels,  toward an assessment  of  how people's  activity
patterns influence their exposure.  To this end,  I  examined  the time/space
sources  of exposure.   This  new approach  compares  exposure  profiles  by

                                    18-8

-------
analyzing  differences  in  both  the  amount   and  the  proportion  of  an
individual's total  exposure that is attributable to home,  work, and travel
activities.

      A  comparative analysis  of  these  activity  dimensions  of  exposure
documents  the  way  in  which  exposure  profiles   vary  among   the
sociodemographic  groups.    Both  the parametric  and  the nonparametric tests
yield significant (p<.01)  differences among the groups for home,  work,  and
travel exposure (Table  2).   The  posteriori  Scheffe  tests find the following
specific differences, significant at  p<.05, between groups:

      1)    workers  have   higher amounts  of   travel  exposure  and get  a
            greater  proportion   of  their  exposure  from  travel   than  do
            nonworkers;

      2)    nonworkers  have higher  amounts of home exposure and get  a
            greater proportion of their total exposure from the home;

      3)    the  amount  of exposure  from travel  is  lower  for nonworking
            women than  it  is for nonworking men;

      4)    workers in  high-exposure jobs have  higher work  exposure levels
            and get a greater proportion of their  total  exposure from work
            than  do either males or females  in  low-exposure jobs;  and

      5)    the  proportion of  exposure  from  travel  and  home  sources is
            greater for workers  in  low-exposure jobs than it is for those
            in high-exposure jobs.      .

Figure  2 illustrates these differences.   The  bar  graph shows the average
contribution of each activity to a person's  total exposure for each  group.

      Time/space  sources of exposure do vary with personal  characteristics;
there  are  important  differences  in exposure profiles  among population
subgroups defined by work  status and sex.   The results  also highlight the
complexity  of  the  relationship  between personal characteristics,   activity
patterns, and  exposure.   The interpretation of the pattern of differences
among the social groups in exposure  associated with each activity is not
clear.   In addition, it is important to point  out that Figure 2 was derived
from  group  averages,   but there  is high  within-group  variation.    The
standard deviation  is almost as  large as  the  mean  for each variable,  even
after  controlling  for social  group  (Table   3).    The  next section  of
                                   18-9

-------
                  PERCENT OF TOTflL EXPOSURE
                  FROM EflCH flCTIVITY    -
  NONUORK,  WOMEN


    NONUORK, MEN


  LOU-EXP,  UOMEN


    LOU-EXP, MEN


HIGH-EXP UORKERS
CUTRRVEL
    OTHER
                 0     20    40     60    30    100    120
Figure 2.  Average exposure profiles.  Each bar shows the proportion of
    exposure the average  individual in that social group receives from
    home, work, travel, and other sources.
                                18-10

-------
TABLE 3.  DESCRIPTIVE STATISTICS FOR EXPOSURE VARIABLES
Group
Men,


Men,
work

MAX1HR+ MAX8HR+
nonwork
(87)

low-exp
(161)

High-exp work


Women


Women
work

Men,


Men,
work

(37)

, nonwork
(190)

, low-exp
(157)

nonwork
(87.)

low-exp
(161)

5
6
(5
5
6
. (5
10
20
(21
4
4
(2
5
6
(5






High-exp work


Women


Women
work

(37)

, nonwork
(190)

, low-exp
(157)









.37*
.47**
.86)***
.64
.97
.52)
.88
.55
.79)
.56
.62
.96)
.19
.45
.38)
80.
72.
(22.
30.
34.
(26.
21.
27.
(20.
83.
77.
(25.
41.
39.
(24.
2
2.
(2.
2
2
(2
5
7
(5
2
2
(1
1
2
(2
65*
.21
,85
86)
.40
.90
.72)
.15
.11
.69)
.35
.20
.53)
.86
.53
.37)

95** '
77.)*"
38
20
35)
05
10
60)
98
69
29)
08
67
79)













HOMEXP++ TRAVEXP~
21.20
30.78
(34.15)
7.30
14.67
(21.91)
14.60
17.17
(14.35)
23.41
28.49
(24.76)
11.29
17.31
(21.75)
12.53
. -18.61
(17.90)
26.53
32.81
(23.27)
8.36
14.99
(15.80)
8.79
16.33
(21.18)
22.36
29.52
(19.58)
5
6.
(5.
8
10
(9
8
11
(16
2
4
.(5
7
9
(7















.04
,40
41)
.71
.72
.91
.31
.50





)


.72)
.34
.24


WORKEXP~



7.17
12.61
(20.30)
34.21
53.83
(47.84)
—

.22)
.06
.16


.90)















0
0
.0
24
28
(20
53
57
(25
0
0
0
19
25
(18
6.09
10.15
17.00

.

.96
.30
.01)
.61
.11
.66)



.84
.48
.93)
 Median
 Mean
 Standard deviation
 in ppm/hour
 in ppm-hours
                                18-11

-------
analysis was designed to investigate reasons  for this  high  person-to-person
variation and for the pattern of differential  exposure reported  above.   The
examination  focuses  on the  component  parts  of exposure—time  use and  CO
concentrations.

TIME USE

      Exposure is a  function of the time spent  in  an activity and the  CO
concentration faced  during that time.  It is  thus  appropriate to ask  how
time use varies  among nonworking women, nonworking men,  women working  in
low-exposure jobs, men  working  in  low-exposure jobs,  and those working  in
high-exposure jobs.    A statistical  comparison of the total amount of time
spent by each person at home, traveling, at work, and  in leisure  among  the
social  groups highlights the same type  of differences  in activity patterns
documented  by previous  studies.   Nonworkers  spend  significantly  more time
at home and  less time  in travel than do workers or  males.   Controlling  for
both work status  and sex  reveals that  male nonworkers spend more time  in
both leisure and  travel,  but less  time at home than  do female  nonworkers.
These results suggest that at least one reason  for  the  differences between
workers and nonworkers and between  male and  female nonworkers  in the amount
of exposure  from travel and  from home  is differences  in the amount of time
spent  in  these activities.   This  evidence  supports  the  hypothesis that
differences  in time use among groups lead to differences in exposure.   But
the differences in exposure levels  are not  as  strong as differences in time
use would suggest.  Another important finding  is that  the high within-group
variation on the  exposure variables is not due to time use; there is  low
interperson  variation  on each  of the time use variables,  especially  after
sex and work status  have been controlled (Table  4).
                                   18-12

-------
    TABLE 4.  DESCRIPTIVE STATISTICS FOR TIME USE VARIABLES
Group              HOMETIME+  TRAVTIME+  WORKTIME+  LEISTIME+
Men, nonwork
(87)
Men, low-exp
work (161)
High-exp work
(37)
Women, nonwork
(190)
Women, low-exp
work (157)
20.47*
20.22**
(2.77)***
13.75
14.24
(2.77)
13.95
14.25
(2.55)
21.99
22.66
(2.23)
14.27
14.86
(2.83)
1.51
1.81
(1.26)
1.86
1.98
(1-01)
1.23
1.81
(1.55)
0.68
1.19
(1.36)
1.60
1.87
(1-09)
0
7.70
7.52
(2.01)
8.15
8.42.
(2.12)
0
7.36
6.99
(2.16)
4.15
4.76
(4.03)
• 1.30
1.85
(2.15)
0.20
1.34
(2.12)
2.73
3.29
(2.97)
1.26
1.62
(1.69)
    Median
**   >•
^  Mean
    Standard deviation
*   in hours
                                    18-13

-------
CO CONCENTRATIONS

      A  new  set   of variables  was  created  for  the   analysis  of CO
concentrations,  the other component of exposure.   Whereas  previous  analyses
of this  data  set  use PEM readings as  the  unit  of analysis (31,32),  this
study  uses  the individual.   For  each person  I  calculated  the  mean CO
concentration associated with all  occurrences of a given activity  (home,
work,  and travel).   Such measures maintain  the integrity of differences
among  individuals.   The next step was  to  compare mean CO concentrations
across both  groups  and across activities.

      A possible reason why differences in exposure between the  pairs of
sociodemographic groups  are  not  as strong  as  differences  in  their  time use
would suggest is that CO concentrations are similar across activities.  A
comparison of the  mean  CO  concentrations across activities (ANOVA  on the
natural logarithm  of the mean CO  concentration  in  each activity)  yields a
significant  statistic (F=163.39;  p=0.000).   Scheffe tests  demonstrate  that
the  CO concentrations  associated  with travel  are significantly  (p<.05)
higher  than  are  the concentrations  associated  with any  of  the other
activities.

      A more surprising finding  is  that the CO concentrations  associated
with home and low-exposure  work  places  do not differ significantly  from one
another  (t=-0.66;  p=0.51).   This similarity in CO concentrations explains,
in  part,  why  MAX8HR  does  not  differ  markedly  between  workers   and
nonworkers,  even though  time use varies between these groups.

      All of the CO concentration variables exhibit high within-activity
variation, even after controlling for social group.  (Table 5),  suggesting
the  reason  for  both the high within-group variation in  exposure and the
lack of a significant difference  between the concentrations associated  with
home and work.

      Another possible  reason for the pattern of exposure profiles found
above  is that mean CO concentrations  associated with each  activity  category
differ across sociodemographic groups as a  result  of subtle  differences  in
activities while at home,  at  work,  or traveling.   For example, women may
spend more of their time at  home  cooking, doing laundry, or in  the  vicinity
of gas  appliances  than  do men,   leading them to face higher CO levels at
home than do men.   More frequent travel during rush hours and  on heavily
traveled routes may  lead workers  to  face higher CO levels while traveling
than nonworkers.   Indeed ANOVAs  of the mean CO concentrations  associated
with  the home  and travel  activity  categories across social  groups  are
significant  (Table 6).     Posteriori  tests  show two   interesting
relationships.

      First;  CO concentrations associated  with  the  home  are  significantly
(p<.05)  higher  for male nonworkers than they are  for  male workers.   The
presence of a difference between the male employment-status groups but a
lack of  a difference between  the female work-status groups could reflect
role-related differences between the sexes--a female's duties within the
                                   18-14

-------
           TABLE 5.  DESCRIPTIVE STATISTICS FOR CO CONCENTRATION
                                 VARIABLES
Group
HOMECO+    TRAVCO+
WORKCO+
Men, nonwork
(87)
Men, low-exp
work (161)
High-exp work
(37)
Women, nonwork
(190)
Women, low-exp
work (157)
1.62
1.79"
(1.45)"*
0.75
1.24
(1.67)
1.56
1.58
(1.13)
1.48
1.60
(1.29)
1.24
1.54
(1.91)
2.65
3.20
(2.21)
4.10
5.13
(5.15)
3.42
7.17
(12.14)
3.21
3.23
(2.01)
3.87
4.86
(3.64)
—
1.22
1.73
(2.45)
4.69
6.06
(4.91)
~ ~ ~
0.81
1.42
(1.78)
Median • . .
    Mean
*** Standard deviation
+   in ppm
                                   18-15

-------
             TABLE 6.   RESULTS OF SIGNIFICANCE TESTS OF CO
                CONCENTRATION BY SOCIODEMOGRAPHIC GROUP
CO variable

HOMECO
TRAVCO
LEISCO
WORKCO**
Transform ANOVA

nat
nat
nat
nat.

. log.
. log.
. log.
log.
F-stat.
4.32
11.21
1.62
1.13+
Nonparametric tests performed on
variables
Between male and female workers i
Prob
.002
.000
.169
.258
Kruskal-Wallis*
X2
23.22
28.04
10.57
-1.99*
untransformed
n low-exposure
Prob.
0.000
0.000
0.032
h 0.047
jobs
 probability)
Mann-Whitney test (z statistic)
                                 18-16

-------
home are likely to be similar regardless of work status, whereas a  male's
are not.

      Second,  CO concentrations  associated with travel are significantly
(p<.05)  higher for  workers than they are for nonworkers, regardless  of  sex.
As reasoned above, this  difference  may be related to  the joint effect  of
the timing of  trips and routes used by workers.

      This examination has highlighted the importance of within-activity
variations in  CO concentrations  in  explaining interperson and  intergroup
differences in exposure.

                         SUMMARY AND IMPLICATIONS

      This study  of the  relationship  between  personal characteristics,
activity patterns,  and exposure offers  useful insights  into the nature  of
human exposure to  air  pollutants.    For the  first  time,   it  has  been
empirically demonstrated  that  exposure characteristics vary  among  social
groups.   The  analysis  shows that differences in activity patterns (manifest
both by  differences  in  time use  and  by differences in CO concentrations
resulting from differences in micro-level  activities)  lead to differences
in  the   time/space  sources  of. exposure.    Work  status  is a   better
differentiator  of  exposure  than  is  sex,  but  there  are  also important
differences  between men  and  women.     Whereas travel  is  an important
contributor to exposure  for  workers,  especially  men,  the  home  is the major
time/space source  of exposure for  nonworkers,  especially women.   Among
nonworkers, men have higher travel exposures than do  women.

      Although this analysis  was  not able  to pinpoint  the extent to which
each  of  the   many personal  and  household  characteristics influences
exposure,   the  results highlight   the  value  of  incorporating   a
sociodemographic component into  exposure assessments.   It is  inadequate  to
formulate exposure  models based  on a prototypical  person;  the  individual's
social characteristics  affect  the probability that she/he will come  into
contact  with  a given pollutant  concentration.   It  is also  important  to
maintain the integrity of a person's total activity  pattern,  especially  as
such patterns  relate to  personal characteristics.

      In  addition,  the groundwork  has been laid for  future analyses  by
showing  that studying activity  patterns in terms of  time spent at home,  at
work, and in  travel  is  not sufficient  to explain  differential exposure
either  among  individuals or groups.    Differences  among people in their
micro-level activities lead CO concentrations to vary  greatly within  each
activity  category,   even  after controlling  for   social  group.    The
implications  of this  high interperson variation in CO  concentrations for
differential  exposure  require further research.

      Finally, because we now  know  that systematic  variations in  activity
patterns  among social  groups  do  lead to differential  exposure,  it  is
appropriate  to extend  the  analysis  by collecting data  on   the activity
patterns of high-risk  groups—particularly  young children, the elderly, and
low-income groups.

                                  18-17

-------
      In summary,  the perspective  and results  of this study have important
research implications:   the existence of  differential  exposure has  been
demonstrated,  the value of  monitoring at the  individual  level has  been
verified,  and directions for future research have been outlined.

      The  work described  in  this  paper was   not  funded  by  the  U.S.
Environmental  Protection  Agency  and,  therefore,   the  contents  do  not
necessarily  reflect  the  views of the  Agency  and no official  endorsement
should be inferred.
                                   18-18

-------
                                REFERENCES

1.     Ott,  W.   Concepts of human exposure  to  air pollution.     Environment
      International 7:  179-196, 1982.

2.     Duan, N.   Models for human exposure to  air pollution.  Environment
      International 8:  305-309, 1982.

3.     Ott,   W.   and  Willits,  N.H.    CO exposures of  occupants  of motor
      vehicles:  Modeling  the  dynamic response  of  the  vehicle.  SIMS
      Technical  Reports,   No.  48,   Department  of Statistics,  Stanford
      University,  1981.

4.     Peterson,  W.B.   and  Allen,  R.    CO  exposures to  Los Angeles area
      commuters.  Journal   of  the Air  Pollution  Control  Association  32:
      826-833,  1982.

5.     Spengler,  J.D.;   Ferris,  B.G.;  Dockery,  D.W.;  and  Speizer, F.E.
      Sulfur dioxide and nitrogen dioxide levels  inside  and outside homes
      and  the   implications  on  health effects  research.   Environmental
      Science  and Technology 13: 1726-1280, 1979.

6.     Sterling, T. and  Sterling,  E.   Carbon monoxide levels  in kitchens  and
      homes with gas   cookers.  Journal  of the Air  Pollution  Control
      Association 29: 238-241, 1979.

7.     Repace,  J.  and Lowrey,  A.   Indoor air  pollution,  tobacco smoke,  and
      public health.  Scien'ce 208: 464-472,  1980.

8.     Spengler,  J.D.;   Treitman,  R.D.; Tosteson, T.D.;  Mage,  D.T.;  and
      Soczek,   M.L.     Personal  exposures  to  respirable particulates  and
      implications for air pollution epidemiology. Environmental  Science
      and Technology  19: 700-707, 1985.

9.     Hanson,   S.  and  Hanson, P.  Gender and  urban activity  patterns  in
      Uppsala,  Sweden.  Geographical  Review 70: 292-299,  1980.

10.   Hanson,   S.  and  Hanson,  P.    The travel-activity patterns of urban
      residents:    Dimensions   and  relationships  to  sociodemographic
      characteristics.   Economic Geography 57: 332-347,  1980.

11.   Hanson,  S.  and Johnston, I.   Gender differences  in work trip  length:
      Explanations and  implications. Urban Geography 6:  193-219, 1985.

12.   Madden,   J.F.    Why  women  work  closer~to  home.  Urban Studies  18:
      181-194,  1981.

13.   Black, J.   and Conroy,  M.   Accessibility  measures  and the  social
      evaluation  of  urban  structure.  Environment   and   Planning A  9:
      1013-1031,  1977.
                                   18-19

-------
14.    Federal  Highway Administration.   Personal travel in the U.S.,  Volume
      I:    1983-1984  nationwide  personal  transportation  study.  U.S.
      Department  of Transportation, Washington, D.C., 1986.

15.    Brail,   R.K.  and Chapin,  F.S.,  Jr.    Activity patterns  of  urban
      residents.   Environment  and Behavior 5: 163-192,  1973.

16.    Robinson,  J.P.;  Converse,  P.E.;  and  Szalai,  A.   Everyday life  in
      twelve  countries, in:  A. Szalai  (ed.), The Use of Time.  Mouton,  The
      Hague,  1972.

17.    U.S.  Department of Labor,  Bureau of Labor Statistics.  Perspectives  on
      women:   A data book.  Bulletin  2800.   U.S.  Government Printing  Office,
      Washington, D.C., 1980.

18.    Chapin,  F.S.,  Jr.   Human Activity Patterns in  the City.  New  York:
      John  Wiley  and Sons,  1974.

19.    Pas,  E.   The effect of selected sociodemographic characteristics  on
      daily  travel-activity  behavior.   Environment   and  Planning  A  16:
      571-581,  1984.

20.    Doubleday,  C.  Some studies of  the temporal stability  of person trip
      generation  models.  Transportation Research 11: 255-263, 1977.

21.    Hanson,  S.   The importance  of  the multipurpose journey  to work  in
      urban travel behavior. Transportation  9:  229-248, 1980.

22.    Carp, F..M.   The mobility of  retired people.  In:   E.J.  Cantilli  and  J.
      Schmelzer  (eds.),  Transportation  and  Aging.      U.S.  Government
      Printing Office,  Washington,  D.C.,  1970.  Cited in 10.

23.    Potter,  R.B.    The  nature  of consumer usage  fields in  an  urban
      environment:   Theoretical and  empirical perspectives.  Ti.ldschrift
      Voor Economische en  Sociale  Geographic  68:  168-176,  1977. Cited  in
      10.

24.    Hanson,  P.   The  activity  patterns  of  elderly  households.  Geoqrafiska
      Annaler Series B 59:  109-124,  1977.

25.    Douglas,  A.   Home-based trip  end models  -- A comparison  between
      category analysis and regression analysis procedures.  Transportation
      2:  53-70, 1973.  Cited in  10.

26.    Davies,  R.L.   Effects  of consumer  income differences  on shopping
      movement behavior.  Ti.idschrift Voor Economische  en  Sociale  Geographic
      60: 11-121, 1969. Cited  in 10.

27.    Oppenheim,   N.   A typological approach to individual  travel  behavior
      prediction. Environment  and Planning 7: 141-152,  1975.
                                   18-20

-------
28.   Wallace,  L.A. and Ott,  W.R.   Personal monitors:  A  state-of-the  art
      survey.  Journal  of the Air Pollution  Control  Association  32:  602-610,
      1982.

29.   Hartwell,  T.D.;  Clayton,  C.A.;  Mitchie,  R.M.;  Whitmore,R.W.;  Zelon,
      H.S.;   and  Whithurst,  D.A.   Study  of  carbon  monoxide  exposure  of
      residents  in Washington,  DC  and  Denver,  CO.   EPA-600/4-84- 031,  U.S.
      Environmental Protection Agency,  Research Triangle Park, NC, 1984.

30.   Johnson, T.    A study  of personal  exposure to carbon  monoxide  in
      Denver,  CO.  EPA-600/4-84-014,  U.S.  Environmental Protection  Agency,
      Research Triangle Park,  NC,  1984.

31.   Akland,  G.;  Hartwell,  T.D.;   Johnson,  T.R.;  and  Whitmore,   R.
      Measuring  human  exposure  to  carbon monoxide  in Washington, D.C.  and
      Denver, CO  during the  Winter  of  1982-3.  Environmental  Science  and
      Technology  19:  911-918,  1985

32.   Johnson,  T.; Capel,( J.;  and  Wijnberg, L.   Selected  analyses  relating
      to  studies  of   personal  carbon  monoxide  exposure  in  Denver  and
      Washington,  D.C.. Report  compiled  by  PEDco (Durham,  NC)  for  the U.S.
      Environmental  Protection Agency  (Contract No.  68-02-3496/PN  3550)
      1986.

33.   Duan,   N.     Application  of  the Microenvironment  Type  Approach  to
      Assessment  of Human Exposure to  Carbon Monoxide.  Rand,  Santa Monica,
      1985.

34-   Ott, W.;  Thomas,. J.;  Mage,  D.;-and Wallace,  L.  Validation  of  the
      simulation  of human air pollution exposure  (SHAPE) model  using paired
      days  from  the   Denver  carbon  monoxide  field  study.  Atmospheric
      Environment (in  press  -- 1988).

35.   Fox, M.B.   Working women and travel:   The access of  women to  work and
      community facilities.  Journal  of  the American Planning Association
      49: 156-170,  1983.
                                   18-21

-------
                      IDENTIFICATION OF RESEARCH NEEDS


      Near the  end  of the  conference,   the  participants  were  asked to
identify needs for  future  research  in  the area of human  activity  patterns
and their relationship to exposure  assessment.   Those  identified needs  that
were legible and  understandable were edited  as little as possible and  are
presented  here.     They have  been  sorted  into  five  categories:    data
collection processes,  nonresponse  problems,  models, field  studies,   and
standardization  and organization.   Beyond  the  categorization,  no  conscious
effort  was made  to  sort  and order  the  identified  needs.   There  is  a
considerable  amount of repetition,  but we  decided to  list all  of  them  with
the thought that  the  repetition  represents an indication of the  amount of
agreement between participants.

DATA COLLECTION  PROCEDURES

In what form  should data be collected?

      A.    Self Completed  diary
      B.    Recall questionaire
      C.    Direct observation
      0.    Indirect observation
      E.    Electronic monitoring
      F.    Non-paper data  recording  (electronic event/time logger)

Consider time frames of data .collection:

      A.    How  frequently  should entries  be made?
            (1)    At change of activity or location
            (2)    Every "x" minutes
      B.    How  frequently  should "diary"  be reviewed?

Develop follow-up questions/check list.

How can we merge the data collection  forms and technique  experiences of the
social  sciences  into the future development  of the  environmental science
issues?

Develop "standard" activity pattern diary  questionaires.

Develop  a voice  tape  recorder  with  a .voice time  recorder  that  would
facilitate  an acurate diary  with  a  minimum amount of effort  for   the
respondents.

Determine  which  survey methods  -  data  collection  devices   are  most
producti ve/effecti ve?

Quality  Assurance.    Besides coding  and clerical   errors,  it  is   very
important to  reduce errors  at the  data  collection  points.   In a  number of
studies,   respondent  filled  the  diaries  or  questionnaires.    In   such
circumstances,  I worry  about the  quality of  the  data   collected.    Some

                                   19-1

-------
questions  I  have  in my mind  are:   Are definitions clear  to respondents?
Are  concepts clear  to them?   Do  they  interpret  questions  correctly?
Research  is  needed  in this  area to  reduce  response  variance.    Can  a
group-training  of  respondents be  arranged?    Should  a  mock  interview be
conducted with respondents?  These and other ideas may help to improve the
quality of data at the collection end.  Also I believe that efforts  should
be made  to get good  quality data to  start  with rather than  find errors
later and correct  them.

Data Collection Instrument:   In questionnaire design,  it  is important to
pretest it before  fielding the  questionnaire.   The pretest should be done
on the target population.   Even one respondent  by subgroup  would be more
useful  than no pretesting.

Before developing  new activity/time  diaries,  the resolution  required in
terms  of  time period  (1-5,   5-30,  >30  min),  of  locations  (#  indoor,  #
outdoor,  with sources),  and  of activities (source usage,  breathing rate,
heart  rate).     What  is  the  time  frame  for  anticipated  responses
(acute/chronic),  and how important  is a short-term peak  vs.  a long-term
average exposure?   Additionally, the total burden placed on the  respondent
(relative  to what is essential)  should be considered  in  determining the
format of  the diary/questionnaire (integrated vs. promoted time  interval).
Several diary/questionnaire formats were presented here.    Each has certain
advantages/disadvantages,  in  terms of  burden  and  sensitivity/accuracy.
Without  adequately  defining what  is  needed  (for  continuous/integrated
monitoring),  there is no real  basis for selection between these methods.

Do more methodological research  in  particular  on questionnaire design and
non-response avoidance.                                              .

Develop  standardized set  of questionnaires  for  characterizing pollutant
exposure (indoor;  outdoor;  total)  and activity information;  e.g.,  for:

      (a)   pollutants with  short time resolution  --  1  to  6  hrs. (such as
            CO).
      (b)   pollutants with medium time resolution -'-  6 to 24  hrs. (VOC's).
      (c)   pollutants with longer time resolution --  > 24 hrs.

Generate  an  activity pattern questionnaire  that is devoid  of subjective
questions, such as  strenuous  activity; heavy/light traffic;  high, medium,
or low usage of an appliance,  etc.

Compare activity recording  instruments:

(1)   Paper and pencil  diaries for concurrent records
      (a)   page-by-page format  (Denver-Washington)
      (b)   matrix format  (Lambert, Colome, Adair)
(2)   Electronic monitors  (Ott suggested  monitor) for concurrent.
(3)   24-hour Retrospective Interview  - telephone based CATI technique.
(4)   Observational  Studies   (ethnography  tradition,  other  family member
      observing  and  reporting   on  activities   of  target  subject,   or
      electronic tracking device).

                                   19-2

-------
Convergent' methods  would  allow  multi-method  multi-trait  approach  to
assessing  the  efficiency of  these  methods.     Scoring  "omissions"  and
"commissions"  in  recording  errors against  a  reference or gold standard
method (e.g.,  observational standard or electronic).

(1)   Test (experimental)  time/diary formats and administrative approaches.
(2)   Validate, with  observer or  electronic equipment.

Validate locations reported.
Identify methods to get accurate  evaluations of  critical microenvironments
(e.g., gas stations,  home  garage).

Methodological  research  on  activity  diaries,   to  find  ways  that  are
cognitively simplest  for respondents,  while giving necessary data.

It also appears that a  "multitrait, multimethod"  study of activity diaries
should be  conducted.   This will  help guard  against "instrumentation" and
"observer" validity threats.   Thus use  "matrix"  diaries and  "time" driven
diaries in conjunction with  other  variables that  "should" and  "should not"
correlate  with time/activities.    This will  result  in  a validation  of
instruments/techniques.    Use of  standard designs  for instrument  validation
will also help to determine the instrument effect on  behavior.

Develop QA/QC data validation procedures.

Use  matrix sampling  to  determine which  sour'ces  to ask  for  from  each
respondent.                                  .             •

Conduct  pilot  studies to  determine human  activity  patterns  by,  using
electronic aids.

Development of monitoring  devices  that are less intrusive.

Continue  development  of light-weight, quiet,  reliable,  personal monitors
for target species.

Consider   availability of  transmitter technologies  to  monitor/verify
location  information.     If  "ideal"  monitors  are not available,  develop
specifications for monitor development.

Determine  availability/needs  for  heart  rate,   respiratory ventilation
monitors with data logging ability.

Develop protocols  for observer/technician/training  and subject  instruction
for diary/monitor studies.  Extend literature for activity assessment on
this issue.   There is  an  art and oral history  in this area that  should be
codified.

Develop improved activity  recording procedures such  as:
                                   19-3

-------
      (1)    Recording  devices that do not require paper.
      .(2)    Reminder devices that supplement paper diaries.

Develop continuous  monitoring devices for pollutants.

Improvements  in  assessing   the  accuracy  of   reported  personal
location/time/activity  patterns.    Information obtained from  "objective"
observers is  useful but not the answer, because presence  of observer may
alter  normal  activity  patterns  (both  type  of pattern  and  accuracy  of
recall).  High  priority should  be  placed on development and evaluation of
"nonintrusive"  devices  that  allow objective verification   of  personal
location, and perhaps activity, as a function  of  time.   (Perhaps award a
contract to  Bell  Labs  or the DOD for this?)

NONRESPONSE  PROBLEMS

Increase participation?

      A.    Incentives
      B.    Determine  non -  $ benefits to participant
      C.    Reduce  burden (perceived or actual)

Estimation Research:  I was not sure whether past  activity pattern studies
used weighted data in  the analysis.   It is worth researching methods to
reduce bias  in estimates due to  missing data.

Improve Response Rate:   Dawn Nelson presented  methods to improve response
rate.   Some of  these are applicable  to the studies.   There are probably
other ways to improve  rates.   With a small sample size,  even missing a few
respondents  has  a greater effect on the  results and,  hence, research should
be directed  in this area.

Do more methodological  research in  particular  on questionnaire design and
non-response avoidance.

Develop report on factors that will  improve response rate.

      (a)   initial information as  an incentive (how much  detail does one
            provide to participants?)
      (b)   effect  of  financial  incentives.
      (c)   return  of  appropriate information after the study is completed.

Evaluate/review  methods  for  increasing participation and compliance.

Develop  better   understanding   of  human behavior  on topics   such  as  the
following:

      (1)   Willingness  to Participate
      (2)   Ability to  Complete Forms (Diaries, Questionnaires) and Wear
            Monitors (barriers in workplaces, certain outdoor activities)
                                   19-4

-------
Examination of factors  that might increase response  rates  for exposures.
Study biases from nonresponse.

Explore ways to compensate for nonresponse in the analysis.   Also consider
the  inclusion  of variables  in  data collection  to  aid in  the adjustment
process.

Random  Digit Dialing  (ROD).  Using ROD,  certain socio-economic areas are
not  fairly  represented  in the  study.    The  effect  of   this  type  of
undercoverage on  Activity Pattern  research study  should  be evaluated.  [The
group without  telephones has difficult socio-economic  conditions,  health
conditions, exposure level,  etc.].

MODELS

Analysis Related  Research:   A  number  of studies  used modeling approach to
analyze data.  These  models make certain  assumptions.   These assumptions
may  have significant effects on the results.   I  recommend that sensitivity
analysis should be  done  to  reduce  risks of replications of  results  due to
the  models,  and  one  should be very  careful  about those assumptions the
model is highly sensitive to.

Formulate more sophisticated exposure models.

Design procedures to validate exposure models.

Improved human activity  pattern  -  exposure  models.

Develop  improved  mathematical  and  statistical models  to  utilize  and
interpret  the  activity pattern and  related information, which  would fit
into total  exposure.  Current work such as SIMS  and work on  models such as
SHAPE  and  breath - personal  TEAM should  be  continued.  New models must
include Stochastic  component;  i.e., temporal  relationships (most current
models assure independence of pollutant  exposure  from day to  day).

Studies to validate exposure models.

Detailed  assessment of  the modeling  procedures,  especially sensitivity
analysis:  these  models  appear  very  hazardous.

Incorporate autocorrelation  (serial  correlation) into  exposure estimation
models  for pollutants  which exhibit  autocorrelation  in PEM  data  (e.g.,
carbon monoxide).

Compare  pollutant exposure  predictions of microenvironment-based  models
with predictions  of source-proximity models.   Evaluate  stepwise regression
as a means of combining  models.

Time  and  activity  patterns need  to  be defined  for  different population
groups.  At the  minimum,  this  should include age (group),  sex, ethnic/SES
groups,  work  and/or school  status  (full/part-time;  summer/winter/full
year).  Additional  classification factors  could  include the identification

                                   19-5

-------
of pre-existing  cardiopulmonary disease (i.e., chronic heart disease,  COPD,
Asthma),  since  these  might modify both  activity and  time  patterns,  in
relation to preceptible afr  pollutant effects (e.g.,  irrigation).

Optimize the  union of activity  pattern  information  (time,  density,  and
location) and pollutant concentration  capability  (time  of  sampling).   Such
optimization  should  lead  to  a  definition   of  a  universal   set  of
microenvironments.   Of  course such a set will be  pollutant  dependent.

Determine  time  resolution based  on  technological   and  "participant"
restraints.

FIELD STUDIES

Collect activity data on  high-risk groups (large  sample sizes).

      A.    Young children
      B.    Elderly
      C.    Those with  a  history of illness/prone to  illness
      D.    Low  Income  Groups - especially inner  city.

Some thought must  be given  for  the  coordinated  efforts by various groups.
Use the  same  sample if possible  for multipurpose  studies.   This  will  help
in increasing the sample  size since some operational  costs  will  be  shared.

An  extensive,  nationwide  activity  pattern  research  field  survey  that
includes  those  activities  likely  to  cause  exposure  to environmental
chemicals.

Specialized field surveys of the activity patterns of children.

Additional  TEAM  (direct approach)  field  studies  that include  improved
activity pattern data collection procedures.  TEAM  studies  are needed for
ozone,  N02,  and  polar organic compounds.

Establish  some  mechanism  to  coordinate  studies  to prevent   overlap,
duplication,  and repetition  of mistakes.

Assess  usefulness  and   advantages  of probability  based  samples of  the
general  population.   Although necessary  for many study goals,  there are
also many  research hypotheses  in  health and exposure areas  that do not
require precise  representation of the general population.

Conduct Additional  Field  Studies:

      More data  are needed  to  evaluate and  "validate"  model  accuracy and
      precision  characteristics.

      For  the   collection  of  activity-location  data  on   a  large
      population-based sample,  initiate  cooperative research  effort  with
      Bureau of  Census or  NHANES group.   Would  encourage  CATI  interviews
      rather than mail-out diaries for self-administration.

                                   19-6

-------
Perform Large-Scale:
      A.     Population-based  studies  which  assess  activity  patterns
            stratified  on  sex,  age occupation, role, and health status.
      B.     Person-day  is  unit  of sampling.   Geographical  concerns  somewhat
            overshadowed by day of the week and seasonal concerns.

Field Studies:

      A.     TEAM studies for  species with available instrumentation.
      B.     Targeted microenvironmental  studies  for certain species  (e.g.,
            benzene and garages/areas with passive smoking).
      C.     Targeted  activity/exposure/location  studies  on  "sensitive"
            populations (e.g.,  asthmatics/ozone).

Laboratory Studies (in  support  of field monitoring):

      A.     Emissions rate determination in chambers.
      B.     Decay rate  determination in chambers.

Establish voluntary policies for designing  and  conducting field  studies;
e.g.:

      (A)   Survey and  sampling statisticians should be involved.
      (B)   Linking  health  effects to  exposure  (having  a  health effect
      component on questionnaires).
      (C)   Requiring analysis  plan at beginning of project.-            • .

Many studies are very  small  and in special  locations.  Generalization is
problematic.     Are there  ways   to  develop larger-scale representative
studies,  perhaps using  two-phase  sample designs?

More consideration  needs to be given to  the  time dimension.   One  needs to
ensure that samples are representative across time.

More attention to the need for  assessing  location/time/activity  patterns
for  important  population   subgroups,  e.g.,   children,  the  elderly,
asthmetics, persons with COPD,  residents  of low-income  substandard  housing,
who may  be  at  increased risk of pollutant exposure and/or adverse health
effects.   The  effect of geographical differences  in climate and housing on
activity patterns also  needs  to be studied much more.

There continues to  be  a  need to conduct  personal monitoring  studies to
quantify the relationships among  exposure,  microenvironmental  monitors,  and
fixed site ambient monitors.

Conduct a source driven activities diary study.

Modification or restriction  of activities or  time  allocation  in  response to
the perception of  air  pollutants  (directly or through  media reports) might


                                   19-7

-------
be considered as an indication of air pollution  effects.   Thus,  research  is
needed to  help  identify "normal"  activities and time  patterns and track
these individuals to determine if there are  changes  in  these  patterns  that
are  related to  outdoor  air  quality  (e.g.,   03,   N0x,   PM10)  or indoor
pollutants.

Better data on the concentrations formed in key  microenvironments  and their
causes.

There do appear to be regional and seasonal  differences;  thus,  at  the  very
least there  is  a  need  to determine activities  in various  locations during
at least  two (2) seasons  (winter and  summer). _ Care  should be taken  to
differentiate weekday,  Saturday,  and  Sunday patterns.

Various subgroups  need to be  identified  to  determine their  T/A patterns.
These should include "at risk" groups  as well  as occupational  groups.

What is an optimum number of locations to  be  investigated?

STANDARDIZATION AND DATA MANAGEMENT

Definition of a uniform set of microenvironments.

Determination of a set  of  variables to  be  included  in all  studies  to allow
comparison of data across studies.

Can we develop precise  and consistent definitions of specific  activities  of
interest?  Must be useful and, more importantly, must be collectible.

All  future  studies   (i.e.,   survey designs)   should  include  basic
demographic/socioeconomic data.

It is important to design  studies  so  that  the data  are  useful in  a variety
of contexts.

Defining a  manageable  set  of microenvironments  (verbs).   These  could  then
act  as  the foundation  for  all  exposure  studies,  allowing  cross-study
comparisons.

My recommendation is that  standardized definitions  and  concepts will  be
very  useful.   This  could also  help follow-up studies for  screening the
target cases, if the common definitions are employed by  them.

Consistent  coding categories  for locations and  activities  are essential  if
there is any hope  to combine  data  collected  under different studies into a
single data  base.  The definitions of location categories  (hierarchial, for
subsequent aggregation) and activities should be standardized.

A  workshop  or  working room  (<  20  individuals)  be  assembled to develop
recommendations to be circulated for  comment  and  review on standardization
of  diary  formats  (integrated   and  detailed)  and of  definitions  for


                                    19-8

-------
categories/activities.   This group could  meet  for a  2-3  day period and
produce a set  of  instruments together with instructions  for usage.

Organize available data in a standardized format.

Standardized  formats  for  transferring  activity  data on tape and personal
computer disks.

Specialized software  for .analyzing  and graphically  displaying activity
pattern information  from these tapes/disks.

Relational data bases  for  activity patterns.

Determine minimum set of  variables  --  e.g.,  demographic  characteristics  -
that should be collected in a standard way in every  study.

Standardize activity pattern  questions  (if not form of the  survey),  e.g.,
exertion level.

Suggestion  of How to  Carry  This Out:   Controlled  study that  asks  several
different  questions to get  at the same information  -  then  possible  to
evaluate which question is most effective.

Standardize data collection  in  small-scale  personal monitoring  studies  to
complement and allow extrapolation  to large-scale population-based samples.
Personal  characteristics   (age,   sex,  income,   geographic  location,
occupation, heath status).

Adapt  00-99  International  Activity   Coding  System to   air  pollution
monitoring  needs -  establish working  group to  produce a standard  coding
scheme with sufficient detail to meet exposure-absorbed dose  needs.

Construct  a hierarchial  locational data coding scheme that allows  one  to
collapse  categories  to  major  environments  (i.e.,   indoor-residence,
indoor-school, indoor  work, transit and outdoors).

Use  hierarchial   location  codes that may  be collapsed  depending  upon study
purpose/design.    Evaluate available schemes  and generate codes  for future
research.

Greater  efforts  toward   standardizing   collected  data   on
location/time/activity  patterns  in  order  to  achieve  the  standard  QA
objective  of  "comparable"  data sets.    It is especially  important for
investigators  primarily concerned with evaluating   exposures to  one  or two
specific pollutants  not to exclude by  design  the possibility  of later use
of their activity data  for  the estimation of personal exposures to other
pollutants.   For example, knowledge of .specific times  of  relatively brief
periods  outdoors may  not  be  very important for  assessing  overall  exposure
to VOC's, but  would  be of  prime importance in evaluating exposure to  ozone.
               i
Determine  reasonable  number  of location  categories  based on  pollutant
sources and activity patterns.

                                   19-9

-------
Establish  guidelines  and  standards for  designing and  conducting  field
studies.    One  project that  would  be  useful:    create  compendium  of
questionnaires,  diaries and sampling plans.   Another would be to  define
terms.

Identify (specific)  human activity patterns research areas.

Improve  scientists'  and statisticians'  understanding of management  and
policy  makers'  needs in  the  human exposure arena.
                                  19-10

-------
                          CONFERENCE  PARTICIPANTS
Dr. James H. Adair, Project Manager*
Harvard School of Public Health
665 Huntington Avenue
Boston, MA 02115

Dr. Joseph V. Behar * **
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Mr. Andrew Bond
Environmental Monitoring and Data
  Analysis Division
EPA Mail Drop 76
Research Triangle Park, NC 27711 •

Ms. Elizabeth Bryan
U.S. Environmental Protection Agency
TS-798
401 M Street SW
Washington, D.C. 20460

Mr. Michael Callahan, Director*
Exposure Assessment Group
U.S. Environmental Protection Agency
RD-689
401 M Street SW
Washington, D.C. 20460

Dr. Chao Chen
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154

Dr. Steve Colome
Department of Social Ecology
University of California at Irvine
Irvine, CA 92717

Ms. Lori Coyner
E.R.T.
1220 Avenido Acaso
Camarillo, CA 93010

Mr. Michael Delarco
U.S. Environmental Protection Agency
401 M Street SW (RD-680)
Washington, D.C. 20460
Dr. Naihua Duan*
513 Wilshire Blvd.
Suite 249
Santa Monica, CA 90401

Mr. Michael Dusetzina
U.S. Environmental Protection Agency
(MD-13)
Research Triangle Park, NC 27711

Dr. Evan Englund
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Ms. Lynn Fenstermaker
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154

Mr. Chas. Fitzsimmons
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154

Mr. George T. Flatman
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3748

Dr. Kenneth Hedden
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Mr. Stephen C. Hern **
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Dr. C. H. Ho
Department of Mathematics
University of Nevada-Las Vegas
Las Vegas, Nevada 89154
                                    20-1

-------
Mr. Ted Johnson*
PEI Associates, Inc.
505 S. Duke Street, Suite 503
Durham, NC 27701-3196

Dr. Graham Kalton, Chairman*
Department of Biostatistics
University of Michigan
Ann Arbor, MI 48106

Dr. Robert Kinneson
Desert Research Institute
Water Resources Center
University of Nevada System
Las Vegas, Nevada 89120

Mr. Mel Kollander*
U.S. Environmental Protection Agency
(PM-223)
401 M Street SW
Washington, D.C. 20460

Prof. William Lambert*
UNM Medical Center
900 Camino De Salud NE
Albuquerque, NM 87131

Ms. Carolyn H. Lichtenstein*
Roth Associates, Inc.
6115 Executive Blvd.
Rockville, MD 20852

Dr. David McNelis, Director
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154

Dr. Forest Miller
Desert Research Institute
Water Resources Center
University of Nevada System
Las Vegas, NV 89120

Dr. D.J. Moschandreas*
IIT Research Institute
10 W. 35th Street
Chicago, IL 60616
Ms. Dawn Nelson*
Demograpic Surveys Division
Bureau of the Census
FOB03, Room 3377
Washington, D.C. 20233

Dr. William C. Nelson**
U.S. Environmental Protection Agency
MD-55
Research Triangle Park, NC 27711

Dr. Wayne R. Ott, Chief*
Air, Toxics, and Radiation Staff
U.S. Environmental Protection Agency
RD680
401 M Street SW
Washington, D.C. 20460

Dr. Craig Palmer
Environmental Research Center
University of Nevada-Las vegas
Las Vegas, Nevada 89154

Dr. Muni Ian Pandian*
Environmental Research Center
University of .Nevada-Las Vegas
Las Vegas, Nevada 89154

Mr. J. Gareth Pearson, Director**
Exposure Assessment Research
  Division
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Mr. Thomas Phillips*
Air Resources Board
1102 Q Street
P.O. Box 2815
Sacramento, CA 95812

Dr. James Quackenboss
College of Medicine
University of Arizona
Arizona Medical  Center
Tucson, AZ 85724
                                    20-2

-------
Dr. John Robinson, Director*
Survey Research Center
University of Maryland
College Park, MD 20742

Ms. Margo Schwab
Harvard School of Public Health
665 Huntington Avenue
Bldg 1, Room 1310
Boston, MA 02115

Dr. R. Keith Schwer, Director
Center for Business and Economic
  Research
University of Nevada-Las Vegas
Las Vegas, Nevada 89154

Dr. Rajendra P. Singh, Director
Survey of Income and Program
  Participation Branch
Statistical Methods Division
U.S. Bureau of the Census
FOB#3, Room 3705
Suitland, MD 20233

Mr. Robert L. Snelling*
Acting Director
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Dr. Thomas Starks*
Environmental Research Center
University of Nevada-Las Vegas
Las Vegas, Nevada 89154

Dr. Bonnie Stern
Health Protection Branch
203 Environmental Health Center
Tunney's Pasture
Ottawa K1A OL2
Canada

Dr. Thomas H. Stock*
Department of Environmental Sciences
University of Texas
Health Science Center
P.O. Box 20036
Houston, TX 77225
Mr. Jacob Thomas*
General Sciences Corportion
6100 Chevy Chase Drive
Laurel MD 20707

Dr. John C. Unrue, Vice-President
  for Academic Affairs*
University of Nevada-Las Vegas
Las Vegas, NV 89154

Mr. Llewellyn Williams**
U.S. Environmental Protection Agency
Environmental Monitoring Systems
  Laboratory-Las Vegas
Las Vegas, Nevada 89183-3478

Mr. A.L. Wilson
Wilson Environmental Associates
135 E. Live Oak Avenue
Suite 203
Arcadia, CA 91006

Mr. Harvey S. Zelon*
Center for Survey Statistics
Research Triangle Institute
Box 12194
Research Triangle'Park, NC
  27708-2194
* Speaker
**Session Chairman
                                    20-3

-------