Total Risk Integrated Methodology (TRIM) Air Pollutants Exposure Model Documentation (TRIM.Expo / APEX, Version 4.3) Volume II: Technical Support Document


&EPA
United a*s
Environmental PiutecBmi
Agoncy
Total Risk Integrated Methodology (TRIM) Air
Pollutants Exposure Model Documentation
(TRIM.Expo / APEX, Version 4.3)


Volume II: Technical Support Document

-------

-------
                                                     EPA-452/B-08-001b
                                                          October 2008
Total Risk Integrated Methodology (TRIM) Air Pollutants
Exposure Model Documentation (TRIM.Expo / APEX, Version 4.3).
Volume II: Technical Support Document
                      U.S. Environmental Protection Agency
                    Office of Air Quality Planning and Standards
                    Health and Environmental Impacts Division
                      Research Triangle Park, North Carolina

-------

-------
                                   DISCLAIMER
       This document has been prepared by Alion Science and Technology, Inc. (through
Contract No. EP-D-05-065, WAs 21 and 94). Any opinions, findings, conclusions, or
recommendations are those of the authors and do not necessarily reflect the views of the EPA or
Alion Science and Technology, Inc. Mention of trade names or commercial products is not
intended to constitute endorsement or recommendation for use. Comments on this document
should be addressed to John E. Langstaff, U.S. Environmental Protection Agency, C504-06,
Research Triangle Park, North Carolina 27711 (email: langstaff.john@epa.gov).
                                          11

-------
                             ACKNOWLEDGEMENTS
The primary authors of this document are Graham Glen and Kristin Isaacs, Alion Science and
Technology, Inc.  Contributions have also been made by Melissa Nysewander, Luther Smith,
Casson Stallings (Alion Science and Technology, Inc.), Tom McCurdy, John Langstaff (EPA),
and ICF Consulting.
                                         in

-------
                                   CONTENTS

CHAPTER 1.   INTRODUCTION	1
  1.1    TRIM and The APEX Model	1
  1.2    Scope and Organization of This Document	1
  1.3    Introduction to APEX	1
  1.4    Strengths and Limitations of APEX	4
    1.4.1   Strengths	4
    1.4.2  Limitations	5
  1.5    Applicability	6
  1.6    Brief History of APEX	6
CHAPTER 2.   OVERVIEW OF MODEL DESIGN AND ALGORITHMS	8
CHAPTERS.   USING PROBABILITY DISTRIBUTIONS IN APEX	13
  3.1    The APEX Input Distribution Format	13
  3.2    Details of Distribution Sampling, Truncation, and Resampling	15
    3.2.1  Resampling Options	15
    3.2.2  Beta Distribution	17
    3.2.3  Cauchy Distribution	18
    3.2.4  Discrete Distribution	19
    3.2.5  Exponential Distribution	20
    3.2.6  Extreme Value Distribution	21
    3.2.7  Gamma Distribution	22
    3.2.8  Logistic Distribution	23
    3.2.9  Lognormal Distribution	24
    3.2.10    Loguniform Distribution	25
    3.2.11    Normal Distribution	26
    3.2.12    Pareto Distribution	27
    3.2.13    Triangle Distribution	28
    3.2.14    Uniform Distribution	29
    3.2.15    Weibull Distribution	30
CHAPTER 4.   CHARACTERIZING THE STUDY AREA	31
  4.1    APEX Spatial Units	31
    4.1.1  Initial Study Area	31
    4.1.2   Sectors	31
    4.1.3  Air Quality Districts	33
    4.1.4  Meteorological Zones	34
  4.2    Determining the Final Study Area	35
    4.2.1  Matching Sectors, Air Quality Districts, and Meteorological Zones	35
    4.2.2  The Distance Algorithm	35
  4.3    Modeling Commuting	36
    4.3.1  Nationwide Commuting Database for 2000	37
    4.3.2  Implementation of Commuting in APEX	38
CHAPTER 5.   GENERATING SIMULATED INDIVIDUALS (PROFILES)	40
  5.1    Demographic variables	44
  5.2    Residential Variables	45
                                         IV

-------
  5.3    Physiological Profile Variables	46
  5.4    Daily-Varying Variables	50
  5.5    Modeling Variables	51
CHAPTER 6.   Constructing a Sequence of Diary Events	53
  6.1    Constructing the Diary Pool	53
    6.1.1   Diary Data	53
    6.1.2   Grouping the Available Diaries into the Diary Pools	54
  6.2    Basic (Random) Composite Diary Construction	55
  6.3    Longitudinal Activity Diary Assembly	55
    6.3.1   The Longitudinal Diary Assembly Algorithm	56
    6.3.2   Selecting Appropriate D and A Values For a Simulated Population	60
CHAPTER 7.   ESTIMATING ENERGY EXPENDITURES AND VENTILATION	62
  7.1    Generating the MET Time-Series	62
  7.2    Adjusting the MET Time-Series for Fatigue and Excess Post-Exercise Oxygen
  Consumption	63
    7.2.1   Simulation of Oxygen Deficit	65
       7.2.1.1   Fast Processes	65
       7.2.1.2   Slow Processes	66
       7.2.1.3   Derivation of Appropriate Values for the Model Parameters	67
    7.2.2   Adjustments to M for Fatigue	68
    7.2.3   Adjustments to M for EPOC	69
       7.2.3.1   Fast Processes	69
       7.2.3.2   Slow Processes	69
  7.3    Calculating PAI and the Ventilation Rates	70
    7.3.1   Calculating PAI and Energy Expenditure	70
    7.3.2   Calculating Oxygen Consumption and Ventilation Rates	71
CHAPTER 8.   CALCULATING POLLUTANT CONCENTRATIONS IN
MICROENVIRONMENTS	72
  8.1    Defining Microenvironments	72
  8.2    Calculating Concentrations in Microenvironments	77
    8.2.1   Microenvironmental Concentrations for Home/Work/Other Locations	78
    8.2.2   Mass Balance Method	78
    8.2.3   Factors Method	86
    8.2.4   Microenvironment Parameter Definitions	87
       8.2.4.1   Time and Area Mappings	90
       8.2.4.2   Conditional Variables	91
       8.2.4.3   Correlation Settings	94
       8.2.4.4   Resampling Options	96
       8.2.4.5   Random Number Seeds	97
       8.2.4.6   Source Strength  Specification	98
       8.2.4.7   Specification of Distribution Data	100
CHAPTER 9.   CALCULATING EXPOSURES	102
  9.1    Estimating Exposure	102
  9.2    Exposure Summary Statistics	103
  9.3    Exposure Summary Tables	104

-------
CHAPTER 10.    CALCULATING DOSE	108
  10.1      Inhaled Dose Calculation	108
  10.2      Carboxyhemoglobin (COHb) Calculation	109
  10.3      Calculating PM Dose	112
    10.3.1    Particle Sizes, Inhalability, and Diffusion Coefficient	113
    10.3.2    The ICRP Deposition Equations	114
       10.3.2.1    Lung Volumes and Age Scaling Factors	115
       10.3.2.2    Tidal Volume and Activity Level	117
       10.3.2.3    Inspiratory Ventilation	117
       10.3.2.4    Residence Times	118
       10.3.2.5    Final Deposition Fractions and Deposited Masses	118
  10.4      Definition of Dose Summary Statistics	119
                                          VI

-------
                                LIST OF TABLES

Table 3.1.  Available Probability Distributions in APEX	14
Table 5.1.  Profile Variables in APEX	41
Table 6.1.  D and A Statistics Derived from the Southern California Children's Study	61
Table 8.1.  Default Mapping of CHAD Location Codes to APEX Microenvironments	73
Table 8.2.  Microenvironmental Parameters	84
Table 10.1 The values of a, R, and P for each filter for oral and nasal breathing	115
Table 10.2 Coefficients for the Lung Volumes and Scaling Factors	116
                                         vn

-------
                               LIST OF EXHIBITS

Exhibit 8-1. Example of a Microenvironmental Parameter Description	88
Exhibit 8-2. Example of the Shortest Possible MP Description	89
Exhibit 8-3. Example of Defining Correlated Microparameters	95
Exhibit 8-4. Use of Source Number in MP Definition	98
Exhibit 8-5. Second MP Definition with Source Number 2	99
Exhibit 8-6. Use of #sources Setting in the Pollutant Parameters section of the Simulation
ControlFile	99
                                         Vlll

-------
                               LIST OF FIGURES

Figure 2. la. Overview of APEX, Part 1	10
Figure 3.1. The Beta Distribution in APEX	17
Figure 3.2. The Cauchy Distribution in APEX	18
Figure 3.3 The Discrete Distribution in APEX	19
Figure 3.4. The Exponential Distribution in APEX	20
Figure 3.5. The Extreme Value Distribution in APEX	21
Figure 3.6. The Gamma Distribution in APEX	22
Figure 3.7. The Logistic Distribution in APEX	23
Figure 3.8. The Lognormal Distribution in APEX	24
Figure 3.9. The Loguniform Distribution in APEX	25
Figure 3.10. The Normal Distribution in APEX	26
Figure 3.11. The Pareto Distribution in APEX	27
Figure 3.12. The Triangle Distribution in APEX	28
Figure 3.13. The Uniform Distribution in APEX	29
Figure 3.14. The Weibull Distribution in APEX	30
Figure 4.1. Example of Study Areas, Air Quality Districts, Meteorological Zones, and Sectors 32
Figure 5.1. Generating a Simulated Profile	44
Figure 6.1. Overview of the Longitudinal Diary Assembly Algorithm	57
Figure 7.1. Fast Components of Oxygen Deficit and Recovery	66
Figure 8.1. The Mass Balance (MASSBAL) Model	79
Figure 10.1 Structure of the ICRP Deposit!on Model	113
                                         IX

-------
CHAPTER 1. INTRODUCTION
1.1 TRIM and The APEX Model

The Air Pollutants Exposure model (APEX) is part of EPA's overall Total Risk Integrated
Methodology (TRIM) model framework (EPA, 1999), in particular the inhalation exposure
component (TRIM.ExpOinhaiation). TRIM is a time-series modeling system with multimedia
capabilities for assessing human health and ecological risks from hazardous and criteria air
pollutants; it is being developed to support evaluations with a scientifically sound, flexible, and
user-friendly methodology. The TRIM design includes three modules:

• Environmental Fate, Transport, and Ecological Exposure module (TRIM.FaTE);

• Human Exposure-Event module (TRIM.Expo); and

• Risk Characterization module (TRIM.Risk).
APEX is designed to estimate human exposure to criteria and air toxic pollutants at local, urban,
and regional scales. The current release of the model is APEX4. Note that APEX has been
extensively reviewed. Any changes to the computer code may lead to results that cannot be
supported by this documentation. Model enhancements, bug fixes, and other changes are
occasionally made to APEX, and thus users are encouraged to revisit the website
http ://www. epa. gov/ttn/fera/human apex.html for notices of these changes.

1.2 Scope and Organization of This Document

The documentation of the APEX model is currently divided into two volumes. Volume II:
Technical Support Guide (this document) is intended to be a reference on the scientific basis of
the APEX model. The scientific background, original references, and equations for the APEX
model algorithms are included in this volume. Topics covered include the methods implemented
in APEX for sampling probability distributions, calculating microenvironmental concentrations,
modeling ventilation, estimating exposure and dose, and assembling composite activity diaries.
Other model algorithms, such as those for generating the study area and the simulated population
are also described.

Volume I: User's Guide., is designed to be a hands-on guide to using APEX. It is applicable to all
levels of expertise, from novice to advanced, and focuses on how to run the APEX computer
model, develop the appropriate input files, and interpret the model output files.

1.3 Introduction to APEX

APEX estimates human exposure to criteria and toxic air pollutants using a stochastic,
"microenvironmental" approach. That is, the model randomly selects data for a sample of

-------
hypothetical individuals from an actual population database and simulates each individual's
movements over time, in different locations (e.g., at home, in vehicles) to estimate their exposure
to (and, optionally, dose of) the modeled pollutants. APEX can assume people live and work in
the same general area (i.e., that the ambient air quality is the same at home and at work) or
optionally can model commuting and thus exposure at the work location for employed
individuals.

APEX is a multipollutant model.  It can model the simultaneous exposure to any number of
pollutants, assuming that the user can provide the necessary input air quality data and pollutant
parameters.

The APEX model uses the personal profile approach to generate simulated individuals, for whom
exposure time series are calculated. The profile is a description of the characteristics of an
individual that may  affect either their activities or the concentrations in the microenvironments.
Typically, the profile includes demographic variables such as age, gender, and employment
status, as well as physiological variables such as height and weight, and finally some situational
variables such as possession of a gas stove or air conditioning.  The demographic variables are
used in the selection of activity diaries from EPA's Consolidated Human Activity Database
(CHAD, McCurdy et al., 2000) to represent the individual, while the situational variables are
used to help  calculate the microenvironmental concentrations. The physiological variables are
used in the calculation of pollutant dose.

An APEX model run consists of calculating the exposure (and optionally dose) time series for a
user-specified number of profiles. The time series can be calculated on different temporal scales.
Collectively, these profiles are intended to be a representative random sample of the population
in a given study area.  To this end, tables of demographic data from the decennial census are
used, so appropriate probabilities for any given geographical area can be derived. In APEX the
geographical units are called sectors.  Using the standard input files provided with the model,
each sector is a census tract. Ambient air quality and meteorological data for the study area are
also required by the model; the area covered by an air quality monitor is called a district, and the
area  covered by a meteorological monitor is called a zone.   APEX matches up each  sector of the
study area with an appropriate air quality district and meteorological zone  to provide all the data
necessary to simulate exposure and dose for an individual.

APEX can be thought of as a simulation of a field study that would involve selecting an actual
sample of specific individuals who live in  (or work and live in) a geographic area and then
continuously monitoring their activities  and subsequent inhalation exposures to a specific air
pollutant during a specific period of time.  The main differences between the model and an actual
field study are that in the model:

•      The sample of individuals is a "virtual" sample, created by the model according to
       various demographic variables and census data of relative frequencies,  in order to obtain
       a representative sample (to the extent possible) of the actual people in the study area;

•      The activity  patterns of the sampled individuals (e.g., the specification  of indoor and
       other microenvironments, the duration of time spent in each) are assumed by the model to
       be similar to individuals with similar demographic characteristics, according to activity

-------
data such as diaries compiled in EPA's Consolidated Human Activities Database
(CHAD) (EPA, 2002; McCurdy et al., 2000);

• The pollutant exposure concentrations and doses are estimated by the model using
temporally and spatially varying ambient outdoor concentrations, coupled with
information on the behavior of the pollutant in various microenvironments; and

• Various reductions in ambient air quality levels due to potential emission reductions can
be simulated by adjusting air quality concentrations to reflect the scenarios under
consideration.
Thus, the model accounts for the most significant factors contributing to inhalation exposure—
the temporal and spatial distribution of people and pollutant concentrations throughout the study
area and among the microenvironments—while also allowing the flexibility to adjust some of
these factors for regulatory assessments and other reasons.

Nomenclature. The following terms are used throughout this guide:

• Diary—a set of events or activities (e.g., cooking, sleeping) for an individual in a given
time frame (e.g., a day).

• Air quality district—the geographical area represented by a given set of ambient air
quality data (either based on a fixed-site monitor or output from an air quality model).

• Event—an activity (e.g., cooking) with a known starting time, duration,
microenvironment, and location (usually home or work).

• Microenvironment—a space in which human contact with an environmental pollutant
takes place.

• Profile—a set of characteristics that describe the person being simulated (e.g., age,
gender, height, weight, employment status, whether an owner of a gas stove or air
conditioner).

• Sector—the basic geographical unit for the demographic input to and output from APEX
(usually census tracts).

• Study Area—the geographical area modeled.

• Study Area Population—total population of persons who live in the study area.

• Meteorological zone—the geographical area represented by a given set of meteorological
data (either based on a meteorological station or output from a meteorological model.
Labeling Conventions. The labeling used in this document is as follows.

• Input and output file names are in italics.

• Model Variables are in bold italics, generally only when first used in a section.

-------
KEYWORDS, which are used in the input files to identify variables and settings, are
given in uppercase bold italics.
• [input and output file excerpt s| are in a box surrounded by a single line,
indicating that the text inside the box is shown exactly as it exists in electronic form.

• This document also contains references to the APEX model code. Specifically, the
discussions of the model algorithms include mention to the module and function or
subroutine in which they are implemented. The code locations are given in bold non-
italic text in the format Module: Subroutine or ModulerFunction.

1.4 Strengths and Limitations of APEX

All models have strengths and limitations, and for each application it is important to carefully
select the model that has the desired attributes. With this in mind, it is equally important to
understand the strengths and weaknesses of the chosen model. The following sections provide a
summary of the strengths and potential limitations of APEX.
1.4.1 Strengths

APEX simulates the movement of individuals through time and space to estimate their exposure
to individual or multiple pollutants in indoor, outdoor, and in-vehicle microenvironments.
Compared to conducting a field study that would involve identifying, interviewing, and
monitoring specific individuals in a study area, APEX provides a vastly less expensive, more
timely, and more flexible approach. The model also allows different air quality data, exposure
scenarios, and other inputs and thus is very useful for decision making applications.

An important feature of APEX is its versatility. The model is designed with a great deal of
flexibility so that different levels of detail in input data can be applied for different applications.
The input data sets supplied with APEX contain information for several microenvironments,
covering the needs of most applications. The air quality data input to the model can be in the
form of monitoring or modeling data. The data can be for specific locations, or geo-political
units such as counties, or census units such as tracts, or the locations of air dispersion model
receptors, or the grid cells of Eulerian model output. Criteria and hazardous air pollutants can be
modeled by APEX.

A key strength of APEX is the way it incorporates stochastic processes representing the natural
variability of personal profile characteristics, activity patterns, and microenvironment
parameters. In this way, APEX is able to represent much of the variability in the exposure
estimates resulting from the variability of the factors effecting human exposure.

Another strength of APEX is its ability to estimate exposures and doses on different timescales
for all simulated individuals in the sample population from the study area. This ability allows for
powerful statistical analysis of a number of exposure characteristics (e.g., acute and chronic
exposure, correlations with activities and demographics), many of which are provided
automatically by APEX in output tables.

-------
APEX also estimates the exposures of workers in the areas where they work, in addition to the
areas where they live. The pollutant concentrations in these respective locations may be very
different from each other.

The use of APEX has been facilitated by the availability of model-ready input files which have
been developed from the databases discussed above: national population demographics and
commuting information from the 2000 U.S. Census; CHAD activity data; and microenvironment
definitions.
1.4.2 Limitations

The following limitations of APEX have been identified:

• The population activity pattern data supplied with APEX (CHAD activity data) are
compiled from a number of studies in different areas, and for different seasons and years.
Therefore, the combined data set may not constitute a representative sample.
Nevertheless, the largest portion of CHAD is from random-sample studies of national
scope, which could be extracted by the user if desired to create a representative sample.

• The commuting data address only home-to-work travel; travel between sectors for other
purposes is not modeled directly. APEX can model time spent in travel; however, based
on the model settings, the ambient air quality during travel is assumed to be either 1) a
composite of the air quality in all study area sectors or 2) a composite of the air quality in
a randomly-selected group of sectors.

• APEX creates seasonal or year-long sequences of activities for a simulated individual by
sampling human activity data from more than one subject in CHAD. Thus, uncertainty
exists about season-long exposure event sequences. This approach can tend to
underestimate the variability from person to person, because each simulated person
essentially becomes a composite or an "average" of several actual people in the
underlying activity data (which tends to dampen the variability). At the same time, this
approach may overestimate the day-to-day variability for any individual if each simulated
person is represented by a sequence of potentially dissimilar activities from different
people rather than more similar activities from one person. These uncertainties have been
partly removed with the implementation in APEX of an algorithm for combining diaries
which addresses these limitations to some extent.

• The model currently does not capture certain correlations among human activities that
can impact microenvironmental concentrations (e.g., cigarette smoking leading to an
individual opening a window, which in turn affects the amount of outdoor air penetrating
the residence).

• Certain aspects of the personal profiles are held constant, though in reality they change
every year (e.g., age). This is only an issue for simulations spanning several years.

• At this point in time, no interactions between pollutants are modeled.

-------
Other data and model limitations exist besides those identified above, including physiological,
meteorological, and those associated with estimating concentrations in microenvironments. EPA
will continue to refine the model and data to reduce these limitations to the extent possible. The
uncertainties which result from these limitations of APEX have been characterized for an ozone
assessment (Langstaff, 2007).
1.5 Applicability

APEX is an advanced air inhalation exposure model which can be used for a range of
applications. APEX can be employed to model episodic "high-end" inhalation exposures that
result from highly localized pollutant concentrations (e.g., residual risk assessments). APEX can
also provide detailed probabilistic estimates of exposure for urban and greater metropolitan areas
(e.g. for regulatory analyses supporting national decisions such as NAAQS reviews). APEX is
appropriate for assessing both long-term chronic and short-term acute inhalation exposures of the
general population or of specific segments of the population. The model is designed to look at
the range of inhalation exposures of different groups of people across a population, for a range of
averaging times, in a single simulation. The current version of APEX produces results for
flexible averaging times. By default APEX produces results for 1 hour, 8 hours, 24 hours, and
annual time periods (or the length of a simulation, if shorter than one year). However, APEX
can optionally model results for timesteps on a much smaller scale (e.g. ever 5 minutes) by
setting optional run parameters and providing air quality data on the appropriate time scale.

Due to the computational demands (run time and disk space) of running APEX, it is not
appropriate for national-level assessments of population exposure. However, this is not an
inherent limitation in the model code or algorithms.
1.6 Brief History of APEX

APEX was originally derived from the probabilistic National Ambient Air Quality Standards
Exposure Model (pNEM). The NEM series was developed to estimate exposure to the criteria
pollutants (e.g., CO, ozone). In 1979, EPA began to develop NEM by assembling a database of
human activity patterns that could be used to estimate exposures to outdoor pollutants (Roddin et
al., 1979). The data were then combined with measured outdoor concentrations in NEM to
estimate exposures to CO (Biller et al., 1981; Johnson and Paul, 1983). In 1988, OAQPS began
to incorporate probabilistic elements into the NEM methodology, using activity pattern data
based on various human activity diary studies in an early version of probabilistic NEM for ozone
(pNEM/O3). In 1991, a probabilistic version of NEM was developed for CO (pNEM/CO) that
included a one compartment mass-balance model to estimate CO concentrations in indoor
microenvironments (Johnson et al., 1992). A newer version of pNEM/Os was developed in the
1990's and applied to nine urban areas for the general population, outdoor children, and outdoor
workers (Johnson et al., 1996a,b,c). During 1999-2001, an updated version of pNEM/CO
(versions 2) was developed that relied on activity diary data from CHAD and enhanced
algorithms for simulating gas stove usage, estimating alveolar ventilation rate (a measure of
human respiration), and modeling home-to-work commuting patterns.

-------
APEX evolved from pNEM to provide greater applicability, flexibility, and accuracy. The
APEX model was substantially different than pNEM, particularly in the use of a personal profil*
approach rather than a cohort simulation approach.  APEX introduced a number of new features
including automatic site selection from large (e.g., national) databases, a series of new output
tables providing summary statistics, and a thoroughly reorganized method of describing
microenvironments and their parameters.  Most of the spatial and temporal constraints were
removed or relaxed in APEX.  Several major improvements to APEX have been introduced
the most recent version, APEX4. Specifically, APEX4 includes:
were
      in
       Multipollutant capability

       algorithms for the assembly of multi-day (longitudinal) activity diaries that model intra-
       individual variance, inter-individual variance, and day-to-day autocorrelation in diary
       properties.

       methods for adjusting diary-based energy expenditures for fatigue and excess post-
       exercise oxygen consumption

       new equations for estimation of ventilation

       the ability to model commuters leaving the study area

       the ability to model air quality and exposure on different time scales

       the ability to model person-to-person variability in air quality within an air district

       new output files containing diary event-level, timestep level, and hourly-level exposure,
       dose, and ventilation data, and hourly-level microenvironmental data

       the ability to model the prevalence of disease states such as asthma

       new output exposure tables that report  exposure statistics for subpopulations such as
       children and active people under different ventilation levels.

       the ability to model inhaled dose for pollutants

       the inclusion of commuting data from the 2000 census

       expanded options for modeling microenvironments

-------
CHAPTER 2. OVERVIEW OF MODEL DESIGN AND
ALGORITHMS

This chapter provides a brief outline of the key modeling steps, logic processes, and databases
used in APEX.

APEX is designed to simulate population exposure to criteria and air toxic pollutants at local,
urban, and regional scales. The user specifies the geographic area to be modeled and the number
of individuals to be simulated to represent this population. APEX then generates a personal
profile for each simulated person that specifies various parameter values required by the model.
The model next uses diary-derived time/activity data matched to each personal profile to
generate an exposure event sequence (also referred to as "activity pattern" or "composite diary")
for the modeled individual that spans a specified time period, such as one year. Each event in the
sequence specifies a start time, an exposure duration, a geographic location, a microenvironment,
and an activity. Probabilistic algorithms are used to estimate the pollutant concentration and
ventilation (respiration) rate associated with each exposure event. The estimated pollutant
concentrations account for the effects of ambient (outdoor) pollutant concentration, penetration
factor, air exchange rate, decay/deposition rate, and proximity to emission sources, depending on
the microenvironment, available data, and the estimation method selected by the user. The
ventilation rate is derived from an energy expenditure rate estimated for the specified activity.
Because the modeled individuals represent a random sample of the population of interest, the
distribution of modeled individual exposures can be extrapolated to the larger population.

The model simulation includes up to seven steps:

1. Characterize the study area - APEX selects sectors (e.g., census tracts) within a study
area—and thus identifies the potentially exposed population—based on the user-defined
center and radius of the study area and availability of air quality and weather input data
for the area.

2. Generate simulated individuals - APEX stochastically generates a sample of simulated
individuals based on the census data for the study area and human profile distribution
data (such as age-specific employment probabilities). The user can specify the size of the
sample. The larger the sample, the more representative it is of the population in the study
area (but also the longer the computing time).

3. Construct a sequence of activity events - APEX constructs an exposure event sequence
(activity pattern) spanning the period of simulation for each of the simulated persons
(based on the supplied Consolidated Human Activity Database (CHAD) data, although
other data could be used).

4. Estimate energy expenditures and ventilation - APEX constructs a time-series of energy
expenditures for each profile based on the activity event sequence. These expenditures
are adjusted for physiological realism, and then used to estimate a number of ventilation

-------
metrics that are later used in estimating dose and in identifying an active subpopulation of
active persons for use in creating exposure summary tables.

5. Calculate timestep concentrations in microenvironments for each pollutant - APEX enables
the user to define microenvironments that people in a study area would visit (e.g., by
grouping location codes included in the supplied CHAD database). The model then
calculates timestep concentrations of each pollutant in each of the microenvironments for
the period of simulation, based on the user-provided ambient air quality data. All the
timestep concentrations in the microenvironments are re-calculated for each simulated
individual.

6. Calculate exposures for each pollutant - APEX assigns a concentration to each exposure
event based on the microenvironment occupied during the event and the person's activity.
These values are averaged by timestep and by clock hour to produce a sequence of
timestep and hourly-average exposures spanning the specified exposure period (typically
one year). These hourly values may be further aggregated to produce daily, monthly, and
annual average exposure values.

7. Calculate doses - APEX optionally calculates timestep, hourly, daily, monthly, and annual
average dose values for each of the simulated individuals.

The model simulation continues until exposures are calculated for the user-specified number of
simulated individuals. Figure 2.1 presents these steps within a schematic of the APEX model
design. The following chapters provide additional detail on the algorithms used in each of the
above simulation steps.
The above steps are largely self-contained in the APEX computer code and do not depend on
subsequent steps. For example, the generation of simulated individuals (step 2) is independent of
any other profile characteristics or modeling results. This means that the profile variables do not
depend on the diaries assigned to that profile or to the properties of the microenvironments for
that profile. The assignment of diaries to the profile (step 3) depends on the profile variables but
not on the microenvironments, the exposure, the dose, or the properties of any other profile. The
calculation of microenvironment concentrations (step 4) can depend on the profile variables
through the use of conditional variables, but in APEX this step cannot depend on the contents of
the selected activity diaries. Conceptually, this means that the microenvironments essentially
have an existence of their own that is independent of the activities of the profile. For example,
activities such as smoking and cooking can "occur" in a residence even when the person being
profiled is not at home. In reality, the activities of the profiled person can have some effect on
the microenvironments they visit, but this is not captured in the present version of the model.
However, through the judicious use of source terms in microenvironments, APEX can simulate
changes in concentrations due to the presence of the person (e.g., a "personal cloud" effect).

-------
       1. Characterize study area
                                   2. Characterize study population
                                                    3. Generate N number of
                                                 simulated individuals (profiles)
               2000 Census tract-level data for the entire U.S.  (sectors=tracts for the NAAQS ozone exposure application)
       Sector location data I
       (latitude, longitude) I
                   Sector population data
                     (age/gender/race)
  Defined study area (sectors
  within a city radius and with air
  quality and meteorological data
  within their radii of influence)
    Locations of air quality and
   meteorological measurements;
         radii of influence
(        n
- National
 database
                  Commuting flow data
                (origin/destination sectors)
                                       Population within
                                         the study area
                                                 Age/gender-specific
                                                 physiological
                                                 distribution data (body
                                                 weight, height, etc)
                                  Distribution functions for
                                  profile variables
                                  (e.g, probability of air
                                  conditioning)
- Area-specific
  input data

- Data processor
              Age/gender/tract-specific
              employment probabilities
                                                    Stochastic
                                                 profile generator
                                      Distribution functions
                                      for seasonal and daily
                                      varying profile variables
                                      (e.g., window status, car
                                      speed)
- Intermediate step
  or data

- Output data
A simulated individual with the
following profile:
• Home sector
• Work sector (if employed)
•Age
• Gender
• Race
• Employment status
• Home gas stove
• Home gas pilot
• Home air conditioner
• Car air conditioner
• Physiological parameters
  (height, weight, etc.)
                                                  Figure 2. la. Overview of APEX, Part 1
                                                                        10

-------
                                              4. Construct sequence of activity events
                                                   for each simulated individual
                             Diary events/activities
                            and personal information
                              (e.g., from CHAD)
                                       Selected diary records for each day in the simulation
                                       period, resulting in a sequence of events
                                       (microenvironments visited, minutes spent, and
                                       activity) in the simulation period, for an individual
                            Activity diary pools by da}
                            type/temperature category
                                                                                       Stochastic
                                                                                  calculation of energy
                                                                             expended per event (adjusted for
                                                                               hysiological limits and EPOC)
                                                                                     and ventilation
   Stochastic diar}
  selector using age,
gender, employment,
and key diary statistic
(for longitudinal diary
     assembly)
Profile for an
 individual
                                                                          Physiological
                                                                        parameters from
                                                                            profile
                          Each day in the simulation
                          period is assigned to an
                          activity pool based on day type
                          and temperature category
                                                                                                        Sequence of events for an
                                                                                                               individual
                              Maximum/mean dail}
                                temperature data
                                         Figure 2-lb. Overview of APEX, Part 2
                                                                11

-------
              5. Calculate concentrations in
          microenvironments for all events for
                each simulated individual
                                                                      6. Calculate timestep /
                                                                       hourly exposures for
                                                                          each simulated
                                                                            individual
                                            7. Calculate
                                        population exposure
                                              statistics
                    Microenvironments defined by
                   grouping of CHAD location codes
                                                                _L

Input functions describing
interpersonal, geographic,
meteorological,
and temporal variation in
microenvironmental
parameters

Select calculation method for
each microenvironment:
• Factors
• Mass balance

/\
                                                               Average exposures
                                                               for simulated person,
                                                               stratified by ventilation
                                                               rate:
                                                               •Timestep
                                                               •Hourly
                                                               •Daily timestep max
                                                               •Daily...
                                                                                                           Population exposure
                                                                                                           indicators for:
                                                                                                           • Total population
                                                                                                           • Children
                                                                                                           • Asthmatic children
Timestep concentrations and
minutes spent in each
microenvironment visited by
the simulated individual
                             Calculate
                         concentrations in all
                         microenvironments
Timestep or hourly air
quality data for all sectors
                                                                  Calculate timestep
                                                                   concentrations in
                                                                  microenvironments
                                                                       visited
                           Concentrations for all events
                           for each simulated individual
 Sequence of events for
each simulated individual
                                                                                                              Repeat steps 5-7 for each pollutant
                                                                                                                     in the simulation
                                      Figure 2-lc. Overview of APEX, Part 3
                                                            12

-------
CHAPTER 3. USING PROBABILITY DISTRIBUTIONS IN
APEX

APEX is a stochastic model. It makes use of random sampling from probability distributions to
model variability in a number of input model parameters. Specifically, distributions are used:

1) To model variability in MET (energy expenditures) for different activities. Input MET
distributions for each activity are defined in the Activity-Specific MET file.

2) To model inter-person variability in physiological parameters. Physiological parameter
distributions are defined for each different age-gender cohort in the Physiology file.

3) To model timestep, hourly, daily, or geographic variability in microenvironment
parameters. Distributions for microenvironmental parameters are defined in the
Microparameter Descriptions file.

4) To model person-to-person variation in hourly air quality data within an air district.
Distributions for hourly air quality values can be defined in the Air Quality Data file. This is an
optional feature of APEX.

This chapter gives direction on how to define distributions in these input files. In addition, each
distribution available in APEX and its parameters are defined in detail.
3.1 The APEX Input Distribution Format

In all APEX input files, distributions are defined in the same manner, via a standard APEX
format. This format consists of the following items:

• Distribution Shape. This variable gives the shape of the distribution.
• Parl. Parameter 1 of the distribution. Depends on shape.
• Par2 Parameter 2 of the distribution. Depends on shape.
• Par3. Parameter 3 of the distribution. Depends on shape.
• Par4. Parameter 4 of the distribution. Depends on shape.
• LTrunc. Lower truncation point of the distribution.
• VTrunc Upper truncation point of the distribution.
• ResampOut.: Distribution resampling flag.
The distribution shape is a text keyword that defines the type of distribution to be used. The next
four items (Parl-Par4) are numerical values defining the parameters of the distribution. The next
two items (LTrunc and UTrunc) are the optional truncation limits for the distribution, and the last
item is an optional character flag (ResampOut, set to either Y or N) indicating how sampled
values outside of the truncation limits are handled. All the information is entered on a single line
in the appropriate input file. Note that in each input file the distribution definition may be
preceded on each line by additional data specific to that file; see sections on individual input
files.

-------
The probability distributions allowed in APEX are listed in Table 3.1. Equations for each of the
distributions in the table are given in Section 3.2.
                   Table 3.1.  Available Probability Distributions in APEX.
Distribution
Beta
Cauchy
Discrete
Exponential
Extreme
Value
Gamma
Logistic
Lognormal
Loguniform
Normal
OffOn
Pareto
Point
Triangle
Uniform
Weibull
APEX SHAPE
KEYWORD
BETA
CAUCHY
DISCRETE
EXPONENTIAL
EVALUE
GAMMA
LOT
LOGNORMAL
LUNIFORM
NORMAL
OFFON
PARETO
POINT
TRIANGLE
UNIFORM
WEIBULL
Parl
Minimum
Median
Par2
Maximum
Scale (b) >
0
Par3
Shape 1
(sl)>0

Par4
Shape2
(s2) >0

LTrunc
(Optional)
Lower
truncation
limit
Lower
truncation
limit
UTrunc
(Optional)
Upper
truncation
limit
Upper
truncation
limit
ResampOut
(Optional)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
This type of distribution has no parameters, rather the keyword is simply followed by a
space-delimited list of up to 100 discrete values. The distribution returns each of these
values with equal probability.
Decay
constant, k
>0
Scale (b) >
0
Shape (s) >
0
Mean
Geometric
mean (gm)
of unshifted
dist
Minimum >
0
Mean
Probability
of being 0
(0-1)
Shape (s) >
0
Point Value
Minimum
Minimum
Shape (s) >
0
Shift (a)
Shift (a)
Scale (b) >
0
Scale (b) >
0
Geometric
standard
deviation
(gsd) >1
Maximum >
0
Standard
deviation

Scale (b) >
0

Maximum
Maximum
Scale (b) >
0


Shift (a)

Shift (a)



Shift (a)

Peak

Shift













Lower
truncation
limit
Lower
truncation
limit
Lower
truncation
limit
Lower
truncation
limit
Lower
truncation
limit
Lower
truncation
limit
Lower
truncation
limit

Lower
truncation
limit

Lower
truncation
limit
Lower
truncation
limit
Lower
truncation
limit
Upper
truncation
limit
Upper
truncation
limit
Upper
truncation
limit
Upper
truncation
limit
Upper
truncation
limit
Upper
truncation
limit
Upper
truncation
limit

Upper
truncation
limit

Upper
truncation
limit
Upper
truncation
limit
Upper
truncation
limit
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)

Resample
outside
truncation?
(Y/N)

Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
Resample
outside
truncation?
(Y/N)
                                            14

-------
Cells that are grayed out in the table correspond to items not needed for a particular distribution,
and data entered in these locations will be ignored by APEX. In addition, the LTrunc, UTrunc,
and ResampOut items are in general optional. (If LTrunc and UTrunc are defined but
ResampOut is not, the default value of ResampOut=Y is used.) Note however, that a
placeholder period (".") must be used in the distribution definition for each item that is not
used.

Consider each of the examples below (one from each input file using distributions):

From the Activity-Specific MET file:

Row Act Age Occ. Shape Parl Par2 Par3 Par4 LTrunc UTrunc ResampOut
1 10000 0 ADMIN LogNormal 1.7 1.450 . 1.4 2.7 Y

From the Physiology file:

Gen Shape Parl Par2 Par3 Par4
M Normal 48.3 1.7
From the Microparameter Descriptions file:

Block DType Season Area Cl C2 C3 Shape
111 1111 Normal
Note that in each case the distribution definitions follow the exact same format (starting with the
Shape keyword). The only distribution type that does not follow this format is the Discrete
distribution, see Section 3.2.4.

Distributions are read from the various input files and stored in DistributionModulerReadDist.
3.2 Details of Distribution Sampling, Truncation, and Resampling

Probability density functions (PDFs) for each of the APEX distributions, parameterized
in terms of their input APEX parameters, are given in this section (with the exception of the
OffOn and Point distributions, which are trivial). In addition, real examples (of 10000 samples
each) from APEX are shown for untruncated distributions and for truncated distributions using
both ResampOut=Y and ResampOut=N.

When needed, stored distributions (which were read from the input files) are sampled in
DistributionModule: SampleDist.
3.2.1 Resampling Options

ResampOut determines how truncated distributions are handed by the APEX sampling routines.
If ResampOut = N, then any generated sample outside the truncation points is set to the
truncation limit; in this case, samples "stack up" at the truncation points, and the probability

-------
associated with the area under the PDF outside the truncation bounds is associated with the
truncation limit. If ResampOut = Y, then a new random value is selected from inside the valid
range.  In this case, the probability outside the limits is spread over the valid values, and thus the
probabilities inside the truncation limits will be higher than the theoretical untruncated PDF.

For each of the distributions defined in this section, the theoretical PDF is shown plotted against
APEX results for each truncation case. (Note that the "untruncated" case is actually a truncated
case with the truncation points set to the 0.1st and 99.9th percentiles where noted. These
distributions had very long tails, and they were truncated so they would fit in the illustration).
                                            16

-------
3.2.2  Beta Distribution

The PDF for the beta distribution in terms of the APEX input parameters is

                            (x - min)sl-1 (max- x)s2~l T(sl + si)
                     p(x) =
                               Y(sl)Y(s2)(max-mm)(sUs2)
                               (3-1)
where F indicates the gamma function and SI and S2 are shape parameters.  See Table 3.1 for
assignment of the parameters in this equation to the APEX parameters Parl-Par4. The
theoretical PDF for the beta distribution is illustrated in Figure 3.1 along with real examples
obtained from APEX using the different sampling options.
      0.25-


      0.20-


      0.15-


      0.10-


      0.05-
      0.00
        1234567

                      PDF
         Untruncated
          Truncated, range [3-7], ResampOut=N
Truncated, range [3-7], ResampOut=Y
                         Figure 3.1. The Beta Distribution in APEX.
                                             17

-------
3.2.3  Cauchy Distribution

The PDF for the Cauchy distribution in terms of the APEX input parameters is

                                       1
                     />(*) =
                            bn\ 1 +
                                   (x- median)2
                                   (3-2)
where b is a scale parameter. See Table 3.1 for assignment of the parameters in this equation to
the APEX parameters Parl-Par4. The theoretical PDF for the Cauchy distribution is illustrated
in Figure 3.2 along with real examples obtained from APEX using the different sampling
options.
         0.16-
         0.14-

         0.12-

         0.10-

         0.08-

         0.06-

         0.04-

         0.02-
         0.00
           -10   0   10  20  30  40   50  60  70
                                                    Truncated at 0.1st and QQ.Q'percentiles
          Truncated, range [-1 -8], ResampOut=N
Truncated, range [-1 - 8], ResampOut=Y
                        Figure 3.2. The Cauchy Distribution in APEX.
                                            18

-------
3.2.4   Discrete Distribution

The discrete distribution is a custom form of APEX distribution. Rather than being defined by
the regular 7 parameters, the discrete distribution is just given as a space-separated list of up to
100 values. APEX will return all values with equal probability.

An example of a discrete distribution having 6 values is shown in Figure 3.3.

"
1.4-
•
1.2-
.

1.0-
-
0.8-
0.6-
0.4-
0.2-
















DISCRETE
Values=
2.2
4.4
6.6
8.8
11.0
13.2







































































                       Figure 3.3 The Discrete Distribution in APEX
                                            19

-------
3.2.5  Exponential Distribution

The PDF for the exponential distribution in terms of the APEX input parameters is
                     p(x) = ke
                               a-x
                               ~
                               (3-3)
where a is a shift parameter and k is the decay constant.  See Table 3.1 for assignment of the
parameters in this equation to the APEX parameters Parl-Par4.  The theoretical PDF for the
exponential distribution is illustrated in Figure 3.4 along with real examples obtained from
APEX using the different sampling options.
    0.25-
    0.20-
    0.15-
    0.10-
    0.05-
    0.00
            5    10    15   20    25

               Theoretical PDF
                                   30
                                       35
          Untruncated
        Truncated, range [5-15], ResampOut=N
Truncated, range [5-15], ResampOut=Y
                     Figure 3.4. The Exponential Distribution in APEX.
                                            20

-------
3.2.6   Extreme Value Distribution

The PDF for the extreme value distribution in terms of the APEX input parameters is


                          _ 1
                     P(x)--<
                                                                                    (3-4)
where a is a shift parameter and b is a scale parameter.  See Table 3.1 for assignment of the
parameters in this equation to the APEX parameters Parl-Par4.  The theoretical PDF for the
extreme value distribution is illustrated in Figure 3.5 along with real examples obtained from
APEX using the different sampling options.
       0.14-

       0.12

       0.10-

       0.08

       0.06-

       0.04-

       0.02-

       0.00
                             EXTREME VALUE
                             Scale=3
                             Shift=2
                 Theoretical PDF
                                                             Untruncated
          Truncated, range [-3 - 7], ResampOut=N
                                                    Truncated, range [-3 - 7], ResampOut=Y
                    Figure 3.5. The Extreme Value Distribution in APEX.
                                            21

-------
3.2.7   Gamma Distribution

 The PDF for the gamma distribution in terms of the APEX input parameters is
                                        a-x
                                         b
                                                                                   (3-5)
where a is a shift parameter and b is a scale parameter. See Table 3.1 for assignment of the
parameters in this equation to the APEX parameters Parl-Par4. The theoretical PDF for the
gamma distribution is illustrated in Figure 3.6 along with real examples obtained from APEX
using the different sampling options.
    0.25
    0.20
    0.15-
    0.10-
    0.05-
    0.00
      0   2   4   6   8   10   12  14   16   18

               Theoretical PDF
       Truncated, range [2.5-14], ResampOut=N
           Untruncated
Truncated, range [2.5-14], ResampOut=Y
                       Figure 3.6. The Gamma Distribution in APEX.
                                            22

-------
3.2.8  Logistic Distribution

The PDF for the logistic distribution in terms of the APEX input parameters is
                                                                                   (3-6)
where a is a shift parameter and b is a scale parameter.  See Table 3.1 for assignment of the
parameters in this equation to the APEX parameters Parl-Par4.  The theoretical PDF for the
logistic distribution is illustrated in Figure 3.7 along with real examples obtained from APEX
using the different sampling options.
                  Theoretical PDF
            Truncated, range [0-4], ResampOut=N
                                                             Untruncated
Truncated, range [0-4], ResampOut=Y
                       Figure 3.7. The Logistic Distribution in APEX.
                                            23

-------
3.2.9  Lognormal Distribution

The PDF for the lognormal distribution in terms of the APEX input parameters is
/>(*)=
                                                    log
                                                       GM
                                                    log(GSD)
                                                                                   (3-7)
where a is a shift parameter, GM is the geometric mean, and GSD is the geometric standard
deviation.  See Table 3.1 for assignment of the parameters in this equation to the APEX
parameters Parl-Par4.  The theoretical PDF for the lognormal distribution is illustrated in Figure
3.8 along with real examples obtained from APEX using the different sampling options.
     1.0


     0.8-


     0.6


     0.4


     0.2-
         LOGNORMAL
         GM=1
         GSD=1.S
         Shift=2
       1.0  1.5  2.0 2.S 3.0  3.5  4.0 4.5 S.O  5.S  6.0

                Theoretical PDF
                                         Untruncated
                                                              3D     35
        Truncated, range [2.5-4], ResampOut=N
                               Truncated, range [2.5-4], ResampOut=Y
                      Figure 3.8. The Lognormal Distribution in APEX.
                                            24

-------
3.2.10 Loguniform Distribution

The PDF for the loguniform distribution in terms of the APEX input parameters is

                                 1
                     p(x) = •
                            ,   f max
                            log\^
                               ( mm .
                                                      (3-8)
where min and max are the minimum and maximum values of the untruncated distribution,
respectively. See Table 3.1 for assignment of the parameters in this equation to the APEX
parameters Parl-Par4.  The theoretical PDF for the loguniform distribution is illustrated in
Figure 3.9 along with real examples obtained from APEX using the different sampling options.
       0.8-

       0.7-

       0.6-

       0.5-

       0.4-

       0.3-

       0.2-

       0.1-

       0.0-
LOGUNIFORM
Min=l
Max=5
         0
                 Theoretical PDF
                                Untruncated
           Truncated, range [2-4], ResampOut=N
                       Truncated, range [2-4], ResampOut=Y
                     Figure 3.9. The Loguniform Distribution in APEX.
                                            25

-------
3.2.11 Normal Distribution

The PDF for the normal distribution in terms of the APEX input parameters is
                     p(x) =
                               1
                               (3-9)
where mean is the mean of the untruncated distribution and SD is the standard deviation.  See
Table 3.1 for assignment of the parameters in this equation to the APEX parameters Parl-Par4.
The theoretical PDF for the normal distribution is illustrated in Figure 3.10 along with real
examples obtained from APEX using the different sampling options.
     0.20-
     0.15-
     0.10-
     0.05-
     0.00
        0   2    4   6   8   10   12   14   16

                Theoretical PDF
          Untruncated
         Truncated, range [4-12], ResampOut=N
Truncated, range [4-12], ResampOut=Y
                       Figure 3.10. The Normal Distribution in APEX.
                                            26

-------
3.2.12 Pareto Distribution

The PDF for the Pareto distribution in terms of the APEX input parameters is

sb'
(x-d)
s+l
(3-10)
where a is a shift parameter and b is a scale parameter. See Table 3.1 for assignment of the
parameters in this equation to the APEX parameters Parl-Par4. The theoretical PDF for the
Pareto distribution is illustrated in Figure 3.11 along with real examples obtained from APEX
using the different sampling options.
015-
(10.
0,0?.
PAEITO
S IT
-------
3.2.13 Triangle Distribution

The PDF for the triangle distribution in terms of the APEX input parameters is
p(x) =

p(x) =
2(x - min)
(peak - min\max - min)

2max
f \2f peak-min\
(max-min) 1- —
^ max - min )
min < x < peak (3-11)

peak < x < max
where min, max, and peak are the minimum, maximum, and peak of the untruncated distribution,
respectively. See Table 3.1 for assignment of the parameters in this equation to the APEX
parameters Parl-Par4. The theoretical PDF for the triangle distribution is illustrated in Figure
3.12 along with real examples obtained from APEX using the different sampling options.
Theoretical PDF
Untruncated
Truncated, range [4-9], ResampOut=N
Truncated, range [4-9], ResampOut=Y
Figure 3.12. The Triangle Distribution in APEX.
28
-------
3.2.14 Uniform Distribution

The PDF for the uniform distribution in terms of the APEX input parameters is

1
p(x) =
max - mm
(3-12)
where min and max are the minimum and maximum values of the untruncated distribution,
respectively. See Table 3.1 for assignment of the parameters in this equation to the APEX
parameters Parl-Par4. The theoretical PDF for the uniform distribution is illustrated in Figure
3.13 along with real examples obtained from APEX using the different sampling options.
0.10-

0.05-

0.00
UNIFORM
Min=3
Max=10
0
5 10

Theoretical PDF
IS
Untruncated
Truncated, range [4-8], ResampOut=N
Truncated, range [4-8], ResampOut=Y
Figure 3.13. The Uniform Distribution in APEX.
29
-------
3.2.15 Weibull Distribution

The PDF for the Weibull distribution in terms of the APEX input parameters is
p(x) = sb s (x - a) l e
x-a
b
(3-13)
where a is a shift parameter and b is a scale parameter. See Table 3.1 for assignment of the
parameters in this equation to the APEX parameters Parl-Par4. The theoretical PDF for the
Weibull distribution is illustrated in Figure 3.14 along with real examples obtained from APEX
using the different sampling options.
0.6-

0.5-

0.4-

0.3-

0.2-

0.1-
O.ft
WE3BULL
Shape=2
Scale=2
Shift=3
01234S67

Theoretical PDF
Untruncated
Truncated, range [4-8], ResampOut=N
Truncated, range [4-8], ResampOut=Y
Figure 3.14. The Weibull Distribution in APEX.
30
-------
CHAPTER 4. CHARACTERIZING THE STUDY AREA

An initial study area in an APEX analysis consists of a set of basic geographic units called
sectors, typically defined as census tracts. (See Nomenclature, section 1.3) The user provides
the geographic center (latitude/longitude) and radius of the study area. APEX calculates the
distances to the center of the study area of all the sectors included in the sector location database,
and then selects the sectors within the radius of the study area. One can also provide a list of
counties or census tracts as part of the specification of the initial study area. APEX then maps
the user-provided timestep air quality district and hourly meteorological zone data to the selected
sectors. The sectors identified as having acceptable air and meteorological data within the radius
of the study area are selected to comprise a final study area for the APEX simulation analysis.
This final study area determines the population make-up of the simulated persons (profiles) to be
modeled.

The following sections describe in more detail how a final study area is determined in an APEX
simulation analysis.
4.1 APEX Spatial Units
4.1.1 Initial Study Area

The APEX study area has typically been the neighborhood around an emission source or on the
scale of a city or larger metropolitan area. Larger study areas are possible to simulate, depending
on computing capabilities, available data, and the desired precision of the run.

The user defines an initial study area by specifying the latitude and longitude of a central point
(referred to here as the study area central location), together with a radius. The user also has the
option of providing a list of counties or census tracts to be modeled. If present, this list further
restricts the area to be modeled to the counties or tracts to be modeled which are within the
specified study area radius. The final study area is a function of the availability of the user-
supplied demographic data, pollutant concentration data, and the meteorological data within the
initial study area, as determined respectively by population sectors, air quality districts, and
meteorological zones. Figure 4.1 and the subsections below provide additional details about
these geographical units.
4.1.2 Sectors

The demographic data used by the model to create personal profiles is provided at the sector
level. For each sector the user must provide demographic information allowing the
determination of age, gender, race, and work status. This is most commonly done by equating
sectors with census tracts and providing input files with counts at the tract level for each age,
gender, and race combination. The current release of APEX includes input files that already
contain this demographic and location data for all census tracts in the 50 states and D.C., based

31
-------
on the 2000 Census. One of the APEX input files (genetically named Sector Location file in this
guide) lists the sector ID and location for all sectors that have associated population data. The
supplied Sector Location file has been prepared listing all the census tracts in the 2000 U.S.
Census. Corresponding Population files have been supplied as well. This allows the user to
model any desired study area in the country without having to make any changes to these input
files.
hitial Study .An
(lAirthi n C rty R a dius Study Area Cent er
distance from Study /
.Area Center) "Sk, •'''
/
f

Intermediate Study
Area (e.g., Charlotte,
HC CMS A) *

Zone Center

Zone Radius
C
ZonefZone
Radius distance
from Zone
Center)

District Ajr Monitor

District Air Radius

District (Air Radius
distance from Air Monitor)
X
Final Study Area
(dark line; the
Sectors comprising
the five Districts)
Figure 4.1. Example of Study Areas, Air Quality Districts, Meteorological Zones, and Sectors

If available, finer scales such as census block groups could be used instead. Also, data could be
aggregated to larger regions such as counties if fewer sectors were desired. Regardless of the
specific meaning for sectors, in APEX the shape of sectors is irrelevant in the sense that the
model only uses the central location each sector, determined by the latitude and longitude for
some representative point.

In the Simulation Control file for an APEX run, the user specifies the area to be modeled by
specifying the latitude and longitude of a central location for the study area, along with a radius
(the CityRadius parameter). Optionally, the user may also provide a list of counties or census
tracts to be modeled. If present, this list further restricts the study area.
32
-------
For each model run, APEX selects the sectors that meet the study area conditions in the
following way. First, the sector location must be within the specified distance (radius) of the
designated center of the study area. Second, if the user provided a specific list of counties
identified by their FIPS codes (using CountyList and County in the Control file) , then the sector
must belong to one of these counties. This is achieved using the first five characters of the sector
ID, which should contain the FIPS code for the county in which the sector is located. The user
may also provide a list of census tracts (sectors) using the TractList and Tract variables. If no
county or tract list is provided, the initial study area is roughly circular, consisting of all sectors
with sector locations within the specified radius. The final study area consists of the subset of
sectors in the initial study area for which air quality and meteorological data are available. This
is discussed under the next two subsections (4.1.3 and 4.1.4)/

One way to exert greater control over the selection of sectors is to edit the Census input files to
eliminate any undesired sectors. However, the Sector Location file and the various Population
files must all have the same set of sectors in them, so consistent editing is necessary. The
Sectors and Population files provided with the current release of APEX contain data for all
census tracts in the 50 states and D.C. from the 2000 Census.

The sectors are read in SpaceTimeModulerReadSiteLists.
4.1.3 Air Quality Districts

The spatial units for ambient air quality data are called air quality districts. Ambient air quality
data are provided as timestep (e.g. hourly) time series at specific locations. These locations
could be monitoring sites, or political units such as counties, or census units such as tracts, or
receptor locations or grid points as used by some air quality models. As with sectors, each air
quality district has a nominal central location consisting of a latitude and a longitude. The air
quality district locations are stored in the Air Quality District Location input file. The user
designates the maximum representative radius of the air quality districts as a modeling parameter
called AirRadius. In a given APEX run, all air quality districts have the same maximum
representative radius. The same set of air quality districts must be used for each pollutant.

The model checks each air quality district listed in the Air Quality Data input files (one for each
pollutant) to determine if the district has data covering the entire simulation period. Note that
each air quality district may have a different period of operation (i.e., different start and/or stop
dates), but there can be no missing data or gaps between the simulation start and stop dates. If
the user is supplying monitoring data as inputs, for example, then any missing values within each
monitoring period must be filled in. The monitoring period can exceed the simulation period for
the model run because APEX extracts the portion of the input data that corresponds to the
simulation period. If the monitoring period does not cover the entire simulation period, the air
quality district is deleted in its entirety.

APEX calculates the distance from each sector location to each air quality district center, and
assigns the sector to the nearest air quality district, provided it is within the maximum
representative air quality radius. If there is no air quality district within range (that is, all air
quality district centers are further than AirRadius from the sector center), the sector is deleted
from the study area and is not modeled. It is possible and perhaps even likely that some air

33
-------
quality districts in the Air Quality Data input files will have no sectors assigned to them. Such
air quality districts are not used. This feature allows the user to prepare an input file in the
simplest manner, perhaps containing more air quality districts than are necessary. For example,
one might prepare a single Air Quality Data input file for a pollutant for all air quality districts in
the state of Texas. This same input file could then be run on a study area around Houston, or
Dallas, or some other location in Texas, without having to alter the input file. In principle,
although the file would be very large, a single nationwide Air Quality Data file could be
prepared and used for any pollutant of interest.

By default, APEX will assign each person within a sector the corresponding appropriate ambient
values from the sector's matching air district. Thus, for each timestep in the simulation, all
persons within the same district will have the same air quality value. However, APEX can
optionally model person-to-person variation in air quality within an air district. In this case, an
optional form the the Air Quality Data file is provided, which lists air quality distributions for
each hour for each air district. Each person in the district is assigned a randomly sampled value
from the appropriate hourly distribution. See Volume I for details on using this option. Note that
this option can only be used when the APEX timestep is equal to 1 hour (the APEX default).

The air quality districts are read in from the input file in SpaceTimeModulerReadSiteLists.
The actual air quality is read from the Air Quality Data files in
SpaceTimeModulerReadAirQuality. If the Air Quality Data file uses hourly distributions for
each district (an optional feature of APEX, see Volume 7), then the air data for each simulated
individual is sampled in SpaceTimeModule: GenerateAirQuality.
4.1.4 Meteorological Zones

Another spatial unit in APEX is the meteorological zone., which is the equivalent to air quality
districts but for meteorological data. Most of the rules that apply to air quality districts also
apply to meteorological zones. Each meteorological zone is associated with a central location
(specified by latitude and longitude), a maximum representative radius given by ZoneRadius, a
Start Date, and a Stop Date The start and stop dates may differ for each meteorological zone,
and must encompass the entire simulation period or else the meteorological zone is deleted.
There can be no missing data between the start and stop dates of the simulation.

APEX calculates the distance from each sector location to each meteorological zone center, and
assigns each sector to the nearest meteorological zone if within range (ZoneRadius\ or deletes
the sector otherwise.

The meteorological zone locations are read in from the Meteorological Zone Location input file
in SpaceTimeModulerReadSiteLists. The actual meteorological data is read from the
Meteorology Data file in SpaceTimeModulerReadMeteorologyData. When the data are read
in, daily maximum and average temperatures are calculated for later use.
34
-------
4.2 Determining the Final Study Area
4.2.1 Matching Sectors, Air Quality Districts, and Meteorological Zones

The APEX code for reading the locations of sectors, air quality districts, and meteorological
zones, and for pairing the sectors to the nearest air quality districts and meteorological zones, is
found in SpaceTimeModule: SelectSites.

The final study area consists of all the sectors within City Radius of the study area central
location, restricted to the listed counties or tracts (if provided), that have both an air quality
district and a meteorological zone within range. If both tracts and counties are listed, then the
resulting study area is the union of the two lists. Sectors for which a valid air quality district
(one within AirRadius) or a valid meteorological zone (one within ZoneRadius) cannot be found
are discarded from the final study area. The study area population is the total population in the
input Population files that reside in these sectors.

The SelectSites subroutine makes use of the function SpaceTimeModule: Distance which
calculates the distance between two points given their latitudes and longitudes. It is discussed
below.
4.2.2 The Distance Algorithm

APEX uses a computational algorithm (implemented in SpaceTimeModulerDistance) to
calculate the distances (e.g. study center to sector center, sector center to air quality district
center) required to determine the final study area. The method is accurate, simple in terms of
program code, and works well everywhere on the globe. It calculates the distance between two
points based on their latitudes and longitudes. It is based on projecting the points onto a sphere,
rather than onto a plane, using the following steps:

• The latitude and longitude of the two points (locations) in question are identified

• The two sets of angles, (9i, (|)i) and (62, (h), subtended at the Earth's center by the radial
vectors to the two points are calculated

• The net angle between these two radial vectors is found

• The Earth's radius at the average latitude of the two points is calculated

• The results from the previous 2 steps are multiplied to give the distance between the
points, D

All angles are measured in radians. In step 1, the angles § subtended by the radial vectors at the
Earth's center are the same as the longitudes of the points in question. However, calculating the
9 from latitude is more complicated since it is affected by the flattening of the poles. The
intersection of the Earth's surface with a plane through the poles is an ellipse. Let the
eccentricity of the ellipse be given by e. Latitude measures the angle between the polar axis and

35
-------
a locally horizontal surface such as sea level (a tangent surface to the Earth). For a point on an
ellipse at an angle 9 from the semimajor axis, the tangent line has slope s:

s = -(1- e2)cot (6) (4-1)

where the square of the eccentricity of the earth is e2 = 0.00672265.

This slope is equal to the negative cotangent of the latitude. Hence

9 = tan-1 [(/ - e2 )tan(lat)] (4-2)

When applied to both points, this gives angles 91 and 92.

The radius of the Earth varies with latitude but not with longitude. At an angle theta relative to
the semimajor axis, the distance from the center to a point on the ellipse is given by
R-A 1~e (4-3)
where A is the semimajor axis length, A = 6378.388. For purposes of the distance calculation, 9
is set to the average of 9i and 92.

For two points at (9i, (|)i) and (92, §2) on a sphere of radius R, the arc length of the great circle
connecting them, the distance D, is given by the product of the radius and the net angle between
them:
D = Rcos-1 [cos(91 )cos(02 )cos(^ -2}+ sin(91 )s;n(02 )] (4-4)

which follows from the formula for the dot product of the radial vectors, expressed in spherical
coordinates.

All of the above formulas are exact. The only approximation being made is in claiming that the
arc of the Earth's surface between the two points lies exactly on a sphere. In most practical
cases, the error resulting from this approximation is in the range of one to ten parts per million.
4.3 Modeling Commuting

APEX models commuting by assigning a work sector to each employed modeled individual
based on commuting data for the individual's home sector from the 2000 Census. A single
nationwide file of commuting data (the Commuting file) is supplied with APEX. Both the
nationwide Population Data and Commuting input files use census tracts as the sectors. The
Population Data files contain year 2000 tract-level U.S. Census counts, tract locations, and
national employment probabilities by age. The Commuting file contain adult commuting
patterns based on the 2000 Census. As part of step 2 (Figure 2-1), APEX can extract the flows
for the selected home sectors from this large Commuting file. The development of this file is
described below.
36
-------
4.3.1 Nationwide Commuting Database for 2000

A national commuting database was generations from a set of US Census files listing the number
of persons living in one tract and working in another. The original files, derived from the 2000
Census, are distributed by the US DOT Bureau of Transportation Statistics at their web site
http://transtats.bts.gov/. The data used were collected as part of the Census Transportation
Planing Package (CTPP). The files used are from CTPP Part 3-The Journey To Work files. They
contain counts on individuals commuting from home to work locations at a number of
geographic scales. Tract-to-tract data were used for the commuting databases.

One concern identified from these data was that some home-work pairs are very widely
separated geographically. For example, some people live in Alabama but work in Alaska. This
may be the case for either military personnel, or others who are stationed in remote workplaces
for weeks or months at a time, but in APEX the need is for daily commuting flows and there is
no way to determine from these data which workers commute on a daily basis. A preliminary
analysis of the home-work counts showed that a graph of Log(flows) versus Log(distance) had a
near-constant slope up to somewhere around 100 kilometers. Beyond 150 kilometers the slope is
also fairly constant but flatter, meaning that flows were not as sensitive to distance. Between
these two distances there is a smooth transition from one slope to the other. A simple
interpretation of this result is that at distances below 100 km the vast majority of the flow was
due to persons traveling back and forth daily, and the numbers of such persons decrease fairly
rapidly with increasing distance. At large distances the majority of the flow are persons who
stay at the workplace for extended times, in which case the separation distance is not as crucial in
determining the size of the flow.

To apply the home-work data to commuting patterns in APEX, a simple rule was chosen. It was
assumed that all persons in home-work flows up to 120 km are daily commuters, and no persons
in more widely separated pairs commute daily. This meant that the list of destinations for each
home tract can be restricted to only those work tracts that are within 120 km of the home. In
practice, this cutoff has little impact in an APEX model run since persons who live and work
more than 120 km apart are not likely to have both their home and work sectors inside the same
study area.

The resulting database contained a total of 64,958 distinct home sectors, and 65,075 work
sectors, based on data from 117,792,172 persons. The home-work tract pairs were sorted by the
home tract ID, and for each home tract the possible destinations were listed, one per line. The
flows were converted to fractions of the home tract total, and then expressed as cumulative
fractions since this is the form that will be needed by APEX. The work sectors were sorted by
decreasing fraction of the home tract total, and the information was written to an ASCII text file.
Every record in this file contains three variables: a tract ID, a flow fraction, and a distance. A
convention was used that the flow fraction and distance were set to -1.0 to indicate a home tract.
The first three records on the nationwide ASCII file are as follows:
37
-------
01001020100 -1.00000 -1.0
01001020700 0.10412 5.5
01101000100 0.20097 19.6
The first record contains the ID of the first home tract by sort order, followed by two placeholder
values. The second record contains ID for the first destination tract for the current home tract.
The destination tracts are sorted in order of decreasing commuting fraction The number 0.10412
indicates that the largest percentage of commuters of the commuters who live in this home tract
(around 10.4%) commute to tract 01001020700. The third number on each line is the separation
distance in kilometers between the home and work tracts. The third record shows the second
destination tract for this same home tract, followed by a cumulative fraction of 0.20097. This
means that about 9.7% of the workers in that home tract go to second work tract, despite the two
being greater than 19 kilometers apart. Further examination of this file reveals that this
particular home tract has 38 destination tracts to which workers commute. Note that fractions
down to 0.00001 (one in one hundred thousand) are reported. Fractions that round to zero at this
precision are deleted from the database.

The number of destinations per home tract is not constant. There are two ways to know when
the last destination for a given home tract has been reached. First, the cumulative flow fraction
for the last destination is always 1.00000. Second, the next record in the file has negative flow
and distance indicators as placeholders, indicating that the record contains the ID of the next
home tract. The mean number of associated work tracts per home tract is 79, with a minimum
of 1 and a maximum (within 120 km) of 413. Overall, this file has over 5.2 million records and
occupies 131 megabytes of disk space.

4.3.2 Implementation of Commuting in APEX

The APEX model only extracts the commuting data that pertains to the selected study area. For
each sector in the study area (home sector) APEX reads in a list of possible work sectors and the
cummulative probabilities associated with each. The cumulative probabilities range from 0 to 1;
the work sectors having higher probabilities are associated with a larger "range" of this interval.
For each worker (each profile with Employed=YES), a uniform random number R from zero to
one is generated. This random number is used to select a corresponding work sector (the one
that is associated with smallest cumulative probability that is > R) for the profile.

When a profile's activity diary events are read from the Diary Events file, they are characterized
as occurring in one of three places: "Work", "Home" or "Other" based on the CHAD location
and activity codes. For profiles who are commuters, all "Work" events will use the work
sector's air quality data when calculating microenvironmental concentrations. "Home" events
will use the home sector data; "Other" events will use an average of data from different sectors
(either all sectors in the area or a random selection, based on the APEX input settings). See
Section 8.2.1 for more information on home, work, and other locations.

It is likely that some profiles will commute to sectors outside of the final study area. The user
may choose to include these profiles in the analysis or discard them by setting the Simulation

38
-------
Control file variable KeepLeavers. If KeepLeavers=YES, the air quality data for the work sector
is unknown. Therefore, it is assumed to be related to the average concentration over all of the
study area air quality districts at the same point in time. Calling this average Cavg, the ambient
concentration C for the person is:

C=LeaverMult*Cavg+LeaverAdd (4-5)

where LeaverMult and LeaverAdd are also set in the Simulation Control file. If KeepLeavers =
NO, then these individuals are not modeled.
39
-------
CHAPTER 5. GENERATING SIMULATED INDIVIDUALS
(PROFILES)

APEX stochastically generates a user-specified number of simulated persons to represent the
population in the study area. Each simulated person is represented by a "personal profile." The
personal profile (see step 2 in 2-1) is a set of parameters that describe the person being simulated.
The simulated person has a specific age, a specific home sector, a specific work sector (or does
not work), specific housing characteristics, specific physiological parameters, and so on.
The profile does not belong to any particular real person, instead it is a fictional or simulated
person. A single profile does not have much meaning in isolation, but a collection of profiles
represents a random sample drawn from the study area population. This means that statistical
properties of the collection of profiles should reflect statistical properties of the real population in
the study area.

APEX generates the simulated person or profile by probabilistically selecting values for a set of
profile variables. The profile variables fall into five categories:

• Demographic variables, which are generated based on the census data;

• Residential variables, which are generated based on sets of user-defined distribution data;

• Physiological variables, which are generated based on age- and gender-specific
distribution data; and

• Daily varying variables, which (in most cases) are based on distribution data that change
daily during the simulation period.

• Modeling variables, which can have a variety of meanings, and are usually defined by the
user. The individual profile variables are given in

• Table 5.1.

The profile variables do not depend on the results for any other profile, and in general do not
depend on any results (diary selection, microenvironment concentrations, exposure, or dose) for
the current profile. The exception is the person's physical activity index (PAI), which is
calculated using the activity diary.

At present, APEX does not allow the user to select which variables are part of the personal
profile. If such an enhancement were included, the user would be able to assign (say) a family
size to each profile using input files rather than by reprogramming the source code.
The process by which APEX generates the personal profile is illustrated in Figure 5.1. The
demographic and residential variables are set in ProfileModulerGenerateProfiles, the
40
-------
physiological variables are generated in ProfileModulerGeneratePhysiology, and the daily-
varying variables are set in ProfileModulerGenerateDailyVars.

The following subsections describe the different categories of profile variables in more detail.
Table 5.1. Profile Variables in APEX
Variable Type
Demographic
variables
Residential variables
Physiological
variables
Profile Variable
Index
Gender
Race
Age
Home sector
Work sector
Home District
Work District
Meteorological zone
Employment Status
Gas stove
Gas pilot
Home air conditioning
Car air conditioner
Blood volume
Body Mass
Weight
Height
Resting metabolic rate
Body surface area
Maximum permitted MET value
Description
Internal APEX profile index
Male/Female
White, Black, Native American,
Asian, and Other
Age (years)
Sector in which a simulated person
lives
Sector in which a simulated person
works
Air quality district assigned to home
sector
Air quality district assigned to work
sector
Meteorological zone assigned to
home sector
Indicates employment outside home
Indicates presence of gas stove
Indicates presence of gas pilot light
Indicates type of home ventilation/air
conditioning system
Indicates presence of air conditioning
in the simulated person's car
Blood volume of a simulated person
(ml)
Mass of simulated person (kg)
Body weight of a simulated person
(Ibs)
Height of a simulated person (in)
Resting metabolic activity rate
(kcal/min)
Surface area of the body (m )
Maximum metabolic activity level
(multiple of resting metabolic rate)
that can obtained by the individual
(dimensionless)
41
-------
Variable Type
Profile Variable
Description
Energy conversion factor
Oxygen uptake per unit of energy
expended (liters/kcal)
Lung CO diffusivity
Lung CO diffusivity parameter used
in COHb calculation (ml/min/torr)
Endogenous CO production rate
#1
Endogenous (internally produced)
CO production rate #1 (ml/min)
Endogenous CO production rate
#2
Endogenous CO production rate #2
(ml/min) used only for women
between ages of 12 and 50 for half
the menstrual cycle
Hemoglobin altitude factor
Correction for blood hemoglobin
density at high altitudes
Recovery Time
Time to recover a maximum oxygen
deficit (hours)
Ventilation slope
Slope term for MET to ventilation
conversion
Ventilation intercept
Intercept term for MET to ventilation
conversion
Ventilation residual
Residual term for MET to ventilation
conversion
Hemoglobin density in the blood
Amount of hemoglobin in the blood
(g/ml)
Normalized maximum oxygen
consumption
Maximum rate of oxygen
consumption by the individual,
normalized to body mass (ml of
oxygen/min/kg)
Starting day of menstrual cycle
The day during the first 28 days of
the simulation period that
menstruation begins; used for
calculating endogenous CO
Disease status
Whether or not the person has the
disease. Probability is determined by
the input Prevalence file.
Maximum oxygen deficit
Maximum oxygen deficit obtainable
by profile (ml/kg)
Daily varying
variables
Daily endogenous CO
production rate
Daily endogenous CO production
rate in the simulation period (ml/min)
Diary number
Index for the selected diary for the
day
First event #
Index of the first event for the day in
the composite activity diary
Diary ID
CHAD (or other database) ID for the
diary selected for the day
Diary age
Age of the CHAD diary selected for
the day
42
-------
Variable Type

Modeling Variables

Profile Variable
Diary employment
Last event #
Residence window position
Car window position
Daily average car speed
category
Daily Conditional variable # 1
Daily Conditional variable # 2
Daily Conditional variable # 3
PAI
Diary key statistic
Number of diaries
Profile Conditional Variable #1
Profile Conditional Variable #2
Profile Conditional Variable #3
Profile Conditional Variable #4
Profile Conditional Variable #5
Regional Conditional Variable
#1
Regional Conditional Variable
#2
Regional Conditional Variable
#3
Regional Conditional Variable
#4
Description
Employment status of the CHAD
diary selected for the day
Index of the last event for the day in
the composite activity diary
Daily residence window position
(open or closed) during the
simulation period
Daily car window position (open or
closed) during the simulation period
Daily average car speed category
during the simulation period
Generic user-defined daily
conditional variable # 1
Generic user-defined daily
conditional variable # 2
Generic user-defined daily
conditional variable # 3
Daily physical activity index (MET,
dimensionless)
Key diary statistic (such as outdoor
time of vehicle time) for the day.
Used for constructing longitudinal
activity diary. Equal to 0 if not using
longitudinal assembly.
Number of different activity diaries
used to construct the composite
activity diary for the person.
Generic user-defined conditional
variable # 1 .
Generic user-defined conditional
variable # 2.
Generic user-defined conditional
variable #3.
Generic user-defined conditional
variable # 4.
Generic user-defined conditional
variable # 5.
Regional user-defined conditional
variable # 1 .
Regional user-defined conditional
variable # 2.
Regional user-defined conditional
variable # 3.
Regional user-defined conditional
variable # 4.
43
-------
Variable Type

Profile Variable
Regional Conditional Variable
#5
Description
Regional user-defined conditional
variable # 5.
NO
Select population type
(Race/Gender combination)
Select home sector
Select age group within home
sector
Assign age within age group
Check if age within range
set in Simulation Control file
YES,
Generate employment status
Set demographic variables
based on the above
Profile
Functions
file
> Set residential variables
j Physiology file |—> Set physiological variables
> Set daily-varying variables
Select work sector
(if modeling commuting)
Figure 5.1. Generating a Simulated Profile
5.1 Demographic variables

The following are the profile variables of a demographic nature:

• Age: Age in years (integer)

• Employed: YES if employed outside home, NO otherwise

• Gender: Male or Female

44
-------
• HomeD: Number of air quality district assigned to home sector

• HomeSec: Number of home sector for this profile

• Race: White, Black, Native American, Asian, Other

• WorkD: Number of air quality district assigned to work sector

• WorkSec: Number of work sector (if any)

• Index: Sequential number indexing the set of profiles

• Zone: Number of meteorological zone for home sector
The values of demographic variables (gender, age, etc.) for a simulated profile are selected
probabilistically according to the following steps (see Figure 5.1):

• The fractions of people in each of the gender/race combinations ("population types") in
the final study area are calculated and then used as probabilities to randomly select a
gender/race type for a simulated individual.

• The fractions of the selected population type in each sector within a study area are found
and then used as probabilities to randomly select a home sector for the simulated person.

• The fractions of people in each age group in the selected population type in the selected
sector are calculated and then used as probabilities to randomly select an age group for
the simulated person.

• A specific age within the selected age group is randomly selected, assuming a uniform
distribution.

• The employment probability for the selected age group (for the home sector) is used to
randomly determine whether a simulated person will work.

• If the commuting option is used and a simulated person works, use the fractions of people
commuting to each of the work sectors for the selected home sector to randomly select a
work sector that a simulated person will commute to. If the commuting option is not
used, assume a person who works does so in their home sector.

5.2 Residential Variables

The residential variables are set after the demographic variables described above. The residential
variables are categorical variables that are used to indicate whether a residence or a car
associated with a simulated person has a specified appliance or component that may affect
exposure. The rules for assigning these four variables are specified by the user in the Profile
Functions (Distributions) input file (see Volume I). The residential variables are:
45
-------
• AC_Home: An integer indicating the type of home ventilation systems (e.g., central air,
attic fan, window unit, no A/C).

• AC_Car: YES if has air conditioning in car, NO otherwise

• HasGasStove: YES if has gas stove in home, NO otherwise

• HasGasPilot: YES if has gas pilot for gas stove in home, NO otherwise
APEX randomly determines the result for the variable based on the probabilities specified in the
input files. For example, suppose a user specifies probabilities of 0.3 for not having a car
conditioner (AC_Car= NO) and 0.7 for having a car air conditioner (AC_Car=YES). APEX
randomly generates a value in the range of 0 to 1, assuming a uniform distribution. If this value
is larger than 0.3, the simulated person will own an air conditioner, otherwise not.

These variables may be used as conditional variables for calculating concentrations in
microenvironments (see Chapter 8).

5.3 Physiological Profile Variables

The physiological variables are used for estimating ventilation, calculating dose (as described in
Chapters 7 and 10) and classifying profiles in reporting results. This section covers the
calculation of these variables. The profile variables relating to physiology are as follows:

• Blood: Volume of blood (milliliters)

• BM: Body mass (kilograms)

• BSA: Body surface area (m2)

• Diff: Lung CO diffusivity(ml/min/torr)

• ECF: Energy conversion factor (liters of oxygen per kcal)

• Endgnl: Endogenous CO production rate #1 (ml/min)

• Endgn2: Endogenous CO production rate #2 (ml/min)

• Height: Body height (inches)

• Hemfac: Hemoglobin altitude adjustment

• Hmglb: Hemoglobin density (grams per milliliter of blood)

• METmax: Maximum obtainable MET value (unitless)

• MOXD: Maximum obtainable oxygen debt (ml/kg body mass)
46
-------
• NVO2max: Maximum normalized oxygen consumption (milliliter of oxygen per minute
per unit body mass)

• PAI: Median daily physical activity index (MET, dimensionless)

• RMR: Resting metabolic rate (kcal/min)

• RecTime: Oxygen debt recovery time (hours)

• Start: Starting day (phase) for menstrual cycle (if applicable)

• VeSlope: Slope for MET to ventilation rate conversion

• Velnter : Intercept for MET to ventilation rate conversion

• VeResid: Residual for MET to ventilation rate conversion

• Weight: Body weight (pounds)

• ///: Flag indicating whether or not a person has the disease that is modeled in the
Prevalence input file.
The physiological profile variables do not affect the exposure calculations for individuals in any
way, as none of them affect either the selection of diaries or the concentrations in the various
microenvironments. However, the physiology of a profile affects the ventilation calculations and
thus influences (1) the calculation of dose (see Chapter 10) and (2) the calculation of exertion
level, which is used to group exposure results in the output Tables file (see Volume I).

The physiological variables are calculated from input data. The Physiology input file contains
distributions that are used to set the physiological profile variables and parameters for each
simulated person. Specifically, the physiology file contains distributions for the following
variables and parameters for each age and gender cohort (see Volume I for more information):

• NVO2max: normal distributions for NVO2max in ml CVmin/kg (Isaacs and Smith,
2005)

• EM lognormal distributions for BM in kg (Isaacs and Smith, 2005)

• RMRSLP, RMRINT, and RMRERR: Point distributions for regression coefficients for
RMR (slope, intercept, and error) as a function of BM in MJ/Day (Johnson et al., 2000,
adapted from Schofield, 1985)

• BLDF1 and BLDF2: Point distributions for blood volume factors for calculation of
volume (ml) from height and weight (Johnson et al., 2002)

• HMG: Normal distributions for blood hemoglobin density (g/lm of blood) (Johnson et
al., 2002)
47
-------
• HTSLP, HTINT, and HTERR: Point distributions for regression coefficients for height
in inches (slope, intercept, and error) as a function of age (children 0-17) or body mass
(adults) (Johnson et al., 2000)

• BSAEXP1 and BSAEXP2: Point distributions for exponential parameters for calculating
body surface area (m2) as a function of body mass (Burmaster, 1998)

• MOXD: Normal distributions for maximum obtainable oxygen debt in ml per kg body
weight (Isaacs et al., 2007). MOXD is used in the adjustment of MET values; see section
7.2.

• ECF: Uniform distributions for the energy conversion factor for the person. Liters of
oxygen per kcal. (Johnson et al., 2000, adapted from Esmail et al., 1995)

• RecTime: Uniform distributions for the time required to recover a maximum oxygen
deficit (hours). (Isaacs et al., 2007) RecTime is used in the adjustment of MET values;
see section 7.2.

• Endgnl and Endgn2. Point distributions for endogenous CO production rates in ml/min.
(ENDGN2 is used for women in 2nd half of menstrual cycle).
The Ventilation input file contains data for a set of regression parameters used by APEX to
estimate the ventilation-related profile variables VeSlope, Velnter, and VeResid. See Volume I
for an example of the Ventilation file; see Graham and McCurdy (2005) for the derivation of
these parameters. The file contains the following parameters for 5 age groups:
• VEBO and VEBOSE: Mean and standard error for regression parameter bO

• VEB1 and VEB1SE: Mean and standard error for regression parameter bl

• VEB2 and VEB2SE: Mean and standard error for regression parameter b2

• VEB3 and VEB3SE: Mean and standard error for regression parameter b3

• VEEB: Interpersonal variance

• VEEW: Intrapersonal variance
Note: the standard error terms defined in the Ventilation file are not currently used by APEX, but
could be used for future uncertainty analyses.

Once the above parameters are read/set by APEX, the physiological profile variables are
formulated as follows in ProfileModulerGeneratePhysiology using the appropriate parameter
values for the profile age and gender. First, NVO2max, BM, RMR, HMG, BLDF1, BLDF2,
HTSLP, HTINT, HTERR, MOXD, ECF, RecTime, BSAEXP1, BSAEXP2, Endgnl and
Endgn2 are sampled from the appropriate input distribution (using the APEX sampling methods,
Section 3.2) for the profile's age and gender. Then the other profiles variables are calculated as
follows:

Weight = 2.2046 (BM) (5-1)

48
-------
where weight is given in pounds (Ibs.). Body surface area (in m2) (Burmaster, 1998) is
calculated as

r-i^/i RK&FYP1 1-1 ii , RSAFYP?
(5-2)
BSA=eBSAEXP1BMBSAEXP2
Resting metabolic rate (Johnson et al., 2000) is given in units of kcal/min by

RMR = 0. 166[(RMRSLP}(BM) + RMRINT + RMRERR] (5-3)

where 0.166 is the conversion factor for converting MJ/day to kcal/min.

The maximum MET value an individual can obtain is given by their personal maximum energy
expenditure divided by their RMR. Personal maximum energy expenditure is simply maximum
oxygen consumption converted to kcals via ECF, and thus METSmax can expressed as:
METmax = (0-001 (5.4)
ECF * RMR

where 0.001 is the conversion factor for ml to liters C>2 such that METmax is a unitless number.
The lower bound for METmax is 5, and the higher bound is 20; values outside of this range are
set to the corresponding bound.

For children under the age of 18, height (in inches) is a function of age (in years):

Height = HTINT + age * HTSLP + HTERR (children under 1 8) (5-5)

for adults, height (in inches) is a function of body weight (in pounds):

Height = HTINT + In(weight) * HTSLP + HTERR (adults 1 8 and older) (5-6)

Alion staff fit these equations for height, and derived the accompanying coefficients.

Several physiological variables are calculated differently for males and females, including lung
CO diffusivity in ml/min/torr (Johnson et al., 2000). This variable is only used in the APEX CO
dose algorithm.

= 0.36l(Height)-0.232(Age)+16.3 (males) (5-7)

iff = 0.556(Height)-0.115(Age)-5.97 (females) (5-8)

where height is in inches and age in years. Blood volume in milliliters (Johnson et al., 2000) is
calculated as:

Blood = BLDFl(Weight) + BLDF2(Height)3 - 30 (5-9)

where weight is in pounds and height in inches. For males, Velnter is calculated as

Velnter = lntterm-[VEB3] (males) (5-10)

49
-------
and for females

Velnter = Intterm + [VEB3] (females) (5-11)

where

Intterm = VEBO + Z(VEEB)+ [VEB2][ln(l + Age)] (5-12)

where Z is a number drawn from a normal distribution ranging from -4 and 4.

For both genders

VeSlope=VEB1 (5-13)

VeResid = VEEW (5-14)

The variables VeSlope, Velnter, and VeResid are used in the APEX ventilation algorithm (see
Chapter 7 and Graham and McCurdy, 2005).

5.4 Daily-Varying Variables

The daily varying variables are generated for each day in the model simulation for each profile in
ProfileModulerGenerateDailyVars. There are eight daily-varying profile variables:

• DailyConditionall User-defined daily conditional variable #1

• DailyConditional2 User-defined daily conditional variable #2

• DailyConditionalS User-defined daily conditional variable #3

• Endgn Endogenous CO production rate on given day (ml/min)

• PAI Daily physical activity index (MET, dimensionless)

• SpeedCat Category for average vehicle speed on given day

• WindowRes Residence window status (open or closed) for given day

• WindowCar Car window status (open or closed) for given day

• DiarylD ID of the activity diary selected for the day (as it appears in the Diary
Questionnaire and Diary Events files).

• DiaryEmp. Employment status of the activity diary selected for the day.

• Diary Age. Age of the activity diary selected for the day.
50
-------
• Stat. Value of the key diary statistic for the activity diary selected for the day.
The rules for defining WindowRes, WindowCar, SpeedCat, DailyConditionall,
DailyConditional2, and DailyConditionalS are provided by the user in the Profile Functions
(Distributions) input file. These variables may be used as conditional variables for use in
calculating the concentrations in microenvironments. The Profile Functions file and conditional
variables are covered in detail in Volume I. Conditional variables are also explained further in
the Microenvironments chapter (Chapter 8) of this Volume.

The variable Endgn is used in the evaluation of the blood COHb level. Even though it is stored
in an array of daily values, for males it always has the same value from day to day, that is, it is
always set to the physiological profile variable Endgnl. For females it may have one of two
values (the physiological profile variables Endgnl and Endgn2) depending on the phase of the
menstrual cycle.

The variable PAI is calculated as the time-averaged MET value for the entire simulation day. As
it depends on the daily activity diaries, it is not set in the profile module, but rather later, in the
ventilation subroutine ExposureDoseModule: Ventilation.

The variables Diary ID, Diary Age, and Diary Emp are mainly for QA of the diary selection and
diary assembly methods. The variable Stat contains the value for the key diary variable, which is
used to construct the longitudinal diary, and thus is usually a variable important to exposure
(such as time spent outdoors or time spent in vehicles). See Section 6.3. These variables are all
set in DiaryModulerCompositeDiary.
5.5 Modeling Variables

The remaining profile variables are general modeling variables. They are:

• Number of diaries. Number of different activity diaries used to construct the composite
activity diary for the person.

• Profile Conditional Variable #1-5. Generic user-defined profile conditional variables.
Can be defined by the user model any property of the simulated person (for example,
behavior or residential properties) that may affect microenvironmental parameters.

• Regional Conditional Variable #1-5. Regional user-defined conditional variables. Can
be defined by the user model any property of the simulated person (for example, behavior
or residential properties) that vary by county or sector.

The number of diaries is a convenience variable that is set during the diary assembly process, in
DiaryModulerCompositeDiary. It depends on the number of appropriate diaries available in
CHAD for the person, the diary pools, and other factors. It gives the modeler an idea of the
heterogeneity in the diary selection for the person.

The profile and regional conditional variables are defined by the user in the Profile Functions
file (see Volume I). They are set in ProfileModule:GenerateProfiles.

51
-------
52
-------
CHAPTER 6. Constructing a Sequence of Diary Events
APEX probabilistically creates a composite diary for each of the simulated persons by selecting a
24-hour diary record—or diary day—from an activity database for each day of the simulation
period. The Consolidated Human Activity Database (CHAD) has been supplied with APEX for
this purpose. A composite diary is a sequence of events that simulate the movement of a
modeled person through geographical locations and microenvironments during the simulation
period. Each event is defined by geographic location, start time, duration, microenvironment
visited, and activity performed.
The APEX model generates sets of exposure time series, one for each simulated individual, and
both mean exposures over time and variation in exposures are important. The ability to
realistically reproduce these exposure metrics depends on the method used to construct the
composite activity diaries for the simulated population. APEX provides two methods of
assembling composite diaries. The first (basic) method, which is adequate for estimating mean
exposures, simply involves randomly selecting an appropriate activity diary for the simulated
individual from the available diary pool. The second method is a more complex algorithm for
assembling longitudinal diaries that realistically simulates day-to-day (within-person) and
between-person variation in activity patterns (and thus exposures). Both methods are covered in
this chapter.

As both methods of composite diary assembly require the creation of diary pools, their
construction is covered first.
6.1 Constructing the Diary Pool

6.1.1 Diary Data

The composite diary is created by concatenating individual one-day activity diaries from the
CHAD database. APEX currently provides this database as two files: a file containing personal
information of the studied individual, and one containing the actual diary events (see Volume I
for more information). The Diary Questionnaire (DiaryQuest) file contains the following
variables, in this order, as comma-separated values:

• CHAD ID

• Day of the week (e.g., Monday)

• Gender (M/F)

• Race

• Status of employment (Y/N)
53
-------
• Age (years)

• Maximum hourly temperature on day of study (°F)

• Average temperature on day of study (°F)

• Occupation

• Count of missing time in minutes (when activity and/or location codes are missing from
the activity file)

• Number of events
The Diary Events file contains the following variables as comma-separated values:

• CHAD ID

• Start time (of the event)

• Duration (minutes)

• Activity code

• Location code
See Volume I for a description of the CHAD activity and location codes. The events file contains
a record for each of the events indicated by the number of events in the diary questionnaire file.
Note that while CHAD data are provided with APEX, other activity data could be used instead,
as long as the input file formats are followed and the CHAD coding conventions are used.

The APEX diary input files are read by DiaryModulerReadDiaries.

6.1.2 Grouping the Available Diaries into the Diary Pools

A diary pool is a group of CHAD diaries, appropriate for a given simulation day, from which a
daily diary may be drawn. The criteria for creation of the Diary Pools are defined using the
DiaryPool variable in the Profile Functions (Distributions) input file (see Volume I). Briefly, the
user can define different pools for different combinations of ranges of maximum temperature,
average temperature, and day of the week. Thus, the definition of a diary pool for a single
(hypothetical) simulated day may be something like "all CHAD activity diaries for a weekend
day for which the maximum temperature was between 70 and 80 degrees, and the average
temperature was between 50 and 80 degrees." The idea behind this logic is that temperature and
day of the week affect the type of activities people perform, and thus it is important to match
these real properties of the activity diaries to corresponding simulated days. It should be noted
that in APEX, CHAD diaries that are missing temperature data are thrown out (ie. not assigned
to any diary pool.)
54
-------
The number of diary pools that are defined affects the number of diaries that available for
selection on a given day. Therefore, it is important not to define the pools too narrowly, as it
could result in the same activity diaries being selected over and over again, or could result in
pools with no diaries in which case APEX will fail with a fatal error.

The diary pools are created in DiaryModulerReadDiaries.
6.2 Basic (Random) Composite Diary Construction

Basic (random) composite diary construction is implemented in
Diary Module: CompositeDiary In basic composite diary construction, APEX develops a
composite diary for each of the simulated individuals (profiles) in the following manner:

Once the diary pools have been created (based on temperature and day of the week as described
above), a selection probability for each diary within the pool is calculated based on
age/gender/employment similarity of the simulated person to the "real" diaries. The probabilities
are calculated in DiaryModulerDairyProbabilities. The selection probability is actually a
product of three probabilities between 0 and 1, one each for age, gender, and employment. The
gender and employment probabilities are straightforward: if the gender or the employment status
of the diary matches that of the profile, then the probability for these factors is set to 1. If they
do not match, the probability is set to 0. The exception is for the employment probability for
children under 16. Since APEX does not model employment status for children under sixteen
(the employment profile variable will always be 0), then the employment probability is always
set to 1 for this age group. This prevents APEX from discarding the CHAD activity diaries for
children under age 16 that had an employment status=YES.

The age selection probability is a bit more complicated. APEX provides the user with the option
of using activity diaries that have an age close to that of the simulated profile, although it may
not match exactly. This range is controlled by the Simulation Control file setting AgeCut2Pct
(see Volume I). Ages in this range will be given a probability of 1. In addition, APEX allows for
the use of "shoulder ages," which are the age ranges (of width AgeCutPct) above and below the
main age window. These ages are given reduced probability equal to the value of the Simulation
Control file setting the variable Age2Prob.

If the employment status, gender, or age for an activity diary is missing, then the selection
probability for that variable is determined directly from the Simulation Control file variables
MissEmpl, MissGender, andMissAge, respectively.

The final selection probability for each diary is the product of the age, gender, and employment
selection probabilities. Then, on each day of the simulation period, APEX randomly selects a
diary day from the appropriate diary pool, based on selection probability value.
6.3 Longitudinal Activity Diary Assembly

The second method of multi-day diary construction in APEX, which is required for
characterizing within-person and between-person exposure variability, is a longitudinal diary

55
-------
assembly algorithm that constructs multi-day diaries based on reproducing realistic variation in a
user-selected key diary variable. The variable may reflect any diary property, and is provided to
APEX in the Diary Statistics input file. It is assumed that the key variable has a dominant
influence on exposure. The APEX release provides Dairy Statistics files for outdoor time and
vehicle time, which were constructed by summing the total time associated with "outdoor" and
"vehicle" CHAD location codes for each diary (see Volume I). For some scenarios, the key
variable might be travel time or time performing a particular activity. The key variable could
also be a composite formed from several different variables, for example, a weighted average of
diary variables. The necessary condition for implementing the method is that every single-day
diary be assigned a numeric value for this key variable. This allows the set of available diaries in
every diary pool to be ranked in terms of this key variable, from lowest to highest. The method
uses this key variable to preferentially select appropriate diaries from the available pool on the
different days in order to produce a final longitudinal activity diary that has specific statistical
properties. The method is primarily contained within DiaryModulerDiaryRanks, which is
called from DairyModulerDiaryProbabilities.

The longitudinal diary construction method targets two statistics, D and A. The D statistic
reflects the relative importance of within-person variance and between-person variance in the
key variable. The A statistic quantifies the lag-one (day-to-day) variable autocorrelation, which
characterizes the similarity in diaries from day to day. Desired D and A values for the key
variable are selected by the user and set in the Simulation Control file, and the algorithm
constructs a longitudinal diary that preserves these parameters. See Section 6.3.2 for guidance
on selecting appropriate D and A values for a particular simulation.
6.3.1 The Longitudinal Diary Assembly Algorithm

The longitudinal diary selection method is based on the scaled rank, or "x-score" for the key
variable for each diary. The x-scores are calculated within diary pools. First, a pool is sorted
from lowest to highest on the key variable and given a corresponding rank, R. If there are K
diaries in a pool, and each diary has equal statistical weight then the x-score for the diary at rank
Ris
(6-1)

If the diaries have unequal statistical weight, then the x-scores will not be evenly spaced from 0
to 1 (as some diaries will correspond to a greater "interval" on the 0 to 1 scale). In APEX, eq. 6-
1 is not used, but rather the x-scores are assigned to each diary by sorting the diary pool on the
key statistic, and then applying the age, gender, and employment probabilities as described in
Section 5.2. The diaries are then assigned to days in the simulation based on their x-scores by
the algorithm described below.

An overview of the longitudinal diary method is shown in Figure 6.1. For each simulated
person, the following steps are performed:
56
-------
1. An individual target x-score T; is selected from a beta distribution (Pi) that depends on the
value of D.

2. For each day in the simulation, a daily scaled x-score (scaled rank) is generated. It is picked
from a different beta distribution ($2) having a peak near T;.

3. An individual target correlation A; is sampled from a beta distribution (Ps) having a peak
near the population autocorrelation A.

4. The independently sampled daily x-score values are re-ordered to induce the target
autocorrelation A;.

5. Diaries are assigned from the diary pool according to the final time-series of daily x-score
values.
Input
Results
Loop over simulated persons
1. Pick personal target x score for key var
Loop over simulated days
2. Pick daily x distributed about T
Var(TiS)=D i|/\
3. Pick personal target
autocorrelation (A;) :]
Personal A,
4. Re-order daily x scores to match) A k-
Defined diary
pools, sorted
by rank of
key variable

I
5. Assign diaries with appropriate
x score to each day
Figure 6.1. Overview of the Longitudinal Diary Assembly Algorithm
All of the random number generation in the new method involves drawing numbers from Beta
distributions. All of the random number distributions are bounded both above and below, which
is a natural property of the Beta distribution. Given these fixed end points, the Beta distribution
has two shape parameters which allow a great variety of forms. Both shape parameters (for all
the distributions) are positive. The means and shape parameters for each of the Beta
distributions were formulated to 1) properly reproduce D for the population, while 2) producing
unbiased x-scores across the population. The resulting equations for the parameters are given
57
-------
below. See Glen et al. (2007) for the derivation of these equations. Each of the above steps of
the algorithm is described in detail below:

1. An individual target x-score Ti is selected from a beta distribution (Pi) that depends on
the value of D. The distribution of T; depends on the value of D. The Pi distributions for T;
are symmetric about their midpoint, so the distribution shape parameters are equal. They are
equal to a, where

a = 1-0.03(4^0 (/ - Vo))3 (6-2)

The mean of the distribution, u, is equal to 0.05. The bounds of Pi are at (l/2-wi/2) and
(l/2+wi/2), where wi is a function of D and a:
Thus the final T; for the profile is generated as:

T,. = 0.5+w1 (P(a, a) - 0.5) (6-4)

2. For each day in the simulation, a daily x-score is generated. To generate daily scores, a
beta distribution (P2) dependent on D with a peak near the target value T; is constructed. Since
each individual has a personal value of T;, each will have a unique fii. As D approaches zero, fi
flattens into a uniform distribution. As D approaches one, P2 narrows to a spike at T;. One x
value is selected randomly from the PDF for each day in the simulation, plus a few extra (15
extra is sufficient for a one-year simulation). Since the fii for two individuals differ, the
between-person variance does not go to zero as the number of draws becomes large. The shape
parameters a and b for P2 are a function of T; and D:

(6-5)
(1-D)

= 2(1-7) (6-6)
" (1-D)

Thus for each day in the simulation, an x-score is generated as

x=02 =/3(a,b) (6-7)

3. An individual target correlation AI is sampled from a different beta distribution ((33)
having a peak near the population autocorrelation A. The width of this distribution is W2:

w2 =min(2-2\A\,l) (6-8)

The personal target for A, A;, is then generated as:

A,= A + w2 (13(2.65,2.625) - 0.5) (6-9)
58
-------
4. The independently sampled daily x-scores are re-ordered to induce the target
autocorrelation AI. This is done with a single pass, from first simulation day to the last. The
reordering process involves selecting each x-score in the time series from another beta
distribution ($4) that is a function of both the individual autocorrelation target A; and the
previous x-score. The "extra" diaries come into play here; they allow the avoidance of
undesirable forced selections. The shape parameters c and d for (34 for day j are:

(6-10)

(6-H)
where

S = ^- (6-12)
(1-Af)

and Xj.i is the previous day's x-score. The x-score for the first day is picked at random from a
uniform distribution ranging from 0 to 1. Then, for day j, the x-score is selected as:

(6-13)

The APEX code for performing this process keeps track of which x-scores are picked. If the
same score is selected more than once, then the algorithm finds the closest unused score. The
process laid out by equations 6-10 through 6-13 continues until x-scores for all simulation days
are selected.

On very short time series (less than 30 days), the autocorrelation step has a slight effect on the
resulting D for the population (a few percent increase).

5. Assign the selected x-score (scaled rank) values to each day in the simulation, and assign
corresponding diaries from the diary pool. The result of steps 1-4 is a time series of x-score
values mapped to simulation days, for example,

Jan 1 Jan 2 Jan 3

0.148 0.372 0.324

The diary pool has already been defined by APEX (typically by factors including age, gender,
employment, day type, and season, and perhaps others, see Section 6.1). The pool is then sorted
into rank order from lowest to highest score for the key variable, and sort order is mapped onto a
corresponding x-score. The x-score reflects the behavior of an individual relative to their peer
group (for example, a person with score 0.75 is above 75% of the people in the same cohort and
pool, in terms of the key variable). Scores can be moved across diary pools, whereas absolute
values for the key variable might not. Thus, there might be a diary with six hours of outdoor
time in the Sunday pool, but no such diary in the Monday pool. But a score of 0.75 exists on all
days. The use of scores also helps in ensuring that all the available diaries are collectively

59
-------
sampled with the correct frequency. Note that the use of scores does not destroy information.
All that matters in terms of diary assembly is the ability to specify which diary should be used on
a given simulation day. For this purpose, requesting the available diary nearest to score 0.38 is
no different than requesting the available diary nearest to (for example) 73 minutes of outdoor
time.

APEX assigns the diary whose x-score is closest to the daily x-score value to the day. No
distinction is made or needed between day types, seasons, etc., because that is already taken into
account. Note that any diary matching criteria such as day type, season, temperature, rainfall,
workday, holiday, etc., affect the list of diaries that belong to the pool for a given day, but have
no effect on the x-scores.
6.3.2 Selecting Appropriate D and A Values For a Simulated Population

The statistic D for a population of individuals is given by

D = ob2 / (ob2 + aw2 ) (6-14)

where Ob2 and ow2 are the between- and within- person variances in the key variable. Since both
variances are non-negative, it is clear that D is in the interval [0,1]. D=0 means that Ob2 is zero,
or that each person has the same mean score. A small D means that Ob2 is substantially smaller
than ow2, indicating that the overall variability between people in the key diary statistic is smaller
than the variability observed over days within the same person. A D near one means that Ob2 is
much larger than ow2, or that each person shows little variation over time relative to the
variability between persons.

The lag-one autocorrelation^ is simpler to calculate than D, because each time series can be
examined independently. The first step is to determine the x-score for each day, relative to the
entire time series. If there are J days in the time series, and a given day is at rank R in terms of
the rank for the key variable among the J days, then the x-score for that day is ( R-l/2 ) / J. The
overall mean and variance in these scores for the time series is then calculated. Due to the
properties of the discrete uniform distribution of the scores (neglecting tied scores), the mean
must be 1/2 and the variance is
(W5)
which is very close to 1/12 for large J. The lag-one covariance is calculated by

cov =lZU-l¥xff+,;-l
j \ *- j\ *-

where x(/) is the x-score on dayy (see for example, Box et al., 1994). The lag-one
autocorrelation for the individual time series is given by the ratio of the covariance to the
variance:
-— (6-17)
COV

60
-------
This calculation is repeated for each time series, and the statistic A is the mean of these
individual autocorrelations. The statistic A has a range from -1 to +1, with positive values
indicating that each day has a tendency to resemble the day before. Random selection of diaries
from day to day produces A values near zero. Negative A values imply dissimilarity between
consecutive days.

A study of children conducted in Southern California (Xue et al., 2004) provided about 60 days
of data on each of 163 children. The time series are not continuous, as the monitoring consisted
of twelve six-day periods, one per month over a year. Furthermore, only about 40 children were
measured simultaneously, as the other children were sampled in different weeks. However, a
sample size of 40 is sufficient to calculate reliable rankings across persons. The number of
consecutive day pairs was substantially less than the number of days, due to the gaps in the time
series. However, D and A statistics were calculated for three variables directly recorded on the
activity diaries (outdoor time, travel time, and indoor time), and also for a fourth variable, the
physical activity index or PAI (McCurdy, 2000). The analyses were performed for all children
together and for two gender cohorts. The separation into two cohorts reduces the number of
children measured simultaneously to fewer than 20. Further division into more cohorts is
therefore not practical, as the reliability of the scores would become very uncertain. The results
for these analyses are given in Table 6.1.

Table 6.1. D and A Statistics Derived from the Southern California Children's Study
Variable
Outdoor time
Outdoor time
Outdoor time
Travel time
Travel time
Travel time
Indoor time
Indoor time
Indoor time
PAI
PAI
PAI
Group
all
boys
girls
all
boys
girls
all
boys
girls
all
boys
girls
D
0.19
0.21
0.15
0.18
0.18
0.18
0.17
0.21
0.17
0.16
0.16
0.16
A
0.22
0.21
0.24
0.07
0.05
0.08
0.22
0.2
0.24
0.23
0.2
0.25
For all variables and each group, the standard deviation between persons for autocorrelation was
about 0.20, and the standard error in the mean A was about 0.02. The values in Table 6.1
indicate that gender differences for both D and A are small, if present at all. The variance in A
over the population was also examined for the four diary variables. While the absolute values of
A were different across variables, it was found the variance in individual autocorrelations was
very similar (approximately 0.2) in all variables. This variance was used to derive the
parameters for the target^ distribution (eq. 6-9) in the APEX algorithm, and thus the method
returns diaries that reproduce a variance of 0.2 in the daily autocorrelation, no matter what diary
variable is modeled or what absolute A value is used.
61
-------
CHAPTER 7. ESTIMATING ENERGY EXPENDITURES AND
VENTILATION

Ventilation rates are used in APEX for

• Calculating exertion level for use in tabulating exposure summaries for the population

• Estimating dose
Ventilation does not influence the exposures for a simulated person.

Ventilation is a general term for the movement of air into and out of the lungs. Minute or total
ventilation is the amount of air moved in or out of the lungs per minute. Quantitatively, the
amount of air breathed in per minute (Vi) is slightly greater than the amount expired per minute
(Ve}. Clinically, however, this difference is not important, and by convention minute ventilation
is always measured on an expired sample, Ve. Alveolar ventilation (Va) is the volume of air
breathed in per minute that (1) reaches the alveoli and (2) takes part in gas exchange. The
ventilation rate needed for the %COHb calculation is this ventilation rate, Va, and is derived for
use in APEX based on work by Adams (1998), Astrand and Rodahl (1977), Burmaster and
Crouch (1997), Esmail et al. (1995), Galetti (1959), Johnson (1998), Joumard et al. (1981),
McCurdy (2000), McCurdy et al. (2000), Schofield (1985), and many others. Only a brief
description of VA is described below; for the complete derivation, see Johnson (2002).

Ventilation is calculated on an activity event-by-activity event basis. Ventilation is derived from
the energy expenditure rate (MET, given as a multiple of resting metabolic rate), associated with
the diary activities. The general steps in the estimating ventilation are

• Generate the MET event time-series based on the diary activities

• Adjust the resulting MET series for fatigue and excess post-exercise oxygen consumption

• Convert the MET time-series into a ventilation rate time series
These steps are covered in the following sections.
7.1 Generating the MET Time-Series
MET—which comes from "metabolic equivalents of task"—is a dimensionless ratio of the
activity-specific energy expenditure rate to the basal or resting energy expenditure rate. While
different people have very different basal metabolic rates, it is generally found that the MET
ratios do not exhibit as much variability. Thus, standing still might require two times the basal
energy expenditure, or two MET, for most people, with relatively little variation. The basal rate
is constant (it only has to be determined once per profile), while the activity-specific MET ratio
is calculated for each of the activities reported on the composite activity diary.
62
-------
Each possible diary activity code (see Volume 7), is mapped to a corresponding APEX MET
distribution number the MET Mapping input file. Each of these distributions is then defined in
the MET Distributions file. This file (see Volume I) gives the properties of the MET distribution
for each type of activity, in some cases as a function of age or occupation. The distributions are
defined in the standard APEX format (see Section 3.1) These distributions are based on many
available data on energy expenditure, and in general should not be changed. This file is read into
APEX in the DiaryModulerReadMETS In the subroutine DiaryModulerMETSEval, APEX
steps through the activity diary and assigns a MET value to each event by randomly selecting a
value from the appropriate distribution as defined by the activity code and the profile. The result
is consider the "raw" MET time-series, which is then adjusted to be physiologically realistic.
The adjustments are covered in the next section.
7.2 Adjusting the MET Time-Series for Fatigue and Excess Post-Exercise
Oxygen Consumption

As discussed in the previous section, APEX assigns distributions for MET level to each diary
event, based on the reported event activity (and in some cases, age and occupation). However,
these raw MET time-series do not consider the sequence of the events (i.e., the order in which
they occur). It is well known that a person's capacity for work will diminish as they get tired,
and in practice, this means that the upper bound on MET is lowered if events in the recent past
have been at unusually high MET levels. Furthermore, once high activity levels have ended,
people tend to breathe heavily even while resting, as they recover their accumulated oxygen
deficit. This effect is called excess post-exercise oxygen consumption (EPOC), and results in
raising the MET levels above the 'raw' values pulled from the activity-based distributions.
APEX contains an algorithm for adjusting the MET time series for both of these effects. The
algorithm is implemented in ExposureDoseModule: Ventilation.

The adjustment method is based on keeping a running total of the oxygen deficit as a simulated
individual proceeds chronologically through his or her activity diary. The oxygen deficit is the
amount of energy supplied to the muscles by non-aerobic systems during exercise. It reflects a
need for increased post-exercise ventilation to "pay back" this energy. The oxygen deficit
calculations were derived from a synthesis of numerous published studies (see below).

Oxygen deficit is measured as a percentage of the maximum oxygen deficit an individual can
attain prior to deterioration of exercise performance. Limitations on MET levels corresponding
to post-exercise diary events were based on maintaining an oxygen deficit below this maximum
value. In addition, adjustments to MET were simultaneously made for EPOC. The EPOC
adjustments are based in part on the modeled oxygen deficit and in part on data from published
studies on EPOC, oxygen deficit, and oxygen consumption.

The methods are constructed in terms of reserve MET, which is the amount over the basal rate
(MET=1). Furthermore, we defined M as the normalized reserve, so that M=0 at MET=1, and
M=l at maximum MET:

M- MET~1 (7-1)
METmax-1
63
-------
(Recall that METmax is a profile variable assigned for each simulated profile; see Section 5.3).
Using a normalized reserve assures that the method can be applied identically to the entire
population of profiles, each having a unique METmax value.

A number of terms will be used in the description of the algorithm. They are defined below:

• MET Metabolic equivalent of task (unitless)

• METmax Maximum achievable metabolic equivalent for an individual (unitless)

• M Normalized MET reserve (unitless, M, bounded between 0 and 1)

• AM Change in M from one diary event to the next (M)

• Dmax Absolute maximum oxygen deficit that can be obtained (M-hr)

• F Fractional oxygen deficit (percent of individual maximum, unitless)

• te Duration of activity diary event (hours)

• tr Time required to recover from an F of 1 to an F of 0 at rest (recovery time,
hours)

• dF;nc Rate of change of F due to deficit increase (F/hr, will have a positive
value)

• dFrec Rate of change of F due to deficit recovery (F/hr, will have a negative
value)

• dFtot Total rate of change of F, dFinc+ dFrec (F/hr)

• AFinc Increase in F due to anaerobic energy expenditure (F)

• AFrec Decrease in F due to recovery of oxygen deficit (F)

• AFtot Change in F due to simultaneous anaerobic work and oxygen recovery,
AFinc+AFrec (F)

• AFfast Total change in F during the fast recovery phase (F)

• Sfast Magnitude of the rate of change in M during fast component (M/hr)

• EPOC fast Change in M due to fast-component EPOC (M)

• EPOCsiow Change in M due to slow-component EPOC (M)

See Isaacs et al. (2007) for a complete derivation of the method.
64
-------
7.2.1 Simulation of Oxygen Deficit

This section presents the theoretical development of the equations describing the accumulation of
oxygen deficit. The method was developed using a large number of studies on oxygen
consumption, oxygen deficit, and EPOC. Individual studies will be referenced below. The first
two subsections below describe the equations themselves, while the last section describes the
determination of the values for the model parameters.
7.2.1.1 Fast Processes

There exists a component of the accumulated oxygen deficit that is due to transition from one M
level to another (McArdle et al., 2001). This component derives from the anaerobic work that is
required by sudden muscular motion. There is also a corresponding fast component of oxygen
recovery which occurs very quickly after a change from a high M level to a lower one. In the
absence of any data to the contrary, it is assumed that these fast deficit accumulation and fast
recovery processes occur at the same rate. These processes are illustrated in Figure 7.1. The
adjustment to F is equal to the area of the triangle associated with either a positive or negative
change in M, normalized by the maximum obtainable accumulated oxygen deficit (Dmax). The
normalized area can thus be calculated as:

AM\AM
AFfast=0.5—J— (7-2)
o n
0 fast ^max
where AM = Mi-Mi_i and Sfast is the slope of the change in M (in M/hr). Note that this change in
F will be positive if AM is positive, and negative otherwise.
65
-------
Fast Component of F:
Accumulation
(AM Positive)
Fast Component of F:
Recovery
j> (AM Negative)
Exercise
Recovery
Time
Figure 7.1. Fast Components of Oxygen Deficit and Recovery
7.2.1.2 Slow Processes

The slow component of the increase in oxygen deficit corresponds to the accumulation of deficit
over a period of heavier exercise (rather than that associated with an increase in activity level).
The method was derived from the analysis of a number of studies on exercise and EPOC
including Bahr (1992), Bahr et al. (1987), Bielinski et al. (1985), Brockman et al. (1993), Gillette
et al. (1994), Gore and Withers (1990), Hagberg et al. (1980), Harris et al. (1962), Kaminsky and
Whaley (1993), and Katch et al. (1972), Maehlum et al. (1986), Sedlock (1991a,b). The
following data were considered: the time it took for subjects to reach exhaustion, their
accumulated oxygen deficit, their METmax, the MET value at which they exercised, and the
corresponding normalized reserve MET (M). (Note that the MET and METmax quantities were
derived from the published VO2 and VO2max measurements). The data indicated that oxygen
deficit accumulates at a much faster rate when M is high. For example, an M value near 0.5
requires about 5 times longer to reach exhaustion than an M value near 0.75 (on average),
indicating that F is nonlinear in M.

Let the rate of increase in F be given by dFinc. The relationship between dFinc and M is a power
law:
dF!n
(7-3)
where a and b were estimated from available data. The slow recovery of oxygen deficit must
also be accounted for, as it occurs simultaneously with debt accumulation. A slow, but
continual, process for recovering oxygen deficit is modeled, independent of the MET level.
EPOC recovery is modeled as constant over time until the oxygen deficit is erased. Assuming
this takes tr hours, the slow recovery of oxygen deficit occurs at a rate
66
-------
dFrec= (7-4)
r

The total net rate of change in F from slow processes during an event with duration te is given by

dFsk)W=dFinc+dFrec (7-5)

and the associated change in F is

AFslow=(aM?-f]te (7-6)
V ' r )

For an individual starting with an F of 0 and exercising to exhaustion (neglecting the transitory
effects), the change in AF is 1.0. In this case, rearranging and taking the logarithm gives

logi i + L ] = /og(a) + b log(M) (7-7)
V r )

This equation can be used to fit data to estimate the parameters a and b (this will be discussed in
the next subsection).

The starting normalized oxygen deficit for the next event (i +1), taking into account both the fast
and slow changes in F, is then

Fi+1=Fi+AFslow+AFFast (7-8)

7.2.1.3 Derivation of Appropriate Values for the Model Parameters

The values of the model parameters tr, a, and b, were derived from summaries of published data
on EPOC and oxygen debt (see references listed in Section 7.2. Several of the these studies
reported tr values, however, due to variability in measurement and protocol differences, these
recovery times varied from 0.5 hours to 24 hours. From a modeling viewpoint, it would be
unacceptable to allow recovery to significantly carry over from one day to the next. To do so
could lead to a perpetual delay in recovering an oxygen deficit, for example, by repeatedly
encountering new exercise events before recovery is complete. In APEX, tr, which is a profile
variable, is selected from a uniform distribution having a minimum of 8 and a maximum of 16
hours.

In APEX, a and b are modeled as a function of tr:

(7-9)

67
-------
These expressions were derived from the experimental data.

Appropriate distributions for maximum oxygen debt (MOD) in ml/kg were derived from data
from a number of studies in adults (Bickham et al., 2002, Billat et al., 1996, Buck and
McNaughton, 1999, Demarle et al., 2001, Doherty et al., 2000, Faina et al., 1997, Gastin and
Lawson, 1994, Gastin et al., 1995, Hill et al., 1998, Maxwell and Nimmo, 1996, Olesen,1992,
Renoux et al., 1999, Roberts et al., 2003, Weber and Schneider, 2000) adolescents (Naughton et
al., 1998) and children (Berthoin et al., 1996, Carlson and Naughton, 1993). The studies covered
multiple types of exercise protocols, some having more than one protocol per study. Normal
distributions for MOD were defined for all three age groups, based on average mean and
standard deviation values from the studies:

• adults (>17 yrs): 54.95+14.46 (ml/kg)

• adolescents (12-17 yrs): 63.95+21.12 (ml/kg)

• children (< 12 yrs): 34.74+13.10 (ml/kg)
(These mean and standard deviations are read in from the Physiology file; see Section 5.3.)
Values are selected from normal distributions with these characteristics. These values are
constant for an individual over the simulation period. The bounds of these distributions are
selected as two standard deviations from the mean; these ranges were found to be reasonable
when compared to reported ranges (Olesen 1992). These values are transformed to DmaX: via a
units conversion factor and the normalization needed for use with reserve MET:

Dmax (M-hr) =( MOD \ y (7-H)
(60METStoO2f '

where METStoO2 is the conversion factor for ml O2 to MET-min, 3.5 [(ml O2/min)/kg]/MET.
Note that the variability in this factor is not addressed here.

A number of studies on EPOC (Almuzaini et al., 1998, Dawson et al., 1996, Frey et al., 1993,
Harms et al., 1995, Kaminsky et al., 1990, Knuttgen, 1970, Maresh et al., 1992, Pivarnik and
Wilkerson, 1988, Short and Sedlock, 1997, Trost et al., 1997) were used to derive Sfast. These
were all studies in which oxygen consumption was measured relatively soon (within a few
minutes) after the end of exercise and at a frequency high enough to capture the kinetics of the
change in oxygen consumption. The data were found to be relatively uniform from the minimum
(0.6 MET/min) to the maximum (3.7 MET/min) slope values, and so values were selected from a
uniform distribution having these bounds. Converting units and normalizing to M, one obtains:

s (M/hr) = 60 Uniform (0.6,3.7) (7.12)
(METSmax-1)
7.2.2 Adjustments to M for Fatigue

The equations provided in the previous section describe a method for keeping a running total of
the fractional oxygen deficit (F) for each diary event for an individual. These event F values are

68
-------
used to limit M for each event to appropriate values. Basically, the maximum M value that can
be maintained for an entire event is the value that would result in an F;+i (eq. 7-8) equal to 1 (i.e.,
the maximum value) at the end of the diary event. The approach used in APEX is to set M for
each event equal to the raw MET value, and test if F;+i>l. If it is, then the M; value is reduced
by a predetermined amount (currently 0.01) and F;+i is recalculated. The process continues until
an appropriate value of M;, called Mmax,i is found. As the exposure model marches through the
events of the activity diary, the M values associated with each event are adjusted if necessary:

Mi=min(Mi, MmaxJ) (7-13)
7.2.3 Adjustments to M for EPOC

As noted above, it has been observed in many studies that EPOC is characterized by both slow
and fast components. The fast increase in oxygen consumption occurs within minutes of
exercise, while the slow component may persist for many hours. Both fast and slow EPOC
components were modeled.
7.2.3.1 Fast Processes

The fast EPOC component, which takes place in the first few minutes after exercise, is also
characterized by the slope Sfast. The energy recovered during those first few minutes
corresponds to the recovery triangle in Figure 7.1 and this increase in the rate of energy
expenditure for a post-exercise event is modeled as the area of the triangle divided by the event
duration:

EPOC™ =0.5

EPOCfast will thus have units of M (normalized reserve MET). The M level for the post-exercise
events will be incremented by EPOCfast.
7.2.3.2 Slow Processes

The increase in M associated with the slow EPOC component is estimated as the amount
required to maintain the slow recovery of F. Since the deficit Dmax is recovered in full in the
recovery time tr, the time-averaged adjustment to MET for the slow recovery process must be

^_ (7-15)
Every diary event with the full rate of slow recovery will have its M value adjusted upward by
EPOCsiow An appropriate fraction of EPOCsiow is used if only partial recovery is needed to
eliminate the deficit (i.e., return F to 0). The final adjusted M value for the diary event is thus

MadJ =M + EPOCfast + EPOCslow (7-16)
69
-------
and the new MET value for the event is

METSadJ = MadJ(METSmax -1) + 1 (7-17)
7.3 Calculating PAI and the Ventilation Rates

APEX calculates three different ventilation rates from the adjusted MET time series. These are
the expired ventilation rate (Ve) the alveolar ventilation rate (Va), and the effective ventilation
rate (EVR). All three are reported for each hour in the simulation in the APEX output files. Va
is used in the CO dose calculations, and EVR is used in compiling summary exposure tables for
different populations during different levels of exertion.

In addition, the final MET time-series is used to calculate a physical activity index (PAI) for
each individual. Finally, an intermediate rate, oxygen consumption (VC^), is also calculated.
The equations for calculating these rates from the MET time series are given below. All of these
calculations are implemented in ExposureDoseModule: Ventilation
7.3.1 Calculating PAI and Energy Expenditure

Once the final MET time series is calculated, the timestep and hourly physical activity index
(PAI) for each hour in the simulation for the simulated individual is calculated as the time-
weighted average of MET:
PAI =-i^ - (7-18)

where MET; is the MET value for event i, t; is the event duration in minutes, and t is the length
of the timestep in minutes (or 60 minutes, if calculating hourly values). Nevents is the number of
diary events in the considered timestep or hour. These PAI values can be written to the Timestep
or Hourly files. The daily PAI value is simply the average of the 24 hourly values. Finally, a
median daily PAI value is calculated for each profile. The daily and median daily PAI values are
saved as a profile variables (see Section 5.3). The median daily PAI value is used in the
characterization of persons as "active" when creating the output exposure summary tables (see
Volume I and Section 9.2).

The energy expenditure (kcal/min) is

EE = PAIxRMR (7-19)

where RMR is the profile resting metabolic rate in kcal per minute. EE is calculated for both
timesteps and hours, and can be written to the corresponding output files.
70
-------
7.3.2 Calculating Oxygen Consumption and Ventilation Rates

The oxygen consumption rate in ml/min/kg (McCurdy, 2000), normalized to body mass, is given
by

_ METS x EOF x RMR (7-20)
2 ~ BM

where ECF is the profile energy conversion factor in liters of oxygen per kcal and BM is the
profile body mass. The alveolar ventilation rate is also calculated from MET using:

Va = METS x 19630 x ECF x RMR (7-2 1 )

where the constant 19630 is the oxygen to air conversion factor (19,630 ml of air/1 of 02).

The calculation of Ve is based on the VeSlope, VeResid, and Velnter profile variables (see
Section 5.3). The calculation of those variables and the following equations for Ve comprise the
Ve regression equations derived by Graham and McCurdy (2005). Ve is calculated as

Ve=BM(ex) (7-22)

where BM is the profile body mass and the exponent term X is given by

X = Velnter + VeSlope(VO2 ) + Z(VeResid) (7-23 )

where Z is a random number pulled from a normal distribution every event. EVR is then

(7-24)
where BSA is the profile body surface area.

Using the above equations, APEX generates an event time-series for Ve and Va and EVR. Ve
and Va are output on the Events output file. Timestep and hourly values (time-weighted timestep
and hourly averages of the event values) for EVR, Va, Ve, are calculated and can be written to
the Timestep and Hourly output files (see Volume I).
CHAPTER 8.
71
-------
CALCULATING POLLUTANT
CONCENTRATIONS IN
MICROENVIRONMENTS

APEX calculates concentrations of all modeled air pollutants in all microenvironments at each
timestep of the simulation period for each of the simulated individuals. The default APEX
timestep is 1 hour (See Volume I). The microenvironmental concentrations are set in
MicroEnvModulerMicroConcs. The input files and algorithms for these calculations are
described in the following sections.

8.1 Defining Microenvironments

APEX gives the user great flexibility in defining the number and properties of the
microenvironments (see step 4 in Figure 2-1). (Note that the term microenvironment is generally
shortened to micro in the computer code and files.) Along with this flexibility, however, is the
need for the user to specify a substantial amount of information about the microenvironments in
the input files.

There are three input files that relate to microenvironments. The first is the Microenvironment
Mapping file, which contains the mapping from the location categories used in the activity
diaries to the APEX microenvironments. The second is the Microenvironment Descriptions file,
which contains rules for calculating pollutant concentrations in each microenvironment. The
third file is the Profile Functions (Distributions) file, in which profile (person) variables (such as
the presence of air conditioning or a gas stove in the home) influencing the microenvironmental
concentrations can be defined. See Volume I for a discussion of the Profile Functions
(Distributions) file.

The Microenvironment Mapping file gives the user control over how many microenvironments
will be modeled and what CHAD (or other activity database) locations should be grouped into
each of microenvironment. The file contains one row for each CHAD location code, indicating
which APEX microenvironment is to be used whenever that CHAD code is encountered. Thus,
the more than 100 location codes defined in the activity (CHAD) database are mapped into a
smaller subset of user-defined microenvironments amenable to modeling. In addition, location
codes are also mapped to concentration locations (Home, Work, Other, H/W/O), which tell
APEX which set of microenvironmental concentrations to use for a particular location code. See
Section 8.2.1 for details.

Table 8.1 lists the 115 location codes currently in CHAD and the microenvironment to which
each is assigned in the default Microenvironment Mapping file included with the APEX release.
72
-------
Table 8.1. Default Mapping of CHAD Location Codes to APEX Microenvironments
CHAD
Location
Code
U
X
30000
30010
30020
30100
30120
30121
30122
30123
30124
30125
30126
30127
30128
30129
30130
30131
30132
CHAD Location
Description
Uncertain of correct code
No data
Residence, general
Your residence
Other residence
Residence, indoor
Your residence, indoor
..., kitchen
..., living room or family
room
..., dining room
..., bathroom
..., bedroom
..., study or office
..., basement
..., utility or laundry room
..., other indoor
Other residence, indoor
..., kitchen
..., living room or family
room
APEX
Microenv.
Code
-1
-1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
APEX
Microenviroment
Description
Use previous
microenvironment
Use previous
microenvironment
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Location
(see Section
8.2.1)
U
U
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
73
-------
CHAD
Location
Code
30133
30134
30135
30136
30137
30138
30139
30200
30210
30211
30219
30220
30221
30229
30300
30310
30320
30330
30331
30332
30340
30341
30342
30400
31000
31100
31110
CHAD Location
Description
..., dining room
..., bathroom
..., bedroom
..., study or office
..., basement
..., utility or laundry room
..., other indoor
Residence, outdoor
Your residence, outdoor
..., pool or spa
..., other outdoor
Other residence, outdoor
..., pool or spa
..., other outdoor
Residential garage or carport
..., indoor
..., outdoor
Your garage or carport
..., indoor
..., outdoor
Other residential garage or
carport
..., indoor
..., outdoor
Residence, none of the above
Travel, general
Motorized travel
Car
APEX
Microenv.
Code
1
1
1
1
1
1
1
10
10
10
10
10
10
10
7
7
10
1
1
10
1
1
10
1
11
11
11
APEX
Microenviroment
Description
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Indoors -
Residence
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Indoors - Other
Indoors - Other
Outdoors - Other
Indoors -
Residence
Indoors -
Residence
Outdoors - Other
Indoors -
Residence
Indoors -
Residence
Outdoors - Other
Indoors -
Residence
In Vehicle - Cars
and Trucks
In Vehicle - Cars
and Trucks
In Vehicle - Cars
and Trucks
Location
(see Section
8.2.1)
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
H
O
O
O
74
-------
CHAD
Location
Code
31120
31121
31122
31130
31140
31150
31160
31170
31171
31172
31200
31210
31220
31230
31300
31310
31320
31900
31910
32000
32100
32200
32300
32400
32500
32510
CHAD Location
Description
Truck
Truck (pickup or van)
Truck (not pickup or van)
Motorcycle or moped
Bus
Train or subway
Airplane
Boat
Boat, motorized
Boat, other
Non-motorized travel
Walk
Bicycle or inline
skates/skateboard
In stroller or carried by adult
Waiting for travel
..., bus or train stop
..., indoors
Travel, other
..., other vehicle
Non-residence indoor,
general
Office building/ bank/ post
office
Industrial/ factory/
warehouse
Grocery store/ convenience
store
Shopping mall/ non-grocery
store
Bar/ night club/ bowling
alley
Bar or night club
APEX
Microenv.
Code
11
11
11
8
12
12
0
10
10
10
10
10
10
10
10
8
7
11
11
7
5
5
6
6
2
2
APEX
Microenviroment
Description
In Vehicle - Cars
and Trucks
In Vehicle - Cars
and Trucks
In Vehicle - Cars
and Trucks
Outdoors - Near
Road
In Vehicle - Mass
Transit
In Vehicle - Mass
Transit
Zero concentration
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Near
Road
Indoors - Other
In Vehicle - Cars
and Trucks
In Vehicle - Cars
and Trucks
Indoors - Other
Indoors - Office
Indoors - Office
Indoors -
Shopping
Indoors -
Shopping
Indoors - Bars and
Restaurants
Indoors - Bars and
Restaurants
Location
(see Section
8.2.1)
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
O
H
O
O
O
75
-------
CHAD
Location
Code
32520
32600
32610
32620
32700
32800
32810
32820
32900
32910
32920
33100
33200
33300
33400
33500
33600
33700
33800
33900
34100
34200
34300
35000
35100
35110
35200
35210
CHAD Location
Description
Bowling alley
Repair shop
Auto repair shop/ gas station
Other repair shop
Indoor gym /health club
Childcare facility
..., house
..., commercial
Large public building
Auditorium/ arena/ concert
hall
Library/ courtroom/ museum/
theater
Laundromat
Hospital/ medical care
facility
Barber/ hair dresser/ beauty
parlor
Indoors, moving among
locations
School
Restaurant
Church
Hotel/ motel
Dry cleaners
Indoor parking garage
Laboratory
Indoor, none of the above
Non-residence outdoor,
general
Sidewalk, street
Within 10 yards of street
Outdoor public parking lot
/garage
..., public garage
APEX
Microenv.
Code
2
7
7
7
7
4
1
4
7
7
7
7
7
7
7
O
2
7
7
7
7
7
7
10
8
8
9
9
APEX
Microenviroment
Description
Indoors - Bars and
Restaurants
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Day
Care Centers
Indoors -
Residence
Indoors - Day
Care Centers
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Schools
Indoors - Bars and
Restaurants
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Indoors - Other
Outdoors - Other
Outdoors - Near
Road
Outdoors - Near
Road
Outdoors - Public
Garage / Parking
Outdoors - Public
Garage / Parking
Location
(see Section
8.2.1)
O
O
O
O
O
O
O
O
O
O
O
H
O
H
O
O
O
H
O
H
O
O
O
O
O
O
O
O
76
-------
CHAD
Location
Code
35220
35300
35400
35500
35600
35610
35620
35700
35800
35810
35820
35900
36100
36200
36300
CHAD Location
Description
..., parking lot
Service station/ gas station
Construction site
Amusement park
Playground
..., school grounds
..., public or park
Stadium or amphitheater
Park/ golf course
Park
Golf course
Pool/ river/ lake
Outdoor restaurant/ picnic
Farm
Outdoor, none of the above
APEX
Microenv.
Code
9
10
10
10
10
10
10
10
10
10
10
10
10
10
10
APEX
Microenviroment
Description
Outdoors - Public
Garage / Parking
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Outdoors - Other
Location
(see Section
8.2.1)
O
O
O
O
H
O
H
O
O
O
O
O
O
O
O
All of the other properties of the microenvironments are provided in the Microenvironment
Descriptions file. All microenvironments assigned locations in the Microenvironment Mapping
file must be defined in the Microenvironment Descriptions file. More microenvironments may
be described than are assigned locations, but they will not be used by APEX (since in this case
simulated people will never enter the microenvironment). The total number of
microenvironments described in the Microenvironment Descriptions file must also be indicated
in the Simulation Control input file. (These input files are also covered in Volume /.)

Definition of the microenvironments using the Microenvironment Descriptions file is covered in
Section 8.2.4.
8.2 Calculating Concentrations in Microenvironments

APEX calculates concentrations of all the air pollutants in all the microenvironments at each
timestep of the simulation period for each of the simulated individuals, based on the ambient air
quality data specific to the geographic locations visited by the individual. APEX provides two
methods for calculating microenvironmental concentrations: the mass balance (MASSBAL)
method and the simpler factors (FACTORS) method. The MASSBAL method starts with the
previous timestep's concentration in each microenvironment, which is modified over time by
exchange with the ambient air. The FACTORS method uses a simple equation to relate the
concentration in each microenvironment to the current ambient concentration. Both methods
require that a number of parameters including proximity, penetration and pollutant sources be
specified over time, while the MASSBAL method uses some additional parameters, such as air
exchange, volume, and decay rates. All pollutants use the same method in each
microenvironment. These methods are described in the next two subsections. The user is
required to specify the calculation methods for each of the microenvironments in the simulation
77
-------
in the Microenvironment Descriptions file (see Volume I). Microenvironments within a single
simulation may use either method; mixing the methods (across micros, not pollutants) is allowed
with no restrictions.
8.2.1 Microenvironmental Concentrations for Home/Work/Other Locations

APEX calculates up to three sets of microenvironmental concentrations for each person. In
general, these concentrations represent the geographical locations a person moves through in a
day. These three locations are "Home" (H), "Work" (W), and "Other" (O). (A person who is
not employed only has H and O locations.) The H and W concentrations are calculated from the
air quality data in a person's home and work sectors respectively. The concentrations in the O
location are calculated from a composite of set of air districts. By default, APEX uses the city-
average air concentration to calculate O concentrations. However, the user can customize this
average concentration using the Control file settings SampleOtherLocs, #OtherDistricts, and
HomeProbab (see Volume I).

Location is determined event-by-event for each simulated person. In general, APEX determines
which set of concentrations to use for each event via the diary (CHAD or other) location code
The Microenvironment Mapping file assign locations definitions to each activity database
location code (see Table 8.1). However, APEX also assigns diary events with a "work" activity
code to the W location. This assignment overrides the location assignment based on location
code. By default, APEX assigns CHAD activity codes 10000-10300 to the work location (see
Table 4-5 in Volume I). However, the user also has the ability to customize this setting using the
Control file variable CustomWork.

APEX only defines a single set of micro parameter distributions for each micro (i.e. there are no
unique distributions for the H, W, O locations.) However, the values of the parameters
themselves may differ between locations. By using the ResampleWork keyword in a micro
parameter description, APEX will select a different value from the distribution to use for the W
location (see Section 8.2.4.3 for details). If ResampleWork is used, then the micro parameters
for the O location will be the average of the parameters for the H and W locations. If not, the H,
W, and O locations will all use the same values of the parameters for the micro. In addition, if
the simulated person (profile) is not employed, there is no W location, and the values of the
parameters for the O location are equal to the H values.
8.2.2 Mass Balance Method

The mass balance method assumes that an enclosed microenvironment (e.g., a room in a
residence) is a single well-mixed volume in which the air concentration is approximately
spatially uniform. The concentration of an air pollutant in such a microenvironment is estimated
using the following four processes (as illustrated in Figure 8.1):

• Inflow of air into the microenvironment;

• Outflow of air from the microenvironment;
78
-------
Removal of a pollutant from the microenvironment due to deposition, filtration, and
chemical degradation; and

Emissions from sources of a pollutant inside the microenvironment.

Microenvironment
Air
outflow
Indoorsources
Air
inflow
Removal due to:
•Chemical reactions
•Deposition
•Filtration
Figure 8.1. The Mass Balance (MASSEAL) Model

It is assumed that the flow of outside air into the microenvironment is equal to that flowing out
of the microenvironment, and this rate is given by the air exchange rate, Ra;r exchange • Rair exchange is
given in units of hr"1. The air exchange rate can be interpreted as the number of times per hour
the entire volume of air in the microenvironment is replaced.

Considering the microenvironment as a distinct, well-mixed volume of air, the mass balance
equation for a pollutant can be described by:
=e. -c ,-
in out
,
removal
(8-1)
where:
C(t)
'out
Concentration in the microenvironment at time t (|ig/m )

Rate of change in C(t) due to air entering the micro

Rate of change in C(t) due to air leaving the micro
79
-------
C removal = R-ate °f change in C(t) due to all removal processes

C source = Rate of change in C(t) due to all source terms

(Note that concentration must be in the same units as the ambient air quality data, i.e., either ppm
or |ig/m3, although throughout these equations concentration is shown only in |ig/m3 for brevity.

The change in microenvironmental concentration due to influx of air, Cln , is
in ~ ^ambient proximity penetration airexchange
where:
C ambient = Ambient timestep concentration (|ig/m3)

/proximity = Proximity factor (unitless)

/penetration = Penetration factor (unitless)
The proximity factor fpr0ximity is used to account for differences in ambient concentrations
between the geographic location represented by the ambient air quality data (e.g., a regional
fixed-site monitor) and the geographic location of the microenvironment. That is, the outdoor air
at a particular location may differ systematically from the outdoor air at the center of the air
quality district. For example, a house might be located next to a busy road in which case the air
outside the house would have elevated levels for mobile source pollutants such as carbon
monoxide. The concentration Coutdoor in the air directly outside the microenvironment is given by
the product of the ambient concentration and f proximity:

C = f C (R ^
outdoor proximity ambient \°~JJ

For some pollutants (especially particulate matter), the process of infiltration may remove a
fraction of the pollutant from the air. The fraction that is retained in the air is given by the
penetration factor fpenetration. During exploratory analyses, the user may examine how a
microenvironment affects overall exposure by setting the microenvironment' s proximity or
penetration factor to zero, thus effectively eliminating the microenvironment.

Change in microenvironmental concentration due to outflux of air is calculated as the
concentration in the microenvironment C(t) multiplied by the air exchange rate:

Cout = RaireXchan9e * C(f ) (8-4)

The third term in the MAS SEAL calculation represents removal processes within the
microenvironment. There are three such processes in general: chemical reaction, deposition,
and filtration. Chemical reactions are significant for ozone, for example, but not for carbon
monoxide. The amount lost to chemical reactions will generally be proportional to the amount
present, which in the absence of any other factors would result in an exponential decay in the
concentration with time. Similarly, deposition rates are usually given by the product of a

80
-------
(constant) deposition velocity and a (time-varying) concentration, also resulting in an
exponential decay. The third removal process is filtration, usually as part of a forced air
circulation or HVAC system. Filtration will normally remove particles but not gases. In any
case, filtration rates are also proportional to concentration. Change in concentration due to
deposition, filtration, and chemical degradation in a microenvironment is simulated based on the
first-order equation:

C removal = (^ deposition + ^ filtration + ^chemical P(0 = ^removal X ^V ) (°~->)

where:

^removal = Change in microenvironmental concentration due to removal
processes (|ig/m3/hour)

^deposition = Removal rate of a pollutant from a microenvironment due to
deposition (I/hour)

= Removal rate of a pollutant from a microenvironment due to
filtration (I/hour)

= Removal rate of a pollutant from a microenvironment due to
chemical degradation (I/hour)

= Removal rate of a pollutant from a microenvironment due to the
combined effects of deposition, filtration, and chemical
degradation (I/hour)
For unreactive gases like carbon monoxide, all three removal terms could be zero, in which case
-[^removal "•

The fourth term in the MAS SEAL calculation represents pollutant sources within the
microenvironment. This is the most complicated term, in part because several sources may be
present. APEX allows two methods of specifying source strengths: emission sources (ESource
or ES) or concentration sources (CSource or CS). Either may be used for MASSB AL
microenvironments, and both can be used within the same microenvironment. The source
strength values are used to calculate the source term Csource, which has standard units of
|ig/m3/hr.
Emission sources are expressed as emission rates in units of |ig/hr. To determine the source term
associated with an emission source, ES must be divided by the volume V of the
microenvironment in m :

ES
CsouvefS = -TT (8"6)

Concentration sources however, are expressed in units of concentration. These must be the same
units as used for the ambient concentration (e.g., |ig/m3). Concentration sources are normally

81
-------
used as additive terms for microenvironments using the FACTORS method. Strictly speaking,
they are somewhat inconsistent with the MASSBAL method, since concentrations should not be
inputs but should be consequences of the dynamics of the system. Nevertheless, a suitable
meaning can be found by determining the source strength Csourcethat would result in a mean
increase of CS in the concentration, given constant parameters and equilibrium conditions, in this
way:

Assume that a microenvironment is always in contact with clean air (ambient = zero), and it
contains one concentration source. Then the mean concentration over time in this
microenvironment from this source should be numerically equal to CS. The mean source
strength expressed in ppm/hr or |ig/m3/hr is the rate of change in concentration Csource cs. In
equilibrium,

CS = ^^ (8-7)
R 4- R
air exchange removal
source, cscanbe written as
Rmean (8~8)
source.CS - mean

where Rmean is the chemical removal rate. From eq. 8-7, Rmean is equal to the sum of the air
exchange rate and the removal rate (Rair exchange +Rremovai) under equilibrium conditions. In
general, however, the microenvironment will not be in equilibrium, but in such conditions there
is no clear meaning to attach to Csource cs since there is no fixed emission rate that will lead to a
fixed increase in concentration. The simplest solution is to use Rmean = Rair exchange +Rremovai.
However, the user is given the option of specifically specifying Rmean (see discussion of
parameters below). This may be used to generate a truly constant source strength Csource cs by
making CS and Rmean both constant in time. If this is not done, then Rmean is simply set to the
sum of (Rair exchange + Rremovai). If these parameters change over time, then Csource cs also
changes. Physically, the reason for this is that in order to maintain a fixed elevation of
concentration over the base conditions, then the source emission rate would have to rise if the air
exchange rate were to rise.

Multiple emission and concentration sources within a single microenvironment are combined
into the final total source term by combining equations 8-6 and 8-8:

•/ "e "c
^source = ^source,ES + ^source.CS = 77 Zj ^' + ^mean Zj ^' (8~^)
v i=1 i=1

where:

ES, = Emission source strength for emission source / (jig/hour)

CS/ = Emission source strength for concentration source /' (|ig/m3)

82
-------
ne = Number of emission sources in the microenvironment

nc = Number of concentration sources in the microenvironment
(A note on units: The above equation is modified if the units of air quality are ppm rather than
|ig/m3. Specifically, 7/Fis replaced by f/V, where/= 1/ppmFact. The value of ppmFact is a
user-supplied in the Simulation Control input file; it expresses the number of |ig/m3 that equate
to 1 ppm. For the pollutant CO, ppmFact=l,145. The conversion factor ppmfactor could also be
used to convert |ig/m3 to other units.)

Equations 8-2, 8-4, 8-5, and 8-9 can now be combined with 8-1 to form the differential equation
for the microenvironmental concentration C(t). Within the time period of an timestep, Csourceand
Cjn are assumed to be constant. Using Ccombined = Csource + Cin leads to:
-
combined ~ " air exchange ° V / ^ removal
,,

= Ccombined-RmeanC(t) (8-10)

Solving this differential equation leads to:

r ( r \
C(t) = combined + C(0)- combined e"Rmea"( (8-11)
"mean \_ "mean )

where:

C(0) = Concentration of a pollutant in a microenvironment at the
beginning of a timestep (|ig/m3)

C(t) = Concentration of a pollutant in a microenvironment at time t within
the time period of a timestep (|ig/m3).
Based on eq. 8-11, the following three concentrations in a microenvironment are calculated:

P p , p
- ,A ^combined ^source ^ in
mean air exchange removal
= Cequil + C0- C^e-"— ' (8-13)
[ C(t}dt , , •/ _ p-Rmeant
Cmean = ^ - = Cequil + (C(0) - Cequil )' e (8-14)
f (ft Kmean*
JO
where:
t = length of the APEX timestep (hours)

83
-------
*—
equil
C(0)
. end
Concentration in a microenvironment (|ig/m ) if t — » oo
(equilibrium state).

Concentration in a microenvironment at the beginning of the
timestep (|ig/m3)

Concentration in a microenvironment at the end of the timestep
Mean concentration in a microenvironment for timestep (|ig/m3)
*— m
t^-air exchange ' t\-removal
At each timestep of the simulation period, APEX uses Eqs. 8-12, 8-13, and 8-14 to calculate the
equilibrium, ending, and mean concentrations. APEX reports mean concentration as the
concentration for a specific timestep. The calculation continues to the next timestep by using C
end for the previous timestep as C(0).

The microenvironmental parameters for the MAS SEAL method that can be defined by the user
in the Microenvironment Descriptions file are summarized in the Table 8.2, with their valid
ranges and their corresponding names in the file.

Table 8.2. Microenvironmental Parameters
Parameter
J proximity
J penetration
cs
ES
K removal
K air exchange
K-mean
V
Definition
Proximity factor
Penetration factor
Concentration source
Emission source
Removal rate due to
deposition, filtration,
and chemical reaction
Air exchange rate
Mean removal rate:
Volume of
microenvironment
Units
unitless
unitless
|ig/m3 or
ppm
Hg/hr
1/hr
1/hr
1/hr
m3
Range
J proximity — "
" — J penetration — J-
CS>0
ES>0
-^•removal — "
t^-air exchange — "
K-mean — "
F>0
Default
Value
1
1
0
0
0
none
7? 4-
J^-removal ~
J^-air exchange
none
Name3
PR
PE
CS
ES
DE
AE
MR
V
a Designation m Microenvironment Descriptions file
84
-------
Not all of the possible parameters are always needed, and several of them have natural default
values. Based on the above equations, the following generalizations can be made about the
definition of the MASSBAL parameters in the Microenvironment Descriptions file:

• Air exchange rate is a critical parameter that is always needed in a MASSBAL
calculation. It must always be defined in the file as it has no default value.

• Air exchange rate and volume are not pollutant-specific, and therefore are defined only
once for each micro. All other parameters must be defined for each pollutant.

• Removal rate must also be user-defined in the file if not assumed to be zero. For some
pollutants it can be assumed to have a natural default value of zero.

• The proximity and penetration factors must be defined in the file unless assumed to be
unity, which is the natural default value for both factors that should be used in the
absence of data to the contrary.

• If any emission source terms are present then volume must be defined. Volume has no
default value.

• If any concentration source terms are present then the mean removal rate may be user-
defined, but if appropriate, it may assume a default value of (Ra;r exchange + Rremovai)-
The details for specifying these input parameters in the Microenvironment Descriptions file are
provided in the Volume I of this User's Guide. Further details on the options for designating
these parameters are given in Section 8.2.4.

In APEX, it is assumed that the outdoor concentration and the other modeling parameters for the
MASSBAL method remain constant during any timestep. . Of course, recalling that the APEX
default timestep is one hour, in many cases the MASSBAL parameters may not remain constant
for an timestep at a time. For example, a person may enter a microenvironment and smoke a
cigarette for five or ten minutes and then leave. Or, someone might enter a kitchen and cook for
a few minutes using a gas stove. Or one might alter an air exchange rate by opening or closing a
window. There are two reasons why it is difficult to model such events in APEX. First, there is
already a large computational burden in calculating concentrations in every microenvironment
for every timestep for every simulated person. This burden is substantially large if very fine time
resolution were demanded. Second, most examples of fine-scale parameter variation are driven
by human actions. The CHAD activity diaries generally do not contain enough detail to
determine when each cigarette is lit or each time a stove is used or a window is opened or closed.
Furthermore, the diaries only follow the activities of a single person. It is quite possible for these
actions to be performed by other people. For example, if the activity diary follows a child, then
the child's parents may be doing these things that affect the properties of the microenvironments
that the child is in. Since the diaries do not reliably report such information, it was decided that a
very fine time resolution could not reliably be used for the calculation of concentrations.

In a MASSBAL microenvironment, the concentration during any timestep depends on the
concentration for the timestepbefore. Ultimately, all timestepsdepend on some method for
establishing initial conditions. To avoid the problem of establishing new initial conditions every

85
-------
time the activity diary indicates that a MASSBAL microenvironment is entered, the time series is
evaluated for all timestepsin the simulation period. An extra 24 hour period is added prior to the
start of the APEX simulation period by duplicating the properties from the first day of the
simulation period. It is assumed that 24 hours is sufficient so that the initial concentration
becomes irrelevant. The entire simulation period is then evaluated timestepby timestepwithout
gaps, with each timestepbeing used to determine the next.
8.2.3 Factors Method

The FACTORS method is simpler than the mass balance method. In this method, the value of
the concentration in a microenvironment is not dependent on the concentration during the
previous timestep. Rather, the method uses the following equation to calculate
timestepconcentration in a microenvironment from the user-provided air quality data:
"c
timestep ~ ambient proximity penetration 7 i

(8-15)

where:

Ctimestep = Timestep concentration in a microenvironment (|ig/m3)

Cambient = Timestep concentration in ambient environment (|ig/m3)

fproximity = Proximity factor (unitless)

fpenetration = Penetration factor (unitless)

CS; = Mean air concentration resulting from source i (|ig/m3)

nc = number of concentration sources in the microenvironment
The user may provide values for proximity, penetration, and any concentration source terms, or
may allow them to assume default values (see Table 8.2); however, it is not mandatory that the
user supply any values if the default values are suitable. An undefined proximity or penetration
is assumed to be unity at all times. Missing (i.e., undefined) sources are assumed to be zero.
Parameters are left undefined by simply omitting them from the Microenvironment Descriptions
input file. If all parameters are missing, then the concentration in the microenvironment is
always the same as the ambient concentration. All of the parameters in the above equation are
evaluated for each timestep, although these values might remain constant for several hours,
entire days, or even for the entire simulation. For the ambient concentration, the timestep values
come from the input Air Quality Data file, and may be either measurements or modeled results,
or may be sampled for each hour from a distribution (see Volume I for available formats of this
file). For the other parameters, the timestep values are the result of the calculations based on the
information specified in the Microenvironment Descriptions input file.
86
-------
8.2.4 Microenvironment Parameter Definitions

The second section of the Microenvironment Descriptions input file contains the rules for
determining the values of the parameters used in the MAS SEAL and FACTORS methods.
Instructions for specifying microenvironmental input parameters (those in Table 8.2) in the
Microenvironment Descriptions file are provided in Volume I of this User's Guide, but further
details on the options for using resampling rates, conditional variables, periodic (daily, weekly,
and monthly) groupings, and random seeds, are provided in this section. This includes an
explanation of ways to specify CS terms as products of distributions.

Both of the concentration calculation methods require multiple user-defined input parameters.
These microenvironmental parameters are defined by probability distributions defined in the
Microenvironment Descriptions input file, with values being assigned to each timestep of the
simulation. The file is read in MicroEnvModulerReadMicroData. Any of the following
distribution types in Table 3.1 can be used for microparameters.

The user may provide different distribution data for the parameter for any combination of the
following temporal and spatial variables:

• Hour in a day

• Day in a week

• Month in a year (i.e., season)

• Air quality district
For example, the user can define probability distributions for a parameter that vary depending on
the time of day and whether the simulated timestep is on a weekday or a weekend, or the user
can define a distribution that changes with the season of the year and with the air quality district
associated with the microenvironment being considered (recall that home and work sectors for
each profile will be associated with a unique air quality district).

The distributions for the microenvironmental parameters may also depend on conditional
variables that are a subset of the profile variables for an individual. A single microenvironmental
parameter may depend on up to three of the above conditional variables. Conditional variables
may change on an interpersonal, geographic, meteorological, or temporal basis, influencing the
microparameters accordingly.

The rules for determining any microenvironmental parameter (MP) are unique to each
microenvironment. For example, the rules for proximity for a house may differ completely from
the rules for proximity for a car. Every combination of parameter, pollutant, and
microenvironment is distinct (with the exception of air exchange rate and volume, which can
only be defined once per micro). The order in which the MP definitions are presented is not
significant to APEX, although it might help the user to group them by microenvironment,
pollutant, or MP. Note that the entire definition of an MP can be omitted if its default value is
acceptable, so for example if proximity is always to be unity in some microenvironment, then no
MP definition is needed for proximity in that microenvironment. The example in Exhibit 8-1
shows one possibility for the proximity parameter for microenvironment #1. The actual data are

87
-------
only for illustrative purposes and are not intended to properly represent this MP in any real
scenario.
^icro number = 1
Pollutant = 3

Parameter Type = Proximity
Hours - Block =11
Weekday-DayType =11
Month-Season =11
District-Area =11
Condition #1=0
Condition #2=0
Condition #3=0
ResampHours = NO
ResampDays = YES
ResampWork = YES
RandomSeed = 0
Block DType Season Area
111 1
211 1
112 1
212 1
113 1
213 1
114 1
214 1
1 1
1 1
2 2
1 1

Cl
1
1
1
1
1
1
1
1
1112222222
111
23334441
1 1

C2 C3 Shape
1 1 Normal
1 1 Point
1 1 Lognormal
1 1 Lognormal
1 1 Triangle
1 1 Normal
1 1 Uniform
1 1 Lognormal
222

Parl
1.5
2.0
1.2
0.4
0
2.5
0
3
211111

Par2 Par3 Par4
1.2

1.5
1.2
3 2
1.5
3
2

LTrunc UTrunc ResampOut
0 4.0 Y

0 10 Y
0 10 Y
Y
Y
Y
0 10 Y
Exhibit 8-1. Example of a Microenvironmental Parameter Description

The general format is the same for all MPs. The first three lines are mandatory and specify the
microenvironment number, the pollutant (indicated by its order in the Control file), and the MP,
or parameter type (the Pollutant line may be absent for AER and Vol). This combination should
be unique for every MP in the input file, with the possible exception of enumerated sources
(discussed later).

The parameter types are indicated using standard keywords (given in Table 8.2). Note that only
the first two characters of the parameter type are checked by APEX, and the keyword is not case
sensitive, so the example above could use "PR". The user may spell out the parameter types if
desired, providing greater clarity.

After the microenvironment number and parameter type, any or all of the remaining lines
containing an equal sign may be omitted. These indicate the settings for various options, all of
which have default values. These settings may appear in any order within the description; they
are recognized via the keyword that precedes the equal sign. One option, missing in the above
example since it only applies to parameter types ES and CS, is the source number. See section
8.2.4.6 for an example using this option. The other seven options in this example are the
mappings that determine the values for the seven indices that label each distribution. This is
followed by three resampling options and a random seed initialization option. These options are
covered in the following subsections.

After all the options are specified, the next line (starting with "Block") indicates that the
following lines contain descriptions of distributions. At least one distribution is always required;
the exact number needed depends on the settings of the seven indexing options. The shortest
possible MP description (other than one completely missing) consists of four lines. Such a
description would have the first three mandatory lines, the header line indicating that
distributions follow, and a single distribution that applies to all timesteps of the simulation, as in
Exhibit 8-2.
88
-------
Micro
number = 1

Parameter Type = Proximity
Block
1
DType Season Area Cl
11 11
C2 C3 Shape
1 1 Normal
Parl
1
5
Par2
1
2
Par3 Par4 LTrunc
0
UTrunc
4
ResampOut
Y
Exhibit 8-2. Example of the Shortest Possible MP Description

These rules state that the proximity in microenvironment #1 is to be drawn from a normal
distribution with mean 1.5 and standard deviation 1.2. If the value drawn is below zero or above
4.0 then another value is drawn, until one is found that is within bounds. This single value is
then applied to all timesteps of the simulation, for that particular simulated individual, for their
home air quality district. A separate single value is applied to that individual for all timesteps in
their work district. Actually, a third value, the average of the first two, is applied to all timesteps
in the "other" (non-home, non-work) air quality district. In effect, all the microenvironments are
modeled in triplicate to account for the three different places. In the above example the three
values would all be the same if an extra line "ResampleWork =NO" were added
after the second line. Each simulated individual is modeled independently so new values are
drawn for all parameters when starting another profile.

It is necessary that all the parameter distribution data be given in the correct units (i.e. those that
are compatible with the data in the Air Quality Data input files.)

The motivation for the rather complex programming that defines and evaluates the MP is that the
various parameters that enter the MAS SEAL and FACTORS equations may have widely
divergent properties. For example, a parameter like house volume should have a single value
that does not change over time. Another parameter such as an air exchange rate may change
every hour. Some parameters like source strengths from cooking or from traffic may show
strong diurnal patterns that may repeat on a daily or a weekly basis. Parameters that relate to
temperature may show seasonal variation.

There are a number of possibilities for each optional rule in the MP definitions. The options fall
into the following categories:

• Time and area mappings;

• Conditional variables;

• Correlation settings;

• Resampling options;

• Random number seeds; and

• Source number specification.
Each of these are discussed in the following subsections, after which comes a brief subsection on
the specification of distributions.
89
-------
8.2.4.1 Time and Area Mappings

Each MP is evaluated for every timestep of the simulation period. Normally this is done by
drawing a value at random from a distribution. The user may specify that different distributions
apply at timesteps within different hours of the day. Furthermore, the frequency of sampling can
be controlled by the user. It is not the case that a new value must be drawn every timestep,
instead values drawn for other times may be reused. The primary purpose of the time and area
mappings is to specify which distribution applies to each timestep. The secondary purpose, in
conjunction with the resampling options, is to establish periodic reuse of values on a daily,
weekly, monthly, or geographical basis.

As an example, suppose some parameter should be sampled from one distribution during typical
working hours and from another distribution at other times. This can be accomplished by
defining two "blocks" and assigning each hour of the day to either block #1 or block #2. Perhaps
hours 1-7 (midnight to 7 a.m.) and hours 19-24 (6 p.m. to midnight) belong to block #1, while
hours 8-18 (7 a.m. to 6 p.m.) belong to block #2. Then distributions must be defined for block
#1 and block #2. Hours that fall into block #1 will have their parameter values drawn from the
distribution that applies to block #1, and so on. If, for example, the distribution for block #2 has
the higher mean, then the daytime values for the parameter will generally be higher than the
nighttime values, generating a diurnal pattern. Similarly, weekly and seasonal patterns may be
generated.

The Hour-Block (HB) mapping indicates the block to which each hour of the day belongs. The
mapping must contain 24 numbers. The first is the block number for hour 1 (midnight to 1 a.m.),
and so on. All timesteps belonging to the same block use the same distribution. The block
numbers range from 1 upwards; there can be anything from 1 to 24 blocks. The number of
blocks (#blocks) is determined from this mapping itself. If the Hour-Block mapping is missing it
is assumed that there is only one block, which implies that the parameter values for all 24 hours
of the day (that is, all timesteps) are taken from the same distribution. The term "Block" might
suggest that the hours belonging to a given block should be adjacent chronologically, but this is
not necessary in APEX. It is also not necessary for each block to contain the same number of
hours.

The Weekday-Daytype (WT) mapping is similar to the Hour-Block mapping, except that it
contains seven values instead of 24. The first value is the day type for Sunday, the second is for
Monday, and so on until the last which is the day type for Saturday. The seven days of the week
are in the same order as on a standard calendar. Thus if day type 1 is weekday and 2 is weekend,
the vector should be ( 2 1 1 1 1 1 2), but without the parentheses. The mapping (1222221)
would be equivalent if the distributions presented further down were appropriately renumbered.
If the WT mapping is missing in the Microenvironment Descriptions input file, then only one
day type is assumed, that is, the default mapping is (1 1 1 1 1 1 1).

The Month-Season (MS) mapping is similar to the previous two, except that 12 numbers are
needed. The first number indicates the season for January, and so on through December. Again,
if this mapping is missing then a single season is assumed to apply to all months.

The District-Area (DA) mapping is similar to the others, except that the number of air quality
districts is not a universal constant, but may vary from one simulation to another. If the mapping

90
-------
is present, the user must ensure that it contains the correct number of terms (one area assignment
for each air quality district in the study area). The district indices represent the APEX air district
numbers, as enumerated in the Sites output file.

If the user defines 2 blocks, 2 day types, 4 seasons, and 3 areas, then a total of 48 distributions
(that is, 2 x 2 x 4 x 3) must be specified, one for each possible combination. The number would
be even larger if any conditional variables were used. To ease the burden of data requirements
the number of cases should be kept to a minimum. For example, if the seasonal dependence of
this parameter were weak, one could eliminate it and reduce the number of distributions from 48
to just 12. If this is too extreme, perhaps two seasons would suffice to capture the variation. The
number of seasons (or blocks, day types, or areas) can be defined differently for each MP, even
ones that belong to the same microenvironment. Each MP is evaluated for all timesteps in the
simulation period, independently of other MP, so there is no reason why the rules for one MP
should match or correspond with the rules for any other MP.
8.2.4.2 Conditional Variables

Selected profile variables may be used to influence the parameter values in the MAS SEAL or
FACTORS equations. These profile variables are known as Conditional Variables. Conditional
variables can be used to vary parameters on a profile, daily, hourly, or timestep basis. The list of
variables to use for the current simulation are set in ProfileModule: SetCVlist. The allowable
conditional profile variables are:

• Gender

• Population category (Race/gender combination)

• Employed

• HasGasStove

• HasGasPilot

• AC_Home

• AC_Car

• Window_Res

• Window_Car

• SpeedCat

• ProfileConditionall

• ProfileConditional2
91
-------
• ProfileConditionalS

• RegionalConditionall

• RegionalConditional2

• RegionalConditionalS

• RegionalConditional4

• RegionalConditionalS

These variables influence the parameters on a profile basis. See Chapter 5 for definition of these
variables. With the exception of the first three, rules for setting these variables for each profile
are defined in the Profile Functions (Distributions) input file. In addition to these variables, the
MPs can also depend on seven meteorological variables:

• TempCat Hourly temperature, binned into categories.

• HumidCat Hourly humidity, binned into categories.

• PrecipCat Hourly precipitation category.

• WindCat Hourly windspeed, binned into categories.

• DirCat Hourly wind direction, binned into categories.

• MaxTempCat Daily maximum temperature, binned into categories.

• AvgTempCat Daily average temperature, binned into categories

The first five can influence parameters on an hourly basis, while the last 2 are daily-varying
parameters.

Finally, the MPs can depend daily on 5 user-defined daily varying functions.

• DailyConditionall

• DailyConditional2

• DailyConditionalS

• DailyConditional4

• DailyConditionalS
. These hourly and daily varying variables are not profile variables. However, rules for defining
them are also designated in the Profile Functions (Distributions) input file (see Volume I).
92
-------
Note that Conditional Variables must be integer, since their values are used as indices to select
the distribution to be sampled. In practice, the number of categories must be fairly small
(generally 2 or 3) or else defining distributions for every case becomes burdensome. It should be
noted that while the concentrations in various microenvironments may depend on the profile
through the conditional variables, these concentrations do not depend on the activity diaries or
the event structure.

The user may select up to three Conditional Variables from the list for each MP. If one is used,
it does not matter whether conditional variable #1, #2, or #3 is used. If more than one is used
then the order they are designated does not matter.

The conditional variable to be applied is identified by its name. For example, if a parameter
were to depend on gender then it would be indicated as follows in the input file:
Condition #1 = Gender
Note that the word "Gender" must be spelled out in full as in the above list, but it is not case
sensitive. There are two ways to indicate that a conditional variable is not used. Either the line
for it can be omitted, or else the variable name can be set to anything that is not on the list. The
standard practice (if the line is not simply omitted) is to set the right hand side to zero:
Condition #1 =0

There is a complication for the user when it comes to specifying the distributions that are
applicable to conditional variables. Each distribution has seven indices (four for time and area
mappings and three for Conditional Variables). But the values that any index may have must be
integers. Thus, if gender is used as a Conditional Variable then the user must specify
distributions for gender=l and for gender=2. The user cannot use mnemonic devices such as
"M" or "F" instead. The numerical codes for the Conditional Variable are set as constants in
GlobalModule and are as follows:

• Gender: l=Male, 2=Female

• Employed: 1=YES, 2=NO

• HasGasStove 1=YES, 2=NO

• HasGasPilot: 1=YES, 2=NO

• AC_Car: 1=YES, 2=NO

• Window_Res: 1=OPEN, 2=CLOSED

• Wmdow_Car: 1=OPEN, 2=CLOSED
For the population category conditional variable, the conditional variable value codes are
integers that represent the order in which the population categories are defined in the "Population
file" definitions in the Control file. For example, if "Native American females" is the third

93
-------
population file defined, then the code for that population category would be 3. The other
Conditional Variables, the numerical values are provided by the user in the Profile Functions
input file, so there are no pre-assigned ranges or interpretations. The list of conditional variables
to be used in defining MPs is set in DistributionModulerSetCVList.
8.2.4.3 Correlation Settings

In APEX the user has the option of correlating samples for microenvironmental parameters.
Such correlation would make sense, for example, when the value of the parameter is assumed to
be mainly a function of the properties of a simulated individual's home and the pollutants have
similar properties (for example, are all particles). In addition, in some cases it may be that the
same parameters may be correlated in different microenvironments.

APEX uses a simple method of correlation microparameters - by sampling them using the same
random numbers. This results in values being selected for correlated parameters at the same
percentile from the appropriate distributions. The percentiles will correspond each hour as long
as the 2 (or more) parameters use the same conditional variables, time and area mappings, and
resampling rates and thus have the same number of required distributions and samples.
Otherwise the samples get out of phase and any correlation is lost. APEX checks that the
conditionals, mappings, and resampling are the same when correlating parameters, and writes a
warning if they are not. APEX will still run, but the user should be aware that the correlation is
lost.

Correlation is handled by an optional keyword in the microparameter definition, CORRNUM.
Each subset of microparameters that the user desires be correlated (sampled at the same
percentile each hour) are assigned a unique integer 1-N, where N is the total number of
correlated subsets. For example, assume that the user wants to correlate the penetration and
decay values for residences (micro #1) for 2 pollutants. Valid MP definitions for this case are
shown in
94
-------
! PENETRATION FACTOR
Xlicro number
POLLUTANT = 1
CorrNum = 1
Parameter Type
Xlonth - Season = 1
ResampHours
ResampDays
ResampWork
Block DType Season
111
112
Xlicro number
POLLUTANT = 2
CorrNum = 1
Parameter Type
Xlonth - Season = 1
ResampHours
ResampDays
ResampWork
Block DType Season
111
112
! DECAY RATES
Xlicro number
POLLUTANT = 1
CorrNum = 2
Parameter Type
ResampHours
ResampDays
ResampWork
Block DType Season
111
Xlicro number
POLLUTANT = 2
CorrNum = 2
Parameter Type
ResampHours
ResampDays
ResampWork
Block DType Season
111
= 1

= PE
11122222211
= NO
= YES
= YES
Area Cl C2 C3 Shape
1 111 Normal
1 111 Normal
= 1

= PE
11122222211
= NO
= YES
= YES
Area Cl C2 C3 Shape
1 111 Normal
1 111 Normal

= 1

= DE
= NO
= NO
= YES
Area Cl C2 C3 Shape
1 111 LogNormal
= 1

= DE
= NO
= YES
= YES
Area Cl C2 C3 Shape
1 111 LogNormal

Parl Par2 Par3 Par4 LTrunc UTrunc ResampOut
0.5 0.2 . . 0.1 1.0 Y
0.8 0.2 . . 0.1 1.0 Y

Parl Par2 Par3 Par4 LTrunc UTrunc ResampOut
0.4 0.1 . . 0.1 1.0 Y
0.7 0.1 . . 0.1 1.0 Y

Parl Par2 Par3 Par4 LTrunc UTrunc ResampOut
2.51 1.53 0 . 0.95 8.05 Y

Parl Par2 Par3 Par4 LTrunc UTrunc ResampOut
2.51 1.53 0 . 0.95 8.05 Y
Exhibit 8-3. Example of Defining Correlated Microparameters

In this example, the 2 penetrations are correlated, as are the decays, but the decays and
penetration are not correlated with each other. (If for some reason this was desirable, all four
descriptions could be set to have CorrNum=l).

Note that if the same distributions are assigned to each pollutant (which is true for decay in this
example), the pollutants will use the exact same parameter values (samples) for that pollutant for
each hour.

Note that the user can specify random number seeds for each microparameter using the
RandomSeed keyword (see section 8.2.4.5). This can also be used to correlate microparameters.
However, in this case the hourly microparameter samples will be the same for all persons in the
95
-------
simulation, while using the CorrNum setting will produce different sets of correlated samples for
each person.

8.2.4.4 Resampling Options

The time and area mappings indicate which cases are to have their parameter values drawn from
the same distribution. However, they do not distinguish between cases when a single parameter
value is to be drawn and applied to multiple timesteps and when separate values from the same
distribution are to be drawn. The concept is called resampling, and the two specific cases
mentioned above correspond to ResampTS=NO and ResampTS=YES.

There are four resampling options, namely for timesteps, hours, days, and work air quality
district. Each may be set to either YES or NO. If ResampTS=NO and ResampleHours=NO,
then all timesteps within a block on any day of the simulation will share the same value.
Timesteps in different blocks have their values drawn from different distributions, so these
values are never shared. If Resample TS=NO and ResampHours=YES, then only timesteps
within each hour will share values. If ResampTS=YES, then each timestep in the same block on
the same day will have its own value drawn, so Ntimestep values will be drawn no matter how
many blocks there are. In this case the blocks serve only to determine which parent distribution
is used for each particular hour. The defaults are ResampTS=NO and ResampHours=NO. If the
Control file variable TimeStepsPerDay is equal to 24 (the APEX default), then ResampTS and
ResampHours are interchangeable. If TimeStepsPerDay < 24, then the ResampHours flag has
no meaning and is ignored (because the microconcentrations are only calculated for each
timestep, which in this case is greater than an hour).

The second resampling option is ResampDays. If ResampDays=NO, then the same daily profile
is used on all days in the same category. If ResampDays=YES, then all days have new sets of
values drawn for them. The category for days is defined by the combination of day type, season,
and conditional variables. An example might be all winter weekdays with windows open, if day
type = weekday and season = winter and condition #1 = (WindowPos==OPEN), assuming these
terms have been defined appropriately. If ResampDays=NO (which is the default) then all such
days in the simulation period will share the same set of values.

The third resampling option is ResampWork. APEX normally generates different parameter
values for home and work locations. However, there are cases when this is not logical. For
example, if a car is defined to be a microenvironment and it uses the mass balance method, the
volume of the car should be the same whether the car is at home or at work. If
ResampWork=YES, then home and work always draw parameter values independently. If
ResampWork=NO, then the same values are used for work as for home. Since such cases are
rare, ResampWork=YES is the default, meaning that the workplace will have its values sampled
independently of the home.

To summarize, the default values for ResampTS, ResampHours and ResampDays are NO, and
for ResampWork it is YES. This means that the default is to draw only two values (one for
home and one for work) from each distribution listed for that MP and to apply those values
whenever possible. If a default is to be used then the line may be omitted from the input file, but
it does not hurt to show the lines anyway for purposes of clarity.
96
-------
8.2.4.5 Random Number Seeds

One of the features of the APEX model is the ability to conduct paired runs by controlling the
random number seeds. In normal use the model stochastically generates random profiles, selects
diaries, and generates MP values from distributions. If multiple model runs are to be
independent then the main RandomSeed value in the Simulation Control input file should be set
to zero or to different values. If model runs are to be paired, then identical streams of random
numbers are desired, in which case RandomSeed should be set to the same positive number in
both runs.

Creating paired runs in which everything is identical will create results that are identical as well.
Thus, the usual mode of operation is to have one specific difference between the two runs and
then see how much effect it has on the results. The difficulty from a programming point of view
is that it is not sufficient to simply start both runs with the same random number seed. In
addition, multiple random number streams are required. As an example, suppose one of the
paired runs contains an extra source term, and there was only one random number stream. The
first profile in both runs would be the same, as would all random numbers evaluated until values
are drawn for the new source. From then on the two runs will be drawing random numbers out
of phase, meaning at different points in the stream. This means that the next profile will differ
between the two runs.

In APEX the solution to the problem of getting out of phase is to set up multiple random number
streams. Then, for example, it is possible to generate the profile variables by sampling from one
stream while generating sources from another. This ensures that even if the sources differ, the
same profiles will be generated in both runs, since the random numbers used for the profiles
always will stay in step. The "extra" random numbers used for the additional source are drawn
from a separate stream and therefore do not affect the status of any of the other streams.

In general, paired runs using the same microparameters will stay in phase (even if the definitions
for these microparameters change). However, if a different number of microparameters are
defined (if, for example, a different number of sources are used for 2 different runs), then APEX
will get out of phase. The solution to this is to allow a definition random number seed for each
MP.

Each MP has an optional line for setting a RandomSeed specific to that MP. There are two
choices. If RandomSeed is set to a positive value, then that sets an absolute starting point for the
stream for that MP that can be set the same in another run. If RandomSeed = 0 (which is the
default) then the seed is chosen internally. This internal seed is fixed relative to the overall
model seed as defined in the Simulation Control file, based on the order of the MP definitions.
Thus, if the first MP defined in the Microenvironment Descriptions input file has RandomSeed
= 0, then this stream will begin at a fixed offset from the main stream for the model run. There
are two cases, depending on whether the main seed has a fixed value or is set to zero. If the main
seed is fixed (as for two paired runs), then the starting seed value for any MP will be the same in
both runs, provided the MP are in exactly the same order in both runs. However, this is not the
case unless the run with "extra" MP has them all appear in the input file after the MP that are in
common to both runs. To avoid this restriction, it is better in paired runs to set each MP to have
a non-zero random seed of its own. If the main seed is 0 and an MP seed is greater than 0, runs
97
-------
can be generated in which: 1) the MP values are reproduced exactly and 2) all other random
aspects of the model (for example, the profiles) change.

If an MP has RandomSeed = 0 and the Simulation Control file also has RandomSeed = 0, then
the MP will have a starting seed that has a fixed offset from a randomly chosen main seed. This
means that the MP will draw random numbers independently from one run to another.

In summary there are three cases. If independent, random runs are desired then set the main
RandomSeed in the Simulation Control file to zero and for each MP either set the RandomSeed
to zero or simply omit the option. If paired runs are desired and the MP are not altered between
runs, then the main seed in the Simulation Control file should be nonzero and it does not matter
whether the seeds for each MP are zero or not (although if they are not zero the seeds should be
the same in both runs). If paired runs involve changes to the list of MP then all seeds should be
set to positive values, matching seeds in the MP common to both runs.

A limitation of using the RandomSeed keyword is that it will produce the same stream of random
numbers (and thus the same set of hourly microparameter samples) for each person in the
simulation.

8.2.4.6 Source Strength Specification

As described in the sections on the MASSBAL and FACTORS methods (Sections 8.2.1 and
8.2.3), APEX allows two types of sources to be defined. ESource terms are emission sources
expressed in units of micrograms per hour. CSource terms are sources expressed in
concentration units of |ig/m3 (or ppm). ESource and CSource strengths can optionally be
specified as the product of two or more values drawn from different distributions.

As an example, an ESource term representing emissions from gas stoves can be constructed as
the product of three terms: a binary switch for Use/Non-use for each timestep, a duration of
usage term, and an emission rate per minute of usage. One advantage of separating this term into
three pieces is that different rules for the time and area mappings and the resampling rate can be
defined for each of the three pieces.

Each source in APEX, meaning each MP of type ES or CS, may be assigned an optional source
number. This is done by adding a line to the MP definition as inExhibit 8-3:
Micro number = 4
Parameter Type = ES
Source number = 2
Block DType
1 1
Season
1
Area
1
Cl
1
C2
1
C3
1
Shape
Lognormal
Parl
1000
Par2
2
Par3 Par4 LTrunc
100
UTrunc
10000
ResampOut
Y
Exhibit 8-4. Use of Source Number in MP Definition

For clarity, the source number should appear right after the parameter type. It cannot appear
earlier nor can it appear after the header line starting with "Block". It only applies to types ES or
CS and is not relevant for other parameter types. If the source number is omitted then it is
assumed to be zero, which is the catch-all category for additive source terms.

All MP of type ES or CS that share the same microenvironment number and source number are
evaluated separately for all timesteps according to their own rules, then the results are multiplied

98
-------
together on an timestep basis. For example, if another MP has the description shown in Exhibit
8-5, then both this MP and the previous one are evaluated for each timestep and then the results
are multiplied together, timestep by timestep.
Micro number = 4
Parameter Type= ES
Source number = 2
Hours
Block
1
2
- Block =1111
DType
1
1
Season
1
1
Area
1
1
1 1
Cl
1
1
1 1
C2
1
1
1111111122221111
C3
1
1
Shape
Normal
Point
Parl
10
45
Par2 Par3 Par4 LTrunc UTrunc ResampOut
10 . . 0 60 Y

Exhibit 8-5. Second MP Definition with Source Number 2.

Since the first MP was not resampled, only one value is generated per person and that value
remains constant over the simulation. The second MP is resampled every hour (although the
timesteps using the Point distribution will all return the same value).

It is possible to have a large number of terms sharing the same microenvironment and source
numbers, in which case all terms are evaluated separately and the timestep results are multiplied.
Note that all terms sharing the same microenvironment and nonzero source numbers must also
share the same parameter type; one cannot mix ES and CS types since it does not make sense to
multiply them together. APEX will generate an error message if this is attempted. It is possible
for some sources to be ES while others are CS, even in the same microenvironment, as long as
they are not assigned the same nonzero source numbers.

Once the product is evaluated, the result is treated exactly the same as any additive sources for
that microenvironment. All ESource terms are added together timestep by timestep, whether
each resulted from a product or was a separate term by itself. The same applies to all CSource
terms in the same microenvironment. In effect there is no change at all to either the MASSBAL
or the FACTORS equations, there is simply a change in how the ESource and CSource terms are
determined.

When defining product terms, a line must be added to the Simulation Control input file
indicating the largest source number used in the run for each pollutant. This line has the
keyword #SOURCES and it might appear as in the pollutant parameters section of the
Simulation Control file as shown in Exhibit 8-6.
Pollutant =CO
DoDose = YES
InputUnits = ug/m3
OutputUnits = ug/m3
ttSources = 3
Exhibit 8-6. Use of #sources Setting in the Pollutant Parameters section of the Simulation
ControlFile.

This value reported here (3 in this case) is echoed to the Log file after "#microenvironments" and
it is called "#Enumerated sources". Actually, like #microenvironments, this value is only used to
allocate array space, and APEX will run correctly as long as #sources is large enough to

99
-------
accommodate the source numbers used in the Microenvironment Descriptions input file. If
#sources is larger than is needed, no error occurs although job execution will be slightly less
efficient. This line can only be omitted from the Simulation Control file if no source numbers
are assigned to any MP.

Not all sources need be expressed as products. These sources can either be left without source
numbers or can explicitly have source number of zero, or may even have a positive source
number. If there is only one MP with a given microenvironment number and source number,
then it essentially constitutes a product with only one term. If only source numbers = 0 are used
(i.e. only additive terms are defined), then #Sources may be set to 0 in the control file or omitted.
8.2.4.7 Specification of Distribution Data

The number of distributions that the user must supply for each MP is the product of these
numbers:

Noistributions = Nblocks X Ndaytypes X Nseasons * Nareas X NC1 * NC2 * NC3 (8-16)

Here Nci-Ncs are the number of possible responses for conditional variables #l-#3. If a
conditional variable is not used, then it has one possible value (its index is always equal to 1).
The number of possible responses varies from one conditional variable to another. All preset
Conditional Variables have two values. Population Category has as many values as population
files are defined in the Control file. For other Conditional Variables the number is determined
from the definition supplied by the user in the Profile Functions (Distributions) input file.

Each distribution occupies one line in the Microenvironment Descriptions input file. Every
combination of the seven indices must have a distribution defined, even if that specific
combination never occurs during the simulation. For example, if a winter season exists then it
must have a set of distributions defined, even if the simulation period only covers the summer.
The reason is that the distributions are stored in an array with room for all possible combinations
of index values, and this array is checked once for gaps (missing data) before the first profile is
begun. This is more efficient than checking to see if a distribution exists every time it is called,
as a model run can contain millions or even billions of calls to distributions. The price is that
distributions that ultimately are never called must still exist in the array.

The user does not need to number the distributions, this is done internally by the program. The
distributions are assigned index numbers in standard Fortran order (the block index changes
fastest, and Condition #3 changes slowest). Thus, distribution#l is for (1,1,1,1,1,1,1), and if
there is more than one block then distribution#2 is for (2,1,1,1,1,1,1), etc. In the input file the
distributions can appear in any order, but the standard order is preferred for consistency.

Each line describing a distribution contains the following information First, the seven indices
are listed; block, day type, season, area, cl, c2, and c3. The seven indices must all appear
explicitly in the set order. Any superfluous indices must be given a value of one. Thus if a
conditional variable such as c3 is not used then the c3 index number is 1 for all distributions for
that MP.
100
-------
After the seven indices comes the distribution definition in standard APEX distribution format
(Section 3.1). Any of the APEX distributions can be used; the available shapes and their
parameters are given in Table 3.1.
101
-------
CHAPTER 9. CALCULATING EXPOSURES

Below is a description of how the microenvironment concentrations and other information are
used by APEX to estimate exposure (see step 6 in Figure 2-1), and how the model then presents
and summarizes the exposure results. Exposures are calculated in
ExposureDoseModulerExposure.
9.1 Estimating Exposure

For inhaled pollutants such as the ones APEX models, exposure is defined as the time integrated
concentration of the pollutant in the breathing zone of a person. We refer to the concentration to
which a person is exposed as the exposure concentration. This concentration is assumed to be
approximately spatially uniform within each microenvironment and also approximately
temporally uniform for the duration of any one activity event (at most one hour). A time series
of exposure concentrations can be constructed by following the sequence of microenvironments
and locations ("Home," "Work," or "Other") visited according to the composite activity dairy
assembled for the target profile.

As an example, assume the activity diary indicates that the first 40 minutes of the simulation are
spent in microenvironment #3 ("Home"), the next 20 minutes in microenvironment #2 ("Other"),
and the next 60 minutes in microenvironment #5 ("Work"), and so on. Then the exposure time
series has its first 40 minutes at the concentration for hour 1 in microenvironment #3 in location
"Home," the next 20 minutes at the concentration for hour 1 in microenvironment #2 for location
"Other," and the next 60 minutes at the concentration for hour 2 in microenvironment #5 for
location "Work," and so on. The exposure itself depends neither on the activities that the person
is performing nor on any personal physiological properties.

The user may select the units for reporting the exposure output. Either ppm or |ig/m3 may be
chosen. This applies both to concentration and to exposure. It also applies to the levels used as
cutpoints in the various exposure tables. The parameter OutputUnits in the Simulation Control
file controls this; if set to PPM (actually, to anything beginning with "P") then parts per million
(ppm) are used, otherwise micrograms per cubic meter (|ig/m3) are used.

APEX calculates exposure as a time series of exposure concentrations that a simulated individual
experiences during the simulation period. APEX calculates the exposure by identifying the
concentrations in the microenvironments visited by the person according to the composite
activity diary. In this manner, a time-series of event exposures are found. Then, the timestep
exposure concentration at any clock hour during the simulation period is calculated using the
following equation:
- - (9-1)
where:

102
-------
C/ = Timestep exposure concentration at clock hour i of the simulation
period (|ig/m3 or ppm)
N = Number of events (i.e., microenvironments visited) in timestep i of
the simulation period.
Ctimestep (j) = Timestep concentration in microenvironmentj (|ig/m3 or ppm)
t(Q = Time spent in microenvironmentj (minutes)
T = Length of timestep (minutes)

From the timestep exposures, APEX calculates time series of 1-hour, 8-hour and daily average
exposures that a simulated individual would experience during the simulation period. APEX then
statistically summarizes and tabulates the timestep, hourly, 8-hour, and daily exposures. Note
that if the APEX timestep is greater than an hour, the 1-hour and 8-hour exposures are not
calculated and the corresponding tables are not produced. Exposures are calculated
independently for all pollutants in the simulation.
9.2 Exposure Summary Statistics

The exposure time series provides a wealth of detail on the exposure experienced by each profile.
However, it is difficult to analyze since the number of events differs for each profile and even the
number of events on any given calendar day is unpredictable. Also, a model run may consist of
thousands of profiles, so it is not practical to retain the exposure time series for all profiles in
memory. For output tables and for analysis, summaries of the exposure time series are required.

Exposure summary statistics are calculated for each pollutant in the simulation, and are written
to pollutant-specific output files: the Micro Summary file, the Micro Results file, and the Tables
file. Each exposure metric in this section is thus calculated for each pollutant, and the Control
file keywords listed below are pollutant-specific. (See Volume I).

The first step in summarizing exposure is to calculate the time series of event-level, timestep-
level, and hourly average exposures for the profile. The event exposures for all pollutants are
written to the Events output file (if the Simulation Control file setting EventsOut = YES), while
the timestep values are written to the Timestep output file (if the Simulation Control file setting
TimestepOut = YES). Daily averages and maxima for each pollutant can be written to the Daily
output file as well. The timestep values are used to derive most of the other exposure summary
statistics.

There are two exposure time series reported on an hourly basis, namely the series of 1-hour
averages and running eight hour averages. These are stored in other vectors. For the first seven
hours of the simulation the averages are taken over fewer than eight values, otherwise it is
always the current hour and the previous seven hours. These hours may cross day, month, or
even yearly boundaries. The 1-hour and 8-hour averages are not calculated for timesteps greater
than 1 hour (ie. when the Control file variable TimestepsPerDay is less than 24).

There are three exposure statistics calculated on a daily basis. The first is the daily average
exposure or DAvgExp. This is this arithmetic mean of the 24 values of hourly exposures that fall
on a given calendar day. Note that all days contain 24 hours in APEX; the effect of Daylight
Savings Time is removed from the model. The daily average exposure is then binned into levels

103
-------
according to the cutpoints provided by the user in the Simulation Control file. For example, the
input line
DAvgExp =2, 5, 8, 12, 20
indicates that the first bin for daily average exposure extends from 0.0 to 2.0 (ppm or |ig/m3),
and the second bin from 2.0 to 5.0, et cetera. The final or sixth bin in this example contains all
values over 20.0. The number of days at each level is recorded for each profile.

The second daily exposure statistic is the maximum timestep average for each calendar day, or
DMTSExp in the code (for Daily Maximum Timestep Exposure). It is the highest of the
timestep values on each day. Like DAvgExp, these are also converted to bins or levels using
cutpoints from the Simulation Control input file. The number of days at each level is recorded
for each profile.

The third daily exposure statistic is the maximum 1-hour average for each calendar day, or
DMIHExp in the code (for Daily Maximum 1-Hour Exposure). It is the highest of the 24 hourly
values on each day. Like DAvgExp, these are also converted to bins or levels using cutpoints
from the Simulation Control input file. The number of days at each level is recorded for each
profile.

The final daily summary statistic for exposure is DMSHExp which is the Daily Maximum 8-
Hour Exposure. It is the largest of the 24 8-hour running averages for each calendar day. As for
the other daily statistics, it is also binned into levels and recorded for each profile.

The average exposure over the entire simulation period is also calculated. As for the daily
summary statistics, the average exposure is binned using the cutpoints designated in the
Simulation Control input file by SAvgExp.

In addition to the exposure statistics described above, there are a few others that are derived
directly from the event-based exposure time series. The variable TimeExp represents the
number of minutes spent in each bin in each micro, based not on hourly averages but on the
original event exposures. Again, the cutpoints for the TimeExp bins are provided by the user in
the Simulation Control input file.

The user can also set a threshold exposure level in the Simulation Control file. This variable is
called AlertThresh. If the exposure time series exceeds AlertThresh for any event then the
following three things occur. First, the count of high exposure events for this profile is
incremented by one. Second, the total duration over the threshold exposure is incremented by
the duration of this event. Third, the exposure is checked to see if it exceeds the maximum
exposure previously experienced by this profile. These results are reported in the Log file for the
model run for each profile that exceeds the threshold.
9.3 Exposure Summary Tables

APEX writes out up to 130 different exposure summary tables for the statistics described above.
The content and interpretation of these tables (including examples) are covered in Volume I.

104
-------
There are 11 basic types of tables; the tables are written to the Tables files (one for each
pollutant) and optionally the Log file in ExposureDoseModule:Output:

1. Minutes in each Exposure Interval by Microenvironment (TimeExp)
2. Minutes at or above each Exposure Level by Microenvironment (TimeExp)
3. Person-Days at or above each Daily Maximum 1-Hour Exposure Level (DMIHExp)
4. Person-Days at or above each Daily Maximum 8-Hour Exposure Level (DMSHExp)
5. Person-days at or above each Daily Maximum Timestep Exposure Level (DMTSExp)
6. Number of Simulated Persons with Multiple Exposures at or above each Daily Maximum
1-Hour Exposure Level (DMIHExp)
1. Number of Simulated Persons with Multiple Exposures at or above each Daily Maximum
8-Hour Exposure Level (DMSHExp)
8. Number of Simulated Persons with Multiple Exposures at or above each Daily Maximum
Timestep Exposure Level (DMTSExp)
9. Number of Simulated Persons with Multiple Exceedances (in the Simulation) of the
Threshold Timestep Exposure Levels (TSExp).
10. Person-Days at or above each Daily Average Exposure Level (DAvgExp)
11. Persons at or above each Overall Average Exposure Level (SAvgExp)

The levels written to each table are given by the Simulation Control file keywords in
parentheses. The definition of terms in the titles of these tables are as follows:

• Person-day: A single simulated day for one simulated individual. A 100-day simulation
of 10 persons contains 1000 person-days, as does a single-day simulation of 1000
persons. APEX in general counts exposures in Person-days. Multiple exposures (at a
single level) during single day are counted as one Person-day of exposure.

• Multiple Exposures: Multiple exposures for the same person on different simulation
days. This term refers to multiple person-days of exposure in a simulation corresponding
to the same profile. Multiple exposures (at a single level) during single day are counted as
one Person-day of exposure.

• Multiple Exceedances: Multiple exposures for the same person on different timesteps of
the simulation. Multiple exposures during the same day (at a single level) DO count as
different exceedances.
Table types 1,2, 10, and 11 are generated only once, for the entire population. Table types 3 to 9
are generated for six population subgroups, under three exertion levels. . The six population
subgroups are as follows:
1. All Persons. The table statistics are based on the entire population.

2. Children. The table statistics are based on the population of children, as defined by the age
range given by the Control file settings ChildMin and ChildMax.
105
-------
3. Active Persons. The table statistics are based on the population of people having a median
Physical Activity Index (PAI, mean MET) over the whole simulation period that exceeds the
value designated by the Control file setting Active?AI.

4. Active Children. The table statistics are based on the population of active children, as
determined by the Control file settings ChildMin, ChildMax, and ActivePAI.

5. Ill Persons. The table statistics are based on the population of ill people. The population is
determined by the probabilities given in the Prevalence file. This population is only considered
if the input variable Disease is set in the Control file.

6. Ill Children. The table statistics are based on the population of ill people. The population is
determined by the probabilities given in the Prevalence file and the Control file settings
ChildMin and ChildMax. This population is only considered if the input variable Disease is set
in the Control file.

The three exertion levels are:

1. All Exertion Conditions. The table statistics are based on exposures experienced by the
population subgroup under any ventilatory conditions.

2. Moderate Exertion. The table statistics are based on exposures experienced by the
population subgroup only during periods in which their average equivalent ventilation rate
(EVR) is in the "moderate" range. The period of time during which EVR is averaged is either 1
hours or 8 hours, based on the table being generated. The "moderate" EVR ranges are defined
by the Simulation Control file settings ModEVRl (for timestep exposures), ModEVRl (for 1-
hour exposures) and ModEVRS (for 8-hour exposures). An individual's EVR is in the moderate
range if it is greater than or equal to the ModEVR# setting and less than the HeavyEVR# setting
for the exposure period.

3. Heavy Exertion. The table statistics are based on exposures experienced by the population
subgroup only during periods in which their average equivalent ventilation rate (EVR) is in the
"heavy" range. The period of time during which EVR is averaged is either 1 hours or 8 hours,
based on the table being generated. The "heavy" EVR ranges are defined by the Simulation
Control file settings HeavyEVRTS (for timestep exposures), HeavyEVRl (for 1-hour
exposures) and HeavyEVRS (for 8-hour exposures). An individual's EVR is in the heavy range
if it is greater than or equal to the HeavyEVR# setting for the exposure period.

The exertion level statistics are calculated in the following manner: For each day in the
simulation, the mean EVR level during every timestep, one hour, and running eight hour time
period is calculated. If the mean timestep, 1-hour or 8-hour EVR is higher then the EVR levels
indicated (in the Simulation Control file) for moderate or heavy exertion, then the exposure
during the same time period is compared with the levels given in the Simulation Control file, and
the statistics and tables are updated if the exposure exceeds the level.

NOTE: Many of the tables can include statistics at the 0.0 exposure level, if it is indicated on the
Simulation Control file. This can be used to obtain some useful statistics (such as total final
study area of the subpopulation population), since all persons will have exposures equal to or
exceeding 0.0 level exposures on all person-days. However, use caution in examining these

106
-------
"0.0" level statistics in the case of the exertion-level tables. If a simulated person has no
timestep, 1-hour or 8-hour periods at Moderate or Heavy EVR, then they will NOT have an
exposure in that table for the 0.0 level, and the 0.0 level statistics will not correspond to the
entire subpopulation.
107
-------
CHAPTER 10. CALCULATING DOSE

APEX contains algorithms for estimating pollutant doses. The term "dose" refers to some
measure of the amount of pollutant in the body of the target person. The situation is not as clear
as for exposure since there are numerous specific definitions of dose that are used in various
contexts. For an airborne pollutant, dose could refer to the amount inhaled, or the amount
currently in the lungs, or the amount crossing from the lungs to the body, or the total amount in
the body, or the total in some specific target organ, or a number of other things. Dose may be
more useful or accurate than exposure for evaluating the effects of air pollutants because it
accounts better for differences in pollutant uptake resulting from (1) the variation in physiology
and activities across populations and (2) the variation in physiological responses to activities
within an individual. In APEX, dose is generally defined at the amount inhaled. APEX
contains a special algorithm for the pollutant CO, in which the model calculates the percent of
carboxyhemoglobin (%COHb) in the blood. Carboxyhemoglobin is hemoglobin that has carbon
monoxide instead of the normal oxygen bound to it. In addition, APEX contains algorithms to
estimate the deposited lung dose in the case of particulate matter (PM). When the APEX model
is extended to other pollutants, it will likely require the development of specific dose algorithms
for each new pollutant, or at least for each class of pollutant.

The simple algorithm for calculating inhaled dose is discussed in the next section, followed by
the algorithm for the %COHb calculation. The final topic relating to dose is the explanation of
the summary statistics for reporting dose. Doses are calculated in ExposureDoseModulerDose.
10.1 Inhaled Dose Calculation

Currently, for all pollutants other than CO, APEX calculated inhaled dose. Inhaled dose is
simply the amount of pollutant inhaled over the course of a specified time period. In APEX,
inhaled dose for each timestep is calculated as

D,=VeCt (10-1)
where
d = Timestep exposure concentration at timestep hour i of the simulation
period (|ig/m3 or ppm)
Ve = Expired ventilation rate (ml/min)

The calculation of the expired ventilation rate Ve is discussed in Section 7.3. The exposure C; is
discussed in Section 9.1. From the timestep doses, APEX calculates time series of 1-hour, 8-
hour and daily average dose that a simulated individual would experience during the simulation
period. APEX then statistically summarizes and tabulates the hourly, 8-hour, and daily doses.
Note that if the APEX timestep is greater than an hour, the 1-hour and 8-hour dose are not
calculated and the corresponding tables are not produced. Doses are calculated independently
for all pollutants in the simulation.
108
-------
10.2 Carboxyhemoglobin (COHb) Calculation

The calculation of CO dose is complex. It starts with the exposure time series, which indicates
the pollutant concentration in the inhaled air at each moment in time. It also requires the
ventilation rate which is activity dependent, knowledge of a number of physiological parameters,
and the %COHb level depends not only on current conditions but on recent history as well. The
discussion of physiological parameters is in the chapter on personal profiles. In APEX, dose is
the delivered dose of CO as measured by the biotransformation product carboxyhemoglobin
(COHb)—specifically %COHb—in the blood. The estimated %COHb, in addition to being a
more refined measure of exposure compared to exposure concentration, is useful for comparing
to health-based benchmarks of %COHb. A normal carboxyhemoglobin level is less than two
percent for non-smokers. Overt signs of toxic effects usually appear at carboxyhemoglobin
levels of 15 to 20 percent, and a level of 25 percent is an index of severe poisoning, which may
lead to sudden loss of consciousness.

Conceptually, APEX uses a number of factors, in particular a time series estimate of alveolar
ventilation rate Va (which is activity and physiology dependent) and a time-series estimate of
exposure (which indicates the pollutant concentration in the inhaled air for specific moments in
time), to calculate %COHb. The ventilation rate algorithm is discussed in Section 7.3 and the
%COHb algorithm is discussed in the following subsections. For other factors and additional
detail, see Johnson (2002).

The %COHb calculation in APEX uses the time series for exposure to CO and the time series for
alveolar ventilation rate, VA, as inputs (among other factors). The dose calculation is based on
the solution to the non-linear Coburn, Forster, Kane (CFK) equation, as detailed in Johnson
(2002). As pointed out by that report, the CFK equation does not have an explicit solution, so an
iterative solution or approximation is needed to calculate the %COHb. An iterative solution,
however, was determined to be unsuitable because of the model execution time necessary (a
typical model run of one calendar year represents roughly 14,000 events per person and several
thousand people, or tens of millions of diary events). Therefore, the CFK equation is solved
using a modified Taylor's series method in which the event duration is restricted in time (if
necessary) to ensure convergence with only a few terms. This method avoids the dangers of
non-convergence that arise in some other methods.

As the mathematical derivation in the above report is very detailed, only the main results are
presented here. First, it should be noted that the literature discusses two forms of the CFK
equation, namely a linear and a non-linear form. The linear form itself is an approximation that
allows an explicit solution, but is not accurate under all conditions. The non-linear form is
considered to be more correct and is the one being discussed here.
Restricting %COHb(t) to between 0 and 100 (percent), the CFK equation takes the form of the
following differential equation:
109
-------
%COHb'(t) = C0-C1x (10-2)
0 1 100-%COHb(t)
where Co and Ci are constants over the duration of one event that depend on physical and
physiological parameters, including Va and the CO exposure. Co is given by
c _ (10-3)
0 RHB0+ Blood
where Endgn and Blood are physiological profile variables (Section 5.3). PCO is the partial
pressure of CO (torr)

PCO = Pgases * EXpOSUM X 10'6 (10-4)
gases

where Exposure is the event CO exposure concentration and Pgases is the partial pressure (torr) of
dry gases at the study altitude (EPA, 1978):

- 47 (10-5)

where altitude is in feet. The variable RHB0 is the total reduced blood hemoglobin level (ml 02
or ml CO per ml blood), adjusted for altitude (as in EPA, 1978)

( ? -7Cp0.0001429Altitude \
RHB0=1.39x0.01x0.995xHmglb 7 + — — - (10-6)
0 y ( 100 )

where Hmglb is the profile variable for hemoglobin density. The factor B is given by

B = _f_+^s_ (10-7)
DIFF Va

Va is the event alveolar ventilation (see Section 7.3) and DIFF is lung CO diffusivity
(ml/min/torr) adjusted for ventilation rate

DIFF = Diff + 0. 000845Va -5.7 (10-8)

where Diff is the CO diffusivity profile variable (which corresponds to a baseline ventilation).

The constant Ci is given by

_ 1 + 0-32xP02
' 69.76 xRHB0x Blood

where P02 is the partial pressure of oxygen in the lungs (torr)

P02 = 0. 209Pgases -49 (10-10)

See Johnson et al. (2002) for a detailed derivation of the above equations for Ci and C2.

110
-------
Time zero represents the start of the current event. The concentration %COHb(0) (at time zero)
is assumed to be known. The first derivative, 94COHb'(0), can easily be found from the above
equation. The solution %COHb(t) is a smoothly varying function of time without sudden
discontinuities or changes in slope. It therefore can be expanded in a Taylor's series about 7=0,
which should converge fairly rapidly. One simplification is to rescale the time variable to the
unitless parameter z:

(C0+C,)xf (10-11)
(lOOxD0xD0)

where

D _1-*COHb(0)
100

The Taylor's series up to the fourth order term is:
T4 (z) = %COHb(0)± 100xD0 x Dz_
2

100 xA1 xD0xDx(A1-2D)xz3
6 +

100 x A, x D0 x D x (A,2 - 8DA, + 6D2)x z4
24

where

A< = °1 (10-14)
r* i r*
U0 + Of

D = D0-A1 (10-15)

For typical values for the constants Co, Ci, and %COHb(0), convergence occurs for z
-------
subevent duration is used, and all the subevents have equivalent length. An accumulated dose is
calculated as the running sum of the average %COHb level over the subevents. At the end of all
subevents, the average dose is this accumulated dose divided by the number of subevents while
the final value of the dose is simply the final value of COHb itself. These values are saved and
the next diary event is processed.
10.3 Calculating PM Dose

In APEX PM dose is modeled as the mass of PM depositing in the entire respiratory system,
including the extrathoracic regions (mouth, nose, and oropharynx) and the lungs. The PM dose
algorithm was developed from the empirical lung deposition equations of the International
Commission on Radiological Protection's Human Respiratory Tract Model for Radiological
Protection (ICRP 1994).

The algorithm calculates the deposition fraction (f) in each of 9 filters (see Figure 10.1) and
determines the PM mass deposited in each for each event. The 9 filters correspond to the
following regions of the respiratory system: 1. Nose (inhalation, fN) 2. Oropharynx (inhalation,
fo) 3. Tracheobronchi (inhalation, fTB,i) 4. Bronchioles (inhalation, fB), 5. Alveoli (fA), 6.
Bronchioles (exhalation, fB), 7. Tracheobronchi (exhalation, fiB,e) 8. Oropharynx (exhalation,
fo) and 9. Nose (exhalation, fN). The f values determine the fraction of the particle mass
entering the region that deposits. Only the fraction in the Tracheobronchial filters differs for
inhalation versus exhalation, for the other filters the value of the fraction is the same in both
cases. Deposition via both aerodynamic and thermodynamic (diffusive) mechanisms are
estimated for each filter.
112
-------
Inhaled
Particles
Exhaled
Particles
Figure 10.1 Structure of the ICRP Deposition Model.
The fractions f and the resulting deposited doses are calculated for each particle size considered.
The size-specific mass are summed to get the mass deposited in each filter. The deposited filter
masses are then combined into 4 deposited masses corresponding to total PM dose in the entire
respiratory system and in the extrathoracic (ET, nose + oropharynx), tracheobronchial (TB), and
pulmonary (Pulm, bronchiole + alveoli) regions.
10.3.1 Particle Sizes, Inhalability, and Diffusion Coefficient

The particle size used by the ICRP algorithm is the aerodynamic diameter. Aerodynamic
diameter (dae) is the property of particles measured by the majority of particle samplers, and the
EPA designations of PM2.5 and PM10 are based on aerodynamic diameter. The aerodynamic
diameter is provided to the model in the Control file using the pollutant-specific keyword Size
(see Volume I).

Particle inhalability is defined as the fraction of the ambient particles that are inhaled. This
fraction is calculated as:
finhj=1 -0.5(1 -[7.6x10-4d2aeJ+1]-1)
(10.16)
The particle diffusion coefficient D; is also required to calculate the deposition fractions. D; is
calculated using Equation 10.17:
113
-------
Cc(dthi)kT

377/icV,

where k is the Boltzmann's constant 1.38 x 10"16 g cm2 s"1 K"1

dth,i is an equivalent thermodynamic diameter of particle size i

T is absolute temperature (Kelvin)

(j, is the absolute viscosity of air (1.82 X 10"4 g cm"1 s"1)

and Cc is the Cunningham correction factor, given in Equation 10.18:

Cc(dthi) = 1+^[1.257+ 0.4 exp(-1'1^thj)] (10.18)
dth,i 2A

where A, is the mean free path of air, 6.5 X 10"6 cm.

The thermodynamic diameter is calculated from the aerodynamic diameter:
(microns)
cthJ
where p is the particle density in g/cm3. The value of dth,i is found by recursively solving this
equation using an initial guess of dae,i(p) 1/2.

10.3.2 The ICRP Deposition Equations

As described above, the ICRP model considers the respiratory system as a system of 9 filters
corresponding to the particles deposited in the nose (N), oropharynx (O), trachea/bronchi (TB),
bronchioles (B), and alveoli (A) during inhalation and exhalation. The ICRP equations considers
fractions for both aerodynamic (ae) and thermodynamic (th) deposition processes in each region.
These fractions in each filter] are exponential functions of 3 empirical coefficients a, R, and p, as
shown in Equations 10.20 and 10.21.

Aerodynamic Deposition Fraction

faeJ=1-exp(-aaeJR^) (10.20)

Thermodynamic Deposition Fraction

fthJ=1-exp(-athJR^) (10.21)
114
-------
There is an a, R, and p coefficient for each filter for both aerodynamic and thermodynamic
mechanisms. The coefficients may be constant or may be functions of parameters describing the
respiratory system physiology and activity level of the individual being modeled. The
coefficients, which may also be a function of the mode of breathing (nasal or oral), are given in
Table 10.1.
Table 10.1 The values of a, R, and P for each filter for oral and nasal breathing

Nose
(N)
Oropharynx
(0)
Tracheobronchial
(TB)
Bronchioles
(B)
Alveoli
(A)
Aerodynamic Deposition
a
R
P
0.0003
d2Vn(S,)3
1
nose breathing:
0.000055
oral breathing:
0.00011
nose breathing:
d2Vn(S!)3
oral breathing:
d^vAv^s,)1-4
nose breathing:
1.17
oral breathing:
1.4
inhalation:
0.00000408
exhalation:
0.00000204
d2V(S023
1.152
0.1147
0.056 + tBL5d'B"°25
1.173
0.146S3098
cftA
0.6495
Thermodynamic Deposition
a
R
P
18
DVnSf°-25
0.5
nose breathing:
15.1
oral breathing:
9
nose breathing:
DVnSf0'25
oral breathing:
d^vAvT)-0-^,)1-4
nose breathing:
0.538
oral breathing:
0.5
22.02(8 0L24x
-[log,, (100+-!^-)]!
[l-100e " ]
DtTB
0.6391
-76.8+167S2a65
DtB
0.5676
170+103S3213
DtA
0.6101
As indicated in the table, several of the a, R, and P coefficients are a function of a number of
physiological variables, including lung volumes, inspiratory flow rates, and residence times.
These are described below.
10.3.2.1 Lung Volumes and Age Scaling Factors

The ICRP Publication provides reference values for the required lung volumes as a function of
subject age and gender. The volumes are: (1) the dead spaces for the entire lung (Vd), and for
the ET, TB, and B regions (Vd,ET, VdjB, Vd,B) and (2) the functional residual capacity (FRC).
The ICRP model also makes use of three scaling factors to adjust some of the a and R model
coefficients as a function of age. Physiologically, these factors represent ratios of specific
airways of a modeled person to those of a reference adult male. S i gives the ratio for the
115
-------
trachea, 82 for the 9th airway generation, and 83 for the 16th airway generation. Both the
volumes and the scaling factors are modeled in APEX based on the reference values, using an
equation of the form:
P=Ah2+Bh+C
(10.22)
The variable P is the volume or scaling factor of interest and h is the individual's height (in cm).
The values of the A,B,C coefficients for the different parameters for males and females were
calculated by fitting eq 10.22 to the ICRP parameters; they are given in Table 10.2.

Table 10.2 Coefficients for the Lung Volumes and Scaling Factors

A
B
C
Volumes
Male
FRC
vd
Vd,ET
Vd,TB
Vd,B
Female
FRC
vd
Vd,ET
Vd,TB
Vd,B

0.0002
0.0078
0.0031
0.0026
0.0023

0.0002
0.0079
0.0029
0.0022
0.0029

-0.0279
-0.7135
-0.3175
-0.2392
-0.2119

-0.0265
-0.7403
-0.2861
-0.1523
-0.3225

1.0353
29.316
10.907
9.7143
11.375

0.96
30.381
9.5306
5.9635
16.098
Scaling Factors
Male
Si
S2
S3

1.1354E-04
2.3020E-05
7.8360E-05

-4.0700E-02
-1.1168E-02
-3.2100E-02

4.6711
2.2567
4.2381
116
-------
Female
Si
S2
S3

1.2800E-04
2.3200E-05
8.0500E-05

-4.3542E-02
1.1225E-02
-3.2543E-02

4.7975
2.2597
4.2585
10.3.2.2 Tidal Volume and Activity Level

Tidal volumes (Vt) were calculated as a function of age, gender, and activity level. The starting
point for the Vt calculations are the ICRP reference values for men and women of reference ages
for four different activity levels: sleep, sitting, light exercise, and heavy exercise. Other ages are
interpolated from the values at the reference ages. After age 30, the values are assumed to be
constant with increasing age. The correct Vt for each event is determined as a function of age,
gender, and activity level. Activity level is determined by the event METS values for the
individual being studied. Activity level was based on normalized METS (M), defined in
Equation 10.23.
M-
METS-1
METSmax -1
(10.23)
METSmax is the maximum obtainable MET value for the person. If M was less than or equal to
0, then activity level was assumed to be "sleep." If M was greater than 0, then the activity level
was assigned as follows:

M < 0.333 : activity level = sitting

0.333 < M < 0.667 : activity level = light exercise

M > 0.667 : activity level = heavy exercise

10.3.2.3 Inspiratory Ventilation

Several of the model parameters are a function of the inspiratory ventilation. This flow rate was
calculated as
(ml/s)
(10.24)
where Ve is the exhaled ventilation rate.
117
-------
10.3.2.4 Residence Times

The residence times in the lungs are required to calculate deposition via diffusive mechanisms.
Residence times are a function of the flow rate, the dead spaces, FRC, and Vt. The residence
times in the tracheobronchial, bronchial, and alveolar regions are given by:
LTB
Vd,tTB 0.5Vt
/ +'
V
V
FRC
FRC
(s)
(10.25)
(10.26)
V
(10.27)
10.3.2.5 Final Deposition Fractions and Deposited Masses

The total deposition efficiency for filter] is:

Error! Objects cannot be created from editing field codes.

The total mass deposited in each filter] is:
k=1
(10 28)
(10.29)
The variable minh is the total inhaled PM mass that is calculated for each event, Equation 10.30.

minh= fmh *VE*minutes * CPM (10.30)

CPM is the microenvironmental PM concentration for the event, VE is the exhaled ventilation,
and minutes is the event duration.

The mass deposited in each region of the respiratory tract can be calculated by summing the
mass deposited by the inhalation and exhalation filters associated with that region. The total
deposited mass is given by
(10.31)
118
-------
10.4 Definition of Dose Summary Statistics

A flag called DODOSE is available in the Simulation Control file. The default is DODOSE =
YES. If DODOSE = NO then the dose calculation in APEX is skipped.

If the dose calculation is performed, the initial result is the dose level for each event in the
simulation. Unlike exposure, there are actually three doses calculated per event. The first is the
time-average dose over the event. The second is the running 8-hour average of event doses. The
third is the final instantaneous dose at the end of the event. For CO, The final dose is found
directly using the series T4(z) given in the previous section. The average dose is easily found
since T4(z) is a polynomial in time and therefore can be integrated without difficulty. For all
other pollutants, the instantaneous end-of-event dose has no meaning and is simply equal to the
value calculated in Eq. 10-1.

Most of the summary statistics for dose are analogous to those for exposure. The average dose
values over an event and the instantaneous dose at the end of the event are written to Events
output file (if it exists). Timestep and hourly-average dose time series are created by taking
appropriate duration-weighted averages of dose over the events in each clock hour. These doses
may be written to the Timestep and Hourly output files. A vector of instantaneous dose at the
end of each hour (FDose) is also saved, for the calculation of a daily statistic for maximum end-
of-hour dose (see next section). This is not an average, but simply a subset of the values of the
event-end dose corresponding to the events that end on a clock hour. For pollutants other than
CO, this end-of-hour dose is simply the dose on the last event of the hour.

There are five daily summary statistics for dose. The first is DAvgDose (Daily Average Dose)
which is the average of the 24 hourly average dose values that fall on the same calendar day. The
result for each day is binned according to the levels defined by the cutpoints set in the Simulation
Control file. The second summary statistic is DMTSDose (Daily Maximum Timestep Dose), the
largest dose over all the timestep values for the day. The third summary statistic is DMIHDose
(Daily Maximum 1-Hour Dose) which is the largest of the 24 hourly dose values for a day. The
fourth is DMSHDose (Daily Maximum 8-Hour Dose) which is the largest of the 24 8-hour
running dose averages. The fifth is DMEHDose (Daily Maximum End-of-Hour Dose), which is
the largest of the 24 values of the instantaneous dose level at the end of each clock hour on a day.
All four daily summaries are binned according to the appropriate set of cutpoints in the
Simulation Control file. Finally, there is the average dose over the entire simulation period,
SAvgDose. It too is binned and tabulated according to cutpoints set in the Simulation Control
file.
119
-------
REFERENCES

Adams WC. 1998. Letter to Tom McCurdy, National Exposure Research Laboratory, U.S.
Environmental Protection Agency, Research Triangle Park, North Carolina. August 21.

Almuzaini KS, Potteiger JA, and Green SB. 1998. Effects of split exercise sessions on excess
postexercise oxygen consumption and resting metabolic rate. Can J Appl Physiol. 23(5):433-43.

Astrand PO and Rodahl K. 1977. Textbook of Work Physiology. 2nd ed. McGraw-Hill, New
York, New York.

Bahr R, Ingnes I, Vaage O, Sejersted OM, and Newsholme EA. 1987. Effect of duration of
exercise on excess postexercise O2 consumption. J Appl Physiol. 62(2):485-90.

Bahr R. 1992. Excess postexercise oxygen consumption—magnitude, mechanisms and practical
implications. Acta Physiol. Scand Suppl. 605:1-70.

Berthoin S, Baquet G, Dupont G, Blondel N, and Mucci P. 1996. Critical velocity and anaerobic
distance capacity in prepubertal children. Can J Appl Physiol. 28(4):561-75.

Bickham D, Le Rossignol P, Gibbons C, and Russell AP. 2002. Re-assessing accumulated
oxygen deficit in middle-distance runners. J Sci Med Sport. 5(4):372-82.

Bielinski R, Schutz Y, and Jequier E. 1985. Energy metabolism during the postexercise recovery
in man. Am J Clin Nutr. 42(l):69-82.

Billat V, Beillot J, Jan J, Rochcongar P, and Carre F. 1996. Gender effect on the relationship of
time limit at 100% VO2max with other bioenergetic characteristics. Med Sci Sports Exerc.
28(8): 1049-55.

Biller WF, Feagans TB, Johnson TR, Duggan GM, ,Paul RA, McCurdy T, and Thomas HC.
1981. A general model for estimating exposure associated with alternative NAAQS. Paper No.
81-18.4 in Proceedings of the 74th Annual Meeting of the Air Pollution Control Association,
Philadelphia, Pa.

Box G, Jenkins G, and Reinsel G. 1994. Time Series Analysis: Forecasting and Control,
Prentice Hall, Englewood Cliffs, NJ.

Brockman L, Berg K, and Latin R. 1993. Oxygen uptake during recovery from intense
intermittent running and prolonged walking. J Sports Med Phys Fitness. 33(4):330-6.

Buck D and McNaughton L. 1999. Maximum accumulated oxygen debt must be calculated using
10 min time periods. Med Sci Sports Exerc. 31(9): 1346-1349.

Burmaster DE and Crouch EAC. 1997. Lognormal distributions of body weight as a function of
age for males and females in the United States, 1976 - 1980. Risk Analysis 17(4).

Burmaster DE. 1998. LogNormal distributions for skin area as a function of body weight. Risk
Analysis. 18(l):27-32.

120
-------
Carlson JS and Naughton GA. 1993. An examination of the anaerobic capacity of children using
maximum accumulated oxygen debt. Pediatr Exerc Sci. 5:60-71.

Dawson B, Straton S, and Randall N. 1996. Oxygen consumption during recovery from
prolonged submaximum cycling below the anaerobic threshold. J Sports Med Phys Fitness.
36:77-84.

Demarle AP, Slawinski JJ, Laffite LP, Bocquet VG, Koralsztein JP, and Billat VL. 2001.
Decrease of O(2) deficit is a potential factor in increased time to exhaustion after specific
endurance training. J Appl Physiol. 90(3):947-53.

Doherty M, Smith PM, and Schroder K. 2000. Reproducibility of the maximum accumulated
oxygen deficit and run time to exhaustion during short-distance running. J Sports Sci. 18(5):331-
EPA 1978. Altitude as a Factor in Air Pollution. Environmental Criteria and Assessment Office.
EPA-600/9-78-015.

EPA 1999. Total Risk Integrated Methodology. [On-line]. Available:
http://www.epa.gov/ttnatw01/urban/trim/trimpg.html.

EPA. 2002. Consolidated Human Activities Database. [On-line] Available:
http://www.epa.gov/chadnetl/.

Esmail S, Bhambhani Y, and Brintnell S. 1995. Gender differences in work performance on the
Baltimore therapeutic equipment work simulator. Amer. J. Occup. Therapy. 49: 405 - 411.

Faina M, Billat V, Squadrone R, De Angelis M, Koralsztein JP, and Dal Monte A. 1997.
Anaerobic contribution to the time to exhaustion at the minimal exercise intensity at which
maximum oxygen uptake occurs in elite cyclists, kayakists and swimmers. Eur J Appl Physiol.
Occup Physiol. 76(1): 13-20.

Frey GC, Byrnes WC, and Mazzeo RS. 1993. Factors influencing excess postexercise oxygen
consumption in trained and untrained women. Metabolism. 42(7):822-828.

Galetti, P. M. 1959. Respiratory exchanges during muscular effort. Helv. Physiol. Acta. 17: 34 -
61.

Gastin PB and Lawson DL. 1994. Variable resistance all-out test to generate accumulated
oxygen deficit and predict anaerobic capacity. Eur J Appl Physiol. Occup Physiol. 69(4):331-6.

Gastin PB, Costill DL, Lawson DL, Krzeminski K, and McConell GK. 1995. Accumulated
oxygen deficit during supramaximum all-out and constant intensity exercise. Med Sci Sports
Exerc. 27(2):255-63.

Gillette CA, Bullough RC and Melby CL. 1994. Postexercise energy expenditure in response to
acute aerobic or resistive exercise. Int J Sport Nutr. 4(4):347-60.
121
-------
Glen G, Smith L, Isaacs K., McCurdy T., and Langstaff J. 2007. A new method of longitudinal
diary assembly for human exposure modeling. J. Expos. Sci. Environ. Epidemiol. In press.

Gore CJ and Withers RT. 1990. Effect of exercise intensity and duration on postexercise
metabolism. J Appl Physiol. 68(6):2362-8.

Graham S and McCurdy T. 2005. Revised Ventilation Rate (Ve) Equations for Use in
Inhalation-Oriented Exposure Models, A NERL Internal Research Report.

Hagberg JM, Hickson RC, Ehsani AA, and Holloszy JO. 1980. Faster adjustment to and
recovery from submaximum exercise in the trained state. J Appl Physiol. 48(2):218-24.

Harms CA, Cordain L, Stager JM, Sockler JM, and Harris M. 1995. Body fat mass affects
postexercise oxygen metabolism in males of similar lean body mass. Med Exer Nutr Health.
4:33-39.

Harris JM, Hobson EA, and Hollingsworth DF. 1962. Individual variations in energy
expenditure and intake. Proc Nutr Soc. 21: 157-169.

Hill DW, Ferguson CS, and Ehler KL. 1998. An alternative method to determine maximum
accumulated O2 deficit in runners. Eur J Appl Physiol. Occup Physiol. 79(1): 114-7.

ICRP Publication 66. 1994. Human Respiratory Tract Model for Radiological Protection.
Annals of the ICRP. International Commission on Radiological Protection.

Isaacs K and Smith L. 2005. New Values for Physiological Parameters for the Exposure Model
Input File Physiology.txt. Memorandum submitted to the U.S. Environmental Protection
Agency under EPA Contract EP-D-05-065. NERL WA 10. Alion Science and Technology.

Isaacs K, Glen G, McCurdy T., and Smith L. 2007. Modeling energy expenditure and oxygen
consumption in human exposure models: Accounting for fatigue and EPOC. J. Expos. Sci.
Environ. Epidemiol. In press.

Johnson TR and Paul RA. 1983. The NAAQS Exposure Model (MEM) Applied to Carbon
Monoxide. EPA-450/5-83-003. Prepared for the U.S. Environmental Agency by PEDCo
Environmental Inc., Durham, N.C. under Contract No. 68-02-3390. U.S. Environmental
Protection Agency, Research Triangle Park, N.C.

Johnson T, Capel J, Olaguer E, Wijnberg L. 1992. Estimation of Ozone Exposures Experienced
by Residents of ROMNET Domain Using a Probabilistic Version of NEM. Report prepared by
IT Air Quality Services for the Office of Air Quality Planning and Standards, U. S.
Environmental Protection Agency, Research Triangle Park, North Carolina.

Johnson T, Capel J, and McCoy M. 1996a. Estimation of Ozone Exposures Experienced by
Urban Residents Using a Probabilistic Version of NEM and 1990 Population Data. Report
prepared by IT Air Quality Services for the Office of Air Quality Planning and Standards, U.S.
Environmental Protection Agency, Research Triangle Park, North Carolina.
122
-------
Johnson T, Capel J, Mozier J, and McCoy M. 1996b. Estimation of Ozone Exposures
Experienced by Outdoor Children in Nine Urban Areas Using a Probabilistic Version of NEM.
Report prepared for the Air Quality Management Division under Contract No. 68-DO-30094,
April.

Johnson T., Capel J, McCoy M and Mozier J. 1996c. Estimation of Ozone Exposures
Experienced by Outdoor Workers in Nine Urban Areas Using a Probabilistic Version of NEM.
Report prepared for the Air Quality Management Division under Contract No. 68-DO-30094,
April.

Johnson T. 1998. Analysis of Clinical Data Provided by Dr. William Adams and Revisions to
Proposed Probabilistic Algorithm for Estimating Ventilation Rate in the 1998 Version of
pNEM/CO. Memorandum submitted to the U.S. Environmental Protection Agency under EPA
Contract No. 68-D6-0064. TRJ Environmental, Inc.

Johnson T, Mihlan G, LaPointe J, Fletcher K, Capel J, Rosenbaum A, Cohen J, Stiefer P. 2000.
Estimation of carbon monoxide exposures and associated carboxyhemoglobin levels for residents
of Denver and Los Angeles using pNEM/CO. Appendices. EPA constract 68-D6-0064.

Johnson T. 2002. A Guide to Selected Algorithms, Distributions, and Databases Used in
Exposure Models Developed By the Office of Air Quality Planning and Standards. Revised
Draft. Prepared for U.S. Environmental Protection Agency under EPA Grant No. CR827033.

Joumard R, Chiron M, Vidon R, Maurin M, and Rouzioux J-M. 1981. Mathematical models of
the uptake of carbon monoxide on hemoglobin at low carbon monoxide levels. Environmental
Health Perspectives. 41: 277 - 289.

Kaminsky LA, Padjen S, and LaHam-Saeger. 1990. J Effect of split exercise sessions on excess
post-exercise oxygen consumption. Br J Sports Med. 24(2):95-8.

Kaminsky LA, and Whaley MH. 1993. Effect of interval-type exercise on excess post-exercise
oxygen consumption in obese and normal-weight women. Med Exer Nutr Health. 2:106-111.

Katch FI, Girandola RN, and Henry FM. 1972. The influence of the estimated oxygen cost of
ventilation on oxygen deficit and recovery oxygen intake for moderately heavy bicycle
ergometer exercise. Med Sci Sports. 4:71-76.

Knuttgen HG. 1970. Oxygen debt after submaximum physical exercise. J Appl Physiol.
29(5):651-657.

Langstaff, J.E. 2007. Analysis Of Uncertainty In Ozone Population Exposure Modeling. Office
of Air Quality Planning and Standards,U.S. Environmental Protection Agency.

Maehlum S, Grandmontagne M, Newsholme EA, and Sejersted OM. 1986. Magnitude and
duration of excess postexercise oxygen consumption in healthy young subjects. Metabolism.
35(5):425-9.
123
-------
Maresh CM, Abraham A, De Souza MJ, Deschenes MR, Kraemer WJ, Armstrong LE, Maguire
MS, Gabaree CL, and Hoffman JR. 1992. Oxygen consumption following exercise of moderate
intensity and duration. Eur J Appl Physiol. Occup Physiol. 65(5):421-6.

Maxwell NS and Nimmo MA. 1996. Anaerobic capacity: a maximum anaerobic running test
versus the maximum accumulated oxygen deficit. Can J Appl Physiol. 21(l):35-47.

McArdle WD, Katch FI, and Katch VL. 2001. Exercise Physiology: Energy, Nutrition, and
Human Performance, Fifth Edition. Lippincott, Williams, and Wilkins, Philadephia.

McCurdy T. 2000. Conceptual Basis for Multi-Route Intake Dose Modeling Using an Energy
Expenditure Approach. Journal of Exposure Analysis and Environmental Epidemiology. 10:1 -
12.

McCurdy T, Glen G, Smith L, and Lakkadi Y. 2000. The National Exposure Research
Laboratory's Consolidated Human Activity Database, Journal of Exposure Analysis and
Environmental Epidemiology 10: 566-578.

Naughton GA, Carlson JS, Buttifant DC, Selig SE, Meldrum K, McKenna MJ, and Snow RJ.
1998. Accumulated oxygen deficit measurements during and after high-intensity exercise in
trained male and female adolescents. Eur J Appl Physiol. Occup Physiol. 76(6):525-31.

Olesen HL. 1992. Accumulated oxygen deficit increases with inclination of uphill running. J
Appl Physiol. 73(3): 1130-4.

Pivarnik JM and Wilkerson JE. 1988. Recovery metabolism and thermoregulation of endurance
trained and heat acclimatized men. Sports Med Phys Fitness 28(4):375-80.

Renoux JC, Petit B, Billat V, and Koralsztein JP. 1999. Oxygen deficit is related to the exercise
time to exhaustion at maximum aerobic speed in middle distance runners. 1: Arch Physiol.
Biochem. 107(4):280-5.

Roberts AD, Clark SA, Townsend NE, Anderson ME, Gore CJ, and Hahn AG. 2003. Changes in
performance, maximum oxygen uptake and maximum accumulated oxygen deficit after 5, 10
and 15 days of live high:train low altitude exposure. Eur J Appl Physiol. 88(4-5):390-395.

Roddin, MF, Ellis HT, and Siddiqee WM. 1979. Background Data for Human Activity Patterns,
Vols. 1, 2. Draft Final Report prepared for Strategies and Air Standards Division, Office of Air
Quality Planning and Standards, U.S. Environmental Protection Agency, Research Triangle
Park, N.C.

Schofield, WN. 1985. Predicting basal metabolic rate, new standards, and review of previous
work. Hum Nutr Clin Nutr, 39C(Supplement 1):5 - 41.

Sedlock DA. 199la. Effect of exercise intensity on postexercise energy expenditure in women.
Br J Sports Med. 25(1):38-40.

Sedlock DA. 1991b. Postexercise energy expenditure following upper body exercise. Res Q
Exerc Sport. 62(2):213-6.

124
-------
Short KR and Sedlock DA. 1997. Excess postexercise oxygen consumption and recovery rate in
trained and untrained subjects. J Appl Physiol, 83(1): 153-159.

Trost S, Wilcox A, and Gillis D. 1997. The effect of substrate utilization, manipulated by
nicotinic acid, on excess postexercise oxygen consumption. Int J Sports Med 18(2):83-88.

Weber CL and Schneider DA. 2000. Maximum accumulated oxygen deficit expressed relative to
the active muscle mass for cycling in untrained male and female subjects. Eur J Appl Physiol.
82(4):255-61.

Xue J, McCurdy T, Spengler O, Ozkaynak, H. 2004. Understanding variability in the time spent
in selected locations for 7-12 year old children. J Exposure Anal Environ Epidem 14(3) : 222-
233.
125
-------
United States Office of Air Quality Planning and Standards Publication No. EPA-452/B-08-001b
Environmental Protection Health and Environmental Impacts Division October 2008
Agency Research Triangle Park, NC
-------