The HAPEM User's Guide
Hazardous Air Pollutant
Exposure Model,

Version 8

December 2023

Prepared for:

Matt Woody

US Environmental Protection Agency
Office of Air Quality Planning and Standards

Research Triangle Park, North Carolina

Prepared by:

Minti Patel, Chris Holder, Graham Glen,
Aishwarya Javali, Jared Wang,
and Melissa Polansky
ICF

2635 Meridian Pkwy., Suite 200,

Durham, NC 27713

David Yarnell, Ben Holloway, and Michael Blair
Innovate! Inc.

6189 Cobbs Road
Alexandria, VA 22310


-------
This page intentionally left blank.


-------
Table of Contents

Contents

New Features in HAPEM8	v

1.	Introduction	1-1

1.1.	Organization of the User's Guide	1-1

1.2.	Background	1-2

1.2.1.	Population Data	1-2

1.2.2.	Activity Data	1-2

1.2.3.	Air-quality Data	1-4

1.2.4.	ME Data	1-5

1.2.5.	Stochastic Elements	1-5

1.3.	Strengths and Limitations of HAPEM	1-7

1.3.1.	Strengths	1-8

1.3.2.	Limitations	1-8

1.4.	Applicability	1-9

1.5.	Brief History of the Hazardous Air Pollutant Exposure Model	1-9

2.	Getting Started—An Overview of HAPEM	2-1

2.1.	Model Structure	2-2

2.1.1.	Parameter Files	2-2

2.1.2.	The DURAV Program and the Activity and Cluster Files	2-7

2.1.3.	The INDEXPOP Program and the Population, Distance-to-road, Commuting-
time, and Commuting-fraction Files	2-8

2.1.4.	The COMMUTE Program and the Commuting, Distance-to-road,

Com muting-time, and Commuting-fraction Files	2-9

2.1.5.	The AIRQUAL Program and the Air Quality and Distance-to-road Files.. 2-10

2.1.6.	The HAPEM Program, the ME Factors and Mobiles Files, and the Activity
Cluster-transition File	2-10

2.1.7.	The Statefip File	2-12

2.1.8.	Background Concentration	2-13

2.1.9.	Exposure Output Files	2-13

2.2.	Changing the Parameter Settings	2-14

2.2.1.	Changing the Number of MEs	2-14

2.2.2.	Changing the Number and/or Definitions of the Demographic Groups .... 2-14

2.2.3.	Changing the Number and/or Definitions of Day Types	2-15

2.2.4.	Changing the Number and/or Definitions of Time Blocks	2-15

2.3.	Setting Up a HAPEM Run	2-15

2.3.1.	Running HAPEM as a "Batch" Job	2-17

2.3.2.	Running HAPEM Programs Individually	2-17

3.	HAPEM Input Files	3-1

3.1.	Parameter Files	3-2

3.1.1.	Specifying the Location and Names of Input and Output Files	3-3

3.1.2.	Identifying the Uniform Component of the Background Concentration	3-3

3.1.3.	Setting the Internal Parameters	3-4

3.2.	Activity File	3-4

3.2.1.	Variables and Format of the Default File	3-4

3.2.2.	Replacing or Modifying the Default File	3-7

3.3.	Cluster File	3-8

December 2023


-------
Table of Contents

3.3.1.	Variables and Format of the Default File	3-8

3.3.2.	Replacing or Modifying the Default File	3-9

3.4.	Population File	3-9

3.4.1.	Variables and Format of the Default File	3-9

3.4.2.	Replacing or Modifying the Default File	3-10

3.5.	Commuting-time File	3-11

3.6.	Commuting-fraction File	3-11

3.7.	Distance-to-road File	3-12

3.8.	Commuting File	3-13

3.8.1. Replacing or Modifying the Default File	3-14

3.9.	Air Quality File	3-14

3.10.	ME Factors and Mobiles Files	3-16

3.11.	Cluster-transition File	3-21

3.12.	Statefip File	3-23

4.	HAPEM Output Files	4-1

4.1.	Log File	4-1

4.1.1.	DURAV Output to the Log File	4-2

4.1.2.	INDEXPOP Output to the Log File	4-3

4.1.3.	COMMUTE Output to the Log File	4-3

4.1.4.	Al RQUAL Output to the Log File	4-3

4.1.5.	HAPEM Output to the Log File	4-4

4.2.	Counter File	4-4

4.3.	Mistract File	4-6

4.4.	Final Exposure File	4-7

5.	HAPEM Programs	5-1

5.1.	Programming Guidelines Used to Develop HAPEM	5-1

5.1.1. Common Structural Elements	5-1

5.2.	Program Descriptions	5-3

5.2.1.	DURAV	5-3

5.2.2.	INDEXPOP	5-5

5.2.3.	COMMUTE	5-6

5.2.4.	AIRQUAL	5-8

5.2.5.	HAPEM	5-9

6.	References	6-1

Appendix A: Updating the Hazardous Air Pollutant Exposure Model (HAPEM) for Use in
the 2020 Air Toxics Screening Assessment (AirToxScreen)	A-1

ICF	ii	HAPEM8 User's Guide

December 2023


-------
Table of Contents

Figures

Figure 2-1. Overview of HAPEM	2-1

Figure 2-2a. Example parameter file for running model programs 1-3 (DURAV, INDEXPOP, and

COMMUTE)	2-5

Figure 2-2b. Example parameter file for running model programs 4-5 (AIRQUAL and HAPEM)2-
6

Figure 2-3. Example "batch" file for running the five model programs	2-17

Tables

Table 1-1. HAPEM MEs	1-3

Table 2-1. Keywords for parameter files and example filenames	2-3

Table 3-1. Variables in the default activity file	3-5

Table 3-2. Variables in the default population file	3-10

Table 3-3a. Format for the factors file	3-18

Table 3-3b. Format for the mobiles file (one onroad-mobile source category)	3-19

Table 3-4. Variables in the cluster-transition file	3-22

Table 4-1. Variables in the counter file	4-5

Table 4-2. Variables in the final exposure output file (assuming nsource = 4)	4-9

Table 5-1. The filename keywords in the parameter files recognized by the model programs. 5-2

ICF

iii

HAPEM8 User's Guide
December 2023


-------
This page intentionally left blank.


-------
New Features of HAPEM8

New Features in HAPEM8

The Hazardous Air Pollutant Exposure Model, version 8 (HAPEM8) includes a number of
updated features. These updated features better reflect the residential locations, work locations,
commuting habits, and activity patterns of the current (2020) U.S. population. They also are
designed to provide exposure estimates that better characterize the variability across the
population. These updated features are summarized in the list below and detailed in other
portions of this User's Guide.

•	Data on population, commuting patterns, and residential proximity to major roads have
been updated based on information from the 2020 census where possible.

•	Activity-pattern data have been updated based on the April 2020 version of CHAD-
M aster.

ICF

v

HAPEM8 User's Guide
December 2023


-------
This page intentionally left blank.


-------
1. Introduction

1. Introduction

The Hazardous Air Pollutant Exposure Model, version 8 (HAPEM8) User's Guide is designed to
assist exposure analysts with running and interpreting results from HAPEM8. Throughout the
User's Guide, for easier identification, the input filenames and file types are in italics (usually
lowercase), model program names are uppercase underlined, and model variables are in bold
italics. When presented, input and output data and program source codes will be presented in a
single lined box, indicating that the text inside the box is shown exactly as it exists in its
electronic form. In addition, shaded text boxes appear throughout the document providing useful
information and tips to users.

Most of the material in this HAPEM8 User's Guide was taken from the HAPEM7 User's Guide.

1.1. Organization of the User's Guide

The User's Guide is organized into six chapters and an appendix. Chapters 1 and 2 provide a
general overview of the background functionality of HAPEM, as well as basic instructions for
running the model. The remaining chapters are designed to provide the user with more detailed
information on the components of HAPEM. These chapters are designed to be easily
referenced without requiring the entire document to be read. We suggest, however, that the
novice user read all of the chapters at least once to gain a better understanding of HAPEM.

Chapter 1 Introduction. Provides a brief introduction to HAPEM modeling

fundamentals, including a brief history of the development of HAPEM.

Chapter 2 Getting Started—An Overview of HAPEM. Provides an overview of the
various components of HAPEM and basic information needed to run the
model.

Chapter 3 HAPEM Input Files. Provides a description of the format, data, and
options for each HAPEM input file.

Chapter 4 HAPEM Output Files. Provides a description of the format and data
associated with each HAPEM output file.

Chapter 5 HAPEM Programs. Provides a description of the purpose, operations,

inputs, and outputs, including a brief description of the computer code, for
each HAPEM computer program.

Chapter 6 References.

Appendix A A 2023 technical memorandum from ICF to EPA, providing a thorough
description of the process of updating the default input files and model
source code for HAPEM8.

ICF

1-1

HAPEM8 User's Guide
December 2023


-------
1. Introduction

1.2. Background

The Hazardous Air Pollutant Exposure Model, version 8 (HAPEM8) is a screening-level
exposure model appropriate for assessing average long-term inhalation exposures of the
general population, or a specific sub-population, over spatial scales ranging from urban1 to
national. HAPEM provides a relatively transparent set of exposure assumptions and
approximations, as is appropriate for a screening-level model.

HAPEM uses the general approach of tracking
representatives (termed "replicates") of specified
demographic groups as they move among indoor and
outdoor microenvironments (MEs) and among
geographic locations. The estimated HAP concentrations
in each ME visited are combined into a time-weighted
average concentration, which is assigned to members of
the demographic group.

HAPEM uses four primary sources of information:
population data from the U.S. Census Bureau (census),
population activity data from the U.S. Environmental Protection Agency (EPA) Consolidated
Human Activity Database (CHAD), air quality data, and ME data. These data will be discussed
briefly below, and in greater detail later in this User's Guide.

1.2.1.	Population Data

The census is the primary source of most population demographic data. The census collects,
among other things, information on where people live, their demographic makeup (e.g., age,
gender, ethnic group; note that only age currently is used in HAPEM), employment (which is not
explicitly used in HAPEM), and commuting behavior. The default population data for HAPEM
currently are derived from the 2020 census reported at the spatial resolution of census tracts.
Census tracts are small, relatively permanent statistical subdivisions of a county and usually
contain between 2,500 and 8,000 residents. The six HAPEM age groups are: 0-1, 2-4, 5-15,
16-17, 18-64, and 65 years and older.

A second type of population data used in HAPEM is an estimate of the fraction of the population
of each census tract that lives within certain distances of major roadways. These estimates
were derived using geospatial software to perform proximity analyses on roadway location data
from the 2022 census TIGER/Line database (see Appendix A for more details on the HAPEM
default input files). They are used, in conjunction with the PROXfactors described below, to
account for the enhanced outdoor concentrations of HAPs emitted from onroad vehicles at
locations near major roadways, and the associated enhanced indoor concentrations.

1.2.2.	Activity Data

HAPEM uses four types of population activity data: activity-pattern data, commuting-flow data,
commuting-time data, and commuting-fraction data. Human activity-pattern data are used to
determine the frequency and duration of exposure for specific groups within various MEs.
Activity-pattern data are taken from demographic surveys compiled in CHAD (Graham et al.

1 Urban refers to a scale that encompasses the size of a large city, generally on the order of tens of kilometers.

A microenvironment (ME) is a three-
dimensional space in which human
contact with an environmental pollutant
takes place and which can be treated as
a well-characterized, relatively
homogeneous location with respect to
pollutant concentrations for a specified
time period.

ICF

1-2

HAPEM8 User's Guide
December 2023


-------
1. Introduction

2019) of individuals' daily activities, the amount of time spent engaged in those activities, and
the locations where the activities occur. The version of CHAD current in April 2020 was used in
the current HAPEM. CHAD contains the sequential patterns of activities for each individual
diary-day, and each activity event has a corresponding location code so that the ME of each
activity event is known. It is composed of over 178,000 person-days of activity-pattern data,
including 315 specific activities and 110 specific ME locations, collected and organized from 23
human activity-pattern surveys.

In addition to recording the duration and location of a person's activities, these surveys also
collect important demographic information about the person. The demographic information
usually includes the person's age, gender, and race/ethnicity group. Most activity-pattern
studies also try to collect information on other attributes of a respondent, such as highest level
of education completed, number of people in their household, whether the person or anyone in
their household is a smoker, employment status, and the number of hours spent outdoors. For
the purposes of the HAPEM default files, age is the only CHAD demographic attribute used,
although the current HAPEM activity input file includes gender and race/ethnicity. A commuting-
status indicator also is included in the current HAPEM activity file and used in the modeling, as
is an indication of the day type of the diary-day (HAPEM currently has three day types: summer
weekdays, other [non-summer] weekdays, and weekends).

The ME categories currently incorporated into the default population activity file for HAPEM are
presented in Table 1-1 (see Appendix A for more details on the HAPEM default input files).

Table 1-1.
HAPEM MEs

Number

ME Description





1

| Residential

| 1 Indoors Residence

| No

2

; School

| 2 Indoors Other

V No

3	

Hospital

i 2 Indoors Other

No

' 4""

s Office

I 2 Indoors Other

; No 		

5

• Public Access

2 Indoors Other

No 	

	6	

! Bar/Restaurant

j 2 Indoors Other

No

7

! Car/Truck

5 In-vehide

I Yes - Private Transit

" 8	

Public Transit

; 5 In-vehide

! Yes - Public Transit

9

: Air Travel

5 2 Indoors Other

j No "	" "

10	

Waiting Indoors for Public Transit

j 2 Indoors Other

; Yes-Public Transit

11

Waiting Outdoors for Public Transit

j 3 Outdoors Near-roadway

; Yes-Public Transit

12

1 Motorcycle/Bicycle

j 3 Outdoors Near-roadway

; Yes - Private Transit

13

' Ferryboat

; 4 Outdoors Other

; Yes - Public Transit

14	

i Residential Garage

; 3 Outdoors Near-roadway

: no

15	

Outdoors, Near Roadway

I 3 Outdoors Near-roadway

1 No 	

16

i Outdoors, Service Station

j 3 Outdoors Near-roadway

:: No

17

Outdoors, Parking Garage

j 3 Outdoors Near-roadway

j No 		

	18

f Outdoors, Other

i 4 Outdoors Other

No

Because available activity data are not adequate to estimate the exposure of each individual in
a population, HAPEM groups activity-pattern data together for people with similar demographic
characteristics that are expected to influence exposure to HAPs (e.g., age and commuting
status), and it makes exposure estimates for these groups. The activity profiles for each
replicate in an demographic group have an equal chance of being selected from the activity

ICF

1-3

HAPEM8 User's Guide
December 2023


-------
1. Introduction

database (see Section 1.2.5 [Stochastic Elements]). The result is that HAPEM provides a
distribution of exposure concentrations for each demographic group in each census tract.

The commuting-flow data contained in the current HAPEM default file were derived by the U.S.
Department of Transportation Federal Highway Administration (FHWA) from the 2012-2016
five-year data from the U.S. Census Bureau's American Community Survey (ACS), as part of
the Census Transportation Planning Package (CTPP) and commissioned by the American
Association of State Highway and Transportation Officials (see the CTPP web site
https://ctpp.transportation.org/2012-2016-5-year-ctpp/). The data files specify the number of
residents of each census tract that work in that tract and every other tract (i.e., the population
associated with each home-tract/work-tract pair). The geographies of the data were for 2010
tracts, so a 2010-to-2020 tract-relationship file from the Census Bureau was used to
approximate the data for 2020 tract geographies. For the current HAPEM, the distance between
the centroids of the home and work tracts were calculated outside of the CTPP, using the 2020
census gazetteer spatial files and geostpatial algorithms (see Appendix A for more details on
the current HAPEM default input files). HAPEM uses these data in coordination with the
activity-pattern data to place a replicate who commutes to work either in the home tract or the
work tract at each time step.

For each census tract, the current HAPEM default commuting-time file contains the proportion
of commuting workers using public transit and the proportion using private transit, and it also
contains the average commute time stratified by public or private transit, as derived from the
2016-2020 five-year ACS (see Appendix A for more details on the current HAPEM default input
files). These data are combined with data on the centroid-to-centroid distances between tracts
(see Section 5.2.5 [HAPEM]) to estimate the commuting time for each commuting replicate.

Data specifying the fraction of each demographic group in each census tract that commutes to
work, as contained in the current HAPEM default commuting-fraction file, were derived from the
2016-2020 five-year ACS (see Appendix A for more details on the current HAPEM default input
files).

1.2.3. Air-quality Data

Some previous versions of HAPEM relied on measured outdoor HAP concentration data for the
exposure calculations. This limited both the extent of the modeling domain and HAPs, because
exposures could only be calculated for locations and HAPs with large monitoring networks.
Typically, sufficient data were only available for large metropolitan areas and for the criteria
pollutants.2

HAPEM is able to estimate exposures over the entire US at spatial scales as small as a census
tract. In order to preserve any characteristic diurnal patterns in ambient concentrations that
might be important in the estimation of population exposure, HAPEM can treat annual-average
concentration estimates that are stratified by time of day in the air-quality input file. The time
steps in the air-quality data must be an integral factor of the number of time steps in the activity
input file (see Section 2.1.2 [The DURAV Program and the Activity and Cluster Files]). For
example, the current HAPEM default activity file contains data in (24) 1-hour time blocks, so an
air-quality file used with the default activity file must contain data in (24) 1-hour time blocks, or
(12) 2-hour time blocks, or (8) 3-hour time blocks, and so on. The air-quality data are combined

2 Criteria pollutants are those for which a National Ambient Air Quality Standard (NAAQS) has been set. They are
ground-level ozone, carbon monoxide, sulfur dioxide, nitrogen dioxide, lead, and particulate matter.

ICF

1-4

HAPEM8 User's Guide
December 2023


-------
1. Introduction

in HAPEM with activity data to estimate exposure concentrations. The air-quality data also can
be decomposed to reflect the contributions from various emission sources. The number of
sources is a user-specified variable.

HAPEM also is able to incorporate spatial variability of air quality within each census tract. That
is, the air quality within a tract is not limited to a single point estimate (diurnally- and source-
stratified). Spatial variability may be incorporated in two different ways. One method is to
characterize the air quality in a census tract by a set of up to 500 diurnally- and source-stratified
values. How HAPEM handles these datasets is explained below in Section 1.2.5 (Stochastic
Elements).

When air quality is characterized by a single point estimate (diurnally- and source-stratified), a
second method allows the user to specify a scalar factor to be applied to the census-tract air-
quality values, with the scalar dependent on the distance of the replicate's residence from a
major roadway. This approach also is discussed in Section 1.2.5 (Stochastic Elements).

1.2.4.	ME Data

To calculate the exposure concentration for each demographic group, an estimate is required of
the concentration in each ME specified by the activity pattern. In HAPEM, these ME
concentration estimates are derived from the outdoor-concentration estimate for the census
tract and a set of three ME factors: PEN, PROX, and ADD. These respectively account for
penetration of outdoor air into the ME, concentration enhancement due to proximity of the ME to
the emission source, and emission sources within the ME (note that ADD factors are currently
set to zero, as discussed in Section 2.1.6 [The HAPEM Program, the ME Factors and Mobiles
Files, and the Activity Cluster-transition File]).

The ME factors are entered into the model as data from input files that contain estimates of
distributions for PEN, PROX, and ADD for three phases of HAPs: gases, particles, and HAPs
that might be either phase depending on various conditions. The current HAPEM default PEN
distributions were obtained from an extensive review of literature and databases on
indoor/outdoor ratios of HAPs. The current HAPEM default PROX distributions for onroad-
mobile sources were derived from modeling studies of the concentration gradients of HAPs near
major roadways.3 How the distributions are utilized in HAPEM is discussed below in Section
1.2.5 (Stochastic Elements).

As is the case with all other HAPEM input files, these data can be modified by the user. The ME
factors should be updated as needed to reflect current knowledge, as available.

1.2.5.	Stochastic Elements

Although it would be difficult to accurately represent the activities of an individual due to day-to-
day variation, the general behavior of population groups can be well represented using
stochastic processes. This makes it possible for estimates of population exposure to be
characterized as distributions rather than point estimates. HAPEM incorporates six stochastic
elements, as described below.

3 The default PROX values for other emission source categories are point values of 1.0 (i.e., no concentration
enhancement due to proximity), and the default ADD values are point values of 0.0 (i.e., no indoor emission
sources).

ICF

1-5

HAPEM8 User's Guide
December 2023


-------
1. Introduction

1.2.5.1.	Commuting Status

The first stochastic element in the construction of a replicate is the determination of the
commuting status (yes or no), according to the probabilities specific to census tracts and
demographic groups.

1.2.5.2.	Activity Patterns

The second stochastic element is the selection of daily activity patterns to represent the
demographic group and commuting status of the replicate. HAPEM estimates long-term-
average concentrations, but the available sequences of population activity data are specified for
24-hour periods only. The general approach used by HAPEM for constructing long-term-
average activity sequences from short-term records is composed of several steps (see
Appendix A for a detailed discussion, which is briefly summarized here). The first is to select
three sets of 24-hour activity patterns, where each set is used to construct an average pattern
for an individual for one of the three specified HAPEM day types. A set of patterns, rather than a
single pattern, is selected for each day type to reflect the day-to-day variability of activity
patterns for an individual. How the set of patterns is combined into an average pattern for the
day-type is explained later in this section.

Next, the corresponding exposure concentration is calculated for each of the three day-type-
average activity patterns. Then, a weighted average of the three exposure concentrations is
calculated to represent the annual-average concentration, where the weightings represent the
number of days per year for each day type (i.e., 65 for summer weekdays, 196 for other
weekdays, and 104 for weekends). This process is repeated for several replicates4 for each
combination of census tract and demographic group, to create a set of annual exposure-
concentration estimates for each group in each tract.

To implement this approach, first all the activity-pattern data are grouped according to
demographic group, day type, and commuting status. Then, for each group/commuting /day
combination, the activity patterns are stratified into one to three categories, based on similarity
of time spent in the various MEs, as determined by cluster analysis (see Appendix A for a
detailed discussion on clustering).

Transition probabilities between categories are derived from empirical data of sequenced diary
records. Given that the first day of a 2-day sequence falls into category X, the transition
probabilities specify the relative frequency of the second day falling into each possible category.
For example, if half of the 2-day sequences with the first day in category X also have the second
day in category X, the X-to-X transition probability would be 0.5.

The HAPEM algorithms construct an average activity pattern for each replicate by randomly
selecting one activity pattern from each category and combining them with weighted averaging.
The weights represent the relative frequency of days from each category for the replicate
represented. To determine the averaging weights to use, the algorithms perform a Markov
process based on the category-to-category transition probabilities. For example, suppose the
day type is summer weekday. Because there are 65 summer weekdays in a year, 65 random
selections are made of categories. The category for the first day is selected randomly from the
set of categories using the relative frequency of each category as the probability of selection.
The category for the second day is selected according to the transition probabilities from the first
day's category. The category for the third day is selected according to the transition probabilities

4 The number of replicates is a user-specified variable.

ICF

1-6

HAPEM8 User's Guide
December 2023


-------
1. Introduction

from the second day's category. This is repeated until 65 category selections are made. The
weight given each activity pattern in the averaging process is the number of times its category
was selected in the Markov process.

1.2.5.3.	Work Tract

Another stochastic process is applied in HAPEM for replicates that commute to work. For those
groups, a work census tract is selected at random from the set of work tracts specified for that
home tract, using the proportion of workers commuting to each work tract for its selection
probability.

1.2.5.4.	ME Factors

Another stochastic feature of HAPEM is the ability to characterize ME factors as variable,
instead of uniform over the population. That is, three of the four ME factors (PEN, PROX, and
ADD) can be represented by probability distributions rather than point estimates.5 Several
distribution types may be used, as discussed in Section 3.10 (ME Factors and Mobiles Files).
For each replicate, a different set of ME factors is randomly selected.

1.2.5.5.	Air Quality—General

HAPEM has the ability to characterize outdoor air concentrations as spatially variable within a
census tract. It can do this in two different ways. One approach is to characterize the air quality
for each tract as a dataset with up to 500 sets of values (i.e., diurnally- and source-stratified).
Then, for each replicate, a different set of ambient air concentrations is selected for the home
(and work) tract to reflect the spatial variability in air quality within the tract.

1.2.5.6.	Air Quality—Onroad Vehicles

When air quality is characterized by a single point estimate (diurnally- and source-stratified),
another approach is used to account for enhanced onroad-vehicle-related HAP concentrations
in the vicinity of major roadways. To implement this approach, the distance of the replicate's
home (and workplace) from a major roadway is randomly selected based on probabilities
specific to census tracts and demographic groups. A PROX factor is then selected from a
distribution and applied to the census-tract air-quality values for onroad-mobile sources, with the
distribution dependent on the selected distance.

1.3. Strengths and Limitations of HAPEM

All models have strengths and limitations. Therefore, for each application, it is important to
carefully select the model that has the desired attributes. The following sections provide a
summary of the strengths and potential limitations of HAPEM. However, this is not an
exhaustive list and may not address features important for specific applications of an exposure
model.

5 As noted above, in practice the default PROX values for emission source categories other than onroad vehicles
are point values of 1.0 (i.e., no concentration enhancement due to proximity), and the default ADD values are
point values of 0.0 (i.e., no indoor emission sources). However, HAPEM8 contains the structure to characterize
these as distributions if appropriate data are available.

ICF

1-7

HAPEM8 User's Guide
December 2023


-------
1. Introduction

1.3.1.	Strengths

One strength of HAPEM is the ability to use air-concentration estimates from modeling, allowing
exposure to population groups to be simulated at the census-tract level rather than relying solely
data from the limited (in both areal extent and HAPs measured) nationwide network of fixed-site
monitors.

Another important feature of HAPEM is its versatility. The model is designed so that input data
specific to different applications can be used without having to rewrite the computer source
code. This flexibility is possible because most specifications are not "hard wired" into the
model's code. Instead, the necessary input data are entered through external databases and
the modeling parameters are specified through an external file. This feature allows easier use of
new data, or other information (e.g., ME factors) used by the model, as they become available.

Another strength of HAPEM is its ability to estimate the exposures of workers in the geographic
area where they work, in addition to the geographic area where they live, since the HAP
concentrations in these locations may be very different.

Another important feature of HAPEM is the incorporation of stochastic processes for the
selection activity patterns, work census tracts, ambient air quality among locations within a tract,
and ME factors, so that more of the variability in the exposure estimates can be captured than
simply the variability associated with residential tract.

Exposure assessment with HAPEM has also been facilitated by development of default input
files derived from the databases discussed above: national census population and commuting
information, CHAD activity data, and variable ME factors for gases, particles and those HAPs
that might be either gaseous or particulate depending on conditions.

1.3.2.	Limitations

HAPEM calculates long-term average exposure concentrations in order to address exposures to
HAPs with carcinogenic and other long-term effects. Thus, HAPEM does not preserve the time-
sequence of exposure events when sampling from the time-activity databases. The result is that
information used to evaluate possible correlations in exposures to different HAPs due to
activities that are related in time is not preserved.

HAPEM only estimates exposures experienced through inhalation. For certain HAPs, inhalation
might not be the major route of exposure, and, therefore, HAPEM may underestimate
exposures in these instances. Also, although HAPEM is an inhalation-exposure model, it does
not include any measures of the ventilation rate associated with an activity, so there is no ability
to calculate the potential dose received when engaging in various activities.

Uncertainty in the prediction distributions is not addressed. Some of the uncertainties are as
follows.

•	The activity-pattern data are limited. Only three of the 23 studies in the version of CHAD
used for the current HAPEM were national in scope (with several other studies covering
multiple metropolitan areas or state-scale); therefore, the combined dataset does not
constitute a representative sample, at least with respect to geographic region.

•	Commuting-pattern data address only home-to-work travel. The population not
employed outside the home is assumed to always remain in the residential census tract.

ICF

1-8

HAPEM8 User's Guide
December 2023


-------
1. Introduction

Further, although several of the HAPEM MEs account for time spent in travel, the travel
is assumed to always occur either in the home or work tract. No provision is made for
the possibility of passing through other tracts during travel.

•	The ME PEN factor distributions incorporated into the current HAPEM were derived from
reported measurement studies. The data available were quite limited. As a result, most
factors were not derived from a representative sample of measurements, and many
were inferred from measurements of different HAPs and/or MEs that would be expected
to be similar. In addition, the derivation of the PEN factors assumed that measured
indoor:outdoor ratios of 1.0 or less indicate the absence of indoor emission sources.
Because this assumption is unlikely to be uniformly valid, PEN factors are likely to
overestimate penetration by some unknown amount.

•	The ME PROXfactor distributions incorporated into the current HAPEM for the onroad-
vehicle source category were derived from modeling studies for Portland, Oregon. They
are subject to the standard uncertainties of air-dispersion modeling. They also are
subject to the uncertainties of extrapolating from the traffic patterns of Portland to other
locations.

•	Air-quality data from modeling studies are uncertain, due to simplifications incorporated
into modeling algorithms and limitations of input data (e.g., emissions, meteorology). Air-
quality measurements also are uncertain due to limitations of measurement technology
(e.g., minimum detection limits) and unknown representativeness of monitoring
locations.

1.4.	Applicability

HAPEM is a screening-level exposure model appropriate for assessing average long-term
inhalation exposures of the general population, or a specific sub-population, over spatial scales
ranging from urban to national. Due to its design features, HAPEM is not appropriate for
modeling short-term (e.g., hourly or daily) exposure events, nor should the model be used to
assess the exposure of individuals.

The model is designed to look at the "typical" inhalation exposures of different groups, including
their variance across the population. However, it should not be used to quantify episodic "high-
end" inhalation exposure that results from highly localized HAP concentrations and/or activities
that, by their nature, could result in potentially high exposures (e.g., occupational exposures).
Furthermore, HAPEM cannot address cumulative exposure from multiple HAPs or HAP
mixtures.

1.5.	Brief History of the Hazardous Air Pollutant
Exposure Model

In 1985, the EPA's Office of Mobile Sources (OMS)6 developed a model for estimating human
exposure to nonreactive pollutants emitted by mobile sources. This model was similar to the
probabilistic National Ambient Air Quality Standards Exposure Model (pNEM) in that both
simulated the movements of population groups between home and work locations and through

6 The EPA changed this name to the Office of Transportation and Air Quality in 1999.

ICF

1-9

HAPEM8 User's Guide
December 2023


-------
1. Introduction

various MEs. They differed, however, in several respects. The pNEM provided minute-by-
minute exposure estimates, which could be averaged over longer time periods, whereas the
model now known as HAPEM provided annual-average exposure estimates. The pNEM
included stochastic processes for estimating uncertainty and variability, while HAPEM provided
only point estimates. HAPEM also included the ability to estimate cancer incidence using risk
factors developed by EPA—a capability not available to pNEM.

OMS extended the modeling methodology in 1991 to estimate annual-average carbon
monoxide (CO) exposures in urban and rural areas under specified control scenarios. The
model was renamed the Hazardous Air Pollutant Exposure Model for Mobile Sources (HAPEM-
MS). HAPEM-MS used the estimated annual-average CO exposures to estimate annual-
average exposures to various HAPs associated with mobile sources. This was achieved by
assuming the annual-average exposure to each HAP was linearly proportional to the annual-
average CO exposure. The model was limited by the fact that it could be run only for specified
urban areas with ambient fixed-site CO monitors.

Shortly thereafter, EPA's Office of Research and Development (ORD) developed an enhanced
version of HAPEM-MS, called HAPEM-MS2. HAPEM-MS2 sub-divided the annual exposures by
calendar quarter (i.e., 3-month periods) to more accurately estimate exposures to mobile
sources as a function of outdoor air temperature. HAPEM-MS2 also increased the number of
MEs from 5 to 37, increased the number of demographic groups from 11 to 23, and increased
the size of the activity-pattern database.

In 1996, ORD further enhanced HAPEM by creating another generation of the model called
HAPEM-MS3. These enhancements included adding the ability to customize the demographic
groups, updating the census data using the 1990 census, and developing an algorithm for
estimating ambient impacts in residences with attached garages.

Until the spring of 1998, HAPEM-MS3 could only be run on an EPA mainframe computer.

During early model development, use of the mainframe was necessary because the model
required the storage of large data files and the calculation of large internal arrays. After 1998,
with advances in computing technology, it became possible for HAPEM-MS3 to be executed on
a "workstation." To this end, in the spring of 1998, HAPEM-MS3 was migrated (i.e., transferred)
to the UNIX operating system on a workstation. During the migration, further enhancements to
the model were made, including a new time-activity database derived from CHAD, a new air-
quality program that automatically selects air-pollutant monitoring sites, and a more efficient
implementation of the commuting algorithm.

Immediately after the release of the UNIX-version of HAPEM-MS3, ORD, in association with the
EPA's Office of Air Quality Planning and Standards (OAQPS), again made substantial
improvements to the model. The newer model had two distinct improvements over the 1998
UNIX-version. First, the flexibility of the model was expanded to allow the use of modeled air-
quality data as well as measured data. This added functionality allowed the second
improvement: expanding the areal extent of the model to include the entire contiguous US at the
census-tract level. With these improvements, the model was able to directly estimate exposures
to HAPs, and hence the model was again renamed by dropping the mobile source (-MS)
acronym.

An earlier version of the model, HAPEM4, had other enhancements as well. These included
broader flexibility in defining the study area (this can range from a single census tract up to the
entire contiguous US), population and commuting data for all census tracts in the country, a

ICF

1-10

HAPEM8 User's Guide
December 2023


-------
1. Introduction

database of (non-variable) ME factors for more than 30 HAPs, stochastic selection of activity
data, and the ability to allow the user to change internal modeling parameters such as the
number of MEs. EPA used HAPEM4 in its National Air Toxics Assessment (NATA) for 1996—a
periodic assessment designed to help assess the prevalence of air toxics in the US and an
important part of EPA's Integrated Urban Air Toxics Strategy.

HAPEM5 incorporated additional enhancements. These included the use of variable ME factors
and air-quality data that are spatially variable within census tracts. It also contained a more
refined approach for extrapolating short-term (24-hour) activity patterns into annual activity
patterns, to better reflect the day-to-day variability in an individual's activities. HAPEM5 was
applied as part of the NATA for 1999.

HAPEM6 included the ability to account for enhanced onroad-vehicle-related HAP
concentrations in the vicinity of major roadways, a more accurate characterization of the fraction
of the population of each census tract that commutes to work, and a more accurate estimate of
the duration of commuting to work.

HAPEM7 and HAPEM8 are not fundamentally different from HAPEM6. Both include updates to
all census- and CHAD-related data in the default input files. The HAPEM7 update included 18
default microenvironments (up from 14 in HAPEM6). HAPEM7 was applied as part of the NATA
for 2011, while HAPEM8 was applied as part of the AirToxScreen for 2020 (AirToxScreen—Air
Toxics Screening Assessment—is the successor to NATA).

NOTE: HAPEM currently contains enhanced algorithms for estimating exposure concentrations
from indoor emission sources. However, the algorithms have undergone only limited testing,
and the development is not complete of the databases required to implement these algorithms.
Therefore, we do not recommend the use of these algorithms at the present time.

ICF

1-11

HAPEM8 User's Guide
December 2023


-------
This page intentionally left blank.


-------
2. Getting Started—An Overview of HAPEM

2. Getting Started—An Overview of HAPEM

This chapter provides the user the basic information needed to run HAPEM. The topics
addressed in this chapter include the functions of the programs that are contained in HAPEM,
the contents of the various input and output files, and the meanings of parameter values. The
chapter has been separated into the following sections.

Section 2.1 Model Structure. Describes the general structure of HAPEM, the input
and output files, and the parameter settings.

Section 2.2 Changing the Parameter Settings. Discusses considerations for changing
parameter settings.

Section 2.3 Setting Up a HAPEM Run. Provides instructions for setting up and
running HAPEM.

Figure 2-1 presents a graphical overview of HAPEM, including the types of data needed and the
types of output produced by the model. The user should refer back to the figure while reading
this chapter to understand how all the pieces of the model fit together.

Figure 2-1.

Overview of HAPEM

(activity patterns by day
type, demographic
group, and cluster
category)

Air-
concentration
Estimates

(e.g., from AERMOD)

(multiple sets of air-quality diurnal
patterns for census tracts by
source category)

Exposure
concentrations

(Multiple estimates by i ^
census tract and	I ~

source category for each
demographic group and
\ the total population)

HAPEM

HAPEM
Indexed Population/
Data

(population for each
demographic group
for census tracts)

LEGEND

Input data

HAPEM Program

Intermediate data

Output data

ICF

2-1

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

2.1. Model Structure

HAPEM contains five programs. These are listed below.

1.

DURAV

2.

INDEXPOP

3.

COMMUTE

4.

AIRQUAL

5.

HAPEM

Because several output files of these programs are used as inputs to other programs of the set,
it is important to execute them in the order presented. The COMMUTE program is omitted if
commuting is not included in the exposure assessment.

For a given modeling domain (e.g., a state, a set of states, the entire US), programs 1-3 need
to be executed only once, even if several different air-quality scenarios/HAPs are evaluated.
Programs 4-5 need to be executed one time each for each air-quality scenario/HAP. The
modeling domain for running programs 4-5 must be included in the modeling domain used for
running Programs 1-3, but it may be smaller. For example, if programs 1-3 are run for the
entire US, the output files from these runs may then be used by programs 4-5 for evaluating a
single state or set of states.

The model programs use 12 groups of user-supplied input data files, and two or more
parameter files. All are in American Standard Code for Information Exchange (ASCII) format. A
parameter file identifies the user-supplied input files, the output files available to the user, and
specifies the parameter settings for a model run.

2.1.1. Parameter Files

The information required in the parameter files is
presented in Table 2-1 in a way that shows what
information is supplied by user-defined files, what is
supplied by user-defined parameters, and which model
program requires the information. With one exception,
noted below, any information in the parameter files that is
not required will be ignored by the program. This allows
wide flexibility in the use of parameter files. For example,
one approach would be to construct and use a separate
parameter file for each model program, with each
parameter file including only the information required by
its corresponding program. An alternative approach is to
use the same parameter file for running more than one program by aggregating all the
information needed for each program into the file. We recommend using one parameter file for
running programs 1-3, and a separate parameter file for each set of program 4-5 runs (i.e., for
each air-quality scenario). This configuration provides a balance between avoiding errors in
duplicating information used by more than one program, and keeping track of the input files
used for each air-quality scenario. To avoid using the wrong parameter file, a checking feature
has been included in programs 1-3 so that they will stop if the keyword nreplic (required by

We recommend that the user prepare a
separate parameter file for each air-
quality scenario/pollutant evaluation.
Using distinct files, rather than re-using
the same file repeatedly (i.e., by editing it
between runs), will assist the user in
keeping track of the differences between
various model runs, because the
parameter file serves as a record of the
job settings.

ICF

2-2

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

programs 4-5) is encountered in the parameter file. The name of the parameter file is specified
on the command line just after the name of the executable file to be run.

Table 2-1.

Keywords for parameter files and example filenames

User/Model
Defined

i User-defined
; files

! User-defined
; parameters
; Model-defined
: files

; User-defined
; files

; User-defined
: parameters
; Model -defined
¦ files

: User-defined
: files

User-defined
; parameters
: Model-defined
; files

Inputs

activity file (e.g., durhw_HAPEM8.txt)
: cluster file (e.g., cluster_HAPEM8.txt)
nmicro	h block

nblock	ntype

DURAV HAPEM8.f90

ngroup

INDEXPOP HAPEM8.f90
population file (e.g., population_HAPEM8.txt)
i distance-to-road file (e.g., proximity_road_HAPEM8.txt)
i commuting-time file (e.g., commute_time_HAPEM8.txf)
; commuting-fraction file (e.g., commute_fraction_HAPEM8.txt)
I statefip file (e.g., FIPS_StateCrosswalk_HAPEM8.DAT)
regionl	region2	ngroup

COMMUTE HAPEM8.P0
i commuting file (e.g., commute_flow_HAPEM8.txt)
i population file (e.g., population_HAPEM8.txt)

| distance-to-road file (e.g., proximity_road_HAPEM8.txt)
1 commuting-time file (e.g., commute_time_HAPEM8.txt)
i commuting-fraction file (e.g., commute_fraction_HAPEM8.txt)
I statefip file (e.g., FIPS_StateCrosswalk_HAPEM8.DAT)

regionl	region2	keep

population_HAPEM8_direct.ind
s population_HAPEM8.county_tract_pop_range
I population_HAPEM8.state_county_pop_range
i proximity_road_HAPEM8.STIDX
I proximity_road_HAPEM8.dat
; commute_time_HAPEM8.STIDX
i commute_time_HAPEM8.dat
i commute_fraction_HAPEM8.STIDX
: commute fraction HAPEM8.dat

Outputs

log_file.txt	counter.dat

i durhw_HAPEM8.da
: durhw HAPEM8.nonzero

. log_file.txt

counter.dat

population_HAPEM8.da
• population_HAPEM8_direct.ind
| population_HAPEM8.county_tract_pop_range
| population_HAPEM8.state_county_pop_range
I proximity_road_HAPEM8.STIDX
I proximity_road_HAPEM8.dat
i commute_time_HAPEM8.STIDX
I commute_time_HAPEM8.dat
| commute_fraction_HAPEM8.STIDX
I commute fraction HAPEM8.dat

i log_file.txt counter.dat

mistract.dat

commute_flow_HAPEM8st_comm1 _ fip_ range
\ commute_flow_HAPEM8.da
: commute flow HAPEM8.ind

ICF

2-3

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

User/Model
Defined

; User-defined
; files

: User-defined
: parameters
i Model-
generated files

; User-defined
: files

; User-defined
i parameters

Inputs

AIRQUAL HAPEM8.f90

i air quality file

population file (e.g., population_HAPEM8.txt)
j distance-to-road file (e.g., proximity_road_HAPEM8.txt)

• statefip file (e.g., FIPS_StateCrosswalk_HAPEM8.DAT)

\ log_file.txt

Outputs

counter.dat

mistract.dat

hblock	ngroup

nsource	regionl

population_HAPEM8. da
population_HAPEM8_direct. ind

region2
nreplic

proximity_road_HAPEM8.STIDX ' HAP.state_air_fip_range
proximity_road_HAPEM8.dat ; HAP.state_air1Jp_range

i HAP.state_air2Jp_range

HAPEM HAPEM8.f90

i factors file (e.g., tactors,_*_HAPEM8.txt)
mobiles file (e.g., factors_OnroadMobile_*_HAPEM8.txt)
; population file (e.g., population_HAPEM8.txt)

I air quality file

i commuting file (e.g., commute_flow_HAPEM8.txt)

; activity file (e.g., durhw_HAPEM8.txt)

; cluster-transition file (e.g., clustrans_HAPEM8.txt)
i Product files1 (specify path only)

= AutoPduct file1

HAP.da
HAP.air_da

< log_file.txt
counter.dat

HAP.pop_air_da

mistract.dat
afile file (path only)

pollutant

backg

Rseedl

CAS1

sarod

Rseed2

unit

nmicro

Rseed3

EPA

hblock

B_00

nmobiles

ntype

B_02

nemicro

ngroup

B_05

nbmicro

nsource

B_16

nvehicles

nreplic

B_18

npublict

regionl

B_65

year

region2



: Model-
: generated files

commute_How_HAPEM8st_comm 1_fip_ range

commuteJo w_HAPEM8.da
commute,Jow_HAPEM8. ind
activity_CHAD_HAPEM8.da
activity_CHAD_HAPEM8.nonzero HAP.state_air1 _fip_range
HAP.da

HAP.air_da
HAP.pop_air_da
HAP. state_air_fip_range

Note: some entries in the above table are presented side-by-side instead of down the page, to save space

1 A path to one or more indoor-emission-source inputs for the indoor-source algorithms is specified in these statements (with the AutoPduct
statement including a filename). These algorithms are included in the HAPEM program but have not yet been tested and reviewed.
Therefore, they are currently not recommended for use, and instructions for their use are omitted from this document. The Chemical
Abstract Service (CAS) registry number is used to identify files for inputs to the HAPEM indoor-source algorithms. To disable the indoor
source algorithms, set keyword CAS to 99999, and specify any existing path (and file for AutoPduct, other than those otherwise specified
for input or output for the HAPEM program) since no indoor source files will then actually be utilized by the HAPEM program.

For a record in the parameter file to be processed by the program, it must contain an equal sign
("="). Other records in the file are ignored by the program. The left side of the equal sign
contains a user-supplied key word or phrase for each user defined file and parameter, as
indicated in Table 2-1. Note that the word "file" is part of the file key phrase (e.g., "activity file").
On the right side of the equal sign is specified a full file pathname (all files except the final
exposure output files and the indoor-source files), a pathname (the final exposure output files

ICF

2-4

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

and the indoor source files7), or a parameter value. As currently configured, HAPEM creates an
exposure output file for each state/HAP combination. The names of these files are constructed
by the program based on the HAP's SAROAD code and the state's Federal Information
Processing Standard (FIPS) code, so that the user need not supply names for these files in the
parameter file. However, the user must supply the SAROAD code for the HAP in the parameter
file of the AIRQUAL and HAPEM programs as the value for the parameter sarod.

The names of the other user-defined input and output files should consist of two parts,
separated by a dot The part of the name preceding the dot, including the path, is the root
and the part following the dot is the extension. Note that the program will not process a record in
the parameter file that is longer than 120 characters, including the key word/phrase, the equal
sign, and the filename/path or parameter value. The number of spaces between the keywords
and the "=" signs and between the "=" signs and the filenames are not fixed and therefore can
be any reasonable number. Figure 2-2a and Figure 2-2b present example parameter files that
can be used to run programs 1-3 and 4-5, respectively. Note that the input and output
filenames must be listed before the parameter settings.

Figure 2-2a.

Example parameterfUe for running model programs 1-3 (DURAV.

INDEXPOP. and COMMUTE)

= input/activity pattern/durhw_HAPEM8.txt
= input/activity pattern/cluster_HAPEM8.txt
= input/population/population_HAPEM8.txt
= input/commute/commute_flow_HAPEM8.txt
= input/others/commute_time_HAPEM8.txt
= input/others/Commute_fraction_HAPEM8.txt
= input/others/proximity_road_HAPEM8.txt
= input/FIPS_StateCrosswalk_HAPEM8.dat

= output/log_l-3.txt
= output/counter.dat
= output/mistract_l-3.dat

INPUT FILES:
activity file
cluster file
population file
commuting file
CommutTime file
CommutFrac file
DistToRoad file
statefip file

OUTPUT FILES:
log file
counter file
mistract file

PARAMETER SETTINGS

keep	= YES

regionl	= 1

region2	= 53

nmicro	= 18

nblock	= 2 4

hblock	= 8

ntype	= 3

ngroup	= 6

Number of microenvironments

Number of time blocks/day in CHAD file

Number of time blocks/day in

Number of day types

Number of demographic groups

7 Indoor-source algorithms are included in the HAPEM program but have not yet been tested and reviewed.
Therefore, they are currently not recommended for use, and instructions for their use are omitted from this
document. To disable the indoor-source algorithms, set keyword CAS to 99999.

ICF

2-5

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

Figure 2-2b.

Example parameterfWe for running model programs 4-5 (AIRQUAL

and HAPEM)

INPUT FILES:
activity file
ClusTrans file
population file
commuting file
air quality file
factors file
mobiles file
CommutTime file
DistToRoad file
statefip file

input/activity pattern/durhw_HAPEM8.txt
input/Activity Pattern/clustrans_HAPEM8.txt
input/population/population_HAPEM8.txt
input/commute/commute_flow_HAPEM8.txt
input/airqual/2 02 0benzene_AirToxScreen2 023.txt
input/factor/factors_gas_HAPEM8.txt

= input/factor/factors_OnroadMobile_Benzene_HAPEM8.txt
input/others/commute_time_HAPEM8.txt
input/others/proximity_road_HAPEM8.txt
input/FIPS_StateCrosswalk_HAPEM8.dat
product file Pathname = input/Add/

AutoPduct file	= input/Add/AutoGarage.txt

Demographic Groups:

s oo

= Ages

0-1

S 02

= Ages

2-4

S 05

= Ages

5-15

S 16

= Ages

16-17

S 18

= Ages

18-64

S 65

= Ages

V
II

(J\
Cn

OUTPUT FILES:
log file
counter file
mistract file
af ile
bf ile

= output/benzene/log_4-5_benzene.txt
= output/counter_AirToxScreen2023.dat
= output/benzene/mistract_4-5_benzene.dat
= output/benzene/

= output/benzene/

Keep intermediate files = YES

PARAMETER SETTINGS:

pollutant

CAS

units

year

regionl

region2

EPA Region

sarod

Rseedl =
Rseed2 =
Rseed3 =
backg
nmicro
nblock
hblock
ntype
ngroup
nsource
nmobiles
nbmicro
nemicro
nvehicles
npublict
nreplic

Benzene

99999

ug/m3

2020

1

53
1

45201

-10

Random

-l

Random

-l

Random

O

o
II



= 18

Number

= 24

Number

= 8

Number

= 3

Number

= 6

Number

= 4

Number

3
1

10

7 12

for Selecting Activity Pattern Data

for Selecting Micro Factors

for Selecting Air Quality Dataset

Sequence # of source categories which are on-road mobile

beginning # of micros which are indoor environments

ending # of micros which are indoor environments

Sequence # of micros which are cars/trucks used in commuting

8 10 11 13 ! Sequence # of micros which are public transit

30

Number of replicates/demo group in output file

ICF

2-6

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

The model programs also create several intermediate output files that are used as input to other
programs in the model set but are not directly useful for the user. The model programs generate
the names of the intermediate output files by changing the filename extensions (i.e., the text
after the dot) of the input filenames. An example set of filenames, including the intermediate
files generated by the programs, is shown in Table 2-1, with example user-defined filenames in
parentheses. In the COMMUTE program, two of these intermediate files
(population_HAPEM8.county_tract_pop_range and

population_HAPEM8.state_county_pop_range) will be deleted at the end of the program unless
the keyword variable keep is set to "yes".

Besides the input and output files, the model programs create a set of user-defined diagnostic
output files. The main one is a log file, which records information about the execution of the
programs, including some error messages. Another is a counter file that keeps track of the
numbers of elements in various processed files, some of which are used by subsequent
programs. A third diagnostic file is the mistract file which keeps track of census tracts in the
population file that are not matched by tracts in the commuting file, tracts in the population file
that are not matched by tracts in the air quality file, and of tracts in the commuting file that are
not matched by tracts in the air quality file. Only tracts included in both the population and air
quality files are processed by the model since both these pieces of information about a tract
(population and air quality) are needed to make an exposure estimate. If commuting is included
in the simulation and the tract is missing from the commuting file, it is assumed that all workers
residing in that tract stay in the home tract for work.

2.1.2. The DURAV Program and the Activity and Cluster Files

The DURAV program performs the three main functions listed below.

•	It categorizes and groups human activity data extracted from CHAD into demographic
groups, day types, commuting status, and cluster categories.

•	If a different number of daily time blocks is specified for the analysis than in the activity
file, it processes the activity records so that the number of time blocks matches the
number specified for the analysis.

•	It creates a sequential ASCII file of the activity-pattern records for use by the HAPEM
program.

The activity file is the primary input file for the DURAV program. The default file, currently
durhw_HAPEM8.txt, contains data extracted from CHAD describing the amount of time spent in
various MEs by individuals. Each record in the activity file consists of one person-day (i.e.,
1,440 minutes for an individual) of activity data. This information is not an activity sequence;
rather, it is the total number of minutes spent in each ME during each block of time throughout
the day (i.e., the time increments used per 24-hour period).

For example, in the current HAPEM default activity file, durhw_HAPEM8.txt, there are 18 MEs,
(24) 1-hour time blocks, and 2 exposure districts (home and work), resulting in a total of 864
duration values. The duration in each of the 18 MEs for the first hour comes first in the activity
file, followed by the 18 durations for the second hour, etc. This pattern is repeated for all 24
hours for the home exposure district, and then for the 24 hours and 18 MEs of the work district
(see Appendix A for more details on the current HAPEM default input files).

ICF

2-7

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

The number of time blocks in the activity file is specified by the user in the parameter file of the
DURAV, INDEXPOP, and COMMUTE programs as nblock. The number of MEs in both the
activity file and the factors and mobiles files (discussed below) must be the same and is
specified in all the parameter files as nmicro.5 The number of duration values in the activity file
must equal twice the product of the values of the nmicro and nblock settings in the parameter
file. The sum of the duration values for each individual profile should always equal 1,440
minutes (i.e., there should be no unaccounted time); otherwise, the program will stop. Each
duration must be specified as a whole number (i.e., no decimals; this number can be zero) of
minutes in each ME.

The number of time blocks for the analysis is specified in all the parameter files as hblock. The
number may be less than or equal to nblock, however, it must be an integral factor of nblock,
so that the activity time blocks can be combined if necessary to match to match hblock. For
example, if nblock is 24 and hblock is set to 8, the DURAV program will combine the (24) 1-
hour activity time blocks into (8) 3-hour activity time blocks.

Each record in the activity file also contains information about the individual from whose
activities the data were derived, so that the records can be classified into demographic groups.
The definitions of these groups are part of the DURAV program source code, so that to change
the definitions of the groups, the source code must be modified and recompiled. Similarly, the
definitions of day types, pertaining to season and day-of-week for categorizing activity patterns,
are part of the DURAV program source code. The number of groups is specified as ngroup in
all the parameter files. The number of day types, ntype, is specified on the parameter files of
the DURAV and HAPEM programs.

The cluster category for each CHAD record, identified by CHAD identification code, is specified
in the cluster file. The current version of the DURAV program divides the activity data into 12
person groups, based on demographic (six categories) and commuting status (yes or no).
Activity-pattern data also are separated into three day types: summer weekdays, other
weekdays, and weekends. The number of clusters, derived from a statistical cluster-analysis
procedure, ranges from one to three, depending on the group and day type (see Appendix A for
a detailed discussion). The current HAPEM requires that the CHAD diaries are in the same
order in the activity and cluster files.

2.1.3. The INDEXPOP Program and the Population, Distance-to-road,
Commuting-time, and Commuting-fraction Files

The INDEXPOP program performs the three main functions listed below.

•	It creates a direct-access file of population data to be used in the AIRQUAL program.

•	It creates sequential ASCII index files for the population data census tracts, to facilitate
file searching in the COMMUTE and AIRQUAL programs.

•	It creates direct-access files and associated index files of the data in the distance-to-
road, commuting-time, and commuting-fraction files, to be used in the COMMUTE and
AIRQUAL programs.

8 As explained in Section 2.1.6 (The HAPEM Program, the ME Factors and Mobiles Files, and the Activity Cluster-
transition File), there must be nmicro records for each onroad-mobile source category in the mobiles file.

ICF

2-8

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

The main input file to the INDEXPOP program is the population file, which provides the number
of people in each demographic group (defined in the DURAV program source code) for each
census tract in the study area. The data must be sorted according to the state, county, and tract
FIPS codes. These data are typically obtained from the census surveys (see Appendix A for
more details on the current HAPEM default input files).

Other input files with census-tract-specific information about the population, such as the
distance-to-road, commuting-time, and commuting-fraction files, also are first processed in this
program. The distance-to-road file provides information on the fraction of each demographic
group in each tract that resides within three different distance categories of major roadways, as
well as the fraction of the tract area that is within each distance category. The commuting-time
file provides information on the average commuting time for commuters residing in each tract.
The commuting-fraction file provides information on the fraction of workers in each group that
resides in each tract and that commutes to work (see Appendix A for more details on the current
HAPEM default input files).

2.1.4. The COMMUTE Program and the Commuting, Distance-to-road,
Commuting-time, and Commuting-fraction Files

The COMMUTE program performs the three main functions listed below.

•	It creates a file identifying for each census tract (i.e., home tract), the associated set of
work tracts (i.e., tracts in which the residents of the home tract work), the fraction of
workers residing in that home tract and working in each work tract, and the normalized
centroid-to-centroid distance between home tract and each work tract. The normalized
distance is the distance/(average distance). The normalized distance is combined with
the average commuting time for the tract to estimate the commuting time for the home-
tract/work-tract pair in the HAPEM program.

•	It creates a sequential index file to facilitate file searching in the HAPEM program.

•	It adds the census-tract-specific information from the distance-to-road, commuting-time,
and commuting-fraction direct-access files (created in the INDEXPOP program) to the
commuting index file.

The commuting file is the main input file to the COMMUTE program. It specifies the fraction of
residents of each home census tract that work in that tract and every other tract (i.e., the
population associated with each home-tract/work-tract pair), which is typically derived from
census data (see Appendix A for more details on the current HAPEM default input files). While
there are hundreds of million pairs of tracts nationwide within a reasonable commuting distance
of each other, only about 6 million of these pairs have a non-zero flow of commuters. Only those
pairs with non-zero flows are included in the commuting file.

An important issue pertaining to this commuting data is that workers do not always travel daily
between their home and work locations. The larger the distance between home and work, the
greater the likelihood that daily commuting does not occur. For example, places of residence in
the lower 48 states appear with Alaskan places of work. These workers are almost surely not
commuting daily between the continental US and Alaska. To address this issue, the commuting
flows were examined as a function of distance. To examine how the decline in commuting flow
is affected by distance, researchers plotted the natural log of the natural log of the total flow
versus distance. This plot revealed that the ln(ln(total flow)) is nearly linear for distances ranging

ICF

2-9

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

from 0 to about 100 km. For distances greater than 100 km, the graph exhibits a decreasingly
negative slope with distance (i.e., the curve "flattens out"). These findings suggest that people's
"commuting behavior" is fairly consistent, on an aggregate basis, to a distance of approximately
100 km. Then, at greater distances, factors other than daily commuting may become
increasingly important. Therefore, in constructing the commuting distance distributions for each
census tract, commuting distances greater than 120 km are assumed to be atypical for a daily
commuter and the COMMUTE program ignores these longer commutes.

2.1.5.	The AIRQUAL Program and the Air Quality and Distance-to-
road Files

The AIRQUAL program performs the four main functions listed below.

•	It creates a sequential file of air-quality data to be used in the HAPEM program.

•	It determines the number of data records for each census tract in the air quality file.

•	It creates index files to facilitate file searching in the HAPEM program.

•	It adds the tract-specific information from the distance-to-road direct-access file (created
in the INDEXPOP program) to the air-quality index files.

The air quality f\\e contains the ambient air concentrations that are used by the AIRQUAL
program. The file records can have concentration contributions from multiple emission source
categories for multiple time blocks for a census tract, as well as a time-invariant location-specific
background concentration. There may be multiple such records for each tract, representing
spatial variability throughout the tract. The AIRQUAL program requires a separate air quality file
for each HAP being evaluated. Details about the format of the air quality file can be found in
Section 3.9 (Air Quality File).

The number of outdoor emission-source categories is specified in the parameter files of the
AIRQUAL and HAPEM programs as nsource, and it must match the number in the factors file
(see Section 2.1.6 [The HAPEM Program, the ME Factors and Mobiles Files, and the Activity
Cluster-transition File]). The user specifies the number of time blocks for the analysis in all the
parameter files as hblock. As discussed above, this value must be an integral factor of nblock,
the number of time blocks in the activity file, so that the activity time blocks can be combined if
necessary to match to match hblock. Similarly, hblock may also be greater than or equal to the
number of time blocks in the air quality file, but it must be an integral multiple of the number of
air-quality time blocks, so that the air-quality values can be replicated if necessary to create
hblock air-quality values. For example, suppose the air quality input file has 8 3-hour time
blocks per day; if hblock is set to 24, the AIRQUAL program will create 24 air-quality time
blocks with three replicates of each of the 8 air-quality values.

2.1.6.	The HAPEM Program, the ME Factors and Mobiles Files, and
the Activity Cluster-transition File

The HAPEM program performs the six main functions listed below.

•	For each demographic group in each census tract, it randomly selects nreplic sets of
ME factors based on the distribution data provided in the factors and mobiles files. Each
set contains a subset of ME factors randomly selected for each of the time blocks (for

ICF

2-10

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

the PEN and ADD factors) or each of the sources (for the PROX and LAG factors). Each
subset contains randomly selected ME factors for each of nmicro MEs.

•	For each demographic group in each census tract, it randomly selects nreplic sets of
air-quality data from the datasets available for a tract.

•	For each demographic group in each census tract, it creates nreplic sets of average
activity patterns, where a set contains one average pattern for each day type. An
average activity pattern for each day type is calculated as a weighted average of activity
patterns randomly selected from each cluster in a group/day-type/commuting-status
combination. The weights are determined by the relative frequencies of cluster types
randomly selected in a one-stage Markov process,9 based on the cluster-transition
probabilities provided in the cluster-transition file.

•	For each activity pattern for a commuting demographic group, it randomly selects a work
census tract with probability weighting based on the fraction of home-tract residents that
work in that tract.

•	For each census tract, it estimates the concentration in each ME based on ME factors
and outdoor concentrations.

•	It combines activity patterns, commuting status, and estimates of ME concentration to
calculate nreplic annual-average exposure concentrations for each demographic group
in each census tract.

The ME factors and mobiles files provide the factors used to calculate an estimated ME
concentration from an outdoor concentration. This methodology allows the user to specify
values (distributions or point estimates) for three types of ME factors: penetration factors,
proximity factors, and additive factors. These factors are combined with the outdoor
concentration estimates according to the following algorithm.

ME concentration = PROX x PEN x outdoor concentration + ADD

The outdoor concentration is the sum of the concentration contributions from each outdoor
emission-source category and background.

The penetration factor, PEN, is an estimate of the ratio of the ME concentration contribution
(from a given emission-source category) to the concurrent outdoor-concentration contribution in
the immediate vicinity of the ME.

The proximity factor, PROX, is an estimate of the ratio of the outdoor concentration in the
immediate vicinity of the ME to the outdoor concentration represented by the air-quality data.
The air-quality data represent an average over some geographic area (i.e., some subset of a
census tract, or an average across the whole tract). For most situations, the current default
factors file specifies a PROX value of 1.0 (i.e., an outdoor-concentration contribution in the
immediate vicinity of the tract equal to the tract-average concentration contribution). However,
when assessing exposure to motor vehicle emissions, for MEs near roadways (e.g., in-vehicle,
residences near major roadways) the HAP concentration contribution in the immediate vicinity of
the ME is expected to be higher than the average HAP concentration contribution over the

9 A one-stage Markov process is a sequence of events, such that at every step in the Markov chain the probability
distribution for the next event depends on what the current event is.

ICF

2-11

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

census tract (i.e., PROX is expected to be greater than 1.0), and this is reflected in the current
default factors and mobiles files.

ADD is an additive factor that accounts for emission sources within or near to a ME (i.e., indoor
emission sources). Unlike the other two factors, the ADD factor is itself a concentration and
therefore has units of mass/volume. The actual units used must be the same as those in the air
quality flie.7

A fourth factor, LAG, is used to account for the possibility of very slow HAP diffusion and
penetration, so that the relevant air-concentration value may be from the previous time block. A
value of zero for LAG indicates no time lag (i.e., use the concurrent air-concentration value;
otherwise, the previous time-block value is used).

The factors file includes distributions for each of these factors for each ME/emission-source-
category combination, with the exception of PROX and LAG factors for onroad-mobile-source
emissions, which are contained in the mobiles file with separate distributions specified for three
distance-from-roadway categories. As noted above, the number of MEs in the factors and
mobiles files must match the number in the activity file (i.e., nmicro). Similarly, the number of
outdoor-emission source categories (i.e., nsource) must match the number in the air quality file.
The mobiles file must contain nmicro records for each onroad-mobile source category specified
with nmobiles.

There are three default factors files: one each for gaseous HAPs, particulate HAPs, and HAPs
which could be either phase depending on conditions. There are four default mobiles files for
onroad-mobile sources: one each for benzene, 1,3-butadiene, diesel particulate matter (PM), as
well as one for non-specific HAPs.

The default factors and mobiles files contain ME factors applicable to all the MEs included in the
default activity file, for nsource emission-source categories (e.g., point, non-point, onroad
mobile, and nonroad mobile). These category-specific estimates were derived from reported
measurement and modeling studies. Because, as noted above, a new approach to evaluating
indoor sources is in development, the ADD factors are uniformly set to zero. And, due to lack of
data, LAG is uniformly set to zero. For onroad-mobile sources, the PROX and LAG values in the
mobiles files will override those in the factors files.

The cluster-transition file specifies, for each combination of demographic group and day type,
the number of activity patterns in each of two to three clusters (derived from cluster analysis on
the activity-pattern data from CHAD), along with the cluster-to-cluster transition probabilities
(derived from the transition frequencies for multiple-day activity-pattern records from CHAD; see
Appendix A for more details on the current HAPEM default input files). These values are used to
create weights for averaging selected activity patterns, one from each cluster, to represent an
individual within the demographic group for that day type.

2.1.7. The Statefip File

The statefip file cross-references the two-character state FIPS codes for each U.S. state (plus
Puerto Rico, the U.S. Virgin Islands, and Washington, DC) to its numerical ranking on the list.
The numerical rankings range from 1 to 53 in the default file, although the FIPS codes range
from "01" to "78", since several possible codes in the sequence are skipped (i.e., not assigned
to a state, district, or territory).

ICF

2-12

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

The statefip file is used in conjunction with the parameters regionl and region2 (used in all the
parameter files to specify the group of states to be included in the analysis according to
numerical ranking). For example, setting regionl to 1 and region2 to 53 results in assessment
of all the states, districts, and territories in the default statefip file (assuming the input files
contain all the necessary data). Alternatively, setting both regionl and region2 to 5 results in
assessment of the fifth state only: California with FIPS code "06".

The region range need not be the same for each of the five model programs; the range for each
program may be the same as or smaller than the range for the preceding program, where the
order of the programs is as specified above. For example, the INDEXPOP and COMMUTE
programs could be run for region range 1 to 53, while the AIRQUAL and HAPEM programs are
run for a single state.

Note that the regionl and region2 parameters specify the states for which the program will look
for data in the input files. However, the input files need not contain data for every tract within the
specified states. For example, if the air quality file contains data for only a subset of census
tracts within a state, the AIRQUAL and HAPEM programs will simply make estimates for those
tracts, as long as the state or states are specified within the regionl and region2 range.

2.1.8.	Background Concentration

In addition to estimating exposure-concentration contributions for each emission-source
category for which data are provided in the air quality file, the HAPEM program also estimates
the exposure-concentration contribution from the background outdoor concentration. The
background concentration is an estimate of the outdoor concentration that would occur in the
absence of any anthropogenic emissions within the modeling domain. It includes concentration
contributions from natural sources, re-entrainment, global transport, and other anthropogenic
sources outside the modeling domain. This background exposure contribution is added together
with the emission-source-category contributions; the total exposure concentration is reported in
the exposure output files.

The background concentration is composed of two parts, either or both of which may be used.
The first is a uniform background concentration throughout the study area, with the single value
is specified as backg in the parameter file of the AIRQUAL and HAPEM programs. The units of
measurement must be the same as those used in the air quality file. The second background-
concentration specification is a single value for each location specified in the air quality file,
representing a spatially variable component of the background concentration.

2.1.9.	Exposure Output Files

As currently configured, the model creates an exposure output file for each state/HAP
combination. The names of these files are constructed by the model based on the HAP
SAROAD code (specified by sarod in the parameter file) and the state FIPS code (as
SAROAD.FIPS.dat).

These output files contain nrepiic records for each combination of census tract and
demographic group. Each record identifies the census tract, the group, the number of people to
which the exposure estimates apply (i.e., Mnrepiic of the population of the group in the tract),
and exposure-concentration contribution estimates: one each for the nsource outdoor-
emission-source categories, one for background, one for each of four indoor-source categories,

ICF

2-13

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

and a total of the contributions from all outdoor-emission-source categories, background, and
indoor sources.

2.2. Changing the Parameter Settings

HAPEM was designed to be as easy to use as possible. With this in mind, the model's structure
is such that, for routine applications, no changes need be made to the model's computer code.
For most applications, the user need only supply the model with the appropriately formatted
input files and parameter specifications declared in the parameter files.

However, there are several changes that users can make to HAPEM to "tailor" the model to their
needs. Changes or modifications to the model are most easily accomplished by altering the
parameter settings. The following discussion describes those parameters that can be altered.

2.2.1.	Changing the Number of MEs

In principle, the model will work with any number of MEs. The number, specified as nmicro in
all the parameter files, must match the number actually used in the activity file and the factors
and mobiles files. Definitions of the MEs do not appear anywhere in model code.

The model programs should be able to accommodate anywhere from one up to at least 100
MEs. However, large numbers of MEs could result in input-file line lengths beyond a system's
limits (particularly in the case of the activity file) if other parameters (such as the number of time
blocks) also are set to large values.

2.2.2.	Changing the Number and/or Definitions of the Demographic
Groups

The number of demographic groups, specified as ngroup in all the parameter files, must be
consistently represented in

•	the number of columns in the population file,

•	the number of columns in the commuting-fraction file,

•	the number of columns in the distance-to-road file, and

•	the number of demographic groups specified in the cluster and cluster-transition files.

The definitions of the groups are listed in the parameter file for the AIRQUAL and HAPEM
programs so that they can be repeated at the start of the final output file for tracking. This listing
in the parameter file has no impact on the exposure results.

The six current age groups are as follows, in years.

•	0-1

•	2-4

•	5-15

•	16-17

•	18-64

ICF

2-14

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

•	65+

The number of demographic groups is unlimited. However, the user is cautioned that for
narrowly defined groups, there might not be enough activity-pattern data to calculate a reliable
group average or create meaningful activity-pattern clusters. An extreme example of this is
where no activity patterns fit a group's definition, resulting in incorrect exposure calculations
(i.e., exposure concentrations equal to zero) for that group.

2.2.3.	Changing the Number and/or Definitions of Day Types

Day types are used to guide the selection of the activity patterns. Demographic studies indicate
that typical weekday (Monday-Friday) and weekend (Saturday-Sunday) activities differ
significantly for most working people and school children. Furthermore, in certain respects,
activities in summer (or warm weather) might differ from those in winter (or cold weather),
especially for children or other non-workers. Currently, season and day of week are used to
determine three day types as

•	weekdays in summer (June-August),

•	other weekdays, or

•	weekends.

In principle, year, month, day, season, temperature, rainfall, other meteorological variables, or
even geographical variables could be used to assign day type. However, if there are too many
day types, or if they are too narrowly defined, then there may not be enough activity-pattern
data fitting the day type definition to allow the determination of a reliable average or to create
meaningful activity-pattern clusters. If additional variables are used to define day types, then the
programmer is advised to check that there are an adequate number of activity-pattern profiles
for each new day type.

2.2.4.	Changing the Number and/or Definitions of Time Blocks

The traditional method for running HAPEM has been to use one-hour time increments (referred
to as time blocks). However, the current model was designed to allow more flexibility in the
selection of time blocks. Time blocks can range between one minute (the finest resolution
available for the activity data) and one day, so in principle, there can be any number from one to
1,440 time blocks. In most practical applications, the number of time blocks will be 24 or less.To
accommodate the possible adjustment of time blocks from nblock to hblock as discussed
above, the time blocks must each be of equal size.

2.3. Setting Up a HAPEM Run

This section shows how to set up and conduct a simple HAPEM model run. Subsequent
sections and chapters provide more detailed explanation about HAPEM's input and output files
and the model's programs. The example shown in this section is for a hypothetical HAPEM8
analysis of benzene.

The most important consideration for making a HAPEM run is ensuring that the input files are
accurate and correctly formatted. This is the responsibility of the user. To run the model, the
user must provide 11-12 data input files (depending on the HAP and source category), the
parameter files defining the run, and the five executable files for the five programs that make up

ICF

2-15

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

the model. The programs can be run consecutively by using a "batch" file, or they can be run
independently.

Parameter Files

The parameter files for this example, presented above in Figure 2-2a and Figure 2-2b, can be
used for running the five executables. The name of the parameter file must be specified in the
command line immediately after the executable name. As the first three programs in the model
sequence (DURAV. INDEXPOP, and COMMUTE) require different inputs from the final two
programs (AIRQUAL and HAPEM), it is suggested that two separate parameter files be
generated for the model sequence of a given simulation or set of simulations. The first
parameter file (Figure 2-2a) should be used for the first three programs and the second
parameter file (Figure 2-2b) should be used for the final two programs.

Input/output Files

As seen in Figure 2-2a and Figure 2-2b, the input files (including full pathnames) are identified
in the parameter files. The input files reside in a subdirectory named "input/". The main
exposure output files (afile) are sent to a subdirectory named "/output/", along with the
diagnostic output files (the log file, the mistract file, and the counter file). When the full
pathname is identified for an input or output file, it is not required that it reside in the same
subdirectory as the executables.

The names of the input and output files must be identified in the parameter files before the
parameter settings.

As noted above, an existing pathname should be specified for the product files, and the full
pathname of any existing file (except other model input or output files) must be specified as the
AutoPduct file in the parameter file used with the model. In the future, these files will be part of
the input for evaluating indoor sources, but for now the file will not actually be utilized by the
HAPEM program. To disable the indoor-source algorithms, set keyword CAS to "99999".

Parameter Settings

The "PARAMETER SETTINGS" in the parameter files shown in Figure 2-2a and Figure 2-2b
indicate that the region to be modeled is 1 through 53 (all states, the District of Columbia,

Puerto Rico, and the U.S. Virgin Islands), and the HAP SAROAD code is 45201 (benzene).

The last group of information in the parameter file shows that there are 18 MEs to be modeled
(.nmicro). This number of MEs must be consistent with the number of ME factors specified in
the factors and mobiles files (i.e., factors_*_HAPEM8.txt and

factors_OnroadMobile_\HAPEM8.txt) and the number of duration values specified in the
activity file (i.e., durhw_HAPEM8.txt). The number of time blocks per day in the activity file is 24
(.nblock), but the number of time blocks per day for the analysis is 8 (hblock), which is an
integral factor of the nblock value, as explained above. The number of outdoor emission-source
categories is 4 (nsource). The data in the air quality file (for this example the file is
2020benzene_AirToxScreen2023.txt) must be consistent with nsource, and the number of time
blocks must be an integral factor of hblock, as explained above. The number of demographic
groups (ngroup) must be consistent with the groups specified in the DURAV source code and
in the population, commuting-fraction, distance-to-road, cluster, and cluster-transition files (i.e.,
population_HAPEM8.txt, commute_fraction_HAPEM8.txt, proximity_road_HAPEM8.txt,

ICF

2-16

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

cluster_HAPEM8.txt, clustrans_HAPEM8.txt). The number of replicates to be simulated for each
group in each tract is 30 (nreplic).

In addition, there are five parameter settings that specify the sequence numbers of particular
emission-source categories and ME types that are subject to special treatment in HAPEM. In
the example, the sequence number for the onroad-mobile source category in the air quality file
is 3 (nmobiles). The sequence numbers of the indoor MEs (including in-vehicle) in the factors
and mobiles files are 1-10 (nbmicro through nemicro). There are two MEs for private
commuting, with sequence numbers 7 and 12 (nvehicles), and there are four MEs for public-
transit commuting, with sequence numbers 8, 10, 11, and 13 (npublict). (There may be up to
10 values each for nmobiles, nvehicles, and npublict.)

The HAP name (pollutant), measurement units (units), target year for the analysis (year), and
the definitions of demographic groups are listed here by the user so that they can be repeated
at the beginning of the final output file for tracking. They have no effect on the exposure results.

2.3.1.	Running HAPEM as a "Batch" Job

When running HAPEM by submitting batch jobs, each job should be allowed to finish before
submitting the next job.

For this example, a simple batch file was written to run the five model programs sequentially,
with all five programs residing in the same directory as the batch file. The batch file is shown in
Figure 2-3.

Figure 2-3.

Example "batch" file for
running the five model
programs

durav_HAPEM8.exe pi.txt	j
: indexpop_HAPEM8.exe pi.txt

: commute_HAPEM8.exe pi.txt	;

¦ airqual_HAPEM8.exe p2.txt	:

HAPEM_HAPEM8.exe p2.txt	j

Because the parameter files specify the names or paths of all the input and output files as well
as the parameter settings, the batch file simply specifies the order in which the HAPEM
executable programs will be run.

2.3.2.	Running HAPEM Programs Individually

Any of the model programs can be run individually. The user must ensure that the required input
files exist and are in the same location specified in the parameter file.

If a user is interested in running the DURAV program (this is typically the first program that is
run when doing an exposure analysis), they would go to the subdirectory containing the
executable program and type the following command on the command line:

durav HAPEM8.exe pi.txt
The other model programs are run similarly.

ICF

2-17

HAPEM8 User's Guide
December 2023


-------
2. Getting Started—An Overview of HAPEM

As indicated in Table 2-1, COMMUTE. AIRQUAL and HAPEM all require input files that are
generated from running other model programs. Therefore, if any of these programs is run alone,
the user must ensure that the required model-generated input files exist and are in the same
subdirectory as the original input file from which their filenames were derived (see Table 2-1).
For example, running AIRQUAL requires two files with filenames derived from the population file
and two files with names derived from the distance-to-road file. For this example, these files are
population_HAPEM8.da, population_HAPEM8_direct.ind, proximity_road_HAPEM8.STIDX, and
proximity_road_HAPEM8.dat, with filenames derived from population_HAPEM8.txt and
proximity_road_HAPEM8.txt. Therefore, the parameter file for running AIRQUAL and HAPEM
must specify the full pathname of the population and distance-to-road files, and the four
intermediate files must exist and reside at those paths.

If the user wishes to run the model for multiple pollutants using the same regions and settings, it
should be noted that DURAV, INDEXPOP, and COMMUTE need only be run in sequence one
time. AIRQUAL and HAPEM may then be run for multiple pollutants without rerunning DURAV,
INDEXPOP, and COMMUTE. This may be accomplished by either running the programs
individually, as directed above, or by creating one batch file for the execution of DURAV.
INDEXPOP. and COMMUTE and then a batch file for each successive run of AIRQUAL and
HAPEM. If the user chooses to do this, it is suggested that upon completing the execution of
DURAV. INDEXPOP. and COMMUTE that the user save the log and mistract files, if needed,
both before and after the execution of AIRQUAL and HAPEM. as each successive run will
overwrite these files. The log and mistract files saved before the execution of AIRQUAL and
HAPEM apply to each successive run, as they include the information from the first three
programs in the sequence.

ICF

2-18

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

3. HAPEM Input Files

The model programs use 11-12 user-supplied data input files (depending on the HAP and
source category), and two or more parameter files. All are in ASCII format. The function of each
of the files and their relationship to the structure of HAPEM are discussed in Chapter 2. The
reader is referred to that chapter for an overview of HAPEM input files. This chapter
summarizes that information and presents the format of each of the user-supplied input files.

The parameter files are the central input files for HAPEM simulations, and customized
parameter files should be prepared for every simulation (or set of simulations). It is best to save
the parameter file used for each simulation under a unique name, so that the files from earlier
simulations are not overwritten. A consistent naming system should be developed to pair each
parameter file with the output files generated by the simulation or set of simulations. This pairing
serves as one form of documentation for the model simulations, so the user can later determine
which settings produced which results. Another form of documentation is the repetition of the
parameter settings at the start of the final output file.

The remaining filenames used by the model programs are input from the parameter files. Thus,
the user must check that the parameter files refer to the correct filenames before conducting a
simulation. Which of the user-supplied files and model-generated files are required for each of
the five programs that HAPEM contains is discussed in Chapter 2 and presented in Table 2-1.

As explained in Chapter 2, there are default files available for 11 of the 12 user-supplied input
files. These 11 default files are listed below.

Default input files available for HAPEM

population

activity

cluster

cluster-transition

commuting

commuting-time

commuting-fraction

distance-to-road

factors

(e.g., population_HAPEM8.txt; national scope)

(e.g., durhw_HAPEM8.txt)

(e.g., cluster_HAPEM8.txt)

(e.g., clustrans_HAPEM8.txt)

(e.g., commute_flow_HAPEM8.txt; national scope)

(e.g., commute_time_HAPEM8.txt, national scope)

(e.g., commute_fraction_HAPEM8.txt; national scope)

(e.g., proximity_road_HAPEM8.txt; national scope)

(one each for gaseous HAPs, particulate HAPs, and HAPs which could

be either phase depending on conditions; e.g.,

factors_gas_HAPEM8.txt, factors_particulate_HAPEM8.txt, and

factors_mixed_HAPEM8.txt, respectively)

(one each for benzene, 1,3-butadiene, diesel PM, and non-specific
HAPs; e.g., factors_OnroadMobile_Benzene_HAPEM8.txt,
factors_OnroadMobile_Butadiene_HAPEM8.txt,
factors_OnroadMobile_Diesel_HAPEM8.txt, and
factors_OnroadMobile_Other_HAPEM8.txt, respectively)

(e.g., FIPS_StateCrosswalk_HAPEM8.DAT, national scope)

mobiles

statefip

ICF

3-1

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

See Appendix A for more details on the current HAPEM default input files. The user may
provide his or her own files as replacements for any or all of these files, using the file formats
described in this chapter.

The twelfth user-supplied file, the air quality file, must be provided by the user with the format
described in this chapter.

3.1. Parameter Files

The parameter files contain the seven types of information listed below for use in HAPEM runs.

•	Paths and filenames for the input data files (except the indoor-source files, which
currently are not used) and the diagnostic-type output files.

•	Paths for the final exposure-output files.

•	Identification of the set of states (optionally including the District of Columbia, Puerto
Rico, and the U.S. Virgin Islands) for the simulation.

•	Identification of the HAP, the units of measurement, the target year of the analysis, and
the definitions of demographic groups.

•	A spatially uniform background concentration.

•	Internal parameter settings.

•	Seed values for three random-number generators.

All of this information is identified using keywords. The required parameter-file information for
running each of the five model programs is presented in Table 2-1 of Chapter 2 as user-defined
files and user-defined parameters. The contents and format of each of the user-defined files is
described below. As explained in Chapter 2, with one exception any information in the
parameter file in addition to that required by a program will be ignored by the program. (The
exception is that programs 1-3 will stop if the keyword nreplic—required by the AIRQUAL and
HAPEM programs—is encountered in the parameter file.) Therefore, although a separate
parameter file may be used for each program in the model set, it is possible to use the same
parameter file for running programs 1-3 and another for running programs 4-5 by aggregating
all the information needed for each program in the file. The format (including keywords) of a
parameter file for running the model programs is presented in Figure 2-2a and Figure 2-2b in
Chapter 2.

The model programs only scan lines containing an equal ("=") sign. The word or words to the
left of the equal sign identify which variable is being set and thus should not be changed. The
data to the right of the equal sign are the values or settings that the user selects for the model
run. The pathnames should precede the parameter settings in the file. The user can add
additional lines (e.g., comments) anywhere to the parameter file. It is safest if these lines do not
contain an equal sign, which could cause them to be parsed accidentally by the model. To
ensure that all the necessary information is specified, it is safest to edit an existing parameter
file, changing only the comments and the right-hand sides of the equal signs.

ICF

3-2

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

3.1.1.	Specifying the Location and Names of Input and Output Files

In editing the parameter file, the user should typically provide the full pathnames for input and
output files (except the indoor-source files [not currently used] and the final exposure output
files). The names can be up to 100 characters in length and should not use quotation marks to
enclose the filenames. If the full pathnames exceed 100 characters, the user may use
abbreviated paths (locations of the files relative to the parameter file's directory) but must
always update these abbreviated paths if the parameter file is moved. Some PC systems might
require backslashes ("\") in pathnames, rather than the forward slashes ("/") used in UNIX and
other systems.

In addition to the input files discussed above, there are three diagnostic output files and a set of
final output files (i.e., one file for each state, district, or territory included in the simulation) for
which full pathnames must be specified. The diagnostic output files are log, counter, and
mistract.

As explained in Chapter 2, HAPEM creates an exposure output file for each state/HAP
combination. The names of these files are constructed by the program based on the HAP
SAROAD code (specified as the value of the sarod parameter) and the state FIPS code. Thus,
the pathnames, but not the filenames, for these files must be specified in the parameter file.

3.1.2.	Identifying the Uniform Component of the Background
Concentration

In addition to estimating exposure-concentration contributions for each emission-source
category for which data are provided in the air quality file, HAPEM also estimates the exposure-
concentration contribution from the background outdoor concentration. This background-
exposure contribution, of which there are two possible components, is added together with the
contributions from the source categories to calculate the total exposure concentration. One
component of the background concentration is assumed uniform throughout the study area (i.e.,
a single value is specified as the backg parameter, in the same units as those used in the air
quality file). The uniform component of the background concentration is an estimate of the
outdoor concentration that would occur in the absence of any local anthropogenic emissions. It
includes concentration contributions from natural sources, re-entrainment, and/or global
transport. The second component of the background concentration is provided in the air quality
file as a single time-invariant value for each location specified in the air quality file. This
component typically represents either the impact of anthropogenic emissions released outside
of the modeling domain or a combination of those emissions and the outdoor concentration that
would occur in the absence of all anthropogenic emissions. In the latter case, the value of
backg would be set to 0.0, since its constituents would be included in the location-specific
background value.

ICF

3-3

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

3.1.3. Setting the Internal Parameters

The 12 internal parameter settings
(nmicro, nblock, hblock, ntype,
ngroup, nsource, nreplic,
nmobiles, nbmicro, nemicro,
nvehicles, and npublict) are
specified by the user in one or more
of the parameter files and must be
consistent with the structure of the
other input data files. Each of these
parameters is defined in the adjacent
text box. Thus, if the user wishes to
change the number of MEs, for
example, the input files that specify
MEs must also be altered in a
consistent manner.

As explained in Chapter 2, the value
of the hblock parameter (the number
of time blocks per day for the
analysis) must be selected to meet
the criteria listed below.

•	The value of hblock must be
an integral factor of nblock
(the number of time blocks
per day in the activity file) so
that the activity time blocks
can be combined if necessary
to match to match hblock.

•	The value of hblock must be
an integral multiple of the
number of time blocks per day in the air quality file, so that the air-quality values can be
replicated if necessary to create hblock air-quality values.

3.2. Activity File

The activity file, the primary input to the DURAV program, contains information on the time
individuals spent in various MEs. This information is not presented as an activity sequence;
rather, it is presented in the activity file as the total time spent in each ME during each block of
time and at each location throughout the day.

3.2.1. Variables and Format of the Default File

The first line of the activity file is a text header that indicates the order of the variables in each
record, although it does not explicitly name the contents that make up the bulk of the file—the
minutes spent in each ME for each combination of commute status and day type. The header in
the current default activity file is as follows.



Internal Parameters

nmicro

number of MEs in the activity, factors, and mobiles files

nblock

number of time blocks per day in the activity file

hblock

number of time blocks per day for the analysis

ntype

number of dav types in the DURAV source code

ngroup

number of demoaraphics in the DURAV source code and
the population file

nsource

number of emission-source categories in the air quality file

nreplic

number of replicates to be simulated for each demographic
group in each census tract

nmobiles sequence numbers of up to 10 onroad-mobile emission-
source categories in the air quality file

nbmicro

sequence number of the first indoor ME (including in-
vehicle) in the activity, factors, and mobiles files

nemicro

sequence number of the last indoor ME (including in-
vehicle) in the activity, factors, and mobiles files

nvehicles sequence numbers of up to 10 MEs for private commuting
in the activity, factors, and mobiles files (e.g., cars, trucks)

npublict

sequence numbers of up to 10 MEs for public-transit
commuting in the activity, factors, and mobiles files (e.g.,
buses, trains)

ICF

3-4

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

Header of default activity file

(in "wrapped" view)

CHADID ZIP ST COU SEX RACE WORK YEAR MN DY AGE G DT CT

Although most of the header record of the activity file is not used by the model programs, it
provides documentation to inform the user of the meaning of the data fields. The exception is
the specification of the number of time blocks per day, nblock, which the DURAV program
checks against the value of the nblock parameter specified in the parameter file for
consistency. If inconsistent, an error message is sent to the log file and the program stops.

Each fixed-width, space-delimited record following the header record consists of one person-
day (1,440 minutes) of activity data. The variables in the default activity file, extracted from
CHAD, are defined in Table 3-1. See Appendix A for more details on the current HAPEM default
input files.

Following the commuting indicator is a series of duration values. The values specify the integral
number of minutes (possibly zero) spent in each ME/time-block/location combination, where
locations are at home or at work. The current default activity file, with 18 MEs (listed in Table
1-1), 24 time blocks per day, and two locations, has a total of 864 duration values. These values
are sequenced so that the 18 ME durations for the first time block in the home location come
first, followed by the 18 ME durations for the second time block in the home location, and so on,
until all the 432 values for the home location are specified. These are followed by the 432
values for the work location. An example of a record from the current activity file is presented
below.

Table 3-1.

Variables in the default activity file

Variable

: CHADID

ZIP
ST
COU
SEX

RACE

Description

i 9-character string identifying the data record; e.g., the corresponding person-day in the CHAD activity
database. This information is used by the DURAV program only to identify faulty records in the diagnostic
I output files.

: 5-character string identifying the zip code of respondent's residence. If a ZIP code is missing, it is
: reported as "00000". This information is not used by the current version of the DURAV program.
; 2-character string identifying the FIPS code of the state where the activities took place. This information ,
i is not used by the current version of the DURAV program.

i 3-character string identifying the FIPS code of the county where the activities took place. This information •
is not used by the current version of the DURAV program,
s 1 -character string, indicating the sex of the respondent, with values as follows:
"1" =	female

"2" =	male

i	"9" =	unknown	j

i This information is not used by the current version of the DURAV program,
i 1 -character string, indicating race/ethnic group of the respondent, with values as follows:
"1" =	White (non-Hispanic)

"2" =	Black (non-Hispanic)	;

"3" =	Hispanic (any race)	j

"4" =	Asian or Other (non-Hispanic)

i	"9" =	unknown	=

i This information is not used by the current version of the DURAV program.

ICF

3-5

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

Variable

WORK

; YEAR, MN, DY

AGE
G

;dt

CT

: DURATION
: (MICRO,BLOCK,HW)

Description

i 1-character string, indicating employment status of respondent, with values as follows:

"Y' =	Yes

;	"N" =	No

!	"X" =	missing

i This information is not used by the current version of the DURAV program.

; Numeric variables (four-digit year) that identify the date when the activities actually took place. This
information is not used by the current version of the DURAV program.

= Integer indicator of the age of the respondent (missing = -999.00)
j Integer indicator of the HAPEM age group of the respondent, with values as follows:

1	=	0-1 years old

2	=	2-4

¦	3	=	5-15

I	4	=	16-17

5	=	18-64

J	6	=	65+

; Integer indicator of day type for classification, with values as follows:
!	1	=	summer weekday

2	=	non-summer weekday

|	3	=	weekend

j Integer indicator of whether the respondent is a commuter, with values as follows:

0	=	no commuting

;	1	=	commuting

: Duration of event (minutes). There are 864 of these fields, cycling through each of the 18 MEs; cycling
through the MEs for the first time block for home locations, and so on through the last time block, then
! repeating the cycle for work locations..

ICF

3-6

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

Example data record from default activity file

(in "wrapped" view)

CHADID



ZIP

ST

cou

SEX

RACE

WORK YEAR

MN DY



AGE

G

DT CT









CAC 0116 6A

93277

06

000

1

1

N

1989

6 16



1. 67

1

1

1

60

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

30

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

30

30

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

30

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

45

0

0

0

0

0

15

0

0

0

0

0

0

0

0 0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

35

0

25

0

0

0

0

0

0

0

0

0

0

0

60

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0





3.2.2. Replacing or Modifying the Default File

If the user wishes to replace or modify the default activity file, they must ensure that the
following two conditions are met.

•	The number of duration values in each record must equal twice the product of the values
of nmicro and nblock as specified in the parameter file.

•	The sum of the duration values in each record must total 1,440 minutes (i.e., no time is
unaccounted for); otherwise, the DURAV program will stop.

ICF

3-7

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

In addition, the user must ensure that the activity file is consistent with a feature of the DURAV
source code: the record length of the activity file (unit 11) is specified in the DURAV program. If
the user constructs a replacement activity file with a record length different from that of the
default activity file, corresponding changes need to be made in the DURAV program.

The variables used by the DURAV program for classifying activity records (i.e., day type,
demographic, and commute status), as well as the activity-duration values, are identified by the
program by their position in the data record. If the user constructs a replacement activity file with
these variables positioned differently, corresponding changes need to be made in DURAV for
unit 11 as well as in HAPEM for unit 21 which is produced by DURAV. The number of
demogrpahic groups and day types is unlimited. However, the user is cautioned that for
narrowly-defined groups and day types, there might not be enough activity-pattern data to
calculate a reliable group average or create meaningful activity-pattern clusters. An extreme
example of this is where no activity patterns fit a group's definition, resulting in incorrect
exposure calculations (i.e., exposure concentrations equal to zero) for that group.

The number, definition, and order of MEs must be the same in both the activity file and the
factors and mobiles files (see Section 3.10 [ME Factors and Mobiles Files]). The number is
specified in the parameter files as nmicro.

The activity file is read by the DURAV program, which creates two intermediate output files with
the same path and root filename, but with different filename extensions. Thus, the user should
NOT name an activity file with any of the following filename extensions: .da and .nonzero.

As with other model input files, the user can add comments or other information after the last
data record in the file. To prevent the program from reading these comments as data, a blank
line must be inserted after the last data record and before any comments.

3.3. Cluster File

This file provides information on demographic group, day type, and cluster type of each
complete (i.e., with 1,440 minutes per day) CHAD record in activity file. The file is used in
DURAV to group CHAD records according to cluster. See Appendix A for more details on the
current HAPEM input files.

3.3.1. Variables and Format of the Default File

The first line of the fixed-width, space-delimited cluster file is a text header that indicates the
order of the variables in each record. The header in the current default cluster file is as follows.

Header of default cluster file

: g dt ct chadid clus nclus j

Although the header record of the cluster file is not used by the model programs, it provides
documentation to inform the user of the meaning of the data fields. The first four variabiles have
the same meaning as in the activity file (see Table 3-1), while "clus" refers to the cluster
category number for that record, and "nclus" refers to the total number of cluster categories for
that demographic-group/day-type/commuting-status combination.

ICF

3-8

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

An extract from the current default cluster file is shown below. These cluster categories were
determined using cluster analysis, as explained in Appendix A.

Extract from default cluster file

g

dt

ct

chadid

clus

nclus

l

1

1

CAC 0116 6 A

1

1

l

1

1

CAC01251A

1

1

l

1

1

CAC 014 8 9A

1

1

l

1

1

CAC015 62A

1

1

l

1

1

CAC015 68A

1

1

l

1

1

CAC 018 0 9A

1

1

l

1

1

CAC 018 3 OA

1

1

l

1

1

CAC01982A

1

1

l

1

1

CAC 0203 6A

1

1

l

1

1

CAC02132A

1

1

3.3.2. Replacing or Modifying the Default File

If the user wishes to replace or modify the default cluster file, they must ensure that the file is
properly formatted and the following two conditions are met.

•	There should be one record for every valid record in the corresponding activity file (i.e.,
one with 1,440 minutes, a demographic specification, and a day-type designation, and a
commuting-status specification). Any record in the activity file without a corresponding
record in the cluster file will not be used.

•	The records should be sorted in the same way as the activity file.

3.4. Population File

The population file, the primary input to the INDEXPOP program, provides the number of people
in each demographic group residing in each census tract of the study area. The data must be
sorted according to state FIPS, county FIPS, and tract FIPS. The data are typically derived from
the census data. The group definitions are presented in Section 2.2.2 (Changing the Number
and/or Definitions of the Demographic Groups). See Appendix A for more details on the current
HAPEM default input files.

3.4.1. Variables and Format of the Default File

The population file begins with two text header records, followed by one data record for each
census tract. The first header record indicates the order of the variables in each of the data
records. The first and second header records of the current default population file are as follows.

Header of default population file

' TRACT	B_00 B_02 B_05 B_16 B_18 B_65 j

i	COM	COM	COM	COM	COM	COM

Although the header of the population file is not used by the model programs, it provides
documentation to inform the user of the meaning of the data fields. Each fixed-width, space-
delimited data record following the header consists of a census-tract identifier and a population

ICF

3-9

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

value for each of the indicated demographic groups in that tract. The definitions of the data
fields in the current default population file are presented in Table 3-2.

Table 3-2.

Variables in the default population file

Variable	Description

TRACT ; 11-character string uniquely identifying a U.S. census tract. The first two characters identify the state FIPS code, the next
i three characters the county FIPS code. The remaining six characters consist of the four-character tract code followed by its
j two-character extension. If there is no extension for the tract, "00" is used.

B_YY ' Integer specifying the number of tract residents with age in category YY.

The age category definitions are:

00 =	0-1 years old

|	02 =	2-4 years old

05 =	5-15 years old

s	16 =	16-17 years old

18 =	18-64 years old

65 =	65 years or older

An extract from the current default population file is presented below.

Extract from default population file

TRACT

O

o

1

CQ

B_02

B_05

B_16

B_18

B_65



COM

COM

COM

COM

COM

COM

01001020100

30

67

252

56

1086

284

01001020200

33

78

283

77

1268

316

01001020300

60

109

471

91

1887

598

01001020400

81

137

596

88

2383

961

01001020501

75

115

625

138

2599

770

01001020502

84

144

569

105

2090

292

01001020503

81

120

542

104

2188

581

01001020600

77

141

642

101

2198

570

01001020700

91

161

491

86

2105

475

01001020801

44

104

487

131

1861

516

3.4.2. Replacing or Modifying the Default File

If the user wishes to replace or modify the default population file, they must ensure that the
definitions and ordering of the demographic groups in the population file corresponds to the
ordering in the output file from DURAV that is subsequently used in the HAPEM program. In
addition, the user must ensure that the record length is consistent with its specification in the
INDEXPOP program (unit 14).

The population file is read by the INDEXPOP program, which creates several intermediate
output files with the same path and root filename, but with different filename extensions. Thus,
the user should NOT name a population file with any of the following filename extensions: .da,
,county_tract_pop_range, and .state_county_pop_range. There also is an intermediate file with
the characters _direct.ind attached to the population file root name.

As with other model input files, the user can add comments or other information after the last
data record in the file. To prevent the program from reading these comments as data, a blank
line must be inserted after the last data record and before any comments.

ICF

3-10

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

3.5. Commuting-time File

The commuting-time file provides data for each tract on the proportion of commuting workers
who take public transit and private transit, and their respective round-trip average commuting
times (minutes). This information is combined with data on the centroid-to-centroid commuting
distances for workers in the tract, provided in the commuting file described below, to estimate a
commuting time for each replicate that is probabilistically selected to commute to work,
according to the data provided in the commuting-fraction file described below. The HAPEM
program then adjusts the selected activity patterns for that replicate to reflect the estimated
commuting time (see Section 5.2.5 fHAPEMl for more details on the algorithm).

The commuting-time file has no header records, only data records. Each tab-delimited data
record contains the following five variables, derived from U.S. Census data for the default file.

Variables in the commuting-time file

Home tract ID

Proportion of commuters who travel by public transit

Proportion of commuters who travel by private vehicle

Average round-trip commuting time for public-transit
commuters

Average round-trip commuting time for private-transit
commuters

(11-character string: state FIPS,
county FIPS, and tract FIPS)

(decimal number)

(decimal number)

(minutes)

(minutes)

The default commuting-time file is sorted by tract ID, smallest to largest in numerical order.
Several example data records from the current default commuting-time file are presented below.
See Appendix A for more details on the current HAPEM default input files.

Extract from the default commuting-time file

01001020100

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



01001020200

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



01001020300

0.

.0000

1.

.0000

0.

.0000

49.

,5615



01001020400

0.

.0377

0.

.9623

89.9083

5

0

01001020501

0.

.0166

0.

.9834

89.9083

5

0

01001020502

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



01001020503

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



01001020600

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



01001020700

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



01001020801

0.

.0000

1.

.0000

0.

.0000

50.

, 8320



3.6. Commuting-fraction File

The commuting-fraction file provides data for each tract on the proportion of workers in each
demographic group that commutes to work. This information is used by the HAPEM program to
determine for each replicate in each group whether they commute to work, and therefore, which
set of activity patterns should be sampled to represent that replicate. The data in the default
commuting-fraction file are derived from census data.

ICF

3-11

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

The commuting-fraction file has no header records, only data records. Each tab-delimited data
record contains 13 variables, as follows.

Variables in the commuting-fraction file

Home tract ID

Proportion of workers in demographic-group 1 does not
commute to work

Proportion of workers in demographic-group 1 that
commutes to work

(11-character string: state FIPS,
county FIPS, and tract FIPS)

(decimal number)

(decimal number)

Repeat the latter two above for groups 2-6

The default commuting-fraction file is sorted by tract ID, smallest to largest in numerical order.
Several example data records from the current default commuting-fraction file are presented
below. See Appendix A for more details on the current HAPEM default input files.

Extract from the default commuting-fraction file

(in "wrapped" view)

: 01001020100 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0101 0.9899
j	0.0000 1.0000

s 01001020200 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0149 0.9851
i	0.0000 1.0000

J 01001020300 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0909 0.9091 0.0334 0.9666
i	0.0000 1.0000

i 01001020400 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0379 0.9621
J	0.1374 0.8626

s 01001020501 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0589 0.9411
>	0.0000 1.0000

; 01001020502 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0051 0.9949
i	1.0000 0.0000

i 01001020503 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0896 0.9104

;	l.oooo o.oooo

: 01001020600 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0717 0.9283
j	0.1000 0.9000

= 01001020700 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0178 0.9822
1	0.0000 1.0000

j 01001020801 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.1304 0.8696 0.0917 0.9083
j	0.0000 1.0000

3.7. Distance-to-road File

The distance-to-road file provides data for each tract on the proportion of tract area and the
proportion of each demographic group that resides within three distance categories from a
major roadway: 0-75 meters, 75-200 meters, and greater than 200 meters. This information is
used by the HAPEM program to determine, for each replicate for each ME, the distance from a
major roadway and, therefore, which PROX factor distributions in the mobiles file (described
below) to sample from for the onroad-mobile source categories.

ICF

3-12

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

The distance-to-road file has no header records, only data records. Each tab-delimited data
record contains the 22 variables listed below, derived in the current default file using census
data as well as census data processed by a third party.

Variables in the distance-to-road file

Tract ID

Fractions of tract area within each of three distance
categories from a major roadway: 0-75 meters, 75-200
meters, greater than 200 meters
Fractions of demographic-group 1 that reside within each
of three distance categories from a major roadway: 0-75
meters, 75-200 meters, greater than 200 meters
Repeat the latter one for groups 2-6

(11-character string: state FIPS,
county FIPS, and tract FIPS)
(decimal number)

(decimal number)

The default distance-to-road file is sorted by tract ID, smallest to largest in numerical order.
Several example data records from the current default distance-to-road file are presented
below. See Appendix A for more details on the current HAPEM default input files.

Extract from the default distance-to-road file

(in "wrapped" view)

: 01001020100 0.0601 0.0865 0.8534 0.0778 0.1077 0.8145 0.0778 0.1077 0.8145 0.0512
0.0889 0.8599 0.0903 0.1279 0.7819 0.0412 0.0743 0.8845 0.0607 0.0998 0.8395
01001020200 0.0526 0.0644 0.8830 0.0562 0.0716 0.8722 0.0562 0.0716 0.8722 0.0421
0.0552 0.9027 0.0543 0.0742 0.8715 0.0586 0.1170 0.8243 0.0525 0.0752 0.8722
01001020300 0.0740 0.1116 0.8143 0.0598 0.0978 0.8423 0.0598 0.0978 0.8423 0.0403
0.0752 0.8844 0.0572 0.0898 0.8530 0.0501 0.0951 0.8548 0.0937 0.1450 0.7614
01001020400 0.1151 0.1740 0.7109 0.1024 0.1859 0.7117 0.1024 0.1859 0.7117 0.1000
, 0.2404 0.6596 0.1250 0.2122 0.6628 0.1107 0.1902 0.6991 0.1082 0.2026 0.6892

01001020501	0.0800 0.1126 0.8074 0.0322 0.0527 0.9151 0.0322 0.0527 0.9151 0.0240
0.0453 0.9307 0.0378 0.0791 0.8831 0.0305 0.0568 0.9126 0.0235 0.0460 0.9306

01001020502	0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000
0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000

01001020503	0.0563 0.0932 0.8505 0.0185 0.0312 0.9503 0.0185 0.0312 0.9503 0.0294
0.0496 0.9210 0.0181 0.0305 0.9514 0.0303 0.0512 0.9185 0.0519 0.0874 0.8607
01001020600 0.1428 0.2165 0.6407 0.1410 0.2209 0.6381 0.1410 0.2209 0.6381 0.1288
0.2146 0.6566 0.0941 0.1996 0.7063 0.1301 0.2333 0.6366 0.1211 0.2206 0.6583
01001020700 0.0524 0.0783 0.8693 0.0544 0.0994 0.8462 0.0544 0.0994 0.8462 0.0490
0.1127 0.8383 0.0523 0.1090 0.8387 0.0596 0.1172 0.8232 0.0648 0.1203 0.8149

¦ 01001020801 0.0152 0.0248 0.9601 0.0733 0.0347 0.8920 0.0733 0.0347 0.8920 0.0298
s 0.0320 0.9382 0.0403 0.0548 0.9049 0.0422 0.0493 0.9085 0.0693 0.0669 0.8638

3.8. Commuting File

The commuting file, the main input file to the COMMUTE program, provides data on the
commuting flows (i.e., the proportion of commuters) between pairs of census tracts. The default
commuting file was derived from census data identifying the tract of work and tract of residence
for individuals in all 50 states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands.
Only those home-tract/work-tract pairs with non-zero flows are included in the default
commuting file.

ICF

3-13

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

The commuting file has data records with no header records. Each fixed-width, space-delimited
data record contains five variables (the first being empty), as follows.

Variables in the commuting file

Leading space in file

Home tract ID	(11-character string: state FIPS, county FIPS,

and tract FIPS)

Work tract ID	(11-character string: state FIPS, county FIPS,

and tract FIPS)

Distance apart in kilometers	(decimal number)

Fraction of workers in the commuting flow (decimal number; sums to 1 across all instances

of a home tract)

An extract from the current default commuting file is presented below. See Appendix A for more
details on the current HAPEM default input files.

Extract from the default commuting file

01001020100
01001020100
01001020100
01001020100
01001020100
01001020100
01001020100
01001020100
01001020100
01001020100

01001020100
01001020803
01001020200
01001020300
01001020400

01001020501

01001020502

01001020503
01001020600
01001020700

0.	00

6.	98

1.	92
3.12
4.56

7.51

7.	07
6.25
4.08

7.52

0.03045067
0.00365408
0.04263094
0.04872107
0. 01827040
0.07308161
0.02436054
0.03654080
0.02436054
0.06090134

3.8.1. Replacing or Modifying the Default File

The commuting file is read by the COMMUTE program, which creates several intermediate
output files with the same path and root filename, but with different filename extensions. Thus,
the user should NOT name a commuting file with any of the following filename extensions: .da,
.ind, and ,st_comm1_fip_range.

As with other model input files, the user can add comments or other information after the last
data record in the file. To prevent the program reading these comments as data, a blank line
must be inserted after the last data record and before any comments.

3.9. Air Quality File

The air quality f\\e contains the ambient-air concentrations that are used by the AIRQUAL
program. AIRQUAL requires a separate air quality file for each HAP being evaluated.

The air quality f\\e must begin with at least one text header record, followed by one or more data
records for each census tract to be evaluated. The required text header is used by the
AIRQUAL program to determine the number of time blocks per day (of equal size) in the air-
quality data. This value should be indicated immediately following the last instance of the

ICF

3-14

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

character string "block". For example, the sixth header record of an AERMOD-derived air quality
file used for the recent AirToxScreen analysis, which indicates the order of the variables in each
of the data records, is as follows.

Example header record from an air quality file

(in "wrapped" view)

FIPS

Conc_

Conc_

Conc_

Conc_

Conc_

Cone

Tract
block4
blockl
block6
block3
block8
block5

Backgrd_Conc

Conc_block5

Conc_block2

Conc_block7

Conc_block4

Conc_blockl

Cone block6

Conc_blockl
Conc_block6
Conc_block3
Conc_block8
Conc_block5
Conc_block2
Cone block7

Conc_block2
Conc_block7
Conc_block4
Conc_blockl
Conc_block6
Conc_block3
Cone block8

Conc_block3
Conc_block8
Conc_block5
Conc_block2
Conc_block7
Cone block4

For this example, AIRQUAL will interpret the number of time blocks per day as 8. As noted
elsewhere, the number of time blocks per day in the air quality file must be an integral factor of
hblock, the number of time blocks per day for the analysis as specified in the parameter file;
otherwise, the program will stop. If the number of time blocks per day in the air quality file is less
than hblock, AIRQUAL will replicate the values to create hblock concentration values. The
other information in this header record and all other header records is ignored by AIRQUAL.

After the required header information is found, AIRQUAL identifies data records by finding a
numerical digit in the fourth data field. To avoid a mistaken identification, the user should ensure
that header records do not contain a numerical digit in the fourth data field.

The fields in the data records are defined as follows.

Variables in the air quality file

Leading spaces in file
Home tract ID

Space-delimited air concentrations for spatially-variable
(but temporally constant) background concentration

Space-delimited air concentrations for each combination of
emission-source category and time block

(11-character string: state FIPS,
county FIPS, and tract FIPS)

(decimal number; optionally in
exponential format, e.g., X.XXE-
XX)

(decimal numbers; optionally in
exponential format, e.g., X.XXE-
XX)

The number of non-background concentration values in each data record must equal the
product of the number of outdoor-emission-source categories (i.e., the value of nsource in the
parameter file) and the number of time blocks per day, as indicated in the text header record
discussed above. The values are ordered beginning with the first time block of the first emission
source, followed by the second time block of the first emission source, and so on. An example
data record is presented below for nsource = 4.

ICF

3-15

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

Example data record from an air quality file

(in "wrapped" view)

01001 020100 0.00E+00 3.18E-03	3.17E-03 2.13E-03

1.02E-03 8.42E-04 1.85E-03 3.00E-03	3.07E-03 2.98E-02

3.44E-02 3.63E-02 2.40E-02 1.83E-02	5.47E-02 8.24E-02

4.59E-02 2.13E-02 3.40E-02 8.71E-02	4.99E-02 3.04E-02

7.28E-02 7.77E-02 4.01E-02 1.58E-02	1.79E-02 1.89E-02

1.16E-02 9.37E-03 2.20E-02 2.74E-02	2.01E-02

The air quality file is read by the AIRQUAL program, which creates several intermediate output
files with the same path and root filename, but with different filename extensions. Thus, the user
should NOT name an air quality file with any of the following filename extensions: .da, .airjda,
,pop_air_da, ,state_air_fip_range, ,state_air1_fip_range, and ,state_air2_fip_range.

As with other model input files, the user can add comments or other information after the last
data record in the file. To prevent the program reading these comments as data, a blank line
must be inserted after the last data record and before any comments.

3.10. ME Factors and Mobiles Files

The ME factors and mobiles files provide probability distributions for the factors used to
calculate an estimated ME concentration from an outdoor concentration. The files contain
probability distributions for three of the four factors for each ME, and a single value for the fourth
factor. These factors are used in the HAPEM algorithm, as follows.

ME(m, c, t, s, d) = PROX(m, s, d) x PEN(m, t) x AMB(c, tLAG(my s)

ME (m, t, i) = ADD (m, t)

ME(m, c, t, b) = PROX(m., s) x PEN(m, t) x [bckgd_u + bckgd_v(c)]

ME (m, c, t. cl) = Ys ME (m, c, t, s, cl) + ME (m, t, i) + ME (m, c, t, b)
where:

ME(m,c,t,s,d)\ concentration in ME m located in census tract c at time t due to source
category s and at distance from source d,

PROX(m,s,d)\ proximity factor for ME m, source category s, and distance from source d
(defined below),

penetration factor for ME m at time t (defined below),

ambient concentration for census tract c at time t for source category s
from the air quality file,

time t if LAG(m) = 0; time t-1, otherwise,

concentration in ME m at time t due to indoor sources,

additive factor for ME m at time t (defined below),

concentration in ME m located in census tract c at time t, due to the
background concentration,

PEN(m,t)\
AMB(c,t,s)\

tLAG(m)-

ME(m,t,i)\
ADD(m,t)\
ME(m,c,t,b)\

ICF

3-16

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

bckgd_u\ uniform component of ambient background concentration,

bckgd_v(c)\ spatially-variable component of background concentration, and

ME(m,c,t,d)\ total concentration in ME m located in census tract c at time t at distance
from source d.

The penetration factor, PEN, is an estimate of the ratio of the ME-concentration contribution
(from a given emission-source category) to the concurrent outdoor-concentration contribution in
the immediate vicinity of the ME. That is,

indoor or in-vehicle ME concentration

PEN =	

outdoor concentration in immediate vicinity of indoor or in-vehicle ME

The proximity factor, PROX, is an estimate of the ratio of the outdoor concentration in the
immediate vicinity of the ME (or in the ME for outdoor MEs) to the outdoor concentration
represented by the air-quality data. That is,

outdoor concentration in immediate vicinity of indoor or in-vehicle ME

PROX, =	J——		

concentration from air quality file

outdoor ME concentration
PROXn = 				——

concentration from air quality file

Air-quality data used in the model typically represent a spatial average over the census tract.
For most MEs, the default factors file specifies a PROX value of 1.0 (i.e., an outdoor-
concentration contribution in the immediate vicinity of the ME equal to the spatial-average
contribution over the census tract). However, when assessing exposure to motor-vehicle
emissions for MEs near roadways (e.g., in-vehicle, indoor MEs situated near roadways), the
HAP-concentration contribution in the immediate vicinity of the ME is expected to be higher than
the spatial-average HAP-concentration contribution of the census tract (i.e., PROX is expected
to be greater than 1.0). This is because the concentration gradient near roadways tends to be
relatively steep. This condition for onroad-mobile emissions is reflected in the default mobiles
file, which contains PROX distributions and LAG factors for onroad-mobile emissions.

ADD is an additive factor that accounts for emission sources within or near to an ME (i.e.,
indoor-emission sources). Unlike the other two factors, the ADD factor is itself a concentration
and therefore has units of mass/volume. The actual units used must be the same as those in
the air quality f\\e.w

LAG is used to account for the possibility of very slow HAP diffusion and penetration, so that the
relevant air-quality concentration value may be from the previous time block. A value of zero for
LAG indicates no time lag (i.e., use the concurrent air-quality value); otherwise, the previous
time-block value is used. Due to lack of sufficient data to make estimates for LAG, the default
file contains a uniform value of zero for all MEs.

10 A database of distributions of indoor-source-concentration contributions for several indoor-source categories and
subcategories is currently under development. The current version of the HAPEM program contains untested
algorithms to utilize the developing database. Therefore, it is currently recommended that indoor sources be
omitted from HAPEM applications until the database and algorithms have been tested and reviewed. To disable the
indoor-source algorithms, set keyword CAS to 99999.

ICF

3-17

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

The factors and mobiles files have no header records. The factors file contains a set of records
(one for each ME) for each outdoor-source category being modeled (the number identified as
nsource), in the same order as the source categories are specified in the air quality file. The
mobiles file contains a set of records (one for each ME) for the onroad-mobile source category
identified with nmobiles and for each distance-from-road category.11 The MEs must be the
same number, definition, and order as the MEs in the activity file. The files are read in free
format, once for each ME, with fields as specified in Table 3-3a and Table 3-3b. All values are
decimal numbers.

Table 3-3a.

Format for the factors file

(N/A)
PEN

Field Num.



Parameter

1 i Number of ME (1-

-18)



2 Indicate Distribution Type



s 1 - Normal





| 2 - Lognormal





s 3 - Uniform





i 4-Triangular





; 5 - Dataset





3 ; Distribution TvDe



Indicate Parameter

: Normal



Mean

i Lognormal



Mean

I Uniform



Minimum

! Triangular



Minimum

i Dataset



Number of data points

4 i Distribution Tvce



Indicate Parameter

Normal



Standard deviation

> Lognormal



Standard deviation

| Uniform



Maximum

i Triangular



Maximum

i Dataset



First data point in the set

5 i Distribution TvDe



Indicate Parameter

Normal



0 (always)

: Lognormal



0 (always)

j Triangular



Mode

¦ Dataset



Second data point in the set

6 < Distribution TvDe



Indicate Parameter

Normal



Lower bound (optional)

> Lognormal



Lower bound (optional)

| Dataset



Third data point in the set

7 ! Distribution Tvce



Indicate Parameter

Normal



Upper bound (optional)

s Lognormal



Upper bound (optional)

; Dataset



Fourth data point in the set

8 s Distribution TvDe



Indicate Parameter

: Dataset



Fifth data point in the set

9 '¦ Distribution TvDe



Indicate Parameter

: Dataset



Sixth data point in the set

10 ' Distribution Tvoe



Indicate Parameter

: Dataset



Seventh data point in the set

11 Note that a PROX-factor distribution is specified in the factors file for the onroad-mobile source category as a
place-holder and such values should be set to 1. The PROX-factor distributions in the mobiles file are then
multiplied by the distributions from the factors file.

ICF	3-18	HAPEM8 User's Guide

December 2023


-------
3. HAP EM Input Files

ME Factor I Field Num.

11

ADD
PROX
Source 1
Source 2
Source 3
Source 4
LAG
Source 1
Source 2
Source 3
Source 4

12

13

"14—25

26-37
39-50
52-63
65-76

38
51
64
77

Distribution Type
Dataset

Distribution Type
Dataset

Distribution Type
Dataset

Parameter

-> Indicate Parameter

Eighth data point in the set
-> Indicate Parameter

Ninth data point in the set
-> Indicate Parameter

Tenth data point in the set

Repeat fields 2-13 for additive factor

Repeat fields 2-13 for proximity factor
Repeat fields 2-13 for proximity factor
Repeat fields 2-13 for proximity factor
Repeat fields 2-13 for proximity factor

0 (no lag) or 1 (lag of 1 time block)
0 (no lag) or 1 (lag of 1 time block)
0 (no lag) or 1 (lag of 1 time block)
0 (no lag) or 1 (lag of 1 time block)

Table 3-3b.

Format for the mobiles file (one onroad-mobile source category)

ME Factor

(N/A)

PROX for Onroad-mobile Source Category:
Distance-from-source Category 1

Field Num.

Parameter

1

: Number of ME (1-18)





2

! Indicate Distribution Type





1 - Normal







i 2 - Lognormal







s 3 - Uniform







i 4-Triangular







I 5 - Dataset





3

• Distribution Type

->

Indicate Parameter



Normal



Mean



; Lognormal



Mean



; Uniform



Minimum



: Triangular



Minimum



= Dataset



Number of data points

4

? Distribution Type

->

Indicate Parameter



Normal



Standard deviation



; Lognormal



Standard deviation



| Uniform



Maximum



: Triangular



Maximum



= Dataset



First data point in the set

5

< Distribution Type

->

Indicate Parameter



Normal



0 (always)



; Lognormal



0 (always)



i Triangular



Mode



I Dataset



Second data point in the set

6

• Distribution Type

->

Indicate Parameter



Normal



Lower bound (optional)



; Lognormal



Lower bound (optional)



; Dataset



Third data point in the set

7

: Distribution Type

->

Indicate Parameter



Normal



Upper bound (optional)



; Lognormal



Upper bound (optional)



: Dataset



Fourth data point in the set

8

; Distribution Type

->

Indicate Parameter



Dataset



Fifth data point in the set

ICF

3-19

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

ME Factor

Field Num.





Parameter



i 9

• Distribution TvDe



Indicate Parameter





Dataset



Sixth data point in the set



| 10

; Distribution TvDe



Indicate Parameter





Dataset



Seventh data point in the set



11

i Distribution TvDe



Indicate Parameter





[ Dataset



Eighth data point in the set



f 12

; Distribution TvDe



Indicate Parameter





Dataset



Ninth data point in the set



r 13

; Distribution TvDe



Indicate Parameter





i Dataset



Tenth data point in the set

LAG for Onroad-mobile Source Category:

f 14

! 0 (no lag) or 1 (lag of 1 time block)

Distance-from-source Category 1









Distance-from-source Category 2

V 15-27

\ Repeat fields 2-14





Distance-from-source Category 3

28-40 ""

| Repeat fields 2-14





The fields in the factors file include PROXdistributions (one per ME and source category), PEN
distributions (one per ME), ADD distributions (one per ME), and LAG factors (one per ME; LAG
factors are either 0 or 1).

The fields in the mobiles file include distributions of PROX and LAG factors for onroad-mobile
source category identified with nmobiles. Distributions of PROX factors in the mobiles files are
stratified for each of three distance-from-source categories: 0-75 meters, 75-200 meters, and
beyond 200 meters, and this information is combined with the data in the distance-to-road file in
the HAPEM program to determine from which probability distribution the PROX factor should be
selected for a given tract/ME combination (see Section 5.2.5 rHAPEMl for more details). The
distributions in the mobiles file override those in the factors file for the onroad-mobile source
category identified with nmobiles.

Distributions can take any of five different forms: normal, lognormal, uniform, triangular, or
dataset. The dataset is composed of up to 10 values, each of which is selected with equal
probability. The parameters that need to be specified for each type of distribution are listed
below.

Distribution types used in the factors and mobiles files

Normal arithmetic mean, arithmetic standard deviation, lower bound (optional), upper
bound (optional) [Note: If both the lower and upper bounds are set to 0.0, then
the distribution is sampled as if unbounded]

Lognormal geometric mean, geometric standard deviation, lower bound (optional), upper
bound (optional) [Note: If both the lower and upper bounds are set to 0.0, then
the distribution is sampled as if unbounded]

Uniform minimum, maximum

Triangular minimum, maximum, mode

Dataset number of data values in the set (1-10), each value

For HAPEM, default factors files are provided for each of three phases of HAPs: gaseous,
particulate, and HAPs that might be either phase depending on various conditions. Default
mobiles files are provided for benzene 1,3-butadiene, diesel PM, and non-specific HAPs
(formatted for a single onroad-mobile source category). As noted above, because a new

ICF

3-20

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

approach to evaluating indoor sources is in development, the default ADD factors are uniformly
set to zero. Due to lack of data, default LAG factors are uniformly set to zero. Excerpts from the
default factors and mobiles files for gaseous HAPs and non-specific HAPs, respectively, are
presented below. See Appendix A for more details on the current HAPEM default input files.

As with other model input files, the user can add comments or other information after the last
data record in the file. In this case a blank line need not be inserted after the last data record
before the comments.

Extract from a default factors file

(in "wrapped" view)

1

1

5

3



0

8

0

8

1

0

0

0

0

0

o :



0

5

1



0



0



0

0

0

0

0

0

o



0

5

1



1



0



0

0

0

0

0

0

0



0

0

5



1



1



0

0

0

0

0

0

o '



0

0

0



5



1



1

0

0

0

0

0

o i



0

0

0



0



5



1

1

0

0

0

0

0 i



0

0

0



0



0

















2

5

5

0

33

0

67

0

71

1

1

0

0

0

0

o ;



5

1

0



0



0



0

0

0

0

0

0

o !



5

1

1



0



0



0

0

0

0

0

0

0 i



0

5

1



1



0



0

0

0

0

0

0

o :



0

0

5



1



1



0

0

0

0

0

0

o !



0

0

0



5



1



1

0

0

0

0

0

o i



0

0

0



0





















3

5

5

0

33

0

67

0

71

1

1

0

0

0

0

o i



5

1

0



0



0



0

0

0

0

0

0

o !



5

1

1



0



0



0

0

0

0

0

0

o i



0

5

1



1



0



0

0

0

0

0

0

o '



0

0

5



1



1



0

0

0

0

0

0

o i



0

0

0



5



1



1

0

0

0

0

0

0 i



0

0

0



0





















Extract from a default mobiles file

(in "wrapped" view)

1

2

2.477 2.0477 0

i

8.0532 0 0

0

0

0

0

0



2

1.61131.9292 0

i

4.7492 0 0

0

0

0

0

0



5

110

0

0 0 0

0

0

0

0

0

2

2

2.477 2.0477 0

1

8.0532 0 0

0

0

0

0

0



2

1.61131.9292 0

1

4.7492 0 0

0

0

0

0

0



5

110

0

0 0 0

0

0

0

0

0

3

2

2.477 2.0477 0

1

8.0532 0 0

0

0

0

0

0



2

1.61131.9292 0

1

4.7492 0 0

0

0

0

0

0



5

110

0

0 0 0

0

0

0

0

0



3.11.

Cluster-transition File











The cluster-transition file specifies, for each combination of demographic group, day type and
commuting status, the number of activity patterns in each of 1-3 clusters (derived from cluster
analysis on the activity-pattern data from CHAD) and the cluster-to-cluster transition
probabilities (derived from the transition frequencies for multiple-day activity-pattern records

ICF

3-21

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

from CHAD). These values are used to create weights for averaging selected activity patterns,
one from each cluster, to represent an individual within the group for that day type.

The cluster-transition file begins with a text header record, followed by one data record for each
combination of day type and demographic group. The header record indicates the order of the
variables in each of the data records. Although the header record of the cluster-transition file is
not used by the model programs, it provides documentation to inform the user of the meaning of
the data fields. The header record of the current default cluster-transition file is as follows.

Header record from the default duster-transition file

i g dt ct nclus clustl clust2 clust3 probll probl2 probl3 prob21 prob22 prob23
| prob31 prob32 prob33

The cluster-transition file is read in free format with the variables defined in Table 3-4 for each
combination of day type and demographic group.

Table 3-4.

Variables in the duster-transition file

Variable	Description

g

; demographic group

dt

; day type

ct

I commuting status of subject

nclus

number of clusters for the group/day type (1 -3)

clustl

| cumulative fraction of group/day type in cluster #1

clust2

; cumulative fraction of group/day type in clusters #1-2

clust3

j cumulative fraction of group/day type in clusters #1 -3

probl1

* cumulative transition probability from cluster #1 to #1

prob12

j cumulative transition probability from cluster #1 to clusters #1-2

probl3

; cumulative transition probability from cluster #1 to clusters #1-3

prob21

¦ cumulative transition probability from cluster #2 to #1

prob22

I cumulative transition probability from cluster #2 to clusters #1 -2

prob23

cumulative transition probability from cluster #2 to clusters #1 -3

prob31

cumulative transition probability from cluster #3 to #1

prob32

j cumulative transition probability from cluster #3 to clusters #1 -2

prob33

j cumulative transition probability from cluster #3 to clusters #1 -3

An extract from the current default cluster-transition file is presented below. See Appendix A for
more details on the current HAPEM default input files.

Extract from a default duster-transition file

(in "wrapped" view)

g dt ct nclus clustl clust2 clust3 probll probl2 probl3 prob21 prob22
prob23 prob31 prob32 prob33

1111 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

112 1 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

12 13 0.48625 0.88567 1.00000 0.60714 1.00000 1.00000 0.13793 0.86207
1.00000 0.11111 0.33333 1.00000

12 2 1 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

ICF

3-22

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

13	13 0.54806 0.88666 1.00000 0.90000 1.00000 1.00000 0.37500 0.62500

1.00000	0.50000 0.75000 1.00000

13	2 1 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000

0.00000	0.00000 0.00000 0.00000

3.12. Statefip File

The statefip file cross-references the two-character FIPS code for each U.S. state (and District
of Columbia, Puerto Rico, and the U.S. Virgin Islands, totaling 53 areas) to its numerical ranking
on the list. The format of each record is as follows.

Variables in the statefip file

Numerical Rank	(integer)

State (or district or territory) FIPS code (2-character string)

As discussed in Section 2.1.7 (The Statefip File), the statefip file is used in conjunction with
regionl and region2 specified in all the parameter files to specify the areas to be included in
the analysis, according to numerical ranking.

Default ststefip file and
corresponding state names

1

01

Alabama

2

02

Alaska

3

04

Arizona

4

05

Arkansas

5

06

California

6

08

Colorado

7

09

Connecticut

8

10

Delaware

9

11

District Of Columbia

10

12

Florida

11

13

Georgia

12

15

Hawaii

13

16

Idaho

14

17

Illinois

15

18

Indiana

16

19

Iowa

17

20

Kansas

18

21

Kentucky

19

22

Louisiana

20

23

Maine

21

24

Maryland

22

25

Massachusetts

23

26

Michigan

24

27

Minnesota

25

28

Mississippi

26

29

Missouri

27

30

Montana

28

31

Nebraska

29

32

Nevada

30

33

New Hampshire

31

34

New Jersey

32

35

New Mexico

ICF

3-23

HAPEM8 User's Guide
December 2023


-------
3. HAP EM Input Files

33

36

34

37

35

38

36

39

37

40

38

41

39

42

40

44

41

45

42

46

43

47

44

48

45

49

46

50

47

51

48

53

49

54

50

55

51

56

52

72

53

78

New York
North Carolina
North Dakota
Ohio

Oklahoma

Oregon

Pennsylvania

Rhode Island

South Carolina

South Dakota

Tennessee

Texas

Utah

Vermont

Virginia

Washington

West Virginia

Wisconsin

Wyoming

Puerto Rico

U.S. Virgin Islands

ICF

3-24

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

4. HAPEM Output Files

HAPEM creates three diagnostic output files and a set of final exposure output files. The
diagnostic files record error messages and information about the parameters of the simulations.
The names for these three files are specified by the user in the parameter files of each model
program. The final exposure output files contain all the exposure estimates from a model run.
The pathnames for these files are specified by the user in the parameter file for the AIRQUAL
and HAPEM programs.

4.1. Log File

The log file contains a record of a model analysis. Three of the model programs (INDEXPOP,
COMMUTE, and HAPEM) will append records onto an existing log file, as specified their
parameter files, without overwriting previous records. The DURAV and AIRQUAL programs will
overwrite any records on an existing log file. Therefore, if a single log filename is used to run all
the model programs, a running record will be written for the DURAV. INDEXPOP. and
COMMUTE programs, but then the AIRQUAL program will erase those records and begin a
new record, and the HAPEM program will add to it. To maintain a complete log file record of a
HAPEM simulation, the two alternatives below can be used.

•	If a single parameter file is for a complete simulation, so that the log filename is the
same for all five programs, manually rename the log file created by the first three
programs before AIRQUAL is run.

•	Use a different parameter file for running AIRQUAL and HAPEM than for the other
programs, with a different name specified for the log file.

If the model programs experience no fatal errors during a simulation, there are several items
written to the log file by each of the programs. The first record written to the file by each
program identifies the program and its start time. The start time consists of three numbers—the
current time, the size of the time increment equivalent to one second, and the maximum value
allowed for the current time before it is reset to zero. All three of these quantities are system-
dependent. An example record of this type is presented below.

Example log record: program and
start time

: DURAV Start time= 34862630 1000 86399999 |

The last two records written to the log by each model program report the ending time and the
total job time for the particular program. For the total-job-time record, the job time is converted
into seconds. Note that the total job time will not be correct if the clock maximum is exceeded
during the job. An example of these types of records is presented below.

ICF

4-1

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

Example /pgr records:
program, stop time, and
run time

: DURAV End time = 34880980	;

i DURAV Job time = 18.3500004

If an error occurs that HAPEM considers to be fatal, a diagnostic message will be written to the
log file and the program will stop. For example, if DURAV finds that the number of time blocks
per day specified in the activity-f\\e header does not match the value of nblock specified in the
parameter file, it will write a message to the log file and stop. An example of this type of record
is presented below.

Example log record: error message

: number of time blocks in activity file does not equal nblock 999 :

4.1.1. DURAV Output to the Log File

Apart from the text produced by all model programs, each program writes some specialized
information to the log file. The DURAV program writes the names of the input activity and cluster
files and the output file (the averaged activity database). An example of these types of records
is presented below.

Example log records: input and intermediate files

: CHAD data from file=input/activity pattern/durhw_HAPEM8.txt	;

( Clustering from file=input/activity pattern/cluster_HAPEM8.txt
' Output data on file=input/activity pattern/durhw_HAPEM8.da

The DURAV program also records the number of records (person-days) extracted from the
activity file, and it produces a table of frequency counts for each combination of demographic
group, commute status, and day type (a matrix whose elements should sum to the total number
of records extracted). If any elements of this matrix are zero then there are groups that have no
activity patterns and thus are undefined. If the numbers are positive but small (e.g., less than
ten), then there is a chance that the exposure results might not be representative for the group.
An example of a part of this type of matrix is presented below.

ICF

4-2

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

Example /pgr records: file matrix

Frequency table for diaries
By demographic group (rows)
198 691 697
1275 2126 1968
2682 7664 7171
469 2095 2465
7969 16967 41348
4404 9678 11555
Frequency table for diaries
By demographic group (rows)

0
0
0

146

0
0
0

382

0
0
0

534

11360 28136 13653
747 1560 681

(non-commute)
& day type (cols)

(commute)

& day type (cols)

4.1.2. INDEXPOP Output to the Log File

In addition to the program name and the start-, stop-, and job-time information provided to the
log file by all the model programs, the INDEXPOP program writes two other records to the log
file. The first confirms that all the input files were successfully opened, and the second records
the total number of tract records in the population file. An example of these two records is
presented below.

Example log records: opened
files and tract counts

: Finished opening files	i

; total number of tracts is	85427

4.1.3.	COMMUTE Output to the Log File

The COMMUTE program writes no information to the log file other than the program name and
the start, stop, and job time.

4.1.4.	AIRQUAL Output to the Log File

In addition to the program name and the start-, stop-, and job-time information provided to the
log file by all the model programs, the AIRQUAL program writes several other records to the log
file. First, a summary of the air quality file is written to document the number of census tracts
and distinct counties found in the file. These tracts are then paired with the tracts found in the
population file. The number of tracts found in the air quality file but not in the population file is
recorded in the line containing the phrase "unpaired air tracts". This is followed by the list (if any)
of unpaired tracts. Then, the tracts in the population file are compared to the tracts in the air
quality file—the number of tracts in the population file but not in the air quality file is reported,
along with the number of matching tracts as well as the number of population tracts with multiple
air quality tracts. Next, similar statistics are given comparing tracts within the counties in the air
quality file to those in the population file. Any tract in the population file but not the air quality file
will not be modeled; similarly, any tract in the air quality file but not in the population file will not

ICF

4-3

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

be modeled. An example of the log output produced by the AIRQUAL program is presented
below.

Example log records: AIRQUAL statistics

: # air tracts = 84810 # of air records =	84810 ;
( # counties on air file = 3224

' There were 0 unpaired air tracts.	;

; Overall, there were:	s
i 617 unpaired census tracts.

i 84810 census tracts with a matching air tract.	i

0 census tracts with 2 or more air tracts.	s

i Within the counties on the air file, ther were:	i

¦ 617 unpaired census tracts.	;

i 84810 census tracts with a matching air tract.	j

! 0 census tracts with 2 or more air tracts.	s

4.1.5. HAPEM Output to the Log File

In addition to the program name and the start-, stop-, and job-time information provided to the
log file by all the model programs, the HAPEM program writes two other records to the log file. It
reports the time when dynamic array allocation is complete and the number of tracts used in the
analysis (i.e., that had data in the air quality, population, and commuting files). An example of
the log output produced by the HAPEM program is presented below.

Example log record: HAPEM array
allocation and tracts

: HAPEM Allocation = 1317787200	(

( There were	85427 tracts in the study area.

4.2. Counter File

A second diagnostic file created by HAPEM is the counter file. The counter file records the
number of records in various data-input and -output files, which can also be a useful tool for
troubleshooting and keeping track of which files were used in the simulation.

It is important to use same counter file for all the model programs in a simulation—the programs
use some of the information recorded by previous programs for dynamic memory allocation of
arrays. If the expected records from previous programs are not in the counter file, an error will
occur.

The model programs add records to the counter file by appending to the end of the records
generated by the previous programs, where programs are run in the expected order as
described in Section 2.1 (Model Structure; though running the COMMUTE program is optional).
For example, the INDEXPOP program reads records generated by the DURAV program, and
then it begins its own recording. If the INDEXPOP program is run a second time using the same
counter file, the second run will overwrite the previous records generated by the INDEXPOP
program.

The specific information recorded in the counter file is provided in Table 4-1. An example
counter file is also shown below.

ICF

4-4

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

Table 4-1.
Variables in the counterfWe

HAPEM	Record	Description

Program Number

number of data records (person-days) in the activity file

number of acf/V/fy-file data records (person-days) with 1,440 total minutes

number of data records (tracts) in the population file

number of counties in the population file (

number of data records in the population index file (e.g., population_HAPEM8_direct.ind)
number of data records (home-tract/work-tract pairs) in the commuting file
number of data records in the work-tract file (e.g., commute_flow_HAPEM8.da)
number of records in the commuting index file (e.g., commute_flow_HAPEM8.ind)

DURAV

INDEXPOP ! 2
! 3

COMMUTE

ICF

4-5

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

5

-number of matching tracts in the air quality (e.g., berzere.txt) and population index (e.g.
population_HAPEM8.ind) files

-number of counties with matching tracts in the air quality (e.g., benzene.txt) and
population index (e.g., population_HAPEM8.ind) files
-number of tracts in the air quality file (e.g., benzene.txt)

-number of data records in the air quality file (e.g., benzene.txt)

AIRQUAL

6

Example counter file

178621
85427
85427

178621
3224
6004343

5831073

85427
3224
84810

84810
84810

The relationships listed below are expected among the numbers in the counter file.

•	The number of records in the population file, the population index file (e.g.,
population_HAPEM8.ind), and the commuting index file ('e.g.,
commute_flow_HAPEM8.ind) should all be the same.

•	The number of records in the work-tract file (e.g., commute_flow_HAPEM8.da) may be
larger or smaller than the number of records in the commuting file. It may be larger
because the COMMUTE program will create a "commuting" flow for a tract that is in the
population file but is not a home tract in the commuting file (using the population tract as
both the home and work tract). It may also be smaller if the study area in the population
file is smaller than the study area in the commuting file (the default commuting file is all
U.S. states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands).

4.3. Mistract File

A third diagnostic file created by the COMMUTE, AIRQUAL, and HAPEM programs is the
mistract file. If the same mistract filename is used for the COMMUTE and AIRQUAL programs,
the COMMUTE program's information will be overwritten by that of the AIRQUAL program. The
HAPEM program will then append records onto an existing mistract file. To maintain a complete
record of this information for a HAPEM simulation, either different mistract filenames should be
used for the COMMUTE and AIRQUAL programs (requiring different parameter files), or the
mistract file should be manually renamed after the COMMUTE program is run.

Each of the three programs records a different set of information about the consistency of
census tracts included in the input files, as detailed in the list below. Below the list are example
excerpts from each program's mistract file.

•	The COMMUTE program's mistract file records the state, county, and tract FIPS codes
of each tract in the population file that is not matched by a home tract in the commuting
file. These unmatched tracts are still processed by the COMMUTE program, as
explained in the previous section, by creating a "commuting" flow using the population
tract as both the home and work tract.

ICF	4-6	HAPEM8 User's Guide

December 2023


-------
4. HAP EM Output Files

•	The AIRQUAL program's mistract file records the record number and the state, county,
and tract FIPS codes of each tract in the population file that is not matched by a tract in
the air quality f\\e. Only tracts that are included in both the files are processed by
HAPEM, since both these pieces of information about a tract (population and air quality)
are needed to make an exposure estimate.

•	The HAPEM program's mistract file records the state, county, and tract FIPS codes of
each home tract in the commuting file that is not matched by a tract in the air-quality
index files. These air-quality index files contain information on tracts that were included
in both the population and air quality files. The unmatched home tracts are not
processed further. The HAPEM program's mistract file also records each instance of a
work tract that is not matched by a tract in the air quality file; for these cases, the work
tract is assigned the air-quality values of the home tract.

Example excerpt from the COMMUTE
program's mistractY\\e

: MISSING TRACTS	OF COMMUTE & AIRQUAL in COMMUTE	*

j 44	2 1 01003990000 0	S

s 109	8 1 01015981903 0	;

j 139	13 1 01025957601 0	!

i [etc.]	i

Example excerpt from the AIRQUAL
program's mistractfWe

l MISSING TRACTS for AIRQUAL & POPULATION DATA in AIRQUAL	i.

I	2375 04013980500	!

s	5785 06037320000	{

j	7049 06037980001	i

i [etc.]

Example excerpt from the HAPEM program's

mistract Y\\e

' MISSING TRACTS of AIRQUAL & COMMUTE IN HAPEM	i

( airtract match with worktract not found

= home	3 2375 04013980500 0	j

; airtract match with worktract not found	:

i [etc.]	;

4.4. Final Exposure File

As explained in Section 2.1.9 (Exposure Output Files), HAPEM creates an exposure output file
for each combination of state and HAP. The names of these files are constructed by the model
based on the HAP SAROAD code (specified by sarod in the parameter file) and the state FIPS
code (as SAROAD.FIPS.dat).

The final exposure output files each begin with a repetition of some of the information specified
in the parameter file for the AIRQUAL and HAPEM programs, as listed below.

ICF

4-7

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

Information at the top of the final exposure output file

State FIPs code

HAP SAROAD code	(sarod)

HAP name	(pollutant)

HAP CAS number	(CAS)

Air quality data units	(units)

Year of air quality data	(year)

Number of outdoor air emission source categories	(nsource)

Random number seed for activity pattern selection	(Rseedl)

Random number seed for ME factors selection	(Rseed2)

Random number seed for air quality data selection	(Rseed3)

Number of indoor product emission source types	(a)

Number of indoor material emission source types	(a)

Number of indoor combustion emission source types	(a)

Number of vehicle in residential garage emission source types (a)

EPA Region of indoor emission source data	(a)

Number of demographic groups	(ngroup)

Number of replicates for each demographic group	(nreplic)

Definition of each demographic group, ordered as in the	(under "Demographic

population file	Groups:" heading in the

parameter file for the

AIRQUAL and HAPEM
programs)

a Indoor-source algorithms are included in the HAPEM program but have not yet been tested and reviewed.
Therefore, they are currently not recommended for use, and instructions for their use are omitted from this document.
To disable the indoor-source algorithms, set keyword CAS to 99999.

This information is followed by a header record defining the fields in the data records. An
example header record is presented below.

Example header record for final exposure output file

(in "wrapped" view)

i ST CTY CENSUS GRUP POPUL SOURCEOl SOURCE02 SOURCE03 SOURCE04 BackgConc i
; IndCon_Pro IndCon_Mat IndCon_Com IndCon_Veh Total Cone

The header record is then followed by nreplic data records for each combination of group and
tract combination. The format of each data record, assuming nsource = 4, is provided in Table
4-2.

ICF

4-8

HAPEM8 User's Guide
December 2023


-------
4. HAP EM Output Files

Table 4-2.

Variables in the final exposure output file (assuming nsource- 4)

Field Numbers

1

2-3
5-7
9-14
16-17

19-25

27-36
38-47
49-58
60-69
71-80

82-91

93-102

104-113

115-124

126-135

Description

leading space
state FIPS code
county FIPS code
tract FIPS code
demographic-group indicator

number of people to which the exposure estimates in
the data record apply

estimated exposure-concentration contribution from

emission-source-category 1

estimated exposure-concentration contribution from

emission-source-category 2

estimated exposure-concentration contribution from

emission-source-category 3

estimated exposure-concentration contribution from

emission-source-category 4

estimated exposure-concentration contribution from

background

estimated exposure-concentration contribution from

indoor-product emission sources

estimated exposure-concentration contribution from

building-materials indoor emissions

estimated exposure-concentration contribution from

indoor-combustion emission sources

estimated exposure-concentration contribution from

vehicles in attached garages

estimated total-exposure concentration

2-character	string

3-character	string
6-character string

integer 1-10, ordered as in the population
input file

decimal number equal to the population of
the group/tract combination divided by
nreplic

decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file;
derived from the sum of the uniform
background—backg—and the variable
background concentrations
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file
decimal number in scientific notation; units
of measurement as in the air quality file;
the sum of the preceding contribution
values

An example of a set HAPEM exposure-output records (for 30 replicates of one demographic
group in one tract) is presented below. The total population for group 1 in this tract is 36 and
nreplic = 30, so that the number of people to which the exposure estimates in each record
apply is 36/30 = 1.200.

ICF

4-9

HAPEM8 User's Guide
December 2023


-------
ST

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

78

4. HAP EM Output Files

Example set of exposure-output records (for 30 replicates of one demographic group

in one tract)

CTY
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010
010

CENSUS
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100
970100

GRUP POPUL
1 1.200

1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200
1.200

SOURCE01
0.4119E+00
0.3939E+00
0.33 60E+00
0.4791E+00
0.4057E+00
0.4652E+00
0.4535E+00
0.4412E+00
0.38 91E+00
0.3771E+00
0.3785E+00
0.4700E+00
0.3918E+00
0.3923E+00
0.3228E+00
0.2741E+00
0.5163E+00
0.3965E+00
0.3831E+00
0.3829E+00
0.3717E+00
0.4407E+00
0.34 63E+00
0.5035E+00
0.4628E+00
0.38 47E+00
0.3232E+00
0.3479E+00
0.4049E+00
0.4263E+00

SOURCE02
0 .1527E-03
0.1699E-03
0.1102E-03
0.165 4E-03
0.122 8E-03
0.1805E-03
0.1430E-03
0.1815E-03
0.15 64E-03
0.1278E-03
0.1366E-03
0.1662E-03
0.1333E-03
0.1632E-03
0.8932E-04
0.8380E-04
0.2219E-03
0.1391E-03
0.1792E-03
0.152 8E-03
0.1006E-03
0.1510E-03
0.192 6E-03
0.15 62E-03
0.1433E-03
0.1175E-03
0.3503E-03
0.1098E-03
0.3598E-03
0.1639E-03

BackgConc
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0. 0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0. 0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00
0.0000E+00

IndCon_Pro
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00

IndCon_Mat
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0.0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00

IndCon_Com
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00
0 . 0000E+00

IndCon_Veh
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00
0. 0000E+00

Total Cone
0 . 4121E+00
0 .3941E+00
0 .3361E+00
0 . 4793E+00
0.4058E+00
0.4 65 4E+00
0 . 4536E+00
0.4414E+00
0 .3893E+00
0.3772E+00
0 .378 6E+00
0.4702E+00
0.3919E+00
0 .3925E+00
0 .3229E+00
0.2742E+00
0 .5165E+00
0 .3966E+00
0 .3833E+00
0.3831E+00
0 .3718E+00
0 . 4409E+00
0 .3465E+00
0 .5037E+00
0 . 4 62 9E+00
0 .3848E+00
0 .3236E+00
0 .3480E+00
0 . 4053E+00
0 . 4265E+00

4-10

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

5. HAPEM Programs

This section contains detailed descriptions of the five programs that are contained in the model:
DURAV. INDEXPOP. COMMUTE. AIRQUAL. and HAPEM. The first four programs (DLJRAV,
INDEXPOP, COMMUTE, and AIRQUAL) are pre-processers that convert input-data files into
the form required for efficient exposure calculations. The final program (HAPEM) performs the
exposure calculations and summarizes the results.

It is important to note that some knowledge of Fortran programming is necessary to understand
all the programming details discussed in this section. However, all the general concepts related
to the programs should be clear to all users.

5.1. Programming Guidelines Used to Develop
HAPEM

The source code for each of the five model programs is written in Fortran 90 and designed so
that it can be compiled and executed on various platforms (e.g., UNIX, DOS, Windows) with
little or no programming changes required.

The model programs incorporate a structured programming style as summarized by the
attributes listed below.

•	No "GO TO" statements or line numbers are in any of the programs. Program flow is
direct from the beginning to the end within each program, thus making the code easy to
follow. The only looping is within "DO" blocks.

•	No filenames appear in source code. Instead, this information is specified in the
parameter file, which is read in from the command line.

•	Most parameter values are input from the parameter file so that the programs
themselves only allocate space for carrying out as many calculations as are necessary.

•	Most arrays depending on variable parameters are dynamically allocated.

•	All variables are declared (no implicit typing), with comments at the end of most
declarations to assist in interpretation. Comment lines are inserted between the logical
blocks of code for clarity.

5.1.1. Common Structural Elements

All the model programs consist of a declarations section, a parameters section, a setup section,
a primary section that processes the data, and a wrap-up section.

In the declarations section, all variables are explicitly typed. Most lines include a trailing
comment to indicate the general purpose of the variable(s). Arrays that are to be dynamically
allocated are fixed in rank (number of dimensions), with a colon used to defer the size
specification.

ICF

5-1

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

The second program section, referred to as the params section, reads the parameter file to
determine the specific input filenames and the parameter settings. This section is similar in all
the model programs, except that only the names of files needed by each job are retained as
variables. Each line of the parameter file is read in as a character string (maximum length of 120
characters) and inspected for an equals sign ("="). If there is no equals sign, then the line is
ignored. This allows the programmer to add comments and other lines directly to the parameter
file without altering its performance. Lines containing an equals sign are divided into two parts at
the equals sign. The part to the left of the sign is scanned for keywords. All keywords are in
lower case. If the string 'file' is found, then the line is assumed to specify one of the input or
output files. For these lines, a second keyword is searched for. Possible keywords are provided
in Table 5-1. Which filenames and paths are required by each model program are shown in
Table 2-1 as user-defined files.

Table 5-1.

The filename keywords in the parameter files
recognized by the model programs



Definition

activity

; name of the activity file (input)

cluster

name of the cluster file (input)

ClusTrans

\ name of the cluster-transition probability file (input)

populat

= name of the population file (input)

CommutTime

: name of the commuting-time file (input)

CommutFrac

S name of the commuting-fraction file (input)

DistToRoad

j name of the distance-to-road file (input)

commut

name of the commuting file (input)

quality

t name of the air quality file (input)

factors

i name of the factors file (input)

mobiles

name of the mobiles file (input)

statefip

: name of the statefip file (input)

log	

i name of the log file (output)

counter

i name of the counter file (output)

mistract

• name of the mistract file (output)

afile

! path of final exposure file (output)

Product1

' path of indoor source files (input)

AutoPduct1

Name of file for automobile-related consumer products (input)

1 A path to one or more indoor-emission-source inputs for the indoor-source algorithms
is specified in these statements (with the AutoPduct statement including a filename).

These algorithms are included in the HAP EM program, but they have not yet been
tested and reviewed. Therefore, they are currently not recommended for use, and
instructions for their use are omitted from this document. To disable the indoor-source
algorithms, set keyword CAS to 99999, and specify any existing path (and file for
AutoPduct, other than those otherwise specified for input or output for the HAPEM
program) since no indoor-source files will then actually be utilized by the HAPEM
program.

The model user can use the above keywords in lines that do not contain an equals sign, or in
comments containing an equals sign as long as the word "file" does not also appear left of the
equals sign. The strings containing the directory and filenames should not exceed 100
characters. If they do, then use an alias or a logical drive specification to identify most of the
path, and thereby reduce the length to less than 100 characters. As described earlier in this
guide, each of the input files requires a certain format for the data. It is the responsibility of the
user to ensure that this format specification is met.

ICF

5-2

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

The setup section allocates and initializes the dynamic arrays that can be sized from the
parameter settings specified in the parameter file. Other arrays that are dependent on the
number of records in an input file are allocated elsewhere. The dynamic allocation saves space
and time by only using as much space as is necessary, allows for the parameters to be
increased or decreased without recompiling the program, and allows vector and array
operations to be programmed more simply since they can be applied to the entire array rather
than only to certain elements.

5.2. Program Descriptions

This section describes the purpose and structure of the processing section of each of the five
model programs.

5.2.1. DURAV

As explained in Section 2.1.2 (The DURAV Program and the Activity and Cluster Files), the
DURAV program performs the two main functions listed below.

•	If a different number of daily time blocks is specified for the analysis than in the activity
file, it processes the activity records so that the number of time blocks matches the
number specified for the analysis.

•	It creates a sequential ASCII file of the activity pattern records for use by the HAPEM
program.

The six age groups in HAPEM are as follows, in years.

•	0-1

•	2-4

•	5-15

•	16-17

•	18-64

•	65+

Currently, season and day of week are used to determine three day types as

•	weekdays in summer (June-August),

•	other weekdays, or

•	weekends.

Cluster types are used to represent variations in activity pattern within each combination of
demographic group, day type, and commuting status. There are 1-3 cluster types for each
combination of group, day type, and commuting status. Each CHAD record in the activity file
has been assigned a cluster type based on the cluster analyses. While DURAV makes use of
the age groups, day types, and commuting-status categories, those are already on the activity
and cluster input files.

ICF

5-3

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

DURAV Processing Operations

In addition to the operations discussed above, the params section of DURAV conducts the
operations described below.

•	The parameter file, which stores input model variables and input file names, is read in
from the command line.

•	The values of nblock (the number of time blocks per day in the activity file) and hblock
(the number of time blocks per day for the analysis), specified in the parameter file, are
checked for compatibility. As explained elsewhere in this guide, hblock must be an
integral factor of nblock, so that the activity time blocks can be combined if necessary to
match hblock. If the check fails, then an error message is written to the log file and the
program stops.

Further, the setup section of DURAV conducts the operation described below.

•	The number of time blocks per day in the activity file is determined from the header
record, as explained elsewhere in this guide. This number is checked against the value
of nblock specified in the parameter file. If the values are different, an error message is
written to the log file and the program stops.

Finally, the main processing section of DURAV conducts the several operations described
below.

•	The number of data records in the activity file is determined so that memory can be
allocated for various arrays used to hold the input data records and other data derived
from them.

•	Each activity record is checked to ensure that the total activity time is 1,440 minutes. If
the check fails, then a message is written to the screen. This should never occur if it is
checked for when developing the activity file.

•	The nblock time blocks in each activity record are aggregated, if necessary, to create
hblock time blocks.

•	The total number of data records in the activity file, and the total number with activity
durations of 1,440 minutes, are recorded in the counter file.

•	The number of aggregated records in each combination of demographic group, day type,
and commuting status is determined.

•	The number of aggregated activity records in each combination of demographic group,
day type, commuting status, and cluster, and the number of clusters in each combination
of group, day type, and commuting status, are recorded in an intermediate file with
filename extension .nonzero n This information is used in the HAPEM program, as
described in Section 5.2.5 (HAPEM).

12 The *.nonzero file also records a flag for each combination of demographic group and day type, indicating whether
10 percent of the activity patterns include commuting. This flag was used by an earlier version of the HAPEM
program, but it is not used in this version.

ICF

5-4

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

•	The total number of aggregated activity records processed and their allocation among
demographic group, day types, and commuting status is written into the log file.

•	The activity patterns are written into a sequential file with filename extension .da sorted
by demographic group, day type, commuting status, and cluster type, and the filename is
recorded in the log file.

5.2.2. INDEXPOP

As explained in Section 2.1.3 (The INDEXPOP Program and the Population, Distance-to-road,
Commuting-time, and Commuting-fraction Files), the INDEXPOP program performs the two
main functions listed below.

•	It creates a direct-access file of population data to be used in the AIRQUAL program.

•	It creates sequential ASCII index files for the population data census tracts, to facilitate
file searching in the COMMUTE and AIRQUAL programs.

•	It creates direct-access files and associated index files of the data in the distance-to-
road, commuting-time, and commuting-fraction files, to be used in the COMMUTE and
AIRQUAL programs.

INDEXPOP Processing Operations

The specific operations performed in the main processing section of INDEXPOP are described
below.

•	The parameter file, which stores input model variables and input file names, is read in
from the command line.

•	Each record in the distance-to-road file is read and written into a direct-access file with
filename extension .dat, and an associated index file is created with filename extension
.STIDX.

•	Each record in the commuting-time file is read and written into a direct-access file with
filename extension .dat, and an associated index file is created with filename extension
.STIDX.

•	Each record in the commuting-fraction file is read and written into a direct-access file
with filename extension.dat, and an associated index file is created with filename
extension .STIDX.

•	The number of data records in the population file is determined so that memory can be
allocated for various arrays used to hold the input data records and other data derived
from them.

•	Each data record in the population file is read. The population array is recorded in a
direct-access file with the filename extension .da. The state FIPS, county FIPS, tract
FIPS, and serial record number are recorded in a direct-access file with the filename
extension _direct.ind.

•	The total number of tract records in each county is determined.

ICF

5-5

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

•	The total number of counties included in the population file that are in each state is
determined.

•	A sequential index file is created with filename extension ,county_tract_pop_range. For
each county in the population file, there is a record in this file indicating the serial record
numbers of the first and last data record for tracts in that county in the *.da and
*_direct.ind files.

•	A sequential index file is created with filename extension ,state_county_pop_range. For
each county, there is a record in this file indicating the serial record numbers of the first
and last data record for counties in that state in the *.county_tract_pop_range file.

•	The total number of records (tracts) and counties in the population file is added to the
counter file.

5.2.3. COMMUTE

As explained in Section 2.1.4 (The COMMUTE Program and the Commuting, Distance-to-road,
Commuting-time, and Commuting-fraction Files), the COMMUTE program performs the three
main functions described below.

•	It creates a file identifying the set of work tracts (i.e., tracts in which the residents of the
home tract work) associated with each census tract (i.e., home tract), the fraction of
workers residing in that home tract and working in each work tract, and the normalized
centroid-to-centroid distance between home tract and each work tract. The normalized
distance is the distance/(average distance). The normalized distance is combined with
the average commuting time for the tract to estimate the commuting time for the home-
tract/work-tract pair in the HAPEM program.

•	It creates a sequential index file to facilitate file searching in the HAPEM program.

•	It adds the census-tract-specific information from the distance-to-road, commuting-time,
and commuting-fraction direct-access files (created in the INDEXPOP program) to the
commuting index file.

COMMUTE Processing Operations

The specific operations performed in the main processing section of COMMUTE are as follows.

•	The parameter file, which stores input model variables and input file names, is read in
from the command line.

•	The distance-to-road index file (filename extension .STIDX, created in INDEXPOP) is
read twice: first to determine the number of records for array allocation, and then to
populate the arrays with the data in the file.

•	The commuting-time index file (filename extension .STIDX, created in INDEXPOP) is
read twice: first to determine the number of records for array allocation, and then to
populate the arrays with the data in the file.

•	The commuting-fraction index file (filename extension .STIDX, created in INDEXPOP) is
read twice: first to determine the number of records for array allocation, and then to
populate the arrays with the data in the file.

ICF

5-6

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

•	The number of data records in the commuting file is determined so that memory can be
allocated for various arrays used to hold the input-data records and other data derived
from them.

•	The number of commuting records with home tracts in each state is determined.

•	For each state, the sequence numbers of the first and last data record indicating a home
tract in that state are determined.

•	The number of records in the population file is read from the counter file, so that memory
can be allocated for various arrays used to hold the input-data records and other data
derived from them.

•	All the tract FIPS are read from the *_direct.ind file created by INDEXPOP, using the
indices from the *.state_county_pop_range and *county_tract_pop_range files created
by INDEXPOP.

•	For each tract in the *_direct.ind file created by INDEXPOP, all matching home tracts in
the commuting file are found. (There is one home-tract record for every commuting flow
originating in that tract). For each matched home tract, the FIPS and number of work
tracts within 120 km are determined. For each home tract, the fractions of total
commuting flow to work tracts, which are specified in the commuting file, are adjusted to
the fractions of the total commuting flow within 120 km.

•	For each home-tract/work-tract pair, the centroid-to-centroid distance from the
commuting file is determined and a normalized distance is calculated as
distance/(average distance).

•	Each work-tract FIPS, its adjusted flow fraction, and its normalized distance are
recorded in a sequential file with filename extension .da (one record for each work tract).

•	If no matching home tracts are found in the commuting file for a population tract, an
entry is recorded in the mistract file, indicating the tract FIPS and the indices of the tract
in the *.state_county_pop_range, *.county_tract_pop_range, and *_direct.ind files.

•	For population tracts with no matching commuting home tracts, a record is recorded in
the *.da file indicating the population tract as the work tract, with fractional commuting
flow of 1.0 (i.e., all work takes place in the home tract).

•	For each population tract, a record is written into a temporary index file. The fields in the
record are the population tract FIPS, the sequence numbers of the first and last work
tract record in the *.da file, and a flag indicating whether the population tract was
matched by a home tract in the commuting file (0=no; 1=yes).

•	Two records are added to the counter file. The first record indicates the number of
records found in the *_direct.ind file (created by INDEXPOP) and the number of data
records found in the commuting file. The second record records the number of records in
the *.da file and the number of records in the *.ind file.

•	A sequential index file is created with filename extension ,st_comm1_fip_range. For
each state, there is a record in this file indicating the sequence numbers of the first and
last data record for tracts for that state in the temporary index file.

ICF

5-7

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

•	The temporary index file is read from the beginning. Each record is matched by tract with
a record in the distance-to-road, commuting-time, and commuting-fraction direct-access
files (filename extensions of .dat, created in INDEXPOP)- The combined data for each
tract are written into a direct-access file with the root filename of the commuting file and
the filename extension .ind.

5.2.4. AIRQUAL

As explained in Section 2.1.5 (The AIRQUAL Program and the Air Quality and Distance-to-road
Files), the AIRQUAL program performs the four main functions listed below.

•	It creates a sequential file of air-quality data to be used in the HAPEM program.

•	It determines the number of data records for each census tract in the air quality file.

•	It creates index files to facilitate file searching in the HAPEM program.

•	It adds the tract-specific information from the distance-to-road direct-access file (created
in the INDEXPOP program) to the air-quality index files.

AIRQUAL Processing Operations

The specific operations performed in the main processing section of AIRQUAL are described
below.

•	The parameter file, which stores input model variables and input file names, is read in
from the command line.

•	The number of data records in the air quality file is determined so that memory can be
allocated for various arrays used to hold the input-data records and other data derived
from them.

•	The number of time blocks in the air quality file is determined from the header record. It
is checked for compatibility with the value of hblock (the number of time blocks for the
analysis, as specified in the parameter file). As explained in Section 2.1.5 (The
AIRQUAL Program and the Air Quality and Distance-to-road Files), hblock must be an
integral multiple of the number of air-quality time blocks, so that the air-quality values
can be replicated if necessary to create hblock air-quality values. If this check fails, an
error message is written to the log file and the program stops.

•	Each data record in the air quality file is read and, if necessary, the concentration values
for each time block are replicated to create hblock values.

•	The concentrations in each record are recorded in a sequential file with the root name of
the air quality f\\e and the filename extension .da, (e.g., HAP.da) to be used in HAPEM.

•	The index ranges for the multiple data records in each tract are determined and stored in
an index array.

•	All the unique county FIPS in the air quality file are counted and the values saved into an
array.

•	The number of records in the population file is read from the counter file.

ICF

5-8

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

•	An attempt is made to match each population tract specified in the *_direct.ind file
(created by INDEXPOP) with a tract in the air quality file. If a match is found, the
population array from the *.da file (created by INDEXPOP) is recorded in a sequential
file with the root name of the population file and the filename extension .popjairjda
(e.g., population_HAPEM8.pop_air_da). The tract code (state FIPS, county FIPS, and
tract FIPS) and the indices range for data records in a tract (from the index array) are
recorded in a sequential file with the root name of the air quality file and the filename
extension .airjda, (e.g., HAP.airjda). If no match is found, the serial record number of
the tract in the *_direct.ind file (created by INDEXPOP) and the tract code are recorded
in the mistract file.

•	For each state, the number of tracts in the *.air_da file is determined.

•	For each county in the * airjda file, the number of tracts is determined.

•	A sequential index file is created with filename extension ,state_air_fip_range. For each
county, there is a record in this file indicating the serial record numbers of the first and
last data records in the *.pop_air_da and *.air_da files.

•	A sequential index file is created with filename extension ,state_air1_fip_range. For each
state, there is a record in this file indicating the serial record numbers of the first and last
data records in the *.state_air_fip_range file.

•	A sequential index file is created with filename extension ,state_air2_fip_range. For each
state, there is a record in this file indicating the serial record numbers of the first and last
data records in the *.pop_air_da and *.air_da files.

•	Two records are added to the counter file. The first record indicates the number of tracts
in the *.pop_air_da and *.air_da files, and the number of counties in the

*.state_air_fip_range file. The second record indicates the number of census tracts in the
air quality file and the number of data records in the air quality file.

5.2.5. HAPEM

As explained in Section 2.1.6 (The HAPEM Program, the ME Factors and Mobiles Files, and the
Activity Cluster-transition File), the HAPEM program performs the six main functions described
below.

•	For each demographic group in each census tract, it randomly selects nreplic sets of
ME factors based on the distribution data provided in the factors and mobiles files. Each
set contains a subset of ME factors randomly selected for each of the time blocks (for
the PEN and ADD factors) or each of the sources (for the PROX and LAG factors). Each
subset contains randomly selected ME factors for each of nmicro MEs.

•	For each demographic group in each census tract, it randomly selects nreplic sets of
air-quality data from the datasets available for a tract.

•	For each demographic group in each census tract, it creates nreplic sets of average
activity patterns, where a set contains one average pattern for each day type. An
average activity pattern for each day type is calculated as a weighted average of activity
patterns randomly selected from each cluster in a group/day-type/commuting-status
combination. The weights are determined by the relative frequencies of cluster types

ICF

5-9

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

randomly selected in a one-stage Markov process,9 based on the cluster-transition
probabilities provided in the cluster-transition file.

•	For each activity pattern for a commuting demographic group, it randomly selects a work
census tract with probability weighting based on the fraction of residents that work in that
tract.

•	For each census tract, it estimates the concentration in each ME based on ME factors
and outdoor concentrations.

•	It combines activity patterns, commuting status, and estimates of ME concentration to
calculate nreplic annual-average exposure concentrations for each demographic group
in each census tract.

HAPEM Processing Operations

The specific operations performed in the main processing section of HAPEM are described
below.

•	The parameter file, which stores input model variables and input file names, is read in
from the command line.

•	The distribution data of ME factors for each of nmicro MEs is read from the factors and
mobiles files (as appropriate) and saved into arrays.

•	For the PROX distributions in the mobiles file for onroad-mobile sources, the average
PROXfactor for the second distance category (75-200 meters) over all the indoor MEs
is calculated. (This value will be used later to calculate the ambient concentration for the
third distance category [beyond 200 meters], as described below.)

•	For each combination of demographic group, day type, and commuting status, the
number of activity patterns for each cluster is read from the * nonzero file created in
DURAV.

•	For each combination of demographic group, day type, and commuting status, the
frequency of each cluster, and the cluster-to-cluster transition probabilities, are read from
the cluster-transition file.

•	For each combination of demographic group, day type, commuting status, and cluster
with a positive number of activity records, the activity-pattern records are read from the
*.da file (created in DURAV) and the values saved into an array.

•	Each activity pattern is checked to ensure a total activity time of 1,440 minutes. If this
check fails, an error message is written to the log file and the program stops.

•	Several values are read from the counter file to allocate memory for various arrays.

•	Indices are read from the *.state_air_fip_range and *.state_air1_fip_range files (created
by AIRQUAL).

•	Data are read from the *.pop_air_da file and the index ranges for air records from the
*.air_da file (created by AIRQUAL).

ICF

5-10

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

Air-data records are read from * da files created by AIRQUAL.

Indices are read from the *.st_comm1_fip_range and *.indfiles created by COMMUTE,
and data are read from the *.da file created by COMMUTE.

For each tract in the *.ind file created by COMMUTE, an attempt is made to find a
matching tract in the *.state_air_fip_range file created by AIRQUAL. If a match is not
found, the commuting tract is recorded in the mistract file.

For each demographic group in each census tract, nreplic sets of ME factors are
randomly selected based on the distribution data provided in the factors and mobiles
files, using subroutines "DISTRIBUTION" and "DATASET" (i.e.,
distribution HAPEM8.FOR and dataset HAPEM8.f90). Each set contains a subset of
ME factors randomly selected for each time block (for the PEN and ADD factors) or each
source (for the PROXfactor). For onroad-mobile source categories, first a distance-from-
source category is selected for each indoor ME based on the population fractions in
each distance category that were taken from the distance-to-road file and added to the
commuting index file in COMMUTE.13 Then, a PROX factor for each indoor ME is
selected from the appropriate distribution. Each subset contains randomly selected ME
factors for each of nmicro MEs.

For each demographic group in each census tract, nreplic sets of air-quality data are
randomly selected from the datasets available for the census tract in the *.da" file
created by AIRQUAL.

When a single set of ambient concentrations are provided for each tract in the air quality
file (as is typically the case), they represent spatial averages over the tract, excluding
locations very close to an emission source. For onroad-mobile source categories, it is
assumed that the ambient concentrations in the air quality file represent spatial averages
over the second and third distance categories (the distances 75-200 meters and beyond
200 meters) for the distance-to-road and mobiles files. Because HAPEM estimates the
ambient concentration for the second distance category by applying a PROX factor to
the "tract-average" ambient concentration, the ambient concentration for the third
distance category also is adjusted to make the area-weighted average over these two
distance categories equal to the "tract average". This is done as shown below.

CONCAq — AREAD3 X CONCq^ ~^~AREAd2 ^ CONCj)2	or

CONCaq = AREAm X CONCm + AREAD2 X PROXm X CONCm	or

CONCm =

CONCaq

(areaD3 ~l~ AREAD2 X proxD2)

where:

CONCaq: the "tract-average" concentration from the air quality file,

CONCd2'. average ambient concentration in second distance category (75-200
meters),

13 It is assumed that the spatial distribution of all indoor MEs in a tract with respect to distance from major roadways is
the same as for residences.

ICF

5-11

HAPEM8 User's Guide
December 2023


-------
5. HAP EM Programs

C0NCd3'. average ambient concentration in third distance category (beyond 200
meters),

AREAd2'. fraction of the tract area in the second distance category (from the
distance-to-road file),

AREAd3'. fraction of the tract area in the third distance category (from the distance-
to-road file), and

PROXd2: average PROX factor for the second distance category over all the indoor
MEs (calculated above).14

The randomly selected air-quality data from the *.da file created by AIRQUAL for each
matched tract is combined with the randomly selected ME factors to estimate the
concentrations for each ME/time-block combination for that tract.

For each demographic group in each census tract, the background-exposure-
concentration contributions are calculated for each ME/time-block combination based on
the uniform value of the backg parameter (specified in the parameter file), the variable
background-concentration values for each data record in *.da file created by AIRQUAL,
and the randomly selected ME factors.

For each census-tract, demographic-group, and day-type replicate, a commuting status
is selected based on the data from the commuting-fraction file (that were added to the
commuting index file in COMMUTE). If the replicate is a commuter, then a commuting
mode (public or private transit) is randomly selected based on the data from the
commuting-time file (that were added to the commuting index file in COMMUTE). This
selection also determines an associated average commuting time for the tract.

For each replicate that commutes, a work tract is randomly selected for each selected
activity pattern, using the subroutine "RANDOMR" within HAPEM. The work tract is
selected from the set of work tracts corresponding to that home tract, as specified in the
*.da file created by COMMUTE. The air-quality data for that work tract are randomly
selected from the datasets available for the work tract in the *.air_da file created by
AIRQUAL. If the work tract cannot be found in the *.air_da file, the air-quality data for the
home tract are used. The air-quality data are adjusted and combined with the ME factors
randomly selected in the same way as the home tract, to estimate the concentrations for
each ME/time-block combination for that work tract.

For each replicate/day-type combination, an average activity pattern is calculated as the
weighted average of activity patterns randomly selected from each cluster in a
combination of demographic group, day type, and commuting status in the *.da file
created in DURAV. The weights are determined by the relative frequencies of cluster
types randomly selected in a Markov process, based on the cluster-transition
probabilities provided in the cluster-transition file.

The average activity pattern for the day-type is adjusted so that the commuting time for
the replicate is equal to the product of the tract-average commuting time for the

14 As implied by the equations above, the onroad-mobile-source PROX distributions are estimated as the ratios
between the near-roadway concentration and the concentration distant from the roadway, rather than the ratios
between the near-roadway concentration and the "tract-average" concentration.

ICF	5-12	HAPEM8 User's Guide

December 2023


-------
5. HAP EM Programs

commuting mode selected above, and the normalized home-tract/work-tract distance
calculated in COMMUTE and recorded in the commuting direct-access file (created in
COMMUTE). The adjustments are made by uniform scaling of the time in each time
block for commuting MEs (so that the sum matches the total calculated commuting time)
and corresponding uniform scaling of the time in each time block for non-commuting
MEs.

The ME/time-block time durations of the weighted-average activity patterns are
combined with the estimated ME/time-block concentrations for the home tract and the
work tracts to estimate nreplic exposure concentrations for each combination of
demographic group and day type. A separate set of estimates is made for each
emission-source category. The algorithm for each combination of group and day type in
the tract is as follows.

„ „	2TimeBlocks 2Microenvironments ConCf m X Durati07lf m

ExpConc=-

^'iTimeBlocks TiMicroenvironments Duration

t,m

where:

Conct,m: the emission-source-category concentration during time-block t in ME m,
and

Duratior)t,m: the duration of activity during time-block t in ME m.

The exposure concentrations for each day type are combined with weighted averaging
to create an annual-average exposure concentration. The weights are the relative
frequencies of the day types: 0.178 for summer weekday, 0.537 for other weekdays, and
0.285 for weekends.

A total annual-average exposure concentration is calculated by summing the annual-
average values for each emission-source category, the background contribution, and
from the indoor-source ADD factor.

The results are written into the final exposure output files, with nreplic records for each
demographic group in each tract. The format of the files is described in Section 4.4
(Final Exposure File).

ICF	5-13	HAPEM8 User's Guide

December 2023


-------
This page intentionally left blank.


-------
6. References

6. References

Graham, S., K. Isaacs, T. McCurdy, J. Langstaff, P. Hartman, C. Stevens, H. Hubbard, S.
Hartley, J. Cohen, A. Bordner, C. Holder, N. Vetter, A.J Overton, I. Warren, C. Cavanagh, B.
Luukinen, and W. Mitchell, 2019: The Consolidated Human Activity Database (CHAD)
Documentation and Users' Guide. EPA-452/B-19-001. U.S. Environmental Protection Agency,
Research Triangle Park, NC. https://www.epa.gov/sites/default/files/2019-
11/documents/chadreport october2019.pdf.

ICF

6-1

HAPEM8 User's Guide
December 2023


-------
This page intentionally left blank.


-------
Appendix A: Updating the Hazardous Air
Pollutant Exposure Model (HAPEM) for Use in
the 2020 Air Toxics Screening Assessment
(AirT oxScreen)

ICF

A-1

HAPEM8 User's Guide
December 2023


-------
This page intentionally left blank.


-------
MEMORANDUM

Appendix A

\ly

1.

To: Matt Woody, Rod Truesdell, and Michael Moeller

From: ICF: Minti Patel, Chris Holder, Aishwarya Javali, Jared Wang, Graham Glen, and
Melissa Polansky

Innovate! Inc.: David Yarnell, Ben Holloway, and Michael Blair
Date: December 4, 2023

Re: Updating the Hazardous Air Pollutant Exposure Model (HAPEM) for Use in the
2020 Air Toxics Screening Assessment (AirToxScreen)

ICF ("we") updated the default input files accompanying the Hazardous Air Pollution
Exposure Model (HAPEM), and we updated some of the HAPEM source code to
accommodate the new default files. The resulting new version of HAPEM (i.e., HAPEM8),
with its default files, simulates exposure concentrations for all populated census tracts
using 2020 census data, commuting data from the 2012-2016 and 2015-2020 American
Community Survey (ACS), and time-activity data from the April 2020 version of the U.S.
Environmental Protection Agency (EPA) Consolidated Human Activity Database (CHAD).
In this technical memorandum, we describe how we updated the default files and model
source code, including the quality-assurance (QA) steps we used and the format of the
final default files. HAPEM8 and its updated default files will be available for download as
EPA's latest, default version of HAPEM.1 We modeled exposure concentrations using
HAPEM8 for the 2020 Air Toxics Screening Assessment (AirToxScreen), as described in a
separate memorandum.2

1	We anticipate HAPEM8 and its User's Guide will be made available by EPA online in Winter 2023-2024. As of
April 26, 2021, HAPEM7 is available for download at https://www.epa.gov/fera/human-exposure-modeling-
hazardous-air-pollutant-exposure-model-hapem.

2	We describe the use of HAPEM8 in the 2020 AirToxScreen in the ICF Memorandum "HAPEM8 Modeling for
the 2020 Air Toxics Screening Assessment (AirToxScreen)" dated December 4, 2023, to Matt Woody, Rod
Truesdell, and Michael Moeller of EPA's Office of Air Quality Planning and Standards.

2635 Meridian Parkway, Suite 200, Durham, f#	USA +1.313.293.1820 +1.313.233.1645 fax

icf.com

ICF

A-3

HAPEM8 User's Guide
December 2023


-------
Appendix A

1. Introduction to HAPEM and its Use in AirToxScreen

HAPEM is a model used by EPA to perform screening-level assessments of long-term
inhalation exposures to hazardous air pollutants (HAPs). Exposure concentrations output
by HAPEM are stratified by location (i.e., U.S. census tract), age group, and the individual
source categories and HAPs being modeled. The model's default files cover all 50 states
in the US, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands (USVI).

AirToxScreen uses HAPEM with these default files. Therefore, exposure concentrations
produced for the AirToxScreen have the same stratifications discussed above, though
AirToxScreen-specific post-processing includes accumulating exposure concentrations
into a lifetime period of exposure (defined as 70 years). AirToxScreen (the successor to
the National Air Toxics Assessment, or NATA) is a nationwide modeling assessment of air
concentrations, exposure concentrations, and potential human health cancer risks and
chronic hazards associated with exposure to HAP emissions from man-made and
naturally occurring sources. These results are spatially partitioned by various census
geographies. EPA models air concentrations using two air-concentration models:
AERMOD (the atmospheric dispersion model developed by the American Meteorological
Society and the EPA Regulatory Model Improvement Committee) and CMAQ (EPA's
Community Multiscale Air Quality model). Those modeled air concentrations are the "air
quality" inputs for HAPEM. AirToxScreen is not an enforcement tool to determine
compliance with various standards of emissions, air quality, or health impacts; rather, it is
a screening-level tool used to rank HAPs based on potential health impacts (nationally
and locally), estimate the numbers of people and demographics potentially subject to
health risks above levels of concern, identify gaps in data, and prioritize locations, source
categories, and HAPs to inform additional data collection and assessment.

Data on where people live and work, and otherwise how they spend their time, are critical
to the completeness of the exposure modeling conducted with HAPEM. The version of
HAPEM currently available for download (HAPEM7) uses census data from the year 2010
and activity patterns gleaned from the 2014 version of CHAD.3 We have updated the
default files used by HAPEM to reflect or approximate 2020 census data and the version
of CHAD available in April 2020. We also have updated HAPEM source code as necessary,
mostly to accommodate the sizes of the updated inputs.

3 The content, functionality, and implementation of HAPEM7 are discussed in the HAPEM7 User's Guide,
available as of April 26, 2021 at https://www.epa.gov/fera/hazardous-air-pollutant-exposure-model-hapem-

users-guides.

ICF

A-4

HAPEM8 User's Guide
December 2023


-------
Appendix A

2. Updating Census-based Data

2.1. Population File - "population_HAPEM8.txt"

The HAPEM default population input file ("population_HAPEM8.txt" in HAPEM8) provides
the number of people in each HAPEM age group residing in each tract in the 50 states
plus the District of Columbia, Puerto Rico, and the USVI. The HAPEM default ages are
binned into six groups: 0-1, 2-4, 5-15,16-17,18-64, and 65 years and older.

HAPEM7. For the previous HAPEM model (HAPEM7), the population data were derived
from the 2010 census Summary File 1: Table PCT12 ("Sex by Age"), available separately for
males and females, and provided by each year of age. Population data for the USVI were
not available from Table PCT12, but they were available by querying the census American
FactFinder web page. For the purposes of HAPEM, the male and female data from Table
PCT12 were aggregated male+female and into the HAPEM age groups. The American
FactFinder USVI data were available by groups of ages which did not match the HAPEM
age groups. For the purposes of fitting the USVI age groups to the HAPEM age groups, it
was assumed that population counts were evenly distributed among the incremental
years represented in the USVI 0-4-year group (i.e., two fifths being 0-1 and three fifths
being 2-4 years old) and in the 15-17 group (i.e., one third being 15 and two thirds being
16-17 years old); all other USVI age groups (e.g., 5-9,10-14,18-19,...,62-64,65-66,...,85 and
over) required no subdivision to fit into the HAPEM age groups.

HAPEM8: For HAPEM8, we used the 2020 census' Table PCT12 ("Sex by Single-year Age")
to update the HAPEM population file for all areas except the USVI. We obtained
population data for the USVI from the 2020 census' Table PCT1 ("Sex by Single Years of
Age", U.S. Virgin Islands). We summed the population information across the two sexes
and aggregated the single-age data into the six default HAPEM age groups.

2.1.1. Quality Assurance

We checked that the HAPEM8 default population file contained all the expected census
geographies (i.e., all the 2020 tracts) by comparing against the 2020 census gazetteer
tract file4 (and tigerweb.geo.census for the USVI). We created the file using Microsoft®
Excel™, where we cross-checked our processing formulas to ensure individual ages were
accurately summed into the HAPEM age groups. We also compared the grand total of
those binned population numbers to the grand total of the raw census data of individual

4 As of August 2023, the census gazetteer files are available at

https://www.censys.gov/geographies/reference-files/time-series/geo/gazetteer-files.html.

ICF

A-5

HAPEM8 User's Guide
December 2023


-------
Appendix A

ages. Lastly, we compared the HAPEM8 population file against the HAPEM7 file to ensure
proper formatting.

2.1.2. Content and Format

The HAPEM8 population data are contained in a fixed-width, space-delimited text file
with characteristics shown in Table 1. The file contains seven columns and a total of
85,427 rows of data (after two header rows). Each data row corresponds to a tract, where
the first field identifies the tract using Census Federal Information Processing Series
(FIPS) coding5, and fields 2-7 contain population counts per age group. Population counts
are whole numbers (no commas separating thousands). The first header row labels the
fields, where the age-group columns are identified by the youngest age within the group
(i.e., B_00 for age group 0-1 years old, B_02 for age group 2-4, and so on). The second
header row serves an unknown purpose, but we retained it from the HAPEM7 population
file. In Figure 1 we show the first ten data rows of the population file. On the whole, the
HAPEM8 total tract populations range from 0 (for 617 tracts across 42 states and
territories, which is less than 1 percent off all tracts) to 37,892, with an average of 3,919.
The total population in this file is 334,822,301.

Table 1. Characteristics of the HAPEM8 Population File

Variable

Description

Character Start
Position on Data Row

Character Length on
Data Rowa

TRACT

Full census FIPS code for home tract

1

11

DO

1

o
o

Total population ages 0-1 years

17

8

CM

o

1

00

Total population ages 2-4

25

8

LO

o

1

00

Total population ages 5-15

33

8

B_16

Total population ages 16-17

41

8

B_18

Total population ages 18-64

49

8

LO
CD

I

00

Total population ages 65 and older

57

8

Note: FIPS = Census Federal Information Processing Series

a Any unused character space after a number and/or between fields consists of blank spaces.

5 The full tract identifier used by census consists of a 2—digit state code, a 3—digit county code, and a 6—digit
tract code, concatenated together to form an 11—digit code.

A-6

ICF

HAPEM8 User's Guide
December 2023


-------
Appendix A

TRACT

B 00

B 02

B 05

B 16

w

1

l->

CO

B 65



COM

COM

COM

COM

COM

COM

01001020100

30

67

252

56

1086

284

01001020200

33

78

283

77

1268

316

01001020300

60

109

471

91

1887

598

01001020400

81

137

596

CO
CO

2383

961

01001020501

75

115

625

138

2599

770

01001020502

84

144

569

105

2090

292

01001020503

81

120

542

104

2188

581

01001020600

77

141

642

101

2198

570

01001020700

91

161

491

86

2105

475

01001020801

44

104

487

131

1861

516

Figure 1. Excerpt from the HAPEM8 Population File

2.2. Commuting-flow File - "commute_flow_HAPEM8.txt"

In HAPEM, the tract where a person resides is their home tract, and the tract where a
person works is their work tract. Some people work within their home tract (i.e., the work
tract is the home tract); the remaining employed people work outside their home tract.
For the employed people in each home tract, the HAPEM default commuting-flow input
file ("commute_flow_HAPEM8.txt" in HAPEM8) provides the fraction of those people who
work within their home tract and the fraction that commute to work in each other tract.
For each home tract, the file contains only the tract(s) where residents of the home tract
work (i.e., there are no fractions of 0). These commuting data are provided for nearly all
the (home) tracts contained in the HAPEM population file, with exceptions noted in the
discussion below.

HAPEM7. For the previous HAPEM model (HAPEM7), the commuting-flow data were
derived from data provided by the U.S. Department of Transportation (DOT) Federal
Highway Administration (FHWA)—specifically, their Microsoft® Access™-based Census
Transportation Planning Products (CTPP) 2006-2010 file, based on 2006-2010 five-year
summary data from the ACS and commissioned by the American Association of State
Highway and Transportation Officials (AASHTO). This Access database contains
estimates of the total number of workers commuting within or between tracts.

HAPEM8: For the HAPEM8 commuting-flow file, we used the FHWA CTPP data based on
the 2012-2016 five-year summary data from the ACS (the most current available).6 The

6 As of August 2023, the 2012-2016 CTPP data are available at https://ctpp.transportation.org/2Q12-2Q16-5-

vear-ctpp/.

ICF

A-7

HAPEM8 User's Guide
December 2023


-------
Appendix A

data are available by state, and we downloaded the state files and concatenated them
into the overall commuting-flow file.

Because the 2012-2016 CTPP data uses geography (census tracts) for the 2010 census,
we mapped the data to 2020 census tracts using a 2020 relationship file made available
by the Census Bureau.7 The relationship file provides a one-to-many crosswalk from 2010
tracts to 2020 tracts, indicating the surface area of overlap between the two vintages of
tracts. We used proportion of total overlapping tract area (sum of land area and water
area) to redistribute the CTPP commuter data to the 2020 tracts.

To produce the commuter fractions, we divided the number of workers in each home-
tract/work-tract pair by the total number of workers residing in the home tract. We
calculated the distance between each home-tract/work-tract pair using the 2020
census coordinates of tract internal points (i.e., centroids), available from the 2020
census gazetteer. More specifically, we used the distGeo function in the geosphere
package of R8 to calculate the distance between the internal-point tract coordinates.

A small number of tracts were absent from the CTPP data as home tracts (totaling 920
tracts, 1 percent of all tracts; 469 of these were unpopulated, while 451 were populated).
HAPEM will model each missing tract as if all its employed residents work within the home
tract (i.e., for the purposes of HAPEM modeling, they essentially do not commute), so we
did not insert any data for these missing tracts. Additionally, the CTPP contained no data
on all 32 tracts in the USVI. To prevent the USVI from being conspicuously missing from
the commuting file, we inserted one record for each USVI tract, where work tract equals
home tract and the commute distance is 0 kilometer (km), which is how HAPEM would
model them if they remained missing from the file.

2.2.1. Quality Assurance

We ensured that the data downloaded from the CTPP website matched that obtained
from the CTPP online queries, by randomly checking four tracts from five different states.
We confirmed the numbers of home and work tracts at various stages of the analysis. We
also ensured the accuracy of the commuting fractions including the usage of the
relationship file to estimate flows between the 2020 census tracts, through a thorough
check of the calculations for Alaska. We ensured that the cumulative commuting fraction
equaled 1 for each home tract (with an allowance for very small rounding errors). We used

7	As of August 2023, the census relationship files are available at

https://www.census.gOv/geographies/reference-files/time-series/geo/relationship-files.2020.html#tract.

8	As of August 2023, the R geosphere package files are available at
https://www.rdocumentation.Org/packages/geosphere/versions/1.5-18.

ICF

A-8

HAPEM8 User's Guide
December 2023


-------
Appendix A

mapping software to check a small number of the commuting distances calculated by
the distGeo function.

2.2.2. Content and Format

The HAPEM8 commuting-flow data are contained in a fixed-width, space-delimited text
file with characteristics shown in Table 2. The file contains five columns (the first being
empty) and a total of 6,004,343 rows of data with no header rows. Each data row
corresponds to a unique home-tract/work-tract pair, where the second and third fields
respectively contain the home and work tract identifiers using FIPS coding, and the fourth
and fifth fields respectively contain the commuting distance (in km) and the fraction of
workers commuting between the associated home and work tracts. Distance values are
presented to no more than two decimal places (i.e., hundredths of km, which is tens of
meters), while commuting fractions are presented to no more than eight decimal places.
In Figure 2 we show the first ten data rows of the commuting-flow file. On the whole, the
data show on average there are 71 work tracts per home tract, up to a maximum of 313
work tracts. In 351 home tracts (which is less than 1 percent of home tracts), all workers
worked within their home tract.

Table 2. Characteristics of the HAPEM8 Commuting-flow File

Field
Number

Description

Character Start
Position on Data
Row

Character
Length on Data
Rowa

1

Leading space in file

1

1

2

Full census FIPS code for home tract

2

11

3

Full census FIPS code for work tract

14

11

4

Distance in kilometers between home and work tract

26

8

5

Fraction of workers in the home tract commuting to
the work tract

34

10

Note: FIPS = Census Federal Information Processing Series

a Any unused character space after a number and/or between fields consists of blank spaces.

ICF

A-9

HAPEM8 User's Guide
December 2023


-------
Appendix A

01001020100

01001020100

O

o

o

0.03045067

01001020100

01001020803

6. 98

0. 00365408

01001020100

01001020200

1. 92

0.04263094

01001020100

01001020300

3. 12

0. 04872107

01001020100

01001020400

4 .56

0. 01827040

01001020100

01001020501

7 .51

0.07308161

01001020100

01001020502

7 . 07

0.02436054

01001020100

01001020503

6. 25

0.03654080

01001020100

01001020600

o

CO

0.02436054

01001020100

01001020700

7 . 52

0.06090134

Figure 2. Excerpt from the HAPEM8 Commuting-flow File

Commuting distances greater than 120 km are assumed in HAPEM to be very atypical for
a daily commuter. As noted in the HAPEM User's Guide,1 during an earlier development
stage of HAPEM, commuting flows were examined as a function of distance. The analysis
revealed that commute flows generally decreased linearly in log space with increasing
distance, but at commute distances greater than about 100 km that trend flattened. This
suggested that those longer commutes likely did not occur daily. Since HAPEM is
designed to construct daily commutes for simulated workers, it would not be appropriate
for HAPEM to model daily commutes longer than about 120 km, and thus HAPEM ignores
these longer commutes in constructing the commute distance distributions for each
tract. Most home tracts have at least one work tract that is more than 120 km away; that
is, in approximately 64 percent of home tracts there is at least one person residing there
who commutes farther than 120 km. However, this affects only 3 percent of home-
tract/work-tract pairs. Ignoring these records with commuting distances greater than 120
km, the average tract-to-tract distance is 22.5 km (weighting all tract pairs equally, not
by numbers of people performing those commutes; that average is 42.1 km when
commuting distances greater than 120 km are included).

2.3. Commuting-time File - "commute_time_HAPEM8.txt"

While the HAPEM commuting-flow file (see Section 2.2) contains information on the
frequency distribution of commuting distances for workers in a given home tract, the
HAPEM commuting-time file ("commute_time_HAPEM8.txt" in HAPEM8) contains
information on the method of commuting (public versus private transit) and the average
commuting time per person. These commuting-time data are provided for all the tracts
contained in the HAPEM population file, though no commuting data were available for the
USVI, as discussed below.

ICF

A-10

HAPEM8 User's Guide
December 2023


-------
Appendix A

HAPEM7. For the previous HAPEM model (HAPEM7), the commuting-time file data were
derived for 2010 from the 2006-2010 five-year summary data from the ACS Tables
B08301 ("Means of Transportation to Work for Workers 16+ Years"), C08134 ("Means of
Transportation to Work by Travel Time to Work for Workers 16+ Years who Did Not Work
at Home"), and C08136 ("Aggregate Travel Time to Work (in Minutes) by Means of
Transportation to Work for Workers 16+ Years who Did Not Work at Home").

HAPEM8: For the HAPEM8 commuting-time file, relative to the HAPEM7 file, we identified
equivalent data for the year 2020 from other tables from the ACS 2016-2020 five-year
summary data, as detailed in the following paragraphs.

Table B08134 ("Means of Transportation to Work by Travel Time to Work for Workers 16+
Years who Did Not Work at Home") contains the numbers of people commuting to work,
irrespective of commuting time, for specific means of transit in broader groups than in
Table B08301 that was used for HAPEM7. We used Table B08134 to derive the proportion
of workers traveling by public transit (i.e., bus, trolley bus, streetcar, trolley car, subway,
elevated train, railroad, and ferryboat) and the proportion of commuters traveling by
private transit (i.e., car, truck, van, taxicab, motorcycle, bicycle, any other non-public
means except walking). People working from home (i.e., workers not commuting) were not
included in this dataset. We excluded people walking to work, which are cases where we
assume people work within their home tract and thus are not considered commuters for
the purposes of HAPEM exposure modeling. As such, the fractions of workers commuting
by public and private transit sum to 1, except a relatively small number of tracts
(approximately 1,064, or 1 percent of all tracts) where the survey recorded no commuting
activity.

ACS Table B08136 ("Aggregate Travel Time to Work (in Minutes) by Means of
Transportation to Work for Workers 16+ Years who Did Not Work at Home") contains
travel times to work by the same transit means as in Table B08134, summed across all
people who use those means. We divided these aggregate travel times by the
corresponding population counts from Table B08134, resulting in average per-person
travel times to work, by public transit and by private transit. We then multiplied the
average per-person travel times by two to derive the round-trip time used in HAPEM8
commuting-time file. Commuting times related to public transit include time spent
waiting at a bus or train stop, and commuting times (and population counts from Table
B08134) related to private transit include walking commuters; these times are included in
our calculations because they cannot be disaggregated from the total commuting time. If
the data derived from Table B08134 (used for the proportions of workers commuting by
public and private means) indicated that a tract had no commuters using public means.

ICF

A-11

HAPEM8 User's Guide
December 2023


-------
Appendix A

then we set commuting times to 0 for public means; similarly, we set private commuting
times to 0 if there were no private commuters, and we set both public and private
commuting times to 0 if there were no commuters at all.

Commuting data for the USVI were not available from the ACS, so we set all their workers
to work in their home tract (i.e., commute neither by public nor private transit, with
commuting times equal to 0). This is consistent with how we approached USVI data in the
commuting-flow file.

Aggregate commuting-time data also were unavailable from Table B08136 (either missing
entirely from the table, or present in the table but with flags [orvalue entries]
indicating a lack of reliable data) for 87 percent9 of tracts in areas outside the USVI. We
used county-average aggregate times for 68 percent of these missing tracts (i.e., for 32
percent of all tracts outside of the USVI) and state times for the remaining 32 percent of
missing tracts (i.e., for 5 percent of all tracts outside of the USVI). We divided those
county and state aggregate times by the county and state counts of commuters to
produce average, per-person, one-way commuting times, and we multiplied by two to
obtain round-trip times. We stratified these county and state averages by public- and
private-transit means.

For the State of Wyoming, although state time aggregates had a null value, values were
available for some counties. To derive a state-level aggregate, we summed values across
all counties. We used this to substitute as a state-level value in cases of missing tract -
and county-level aggregates in Wyoming.

i. Quality Assurance
We checked that the HAPEM8 default commuting-time file contained all the expected
census geographies (i.e., all the 2020 tracts) by comparing against the default population
file (see Section 2.1). We spot-checked several very different tracts (e.g., rural Alaska, city
in Alaska, Queens County in New York City) to ensure that the ACS data pulled into our
Excel processing file matched the raw data displayed on the ACS website. We checked
each of our Excel processing formulas, including aggregations across census transit
types, the calculations of county- and state-average data, and the compilation of those
data into a complete set of tract data. We ensured that the public and private
commuting proportions summed to 1 for every record except the tracts with 0
commuters. For consistency, we confirmed that tracts with commuting workers (from the

9 It was unclear why a large percentage of these data were missing or marked as insufficient.

ICF

A-12

HAPEM8 User's Guide
December 2023


-------
Appendix A

HAPEM8 default commuting-fraction file, discussed later in Section 2.4) had non-0
commuting-time values in the final file.

" lat

The HAPEM8 commuting-time data are contained in a tab-delimited text file with
characteristics shown in Table 3. The file contains five columns and a total of 85,427 rows
of data with no header rows. Each row corresponds to a tract, where the first field
contains the tract identifier using census FIPS coding, the second and third fields
respectively contain the proportion of commuters who travel by public transit (excluding
taxicabs) and by private transit (including taxicabs), and the fourth and fifth fields
respectively contain the average round-trip times (in minutes) commuting to work by
public transit and by private transit. All values in fields 2-5 are displayed to four decimal
places. In Figure 3 we show the first ten data rows of the commuting-time file. On the
whole (except the USVI), the data show that 86 percent of commuters used private
transit, and all commuters in 40 percent of census tracts used private transit. The
conditional-average round-trip private-transit commute was 53 minutes (100 minutes
for public transit) (conditional averaging considers only non-zero values). This statistic
treats every tract equally, rather than weighting by commuting population, and it includes
county and state averages where we used them. The longest round-trip commuting times
in the data set are 163 minutes for private transit and 336 minutes for public transit.

ICF

A-13

HAPEM8 User's Guide
December 2023


-------
Appendix A

Table 3. Characteristics of the HAPEM8 Commuting-time File

Field Number Description

1	Full census FIPS code for home tract

2	Proportion of workers commuting outside of the home by public transit

3	Proportion of workers commuting outside of the home by private transit

4	Average round-trip commuting time for workers commuting outside of the home by
public transit

5	Average round-trip commuting time for workers commuting outside of the home by
private transit

Note: The position where table values begin and the number of characters per value are not relevant in a tab-
delimited format.

01001020100

0.0000

1.0000

0.0000

50.8320

01001020200

0 .0000

1.0000

0 .0000

50.8320

01001020300

0 .0000

1.0000

0 .0000

49.5615

01001020400

0.0377

0 . 9623

89.9083

50.8320

01001020501

0 .0166

0 . 9834

89.9083

50.8320

01001020502

0 .0000

1.0000

0 .0000

50.8320

01001020503

0 .0000

1.0000

0 .0000

50.8320

01001020600

0 .0000

1.0000

0 .0000

50.8320

01001020700

0 .0000

1.0000

0 .0000

50.8320

01001020801

0.0000

1.0000

0.0000

50.8320

Figure 3. Excerpt from the HAPEM8 Commuting-time File

2.4. Commuting-fraction File -

"commute_fraction_HAPEM8.txt"

The HAPEM commuting-fraction file ("commute_fraction_HAPEM8.txt" in HAPEM8)
contains the fraction of workers in each tract who commute to work and the fraction who
do not commute, stratified by age group. Workers who walk to work are not included as
commuters for HAPEM8.

HAPEM7. The HAPEM7 commuting-fraction data were derived for 2010 from the 2006-
2010 five-year summary data from the ACS—specifically, ACS Table B23001 ("Sex by Age
by Employment Status for the Population 16 Years and Over") and ACS Table B08101
("Means of Transportation to Work by Age for Workers 16+ Years"). HAPEM7 included
Armed Forces members but did not include those walking to work.

HAPEM8: For the HAPEM8 commuting-fraction file, relative to the HAPEM7 file, we
identified equivalent data for 2020 from Table B08101 of the ACS 2016-2020 five-year

ICF

A-14

HAPEM8 User's Guide
December 2023


-------
Appendix A

summary data, not including those walking to work. Detailed calculation methods are
discussed in the following paragraphs.

ACS Table B08101 ("Means of Transportation to Work by Age for Workers 16+ Years")
contains the numbers of people per age group commuting to work by various means of
transit (e.g., "Total", "Car, truck, or van: Drove alone", "Car, truck, or van: Carpooled",
"Public Transportation (excluding taxicab)"). We used this table to derive 1) the numbers
of workers who commuted by means other than walking and 2) the number of people per
HAPEM age group who are workers. As we did in calculating the proportion of workers
commuting by public and private transit (see Section 2.3), we excluded people walking to
work because they likely work within their home tract, and for simplicity we consider
them not to be commuters in HAPEM.

For each tract and HAPEM age group, we calculated the fraction of workers commuting as
(number of people aged 16+ years who commute to work other than by walking) +
(number of workers aged 16+ years). The fraction of workers not commuting is 1 minus the
above fraction.

Commuting data for the USVI were not available from the ACS, so we set data in the
commuting-fraction file such that all workers in the USVI work in their home tract (i.e., did
not commute). This is consistent with how we treated USVI data in the commuting-flow
and commuting-time files (see Sections 2.2 and 2.3, respectively).

2.4.1. Quality Assurance

We performed systematic data processing using R. As a thorough check, we also
repeated the processing in Excel (by a separate person than who authored the R code),
finding that both methods of processing resulted in the same values. We checked that all
commuting-fraction numbers were between 0 and 1. We ensured that the fractions of
workers in each age group commuting and not commuting summed to 1 for every record.
We compared the HAPEM8 and HAPEM7 files to ensure proper layout.

" lat

The HAPEM8 commuting-fraction data are contained in a tab-delimited text file with
characteristics shown in Table 4. The file contains five columns and a total of 85,427 rows
of data with no header rows. Each row corresponds to a tract, where the first field
contains the tract identifier using census FIPS coding, the second and third fields
respectively contain the fraction of workers aged 0-1 years who do not commute and
who do commute, and the remaining fields show the same data for each of the other five
HAPEM age groups. All values in fields 2-13 are displayed to four decimal places. Nobody
younger than 16 years is considered employed and a commuter, so all values for "does

ICF

A-15

HAPEM8 User's Guide
December 2023


-------
Appendix A

not commute to work" are 1 and all values for "commutes to work" are 0 for the first three
HAPEM age groups. In Figure 4 we show the first ten data rows of the commuting-fraction
file. On the whole (except the USVI), the data show that the average tract commuting
fraction is 0.80 (80 percent of workers commuting) for ages 16-17 years, 0.89 for ages
18-64 years, and 0.83 for 65+ years. This statistic treats every tract equally, rather than
weighting by commuting population.

ICF

A-16

HAPEM8 User's Guide
December 2023


-------
Appendix A

Table 4. Characteristics of the HAPEM8 Commuting-fraction File

Full census FIPS code for home tract

10

13

12

11

7

4

6

2

8

9

5

3

Proportion of age group 1 (ages 0-1 years) that does not commute to work
Proportion of age group 1 (ages 0-1) that commutes to work
Proportion of age group 2 (ages 2-4) that does not commute to work
Proportion of age group 2 (ages 2-4) that commutes to work
Proportion of age group 3 (ages 3-15) that does not commute to work
Proportion of age group 3 (ages 3-15) that commutes to work
Proportion of age group 4 (ages 16-17) that does not commute to work
Proportion of age group 4 (ages 16-17) that commutes to work
Proportion of age group 5 (ages 18-64) that does not commute to work
Proportion of age group 5 (ages 18-64) that commutes to work
Proportion of age group 6 (ages 65 and older) that does not commute to work
Proportion of age group 6 (ages 65 and older) that commutes to work

Note: The position where table values begin and the number of characters per value are not relevant in a tab-
delimited format.

01001020100 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0101 0.9899
0.0000 1.0000

01001020200 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0149 0.9851
0.0000 1.0000

01001020300 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0909 0.9091 0.0334 0.9666
0.0000 1.0000

01001020400 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0379 0.9621
0.1374 0.8626

01001020501	1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0589 0.9411
0.0000 1.0000

01001020502	1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0051 0.9949
1.0000 0.0000

01001020503	1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0896 0.9104
1.0000 0.0000

01001020600 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0717 0.9283
0.1000 0.9000

01001020700 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0178 0.9822
0.0000 1.0000

01001020801 1.0000 0.0000 1.0000 0.0000 1.0000 0.0000 0.1304 0.8696 0.0917 0.9083
0.0000 1.0000

Note: Contents wrap around due to space constrictions in this figure.

Figure 4. Excerpt from the HAPEM8 Commuting-fraction File

ICF	A-17	HAPEM8 User's Guide

December 2023


-------
Appendix A

2.5. Distance-to-road File - "proximity_road_HAPEM8.txt"

The HAPEM distance-to-road file ("proximity_road_HAPEM8.txt" in HAPEM8) contains
information on the fraction of a tract's residents that live within each of three categories
of distance from a major roadway, by age group. These distances are 0-75 m, greater
than 75 m up to 200 m, and greater than 200 m. The file contains these data for all tracts
in the HAPEM8 population file. We conducted the proximity assessment at the level of
census blocks and stratified by age and sex, and then we aggregated the block-level
results up to the tract level and stratified only by age group.

We used block-level geographies from the 2020 Census TIGER/Line Shapefiles10 for all
areas except the USVI. We used block-level population data from the 2020 census' Table
P12 ("Sex by Age for Selected Age Categories"). For the USVI, we used 2020 tract-level
geographies and population data from the 2020 census' Table PCT1 ("Sex by Single Years
of Age", U.S. Virgin Islands). We downloaded the geometries and demographic data
separately before joining them into a single table in a PostGIS server.

We compiled roadway location data from the 2022 Census TIGER/Line "All Roads" U.S.
roadway layer. We considered the three roadway types shown in Table 5 to be major
roads for the purposes of evaluating enhanced pollutant exposure to people living near
heavy-use roads, assuming that other features such as traffic circles, cul-de-sacs, local
or neighborhood roads, rural roads, and city streets do not meet the definition.

Table 5. Types of "Major" Roads Included in the Roadway-proximity Assessment

Roadway Type Definition

Primary Road	Generally divided, limited-access highways within the interstate highway

system or under state management, and distinguished by the presence of
interchanges. Accessible by ramps and may include some toll highways.

Ramp	Allows controlled access from adjacent roads onto a limited-access highway,

often in the form of a cloverleaf interchange.

Secondary Road Main arteries, usually in the U.S., state, or county highway systems. Have one or
more lanes of traffic in each direction, may or may not be divided, and usually
have at-grade intersections with many other roads and driveways.

We used Postgres software utilizing PostGIS to perform the steps noted below for the
roadway-proximity geospatial analyses.

10 As of August 2023, the U.S. Census TIGER/Line data available at https://www.census.gov/geo/maps-
data/data/tiger-line.html.

ICF

A-18

HAPEM8 User's Guide
December 2023


-------
Appendix A

1.	We created 75- and 200-m buffers around all major roadways. We clipped these
buffers at the boundaries of census blocks, such that no buffer crossed a block
boundary.

2.	We assumed uniform population across blocks and used area analysis to calculate the
ratio of each block area within each buffer. For each block, we calculated the fraction
of the area that was within the 75-m buffer, the fraction that was within the 200-m
buffer (subtracting the 75-m portion to create results for the 75-to-200-m distance),
and the fraction that was outside the 200-m buffer. We calculated the ratios of (block
area that fell within each of the major-roadway buffers) divided by (total block area).

3.	For each block and buffer, we multiplied the ratio from Step 2 above by the block
population count per gender and age group. These are the numbers of people residing
0-75 m, greater than 75 m up to 200 m, and greater than 200 m of a major roadway,
at the block level and stratified by sex and age.

4.	We aggregated the data from Step 3 above to the tract level and summed together
the male and female data. We then divided the population counts within the major-
roadway buffers by the total tract population, stratified by each of the six HAPEM age
groups. The result for each age group is the fraction of residents who live within each
of the three distance buffers of a major roadway.

ility Assurance

We implemented several layers of QA with multiple staff members at different stages of
the processing. A major focus was on calculations performed in Step 4 above (i.e., the
final steps of processing population data and aggregating to the tract level). We reviewed
the block-level population data to ensure they were complete, and we reviewed our
processed block-level results to ensure they included all blocks nationwide.

We checked that the major-roadway buffer ratios from Step 3 summed to 1 for every
block (and in Step 4 summed to 1 for every tract). In this process, we implemented post-
processing algorithms to remove rounding errors so that fractions summed to 1 where
appropriate (when processed at 4 decimal places).

We spot-checked that the processed tract population data summed to the correct
state-total populations and summed correctly across age groups. We also noted that we
should not always expect the fraction of tract area within the individual major-roadway
buffers to equal the fraction of tract population within the buffers. This is because we
performed the assessment at the block level and then aggregated to the tract level,
where each block has a unique population density that makes aggregated populations
unequal to aggregated areas.

We also discovered that HAPEM8 throws an error if any age group in a tract has all its
population living within 75 m of a major roadway. This happened with a single tract, and

ICF

A-19

HAPEM8 User's Guide
December 2023


-------
Appendix A

we worked around the model error by setting 99.98% (0.9998) living in that buffer, with
0.01% (0.0001) living in the second buffer and again 0.01% living in the third buffer.

2.5.2. Content and Format

The HAPEM8 distance-to-road data are contained in a tab-delimited text file with
characteristics shown in Table 6. The file contains 22 columns and a total of 85,427 rows
of data with no header rows. Each row corresponds to a tract, where the first field
contains the tract identifier using census FIPS coding, fields 2-4 contain the fractions of
tract area within each of the three roadway buffers, and the remaining fields show similar
data for the fractions of people in each HAPEM age group who reside within those buffers.
All values in fields 2-22 are displayed to four decimal places. The population fractions in
tracts with 0 residents are shown as 0 values. In Figure 5 we show the first ten data rows
of the distance-to-road file.

Table 6. Characteristics of the HAPEM8 Distance-to-road File

Field

Number Description

1	Full census FIPS code for home tract

2	Proportion of tract area located 0-75 m from major roadway

3	Proportion of tract area located beyond 75 m of major roadway, up to 200 m

4	Proportion of tract area located beyond 200 m of major roadway

5	Proportion of age group 1 (ages 0-1 years) residing 0-75 m from major roadway

6	Proportion of age group 1 (ages 0-1) residing > 75 m of major roadway, up to 200 m

7	Proportion of age group 1 (ages 0-1) residing > 200 m of major roadway

8	Proportion of age group 2 (ages 2-4) residing 0-75 m from major roadway

9	Proportion of age group 2 (ages 2-4) residing > 75 m of major roadway, up to 200 m

10	Proportion of age group 2 (ages 2-4) residing > 200 m of major roadway

11	Proportion of age group 3 (ages 5-15) residing 0-75 m from major roadway

12	Proportion of age group 3 (ages 5-15) residing > 75 m of major roadway, up to 200 m

13	Proportion of age group 3 (ages 5-15) residing > 200 m of major roadway

14	Proportion of age group 4 (ages 16-17) residing 0-75 m from major roadway

15	Proportion of age group 4 (ages 16-17) residing > 75 m of major roadway, up to 200 m

16	Proportion of age group 4 (ages 16-17) residing > 200 m of major roadway

17	Proportion of age group 5 (ages 18-64) residing 0-75 m from major roadway

18	Proportion of age group 5 (ages 18-64) residing > 75 m of major roadway, up to 200 m

19	Proportion of age group 5 (ages 18-64) residing > 200 m of major roadway

20	Proportion of age group 6 (ages 65 and older) residing 0-75 m from major roadway

ICF	A-20	HAPEM8 User's Guide

December 2023


-------
Appendix A

Field

Number Description

21	Proportion of age group 6 (ages 65 and older) residing > 75 m of major roadway, up to 200 m

22	Proportion of age group 6 (ages 65 and older) residing > 200 m of major roadway

Note: The position where table values begin and the number of characters per value are not relevant in a tab-
delimited format.

01001020100 0.0601 0.0865 0.8534 0.0778 0.1077 0.8145 0.0778 0.1077 0.8145 0.0512
0.0889 0.8599 0.0903 0.1279 0.7818 0.0412 0.0743 0.8845 0.0607 0.0998 0.8395
01001020200 0.0526 0.0644 0.8830 0.0562 0.0716 0.8722 0.0562 0.0716 0.8722 0.0421
0.0552 0.9027 0.0543 0.0742 0.8715 0.0586 0.1170 0.8244 0.0525 0.0752 0.8723
01001020300 0.0740 0.1116 0.8144 0.0598 0.0978 0.8424 0.0598 0.0978 0.8424 0.0403
0.0752 0.8845 0.0572 0.0898 0.8530 0.0501 0.0951 0.8548 0.0937 0.1450 0.7613
01001020400 0.1151 0.1740 0.7109 0.1024 0.1859 0.7117 0.1024 0.1859 0.7117 0.1000
0.2404 0.6596 0.1250 0.2122 0.6628 0.1107 0.1902 0.6991 0.1082 0.2026 0.6892

01001020501	0.0800 0.1126 0.8074 0.0322 0.0527 0.9151 0.0322 0.0527 0.9151 0.0240
0.0453 0.9307 0.0378 0.0791 0.8831 0.0305 0.0568 0.9127 0.0235 0.0460 0.9305

01001020502	0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000
0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000 0.0000 0.0000 1.0000

01001020503	0.0563 0.0932 0.8505 0.0185 0.0312 0.9503 0.0185 0.0312 0.9503 0.0294
0.0496 0.9210 0.0181 0.0305 0.9514 0.0303 0.0512 0.9185 0.0519 0.0874 0.8607

01001020600 0.1428 0.2165 0.6407 0.1410 0.2209 0.6381 0.1410 0.2209 0.6381 0.1288
0.2146 0.6566 0.0941 0.1996 0.7063 0.1301 0.2333 0.6366 0.1211 0.2206 0.6583
01001020700 0.0524 0.0783 0.8693 0.0544 0.0994 0.8462 0.0544 0.0994 0.8462 0.0490
0.1127 0.8383 0.0523 0.1090 0.8387 0.0596 0.1172 0.8232 0.0648 0.1203 0.8149
01001020801 0.0152 0.0248 0.9600 0.0733 0.0347 0.8920 0.0733 0.0347 0.8920 0.0298
0.0320 0.9382 0.0403 0.0548 0.9049 0.0422 0.0493 0.9085 0.0693 0.0669 0.8638

Note: Contents wrap around due to space constrictions in this figure.

Figure 5. Excerpt from the HAPEM8 Distance-to-road File

3. Updating Activity Files - "durhw_HAPEM8.txt",
"cluster_HAPEM8.txt", and
"clustertrans_HAPEM8.txt"

We updated the HAPEM activity file ("durhw_HAPEM8.txt" in HAPEM8) to reflect the most
recent version of CHAD as of April 2020. This version of CHAD has nearly four times the
number of activity diaries as the version used for HAPEM7. Accordingly, we also updated
the HAPEM cluster file ("cluster_HAPEM8.txt" in HAPEM8) and the HAPEM cluster-
transition file ("clustertrans_HAPEM8.txt" in HAPEM8).

Starting with HAPEM5, we analyzed CHAD data to create longitudinal activity patterns
using Markov chains. In HAPEM8, we refit the Markov chain model to the most recent
CHAD to include more activity-pattern studies and, thus, more daily activity patterns.

ICF	A-21	HAPEM8 User's Guide

December 2023


-------
Appendix A

The data analysis groups the daily patterns into one, two, or three activity categories (or
"clusters") of similar activity patterns for each of 36 combinations of type of day (the
three day types of HAPEM: summer weekday, non-summer weekday, and weekend), age
(the six age groups discussed in this memo), and commuter type (two types: commutes
or does not commute). Whether one, two, or three activity clusters are assigned to a day-
age-commuter combination depends on the availability of CHAD data. For HAPEM8,17
day-age-commuter combinations were assigned three clusters, 2 were assigned two
clusters, and 17 were assigned one cluster. We defined clusters based on similar times
spent in five broad microenvironments (i.e., indoors residence, indoors other, outdoors
near-roadway, outdoor other, and in-vehicle).

In HAPEM, for each day-age-commuter combination, one daily activity pattern is
randomly selected from all the CHAD data that correspond to that combination. The
starting activity category (i.e., for the first day) is selected according to the relative
frequencies of each category. The activity category for the second day is selected
according to the transition probabilities from the starting category. Transition
probabilities are the relative frequencies of each activity category when the same
subject was in the starting category on the first day and the given activity category on
the next day. The activity category for the third day is selected according to the
transition probabilities from the second day's category. This is repeated for all days in the
day type, producing a sequence of daily activity categories. For a given simulated person,
each day is assigned an activity pattern representative of the day's activity category.
Once a particular activity pattern is selected as representative of an activity category,
that pattern is always used for that category for that simulated person. Further details on
the cluster and cluster-transition approach can be found in Appendix A of the HAPEM7
User's Guide (a 2015 memorandum from ICF to EPA's Ted Palma and Terri Hollingsworth).

For HAPEM8, we also forced our analysis of CHAD to consider children in the first three
age groups (through age 15 years) to never be commuters (even if CHAD has them
"working"). This was to better comply with the census-based commuting data (discussed
in this memo) where workers start at age 16 years. We had to create "dummy" records in
the cluster-transition file for commuting children, since HAPEM8 expects these records
to be present in the file even though they are never used by the model. The result is that
the "clustrans_HAPEM8.txt" output file has 36 data records, one for each combination of
the 6 demographic groups, 3 day types, and 2 commuting categories, and the 9
categories of commuting children under 16 years old are dummy records.

ICF

A-22

HAPEM8 User's Guide
December 2023


-------
Appendix A

3.1.	Quality Assurance

We ensured that each CHAD record was represented in the activity file and formatted
appropriately, including the proper sets of columns for each day-age-commuter
combination. We ensured that the same CHAD records were represented in the cluster
file and formatted appropriately. We ensured that the cluster-transition file contained
the correct combinations of day-age-commuter and was formatted appropriately.

The HAPEM8 activity and cluster files both contain 178,621 records with the same CHADID
on the same record in both files. This was a change for HAPEM8 that allowed us to
simplify HAPEM8 algorithms (counterbalanced by more complex code to develop the
activity files). By having matching CHADIDs, all diaries are available for use in HAPEM8 and
will remain available for future CHAD updates.

3.2.	Content and Format

The HAPEM8 activity data are contained in a fixed-width, space-delimited text file with
characteristics shown in Table 7. The file contains 878 columns and a total of 178,621 rows
of data with one header row. Each row corresponds to a person-day of activity in CHAD,
where the first field contains an identifier for the record, the next 12 fields can be used
together to describe the study respondent, and the remaining fields contain duration
values for how long the subject spends in each microenvironment, for each hour of a day,
and at work versus at home. All values in fields 15-878 are displayed as whole numbers
(i.e., whole minutes). In Figure 6 we show the header and first data row of the HAPEM8
activity file. This record is from a white non-Hispanic female from an unspecified county
in California. She was a child between 1 and 2 years old (unemployed and non-
commuting). This record was from 16 June 1989, which was a summer weekday. She spent
most of her day indoors at home, except in the afternoon when she was outdoors for 1
hour total, in a vehicle for 40 minutes total, and in some other indoor location for 35
minutes total.

ICF

A-23

HAPEM8 User's Guide
December 2023


-------
Appendix A

Table 7. Characteristics of the HAPEM8 Activity File

Variable
Number

Variable

Description

Character
Start

Position on
Data Row

Character
Length on
Data Rowa

1

CHADID

ID of event in CHAD

1

10

2

ZIP

ZIP code of subject's residence

11

6

3

ST

2-character FIPS code of state where event took
place

17

3

4

COU

3-character FIPS code of county where event took
place

20

4

5

SEX

Gender of subject (1=female, 2=male, 9=unknown)

24

4

6

RACE

Race of subject (1=white non-Hispanic, 2=black non-
Hispanic, 3=Hispanic any race, 4=Asian or other non-
Hispanic, 9=unknown)

28

5

7

WORK

Employment status of subject ("Y"=employed,
"N"=unemployed, "X"=missing)

33

5

8

YEAR

Year when the event took place

38

5 or 6,
depending
on the
next field

9

MN

Month when the event took place

Field length varies such
that the last digit of
each month entry lines
up

10

DY

Day of month when event took place

Field length varies such
that the last digit of
each day entry lines up

11

AGE

Age of subject (presented to two decimal places)

51

6

12

G

HAPEM8 age group (1-6)

57

3

13

DT

Type of day when the event took place (1=summer
weekday, 2=non-summer weekday, 3=weekend)

60

3

14

CT

Commuter status of subject (1=does not commute.

63

4

15-878 No header text

2=commutes)

Duration of event (minutes). There are 864 of these Field lengths vary such

fields, cycling through each of the 18	that the last digit of

microenvironments, 24 hours of the day, and 2	each duration entry lines

commute types. The values are sequenced so that the up down the file

18 microenvironment durations for the first hour in the

home location come first, followed by the 18

microenvironment durations for the second hour in

the home location, and so on, until all the 432 values

for the home location are specified. These are

followed by the 432 values for the work location.

ICF

A-24

HAPEM8 User's Guide
December 2023


-------
Appendix A

a Any unused character space before a number or character and/or between fields consists of
blank spaces.

CHADID



ZIP

ST

cou

SEX

RACE

WORK YEAR

MN DY



AGE

G

DT CT









CAC 0116 6A

93277

06

000

1

1

N

1989

6 16



1. 67

1

1

1

60

0

0

0 0

0 0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

30

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

30

30

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

30

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

45

0

0

0

0

0

15

0

0

0

0

0

0

0

0 0

0

0

60

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

35

0

25

0

0

0

0

0

0

0

0

0

0

0

60

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

60

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0 0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0





Note: Contents wrap around due to space constrictions in this figure.
Figure 6. Excerpt from the HAPEM8 Activity File

ICF

A-25

HAPEM8 User's Guide
December 2023


-------
Appendix A

The HAPEM8 activity-cluster data are contained in a fixed-width, space-delimited text
file with characteristics shown in Table 8. The file contains six columns and a data row
corresponding to each data row in the activity file, plus one header row. The first field
contains an identifier for the record, the next three fields together identify the age-day-
commuter combination of the event, and the final two fields respectively identify the
cluster number of the event and the number of clusters that exist for all records
corresponding to the age-day-commuter combination. In Figure 7 we show the first ten
data rows of the HAPEM8 cluster file, indicating all are in the first age group, on summer
weekdays, for non-commuters, and all are in the first cluster and all belong to only one
cluster.

Table 8. Characteristics of the HAPEM8 Cluster File

Variable
Number

Variable Description

Character
Start Position
on Data Row

Character
Length on
Data Rowa

1

g

HAPEM8 age group (1-6)

1

5

2

dt

Type of day when the event took place (1=summer
weekday, 2=non-summer weekday, 3=weekend)

6

5

3

ct

Commuting status of subject (1=does not commute,
2=commutes)

11

5

4

chadid

ID of event in CHAD

16

12

5

clus

Cluster category of event

28

5

6

nclus

Number of clusters for the corresponding combination of
g, dt, and ct

33

1

a Any unused character space before a number or character and/or between fields consists of blank spaces.

g

dt

ct

chadid

clus

nclus

l

1

1

CAC 0116 6 A

1

1

l

1

1

CAC01251A

1

1

l

1

1

CAC 014 8 9A

1

1

l

1

1

CAC015 62A

1

1

l

1

1

CAC015 68A

1

1

l

1

1

CAC 018 0 9A

1

1

l

1

1

CAC 018 3 OA

1

1

l

1

1

CAC01982A

1

1

l

1

1

CAC 0203 6A

1

1

l

1

1

CAC02132A

1

1

Figure 7. Excerpt from the HAPEM8 Cluster File

The HAPEM8 activity-cluster-transition data are contained in a fixed-width, space-
delimited text file with characteristics shown in Table 9. The file contains 16 columns, with
a data row corresponding to each age-day-commuter combination, plus a header row.

ICF

A-26

HAPEM8 User's Guide
December 2023


-------
Appendix A

The first three fields identify the age group, day type, and commuter status, while the
fourth field identifies the number of clusters that exist for that age-day-commuter
combination, fields 5-7 contain the cumulative fractions of the combination within each
cluster, and the remaining fields identify the cumulative transition probabilities of all
possible combinations of the subject's cluster number on day X and the subject's cluster
number on day X+1. In Figure 8 we show the age group #1 data rows of the HAPEM8
cluster-transition file. Several of the day-commuter combinations shown in the excerpt
(day-type 1 with both commuting status, day-type 2 with commuting, and day-type 3
with commuting) are all in cluster #1. For day-type 2 non-commuting, about 49% of
diaries are in cluster #1, 89% are in cluster #2, and all are in cluster #3. Logically there is a
100% probability of a cluster #1, 2, or 3 diary transitioning to a cluster #1, 2, or 3 diary (i.e.,
values of 1.00000). There are relatively high probabilities of a cluster #1 diary
transitioning to a cluster #1 diary (0.60714) or a cluster #2 diary transitioning to a cluster
#1 or 2 diary (0.86207). There are relatively low probabilities of a cluster #2 or #3 diary
transitioning to a cluster #1 diary (0.13793 and 0.11111, respectively) or a cluster #3 diary
transitioning to a cluster #1 or 2 diary (0.33333). For day-type 3 non-commuting, about
55% (0.54806) of diaries are in cluster #1, 89% (0.88666) are in cluster #2, and all are in
cluster #3 (1.00000). Logically there is a 100% probability of a cluster #1, 2, or 3 diary
transitioning to a cluster #1, 2, or 3 diary (i.e., values of 1.00000). There are relatively high
probabilities of a cluster #1 diary transitioning to a cluster #1 diary (0.90000), a cluster
#2 diary transitioning to a cluster #1 or 2 diary (0.62500), or a cluster #3 diary
transitioning to a cluster #1 or 2 diary (0.75000), with a 50% (0.50000) probability of a
cluster #3 diary transitioning to a cluster #1 diary. There is a relatively low probability of a
cluster #2 diary transitioning to a cluster #1 diary (0.37500).

Table 9. Characteristics of the HAPEM8 Cluster-transition File

Variable

Number Variable Description3

Character
Start Position
on Data Row

Character
Length on
Data Rowb

1 g

HAPEM8 age group (1-6)

1

4

2 dt

Type of day when the event took place (1=summer
weekday, 2=non-summer weekday, 3=weekend)

5

4

3 ct

Commuting status of subject (1=does not commute,
2=commutes)

9

4

4 nclus

Number of clusters for the corresponding combination
of g, dt, and ct

13

3

5 clustl

Cumulative fraction of g/dt in cluster #1

16

8

6 clust2

Cumulative fraction of g/dt in clusters #1-2

24

8

ICF

A-27

HAPEM8 User's Guide
December 2023


-------
Appendix A

Variable

Number Variable Description3

Character
Start Position
on Data Row

Character
Length on
Data Rowb

7

clust3

Cumulative fraction of g/dt in clusters #1-3

32

8

8

prob11

Cumulative transition probability from cluster #1 to #1

40

8

9

prob12

Cumulative transition probability from cluster #1 to
clusters #1-2

48

8

10

prob13

Cumulative transition probability from cluster #1 to
clusters #1-3

56

8

11

prob21

Cumulative transition probability from cluster #2 to #1

64

8

12

prob22

Cumulative transition probability from cluster #2 to
clusters #1-2

72

8

13

prob23

Cumulative transition probability from cluster #2 to
clusters #1-3

80

8

14

prob31

Cumulative transition probability from cluster #3 to #1

88

8

15

prob32

Cumulative transition probability from cluster #3 to
clusters #1-2

96

8

16

prob33

Cumulative transition probability from cluster #3 to
clusters #1-3

104

7

a For the cluster* fields, if nclus = 1 then clust2 and clust3 = 0 in the file; similarly, if nclus = 2 then
clust3 = 0. The same is true for the prob* fields (if nclus = 1 then profc>12, prob13, prob21, prob22,
prob23, prob31, prob32, and prob33 = 0, and if nclus= 2 then prob13, prob23, prob31, prob32, and
prob33 = 0).

b Any unused character space before a number or character and/or between fields consists of
blank spaces.

g dt ct nclus clustl clust2 clust3 probll probl2 probl3 prob21 prob22
prob23 prob31 prob32 prob33

1111 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

112 1 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

12 13 0.48625 0.88567 1.00000 0.60714 1.00000 1.00000 0.13793 0.86207
1.00000 0.11111 0.33333 1.00000

12	2 1 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

13	13 0.54806 0.88666 1.00000 0.90000 1.00000 1.00000 0.37500 0.62500
1.00000 0.50000 0.75000 1.00000

13 2 1 1.00000 0.00000 0.00000 1.00000 0.00000 0.00000 0.00000 0.00000
0.00000 0.00000 0.00000 0.00000

Note: Contents wrap around due to space constrictions in this figure.

Figure 8. Excerpt from the HAPEM8 Cluster-transition File

ICF

A-28

HAPEM8 User's Guide
December 2023


-------
Appendix A

4. Updating Source Code

We made several modifications to various source-code modules for HAPEM8. Most
modifications were minor and functioned either to ensure proper execution from the
command line or to ensure that data-array dimensions were large enough to
accommodate the revised default model input data discussed in this memorandum. The
changes to the "durav" module were more significant. We describe below the specific
changes we made to the specific modules.

•	"durav_HAPEM8.f90" (compiled into an executable named "durav_HAPEM8.exe"):

•	Simplified code since updated activity input files already were sorted consistently.
The number of code lines now is reduced by more than half.

•	"indexpop_HAPEM8.f90" (compiled into an executable named
"indexpop_HAPEM8.exe"):

•	No changes.

•	"commute_HAPEM8.f90" (compiled into an executable named
"commute_HAPEM8.exe"):

•	Increased seven array bounds from 80000 to 99000.

•	Updated the status for two files to eliminate a compiler warning.

•	"airqual_HAPEM8.f90" (compiled into an executable named "airqual_HAPEM8.exe"):

•	No changes.

•	"hapem_HAPEM8.f90" (compiled into an executable named "hapem_HAPEM8.exe"):

•	Broke down one large seven-dimensional array into six six-dimensional arrays, one
for each demographic group.

•	Revised the reading of the commuting database file to six times (for six
demographic groups) rather than one time overall.

ICF

A-29

HAPEM8 User's Guide
December 2023


-------