#•	\

\ d?

PRO*^

Bayesian Space-time Downscaling Fusion
Model (Downsealer) - Derived Estimates of Air
Quality for 2020


-------

-------
EPA-454/R-23-004
December 2023

Bayesian Space-time Downscaling Fusion Model (Downscaler) - Derived Estimates of Air

Quality for 2020

U.S. Environmental Protection Agency
Office of Air Quality Planning and Standards
Air Quality Assessment Division
Research Triangle Park, NC


-------
Authors:

Adam Reff (EPA/OAR)
Sharon Phillips (EPA/OAR)

Alison Eyth (EPA/OAR)
Janice Godfrey (EPA/OAR)
Jeff Vukovich (EPA/OAR)
David Mintz (EPA/OAR)

Acknowledgements

The following people served as reviewers of this document: Caroline Farkas (EPA/OAR) and
David Mintz (EPA/OAR).


-------
Contents

Contents	1

1.0 Introduction	2

2.0 Air Quality Data	5

2.1	Introduction to Air Quality Impacts in the United States	5

2.2	Ambient Air Quality Monitoring in the United States	7

2.3	Air Quality Indicators Developed for the EPHT Network	11

3.0 Emissions Data	13

3.1	Introduction to Emissions Data Development	13

3.2	Emission Inventories and Approaches	15

3.3	Emissions Modeling Summary	50

3.4	Emissions References	95

4.0 CMAQ Air Quality Model Estimates	100

4.1	Introduction to the CMAQ Modeling Platform	100

4.2	CMAQ Model Version, Inputs and Configuration	101

5.0 Bayesian space-time downscaling fusion model (downscaler) -Derived Air Quality Estimates... 132

5.1	Introduction	132

5.2	Downscaler Model	132

5.3	Downscaler Concentration Predictions	133

5.4	Downscaler Uncertainties	137

5.5	Summary and Conclusions	140

Appendix A - Acronyms	141

Appendix B - Emissions Totals by Sector	143

1


-------
1,0 Introduction

This report describes estimates of daily ozone (maximum 8-hour average) and fine particulate matter
(PM2.5) (24-hour average) concentrations throughout the contiguous United States during the 2020
calendar year generated by EPA's recently developed data fusion method termed the "downscaler model"
(DS). Air quality monitoring data from the State and Local Air Monitoring Stations (SLAMS) and
numerical output from the Community Multiscale Air Quality (CMAQ) model were both input to DS to
predict concentrations at the 2010 and 2020 US census tract centroids encompassed by the CMAQ
modeling domain. Information on EPA's air quality monitors, CMAQ model, and DS is included to
provide the background and context for understanding the data output presented in this report. These
estimates are intended for use by statisticians and environmental scientists interested in the daily spatial
distribution of ozone and PM2.5.

DS essentially operates by calibrating CMAQ data to the observational data, and then uses the resulting
relationship to predict "observed" concentrations at new spatial points in the domain. Although similar
in principle to a linear regression, spatial modeling aspects have been incorporated for improving the
model fit, and a Bayesian1 approach to fitting is used to generate an uncertainty value associated with
each concentration prediction. The uncertainties that DS produces are a major distinguishing feature
from earlier fusion methods previously used by EPA such as the "Hierarchical Bayesian" (HB) model
(McMillan et al, 2009). The term "downscaler" refers to the fact that DS takes grid-averaged data
(CMAQ) for input and produces point-based estimates, thus "scaling down" the area of data
representation. Although this allows air pollution concentration estimates to be made at points where no
observations exist, caution is needed when interpreting any within-gridcell spatial gradients generated by
DS since they may not exist in the input datasets. The theory, development, and initial evaluation of DS
can be found in the earlier papers of Berrocal, Gelfand, and Holland (2009, 2010, and 2011).

EPA's Office of Air and Radiation's (OAR) Office of Air Quality Planning and Standards (OAQPS)
provides air quality monitoring data and model estimates to the Centers for Disease Control and
Prevention (CDC) for use in their Environmental Public Health Tracking (EPHT) Network. CDC's
EPHT Network supports linkage of air quality data with human health outcome data for use by various
public health agencies throughout the U.S. The EPHT Network Program is a multidisciplinary
collaboration that involves the ongoing collection, integration, analysis, interpretation, and dissemination
of data from: environmental hazard monitoring activities; human exposure assessment information; and
surveillance of noninfectious health conditions. As part of the National EPHT Program efforts, the CDC
led the initiative to build the National EPHT Network (https://www.cdc.gov/nceh/tracking/). The
National EPHT Program, with the EPHT Network as its cornerstone, is the CDC's response to requests
calling for improved understanding of how the environment affects human health. The EPHT Network is
designed to provide the means to identify, access, and organize hazard, exposure, and health data from a
variety of sources and to examine, analyze and interpret those data based on their spatial and temporal
characteristics.

1 Bayesian statistical modeling refers to methods that are based on Bayes' theorem and model the world in terms of
probabilities based on previously acquired knowledge.

2


-------
Since 2002, EPA has collaborated with the CDC on the development of the EPHT Network. On
September 30, 2003, the Secretary of Health and Human Services (HHS) and the Administrator of EPA
signed a joint Memorandum of Understanding (MOU) with the objective of advancing efforts to
achieve mutual environmental public health goals.2 HHS, acting through the CDC and the Agency for
Toxic Substances and Disease Registry (ATSDR), and EPA agreed to expand their cooperative
activities in support of the CDC EPHT Network and EPA's Central Data Exchange Node on the
Environmental Information Exchange Network in the following areas:

•	Collecting, analyzing and interpreting environmental and health data from both agencies (HHS
and EPA).

•	Collaborating on emerging information technology practices related to building, supporting,
and operating the CDC EPHT Network and the Environmental Information Exchange
Network.

•	Developing and validating additional environmental public health indicators.

•	Sharing reliable environmental and public health data between their respective networks in an
efficient and effective manner.

•	Consulting and informing each other about dissemination of results obtained through work
carried out under the MOU and the associated Interagency Agreement (IAG) between EPA and
CDC.

The best available statistical fusion model, air quality data, and CMAQ numerical model output were
used to develop the estimates. Fusion results can vary with different inputs and fusion modeling
approaches. As new and improved statistical models become available, EPA will provide updates.

Although these data have been processed on a computer system at the EPA, no warranty expressed or
implied is made regarding the accuracy or utility of the data on any other system or for general or
scientific purposes, nor shall the act of distribution of the data constitute any such warranty. It is also
strongly recommended that careful attention be paid to the contents of the metadata file associated with
these data to evaluate data set limitations, restrictions or intended use. The EPA shall not be held liable
for improper or incorrect use of the data described and/or contained herein.

The four remaining sections and appendix in the report are as follows:

•	Section 2 describes the air quality data obtained from EPA's nationwide monitoring network
and the importance of the monitoring data in determining potential health risks.

•	Section 3 details the emissions inventory data, how it is obtained and its role as a key input into
the CMAQ air quality computer model.

2The original HHS and EPA MOU is available at https://www.cdc.gov/nceh/tracking/pdfs/epa mou 2007.pdf.

3


-------
•	Section 4 describes the CMAQ computer model and its role in providing estimates of pollutant
concentrations across the U.S. based on 12-km grid cells over the contiguous U.S.

•	Section 5 explains the downscaler model used to statistically combine air quality monitoring
data and air quality estimates from the CMAQ model to provide daily air quality estimates for
the 2010 and 2020 U.S. census tract centroid locations within the contiguous U.S.

•	Appendix A provides a description of acronyms used in this report.

•	Appendix B is a separate spreadsheet that shows emissions totals for the modeling domain and
for each emissions modeling sector (see Section 3 for more details).

4


-------
lality Data

To compare health outcomes with air quality measures, it is important to understand the origins of those
measures and the methods for obtaining them. This section provides a brief overview of the origins and
process of air quality regulation in this country. It provides a detailed discussion of ozone (O3) and
particulate matter (PM). The EPHT program has focused on these two pollutants, since numerous studies
have found them to be most pervasive and harmful to public health and the environment, and there are
extensive monitoring and modeling data available.

2.1 Introduction to Air Quality Impacts in the United States

2.1.1	The Clean Air Act

In 1970, the Clean Air Act (CAA) was signed into law. Under this law, EPA sets limits on how much of
a pollutant can be in the air anywhere in the United States. This ensures that all Americans have the same
basic health and environmental protections. The CAA has been amended several times to keep pace with
new information. For more information on the CAA. go to https://www.epa.gov/clean-air-act-overview.

Under the CAA, the EPA has established standards, or limits, for six air pollutants known as the criteria
air pollutants: carbon monoxide (CO), lead (Pb), nitrogen dioxide (NO2), sulfur dioxide (SO2), ozone
(O3), and particulate matter (PM). These standards, called the National Ambient Air Quality Standards
(NAAQS), are designed to protect public health and the environment. The CAA established two types of
air quality standards. Primary standards set limits to protect public health, including the health of
"sensitive" populations such as asthmatics, children, and the elderly. Secondary standards set limits to
protect public welfare, including protection against decreased visibility, damage to animals, crops,
vegetation, and buildings. The CAA requires EPA to review these standards at least every five years. For
more specific information on the NAAQS, go to https://www.epa.gov/criteria-air-pollutants/naaqs-table.
For general information on the criteria pollutants, go to https://www.epa.gov/criteria-air-pollutants.

When these standards are not met, the area is designated as a nonattainment area. States must develop
state implementation plans (SIPs) that explain the regulations and controls it will use to clean up the
nonattainment areas. States with an EPA-approved SIP can request that the area be designated from
nonattainment to attainment by providing three consecutive years of data showing NAAQS compliance.
The state must also provide a maintenance plan to demonstrate how it will continue to comply with the
NAAQS and demonstrate compliance over a 10-year period, and what corrective actions it will take
should a NAAQS violation occur after designation. EPA must review and approve the NAAQS
compliance data and the maintenance plan before designating the area; thus, a person may live in an area
designated as nonattainment even though no NAAQS violation has been observed for quite some time.
For more information on ozone designations, go to https://www.epa.gov/ozone-designations and for PM
designations, go to https://www.epa.gov/particle-pollution-designations.

2.1.2	Ozone

Ozone is a colorless gas composed of three oxygen atoms. Ground level ozone is formed when pollutants
released from cars, power plants, and other sources react in the presence of heat and sunlight. It is the
prime ingredient of what is commonly called "smog." When inhaled, ozone can cause acute respiratory
problems, aggravate asthma, cause inflammation of lung tissue, and even temporarily decrease the lung

5


-------
capacity of healthy adults. Repeated exposure may permanently scar lung tissue. EPA's Integrated
Science Assessments and Risk and Exposure documents are available at

https://www.epa.gov/naaqs/ozone-o3-air-qualitv-standards. The current NAAQS for ozone (last revised
in 2015) is a daily maximum 8-hour average of 0.070 parts per million [ppm] (for details, see
https://www.epa.gov/ozone-pollution/setting-and-reviewing-standards-control-ozone-
pollution#standards). The CAA requires EPA to review the NAAQS at least every five years and revise
them as appropriate in accordance with Section 108 and Section 109 of the Act. The standards for ozone
are shown in Table 2-1.

Table 2-1. Ozone National Ambient Air Quality Standards

Form of the Standard (parts per million, ppm)

1997

2008

2015

Annual 4th highest daily max 8-hour average, averaged over
three years

0.08

0.075

0.070

2.1.3 Particulate Matter

PM air pollution is a complex mixture of small and large particles of varying origin that can contain
hundreds of different chemicals, including cancer-causing agents like polycyclic aromatic hydrocarbons
(PAH), as well as heavy metals such as arsenic and cadmium. PM air pollution results from direct
emissions of particles as well as particles formed through chemical transformations of gaseous air
pollutants. The characteristics, sources, and potential health effects of particulate matter depend on its
source, the season, and atmospheric conditions.

As practical convention, PM is divided by sizes into classes with differing health concerns and potential
sources.3 Particles less than 10 micrometers in diameter (PMio) pose a health concern because they can be
inhaled into and accumulate in the respiratory system. Particles less than 2.5 micrometers in diameter
(PM2.5) are referred to as "fine" particles. Because of their small size, fine particles can lodge deeply into
the lungs. Sources of fine particles include all types of combustion (motor vehicles, power plants, wood
burning, etc.) and some industrial processes. Particles with diameters between 2.5 and 10 micrometers
(PM10-2.5) are referred to as "coarse" or PMc. Sources of PMc include crushing or grinding operations and
dust from paved or unpaved roads. The distribution of PM10, PM2.5 and PMc varies from the eastern U.S.
to arid western areas.

Particle pollution - especially fine particles - contains microscopic solids and liquid droplets that are so
small that they can get deep into the lungs and cause serious health problems. Numerous scientific
studies have linked particle pollution exposure to a variety of problems, including premature death in
people with heart or lung disease, nonfatal heart attacks, irregular heartbeat, aggravated asthma, decreased
lung function, and increased respiratory symptoms, such as irritation of airways, coughing or difficulty
breathing. Additional information on the health effects of particle pollution and other technical
documents related to PM standards are available at https://www.epa.gov/pm-pollution.

3 The measure used to classify PM into sizes is the aerodynamic diameter. The measurement instruments used for PM are
designed and operated to separate large particles from the smaller particles. For example, the PM2 5 instrument only captures
and thus measures particles with an aerodynamic diameter less than 2.5 micrometers. The EPA method to measure PMc is
designed around taking the mathematical difference between measurements for PMi0 and PM2 5

6


-------
The current NAAQS for PM2.5 (last revised in 2012) includes both a 24-hour standard to protect against
short-term effects, and an annual standard to protect against long-term effects. The annual average PM2.5

"3

concentration must not exceed 12.0 micrograms per cubic meter (ug/m ) based on the annual mean

"3

concentration averaged over three years, and the 24-hr average concentration must not exceed 35 ug/m
based on the 98th percentile 24-hour average concentration averaged over three years. More information is
available at https://www.epa.gov/pm-pollution/setting-and-reviewing-standards-control-particulate-
matter-pm-pollution#standards. The standards for PM2.5 are shown in Table 2-2.

Table 2-2. PM2.5 National Ambient Air Quality Standards

Form of the Standard
(micrograms per cubic meter, jig/m3)

1997

2006

2012

Annual mean of 24-hour averages, averaged over 3 years

15.0

15.0

12.0

98th percentile of 24-hour averages, averaged over 3 years

65

35

35

2.2 Ambient Air Quality Monitoring in the United States

2.2.1 Monitoring Networks

The CAA (Section 319) requires establishment of an air quality monitoring system throughout the U.S.
The monitoring stations in this network have been called the State and Local Air Monitoring Stations
(SLAMS). The SLAMS network consists of approximately 4,000 monitoring sites set up and operated by
state and local air pollution agencies according to specifications prescribed by EPA for monitoring
methods and network design. All ambient monitoring networks selected for use in SLAMS are tested
periodically to assess the quality of the SLAMS data being produced. Measurement accuracy and
precision are estimated for both automated and manual methods. The individual results of these tests for
each method or analyzer are reported to EPA. Then, EPA calculates quarterly integrated estimates of
precision and accuracy for the SLAMS data.

The SLAMS network experienced accelerated growth throughout the 1970s. The networks were further
expanded in 1999 based on the establishment of separate NAAQS for fine particles (PM2.5) in 1997. The
NAAQS for PM2.5 were established based on their link to serious health problems ranging from increased
symptoms, hospital admissions, and emergency room visits, to premature death in people with heart or
lung disease. While most of the monitors in these networks are located in populated areas of the country,
"background" and rural monitors are an important part of these networks. For more information on
SLAMS, as well as EPA's other air monitoring networks go to https://www.epa.gov/amtic.

In 2023, approximately 35 percent of the U.S. population was living within 10 kilometers of ozone and
PM2.5 monitoring sites. Highly populated areas in the eastern U.S. and California are well covered by both
ozone and PM2.5 monitoring network (Figure 2-1).

7


-------
< 10 km (100.7 million
people)

10 km - 25 km (129.7
million people)

25 km - 50 km (58.8
million people)
50 km - 75 km (21.2
million people)

75 km - 100 km (8.8
million people)

100 km -150 km (8.4
million people)

150 km < ( 5.4 million
people)



Distance to Active
PM2.5 Monitors

% < 10 km (115.1 million
people)

#	10 km - 25 km (114
million people)

25 km - 50 km (59
million people)
50 km - 75 km (24.6
million people)
75 km -100 km (10.9
million people)

#	100 km -150 km (6.6
million people)

9 150 < (2.9 million
people)

8


-------
Figure 2-1. Distances from U.S. Census Tract centroids to the nearest monitoring site, 2023.

In summary, state and local agencies and tribes implement a quality-assured monitoring network to
measure air quality across the U.S. The EPA provides guidance to ensure a thorough understanding of the
quality of the data produced by these networks. These monitoring data have been used to characterize the
status of the nation's air quality and the trends across the U.S. (see https://www.epa.gov/air-trends).

2.2.2	Air Quality System Database

EPA's Air Quality System (AQS) database contains ambient air monitoring data collected by EPA, state,
local, and tribal air pollution control agencies from thousands of monitoring stations. AQS also contains
meteorological data, descriptive information about each monitoring station (including its geographic
location and its operator), and data quality assurance and quality control information. State and local
agencies are required to submit their air quality monitoring data into AQS within 90 days following the
end of the quarter in which the data were collected. This ensures timely submission of these data for use
by state, local, and tribal agencies, EPA, and the public. EPA's OAQPS and other AQS users rely upon
the data in AQS to assess air quality, assist in compliance with the NAAQS, evaluate SIPs, perform
modeling for permit review analysis, and perform other air quality management functions. For more
details, including how to retrieve data, go to https://www.epa.gov/aqs.

2.2.3	Advantages and Limitations of the Air Quality Monitoring and Reporting System

Air quality data is required to assess public health outcomes that are affected by poor air quality. The
challenge is to get surrogates for air quality on time and spatial scales that are useful for EPHT activities.

The advantage of using ambient data from EPA monitoring networks for comparison with health
outcomes is that these measurements of pollution concentrations are the best characterization of the
concentration of a given pollutant at a given time and location. Furthermore, the data are supported by a
comprehensive quality assurance program, ensuring data of known quality. One disadvantage of using
the ambient data is that it is usually out of spatial and temporal alignment with health outcomes. This
spatial and temporal 'misalignment' between air quality monitoring data and health outcomes is
influenced by the following key factors: the living and/or working locations (microenvironments) where a
person spends their time not being co-located with an air quality monitor; time(s)/date(s) when a patient
experiences a health outcome/symptom (e.g., asthma attack) not coinciding with time(s)/date(s) when an
air quality monitor records ambient concentrations of a pollutant high enough to affect the symptom (e.g.,
asthma attack either during or shortly after a high PM2.5 day).

To compare/correlate ambient concentrations with acute health effects, daily local air quality data is
needed.4 Spatial gaps exist in the air quality monitoring network, especially in rural areas since the air
quality monitoring network is designed to focus on measurement of pollutant concentrations in high
population density areas. Temporal limits also exist. Hourly ozone measurements are aggregated to daily
values (the daily max 8-hour average is relevant to the ozone standard). Ozone is typically monitored
during the ozone season (the warmer months, approximately April through October). However, year-long
data is available in many areas and is extremely useful to evaluate whether ozone is a factor in health
outcomes during the non-ozone seasons. PM2.5 is generally measured year-round. Most Federal Reference
Method (FRM) PM2.5 monitors collect data one day in every three days, due in part to the time and costs

4 EPA uses exposure models to evaluate the health risks and environmental effects associated with exposure. These models
are limited by the availability of air quality estimates, https://www.epa.gov/technical-air-pollution-resources.

9


-------
involved in collecting and analyzing the samples. Additionally, continuous monitors have become
available which can automatically collect, analyze, and report PM2.5 measurements on an hourly basis.
These monitors are available in most of the major metropolitan areas. Some of these continuous monitors
have been determined to be equivalent to the FRM monitors for regulatory purposes and are called
Federal Equivalent Methods (FEM).

2.2.4 Use of Air Quality Monitoring Data

Air quality monitoring data has been used to provide the information for the following situations:

(1)	Assessing effectiveness of SIPs in addressing NAAQS nonattainment areas

(2)	Characterizing local, state, and national air quality status and trends

(3)	Associating health and environmental damage with air quality levels/concentrations

For the EPHT effort, EPA is providing air quality data to support efforts associated with (2), and (3) above.
Data supporting (3) is generated by EPA through the use of its air quality data and its downscaler model.

Most studies that associate air quality with health outcomes use air monitoring as a surrogate for exposure
to the air pollutants being investigated. Many studies have used the monitoring networks operated by
state and federal agencies. Some studies perform special monitoring that can better represent exposure to
the air pollutants: community monitoring, near residences, in-house or workplace monitoring, and
personal monitoring. For the EPHT program, special monitoring is generally not supported, though it
could be used on a case-by-case basis.

From proximity-based exposure estimates to statistical interpolation, many approaches are developed for
estimating exposures to air pollutants using ambient monitoring data (Jerrett et al., 2005). Depending
upon the approach and the spatial and temporal distribution of ambient monitoring data, exposure
estimates to air pollutants may vary greatly in areas further apart from monitors (Bravo et al., 2012).
Factors like limited temporal coverage (i.e., PM2.5 monitors do not operate continuously such as recording
every third day or ozone monitors operate only certain part of the year) and limited spatial coverage (i.e.,
most monitors are located in urban areas and rural coverage is limited) hinder the ability of most of the
interpolation techniques that use monitoring data alone as the input. If we look at the example of Voronoi
Neighbor Averaging (VNA) (referred as the Nearest Neighbor Averaging in most literature), rural
estimates would be biased towards the urban estimates. To further explain this point, assume the scenario
of two cities with monitors and no monitors in the rural areas between, which is very plausible. Since
exposure estimates are guaranteed to be within the range of monitors in VNA, estimates for the rural areas
would be higher according to this scenario.

Air quality models may overcome some of the limitations that monitoring networks possess. Models such
as CMAQ can estimate concentrations in reasonable temporal and spatial resolutions. However, these
sophisticated air quality models are prone to systematic biases since they depend upon so many variables
(i.e., metrological models and emission models) and complex chemical and physical process simulations.

10


-------
Combining monitoring data with air quality models (via fusion or regression) may provide the best results
in terms of estimating ambient air concentrations in space and time. EPA's eVNA5 is an example of an
earlier approach for merging air quality monitor data with CMAQ model predictions. DS attempts to
address some of the shortcomings in these earlier attempts to statistically combine monitor and model
predicted data, see published paper referenced in section 1 for more information about DS. As discussed
in the next section, there are two methods used in EPHT to provide estimates of ambient concentrations of
air pollutants: air quality monitoring data and the downscaler model estimate, which is a statistical
'combination' of air quality monitor data and photochemical air quality model predictions (e.g., CMAQ).

2.3 Air Quality Indicators Developed for the EPHT Network

Air quality indicators have been developed for use in the Environmental Public Health Tracking Network
by CDC using the ozone and PM2.5 data from EPA. The approach used divides "indicators" into two
categories. First, basic air quality measures were developed to compare air quality levels over space and
time within a public health context (e.g., using the NAAQS as a benchmark). Next, indicators were
developed that mathematically link air quality data to public health tracking data (e.g., daily PM2 5 levels
and hospitalization data for acute myocardial infarction). Table 2-3 and Table 2-4 describe the issues
impacting calculation of basic air quality indicators.

Table 2-3. Public Health Surveillance Goals and Current Status

Goal

Status

Air data sets and metadata required for air quality
indicators are available to EPHT state Grantees.

Data are available through state agencies and EPA's
AQS. EPA and CDC developed an interagency
agreement, where EPA provides air quality data along
with statistically combined AQS and CMAQ data,
associated metadata, and technical reports that are
delivered to CDC.

Estimate the linkage or association of PM2.5 and ozone on
health to: Identify populations that may have higher risk
of adverse health effects due to PM2.5 and ozone,

Generate hypothesis for further research, and
Provide information to support prevention and pollution
control strategies.

Regular discussions have been held on health-air linked
indicators and CDC/HFI/EPA convened a workshop
January 2008. CDC has collaborated on a health impact
assessment (HIA) with Emory University, EPA, and
state grantees that can be used to facilitate greater
understanding of these linkages.

Produce and disseminate basic indicators and other
findings in electronic and print formats to provide the
public, environmental health professionals, and
policymakers, with current and easy-to-use information
about air pollution and the impact on public health.

Templates and "how to" guides for PM2.5 and ozone
have been developed for routine indicators. Calculation
techniques and presentations for the indicators have been
developed.

5 eVNA is described in the "Regulatory Impact Analysis for the Final Clean Air Interstate Rule", EPA-452/R-05-002, March
2005, Appendix F.

11


-------
Table 2-4. Basic Air Quality Indicators used in EPHT, derived from the EPA data delivered to

CDC

Ozone (daily 8-hr period with maximum concentration, ppm. by FRM)	

•	Number of days with maximum ozone concentration over the NAAQS (or other relevant benchmarks (by county
and MSA)

•	Number of person-days with maximum 8-hr average ozone concentration over the NAAQS & other relevant
benchmarks (by county and MSA)

PM? s (daily 24-hr integrated samples. u#/m:, by FRM)	

•	Average ambient concentrations of particulate matter (< 2.5 microns in diameter) and compared to annual
PM2.5 NAAQS (by state).

•	Percent of population exceeding annual PM2.5 NAAQS (by state).

•	Percent of days with PM2.5 concentration over the daily NAAQS (or other relevant benchmarks (by county and
MSA)

•	Number of person-days with PM2.5 concentration over the daily NAAQS & other relevant benchmarks (by
county and MSA)	

2.3.1	Rationale for the Air Quality Indicators

The CDC EPHT Network is initially focusing on ozone and PM2.5. These air quality indicators are based
mainly around the NAAQS health findings and program-based measures (measurement, data and analysis
methodologies). The indicators will allow comparisons across space and time for EPHT actions. They are
in the context of health-based benchmarks. By bringing population into the measures, they roughly
distinguish between potential exposures (at broad scale).

2.3.2	Air Quality Data Sources

The air quality data will be available in the EPA's AQS database based on the state/federal air program's
data collection and processing. The AQS database contains ambient air pollution data collected by EPA,
state, local, and tribal air pollution control agencies from thousands of monitoring stations (SLAMS).

2.3.3	Use of Air Quality Indicators for Public Health Practice

The basic indicators will be used to inform policymakers and the public regarding the degree of hazard
within a state and across states (national). For example, the number of days per year that ozone is above
the NAAQS can be used to communicate to sensitive populations (such as asthmatics) the number of days
that they may be exposed to unhealthy levels of ozone. This is the same level used in the Air Quality
Alerts that inform these sensitive populations when and how to reduce their exposure. These indicators,
however, are not a surrogate measure of exposure and therefore will not be linked with health data.

12


-------
3.0 Emissions Data

3.1 Introduction to Emissions Data Development

The U.S. Environmental Protection Agency (EPA) developed an air quality modeling platform for air
toxics and criteria air pollutants that represents the year 2020. The platform is based on the 2020 National
Emissions Inventory (2020 NEI) published in April 2023 (EPA, 2023) along with other data specific to
the year 2020. The air quality modeling platform consists of all the emissions inventories and ancillary
data files used for emissions modeling, as well as the meteorological, initial condition, and boundary
condition files needed to run the air quality model. This document focuses on the emissions modeling
component of the 2020 modeling platform, including the emission inventories, the ancillary data files, and
the approaches used to transform inventories for use in air quality modeling.

The modeling platform includes all criteria air pollutants and precursors (CAPs), two groups of hazardous
air pollutants (HAPs) and diesel particulate matter. The first group of HAPs are those explicitly used by
the chemical mechanism in the Community Multiscale Air Quality (CMAQ) model (Appel, 2018) for
ozone/particulate matter (PM): chlorine (CI), hydrogen chloride (HC1), naphthalene, benzene,
acetaldehyde, formaldehyde, and methanol (the last five are abbreviated as NBAFM in subsequent
sections of the document). The second group of HAPs consists of 52 HAPs or HAP groups (such as
polycyclic aromatic hydrocarbon groups) that are included in CMAQ for the purposes of air quality
modeling for a HAP+CAP platform.

Emissions were prepared for the Community Multiscale Air Quality (CMAQ) model
(https://www.epa.gov/cmaq) version 5.4,6 which was used to model ozone (O3) particulate matter (PM),
and H APs. CMAQ requires hourly and gridded emissions of the following inventory pollutants: carbon
monoxide (CO), nitrogen oxides (NOx), volatile organic compounds (VOC), sulfur dioxide (SO:),
ammonia (NH3), particulate matter less than or equal to 10 microns (PM10), and individual component
species for particulate matter less than or equal to 2.5 microns (PM2.5). In addition, the Carbon Bond
mechanism version 6 (CB6) with chlorine chemistry within CMAQ allows for explicit treatment of the
VOC HAPs naphthalene, benzene, acetaldehyde, formaldehyde and methanol (NBAFM), includes
anthropogenic HAP emissions of HC1 and CI, and can model additional HAPs as described in Section 3.
The short abbreviation for the modeling case name was "2020ha2", where 2020 is the year modeled, 'h'
represents that it was based on the 2020 NEI, and 'a' represents that it was the first version of a 2020 NEI-
based platform. The additional '2' after the 'ha' is related to a second run of the 2020ha case with an
updated version of some spatial surrogates.

Emissions were also prepared for an air dispersion modeling system: American Meteorological
Society/Environmental Protection Agency Regulatory Model (AERMOD) (EPA, 2018). AERMOD was
run for 2020 for all NEI HAPs (about 130 more than covered by CMAQ) across all 50 states, Puerto Rico
and the Virgin Islands in a similar way as was done for the 2018 version of AirToxScreen (EPA, 2022a).
This TSD focuses on the CMAQ aspects of the 2020 modeling platform from which onzone and PM data
were developed for the Centers for Disease Control and Prevention.

6 CMAQ version 5.4: https://zenodo.org/record/7218076. CMAQ is also available from the Community Modeling and Analysis
System (CMAS) Center at: http://www.cmascenter.org.

13


-------
The effort to create the emission inputs for this study included development of emission inventories to
represent emissions during the year of 2020, along with application of emissions modeling tools to
convert the inventories into the format and resolution needed by CMAQ and AERMOD.

The emissions modeling platform includes point sources, nonpoint sources, onroad mobile sources,
nonroad mobile sources, biogenic emissions and fires for the U.S., Canada, and Mexico. Some platform
categories use more disaggregated data than are made available in the NEI. For example, in the platform,
onroad mobile source emissions are represented as hourly emissions by vehicle type, fuel type process
and road type while the NEI emissions are aggregated to vehicle type/fuel type totals and annual temporal
resolution. Emissions used in the CMAQ modeling from Canada are provided by Environment and
Climate Change Canada (ECC) and Mexico are mostly provided by SEMARNAT and are not part of the
NEI. Year-specific emissions were used for fires, biogenic sources, fertilizer, point sources, and onroad
and nonroad mobile sources. Where available, continuous emission monitoring system (CEMS) data were
used for electric generating unit (EGU) emissions.

The primary emissions modeling tool used to create the CMAQ model-ready emissions was the Sparse
Matrix Operator Kernel Emissions (SMOKE) modeling system. SMOKE version 4.9 was used to create
CMAQ-ready emissions files for a 12-krn grid covering the continental U.S. Additional information about
SMOKE is available from http ://www.cmascenter.org/smoke.

The gridded meteorological model used to provide input data for the emissions modeling was developed
using the Weather Research and Forecasting Model (WRF,

https://ral.ucar.edu/solutions/products/weather-research-and-forecasting-model-wrQ version 4.1.1,
Advanced Research WRF core (Skamarock, et al., 2008). The WRF Model is a mesoscale numerical
weather prediction system developed for both operational forecasting and atmospheric research
applications. The WRF was run for 2020 over a domain covering the continental U.S. at a 12km
resolution with 35 vertical layers. The run for this platform included high resolution sea surface
temperature data from the Group for High Resolution Sea Surface Temperature (GHRSST) (see
https://www.ghrsst.org/) and is given the EPA meteorological case abbreviation "20k." The full case
abbreviation includes this suffix following the emissions portion of the case name to fully specify the
abbreviation of the case as "2020ha2_cb6_20k."

Following the emissions modeling steps to prepare emissions for CMAQ and AERMOD, both models
were run for each of the four modeling domains. CMAQ outputs provide the overall mass, chemistry and
formation for specific hazardous air pollutants (HAPs) formed secondarily in the atmosphere (e.g.,
formaldehyde, acetaldehyde, and acrolein), whereas AERMOD provides spatial granularity and more
detailed source attribution. CMAQ also provided the biogenic and fire concentrations, as these sources are
not run in AERMOD. Special steps were taken to estimate secondary HAPs, fire and biogenic emissions
in these areas. The outputs from CMAQ and AERMOD were combined to provide spatially refined
concentration estimates for HAPs, from which estimates of cancer and non-cancer risk were derived.
Information about the emissions and associated data files for this platform are available from this section
of the air emissions modeling website https://www.epa.gov/air-emissions-modeling/2020-emissions-
modeling-platform.

This chapter contains two additional sections. Section 3.2 describes the inventories input to SMOKE and
the ancillary files used along with the emission inventories. Section 3.3 describes the emissions modeling
performed to convert the inventories into the format and resolution needed by CMAQ. Additional details
on the development of the emissions inputs to CMAQ are provided in the publication Technical Support

14


-------
Document (TSD): Preparation of Emissions Inventories for the 2020 North American Emissions
Modeling Platform (EPA, 2023).

3.2 Emission Inventories and Approaches

This section describes the emissions inventories created for input to SMOKE, which are based on the
April 2023 version of the 2020 NEI. The NEI includes five main data categories: a) nonpoint sources; b)
point sources; c) nonroad mobile sources; d) onroad mobile sources; and e) fires. For CAPs, the NEI data
are largely compiled from data submitted by state, local and tribal (S/L/T) agencies. HAP emissions data
are often augmented by EPA when they are not voluntarily submitted to the NEI by S/L/T agencies. The
NEI was compiled using the Emissions Inventory System (EIS). EIS collects and stores facility inventory
and emissions data for the NEI and includes hundreds of automated QA checks to improve data quality,
and it also supports release point (stack) coordinates separately from facility coordinates. EPA
collaboration with S/L/T agencies helped prevent duplication between point and nonpoint source
categories such as industrial boilers. The 2020 NEI Technical Support Document describes in detail the
development of the 2020 emission inventories and is available at https://www.epa.gov/air-emissions-
inventories/2020-national-emissions-inventory-nei-technical-support-document-tsd (EPA, 2023).

A full set of emissions for all source categories is developed every three years, with 2020 being the most
recent year represented with a full "triennial" NEI. S/L/T agencies are required to submit all applicable
point sources to the NEI in triennial years, including the year 2020. Because all applicable point sources
were submitted for 2020, it was not necessary to pull forward unsubmitted sources from another NEI year,
as was done for interim years such as 2018 and 2019. The SMARTFIRE2 system and the BlueSky
Pipeline (https://github.com/pnwairfire/bluesky) emissions modeling system were used to develop year
2020 fire emissions. SMARTFIRE2 categorizes all fires as either prescribed burning or wildfire, and the
BlueSky Pipeline system includes fuel loading, consumption and emission factor estimates for both types
of fires. Onroad and nonroad mobile source emissions were developed for this project for the year 2020
by running MOVES3 (https://www.epa.gov/moves).

With the exception of onroad and fire emissions, Canadian emissions were provided by Environment
Canada and Climate Change (ECCC) for the year 2020. For Mexico, inventories from the 2019 emissions
modeling platform (EPA, 2022b) were used as the starting point. Adjustments were made to the Canadian
and Mexican emissions also include additional adjustments to account for the impacts of the COVED
pandemic.

The emissions modeling process was performed using SMOKE v4.9. Through this process, the emissions
inventories were apportioned into the grid cells used by CMAQ and temporally allocated into hourly
values. In addition, the pollutants in the inventories (e.g., NOx, PM and VOC) were split into the
chemical species needed by CMAQ. For the purposes of preparing the CMAQ- ready emissions, the NEI
emissions inventories by data category were split into emissions modeling platform "sectors"; and
emissions from sources other than the NEI were added, such as the Canadian, Mexican, and offshore
inventories. Emissions within the emissions modeling platform were separated into sectors for groups of
related emissions source categories that are run through all of the appropriate SMOKE programs, except
the final merge, independently from emissions categories in the other sectors. The final merge program
called Mrggrid combines low-level sector-specific gridded, speciated and temporalized emissions to
create the final CMAQ-ready emissions inputs. For biogenic and fertilizer emissions, the CMAQ model
allows for these emissions to be included in the CMAQ-ready emissions inputs, or to be computed within

15


-------
CMAQ itself (the "inline" option). This study used the option to compute biogenic emissions within the
model and the CMAQ bidirectional ammonia process to compute the fertilizer emissions.

Table 3-1 presents the sectors in the emissions modeling platform used to develop the year 2020
emissions for this project. The sector abbreviations are provided in italics; these abbreviations are used in
the SMOKE modeling scripts, the inventory file names, and throughout the remainder of this section.
Annual emission summaries for the U.S. sectors are shown in Table 3-2. Table 3-3 provides a summary of
emissions for the anthropogenic sectors containing Canadian, Mexican, and offshore sources. State total
emissions for each sector are provided in Appendix B, a workbook entitled
"Appendix_B_20202_emissions_totals_by_sector.xlsx".

Table 3-1. Platform Sectors Used in the Emissions Modeling Process

Platform Sector:

abbreviation

NEI Data
Category

Description and resolution of the data input to SMOKE

EGU units:

Ptegu

Point

2020 NEI point source EG Us. replaced with hourly
Continuous Emissions Monitoring System (CEMS) values
for NOx and SO;, and the remaining pollutants temporally
allocated according to CEMS heat input where the units are
matched to the NEI. Emissions for all sources not matched
to CEMS data come from 2020 NEI point inventory. Annual
resolution for sources not matched to CEMS data, hourly for
CEMS sources. EG Us closed in 2020 are not part of the
inventorv.

Point source oil and gas:
ptoilgas

Point

2020 NEI point sources that include oil and gas production
emissions processes for facilities with North American
Industry Classification System (NAICS) codes related to Oil
and Gas Extraction, Natural Gas Distribution, Drilling Oil
and Gas Wells, Support Activities for Oil and Gas
Operations, Pipeline Transportation of Crude Oil, and
Pipeline Transportation of Natural Gas. Includes U.S.
offshore oil production.

Aircraft and ground
support equipment:

airports

Point

2020 NEI point source emissions from airports, including
aircraft and airport ground support emissions. Annual
resolution.

Remaining non-EGU
point:

Ptnonipm

Point

All 2020 NEI point source records not matched to the
airports, ptegu, or pt_oilgas sectors. Includes 2020 NEI rail
yard emissions. Annual resolution.

Livestock:

Livestock

Nonpoint

2020 NEI nonpoint livestock emissions. Livestock includes
ammonia and other pollutants (except PM2.5). County and
annual resolution.

Agricultural Fertilizer:

fertilizer

Nonpoint

2020 agricultural fertilizer ammonia emissions computed
inline within CMAQ.

Area fugitive dust:

afdustadj

Nonpoint

PM10 and PM2 5 fugitive dust sources from the 2020 NEI
nonpoint inventory; including building construction, road
construction, agricultural dust, and paved and unpaved road
dust. The emissions modeling system applies a transport
fraction reduction and a zero-out based on 2020 gridded
hourly meteorology (precipitation and snow/ice cover).
Emissions are county and annual resolution.

16


-------
Platform Sector:

abbreviation

NEI Data
Category

Description and resolution of the data input to SMOKE

Biogenic:

beis

Nonpoint

Year 2020 emissions from biogenic sources. These were left
out of the CMAQ-ready merged emissions, in favor of inline
biogenic emissions produced during the CMAQ model run
itself. Version 4 of the Biogenic Emissions Inventory
System (BEIS) was used with Version 6 of the Biogenic
Emissions Landuse Database (BELD6). Therefore, the
biogenic emissions used here are similar to the 2020 NEI
biogenic emissions, but not exactly the same.

Category 1, 2 CMV:

cmv_clc2

Nonpoint

2020 NEI Category 1 (CI) and Category 2 (C2), commercial
marine vessel (CMV) emissions based on Automatic
Identification System (AIS) data. Point and hourly
resolution.

Category 3 CMV:

cmv_c3

Nonpoint

2020 NEI Category 3 (C3) commercial marine vessel
(CMV) emissions based on AIS data. Point and hourly
resolution.

Locomotives :
Rail

Nonpoint

Line haul rail locomotives emissions from 2020 NEI.
County and annual resolution.

Nonpoint source oil and
gas: np oilgas

Nonpoint

Nonpoint 2020 NEI sources from oil and gas-related
processes. County and annual resolution.

Residential Wood
Combustion:

rwc

Nonpoint

2020 NEI nonpoint sources with residential wood
combustion (RWC) processes. County and annual
resolution.

Solvents: np solvents

Nonpoint

Emissions of solvents from the 2020 NEI (Seltzer, 2021).
Includes household cleaners, personal care products,
adhesives, architectural and aerosol coatings, printing inks,
and pesticides. Annual and county resolution.

Remaining nonpoint:

nonpt

Nonpoint

2020 NEI nonpoint sources not included in other platform
sectors. County and annual resolution.

Nonroad:

nonroad

Nonroad

2020 NEI nonroad equipment emissions developed with
MOVES3, including the updates made to spatial
apportionment that were developed with the 2016vl
platform. MOVES3 was used for all states except
California, which submitted their own emissions for the
2020 NEI. County and monthly resolution.

Onroad:

onroad

Onroad

Onroad mobile source gasoline and diesel vehicles from
parking lots and moving vehicles from 2020 NEI. Includes
the following emission processes: exhaust, extended idle,
auxiliary power units, evaporative, permeation, refueling,
vehicle starts, off network idling, long-haul truck hoteling,
and brake and tire wear. MOVES3 was run for 2020 to
generate emission factors.

Onroad California:

onroadcaadj

Onroad

California-provided 2020 CAP and HAP (VOCs and metals)
onroad mobile source gasoline and diesel vehicles from
parking lots and moving vehicles based on Emission Factor
(EMFAC), gridded and temporalized based on outputs from
MOVES3. Polycyclic aromatic hydrocarbon (PAH)
emissions are based on MOVES3.

17


-------
Platform Sector:

abbreviation

NEI Data
Category

Description and resolution of the data input to SMOKE

Point source agricultural
fires: ptagfire

Nonpoint

Agricultural fire sources for 2020 developed by EPA as
point and day-specific emissions.7 Only EPA-developed ag.
fire data are used in this study, thus 2020 NEI state
submissions are not included. Agricultural fires are in the
nonpoint data category of the NEI, but in the modeling
platform, they are treated as day-specific point sources.
Updated HAP-augmentation factors were applied.

Point source prescribed
fires: ptfire-rx

Nonpoint

Point source day-specific prescribed fires for 2020 NEI
computed using SMARTFIRE 2 and Blue Sky Pipeline. The
ptfire emissions were run as two separate sectors: ptfire-rx
(prescribed, including Flint Hills / grasslands) and ptfire-
wild.

Point source wildfires:

ptfire-wild

Nonpoint

Point source day-specific wildfires for 2020 NEI computed
using SMARTFIRE 2 and Blue Sky Pipeline.

Non-US. Fires:
ptfireothna

N/A

Point source day-specific wildfires and agricultural fires
outside of the U.S. for 2020. Canadian fires for May through
December are provided by ECCC. All other fire emissions,
including Canadian emissions from January through April,
as well as Mexico, Caribbean, Central American, and other
international fires, are from v2.5 of the Fire INventory
(FINN) from National Center for Atmospheric Research
(Wiedinmyer, C„ 2023).

Canada Area Fugitive dust
sources:

Canada afdust

N/A

Area fugitive dust sources from ECCC for 2020 with
transport fraction and snow/ice adjustments based on 2020
meteorological data. Annual and province resolution.

Canada Point Fugitive
dust sources:

Canada ptdust

N/A

2020 point source fugitive dust sources from ECCC with
transport fraction and snow/ice adjustments based on 2020
meteorological data. Monthly and province resolution.

Canada and Mexico
stationary point sources:

canmex_point

N/A

Canada and Mexico point source emissions not included in
other sectors. Canada point sources for 2020 were provided
by ECCC and Mexico point source emissions for 2016 were
provided by SEMARNAT. Mexico sources were projected
from 2019ge (EPA, 2022b) with COVID adjustments
applied. Canada monthly temporalization adjusted for
COVID. Annual and monthly resolution.

Canada and Mexico
agricultural sources:

canmexag



Canada and Mexico agricultural emissions. Canada point
sources for 2020 were provided by ECCC and Mexico
emissions for 2016 were provided by SEMARNAT and
adjusted to 2019. COVID adjustments were not applied to
the ag sector. Annual resolution.

Canada low-level oil and
gas sources:

canada_og2D



2020 Canada emissions from upstream oil and gas. This
sector contains the portion of oil and gas emissions which
are not subject to plume rise. The rest of the 2020 Canada
oil and gas emissions are in the canmex_point sector.
Provided by ECCC with COVID-adjusted monthly
temporalization. Monthly resolution.

7 Only EPA-developed agricultural fire data were included in this study; data submitted by states to the NEI were excluded.

18


-------
Platform Sector:

abbreviation

NEI Data
Category

Description and resolution of the data input to SMOKE

Canada and Mexico
nonpoint and nonroad
sources:

canmexarea

N/A

2020 Canada and Mexico nonpoint source emissions not
included in other sectors. Canada: ECCC provided a 2020
inventory and surrogates. Mexico: applied COVID
adjustments to 2019ge. Monthly temporalization adjusted
for COVID.

Canada onroad sources:

canadaonroad

N/A

Canada onroad emissions. 2020 Canada inventory provided
by ECCC and processed using updated surrogates. COVID
impacts applied to monthly profiles (not to annual totals).
Province and monthly resolution.

Mexico onroad sources:

mexicoonroad

N/A

Mexico onroad emissions. 2020 MOVES-Mexico with
COVID adjustments applied. Municipio and monthly
resolution.

Ocean chlorine emissions were also merged in with the above sectors. The ocean chlorine gas emission
estimates are based on the build-up of molecular chlorine (Cb) concentrations in oceanic air masses
(Bullock and Brehme, 2002). Ocean chlorine data at 12 km resolution were available from earlier studies
and were not modified other than the name "CHLORINE" was changed to "CL2" because that is the
name required by the CMAQ model.

The emission inventories in SMOKE input formats for the platform are available from EPA's Air
Emissions Modeling website: https://www.epa.gov/air-emissions-modeling/2020-emissions-modeling-
platform. The platform informational text file indicates the particular zipped files associated with each
platform sector. Some emissions data summaries are available with the data files for the 2020 platform.
The types of reports include state summaries of inventory pollutants and model species by modeling
platform sector and county annual totals by modeling platform sector. Summaries of the emissions in the
Contiguous U.S. and emissions within the 12-km domain but outside of the U.S. are shown in Table 3-2.
2020 Contiguous United States Emissions by Sector (tons/yr in 48 states + D.C.)Table 3-2 and Table 3-3,
respectively.

19


-------
Table 3-2. 2020 Contiguous United States Emissions by Sector (tons/yr in 48 states + D.C.)

Sector

CO

NH3

NOX

PM10

PM2_5

S02

voc

afdustadj







5,513,981

765,892





airports

324,335

0

81,729

8,295

7,334

8,889

48,680

cmv_clc2

17,242

57

113,213

3,051

2,956

571

3,973

cmv_c3

9,216

29

91,850

1,640

1,508

3,690

4,233

fertilizer



1,401,045











livestock



2,693,568









215,483

nonpt

2,199,000

145,244

739,200

724,647

634,164

107,619

1,007,035

nonroad

11,005,619

1,980

866,081

85,040

79,961

990

977,863

npoilgas

621,795

16

571,317

10,541

10,453

135,998

2,583,242

npsolvents













2,586,519

onroad

14,063,910

89,328

2,327,115

188,720

78,626

9,785

1,030,292

ptegu

400,900

21,491

847,682

101,118

86,781

820,839

25,466

ptagfire

664,858

140,954

28,037

102,245

66,604

11,025

107,166

ptfire-rx

7,181,506

114,977

140,674

794,163

681,777

64,751

1,654,719

ptfire-wild

18,664,856

306,009

239,530

1,885,536

1,597,986

135,617

4,399,094

ptnonipm

1,157,963

63,289

769,850

343,959

222,800

443,029

705,590

ptoilgas

171,082

8,264

330,517

12,668

12,168

35,130

196,102

rail

92,100

282

422,975

10,819

10,459

351

17,492

rwc

2,955,189

22,735

44,869

450,864

448,073

12,019

455,660

beis

3,265,206



980,749







28,254,267

CONUS no beis

59,529,571

5,009,270

7,614,637

10,237,288

4,707,543

1,790,303

16,018,609

CONUS + beis

62,794,777

5,009,270

8,595,386

10,237,288

4,707,543

1,790,303

44,272,876

Table 3-3. Non-US Emissions by Sector within the 12US1 Modeling Domain (tons/yr for Canada,

Mexico, Offshore)

Sector

CO

NH3

NOX

PM10

PM2_5

S02

VOC

Canada ag



495,216



6,567

1,876



124,394

Canada oil and gas 2D



8









318,720

Canada afdust







799,628

154,654





Canada ptdust







2,791

361





Canada area

2,020,228

5,987

321,437

184,241

135,848

14,263

709,347

Canada onroad

1,622,797

6,848

354,849

24,288

13,272

830

115,863

Canada point

1,011,453

18,160

549,975

111,671

41,376

499,692

146,194

Canada fires

654,404

8,746

10,058

118,455

102,005

5,444

215,854

Canada cmv_clc2

2,596

8

16,691

441

428

60

580

Canada cmv_c3

7,160

19

71,623

1,051

967

2,167

3,497

Mexico ag



115,994



66,380

14,465



0

Mexico area

115,014

81

55,083

29,228

16,992

1,586

278,327

Mexico onroad

1,241,148

2,130

311,807

11,557

8,144

4,888

110,159

Mexico point

124,965

949

144,798

39,649

27,670

293,438

29,882

Mexico fires

211,379

3,612

13,079

24,985

21,413

2,000

109,543

20


-------
Sector

CO

NH3

NOX

PM10

PM2_5

S02

voc

Mexico cmv_clc2

118

0

766

20

19

2

32

Mexico cmv_c3

7,375

72

79,149

4,088

3,761

10,888

3,442

Offshore cmv_clc2

3,647

11

23,290

610

591

64

885

Offshore cmv_c3

43,133

254

434,674

14,334

13,187

36,361

20,624

Offshore pt oilgas

52,008

8

50,096

638

637

463

38,910

Can/Mex/offshore total

7,117,423

658,106

2,437,376

1,440,620

557,665

872,147

2,226,254

3.2.1 Point Sources (ptegu, ptoilgas, ptnonipm, and airports)

Point sources are sources of emissions for which specific geographic coordinates (e.g., latitude/longitude)
are specified, as in the case of an individual facility. A facility may have multiple emission release points
that may be characterized as units such as boilers, reactors, spray booths, kilns, etc. A unit may have
multiple processes (e.g., a boiler that sometimes burns residual oil and sometimes burns natural gas).

With a couple of minor exceptions, this section describes only NEI point sources within the contiguous
U.S. The offshore oil platform (pt oilgas sector) and CMV emissions (cmv c 1 c2 and cmv_c3 sectors)
are processed by SMOKE as point source inventories and are discussed later in this section. A complete
NEI is developed every three years. At the time of this writing, 2020 is the most recently finished
complete NEI. A comprehensive description about the development of the 2020 NEI is available in the
2020 NEI TSD (EPA, 2023). Point inventories are also available in EIS for non-triennial NEI years such
as 2019 and 2021. In the interim year point inventories, states are required to update larger sources with
the emissions that occurred in that year, while sources not updated by states for the interim year were
either carried forward from the most recent triennial NEI or marked as closed and removed.

In preparation for modeling, the complete set of point sources in the NEI was exported from EIS for the
year 2020 into the Flat File 2010 (FF10) format that is compatible with SMOKE (see
https://cmascenter.Org/smoke/documentation/4.9/html/ch06s02s08.html) and was then split into several
sectors for modeling. For both flat files, sources without specific locations (i.e., the FIPS code ends in
777) were dropped and inventories for the other point source sectors were created from the remaining
point sources. The point sectors are: EGUs (ptegu), point source oil and gas extraction-related sources
(pt oilgas), airport emissions (airports), and the remaining non-EGUs (ptnonipm). The EGU emissions
were split out from the other sources to facilitate the use of distinct SMOKE temporal processing and
future-year projection techniques. The oil and gas sector emissions (pt oilgas) and airport emissions
(airports) were processed separately for the purposes of developing emissions summaries and due to
distinct projection techniques from the remaining non-EGU emissions (ptnonipm), although this study
does not include emissions projected to other years.

In some cases, data about facility or unit closures are entered into EIS after the inventory modeling
inventory flat files have been extracted. EIS. Prior to processing through SMOKE, submitted facility and
unit closures were reviewed and where closed sources were found in the inventory, those were removed.

For the 2020 platform, an analysis of point source stack parameters (e.g., stack height, diameter,
temperature, and velocity) was performed due to the presence of unrealistic and repeated stack parameters
as default values were noticed. The defaulted values were noticed in data submissions for the states of
Illinois, Louisiana, Michigan, Pennsylvania, Texas, and Wisconsin. Where these defaults were detected
and deemed to be unreasonable for the specific process, the affected stack parameters were replaced by

21


-------
values from the PSTK file that is input to SMOKE. PSTK contains default stack parameters by source
classification code (SCC). These updates impacted the ptnonipm and ptoilgas inventories.

The inventory pollutants processed through SMOKE for input to CMAQ for the ptegu, ptoilgas,
ptnonipm, and airports sectors included: CO, NO\, VOC, SO:, NH.,, PMm, and PM2.5 and the following
HAPs: HQ (pollutant code = 7647010), CI (code = 7782505), and several dozen other HAPs listed in
Section 3. NBAFM pollutants from the point sectors were utilized. For AERMOD, additional HAPS
were included as described in the 2020 AirTox Screen TSD.

The ptnonipm, pt oilgas, and airports sector emissions were provided to SMOKE as annual emissions.
For sources in the ptegu sector that could be matched to 2020 CEMS data, hourly CEMS NOx and SO2
emissions for 2020 from EPA's Acid Rain Program were used rather than annual inventory emissions.
For all other pollutants (e.g., VOC, PM2.5, HQ), annual emissions were used as-is from the annual
inventory but were allocated to hourly values using heat input from the CEMS data. For the unmatched
units in the ptegu sector, annual emissions were allocated to daily values using IPM region- and pollutant-
specific profiles, and similarly, region- and pollutant-specific diurnal profiles were applied to create
hourly emissions.

The non-EGlJ stationary point source (ptnonipm) emissions were used as inputs to SMOKE as annual
emissions. The full description of how the NEI emissions were developed is provided in the NEI
documentation - a brief summary of their development follows:

a.	CAP and HAP data were provided by States, locals and tribes under the Air Emissions Reporting Rule
(AERR) | the reporting size threshold is larger for inventory years between the triennial inventory years of 2011.
2014,2017, 2020, ...].

b.	EPA corrected known issues and filled PM data gaps.

c.	EPA added HAP data from the Toxic Release Inventory (TRI) where corresponding data was not already
provided by states/locals.

d.	EPA stored and applied matches of the point source units to units with CEMS data and also for all EGU
units modeled by EPA's Integrated Planning Model (IPM).

e.	Data for airports and rail yards were incorporated.

f.	Off-shore platform data were added from the Bureau of Ocean Energy Management (BOEM).

The changes made to the NEI point sources prior to modeling with SMOKE are as follows:

•	The tribal data, which do not use state/county Federal Information Processing Standards (FIPS) codes in the
NEI, but rather use the tribal code, were assigned a state/county FIPS code of 88XXX, where XXX is the 3-
digit tribal code in the NEI. This change was made because SMOKE requires all sources to have a
state/county FIPS code.

•	Sources that did not have specific counties assigned (i.e., the county code ends in 777) were not included in
the modeling because it was only possible to know the state in which the sources resided, but no more
specific details related to the location of the sources were available.

Each of the point sectors is processed separately through SMOKE as described in the following
subsections.

22


-------
3.2.1.1	EGU sector (ptegu)

The ptegu sector contains emissions from EG Us in the 2020 point source inventory that could be matched
to units found in the National Electric Energy Database System (NEEDS) v6 that is used by the Integrated
Planning Model (1PM) to develop projected EGU emissions. It was necessary to put these EG Us into a
separate sector in the platform because EGUs use different temporal profiles than other sources in the
point sector and it is useful to segregate these emissions from the rest of the point sources to facilitate
summaries of the data. Sources not matched to units found in NEEDS were placed into the ptoilgas or
ptnonipm sectors. For studies that include analytic years, the sources in the ptegu sector are fully replaced
with the emissions output from IPM. It is therefore important that the matching between the NEI and
NEEDS database be as complete as possible because there can be double-counting of emissions in
analytic year modeling scenarios if emissions for units projected by IPM are not properly matched to the
units in the base year point source inventory.

The 2020 ptegu emissions inventory is a subset of the point source flat file exported from the Emissions
Inventory System (EIS). In the point source flat file, emission records for sources that have been matched
to the NEEDS database have a value filled into the IPM YN column based on the matches stored within
EIS. Thus, unit-level emissions were split into a separate EGU flat file for units that have a populated
(non-null) ipm_yn field. A populated ipm_yn field indicates that a match was found for the EIS unit in the
NEEDS v6 database. Updates were made to the flat file output from EIS as follows:

• ORIS facility and unit identifiers were updated based on additional matches in a cross-platform
spreadsheet, based on state comments, and using the EIS alternate identifiers table as described
later in this section.

Some units in the ptegu sector are matched to Continuous Emissions Monitoring System (CEMS) data via
Office of Regulatory Information System (ORIS) facility codes and boiler IDs. For the matched units, the
annual emissions of NOx and SO2 in the flat file were replaced with the hourly CEMS emissions in base
year modeling. For other pollutants at matched units, the hourly CEMS heat input data were used to
allocate the NEI annual emissions to hourly values. All stack parameters, stack locations, and Source
Classification Codes (SCC) for these sources come from the flat file. If CEMS data exists for a unit, but
the unit is not matched to the NEI, the CEMS data for that unit were not used in the modeling platform.
However, if the source exists in the NEI and is not matched to a CEMS unit, the emissions from that
source are still modeled using the annual emission value in the NEI temporally allocated to hourly values.

EIS stores many matches from NEI units to the ORIS facility codes and boiler IDs used to reference the
CEMS data. In the flat file, emission records for point sources matched to CEMS data have values filled
into the ORIS FACILITY CODE and ORIS BOILER ID columns. The CEMS data are available at
https://campd.epa.gov/data. Many smaller emitters in the CEMS program cannot be matched to the NEI
due to differences in the way a unit is defined between the NEI and CEMS datasets, or due to
uncertainties in source identification such as inconsistent plant names in the two data systems. In
addition, the NEEDS database of units modeled by IPM includes many smaller emitting EGUs that do not
have CEMS. Therefore, there will be more units in the ptegu sector than have CEMS data.

Matches from the NEI to ORIS codes and the NEEDS database were improved in the platform where
applicable. In some cases, NEI units in EIS match to many CAMD units. In these cases, a new entry was
made in the flat file with a "_M_" in the ipm_yn field of the flat file to indicate that there are "multiple"
ORIS IDs that match that unit. This helps facilitate appropriate temporal allocation of the emissions by
SMOKE. Temporal allocation for EGUs is discussed in more detail in the Ancillary Data section below.

23


-------
The EGU flat file was split into two flat files: those that have unit-level matches to CEMS data using the
orisfacilitycode and oris boiler id fields and those that do not so that different temporal profiles could
be applied. In addition, the hourly CEMS data were processed through v2.1 of the CEMCorrect tool to
mitigate the impact of unmeasured values in the data.

3.2.1.2	Point Oil and Gas Sector (ptoilgas)

The pt oilgas sector was separated from the ptnonipm sector by selecting sources with specific North
American Industry Classification System (NAICS) codes shown in Table 3-4. The emissions and other
source characteristics in the pt oilgas sector are submitted by states, while EPA developed a dataset of
nonpoint oil and gas emissions for each county in the U.S. with oil and gas activity that was available for
states to use. Nonpoint oil and gas emissions can be found in the np oilgas sector. The pt oilgas sector
includes emissions from offshore oil platforms. Where available, the point source emissions submitted as
part of the 2020 NEI process were used. More information on the development of the 2020 NEI oil and
gas emissions can be found in Section 13 of the 2020 NEI TSD.

Table 3-4. Point source oil and gas sector NAICS Codes

NAICS

NAICS description

2111

Oil and Gas Extraction

211112

Natural Gas Liquid Extraction

21112

Crude Petroleum Extraction

211120

Crude Petroleum Extraction

21113

Natural Gas Extraction

211130

Natural Gas Extraction

213111

Drilling Oil and Gas Wells

213112

Support Activities for Oil and Gas Operations

2212

Natural Gas Distribution

22121

Natural Gas Distribution

221210

Natural Gas Distribution

237120

Oil and Gas Pipeline and Related Structures Construction

4861

Pipeline Transportation of Crude Oil

48611

Pipeline Transportation of Crude Oil

486110

Pipeline Transportation of Crude Oil

4862

Pipeline Transportation of Natural Gas

48621

Pipeline Transportation of Natural Gas

486210

Pipeline Transportation of Natural Gas

24


-------
3.2.1.3	A irports Sector (airports)

Emissions at airports were separated from other sources in the point inventory based on sources that have
the facility source type of 100 (airports). The airports sector includes all aircraft types used for public,
private, and military purposes and aircraft ground support equipment. The Federal Aviation
Administration's (FAA) Aviation Environmental Design Tool (AEDT) is used to estimate emissions for
this sector. Additional information about aircraft emission estimates can be found in section 3 of the 2020
NEITSD. EPA used airport-specific factors where available. Airport emissions were spread out into
multiple 12km grid cells when the airport runways were determined to overlap multiple grid cells.
Otherwise, airport emissions for a specific airport are confined to one air quality model grid cell.

3.2.1.4	Non-IPM Sector (ptnonipm)

With some exceptions, the ptnonipm sector contains the point sources that are not in the ptegu, pt oilgas,
or airports sectors. For the most part, the ptnonipm sector reflects non-EGU emissions sources and rail
yards. However, it is possible that some low-emitting EGUs not matched to units the NEEDS database or
to CEMS data are in the ptnonipm sector.

The ptnonipm sector contains a small amount of fugitive dust PM emissions from vehicular traffic on
paved or unpaved roads at industrial facilities, coal handling at coal mines, and grain elevators. Sources
with state/county FIPS code ending with "777" are in the NEI but are not included in any modeling
sectors. These sources typically represent mobile (temporary) asphalt plants that are only reported for
some states and are generally in a fixed location for only a part of the year and are therefore difficult to
allocate to specific places and days as is needed for modeling. Therefore, these sources are dropped from
the point-based sectors in the modeling platform.

The ptnonipm sources (i.e., not EGUs and non -oil and gas sources) were used as-is from the 2020 NEI
point inventory. Solvent emissions from point sources were removed from the np solvents sector to
prevent double-counting, so that all point sources can be retained in the modeling as point sources rather
than as area sources. The modeling was based the point flat file exported from EIS on January 28, 2023
with edits made through April 14, 2023 that included corrections to how the selection was implemented in
EIS, updates from the state/local review, and updates specific to ethylene oxide.

Emissions from rail yards are included in the ptnonipm sector. Railyards are from the 2020 NEI railyard
inventory. Additional information about railyard estimates can be found in section 3 of the 2020 NEI
TSD.

3.2.3 Nonpoint Sources (afdust, ag, nonpt, np oilgas, rwc)

This section describes the stationary nonpoint sources in the NEI nonpoint data category. Locomotives,
CI and C2 CMV, and C3 CMV are included in the NEI nonpoint data category but are mobile sources
that are described in Section 2.4. The 2020 NEI TSD includes documentation for the nonpoint data.

Nonpoint tribal emissions submitted to the NEI are dropped during spatial processing with SMOKE due
to the configuration of the spatial surrogates. Part of the reason for this is to prevent possible double-
counting with county-level emissions and also because spatial surrogates for tribal data are not currently
available. These omissions are not expected to have an impact on the results of the air quality modeling at
the 12-km resolution used for this platform.

25


-------
The following subsections describe how the sources in the NEI nonpoint inventory were separated into
modeling platform sectors, along with any data that were updated (replaced) with non-NEI data.

3.2.3.1	Area Fugitive Dust Sector (afdust)

The area-source fugitive dust (afdust) sector contains PMio and PM2.5 emission estimates for nonpoint
SCCs identified by EPA as dust sources. Categories included in the afdust sector are paved roads,
unpaved roads and airstrips, construction (residential, industrial, road and total), agriculture production,
and mining and quarrying. It does not include fugitive dust from grain elevators, coal handling at coal
mines, or vehicular traffic on paved or unpaved roads at industrial facilities because these are treated as
point sources so they are properly located.

The afdust sector was separated from other nonpoint sectors to allow for the application of a "transport
fraction," and meteorological/precipitation reductions. These adjustments were applied using a script that
applies land use-based gridded transport fractions based on landscape roughness, followed by another
script that zeroes out emissions for days on which at least 0.01 inches of precipitation occurs or there is
snow cover on the ground. The land use data used to reduce the NEI emissions determines the amount of
emissions that were subject to transport. This methodology is discussed in Pouliot, et al., 2010, and in
"Fugitive Dust Modeling for the 2008 Emissions Modeling Platform" (Adelman, 2012). Both the
transport fraction and meteorological adjustments were based on the gridded resolution of the platform
(i.e., 12km grid cells); therefore, different emissions will result if the process were applied to different
grid resolutions. A limitation of the transport fraction approach is the lack of monthly variability that
would be expected with seasonal changes in vegetative cover. While wind speed and direction are not
accounted for in the emissions processing, the hourly variability due to soil moisture, snow cover and
precipitation were accounted for in the subsequent meteorological adjustment.

Paved road dust emissions were from the 2020 NEI. For the fugitive dust emissions compiled into the
2020 NEI, meteorological adjustments were applied to paved and unpaved road SCCs but not transport
adjustments. This is because the modeling platform applies meteorological adjustments and transport
adjustments based on unadjusted NEI values. For the 2020 platform, the meteorological adjustments that
were applied in the NEI to paved and unpaved road SCCs were backed out and reapplied in SMOKE at an
hourly resolution for each grid cell. The FF10 that is run through SMOKE consists of 100% unadjusted
emissions, and after SMOKE all afdust sources have both transport and meteorological adjustments
applied according to year 2020 meteorology.

For categories other than paved and unpaved roads, where states submitted afdust data it was assumed
that the state-submitted data were not met-adjusted and therefore the meteorological adjustments were
applied. Thus, if states submitted data that were met-adjusted for sources other than paved and unpaved
roads, these sources would have been adjusted for meteorology twice. Even with that possibility, air
quality modeling shows that, in general, dust is frequently overestimated in the air quality modeling
results.

3.2.3.2	Agricultural Livestock Sector (livestock)

The livestock emissions in this sector are based only on the SCCs starting with 2805. The livestock
emissions are related to beef and dairy cattle, poultry production and waste, swine production, waste from
horses and ponies, and production and waste for sheep, lambs, and goats. The sector does not include
quite all of the livestock NH3 emissions, as there is a very small amount of NH3 emissions from livestock

26


-------
in the ptnonipm inventory (as point sources). In addition to NH3, the sector includes livestock emissions
from all pollutants other than PM2.5. PM2.5 from livestock are in the afdust sector.

Agricultural livestock emissions in the 2020 platform were from the 2020 NEI, which is a mix of state-
submitted data and EPA estimates. Livestock emissions utilized improved animal population data. VOC
livestock emissions, new for this sector, were estimated by multiplying a national VOC/NH3 emissions
ratio by the county NH3 emissions. The 2020 NEI approach for livestock utilizes daily emission factors by
animal and county from a model developed by Carnegie Mellon University (CMU) (Pinder, 2004,
McQuilling, 2015) and 2020 U.S. Department of Agriculture (USDA) National Agricultural Statistics
Service (NASS) survey. Details on the approach are provided in Section 10 of the 2020 NEI TSD.

3.2.3.3	Agricultural Fertilizer Sector (fertilizer)

As described in the 2020 NEI TSD, fertilizer emissions for 2020 awere based on the FEST-C model As
described in the 2020 NEI TSD, fertilizer emissions for 2020 were based on the FEST-C model
(https://www.cmascenter.org/fest-c/). Unlike most of the other emissions input to the CMAQ model,
fertilizer emissions are computed during a run of CMAQ in bi-directional mode and are output during the
model run. The bidirectional version of CMAQ (v5.3) and the Fertilizer Emissions Scenario Tool for
CMAQ FEST-C (vl.3) were used to estimate ammonia (NH3) emissions from agricultural soils. The
computed emissions were saved during the CMAQ run so they can be included in emissions summaries
and in other model runs that do not use the bidirectional method.

FEST-C is the software program that processes land use and agricultural activity data to develop inputs
for the CMAQ model when run with bidirectional exchange. FEST-C reads land use data from the
Biogenic Emissions Landuse Dataset (BELD), meteorological variables from the Weather Research and
Forecasting (WRF) model, and nitrogen deposition data from a previous or historical average CMAQ
simulation. FEST-C, then uses the Environmental Policy Integrated Climate (EPIC) modeling system
(https://epicapex.tamu.edu/epic/) to simulate the agricultural practices and soil biogeochemistry and
provides information regarding fertilizer timing, composition, application method and amount.

An iterative calculation was applied to estimate fertilizer emissions. First, fertilizer application by crop
type was estimated using FEST-C modeled data. To develop the NEI emissions, CMAQ v5.4 was run
with the Surface Tiled Aerosol and Gaseous Exchange (STAGE) deposition option along with
bidirectional exchange to estimate fertilizer and biogenic NH3 emissions. However, for this study, the
M3DRY option was used to develop the fertilizer emissions.

The following activity parameters were input into the EPIC model:

•	Grid cell meteorological variables from WRF

•	Initial soil profiles/soil selection

•	Presence of 21 major crops: irrigated and rain fed hay, alfalfa, grass, barley, beans, grain corn,
silage corn, cotton, oats, peanuts, potatoes, rice, rye, grain sorghum, silage sorghum, soybeans,
spring wheat, winter wheat, canola, and other crops (e.g., lettuce, tomatoes, etc.)

•	Fertilizer sales to establish the type/composition of nutrients applied

•	Management scenarios for the 10 USDA production regions. These include irrigation, tile
drainage, intervals between forage harvest, fertilizer application method (injected versus surface
applied), and equipment commonly used in these production regions.

27


-------
The WRF meteorological model was used to provide grid cell meteorological parameters for year 2020
using a national 12-km rectangular grid covering the continental U.S. Initial soil nutrient and pH
conditions in EPIC were based on the 1992 USDA Soil Conservation Service (CSC) Soils-5 survey. The
EPIC model then was run for 25 years using current fertilization and agricultural cropping techniques to
estimate soil nutrient content and pH for the 2017 EPIC/WRF/CMAQ simulation.

The presence of crops in each model grid cell was determined using USDA Census of Agriculture data
(2012) and USGS National Land Cover data (2011). These two data sources were used to compute the
fraction of agricultural land in a model grid cell and the mix of crops grown on that land.

Fertilizer sales data and the 6-month period in which they were sold were extracted from the 2014
Association of American Plant Food Control Officials (AAPFCO,

http://www.aapfco.org/publications.htmn. AAPFCO data were used to identify the composition (e.g.,
urea, nitrate, organic) of the fertilizer used, and the amount applied is estimated using the modeled crop
demand. These data were useful in making a reasonable assignment of what kind of fertilizer is being
applied to which crops.

Management activity data refers to data used to estimate representative crop management schemes. The
USDA Agricultural Resource Management Survey (ARMS,

https://www.nass.usda.gov/Survevs/Guide to NASS Survevs/Ag Resource Management/) was used to
provide management activity data. These data cover 10 USDA production regions and provide
management schemes for irrigated and rain fed hay, alfalfa, grass, barley, beans, grain corn, silage corn,
cotton, oats, peanuts, potatoes, rice, rye, grain sorghum, silage sorghum, soybeans, spring wheat, winter
wheat, canola, and other crops (e.g., lettuce, tomatoes, etc.).

3.2.3.4	Nonpoint Oil-gas Sector (npoilgas)

The nonpoint oil and gas (np oilgas) sector includes onshore and offshore oil and gas emissions. The
EPA estimated emissions for all counties with 2020 oil and gas activity data with the Oil and Gas Tool.
The types of sources covered include drill rigs, workover rigs, artificial lift, hydraulic fracturing engines,
pneumatic pumps and other devices, storage tanks, flares, truck loading, compressor engines, and
dehydrators. Because of the importance of emissions from this sector, special consideration is given to
the speciation, spatial allocation, and monthly temporalization of nonpoint oil and gas emissions, instead
of relying on older, more generalized profiles.

The 2020 NEI version of the Nonpoint Oil and Gas Emission Estimation Tool (i.e., the "NEI oil and gas
tool") was used to estimate 2020. Year 2020 oil and gas activity data obtained from Enverus' activity
database (www.enverus.com) and supplied by some state air agencies. The NEI oil and gas tool is an
Access database that utilizes county-level activity data (e.g., oil production and well counts), operational
characteristics (types and sizes of equipment), and emission factors to estimate emissions. The tool was
used to create a CSV-formatted emissions dataset covering all national nonpoint oil and gas emissions.
This dataset was converted to the FF10 format for use in SMOKE modeling. More details on the inputs
for and running of the tool for 2020 are provided in the 2020 NEI TSD.

A new source was added to the oil and gas sector for the 2020 NEI. Pipeline Blowdowns and Pigging
(SCC= 2310021801) emissions were estimated using US EPA Greenhouse Gas Reporting Program
(GHGRP) data. These Pipeline Blowdowns and Pigging emissions included county-level estimates of

28


-------
VOC, benzene, toluene, ethylbenzene, and xylene (BTEX). These emissions estimates were calculated
outside of the Oil and Gas Tool and submitted to EIS separately from the Oil and Gas Tool emissions.
These emissions were considered EPA default emissions and SLTs had the opportunity to submit their
own Pipeline Blowdowns and Pigging (e.g., Utah) emissions and/or accept/omit these emissions using the
Nonpoint Survey. Unfortunately, these EPA default Pipeline Blowdowns and Pigging emissions did not
get into the 2020 NEI release for the states that accepted these emissions due to EIS tagging issues. These
emissions were included in this 2020 Emissons Modeling Platform.

Lastly, EPA and the state of New Mexico worked together to exercise the point source subtraction step in
the Oil and Gas Tool during the 2020 NEI development period. This point source subtraction step was
used for New Mexico because additional oil and gas point sources were submitted by New Mexico that
were the same processes that are estimated in the Oil and Gas Tool (non-point sources). This point source
subtraction step is a processed used to eliminate possible double counting of sources in the Oil and Gas
Tool that are already defined in the point source inventory. Unfortunately, the resulting non-point
emissions from the point source subtraction step for New Mexico did not get into the 2020 NEI release
due to EIS tagging issues. New Mexico non-point oil and gas emissions are overestimated in the 2020
NEI as a result. This overestimation was corrected for this 2020 Emissions Modeling Platform.

3.2.3.5	Residential Wood Combustion Sector (rwc)

The residential wood combustion (rwc) sector includes residential wood burning devices such as
fireplaces, fireplaces with inserts (inserts), free standing woodstoves, pellet stoves, outdoor hydronic
heaters (also known as outdoor wood boilers), indoor furnaces, and outdoor burning in firepots and
chimeneas. Free standing woodstoves and inserts are further differentiated into three categories:
1) conventional (not EPA certified); 2) EPA certified, catalytic; and 3) EPA certified, noncatalytic.
Generally speaking, the conventional units were constructed prior to 1988. Units constructed after 1988
have to meet EPA emission standards and they are either catalytic or non-catalytic. As with the other
nonpoint categories, a mix of S/L and EPA estimates were used. The EPA's estimates use updated
methodologies for activity data and some changes to emission factors.

The 2020 platform RWC emissions are unchanged from the data in the 2020 NEI and include some
improvements to RWC emissions estimates developed as part of the 2020 NEI process. The EPA, along
with the Commission on Environmental Cooperation (CEC), the Northeast States for Coordinated Air Use
Management (NESCAUM), and Abt Associates, conducted a national survey of wood-burning activity in
2018. The results of this survey were used to estimate county-level burning activity data. The activity data
for RWC processes is the amount of wood burned in each county, which is based on data from the CEC
survey on the fraction of homes in each county that use each wood-burning appliance and the average
amount of wood burned in each appliance. These assumptions are used with the number of occupied
homes in each county to estimate the total amount of wood burned in each county, in cords for cordwood
appliances and tons for pellet appliances. Cords of wood are converted to tons using county-level density
factors from the U.S. Forest Service. RWC emissions were calculated by multiplying the tons of wood
burned by emissions factors. For more information on the development of the residential wood
combustion emissions, see Section 27 of the 2020 NEI TSD.

3.2.3.6	Solvents (npsolvents)

The np solvents sector is a diverse collection of emission sources for which emissions are driven by
evaporation. Included in this sector are everyday items, such as cleaners, personal care products,
adhesives, architectural and aerosol coatings, printing inks, and pesticides. These sources exclusively emit

29


-------
organic gases and feature origins spanning residential, commercial, institutional, and industrial settings.
The organic gases that evaporate from these sources often fulfill other functions than acting as a
traditional solvent (e.g., propellants, fragrances, emollients). For this reason, the solvents sector is often
referred to as "volatile chemical products." Emissions from this sector for the 2020 modeling platform are
unchanged from the 2020 NEI, and users should review Section 32 of the 2020 NEI TSD for additional
information on the construction of emissions estimates for solvents in the 2020 NEI.

3.2.3.7	Other Nonpoint Sources (nonpt)

The 2020 platform nonpt sector inventory is unchanged from the April 2023 version of the 2020 NEI.
Stationary nonpoint sources that were not subdivided into the afdust, livestock, fertilizer, np oilgas, rwc
or np solvents sectors were assigned to the "nonpt" sector. Locomotives and CMV mobile sources from
the 2020 NEI nonpoint inventory are described with the mobile sources. The types of sources in the nonpt
sector include:

•	stationary source fuel combustion, including industrial, commercial, and residential and orchard
heaters;

•	chemical manufacturing;

•	industrial processes such as commercial cooking, metal production, mineral processes, petroleum
refining, wood products, fabricated metals, and refrigeration;

•	storage and transport of petroleum for uses such as portable gas cans, bulk terminals, gasoline
service stations, aviation, and marine vessels;

•	storage and transport of chemicals;

•	waste disposal, treatment, and recovery via incineration, open burning, landfills, and composting;
and

•	miscellaneous area sources such as cremation, hospitals, lamp breakage, and automotive repair
shops.

The nonpt sector includes emission estimates for Portable Fuel Containers (PFCs), also known as "gas
cans" The PFC inventory consists of three distinct sources of PFC emissions, further distinguished by
residential or commercial use. The three sources are: (1) displacement of the vapor within the can; (2)
emissions due to evaporation (i.e., diurnal emissions); and (3) emissions due to permeation. Note that
spillage and vapor displacement associated with using PFCs to refuel nonroad equipment are included in
the nonroad inventory.

3.2.4 Mobile Sources (onroad, onroadcaadj, nonroad, cmv_clc2, cmv_c3, rail)

Mobile sources are emissions from vehicles that move and include several sectors. Onroad mobile source
emissions result from motorized vehicles that are normally operated on public roadways. These include
passenger cars, motorcycles, minivans, sport-utility vehicles, light-duty trucks, heavy-duty trucks, and
buses. Nonroad mobile source emissions are from vehicles that do not operate on roads such as tractors,
construction equipment, lawnmowers, and recreational marine vessels. All nonroad emissions are treated
as low-level emissions (i.e., they are released into model layer 1) and most nonroad emission are
represented as county totals. Note that rail yard and airport emissions are part of the NEI point data
category.

Commercial marine vessel (CMV) emissions are split into two sectors: emissions from Category 1 and
Category 2 vessels are in the cmv c 1 c2 sector, and emissions from the larger ocean-going Category 3

30


-------
vessels are in the cmv_c3 sector. Both CMV sectors are treated as point sources with plume rise.
Locomotive emissions are in the rail sector. Having the emissions split into these sectors facilitates
separating them in summaries and also allows for CMV to be modeled with plume rise. In addition, CMV
emissions are treated as hourly point source emissions in the modeling platform, although they are part of
the NEI nonpoint data category.

3.2.4.1 Onroad (onroad)

Onroad mobile source include emissions from motorized vehicles operating on public roadways. These
include passenger cars, motorcycles, minivans, sport-utility vehicles, light-duty trucks, heavy-duty trucks,
and buses. The sources are further divided by the fuel they use, including diesel, gasoline, E-85, and
compressed natural gas (CNG) vehicles. The sector characterizes emissions from parked vehicle
processes (e.g., starts, hot soak, and extended idle) as well as from on-network processes (i.e., from
vehicles as they move along the roads). For more details on the approach and for a summary of the
MOVES inputs submitted by states, see section 5 of the 2020 NEI TSD.

For the 2020 modeling platform activity data (i.e., VMT, VPOP, starts, on-network idling, and hoteling)
were based on state submitted CDBs, as well as data from Federal Highways administration (FHWA)
annual VMT at the county level. A new MOVES run for 2020 was done using MOVES3.

Except for California, all onroad emissions are generated using the SMOKE-MOVES emissions modeling
framework that leverages MOVES-generated emission factors https://www.epa.gov/moves), county and
SCC-specific activity data, and hourly 2020 meteorological data. Specifically, EPA used MOVES3
inputs for representative counties, vehicle miles traveled (VMT), vehicle population (VPOP), and hoteling
hours data for all counties, along with tools that integrated the MOVES model with SMOKE. In this way,
it was possible to take advantage of the gridded hourly temperature data available from meteorological
modeling that are also used for air quality modeling. The onroad source classification codes (SCCs) in the
modeling platform are more finely resolved than those in the National Emissions Inventory (NEI). The
NEI SCCs distinguish vehicles and fuels. The SCCs used in the model platform also distinguish between
emissions processes (i.e., off-network, on-network, and extended idle), and road types.

MOVES3 includes the following updates from MOVES2014b:

•	Updated emission rates:

o Updated heavy-duty (HD) diesel running emission rates based on manufacturer in-use

testing data from hundreds of HD trucks
o Updated HD gasoline and compressed natural gas (CNG) trucks
o Updated light-duty (LD) emission rates for hydrocarbons (HC), CO, NOx, and PM

•	Includes updated fuel information

•	Incorporates HD Phase 2 Greenhouse Gas (GHG) rule, allowing for finer distinctions among HD
vehicles

•	Accounts for glider vehicles that incorporate older engines into new vehicle chassis

•	Accounts for off-network idling - emissions beyond the idling that is already considered in the
MOVES drive cycle

•	Includes revisions to inputs for hoteling

•	Adds starts as a separate type of rate and activity data

31


-------
Except for California, all onroad emissions were computed with SMOKE-MOVES by multiplying
specific types of vehicle activity data by the appropriate emission factors. SMOKE-MOVES was run for
specific modeling grids. Emissions for the contiguous U.S. states and Washington, D.C., were computed
for a grid covering those areas.

SMOKE-MOVES makes use of emission rate "lookup" tables generated by MOVES that differentiate
emissions by process (i.e., running, start, vapor venting, etc.), vehicle type, road type, temperature, speed,
hour of day, etc. To generate the MOVES emission rates that could be applied across the U.S., EPA used
an automated process to run MOVES to produce year 2020-specific emission factors by temperature and
speed for a series of "representative counties," to which every other county was mapped. The
representative counties for which emission factors are generated are selected according to their state,
elevation, fuels, age distribution, ramp fraction, and inspection and maintenance programs. Each county
is then mapped to a representative county based on its similarity to the representative county with respect
to those attributes. For this study, there are 254 representative counties in the continental U.S. and a total
of 292 including the non-CONUS areas.

Once representative counties have been identified, emission factors are generated with MOVES for each
representative county and for two "fuel months" - January to represent winter months, and July to
represent summer months - due to the different types of fuels used. SMOKE selects the appropriate
MOVES emissions rates for each county, hourly temperature, SCC, and speed bin and then multiplies the
emission rate by appropriate activity data. For on-roadway emissions, vehicle miles travelled (VMT) is
the activity data; off-network processes use vehicle population (VPOP), vehicle starts, and hours of off-
network idling (ONI); and hoteling hours are used to develop emissions for extended idling of
combination long-haul trucks. These calculations are done for every county and grid cell in the
continental U.S. for each hour of the year.

The SMOKE-MOVES process for creating the model-ready emissions consists of the following steps:

1)	Determine which counties will be used to represent other counties in the MOVES runs.

2)	Determine which months will be used to represent other month's fuel characteristics.

3)	Create inputs needed only by MOVES. MOVES requires county-specific information on
vehicle populations, age distributions, and inspection-maintenance programs for each of the
representative counties.

4)	Create inputs needed both by MOVES and by SMOKE, including temperatures and activity
data.

5)	Run MOVES to create emission factor tables for the temperatures found in each county.

6)	Run SMOKE to apply the emission factors to activity data (VMT, VPOP, STARTS, off-network
idling, and HOTELING) to calculate emissions based on the gridded hourly temperatures in the
meteorological data.

7)	Aggregate the results to the county-SCC level for summaries and quality assurance.

The onroad emissions were processed in six processing streams that were then merged together into the
onroad sector emissions after each of the six streams have been processed:

• rate-per-distance (RPD) uses VMT as the activity data plus speed and speed profile information to
compute on-network emissions from exhaust, evaporative, permeation, refueling, and brake and tire
wear processes;

32


-------
•	rate-per-vehicle (RPV) uses VPOP activity data to compute off-network emissions from exhaust,
evaporative, permeation, and refueling processes;

•	rate-per-profile (RPS) uses STARTS activity data to compute off-network emissions from vehicles starts;

•	rate-per-profile (RPP) uses VPOP activity data to compute off-network emissions from evaporative fuel
vapor venting, including hot soak (immediately after a trip) and diurnal (vehicle parked for a long period)
emissions;

•	rate-per-hour (RPH) uses hoteling hours activity data to compute off-network emissions for idling of long-
haul trucks from extended idling and auxiliary power unit process; and

•	rate-per-hour off-network idling (RPHO) uses off network idling hours activity data to compute off-
network idling emissions for all types of vehicles.

The onroad emissions inputs to MOVES for the 2020 platform are based on the 2020 NEI, described in
more detail in Section 5 of the 2020 NEI TSD. These inputs include:

•	Key parameters in the MOVES County databases (CDBs) including Low Emission Vehicle (LEV)
table

•	Fuel months

•	Activity data (e.g., VMT, VPOP, speed, HOTELING)

Fuel months, age distributions, and other inputs were consistent with those used to compute the 2020 NEI.
Activity data submitted by states and development of the EPA default activity data sets for VMT, VPOP,
and hoteling hours are described in detail in the 2020 NEI TSD and supporting documents. Hoteling hours
activity were used to calculate emissions from extended idling and auxiliary power units (APUs) by
combination long-haul trucks.

SMOKE-MOVES uses vehicle miles traveled (VMT), vehicle population (VPOP), vehicle starts, hours of
off-network idling (ONI), and hours of hoteling, to calculate emissions. These datasets are collectively
known as "activity data". For each of these activity datasets, first a national dataset was developed; this
national dataset is called the "EPA default" dataset. The default dataset started with the 2020 NEI activity
data, which was supplemented with data submitted by state and local agencies. EPA default activity was
used for California, but the emissions were scaled to California-supplied values during the emissions
processing. States that submitted activity data and development of the EPA default activity data sets for
VMT, VPOP, and hoteling hours are described in detail in the 2020 NEI TSD (EPA, 2023) and
supporting documents.

In SMOKE 4.7, SMOKE-MOVES was updated to use speed distributions similarly to how they are used
when running MOVES in inventory mode. This new speed distribution file, called SPDIST, specifies the
amount of time spent in each MOVES speed bin for each county, vehicle (aka source) type, road type,
weekday/weekend, and hour of day. This file contains the same information at the same resolution as the
Speed Distribution table used by MOVES but is reformatted for SMOKE. Using the SPDIST file results
in a SMOKE emissions calculation that is more consistent with MOVES than the old hourly speed profile
(SPDPRO) approach, because emission factors from all speed bins can be used, rather than interpolating
between the two bins surrounding the single average speed value for each hour as is done with the
SPDPRO approach.

33


-------
For the 2020 NEI, to more accurately reflect the variation of average speeds from month to month
throughout the year 2020, month-specific SPDIST files were generated. Speed data from the Streetlight
dataset were used to generate hourly speed profiles by county, SCC, and month. The SPDIST files for
2020 NEI are based on a combination of the Streetlight project data and 2020 NEI MOVES CDBs. More
information can be found in the 2020 NEI TSD (EPA, 2023) and supporting documents.

Hoteling hours were capped by county at a theoretical maximum and any excess hours of the maximum
were reduced. For calculating reductions, a dataset of truck stop parking space availability was used,
which includes a total number of parking spaces per county. This same dataset is used to develop the
spatial surrogate for allocating county-total hoteling emissions to model grid cells. The parking space
dataset includes several recent updates based on new truck stops opening and other new information.
There are 8,784 hours in the year 2020; therefore, the maximum number of possible hoteling hours in a
particular county is equal to 8,784 * the number of parking spaces in that county. Hoteling hours were
capped at that theoretical maximum value for 2020 in all counties. The final step related to hoteling
activity is to split county totals into separate values for extended idling (SCC 2202620153) and Auxiliary
Power Units (APUs) (SCC 2202620191). For 2020 modeling with MOVES3, a 7.2% APU split is used
nationwide, meaning that during 7.2% of the hoteling hours auxiliary power units are assumed to be
running.

Onroad "start" emissions are the instantaneous exhaust emissions that occur at the engine start (e.g., due
to the fuel rich conditions in the cylinder to initiate combustion) as well as the additional running exhaust
emissions that occur because the engine and emission control systems have not yet stabilized at the
running operating temperature. Operationally, start emissions are defined as the difference in emissions
between an exhaust emissions test with an ambient temperature start and the same test with the
engine and emission control systems already at operating temperature. As such, the units for start
emission rates are instantaneous grams/start.

MOVES3 uses vehicle population information to sort the vehicle population into source bins defined
by vehicle source type, fuel type (gas, diesel, etc.), regulatory class, model year and age. The model uses
default data from instrumented vehicles (or user-provided values) to estimate the number of starts for
each source bin and to allocate them among eight operating mode bins defined by the amount of time
parked ("soak time") prior to the start. Thus, MOVES3 accounts for different amounts of cooling of the
engine and emission control systems. Each source bin and operating mode has an associated g/start
emission rate. Start emissions are also adjusted to account for fuel characteristics, LD inspection and
maintenance programs, and ambient temperatures.

After creating VMT inputs for SMOKE-MOVES, Off-network idle (ONI) activity data were also needed.
ONI is defined in MOVES as time during which a vehicle engine is running idle and the vehicle is
somewhere other than on the road, such as in a parking lot, a driveway, or at the side of the road. This
engine activity contributes to total mobile source emissions but does not take place on the road network.
Examples of ONI activity include:

light duty passenger vehicles idling while waiting to pick up children at school or to pick up

passengers at the airport or train station,

single unit and combination trucks idling while loading or unloading cargo or making

deliveries, and

vehicles idling at drive-through restaurants.

34


-------
Note that ONI does not include idling that occurs on the road, such as idling at traffic signals, stop signs,
and in traffic—these emissions are included as part of the running and crankcase running exhaust
processes on the other road types. ONI also does not include long-duration idling by long-haul
combination trucks (hoteling/extended idle), as that type of long duration idling is accounted for in other
MOVES processes.

ONI activity hours were calculated based on VMT. For each representative county, the ratio of ONI hours
to onroad VMT (on all road types) was calculated using the MOVES ONI Tool by source type, fuel type,
and month. These ratios are then multiplied by each county's total VMT (aggregated by source type, fuel
type, and month) to get hours of ONI activity.

MOVES3 was run in emission rate mode to create emission factor tables for 2020, for all representative
counties and fuel months. The county databases used to run MOVES to develop the emission factor tables
included the state-specific control measures such as the California LEV program, and fuels represented
the year 2020. The range of temperatures run along with the average humidities used were specific to the
year 2020. The remaining settings for the CDBs are documented in the 2020 NEI TSD. To create the
emission factors, MOVES was run separately for each representative county and fuel month for each
temperature bin needed for the calendar year 2020. The MOVES results were post-processed into CSV-
formatted emission factor tables that can be read by SMOKE-MOVES.

The county databases CDBs used to run MOVES to develop the emission factor tables were those used
for the 2020 NEI and therefore included any updated data provided and accepted for the 2020 NEI
process. The 2020 NEI development included an extensive review of the various tables including speed
distributions were performed. Each county in the continental U.S. was classified according to its state,
altitude (high or low), fuel region, the presence of inspection and maintenance programs, the mean light-
duty age, and the fraction of ramps. A binning algorithm was executed to identify "like counties. The
result was 254 representative counties for CONUS.

Age distributions are a key input to MOVES in determining emission rates. The age distributions for 2020
were updated based on vehicle registration data obtained from IHS Markit, subject to reductions for older
vehicles. One of the findings of CRC project A-l 15 is that IHS data contain higher vehicle populations
than state agency analyses of the same Department of Motor Vehicles data, and the discrepancies tend to
increase with increasing vehicle age (i.e., there are more older vehicles in the IHS data) and appropriate
decreases in older vehicles were applied when the age distributions were computed for 2020.

To create the emission factors, MOVES was run separately for each representative county and fuel month
and for each temperature bin needed for calendar year 2020. The CDBs used to run MOVES include the
state-specific control measures such as the California low emission vehicle (LEV) program. In addition,
the range of temperatures run along with the average humidities used were specific to the year 2020. The
MOVES results were post-processed into CSV-formatted emission factor tables that can be read by
SMOKE-MOVES.

California uses their own emission model, EMFAC, to develop onroad emissions inventories and provides
those inventories to EPA. EMFAC uses emission inventory codes (EICs) to characterize the emission
processes instead of SCCs. The EPA and California worked together to develop a code mapping to better
match EMFAC's EICs to EPA MOVES' detailed set of SCCs that distinguish between off-network and
on-network and brake and tire wear emissions. This detail is needed for modeling but not for the NEI.
California submitted onroad emissions for the 2020 NEI, and these emissions were used for 2020

35


-------
modeling. The California inventory had CAPs and select HAPs, but did not have NH3 or refueling
emissions. The EPA added NH3 to the CARB inventory by using the state total NH3 from MOVES and
allocating it at the county level based on CO. Refueling emissions were taken from MOVES in California.
HAP emissions for VOCs and metals as provided by California were used, while other HAPs (e.g., PAHs)
were from MOVES.

The California onroad mobile source emissions were created through a hybrid approach of combining
state-supplied annual emissions with EPA-developed SMOKE-MOVES runs. Through this approach, the
platform was able to reflect the California-developed emissions, while leveraging the more detailed SCCs
and the highly resolved spatial patterns, temporal patterns, and speciation from SMOKE-MOVES. The
basic steps involved in temporally allocating onroad emissions from California based on SMOKE-
MOVES results were:

1)	Run CA using EPA inputs through SMOKE-MOVES to produce hourly emissions hereafter
known as "EPA estimates." These EPA estimates for CA were run in a separate sector called
"onroadca."

2)	Calculate ratios between state-supplied emissions and EPA estimates. The ratios were
calculated for each county/SCC/pollutant combination based on the California onroad
emissions inventory. The 2020 California data did not separate off and on-network emissions
or extended idling, and also did not include information for vehicles fueled by E-85, so these
differentiations were obtained using MOVES.

3)	Create an adjustment factor file (CFPRO) that includes EPA-to-state estimate ratios.

4)	Rerun CA through SMOKE-MOVES using EPA inputs and the new adjustment factor file.

Through this process, adjusted model-ready files were created that sum to annual totals from California,
but have the temporal and spatial patterns reflecting the highly resolved meteorology and SMOKE-
MOVES. After adjusting the emissions, this sector is called "onroadcaadj " Note that in emission
summaries, the emissions from the "onroad" and "onroad ca adj" sectors were summed and designated
as the emissions for the onroad sector.

3.2.4.2 Category 1,2, and3 commercial marine vessels (cmv_clc2 and cmv_3)

The cmv_clc2 sector contains Category 1 and 2 CMV emissions. Category 1 and 2 vessels use diesel
fuel. All emissions in this sector are annual and at county-SCC resolution; however, in the NEI they are
provided at the sub-county level (i.e.,. port shape ids) and by SCC and emission type (e.g., hoteling,
maneuvering). For more information on CMV sources, see Section 11 of the 2020 NEI TSD and the
supplemental documentation.8 CI and C2 emissions that occur outside of state waters are not assigned to
states. For this modeling platform, all CMV emissions in the cmv_clc2 sector are treated as hourly
gridded point sources with stack parameters that should result in them being placed in layer 1.

Sulfur dioxide (S02) emissions reflect rules that reduced sulfur emissions for CMV that took effect in the
year 2015. The cmv_clc2 inventory sector contains small to medium-size engine CMV emissions.
Category 1 and Category 2 (C1C2) marine diesel engines typically range in size from about 700 to 11,000
hp. These engines are used to provide propulsion power on many kinds of vessels including tugboats,
towboats, supply vessels, fishing vessels, and other commercial vessels in and around ports. They are also

8 https://gaftp.epa.gov/Air/nei/2020/doc/supporting_data/nonpoint/CMV/.

36


-------
used as stand-alone generators for auxiliary electrical power on many types of vessels. Category 1
represents engines up to 7 liters per cylinder displacement. Category 2 includes engines from 7 to 30 liters
per cylinder.

The cmv_clc2 inventory sector contains sources that traverse state and federal waters along with
emissions from surrounding areas of Canada, Mexico, and international waters. The cmv_clc2 sources
are modeled as point sources but using plume rise parameters that cause the emissions to be released in
the ground layer of the air quality model.

The cmv_clc2 sources within state waters are identified in the inventory with the Federal Information
Processing Standard (FIPS) county code for the state and county in which the vessel is registered. The
cmv_clc2 sources that operate outside of state waters but within the Emissions Control Area (ECA) are
encoded with a state FIPS code of 85. The ECA areas include parts of the Gulf of Mexico, and parts of
the Atlantic and Pacific coasts.

Category 1 and 2 CMV emissions were developed for the 2020 NEI. The emissions were developed
based signals from Automated Identification System (AIS) transmitters. AIS is a tracking system used by
vessels to enhance navigation and avoid collision with other AIS transmitting vessels. The USEPA
Office of Transportation and Air Quality received AIS data from the U.S. Coast Guard (USCG) to
quantify all ship activity which occurred between January 1 and December 31, 2020. To ensure coverage
for all of the areas needed by the NEI, the requested and provided AIS data extend beyond 200 nautical
miles from the U.S. coast. The area covered by the NEI is roughly equivalent to the border of the U.S
Exclusive Economic Zone and the North American ECA, although some non-ECA activity are captured
as well. Two types of AIS data were received: satellite (S-AIS) and terrestrial (T-AIS).

The AIS data were compiled into five-minute intervals by the USCG, providing a reasonably refined
assessment of a vessel's movement. For example, using a five-minute average, a vessel traveling at 25
knots would be captured every two nautical miles that the vessel travels. For slower moving vessels, the
distance between transmissions would be less. The ability to track vessel movements through AIS data
and link them to attribute data, has allowed for the development of an inventory of very accurate emission
estimates. These AIS data were used to define the locations of individual vessel movements, estimate
hours of operation, and quantify propulsion engine loads. The compiled AIS data also included the
vessel's International Marine Organization (IMO) number and Maritime Mobile Service Identifier
(MMSI); which allowed each vessel to be matched to their characteristics obtained from the Clarksons
ship registry (Clarksons, 2021).

The engine bore and stroke data were used to calculate cylinder volume. Any vessel that had a calculated
cylinder volume greater than 30 liters was incorporated into the USEPA's new Category 3 Commercial
Marine Vessel (C3CMV) model. The remaining records were assumed to represent Category 1 and 2
(C1C2) or non-ship activity. The C1C2 AIS data were quality assured including the removal of duplicate
messages, signals from pleasure craft, and signals that were not from CMV vessels (e.g., buoys,
helicopters, and vessels that are not self-propelled).

The emissions were calculated for each time interval between consecutive AIS messages for each vessel
and allocated to the location of the message following to the interval. Emissions were calculated
according to Equation 3-1.

37


-------
g

Emissionsintervai = Time (hr)interval x Power(kW) x	x LLAF

3-1

Power is calculated for the propulsive (main), auxiliary, and auxiliary boiler engines for each interval and
emission factor (EF) reflects the assigned emission factors for each engine, as described below. LLAF
represents the low load adjustment factor, a unitless factor which reflects increasing propulsive emissions
during low load operations. Time indicates the activity duration time between consecutive intervals.
11,302 vessels were directly identified by their ship and cargo number. The remaining group of
miscellaneous ships represent 13 percent of the AIS vessels (excluding recreational vessels) for which a
specific vessel type could not be assigned.

Next, vessels were identified in order determine their vessel type, and thus their vessel group, power
rating, and engine tier information which are required for the emissions calculations. See the 2020 NEI
documentation for more details on this process. Following the identification, 108 different vessel types
were matched to the C1C2 vessels. Vessel attribute data was not available for all these vessel types, so the
vessel types were aggregated into 13 different vessel groups for which surrogate data were available The
cmv_c3 sector contains large engine CMV emissions.

The final components of the emissions computation equation are the emission factors and the low load
adjustment factor. The emission factors used in this inventory take into consideration the EPA's marine
vessel fuel regulations as well as exhaust standards that are based on the year that the vessel was
manufactured to determine the appropriate regulatory tier. Emission factors in g/kWhr by tier for NOx,
PMio, PM2.5, CO, CO2, SO2 and VOC were developed using Tables 3-7 through 3-10 in USEPA's (2008)
Regulatory Impact Analysis on engines less than 30 liters per cylinder. To compile these emissions
factors, population-weighted average emission factors were calculated per tier based on C1C2 population
distributions grouped by engine displacement. Boiler emission factors were obtained from an earlier
Swedish Environmental Protection Agency study (Swedish EPA, 2004). If the year of manufacture was
unknown then it was assumed that the vessel was Tier 0, such that actual emissions may be less than those
estimated in this inventory. Without more specific data, the magnitude of this emissions difference cannot
be estimated.

Propulsive emissions from low-load operations were adjusted to account for elevated emission rates
associated with activities outside the engines' optimal operating range. The emission factor adjustments
were applied by load and pollutant, based on the data compiled for the Port Everglades 2015 Emission
Inventory. 9 Hazardous air pollutants and ammonia were added to the inventory according to
multiplicative factors applied either to VOC or PM2.5.

The stack parameters used for cmv_clc2 are a stack height of 1 ft, stack diameter of 1 ft, stack
temperature of 70°F, and a stack velocity of 0.1 ft/s. These parameters force emissions into layer 1.

For more information on the C1C2 CMV emission computations for 2020, see the supporting
documentation for the 2020 NEI. The cmv_clc2 emissions were aggregated to total hourly values in each

9 USEPA. EPA and Port Everglades Partnership: Emission Inventories and Reduction Strategies. US Environmental Protection
Agency, Office of Transportation and Air Quality, June 2018. https://nepis.epa.gov/Exe/ZvPDF.cgi?Dockev=P100UKV8.pdf.

38


-------
grid cell and ran through SMOKE as point sources. SMOKE requires an annual inventory file to go along
with the hourly data and this file was generated for 2020.

The cmv_c3 sector contains large engine CMV emissions. Category 3 (C3) marine diesel engines at or
above 30 liters per cylinder. Category 3 (C3) marine diesel engines are those at or above 30 liters per
cylinder, typically these are the largest engines rated at 3,000 to 100,000 hp. C3 engines are typically used
for propulsion on ocean-going vessels including container ships, oil tankers, bulk carriers, and cruise
ships. Emissions control technologies for C3 CMV sources are limited due to the nature of the residual
fuel used by these vessels.10 The cmv_c3 sector contains sources that traverse state and federal waters;
along with sources in waters not covered by the NEI in surrounding areas of Canada, Mexico, and
international waters. For more information on CMV sources in the 2020 NEI, see Section 11 of the 2020
NEI TSD and the supplemental documentation for 2020 NEI CMV.

The process for computing the C3 CMV emissions was similar to that used for C1C2 CMV described
above. The 2020 CMV C3 NEI data were computed based on the AIS data from the USGS for the year of
2020. The AIS data were coupled with ship registry data that contained engine parameters, vessel power
parameters, and other factors such as tonnage and year of manufacture which helped to separate the C3
vessels from the C1C2 vessels. Where specific ship parameters were not available, they were gap-filled.
The types of vessels that remain in the C3 data set include bulk carrier, chemical tanker, liquified gas
tanker, oil tanker, other tanker, container ship, cruise, ferry, general cargo, fishing, refrigerated vessel,
roll-on/roll-off, tug, and yacht.

Prior to use, the AIS data were reviewed - data deemed to be erroneous were removed, and data found to
be at intervals greater than 5 minutes were interpolated to ensure that each ship had data every five
minutes. The five-minute average data provide a reasonably refined assessment of a vessel's movement.
For example, using a five-minute average, a vessel traveling at 25 knots would be captured every two
nautical miles that the vessel travels. For slower moving vessels, the distance between transmissions
would be less.

Emissions were computed according to a computed power need (kW) multiplied by the time (hr) and by
an engine-specific emission factor (g/kWh) and finally by a low load adjustment factor that reflects
increasing propulsive emissions during low load operations. The resulting emissions were available at 5-
minute intervals. Code was developed to aggregate these emissions to modeling grid cells and up to
hourly levels so that the emissions data could be input to SMOKE for emissions modeling with SMOKE.
Within SMOKE, the data were speciated into the pollutants needed by the air quality model but since the
data were already in the form of point sources at the center of each grid cell, and they were already
hourly, no other processing was needed within SMOKE. SMOKE requires an annual inventory file to go
along with the hourly data, so this file was also generated for 2020.

On January 1st, 2015, the EC A initiated a fuel sulfur standard which regulated large marine vessels to use
fuel with 1,000 ppm sulfur or less. These standards are reflected in the cmv_c3 inventories.

The resulting point emissions centered on each grid cell were converted to an annual point 2010 flat file
format (FF10). A set of standard stack parameters were assigned to each release point in the cmv_c3
inventory. The assigned stack height was 65.62 ft, the stack diameter was 2.625 ft, the stack temperature

10 https://www.epa.gov/regulations-emissions-vehicles-and-engines/regulations-emissions-marine-vessels.

39


-------
was 539.6 °F, and the velocity was 82.02 ft/s. Emissions were computed for each grid cell needed for
modeling.

3.2.4.3	Locomotive (rail)

The rail sector includes all locomotives in the NEI nonpoint data category. This sector excludes railway
The rail sector includes all locomotives in the NEI nonpoint data category including line haul locomotives
on Class 1, 2, and 3 railroads along with emissions from commuter rail lines and Amtrak. The rail sector
excludes railway maintenance locomotives and point source yard locomotives. Railway maintenance
emissions are included in the nonroad sector. The point source yard locomotives are included in the
ptnonipm sector.

The rail emissions for the 2020 platform use the 2020 NEI. The 2020 NEI is based on methods developed
during the 2017 rail inventory developed for the 2017 NEI by the Lake Michigan Air Directors
Consortium (LADCO) and the State of Illinois with support from various other states. Class I railroad
emissions are based on confidential link-level line-haul activity GIS data layer maintained by the Federal
Railroad Administration (FRA). In addition, the Association of American Railroads (AAR) provided
national emission tier fleet mix information. Class II and III railroad emissions are based on a
comprehensive nationwide GIS database of locations where short line and regional railroads operate.
Passenger rail (Amtrak) emissions follow a similar procedure as Class II and III, except using a database
of Amtrak rail lines. Yard locomotive emissions are based on a combination of yard data provided by
individual rail companies, and by using Google Earth and other tools to identify rail yard locations for rail
companies which did not provide yard data. Information on specific yards were combined with fuel use
data and emission factors to create an emissions inventory for rail yards. Pollutant-specific factors were
applied on top of the activity-based changes for the Class I rail. More detailed information on the
development of the 2020 NEI rail inventory for this study is available in the 2020 NEI TSD and in the
Rail 2020 National Emissions Inventory supplementary document on the 2020 NEI supporting data FTP
site.

3.2.4.4	MO VES-based Nonroad Mobile Sources (nonroad)

The mobile nonroad equipment sector includes all mobile source emissions that do not operate on roads,
excluding commercial marine vehicles, railways, and aircraft. Types of nonroad equipment include
recreational vehicles, pleasure craft, and construction, agricultural, mining, and lawn and garden
equipment. Nonroad equipment emissions were computed by running MOVES3 which incorporates the
NONROAD model. MOVES3 incorporated updated nonroad engine population growth rates, nonroad
Tier 4 engine emission rates, and sulfur levels of nonroad diesel fuels. MOVES provides a complete set of
HAPs and incorporates updated nonroad emission factors for HAPs. MOVES3 was used for all states
other than California, which uses their own model. California nonroad emissions were provided by the
California Air Resources Board (CARB) for the 2020 NEI. CARB emissions were used in California for
all pollutants except PAHs, which were taken from MOVES.

MOVES creates a monthly emissions inventory for criteria air pollutants (CAPs) and a full set of HAPs,
plus additional pollutants such as NONHAPTOG and ETHANOL, which are not part of the NEI but are
used for speciation. MOVES provides estimates of NONHAPTOG along with the speciation profile code
for the NONHAPTOG emission source. This was accomplished by using NHTOG#### as the pollutant
code in the Flat File 2010 (FF10) inventory file that can be read into SMOKE, where #### is a speciation
profile code. For California, NHTOG####-VOC and HAP-VOC ratios from MOVES-based emissions
were applied to VOC emissions so that VOC emissions can be speciated consistently with other states.

40


-------
MOVES also provides estimates of PM2.5 by speciation profile code for the PM2.5 emission source,
using PM25_#### as the pollutant code in the FF10 inventory file, where #### is a speciation profile
code. To facilitate calculation of PMC within SMOKE, and to help create emissions summaries, an
additional pollutant representing total PM2.5 called PM25TOTAL was added to the inventory. As with
VOC, PM25_####-PM25TOTAL ratios were calculated and applied to PM2.5 emissions in California so
that PM2.5 emissions in California can be speciated consistently with other states.

MOVES3 outputs emissions data in county-specific databases, and a post-processing script converts the
data into FF10 format. Additional post-processing steps were performed as follows:

•	County-specific FFlOs were combined into a single FF10 file.

•	Emissions were aggregated from the more detailed SCCs modeled in MOVES to the SCCs
modeled in SMOKE. A list of the aggregated SMOKE SCCs is in Appendix A of the 2016vl
platform nonroad specification sheet (NEIC, 2019).

•	To reduce the size of the inventory, HAPs not needed for air quality modeling, such as dioxins and
furans, were removed from the inventory.

•	To reduce the size of the inventory further, all emissions for sources (identified by county/SCC)
for which CAP emissions totaling less than 1*10"10 were removed from the inventory. The
MOVES model attributes a very tiny amount of emissions to sources that are actually zero, for
example, snowmobile emissions in Florida. Removing these sources from the inventory reduces
the total size of the inventory by about 7%.

•	Gas and particulate components of HAPs that come out of MOVES separately, such as
naphthalene, were combined.

•	VOC was renamed VOC INV so that SMOKE does not speciate both VOC and NONHAPTOG,
which would result in a double count.

•	PM25TOTAL, referenced above, was also created at this stage of the process.

•	Emissions for airport ground support vehicles (SCCs ending in -8005), and oil field equipment
(SCCs ending in -10010), were removed from the inventory at this stage, to prevent a double
count with the airports and npoilgas sectors, respectively.

•	California emissions from MOVES were deleted and replaced with the CARB-supplied emissions.

California nonroad emissions were provided by CARB for the 2020 NEI. All California nonroad
inventories were annual, with monthly temporalization applied in SMOKE. Emissions for oil field
equipment (SCCs ending in -10010) were removed from the California inventory in order to prevent a
double count with the np oilgas sector. VOC HAPs from California were incorporated into speciation
similarly to VOC HAPs from MOVES elsewhere, e.g. model species BENZ is equal to HAP emissions
for benzene as submitted by CARB. VOC and PM2.5 emissions were allocated to speciation profiles.
Ratios of VOC (PM2.5) by speciation profile to total VOC (PM2.5) were calculated by county and SCC
from the MOVES run in California, and then applied CARB-provided VOC (PM2.5) in the inventory so
that California nonroad emissions could be speciated consistently with the rest of the country.

41


-------
For more information on the nonroad sector in the 2020 NEI see Section 4 of the 2020 NEI TSD.

3.2.5 Day-Specific Point Source Fires (ptfire)

Multiple types of fires are represented in the modeling platform. These include wild and prescribed fires
that are grouped into the ptfire-wild and ptfire-rx sectors, respectively, and agricultural fires that comprise
the ptagfire sector. All ptfire and ptagfire fires are in the United States. Fires outside of the United States
are described in the ptfire othna sector later in this document.

Wildfire and prescribed burning emissions are contained in the ptfire-wild and ptfire-rx sectors, respectively. The
ptfire sector has emissions provided at geographic coordinates (point locations) and has daily emissions values.
The ptfire sector excludes agricultural burning and other open burning sources that are included in the ptagfire
sector. Emissions are day-specific and include satellite-derived latitude/longitude of the fire's origin and other
parameters associated with the emissions such as acres burned and fuel load, which allow estimation of plume rise.

The ptfire-rx and ptfire-wild inventories include separate SCCs for the flaming and smoldering
combustion phases for wildfire and prescribed burns. Note that prescribed grassland fires or Flint Hills,
Kansas have their own SCC (2811021000) in the inventory. These wild grassland fires were assigned the
standard wildfire SCCs.

Inputs to SMARTFIRE2 for 2020 include:

•	The National Oceanic and Atmospheric Administration's (NOAA's) Hazard Mapping System
(HMS) fire location information

•	National Incident Feature Services (NIFS) (formerly GeoMAC) wildland fire perimeter polygons

•	The Incident Status Summary, also known as the "ICS-209", used for reporting specific
information on fire incidents of significance

•	Hazardous fuel treatment reduction polygons for prescribed bums from the Forest Service Activity
Tracking System (FACTS)

•	Fire activity on federal lands from the United States Fish and Wildlife Service (USFWS) and other
Department of Interior agencies

•	Wildfire and prescribed date, location, and locations from S/L/T activity 2020 NEI submitters
(includes Alaska, Arizona, California, Delaware, Georgia, Florida, Iowa, Idaho, Kanas (Flint Hills
only), Louisiana, Maine, Massachusetts, Montana, New Jersey, North Carolina, Nevada (Washoe
Co.), Oklahoma, Oregon, Rhode Island, South Carolina, Texas, Utah, Virginia, Washington, and

Wyoming)

The national and S/L/T data mentioned earlier were used to estimate daily wildfire and prescribed burn
emissions from flaming combustion and smoldering combustion phases for the 2020 inventory. Flaming
combustion is more complete combustion than smoldering and is more prevalent with fuels that have a
high surface-to-volume ratio, a low bulk density, and low moisture content. Smoldering combustion
occurs without a flame, is a less complete burn, and produces some pollutants, such as PM2.5, VOCs, and
CO, at higher rates than flaming combustion. Smoldering combustion is more prevalent with fuels that
have low surface-to-volume ratios, high bulk density, and high moisture content. Models sometimes
differentiate between smoldering emissions that are lofted with a smoke plume and those that remain near

42


-------
the ground (residual emissions), but for the purposes of the inventory the residual smoldering emissions
were allocated to smoldering SCCs.

Figure 3-1 is a schematic of the data processing stream for the inventory of wildfire and prescribed burn
sources. The ptfire-rx and ptfire-wild inventory sources were estimated using Satellite Mapping
Automated Reanalysis Tool for Fire Incident Reconciliation version 2 (SMARTFIRE2) and Blue Sky
Pipeline. SMARTFIRE2 is an algorithm and database system that operate within a geographic
information system (GIS). SMARTFIRE2 combines multiple sources of fire information and reconciles
them into a unified GIS database. It reconciles fire data from space-borne sensors and ground-based
reports, thus drawing on the strengths of both data types while avoiding double-counting of fire events. At
its core, SM ARTFIRE2 is an association engine that links reports covering the same fire in any number of
multiple databases. In this process, all input information is preserved, and no attempt is made to reconcile
conflicting or potentially contradictory information (for example, the existence of a fire in one database
but not another).

For the 2020 platform, the national and S/L/T fire information was input into SMARTFIRE2 and then
merged and associated based on user-defined weights for each fire information dataset. The output from
SMARTFIRE2 was daily acres burned by fire type, and latitude-longitude coordinates for each fire. The
fire type assignments were made using the fire information datasets. If the only information for a fire was
a satellite detect for fire activity, then the flow described in Figure 3-1 was used to make fire type
assignment by state and by month in conjunction with the default fire type assignments.

Input Data Sets
(state/local/tribal and national data sets)

% * #

Data Preparation

* *

Data Aggregation and Reconciliation

(SmartFire2) 	I

¦ Daily fire locations	Fuel Moisture and

with fire size and type	Fuel Loading Data







USFS Bluesky Pipeline



Daily smoke emissions
for each fire



Emissions Post-Processing

*

Final Wildland Fire Emissions Inventory

43


-------
Figure 3-1. Processing flow for fire emission estimates

The second system used to estimate emissions is the BlueSky Modeling Pipeline. The framework
supports the calculation of fuel loading and consumption, and emissions using various models depending
on the available inputs as well as the desired results. The contiguous United States, where Fuel
Characteristic Classification System (FCCS) fuel loading data are available, were processed using the
modeling chain described in Figure 3-2Error! Reference source not found.. The Fire Emissions
Production Simulator (FEPS) (Anderson, 2004) in the BlueSky Pipeline generates all the CAP emission
factors for wildland fires used in the 2020 study. HAP emission factors were obtained from Urbanski's
(2014) work and applied by region and by fire type.

Figure 3-2. BlueSky Pipeline modeling system

The FCCSv3 cross-reference was implemented along with the LANDFIREvl (at 200 meter resolution) to
provide better fuel bed information for the BlueSky Pipeline (BSP). The LANDFIREv2 was aggregated
from the native resolution and projection to 200 meter using a nearest-neighbor methodology.
Aggregation and reprojection was required for the proper function on BSP.

The final products from this process are annual and daily FFlO-formatted emissions inventories. These
SMOKE-ready inventory files contain both CAPs and HAPs. The BAFM HAP emissions from the
inventory were used directly in modeling and were not overwritten with VOC speciation profiles (i.e., an
"integrate HAP" use case).

3.2.6 Agricultural fires (ptagfire)

In the NEI, agricultural fires are stored as county-annual emissions and are part of the nonpoint data
category. For this study agricultural fires are modeled as day specific fires derived from satellite data for
the year 2020 in a similar way to the emissions in ptfire.

Daily year-specific agricultural burning emissions are derived from HMS fire activity data, which
contains the date and location of remote-sensed anomalies. The activity is filtered using the 2020 USDA

44


-------
cropland data layer (CDL). Satellite fire detects over agricultural lands are assumed to be agricultural
burns and assigned a crop type. Detects that are not over agricultural lands are output to a separate file for
use in the ptfire sector. Each detect is assigned an average size of between 40 and 80 acres based on crop
type. Grassland/pasture fires were moved to the ptfire sectors for this 2020 modeling platform. Depending
on their origin, grassland fires are in both ptfire-rx and ptfire-wild sectors because both fire types do
involve grassy fuels.

The point source agricultural fire (ptagfire) inventory sector contains daily agricultural burning emissions.
Daily fire activity was derived from the NOAA Hazard Mapping System (HMS) fire activity data. The
agricultural fires sector includes SCCs starting with '28015'. The first three levels of descriptions for
these SCCs are: 1) Fires - Agricultural Field Burning; Miscellaneous Area Sources; 2) Agriculture
Production - Crops - as nonpoint; and 3) Agricultural Field Burning - whole field set on fire. The SCC
2801500000 does not specify the crop type or burn method, while the more specific SCCs specify field or
orchard crops and, in some cases, the specific crop being grown.

Another feature of the ptagfire database is that the satellite detections for 2020 were filtered out to
exclude areas covered by snow during the winter months. To do this, the daily snow cover fraction per
grid cell was extracted from a 2020 meteorological Weather Research Forecast (WRF) model simulation.
The locations of fire detections were then compared with this daily snow cover file. For any day in which
a grid cell had snow cover, the fire detections in that grid cell on that day were excluded from the
inventory. Due to the inconsistent reporting of fire detections from the Visible Infrared Imaging
Radiometer Suite (VIIRS) platform, any fire detections in the HMS dataset that were flagged as VIIRS or
Suomi National Polar-orbiting Partnership satellite were excluded. In addition, certain crop types (corn
and soybeans) have been excluded from these specific midwestern states: Iowa, Kansas, Indiana, Illinois,
Michigan, Missouri, Minnesota, Wisconsin, and Ohio. The reason for these crop types being excluded is
because states have indicated that these crop types are not burned.

Heat flux for plume rise was calculated using the size and assumed fuel loading of each daily agricultural
fire. This information is needed for a plume rise calculation within a chemical transport modeling system.

The daily agricultural and open burning emissions were converted from a tabular format into the
SMOKE-ready daily point flat file format. The daily emissions were also aggregated into annual values
by location and converted into the annual point flat file format.

For this modeling platform, a SMOKE update allows the use of HAP integration for speciation for
PTDAY inventories. The 2020 agricultural fire inventories include emissions for HAPs, so HAP
integration was used for this study.

3.2.7 Biogenic Sources (beis)

Biogenic emissions were computed based on the 2020 meteorology data used for the 2020 NEI and were
developed using the Biogenic Emission Inventory System version 4 (BEIS4) within CMAQ. BEIS4
creates gridded, hourly, model-species emissions from vegetation and soils. It estimates CO, VOC (most
notably isoprene, terpene, and sesquiterpene), and NO emissions for the contiguous U.S. and for portions
of Mexico and Canada. In the BEIS4 two-layer canopy model, the layer structure varies with light
intensity and solar zenith angle (Pouliot and Bash, 2015). Both layers include estimates of sunlit and
shaded leaf area based on solar zenith angle and light intensity, direct and diffuse solar radiation, and leaf
temperature (Bash et al., 2015). BEIS4 computes the seasonality of emissions using the 1-meter soil

45


-------
temperature (S0IT2) instead of the BIOSEASON file, and canopy temperature and radiation
environments are now modeled using the driving meteorological model's (WRF) representation of leaf-
area index (LAI) rather than the estimated LAI values from BELD data alone. See these CMAQ Release
Notes for technical information on BEIS4: https://github.com/USEPA/CMAQ/wiki/CMAQ-Release-
Notes:-Emissions-Updates:-BEIS-Biogenic-Emissions. The variables output from the Meteorology-
Chemistry Interface Processor (MCIP) that are used to convert WRF outputs to CMAQ inputs are shown
in Table 3-5.

Table 3-5. Meteorological variables required by BEIS 3.7

Variable

Description

LAI

leaf-area index

PRSFC

surface pressure

Q2

mixing ratio at 2 m

RC

convective precipitation per met TSTEP

RGRND

solar rad reaching surface

RN

nonconvective precipitation per met TSTEP

RSTOMI

inverse of bulk stomatal resistance

SLYTP

soil texture type by USD A category

SOIM1

volumetric soil moisture in top cm

SOIT1

soil temperature in top cm

TEMPG

skin temperature at ground

USTAR

cell averaged friction velocity

RADYNI

inverse of aerodynamic resistance

TEMP2

temperature at 2 m

WSATPX

soil saturation from (Pleim-Xiu Land Surface Model) PX-LSM

The Biogenic Emissions Landcover Database version 6 (BELD6) was used as the input gridded land use
information in generating 2020 NEI estimates. BELD version 5 (BELD5) was used to generate 2017 NEI
estimates. There are now two different BELD6 datasets that are input into BEIS4. The gridded landuse
and the other is the gridded dry leaf biomass (grams/m2) values for various vegetation types. The
BELD6 includes the following datasets:

High resolution tree species and biomass data from Wilson et al. 2013a, and Wilson et al.
2013b for which species names were changed from non-specific common names to scientific
names

Tree species biogenic volatile organic carbon (BVOC) emission factors for tree species were
taken from the NCAR Enclosure database (Wiedinmyer, 2001)

o https ://www. sciencedirect. com/science/article/pii/S 13 52231001004290

Agricultural land use from US Department of Agriculture (USDA) crop data layer

Global Moderate Resolution Imaging Spectroradiometer (MODIS) 20 category data with
enhanced lakes and Fraction of Photosynthetically Active Radiation (FPAR) for vegetation
coverage from National Center for Atmospheric Research (NCAR)

46


-------
Canadian BELD land use, updates to Version 4 of the Biogenic Emissions Landuse Database
(BELD4) for Canada and Impacts on Biogenic VOC Emissions
(https://www.epa.gov/sites/default/files/2019-08/documents/8Q0am zhang 2 O.pdf).

Bug fixes included in BEIS4 included the following:

•	Solar radiation attenuation in the shaded portion of the canopy was using the direct beam
photosynthetically active radiation (PAR) when the diffuse beam PAR attenuation coefficient
should have been used.

o This update had little impact on the total emissions but did result in slightly higher
emissions in the morning and evening transition periods for isoprene, methanol and
Methylbutenol (MBO).

•	The fraction of solar radiation in the sunlit and shaded canopy layers, SOLSUN and SOLSHADE
respectively were estimated using a planar surface. These should have been estimated based on the
PAR intercepted by a hemispheric surface rather than a plane.

o This update can result in an earlier peak in leaf temperature, approximately up to an hour.

•	The quantum yield for isoprene emissions (ALPHA) was updated to the mean value in Niinemets
et al. 2010a and the integration coefficient (CL) was updated to yield 1 when PAR = 1000
following Niinemets et al 2010b.

o This updated resulted in a slight reduction in isoprene, methanol, and MBO emissions.

Biogenic emissions computed with BEIS were used to review and prepare summaries, but were left out of
the CMAQ-ready merged emissions in favor of inline biogenics produced during the CMAQ model run
itself using the same algorithm described above but with finer time steps within the air quality model.
Biogenic emissions computed with BEIS to review and prepare summaries, but they were left out of the
CMAQ-ready merged emissions. Instead, the biogenic emissions are produced inline during the CMAQ
model run which uses the same algorithm described above, but with finer time steps within the air quality
model.

3.2.8 Emissions from Canada, Mexico (othpt, othar, othafdust, othptdust, onroad can, onroad mex,
ptfire_othna)

The emissions from Canada and Mexico are included as part of the emissions modeling sectors:
canmex_point, canmexarea, canadaafdust, canada_ptdust, canada onroad, mexicoonroad, canmexag,
and canada_og2D. These sector names are new to 2020 platform, but the general organization of these
sectors is unchanged from the 2019 platform, except for agricultural emissions in Canada and Mexico.
The canmex ag sector is processed as a separate sector for reporting and tracking purposes, and unlike in
other recent emissions platforms, the Canada ag sources are area sources in this platform rather than pre-
gridded point sources. As in prior platforms, Fugitive dust emissions in Canada are represented as both
area sources (canada afdust sector, formerly "othafdust") and point sources (canada_ptdust sector,
formerly "othptdust"). Due to the large number of individual points, low-level oil and gas emissions in
Canada are processed separately from the canmex_point sector to reduce the number of individual points
to track within CMAQ, and also to reduce the size of the model-ready emissions files.

47


-------
Emissions in these sectors were taken from the 2020 inventories. Environment and Climate Change
Canada (ECCC) provided the following inventories for use in the 2020 modeling. The sectors in which
they were incorporated are listed and the inventories are described in more detail below:

Agricultural livestock and fertilizer, area source format (canmexag sector)

Surface-level oil and gas emissions in Canada (canada_og2D sector)

Agricultural fugitive dust, point source format (canada_ptdust sector)

Other area source dust (canada afdust sector)

Onroad (canada onroad sector)

- Nonroad and rail (canmexarea sector)

Airports (canmex_point sector)

Other area sources (canmex area sector)

Other point sources (canmex_point sector)

The 2020 NEI CMV included coastal waters of Canada and Mexico with emissions derived from AIS
data. These NEI emissions were used for all areas of Canada and Mexico and are included in the
cmv_clc2 and cmv_c3 sectors. Both the C1C2 and C3 emissions were developed in a point source format
with point locations at the center of the 12km grid cells.

Other than the CB6 species of NBAFM present in the speciated point source data, there are no explicit
HAP emissions in these Canadian inventories. In addition to emissions inventories, the ECCC 2020
dataset also included shapefiles for creating spatial surrogates. These surrogates were used for this study.

Canadian point source inventories provided by ECCC for the 2020 NEI were adjusted for the impacts of
COVID. These inventories include emissions for airports and other point sources. The Canadian point
source inventory is pre-speciated for the CB6 chemical mechanism. Annual emissions provided by ECCC
already reflected pandemic effects, but the monthly distributions of emissions did not. To account for
pandemic effects, monthly emissions in Canada were redistributed using data from the CONFORM
dataset (https://permalink.aeris-data.fr/CONFORM), which provides country-specific adjustment factors
to account for pandemic effects for each month in 2020. Monthly temporal profiles were calculated from
the CONFORM dataset as ratios of monthly totals versus annual totals for several different categories
(aviation, energy, industry, public and commercial, residential, and transport) and applied to the annual
emisions provided by ECCC, with each SCC mapped to a CONFORM category. Annual emissions totals
in Canada were not changed as part of this process, only the distribution to months.

Point sources in Mexico were compiled based on inventories projected from the Inventario Nacional de
Emisiones de Mexico, 2016 (Secretaria de Medio Ambiente y Recursos Naturales (SEMARNAT)),
projected to 2019 as part of the 2019 emissions modeling platform, and then projected to 2020 to include
COVID pandemic effects. The point source emissions were converted to English units and into the FF10
format that could be read by SMOKE, missing stack parameters were gapfilled using SCC-based defaults,
latitude and longitude coordinates were verified and adjusted if they were not consistent with the reported
municipality and were additionally adjusted for COVID. Only CAPs are covered in the Mexico point
source inventory. The CONFORM dataset was used to apply pandemic adjustments to emissions in
Mexico, except that unlike in Canada, annual emissions as well as monthly temporal profiles were
adjusted. First, monthly emissions totals for the unadjusted 2019 inventory were calculated using existing
temporal profiles. Then, a 2019-to-2020 scaling factor was calculated for each month using data from the
CONFORM dataset, and for each emissions category in the CONFORM dataset (energy, industry, public

48


-------
and commercial, residential, and transport). These scaling factors were applied to the 2019 monthly
Mexico emissions, and a new annual total for 2020 was calculated from the adjusted monthly totals.

Fugitive dust sources of particulate matter emissions excluding land tilling from agricultural activities,
were provided by Environment and Climate Change Canada (ECCC) as part of their 2020 emission
inventory. This inventory no longer contains agricultural dust. Different source categories were provided
as gridded point sources and area (nonpoint) source inventories. Gridded point source emissions resulting
from land tilling due to agricultural activities were provided as part of the ECCC 2020 emission
inventory. The provided wind erosion emissions were removed. Both the canada afdust and
canada_ptdust emissions have a COVID-adjusted monthly resolution based on the CONFORM dataset
categories of industry and transport, following a similar process as the canmex_point sector. A transport
fraction adjustment that reduces dust emissions based on land cover types was applied to both point and
nonpoint dust emissions, along with a meteorology-based (precipitation and snow/ice cover) zero-out of
emissions when the ground is snow covered or wet.

Agricultural emissions from Canada and Mexico, excluding fugitive dust, are included in the canmexag
sector. Canadian agricultural emissions were provided by Environment and Climate Change Canada
(ECCC) as part of their 2020 emission inventory. Unlike in recent platforms, Canadian agricultural were
not represented as point sources, instead they were represented as area sources and gridded using spatial
surrogates. In Mexico, agricultural sources are based on the 2019ge Mexico nonpoint inventory at the
municipio resolution. The 2019 inventory was based on a projection of 2016 inventories provided by
SEMARNAT. COVID pandemic adjustments were not applied to the agricultural sector.

Canadian point source inventories provided by ECCC for the 2020 NEI included oil and gas emissions. A
very large number of these oil and gas point sources are surface level emissions, appropriate to be
modeled in layer 1. Reducing the size of the canmex_point sector improves air quality model run time
because plume rise calculations are needed for fewer sources, so these surface level oil and gas sources
were placed into the canada_og2D sector for layer 1 modeling. These emissions include COVID-adjusted
monthly data based on the CONFORM dataset industry sector.

ECCC provided year 2020 Canada province, and in some cases sub-province, resolution emissions from
for nonpoint and nonroad sources (canmexarea). The nonroad sources were monthly while the nonpoint
and rail emissions were annual. Annual emissions provided by ECCC already reflected pandemic effects,
but monthly distributions of emissions did not. Following a similar process as the canmex_point sector,
monthly emissions in Canada were redistributed using data from the CONFORM dataset to reflect
pandemic effects. The CONFORM categories used for the Canada monthly COVID adjustments were
energy, industry, public and commercial, residential, and transport.

For Mexico, 2019ge Mexico nonpoint and nonroad inventories at the municipio resolution (which were
based on a projection of 2016 inventories provided by SEMARNAT) were projected to 2020 to include
COVID pandemic effects using a process similar to the one described for the canmex_point sector. The
CONFORM categories used for the projection and monthly distribution included: industry, public and
commercial, residential, and transport.

The onroad emissions for Canada and Mexico are in the canada onroad and mexicoonroad sectors,
respectively. Emissions for Canada are new for 2020. In Canada, COVID impacts were applied to the

49


-------
monthly profiles (not to the annual totals) using the CONFORM dataset emissions from the transport
category.

For Mexico onroad emissions, a version of the MOVES model for Mexico was run that provided the same
VOC HAPs and speciated VOCs as for the U.S. MOVES model (ERG, 2016a). This includes NBAFM
plus several other VOC HAPs such as toluene, xylene, ethylbenzene and others. Except for VOC HAPs
that are part of the speciation, no other HAPs are included in the Mexico onroad inventory (such as
particulate HAPs nor diesel particulate matter). Emissions from MOVES-Mexico for the year 2020 did
not include any COVID pandemic effects, so monthly and annual emissions were adjusted using the
monthly CONFORM adjustment factors for Mexico transport.

Annual 2020 wildland fire emissions for Mexico, Canada, Central America, and Caribbean nations are
included in the ptfireothna sector. Canadian fires from May-December were provided by ECCC and are
based on their Firework system (https://weather.gc.ca/firework/). Canadian fires for the non-summer
months along with fires in Mexico, Central America, and the Caribbean, were developed from the Fire
Inventory from NCAR (FINN) v2.5 daily fire emissions for 2020 (Wiedenmyer, 2023). For FINN fires,
listed vegetation type codes of 1 and 9 are defined as agricultural burning, all other fire detections and
assumed to be wildfires. All wildland fires that are not defined as agricultural are assumed to be wildfires
rather than prescribed. FINN fire detects of less than 50 square meters (0.012 acres) are removed from
the inventory. The locations of FINN fires are geocoded from latitude and longitude to FIPS code.

3.2.9 Ocean Chlorine, Ocean Sea Salt, and Volcanic Mercury

The ocean chlorine gas emission estimates are based on the build-up of molecular chlorine (Cb)
concentrations in oceanic air masses (Bullock and Brehme, 2002). Data at 36 km and 12 km resolution
were available and were not modified other than the model-species name "CHLORINE" was changed to
"CL2" to support CMAQ modeling.

For mercury, the volcanic mercury emissions that were used in the recent modeling platforms were not
included in this study. The emissions were originally developed for a 2002 multipollutant modeling
platform with coordination and data from Christian Seigneur and Jerry Lin for 2001 (Seigneur et. al, 2004
and Seigneur et. al, 2001). ). The volcanic emissions from the most recent eruption were not included in
the because they have diminished by the year 2019. Thus no volcanic emissions were included.

Because of mercury bidirectional flux within the latest version of CMAQ, no other natural mercury
emissions are included in the emissions merge step.

3.3 Emissions Modeling Summary

The CMAQ and CAMx air quality models require hourly emissions of specific gas and particle species
for the horizontal and vertical grid cells contained within the modeled region (i.e., modeling domain). To
provide emissions in the form and format required by the model, it is necessary to "pre-process" the "raw"
emissions (i.e., emissions input to SMOKE) for the sectors described above. In brief, the process of
emissions modeling transforms the emissions inventories from their original temporal resolution,
pollutant resolution, and spatial resolution into the hourly, speciated, gridded and vertical resolution
required by the air quality model. Emissions modeling includes temporal allocation, spatial allocation,
and pollutant speciation. Emissions modeling sometimes includes the vertical allocation (i.e., plume rise)

50


-------
of point sources, but many air quality models also perform this task because it greatly reduces the size of
the input emissions files if the vertical layers of the sources are not included.

The temporal resolutions of the emissions inventories input to SMOKE vary across sectors and may be
hourly, daily, monthly, or annual total emissions. The spatial resolution may be individual point sources;
totals by county (U.S.), province (Canada), or municipio (Mexico); or gridded emissions. This section
provides some basic information about the tools and data files used for emissions modeling as part of the
modeling platform.

3.3.1	The SMOKE Modeling System

SMOKE version 4.9 was used to process the raw emissions inventories into emissions inputs for each
modeling sector into a format compatible with CMAQ. SMOKE executables and source code are
available from the Community Multiscale Analysis System (CMAS) Center at
http://www.cmasceiiter.org. Additional information about SMOKE is available from http://www.smoke-
model .org. For sectors that have plume rise, the in-line plume rise capability allows for the use of
emissions files that are much smaller than full three-dimensional gridded emissions files. For quality
assurance of the emissions modeling steps, emissions totals by specie for the entire model domain are
output as reports that are then compared to reports generated by SMOKE on the input inventories to
ensure that mass is not lost or gained during the emissions modeling process.

3.3.2	Key Emissions Modeling Settings

When preparing emissions for the air quality model, emissions for each sector are processed separately
through SMOKE, and then the final merge program (Mrggrid) is run to combine the model-ready, sector-
specific 2-D gridded emissions across sectors. The SMOKE settings in the run scripts and the data in the
SMOKE ancillary files control the approaches used by the individual SMOKE programs for each sector.
Table 3-6 summarizes the major processing steps of each platform sector with the columns as follows.

The "Spatial" column shows the spatial approach used: "point" indicates that SMOKE maps the source
from a point location (i.e., latitude and longitude) to a grid cell; "surrogates" indicates that some or all of
the sources use spatial surrogates to allocate county emissions to grid cells; and "area-to-point" indicates
that some of the sources use the SMOKE area-to-point feature to grid the emissions.

The "Speciation" column indicates that all sectors use the SMOKE speciation step, though biogenics
speciation is done within the Tmpbeis3 program and not as a separate SMOKE step.

The "Inventory resolution" column shows the inventory temporal resolution from which SMOKE needs
to calculate hourly emissions. Note that for some sectors (e.g., onroad, beis), there is no input inventory;
instead, activity data and emission factors are used in combination with meteorological data to compute
hourly emissions.

Finally, the "plume rise" column indicates the sectors for which the "in-line" approach is used. These
sectors are the only ones with emissions in aloft layers based on plume rise. The term "in-line" means
that the plume rise calculations are done inside of the air quality model instead of being computed by
SMOKE. In all of the "in-line" sectors, all sources are output by SMOKE into point source files which
are subject to plume rise calculations in the air quality model. In other words, no emissions are output to
layer 1 gridded emissions files from those sectors as has been done in past platforms. The air quality

51


-------
model computes the plume rise using stack parameters, the Briggs algorithm, and the hourly emissions in
the SMOKE output files for each emissions sector. The height of the plume rise determines the model
layers into which the emissions are placed. The plume top and bottom are computed, along with the
plumes' distributions into the vertical layers that the plumes intersect. The pressure difference across each
layer divided by the pressure difference across the entire plume is used as a weighting factor to assign the
emissions to layers. This approach gives plume fractions by layer and source. Day-specific point fire
emissions are treated differently in CMAQ. After plume rise is applied, there are emissions in every layer
from the ground up to the top of the plume.

Table 3-6. Key emissions modeling steps by sector

Platform sector

Spatial

Speciation

Inventory
resolution

Plume rise

afdust adj

Surrogates

Yes

Annual



airports

Point

Yes

Annual

None

beis

Pre-gridded
land use

in BEIS4

computed hourly
in CMAQ



fertilizer

EPIC

No

computed hourly
in CMAQ



livestock

Surrogates

Yes

Annual



cmv clc2

Point

Yes

hourly

in-line

cmv c3

Point

Yes

hourly

in-line

nonpt

Surrogates &
area-to-point

Yes

Annual



nonroad

Surrogates

Yes

monthly



np oilgas

Surrogates

Yes

Annual



onroad

Surrogates

Yes

monthly activity,
computed hourly



onroadcaadj

Surrogates

Yes

monthly activity,
computed hourly



Canada onroad

Surrogates

Yes

monthly



mexico onroad

Surrogates

Yes

monthly



canadaafdust

Surrogates

Yes

annual &
monthly



canmex area

Surrogates

Yes

monthly



canmex point

Point

Yes

monthly

in-line

Canada ptdust

Point

Yes

annual

None

Canada og2D

Point

Yes

monthly

None

canmex ag

Surrogates

Yes

annual



ptagfire

Point

Yes

daily

in-line

pt oilgas

Point

Yes

annual

in-line

ptegu

Point

Yes

daily & hourly

in-line

ptfire-rx

Point

Yes

daily

in-line

ptfire-wild

Point

Yes

daily

in-line

ptfire othna

Point

Yes

daily

in-line

ptnonipm

Point

Yes

annual

in-line

52


-------
Platform sector

Spatial

Speciation

Inventory
resolution

Plume rise

rail

Surrogates

Yes

annual



rwc

Surrogates

Yes

annual



np solvents

Surrogates

Yes

annual



Note that SMOKE has the option of grouping sources so that they are treated as a single stack when
computing plume rise. For the modeling cases discussed in this document, no grouping was performed
because grouping combined with "in-line" processing will not give identical results as "offline"
processing (i.e., when SMOKE creates 3-dimensional files). This occurs when stacks with different stack
parameters or latitude and longitudes are grouped, thereby changing the parameters of one or more
sources. The most straightforward way to get the same results between in-line and offline is to avoid the
use of stack grouping.

Biogenic emissions can be modeled two different ways in the CMAQ model. The BEIS model in SMOKE
can produce gridded biogenic emissions that are then included in the gridded CMAQ-ready emissions
inputs, or alternatively, CMAQ can be configured to create "in-line" biogenic emissions within CMAQ
itself. For this study, the in-line biogenic emissions option was used, and so biogenic emissions from
BEIS were not included in the gridded CMAQ-ready emissions.

3.3.3 Spatial Configuration

For this study, SMOKE was run for the larger 12-km CONtinental United States "CONUS" modeling
domain (12US1) shown in Figure 3-3, but the air quality model was run on the smaller 12-km domain
(12US2). The grid used a Lambert-Conforrnal projection, with Alpha = 33, Beta = 45 and Gamma = -97,
with a center of X = -97 and Y = 40. Later sections provide details on the spatial surrogates and area-to-
point data used to accomplish spatial allocation with SMOKE. Later sections provide details on the spatial
surrogates and area-to-point data used to accomplish spatial allocation with SMOKE.

53


-------
3.3.4 Chemical Speciation Con figuration

Chemical speciation involves the process of translating emissions from the inventory into the chemical
mechanism-specific "model species" needed by an air quality model. Using the CB6R5_AE7 chemical
mechanism as an example, which is the mechanism utilized by the 2020 NEI modeling platform, these
model species either represent explicit chemical compounds (e.g., acetone, benzene, ethanol) or groups of
species (i.e., "lumped species;" e.g., PAR, OLE, KET). This chemical mechanism is an updated version of
the CB6R3AE7 chemical mechanism and features new reaction rates for some chemical reactions
(Yarwood et al„ 2020). CMAQ's Aerosol Module version 7 (AE7) is an updated version of the AE6
aerosol module, with alpha-pinene made an explicit emitted species. Table 3-7 lists the model species
produced by SMOKE in the platform used for this study.

54


-------
Table 3-7. Emission model species produced for CB6R3AE7 for CMAQ

Inventory Pollutant

Model Species

Model species description

Cl2

CL2

Atomic gas-phase chlorine

HC1

HCL

Hydrogen Chloride (hydrochloric acid) gas

CO

CO

Carbon monoxide

NOx

NO

Nitrogen oxide

NOx

N02

Nitrogen dioxide

NOx

HONO

Nitrous acid

S02

S02

Sulfur dioxide

S02

SULF

Sulfuric acid vapor

nh3

NH3

Ammonia

nh3

NH3 FERT

Ammonia from fertilizer

voc

AACD

Acetic acid

voc

ACET

Acetone

voc

ALD2

Acetaldehyde

voc

ALDX

Propionaldehyde and higher aldehydes

voc

APIN

Alpha pinene

voc

BENZ

Benzene

voc

CAT1

Methyl-catechols

voc

CH4

Methane

voc

CRES

Cresols

voc

CRON

Nitro-cresols

voc

ETH

Ethene

voc

ETHA

Ethane

voc

ETHY

Ethyne

voc

ETOH

Ethanol

voc

FACD

Formic acid

voc

FORM

Formaldehyde

voc

GLY

Glyoxal

voc

GLYD

Glycolaldehyde

voc

IOLE

Internal olefin carbon bond (R-C=C-R)

voc

ISOP

Isoprene

voc

ISPD

Isoprene Product

voc

IVOC

Intermediate volatility organic compounds

voc

KET

Ketone Groups

voc

MEOH

Methanol

voc

MGLY

Methylglyoxal

voc

NAPH

Naphthalene

voc

NVOL

Non-volatile compounds

voc

OLE

Terminal olefin carbon bond (R-C=C)

voc

PACD

Peroxyacetic and higher peroxycarboxylic acids

voc

PAR

Paraffin carbon bond

voc

PRPA

Propane

voc

SESQ

Sesquiterpenes (from biogenics only)

voc

SOAALK

Secondary Organic Aerosol (SOA) tracer

voc

TERP

Terpenes (from biogenics only)

55


-------
Inventory Pollutant

Model Species

Model species description

VOC

TOL

Toluene and other monoalkyl aromatics

VOC

UNR

Unreactive

VOC

XYLMN

Xylene and other polyalkyl aromatics, minus naphthalene

Naphthalene

NAPH

Naphthalene from inventory

Benzene

BENZ

Benzene from the inventory

Acetaldehyde

ALD2

Acetaldehyde from inventory

Formaldehyde

FORM

Formaldehyde from inventory

Methanol

MEOH

Methanol from inventory

PM10

PMC

Coarse PM >2.5 microns and <10 microns

PM2.5

PEC

Particulate elemental carbon <2.5 microns

PM2.5

PN03

Particulate nitrate <2.5 microns

PM2.5

POC

Particulate organic carbon (carbon only) <2.5 microns

PM2.5

PS04

Particulate Sulfate <2.5 microns

PM2.5

PAL

Aluminum

PM2.5

PCA

Calcium

PM2.5

PCL

Chloride

PM2.5

PFE

Iron

PM2.5

PK

Potassium

PM2.5

PH20

Water

PM2.5

PMG

Magnesium

PM2.5

PMN

Manganese

PM2.5

PMOTHR

PM2.5 not in other AE6 species

PM2.5

PNA

Sodium

PM2.5

PNCOM

Non-carbon organic matter

PM2.5

PNH4

Ammonium

PM2.5

PSI

Silica

PM2.5

PTI

Titanium

The TOG and PM2.5 profiles used to speciate emissions are part of the SPECIATE v5.2 database
(https://www.epa.gov/air-emissions-modeling/speciate). The SPECIATE database is developed and
maintained by the EPA's Office of Research and Development (ORD), Office of Transportation and Air
Quality (OTAQ), and the Office of Air Quality Planning and Standards (OAQPS), in cooperation with
Environment Canada (EPA, 2016). These profiles are processed using the EPA's S2S-Tool
(https://github.com/USEPA/S2S-Tool) to generate the GSPRO and GSCNV files needed by SMOKE. As
with previous platforms, some Canadian point source inventories are provided from Environment Canada
as pre-speciated emissions.

Speciation profiles (GSPRO files) and cross-references (GSREF files) for this study platform are
available in the SMOKE input files for the platform. Emissions of VOC and PM2.5 emissions by county,
sector, and profile for all sectors other than onroad mobile can be found in the sector summaries. Total
emissions for each model species by state and sector can be found in the state-sector totals workbook.

The following updates to profile assignments were made to this modeling platform and vary from prior
years:

56


-------
•	ForPM2.5:

o The profile for grass fires was updated to profile 95809.
o The profile for hydrogen boilers was updated to a gas combustion profile,
o Assignments for new PM2.5 SCCs in the 2020 point and nonpoint inventories were
included.

•	For VOC:

o The profile for wildfires and prescribed fires was updated to profile 95861.
o Assignments for new VOC SCCs in the 2020 point and nonpoint inventories were included

(e.g., agricultural silage and asphalt paving),
o Several point and nonpoint SCCs which were previously assigned the overall average
profile were reassigned to more appropriate profiles.

The base emissions inventory for this modeling platform includes total VOC and individual HAP
emissions. Often, individual HAPs are components of VOC (HAP-VOC), and these HAP-VOCs are
included ("integrated") in the speciation process. This HAP integration is performed in a way to ensure
double counting of emitted mass does not occur and requires specific data processing by the S2S-Tool
and user input in SMOKE.

To incorporate HAP emissions from the base inventory into the modeling platform, one of two methods
are performed. (1) Integrate, HAP-use is a method where the mass of integrated HAP-VOCs is summed
and subtracted from VOC, and the residual mass (NONHAPVOC) is speciated using a renormalized
speciation profile that does not include the integrated HAP-VOCs (they are subtracted from the profile
and then the profile is renormalized to 100%). (2) No-Integrate, HAP-use is a method where the mass of
VOC is speciated using a speciation profile that does not include the integrated HAP-VOCs (they are
subtracted from the profile and the profile is not renormalized to 100%). In this scenario, the HAP-VOC
and VOC portions of the inventory are difficult to harmonize, and it is assumed that the proportions of
HAPs from these sources are adequately captured in the speciation profile used to speciate the VOC
emissions (which is why there is no renormalization). In addition, HAPs can be introduced into a
modeling platform using speciation profiles. In this scenario, HAP-VOC emissions are "generated"
through VOC speciation and are not incorporated from the base inventory. This method is called
"Criteria" speciation. The integration methods used for each platform sector are shown in Table 3-8.

Table 3-8. Integration status for each platform sector

Platform
Sector

Approach for Integrating NEI emissions of Naphthalene (N), Benzene (B),
Acetaldehyde (A), Formaldehyde (F) and Methanol (M)

afdust

N/A - sector contains no VOC

airports

No integration, use NBAFM in inventory

beis

N/A - sector contains no inventory pollutant "VOC"; but rather specific VOC species

cmv clc2

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

cmv c3

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

fertilizer

N/A - sector contains no VOC

livestock

Full integration (NBAFM)

nonpt

Partial integration (NBAFM)

nonroad

Full integration (internal to MOVES)

np oilgas

Partial integration (NBAFM)

onroad

Full integration (internal to MOVES)

Canada onroad

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

57


-------
Platform
Sector

Approach for Integrating NEI emissions of Naphthalene (N), Benzene (B),
Acetaldehyde (A), Formaldehyde (F) and Methanol (M)

mexicoonroad

Full integration (internal to MOVES-Mexico); however, MOVES-MEXICO speciation was
older CB6, so post-SMOKE emissions were converted to CB6R3AE6

Canada afdust

N/A - sector contains no VOC

canmex area

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

canmex_point

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

Canada ptdust

N/A - sector contains no VOC

Canada og2D

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

canmex ag

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

pt oilgas

No integration, use NBAFM in inventory

ptagfire

Full integration (NBAFM)

ptegu

No integration, use NBAFM in inventory

ptfire-rx

Full integration (NBAFM)

ptfire-wild

Partial integration (NBAFM)

ptfire othna

No integration, no NBAFM in inventory, create NBAFM from VOC speciation

ptnonipm

No integration, use NBAFM in inventory

rail

Full integration (NBAFM)

rwc

Full integration (NBAFM)

np solvents

Partial integration (NBAFM)

The HAPs integrated from the base inventory into the modeling platform are sector and chemical
mechanism specific. In recent years, CB6R3AE7 has been the primary chemical mechanism used at the
EPA. Within that mechanism, naphthalene (NAPH), benzene (BENZ), acetaldehyde (ALD2),
formaldehyde (FORM), and methanol (MEOH) are explicit HAP-VOCs, and these compounds are
collectively referred to as NBAFM. Since NB AFM are explicitly modeled in CB6R3_AE7, these species
have become the default collection of integrated HAP species at the EPA. MOVES, the EPA's mobile
emissions model, features additional species that are explicitly modeled (e.g., ethanol). These species (are
also incorporated directly into modeling platforms if they are explicit in CB6R3 AE7. To incorporate
these species, additional files from the S2S-Tool are required. For California, speciation of
NONHAPTOG is performed on CARB's VOC submissions using the county-specific speciation profile
assignments generated by MOVES in California.

Several sectors require VOC speciation to occur at the county-level and consistent speciation profiles
cannot be applied across the nation. To accomplish this, the GSREFCOMBO functionality within
SMOKE is leveraged. A GSREF COMBO allows profiles to be "blended" at the county/SCC-level using
proportions included in the input file. These variable VOC speciation methods are applied in the oil and
gas sector and for various mobile emissions sources. In both the np oilgas and pt oilgas sector, VOC
speciation profiles are weighted to reflect region-specific application of controls, differences in gas
composition, and variable sources of emissions (e.g., varying proportions of emissions from associated
gas, condensate tanks, crude oil tanks, dehydrators, liquids unloading and well completions). The
Nonpoint Oil and Gas Emissions Estimation Tool generates an intermediate file that provides SCC and
county-specific emissions proportions, which are subsequently incorporated into the modeling platform.

For onroad and nonroad mobile sources, the VOC speciation weighting factors vary for each SCC,
representative county, emissions mode (e.g., exhaust, evaporative), month for start exhaust, and season.
To generate onroad emissions and perform the subsequent speciation, SMOKE-MOVES is first run to
estimate emissions and both the MEPROC and INVTABLE files are used to control which pollutants are

58


-------
processed and eventually integrated. Next, a MOVES post-processing tool is used to generate the needed
GSREFCOMBO data/files. While similar in nature and outcome, the post-processing tools/scripts used
for onroad and nonroad are different. This script allows speciation to occur outside of MOVES, which
better supports processing of onroad emissions for chemical mechanisms other than CB6, without having
to rerun the MOVES model. From there, the NONHAPTOG emission factor tables produced by MOVES
are speciated within SMOKE using the GSREF COMBO file and the NONHAPTOG GSPRO files
generated by the S2S-Tool. For further details on speciation methods involving MOVES can be found in
the associated technical report.

In Canada, a GSPROCOMBO file is used to generate speciated gasoline emissions that account for
various ethanol mixes. In Mexico, onroad emissions are pre-speciated from the MOVES-Mexico model,
thus eliminating the need for a GSPRO COMBO file. For both Canada and Mexico, nonroad VOC
emissions are not defined by mode (e.g., exhaust versus evaporative), which necessitates the need for a
GSPRO COMBO file that splits total VOC into exhaust and evaporative components. In addition,
MOVES- Mexico uses an older version of MOVES that is hardcoded for an older version of the CB6
chemical mechanism ("CB6-CAMx"). This version does not generate the model species XYLMN or
SOAALK, so additional post-processing is performed to generate those emissions:

•	XYLMN = XYL[1]-0.966*NAPHTHALENE[1]

•	PAR = PAR[1]-0.00001*NAPHTHALENE[1]

SOAALK = 0.108*PAR[1]

Unlike VOC speciation, PM2.5 speciation does not integrate species from the base inventory. Except for
mobile sources, speciation is performed within SMOKE, using SPECIATE profiles that were post-
processed using the S2S-Tool. In this modeling platform, onroad PM2.5 speciation is performed within
MOVES, meaning that the model generates emissions factor tables that include total PM2.5 and each of its
components (e.g., POC, PEC, PFE, etc.). Nonroad PM2.5 speciation is also performed within MOVES, but
the output is not speciated emissions. Rather, MOVES outputs emissions of PM2.5 for each relevant
speciation profile. Small adjustments to the methods were needed to accommodate the reporting by
California. Since California does not provide speciated PM2.5 emissions, total PM2.5 emissions for onroad
and nonroad sources in California were speciated using the profile proportions estimated by MOVES in
California. Finally, onroad brake and tire wear PM2.5 emissions were speciated in the moves2smk
postprocessor using the SPECIATE profiles 95462 and 95460, respectively.

Diesel PM emissions are explicitly included in the NEI using the pollutant names DIESEL-PM10 and
DIESEL-PM25 for select mobile sources whose engines burn diesel or residual-oil fuels. This includes
sources in onroad, nonroad, point airport ground support equipment, point locomotives, nonpoint
locomotives, and all PM from diesel or residual oil fueled nonpoint CMV. These emissions are equal to
their primary PM10-PRI and PM25-PRI counterparts, are exclusively from exhaust (i.e., do not include
brake/tire wear), and are exclusively used in toxics modeling. Diesel PM is then speciated in SMOKE
using the same speciation profiles and methods as primary PM, except that diesel PM is mapped to model
species that feature "DIESEL PM" in their species name.

In the NEI, NOx emissions are inventoried on a NO2 weighted basis, but must be speciated into NO, NO2,
and HONO. Table 3-9 provides the NOx speciation profiles used in EPA's modeling platforms. The only
difference between the two profiles is the allocation of some NO2 mass to HONO in the "HONO" profile.
HONO emissions from mobile sources have been identified in tunnel studies and its inclusion in
emissions inventories is important for urban chemistry. Here, a HONO to NOx ratio of 0.008 was selected

59


-------
(Sarwar, 2008). In this modeling platform, all non-mobile sources use the "NHONO" profile, all non-
onroad mobile sources (including nonroad, cmv, and rail) use the "HONO" profile, and all onroad NOx
speciation occurs within MOVES. For further details on NOx speciation within MOVES, please see the
associated technical report.

Table 3-9. NOx speciation profiles

Profile

pollutant

species

split factor

HONO

NOX

N02

0.092

HONO

NOX

NO

0.9

HONO

NOX

HONO

0.008

NHONO

NOX

N02

0.1

NHONO

NOX

NO

0.9

3.3.5 Temporal Processing Configuration

Temporal allocation is the process of distributing aggregated emissions to a finer temporal resolution,
thereby converting annual emissions to hourly emissions as is required by CMAQ. While the total
emissions are important, the timing of the occurrence of emissions is also essential for accurately
simulating ozone, PM, and other pollutant concentrations in the atmosphere. Many emissions inventories
are annual or monthly in nature. Temporal allocation takes these aggregated emissions and distributes the
emissions to the hours of each day. This process is typically done by applying temporal profiles to the
inventories in this order: monthly, day of the week, and diurnal, with monthly and day-of-week profiles
applied only if the inventory is not already at that level of detail.

The temporal factors applied to the inventory were selected using some combination of country, state,
county, SCC, and pollutant. Table 3-10 summarizes the temporal aspects of emissions modeling by
comparing the key approaches used for temporal processing across the sectors. In the table, "Daily
temporal approach" refers to the temporal approach for getting daily emissions from the inventory using
the SMOKE Temporal program. The values given are the values of the SMOKE L TYPE setting. The
"Merge processing approach" refers to the days used to represent other days in the month for the merge
step. If this is not "all," then the SMOKE merge step runs only for representative days, which could
include holidays as indicated by the right-most column. The values given are those used for the SMOKE
M TYPE setting (see below for more information).

Table 3-10. Temporal Settings Used for the Platform Sectors in SMOKE

Platform sector
short name

Inventory
resolutions

Monthly

profiles

used?

Daily

temporal

approach

Merge

processing

approach

Process
holidays as
separate days

afdust adj

Annual

Yes

week

all

Yes

airports

Annual

Yes

week

week

Yes

beis

Hourly



n/a

all

No

cmv clc2

Annual & hourly



All

all

No

cmv c3

Annual & hourly



All

all

No

fertilizer

Monthly



met-based

All

Yes

livestock

Annual

Yes

met-based

All

Yes

60


-------
Platform sector
short name

Inventory
resolutions

Monthly

profiles

used?

Daily

temporal

approach

Merge

processing

approach

Process
holidays as
separate days

nonpt

Annual

Yes

week

week

Yes

nonroad

Monthly



mwdss

mwdss

Yes

np oilgas

Annual

Yes

aveday

aveday

No

onroad

Annual &
monthly1



all

all

Yes

onroad ca adj

Annual &
monthly1



all

all

Yes

Canada afdust

Annual &
monthly

Yes

week

all

No

canmex area

Monthly



week

week

No

Canada onroad

Monthly



week

week

No

mexico onroad

Monthly



week

week

No

canmex point

Monthly

Yes

mwdss

mwdss

No

canada ptdust

Annual

Yes

week

all

No

canmex ag

Annual

Yes

mwdss

mwdss

No

canada og2D

Monthly



mwdss

mwdss

No

pt oilgas

Annual

Yes

mwdss

mwdss

Yes

ptegu

Annual & hourly

Yes2

all

All

No

ptnonipm

Annual

Yes

mwdss

mwdss

Yes

ptagfire

Daily



all

all

No

ptfire-rx

Daily



all

all

No

ptfire-wild

Daily



all

all

No

ptfire othna

Daily



all

all

No

rail

Annual

Yes

aveday

aveday

No

rwc

Annual

No3

met-based3

all

No3

np solvents

Annual

Yes

aveday

aveday

No

1.	Note the annual and monthly "inventory" actually refers to the activity data (VMT, VPOP, starts) for onroad. The
actual emissions are computed on an hourly basis.

2.	Only units that do not have matching hourly CEMs data use monthly temporal profiles.

3.	Except for 2 SCCs that do not use met-based temporalization.

The following values are used in the table. The value "all" means that hourly emissions are computed for
every day of the year and that emissions potentially have day-of-year variation. The value "week" means
that hourly emissions computed for all days in one "representative" week, representing all weeks for each
month. This means emissions have day-of-week variation, but not week-to-week variation within the
month. The value "mwdss" means hourly emissions for one representative Monday, representative
weekday (Tuesday through Friday), representative Saturday, and representative Sunday for each month.
This means emissions have variation between Mondays, other weekdays, Saturdays and Sundays within
the month, but not week-to-week variation within the month. The value "aveday" means hourly
emissions computed for one representative day of each month, meaning emissions for all days within a
month are the same. Special situations with respect to temporal allocation are described in the following
subsections.

61


-------
In addition to the resolution, temporal processing includes a ramp-up period for several days prior to
January 1, 2020, which is intended to mitigate the effects of initial condition concentrations. The ramp-up
period was 10 days (December 22-31, 2019). For all anthropogenic sectors, emissions from December
2020 were used to fill in surrogate emissions for the end of December 2019. For biogenic emissions,
December 2019 emissions were computed using year 2019 meteorology.

The FF10 inventory format for SMOKE provides a consolidated format for monthly, daily, and hourly
emissions inventories. With the FF10 format, a single inventory file can contain emissions for all 12
months and the annual emissions in a single record. This helps simplify the management of numerous
inventories. Similarly, daily and hourly FF10 inventories contain individual records with data for all days
in a month and all hours in a day, respectively.

SMOKE prevents the application of temporal profiles on top of the "native" resolution of the inventory.
For example, a monthly inventory should not have annual-to-month temporal allocation applied to it;
rather, it should only have month-to-day and diurnal temporal allocation. This becomes particularly
important when specific sectors have a mix of annual, monthly, daily, and/or hourly inventories. The
flags that control temporal allocation for a mixed set of inventories are discussed in the SMOKE
documentation. The modeling platform sectors that make use of monthly values in the FF10 files are
nonroad, onroad (for activity data), and all Canada and Mexico inventories except for agriculture.
Commercial marine vessels in cmv_c3 and cmv_clc2 use hourly data in the FF10 files.

3.3.5.1 Standard Temporal Profiles

Some sectors use straightforward temporal profiles not based on meteorology or other factors. For the
ptfire, ptagfire, and ptfire othna sectors, the inventories are in the daily point fire format, so temporal
profiles are only used to go from day-specific to hourly emissions. For all agricultural burning, the
diurnal temporal profile used reflected the fact that burning occurs during the daylight. This puts most of
the emissions during the workday and suppresses the emissions during the middle of the night. This
diurnal profile was used for each day of the week for all agricultural burning emissions in all states.

Most temporal profiles in ptnonipm result in primarily constant emissions for each day of the year,
although some have lower emissions on Sundays. An update in the 2018 platform was an analysis of
monthly temporal profiles for non-EGU point sources in the ptnonipm sector. A number of profiles were
found to be not quite flat over the months but were so close to flat that the difference was not meaningful.
These profiles were replaced in the cross reference to point instead to the flat monthly profile. The codes
for the profiles that were replaced were: 202, 214, 220, 221, 222, 223, 227, 257, 263, 264, 265, 266, 267,
269, 271, 272, 279, 280, 295, 302, 303, 304, 305, 306, 309, 310, 327, 329, 332, and 333.

Monthly temporalization of np oilgas emissions is based primarily on year-specific monthly factors from
the Oil and Gas Tool (OGT). Factors were specific to each county and SCC. For use in SMOKE, each
unique set of factors was assigned a label (OG20M 0001 through OG20M 6306), and then a SMOKE-
formatted ATPRO MONTHLY and an ATREF were developed. This dataset of monthly temporal
factors included profiles for all counties and SCCs in the Oil and Gas Tool inventory. Because we are
using non-tool datasets in some states, this monthly temporalization dataset did not cover all counties and
SCCs in the entire inventory used for this study. To fill in the gaps in those states, state average monthly
profiles for oil, natural gas, and combination sources were calculated from Energy Information
Administration (EIA) data and assigned to each county/SCC combination not already covered by the

62


-------
OGT monthly temporal profile dataset. Coal bed methane (CBM) and natural gas liquid sources were
assigned flat monthly profiles where there was not already a profile assignment in the ERG dataset.

For the afdust sector, meteorology is not used in the development of the temporal profiles, but it is used to
reduce the total emissions based on meteorological conditions. These adjustments are applied through
sector-specific scripts, beginning with the application of land use-based gridded transport fractions and
then subsequent zero-outs for hours during which precipitation occurs or there is snow cover on the
ground. The land use data used to reduce the NEI emissions explain the amount of emissions that are
subject to transport. This methodology is discussed in (Pouliot et al., 2010), and in "Fugitive Dust
Modeling for the 2008 Emissions Modeling Platform" (Adelman, 2012). The precipitation adjustment is
applied to remove all emissions for hours where measurable rain occurs, or where there is snow cover.
Therefore, the afdust emissions vary day-to-day based on the precipitation and/or snow cover for each
grid cell and hour. Both the transport fraction and meteorological adjustments are based on the gridded
resolution of the platform; therefore, somewhat different emissions will result from different grid
resolutions. Application of the transport fraction and meteorological adjustments prevents the
overestimation of fugitive dust impacts in the grid modeling as compared to ambient samples.

Biogenic emissions from the BEIS model vary each day of the year because they are developed using
meteorological data including temperature, surface pressure, and radiation/cloud data. The emissions are
computed using appropriate emission factors according to the vegetation in each model grid cell, while
taking the meteorological data into account.

For the cmv sectors, most areas use hourly emission inventories derived from the 5-minute AIS data. In
some areas where AIS data are not available, such as in Canada between the St. Lawrence Seaway and the
Great Lakes and in the southern Caribbean, the flat temporal profiles are used for hourly and day-of-week
values. Most regions without AIS data also use a flat monthly profile, with some offshore areas using an
average monthly profile derived from the 2008 ECA inventory monthly values. These areas without AIS
data also use flat day of week and hour of day profiles.

For the rail sector, monthly profiles from the 2016 platform were used. Monthly temporal allocation for
rail freight emissions is based on AAR Rail Traffic Data, Total Carloads and Intermodal, for 2016. For
passenger trains, monthly temporal allocation is flat for all months. Rail passenger miles data is available
by month but it is not known how closely rail emissions track with passenger activity since passenger
trains run on a fixed schedule regardless of how many passengers are aboard, and so a flat profile is
chosen for passenger trains. Rail emissions are allocated with flat day of week profiles, and most
emissions are allocated with flat hourly profiles.

For the ptfire sectors, the inventories are in the daily point fire format FF10 PTDAY, so temporal profiles
are only used to go from day-specific to hourly emissions. Separate hourly profiles for prescribed and
wildfires were used. For ptfire, state-specific hourly profiles were used, with distinct profiles for
prescribed fires and wildfires. The wildfire diurnal profiles are similar but vary according to the average
meteorological conditions in each state. For all agricultural burning, the diurnal temporal profile used
reflected the fact that burning occurs during the daylight. This puts most of the emissions during the
workday and suppresses the emissions during the middle of the night. This diurnal profile was used for
each day of the week for all agricultural burning emissions in all states.

63


-------
3.3.5.2 Temporal Profiles for EGUs

Electric generating unit (EGU) sources matched to ORIS units were temporally allocated to hourly
emissions needed for modeling using the hourly CEMS data for units that could be matched to the CEMS
emissions. Those hourly data were processed through v2.1 of the CEMCorrect tool to mitigate the impact
of unmeasured values in the data.

The temporal allocation procedure for EGUs in the base year is differentiated by whether or not the unit
could be directly matched to a unit with CEMS data via its ORIS facility code and boiler ID. Note that
for units matched to CEMS data, annual totals of their emissions input to CMAQ may be different than
the values in the annual inventory because the CEMS data replaces the NOx and SO2 annual inventory
data for the seasons in which the CEMS are operating. If a CEMS-matched unit is determined to be a
partial year reporter, as can happen for sources that run CEMS only in the summer, emissions totaling the
difference between the annual emissions and the total CEMS emissions are allocated to the non-summer
months. Prior to use of the CEMS data in SMOKE it is processed through the CEMCorrect tool. The
CEMCorrect tool identifies hours for which the data were not measured as indicated by the data quality
flags in the CEMS data files. Unmeasured data can be filled in with maximum values and thereby cause
erroneously high values in the CEMS data. When data were flagged as unmeasured and the values were
found to be more than three times the annual mean for that unit, the data for those hours were replaced
with annual mean values (Adelman et al., 2012). These adjusted CEMS data were then used for the
remainder of the temporal allocation process described below (see Figure 3-4 for an example).

2017 January Unit 469_5

2000

1800

1600 	

^ 1400

2. 1200

^ 1000
x 800
Z 600
400

200 	

0	¦ i.			 i — i	¦ — 		

HhroQLnHhmoiLriHhfnoiLnHhfnaiLnHhmoiLnHhmoi
T-i^rtO(T>rM^rr--(T>rMLnr--orM

January 2017 Hour
RawCEM	Corrected

Figure 3-4. Eliminating unmeasured spikes in CEMS data

The region, fuel, and type (peaking or non-peaking) must be identified for each input EGU with CEMS
data so the data can be used to generate profiles. The identification of peaking units was done using
hourly heat input data from the 2020 base year and the two previous years (2018 and 2019). The heat
input was summed for each year. Equation 1 shows how the annual heat input value is converted from
heat units (BTU/year) to power units (MW) using the NEEDS v6 derived unit-level heat rate (BTU/kWh).
In equation 2 a capacity factor is calculated by dividing the annual unit MW value by the NEEDS v6 unit

64


-------
capacity value (MW) multiplied by the hours in the year. A peaking unit was defined as any unit that had
a maximum capacity factor of less than 0.2 for every year (2018, 2019, and 2020) and a 3-year average
capacity factor of less than 0.1.

Equation 1. Annual unit power output

8760 Hourly HI	mw

Annual Unit Output (MW) = 	(btu)—1000 (kw

NEEDS Heat Rate (—)

Wh''

Equation 2. Unit capacity factor

„	_	Annual Unit Output (MW)

Capacity Factor =	

Unit Capacity (^)*8760 (h)

NEEDS

Input regions were determined from one of the eight EGU modeling regions based on MJO and climate
regions. Regions were used to group units with similar climate-based load demands. Region assignment is
made on a state level, where all units within a state were assigned to the appropriate region. Unit fuel
assignments were made using the primary NEEDS v6 fuel. Units fueled by bituminous, subbituminous, or
lignite are assigned to the coal fuel type. Natural gas units were assigned to the gas fuel type. Distillate
and residual fuel oil were assigned to the oil fuel type. Units with any other primary fuel were assigned
the "other" fuel type. Figure 3-5 shows the regions used to generate the profiles. Unit fuel assignments
were made using the primary NEEDS v6 fuel. Units fueled by bituminous, subbituminous, or lignite are
assigned to the coal fuel type. Natural gas units were assigned to the gas fuel type. Distillate and residual
fuel oil were assigned to the oil fuel type. Units with any other primary fuel were assigned the "other" fuel
type. Currently there are 64 profiles based on 8 regions, 4 fuels, and two types (peaking and non-peaking).

The daily and diurnal profiles were calculated for each region, fuel, and peaking type group from the year
2020 CEMS heat input values. The heat input values were summed for each input group to the annual
level at each level of temporal resolution: monthly, month-of-day, and diurnal. The sum by temporal
resolution value was then divided by the sum of annual heat input in that group to get a set of
temporalization factors. Diurnal factors were created for both the summer and winter seasons to account
for the variation in hourly load demands between the seasons. For example, the sum of all hour 1 heat
input values in the group was divided by the sum of all heat inputs over all hours to get the hour 1 factor.
Each grouping contained 12 monthly factors, up to 31 daily factors per month, and two sets of 24 hourly
factors. The profiles were weighted by unit size where the units with more heat input have more influence
on the shape of the profile. Composite profiles were created for each region and type across all fuels as a
way to provide profiles for a fuel type that does not have hourly CEMS data in that region. Figure 3-6
shows peaking and non-peaking daily temporal profiles for the gas fuel type in the LADCO region. Figure
3-7 shows the diurnal profiles for the coal fuel type in the Mid-Atlantic/Northeast Visibility Union
(MANE VU) region.

65


-------
EGU Regions

|	LADCO

~	MANE-VU
J	Northwest

~	SESARM

~	South
]	West

]	Southwest

r™|	West North Central

Figure 3-5. Small EGU Temporal Profile Regions

66


-------
2017

Figure 3-6. Example Daily Temporal Profiles for the LADCO region and Gas Fuel Type

Diurnal Small EGU Profile for MANE-VU coal

Figure 3-7. Example Diurnal Profile for MANE-VU Region and Coal Fuel Type

67


-------
SMOKE uses a cross-reference file to select a monthly, daily, and diurnal profile for each source. For the
2020 platform, the temporal profiles were assigned in the cross-reference at the unit level to EGU sources
without hourly CEMS data. An inventory of all EGU sources without CEMS data was used to identify the
region, fuel type, and type (peaking/non-peaking) of each source. The region used to select the temporal
profile is assigned based on the state from the unit FIPS. The fuel was assigned by SCC to one of the four
fuel types: coal, gas, oil, and other. A fuel type unit assignment is made by summing the VOC, NOX,
PM2.5, and S02 for all SCCs in the unit. The SCC that contributed the highest total emissions to the unit
for selected pollutants was used to assign the unit fuel type. Peaking units were identified as any unit with
an oil, gas, or oil fuel type with a NAICS of 22111 or 221112. Some units may be assigned to a fuel type
within a region that does not have an available input unit with a matching fuel type in that region. These
units without an available profile for their group were assigned to use the regional composite profile.
MWC and cogen units were identified using the NEEDS primary fuel type and cogeneration flag,
respectively, from the NEEDS v6 database. Assignments for each unit needed a profile were made using
the regions shown in Figure 3-5.

3.3.5.3 Meteorological-based Temporal Profiles

There are many factors that impact the timing of when emissions occur, and for some sectors this includes
meteorology. The benefits of utilizing meteorology as a method for temporal allocation are: (1) a
meteorological dataset consistent with that used by the AQ model is available (e.g., outputs from WRF);
(2) the meteorological model data are highly resolved in terms of spatial resolution; and (3) the
meteorological variables vary at hourly resolution and can, therefore, be translated into hour-specific
temporal allocation.

The SMOKE program Gentpro provides a method for developing meteorology-based temporal allocation.
Currently, the program can utilize three types of temporal algorithms: annual-to-day temporal allocation
for residential wood combustion (RWC); month-to-hour temporal allocation for agricultural livestock
NFb; and a generic meteorology-based algorithm for other situations. Meteorological-based temporal
allocation was used for portions of the rwc sector and for all agricultural sources. For 2020, some new
temporal profiles were introduced for livestock that differ by animal type and county.

Gentpro reads in gridded meteorological data (output from MCIP) along with spatial surrogates and uses
the specified algorithm to produce a new temporal profile that can be input into SMOKE. The
meteorological variables and the resolution of the generated temporal profile (hourly, daily, etc.) depend
on the selected algorithm and the run parameters. For more details on the development of these
algorithms and running Gentpro, see the Gentpro documentation and the SMOKE documentation at
http://www.cmascenter.Org/smoke/documentation/3.l/GenTPRQ Technical Summary Aug2012 Final.pd
f and https://www.cmascenter.Org/smoke/documentation/4.5/html/ch05s03s05.html respectively.

For the RWC sector, two different algorithms for calculating temporal allocation are used. For most SCCs
in the sector, in which wood burning is more prominent on colder days, Gentpro was used to compute
annual to day-of-year temporal profiles based on the daily minimum temperature. These profiles distribute
annual RWC emissions to the coldest days of the year. On days where the minimum temperature does not
drop below a user-defined threshold, RWC emissions for most sources in the sector are zero. Conversely,
the program temporally allocates the largest percentage of emissions to the coldest days. Similar to other
temporal allocation profiles, the total annual emissions do not change, only the distribution of the
emissions within the year is affected. The temperature threshold for RWC emissions was 50 °F for most

68


-------
of the country, and 60 °F for the following states: Alabama, Arizona, California, Florida, Georgia,
Louisiana, Mississippi, South Carolina, and Texas. The algorithm is as follows:

IfTd >=Tt: no emissions that day
If Td < Tt: daily factor = 0.79*(Tt -Td)

where (Td = minimum daily temperature; Tt = threshold temperature, which is 60 degrees F in southern
states and 50 degrees F elsewhere).

Once computed, the factors were normalized to sum to 1 to ensure that the total annual emissions are
unchanged (or minimally changed) during the temporal allocation process.

Figure 3-8 illustrates the impact of changing the temperature threshold for a warm climate county. The
plot shows the temporal fraction by day for Duval County, Florida, for the first four months of 2007. The
default 50 °F threshold creates large spikes on a few days, while the 60 °F threshold dampens these spikes
and distributes a small amount of emissions to the days that have a minimum temperature between 50 and

60 °F.

RWC temporal profile, Duval County, FL, Jan - Apr

Figure 3-8. Example of RWC temporalization using a 50 °F versus 60°F threshold

For the 2020 emissions modeling platform, a separate algorithm is used to determine temporal allocation
of recreational wood burning, e.g. fire pits (SCC 2104008700) and is applied by Gentpro. Recreational
wood burning depends on both minimum and maximum daily temperatures by county, and also uses a
day-of-week temporal profile (61500) in which emissions are much higher on weekends than on
weekdays. According to the recreational wood burning algorithm, only days in which the temperature
falls within a range of 50°F and 80°F at some point during the day receive emissions. On days when the
maximum temperature is less than 50°F or the minimum temperature is above 80°F, the daily temporal
factor is zero. For all other days, the day-of-week profile 61500 is applied, which has 33% of the
emissions on each weekend day and lower emissions on weekdays. An example is shown in Figure 3-9.
As a result of applying this algorithm, northern states have more recreational wood burning in summer
months while southern states show a flatter pattern with emissions distributed more evenly throughout the
months.

69


-------
Figure 3-9. Example of RWC tern penalization using a 50 °F versus 60°F threshold

The diurnal profile for used for most RWC sources places more of the RWC emissions in the morning
and the evening when people are typically using these sources. This profile is based on a 2004 MANE-
VU survey based temporal profiles (see

http://www.marama.org/publications folder/ResWoodCombustion/Final report.pdf). This profile was
created by averaging three indoor and three RWC outdoor temporal profiles from counties in Delaware
and aggregating them into a single RWC diurnal profile. This new profile was compared to a
concentration-based analysis of aethalometer measurements in Rochester, NY (Wang el a/. 2011) for
various seasons and day of the week and found that the new RWC profile generally tracked the
concentration based temporal patterns.

The temporal profiles for hydronic heaters" (i.e., SCCs=2104008610 [outdoor], 2104008620 [indoor], and
2104008620 [pellet-fired]) are not based on temperature data, because the meteorologically based
temporal allocation used for the rest of the rwc sector did not agree with observations for how these
appliances are used.

For hydronic heaters, the annual-to-month, day-of-week and diurnal profiles were modified based on
information in the New York State Energy Research and Development Authority's (NYSERDA)
"Environmental, Energy Market, and Health Characterization of Wood-Fired Hydronic Heater
Technologies, Final Report" (NYSERDA, 2012), as well as a Northeast States for Coordinated Air Use
Management (NESCAUM) report "Assessment of Outdoor Wood-fired Boilers" (NESCAUM, 2006). A
Minnesota 2008 Residential Fuelwood Assessment Survey of individual household responses (MDNR,
2008) provided additional annual-to-month, day-of-week, and diurnal activity information for OHM as
well as recreational RWC usage.

The diurnal profile for OHH, shown in Figure 3-10 is based on a conventional single-stage heat load unit
burning red oak in Syracuse, New York. The NESCAUM report describes how for individual units, OHH
are highly variable day-to-day but that in the aggregate, these emissions have no day-of-week variation.

70


-------
In contrast, the day-of-week profile for recreational RWC follows a typical "recreational" profile with
emissions peaked on weekends. Annual-to-month temporalization for OHH as well as recreational RWC
were computed from the MN DNR survey (MDNR, 2008) and are illustrated in Figure 3-11. OHH
emissions still exhibit strong seasonal variability, but do not drop to zero because many units operate
year-round for water and pool heating. In contrast to all other RWC appliances, recreational RWC
emissions are used far more frequently during the warm season.

Annual-to-month temporal allocation for OHH was computed from the MDNR 2008 survey and is
illustrated in Figure 3-10. There are two types of hydronic heaters 2104008620 (indoor hydronic heaters)
and 2104008630 (pellet-fired hydronic heaters). Both of these SCCs use the same monthly, weekly, and
diurnal temporal profiles as OHHs as is shown in Figure 3-11.

Heat Load (BTU/hr)

50,000

40,000

30,000
20,000
10,000

aaaaaaaaaaa

CjCjCjCjCjC3c3c3c3o3o3

aaaaaaaaaaaaa

Q-< Q-< Q-< c3

Figure 3-10. Diurnal profile for OHH, based on heat load (BTU/hr)

Figure 3—11. Annual-to-month temporal profiles for Outdoor Hydronic Heaters

For the ag sector, agricultural GenTPRO temporal allocation was applied to livestock emissions and to all
pollutants within the sector, not just NH3. The GenTPRO algorithm is based on an equation derived by
Jesse Bash of EPA ORD based on the Zhu, Henze, et al. (2014) empirical equation. This equation is based
on observations from the TES satellite instrument with the GEOS-Chem model and its adjoint to estimate

71


-------
diurnal NIT; emission variations from livestock as a function of ambient temperature, aerodynamic
resistance, and wind speed. The equations are:

Ea, = [161500/T,./, x e(-138aV] x AR,a>

PEz/j = Eci / Sum(E;.,v)

where

PE;,/; = Percentage of emissions in county i in hour h
Ei.h = Emission rate in county i in hour h
Tuh = Ambient temperature (Kelvin) in county i in hour h

Vi.h = Wind speed (meter/sec) in county i (minimum wind speed is 0.1 meter/sec)
AR,,/> = Aerodynamic resistance in county /'

Some examples plots of the profiles by animal type in different parts of the country are shown in Figure
3-12.

0.25
0.2
0.15
0.1
0.05
0

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
u Beg ¦ Broiler « Dairy » Layer m Swine

0.25
0.2
0.15
0.1
0.05
0

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
m Beef g Broiler m Dairy w Layer m Swine

Tulare County, CA

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
¦ Beef m Broiler » Dairy u Layer m Swine

Duplin County, NC

Sioux County, IA

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
» Beef m Broiler » Dairy » Layer m Swine

Lancaster County, PA

Figure 3-12. Examples of livestock temporal profiles in several parts of the country

GenTPRO was run using the "BASH NH3" profile method to create month-to-hour temporal profiles for
these sources. Because these profiles distribute to the hour based on monthly emissions, the monthly
emissions were obtained from a monthly inventory, or from an annual inventory that has been
temporalized to the month. Figure 3-13 compares the daily emissions for Minnesota from the "old"

72


-------
approach (uniform monthly profile) with the "new" approach (GenTPRO generated month-to-hour
profiles). Although the GenTPRO profiles show daily and hourly variability, the monthly total emissions
are the same between the two approaches.

MN ag NH3 livestock temporal profiles

0.0

1/1/2008 2/1/2008 3/1/2008 4/1/2008 5/1/2008 6/1/2008 7/1/2008 8/1/2008 9/1/2008 10/1/2008 11/1/2008 12/1/2008

-old
-new

Figure 3-13. Example of animal NH3 emissions temporalization approaches, summed to daily

emissions

For the afdust sector, meteorology is not used in the development of the temporal profiles, but it is used to
reduce the total emissions based on meteorological conditions. These adjustments are applied through
sector-specific scripts, beginning with the application of land use-based gridded transport fractions and
then subsequent zero-outs for hours during which precipitation occurs or there is snow cover on the
ground. The land use data used to reduce the NEI emissions explains the amount of emissions that are
subject to transport. This methodology is discussed in Pouliot, et al., 2010, and in "Fugitive Dust
Modeling for the 2008 Emissions Modeling Platform" (Adelman, 2012). The precipitation adjustment is
applied to remove all emissions for days where measurable rain occurs. Therefore, the afdust emissions
vary day-to-day based on the precipitation and/or snow cover for that grid cell and day. Both the
transport fraction and meteorological adjustments are based on the gridded resolution of the platform;
therefore, somewhat different emissions will result from different grid resolutions. Application of the
transport fraction and meteorological adjustments prevents the overestimation of fugitive dust impacts in
the grid modeling as compared to ambient samples.

3.3.5.4 Temporal Profiles for Onroad Mobile Sources

For the onroad sector, the temporal distribution of emissions is a combination of traditional temporal
profiles and the influence of meteorology. For the 2020 NEI EPA purchased county-level telematics data
from StreetLight for characterization of vehicle speed profiles and VMT temporal distributions for 2020.
Temporal profiles for speeds by road type were obtained by month, day of week, and hour. Vehicle types
included personal, commercial medium-duty, and commercial heavy-duty. This section will discuss both
the meteorological influence and the development of the temporal profiles for this platform.

The "inventories" for onroad consist of activity data for the onroad sector, not emissions. VMT is the
activity data used for on-network rate-per-distance (RPD) processes. For the off-network emissions from
the rate-per-profile (RPP) and rate-per-vehicle (RPV) processes, the VPOP activity data are annual and do
not need temporal allocation. For rate-per-hour (RPH) processes that result from hoteling of combination
trucks, the HOTELING inventory is annual and was temporalized to month, day of the week, and hour of
the day through temporal profiles. Day-of-week and hour-of-day temporal profiles are also used to
temporalize the starts activity used for rate-per-start (RPS) processes, and the off-network idling (ONI)

73


-------
hours activity used for rate-per-hour-ONI (RPHO) processes. The inventories for starts and ONI activity
contain monthly activity so that monthly temporal profiles are not needed.

For on-roadway RPD processes, the VMT activity data are annual for some sources and monthly for other
sources, depending on the source of the data. Sources without monthly VMT were temporalized from
annual to month through temporal profiles. VMT was also temporalized from month to day of the week,
and then to hourly through temporal profiles. The RPD processes also use hourly speed distributions
(SPDIST). For onroad, the temporal profiles and SPDIST will impact not only the distribution of
emissions through time but also the total emissions. SMOKE-MOVES calculates emissions for RPD
processed based on the VMT, speed and meteorology. Thus, if the VMT or speed data were shifted to
different hours, it would align with different temperatures and hence different emission factors. In other
words, two SMOKE-MOVES runs with identical annual VMT, meteorology, and MOVES emission
factors, will have different total emissions if the temporal allocation of VMT changes. Figure 3-14
illustrates the temporal allocation of the onroad activity data (i.e., VMT) and the pattern of the emissions
that result after running SMOKE-MOVES. In this figure, it can be seen that the meteorologically varying
emission factors add variation on top of the temporal allocation of the activity data.

Wake County, NC 2020 VMT and Onroad NOx emissions

VJ m

40
35
30
25
20
15

10	I '(j j I

5
0

1/1/2020 2/1/2020 3/1/2020 4/1/2020 5/1/2020 6/1/2020 7/1/2020 8/1/2020 9/1/2020 10/1/2020 11/1/2020 12/1/2020

	VMT 	NOX (tons)

25

20

15

10

Figure 3-14. Example temporal variability of VMT compared to onroad NOx emissions

Meteorology is not used in the development of the temporal profiles, but rather it impacts the calculation
of the hourly emissions through the program Movesmrg. The result is that the emissions vary at the
hourly level by grid cell. More specifically, the on-network (RPD) and the off-network parked and
stationary vehicle (RPV, RPFI, RPITO, RPS, and RPP) processes use the gridded meteorology (MCIP)
either directly or indirectly. For RPD, RPV, RPH, RPHO, and RPS, Movesmrg determines the
temperature for each hour and grid cell and uses that information to select the appropriate emission factor
for the specified SCC/pollutant/mode combination. For RPP, instead of reading gridded hourly
meteorology, Movesmrg reads gridded daily minimum and maximum temperatures. The total of the
emissions from the combination of these six processes (RPD, RPV, RPH, RPHO, RPS, and RPP)
comprise the onroad sector emissions. In summary, the temporal patterns of emissions in the onroad
sector are influenced by meteorology.

74


-------
Day-of-week and hour-of-day temporal profiles for VMT were developed for use in the 2020 NEI using
data acquired from StreetLight. Data were provided for three vehicle categories: passenger vehicles
(11/21/31), commercial trucks (32/52), and combination trucks (53/61/62). StreetLight data did not cover
buses, refuse trucks, or motor homes, so those vehicle types were mapped to other vehicle types as
follows: 1) other/transit buses were mapped to commercial trucks; 2) Motor homes were mapped to
passenger vehicles for day-of-week and commercial trucks for hour-of-day; 3) School buses and refuse
trucks were mapped to commercial trucks. In addition to temporal profiles, StreetLight data were also
used to develop the hourly speed distributions (SPDIST) used by SMOKE-MOVES.

The StreetLight dataset includes temporal profiles for individual counties. Temporal profiles also vary by
each of the MOVES road types, and there are distinct hour-of-day profiles for each day of the week. Plots
of hour-of-day profiles for all vehicles and road types in Fulton County, GA, are shown in Figure 3-15.
Separate plots are shown for Monday, Saturday, and Sunday in January 2020, and each line corresponds
to a particular MOVES road type (i.e., road type 2 = rural restricted, 3 = rural unrestricted, 4 = urban
restricted, and 5 = urban unrestricted) and vehicle type (as described in the previous paragraph). In the
pre-pandemic profiles shown in this figure, there are bimodal peaks for light-duty vehicles on Monday,
but there is only a single peak on the weekend days.

State/local-provided data for the 2020 NEI were accepted for use in the 2020 NEI if they were deemed to
be at least as credible as the StreetLight data (i.e., reflected the effects of COVID). The 2020 NEI TSD
includes more details on which data were used for which counties. In areas of the contiguous United
States where state/local-provided data were not provided or deemed unacceptable, the StreetLight
temporal profiles were used, including in California. The StreetLight temporal profiles were used in areas
of the contiguous United States that did not submit temporal profiles of sufficient detail for the 2020 NEI.
For this platform, the data selection hierarchy favored local input data over EPA-developed information,
with the exception of the three MOVES tables hourVMTFraction , dayVMTFraction , and
avgSpeedDistribution where county-level, telematics-based EPA Defaults were adopted for the NEI
universally due to unique activity patterns by month during 2020.

For hoteling, day-of-week profiles are the same as non-hoteling for combination trucks, while hour-of-day
non-hoteling profiles for combination trucks were inverted to create new hoteling profiles that peak
overnight instead of during the day.

Temporal profiles for RPHO are based on the same temporal profiles as the on-network processes in
RPD, but since the on-network profiles are road-type-specific and ONI is not road-type-specific, the
RPHO profiles were assigned to use rural unrestricted profiles for counties considered "rural" and urban
unrestricted profiles for counties considered "urban". RPS uses the same day-of-week profiles as on-
network processes in RPD, but uses a separate set of diurnal temporal profiles specifically for starts
activity. For starts, there are two hour-of-day temporal profiles for each source type, one for weekdays
and one for weekends. The starts diurnal temporal profiles are applied nationally and are based on the
default starts-hour-fraction tables from MOVES.

75


-------
2020 Streetlight hourly profiles: FIPS '13121_MO_m1' 'Fulton Co, Georgia - Monday - January'

	13121 _MO_m1 _11 _2		13121 M0_m1 11 3		13121_M0_m1_11_4

	13121 _MO_m1 31 2 	13121 _MO_m1 31~3		13121_MO_m1_31_4

	13121 _M0_m1 52_2 	13121_M0_m1_52 3		 13121 _M0_m1 _52_4

	13121 _MO_m 1 _61 _2 	13121_MO_m1_61 3		13121 _MO_m1 _61 _4

label

	13121 MO_m1 11 5 	13121 _MO_m1 _21 _2		13121_MO_m1 21_3 	13121 MO_m1 21_4 	13121 MO_m1_21_5

13121_MO_m1 31 _5 	13121 _M0_m1 _32 2		 13121_MO_mf32_3 	13121_MO_m1_32_4 	13121~MO_m1_32_5

	13121_M0_m1 52_5 	13121_M0_m1_53_2		13121 M0_m1_53_3 - 13121_MO_m1_53_4	13121 MO_m1_53_5

	13121 _MO_m1 _61 _5 	13121 _MO_m1 _62_2		13121_M0_m1 62_3 	13121 _MO_m1 _62_4 	13121 MO_m1_62_5

Figure 3-15. Sample onroad diurnal profiles for Fulton County, GA

-13121	SU_m1_11_2 •

-13121	SU_m1_31_2 •

-13121	SU_m1_52_2 •

-13121	SU ml 61 2 •

-13121_SU_m1_11_3 ¦

-	13121_SU_m1_31 3 ¦

-	13121_SU_m1_52_3 ¦
-13121 SU ml 61 3 ¦

-	13121_SU_m1_11_4 ¦

-	13121 _SU_mf31_4

-	13121_SU_m1_52_4 ¦

-	13121 SU ml 61 4 -

-13121_SU_m1_11_5 •
13121_SU_m1_31 5 •
-13121 _SU_m1 _52_5 ¦
-13121 SU ml 61 5 ¦

-13121 _SU_m1_21_2	-

-13121 _SU_m1_32_2	•

-13121 _SU_m1_53_2	-

-13121 SU ml 62 2	-

-13121 _SU_m1 _21 _3	¦

-13121 SU m1_32_3	•

-13121 SU_m1_53_3	¦

-13121 SU ml 62 3	•

¦ 13121 _SU_m1_21_4	•

-13121 _SU_m1_32_4	-
-13121 _SU_m1_53_4

-13121 SU ml 62 4	-

-	13121_SU_m1 21 _5
-13121 _SU_m1 _32_5

13121 _SU_m1 _53_5

-	13121 SU ml 62 5

-13121 _SA_m1_11_2	¦
-13121 _SA_m1_31_2 •
-13121 _SA_m1_52_2 ¦

-13121 SA ml 61 2	•

-	13121 _SA_m1 _11 _3	¦

-	13121_SA_m1 31 3

-	13121_SA_m1_52_3	•

-	13121 SA ml 61 3 ¦

- 13121_SA_m1_11_4 ¦

13121 _SA_m1 _31 _4
-13121 _SA_m1 _52_4 ¦
-13121 SA ml 61 4 •

label

-	13121_SA_m1_11_5 —
13121 _SA_m1 _31 _5 —

-	13121_SA_m1 52 5 —

-	13121 SA m1~61 5 —

-	13121 _SA_m1 _21_2	¦

-	13121 _SA_m1 _32_2	•

-	13121_SA m1_53_2	•

-	13121 SA ml 62 2	•

- 13121_SA_m1_21_3	•

-13121 _SA_m1 _32_3	•

-13121 SA_m1_53_3	-

-13121 SA ml 62 3	•

-	13121_SA_m1_21J

-	13121_SA_m1_32J

-	13121_SA_m1_53J

-	13121 SA~m1 62 4

-13121 _SA_m1 _21 _5
-13121 _SA_m1 _32_5
13121_SA_m1 53_5
-13121 SA ml 62 5

2020 Streetlight hourly profiles: FIPS '13121_SU_m1' 'Fulton Co, Georgia ¦ Sunday ¦ January'

2020 Streetlight hourly profiles: FIPS "!3121_SA_mT 'Fulton Co, Georgia • Saturday ¦ January'

76


-------
3.3.5.4 Airport Temporal Profiles

Airport temporal profiles were updated to 2020-specific temporal profiles for all airports other than
Alaska seaplanes (which are not in the CMAQ modeling domain). Hourly airport operations data were
obtained from the Aviation System Performance Metrics (ASPM) Airport Analysis website
(https://aspm.faa.gov/apm/svs/AnalvsisAP.asp). A report of 2020 hourly Departures and Arrivals for
Metric Computation by airport was generated. An overview of the ASPM metrics is at
http://aspmhelp.faa.gov/index.php/Aviation Performance Metrics %28APM%29. Figure 3-16 shows
examples of diurnal airport profiles for Phoenix airport (PHX) and the default diurnal profile for Texas.

2020 FAA State Diurnal Profile: TX default

Figure 3-16. 2020 Airport Diurnal Profiles for PHX and state of Texas

Month-to-day and Annual-to month temporal profiles were developed based on a separate query of the
2020 Aviation System Performance Metrics (ASPM) Airport Analysis

(https://aspm.faa.gov/apm/svs/AnalvsisAP.asp). A report of all airport operations (takeoffs and landings)
by day for 2020 was generated. Day-of-month profiles were derived directly from the daily airport
operations report. An example is shown for Wisconsin in Figure 3-17 while Figure 3-18 shows the pre-

77


-------
pandemic day of week profile. The prepandemic annual-to-month profile is shown in Figure 3-19. The
2020 airport data were summed to crate the example annual-to-month temporal profiles shown in Figure
3-20.

For 2020, all airport SCCs (i.e., 2275*, 2265008005, 2267008005, 2268008005 and 2270008005) were
assigned to individual commercial airports where a match could be made between the inventory facility
and the FAA identifier in the ASPM derived data. State average profiles were calculated as the average
of the temporal fractions for all airports within a state. The state average profiles were assigned by state
to all airports in the inventory that did not have an airport specific match in the ASPM data. Package
processing hubs at the Memphis (MEM), Indianapolis (IND), Louisville (SDF), and Chicago Rockford
(RFD) airports produced peaks in the average state profiles at times not typical for activity in smaller
commercial airports. These packaging hubs were removed from the state averages. Airports that required
state-defaults in states lacking ASPM data use national average profiles calculated from the average of the
state temporal profiles.

Alaska seaplanes, which are outside the CONUS domain use the monthly profile in Figure 3-21. These
were assigned based on the facility ID.

March 2020 FAA State Daily Profile: Wl default

Figure 3-17. 2020 Wisconsin month-to-day profile for airport emissions

78


-------
0.18
0.16
0.14
0.12
0.1
0.08
0.06
0.04
0.02
0

Figure 3-18. Prepandemic weekly profile for airport emissions

Pre-2020 Monthly Airport Profile

0.04
0.02
0

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Figure 3-19. Pre-pandemic monthly profile for airport emissions

Weekly Airport Profile


-------
2020 FAA Airport Monthly Profile: ATL

Figure 3-20. 2020 Monthly airport profiles for ATL and state of Maryland

0.14





0.12
0.10



/ \

/ \

0.08

	L	\	

/ \

0.06

	/	\

/ \

0.04

	L	\

/ >

0.02





0.00



Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Figure 3-21. Alaska Seaplane Profile

80


-------
3.3.5.5 Nonroatl Temporal Profiles

For nonroad mobile sources, temporal allocation is performed differently for different SCCs. Beginning
with the final 2011 platform, improvements to temporal allocation of nonroad mobile sources were made
to make the temporal profiles more realistically reflect real-world practices. The specific updates were
made for agricultural sources (e.g., tractors), construction, and commercial residential lawn and garden
sources.

Figure 3-22 shows two previously existing temporal profiles (9 and 18) and a newer temporal profile (19)
which has lower emissions on weekends. In this platform, construction and commercial lawn and garden
sources use the new profile 19 which has lower emissions on weekends. Residental lawn and garden
sources continue to use profile 9 and agricultural sources continue to use profile 19.

Day of Week Profiles

0.24
0.22
0.2
0.18
0.16
0.14
0.12

01

0.08
0.06
0.04
0.02
0

mood a/ tuesday Wednesday thursday friday Saturday sundae

Figure 3-22. Example Nonroad Day-of-week Temporal Profiles

Figure 3-23 shows the previously existing temporal profiles 26 and 27 along with newer temporal profiles
(25a and 26a) which have lower emissions overnight. In this platform, construction sources use profile
26a. Commercial lawn and garden and agriculture sources use the profiles 26a and 25a, respectively.
Residental lawn and garden sources use profile 27.

Hour of Day Profiles

0.11

26a-New	27	25a- New	26

Figure 3-23. Example Nonroad Diurnal Temporal Profiles

81


-------
For the nonroad sector, while the NEI only stores the annual totals, the modeling platform uses monthly
inventories from output from MOVES. For California, CARB's annual inventory was temporalized to
monthly using monthly temporal profiles applied in SMOKE by SCC.

3.3.6	Vertical Allocation of Emissions

Table 3-6 specifies the sectors for which plume rise is calculated. If there is no plume rise for a sector, the
emissions are placed into layer 1 of the air quality model. Vertical plume rise was performed in-line within
CMAQ for all of the SMOKE point-source sectors (i .e., ptegu, ptnonipm, pt oilgas, ptfire-rx, ptfire-wild,
ptagfire, ptfire othna, othpt, and cmv_c3). The in-line plume rise computed within CMAQ is nearly
identical to the plume rise that would be calculated within SMOKE using the Lay point program. The
selection of point sources for plume rise is pre-determined in SMOKE using the Elevpoint program. The
calculation is done in conjunction with the CMAQ model time steps with interpolated meteorological data
and is therefore more temporally resolved than when it is done in SMOKE. Also, the calculation of the
location of the point sources is slightly different than the one used in SMOKE and this can result in
slightly different placement of point sources near grid cell boundaries.

For point sources, the stack parameters are used as inputs to the Briggs algorithm, but point fires
do not have traditional stack parameters. However, the ptfire-rx, ptfire-wild, ptagfire, andptfire_othna
inventories do contain data on the acres burned (acres per day) and fuel consumption (tons fuel per acre)
for each day. CMAQ uses these additional parameters to estimate the plume rise of emissions into layers
above the surface model layer. Specifically, these data are used to calculate heat flux, which is then used to
estimate plume rise. In addition to the acres burned and fuel consumption, heat content of the fuel is
needed to compute heat flux. The heat content was assumed to be 8000 Btu/lb of fuel for all fires because
specific data on the fuels were unavailable in the inventory. The plume rise algorithm applied to the fires is
a modification of the Briggs algorithm with a stack height of zero.

CMAQ uses the Briggs algorithm to determine the plume top and bottom, and then computes the plumes"
distributions into the vertical layers that the plumes intersect. The pressure difference across each layer
divided by the pressure difference across the entire plume is used as a weighting factor to assign the
emissions to layers. This approach gives plume fractions by layer and source. Note that the implementation
of fire plume rise in CMAQ differs from the implementation of plume rise in SMOKE. This study uses
CMAQ to compute the fire plume rise.

3.3.7	Emissions Modeling Spatial Allocation

The methods used to perform spatial allocation are summarized in this section. For the modeling
platform, spatial factors are typically applied by county and SCC. Spatial allocation was performed for
each of the modeling grids shown in Section 3.1. To accomplish this, SMOKE used national 12-km
spatial surrogates and a SMOKE area-to-point data file. For the U.S., the EPA updated surrogates to use
circa 2020 data. The U.S., Mexican, and Canadian 12-km surrogates cover the entire CONUS domain
For Canada, shapefiles for generating new surrogates were provided by ECCC for use with their 2015
inventories. The U.S., Mexican, and Canadian 12-km surrogates cover the entire CONUS domain 12US1
shown in Figure 3-3. While highlights of information are provided below, the file
Surrogate_specifications_2020_platform_US_Can_Mex.xlsx documents the complete configuration for
generating the surrogates and can be referenced for more details.

82


-------
3.3.7.1

Surrogates for U.S. Emissions

There are more than 100 spatial surrogates available for spatially allocating U.S. county-level emissions
to the 12-km grid cells used by the air quality model. Note that an area-to-point approach overrides the
use of surrogates for a limited set of sources. Table 3-11 lists the codes and descriptions of the surrogates.
Surrogate names and codes listed in italics are not directly assigned to any sources for this platform, but
they are sometimes used to gapfill other surrogates. When the source data for a surrogate have no values
for a particular county, gap filling is used to provide values for the spatial surrogate in those counties to
ensure that no emissions are dropped when the spatial surrogates are applied to the emission inventories.

The surrogates for the platform are based on a variety of geospatial data sources, including the American
Community Survey (ACS) for census-related data, the National Land Cover Database (NLCD) Onroad
surrogates are based on average annual daily traffic counts (AADT) from the highway monitoring
performance system (HPMS).

Surrogate updates for this platform include:

County boundaries used for all surrogates were updated to use the 2020 TIGER boundaries.

Oil and gas surrogates were updated to represent 2020.

ACS-based surrogates were updated to use the 2020 ACS

-	Updated surrogates for residential wood combustion were developed based on ACS data

-	NLCD-based surrogates were updated to use NLCD 2019.

Animal specific livestock waste surrogates were derived from National Pollutant Discharge
Elimination System (NPDES) animal operation water permits and Food and Agriculture
Organization (FAO) gridded livestock count data

-	New surrogates for fuel stations, asphalt surfaces, and unpaved roads were created using data from
the OpenStreetMap database

Gravel and lead mines were split out to their own surrogates from the more general United States
Geological Survey mining surrogate

Surrogates for the U.S. were generated using the Surrogate Tools DB with the Java-based Surrogate tools
used to perform gapfilling and normalization where needed. The tool and documentation for the original
Surrogate Tool are available at https://www.cmascenter.org/sa-

tools/documentation/4.2/SurrogateToolUserGuide 4 2.pdf and the tool and documentation for the
Surrogate Tools DB is available from https://www.cmascenter.org/surrogate tools db/. The file
Surrogate_specifications_2020_platform_US_Can_Mex.xlsx documents the configuration for generating
the surrogates

Table 3-11. U.S. Surrogates available for the 2019 modeling platform

Code

Surrogate Description

Code

Surrogate Description

N/A

Area-to-point approach (see 3.6.2)

672

Gas production - oil wells

100

Population

674

Unconventional Well Completion Counts

110

Housing

676

Well count - all producing

135

Detatched Housing

677

Well count - all exploratory

136

Single and Dual Unit Housing

678

Completions at Gas Wells

150

Residential Heating - Natural Gas

679

Completions at CBM Wells

83


-------
Code

Surrogate Description

Code

Surrogate Description

170

Residential Heating - Distillate Oil

681

Spud Count - Oil Wells

180

Residential Heating - Coal

683

Produced Water at All Wells

190

Residential Heating - LP Gas

6831

Produced water at CBM wells

205

Extended Idle Locations

6832

Produced water at gas wells

239

Total Road AADT

6833

Produced water at oil wells

240

Total Road Miles

685

Completions at Oil Wells

242

All Restricted AADT

686

Completions - all wells

244

All Unrestricted AADT

687

Feet Drilled at All Wells

258

Intercity Bus Terminals

689

Gas Produced - Total

259

Transit Bus Terminals

691

Well Counts - CBM Wells

261

NT AD Total Railroad Density

692

Spud Count - All Wells

271

NT AD Class 12 3 Railroad Density

693

Well Count - All Wells

300

NLCD Low Intensity Development

694

Oil Production at Oil Wells

304

NLCD Open + Low

695

Well Count - Oil Wells

305

NLCD Low + Med

696

Gas Production at Gas Wells

306

NLCD Med + High

697

Oil production - gas wells

307

NLCD All Development

698

Well Count - Gas Wells

308

NLCD Low + Med + High

699

Gas Production at CBM Wells

309

NLCD Open + Low + Med

711

Airport Areas

310

NLCD Total Agriculture

801

Port Areas

319

NLCD Crop Land

850

Golf Courses

320

NLCD Forest Land

860

Mines

321

NLCD Recreational Land

861

Sand and Gravel Mines

340

NLCD Land

862

Lead Mines

350

NLCD Water

863

Crushed Stone Mines

401

FAO 2010 Cattle

900

OSMFuel

402

FAO 2010 Pig

901

OSM Asphalt Surfaces

403

FAO 2010 Chicken

902

OSM Unpaved Roads

404

FAO 2010 Goat

4011

FAO 2010 Large Cattle Operations

405

FAO 2010 Horse

4012

NPDES 2020 Beef Cattle

406

FAO 2010 Sheep

4013

NPDES 2020 Dairy Cattle

508

Public Schools

4021

NPDES 2020 Swine

650

Refineries and Tank Farms

4031

NPDES 2020 Chicken

670

Spud Count - CBM Wells

4041

NPDES 2020 Goat

671

Spud Count - Gas Wells

4071

NPDES 2020 Turkey

For the onroad sector, the on-network (RPD) emissions were spatially allocated differently from other off-
network processes (i.e. RPV, RPP, RPHO, RPS, RPH). Surrogates for on-network processes are based on
AADT data and off network processes (including the off-network idling included in RPHO) are based on
land use surrogates as shown in Table 3-12. Emissions from the extended (i.e., overnight) idling of trucks
were assigned to surrogate 205, which is based on locations of overnight truck parking spaces. The
underlying data for this surrogate were updated during the development of the 2016 platforms to include
additional data sources and corrections based on comments received and these updates were carried into
this platform

84


-------
Table 3-12. Off-Network Mobile Source Surrogates

Source type

Source Type name

Surrogate ID

Description

11

Motorcycle

307

NLCD All Development

21

Passenger Car

307

NLCD All Development

31

Passenger Truck

307

NLCD All Development







NLCD Low + Med +

32

Light Commercial Truck

308

High

41

Other Bus

306

NLCD Med + High

42

Transit Bus

259

Transit Bus Terminals

43

School Bus

508

Public Schools

51

Refuse Truck

306

NLCD Med + High

52

Single Unit Short-haul Truck

306

NLCD Med + High

53

Single Unit Long-haul Truck

306

NLCD Med + High

54

Motor Home

304

NLCD Open + Low

61

Combination Short-haul Truck

306

NLCD Med + High

62

Combination Long-haul Truck

306

NLCD Med + High

For the oil and gas sources in the npoilgas sector, the spatial surrogates were updated to those shown in
Table 3-13 using 2020 data consistent with what was used to develop the nonpoint oil and gas emissions.

The exploration and production of oil and gas have increased in terms of quantities and locations over the
last seven years, primarily through the use of new technologies, such as hydraulic fracturing. Census-
tract, 2-km, and 4-km sub-county Shapefiles were developed, from which the 2020 oil and gas surrogates
were generated. All spatial surrogates for np oilgas are developed based on known locations of oil and
gas activity for year 2020.

The primary activity data source used for the development of the oil and gas spatial surrogates was data
from ENVERUS [formerly Drilling Info (DI) Desktop's HPDI] database (ENVERUS, 2021). This
database contains well-level location, production, and exploration statistics at the monthly level. Due to a
proprietary agreement with ENVERUS, individual well locations and ancillary production cannot be
made publicly available, but aggregated statistics are allowed. These data were supplemented with data
from state Oil and Gas Commission (OGC) websites (Alaska, Arizona, Idaho, Illinois, Indiana, Kentucky,
Louisiana, Michigan, Mississippi, Missouri, Nevada, Oregon, Pennsylvania, and Tennessee). In cases
when the desired surrogate parameter was not available (e.g., feet drilled), data for an alternative
surrogate parameter (e.g., number of spudded wells) were downloaded and used. Under that
methodology, both completion date and date of first production from HPDI were used to identify wells
completed during 2020.

The spatial surrogates, numbered 670 through 699 and also 6831, 6832, and 6833, were gapfilled using
fallback surrogates. For each surrogate, the last two fallbacks were surrogate 693 (Well Count - All
Wells) and 304 (NLCD Open + Low). Where appropriate, other surrogates were also parts of the
gapfilling procedure. For example, surrogate 670 (Spud Count - CBM Wells) was first gapfilled with 692
(Spud Count - All Wells), and then 693 and finally 304. All gapfilling was performed with the Surrogate
Tool.

The U.S. CAP emissions (i.e., NFb, NOx, PM2.5, SO2, and VOC) allocated to the various spatial
surrogates are shown in Table 3-14.

85


-------
Table 3-13. Spatial Surrogates for Oil and Gas Sources

Surrogate Code

Surrogate Description

670

Spud Count - CBM Wells

671

Spud Count - Gas Wells

672

Gas Production at Oil Wells

673

Oil Production at CBM Wells

674

Unconventional Well Completion Counts

676

Well Count - All Producing

677

Well Count - All Exploratory

678

Completions at Gas Wells

679

Completions at CBM Wells

681

Spud Count - Oil Wells

683

Produced Water at All Wells

685

Completions at Oil Wells

686

Completions at All Wells

687

Feet Drilled at All Wells

689

Gas Produced - Total

691

Well Counts - CBM Wells

692

Spud Count - All Wells

693

Well Count - All Wells

694

Oil Production at Oil Wells

695

Well Count - Oil Wells

696

Gas Production at Gas Wells

697

Oil Production at Gas Wells

698

Well Count - Gas Wells

699

Gas Production at CBM Wells

6831

Produced water at CBM wells

6832

Produced water at gas wells

6833

Produced water at oil wells

Table 3-14. Selected 2019 CAP emissions by sector for U.S. Surrogates (12US1, tons)

Sector

ID

Description

NH3

NOX

PM2 5

S02

voc

afdust

240

Total Road Miles

0

0

333,425

0

0

afdust

306

NLCD Med + High

0

0

41,167

0

0

afdust

308

NLCD Low + Med + High

0

0

122,726

0

0

afdust

310

NLCD Total Agriculture

0

0

502,702

0

0

afdust

861

Sand and Gravel Mines

0

0

271

0

0

afdust

863

Crushed Stone Mines

0

0

291

0

0

86


-------
PC

0

0

0

0

0

0

0

,558

,539

,170

,786

,096

,157

,538

,640

36

,946

,680

,086

2

,435

,536

,591

,074

,432

,283

,342

,691

440

292

44

,120

367

,351

,354

,993

,098

341

,964

,724

,364

,321

ID

Description

NH3

NOX

PM2 5

S02

902

OSM Unpaved Roads

960,028

4012

NPDES 2020 Beef Cattle

191,878

4013

NPDES 2020 Dairy Cattle

15,033

4021

NPDES 2020 Swine

658

4031

NPDES 2020 Chicken

5,069

4071

NPDES 2020 Turkey

0

1,959

310

NLCD Total Agriculture

1,832,594

0

405

FAO 2010 Horse

31,969

406

FAO 2010 Sheep

19,235

4012

NPDES 2020 Beef Cattle

702,119

4013

NPDES 2020 Dairy Cattle

572,321

4021

NPDES 2020 Swine

838,696

4031

NPDES 2020 Chicken

426,996

4041

NPDES 2020 Goat

19,231

4071

NPDES 2020 Turkey

83,001

100

Population

454

0

0

0

135

Detached Housing

0

16,359

81,108

2,724

150

Residential Heating - Natural Gas

44,524

214,626

2,669

1,436

170

Residential Heating - Distillate Oil

1,499

25,521

3,165

624

180

Residential Heating - Coal

0

1

190

Residential Heating - LP Gas

127

36,460

150

164

239

Total Road AADT

0

244

All Unrestricted AADT

271

NT AD Class 12 3 Railroad Density

0

0

0

300

NLCD Low Intensity Development

2,860

3,417

17,009

400

306

NLCD Med + High

17,840

251,201

383,854

85,559

307

NLCD All Development

76,463

28,172

126,918

10,917

308

NLCD Low + Med + High

961

162,993

18,656

5,676

310

NLCD Total Agriculture

517

311

504

31

319

NLCD Crop Land

95

70

320

NLCD Forest Land

11

31

650

Refineries and Tank Farms

711

Airport Areas

801

Port Areas

900

OSM Fuel

4011

FAO 2010 Large Cattle Operations

0

0

136

Single and Dual Unit Housing

99

14,706

2,913

47

261

NT AD Total Railroad Density

1,664

168

304

NLCD Open + Low

1,695

155

305

NLCD Low + Med

837

1,014

306

NLCD Med + High

366

160,863

9,452

257

307

NLCD All Development

112

29,:

16,088

52

87


-------
PC

,408

,114

,069

,532

,202

,398

,875

452

35

,544

,222

,821

489

,225

807

,055

,426

,237

,464

74

2

,474

,896

,686

,524

,727

,334

875

,908

,418

,567

,753

,876

,955

,003

,587

,778

,717

,641

,973

476

,811

ID

Description

NH3

NOX

PM2 5

308

NLCD Low + Med + High

585

242,493

20,187

309

NLCD Open + Low + Med

133

21,682

1,301

310

NLCD Total Agriculture

358

257,080

18,310

320

NLCD Forest Land

15

2,439

438

321

NLCD Recreational Land

80

12,898

5,082

350

NLCD Water

203

115,290

4,502

850

Golf Courses

13

2,108

122

860

Mines

2,439

231

670

Spud Count - CBM Wells

0

671

Spud Count - Gas Wells

674

Unconventional Well Completion
Counts

16

23,908

540

678

Completions at Gas Wells

5,343

121

679

Completions at CBM Wells

681

Spud Count - Oil Wells

683

Produced Water at All Wells

41

685

Completions at Oil Wells

217

687

Feet Drilled at All Wells

35,527

733

689

Gas Produced - Total

485

29

691

Well Counts - CBM Wells

19,267

307

692

Spud Count - All Wells

589

34

693

Well Count - All Wells

0

694

Oil Production at Oil Wells

3,060

695

Well Count - Oil Wells

159,345

4,270

696

Gas Production at Gas Wells

42,067

228

697

Oil Production at Gas Wells

261

0

698

Well Count - Gas Wells

281,181

4,185

699

Gas Production at CBM Wells

22

6831

Produced water at CBM wells

6832

Produced water at gas wells

6833

Produced water at oil wells

100

Population

240

Total Road Miles/

306

NLCD Med + High

307

NLCD All Development

308

NLCD Low + Med + High

310

NLCD Total Agriculture

901

OSM Asphalt Surfaces

0

205

Extended Idle Locations

290

33,058

750

242

All Restricted AADT

29,464

783,301

20,867

244

All Unrestricted AADT

54,906

1,215,064

45,715

259

Transit Bus Terminals

42

1,539

37

304

NLCD Open + Low

510

13

88


-------
Sector

ID

Description

NH3

NOX

PM2 5

S02

voc

onroad

306

NLCD Med + High

914

91,100

2,823

67

26,456

onroad

307

NLCD All Development

3,519

182,771

7,802

578

559,726

onroad

308

NLCD Low + Med + High

179

18,151

535

32

29,126

onroad

508

Public Schools

13

1,589

72

1

440

rail

261

NT AD Total Railroad Density

13

22,177

599

16

1,015

rail

271

NT AD Class 12 3 Railroad Density

269

400,799

9,861

336

16,478

rwc

135

Detached Housing

7,054

13,004

132,683

3,635

124,847

rwc

136

Single and Dual Unit Housing

15,681

31,864

315,389

8,383

330,813

3.3.7.2	Allocation Methodfor Airport-Related Sources in the U.S.

There are numerous airport-related emission sources in the NEI, such as aircraft, airport ground support
equipment, and jet refueling. The modeling platform includes the aircraft and airport ground support
equipment emissions as point sources. For the modeling platform, the EPA used the SMOKE "area-to-
point" approach for only jet refueling in the nonpt sector. The following SCCs use this approach:
2501080050 and 2501080100 (petroleum storage at airports), and 2810040000 (aircraft/rocket engine
firing and testing). The ARTOPNT approach is described in detail in the 2002 platform documentation:
http://www3.epa.gov/scram001/reports/Emissions%20TSD%20Voll 02-28-08.pdf. The ARTOPNT file
that lists the nonpoint sources to locate using point data was unchanged from the 2005-based platform.

3.3.7.3	Surrogates for Canada and Mexico Emission Inventories

The surrogates for Canada to spatially allocate the Canadian emissions are based on the 2020 Canadian
inventories and associated data. The spatial surrogate data came from ECCC, along with cross references.
The shapefiles they provided were used in the Surrogate Tool (previously referenced) to create spatial
surrogates. The Canadian surrogates used for this platform are listed in Table 3-15. The population
surrogate was updated for Mexico is based on the 2015 GPW v4 (see

https://sedac.ciesin.columbia.edu/data/collection/gpw-v4/sets/browse). The other surrogates for Mexico
are circa 1999 and 2000 and were based on data obtained from the Sistema Municpal de Bases de Datos
(SIMBAD) de INEGI and the Bases de datos del Censo Economico 1999. Most of the CAPs allocated to
the Mexico and Canada surrogates are shown in Table 3-16.

Table 3-15. Canadian Spatial Surrogates

Code

Canadian Surrogate Description

Code

Description

100

Population

925

Manufacturing and Assembly

101

total dwelling

926

Distribution and Retail (no petroleum)

102

urban dwelling

927

Commercial Services

103

rural dwelling

933

Rail-Passenger

104

capped total dwelling

934

Rail-Freight

105

capped meat cooking dwelling

935

Rail-Yard

106

ALL INDUST

940

PAVED ROADS NEW

113

Forestry and logging

945

Commercial Marine Vessels

116

Total Resources

946

Construction and mining

200

Urban Primary Road Miles

948

Forest

89


-------
Code

Canadian Surrogate Description

Code

Description

210

Rural Primary Road Miles

949

Combination of Dwelling

211

Oil and Gas Extraction

951

Wood Consumption Percentage

212

Mining except oil and gas

952

Residential Fuel Wood Combustion (PIRD)

220

Urban Secondary Road Miles

955

UNPAVED ROADS AND TRAILS

221

Total Mining

960

TOTBEEF

222

Utilities

961

80110 Broilers

230

Rural Secondary Road Miles

962

80111 Cattle dairy and Heifer

233

Total Land Development

963

80112 Cattle non-Dairy

240

capped population

964

80113 Laying hens and Pullets

308

Food manufacturing

965

80114 Horses

321

Wood product manufacturing

966

80115 Sheep and Lamb

323

Printing and related support activities

967

80116 Swine



Petroleum and coal products





324

manufacturing

968

80117 Turkeys

326

Plastics and rubber products
manufacturing

969

80118 Goat



Non-metallic mineral product





327

manufacturing

970

TOTPOUL

331

Primary Metal Manufacturing

971

80119 Buffalo

340

Construction - Oil and Gas

972

80120 Llama and Alpacas

350

Water

973

80121 Deer

412

Petroleum product wholesaler-distributors

974

80122 Elk

448

clothing and clothing accessories stores

975

80123 Wild boars

562

Waste management and remediation
services

976

80124 Rabbit



SCL: 12003 Petroleum Liquids





601

Transportation (PIRD)

977

80125 Mink



SCL: 12007 Oil Sands In-Situ Extraction





602

and Processing (PIRD)

978

80126 Fox



SCL: 12010 Light Medium Crude Oil





603

Production (PIRD)

980

TOTSWIN

604

SCL: 12011 Well Drilling (PIRD)

981

Harvest Annual

605

SCL: 12012 Well Servicing (PIRD)

982

Harvest Perennial

606

SCL: 12013 Well Testing (PIRD)

983

Synthfert Annual



SCL: 12014 Natural Gas Production





607

(PIRD)

984

Synthfert Perennial



SCL: 12015 Natural Gas Processing





608

(PIRD)

985

Tillage Annual

609

SCL: 12016 Heavy Crude Oil Cold
Production (PIRD)

990

TOTFERT



SCL: 12018 Disposal and Waste





610

Treatment (PIRD)

996

urban area



SCL: 12019 Accidents and Equipment





611

Failures (PIRD)

1251

OFFR TOTFERT



SCL: 12020 Natural Gas Transmission and





612

Storage (PIRD)

1252

OFFR MINES

90


-------
Code

Canadian Surrogate Description

Code

Description

651

MEIT C1C2 Anchored

1253

OFFR Other Construction not Urban

652

MEIT C1C2 Underway

1254

OFFR Commercial Services

653

MEIT CI C2 Berthed

1255

OFFR Oil Sands Mines

661

MEIT C3 Anchored

1256

OFFR Wood industries CANVEC

662

MEIT C3 Underway

1257

OFFR UNPAVED ROADS RURAL

663

MEIT C3 Berthed

1258

OFFR Utilities

901

AIRPORT

1259

OFFR total dwelling

902

Military LTO

1260

OFFR water

903

Commercial LTO

1261

OFFR ALL INDUST

904

General Aviation LTO

1262

OFFR Oil and Gas Extraction

905

Air Taxi LTO

1263

OFFR ALLROADS

921

Commercial Fuel Combustion

1264

OFFR AIRPORT

923

TOTAL INSTITUTIONAL AND
GOVERNEMNT

1265

OFFR RAILWAY

924

Primary Industry





Table 3-16. 2018 CAPs Allocated to Mexican and Canadian Spatial Surrogates for 12US1 (tons)



Mexican or Canadian Surrogate











Code

Description

MI;

NOx

PM2s

SO2

voc

11

MEX 2015 Population

0

60,516

330

133

167,796

14

MEX Residential Heating - Wood

0

2,468

6,890

201

18,559

16

MEX Residential Heating - Distillate
Oil

1

31

0

0

1

22

MEX Total Road Miles

2,130

249,454

8,629

4,749

48,885

24

MEX Total Railroads Miles

0

21,516

450

204

806

26

MEX Total Agriculture

115,677

20,235

16,414

527

3,658

32

MEX Commercial Land

0

59

1,287

0

21,908

34

MEX Industrial Land

72

1,598

927

5

24,672

36

MEX Commercial plus Industrial
Land

5

6,830

324

14

79,869



MEX Residential (RES1-











40

4)+Comercial+Industrial+Institutional
+Government

0

13

48

1

16,400

42

MEX Personal Repair (COM3)

0

0

0

0

4,049

44

MEX Airports Area

0

3,805

53

268

1,440

48

MEX Brick Kilns

0

210

4,180

371

102

50

MEX Mobile sources - Border

T,

64

9

0

50

Crossing



A

100

CAN Population

698

56

221

16

3,798

101

CAN total dwelling

0

0

0

0

105,422

104

CAN Capped Total Dwelling

321

32,970

2,486

2,030

1,688

106

CAN ALL INDUST

0

0

543

0

0

113

CAN Forestry and logging

83

627

2,934

15

2,717

91


-------
Code

Mexican or Canadian Surrogate
Description

MI;

NOx

PM2s

SO2

voc

200

CAN Urban Primary Road Miles

1,527

75,221

2,659

176

7,124

210

CAN Rural Primary Road Miles

584

40,602

1,405

74

2,880

212

CAN Mining except oil and gas

0

0

1,618

0

0

220

CAN Urban Secondary Road Miles

2,866

119,406

5,355

357

18,967

221

CAN Total Mining

0

0

12,266

0

0

222

CAN Utilities

0

2,562

2,504

32

110

230

CAN Rural Secondary Road Miles

1,545

74,760

2,682

187

7,677

240

CAN Total Road Miles

330

44,970

1,181

38

79,357

308

CAN Food manufacturing

0

0

17,591

0

5,104

321

CAN Wood product manufacturing

517

1,700

578

207

8,374

323

CAN Printing and related support
activities

0

0

0

0

18,212

324

CAN Petroleum and coal products
manufacturing

0

920

1,285

384

5,820

326

CAN Plastics and rubber products
manufacturing

0

0

0

0

21,854

327

CAN Non-metallic mineral product
manufacturing

0

0

6,686

0

0

331

CAN Primary Metal Manufacturing

0

112

3,880

21

45

412

CAN Petroleum product wholesaler-
distributors

0

0

0

0

36,768

448

CAN clothing and clothing accessories
stores

0

0

0

0

177

562

CAN Waste management and
remediation services

2,656

1,259

2,401

2,119

16,006

601

CAN SCL: 12003 Petroleum Liquids
Transportation (PIRD)

0

0

12

163

6,141

602

CAN SCL: 12007 Oil Sands In-Situ
Extraction and Processing (PIRD)

0

0

0

0

108

603

CAN SCL: 12010 Light Medium
Crude Oil Production (PIRD)

0

0

0

0

2

604

CAN SCL: 12011 Well Drilling
(PIRD)

0

0

0

563

594

605

CAN SCL: 12012 Well Servicing
(PIRD)

0

0

0

62

65

606

CAN SCL: 12013 Well Testing
(PIRD)

0

0

0

0

0

607

CAN SCL: 12014 Natural Gas
Production (PIRD)

0

31

1

0

215

608

CAN SCL: 12015 Natural Gas
Processing (PIRD)

0

0

0

0

0

611

CAN SCL: 12019 Accidents and
Equipment Failures (PIRD)

0

0

0

0

99,936

612

CAN SCL: 12020 Natural Gas
Transmission and Storage (PIRD)

1

800

55

11

408

901

CAN Airport

0

99

9

0

10

92


-------
Code

Mexican or Canadian Surrogate
Description

MI;

NOx

PM2s

SO2

voc

921

CAN Commercial Fuel Combustion

195

22,375

2,452

449

969

923

CAN TOTAL INSTITUTIONAL
AND GOVERNEMNT

0

0

0

0

14,276

924

CAN Primary Industry

0

0

0

0

31,784

925

CAN Manufacturing and Assembly

0

0

0

0

64,541

926

CAN Distribtution and Retail (no
petroleum)

0

0

0

0

6,633

927

CAN Commercial Services

0

0

0

0

30,243

933

CAN Rail-Passenger

1

3,038

60

1

121

934

CAN Rail-Freight

49

77,610

1,537

43

3,430

935

CAN Rail-Yard

1

4,587

95

1

279

940

CAN Paved Roads New





24,023





946

CAN Construction and Mining

42

2,675

149

257

38

951

CAN Wood Consumption Percentage

1,119

12,431

75,655

1,776

105,563

955

CAN

UNPAVED ROADS AND TRAILS

0

0

403,589

0

00

961

CAN 80110_Broilers

12,630

0

115

0

12,787

962

CAN 80111_Cattle_dairy_and_Heifer

57,942

0

276

0

40,516

963

CAN 80112 Cattle non-Dairy

164,849

0

884

0

42,876

964

CAN

80113 Laying hens and Pullets

9,451

0

40

0

10,596

965

CAN 80114_Horses

2,937

0

19

0

1,321

966

CAN 80115_Sheep_and_Lamb

2,122

0

6

0

170

967

CAN 80116 Swine

59,569

0

824

0

9,949

968

CAN 80117_Turkeys

4,877

0

41

0

4,509

969

CAN 80118 Goat

1,680

0

2

0

135

971

CAN 80119_Buffalo

2,092

0

6

0

517

972

CAN 80120_Llama_and_Alpacas

110

0

0

0

0

973

CAN 80121_Deer

18

0

0

0

0

974

CAN 80122_Elk

18

0

0

0

0

975

CAN 80123 Wild boars

34

0

0

0

0

976

CAN 80124_Rabbit

73

0

0

0

1

977

CAN 80125 Mink

284

0

0

0

951

978

CAN 80126_Fox

4

0

0

0

3

981

CAN Harvest Annual

0

0

24,807

0

0

983

CAN Synthfert Annual

177,194

3,616

2,117

5,933

132

985

CAN Tillage_Annual

0

0

106,732

0

0

996

CAN urban area

0

0

3,423

0

0

1251

CAN OFFR TOTFERT

83

63,804

4,510

57

6,290

1252

CAN OFFR MINES

1

585

42

1

81

1253

CAN OFFR Other Construction not
Urban

66

38,916

4,649

44

10,239

93


-------
Code

Mexican or Canadian Surrogate
Description

MI;

NOx

PM2s

SO2

voc

1254

CAN OFFR Commercial Services

44

16,547

2,478

38

37,831

1255

CAN OFFR Oil Sands Mines

0

0

0

0

0

1256

CAN OFFR Wood industries
CANVEC

9

3,343

272

6

922

1257

CAN OFFR Unpaved Roads Rural

23

10,032

626

20

26,879

1258

CAN OFFR Utilities

7

3,988

205

6

829

1259

CAN OFFR total dwelling

17

6,202

598

14

12,332

1260

CAN OFFR water

16

4,665

355

24

24,371

1261

CAN OFFR ALL INDUST

3

4,781

168

2

842

1262

CAN OFFR Oil and Gas Extraction

1

400

32

0

120

1263

CAN OFFR ALLROADS

3

1,811

182

2

463

1265

CAN OFFR CANRAIL

0

65

6

0

12

94


-------
3.4 Emissions References

Adelman, Z. 2012. Memorandum: Fugitive Dust Modeling for the 2008 Emissions Modeling Platform.
UNC Institute for the Environment, Chapel Hill, NC. September 28, 2012.

Adelman, Z. 2016. 2014 Emissions Modeling Platform Spatial Surrogate Documentation. UNC Institute
for the Environment, Chapel Hill, NC. October 1, 2016. Available at
https://gaftp.epa.gov/Air/emismod/2014/vl/spatial surrogates/.

Adelman, Z., M. Omary, Q. He, J. Zhao and D. Yang, J. Boylan, 2012. "A Detailed Approach for
Improving Continuous Emissions Monitoring Data for Regulatory Air Quality Modeling."
Presented at the 2012 International Emission Inventory Conference, Tampa, Florida. Available
from http://www.epa.gOv/ttn/chief/conference/ei20/index.html#ses-5.

Appel, K.W., Napelenok, S., Hogrefe, C., Pouliot, G., Foley, K.M., Roselle, S.J., Pleim, J.E., Bash, J.,
Pye, H.O.T., Heath, N., Murphy, B., Mathur, R., 2018. Overview and evaluation of the
Community Multiscale Air Quality Model (CMAQ) modeling system version 5.2. In Mensink C.,
Kallos G. (eds), Air Pollution Modeling and its Application XXV. ITM 2016. Springer
Proceedings in Complexity. Springer, Cham. Available at https://doi.org/10.1007/978-3-319-
57645-9 11.

Bash, J.O., Baker, K.R., Beaver, M.R., Park, J.-H., Goldstein, A.H., 2016. Evaluation of improved land
use and canopy representation in BEIS with biogenic VOC measurements in California. Available
from http://www.geosci-model-dev.net/9/2191/2016/.

Bullock Jr., R, and K. A. Brehme (2002) "Atmospheric mercury simulation using the CMAQ model:

formulation description and analysis of wet deposition results." Atmospheric Environment 36, pp
2135-2146. Available at https://doi.org/10.1016/S1352-2310(02)00220-0.

California Air Resources Board (CARB): Final 2015 Consumer & Commercial Product Survey Data
Summaries, 2019.

Coordinating Research Council (CRC), 2017. Report A-100. Improvement of Default Inputs for MOVES
and SMOKE-MOVES. Final Report. February 2017. Available at http://crcsite.wpengine.com/wp-
content/uploads/2019/05/ERG FinalReport CRCA100 28Feb2017.pdf.

Coordinating Research Council (CRC), 2019. Report A-l 15. Developing Improved Vehicle Population
Inputs for the 2017 National Emissions Inventory. Final Report. April 2019. Available at
http://crcsite.wpengine.eom/wp-content/uploads/2019/05/CRC-Proiect-A-115-Final-
Report 20190411.pdf.

Drillinginfo, Inc. 2017. "DI Desktop Database powered by HPDI." Currently available from
https://www.enverus.com/.

95


-------
England, G., Watson, J., Chow, J., Zielenska, B., Chang, M., Loos, K., Hidy, G., 2007. "Dilution-Based
Emissions Sampling from Stationary Sources: Part 2— Gas-Fired Combustors Compared with
Other Fuel-Fired Systems," Journal of the Air & Waste Management Association, 57:1, 65-78,
DOI: 10.1080/10473289.2007.10465291. Available at
https://www.tandfonline.com/doi/abs/10.1080/10473289.20Q7.10465291.

EPA. 2007a. Control of Hazardous Air Pollutants from Mobile Sources Regulatory Impact Analysis.
EPA420-R-07-002. EPA Office of Transportation and Air Quality (OTAQ) Assessment and
Standards Division, Ann Arbor, MI. Available online at
https://nepis.epa.gov/Exe/ZvPdf.cgi?Dockey=P1004LNN.PDF.

EPA, 2015b. Draft Report Speciation Profiles and Toxic Emission Factors for Nonroad Engines. EPA-
420-R-14-028. Available at

https://cfpub.epa.gov/si/si public record Report.cfm?dirEntryId=309339&CFID=83476290&CF
TOKEN=35281617.

EPA, 2015c. Speciation of Total Organic Gas and Particulate Matter Emissions from On-road Vehicles in
MOVES2014. EPA-420-R-15-022. Available at
https://nepis. epa.gov/Exe/ZyPDF. cgi?Dockev=P 100NQJG.pdf.

EPA, 2016. SPECIATE Version 4.5 Database Development Documentation, U.S. Environmental

Protection Agency, Office of Research and Development, National Risk Management Research
Laboratory, Research Triangle Park, NC 27711, EPA/600/R-16/294, September 2016. Available
at https://www.epa.gov/sites/production/files/2016-Q9/documents/speciate 4.5.pdf.

EPA, 2018. AERMOD Model Formulation and Evaluation Document. EPA-454/R-18-003. U.S.

Environmental Protection Agency, Research Triangle Park, North Carolina 27711. Available at
https://www3.epa.gov/ttn/scram/models/aermod/aermod mfed.pdf.

EPA, 2019. Final Report, SPECIATE Version 5.0, Database Development Documentation, Research
Triangle Park, NC, EPA/600/R-19/988. . Available at https://www.epa.gov/air-emissions-
modeling/speciate-51-and-50-addendum-and-final-report.

EPA and National Emissions Inventory Collaborative (NEIC), 2019. Technical Support Document (TSD)
Preparation of Emissions Inventories for the Version 7.2 North American Emissions Modeling
Platform. Available at https://www.epa.gov/air-emissions-modeling/2016-version-72-technical-
support-document.

EPA, 2020. Population and Activity of Onroad Vehicles in MOVES3. EPA-420-R-20-023. Office of

Transportation and Air Quality. US Environmental Protection Agency. Ann Arbor, MI. November
2020. Available under the MOVES3 section at https://www.epa.gov/moves/moves-technical-
reports.

EPA, 2020b. Technical Support document: "Development of Mercury Speciation Factors forEPA's Air
Emissions Modeling Programs, April 2020". US EPA Office of Air Quality Planning and
Standards.

EPA, 2021. 2017 National Emission Inventory: January 2021 Updated Release, Technical Support

Document. U.S. Environmental Protection Agency, OAQPS, Research Triangle Park, NC 27711.
Available at: https://www.epa.gov/air-emissions-inventories/2017-national-emissions-inventory-
nei -techni cal - support-document-tsd.

96


-------
EPA, 2021. 2017 National Emissions Inventory (NEI) data, Research Triangle Park, NC, January 2021.
https://www.epa.gov/air-emissions-inventories/2017-national-emissions-inventory-nei-data.

EPA and NEIC, 2021. Technical Support Document (TSD) Preparation of Emissions Inventories for the
2016vl North American Emissions Modeling Platform. Available at: https://www.epa.gov/air-
emissions-modeling/2016-version-1 -technical-support-document.

EPA, 2022a. Technical Support Document EPA's Air Toxics Screening Assessment - 2018

AirToxScreen TSD. Available at: https://www.epa.gov/AirToxScreen/2018-airtoxscreen-
technical-support-document.

EPA, 2022b. Technical Support Document: Preparation of Emissions Inventories for the 2019 North
American Emissions Modeling Platform. Available at: https://www.epa.gov/air-emissions-
modeling/2019-emissions-modeling-platform-technical-support-document.

EPA, 2023. 2020 National Emission Inventory Technical Support Document. U.S. Environmental
Protection Agency, OAQPS, Research Triangle Park, NC 27711. Available at:
https://www.epa.gov/air-emissions-inventories/202Q-national-emissions-inventorv-nei-technical-
support-document-tsd.

ERG, 2016b. "Technical Memorandum: Modeling Allocation Factors for the 2014 Oil and Gas Nonpoint
Tool." Available at https://gaftp.epa.gov/air/emismod/2014/vl/spatial surrogates/oil and gas/.

ERG, 2017. "Technical Report: Development of Mexico Emission Inventories for the 2014 Modeling
Platform." Available at https://gaftp.epa.gov/air/emismod/2016/vl/reports/EPA%205-
18%20Report Clean%20Final 01042017.pdf.

ERG, 2018. Technical Report: "2016 Nonpoint Oil and Gas Emission Estimation Tool Version 1.0".
Available at

https://gaftp.epa.gov/air/emismod/2016/vl/reports/2016%20Nonpoint%200il%20and%20Gas%2
0Emission%20Estimation%20Tool%20Vl 0%20December 2018.pdf.

The Freedonia Group: Solvents, Industry Study #3429, 2016.

Khare. P.. and Gentner. D. R.: Considering the future of anthropogenic gas-phase organic compound
emissions and the increasing influence of non-combustion sources on urban air quality. Atmos
ChemPhvs. 18. 5391-5413. 10.5194/acp-18-5391-2018. 2018.

Luecken D., Yarwood G, Hutzell WT, 2019. Multipollutant modeling of ozone, reactive nitrogen and
HAPs across the continental US with CMAQ-CB6. Atmospheric environment. 2019 Mar
15;201:62-72.

Mansouri, K., Grulke, C. M., Judson, R. S., and Williams, A. J.: OPERA models for predicting
physicochemical properties and environmental fate endpoints, J Cheminformatics, 10,
10.1186/sl3321-018-0263-1, 2018.

McCarty, J.L., Korontzi, S., Jutice, C.O., and T. Loboda. 2009. The spatial and temporal distribution of
crop residue burning in the contiguous United States. Science of the Total Environment, 407 (21):
5701-5712. Available at https://doi.Org/10.1016/i.scitotenv.2009.07.009.

MDNR, 2008. "A Minnesota 2008 Residential Fuelwood Assessment Survey of individual household
responses". Minnesota Department of Natural Resources. Available from
http://files.dnr.state.mn.us/forestry/um/residentialfuelwoodassessment07 08.pdf.

97


-------
NCAR, 2016. FIRE EMISSION FACTORS AND EMISSION INVENTORIES, FINN Data, downloaded
2014 SAPRC99 version from http://bai.acom.ucar.edu/Data/fire/.

NEIC, 2019. Specification sheets for the 2016vl platform. Available from
http://views.cira.colostate.edu/wiki/wiki/10202.

NESCAUM, 2006. "Assessment of Outdoor Wood-fired Boilers". Northeast States for Coordinated Air
Use Management (NESCAUM) report. Available from

http://www.nescaum.org/documents/assessment-of-outdoor-wood-fired-boilers/20Q6-1031-owb-
report revised-iune2006-appendix.pdf.

NYSERDA, 2012. "Environmental, Energy Market, and Health Characterization of Wood-Fired Hydronic
Heater Technologies, Final Report". New York State Energy Research and Development
Authority (NYSERDA). Available from: http://www.nvserda.ny.gov/Publications/Case-Studies/-
/media/Files/Publications/Research/Environmental/Wood-Fired-Hvdronic-Heater-Tech.ashx.

Pouliot, G., H. Simon, P. Bhave, D. Tong, D. Mobley, T. Pace, and T. Pierce. 2010. "Assessing the
Anthropogenic Fugitive Dust Emission Inventory and Temporal Allocation Using an Updated
Speciation of Particulate Matter." International Emission Inventory Conference, San Antonio, TX.
Available at http://www3.epa.gov/ttn/chief/conference/eil9/session9/pouliot pres.pdf.

Pouliot, G. and J. Bash, 2015. Updates to Version 3.61 of the Biogenic Emission Inventory System
(BEIS). Presented at Air and Waste Management Association conference, Raleigh, NC, 2015.

Pouliot G, Rao V, McCarty JL, Soja A. Development of the crop residue and rangeland burning in the

2014 National Emissions Inventory using information from multiple sources. Journal of the Air &
Waste Management Association. 2017 Apr 27;67(5):613-22.

Reichle. L.. R. Cook. C. Yanca. D. Sonntag. 2015. "Development of organic gas exhaust speciation

profiles for nonroad spark-ignition and compression-ignition engines and equipment", Journal of
the Air & Waste Management Association, 65:10, 1185-1193, DOI:

10.1080/10962247.2015.1020118. Available at https://doi.org/10.1080/10962247.2015.102Q118.

Reff A.. Bhave. P.. Simon. H.. Pace. T.. Pouliot G.. Mobley. J.. Houvoux. M. "Emissions Inventory of
PM2.5 Trace Elements across the United States". Environmental Science & Technology 2009 43
(151 5790-5796. DOI: 10.1021/es802930x. Available at https://doi.org/10.1021/es802930x.

Sarwar, G., S. Roselle, R. Mathur, W. Appel, R. Dennis, "A Comparison of CMAQ HONO predictions
with observations from the Northeast Oxidant and Particle Study", Atmospheric Environment 42
(2008) 5760-5770). Available at https://doi.Org/10.1016/i.atmosenv.2007.12.065.

Schauer, J., G. Lough, M. Shafer, W. Christensen, M. Arndt, J. DeMinter, J. Park, "Characterization of

Metals Emitted from Motor Vehicles," Health Effects Institute, Research Report 133, March 2006.
Available at https://www.healtheffects.org/publication/characterization-metals-emitted-motor-
vehicles.

Seltzer, K. M., Pennington, E., Rao, V., Murphy, B. N., Strum, M., Isaacs, K. K., and Pye, H. O. T., 2021:
"Reactive organic carbon emissions from volatile chemical products", Atmos. Chem. Phys. 21,
5079-5100, 2021. https://doi.org/10.5194/acp-21-5079-2021 and
https://acp.copernicus.org/articles/21/5079/2021/.

98


-------
Skamarock, W., J. Klemp, J. Dudhia, D. Gill, D. Barker, M. Duda, X. Huang, W. Wang, J. Powers, 2008.
A Description of the Advanced Research WRF Version 3. NCAR Technical Note. National
Center for Atmospheric Research, Mesoscale and Microscale Meteorology Division, Boulder, CO.
June 2008. Available at: http://www2.mmm.ucar.edu/wrf/users/docs/arw v3 bw.pdf.

Swedish Environmental Protection Agency, 2004. Swedish Methodology for Environmental Data;
Methodology for Calculating Emissions from Ships: 1. Update of Emission Factors.

U.S. Bureau of Labor and Statistics, 2020. Producer Price Index by Industry, retrieved from FRED,

Federal Reserve Bank of St. Louis, available at: https://fred.stlouisfed.org/categories/31. access
date: 21 August 2020.

U.S. Census Bureau: Paint and Allied Products - 2010, MA325F(10), 2011.

https://www.census.gov/data/tables/time-785 series/econ/cir/ma325f.html.

U.S. Census Bureau, Economy Wide Statistics Division: County Business Patterns, 2018.
https://www.census.gov/programs-survevs/cbp/data/datasets.html.

U.S. Department of Transportation and the U.S. Department of Commerce, 2015. 2012 Commodity Flow
Survey, EC12TCF-US. https://www.census.gov/library/publications/2015/econ/ecl2tcf-us.html.

U.S. Energy Information Administration, 2019. The Distribution of U.S. Oil and Natural Gas Wells by
Production Rate, Washington, DC. https://www.eia.gov/petroleum/wells/.

Wang, Y., P. Hopke, O. V. Rattigan, X. Xia, D. C. Chalupa, M. J. Utell. (2011) "Characterization of

Residential Wood Combustion Particles Using the Two-Wavelength Aethalometer", Environ. Sci.
Technol., 45 (17), pp 7387-7393. Available at https://doi.org/10.1021/es2013984.

Weschler, C. J., and Nazaroff, W. W.: Semivolatile organic compounds in indoor environments, Atmos
Environ, 42, 9018-9040, 2008.

Wiedinmyer, C., Y. Kimura, E. C. McDonald-Buller, L. K. Emmons, R. R. Buchholz, W. Tang, K. Seto,
M. B. Joseph, K. C. Barsanti, A. G. Carlton, and R. Yokelson, Volume 16, issue 13, GMD, 16,
3873-3891,2023. https://gmd.copernicus.org/articles/16/3873/2023/.

Wiedinmyer, C., S.K. Akagi, R.J. Yokelson, L.K. Emmons, J.A. Al-Saadi3, J. J. Orlando1, and A. J. Soja.
(2011) "The Fire INventory from NCAR (FINN): a high resolution global model to estimate the
emissions from open burning", Geosci. Model Dev., 4, 625-641. http://www.geosci-model-
dev.net/4/625/2011/ doi: 10.5194/gmd-4-625-2011.

Yarwood, G., R. Beardsley, Y. Shi, and B. Czader: Revision 5 of the Carbon Bond 6 Mechanism
(CB6r5). Presented at the Annual CMAS Conference, Chapel Hill, NC, 2020.

Zhu, Henze, et al, 2013. "Constraining U.S. Ammonia Emissions using TES Remote Sensing
Observations and the GEOS-Chem adjoint model", Journal of Geophysical Research:
Atmospheres, 118: 1-14. Available at https://doi.org/10.1002/igrd.50166.

99


-------
4,0 CMAQ Air Quality Model Estimates

4.1 Introduction to the CMAQ Modeling Platform

The Clean Air Act (CAA) provides a mandate to assess and manage air pollution levels to protect human
health and the environment. EPA has established National Ambient Air Quality Standards (NAAQS),
requiring the development of effective emissions control strategies for such pollutants as ozone and
particulate matter. Air quality models are used to develop these emission control strategies to achieve the
objectives of the CAA.

Historically, air quality models have addressed individual pollutant issues separately. However, many of
the same precursor chemicals are involved in both ozone and aerosol (particulate matter) chemistry;
therefore, the chemical transformation pathways are dependent. Thus, modeled abatement strategies of
pollutant precursors, such as VOC and NOx to reduce ozone levels, may exacerbate other air pollutants
such as particulate matter. To meet the need to address the complex relationships between pollutants, EPA
developed the Community Multi scale Air Quality (CMAQ) modeling system.11 The primary goals for
CMAQ are to:

•	Improve the environmental management community's ability to evaluate the impact of air quality
management practices for multiple pollutants at multiple scales.

•	Improve the scientist's ability to better probe, understand, and simulate chemical and physical
interactions in the atmosphere.

The CMAQ modeling system brings together key physical and chemical functions associated with the
dispersion and transformations of air pollution at various scales. It was designed to approach air quality as
a whole by including state-of-the-science capabilities for modeling multiple air quality issues, including
tropospheric ozone, fine particles, toxics, acid deposition, and visibility degradation. CMAQ relies on
emission estimates from various sources, including the U.S. EPA Office of Air Quality Planning and
Standards" current emission inventories, observed emission from major utility stacks, and model estimates
of natural emissions from biogenic and agricultural sources. CMAQ also relies on meteorological
predictions that include assimilation of meteorological observations as constraints. Emissions and
meteorology data are fed into CMAQ and run through various algorithms that simulate the physical and
chemical processes in the atmosphere to provide estimated concentrations of the pollutants. Traditionally,
the model has been used to predict air quality across a regional or national domain and then to simulate the
effects of various changes in emission levels for policymaking purposes. For health studies, the model can
also be used to provide supplemental information about air quality in areas where no monitors exist.

CMAQ was also designed to have multi-scale capabilities so that separate models were not needed for
urban and regional scale air quality modeling. The CMAQ simulation performed for this 2020 assessment
used a single domain that covers the entire continental U.S. (CONUS) and large portions of Canada and
Mexico using 12-km by 12-ktn horizontal grid spacing. Currently, 12-km x 12-ktn resolution is sufficient

11 Byun, D.W., and K. L. Schere, 2006: Review of the Governing Equations, Computational Algorithms, and Other
Components of the Models-3 Community Multiscale Air Quality (CMAQ) Modeling System. Applied Mechanics Reviews,
Volume 59, Number 2 (March 2006), pp. 51-77.

100


-------
as the highest resolution for most regional-scale air quality model applications and assessments.12 With the
temporal flexibility of the model, simulations can be performed to evaluate longer term (annual to multi-
year) pollutant climatologies as well as short-term (weeks to months) transport from localized sources. By
making CMAQ a modeling system that addresses multiple pollutants and different temporal and spatial
scales, CMAQ has a "one atmosphere" perspective that combines the efforts of the scientific community.
Improvements will be made to the CMAQ modeling system as the scientific community further develops
the state-of-the-science.

For more information on CMAQ, go to https://www.epa.gov/cmaq or http://www.cmascenter.org.

4.1.1 Advantages and Limitations of the CMAQ Air Quality Model

An advantage of using the CMAQ model output for characterizing air quality for use in comparing with
health outcomes is that it provides a complete spatial and temporal coverage across the U.S. CMAQ is a
three-dimensional Eulerian photochemical air quality model that simulates the numerous physical and
chemical processes involved in the formation, transport, and destruction of ozone, particulate matter, and
air toxics for given input sets of initial and boundary conditions, meteorological conditions, and
emissions. The CMAQ model includes state-of-the-science capabilities for conducting urban to regional
scale simulations of multiple air quality issues, including tropospheric ozone, fine particles, toxics, acid
deposition, and visibility degradation. However, CMAQ is resource intensive, requiring significant data
inputs and computing resources.

An uncertainty of using the CMAQ model includes structural uncertainties, representation of physical and
chemical processes in the model. These consist of: choice of chemical mechanism used to characterize
reactions in the atmosphere, choice of land surface model, and choice of planetary boundary layer.

Another uncertainty in the CMAQ model is based on parametric uncertainties, which include uncertainties
in the model inputs: hourly meteorological fields, hourly 3-D gridded emissions, initial conditions, and
boundary conditions. Uncertainties due to initial conditions are minimized by using a 10-day ramp-up
period from which model results are not used in the aggregation and analysis of model outputs.
Evaluations of models against observed pollutant concentrations build confidence that the model performs
with reasonable accuracy despite the uncertainties listed above. A detailed model evaluation for ozone
and PM2.5 species provided in Section 4.3 shows generally acceptable model performance which is
equivalent or better than typical state-of-the-science regional modeling simulations as summarized in
Simon et al., 2012.13

4.2 CMAQ. Model Version, Inputs and Configuration

This section describes the air quality modeling platform used for the 2020 CMAQ simulation. A modeling
platform is a structured system of connected modeling-related tools and data that provide a consistent and
transparent basis for assessing the air quality response to changes in emissions and/or meteorology. A
platform typically consists of a specific air quality model, emissions estimates, a set of meteorological
inputs, and estimates of boundary conditions representing pollutant transport from source areas outside the
region modeled. We used the CMAQ modeling system as part of the 2020 Platform to provide a national

12U.S. EPA (2018), Modeling Guidance for Demonstrating Air Quality Goals for Ozone, PM2.5, and Regional Haze, pp 205.
https://www3.epa.gov/ttn/scram/guidance/guide/O3-PM-RH-Modeling_Guidance-2018.pdf.

13 Simon, H., Baker, K.R., and Phillips, S. (2012) Compilation and interpretation of photochemical model performance
statistics published between 2006 and 2012. Atmospheric Environment 61, 124-139.

101


-------
scale air quality modeling analysis. The CMAQ model simulates the multiple physical and chemical
processes involved in the formation, transport, and destruction of ozone and PM2.5.

This section provides a description of each of the main components of the 2020 CMAQ simulation along
with the results of a model performance evaluation in which the 2020 model predictions are compared to
corresponding measured ambient concentrations.

4.2.1	CMAQ Model Version

CMAQ is a non-proprietary computer model that simulates the formation and fate of photochemical
oxidants, including PM2.5 and ozone, for given input sets of meteorological conditions and emissions. As
mentioned previously, CMAQ includes numerous science modules that simulate the emission, production,
decay, deposition and transport of organic and inorganic gas-phase and particle pollutants in the
atmosphere. This 2020 analysis employed CMAQ version 5.4.14 The 2020 CMAQ run included CB6r5
chemistry15'16, AER07 aerosol module17 with non-volatile Primary Organic Aerosol (POA), and updated
halogen chemistry18. The CMAQ community model versions 5.2 and 5.3 were most recently peer-
reviewed in May of 2019 for the U.S. EPA.19

4.2.2	Model Domain and Grid Resolution

The CMAQ modeling analyses were performed for a domain covering the continental United States, as
shown in Figure 4-1. This single domain covers the entire continental U.S. (CONUS) and large portions
of Canada and Mexico using 12-km by 12-km horizontal grid spacing. The 2020 simulation used a
Lambert Conformal map projection centered at (-97, 40) with true latitudes at 33 and 45 degrees
north. The 12-km CMAQ domain consisted of 459 by 299 grid cells and 35 vertical layers. Table 4-1
provides some basic geographic information regarding the 12-krn CMAQ domain. The model extends
vertically from the surface to 50 millibars (approximately 17,600 meters) using a sigma-pressure
coordinate system. Table 4-2 shows the vertical layer structure used in the 2020 simulation. Air quality

14	CMAQ version 5.4: United States Environmental Protection Agency. (2022). CMAQ (Version 5.4) [Software]. Available
from https://doi.org/10.5281/zenodo.7218076; https://www.epa.gov/cmaa. CMAQ v5.4 is also available from the Community
Modeling and Analysis System (CMAS) at: http://www.cmascenter.org.

15	Luecken, D. J., Yarwood, G., and Hutzell, W. T.: Multipollutant modeling of ozone, reactive nitrogen and HAPs across the
continental US with CMAQ-CB6, Atmos Environ, 201, 62-72, 10.1016/j.atmosenv.2018.11.060, 2019.

16	Yarwood, G., Beardsley, R., Shi, Y., Czader, B.: Revision 5 of the Carbon Bond 6 Mechanism (CB6r5), CMAS 2020,
October 27, 2020.

https://www.cmascenter.org/conference/2020/slides/BeardsleyR_CMAS2020_CarbonBond6_Revision5_clean.pdf

17	Xu, L., Pye, H. O. T., He, J., Chen, Y. L., Murphy, B. N., and Ng, N. L.: Experimental and model estimates of the
contributions from biogenic monoterpenes and sesquiterpenes to secondary organic aerosol in the southeastern United States,
Atmos ChemPhys, 18, 12613-12637, 10.5194/acp-18-12613-2018, 2018.

18Kang, D.; Willison, J.; Sarwar, G.; Madden, M.; Hogrefe, C.; Mathur, R.; Gantt, B.; and Saiz-Lopez, A.: Improving the
Characterization of Natural Emissions in CMAQ, Environmental Manager, A&WMA, October 2021.

"Barsanti. K.C.. Pickering. K.E., Pour-Biazar, A., Savior. R.D.. Stroud. C.A., (June 19, 2019). Final Report: Sixth Peer
Review of the Community Multiscale Air Quality (CMAQ) Modeling System, /https://www.epa.gov/sites/default/files/2019-
08/documents/sixth_cmaq_peer_review_comment_report_6.19.19.pdf.

This peer review was focused on CMAQv5.2, which was released in June of 2017, as well as CMAQ v5.3, which was released
in August of 2019. It is available from the Community Modeling and Analysis System (CMAS) as well as previous peer-
review reports at: http://www.cmascenter.org.

102


-------
conditions at the outer boundary of the 12-km domain were taken from the GEOS-Chem global model
(discussed in Section 4.2.4).

103


-------
Table 4-1. Geographic Information for 2020 12-km Modeling Domain

National 12 km CMAQ Modeling Configuration

Map Projection

Lambert Confbimal Projection

Grid Resolution

12km

Coordinate Center

97W,40N

True Latitudes

33and45N

Dimensions

459 x 299 x 35

Vertical Extent

35Layers: Surfaoeto50mblevel (seeTable4-2)

Table 4-2. Vertical layer structure for 2020 CMAQ simulation (heights are layer top).

Vertical
Layers

Sigma P

Pressure
(mb)

Approximate
Height (m)

35

0.0000

50.00

17,556

34

0.0500

97.50

14,780

33

0.1000

145.00

12,822

32

0.1500

192.50

11,282

31

0.2000

240.00

10,002

30

0.2500

287.50

8,901

29

0.3000

335.00

7,932

28

0.3500

382.50

7,064

27

0.4000

430.00

6,275

26

0.4500

477.50

5,553

25

0.5000

525.00

4,885

24

0.5500

572.50

4,264

23

0.6000

620.00

3,683

22

0.6500

667.50

3,136

21

0.7000

715.00

2,619

20

0.7400

753.00

2,226

19

0.7700

781.50

1,941

18

0.8000

810.00

1,665

17

0.8200

829.00

1,485

16

0.8400

848.00

1,308

15

0.8600

867.00

1,134

14

0.8800

886.00

964

13

0.9000

905.00

797

12

0.9100

914.50

714

11

0.9200

924.00

632

10

0.9300

933.50

551

9

0.9400

943.00

470

104


-------
Vertical
Layers

Sigma P

Pressure
(mb)

Approximate
Height (m)

8

0.9500

952.50

390

7

0.9600

962.00

311

6

0.9700

971.50

232

5

0.9800

981.00

154

4

0.9850

985.75

115

3

0.9900

990.50

77

2

0.9950

995.25

38

1

0.9975

997.63

19

0

1.0000

1000.00

0

Figure 4-1. Map of the 2020 CMAQ Modeling Domain. The blue box denotes the 12-km national
modeling domain.

105


-------
4.2.3 Modeling Period / Ozone Episodes

The 12-km CMAQ modeling domain was modeled for the entire year of 2020. The annual simulation
included a spin-up period, comprised of 10 days before the beginning of the simulation, to mitigate the
effects of initial concentrations. All 365 model days were used in the annual average levels of PM2.5. For
the 8-hour ozone, we used modeling results from the period between May 1 and September 30. This 153-
day period generally conforms to the ozone season across most parts of the U.S. and contains the majority
of days that observed high ozone concentrations.

4.2.4 Model Inputs: Emissions, Meteorology and Boundary Conditions

2020 Emissions: The emissions inventories used in the 2020 air quality modeling are described in Section
3, above.

2020 Meteorological Input Data: The gridded meteorological data for the entire year of 2020 at the 12-
km continental United States scale domain was derived from the publicly available version 4.1.1 of the
Weather Research and Forecasting Model (WRF), Advanced Research WRF (ARW) core.20 The WRF
Model is a state-of-the-science mesoscale numerical weather prediction system developed for both
operational forecasting and atmospheric research applications (http://wrf-model.org). The 12US WRF
model was initialized using the 12-km North American Model (12NAM)21 analysis product provided by
National Climatic Data Center (NCDC). Where 12NAM data was unavailable, the 40-km Eta Data
Assimilation System (EDAS) analysis (ds609.2) from the National Center for Atmospheric Research
(NCAR) was used. Analysis nudging for temperature, wind, and moisture was applied above the
boundary layer only. The model simulations were conducted continuously. The 'ipxwrf program was
used to initialize deep soil moisture at the start of the run using a 10-day spin-up period. The 2020 WRF
meteorology simulated was based on 2011 National Land Cover Database (NLCD).22 The WRF
simulation included the physics options of the Pleim-Xiu land surface model (LSM), Asymmetric
Convective Model version 2 planetary boundary layer (PBL) scheme, Morrison double moment
microphysics, Kain- Fritsch cumulus parameterization scheme utilizing the moisture-advection trigger'
and the RRTMG long-wave and shortwave radiation (LWR/SWR) scheme.24 In addition, the Group for
High Resolution Sea Surface Temperatures (GHRSST)25'26 1 -km SST data was used for SST information
to provide more resolved information compared to the more coarse data in the NAM analysis.

20	Skamarock, W.C., Klemp, J.B., Dudhia, J., Gill, D.O., Barker, D.M., Duda, M.G., Huang, X., Wang, W., Powers, J.G., 2008.
A Description of the Advanced Research WRF Version 3.

21	North American Model Analysis-Only, http://nomads.ncdc.noaa.gov/data.php; download from
ftp://nomads.ncdc.noaa.gov/NAM/analysis_only/.

22	National Land Cover Database 2011, http://www.mrlc.gov/nlcd201 l.php.

23	Ma, L-M. and Tan, Z-M, 2009. Improving the behavior of the Cumulus Parameterization for Tropical Cyclone Prediction:
Convection Trigger. Atmospheric Research 92 Issue 2, 190-211.
http://www.sciencedirect.com/science/article/pii/S01698095080Q2585.

24	Gilliam. R.C., Plcim. I.E., 2010. Performance Assessment of New Land Surface and Planetary Boundary Layer Physics in the
WRF- ARVV. Journal of Applied Meteorology and Climatology 49, 760-774.

25	Stammer, D., F.J. Wentz, and C.L. Gentemann, 2003, Validation of Microwave Sea Surface Temperature Measurements for
Climate Purposes, J. Climate, 16, 73-87.

26	Global High-Resolution SST (GHRSST) analysis, https://www.ghrsst.org/.

106


-------
Additionally, the hybrid-vertical coordinate system was employed, where the model is terrain-following (Eta)
near the surface and isobaric aloft, reducing the influence of surface features on upper-level dynamics.

2020 Initial and Boundary Conditions: The 2020 annual lateral boundary and initial species
concentrations were provided using a global 3-D GEOS-Chem (Goddard Earth Observing System)
vl4.0.1 simulation (of the National Aeronautics and Space Administration (NASA) Global Modeling
Assimilation Office) utilizing standard options and full atmospheric chemistry.27 The GEOS-Chem
simulation was performed at 2 x 2.5-degree horizontal resolution with a 72-layer vertical structure (36
layers in troposphere, 120-meter first layer). Simulation used full chemistry with online strat, non-local
planetary boundary layer and simple secondary organic aerosols and updated methane, lightning, and
other parameters for 2020. Emissions included online Model of Emissions of Gases and Aerosols from
Nature (MEGAN) version 2.128, online DUST module, and online sea salt module. Global Fire Emissions
Database (GFED)29 were monthly mean. Anthropogenic emissions included fugitive, combustion, and
industrial dust.30 Marine emissions were based on Community Emissions Data System (CEDS) version 2
including shipping vessels.31 Aircraft Emissions Inventory Code (AEIC)32 monthly aircraft input data. In
addition, CEDS and AEIC was scaled by Covid-19 adjustmeNt Factors fOR eMissions (CONFORM)
dataset.33 Meteorology used in this 2020 GEOS-Chem run was from Modern-Era Retrospective analysis
for Research and Applications, version 2 (MERRA2)34 meteorology at 2 x 2.5-degree.

4.3 CMAQ Model Performance Evaluation

An operational model performance evaluation for ozone and PM2.5 and its related speciated components
was conducted for the 2020 simulation using state/local monitoring sites data in order to estimate the
ability of the CMAQ modeling system to replicate the 2020 base year concentrations for the 12-km
continental U.S. domain.

There are various statistical metrics available and used by the science community for model performance
evaluation. For a robust evaluation, the principal evaluation statistics used to evaluate CMAQ

27	GEOS-Chem, https://geoschem.github.io/index.html

28	Guenther, A.B., Jiang, X., Heald, C.L., Sakulyanontvittaya, T., Duhl, T., Emmons, L.K., and Wang, X. The Model of
Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): an extended and updated framework for modeling
biogenic emissions, 2012, GMD, Volume 5, Issue 6, 1471-1492.

29	https://www.globalfiredata.org/

30Philip, S., Martin, R.V., Snider, G., Weagle, C.L., van Donkelaar, A., Brauer, M., Henze, D.K., Klimont, Z., Venkataraman,
C., Guttikunda, S.K., and Zhang, Q., April 2017. "Anthropogenic fugitive, combustion and industrial dust is a significant,
underrepresented fine particulate matter source in global atmospheric models." Environmental Research Letters; Bristol, Vol.
12, Iss. 4. Doi: 10.1088/1748-9326/aa65a4.

31A Community Emissions Data System (CEDS) for Historical emissions, https://www.pnnl.gov/projects/ceds
32 Simone, N.W., Stettler, M.E.J., Barrett, S.R.H., 2013. Rapid estimation of global civil aviation emissions with uncertainty
quantification, Transportation Research Part D: Transport and Environment, Volume 25, 33-41, ISSN 1361-9209,
https://doi.Org/10.1016/j.trd.2013.07.001.

33Doumbia, T., Granier, C., Elguindi, N., Bouarar, I., Darras, S., Brasseur, G., Gaubert, B., Liu, Y., Shi, X., Stavrakou, T.,
Tilmes, S., Lacey, F., Deroubaix, A., and Wang, T., 2021: Changes in global air pollutant emissions during the COVID-19
pandemic: a dataset for atmospheric modeling, Earth Syst. Sci. Data, 13, 4191-4206, https://doi.org/10.5194/essd-13-4191-
2021.

34 Global Modeling and Assimilation Office (GMAO). Inst3_3d_asm_Cp; MERRA-2 IAU State Meteorology Instantaneous 3-
hourly (p-coord, 0.625x0.5L42), version 5.12.4, Greenbelt, MD, USA: Goddard Space Flight Center (GSFC DAAC), 2015.
Doi: 10.5067/VJAFPL1CSIV.

107


-------
performance were two bias metrics, mean bias and normalized mean bias; and two error metrics, mean
error and normalized mean error.

Mean bias (MB) is used as average of the difference (predicted - observed) divided by the total number of
replicates (n). Mean bias is defined as:

MB = ^£1 (P — 0) , where P = predicted and O = observed concentrations.

Mean error (ME) calculates the absolute value of the difference (predicted - observed) divided by the total
number of replicates (n). Mean error is defined as:

ME = ±2J|P-0|

Normalized mean bias (NMB) is used as a normalization to facilitate a range of concentration magnitudes.
This statistic averages the difference (model - observed) over the sum of observed values. NMB is a
useful model performance indicator because it avoids overinflating the observed range of values,
especially at low concentrations. Normalized mean bias is defined as:

i(p-o)

NMB = _j	*100, where P = predicted concentrations and O = observed

n

T(o)

1

Normalized mean error (NME) is also similar to NMB, where the performance statistic is used as a
normalization of the mean error. NME calculates the absolute value of the difference (model - observed)
over the sum of observed values. Normalized mean error is defined as

T\p-o\

NME = T	*100

n

£(o)

1

The performance statistics were calculated using predicted and observed data that were paired in time and
space on an 8-hour basis. Statistics were generated for each of the nine National Oceanic and
Atmospheric Administration (NOAA) climate regions'5 of the 12-km U.S. modeling domain (Figure 4-2).
The regions include the Northeast, Ohio Valley, Upper Midwest, Southeast, South, Southwest, Northern
Rockies, Northwest, and West36'37 as were originally identified in Karl and Koss (1984).38

35	NOAA, National Centers for Environmental Information scientists have identified nine climatically consistent regions within
the contiguous U.S., http://www.ncdc.noaa.gov/monitoring-references/maps/us-climate-regions.php.

36	The nine climate regions are defined by States where: Northeast includes CT, DE, ME, MA, MD, NH, NJ, NY, PA, RI, and
VT; Ohio Valley includes IL, IN, KY, MO, OH, TN, and WV; Upper Midwest includes IA, MI, MN, and WI; Southeast
includes AL, FL, GA, NC, SC, and VA; South includes AR, KS, LA, MS, OK, and TX; Southwest includes AZ, CO, NM, and
UT; Northern Rockies includes MT, NE, ND, SD, WY; Northwest includes ID, OR, and WA; and West includes CA and NV.

37	Note most monitoring sites in the West region are located in California (see Figure 4-2), therefore statistics for the West will
be mostly representative of California ozone air quality.

38	Karl, T. R. and Koss, W. J., 1984: "Regional and National Monthly, Seasonal, and Annual Temperature Weighted by Area,
1895-1983." Historical Climatology Series 4-3, National Climatic Data Center, Asheville, NC, 38 pp.

108


-------
U.S. Climate Regions

Figure 4-2. NOAA Nine Climate Regions (source: htti)://www.ncdc.noaa.gov/monitoring-references/mai)s/us-
climate-regions.i)hi)#references)

In addition to the performance statistics, regional maps which show the MB, ME, NMB, and NME were
prepared for the ozone season, May through September, at individual monitoring sites as well as on an
annual basis for PM2.5 and its component species.

Evaluation for 8-hour Daily Maximum Ozone: The operational model performance evaluation for eight-
hour daily maximum ozone was conducted using the statistics defined above. Ozone measurements in the
continental U.S. were included in the evaluation and were taken from the 2020 state/local monitoring site
data in AQS and the Clean Air Status and Trends Network (CASTNet).

The 8-hour ozone model performance bias and error statistics for each of the nine NOAA climate regions
and each season are provided in Table 4-4. Seasons were defined as: winter (December-January-
February), spring (March-April-May), summer (June, July, August), and fall (September-October-
November). In some instances, observational data were excluded from the analysis and model evaluation
based on a completeness criterion of 75 percent. Spatial plots of the MB, ME, NMB and NME for
individual monitors are shown in Figures 4-3 through 4-6, respectively. The statistics shown in these two
figures were calculated over the ozone season, April through September, using data pairs on days with
observed 8-hour ozone of greater than or equal to 60 ppb.

In general, the model performance statistics indicate that the 8-hour daily maximum ozone concentrations
predicted by the 2020 CM AQ simulation closely reflect the corresponding 8-hour observed ozone
concentrations in space and time in each subregion of the 12-km modeling domain. As indicated by the
statistics in Table 4-4, bias and error for 8-hour daily maximum ozone are relatively low in each
subregion, not only in the summer when concentrations are highest, but also during other times of the year.
Generally, 8-hour ozone at the AQS and CASTNet sites in the summer is over predicted at all climate regions
(NMB ranging between 0.0 to 25.6 percent) except in the Southwest and in the Northern Rockies, West and
Northwest at CASTNet sites only where there is a slight under prediction. Likewise, 8-hour ozone at the AQS

109


-------
and CASTNet sites in the fall is typically over predicted across the contiguous U.S. (NMB ranging
between 0.0 to 21.9 percent) except in the West as well as in the Southeast and West at CASTNet sites
only. In the winter, 8-hour ozone is overpredicted in all climate regions at AQS and CASTNet sites
(NMB ranging between 0.3 to 20.2 percent). In the Spring, 8-hour ozone concentrations are over
predicted at AQS and CASTNet sites in all NOAA climate regions (with NMBs less than approximately
20 percent in each subregion) except at AQS sites in the Southwest, Northwest and West (slight under
prediction of NMB ranging between -0.8 and -3.9 percent) and at CASTNet sites in the Northeast,
Southwest, Northern Rockies, Northwest, and West (NMB ranging between -0.3 and -5.7 percent).

Model bias at individual sites during the ozone season is similar to that seen on a subregional basis for the
summer. Figure 4-3 shows the mean bias for 8-hour daily maximum ozone greater than 60 ppb is
generally ±15 ppb across the AQS and CASTNet sites. Likewise, the information in Figure 4-5 indicates
that the normalized mean bias for days with observed 8-hour daily maximum ozone greater than 60 ppb is
within ± 20 percent at the vast majority of monitoring sites across the U.S. domain. Model error, as seen
from Figures 4-4 and 4-6, is generally 2 to 16 ppb and 30 percent or less at most of the sites across the
U.S. modeling domain. Somewhat greater error is evident at sites in several areas most notably in central
California, Northern Rockies, Upper Midwest, and Southeast.

Table 4-4. Summary of CMAQ 2020 8-Hour Daily Maximum Ozone Model Performance Statistics
by NOAA climate region, by Season and Monitoring Network.	

Climate
region

Monitor
Network

Season

No. of
Obs

MB
(ppb)

ME
(ppb)

NMB

(%)

NME

(%)



AQS

Winter

11,255

3.6

4.9

11.3

15.4





Spring

16,442

0.8

4.2

2.0

10.1





Summer

16,412

4.6

6.4

10.9

15.1

Northeast



Fall

13,609

4.4

6.1

13.7

18.9

















CASTNet

Winter

1,240

2.5

3.8

7.4

11.1





Spring

1,267

-0.1

4.0

-0.3

9.2





Summer

1,234

3.6

5.6

OO
oo

13.7





Fall

1,241

3.5

5.6

10.5

16.8



















AQS

Winter

5,808

5.9

6.8

20.2

23.1





Spring

20,625

2.8

5.0

6.8

12.4





Summer

20,549

4.9

7.0

10.9

15.6

Ohio Valley



Fall

15,292

5.8

6.7

17.5

20.4

















CASTNet

Winter

1,582

4.3

5.7

13.4

17.6





Spring

1,630

1.1

4.5

2.6

10.8





Summer

1,635

4.0

6.6

9.3

15.5





Fall

1,602

3.7

5.8

11.0

17.2

















Upper Midwest

AQS

Winter

1,829

4.5

5.2

13.8

15.8



Spring

8,092

2.2

5.1

5.6

12.8

110


-------
Climate
region

Monitor
Network

Season

No. of
Obs

MB
(ppb)

ME
(ppb)

NMB

(%)

NME

(%)





Summer

8,726

2.2

5.9

5.0

13.6





Fall

6,245

4.7

6.0

15.0

18.8



















CASTNet

Winter

445

3.5

4.3

10.5

12.8





Spring

452

0.2

4.3

0.5

10.4





Summer

451

0.9

4.7

2.2

11.7





Fall

448

3.4

5.4

10.9

17.3



















AQS

Winter

7,187

2.8

4.3

7.7

12.0





Spring

15,229

2.3

4.8

5.5

11.4





Summer

14,850

8.9

9.5

25.6

27.3





Fall

11,938

7.0

7.6

21.9

23.7

Southeast

















CASTNet

Winter

1,049

1.5

4.1

4.2

11.3





Spring

1,092

0.2

4.5

0.4

10.4





Summer

1,055

6.7

8.4

18.6

23.3





Fall

1,077

4.1

5.8

12.2

17.1



















AQS

Winter

10,415

3.5

5.7

10.7

17.4





Spring

12,445

3.5

6.1

8.4

14.7





Summer

12,307

6.8

8.8

17.1

22.3





Fall

11,773

5.3

6.9

14.9

19.3

South

















CASTNet

Winter

520

2.7

4.8

8.0

13.9





Spring

481

1.4

5.3

3.4

12.5





Summer

511

5.1

8.1

13.0

20.5





Fall

525

4.3

6.0

12.1

17.0



















AQS

Winter

10,182

2.1

4.7

5.4

12.2





Spring

10,884

-1.9

4.9

-3.9

9.8





Summer

11,039

-2.2

5.5

-4.0

10.2





Fall

10,736

0.0

4.9

0.0

10.9

Southwest

















CASTNet

Winter

979

0.5

3.3

8.0

1.3





Spring

977

-1.9

3.7

-3.8

7.4





Summer

990

-0.8

4.2

-1.6

8.0





Fall

920

-0.4

3.9

-0.8

8.5

















Northern

AQS

Winter

4,383

2.9

4.5

7.6

12.0

Rockies



Spring

4,876

0.4

5.1

0.8

11.7





Summer

4,672

0.3

4.7

0.6

10.2

Ill


-------
Climate

Monitor



No. of

MB

ME

NMB

NME

region

Network

Season

Obs





(%)

(%)





Fall

4,428

1.7

4.7

4.7

12.7



















CASTNet

Winter

608

1.5

3.7

3.6

9.2





Spring

629

-2.1

4.5

-4.4

9.6





Summer

625

-0.1

4.1

-0.1

8.6





Fall

578

1.3

4.3

3.3

10.5



















AQS

Winter

669

1.8

4.8

5.6

14.6





Spring

1,319

-0.3

4.7

-0.8

12.0





Summer

2,409

0.9

4.2

2.5

11.4

Northwest



Fall

1,129

4.0

6.5

11.7

18.9

















CASTNet

Winter

201

3.5

4.4

9.9

12.3





Spring

182

-0.3

3.7

-0.6

8.9





Summer

182

-0.9

4.1

-2.1

9.4





Fall

202

2.7

5.3

7.4

14.5



















AQS

Winter

14,257

1.5

4.9

4.3

13.9





Spring

16,605

-0.4

4.9

-0.9

11.0





Summer

17,005

0.0

6.9

0.1

13.8

West



Fall

15,610

-0.9

7.5

-2.0

15.9

















CASTNet

Winter

614

0.1

3.9

0.3

9.4





Spring

631

-2.7

5.1

-5.7

10.8





Summer

638

-5.7

7.8

-9.9

13.7





Fall

615

-3.3

5.9

-6.7

12.0


















-------
03 8hrmax MB (ppb) for run CMAQ 2Q20ha2 MP cb6r5hap ae7_12US1 for 20200401 to 20200930

* GASTNET Daily • AQS Daily

Figure 4-3. Mean Bias (ppb) of 8-hour daily maximum ozone greater than 60 ppb over the period
April-September 2020 at AQS and CAST Net monitoring sites in the continental U.S. modeling
domain.

03_8hrmax ME (ppb) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12llS1 for 20200401 to 20200930

*¦ CASTNET Daily • AQS Daily

Figure 4-4. Mean Error (ppb) of 8-hour daily maximum ozone greater than 60 ppb over the period
April-September 2020 at AQS and CAST Net monitoring sites in the continental U.S. modeling
domain.

113


-------
03 8hrmax NMB (%) lor run CMAQ 2020ha2 MP_cb6r5hap ae7 12US1 for 20200401 to 20200930

* GASTNET Daily • AQS Daily

Figure 4-5. Normalized Mean Bias (%) of 8-hour daily maximum ozone greater than 60 ppb over
the period April-September 2020 at AQS and CAST Net monitoring sites in the continental U.S.
modeling domain.

03 8hrmax NME (%) for run CMAQ 2020ha2 MP cb6r5hap ae7 12US1 for 20200401 to 20200930

A CASTNET Daily • AQS Daily

Figure 4-6. Normalized Mean Error (%) of 8-hour daily maximum ozone greater than 60 ppb over
the period April-September 2020 at AQS and CAST Net monitoring sites in the continental U.S.
modeling domain.

114


-------
Evaluation for Annual PM7.5 components: The PM evaluation focuses on PM2.5 components including
sulfate (SO4), nitrate (NO3), total nitrate (TNO3 = NO3 + HNO3), ammonium (NH4), elemental carbon
(EC), and organic carbon (OC). The bias and error performance statistics were calculated on an annual
basis for each of the nine NOAA climate subregions defined above (provided in Table 4-5). PM2.5
measurements for 2020 were obtained from the following networks for model evaluation: Chemical
Speciation Network (CSN, 24-hour average), Interagency Monitoring of Protected Visual Environments
(IMPROVE, 24-hour average, and Clean Air Status and Trends Network (CASTNet, weekly average).
For PM2.5 species that are measured by more than one network, we calculated separate sets of statistics for
each network by subregion. In addition to the tabular summaries of bias and error statistics, annual spatial
maps which show the mean bias, mean error, normalized mean bias and normalized mean error by site for
each PM2.5 species are provided in Figures 4-7 through 4-30.

As indicated by the statistics in Table 4-5, annual average sulfate is consistently under predicted at
CASTNet, IMPROVE, and CSN monitoring sites across the 12-km modeling domain (with MB values
ranging from -0.0 to -0.5 |igm~3) except at IMPROVE and CSN sites in the Northwest (over prediction,
0.1 to 0.4 |igm"3, respectively). Sulfate performance shows moderate error in the eastern subregions
(average of approximately 30-50 percent) while Western subregions show slightly larger error (ranging
from 30 to 80 percent). Figures 4-7 through 4-10, suggest spatial patterns vary by region. The model
bias for most of the Northeast, Southeast, Ohio Valley, and Southwest states are under predicted
within ±40 percent. The model bias appears to be greater in the Northwest with predictions up to
approximately 60-80 percent at individual monitors. Model error also shows a spatial trend by region,
where much of the Eastern states are 30 to 50 percent, the Western and Central U.S. states are 40 to 100
percent.

Annual average nitrate is under predicted at the rural IMPROVE monitoring sites at all NOAA climate
subregions (NMB averaging of -40 percent), except in the Northeast, Ohio Valley, Southeast and
Northwest where nitrate is over predicted (between 4 to 83 percent). At CSN urban sites, annual average
nitrate is under predicted at all subregions, except in the Northeast (29.7 percent), Southeast (69.9
percent) and Northwest (64.4 percent) where nitrate is over predicted. Likewise, model performance of
total nitrate at sub-urban CASTNet monitoring sites shows an under prediction at all subregions (NMB in
the range of-10.4 to -53.3 percent), except in the Northeast (21.7 percent), Ohio Valley (3.2 percent) and
Northwest (46.7 percent). Model error for nitrate and total nitrate is somewhat greater for each of the nine
NOAA climate subregions as compared to sulfate. Model bias at individual sites indicates over prediction
of greater than 10 percent at monitoring sites along the upper Northeast, and Northwest coastline as well
as in the South and Southeast as indicated in Figure 4-13. The exception to this is in the Southwest,
Northern Rockies and Western U.S. of the modeling domain where there appears to be a greater number
of sites with under prediction of nitrate of 10 to 80 percent.

Annual average ammonium model performance as indicated in Table 4-5 has a tendency for the model to
under predict across CASTNet sites (ranging from -18 to -72 percent). Ammonium performance across
the urban CSN sites shows an under prediction in all NOAA climate subregions (ranging from -4.4 to -
66.8 percent), except over predictions in the Northeast (19.5 percent). Upper Midwest (3.5 percent). South
(4.0 percent), and Northwest (of 41.7 percent). The spatial variation of ammonium across the majority of
individual monitoring sites in the Eastern U.S. shows bias within ±50 percent (Figures 4-19 and 4-21). A
larger bias is seen in the Northeast and in the Northern Rockies, (over prediction bias on average 80 to
100 percent). The urban monitoring sites exhibit slightly larger errors than at rural sites for ammonium.

115


-------
Annual average elemental carbon is under predicted in all of the nine climate regions at urban and rural
sites (biases between -19.8 to 53.8 percent) except at urban Northwest sites (over prediction ranging 10.8
percent). There is not a large variation in error statistics from subregion to subregion or at urban versus
rural sites.

Similar to elemental carbon, annual average organic carbon is under predicted in all of the nine climate
regions at urban and rural sites (biases between -4.7 to 67.2 percent) except at urban Northwest sites (over
prediction ranging 36.5 percent). Likewise, error model performance does not show a large variation from
subregion to subregion or at urban versus rural sites.

116


-------
Table 4-5. Summary of CMAQ 2020 Annual PM Species Model Performance Statistics by NOAA
Climate region, by Monitoring Network.	

Monitor
Pollutant Network

Subregion

No. of
Obs

MB
(|jgm3)

3 S

3 m

CO

NMB

(%)

NME

(%)

CSN

Northeast

2,666

-0.3

0.4

-36.9

46.9



Ohio Valley

1,852

-0.4

0.5

-36.1

42.8



Upper Midwest

1,009

-0.3

0.4

-32.8

44.8



Southeast

1,718

-0.4

0.4

-31.1

44.8



South

1,203

-0.4

0.5

-34.6

45.0



Southwest

1,031

-0.3

0.3

-51.1

55.5



Northern Rockies

647

-0.2

0.3

-28.2

51.0



Northwest

556

0.4

0.5

72.8

>100



West

1,074

-0.4

0.6

-38.1

56.1



IMPROVE

Northeast

1,899

-0.3

0.3

-43.1

47.7



Ohio Valley

851

-0.4

0.4

-45.9

49.3



Upper Midwest

941

-0.2

0.3

-39.2

45.4



Southeast

1,466

-0.4

0.4

-42.5

50.2

Sulfate

South

1,082

-0.3

0.4

-41.0

48.0



Southwest

3,828

-0.2

0.2

-56.1

59.0



Northern Rockies

2,012

-0.1

0.2

-28.2

52.9



Northwest

1,867

0.1

0.3

38.8

>100



West

2,488

-0.2

0.4

-38.3

71.0



CASTNet

Northeast

891

-0.4

0.4

-51.2

51.8



Ohio Valley

894

-0.5

0.5

-49.5

49.5



Upper Midwest

248

-0.4

0.4

-47.5

47.9



Southeast

647

-0.5

0.5

-57.4

54.7



South

352

-0.5

0.5

-51.1

51.2



Southwest

451

-0.3

0.3

-63.3

63.4



Northern Rockies

544

-0.2

0.2

-50.2

52.1



Northwest

56

-0.0

0.1

-16.0

39.6



West

298

-0.4

0.4

-60.6

66.0

CSN

Northeast

2,665

0.2

0.5

29.7

66.1



Ohio Valley

1,851

-0.0

0.5

-2.5

43.7



Upper Midwest

1,008

-0.0

0.5

-1.1

38.4



Southeast

1,720

0.3

0.4

69.9

>100



South

1,200

-0.0

0.4

-5.7

67.5

Nitrate

Southwest

1,032

-0.3

0.6

-40.5

71.9



Northern Rockies

645

-0.2

0.4

-26.8

50.3



Northwest

556

0.5

0.9

64.4

>100



West

1,072

-1.1

1.4

-48.8

63.1



IMPROVE

Northeast

1,899

0.3

0.4

83.1

>100


-------
Pollutant

Monitor
Network

Subregion

No. of
Obs

MB
(|jgm3)

3 S
3 m

CO

NMB

(%)

NME

(%)





Ohio Valley

851

0.0

0.4

3.8

58.4





Upper Midwest

941

-0.0

0.4

-3.8

47.5





Southeast

1,466

0.1

0.3

44.2

>100





South

1,081

-0.1

0.3

-19.4

67.4





Southwest

3,826

-0.2

0.2

-69.6

83.4





Northern Rockies

2,011

-0.1

0.2

-30.9

73.9





Northwest

1,856

0.0

-0.0

15.2

>100





West

2,486

-0.2

0.3

-45.5

69.8



















CASTNet

Northeast

891

0.2

0.4

21.7

35.8





Ohio Valley

894

0.0

0.4

3.2

24.5





Upper Midwest

248

-0.1

0.3

-4.3

20.5

Total Nitrate
(NO3+HNO3)



Southeast

647

-0.0

0.4

-2.5

45.7



South

352

-0.2

0.4

-14.8

29.7



Southwest

451

-0.2

0.3

-28.7

36.7





Northern Rockies

544

-0.1

0.2

-24.8

35.1





Northwest

56

0.1

0.2

46.7

56.0





West

298

-0.5

0.5

-37.6

43.7



















CSN

Northeast

2,664

0.1

0.2

19.5

66.8





Ohio Valley

1,851

-0.0

0.2

-3.9

45.6





Upper Midwest

1,009

0.0

0.2

3.5

47.2





Southeast

2,130

-0.1

0.2

-21.3

59.3





South

1,718

0.0

0.2

4.0

80.2





Southwest

1,203

-0.0

0.2

-12.5

62.2





Northern Rockies

645

-0.0

0.2

-2.7

62.0





Northwest

555

0.1

0.3

41.7

>100





West

1,072

-0.4

0.5

-52.5

71.9

Ammonium















CASTNet

Northeast

891

-0.1

0.2

-18.2

47.5





Ohio Valley

894

-0.1

0.2

-29.3

40.3





Upper Midwest

248

-0.1

0.2

-28.4

38.6





Southeast

587

-0.1

0.1

-33.6

42.1





South

647

-0.1

0.2

-35.0

58.6





Southwest

352

-0.1

0.2

-31.5

49.8





Northern Rockies

544

-0.1

0.1

-58.3

61.1





Northwest

56

-0.0

0.1

-28.3

50.0





West

298

-0.2

0.2

-71.9

79.6

















Elemental

CSN

Northeast

2,614

-0.1

0.3

-19.8

44.7

Carbon



Ohio Valley

896

-0.1

0.1

-45.0

50.5

118


-------
Monitor
Pollutant Network

Subregion

No. of
Obs

MB
(|jgm3)

3 m

CO

NMB

(%)

NME

(%)



Upper Midwest

1,090

-0.1

0.2

-22.6

45.4



Southeast

1,572

-0.1

0.3

-40.9

53.3



South

1,140

-0.2

0.3

-39.9

47.4



Southwest

1,112

-0.2

0.3

-22.9

44.6



Northern Rockies

565

-0.2

0.3

-53.8

62.4



Northwest

538

0.1

0.7

10.8

68.0



West

2,354

-0.2

0.2

-49.6

63.5



IMPROVE

Northeast

1,777

0.0

0.1

0.0

51.7



Ohio Valley

1,786

-0.3

0.3

-40.2

47.8



Upper Midwest

1,041

-0.1

0.1

-41.5

52.8



Southeast

1,625

-0.3

0.3

-42.8

50.6



South

1,049

-0.1

0.1

-52.2

55.0



Southwest

3,537

-0.1

0.1

-43.6

58.2



Northern Rockies

2,076

-0.1

0.1

-49.5

65.5



Northwest

1,796

-0.2

0.3

-51.3

85.4



West

2,354

-0.2

0.2

-49.6

63.5



CSN

Northeast

2,614

-0.1

0.9

-4.7

51.4



Ohio Valley

1,786

-0.6

0.8

-29.4

40.8



Upper Midwest

1,041

-0.4

0.5

-44.6

54.0



Southeast

1,591

-0.3

0.7

-22.4

60.2



South

1,144

-0.7

0.9

-35.4

48.1



Southwest

1,014

-0.6

1.2

-30.4

57.6



Northern Rockies

564

-0.8

1.0

-57.9

65.8



Northwest

538

1.0

2.5

36.5

90.9



West

1,041

-1.9

2.3

-46.2

55.6

Organic

Carbon IMPROVE

Northeast

1,788

-0.2

0.5

-17.0

50.0



Ohio Valley

899

-0.5

0.6

-38.7

49.8



Upper Midwest

1,090

-0.4

0.7

-26.9

45.3



Southeast

1,631

-0.3

0.8

-14.7

41.6



South

1,062

-0.6

0.6

-52.7

56.2



Southwest

3,824

-0.7

0.7

-67.2

72.0



Northern Rockies

2,101

-0.6

0.7

-58.6

74.2



Northwest

1,826

-0.7

1.4

-44.2

87.0



West

2,397

-1.4

1.7

-64.0

77.6



119


-------
S04 MB

cb6r5hac ae7 12US1 for 20200101 to 20201231

units = ug/m3
coverage limit = 75%

• IMPROVE	CSN	¦ CASTNET Weekly

Figure 4-7. Mean Bias (jigm~3) of annual sulfate at monitoring sites in the continental U.S. modeling
domain.

units = ug/m3
coverage limit = 75%

• IMPROVE	CSN	¦ CASTNET Weekly

Figure 4-8. Mean Error (jignr3) of annual sulfate at monitoring sites in the continental U.S.
modeling domain.

S04ME

cb6r5hac ae7 12US1 for 20200101 to 20201231

120


-------
S04 NMB (%) for run CMAQ 2020ha2 MP cb6r5hap ae7 12US1 for 20200101 to 20201231

• IMPROVE	*¦ CSN	¦ CASTNET Weekly

Figure 4-9. Normalized Mean Bias (%) of annual sulfate at monitoring sites in the continental U.S.
modeling domain.

units = %

coverage limit = 75%

IMPROVE

CSN

CASTNET Weekly

Figure 4-10. Normalized Mean Error (%) of annual sulfate at monitoring sites in the continental U.S.
modeling domain.

S04 NME

cb6r5hac ae7 12US1 for 20200101 to 20201231

> 100

90

80

70

60

50

40

30

20

10

121


-------
N03 MB (ug/m3) for run CMAQ 2020ha2 MP_cb6r5hap ae7 12US1 for 20200101 to 20201231

• IMPROVE a CSN

Figure 4-11. Mean Bias (jignr3) of annual nitrate at monitoring sites in the continental U.S. modeling
domain.

N03 ME (ug/m3) for run CMAQ 2020ha2 MP cb6r5hap_ae7_12US1 for 20200101 to 20201231

• IMPROVE a CSN

Figure 4-12. Mean Error (jignr3) of annual nitrate at monitoring sites in the continental U.S. modeling
domain.

122


-------
N03 NMB (%) for run CMAQ 2020ha2 MP cb6r5hap ae7 12US1 for 20200101 to 20201231

• IMPROVE a CSN

Figure 4-13. Normalized Mean Bias (%) of annual nitrate at monitoring sites in the continental U.S.
modeling domain.

NQ3 NME (%) for run CMAQ 2020ha2_MP cb6r5hap ae7 12US1 for 20200101 to 20201231

• IMPROVE * CSN

Figure 4-14. Normalized Mean Error (%) of annual nitrate at monitoring sites in the continental U.S.
modeling domain.

123


-------
TN03 MB (ug/m3) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

• CASTNET Weekly

Figure 4-15, Mean Bias (jigm3) of annual total nitrate at monitoring sites in the continental U.S.
modeling domain.

TN03 ME (ug/m3) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

• CASTNET Weekly

Figure 4-16. Mean Error (jignr3) of annual total nitrate at monitoring sites in the continental U.S.
modeling domain.

124


-------
TN03 NMB (%) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

• CASTNET Weekly

Figure 4-17, Normalized Mean Bias (%) of annual total nitrate at monitoring sites in the continental U.S.
modeling domain.

TN03 NME (%) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

• CASTNET Weekly

Figure 4-18. Normalized Mean Error (%) of annual total nitrate at monitoring sites in the continental
U.S. modeling domain.

125


-------
NH4 MB (ug/m3) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

coverage limit = 75%

0.6
0.4
0.2
0

-0.2
-0.4
-0.6
-0.8
-1

-1.2
-1.4
-1.6
-1.8
<-2

• CSN	± CASTNET Weekly

Figure 4-19. Mean Bias (jigm"3) of annual ammonium at monitoring sites in the continental U.S. modeling
domain.

units = ug/m3
coverage limit = 75%

1.2

• CSN

CASTNET Weekly

Figure 4-20. Mean Error (jigin ~3) of annual ammonium at monitoring sites in the continental U.S.
modeling domain.

NH4 ME (ug/m3) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

126


-------
NH4 NMB (%) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

units = %

coverage limit = 75%

• CSN	CASTNET Weekly

Figure 4-21. Normalized Mean Bias (%) of annual ammonium at monitoring sites in the continental U.S.
modeling domain.

units = %

coverage limit = 75%

NH4NME

cb6r5haD ae7 12US1 for 20200101 to 20201231

Figure 4-22. Normalized Mean Error (%) of annual ammonium at monitoring sites in the continental
U.S. modeling domain.

• CSN

* CASTNET Weekly

127


-------
units = ug/m3
coverage limit = 75%

>2
1.8
1.6
1.4
1.2
1

0.8
0.6
0.4
0.2
0

-0.2
-0.4
-0.6
-0.8
-1

-1.2
-1.4
-1.6
-1.8
<-2

cb6r5haD ae7 12US1 for 20200101 to 20201231

EC MB

• IMPROVE a CSN

Figure 4-23. Mean Bias (jignr3) of annual elemental carbon at monitoring sites in the continental U.S.
modeling domain.

units = ug/'m3
coverage limit = 75%

Figure 4-24. Mean Error (jignr3) of annual elemental carbon at monitoring sites in the continental U.S.
modeling domain.

cb6r5hao ae7 12US1 for 20200101 to 20201231

• IMPROVE ^ CSN

EC ME

128


-------
units = %

coverage limit = 75%

>100
90
80
70
60
50
40
30
20
10
0

-10
-20
-30
-40
-50
-60
-70
-80
-90

cb6r5hao ae7 12US1 for 20200101 to 20201231

ECNMB

• IMPROVE a CSN

Figure 4-25. Normalized Mean Bias (%) of annual elemental carbon at monitoring sites in the
continental U.S. modeling domain.

units = %

coverage limit = 75%

>100

90

80

70

50

20

10

• IMPROVE a CSN

Figure 4-26. Normalized Mean Error (%) of annual elemental carbon at monitoring sites in the
continental U.S. modeling domain.

cb6r5haD ae7 12US1 for 20200101 to 20201231

EC NME

129


-------
cb6r5hap ae7 12US1 for 20200101 to 20201231

OC MB

units = ug/m3
coverage limit = 75%







10



1



1-0







1



0.8













0





01



-0.6
-0,





F









• IMPROVE a CSN

Figure 4-27. Mean Bias (jignr3) of annual organic carbon at monitoring sites in the continental U.S.
modeling domain.

OC ME (ug/m3) for run CMAQ_2020ha2_MP_cb6r5hap_ae7_12US1 for 20200101 to 20201231

units = ug/m3
coverage limit = 75%

• IMPROVE * CSN

Figure 4-28. Mean Error (jignr3) of annual organic carbon at monitoring sites in the continental U.S.
modeling domain.

130


-------
OC NMB (%) for run CMAQ 2020ha2 MP cbBr5hap ae7 12US1 for 20200101 to 20201231

• IMPROVE a CSN

Figure 4-29. Normalized Mean Bias (%) of annual organic carbon at monitoring sites in the continental
U.S. modeling domain.

OC NME (%) for run CMAQ 2020ha2 MP cb6r5hap ae7 12US1 for 20200101 to 20201231

• IMPROVE a CSN

Figure 4-30. Normalized Mean Error (%) of annual organic carbon at monitoring sites in the
continental U.S. modeling domain.

131


-------
i •'eslj.ii i' i i 			 i • * ii ¦ n in 			 • i . 1 ii r- u< . , • . i /ed

Air Quality Estimates

5.1	Introduction

The need for greater spatial coverage of air pollution concentration estimates has grown in recent years as
epidemiology and exposure studies that link air pollution concentrations to health effects have become more
robust and as regulatory needs have increased. Direct measurement of concentrations is the ideal way of
generating such data, but prohibitive logistics and costs limit the possible spatial coverage and temporal
resolution of such a database. Numerical methods that extend the spatial coverage of existing air pollution
networks with a high degree of confidence are thus a topic of current investigation by researchers. The
downscaler model (DS) is the result of the latest research efforts by EPA for performing such predictions. DS
utilizes both monitoring and CMAQ data as inputs and attempts to take advantage of the measurement data
accuracy and CMAQs spatial coverage to produce new spatial predictions. This chapter describes methods and
results of the DS application that accompany this report, which utilized ozone and PM2.5 data from AQS and
CMAQ to produce predictions to continental U.S. 2020 census tract centroids for the year 2020.

5.2	Downscaler Model

DS develops a relationship between observed and modeled concentrations, and then uses that relationship to
spatially predict what measurements would be at new locations in the spatial domain based on the input data.
This process is separately applied for each time step (daily in this work) of data, and for each of the pollutants
under study (ozone and PM2.5). In its most general form, the model can be expressed in an equation similar to
that of linear regression:

Y(s) = p0(s) + Pi x(s) + e(s)	(Equation 1)

Where:

Y(s) is the observed concentration at point 5. Note that Y(s) could be expressed as Yt (s ), where t indicates the
model being fit at time t (in this case, t=l, ...,365 would represent day of the year.)
x(s) is the point-level regressor based on the CMAQ concentration at point 5. This value is a weighted
average of both the gridcell containing the monitor and neighboring gridcells.

i!"W

f30(s) is the intercept, where fi0(s ) = /?0 + P0(s ) is composed of both a global component (3Q and a local
component /?0(s) that is modeled as a mean-zero Gaussian Process with exponential decay

is the global slope; local components of the slope are contained in the x(s) term.
e(s) is the model error.

DS has additional properties that differentiate it from linear regression:

1) Rather than just finding a single optimal solution to Equation 1, DS uses a Bayesian approach so that
uncertainties can be generated along with each concentration prediction. This involves drawing random
samples of model parameters from built-in "prior" distributions and assessing their fit on the data on the order
of thousands of times. After each iteration, properties of the prior distributions are adjusted to try to improve
the fit of the next iteration. The resulting collection of f]Q and ^ values at each space-time point are the

132


-------
"posterior" distributions, and the means and standard distributions of these are used to predict concentrations
and associated uncertainties at new spatial points.

2) The model is "hierarchical" in structure, meaning that the top-level parameters in Equation 1 (i.e., /?0(s),

x(s)) are actually defined in terms of further parameters and sub-parameters in the DS code. For example,
the overall slope and intercept is defined to be the sum of a global (one value for the entire spatial domain) and
local (values specific to each spatial point) component. This gives more flexibility in fitting a model to the
data to optimize the fit (i.e., minimize s(s)).

Further information about the development and inner workings of the current version of DS can be found in
Berrocal, Gelfand and Holland (2012)39 and references therein. The DS outputs that accompany this report are
described below, along with some additional analyses that include assessing the accuracy of the DS
predictions. Results are then summarized, and caveats are provided for interpreting them in the context of air
quality management activities.

5.3 Downscaler Concentration Predictions

In this application, DS was used to predict daily concentration and associated uncertainty values at the
2020 US census tract centroids across the continental U.S. using measurement and CMAQ data as
inputs. For ozone, the concentration unit is the daily maximum 8-hour average in ppb and for PM2.5 the
concentration unit is the 24-hour average in g/m.

5.3.1 Summary of 8-hour Ozone Results

Figure 5-1 summarizes the AQS, CMAQ and DS ozone data over the year 2020. It shows the 4th max
daily maximum 8-hour average ozone for AQS observations, CMAQ model predictions and DS model
results. The DS model estimated that for 2020, about 35% of the US Census tracts (29542 out of 83776)
experienced at least one day with an ozone value above the NAAQS of 70 ppb.

39 Berrocal, V., Gelfand, A., and D. Holland. Space-Time Data Fusion Under Error in Computer Model Output: An Application to
Modeling Air Quality. Biometrics. 2012. September; 68(3): 837-848. doi:10.1111/j.l541-0420.2011.01725.

133


-------
110°W

90°W

80°W

CMAQ

45°N -
40°N-
35°N -
30°N-
25°N -

45°N -
40°N-|
35°N-
30°N-
25°N -

45°N -
40°N-
35°N -
30°N-
25°N -
120°W

100°W

2020

4'th Max, Daily max
8-hour avg
ozone (ppb)

(-lnf.55]

(55,60]

¦	(60,65]

¦	(65,70]

(70,75]

(75,80]

¦	(80,85]

¦	(85,90]

¦	(90, Inf]

Figure 5-1. Annual 4th max (daily max 8-hour ozone concentrations) derived from AQS, CMAQ and
DS data.

134


-------
5.3.2 Summary of PM2.5 Results

Figures 5-2 and 5-3 summarize the AQS, CMAQ and DS PM2.5 data over the year 2020. Figure 5-2 shows
annual means and Figure 5-3 shows 98th percentiles of 24-hour PM2.5 concentrations for AQS observations,
CMAQ model predictions and DS model results. The DS model estimated that for 2020 about 40% of the US
Census tracts (33298 out of 83776 experienced at least one day with a PM2.5 value above the 24-hour NAAQS
of 35 g/m.

135


-------
AQS

Figure 5-2. Annual mean PM2.5 concentrations derived from AQS, CMAQ and DS data.

2020

Annual mean,
24-hour avg
PM2.5 (ug/m3)

(0,3]

(3,5]

(5,8]

(8,10]

s (10,12]

(12,15]

(15,18]

¦ (lS.Inf]

45°N -
40°N -
35°N-
30°N-
25°N-

45°N-
40ftN -
35°N-
30°N-
25°N-
120°W

110°W

100"W

90°W

80"W

CMAQ

136


-------
AQS

2020

98'th percentile,
24-hour avg
PM2.5 (ug/m3)

(0,10]

(10,15]

(15,20]

(20,25]

¦	(25,30]

(30,35]

(35,40]

• (40,45]

¦	(45,50]

¦	(50,lnf]

110°W

90°W

80° W

45°N -
40°N -
35°N -
30°N -
25°N -

45°N
40°N
35°N
30°N
25°N

40°N -
35°N -
30°N -
25°N -
120°W

CMAQ

100°W

Figure 5-3. 98th percentile 24-hour average PM2.5 concentrations derived from AQS, CMAQ and DS
data.

5.4 Downscaler Uncertainties

137


-------
5.4.1 Standard Errors

As mentioned above, the DS model works by drawing random samples from built-in distributions
during its parameter estimation. The standard errors associated with each of these populations provide
a measure of uncertainty associated with each concentration prediction. Figures 5-4 and 5-5 show the
percent errors resulting from dividing the DS standard errors by the associated DS prediction. The black
dots on the maps show the location of EPA sampling network monitors whose data was input to DS via
the AQS datasets (Chapter 2). The maps show that, in general, errors are relatively smaller in regions
with more densely situation monitors (ie the eastern US), and larger in regions with more sparse
monitoring networks (ie western states). These standard errors could potentially be used to estimate
the probability of an exceedance for a given point estimate of a pollutant concentration.



% OS Error:
ozone

Nk'.. • •;

-: • ' Y. v

. ~ •« . : ; c	r. . •. sjm m to.si

§ . - t t y* *	¦

5 -V	¦ (5.10]

W':4"- "p- • - # f, V?""	¦ (10.15]

¦ (15,20]

V \y . '*.•/"#	¦ (20.30]

, 7^:-r •*"	'«•*- •' . (30,40]



V-

r	• 'I** . * *

•> •

f ' i-' . V

-1 . - ' • /

•»,'	V-. * ,	¦ , r , * , • * V „

•	<• • .	¦ (50,75]

	 " '"

(40,50]
(50.75]
(75.100]

W	X)

Figure 5-4: Annual mean relative errors (standard errors divided by predictions) from the DS 2020 runs for
ozone, The black dots show the locations of monitors that generated the AQS data used as input to the DS
model

138


-------
Figure 5-5: Annual mean relative errors (standard errors divided by predictions) from the DS 2020 runs for
PM2.5. The black dots show the locations of monitors that generated the AQS data used as input to the DS
model

5.4.2 Cross Validation

To check the quality of its spatial predictions, DS can be set to perform "cross-validation" (CV), which
involves leaving a subset of AQS data out of the model run and predicting the concentrations of those left out
points. The predicted values are then compared to the actual left-out values to generate statistics that provide
an indicator of the predictive ability. In the DS runs associated with this report, 10% of the data was chosen
randomly by the DS model to be used for the CV process. The resulting CV statistics are shown below in
Table 5-1.

Pollutant

Monitor
Count

Mean Bias

RMSE

Mean Coverage

PM

943

0,146

4.987

0.953

03

1224

0.018

4.221

0.962

Table 5-1. Cross-validation statistics associated with the 2020 DS runs.

The statistics indicated by the columns of Table 5-1 are as follows:

139


-------
-	Mean Bias: The bias of each prediction is the DS prediction minus the AQS value. This column is the
mean of all biases across the CV cases.

-	Root Mean Squared Error (RMSE): The bias is squared for each CV prediction, then the square root
of the mean of all squared biases across all CV predictions is obtained.

-	Mean Coverage: A value of 1 is assigned if the measured AQS value lies in the 95% confidence
interval of the DS prediction (the DS prediction +/- the DS standard error), and 0 otherwise. This
column is the mean of all those O's and l's.

5.5 Summary and Conclusions

The results presented in this report are from an application of the DS fusion model for characterizing
national air quality for ozone and PM2.5. DS provided spatial predictions of daily ozone and PM2.5 at
2020 U.S. census tract centroids by utilizing monitoring data and CMAQ output for 2020. Large-scale
spatial and temporal patterns of concentration predictions are generally consistent with those seen in
ambient monitoring data. Both ozone and PM2.5 were predicted with lower error in the eastern versus
the western U.S., presumably due to the greater monitoring density in the east.

An additional caution that warrants mentioning is related to the capability of DS to provide predictions
at multiple spatial points within a single CMAQ grid cell. Care needs to be taken not to over-interpret
any within-grid cell gradients that might be produced by a user. Fine-scale emission sources in CMAQ
are diluted into the grid cell averages, but a given source within a grid cell might or might not affect
every spatial point contained therein equally. Therefore DS-generated fine-scale gradients are not
expected to represent actual fine-scale atmospheric concentration gradients, unless possibly where
multiple monitors are present in the grid cell.

140


-------
Apper

Acronyms



ARW

Advanced Research WRF core model

BEIS

Biogenic Emissions Inventory System

BlueSky

Emissions modeling framework

BSP

BlueSky Pipeline modeling system

CAIR

Clean Air Interstate Rule

CAMD

EPA's Clean Air Markets Division

CAP

Criteria Air Pollutant

CAR

Conditional Auto Regressive spatial covariance structure (model)

CARB

California Air Resources Board

CEM

Continuous Emissions Monitoring

CHIEF

Clearinghouse for Inventories and Emissions Factors

CMAQ

Community Multiscale Air Quality model

CMV

Commercial marine vessel

CO

Carbon monoxide

CSN

Chemical Speciation Network

DQO

Data Quality Objectives

EGU

Electric Generating Units

Emission Inventory

Listing of elements contributing to atmospheric release of pollutant



substances

EPA

Environmental Protection Agency

EMFAC

Emission Factor (California's onroad mobile model)

FAA

Federal Aviation Administration

FDDA

Four-Dimensional Data Assimilation

FIPS

Federal Information Processing Standards

HAP

Hazardous Air Pollutant

HC

Hydrocarbon

HMS

Hazard Mapping System

ICS-209

Incident Status Summary form

IPM

Integrated Planning Model

ITN

Itinerant

LSM

Land Surface Model

MOBILE

OTAQ's model for estimation of onroad mobile emissions factors

MODIS

Moderate Resolution Imaging Spectroradiometer

MOVES

Motor Vehicle Emission Simulator

NEEDS

National Electric Energy Database System

NEI

National Emission Inventory

NERL

National Exposure Research Laboratory

NESHAP

National Emission Standards for Hazardous Air Pollutants

NH

Ammonia

NMIM

National Mobile Inventory Model

NONROAD

OTAQ's model for estimation of nonroad mobile emissions

NO

Nitrogen oxides

141


-------
OAQPS	EPA's Office of Air Quality Planning and Standards

OAR	EPA's Office of Air and Radiation

ORD	EPA's Office of Research and Development

ORIS	Office of Regulatory Information Systems (code) - is a 4 or 5 digit

number assigned by the Department of Energy's (DOE) Energy
Information Agency (EIA) to facilities that generate electricity

ORL	One Record per Line

OTAQ	EPA's Office of Transportation and Air Quality

PAH	Polycyclic Aromatic Hydrocarbon

PFC	Portable Fuel Container

PM2.5	Particulate matter less than or equal to 2.5 microns

PM10	Particulate matter less than or equal to 10 microns

PMc	Particulate matter greater than 2.5 microns and less than 10 microns

Prescribed Fire	Intentionally set fire to clear vegetation

RIA	Regulatory Impact Analysis

RPO	Regional Planning Organization

RRTM	Rapid Radiative Transfer Model

SCC	Source Classification Code

SMARTFIRE	Satellite Mapping Automatic Reanalysis Tool for Fire Incident Reconciliation

SMOKE	Sparse Matrix Operator Kernel Emissions

TSD	Technical support document

VOC	Volatile organic compounds

VMT	Vehicle miles traveled

Wildfire	Uncontrolled forest fire

WRAP	Western Regional Air Partnership

WRF	Weather Research and Forecasting Model

142


-------
.I«! «i |i« ,l ' iifn - I'm 1 ii iySi'U'''i

Please see the independent spreadsheet AppendixB_2020_emissions_totals_by_sector.xlsx that provides
inventory and speciation emissions totals for each emissions modeling sector.

143


-------
United States	Office of Air Quality Planning and Standards	Publication No. EPA-454/R-23-004

Environmental Protection	Air Quality Assessment Division	December 2023

Agency	Research Triangle Park, NC


-------